How to speed up fillConvexPoly?

Witek · February 20, 2021, 7:31pm

Hi,

I am drawing a couple of convex polygons (that overlap one another - this could be important) and I wanted to draw them in parallel to speed things up. Here is the code:

vector<Point> face;
vector<vector<Point>> faces;
	
//polygon 1
face.push_back((Point)model2D.col(0));
face.push_back((Point)model2D.col(1));
face.push_back((Point)model2D.col(2));
face.push_back((Point)model2D.col(3));
faces.push_back(face);

//here I fill other faces

//and finally I draw them
for (int i = 0; i < faces.size(); i++)
	fillConvexPoly(img, faces[i], CV_RGB(255, 0, 0));

Now, I thought that adding the following line just before the drawing loop would speed up the process, alas, it makes it actually slower!

#pragma omp parallel for

The OMP is working in general, I set the number of threads to 12. Could it be the problem of accessing the img data, as the polygons overlap one another? Or am I making some basic mistake? How do I speed it up?

matti.vuori · February 21, 2021, 10:37am

I don’t know anything about OpenMP, but I think it is likely that the whole image will be locked for each thread in turn
You could test the effect of overlapping by trying

crackwitz · February 21, 2021, 2:08pm

that is the problem.

what result do you expect when multiple draw calls work on the same data concurrently? which thread wins when multiple write to the same byte/word/cache line?

I hate to say it but this is fundamental stuff in parallel programming. you’ll have to find a book or course or tutorial that covers these aspects.

perhaps some computer graphics introduction would be in order too. OpenGL/Vulkan/Direct3D if it has to be a specific API, but they share the principles.

Witek · February 21, 2021, 3:39pm

Yes, I am aware of the concurrent access problem, however I am not sure if it makes any difference if the polygons DO NOT overlap (I have some that do and some that don’t).

crackwitz · February 22, 2021, 5:43pm

ok, let’s ignore the issue of hazards and only look at slowdowns.

if multiple cores access the same “cache lines”, they’ll fight over it, which costs synchronization (cache coherence). a typical cache line is large enough to span a few pixels.

and that’s the absolute minimum of issues you’ll face.

as mentioned, if these threads happen to lock the whole picture for access, you’re back to serial execution.

kishor_durve · February 27, 2021, 2:20am

If you are drawing only a couple (or may be a few more) polygons and simultaneous access by threads is causing the problem may be you could just create (memory and image size permitting) an image for each thread, have each thread write to its specific image and once all threads are done, OR all images? A brute force method, but…

Topic		Replies	Views
fillPolygon API usage C++ imgproc	3	1323	June 18, 2021
Is there a cuda version of cv::FillConvexPoly()? C++ cuda , imgproc	7	1109	October 11, 2021
cv::fillPoly cuda implementation cuda , imgproc	13	1459	February 16, 2022
How to make polygon transparent in c++? C++	2	782	May 10, 2022
`cv::ximgproc::thinning` is too slow C++ ximgproc	0	256	March 5, 2024

How to speed up fillConvexPoly?

Related topics