How to speed up fillConvexPoly?

ok, let’s ignore the issue of hazards and only look at slowdowns.

if multiple cores access the same “cache lines”, they’ll fight over it, which costs synchronization (cache coherence). a typical cache line is large enough to span a few pixels.

and that’s the absolute minimum of issues you’ll face.

as mentioned, if these threads happen to lock the whole picture for access, you’re back to serial execution.