Absurd thread management in probably kmeans

I have a performance problem with processing that uses (among others) cv::kmeans. 90% of execution time is lost in things like Concurrency::details::WorkSearchContext::StealLocalRunnable or Concurrency::details::SchedulingNode::FindVirtualProcessor. Total execution time is about 45 seconds. Trying various cv::setNumThreads() did not change much. Anyone has some deeper knowledge what’s happening with those schedulers? No difference whether it’s 3.4.1 or 4.5.5.

maybe you could add, what kind of “debugging” you used to get that information ?
then ppl here could try to reproduce it.
also: os ? hw ?

a minimal reproducible example (MRE) program code would be handy. I’m curious to reproduce this.

Well, meanwhile we’ve found out what’s happening. Appeared that it’s normal behaviour of VS VC++ profiler which shows only the main thread performance (at least in the configuration that was sufficient for me so far) and other (15 in this case) threads are visible only as the scheduler mechanism calls.
cv::kmeans splits to threads and indeed works honestly like an ant, but I could only see 1/16th of it and the rest seemed like some useless scheduler epilepsy.
Why I was alerted is that no other OpenCV function I used so far has shown similar behaviour and I thought that it’s some failure. Or VS’s mood, in this aspect it’s sometimes nearly intelligent ;->
I have also found information that K-Means is simply slow by its nature and implementations are not well optimized, which explains both why the execution was so long, and why this task multiplication.

1 Like