ich habe in einer applikation bei 4 unabhängigen threads das problem, dass mergemertens langsamer wird, je mehr threads aktiv sind.
Lasse ich mergemertens in einem thread laufen dann benötige ich 3400ms
Lasse ich den selben code in 4 unabhängigen threads gleichzeitig laufen, dann benötigt jeder thread 8300ms
Ich verwende hier einen Intel Core i9-10900K mit 64Gb Ram. Die Performance sollte also passen.
Programmiert habe ich mein Beispiel in C# mit Emgu, aber das sollte auch nciht das Problem sein.
Lässt sich das reproduzieren?
Habt jemand eine Erklärung für mich?
Hier noch der Link zu meinem Test Code: HDR Test.zip
i have no idea, what this does in c#, but it does NOT disable opencv’s internal parallelism in C++
(i also do not understand, how you want to start 3 threads after disabling this …)
in general, opencv has a lot of data-parallel optimization builtin. wrapping yet another thread-parallel algorithm around it rarely improves the situation
I would like to convert 4 different images at once. It is important that the evaluation runs as stable as possible.
If MultiThreading is active, then the evaluation fluctuates extremely, because too many threads are involved.
If I switch NumThreads to 1, then the values are much more stable.
How would you handle this in C++? Maybe this info will help me.
you start no threads of your own. you do not set NumThreads. you operate on one image. measure the time.
you start no threads of your own. you do set NumThreads, to 1. you operate on one image. measure the time.
restart the program for each test. do not put both tests in the same program.
what are the times?
it was but my answer may not make sense to you.
do you understand that opencv may be starting its own threads, many of them? don’t be sure that NumThreads necessarily has an effect. it’s supposed to, but don’t rely on it.
do you understand that if you start four calls, and each call uses several threads (probably as many as your CPU has cores), that WILL create more threads than can be executed independently on your CPU?
besides… you need to be a lot more exact in stating what you measure. did you measure the time for each individual call, and those four times are around 8 seconds each? or did you measure the time to complete all four calls?
if one call takes 4 seconds, and 4 calls in parallel take 8 seconds total, you saved time already, because 4x 4 seconds would have been 16 seconds.
Thank you for your feedback!
I unserstand that one call may generate several threads. I try to unserstand if this is controllable or a as it is situation.
I always measure the time of one call. 4 calls in parallel → each takes 8s → my total time is about 8 seconds.
To your question. Every test the application is started again:
NumThreads not set, 1 single operation: 2827ms
NumThreads = 1, 1 single operation: 4369ms
NumThreads = 1, 4 parallel operations of different images: 8037ms, 8051ms, 8111ms, 8129ms
NumThreas not set, 4 parallel operations of different images: 5080ms, 8173ms 9660ms, 9660ms
My interpretation is:
NumThreads has a function because of the difference between 1. + 2.
I can also see that 4. overloads the CPU because of 4x (max threads). Thats why the results of 3. are more stable
But I do not understand why 2 is much slower because in theory I have enough cores in my system. I am not sure if I am doing something wrong.
Is there a way to find our how many threads are generated by one call?