I am running the same C++ Code on different machines: Intel PC, iOS and Android. On PC and iOS, execution nicely parallelizes over several CPU cores. But not so on Android devices.
Specifically, I am running DNN inference on a detection model in ONNX format.
I have used the official releases (4.5.1) for all platforms. But I have also tried own builds for Android, based on the build scripts from the repository, optionally enabling OpenMP or Tengine. None of these ran inference in parallel.
I have found this very related discussion regarding RaspberryPi: How to use multi core for DNN in ARM cpu · Issue #16692 · opencv/opencv · GitHub
(which made me try out OpenMP), but which hasnt helped yet :-/
So the initial question is regarding expectations: is it expected to see OpenCV run inference on multiple cores on ARM-based Android smartphones devices at all? Which is the most likely (or proven) threading library for Android?
For my own builds, I am using clang on OSX.
getNumThreads() tells me OpenCV is aware of the availability of several cores – it just dont use them!