Hi,
I am running the same C++ Code on different machines: Intel PC, iOS and Android. On PC and iOS, execution nicely parallelizes over several CPU cores. But not so on Android devices.
Specifically, I am running DNN inference on a detection model in ONNX format.
I have used the official releases (4.5.1) for all platforms. But I have also tried own builds for Android, based on the build scripts from the repository, optionally enabling OpenMP or Tengine. None of these ran inference in parallel.
I have found this very related discussion regarding RaspberryPi: How to use multi core for DNN in ARM cpu · Issue #16692 · opencv/opencv · GitHub
(which made me try out OpenMP), but which hasnt helped yet :-/
So the initial question is regarding expectations: is it expected to see OpenCV run inference on multiple cores on ARM-based Android smartphones devices at all? Which is the most likely (or proven) threading library for Android?
Added info:
For my own builds, I am using clang on OSX.
More info:
getNumThreads() tells me OpenCV is aware of the availability of several cores – it just dont use them!
Thank you for the advice, crackwitz!
Do you expect this to be a matter of getting my build chain to use OpenCL? Or would I need to dig into OpenCV’s cv::ocl:: implementations? I noted that there is no VENDOR_ARM among VENDOR_AMD, VENDOR_INTEL, … in the vendor enum: OpenCV: cv::ocl::Device Class Reference
don’t know if I’d say “expected” but it should certainly be possible. OpenCV can use a ton of parallelism mechanisms. to name one, OpenMP is built into many compilers these days. you can see enabled “parallel” frameworks during cmake configuration step. my build says