I have seen some posts asking about using opencv on arm baremetal environment.
I would say yes, opencv can be compiled in arm cross compiler and used in arm baremetal environment which I did.
I started trying on 2.4.8 then 2.4.13.7, took me a while to run it on my own developped zynq dual core arm processor both running rtos plus opencv.
after play with that a while, I was thinking use 4.5+ which has dnn, so I tried it again, and with experiences by porting 2.4.x, this time take me much less time to get it start running. I have ported 4.11.0 to my zynq arm board,
but i havent try advance usage with dnn which i plan to do. basic function like read image with bmp/jpg/png are working, cvmat puttext, rotate, color fill are working fine.
i did use stereobm with 2.4.13.7 on my arm board. works fine.
regarding speed, i do two image 480x272 for my cameras, resize to half, rotate an angle and put them into same image and show on cd takes 60ms while it has plenty of interrupts break the processing.
a stereo processing (stereobm) with 480x272 taking 130ms.
When later if able to run DNN module on my arm board, i wil see how slow it is.
well, if you want to know whether opencv can be used in arm baremetal/rtos environment, the answer is YES
tried yolov4_tiny 320*320 pretrained model, run on my pc by cpu in opencv 4.11.0 takes 380-690ms.
while run on my zynq single core cpu 750Mhz DDR 533Mhz with L1 L2 cache on, read and write via sdcard, no FPGA acceleration no neon instructio, if compiled by O0, took 120s. O2 took 60s, compileation with O3 neon vector, took 30s.
plan to try tensor flow lite model to see howslow it is.
I tried with .tflite format model file. Looked through many existed pretrained files, seems opencv 4.11.0 even pc code encounter some errors, like type error. shape error. Then I give up, looked for other light mode with faster processing time.
Find that yolo_fastestv2 352*352 seems works fine, it need port protobuf module to ARM side too. it works fine after done with that.
tried with compilation flag O0. no optimization at all. it run one frame within 3 seconds. believe with O3 it will be faster.
compiled with O3. use -neon or -vfpv3. seems no big difference. run 20 frames took less than 20s. each frame 352*352 around 1.8s for yolo fatestv2, model size 300KB.
with opencv macro switch use_intrinsic and compile_neon. looks like cv_try_neon turned on and some functions are not implemented, thus failed to pass compilation.
Plan next step to optimise some parts by using FPGA.
Anyway opencv 4.11.0 is fine to use in ARM baremetal environment.