OpenCV DNN module is slower on GPU than on CPU (Jetson Nano)

trojek · November 23, 2021, 2:41pm

Hi.

I play around with the OpenCV dnn module on both CPU and GPU on Jetson Nano. I make a very similar post on the Nvidia forum Poor performance of CUDA GPU, using OpenCV DNN module - Jetson Nano - NVIDIA Developer Forums, but I think that the topic is more related to OpenCV than CUDA.

I made some tests using different super-resolution models. The results are as follow:

EDSR x2: CPU: timeout, GPU: timeout.
ESPCN x4: CPU: 0.17469215393066406 s, GPU: 10.169917821884155 s
FSRCNN x4: CPU: 0.12776947021484375 ,GPU: 5.2502007484436035
LapSRN x4: CPU: 8.098081111907959, GPU: 6.410776138305664

One can see that CPU time execution is much smaller than on GPU - which is contrary to logic. I check the resources load during the execution of both CPU and GPU versions and in the first case the CPU load was 100%, but with GPU version the load was at about 20%. It looks like dnn module doesn’t use the full power of GPU.

Why is the performance on GPU is too poor to CPU?
Is there exist a way to decrease the execution time of the GPU version?

mshabunin · November 23, 2021, 9:52pm

GPU initial setup and moving data between GPU and system memory takes time. In many cases GPU acceleration can be observed when processing large batches of data (e.g. frames of a video). Modify your test to process the same image 100 times and compare performance.

trojek · November 24, 2021, 10:22am

You are right. I executed the “upscaling” part of the code in the loop (100 times), and I got much better results using GPU than CPU (about 3x faster).

My goal is to upscale a few images (e. g. 6) with high resolution (e. g. 4000 px x 5000 px) in the shortest time (e. g. 1s per image). How can one archive such results?
Are there any known methods? For example, split one image into a few, upscale sub-images, then merge them into one?

Topic		Replies	Views
Change CPU to GPU with OpenCV Python Python dnn , cuda , nvidia	0	1969	November 24, 2021
setUpNet DNN module was not built with CUDA backend; switching to CPU C++ dnn , build , cuda	2	1705	November 21, 2022
Increase performance of DNN model inference dnn , cuda , nvidia	0	663	December 14, 2023
[ WARN:0] global ../modules/dnn/src/dnn.cpp (1363) setUpNet DNN module was not built with CUDA backend; switching to CPU After 5 OS installs and over 10 OpenCV builds C++ dnn , build	1	1161	October 28, 2021
Does OpenCV use GPU instead of CPU? dnn	1	2009	February 7, 2021

OpenCV DNN module is slower on GPU than on CPU (Jetson Nano)

Related topics