OpenCV Optical Flow Cuda Naiva Implementation Slower then CPU

haider_abbasi · April 3, 2024, 5:34pm

Thank you, @cudawarped. Your resources on CUDA optimization in OpenCV are invaluable.

Regarding your answer, I have the following points of confusion that I still don’t understand:

Sparse Optical Flow CPU vs. GPU Performance:
Your comparison of sparse optical flow on CPU vs. GPU showed a 56% speed up on the GPU using the same naive implementation. Even if my rig is not the same, the performance shouldn’t reverse to a 40% decrease, should it?
Comparison with Background Subtraction Test:
| That said I have no idea if the code will be faster on your RTX 3070 than your Ryzon 7 2700.
As I mentioned, I tested your optimization repository’s naive CPU vs. GPU comparison for background subtraction, and it gave me an 11x speed boost using the following code:

bgmog2_device = cv.cuda.createBackgroundSubtractorMOG2()
def ProcFrameCuda0(frame, lr, store_res=False):
    frame_device.upload(frame)
    frame_device_big = cv.cuda.resize(frame_device, (cols_big, rows_big))
    fg_device_big = bgmog2_device.apply(frame_device_big, lr, cv.cuda.Stream_Null())
    fg_device = cv.cuda.resize(fg_device_big, frame_device.size())
    fg_host = fg_device.download()
    if(store_res):
        gpu_res.append(np.copy(fg_host))

gpu_res = []
gpu_time_0, n_frames = ProcVid0(partial(ProcFrameCuda0, store_res=check_res), lr)
print(f'GPU 0 (naive): {n_frames} frames, {gpu_time_0:.2f} ms/frame')
print(f'Speedup over CPU: {cpu_time_0/gpu_time_0:.2f}')

Could the size of the image/frame that we are processing be the cause? In the optical flow case, the test video was 640x360. And in background subtraction, the test image is 1440x2560 (after-resizing).

Also, the OpenCV version you were using in Sparse Optical Flow is 4.1, and I have 4.8. Maybe something changed in the versions that is causing this?

My idea was to get a naive implementation working as it should and then investigate optimizations. If that makes sense.

Topic		Replies	Views
OpenCV CUDA extremely slow cuda	3	6707	April 30, 2021
Help with optimization opencv videocapture optical flow analysis Python	9	945	April 17, 2024
CUDA: SIFT or SURF, disappointed by execution timings cuda	6	3509	December 29, 2022
Reading Video Signal with CPU vs GPU gpu , cuda , videoio , cudacodec	1	3541	July 22, 2022
CUDA Fast detector much slower than normal FAST performance , cuda , practical	9	2445	May 28, 2021

OpenCV Optical Flow Cuda Naiva Implementation Slower then CPU

Related topics