Why some Mats don't need to be uploaded when using GPU?

cudawarped · May 10, 2022, 9:08am

Using your example:

cv::cuda::warpPerspective(InputArray src, OutputArray dst, InputArray M, Size dsize, int flags = INTER_LINEAR,
    int borderMode = BORDER_CONSTANT, Scalar borderValue = Scalar(), Stream& stream = Stream::Null())

In a nutshell src and dst are the data you are going to work on whereas M is just a collection of function arguments/parameters, in the same way that dsize, flags, borderMode and borderValue are.

Now I can’t find any official documentation in the CUDA programming guide to justify passing arguments in host rather than device memory so take the next paragraph with a pinch of salt.

It seems like the consensus is that there is an amount of latency say N ms in launching a kernel as this requires comunication between the host and the device. As the size of this communication is small it does not saturate the available bandwidth between the host and device, meaning there is room for a small amount of extra data to be sent at the same time without increasing N. Therefore a few extra parameters (function arguments in addition to kernel communication overhead) can be sent without any penalty and there would be no advantage to first copying your function arguments from the host to the device before launching the kernel. Conversly if you could pass the data as a host argument this would saturate the available bandwidth between the host and device and significantly increase N, which is what you can experiance when using managed memory.

Topic		Replies	Views
cuda_GpuMat.upload() error doesn't make sense Python cuda	2	918	June 21, 2022
Python CUDA GpuMat upload() function, strange warm-up required? Python cuda	1	994	September 12, 2023
How to use cv::cudev::GpuMat_ correctly? C++ cuda	4	883	May 26, 2021
Where is ptr pointing for a cuda::GpuMat? cuda	9	2002	June 15, 2021
Copy cv::cuda::GpuMat in Cuda Kernel C++ cuda	4	1752	October 5, 2022

Why some Mats don't need to be uploaded when using GPU?

Related topics