Use pre-allocated cuda memory pointer to initialize an opencv GpuMat in python

Is there a way to use a memory pointer that has already been allocated to create a GpuMat in OpenCV python?

There is the C++ definition for GpuMat() which takes void* data as input.

Looking at the functions exported to Python:

    //! copy constructor
    CV_WRAP GpuMat(const GpuMat& m);

    //! constructor for GpuMat headers pointing to user-allocated data
    GpuMat(int rows, int cols, int type, void* data, size_t step = Mat::AUTO_STEP);
    GpuMat(Size size, int type, void* data, size_t step = Mat::AUTO_STEP);

    //! creates a GpuMat header for a part of the bigger matrix
    CV_WRAP GpuMat(const GpuMat& m, Range rowRange, Range colRange);
    CV_WRAP GpuMat(const GpuMat& m, Rect roi);

The functions with void* data do not have the CV_WRAP and are not ported to python?

Is there some other way to use pre-allocated cuda pointers in Python opencv?

I attach pictures.
The first picture I attached was an apriltag picture
The second picture attached is the command window.
I’m not sure what’s the problem.

apriltag picture & program command execution result

Hi, I’m sure this has been asked a lot before on the previous forum but I cannot find a link to give you. As far as I am aware this is not possible and I think someone was looking at writing the code to enable this but I am not sure if they got anywhere with it.

As above they are not wrapped because this won’t work.

I had a quick glance at the code. Isn’t this an apriltag library error?

1 Like

wrap your data into a cv::Mat first.

GpuMat from Mat should be available.

edit: or rather, build a numpy array, and then OpenCV should have ways to turn that into a GpuMat.

Yes, this way would definitely work, but my input is already an array pointer from cuda, so it would involve one dtoh and one htod transfer, which seems wasteful.

I tried looking at the old forums and I only found a couple of similar unanswered questions. If it is not too much trouble, could you please provide the intuition behind why this approach wouldn’t work in Python.
I am not able to grasp why this would work in cpp, but passing a similar pointer in python would fail.

Without looking into it I would assume you can’t get the pointer. If you can I would think that it wouldn’t be in the same address space (CUDA context) in the OpenCV library as it was in the library you created it in. I may be wrong, a quick way to find out would be to give it a go, add CV_WRAP re-build and observe the addresses etc. in the debugger (if your using windows VS will allow you to debug python c++ modules).

I tried that, it was taking the int value of the pointer as a scalar, so I removed the CV_WRAP of the constructors which take in a scalar value as input, but still I get the same result as passing a scalar value.

I missed the fact that opencv might create a new cuda context, I assumed it would use the default context.

If you just wrapped the signature with the data pointer then I would expect you to get a compilation error as the python bindings can’t deal with a void *. Did you do something like the below?

CV_WRAP inline GpuMat(int rows, int cols, int type, uint64 dataAddr, size_t step = Mat::AUTO_STEP) :
    GpuMat(rows, cols, type, (void*)dataAddr, step){}

Assuming 64 bit did the memory address which you passed in as an 8 byte integer access access the same location inside OpenCV? Which library are you creating the memory with and how do you get the memory address?

Hi guys, any new findings in this discussion?
I’m trying to convert torch.Tensor on cuda to cv2.cuda_GpuMat.
I managed to do it the other way by defining the cuda_array_interface property of the tensor, specifically torch know how to handle the cudaPtr() of GpuMat, so I wonder why shouldn’t it be the other way?

I found the solution!
from version 4.8.0 there is a function cv2.cuda.createGpuMatFromCudaMemory(), which accepts a pointer and build a GpuMat.

for example: if x is my torch.Tensor, we can do:
h, w = x.shape
cv2.cuda.createGpuMatFromCudaMemory(h, w, cv2.CV_32F, x.data_ptr())

hope it helps someone, I bearly managed to find it just by checking what other functions are in the cv2.cuda module

2 Likes