cv::Matx in cuda kernel

Hello,

Does OpenCV supports cv::Matx in CUDA kernel ?

I mean declaring in a CUDA kernel a cv::Matx should be possible because there is no memory management. It just needs to add __device__ __host__ to the declaration of the OpenCV function cv::Matx::Matx. Does any version of OpenCV support this?

It would make matrix multiplication of small matrices way easier in kernels. Eigen supports this for example.

Hi, would you use this in OpenCV kernel’s if so can you give an example of what how you would instantiate and pass the cv::Matx to them?

I need to multiply each pixel of an 2-components image with a 2x2 matrix. The matrix is different for every pixel. Currently, the “matrix of matrices” is just stored as a GpuMat copied from a a matrix with 4-float components. But OpenCV doesn’t support multiplication in kernels so I have to cast them to eigen::Matrix<float, 4, 4, eigen::RowMajor> to do the multiplication in the kernel. (I have to use row major because Eigen use by default column major order).

If OpenCV allowed it, I won’t need to do a lot of reinterpret_cast and I could just drop Eigen for OpenCV. Its easier to use one library for all math stuff instead of casting back and forth between the two.

Example : (semi pseudo code)



__global__ void mykernel(cv::cuda::PtrStepSize<cv::Matx<2, 2, float>> matrices, cv::cuda::PtrStepSize<float> image)
{
	/* ... Some code ... */
	
	/* Not possible currently */
	
	image.at<cv::Vec<float, 2>>(row, col) *= matrices(y, x); 
}

int main()
{
	cv::Mat matrices(100, 100, CV_32F4C);
	cv::cuda::GpuMat d_matrices; d_matrices.upload(matrices);
	
	mykernel(d_matrices, d_some_image_2_channels);
}

Passing the matrix is not the problem here, the problem is automatic variables (so just normal variables) of cv::Matx declared inside a kernel doesn’t compile.

OK so you just want to use cv::Matx inside a kernel.

What happens when you compile with __device__ __host__?

How big are the cv::Matx are you storing them in shared memory?

They are not stored in shared memory, they are either in global memory or just local variables.

Do you mean changing the OpenCV functions ? Well I have not tried because I hope I don’need to change OpenCV and maintain it myself. Or do I ?..

I didn’t see your post above before. Yes I meant modify the source although now I understand what you are trying to do that may be a bit extreme for this use case.

In answer to your original question

No, not as far as I can tell.

Whilst I can see that passing Matx will lead to cleaner code I am not sure if it makes that much difference when you are applying a 2x2 matrix, could you not simply perform the operation manually and avoid using eigen as well? i.e.

float2 pt = image.at<float2>(row, col);
float4 M = matrices.at<float4>(row, col);
pt.x = pt.x * M.x + pt.y * M.z;
pt.y = pt.x * M.y + pt.y * M.w;

Let me know how you get on. There may be a few issues as you will have to include the cuda headers in core\matx.hpp and only enable them when you HAVE_CUDA. If its successful you could submit a PR.