Per element CUDA operations such as cv::cuda::divide
or cv::cuda::multiply
in C++ can be applied to a matrix-matrix input as well as to a matrix-scalar input.
In fact, the following C++ code compiles flawlessly:
cv::Mat test;
cv::cuda::GpuMat cu_test(4, 4, CV_8UC1, 16);
cu_test.download(test);
std::cout << test << std::endl;
cv::cuda::divide(cu_test, 2, cu_test);
cu_test.download(test);
std::cout << test << std::endl;
And as expected outputs:
[ 16, 16, 16, 16;
16, 16, 16, 16;
16, 16, 16, 16;
16, 16, 16, 16]
[ 8, 8, 8, 8;
8, 8, 8, 8;
8, 8, 8, 8;
8, 8, 8, 8]
On the other hand, the same logic doesn’t work in Python.
For example, the Python equivalent of the above C++ code:
cu_test = cv2.cuda_GpuMat(4, 4, cv2.CV_8UC1, 16)
test = cu_test.download()
print(test)
cu_test = cv2.cuda.divide(cu_test, 2)
test = cu_test.download()
print(test)
Fails as at cv2.cuda.divide
:
[[16 16 16 16]
[16 16 16 16]
[16 16 16 16]
[16 16 16 16]]
Traceback (most recent call last):
File "test.py", line 7, in <module>
cu_test = cv2.cuda.divide(cu_test, 2)
cv2.error: OpenCV(4.10.0) :-1: error: (-5:Bad argument) in function 'divide'
> Overload resolution failed:
> - src1 is not a numpy array, neither a scalar
> - Expected Ptr<cv::cuda::GpuMat> for argument 'src2'
> - Expected Ptr<cv::UMat> for argument 'src1'
According to help(cv2.cuda.divide)
a matrix-scalar division is supposed to work:
divide(...)
divide(src1, src2[, dst[, scale[, dtype[, stream]]]]) -> dst
. @brief Computes a matrix-matrix or matrix-scalar division.
.
. @param src1 First source matrix or a scalar.
. @param src2 Second source matrix or scalar.
. @param dst Destination matrix that has the same size and number of channels as the input array(s).
. The depth is defined by dtype or src1 depth.
. @param scale Optional scale factor.
. @param dtype Optional depth of the output array.
. @param stream Stream for the asynchronous version.
.
. This function, in contrast to divide, uses a round-down rounding mode.
.
. @sa divide
Is there anything I’m missing on how to provide the scalar as a proper scalar to cv2.cuda.divide
?
Please note that I could get around this issue using a constant matrix that acts as the scalar:
cu_test = cv2.cuda_GpuMat(4, 4, cv2.CV_8UC1, 16)
cu_fake_scalar = cv2.cuda_GpuMat(4, 4, cv2.CV_8UC1, 2)
cu_test = cv2.cuda.divide(cu_test, cu_fake_scalar)
However it seems inefficient when working with high resolution images.
I tested with Python 3.6.9 and OpenCV versions 4.5.4 and also 4.10.0.
Thanks in advance!
Massimo