it’s my test code
import cv2
matcher = cv2.cuda.createTemplateMatching(cv2.CV_8U, cv2.TM_CCOEFF_NORMED)
im_source = cv2.cuda.GpuMat(cv2.imread('tests/image/6.png'))
count = 0
while True:
matcher.match(im_source, im_source)
Every call templateMatch will use cuModuleLoadData and block for a long time.
it’s my cmake. use 3080ti in cuda11.6.
General configuration for OpenCV 4.5.5-dev =====================================
Version control: 4.5.5-358-gc8228e5789-dirty
Extra modules:
Location (extra): D:/c++/opencv_contrib/modules
Version control (extra): 4.5.5-72-gc271bd61
Platform:
Timestamp: 2022-05-12T12:33:10Z
Host: Windows 10.0.19044 AMD64
CMake: 3.23.1
CMake generator: Visual Studio 17 2022
CMake build tool: C:/Program Files/Microsoft Visual Studio/2022/Community/MSBuild/Current/Bin/amd64/MSBuild.exe
MSVC: 1932
Configuration: Debug Release
CPU/HW features:
Baseline: SSE SSE2 SSE3
requested: SSE3
Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
requested: SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
SSE4_1 (16 files): + SSSE3 SSE4_1
SSE4_2 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2
FP16 (0 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
AVX (4 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
AVX2 (31 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
AVX512_SKX (5 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX
C/C++:
Built as dynamic libs?: NO
C++ standard: 11
C++ Compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.32.31326/bin/Hostx64/x64/cl.exe (ver 19.32.31328.0)
C++ flags (Release): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP -openmp /MT /O2 /Ob2 /DNDEBUG
C++ flags (Debug): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP -openmp /MTd /Zi /Ob0 /Od /RTC1
C Compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.32.31326/bin/Hostx64/x64/cl.exe
C flags (Release): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP -openmp /MT /O2 /Ob2 /DNDEBUG
C flags (Debug): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP -openmp /MTd /Zi /Ob0 /Od /RTC1
Linker flags (Release): /machine:x64 /NODEFAULTLIB:atlthunk.lib /INCREMENTAL:NO /NODEFAULTLIB:libcmtd.lib /NODEFAULTLIB:libcpmtd.lib /NODEFAULTLIB:msvcrtd.lib
Linker flags (Debug): /machine:x64 /NODEFAULTLIB:atlthunk.lib /debug /INCREMENTAL /NODEFAULTLIB:libcmt.lib /NODEFAULTLIB:libcpmt.lib /NODEFAULTLIB:msvcrt.lib
ccache: NO
Precompiled headers: YES
Extra dependencies: wsock32 comctl32 gdi32 ole32 setupapi ws2_32 cudart_static.lib nppc.lib nppial.lib nppicc.lib nppidei.lib nppif.lib nppig.lib nppim.lib nppist.lib nppisu.lib nppitc.lib npps.lib cublas.lib cudnn.lib cufft.lib -LIBPATH:C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.6/lib/x64 -LIBPATH:C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.6/lib
3rdparty dependencies: libprotobuf ade ittnotify libjpeg-turbo libwebp libpng libtiff libopenjp2 IlmImf zlib quirc ippiw ippicv
OpenCV modules:
To be built: aruco barcode bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab xfeatures2d ximgproc xobjdetect xphoto
Disabled: java_bindings_generator python_tests wechat_qrcode world
Disabled by dependency: -
Unavailable: alphamat cvv freetype hdf java julia matlab ovis python2 sfm viz
Applications: apps
Documentation: NO
Non-free algorithms: NO
Windows RT support: NO
GUI: WIN32UI
Win32 UI: YES
VTK support: NO
Media I/O:
ZLib: build (ver 1.2.12)
JPEG: build-libjpeg-turbo (ver 2.1.2-62)
WEBP: build (ver encoder: 0x020f)
PNG: build (ver 1.6.37)
TIFF: build (ver 42 - 4.2.0)
JPEG 2000: build (ver 2.4.0)
OpenEXR: build (ver 2.3.0)
HDR: YES
SUNRASTER: YES
PXM: YES
PFM: YES
Video I/O:
DC1394: NO
FFMPEG: YES (prebuilt binaries)
avcodec: YES (58.134.100)
avformat: YES (58.76.100)
avutil: YES (56.70.100)
swscale: YES (5.9.100)
avresample: YES (4.0.0)
GStreamer: NO
DirectShow: YES
Media Foundation: YES
DXVA: YES
Parallel framework: OpenMP
Trace: YES (with Intel ITT)
Other third-party libraries:
Intel IPP: 2020.0.0 Gold [2020.0.0]
at: D:/c++/opencv_build_cuda11.6/3rdparty/ippicv/ippicv_win/icv
Intel IPP IW: sources (2020.0.0)
at: D:/c++/opencv_build_cuda11.6/3rdparty/ippicv/ippicv_win/iw
Lapack: NO
Eigen: NO
Custom HAL: NO
Protobuf: build (3.19.1)
NVIDIA CUDA: YES (ver 11.6, CUFFT CUBLAS)
NVIDIA GPU arch: 86
NVIDIA PTX archs: 86
cuDNN: YES (ver 8.4.0)
OpenCL: YES (NVD3D11)
Include path: D:/c++/opencv/3rdparty/include/opencl/1.2
Link libraries: Dynamic load
Install to: D:/c++/opencv_build_cuda11.6/install
-----------------------------------------------------------------