Hello, I have a problem with the DNN module not being built with CUDA and therefore it using CPU for inference.
I get the runtime error when trying to use CUDA as target and backend for loading a model using readNetFromDarknet function in C++.
[ WARN:0@0.031] global net_impl.cpp:178 cv::dnn::dnn4_v20230620::Net::Impl::setUpNet DNN module was not built with CUDA backend; switching to CPU
Since I can’t build OpenCV manually; loads of protobuf related errors I have switched to vcpkg for the install and building of OpenCV. I built it using this command which took 2.2 hours and was successful:
“.\vcpkg.exe install opencv[cuda,contrib,cudnn,dnn,freetype,jpeg,openmp,png,webp,world]:x64-windows-static”
And after building I linked these library files in my C++ project: cudart_static.lib, opencv_world4.lib and cudnn.lib but I still continue to get the same runtime error.
OpenCV 4.8.0
CUDA 12.6
cuDNN 9.5
Windows 10 x64
I have also opened a GitHub issue for vcpkg; it provides a bit more information. https://github.com/microsoft/vcpkg/issues/41642
Anyone got any solution to this problem?
What errors were you getting with protobuf, did you try building the latest commits from the 4.x branches? On Windows using ninja from the command line OpenCV should build in under an hour.
Hey, I can’t tell you which errors right now, I’ll have to try and built it again. But yes, I I’ve tried building 4.8.0 and 4.10.0 using cmake - I’ve not yet tried ninja, I’ll have to look into it and test it out
I just tried with cmake and there is a bunch of protobuf related errors. I have no idea how to fix it. I tried using ninja from the guide you sent, but it always just returns “error: could not load cache” when doing the build command.
If you need a log file from the build, please help me and tell me which one
Your protobuf errors look to be related to OpenCV trying to use a combination of vpkg and the downloaded version. Remove the vpkg locaiton from your path.
As for the guide, post the CMake arguments you use and the output from the configuration stage as shown in the guide.
Configuration command:
set CMAKE_BUILD_PARALLEL_LEVEL=12
"C:/Program Files/CMake/bin/cmake.exe" ^
-H"C:/src/opencv-4.9.0" ^
-DOPENCV_EXTRA_MODULES_PATH="C:/src/opencv_contrib-4.x/modules" ^
-B"C:/src/opencv-4.9.0/build" ^
-G"Visual Studio 17 2022" ^
-DINSTALL_TESTS=ON ^
-DINSTALL_C_EXAMPLES=ON ^
-DBUILD_EXAMPLES=ON ^
-DBUILD_opencv_world=ON ^
-DENABLE_CUDA_FIRST_CLASS_LANGUAGE=ON ^
-DWITH_CUDA=ON ^
-DCUDA_GENERATION=Auto ^
-DBUILD_opencv_python3=ON ^
-DPYTHON3_INCLUDE_DIR="C:/Users/antia/miniforge-pypy3/include" ^
-DPYTHON3_LIBRARY="C:/Users/antia/miniforge-pypy3/libs/python39.lib" ^
-DPYTHON3_EXECUTABLE="C:/Users/antia/miniforge-pypy3/python.exe" ^
-DPYTHON3_NUMPY_INCLUDE_DIRS="C:/Users/antia/miniforge-pypy3/lib/site-packages/numpy/core/include" ^
-DPYTHON3_PACKAGES_PATH="C:/Users/antia/miniforge-pypy3/Lib/site-packages/" ^
-DCUDA_HOST_COMPILER="C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe" ^
-DCMAKE_TOOLCHAIN_FILE="" ^
-DVCPKG_TARGET_TRIPLET=""
And output:
--
-- General configuration for OpenCV 4.9.0 =====================================
-- Version control: unknown
--
-- Extra modules:
-- Location (extra): C:/src/opencv_contrib-4.x/modules
-- Version control (extra): unknown
--
-- Platform:
-- Timestamp: 2024-10-19T10:01:51Z
-- Host: Windows 10.0.19045 AMD64
-- CMake: 3.30.5
-- CMake generator: Visual Studio 17 2022
-- CMake build tool: C:/Program Files/Microsoft Visual Studio/2022/Community/MSBuild/Current/Bin/amd64/MSBuild.exe
-- MSVC: 1937
-- Configuration: Debug Release
--
-- CPU/HW features:
-- Baseline: SSE SSE2 SSE3
-- requested: SSE3
-- Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
-- requested: SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
-- SSE4_1 (18 files): + SSSE3 SSE4_1
-- SSE4_2 (2 files): + SSSE3 SSE4_1 POPCNT SSE4_2
-- FP16 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
-- AVX (9 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
-- AVX2 (38 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
-- AVX512_SKX (8 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX
--
-- C/C++:
-- Built as dynamic libs?: YES
-- C++ standard: 11
-- C++ Compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.37.32822/bin/Hostx64/x64/cl.exe (ver 19.37.32824.0)
-- C++ flags (Release): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /O2 /Ob2 /DNDEBUG
-- C++ flags (Debug): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /Zi /Ob0 /Od /RTC1
-- C Compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.37.32822/bin/Hostx64/x64/cl.exe
-- C flags (Release): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP /O2 /Ob2 /DNDEBUG
-- C flags (Debug): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP /Zi /Ob0 /Od /RTC1
-- Linker flags (Release): /machine:x64 /INCREMENTAL:NO
-- Linker flags (Debug): /machine:x64 /debug /INCREMENTAL
-- ccache: NO
-- Precompiled headers: NO
-- Extra dependencies: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/cudart_static.lib C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/nppial.lib C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/nppc.lib C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/nppitc.lib C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/nppig.lib C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/nppist.lib C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/nppidei.lib C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/cublas.lib C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/cublasLt.lib C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/cufft.lib C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/nppif.lib C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/nppim.lib C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64/nppicc.lib
-- 3rdparty dependencies:
--
-- OpenCV modules:
-- To be built: aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape signal stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab wechat_qrcode world xfeatures2d ximgproc xobjdetect xphoto
-- Disabled: -
-- Disabled by dependency: -
-- Unavailable: alphamat cannops cvv freetype hdf java julia matlab ovis python2 python2 sfm viz
-- Applications: tests perf_tests examples apps
-- Documentation: NO
-- Non-free algorithms: NO
--
-- Windows RT support: NO
--
-- GUI:
-- Win32 UI: YES
-- VTK support: NO
--
-- Media I/O:
-- ZLib: build (ver 1.3)
-- JPEG: build-libjpeg-turbo (ver 2.1.3-62)
-- SIMD Support Request: YES
-- SIMD Support: NO
-- WEBP: build (ver encoder: 0x020f)
-- PNG: build (ver 1.6.37)
-- TIFF: build (ver 42 - 4.2.0)
-- JPEG 2000: build (ver 2.5.0)
-- OpenEXR: build (ver 2.3.0)
-- HDR: YES
-- SUNRASTER: YES
-- PXM: YES
-- PFM: YES
--
-- Video I/O:
-- DC1394: NO
-- FFMPEG: YES (prebuilt binaries)
-- avcodec: YES (58.134.100)
-- avformat: YES (58.76.100)
-- avutil: YES (56.70.100)
-- swscale: YES (5.9.100)
-- avresample: YES (4.0.0)
-- GStreamer: NO
-- DirectShow: YES
-- Media Foundation: YES
-- DXVA: YES
--
-- Parallel framework: Concurrency
--
-- Trace: YES (with Intel ITT)
--
-- Other third-party libraries:
-- Intel IPP: 2021.11.0 [2021.11.0]
-- at: C:/src/opencv-4.9.0/build/3rdparty/ippicv/ippicv_win/icv
-- Intel IPP IW: sources (2021.11.0)
-- at: C:/src/opencv-4.9.0/build/3rdparty/ippicv/ippicv_win/iw
-- Lapack: NO
-- Eigen: NO
-- Custom HAL: NO
-- Protobuf: build (3.19.1)
-- Flatbuffers: builtin/3rdparty (23.5.9)
--
-- NVIDIA CUDA: YES (ver 12.6.20, CUFFT CUBLAS)
-- NVIDIA GPU arch: 89
-- NVIDIA PTX archs:
--
-- cuDNN: YES (ver 9.3.0)
--
-- OpenCL: YES (NVD3D11)
-- Include path: C:/src/opencv-4.9.0/3rdparty/include/opencl/1.2
-- Link libraries: Dynamic load
--
-- Python 3:
-- Interpreter: C:/Users/antia/miniforge-pypy3/python.exe (ver 3.9.18)
-- Libraries: C:/Users/antia/miniforge-pypy3/libs/python39.lib (ver 3.9.18)
-- numpy: C:/Users/antia/miniforge-pypy3/Lib/site-packages/numpy/core/include (ver )
-- install path: C:/Users/antia/miniforge-pypy3/Lib/site-packages//cv2/python-3.9
--
-- Python (for build): C:/Users/antia/miniforge-pypy3/python.exe
--
-- Java:
-- ant: NO
-- Java: NO
-- JNI: NO
-- Java wrappers: NO
-- Java tests: NO
--
-- Install to: C:/src/opencv-4.9.0/build/install
-- -----------------------------------------------------------------
--
-- Configuring done (89.8s)
-- Generating done (45.7s)
-- Build files have been written to: C:/src/opencv-4.9.0/build
I got the same errors; I’m trying again and disabling vcpkg temporarily for the cmd session.
EDIT 1: Disabling vcpkg temporarily did not work… I’ll try removing it from the system path. EDIT 2: again… nevermind I dont even have it set it in the PATH system/user enviroment variables.
set "VCPKG_ROOT="
set "VCPKG_DEFAULT_TRIPLET="
set "VCPKG_INSTALLATION_ROOT="
EDIT 3: I removed the protobuf install from vcpkg and it has exceeded the point where the errors occur during build now. I’ll update it if successfully builds.
EDIT 4: It did not work, I got new errors though. All 4 of them related to the ‘reduce’ function
EDIT 5: I’m re-installing latest CUDA and cuDNN - should I use OpenCV 4.9.0 or 4.10.0?
You need to use the latest commits from the 4.x branch from both repos which have updates for compatibility with the latest versions of CUDA and cuDNN which were released after 4.10.
Yes thanks, that worked. But I still believe there is something wrong.
"C:\Program Files\CMake\bin\cmake.exe" --build C:\src\opencv-4.10.0\build --target install --config Release
It does not say it build successfully, it just ends there without errors - so I assume it has build fine, but I can’t seem to find any include files or library files in the build directory.
The files should be in the install folder wherever that is for the new build you performed?
After you second most recent comment I changed it out and did it for the 4.10.0 version. I’ll try again and see if it was me missing something, will edit reply if it somehow wasn’t.
EDIT: So I ran the build command below and I dont have any of the .lib or include files. Nor do I have the …/build/install folder. Am I missing a step?
"C:\Program Files\CMake\bin\cmake.exe" --build C:\src\opencv-4.10.0\build --target install --config Release
When you configured CMake was the install location?
-- Install to: C:/src/opencv-4.10.0/build/install
I nearly always build with Ninja which doesn’t just stop without an error or printing install output so I don’t know what could have caused your Visual Studio build to stop.
Try to build with ninja and post the errors you get
"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvars64.bat"
"C:/Program Files/CMake/bin/cmake.exe" ^
-H"C:/src/opencv-4.10.0" ^
-DOPENCV_EXTRA_MODULES_PATH="C:/src/opencv_contrib-4.x/modules" ^
-B"C:/build/opencv-4.10.0/build" ^
-G"Ninja Multi-Config" ^
-DINSTALL_TESTS=ON ^
-DINSTALL_C_EXAMPLES=ON ^
-DBUILD_EXAMPLES=ON ^
-DBUILD_opencv_world=ON ^
-DENABLE_CUDA_FIRST_CLASS_LANGUAGE=ON ^
-DWITH_CUDA=ON ^
-DCUDA_GENERATION=Auto ^
-DBUILD_opencv_python3=ON ^
-DPYTHON3_INCLUDE_DIR="C:/Users/antia/miniforge-pypy3/include" ^
-DPYTHON3_LIBRARY="C:/Users/antia/miniforge-pypy3/libs/python39.lib" ^
-DPYTHON3_EXECUTABLE="C:/Users/antia/miniforge-pypy3/python.exe" ^
-DPYTHON3_NUMPY_INCLUDE_DIRS="C:/Users/antia/miniforge-pypy3/lib/site-packages/numpy/core/include" ^
-DPYTHON3_PACKAGES_PATH="C:/Users/antia/miniforge-pypy3/Lib/site-packages/"
I ran the configuration you sent, but it doesn’t use CUDA nor cuDNN
I figured it out. I have now successfully built OpenCV with CUDA for the DNN module using Ninja, thanks!
Your configuration was good, I just had to open a new cmd - and also I had to disable Python bindings as it was giving some errors with numpy.
EDIT: If you could help me one last time, I’d appreciate if you could tell me which config flags I need to set in order to build it statically.
Use
-DBUILD_SHARED_LIBS=OFF
Note: You will still need a lot of the shared libraries from the CUDA toolkit because static ones are only available for Linux.