hello all, I’m new to OpenCV and first apologize about my English, I’m not a native speaker, but I’ll describe it as clearly as I can.
I built OpenCV with Cuda to accelerate video encoding and decoding, here’s my building information
General configuration for OpenCV 4.7.0 =====================================
Version control: unknown
Extra modules:
Location (extra): D:/opencv_build/opencv_contrib-4.7.0/modules
Version control (extra): unknown
Platform:
Timestamp: 2023-11-09T02:50:34Z
Host: Windows 10.0.17763 AMD64
CMake: 3.26.4
CMake generator: Visual Studio 16 2019
CMake build tool: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/MSBuild/Current/Bin/MSBuild.exe
MSVC: 1929
Configuration: Release
CPU/HW features:
Baseline: SSE SSE2 SSE3
requested: SSE3
Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
requested: SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
SSE4_1 (18 files): + SSSE3 SSE4_1
SSE4_2 (2 files): + SSSE3 SSE4_1 POPCNT SSE4_2
FP16 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
AVX (5 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
AVX2 (34 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
AVX512_SKX (8 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX
C/C++:
Built as dynamic libs?: YES
C++ standard: 11
C++ Compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe (ver 19.29.30152.0)
C++ flags (Release): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /MD /O2 /Ob2 /DNDEBUG
C++ flags (Debug): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /MDd /Zi /Ob0 /Od /RTC1
C Compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe
C flags (Release): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /MP /MD /O2 /Ob2 /DNDEBUG
C flags (Debug): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /MP /MDd /Zi /Ob0 /Od /RTC1
Linker flags (Release): /machine:x64 /INCREMENTAL:NO
Linker flags (Debug): /machine:x64 /debug /INCREMENTAL
ccache: NO
Precompiled headers: NO
Extra dependencies: cudart_static.lib nppc.lib nppial.lib nppicc.lib nppidei.lib nppif.lib nppig.lib nppim.lib nppist.lib nppisu.lib nppitc.lib npps.lib cublas.lib cudnn.lib cufft.lib -LIBPATH:C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2/lib/x64
3rdparty dependencies:
OpenCV modules:
To be built: alphamat aruco barcode bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab wechat_qrcode world xfeatures2d ximgproc xobjdetect xphoto
Disabled: -
Disabled by dependency: -
Unavailable: cvv freetype hdf java julia matlab ovis python2 python2 sfm viz
Applications: tests perf_tests apps
Documentation: NO
Non-free algorithms: NO
Windows RT support: NO
GUI:
Win32 UI: YES
VTK support: NO
Media I/O:
ZLib: build (ver 1.2.13)
JPEG: build-libjpeg-turbo (ver 2.1.3-62)
SIMD Support Request: YES
SIMD Support: NO
WEBP: build (ver encoder: 0x020f)
PNG: build (ver 1.6.37)
TIFF: build (ver 42 - 4.2.0)
JPEG 2000: build (ver 2.4.0)
OpenEXR: build (ver 2.3.0)
HDR: YES
SUNRASTER: YES
PXM: YES
PFM: YES
Video I/O:
DC1394: NO
FFMPEG: YES (prebuilt binaries)
avcodec: YES (58.134.100)
avformat: YES (58.76.100)
avutil: YES (56.70.100)
swscale: YES (5.9.100)
avresample: YES (4.0.0)
GStreamer: NO
DirectShow: YES
Media Foundation: YES
DXVA: YES
Parallel framework: Concurrency
Trace: YES (with Intel ITT)
Other third-party libraries:
Intel IPP: 2020.0.0 Gold [2020.0.0]
at: D:/opencv_build/build/3rdparty/ippicv/ippicv_win/icv
Intel IPP IW: sources (2020.0.0)
at: D:/opencv_build/build/3rdparty/ippicv/ippicv_win/iw
Lapack: YES (C:/openblas/lib/openblas.lib)
Eigen: YES (ver 3.4.0)
Custom HAL: NO
Protobuf: build (3.19.1)
NVIDIA CUDA: YES (ver 11.2, CUFFT CUBLAS NVCUVID NVCUVENC FAST_MATH)
NVIDIA GPU arch: 86
NVIDIA PTX archs:
cuDNN: YES (ver 8.6.0)
OpenCL: YES (NVD3D11)
Include path: D:/opencv_build/opencv-4.7.0/3rdparty/include/opencl/1.2
Link libraries: Dynamic load
Python 3:
Interpreter: D:/Anaconda3/envs/acc/python.exe (ver 3.8.8)
Libraries: D:/Anaconda3/libs/python38.lib (ver 3.8.8)
numpy: D:/Anaconda3/envs/acc/Lib/site-packages/numpy/core/include (ver 1.24.4)
install path: D:/Anaconda3/envs/acc/Lib/site-packages/cv2/python-3.8
Python (for build): D:/Anaconda3/envs/acc/python.exe
Java:
ant: NO
JNI: D:/JDK8/jdk1.8.0_101/include D:/JDK8/jdk1.8.0_101/include/win32 D:/JDK8/jdk1.8.0_101/include
Java wrappers: NO
Java tests: NO
Install to: D:/opencv_build/install
-----------------------------------------------------------------
and I’m doing video reading and writing with multiple threads and gpus. here’s a short version of my code.
def gpu_process(i, gpu_num):
cv2.cuda.setDevice(gpu_num)
video_gpu = cv2.cudacodec.createVideoReader(r'video.mp4')
video_gpu.set(cv2.cudacodec.COLOR_FORMAT_BGR)
format_gpu = video_gpu.format()
fps = int(format_gpu.fps)
height = format_gpu.height
width = format_gpu.width
encoder_params_in = cv2.cudacodec.EncoderParams()
stream = cv2.cuda.Stream()
result_video_path = '{}.mp4'.format(i)
out = cv2.cudacodec.createVideoWriter(result_video_path, (width, height), cv2.cudacodec.H264, fps=fps,
colorFormat=cv2.cudacodec.COLOR_FORMAT_BGR,
params=encoder_params_in,
stream=stream)
while True:
ret, frame = video_gpu.nextFrame()
if not ret:
break
out.write(frame)
out.release()
for i in range(7):
if i % 2 == 0:
gpu_num = 0
else:
gpu_num = 1
a = threading.Thread(target=gpu_process, args=(i, gpu_num)).start()
but after 5 threads running successfully, it will encounter this error:
out = cv2.cudacodec.createVideoWriter(result_video_path, (width, height), cv2.cudacodec.H264, fps=fps,
cv2.error: OpenCV(4.7.0) D:\opencv_build\opencv_contrib-4.7.0\modules\cudacodec\src\video_writer.cpp:220: error: (-217:Gpu API call) in function 'cv::cudacodec::VideoWriterImpl::Init'
> Error initializing Nvidia Encoder. Refer to Nvidia's GPU Support Matrix to confirm your GPU supports hardware encoding, codec and surface format and check the encoder documentation to verify your choice of encoding paramaters are supported.OpenCV(4.7.0) D:\opencv_build\opencv_contrib-4.7.0\modules\cudacodec\src\NvEncoder.cpp:47: error: (-217:Gpu API call) NVENC returned error [Code = 10] in function 'cv::cudacodec::NvEncoder::NvEncoder'
>
it seems I can only successfully create 5 cv2.cudacodec.createVideoWriter object, I don’t know if this is a feature or if am I missing something when building OpenCV with cuda or if I write the wrong code
FYI, I’m using OpenCV 4.7.0, Python 3.8.8, cuda 11.2, Video_Codec_SDK_11.1.5, I have two NVIDIA GeForce RTX 3080 Ti.
can anyone help