I was refering to the source of information, sorry for the confusion. The point is cv::cudacodec::VideoReader()
should work on all versions of CUDA any information to the contrary may be for specific circumstances which I would be interested to investigate if you have a link to the information?
Anyway this thread is getting confusing, I have just performed some tests on my build of 4.5.5 to confirm everything is working as it should, and it appears to be. I will try to compress the relevent information below.
Essentially for Nvidia hardware decoding you have two (possibly three) easy approaches:
- Build OpenCV against FFMpeg built against the Nvidia Video Codec SDK.
- Use
cv::cudacodec::VideoReader()
.
- Use GStreamer - I haven’t personally tested this with OpenCV and RTSP but I have used the Nvidia plugins when not streaming from RTSP outside of OpenCV without any issues.
In my opinion the pros and cons for each are.
Build OpenCV against FFMpeg built against Nvidia Video Codec SDK.
Pros:
- You can use the
VideoCapture()
api as is, frames are returned as Mat’s etc., so essentially the hardware decoding is hidden from you and should work without issues with existing code which uses `VideoCapture().
Cons:
- FFMpeg needs to be built against the Nvidia Video Codec SDK - if you get similar errors to the following from
VideoCapture::open()
[ERROR:0@1.112] global /build/opencv/modules/videoio/src/cap_ffmpeg_impl.hpp (1117) open Could not find decoder 'h264_cuvid'
then this is not the case.
- If you use the old api pre OpenCV 4.5.2 you need to set environmental variables and make sure you know which codec you require in advance, e.g. setting
OPENCV_FFMPEG_CAPTURE_OPTIONS=video_codec;h264_cuvid
and trying to decode a h265 file will result in the following error
[h264_cuvid @ 00000269ae74a000] Codec type or id mismatches
- Decoding can be as slow as using the CPU (although this may have changed with the latest api), you simply get the benefit of reducing CPU utilization.
Procedure:
This vaires depending on the version of OpenCV you are using. For all versions of OpenCV adding the environmenal variables:
- OPENCV_FFMPEG_CAPTURE_OPTIONS=video_codec;h264_cuvid
- OPENCV_FFMPEG_CAPTURE_OPTIONS=video_codec;hevc_cuvid
or even
- OPENCV_FFMPEG_CAPTURE_OPTIONS=video_codec;h264_qsv
if you have an Intel chip.
should work as long as the version of FFMpeg you are building against supports these codecs.
From 4.5.2 onwards this was made simpler but it is still fairly new and I am not sure if everything works. You no longer need to add an environmental variable and opening an RTSP source as
VideoCapture cap("rtsp://....", cv::CAP_FFMPEG, { CAP_PROP_HW_ACCELERATION, cv::VIDEO_ACCELERATION_ANY });
should result in hardware decoding and debug output similar to below
[ INFO:0@1.315] global repos/opencv/modules/videoio/src/cap_ffmpeg_hw.hpp (276) hw_check_device FFMPEG: Using d3d11va video acceleration on device: NVIDIA GeForce GTX 1060
[ INFO:0@1.316] global repos/opencv/modules/videoio/src/cap_ffmpeg_hw.hpp (566) hw_create_device FFMPEG: Created video acceleration context (av_hwdevice_ctx_create) for d3d11va on device 'default'
In addition this should work out of the box on windows as it is baked into opencv_videoio_ffmpegxxx_64.dll
where as video_codec;h264_cuvid
isn’t and would requir hacking the cmake files to link directly to FFMpeg libs built with hardware support.
cv::cudacodec::VideoReader().
Pros:
- The output is a
cv::cuda::GPUMat
. This could also be a con if you require a cv::Mat
.
- You can get the raw encoded output at the same time as the decoded frame if required. This is useful if you want to record without the overhead of encoding by writing the encoded data to a file.
- You don’t need to build or find a version of FFMpeg which is built against the Nvidia Video Codec SDK.
Cons:
- OpenCV needs to be built against both the CUDA and Nvidia Video Codec SDK, if you are not using CUDA for anything else this is probably overkill.
- You don’t need to mess around with environmental variables when using versions of OpenCV pre 4.5.2.
Procedure:
- Build OpenCV again CUDA and the Nvidia VideoCodec SDK.
To summarize:
- If you have the latest version of OpenCV, access to FFMpeg built against the Nvidia Video Codec SDK and don’t use CUDA for processing the resulting frames I would use approach 1).
- If you can’t use a version of FFMpeg built against the Nvidia Video Codec SDK or use CUDA to process the resulting frames use approach 2).
Personally I use 2) because most of the work I do is on the GPU so I would add the following caveat. I am not sure if there is a limit on the number of concurrent hardware decoding sessions or how stable approach 1) is.
You can see the results of some comparisons I ran a while back here to get an idea of the decoding perfomance of each approach.