Efficient way to get camera frames as GpuMat?

Hi,

I am trying to make a plugin for OBS Studio in C and C++. I have decided to use OpenCV to handle all frames manipulations (layer two images, convert pixel format, apply matrix transform, etc.).

Because OBS Studio needs to be rendered at ‘runtime’ and is commonly used to record games that may use a lot of computer resources, I am trying to optimize the frame manipulation by doing all matrix calculations in the GPU using cv::cuda::GpuMat when available. I am doing this check using preprocessor macros, meaning I can build two versions of my plugin: One that is handling and manipulating all frames on the CPU, as cv::Mat, and one that works on the GPU, using cv::cuda::GpuMat.

So far I am having great results with images and videos files (without audio managment). I am now trying to add the camera as a video input source. In order to do that, I have looked at how OBS Studio made it here, but it was a bit hard to understand and reuse the same logic in my own plugin. As far as I understand, they seem to use FFMPEG with DirectShow to retrieve and decode the camera frames.

Instead, I would like to use OpenCV VideoCapture() object. Good thing is that VideoCapture can also use FFMPEG and DirectShow API. When working with the CPU, I can use VideoCapture cap(0, cv::CAP_DSHOW) with cap.read(Mat) in order to get camera frames. However, when working with the GPU, I am a bit lost on how to do it…

Of course, I could use the same code and just upload my frame to the GPU:

cv::Mat cpuFrame;
cv::cuda::GpuMat gpuFrame;
VideoCapture cap(0, cv::CAP_DSHOW);

[...]

if(cap.read(cpuFrame)) {
    gpuFrame.upload(cpuFrame);
}

But then, I would have to upload my mat to the GPU on every OBS tick, which is by far not optimal.

Instead, I could maybe use cv::cudacodec::createVideoReader. Sadly, when I try to use it, I got the following error:

Error: Unsupported format or combination of formats (Unsupported video source) in cv::cudacodec::detail::FFmpegVideoSource::FFmpegVideoSource, file E:.…\ffmpeg_video_source.cpp, line 135

When I looked at ffmpeg_video_source.cpp, line 134 I saw that it also uses the same VideoCapture, but used the API CAP_FFMPEG instead of CAP_DSHOW.

I then tried to change my previous code and use CAP_FFMPEG too, but then I am not able to retrieve the frames from my camera anymore. Here is my code:

std::string name = "0";
cv::Ptr<cv::cudacodec::VideoReader> reader;
cv::VideoCapture cap;
bool open = cap.open(name, cv::CAP_FFMPEG, {cv::CAP_PROP_FORMAT, -1}); // open will be false
reader = cv::cudacodec::createVideoReader(name, {cv::CAP_PROP_OPEN_TIMEOUT_MSEC  , 100}); // Exception: Error: Unsupported format or combination of formats ....

How can I use cv::cudacodec::createVideoReader or cv::VideoCapture with CAP_FFMPEG in order to directly get the frames as GpuMat? Thanks in advance for your help.

Note: this is the OpenCV build configuration I am using:

-- General configuration for OpenCV 4.11.0-pre =====================================
--   Version control:               4.10.0-496-gb42075f3e2
--
--   Extra modules:
--     Location (extra):            E:/.../OpenCV_Build/opencv_contrib/modules
--     Version control (extra):     4.10.0-51-g3e776c87
--
--   Platform:
--     Timestamp:                   2024-12-10T15:13:33Z
--     Host:                        Windows 10.0.22631 AMD64
--     CMake:                       3.29.5-msvc4
--     CMake generator:             Visual Studio 17 2022
--     CMake build tool:            C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/MSBuild/Current/Bin/amd64/MSBuild.exe
--     MSVC:                        1942
--     Configuration:               Debug Release
--     Algorithm Hint:              ALGO_HINT_ACCURATE
--
--   CPU/HW features:
--     Baseline:                    SSE SSE2 SSE3
--       requested:                 SSE3
--     Dispatched code generation:  SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
--       SSE4_1 (16 files):         + SSSE3 SSE4_1
--       SSE4_2 (1 files):          + SSSE3 SSE4_1 POPCNT SSE4_2
--       AVX (8 files):             + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
--       FP16 (0 files):            + SSSE3 SSE4_1 POPCNT SSE4_2 AVX FP16
--       AVX2 (36 files):           + SSSE3 SSE4_1 POPCNT SSE4_2 AVX FP16 AVX2 FMA3
--       AVX512_SKX (5 files):      + SSSE3 SSE4_1 POPCNT SSE4_2 AVX FP16 AVX2 FMA3 AVX_512F AVX512_COMMON AVX512_SKX
--
--   C/C++:
--     Built as dynamic libs?:      YES
--     C++ standard:                11
--     C++ Compiler:                C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.42.34433/bin/Hostx64/x64/cl.exe  (ver 19.42.34435.0)
--     C++ flags (Release):         /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise    /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP  /O2 /Ob2 /DNDEBUG
--     C++ flags (Debug):           /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise    /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP  /Zi /Ob0 /Od /RTC1
--     C Compiler:                  C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.42.34433/bin/Hostx64/x64/cl.exe
--     C flags (Release):           /DWIN32 /D_WINDOWS /W3  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise    /MP   /O2 /Ob2 /DNDEBUG
--     C flags (Debug):             /DWIN32 /D_WINDOWS /W3  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise    /MP /Zi /Ob0 /Od /RTC1
--     Linker flags (Release):      /machine:x64  /INCREMENTAL:NO
--     Linker flags (Debug):        /machine:x64  /debug /INCREMENTAL
--     ccache:                      NO
--     Precompiled headers:         NO
--     Extra dependencies:          cudart_static.lib nppc.lib nppial.lib nppicc.lib nppidei.lib nppif.lib nppig.lib nppim.lib nppist.lib nppisu.lib nppitc.lib npps.lib cublas.lib cufft.lib -LIBPATH:C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/x64
--     3rdparty dependencies:
--
--   OpenCV modules:
--     To be built:                 aruco bgsegm bioinspired calib3d core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hfs img_hash imgcodecs imgproc intensity_transform line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot quality rapid reg rgbd saliency shape signal stereo stitching structured_light superres surface_matching text tracking video videoio videostab wechat_qrcode world xfeatures2d ximgproc xobjdetect xphoto
--     Disabled:                    highgui
--     Disabled by dependency:      ccalib
--     Unavailable:                 alphamat cannops cvv fastcv freetype hdf java julia matlab ovis python2 python3 sfm ts viz
--     Applications:                apps
--     Documentation:               NO
--     Non-free algorithms:         NO
--
--   Windows RT support:            NO
--
--   GUI:
--     Win32 UI:                    YES
--     VTK support:                 NO
--
--   Media I/O:
--     ZLib:                        build (ver 1.3.1)
--     JPEG:                        build-libjpeg-turbo (ver 3.0.3-70)
--       SIMD Support Request:      YES
--       SIMD Support:              NO
--     WEBP:                        build (ver decoder: 0x0209, encoder: 0x020f, demux: 0x0107)
--     AVIF:                        NO
--     PNG:                         build (ver 1.6.43)
--       SIMD Support Request:      YES
--       SIMD Support:              YES (Intel SSE)
--     TIFF:                        build (ver 42 - 4.6.0)
--     JPEG 2000:                   build (ver 2.5.0)
--     OpenEXR:                     build (ver 2.3.0)
--     GIF:                         NO
--     HDR:                         YES
--     SUNRASTER:                   YES
--     PXM:                         YES
--     PFM:                         YES
--
--   Video I/O:
--     FFMPEG:                      YES (prebuilt binaries)
--       avcodec:                   YES (58.134.100)
--       avformat:                  YES (58.76.100)
--       avutil:                    YES (56.70.100)
--       swscale:                   YES (5.9.100)
--       avresample:                YES (4.0.0)
--     GStreamer:                   NO
--     DirectShow:                  YES
--     Media Foundation:            YES
--       DXVA:                      YES
--
--   Parallel framework:            Concurrency
--
--   Trace:                         YES (with Intel ITT)
--
--   Other third-party libraries:
--     Intel IPP:                   2021.12.0 [2021.12.0]
--            at:                   E:/.../OpenCV_Build/build_opencv/3rdparty/ippicv/ippicv_win/icv
--     Intel IPP IW:                sources (2021.12.0)
--               at:                E:/.../OpenCV_Build/build_opencv/3rdparty/ippicv/ippicv_win/iw
--     Lapack:                      NO
--     Eigen:                       NO
--     Custom HAL:                  NO
--     Protobuf:                    build (3.19.1)
--     Flatbuffers:                 builtin/3rdparty (23.5.9)
--
--   NVIDIA CUDA:                   YES (ver 12.6, CUFFT CUBLAS NVCUVID NVCUVENC)
--     NVIDIA GPU arch:             50 52 60 61 70 75 80 86 89 90
--     NVIDIA PTX archs:            90
--
--   cuDNN:                         NO
--
--   OpenCL:                        YES (NVD3D11)
--     Include path:                E:/.../OpenCV_Build/opencv/3rdparty/include/opencl/1.2
--     Link libraries:              Dynamic load
--
--   Python (for build):            C:/.../AppData/Local/Microsoft/WindowsApps/python3.exe
--
--   Java:
--     ant:                         NO
--     Java:                        YES (ver 17.0.0)
--     JNI:                         C:/.../.jdks/openjdk-17/include C:/.../.jdks/openjdk-17/include/win32 C:/.../.jdks/openjdk-17/include
--     Java wrappers:               NO
--     Java tests:                  NO
--
--   Install to:                    E:/.../OpenCV_Build/lib_cuda/opencv
-- -----------------------------------------------------------------
--
-- Configuring done (14.5s)
-- Generating done (11.2s)
-- Build files have been written to: E:/.../OpenCV_Build/build_opencv

terrible idea. OBS already gives you video input, way better than OpenCV does. OpenCV’s Videocapture really isn’t that great. It’s an ancient API that’s been extended to do much, but the API itself is extremely constrained. you aren’t even getting presentation timestamps for the frames from a videocapture.

if you’re writing a filter, such a filter should take any video source within OBS itself.

all that CUDA stuff also seems unjustified. you aren’t doing general computation, and not even much of it. you’re doing graphics effects.

instead of writing CUDA-specific code, you can just use cv::UMat and that’ll give you OpenCL acceleration transparently throughout most of OpenCV. yes, even nvidia GPUs can do OpenCL, and they do it fairly well.

or reach for OpenGL/Vulkan/D3D and write shaders operating on textures and meshes, use the GPU’s innate rasterization ability. OBS likely already does keep its data on the GPU as objects from one of those APIs.

Thanks for your response.

I didn’t know that VideoCapture() was not that efficient. I have already used OBS functions to load images, but it was easier to understand the code. I will look more into OBS class Device and how to use it (it doesn’t look that hard to understand after all).

Would you concider doing the same for videos file processing, or in this case is it ‘optimized enough’ to still use VideoCapture() ? I saw that OBS provide a media_playback object as well.

I am not writing a filter but a new type of OBS source. The goal is to make a visual scripting tool for OBS. I have looked a bit on OpenCL to perform acceleration, but I didn’t know about OpenCV UMat (I am still new to OpenCV). I will definitly take a look on it :+1: .