Slow frame read from webcam

Hi,

As per the title, I’m seeing very slow frame reads from webcam. At a resolution at which it can output 120FPS, I’m only getting 20-40FPS using a very simple OpenCV application, while ffmpeg fed directly from the webcam gets ~60FPS while also doing h264 encoding and streaming, so something is definitely off. One other odd thing is that on one start it could stay locked at 50FPS, while in other instances it wouldn’t be able to get anything better than 20FPS for the duration of the run.

Any points on where to start looking please?

Thanks!

1 Like

any waitKey() involved?

what API is involved, DSHOW/MSMF? V4L?

use ffmpeg to query the camera’s modes (available for dshow and v4l, that I know of). is there any mention of mjpeg or other compressed formats?

you’d have to set(CAP_PROP_FOURCC, ...) and see what works. "MJPG" fourcc has given me “higher” frame rates from some cameras. OpenCV doesn’t “try them all”.

No waitKey() involved :slight_smile: I’m measuring how long each operation takes using gettimeofday() calls around both reading the frames and encoding them in JPG (weird that’s required given that the output is MJPG).

I’m pretty sure it’s gstreamer, as I’m seeing some debug output as I start my app. I used capture.get(CAP_PROP_FOURCC) and confirmed the camera is set to output MJPG. As far as I can tell, I’ve checked everything.

ok then. try reading from the camera with a plain ffmpeg or gstreamer process from the terminal, without any opencv involved.

if either of that succeeds, it should be possible for opencv to do it too.

Double checked and ffmpeg uses v4l for interacting with the webcam, getting 40-45 FPS while doing encoding as well.

Input #0, video4linux2,v4l2, from '/dev/video2':
  Duration: N/A, start: 29580.442876, bitrate: N/A
    Stream #0:0: Video: mjpeg (Baseline), yuvj422p(pc, bt470bg/unknown/unknown), 1280x720, 120 fps, 120 tbr, 1000k tbn, 1000k tbc
Input #1, lavfi, from 'anullsrc':
  Duration: N/A, start: 0.000000, bitrate: 705 kb/s
    Stream #1:0: Audio: pcm_u8, 44100 Hz, stereo, u8, 705 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
  Stream #1:0 -> #0:1 (pcm_u8 (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 0xaaaadd0f7140] using cpu capabilities: ARMv8 NEON
[libx264 @ 0xaaaadd0f7140] profile High 4:2:2, level 3.1, 4:2:2, 8-bit
[libx264 @ 0xaaaadd0f7140] 264 - core 159 r2991M 1771b55 - H.264/MPEG-4 AVC codec - Copyleft 2003-2019 - http://www.videolan.org/x264.html - options: cabac=0 ref=1 deblock=0:0:0 analyse=0:0 me=dia subme=0 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=0 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=0 threads=9 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=0 keyint=250 keyint_min=25 scenecut=0 intra_refresh=0 rc=crf mbtree=0 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=0
Output #0, flv, to 'rtmp://a.rtmp.youtube.com/live2/a6ve-eh4g-v70c-6zwq-77c8':
  Metadata:
    encoder         : Lavf58.29.100
    Stream #0:0: Video: h264 (libx264) ([7][0][0][0] / 0x0007), yuvj422p(pc, progressive), 1280x720, q=-1--1, 30 fps, 1k tbn, 30 tbc
    Metadata:
      encoder         : Lavc58.54.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1: Audio: aac (LC) ([10][0][0][0] / 0x000A), 44100 Hz, stereo, fltp, 128 kb/s
    Metadata:
      encoder         : Lavc58.54.100 aac
[video4linux2,v4l2 @ 0xaaaadd0ed7c0] Thread message queue blocking; consider raising the thread_queue_size option (current value: 8)
[flv @ 0xaaaadd0f5e50] Failed to update header with correct duration.377.1kbits/s speed=1.36x
[flv @ 0xaaaadd0f5e50] Failed to update header with correct filesize.
frame=  841 fps= 41 q=-1.0 Lsize=   21833kB time=00:00:28.00 bitrate=6387.4kbits/s speed=1.37x

The weird thing is that I checked and OpenCV uses V4L2 as well, as
capture.getBackendName() revealed.

I put up a post with the ffmpeg output, but was locked by the antispam filter… Essentially that also uses a V4L2 backend, but gets 40-45 FPS even while doing encoding and streaming.

Input #0, video4linux2,v4l2, from '/dev/video2':
  Duration: N/A, start: 29580.442876, bitrate: N/A
    Stream #0:0: Video: mjpeg (Baseline), yuvj422p(pc, bt470bg/unknown/unknown), 1280x720, 120 fps, 120 tbr, 1000k tbn, 1000k tbc
Input #1, lavfi, from 'anullsrc':
  Duration: N/A, start: 0.000000, bitrate: 705 kb/s
    Stream #1:0: Audio: pcm_u8, 44100 Hz, stereo, u8, 705 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
  Stream #1:0 -> #0:1 (pcm_u8 (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 0xaaaadd0f7140] using cpu capabilities: ARMv8 NEON
[libx264 @ 0xaaaadd0f7140] profile High 4:2:2, level 3.1, 4:2:2, 8-bit
[libx264 @ 0xaaaadd0f7140] 264 - core 159 r2991M 1771b55 - H.264/MPEG-4 AVC codec - Copyleft 2003-2019 - http://www.videolan.org/x264.html - options: cabac=0 ref=1 deblock=0:0:0 analyse=0:0 me=dia subme=0 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=0 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=0 threads=9 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=0 keyint=250 keyint_min=25 scenecut=0 intra_refresh=0 rc=crf mbtree=0 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=0
Output #0, flv, to 'rtmp://a.rtmp.youtube.com/live2/a6ve-eh4g-v70c-6zwq-77c8':
  Metadata:
    encoder         : Lavf58.29.100
    Stream #0:0: Video: h264 (libx264) ([7][0][0][0] / 0x0007), yuvj422p(pc, progressive), 1280x720, q=-1--1, 30 fps, 1k tbn, 30 tbc
    Metadata:
      encoder         : Lavc58.54.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1: Audio: aac (LC) ([10][0][0][0] / 0x000A), 44100 Hz, stereo, fltp, 128 kb/s
    Metadata:
      encoder         : Lavc58.54.100 aac
[video4linux2,v4l2 @ 0xaaaadd0ed7c0] Thread message queue blocking; consider raising the thread_queue_size option (current value: 8)
[flv @ 0xaaaadd0f5e50] Failed to update header with correct duration.377.1kbits/s speed=1.36x
[flv @ 0xaaaadd0f5e50] Failed to update header with correct filesize.
frame=  841 fps= 41 q=-1.0 Lsize=   21833kB time=00:00:28.00 bitrate=6387.4kbits/s speed=1.37x

OpenCV also uses V4L2, as shown by the capture.getBackendName() output.

Any ideas about how to approach this? I find it odd that both OpenCV and FFMpeg would use the same underlying module (V4L) for accessing the webcam with such different results.

Is it possible that they’d use different version of the Video4Linux module?

You probably need to do multithreading. Try to just capture the frames, without any processing of the frame, to check if you can read the frames from the webcam as fast as you expect. If this works, move the processing of the frames to another/other threads.

I have some updates on this, finally.

So I used a gstreamer pipeline to capture the frames from the video and, as I could add/remove elements at will, I troubleshoot the problem a bit further. With this pipeline, I was getting horrible FPS (2).

capture = VideoCapture(“v4l2src device=/dev/video2 io-mode=2 ! image/jpeg, witdh=1920 ! v4l2jpegdec ! video/x-raw ! imxvideoconvert_g2d ! video/x-raw, format=BGRx ! queue ! videoconvert ! video/x-raw,format=BGR ! appsink”, CAP_GSTREAMER);

However, if I removed the ‘videoconvert’ element, which uses the CPU to simply eliminate the 4th byte of the BGRx format, the FPS increased to 60, as expected. I must admit I’m lost as to how simply removing one channel in a format takes so much time. So, as with anything in SW, one answer opens multiple questions:

  1. Is there any way around this? I’ve been reading about people having similar problems with Jetson units as well. Is there any way to use BGRx in OpenCV?

  2. Is it a limitation of my camera? It outputs MJPG, so obviously these need to be converted into a RAW format to be processed by OpenCV. However, everywhere I read (NVIDIA forums for the Jetson mainly), they say they can’t improve on the conversion speed from BGRx to BGR because that’s done in the CPU. I genuinely can’t understand how there are so many products that overlay text on a video and what cameras and processing they are using.

Thanks!

There’s not much I can do with multithreading I think, it’s the frame capturing and conversion that takes longer than the 15ms needed for 60FPS. I’d be happy with 30ms, as long as it’s consistent.

As I said in my other message, what bugs me is that this is such a common task and it’s done in so many projects, that it seems like I’m on totally the wrong path in terms of solutions chosen.

Hi, I’ve had similar experience to you @mars.
The surprising thing I’ve found is that opencv capture in Python was way faster:
1920x1080 video:
C++: 5 FPS
Python: 60+ FPS

The C++/Python scripts are extermely simple (no API/FOURCC selection) and virtually identical. It only reads frames in a loop, running C++ release build from console (outside IDE).
For C++ I’ve tried both:
ret = cap.read(image);
and:
cap >> image;
with no perceivable difference.

Python:
ret, image = cap.read()

One thing worth noting is that Python cap.open() takes very long (> 10 seconds), where the C++ is instantaneous. Is this perhaps internally trying different modes? If so, can this be emulated in C++?

different operating systems? different versions? if yes, that implies possibly different backend priorities/selection.

Hi @crackwitz indeed all on the same OS.

I’ve uploaded my C++ and Python test code here:

C++:
OpenCV: 4.4.0, 4.6.0
Visual Studio/compiler: 2019/v142, 2022/v143
Measured performance: ~5 FPS (quite constant)

Python:
OpenCV: 4.6.0
Python 3.8
Measured performance: 50~100 FPS (varies)

Things tried:
Different OpenCV versions (see above)
Different VS/compilers (see above)
Explicitly set CAP_PROP_FPS to 10 or 30
Frame acquisition: cap.read(image) and: cap >> image
Use VideoCapture constructor with camera index instead of using cap.open(x)
Use capture index -1 or 1 instead of 0 (not working)

All testing on the same OS: Windows 10 64bit

The reason for starting this testing was that on Raspberry Pi with a Pi camera we similarly got very low frame rates using C++ OpenCV. So that’s different architecture, different OS and different camera hardware. I therefor think this should be very reproducible.

That’s all the information I can think of, but let me know if you like more details.

Hi, more to this,
I’ve found out that in Python the back-end selected is: MSMF
While in C++: DSHOW
Explicitely selecting MSMF in C++ does not work. The Windows binary of OpenCV does include MSMF and corresponding DLL.

The verbose/debug output doesn’t say why MSMF plugin fails:

[DEBUG:0@0.070] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\videoio_registry.cpp (197) cv::`anonymous-namespace'::VideoBackendRegistry::VideoBackendRegistry VIDEOIO: Builtin backends(8): FFMPEG(1000); GSTREAMER(990); INTEL_MFX(980); MSMF(970); DSHOW(960); CV_IMAGES(950); CV_MJPEG(940); UEYE(930)
[DEBUG:0@0.071] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\videoio_registry.cpp (221) cv::`anonymous-namespace'::VideoBackendRegistry::VideoBackendRegistry VIDEOIO: Available backends(8): FFMPEG(1000); GSTREAMER(990); INTEL_MFX(980); MSMF(970); DSHOW(960); CV_IMAGES(950); CV_MJPEG(940); UEYE(930)
[ INFO:0@0.072] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\videoio_registry.cpp (223) cv::`anonymous-namespace'::VideoBackendRegistry::VideoBackendRegistry VIDEOIO: Enabled backends(8, sorted by priority): FFMPEG(1000); GSTREAMER(990); INTEL_MFX(980); MSMF(970); DSHOW(960); CV_IMAGES(950); CV_MJPEG(940); UEYE(930)
[ WARN:0@0.072] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\cap.cpp (246) cv::VideoCapture::open VIDEOIO(GSTREAMER): trying capture cameraNum=0 ...
[ INFO:0@0.072] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\backend_plugin.cpp (383) cv::impl::getPluginCandidates Found 2 plugin(s) for GSTREAMER
[ INFO:0@0.073] global c:\build\master_winpack-build-win64-vc15\opencv\modules\core\src\utils\plugin_loader.impl.hpp (67) cv::plugin::impl::DynamicLib::libraryLoad load E:\Personal\Projects\cvtest\x64\Debug\opencv_videoio_gstreamer460_64d.dll => FAILED
[ INFO:0@0.075] global c:\build\master_winpack-build-win64-vc15\opencv\modules\core\src\utils\plugin_loader.impl.hpp (67) cv::plugin::impl::DynamicLib::libraryLoad load opencv_videoio_gstreamer460_64d.dll => FAILED
[ WARN:0@0.075] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\cap.cpp (308) cv::VideoCapture::open VIDEOIO(GSTREAMER): backend is not available (plugin is missing, or can't be loaded due dependencies or it is not compatible)
[ WARN:0@0.075] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\cap.cpp (246) cv::VideoCapture::open VIDEOIO(MSMF): trying capture cameraNum=0 ...
[ INFO:0@0.075] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\backend_plugin.cpp (383) cv::impl::getPluginCandidates Found 2 plugin(s) for MSMF
[ INFO:0@0.125] global c:\build\master_winpack-build-win64-vc15\opencv\modules\core\src\utils\plugin_loader.impl.hpp (67) cv::plugin::impl::DynamicLib::libraryLoad load E:\Personal\Projects\cvtest\x64\Debug\opencv_videoio_msmf460_64d.dll => OK
[ INFO:0@0.126] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\backend_plugin.cpp (50) cv::impl::PluginBackend::initCaptureAPI Found entry: 'opencv_videoio_capture_plugin_init_v1'
[ INFO:0@0.126] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\backend_plugin.cpp (169) cv::impl::PluginBackend::checkCompatibility Video I/O: initialized 'Microsoft Media Foundation OpenCV Video I/O plugin': built with OpenCV 4.6 (ABI/API = 1/1), current OpenCV version is '4.6.0' (ABI/API = 1/1)
[ INFO:0@0.126] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\backend_plugin.cpp (69) cv::impl::PluginBackend::initCaptureAPI Video I/O: plugin is ready to use 'Microsoft Media Foundation OpenCV Video I/O plugin'
[ INFO:0@0.127] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\backend_plugin.cpp (84) cv::impl::PluginBackend::initWriterAPI Found entry: 'opencv_videoio_writer_plugin_init_v1'
[ INFO:0@0.127] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\backend_plugin.cpp (169) cv::impl::PluginBackend::checkCompatibility Video I/O: initialized 'Microsoft Media Foundation OpenCV Video I/O plugin': built with OpenCV 4.6 (ABI/API = 1/1), current OpenCV version is '4.6.0' (ABI/API = 1/1)
[ INFO:0@0.127] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\backend_plugin.cpp (103) cv::impl::PluginBackend::initWriterAPI Video I/O: plugin is ready to use 'Microsoft Media Foundation OpenCV Video I/O plugin'
[ WARN:0@0.127] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\cap.cpp (269) cv::VideoCapture::open VIDEOIO(MSMF): can't create capture
[ WARN:0@0.127] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\cap.cpp (246) cv::VideoCapture::open VIDEOIO(DSHOW): trying capture cameraNum=0 ...
[ INFO:0@0.757] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\backend_static.cpp (17) cv::applyParametersFallback VIDEOIO: Backend 'DSHOW' implementation doesn't support parameters in .open(). Applying 2 properties through .setProperty()
[ INFO:0@0.758] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\backend_static.cpp (22) cv::applyParametersFallback VIDEOIO: apply parameter: [3]=1920 / 1920 / 0x0000000000000780
[ INFO:0@0.759] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\backend_static.cpp (22) cv::applyParametersFallback VIDEOIO: apply parameter: [4]=1080 / 1080 / 0x0000000000000438
[ WARN:0@1.528] global c:\build\master_winpack-build-win64-vc15\opencv\modules\videoio\src\cap.cpp (258) cv::VideoCapture::open VIDEOIO(DSHOW): created, isOpened=1
Camera back end: DSHOW

More testing, this time on Raspberry Pi 4 running Ubuntu 22.04.
Resolution 1280x720:

C++:
OpenCV: 4.5.4
Compiler: cmake
Backends tested: gstreamer, V4L2
Measured performance: ~10 FPS

Python:
OpenCV: 4.6.0.66
Python 3.10.6
Backend: V4L2
Measured performance: ~20 FPS

EDIT (forum doesn’t allow me to add more as I’m new)

After A LOT of digging I was able to solve this, getting 30 FPS at HD resolution. The camera/mode is what ultimately seems to determines the FPS.
I’ve made a lot of progress using another package: openpnp-capture (GitHub - openpnp/openpnp-capture: A cross platform video capture library with a focus on machine vision.). This allows quering what the camera supports including codecs and corresponding FPS values for each combination.

So the codecs are YUY2, H264,MJPG, with YUY2 the default. For some reason YUY2 has lower FPS values at high resolutions for reasons unknown to me. H264 is not supported by any backend, but MJPG is. Using this codec I was able to get normal FPS. I don’t know what codec is used in Python as CAP_PROP_FOURCC does not return a valid fourcc value (integer ‘22’).

*Note: when setting the codec using CAP_PROP_FOURCC, this needs to be done AFTER setting width/height, otherwise it doesn’t work. This also applies to the order when using a vector/list of params in cap.open(…, params)

Outstanding questions:
Why is H264 not supported by any backend?
Why does the MSMF backend not work in C++?
What default codec is used in Python?