Camera Supports 255 in FFMPEG, but Only getting ~185 in OpenCV

Hi All,

I’m I have a cheap ELP camera that outputs the following formats per FFMPEG.

[dshow @ ] vcodec=mjpeg min s=640x360 fps=260.004 max s=640x360 fps=260.004
[dshow @ ] vcodec=mjpeg min s=640x360 fps=260.004 max s=640x360 fps=260.004 (pc, bt470bg/bt709/unknown, center)
[dshow @ ] vcodec=mjpeg min s=1280x720 fps=120 max s=1280x720 fps=120
[dshow @ ] vcodec=mjpeg min s=1280x720 fps=120 max s=1280x720 fps=120 (pc, bt470bg/bt709/unknown, center)
[dshow @ ] vcodec=mjpeg min s=1920x1080 fps=60.0002 max s=1920x1080 fps=60.0002
[dshow @ ] vcodec=mjpeg min s=1920x1080 fps=60.0002 max s=1920x1080 fps=60.0002 (pc, bt470bg/bt709/unknown, center)

When I use FFMPEG directly, I can get 255 FPS using MJPEG + yuvj422p. I can also get 255 using their native windows tool.

Using OpenCV, I’m using this code below, and I average only ~185FPS while reading to memory and doing no other operations. Overall CPU /memory utilization is very low during the run. System is a Xeon E5-1650 + 16GB of RAM

import cv2
import time

cam = cv2.VideoCapture(0, cv2.CAP_DSHOW)
cam.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cam.set(cv2.CAP_PROP_FRAME_HEIGHT, 360)
cam.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG"))


count = 0

while count < 500:  # Warmup
    rval, frame = cam.read()
    count = count + 1

count = 0
start_time = time.time()
while (time.time() - start_time) < 5:
    rval, frame = cam.read()
    count = count + 1
end_time = time.time()
print("Total Time: {}".format(end_time - start_time))
print("Total Frames: {}".format(count))
print("Average FPS: {}".format(count / (end_time - start_time)))

Results:

Total Time: 5.003589630126953
Total Frames: 929
Average FPS: 185.66670504040297

Total Time: 5.007279634475708
Total Frames: 937
Average FPS: 187.12755595845798

Total Time: 5.00612735748291
Total Frames: 929
Average FPS: 185.57258608520556

I then look at the time between each frame, and it seems like every 3 frames one takes longer:

read frame processed : 12.14289665222168 ms
read frame processed : 2.002239227294922 ms
read frame processed : 1.9993782043457031 ms
read frame processed : 11.974811553955078 ms
read frame processed : 2.0225048065185547 ms
read frame processed : 1.9757747650146484 ms
read frame processed : 11.033296585083008 ms
read frame processed : 0.9982585906982422 ms
read frame processed : 2.0055770874023438 ms
read frame processed : 11.561155319213867 ms
read frame processed : 1.9941329956054688 ms
read frame processed : 1.9996166229248047 ms
read frame processed : 12.000560760498047 ms
read frame processed : 2.0232200622558594 ms
read frame processed : 1.9807815551757812 ms

I tried to take FFMPEG and pipe it into open CV, but the pixel format is yuvj422p not BGR24 and when swiching the codec in FFMPEG from MJPEG to rawvideo FFMPEG drops to 100fps and bitrate skyrockets.

Any idea what would cause one of 3 frames to take longer to capture?

Is there a way to use openCV to capture frames at ~250 FPS?

set CAP_PROP_FOURCC. examples all over the place. that should do the trick.

that is junk because you used time.time(). you should have used time.perf_counter_ns() or time.perf_counter()

besides, that merely measures when your program executes, not when the frames are made.

cameras make frames at their own rate and put them into a queue.

Thanks! I added this line to the cap and no change (updated original post too):
cam.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*“MJPG”))

I get the exact same results using perf_counter_ns

I have confirmed with FFMPEG and another DSHOW app that I can pull 255 FPS with this camera.


image

change up the order. it might require being first or last, where one of those will work and the other won’t.

if none of those work, pass the cap props in the constructor instead of using set() calls.

Tried in multiple places, no difference. I tried these two variations as well.

cam.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG"))

cam.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc("M", "J", "P", "G"))

What’s the best way to do that here? I couldn’t figure out the syntax.

Any idea on why when counting the time to read a frame, it’s always every 3rd one that is 5x longer? I think if we can get that frame to take the same time as the rest it would probably work as expected, especially given ffmpeg and another windows app is able to pull 255 frames.

It probably won’t make any difference but just to rule it out, try passing the pre-allocated frame to
cam.read() to avoid alocating a new one on every invocation. e.g.

rval, _ = cam.read(frame)

you’re saying “low” CPU usage, but that still could be using an entire core…

set CAP_PROP_CONVERT_RGB to 0 and see how that affects the frame rate

Same results…

Total Frames: 917
Average FPS: 183.33326669092452

I ran the script multiple times and none of the cores seem to come close to max, picture attached:

no change at all.

the scheduler makes a process/thread hop around. it still uses “an entire core”, but spread around, and you won’t see it in that graph. you’ll only see it in a per-process/per-thread accounting.

that per-core graph looks to me like the CPU is fairly busy, i.e. there’s probably something taking up an entire core’s worth of time.

don’t expect parallelization. there might be some, in some parts, but there might not. processing doesn’t simply use multiple cores on its own. someone has to put effort into writing the algorithm to use parallelism. and that’s speaking generally.

you’re only doing a video capture. the only “heavy lifting” here is (1) data transfer, which is cheap and doesn’t take up CPU (2) VideoCapture’s color space conversion, which can be disabled (explained earlier). that too shouldn’t take up much time, but at high frame rates, it might become significant.

One possibility is that your read thread is sleeping or in a wait state, which could be the case here, see

That is the long waits are because you arrive before a frame is ready and the thread then enters a wait state, to be woken up when the thread is ready. The extra delay you are seeing is then due to the precision (10-15ms on Windows ) of the system timers. Then because you waited so long before the next two frame ready when you call VideoCapture::read() and are processed imidiately, taking ~2ms.

I haven’t checked but I would have thought you could see if changing

DWORD result = WaitForSingleObject(VDList[id]->sgCallback->hEvent, 1000);
if( result != WAIT_OBJECT_0) return false;

to

while (WaitForSingleObject(VDList[id]->sgCallback->hEvent, 0) != WAIT_OBJECT_0) continue;

makes a difference.

Either way from reading this thread it looks like the implementation in OpenCV, which uses the Direct Show API not FFMpeg is probably what is causing the difference you are observing.

you can try VideoCapture::waitAny(). that’s supposed to be able to poll.

still, there’s nothing you could do with that gained time except poll again and again.

If its a windows precision thing, would trying this in linux help before trying to custom compile a version on windows?

Yes if that’s easier, however I have no idea if this will make any difference.

That was it… so I’m able to get 255fps in linux with the exact same code.

Thanks for the help! Is there any info I can share to help get this solved in Windows, or it’s a Windows limitation?