Capture speed / framerate of cv2.VideoCapture

Hi all,

I’ve been playing around with python, opencv, a raspberry 4 and a “pi hq cam”. The thing is, the time it takes to capture is really long. The pi cam could, in theory, do 1920x1080@30fps. I’m not aiming for that but what I’m getting is more like 2-3 fps when using videocapture. I’d be happy if I could get up to 10fps. Also I would like to get a minimal read time at full resolution.

My main questions are:
– How can I get the read times down as far as possible?
– What is going on in the background, are there maybe some “magic resolutions” that bring read time down considerably? (see drop of read time at specific resolutions below)

Below you’ll find some system info and my test code

The test code does the following at different resolutions:
– read a frame from videocapture
– crop the frame to an area of interest
– show the frame with imshow

These are the timing results. The read times are long but go down with the resolution. However at 1280x704 and 640x480 the read time drops considerably. Why is that?

4032 x 3040: 0.553347
3840 x 2144: 0.425402
1920 x 1088: 0.328975
 1280 x 704: 0.026025
 1024 x 768: 0.356915
  640 x 480: 0.031017

Some system info:
Raspberry Pi 4 Model B Rev 1.4 4GB
Raspbian GNU/Linux 10 (buster)
ARMv7 Processor rev 3 (v7l)
IDE: Pycharm, Python 3.7, opencv-python

And finally my code:

import cv2
import time

# Read frame from videocapture
def read_frame():
    _, frame =
    return frame

# Crop to region of interest
#    x1                  x2
# y1 .___________________.
#    |                   |
#    |                   |
#    |                   |
#    |                   |
# y2.|___________________|.
# frame[y1:y2, x1:x2]

def crop(frame, x, y, width, height):
    return frame[y:y+height, x:x+width]

# Show frame
def show(frame):
    cv2.imshow('Preview', frame)

if __name__ == '__main__':


    feed = cv2.VideoCapture(0)

    # Resolutions, the width and height values must be multiples of 32
    # Index:            0           1            2            3             4             5
    resolutions = [[640, 480], [1024, 768], [1280, 704], [1920, 1088], [3840, 2144], [4032, 3040]]
    res_index = 5

    feed.set(cv2.CAP_PROP_FRAME_WIDTH, resolutions[res_index][0])
    feed.set(cv2.CAP_PROP_FRAME_HEIGHT, resolutions[res_index][1])

    # Region of interest will be 1/3 of the width and 1/3 of the height of chosen resolution,
    full_width = resolutions[res_index][0]
    full_height = resolutions[res_index][1]

    roi_w = int(full_width/3)
    roi_h = int(full_height/3)
    roi_x = int(full_width/2 - roi_w/2)
    roi_y = int(full_height/2 - roi_h/2)

    while True:

        start = time.time()
        frame = read_frame()
        print(f'read: {(time.time() - start):.6f}')     # print with 6 decimals

        start = time.time()
        frame = crop(frame, roi_x, roi_y, roi_w, roi_h)
        print(f'crop: {(time.time() - start):.6f}')

        start = time.time()
        print(f'show: {(time.time() - start):.6f}')

        if cv2.waitKey(1) & 0xFF == ord('q'):


btw, please rather use a cpu clock like cv2.getTickCount() to profile, not time.time(), which measures wall time

Ok, so I changed my code, see below.

Also I calculated the percentages for read, crop and show. It is now more clear that the read time drops significantly At 1280x704 and 640x480.

And another question. Is it possible to read just a part of the sensor? Instead of capturing full resolution and then cropping? I came across CAP_PROP_ZOOM but print(feed.get(cv2.CAP_PROP_ZOOM)) returns -1

    while True:

        start = cv2.getTickCount()
        frame = read_frame()
        print(f'read: {cv2.getTickCount() - start}')

        start = cv2.getTickCount()
        frame = crop(frame, roi_x, roi_y, roi_w, roi_h)
        print(f'crop: {cv2.getTickCount() - start}')

        start = cv2.getTickCount()
        print(f'show: {cv2.getTickCount() - start}')

1 Like

those huge resolutions are STILL FRAME resolutions.


“percentages” are irrelevant. taking a numpy slice is a O(1) operation. imshow also takes negligible time.

wall clock time is acceptable here because no CPU time should be used while the driver waits for a frame from the camera. Python recommends time.perf_counter()

DO NOT expect reading a frame to take a fixed amount of time. frames are generated BY THE CAMERA AT ITS OWN RATE. the frame arrives at a specific time. if you do anything else in the meanwhile, that does not affect when it arrives.

I really oughta start writing FAQ articles somewhere… this stuff is asked like once a week.

Dude… that answer really made me feel welcome to the forum. At least I put some effort into it instead of dumping a crappy question with no info whatsoever…

Anyway, so a standard video frame size would be 1920x1080 right ? if I use 1080 the colors are way off, 1088 is fine. If I use 1920x1080 the read time is still really long, like unchanged.

I would like to understand what is happening and why. Why reading the videocapture at 1280x704 is way faster then at 1024x768 for example.


any resolution that isn’t among these is either impossible to request or it gets “made” somehow by driver and/or hardware, using a possibly unsuitable mode.

the pi cameras are notoriously messy to get working properly.

googling should have revealed some of this to you. don’t just use stuff without looking up the documentation. that’s the last post I’ll write (or read) in this thread. bye.

BTW: your post in the other thread… they probably already know how to use numpy slicing. a dedicated function for that makes very little sense.

no… , no… I’m not going to post what’s on my mind regarding the answer above… nope.

On topic:

I read those docs and the modes listed are for the V1 and V2 version of the pi cams, not for the pi cam hq.

If someone is actually prepared to help here, that would be most welcome.
Thank you in advance