How to add ffmpeg options to VideoCapture

Hi there , i wonder how to add ffmpeg options to videocapture.
And i found some answers like

  import os
  os.environ["OPENCV_FFMPEG_CAPTURE_OPTIONS"] = "video_codec;h264_cuvid"
  cap = cv2.VideoCapture(url)

So how to add more options, like preset or anything else?
like:

  os.environ["OPENCV_FFMPEG_CAPTURE_OPTIONS"] = "video_codec;h264_cuvid,preset;slow"

I mean use what char to sep two options

It should be | so if supported the below should work

os.environ["OPENCV_FFMPEG_CAPTURE_OPTIONS"] = "video_codec;h264_cuvid|preset;slow"

Any way to send this options like function parameter instead of using env var for i want specify this parameter is used for which cap .
Such as

 cap = cv2.VideoCapture(url,{"OPENCV_FFMPEG_CAPTURE_OPTIONS":"video_codec;h264_cuvid|preset;slow"})
1 Like

As far as I am aware you can’t do the above, you would need to use the new hardware acceleration api, this may help

Yeah, i have tried opencv 4.5.5 for that there is the api cv2.cudacodec.createVideoReader
So how to add the options like preset pix_format or so on.

I wasn’t refering to cv2.cudacodec I was refering to the way to use hardware acceleration through cv2.VideoCapture without env vars, i.e.

cap = cv2.VideoCapture("rtsp://....", cv2.CAP_FFMPEG, [cv2.CAP_PROP_HW_ACCELERATION, cv2.VIDEO_ACCELERATION_ANY ]);

Not all native FFMpeg options are supported by cv2.VideoCapture with the FFMpeg backend.

pix_format is unlikely to be supported because cv2.VideoCapture only outputs BGR frames. Isn’t preset and encoding option?

Which options do you need?

In my scene ,i want to decode video stream to BGR frame by GPU. But it is better to decode and convert yuv to BGR both on GPU.
Does cv2.VideoCapture work like this now? Or decode to yuv on GPU and convert to BGR on CPU.

cv2.VideoCapture will only output host/CPU frames. I am not sure exactly how the hardware acceleration works internally.

cudacodec.VideoReader decodes directly to device/GPU memory. If you build from the master branch you now have the option to output to BGR, BGRA, GRAY or NV12(YUV), with the default being BGRA. The decoder currently decodes everything to NV12, so if you choose BGR output format then it will run an extra CUDA kernel over the frame to perform the conversion.

have the option to output to BGR, BGRA, GRAY or NV12(YUV)

which option?
I want ot get BGR format.For now i use gpu_mat.download and convert it to BGR by CPU.

I should have mentioned this doesn’t currently work from python see this PR for the update.

See here for details of how to call them.

In my scene, i am using a Tesla V100 to decode from rtsp and encode to rtsp.

  1. After i use cudacodec to get the gpu_mat, is there anyway to use this gpu_mat to inference by pytorch or i must download to cpu and make it a torch.tensor and then inference.
  2. I have a video stream which is 2560x1920.And i write a pipeline like decoding it and then encoding it and pushing it to another server. Yeah it is like a remuxing , but i will modify the frame, so i must decode and encode.
    I am decoding the rgb frame by
    cap = cv2.cudacodec.createVideoReader(url)
    while True:
        ret, frame = cap.nextFrame()
        if ret:
            image = frame.download()  
            image=cv2.cvtColor(image,cv2.COLOR_BGRA2BGR)
            pushing(image)

And
I am pushing the rgb frame to pipe and pushing to the server by this cmd

ffmpeg -hwaccel cuvid -y -f rawvideo -pix_fmt bgr24 -s wxh -i - -c:v h264_nvenc -pix_fmt yuv420p -f rtsp -rtsp_transport tcp rtsp://xxxxxx

And get 20 fps speed= 0.8x for pushing pipe . Is this a normal speed? Also after running for a while will get

[OPENCV:FFMPEG:16] RTP: PT=60: bad cseq 47d2 expected=49d2te=N/A speed=0.799x 

what is the reason ?

Not that I am aware of, there was talk of an external alocator for pytorch etc. but I still don’t think that would work with OpenCV. In the future the OpenCV CUDA DNN backend should support GpuMat input, so if it is just for inference when that modification is made you could export your model from pytorch to ONNX and just use OpenCV. In the meantime you could try to hide the overhead of the download by using CUDA streams to overlap the download with the some CPU work. I know this is very inefficient (downloading from the device in OpenCV and the uploading to the device in pytorch) but I can’t think of another way.

Unfortunatley if you are altering the frame and need to encode there isn’t much I can suggest because cudacodec.videoWriter() is out of action.

I am not sure what you mean 0.8x for pushing pipe? Normal speed for what, processing your 2560x1920 video on a Tesla V100?

I can only guess but it is possible that you are not requesting frames at the source fps due to the resolution of the video combined with the overhead of the calls to download(), cvtColor() and pushing(). Have you tried streaming by just calling ret, frame = cap.nextFrame() to see if the issue disapears? Alternatively you could try the new allowFrameDrop flag to see if that works. If so you need to remember that this is just a convenience flag for prototyping and not really for production because you are dropping frames.

params = cv.cudacodec.VideoReaderInitParams()
params.allowFrameDrop = True
cap = cv2.cudacodec.createVideoReader(url,[],params)

What I would suggest is removing the call to cv2.cvtColor and setting the output from cudacodec::VideoReader to BGR.

cap.set(cv.cudacodec.ColorFormat_BGR)

the latest test code for me is this one ,is this the same way you suggest? This issue appeared as before.

cap = cv2.cudacodec.createVideoReader(url)
ret, frame = cap.nextFrame()
while ret:
    frame = cv2.cuda.cvtColor(frame,cv2.COLOR_BGRA2BGR)
    image = frame.download()  
    pushing(image)
    ret, frame = cap.nextFrame()

Yeah, i mean the speed for pushing stream by ffmpeg cmd line on Tesla V100 with writing frame to a subprocess which will call ffmpeg -hwaccel cuvid -y -f rawvideo -pix_fmt bgr24 -s wxh -i - -c:v h264_nvenc -pix_fmt yuv420p -f rtsp -rtsp_transport tcp rtsp://xxxxxx this cmd to push the frame.

This one i will try next week.
By the way, when the video is 1280*720,this problem almost disappeared. However , i see this performance report on NVIDIA VIDEO CODEC SDK | NVIDIA Developer


Can opencv and cudacodec arrive this performance goal?

whatever that did, please do this instead. I only rearranged the code to use a single nextFrame. while-true loops are okay.

cap = cv2.cudacodec.createVideoReader(url)
while True:
    (ret, frame) = cap.nextFrame()
    if not ret: break
    frame = cv2.cuda.cvtColor(frame,cv2.COLOR_BGRA2BGR)
    image = frame.download()  
    pushing(image)

If you use the latest commint from the master you can slightly modify the code above from crackwitz and it should automatically drop frames for you if you are requesting frames more slowly than they are captured and output BGR instead of BGRA frames.

params = cv.cudacodec.VideoReaderInitParams()
params.allowFrameDrop = True 
cap = cv2.cudacodec.createVideoReader(url,[],params)
cap.set(cv.cudacodec.ColorFormat_BGR)
while True:
    (ret, frame) = cap.nextFrame()
    if not ret: break
    image = frame.download()  
    pushing(image)

In my experiance the quoted performance is realistic. To achieve this you need to call nextFrame() fast enough to saturate the decoder, that is to get 4x30=120 fps (H.264 8=bit V100) you would have to be calling nextFrame() faster than 120 fps (every 8 ms) which you won’t be doing if you are calling download(), cvtColor() and pushing() after nextFrame(). As I said above, try

cap = cv2.cudacodec.createVideoReader(url)
while True:
    (ret, frame) = cap.nextFrame()
    if not ret: break

to see what fps you are getting first to check there are no other issues present.