Convert h264 file to mp4 with `cv::cudacodec`

I am experimenting OpenCV’s integration with CUDA (cv::cudacodec) using the official example here. The sample code works. However, it seems to me that the resultant file generated by cv::Ptr<cv::cudacodec::VideoWriter> is an H264 file, without a proper container such as mp4/avi.

My question is whether or not there is a recommended way to wrap the h264 file (say in-memory?) into a proper mp4/avi file.

I can think of one circuitous approach to achieve this–apart from building OpenCV with cv::cudacodec, we firstly build FFmpeg with Nvidia Video codec and then build OpenCV with this particular FFmpeg build, then we use cv::VideoWriter’s FFmpeg backend to leverage hardware codec. But wondering if there is a more straightforward way…

for context:

When I added cudacodec::VideoWriter back in I didn’t include container formats as I wasn’t sure if it was really necessary and I didn’t want to have to mess with cv::VideoReader to get to FFMpeg on the backend.

What is your use case for mp4?

lol you are also here

I want to resultant file to be directly played by 3rd-party media players.

You can play .h264 in most media players if you don’t need to skip through the file.

The other option is to just use FFMpeg directly either by calling ffmpeg

ffmpeg -i input.h264 -c:v copy output.mp4

or using the libs to place it in a container.

I guess the question could be:

How to feed GpuMats to a cv::VideoWriter initialized with HW_ACCEL flags

I’m guessing that the source data resides in GpuMat instances, or at least cv::UMat

If your requirement for a container format is that strong then that would be a reasonable approach. Last time I checked writing a 1080p h264 file with cv::cudacodec::VideoWriter was neary twice (540fps vs 260fps) as fast as cv::VideoWriter with hardware acceleration.

The main use cases for cv::cudacodec::VideoWriter would be

  1. when your data is already on the device, maybe your reading video with cv::cudacodec::VideoReader,
  2. when you require maximum encoding performance (not that common, 540 fps is probably not that important to most people when the rest of their pipeline can only processing images at say 20 fps),
  3. when you want to lower the processing overhead on the CPU (cv::VideoWriter with hardware accel still stresses the CPU).

Yes I am currently using this approach (cv::VideoWriter with hardware-accelerated FFmpeg backend) but I am exploring any better ways to achieve the same with better performance.

Which media player are you using which can’t play raw h264?

I tested a few players.
Windows Media Player and Windows 11’s “Photos” can’t play .h264 file.
Also, VLC can only play the file if the extension if “.h264”. If I set the extension to “mp4”, VLC doesn’t work either.

This is what I can achieve, but still figuring if I can avoid using the FFmpeg backend altogether–invoking FFmpeg, even with hardware acceleration enabled, uses much more CPU (~2x-3x) than just using cv::cudacodec::VideoWriter.

That makes sense as its a h264 file not an mp4 container.

right. it’s a bare video stream, a bitstream. it’s not a stream in a container.

no regular video player expects bare bitstreams. a container is required.

OpenCV only has its own code for reading/writing AVI containers. those are outdated.

if you want an actual MP4 media file, you need something to write the stream(s) into such a container.

ffmpeg is that.

VLC plays them just fine (if it has a .h264 extension it expects a h264 file, expecting it to play a raw h264 file with an mp4 extension as @mamsds did is asking a little too much), you just can’t seek effectively as there is no index.

and I tested one more important tool–Firefox cant play it.

@mamsds This functionality has now been added to cudacodec::VideoWriter.

@cudawarped can you elaborate a bit? Like sharing a PR url or a new doc here?

`cudacodec::VideoWriter` add container output by cudawarped · Pull Request #3569 · opencv/opencv_contrib · GitHub merged on Nov 2nd.

1 Like

Hi @cudawarped ,

I just took a look at the PR you shared and I am about to try the new functionality, great job!

One question tho: I am aware that currently the code is in branch 4.x, but not in tag 4.8.1 (which is currently the latest tag), does it mean it is yet to be officially “released”? Or put it another way, is it the case that the code will be in the next release, probably 4.8.2?