Calculate optic flow for a video. Save as HSV. Keep 1:1 frame equivalence to original video

I am trying to load a video, convert it into a per-frame optic flow estimate, and save that out as an BGR encoding of the optical flow. The idea is that later on, when a BGR frame has been extracted from the video and converted to HSV, the per-pixel HSV value is an encoding of the magnitude and angle of each flow-vector.

I can do this, no problem. I also insert 1 blank frame into the beginning of video to account for the optic flow calculation requiring a difference of two frames (so, output would normally be num_frames-1).

I can verify that the original and hsv encoding have the same number of frames. However, when I later use

original_video.set(cv2.CAP_PROP_POS_FRAMES, 2000)
hsv_video.set(cv2.CAP_PROP_POS_FRAMES, 2000)

and display the frames, they are offset by a small number that is non-constant across the video (i.e., the offset is different earlier in the video than late in the video). So far, I have found the offset to be as large as five frames.

I am not sure how to retain parity between frame numbers. Can anyone suggest a fix?

Some additional info:

  • When creating the videoWriter object in preparation to transcode to optical flow, I set the fourcc to cv2.VideoWriter_fourcc(*‘mp4v’), and the fps to the value of the original video returned by original_video.get(cv2.CAP_PROP_FPS)

  • After the optical flow video has been created, hsv_video.get(cv2.CAP_PROP_FOURCC) returns “1983148141.0” . I get the same result for both videos.

  • The original video has a dynamic frame rate. When I compare the original video’s frame rate to the hsv video, I get: 29.852196518048217 for the original, and 29.852 for the hsv video. So, not equal, but that shouldn’t matter if seeking using frame number, right?

Any advice greatly appreciated.

Thank you!

“seeking” in media files.

it’s like bus stops. you are dropped off at a stop, not wherever you’d like. you can walk the rest of the way, is the assumption.

does that make sense?

“walking” means decoding several frames until you end up where you want to be.

opencv does all of that for you though, which is why seeking can be expensive in opencv. more expensive than just the jump.

“intra” coding would allow you to jump to any frame. every intra frame is a bus stop.

also: media files don’t have numbered frames. they have timestamps. every frame has a timestamp, not an index.

indices are an illusion. and so is “fixed” frame rate. it’s all calculating with timestamps underneath. some videos have a fixed frame rate because each frame comes the same time after the previous one. some videos don’t have that. or the metadata (“frame rate”) is wrong.

1 Like

Hi crackwitz!

You said, ““intra” coding would allow you to jump to any frame. every intra frame is a bus stop.”

Are you aware of a methodology to increase the number of i-frames in OpenCV?

If not, do you have any other advice?

I do understand a good amount of what you’re saying, but am having trouble translating it into action. Before working with openCV for video encoding I spent several months playing with pyav. Pts, dts, codecs, and … and, oh man, what a mess! I also have a limited grasp of video compression, and realize that any frame is really an assembly of keyframes, interframes, packets from the future, packets from the past, like Frankenstein’s monster of space/time. I could not solve this problem even by explicitly setting .pts and .dts and matching codecs, and so I moved on to OpenCV with the hope that would sidestep this issue. No luck.

…so, I have enough of an understanding to know WHY seeking is difficult. Unfortunately, that doesn’t help me solve the problem that I am trying to solve.

So, any advice?

you’ll need pyav. you’ll need to copy the pts of the source frame onto your optical flow frame. take care that the timebase is right.

oof. Yeah, so I played that game for a few months. I tried matching codecs, time bases, and copying the frame PTS over to the new video. I opened the videos afterwards and compared pts to make sure they matched. I am hazy on the details, but I believe the issue was that I could not get them to match, no matter the approach I took to encoding. Nobody on the pyav forum, was able to provide any sort of help. It is kind of a ghost town over there.

So, I do appreciate the suggestion, and it does make sense. unfortunately, that path has been explored.