Best way to merge multiple video frames to create an action shot with OpenCV

I’ve been working on software to create an Action Shot and found OpenCV very helpful in doing that. I have implemented a few algorithms to achieve that but in all cases, the result was average quality, not exactly what I wanted. I want that the moving object is crystal clear on top of the static background content.

As an input, I have a video and pick random frames from it with the moving object (usually a person). I was able to align the static environment and now I’m looking for the best way two merge multiple frames into a single one. I tried multiple operators like Add, AddWeighted, Bitwise, Min (gave me the best result so far) and nothing look like a clear merge of two frames. What is the right way to do that?

background estimation/subtraction, yes. for each frame you need a mask of the foreground object. then you can compose all foregrounds into one picture.

alignment of frames (stabilization) should work satisfactorily with feature matching and maybe a round of ECC refinement afterwards.

1 Like

I used createBackgroundSubtractorMOG2 to generate a mask for the moving object but the mask is not solid, and if I copy the frame content from a frame using that mask, I’m getting very unclear results. I’m looking for a result as on the screenshot above and this is what I usually get with subtractor:

Also, I have got the mask and frame where I can copy the moving object from using that mask, but where can I get a photo with the background only?

I found the answer in your answer for the background part, I need to use background estimation to create a plain background first: Video Background Estimation - YouTube

This is how it looks, if instead of background subtractor I just use something like addWeighted, I really like the result, just want to do it without transparency, and with each frame, I’m getting more and more transparency after blending with the previous.

Thank you for all your clues and comments @crackwitz ! I was able to do the following:

  1. Collected all video frames and align them with the first one using feature matching
  2. Used all aligned frames to estimate background (just simple median)
  3. Used MOG2 background subtractor to estimate movement for selected by a user frames (I’m around 100 frames to estimate/subtract the background but only 3-5 frames to build the shot)
  4. With cv.copyTo(frame, frameMask, background) I have created the resulting shot

Below you can see the result for a video with 100 frames and with the first frame applied based on the movement MOG2 mask. I can say that the background estimation and image alignment work really well here. But the background subtractor gives unclear mask which provides unsatisfactory results. Is there a better way to have that movement mask created? I want to have a complete moving object without holes in it:

1 Like