Measure pan, rotate and resize between frames

nji9 · January 14, 2024, 6:27pm

I need to determine the above values between consecutive frames.
I know about identifying reference points,
and also about optical flow (which doesn’t accounts rotation and zooming).
I think this should be a common task, but couldn’t find anything about it.
(Maybe the wrong keywords?)
Any hint welcome!
EDIT:
Simplifying assumption: The content itself is constant. The manipulations
affects it at the whole.

Andrej_Lucny · January 20, 2024, 8:25pm

Follow this video stabilization via the optical flow calculation example:

import numpy as np
import cv2
from affineTransformTools import getTranslationX, getTranslationY, getRotation, getAffineTransform

video = cv2.VideoCapture(‘…/record2.avi’)
fps = video.get(cv2.CAP_PROP_FPS)
hasFrame, frame = video.read()
prev_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

out = cv2.VideoWriter()
out.open(‘…/record3.avi’,cv2.VideoWriter_fourcc(‘M’,‘J’,‘P’,‘G’),fps,(frame.shape[1],frame.shape[0]))

x = 0.0
y = 0.0
f = 0.0

while True:

gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

prev_points = cv2.goodFeaturesToTrack(prev_gray,maxCorners=200,qualityLevel=0.01,minDistance=30,blockSize=3)

points, status, err = cv2.calcOpticalFlowPyrLK(prev_gray, gray, prev_points, None) 

indices = np.where(status==1)[0]
warp_matrix, _ = cv2.estimateAffinePartial2D(prev_points[indices], points[indices], method=cv2.LMEDS)

dx = getTranslationX(warp_matrix)
dy = getTranslationY(warp_matrix)
df = getRotation(warp_matrix)

x += dx
y += dy
f += df

final_warp_matrix = getAffineTransform(1.0,1.0,f,-x,-y)

stabilized = cv2.warpAffine(frame,final_warp_matrix,(frame.shape[1],frame.shape[0]))

cv2.imshow("video",cv2.hconcat([frame,stabilized]))
key = cv2.waitKey(1)
if key == 27:
    break
    
out.write(stabilized)

hasFrame, frame = video.read()
if not hasFrame:
    break

prev_gray = gray

out.release()
cv2.destroyAllWindows()

nji9 · April 15, 2024, 12:25pm

Question on generalization: different scaling for each dimension.

I can’t use estimateAffinePartial2D then (only one scale factor).
So it would be to compute dense optical flow and remap by the flow?
I wonder if that will produce good quality as the mapping may be arbitrary,
while I know that there are uniformly just transition, rotate and resize.

It could be circumvented by ensuring identical scaling for both dimension.
Cropping.
But then I will suffer data loss.

What might be the better solution?

Andrej_Lucny · April 15, 2024, 3:58pm

This is the code for a video taken by a camera. In that case, the aspect ratio is constant, and it works fine for suppressing handshaking. If you have different scales in x and y (the video source must be strange), try to employ cv2.estimateAffine2D instead of cv2.estimateAffinePartial2D. Show your data for more advice.

nji9 · April 15, 2024, 8:44pm

Yes indeed - IT IS a strange video camera the frames are from.
I got it cheaper, but the sensor changes its dimensions permanently.
…

SCNR.
Just joking, I hope you don’t mind?

No, please excuse, I used the wrong word: “frames” instead of “scans”.

These scans are from same content, but on different paper and done by different scanners.
They differ in quality, crop-area, resolution, scaling (sometimes even different AR).
And I’m about to match them to take “best of both (several) worlds”.

I had a look at cv2.estimateAffine2D already, but I’m not sure
if my differences are affine transformations at all…

nji9 · April 19, 2024, 11:56am

I implemented all 3 variants:

sparse flow and affine transformation (general, not partial)
dense flow and affine transformation (general, not partial)
dense flow and direct mapping

For that I tried

translation (small (abt. 10 pixel) and moderate (abt. 30 pixel))
rotation (10 degrees)
scale (80% of content, adding border).

The result:
“It depends.”

Which is about the worst result obtainable.
As to my thinking the main notion of “algorithm” is that it produces
usable results quite independent of the inputs.

In detail
In some of my images (esp. the ones I’m about) all variants
are about unusable (transformation not identified).
For other images (e.g. “Lena”) it does quite well.
(Except e.g. scaling with generated canvas for sparse variant:
Doesn’t identifys the scaling).

The direct mapping of dense flow doesn’t produce usable if it’s about pixels.
The mapping is in the right direction, but producing really “warped” images.

So I wonder about

Dense flow:
calcOpticalFlowFarneback (matGrayAdjusted, matGrayB, flow, 0.5, 3, 15, 3, 5, 1.2, 0);

Can this significantly improved due to parameters?
Or are there better methods for dense flow?

As I’m not about video frames, but single pairs of images,
time doesn’t matter so much.

Topic		Replies	Views
I want to use cv2.estimateAffinePartial2D without scaling factor Python calib3d	5	2559	February 21, 2025
Derive affine transformation from dense optical flow C++ calib3d , optflow , programming	0	221	April 16, 2024
Optical flow from given affine transformations optflow	0	709	January 7, 2022
Video Stabilization with spin product OpenCV Python	5	1322	March 14, 2022
Full-frame motion detection - where to start? C++ optflow	4	1219	August 20, 2022

Measure pan, rotate and resize between frames

Related topics