Calculation of IoU between 2 frames when the shot height changes

nsnp · June 5, 2024, 8:37am

Hi (continuation of a previous question here), I want to extract images from a drone video and I want to take images that have an implied difference between them, for this purpose I have a function to calculate IoU between 2 images. When the drone does not move in height between the 2 images, I get a result that seems logical, but when the drone rises/falls between the 2 images, I get an illogical result because the scale changes (for example, I get that the iou is above 90 even though they are really different because the height of the photo has changed). What do I need to change/add in order to deal with these cases or do I need to use a different method for this calculation?
Thanks for the help!

(the previous question - Measuring the similarity percentage between two images)

image example -

my code -

def calculate_iou(image1, image2, ratio_threshold=0.75, min_good_matches=10, min_inliers=10):
    gray1 = cv2.cvtColor(image1, cv2.COLOR_BGR2GRAY)
    gray2 = cv2.cvtColor(image2, cv2.COLOR_BGR2GRAY)
    sift = cv2.SIFT_create()
    kp1, des1 = sift.detectAndCompute(gray1, None)
    kp2, des2 = sift.detectAndCompute(gray2, None)
    bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=False)
    matches = bf.knnMatch(des1, des2, k=2)
    good_matches = []
    for m, n in matches:
        if m.distance < ratio_threshold * n.distance:
            good_matches.append(m)
    if len(good_matches) < min_good_matches:
        logger.info("Not enough good matches found. Returning IoU of 0.")
        return 0.0
    src_pts = np.float32([kp1[m.queryIdx].pt for m in good_matches]).reshape(-1, 1, 2)
    dst_pts = np.float32([kp2[m.trainIdx].pt for m in good_matches]).reshape(-1, 1, 2)
    M, mask = cv2.findHomography(dst_pts, src_pts, cv2.RANSAC, 5.0)
    inliers = mask.ravel().sum()
    if inliers < min_inliers:
        logger.info("Not enough inliers found. Returning IoU of 0.")
        return 0.0
    h, w, _ = image1.shape
    warped_image2 = cv2.warpPerspective(image2, M, (w, h))
    intersection = cv2.bitwise_and(image1, warped_image2)
    union = cv2.bitwise_or(image1, warped_image2)
    # need it to measure the percentage of difference between 2 images
    iou = (np.sum(intersection > 0) / np.sum(union > 0)) * 100
    return iou

crackwitz · June 5, 2024, 8:50am

those pictures, where they overlap, look the same. the only difference is the area of ground they cover.

an IoU should give you the fraction of coverage of the closer zoom over the wider zoom. eyeballing this, it might be around 50% or less.

you should draw the image bounds of each image in the other’s view, assuming you have a homography. that’s a quad/4-polygon for each.

just… debug your code, step through it, look at the data you have at every step. visualize everything.

nsnp · June 6, 2024, 7:48am

I think I understood what the problem (I visualized the steps and there is good match of features between the images and then its goes wrong). But I still didn’t understand how to solve it, could you elaborate on the solution in your opinion?
Do you have a good source that I can understand from us?
Thank you!

crackwitz · June 6, 2024, 9:31am

please provide the homography matrix you’re getting.

if the forum will let you, please also post a pair of pictures that are good for reproducing the issue. I am hesitant to base any investigation on the composite/screenshot in your first post.

nsnp · June 6, 2024, 9:52am

thank you for your response!
on the images with the zoom between them (the pictures are above) this the homography matrix -

[[ 1.59292577e+00  2.62401187e-02 -3.88218698e+02]
 [-1.10673356e-02  1.59030841e+00 -1.98918650e+02]
 [ 1.42779147e-05  1.75936786e-05  1.00000000e+00]]

this images with good iou calculate (38%)

and this is their homography matrix -

[[ 9.28865579e-01 -4.65555462e-01  2.35448144e+02]
 [ 4.77404543e-01  9.19555927e-01 -7.01308649e+02]
 [ 1.92723649e-05  3.06340713e-05  1.00000000e+00]]

crackwitz · June 6, 2024, 10:54am

the matrices look plausible.

the first shows zoom and some translation.
the second shows some zoom, some rotation, some translation.

you try to calculate the intersection and union using masks. at least two issues:

image1/image2 are image data, not masks. your results in intersection and union may show some activity in the images.
these calculations don’t consider the entire area, only the area covered by one of the images.

I would recommend calculating these areas from geometry, i.e. using the two quads (one projected/warped, one being just the rectangle covering the whole image). intersection and union will require algorithms that handle polygons. for the area, you could use contourArea() or an equivalent algorithm.

Topic		Replies	Views
Homography having only 2 aruco markers as a reference Python homography , aruco	18	4063	March 24, 2021
Measuring the similarity percentage between two images Python	9	1488	May 23, 2024
Measuring image similarity with opencv Python calib3d , homography	1	866	August 10, 2023
How to skip differences in the image and recognize the identical part as the same Python	2	471	July 9, 2021
Using OpenCV for Image Similarity Python	12	15817	November 30, 2021

Calculation of IoU between 2 frames when the shot height changes

Related topics