Matching perspective of a camera snapshot and a warped image

I’m trying to implement a nodal offset calibration algorithm.

I have the following setup:

  • Screen displaying a randomly moving and rotating checkerboard
  • Webcam taking snapshots of the screen
  • A ‘virtual camera’ taking snapshots of the same image (which is really just the warpPerspective function)

I have calibrated the webcam lens and I’m using its matrix for the virtual camera. I also know the distance between the webcam and the screen in millimeters. The distance is converted into pixels using the size of my screen and used to translate the image away from the virtual camera.

Here is the code:

def proj_matrix_3d(width, height):
    """2D -> 3D projection matrix"""
    return np.array([   [1, 0, -width / 2],
                        [0, 1, -height / 2],
                        [0, 0, 0],
                        [0, 0, 1]])

img = gen.next() 

proj_3d = proj_matrix_3d(monitor.width, monitor.height)    
mat = gen.camera_matrix @ translation_matrix_3d(0, 0, gen.z_dist) @ proj_3d

 img2 = cv2.warpPerspective(img, mat, (monitor.width, monitor.height), borderMode=cv2.BORDER_CONSTANT, borderValue=(255, 0, 0, 255))

  cv2.imshow('window', img2)
  cv2.waitKey(250)

z_dist is calculated as follows:

pixels_per_mm = monitor.width / monitor.width_mm
self.z_dist = mm_from_camera * pixels_per_mm

where mm_from_camera is the value measured and entered by the user.

The distance and the camera matrix are the same in both cases. However, the virtual camera’s snapshot doesn’t fit into the screen (the original image before applying the 3D transformations does). I expect the perspective to align with that of the webcam snapshot.

I’m at a complete loss. Seems like it could be an issue with calculating the distance but idk what could be wrong here.

For reference here is the exact same image captured by the webcam and the virtual camera:

Edit: I have since tried adding extrinsic parameters from camera calibration to the formula. Still doesn’t work properly.

crosspost:

I’m sorry about that. I’m very desperate for an answer and I can’t afford to wait a couple days before posting elsewhere.

Anyway I deleted my SO post and going to update this one now

My original post was a mess, so here’s a reproducible example.

Code:

import cv2
import numpy as np

IMAGE = cv2.imread('test_img.png')

# Distance from camera to screen in millimeters
CAMERA_DIST_MM = 220

SCREEN_W = 1920
SCREEN_H = 1080
SCREEN_W_MM = 344

CAMERA_MATRIX = np.array([
    [1.666081067492627653e+03, 0, 9.352342619726014163e+02],
    [0, 1.658496672263981964e+03, 6.114342743558016764e+02],
    [0, 0, 1]
])

OPTIMAL_CAMERA_MATRIX = np.array([
    [1.609521850585937500e+03, 0, 9.503673896186955972e+02],
    [0, 1.597573974609375000e+03, 5.920146129275963176e+02],
    [0, 0, 1]
])

DISTORTION = np.array([
    1.721865529753746626e-01, 
    -2.080112797862111673e+00,
    -2.175832103492915739e-02, 
    7.900438166936698051e-03, 
    3.798900597582000493e+00
])


# Physical camera
cam = cv2.VideoCapture(0)
cam.set(cv2.CAP_PROP_FRAME_WIDTH, SCREEN_W)
cam.set(cv2.CAP_PROP_FRAME_HEIGHT, SCREEN_H)

cv2.namedWindow('window', cv2.WND_PROP_FULLSCREEN)
cv2.setWindowProperty('window', cv2.WND_PROP_FULLSCREEN, cv2.WINDOW_FULLSCREEN)
cv2.imshow('window', IMAGE)
cv2.waitKey(1000)

_, physical = cam.read()
cv2.destroyWindow('window')
cam.release()

physical = cv2.undistort(physical, CAMERA_MATRIX, DISTORTION, None, OPTIMAL_CAMERA_MATRIX)
cv2.imwrite('physical.png', physical)


# Virtual camera
proj_3d = np.array([   
    [1, 0, -SCREEN_W / 2],
    [0, 1, -SCREEN_H / 2],
    [0, 0, 0],
    [0, 0, 1]
])

dz = CAMERA_DIST_MM * (SCREEN_W / SCREEN_W_MM)
trans = np.array([
    [1, 0, 0, 0],
    [0, 1, 0, 0],
    [0, 0, 1, dz],
    [0, 0, 0, 1]
])

K = np.zeros((3, 4))
K[0:3, 0:3] = CAMERA_MATRIX

mat = K @ trans @ proj_3d

virtual = cv2.warpPerspective(IMAGE, mat, (SCREEN_W, SCREEN_H))
cv2.imwrite('virtual.png', virtual)

Steps to reproduce:

  1. Save this test image as test_img.png
  2. Edit the variables at the top to match the parameters of your screen and camera
  3. Position the camera so that its optical axis points at the approximate center of the screen
  4. Measure the distance between your camera and screen in millimeters and update the CAMERA_DIST_MM variable to match it
  5. Run the script

Here are the results I get:

Problem solved!

For the future googlers: I found this library which implements a correctly working virtual camera!

You can find an interactive demo in this github repo. It doesn’t contain the source code for the library itself, so if you want to see it you’ll need to install the package.