StereoCalibrate using known world coordinates

I am filming a table tennis table from two different angles. My end goal is to triangulate the 3d position of the ball relative to a coordinate system I define.

Because I know the ‘true’ world 3d coordinates of the locations of the table (corners, net posts), I have tried to do stereoCalibrate with those known coordinates and then providing the 2d pixel coordinates of those reference points from both cameras. However, this approach is giving me an extremely high RMSE of 272.87.

imgpoints_black = np.array([(954,353),(1486,373),
                     (453,787),(1674,917),
                     (727,413),(730,478),(797,485),
                     (1628,450),(1542,525),(1622,526),
                     (1150,504)], dtype=np.float32)

imgpoints_blue = np.array([(560,322),(1085,314),
                    (385,852),(1598,765),
                    (422,400),(430,473),(507,471),
                    (1313,376),(1246,450),(1308,446),
                    (888,460)], dtype=np.float32)

real_world_pts = np.array([(-76.25,137,0),(76.25,137,0),
                           (76.25,-137,0),(76.25,-137,0),
                            (-91.5,0,15.25),(-91.5,0,0),(-76.25,0,0),
                            (91.5,0,15.25),(91.5,0,0),(76.25,0,0),
                            (0,0,0)], dtype=np.float32)

width = 1920
height = 1080

criteria=(cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 100, 0.001)

def stereo_calibrate(mtx0, dist0, mtx1, dist1):
    ret, CM1, dist0, CM2, dist1, R, T, E, F = cv.stereoCalibrate([real_world_pts], [imgpoints_black], [imgpoints_blue], mtx0, dist0,
                                                                 mtx1, dist1, (width, height), criteria = criteria)
    print('rmse: ', ret)
    cv.destroyAllWindows()
    return R, T

mtx0 = np.load('calib_black.npz')['mtx']
dist0 = np.load('calib_black.npz')['dist']
mtx1 = np.load('calib_blue.npz')['mtx']
dist1 = np.load('calib_blue.npz')['dist']

R,T = stereo_calibrate(mtx0=mtx0,dist0=dist0,
                 mtx1=mtx1,dist1=dist1)

np.savez('stereo_calib_R_T_world.npz', R=R, T=T)

I am quite confident in my camera calibration parameters as they have very low RMSE from calibration.

Would appreciate any insight as to why this approach might not work, or what I can do to get it working?

for a good calibration, one picture having a handful of points is basically worthless.

Hello, thank you for the response!

To clarify, do you mean that this doesn’t work because I am using a single image angle or this doesn’t work even with two image angles. If you meant the former, I do have two angles.

In the case that this doesn’t work even with two angles, then I assume the only way is to purchase a relatively large grid that I can hold near the center of the table?

“two” is on the order of “one”.

and that’s still too few points, way too sparse.

the stereotypical “checkerboard” calibration pattern usually has close to 100 points. more are better, upto detection ability. plan on a dozen pictures of a calibration board, more if you aren’t practiced. that’s necessary but not sufficient. all the aspects discussed in the calib.io KB article still apply.

either calibrate your camera(s) with a normal calibration pattern, or open up that can of worms labeled “autocalibration”. that basically estimates both the camera parameters and a 3D reconstruction of the environment. that is a lot more painful, takes a lot more data. you should want to stick to calibrating with a normal pattern.

no, you don’t need to wave around a table-sized calibration board. distance is irrelevant, as long as the camera’s focus stays fixed (adjusting focus usually changes the entire intrinsics matrix) and the pattern’s corner points are still reasonably in focus. if you focus on the table, but take the calibration pattern closer, the pattern will be out of focus, but the question is how much blur does that introduce. the tinier the camera/lens, the less you have to worry about.

I’d recommend patterns that need not be fully in view. those types let you get sample points right in the corners of the picture. that’s important. the “good old” checkerboard usually isn’t paired with an algorithm that can recover the partial board. people then thought, instead of coming up with such an algorithm, let’s be stupid and invent the “charuco” board. it works, it looks important, but it’s ridiculous. feel free to use it.