Camera Intrinsic Calibration

I am trying to calibrate a camera (Zed 2) using OpenCV on Python 3.10. This is with a ChArUco calibration pattern and utilizing calibrateCameraCharucoExtended() following the opencv tutorials:
https://docs.opencv.org/4.x/da/d13/tutorial_aruco_calibration.html

The calibration seems to do a good job of minimizing the reprojection errors (error<0.4) and visualizing the marker, corner detection and the board pose (using drawAxis()) all seems good. However, the Z value from estimatePoseCharucoBoard() is way off (1.4m instead of ground truth which is 0.65m).

I have already checked:

  • size of calibration grid is correct
  • the pixel to distance ratio seems right in the plane of the calibration grid. X and Y values of the corner is accurate across images even with ~30 deg tilt
  • distortion is very low (at least visibly) so the distortion parameters didnt have a large influence on the result
  • Using factory provided calibration parameters (from ZED) seems to give the accurate z distance to board but the reprojections of board pose using that calibration matrix are quite off from the true position on the image. The factory calibration matrix essentially is different in fx fy compared to opencv results (fxy =1000 instead of 2000).

I am really stumped. Any help to further diagnose the issue is appreciated!

The things that come to mind:

  • Are you calibrating in the same resolution mode as when you are using the camera. If you calibrate in a lower resolution, your focal length will be shorter (and your image center will be off)

  • How many images are you using?

  • When you say the size of the calibration grid is correct, do you mean the number of columns/rows is right, or that the printed size is in agreement with the square / marker lengths used when generating the Charuco object?

  • Post code and images, your calibration matrix / distortion results, as well as the factory calibration matrix.

@Steve_in_Denver

  1. Yes, i’ve checked the camera resolution is the same in the images used for calibration and detection
  2. ~20 calibration images
  3. The number of rowxcols as well as physical measurement of the grid & marker size compared to what I use in the code below are consistent
  4. Code and images are below
def calibrate_camera_intrinsic(image_filenames):
    aruco_dict = cv2.aruco.getPredefinedDictionary(cv2.aruco.DICT_6X6_250)
    grid_size = 0.123333
    board = cv2.aruco.CharucoBoard((6, 4), grid_size, grid_size / 2.0,
                                   aruco_dict)

    allCorners = []
    allIds = []
    decimator = 0

    # sub pixel corner improvement criteria
    criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.00001)

    for img_file in image_filenames:
        print("=> Processing image {0}".format(img_file))
        img_bgr = cv2.imread(img_file)
        img_gray = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY)
        corners, ids, rejectedImgPoints = aruco.detectMarkers(img_gray, aruco_dict)

        if len(corners) > 0:
            # SUB PIXEL DETECTION
            for corner in corners:
                cv2.cornerSubPix(img_gray, corner,
                                 winSize=(3, 3),
                                 zeroZone=(-1, -1),
                                 criteria=criteria)
            retVal, chCorners, chIds = cv2.aruco.interpolateCornersCharuco(corners, ids, img_gray, board)
            if chCorners is not None and chIds is not None and len(chCorners) > 3 and decimator % 1 == 0:
                allCorners.append(chCorners)
                allIds.append(chIds)
        decimator += 1

    imsize = img_gray.shape

    distCoeffsInit = np.zeros((8, 1))
    cameraMatrixInit = np.array([[773.029, 0.0, 1.09045081e+03],
                              [0.0, 773.029 , 6.18946622e+02],
                              [0.,  0.,  1., ]])

    flags = (cv2.CALIB_USE_INTRINSIC_GUESS + cv2.CALIB_RATIONAL_MODEL)
    # flags = (cv2.CALIB_USE_INTRINSIC_GUESS + cv2.CALIB_FIX_FOCAL_LENGTH)
    # flags = (cv2.CALIB_USE_INTRINSIC_GUESS + cv2.CALIB_FIX_K1 + cv2.CALIB_FIX_K2 + cv2.CALIB_FIX_K3 + cv2.CALIB_FIX_K4+ cv2.CALIB_FIX_K5 + cv2.CALIB_FIX_K6 + cv2.CALIB_FIX_S1_S2_S3_S4 + cv2.CALIB_FIX_TANGENT_DIST + cv2.CALIB_FIX_TAUX_TAUY )
    # flags = cv2.CALIB_USE_INTRINSIC_GUESS
    # flags = (cv2.CALIB_RATIONAL_MODEL)

    (ret, camera_matrix, distortion_coefficients,
     rotation_vectors, translation_vectors,
     stdDeviationsIntrinsics, stdDeviationsExtrinsics,
     perViewErrors) = cv2.aruco.calibrateCameraCharucoExtended(
        charucoCorners=allCorners,
        charucoIds=allIds,
        board=board,
        imageSize=imsize,
        cameraMatrix=cameraMatrixInit,
        distCoeffs=distCoeffsInit,
        flags=flags,
        criteria=(cv2.TERM_CRITERIA_EPS & cv2.TERM_CRITERIA_COUNT, 10000, 1e-9))

    print('ret:', ret)
    print('camera_matrix:', camera_matrix)
    print('distortion_coefficients:', distortion_coefficients)
    print('rotation_vectors:', rotation_vectors)
    print('translation_vectors:', translation_vectors)

    return True, camera_matrix, distortion_coefficients

def detect_calib_board(image_rgb,camera_matrix, dist_coeffs):

    aruco_dict = cv2.aruco.getPredefinedDictionary(cv2.aruco.DICT_6X6_250)
    grid_size = 0.123333
    board = cv2.aruco.CharucoBoard((6, 4), grid_size, grid_size / 2.0,
                                   aruco_dict)
    output_img_rgb = image_rgb.copy()
    image_gray = cv2.cvtColor(image_rgb, cv2.COLOR_RGB2GRAY)
    corners, ids, rejected_img_points = aruco.detectMarkers(image_gray, aruco_dict)

    if len(corners) != 0:  # some markers were detected
        output_img_rgb = aruco.drawDetectedMarkers(output_img_rgb, corners, ids)

        ret_val, allCorners, allIds = cv2.aruco.interpolateCornersCharuco(corners, ids, image_gray, board)
        output_img_rgb = cv2.aruco.drawDetectedCornersCharuco(output_img_rgb,allCorners, allIds)

        board_rvec_rdg = np.zeros((1,3))
        board_tvec = np.zeros((1,3))
        ret_val, board_rvec_rdg, board_tvec = aruco.estimatePoseCharucoBoard(allCorners, allIds, board, camera_matrix,
                                                                             dist_coeffs , board_rvec_rdg, board_tvec, useExtrinsicGuess=False)
        if ret_val:  # board's pose could be estimated
            print('Board rvec: ')
            print(board_rvec_rdg)
            print('Board tvec: ')
            print(board_tvec)

            length_of_axis = 0.015
            output_img_rgb = cv2.drawFrameAxes(output_img_rgb, camera_matrix, dist_coeffs, board_rvec_rdg, board_tvec,
                                               length_of_axis)
            plt.imshow(output_img_rgb)
            plt.show()
            return True, board_rvec_rdg, board_tvec, output_img_rgb
        else:
            print('Error: Could not identity the pose of the calibration board in the image!')
            return False, None, None, output_img_rgb
    else:
        print('Error: Could not identity the calibration board in the image!')
        return False, None, None, output_img_rgb


Result:

error: 0.29379795366788053
camera_matrix: [[1.97924630e+03 0.00000000e+00 1.12217025e+03]
 [0.00000000e+00 1.97516547e+03 6.19508215e+02]
 [0.00000000e+00 0.00000000e+00 1.00000000e+00]]
distortion_coefficients: [[-9.91883164e+00]
 [-2.51426039e+01]
 [-3.51000391e-04]
 [ 6.58633780e-04]
 [-7.93967022e+00]
 [-9.84978313e+00]
 [-2.69354144e+01]
 [ 5.34058048e+00]
 [ 0.00000000e+00]
 [ 0.00000000e+00]
 [ 0.00000000e+00]
 [ 0.00000000e+00]
 [ 0.00000000e+00]
 [ 0.00000000e+00]]

Board detection:
(cropped image)

Board rvec: 
[[-0.00455889]
 [-0.00658869]
 [ 1.56128103]]
Board tvec: 
[[ 0.18143146]
 [-0.32179749]
 [ 1.41092617]]

Expected distance to board ~0.65m

I only see one image, but if the others look similar then I suspect your problem is that you don’t have any tilt in the calibration target images. You need to have images of the board taken from a variety of angles, not just rotations in the plane.

If all of your images are taken with a “straight on” view of the calibration target the algorithm isn’t able to differentiate between Z translation and focal length changes. If you think about it, you could take a picture of a flat object from far away with a zoom lens, and a picture of the same flat object from up close with a wide angle lens, and they would look the same. Only when you have depth changes in the object would you be able to differentiate the two.

That’s my best guess of what is going on, so if that’s the case (all images taken with a straight on view), try tilting the calibration target at least 20 degrees out of plane and using various pictures from that angle (or a variety of angles) - I suspect your results will start to make a lot more sense.

-Steve

@Steve_in_Denver I do have images with various tilt angles, included one more view (seems like as a new user, I am limited to one media per post)

The marker detection and the board pose drawn on these images are quite consistent across the images tested

I looked over the code and didn’t see anything obvious, and if you are including images with tilt, then hypothesis about the perspective is wrong. (I would make sure you have tilt in both axes, not just in one axis as pictured, but I’m not sure that’s the problem at this point.)

For fun I divided your calibrated focal length by the provided focal length and got 2.56. This is close enough to 2.54 that I have to wonder if there is an inch / cm error somewhere. What are the units of your charuco board (0.12333) are meters?