Overview
I have been working on a project that involves estimating the relative pose between multiple ArUco markers using OpenCV. I’ve implemented two different methods for computing the transformation matrices between the markers, but I’m observing significantly different results from each method. I would like to understand which method is theoretically correct and why these differences arise.
Method 1: Inversion Method
In the first method, I compute the relative transformation matrix by inverting the pose matrix of the first marker and then multiplying it with the pose matrix of the second marker.
def compute_marker_transformation(marker1_poses, marker2_poses):
if marker1_poses is None or marker2_poses is None:
logging.error("Marker poses not available for transformation computation.")
return None
try:
T_marker1_cam = np.linalg.inv(marker1_poses)
T_cam_marker2 = marker2_poses
T_marker1_marker2 = T_marker1_cam @ T_cam_marker2
logging.info("Computed transformation T_marker1_marker2.")
logging.debug(f"Transformation Matrix:\n{T_marker1_marker2}")
return T_marker1_marker2
except np.linalg.LinAlgError as e:
logging.error(f"Matrix inversion failed: {str(e)}")
return None
Method 2: Explicit Rotation and Translation Computation
In the second method, I decompose the transformation matrices into rotation and translation components and then compute the relative transformation using matrix multiplication for rotation and vector subtraction for translation.
def compute_marker_transformation(marker1_poses, marker2_poses):
if marker1_poses is None or marker2_poses is None:
logging.error("Marker poses not available for transformation computation.")
return None
try:
R1 = marker1_poses[0:3, 0:3] # First marker's rotation matrix
t1 = marker1_poses[0:3, 3] # First marker's translation vector
R2 = marker2_poses[0:3, 0:3] # Second marker's rotation matrix
t2 = marker2_poses[0:3, 3] # Second marker's translation vector
R_rel = R2 @ R1.T
t_rel = t2 - R_rel @ t1
T_rel = np.eye(4)
T_rel[0:3, 0:3] = R_rel
T_rel[0:3, 3] = t_rel
logging.info("Computed relative transformation matrix.")
logging.debug(f"Transformation Matrix:\n{T_rel}")
return T_rel
except np.linalg.LinAlgError as e:
logging.error(f"Matrix operation failed: {str(e)}")
return None
Results
Here are the results I obtained from both methods:
Method 1 Results:
- Transformation of Marker 5 from Marker 2:
[[-0.0922232 0.04162427 0.99486798 -1.54370408]
[ 0.00633191 0.99913021 -0.04121563 0.00485165]
[-0.99571823 0.00249838 -0.09240655 0.54259928]
[ 0. 0. 0. 1. ]]
- Transformation of Marker 4 from Marker 2:
[[ 0.99783569 -0.01568401 -0.06385888 2.45007901]
[ 0.01597381 0.99986429 0.00403008 -0.01533664]
[ 0.06378701 -0.00504143 0.9979508 0.62770339]
[ 0. 0. 0. 1. ]]
Method 2 Results:
- Transformation of Marker 5 from Marker 2:
[[-0.09250655 0.00417393 -0.99570333 0.78924233]
[ 0.02842489 0.99959473 0.00154941 0.03508036]
[ 0.99530627 -0.02815943 -0.09258771 0.67293481]
[ 0. 0. 0. 1. ]]
- Transformation of Marker 4 from Marker 2:
[[ 9.97826842e-01 1.38630733e-02 6.44159049e-02 2.41009356e+00]
[-1.39729729e-02 9.99901585e-01 1.25587701e-03 2.93172165e-02]
[-6.43921551e-02 -2.15322948e-03 9.97922349e-01 -1.45258188e-01]
[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]]
Questions
- Which method is theoretically correct for computing the relative pose between two ArUco markers?
- Why do these two methods produce significantly different results?
- Is there a preferred method for ensuring numerical stability and accuracy in pose estimation using OpenCV?
Any insights or references to relevant documentation would be greatly appreciated. Thank you!