Transformation from Cam1 to Cam2 doesn't describe the real world

I a got stereo pair of cameras: camera 1 and camera2. Camera 2 is to the right of camera 1. I want to calibrate them to get the relativ position from cam1 to cam2 using the stereoCalibrate() function. Therefore I use a checkerboard (7,11) with a side length of 5 mm. The corners are find by the method findChessboardCornersSB() (OpenCV: Camera Calibration and 3D Reconstruction) for images taken by cam1 and cam2. I am using the same pictures to calibrate cam1 and cam2 separately using the function calibrateCamera() to get the intrinsic camera parameters + distortion coefficients.
After I calibrated the cameras, I calibrate the stereo pair. I dont missmatch images taken by cam1 with images taken by cam2. The result of the calibration is the following:

  • reprojection error: 1.19
  • R: array([[ 9.99977975e-01, -2.40794135e-04, -6.63262402e-03],
    [ 3.67111980e-04, 9.99818459e-01, 1.90502994e-02],
    [ 6.62683273e-03, -1.90523147e-02, 9.99796526e-01]])
  • T: array([[-65.23586202],
    [ 0.84279578],
    [ 4.82012402]])

The question is, why is T[0] negativ? It doesnt reflect the real world. Cam2 is to the right of cam1, so it should be at least positiv. +6.5cm could be the case.

Thanks in advance!

what if you’ve got the transform from cam2 to cam1?

the biggest problem in engineering is naming things.

academics suck at this. especially the mathy ones suck at it. they think all things have to be single letters.

when a library heavily draws upon academia, the suck is imported into the library.

and then, everyone using the library adopts the suck.

1 Like

But from the description of the function stereoCalibrate() (OpenCV: Camera Calibration and 3D Reconstruction) it is written:
“R: […] In more technical terms, the tuple of R and T performs a change of basis from the first camera’s coordinate system to the second camera’s coordinate system […]” and my assumption is that cam1 (the left one) is the first camera and cam2 is the second camera (the right one). Im passing the found corners in image1 as imagePoints1 and the found corners in image2 as imagePoints2, so no missmatch.

I agree with your understanding of the documentation.

Possibilities:

  • docs disagree with implementation
  • your left/right pictures are swapped/mislabeled
  • your code swaps the left/right pictures somewhere

I’d like to see an image pair, with each “eye” labeled clearly.

why that ? if you look from the right cam, to the left, it’s along negative X from there, no ?

it’s all in ‘pixels’, not cm, so far.

image
If you are looking at this image x is pointing to the right, so it should be positiv.

Its in cm. Because like the doc is saying your transforming 3D points from cam1 to cam2, which are obviously not in pixels.

Left:

Right:

No missmatch by myside and the source.

this is stereo calibration. the translation is in 3D space, not in screen space. units of the translation vector are whatever units the calibration target is specified in.

the translation from the right frame to the left frame should add.

hold on…

you quoted the docs:

that means the translation SHOULD be negative.

something straight ahead in the first/left camera, should appear to the left in the second/right view. that’s a negative X.

and that agrees with the T you showed.

1 Like

That make sense. Thanks!