Camera Calibration with Known Extrinsic Parameters

I know my extrinsic matrix, and I want to manually specify this using calibration of my checkerboards rather than the code trying to figure it out on its own. Is there a way I can specify these parameters so that the code only needs to figure out the intrinsic matrix? I ask only, because from what I have experienced so far, even if I specify the extrinsic matrix, it will still generate one on it’s own in the quest to find the intrinsic matrix, and then use that one instead of the one I have specified I’ve seen some answers on stack overflow but they are quite old and wanted to ask if anybody here has any tips or advice. Thank you once again. The following is the information I’ve measured:

rotation matrix:
[0.949 0 0.313]
[0 1 0]
[-0.313 0 0.949]

translation vector:
[0]
[16]
[0]

extrinsic matrix:
[0.949 0 0.313 0]
[0 1 0 16]
[-0.313 0 0.949]

internal corners:
6col x 4rows

length of square:
22 mm

3d world coordinates of cameras (in cm):
cam1 = (10, 20, 27)
cam2 = (10, 36, 27)
angle between cams = 0.32 radians

How do you know your extrinsic matrix? Specifically, how do you know your translation vector?

If you are getting it from some robotic system (a pan/tilt unit?) you might know the nominal values, but the actual physical values are probably a little different. The translation vector is especially suspect - how do you know where the nodal point of the camera is?

Honestly you are probably better off letting the algorithm calibrate the extrinsics and then use those (they are probably more accurate than your nominal values).

It might be good to take a step back, though. The typical process (assuming a camera with locked optics - no zoom or focus adjustments) is to calibrate the intrinsics one time using multiple images of the checkerbaord pattern from different angles, and then use the intrinsics for your future steps. It sounds like maybe you are trying to calibrate the intrinsics with one image of the checkerboard? That won’t work.

It might help to have more detail about the goal you are trying to accomplish.

Hi,

I’m attempting to perform stereo calibration (2 cameras). This is kind of my set up (attached)

We are performing bee tracking within a cube/cuboid like structure, that’s why we have 3 dimensions specified.

I’ve deduced the coordinates for each camera to be:
camera1: (10, 20, 27 cm)
camera2: (10, 36, 27 cm)
And angle between 2 cameras is 18.26 degrees

The translation vector just shows the shift between the camera position. And the rotation matrix was just the rotation along the y axis. This is probably not the perfect way to do this, but this is my first time performing such a task and I would love to get as much insight and advice as possible. I guess to add on, is there any specific way I could manually, accurately calculate extrinsic parameters? I’ve tried across anipose, deeplabcut, and matlab, but I’m not satisfied with the calibration and thought that if I could manually specify the parameters it would increase the performance? Thank you.

There is a lot of ground to cover on this. I’ll see what I can do.

First of all, a disclaimer: I don’t have a lot of experience working with the OpenCV stereo calibration algorithms, so take everything I say with a grain of salt.

I’m going to keep the first pass brief, because I don’t know what you do/don’t know.

Intrinsics: The intrinsic parameters describe physical properties of a given camera + lens pairing and include the focal length, and optical image center. These parameters describe how 3D points in the camera reference frame project to the image sensor.
Note that the focal length units are pixels, as is the image center. Intrinsics are assumed to not change, which means that if you have an adjustable zoom or focus you must lock it down (preferably physically) so that it doesn’t change after you calibrate the intrinsics. The camera intrinsics are often represented in matrix form, and is commonly called the camera matrix. The camera matrix is typically written as a C or K.

Extrinsics: The extrinsic parameters include 3 rotation angles and 3 translation values. These parameters describe the position and orientation of a camera in some 3D reference frame. Note that origin of the camera reference frame is the nodal point, which corresponds to the pinhole of the theoretical pinhole camera.

While it is possible to estimate the values for both intrinsic and extrinsic parameters (either from measurement or provided specifications), calibration will almost always give better results.

If I were doing this I’d approach it something like this:

  1. Calibrate intrinsics for each camera separately.
  2. Calibrate the stereo configuration using the intrinsics you calibrated (note that the flags parameter defaults to FIX_INTRINSIC in the stereoCalibrate() call - this causes it to use the intrinsics you pass in and will not re-compute them.)
  3. Do whatever validation you need to do so that you can trust the R and T results you get.

Note that the R and T parameters describe a transformation from the camera 1 coordinate frame to the camera 2 coordinate frame. So you might be looking for your translation vector [0,16,0] in the T, but you won’t find it written that way. Assuming your translation measurement is accurate, you would expect the length of the translation vector to be the same as the length of your vector (16), but it will transformed depending on what R is, so you will get something different (not just a translation along a single axis.) And what you actually get depends on how you chose your coordinate system. (does your X,Y,Z agree with OpenCV? Is your rotation around the correct axis?)

I prefer doing the camera calibration offline / as a separate step for a few reasons. It’s a more complex problem to solve, so doing it separately lets you focus on getting high quality results without being constrained as you would by doing it in-situ. Also it probably makes the stereo (extrinsics) part easier because you don’t require a large number of input images with different calibration target position/orientations. (which you would need to get good intrinsics)

This is all assuming you want to use OpenCV as the framework. If so, it’s best to buy in to the framework and let OpenCV do the work for you. Trying to estimate / measure things yourself is fraught - if for no other reason because you have to make sure you are representing the reuslts in the same way that OpenCV would. (And really the calibration algorithms are going to give you far more accurate results, I promise.)