In the practical application of OpenCV camera calibration, I have a stereo camera system with two lenses embedded in a single plane. I have already calibrated both the left and right lenses separately, obtaining their intrinsic parameters and distortion coefficients. Now, I have established a field-of-view (FoV) coordinate system on the “plane of the lenses.” By applying a certain rigid transformation, I can determine the position of a point in the physical world within this FoV coordinate system. My question is whether this FoV coordinate system can be considered equivalent to the camera coordinate system in OpenCV, or if there is any inherent offset. My goal is to convert this point from the physical world to the image pixel plane to determine its visibility, specifically to check if both the left and right lenses can simultaneously view this point.

I’m not sure what you mean by a FOV coordinate system on the “plane of the lenses”, but I think I understand your goal - being able to determine if a given 3D point is visible in both cameras.

A few comments:

- Good job on calibrating the two lenses (intrinsics) separately. I think this is a good way to start.
- You say that you have two lenses “embedded in a single plane” - I’m imagining a single board 2 camera setup, or something similar. Note that these two cameras might nominally/ideally be in the same plane, but the physical reality is that there will be some rotation and translation with respect to the nominal plane.

I think I would approach this in one of two ways, depending on the other requirements / use case of the system.

- Use the openCV stereo calibration algorithm, providing it with the calibrated intrinsics for both cameras and passing the FIX_INTRINSICS flag so that it doesn’t modify the intrinsics you pass in. (The assumption is that calibrating the intrinsics separately / beforehand gives more reliable results and you therefore don’t want the stereoCalibrate function to change them.) The result of this calibration is a R and T between the cam 1 and cam 2. Cam 1 is the reference frame (analogous to your “plane of the lenses”, I think), and you would expect the rotation angles in R to be close to 0, and the translation to be primarily in X (again, assuming the common single-board stereo configuration).

This works as long as you can use camera 1 coordinate system for your 3D / real-world coordinates. If you have a physical 3D coordinate system then you need to calibrate your rig extrinsics to that coordinate system (with solvePnP) and then apply that transform to your cameras.

Or.

Just calibrate both cameras to your 3D reference frame independently if all you want is to determine point visibility for both cameras (and don’t need the fundamental matrix / essential matrix or the transform between C1 and C2).

I think cv::stereoCalibrate() is the way to go.

In any case, you can use cv::projectPoints with the corresponding intrinsics and extrinsics to determine visibility for the two cameras.