Hi, I implemented a multi-view geometry pipeline in ROS to track an underwater robot’s pose using two fixed cameras:
GoPro (bird’s-eye view)
ArduCam B0497 (side view on tripod)
A single fixed ArUco marker is visible in both views for extrinsics.
Pipeline:
CNN detects ROV (always gives the center pixel).
I undistort the pixel, compute the 3D ray (including refraction with Snell’s law), and then transform to world coordinates via TF2.
The trajectories from both cameras overlap nicely except when the robot moves toward the far side of the pool, near the edges of the USB camera’s FOV. There, the ArduCam trajectory (red) drifts significantly compared to the GoPro.
I suspect distortion model limitations. Either you didn’t capture many points near the corners of the image during calibration, or you are using a distortion model that isn’t capable of modeling your lens. Take a picture of a calibration target so that it fills the full frame. Undistort that image - are the calibration target lines straight, or do they curve as you get close to the edges/corners? Curved lines mean the distortion calibration isn’t very good.
I have had good luck using the rational model with wide FOV lenses. If you are using the standard 5 parameter model, I’d give the rational (8 parameter) model a try. It’s important to capture points near the corners of the image during calibration. The 5 parameter model does not extrapolate well, and while the 8 parameter model does a better job with extrapolation, you will still get best results when you have points as far into the corners as you can manage.
Getting points in the corner is difficult with the standard chessboard target (which has to be fully visible to be detected), and I have had good success using the ChAruco calibration target for calibration (which can use images that contain only part of the target). It is important that you check which points are being used in the calibration process. Just because a point is visible doesn’t mean it will get used, particularly if you have significant distortion.
I have had to use an iterative approach when calibrating high distortion lenses using the Charuco process. Boostrap by using the standard Charuco process, then use the recovered calibration to iteratively predict the location of udndetected corners, using cv::cornersSubPix to get a subpixel location. By feeding these new corners (along with the originally detected corners) back into the camera calibration algorithm, you get a more accurate result. You might have to iterate on this several times to get all of the points.