How to reconstruct 3D coordinates from stereo camera pair?

Paul_Wolff · April 3, 2021, 12:52pm

Hi, I have a stereo camera pair which only has only an offset in the x-axis, so they are on the same height and their view axes are parallel to each other and to the ground.

My goal is to reconstruct 3D points of elements lying on the ground.

However I am very confused how to proceed.

My planned procedure is as following:

Calibrate the stereo pair with cv.stereoCalibrate()
Use cv.stereoRectify() to obtain both projection matrices
Use neural network to detect objects of interest in both images and use the centers of the bounding boxes as reference points
Use triangulatePoints() to obtain 3D realworld points in meters from the camera in X Y and Z position

I am very new to computer vision and this is the basic plan I manages to acquire during research.

Is this the proper way to proceed? Or am I missing something?

Thank you very much
Paul

crackwitz · April 3, 2021, 2:12pm

welcome.

that would work.

usually people reconstruct a “dense” per-pixel (disparity map and then) depth map and work with that. it’s an expensive process, unless done in hardware (then the hardware pays the price). the algorithm is called “block matching”.

running DNN inference is also expensive. you have to decide if two inferences is cheaper than one inference and one block matching run.

taking a bounding box in each eye and hoping to throw that back into the scene can work but you’ll have to be careful with the geometry.

imagine a picture frame sitting around your object in the scene. in each view, you aren’t getting a frontal view of that picture frame, but a slightly side view, so it’s not an upright rectangle but a trapezoid or something like that. I mean… working with the bounding boxes as they are… is too simple.

without much thought I’d take the vertical center line of each bounding box, each represents a plane into the scene, and they should intersect in the vertical axis going through the object. that’s probably a good start. then you could take the widths of the bounding boxes and figure a radius from that, around the axis.

or you could consider a “pyramid” going through each bounding box, and intersect those two pyramids to get a volume in space.

Paul_Wolff · April 3, 2021, 2:45pm

Hey, thank you for your quick reply. Yeah I forgot that matching exists, of course I can infer one image and match the points on the other one.

Furthermore I would need to undistortPoints() that I classified right?

crackwitz · April 3, 2021, 2:55pm

if you run block matching for a depth map, you’ll undistort and rectify both eyes anyway, so you can run inference on these pictures and that’s it.

otherwise you do need undistortPoints.

Paul_Wolff · April 3, 2021, 3:19pm

Now I am confused. How come they are already undistorted and rectified?

And do I understand you correctly that I either need to infer both images or infer one and calculate Block Matching?

crackwitz · April 3, 2021, 4:54pm

they aren’t. undistortion and rectification is a prerequisite step to block matching.

and yes that’s what I said.

Topic		Replies	Views
Get 3D coordinates from 2D pixel Python calib3d	4	3304	February 16, 2022
The correct sequence of functions for 3D stereo (triangulatePoints) C++ calib3d	2	582	March 25, 2022
OpenPose 3D reconstruction from saved RGB stereoscopic video Python calib3d , openpose	9	1283	July 6, 2021
Problem with disparity map or stereo calibration Python calib3d , disparity , stereorectify , 3dreconstruction , stereo	0	278	May 12, 2024
Non-planar stereo camera calibration (Two cameras with a physical Z-offset) calib3d , disparity , depth	10	1461	May 14, 2025

How to reconstruct 3D coordinates from stereo camera pair?

Related topics