Hi, im using OpenCV to try and retrieve depth to an object Ive already detected, Im doing this to get pixel per cm at the object location.
so far the constraints I have are:
- I cant use a depth camera or a depth sensor
- I cant place a reference object at the object location
My Approach currently is using my phone camera to take two pictures a certain distance (baseline) apart to simulate sterescopy.
I have calibrate my camera using the standard chessboard pattern, Im wondering does using other patterns help?
I also tried to turns this pattern into metric by adding a ‘square_size’ param, like:
square_size = 0.024
termCriteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
worldPtsCur = np.zeros((nRows*nCols, 3), np.float32)
worldPtsCur[:, :2] = np.mgrid[0:nCols, 0:nRows].T.reshape(-1, 2) * square_size
then I do feature matching and use those points to get Essential Matrix E, and R, t like:
# Calculate Essential Matrix
E, mask = cv2.findEssentialMat(pts1, pts2, cameraMatrix, method=cv2.RANSAC, prob=0.999, threshold=1.0)
# Decompose the essential matrix
_, R, t, mask = cv2.recoverPose(E, pts1, pts2, cameraMatrix)
Ive noticed here that my x element in t is negative, it should be positive as my images are left to right, am I correct in that assumption?
lastly I tried both to just use x-axis disparity with this equation:
depth = f * baseline / disparity
and triangulating using cv2 like:
# Triangulate the 3D point
points_4D = cv2.triangulatePoints(P1, P2, normalized_point1[:2].reshape(2, 1),
normalized_point2[:2].reshape(2, 1))
# Convert from homogeneous coordinates to 3D
points_3D = points_4D[:3] / points_4D[3] # Normalize by the fourth coordinate
# Extract the Z-coordinate as the metric depth
depth = points_3D[2][0] # Metric depth
print("Metric Depth (Z-coordinate):", depth)
but the depth result from both does not match my ground truth on my test set.
is my approach generally, correct?
what are the points of inaccuracies that could be harming my depth calculation?
what can I do better as a whole?