Hello,
I’m currently trying to estimate the head pose from an image using OpenCV in python.
The image above shows the detected image points (blue) and a projection of the three coordinate axes (2.5 cm long) into the image using the projectPoints function. But even though it works, the values returned by solvePnP do not make sense.
If I do not use the projectPoints function, but instead do the transformation myself as according to the documentation , I get negative pixel coordinates.
This is the example code for the image points and used 3D points used which shows the issue:
import cv2
import numpy as np
ref_pts = np.array([[ 0.43488255, 42.70063 , -20.705105 ],
[ 0.27277985, 26.4461 , -14.271904 ],
[ 0.13183136, 11.719142 , -4.9430103 ],
[ 0. , 0. , 0. ],
[-11.985651 , -11.618346 , -21.41968 ],
[ -6.7659035 , -13.994063 , -18.108206 ],
[ 0.1727357 , -15.1853285 , -16.894157 ],
[ 7.1862555 , -14.026736 , -18.05195 ],
[ 12.442382 , -11.639794 , -21.37848 ],
[-45.87925 , 36.742226 , -39.275238 ],
[-18.638481 , 33.563454 , -33.30136 ],
[ 19.544973 , 33.596954 , -33.56625 ],
[ 46.47897 , 36.47234 , -39.469 ],
[-23.46772 , -33.72577 , -31.594542 ],
[ 24.001362 , -33.768307 , -31.932142 ],
[ 0.7072544 , -77.33406 , -36.489365 ]], dtype=np.float32) / 1000.0
im_pts = np.array([[1017., 463.],
[1025., 501.],
[1032., 539.],
[1040., 569.],
[ 987., 592.],
[1010., 592.],
[1032., 599.],
[1055., 592.],
[1070., 584.],
[ 882., 463.],
[ 950., 463.],
[1070., 456.],
[1138., 448.],
[ 950., 667.],
[1115., 644.],
[1032., 787.]])
cam = np.array([[1920., 0., 960.],
[ 0., 1920., 540.],
[ 0., 0., 1.]])
coeffs = np.zeros((4, 1), dtype=np.float64)
ret, rot, pos = cv2.solvePnP(ref_pts, im_pts, cam, coeffs, flags=cv2.SOLVEPNP_ITERATIVE)
rmat, jac = cv2.Rodrigues(rot)
# Manual projection
T = np.eye(4)
T[:3,:3] = rmat
T[:3, 3] = pos.flatten()
proj = cam.dot(T.dot([0,0,0,1])[:3])
# projection using opencv
proj_cv, _ = cv2.projectPoints((0, 0, 0), rot, pos, cam, coeffs)
np.set_printoptions(suppress=True)
print(f"Manual Projection: {proj}")
print(f"projectPoints: {proj_cv[0,0]}")
Which prints
Manual Projection: [-725.07525257 -399.45861329 -0.7002508 ]
projectPoints: [1035.45080729 570.45078031]
I can obviously use the projectPoints function in this case, but I need to understand the transformation as I want to use it further. I’m suspecting there might be some issue with the projected points landing behind the image plane (resulting in the negative values), but I’m unsure what might be causing it, or how it is detected / corrected in the projectPoints function.
As an aside, I’m also confused by the image on the documentation page linked above, as that shows a left-handed camera coordinate system with the y-axis going up, whereas opencv uses a right-handed coordinate system everywhere else.
Any help would be greatly appreciated.
Best Regards,
Mirko