solvePnP rotational vector

Carlos144Green · July 21, 2022, 3:08pm

So I am following this #project, the result video is at the end.

I want a way to extract the global 3d coordinates of the nose and the vector pointing away. I understand that we get the translational vector (which is the nose coordinates… I think) and the rotational vector (I think this is the camera angle?). I just want the two points of the vector pointing away from the face in two 3D points.

Can anyone help with this?

crackwitz · July 21, 2022, 3:48pm

rvec and tvec represent…

a transformation from the object coordinate frame to the camera coordinate frame
the pose of the object, expressed in the camera frame

rvec is an axis-angle encoding.

berak · July 21, 2022, 5:16pm

rvec/tvec are same for ALL the points. the tip of the nose isnt any special here

Carlos144Green · July 21, 2022, 6:01pm

So how do we solve for the end point of the vector if its length is constant but location isn’t?

crackwitz · July 21, 2022, 9:42pm

express that point local to the “object” (face).

then use projectPoints on that data, along with rvec and tvec, and out comes the screen-space point for that 3D point.

if you want to explore what’s going on… use the following combined with cv.perspectiveTransform()

def rtvec_to_matrix(rvec, tvec):
	"""
	Convert rotation vector and translation vector to 4x4 matrix
	"""
	rvec = np.asarray(rvec)
	tvec = np.asarray(tvec)

	T = np.eye(4)
	R, jac = cv.Rodrigues(rvec)
	T[:3, :3] = R
	T[:3, 3] = tvec
	return T

def matrix_to_rtvec(matrix):
	"""
	Convert 4x4 matrix to rotation vector and translation vector
	"""
	rvec, jac = cv.Rodrigues(matrix[:3, :3])
	tvec = matrix[:3, 3]
	return rvec, tvec

Carlos144Green · July 22, 2022, 2:38am

Sorry, I am still very new to all of this. could you explain how to use cv.perspectiveTransform() and these functions to help understand this all.

Also I dabbled around with the first function and it is throwing an error:
ValueError: could not broadcast input array from shape (3,1) into shape (3,) on line T[:3, 3] = tvec

Carlos144Green · July 22, 2022, 2:40am

Here are the values from the mona lisa image I included in the top most post.

translation_vector:  [[-167.53555238]
 [-203.05570512]
 [2571.1324635 ]]
rotation_vector:  [[-2.808854  ]
 [-0.01133365]
 [ 0.42841277]]

crackwitz · July 22, 2022, 10:33am

fix:

def rtvec_to_matrix(rvec, tvec):
	"""
	Convert rotation vector and translation vector to 4x4 matrix
	"""
	rvec = np.asarray(rvec)
	tvec = np.asarray(tvec)

	T = np.eye(4)
	R, jac = cv.Rodrigues(rvec)
	T[:3, :3] = R
	T[:3, 3] = tvec.squeeze() # this is the fix
	return T

crackwitz · July 22, 2022, 10:39am

I don’t see that picture. your first post only includes a link to a blog post.

your tvec looks like the calibration might be in mm.

the 4x4 transformation matrix for your rvec and tvec is:

array([[   0.95552,   -0.03688,   -0.29262, -167.53555],
       [   0.0523 ,   -0.95524,    0.29118, -203.05571],
       [  -0.29026,   -0.29354,   -0.91082, 2571.13246],
       [   0.     ,    0.     ,    0.     ,    1.     ]])

I’m gonna talk about “markers” purely because I’m kinda stuck in that lingo. it comes from working with AR markers. I imagine if you work with faces, the axes are placed similarly (Z is nose/forward, X to subject’s left, Y up)

the rotation maps (read each column)…

X to +X, with a bit of -Z but not much
Y to -Y (so it points up/far)
Z to -Z (so it points near/up)

which rotates marker-local coordinates/directions into camera-local points/directions. so that means you’re facing the marker. it probably faces a little to your bottom left.

the translation then simply moves all that away/far by 2571, and a little to the top left of the image center.

Carlos144Green · July 22, 2022, 1:28pm

Spot on, its pointing towards the bottom left.

I am still a little confused how you read the matrix to determine the position of the vector though.

crackwitz · July 22, 2022, 3:34pm

the matrix transforms object-local points/vectors to camera-local.

you can get a sense for what it does by figuring where the (object-local) axes get mapped to. X is (1,0,0), Y is (0,1,0), Z is (0,0,1). append a 1 for points or a 0 for vectors.

the +Z vector gets mapped to…

\begin{pmatrix} 0.95552 & -0.03688 & -0.29262 & -167.536 \\ 0.0523 & -0.95524 & 0.29118 & -203.056 \\ -0.29026 & -0.29354 & -0.91082 & 2571.13 \\ 0 & 0 & 0 & 1 \end{pmatrix} \cdot \begin{pmatrix} 0 \\ 0 \\ 1 \\ 0 \end{pmatrix} = \begin{pmatrix} -0.29262 \\ 0.29118 \\ -0.91082 \\ 0 \end{pmatrix}

so the +Z vector gets mapped to… the third column.

now, where does a camera-local vector point if that is its value? a little to -X (left), a little to +Y (down), and mostly to -Z, so that’s near. and that is where the face’s +Z vector points, as viewed by the camera.

same for the others.

Topic		Replies	Views
Feed known translation vector to SolvePnP Python calib3d , solvepnp	10	1879	June 16, 2022
SolvePnP wrong x, y coordinates Python calib3d	3	527	September 27, 2021
solvePnP reprojection to image issue Python calib3d , coordinates , solvepnp	3	1321	January 4, 2022
SolvePnP or SolveP3P with known translation vector calib3d , solvepnp	0	523	December 7, 2022
solvePnP gives Wrong Values Python calib3d , solvepnp	3	1733	June 10, 2022

solvePnP rotational vector

Related topics