I have a body tracking software that gives me the pose of a skeleton (from a kinect) in world coordinates. The coordinate system is X right, Y up and Z towards the camera.
The BT software provides me the camera position (0, 0.62, 2.5), as well as its orientation (yaw, pitch, roll).
The character/skeleton is in world coordinates (provides the position of the Hips, which is the root of the skeleton, and then the rotations of each bone).
I have a 2nd camera, a webcam, on top of the kinect, which I calibrated with the kinect IR camera, using OpenCV (stereoCalibrate), so I have the intrinsics of the webcam and the transformation between kinect-webcam: a translation of (0, 0.05, 0). There’s a small X-axis rotation of 9º (or -9º, I don’t remember).
I want to be able to project the skeleton into the webcam image. What I have is:
- Skeleton in world coordinates (right-handed, X right, Y up, Z towards camera)
- Webcam intrinsics
- Transformation from kinect to webcam (translation and rotation)
- Pose of the kinect in world coordinates (so I should be able to compute webcam pose in world coordinates with this and previous info)
To project the 3D to 2D points I use the OpenCV’s projectPoints, which gives me the image coordinates, given the 3d points, the intrinsics of the camera and a transformation.
I compute this transformation using the pose of the camera that the body tracking software gives me. I can edit the values (translation + yaw/pitch/roll) to adjust them visually, that is, interactively move the camera until the projected skeleton fits the person and see the values translation+rotation it should have.
The problems I have are:
- If I set the projectPoints transform using the values obtained from the BT software, the projected 2d points do not fit well, even if I add (just summing) the translation and YPR offset between kinect-webcam.
- I have to add a 180 degree X rotation of the camera or invert YZ because it seems that in OpenCV the Y goes down and the Z goes out of the camera. Maybe this is messing up somehow.
- The Y position of the camera has to be much higher than what the camera says, even after adding the offset between the kinect and the webcam. In some places I have seen that you have to transform the pose of the camera doing tvec = -Rinv * tvec, where R = Rodrigues(rvec), that is, to reverse the rotation of the camera and to apply it to the transfer of the camera and to deny it.
- In fact, what you provide to projectPoints function is the transformation matrix that maps 3D coordinates of the object (world) to the 3D space of the camera, but what I have is the pose of the camera in world coordinates.
Basically, with the information I have, I don’t know what parameters to pass to the OpenCV projectPoints function to give me the correct points.