Fundamental of rotation and translation in calibrateCamera


I want to understand fundamental behind tvec and rvec calculation for Opencv calibrateCamera() and solvePnP().

  1. How rotation and translation is happening in Opencv api implementation- camera to world or world to camera coordinate?
  2. What is the origine point( for rotation and translation ) Opencv is taken - is it top left corner (0,0,0) of chessboard or center point of chessboard pattern?

Please help me to understand.


Quick reply, by heart, please check.

Origin is at camera center (3D center of projection). X to the right, Y down, Z forward.
I think scale unit is a chessboard square side.
Chessboard rotation and translation are informed in the camera reference.
The image show chessboard origin: only vertices with two black square are used.


1 Like

I’d always talk about the camera’s “optical center”, which can be more complex than simply the center of the lens (e.g. when it’s multiple lenses). it’s not either of the focal points of a single lens. I saw that claimed before somewhere so I thought I should point this out now. for typical webcams, the difference between optical center and focal points is just a few mm so nobody will ever know.

1 Like

Thanks for your quick reply.
What about if chessboard is fixed and camera we are rolling to capture multi orientation images by keeping yaw and pitch of camera unchanged?

Actually I want to know how exactly Opencv implemented APIs if I want to calculate 2D points from 3D points (with known dimension of chessboard )

Did you mean Opencv follows rotation from camera to world coordinate to get rotation matrix?

In the image that you shared , Is Opencv consider origine point is top left of chessboard ie.(x=0,y=0,z=0) or first saddle point (that ray is draw at)?

Thanks, to help me to understand this.


Chessboard origin is top left vertex. The point will be the same no matter the chessboard pose. The pattern in the picture is asymmetric, so there’s only one such point.

Rotation and translation applied to chessboard 3D coordinates will give you camera 3D coordinates. They transform from chessboard’s to camera’s reference.

To get camera pose IRT chessboard (aka world), the usual way is to construct a 4x4 rototraslation matrix (aka euclidean matrix) made from 3x3 rotation matrix and 3x1 translation vector (you may want to learn about it). This is also called transformation from chessboard to camera. The inverse is what you are looking for: transformation from camera to chessboard, the translation part will be you camera position IRT chessboard.

It isn’t hard, but it is complex, you need to read about it.

1 Like