How to improve PnP pose estimation?

and which axis is that, relative to the camera, and why do you think it’s that axis, and why should the translation vector not change from that?

the translation vector describes the marker’s pose, in the camera frame. when applied, this transformation transforms marker-local coordinates into camera-local coordinates.