Inconsistent results with `recoverPose`

I briefly reviewed the section 9.6.2 Extraction of cameras from the essential matrix from Multiple View Geometry by Hartley / Zisserman. I was reminded that:

For a given essential matrix E…and first camera matrix P=[I | 0], there are four possible choices for the second camera matrix P’

Two solutions are simply the translation vector being negated, the other two are by a 180 degree rotation about the line joining the two camera centers.

I don’t know how OpenCV handles this ambiguity in the recoverPose() call, or whether this could be the source of your error, but it definitely raises some questions. I have to wonder if the whole approach is flawed, at least with two nearly identical images. No rotation or translation between the two cameras - is this a degenerate / unstable case?

I’m truly getting out of my depth here, and certainly there are others on the forum with a better understanding than I have. Maybe they will jump in to help.

A few thoughts:

  1. Have you undistorted the image points prior to calling findEssentialMatrix()? If not, try that.
  2. Have you tried to use an image pair that is not identical, but instead has movement between them? Say a simple translation, or rotation. Maybe take a sequence of images from one pose, and then move the camera and take a second sequence. Randomly pick an image from the first sequence and pair it with a random image from the second sequence. Are your results more stable?
  3. You ask if I know of other ways to recover the pose. I do, but I’m not sure they are what you are looking for. I am assuming you are only looking for camera motion in a relative sense - how does the camera pose in frame n compare to the camera pose in frame m. Most of the work I do involves calibrating the pose in an absolute sense - where is the camera in some reference frame that I care about. This takes 3D ground truth points and corresponding image points from the camera you are working with.

To summarize:
Make sure you are using undistorted image points. Try to understand the 4 solution ambiguity and whether it is contributing to your problem. Try to run your algorithm with images from two different views to avoid the possibility of a degenerate / unstable case. Maybe try a scene with more depth variation. Consider buying the book referenced above - if you read and understand everything in chapter 9, I’ll be asking you questions.