Feed known translation vector to SolvePnP

Hi all,
I’ve been trying to get pose estimation from a series of frames extracted from a video with georeferenced landmarks.

I know the position of my camera with great precision (it has a differential GPS attached) so I’d like to use that information to feed SolvePnP the tvec in order to a) make the solve faster and b) get better estimation of the attitude of the camera.

I did some research and it might seem ‘fixing’ the tvec is not possible under current implementation of SolvePnP. I’d be more than willing to dig into the code if someone could point me in the right direction.

Any suggestions are appreciated.

what are your 2d / 3d points, and what are you trying to achieve here ?
and what exactly do you mean by ‘fixing the tvec’ ?

My 2D points are the fiducial points as seen in each of my video frames (i.e., regular images) and 3D points are the fiducial markers laid on the ground. I’m just trying to estimate the orientation (pitch, yaw, roll) of the camera in space.

I understand that SolvePnP attempts to solve for the translation vector tvec and rotation vector rvec that define the orientation of the camera, given the correspondance between the 3D and 2D points. I’d like SolvePnP not to solve for the translation vector (given that I already know that from my GPS coordinates of the camera at every frame/image). Instead I’d like to feed this known vector to the algorithm so that there are less variables to solve for and I get more reliable results.

Maybe you can try using solvePnP() with SOLVEPNP_ITERATIVE and useExtrinsicGuess = true parameters.
And test to see if you have better results / performance against using the default method (SOLVEPNP_EPNP method followed by SOLVEPNP_ITERATIVE).

SOLVEPNP_ITERATIVE combined with extrinsic guess will use the input rvec and tvec and perform non-linear minimization of the pose computation problem instead of solving the pose computation problem.

See the corresponding documentation: OpenCV: Camera Calibration and 3D Reconstruction

Maybe it’s easier to ignore solvePnp and find a special solution instead.
Could be involved:

  • Translate your object by the known translation.
  • For each of your individual markers there are 2 possible rotations (using 2 of the 3 rotation’s degrees of freedom) so that the marker projects to that pixel.

I wonder if solvePNP would actually give you better translation estimates than the differential GPS does. I suppose it depends on the scale of the world points you are imaging and the quality of your intrinsic calibration. I also wonder if the GPS coordinates you get back are relative to the nodal point of the camera, or some other point on the camera, and if not the nodal point, how far off is it? And how much would an incorrect value for the tvec influence the orientation computation.

Have you tried using solvePNP directly and comparing the tvec it generates with the one your DGPS gives you? Are the angles you are getting back not as accurate as you need?

I obviously don’t have any real understanding of what you are trying to achieve, but I do have to wonder if you might be solving a problem that is “in the noise” so to speak.


1 Like

Thanks, Eduardo. I’ve tried useExtrinsicGuess but I think all that does is use the supplied vectors as a starting point for the minimization, but doesn’t guarantee that it will converge to the supplied values. Regarding SolvePnP_iterative, I had similar, albeit less precise.

@Micka Yeah, I’m considering an ad-hoc solution for the problem. The thing is that SolvePnP works reasonably well. I get very good rotation and translation vectors so I’m hesitant to make a solution of my own, but I do need better attitude estimation, so that’d be the last approach I’d try.

@Steve_in_Denver The differential GPS does give me better estimates than SolvePnP. I get cm-level accuracy with the GPS and I get errors between to 2-5% with SolvePnP’s translation vector (which is still pretty reasonable, but unnecessary given that I already have position information). However, I do need exceptional attitude reconstruction which is why I was thinking that constraining a few variables of the PnP matrices could help me get more precise attitude solutions. Considering that I already have the GPS data, I thought that feeding it as the ground-truth of the translation vectors could improve my attitude results.

I’m not sure what you mean by “2-5% error with SolvePnP’s translation vector”, but the question that comes to mind is “how much does a 2-5% translation error affect your attitude value.” Is it a 0.01 degree difference, and how does that magnitude of error propagate to the values you are calculating? And maybe you have done that already, and it’s significant enough to try to address. I just know that I’m prone to trying to chase out every last spec of error, and sometimes I need to zoom back out and look for more fertile ground (for example, maybe image processing improvements to get better feature localization.)

Having said all that, I’d probably be trying to do the same thing if I already had high quality data for the translation vector (again, be mindful of any offset from the GPS tvec and your camera nodal point, and account for it if you can.)

I think you could probably do what you want by looking at calib3d/src/calibration.ccp. In my version the good stuff starts at line 1145.

CvLevMarq solver( 6, count*2, cvTermCriteria(CV_TERMCRIT_EPS+CV_TERMCRIT_ITER,max_iter,FLT_EPSILON), true);

I think you could change the 6 to 3 (number of parameters to solve for) and the _param mat to also be 3x1, and then call cvProjectPoints2 with the tvec that you pass in (and whatever other bookkeeping changes are necessary).

If you are willing to do that, and it might be a little tedious to track down the way the parameters are getting shadowed, you should probably be able to get what you want.

Good luck and let us know your results.

1 Like

I just realized this thread is tagged as Python, so if you aren’t able/willing to build from the C++ source, this isn’t going to work. If I were using Python, I’d probably punt on changing the source (unless you are already set up for that) and instead just set up an optimizer on your own. You could look at the C++ source and set it up the same way in Python, but with 3 parameters as described above.

This SE question contains an example that should get you running on setting up a LM solver:

I bet you could cook something up pretty quickly. (famous last words?)


The 5% error is comparing the reconstructed translation vector to the actual GPS measurements. I’m not sure how this may affect the attitude reconstruction but I do know from messing around with multi-variate optimization that removing variables from the problem should help me get a more accurate result -or at least a less degenerated one-.

I’m attacking this on several fronts, simultaneously trying to improve target localization but also trying to minimize other potential sources of error. I did seem reasonable to use all available information (the reliable GPS position information, in this case) to try and improve the result.

Given that I’m irrevocably on Python, I may need to look around at building the C++ source if it comes to it. I’ll need to assess which would take less time, build the source with minor changes or making an optimizer on my own. I think someone already had a similar problem before in the old opencv forum (see here) and might have solved it by modifying the source (?).

Thanks for the help, Steve! (and thanks to everyone else)

I guess the uncertainty here is about what the values are referenced to. A 5% difference could be enormous (say, if the coordinate system origin is the center of the earth), or it could be insignificant. It might be more useful to think about the error in an absolute sense, or normalized to the scale of your scene. Or maybe expressed in terms of your GPS measurement uncertainty. But I digress.

If you don’t already have a process for building OpenCV and using it in Python (and don’t have other reasons to spin that up), I’d definitely start with example I pointed to. If you don’t have experience using optimization techniques it might seem daunting, but I think you could fairly readily translate what is happening in the C++ code to use the scipy.optimize tools:

def myfun(x,a,b):
    return [a*x[0]-x[1]-np.exp(-x[0]), b*x[0]+2*x[1]-np.exp(-x[1])]

x0 = [-5,-5]
sol = least_squares(myfun,x0,method='lm',ftol=1e-9,xtol=1e-9, \

I’m not saying it would be a trivial exercise, but basically you replace myfun() with something that calls projectPoints (with your fixed tvec, and the rvec that is being optimzed) and all the magic happens inside of the optimizer.

Definitely a useful tool to have in your toolbox.