Camera Pose based on solvepnp() weird results

Hey guys, currently I try to estimate the pose of my camera using solvpnp() with OpenCV. I printed the resulting tvec’s. Can anybody explain me why I obtain x=-5,80 and y=-5,02 ?

you could have just printed some random numbers. we can’t know how they happened.

what information would anyone require to be able to help you?

1 Like

I already have the intrinsic parameters for the camera. Then, with the intrinsic parameters, I use solvenpnp() to get the extrinsic parameters. Accordingly, the values I have printed out should not be random.

… and z ~13.5

they’re not random at all, but make sense in a 3d world, where (0,0,0) is the origin. this is the center of the tripod, in camera coords, right ?

please add your code and the intrinsics to your question

ps: maybe you could also try with an empty distortion vec and a synthetic camera mat derived from image size (for reference, and to rule out, that your intrinsics are the problem)

1 Like

I guess I shouldn’t have called them “random” but “arbitrary”… because you merely showed some output, that could have been produced by god knows what program, but you didn’t show that program.

stating that you called some API is NOT ENOUGH. I hope you realize that. for your sake.

you are expected to know by now that you have to make your issue reproducible. we don’t have magical crystal balls that see things you keep a secret from us.

these types of question absolutely require complete code and all data that’s required to reproduce the outputs you don’t like.

if you disagree with anything I just said, I’m very sorry but you want help and these things are the bare minimum to help you… and always remember this is free help. you are free to make this process significantly harder than it needs to be. don’t be surprised by the results then.

1 Like

Thank you very much for the answers. I will keep that in mind and implement your tips. Therefore, I have added the code below with which I have achieved the results which can be seen in the image. For the calibration of the camera, i used the following code:

import numpy as np
import cv2
import glob

#termination criteria
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)

#prepare object points, like (0,0,0), (1,0,0), (2,0,0) …,(6,5,0)
a = 8
b = 5

objp = np.zeros((b*a,3), np.float32)
objp[:,:2] = np.mgrid[0:a,0:b].T.reshape(-1,2)

#Arrays to store object points and image points from all the images.
objpoints = # 3d point in real world space
imgpoints = # 2d points in image plane.

images = glob.glob(‘data1/*.jpg’)

for fname in images:
img = cv2.imread(fname)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

#Find the chess board corners
ret, corners = cv2.findChessboardCorners(gray, (a,b), None)
print(fname, ret)

#If found, add object points, image points (after refining them)
if ret == True:
objpoints.append(objp)

   cv2.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
   imgpoints.append(corners)

   #Draw and display the corners
   cv2.drawChessboardCorners(img, (a,b), corners, ret)
   cv2.imshow('img',img)

cv2.destroyAllWindows()
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1],None,None)
np.savez(‘c2.npz’, mtx=mtx, dist=dist, rvecs=rvecs, tvecs=tvecs)

mean_error = 0
tot_error = 0
for i in range(len(objpoints)):
imgpoints2, _ = cv2.projectPoints(objpoints[i], rvecs[i], tvecs[i], mtx, dist)
error = cv2.norm(imgpoints[i],imgpoints2, cv2.NORM_L2)/len(imgpoints2)
tot_error =tot_error+ error

print("total error: ", mean_error/len(objpoints))

As you can see i saved my results with np.savez.

Next, I used my saved intrinsic parameters to obtain the extrinsic parameters with solvepnp().

import cv2
import numpy as np
import cv2 as cv
import glob

#Load previously saved data
with np.load(‘c1.npz’) as X:
mtx, dist, _, _ = [X[i] for i in (‘mtx’,‘dist’,‘rvecs’,‘tvecs’)]
print(mtx)
print(dist)

def draw(img, corners, imgpts):

corner = tuple(corners[0].ravel())
img = cv.line(img, corner, tuple(imgpts[0].ravel()), (255,0,0), 5)
img = cv.line(img, corner, tuple(imgpts[1].ravel()), (0,255,0), 5)
img = cv.line(img, corner, tuple(imgpts[2].ravel()), (0,0,255), 5)
return img

criteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 30, 0.001)
objp = np.zeros((8*5,3), np.float32)
objp[:,:2] = np.mgrid[0:8,0:5].T.reshape(-1,2)
axis = np.float32([[3,0,0], [0,3,0], [0,0,-3]]).reshape(-1,3)
cap = cv2.VideoCapture(1)
while True:
ret, frame = cap.read()
if not ret:
break
img = np.copy(frame)
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
ret, corners = cv.findChessboardCorners(gray, (8,5),None)
if ret == True:
corners2 = cv.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
#Find the rotation and translation vectors.
ret,rvecs, tvecs = cv.solvePnP(objp, corners2, mtx, dist)
# project 3D points to image plane
imgpts, jac = cv.projectPoints(axis, rvecs, tvecs, mtx, dist)
img = draw(img,corners2,imgpts)
cv.imshow(‘img’,img)
print(tvecs)
cv.waitKey(1)
#k = cv.waitKey(0) & 0xFF
#if k == ord(‘s’):
#cv.imwrite(frame[:6]+‘.png’, img)

cv.destroyAllWindows()

With this booth scriptsI obtain the results which can be seen in the image. I just want to know if my results make sense?

Thanks

Hi, I have a similar question to yours.

I don’t know how to interpret the rvecs and tvecs.

My thread is here: Coordinate frame of 'r' and 't' vectors from cv2.calibrateCamera?

First I would like to know the coordinate frame of the rvecs and tvecs.

All the documentation says is that they are on the ‘calibration pattern coordinate frame’. Having an image of the axis and the 0,0,0 would be awesome.

(I guessed the 0,0,0, was on the first detected point on the pattern, with X and Y in openCV style → X towards the right and Y downwards, but if I try to plot in this style the camera pose it shows way off from the real position).

I hope someone could help us.

EDIT: Hey! I saw your Z is going rightwards, why? maybe we figure it out together, people seems to perceive our question as alien.

1 Like

Ok!

I got my camera pose right.

For each rvec and tvec you can get a camera pose.

Important, the rvecs are in axis-angle. And the coordinate frame from the first detected point on the frame is X right, Y down, Z away.

You can get a bunch of camera poses (they should be extremely close from each other) and average them if you want to do a fine job.

The process can be applied backwards as well, from your camera frame apply the inverse of each vectors and you get the pattern placed on the 3D space.

1 Like

Hey, nice to hear!

I’m still struggeling in the subject of rvec and tvec.

If we use cv2.cameracalibrate(), depending on the number of images, we obtain a bunch of rvec’s and tvec’s…but they all are related to the “first detected point”? or rather one point on the checkerboard? that would have to be the case, wouldn’t it? Because otherwise I would get results with a different reference point each time. This would make it impossible to get a precise camera pose.

Each rvec and tvec gives the rotation and translation from the calibration pattern (chess board or checkboard) to the camera (camera pose) for each corresponding photo.

So if you have two photos, one with the calibration pattern very rotate and far away from the camera, the rvec and tvec both will be bigger than a another photo where you had the calibration pattern closer and with few angle (horizontal) to the camera.

Let me know if you got it now :wink: