Center of the ArucoGridBoard

Hi

I would like to know if it is possible to obtain the POSE (RVEC and TVEC) in relation to the center of the Aruco GridBoard. Here is an example image:

here’s a part of my code:

# Check for camera calibration data
#escolher aqui o arqivo de calibracao para a resolucao desejada
calibration_file = "calibration_640_480.pckl"

if not os.path.exists('./calibration_files/'+str(calibration_file)):
    print("You need to calibrate the camera you'll be using. See calibration project directory for details.")
    exit()
else:
    f = open('calibration_files/'+str(calibration_file), 'rb')
    (cameraMatrix, distCoeffs, _, _) = pickle.load(f)
    f.close()
    if cameraMatrix is None or distCoeffs is None:
        print("Calibration issue. Remove ./calibration_files/"+str(calibration_file)+" and recalibrate your camera with CalibrateCamera.py.")
        exit()

# Constant parameters used in Aruco methods
ARUCO_PARAMETERS = aruco.DetectorParameters_create()
ARUCO_DICT = aruco.Dictionary_get(aruco.DICT_4X4_50)

# Create grid board object we're using in our stream
board = aruco.GridBoard_create(
        markersX=2,
        markersY=2,
        markerLength=0.0615,
        markerSeparation=0.0615,
        dictionary=ARUCO_DICT)

# Create vectors we'll be using for rotations and translations for postures
rvecs, tvecs = None, None

i = 0

while True:
    
    ret = True
    QueryImg = frame
    
    if ret == True:
        # grayscale image
        #gray = cv2.cvtColor(QueryImg, cv2.COLOR_BGR2GRAY)
    
        # Detect Aruco markers
        corners, ids, rejectedImgPoints = aruco.detectMarkers(frame, ARUCO_DICT, parameters=ARUCO_PARAMETERS)
  
        # Refine detected markers
        # Eliminates markers not part of our board, adds missing markers to the board
        corners, ids, rejectedImgPoints, recoveredIds = aruco.refineDetectedMarkers(
                image = frame,
                board = board,
                detectedCorners = corners,
                detectedIds = ids,
                rejectedCorners = rejectedImgPoints,
                cameraMatrix = cameraMatrix,
                distCoeffs = distCoeffs)   

        # Outline all of the markers detected in our image
        QueryImg = aruco.drawDetectedMarkers(QueryImg, corners, borderColor=(0, 0, 255))

        #cameraMatrix[0][2]),int(cameraMatrix[1][2] sao o ponto principal da camera e quase a posicao central
        QueryImg = cv2.circle(QueryImg,(int(cameraMatrix[0][2]),int(cameraMatrix[1][2])),5,(255,0,0),2)


        # Require 15 markers before drawing axis
        if ids is not None and len(ids) > 2:
            # Estimate the posture of the gridboard, which is a construction of 3D space based on the 2D video 
            pose, rvec, tvec = aruco.estimatePoseBoard(corners, ids, board, cameraMatrix, distCoeffs)
            if pose:
                # Draw the camera posture calculated from the gridboard
                QueryImg = aruco.drawAxis(QueryImg, cameraMatrix, distCoeffs, rvec, tvec, 0.3)
            i=i+1
            print('Contagem: ', i)
	    print('TVEC: ', tvec)
        # Display our image
        cv2.imshow('QueryImage', QueryImg)

    # Exit at the end of the video on the 'q' keypress
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
    
    #----------------------------------------------------------------------------------------------------------------------------------------#

cv2.destroyAllWindows()

thanks

without going into details…

rvec,tvec are representations without redundancy (which is good) but they’re hard to calculate with.

build 4x4 matrices. they’re common in computer graphics and robotics. they’re easy to work with. they do contain redundancy because the upper left 3x3 part doesn’t just encode rotation but also scaling, shearing,… even when the transformation doesn’t actually use it.

whenever something needs rvec and tvec, calculate those from your 4x4 matrices.

cv::Rodrigues is the function that does the conversion.

thanks for your answer

let’s say that the values received for TVEC and RVEC are:

('TVEC:', array ([[-0.12476651], [ 0.06477034], [ 0.69848284]]))
('RVEC:', array ([[ 3.1121458 ], [ 0.01443866], [-0.04946228]]))

and applying rmat = cv2.Rodrigues (rvec) [0], I got:

('rmat:', array ([[ 0.99945202,  0.00973551, -0.03163661],
                  [ 0.00881331, -0.99953591, -0.02915969],
                  [-0.03190581,  0.02886488, -0.99907399]]))

this would be the 4x4 matrix?

|rmat[0][0]   rmat[0][1]   rmat[0][2]   tvec[0][0] |
|rmat[1][0]   rmat[1][1]   rmat[1][2]   tvec[1][0] | 
|rmat[2][0]   rmat[2][1]   rmat[2][2]   tvec[2][0] |
|	   0           0             0           1     |

how can i do to get the new POSE considering the central point of the gridBoard which is 0.09225m on the X axis and 0.09225m on the Y axis?

you’ll want to stick with a unit, meters, mm, cm, inches, … so it’s consistent in calibration and use. I’ll go with centimeters because you use that in your graphic. the tvec looks more like you’ve done this in meters. anyway, those quantities are lengths, and units are just factors.

yes, that matrix is right.

\newcommand{\cm}[0]{~\text{cm}} ^\text{camera}T_\text{board} = \begin{pmatrix} R & t \\ 0 & 1 \\ \end{pmatrix}

that matrix will transform coordinates from the object frame (board coordinate frame) into the camera frame (I use “frame” and “space” synonymously). examples:

  • object origin will end up at t in camera space.
  • a point at x + 1.23\cm in object space (i.e. (1.23\cm, 0, 0)^\intercal) will be rotated to align with camera space (rotation around rvec by rvec’s magnitude, in object space), then be translated just the same.

a point would look like (x, y, z, 1)^\intercal. you can describe vectors too, they’ll have a 0 there. vectors in space would map to “vanishing points” in the picture.

I see you’ve calculated the center already. let’s say it’s a simple translation, x + 9.225 \cm, y + 9.225 \cm, z + 0, in board/object space.

that transformation matrix (pose matrix) would be

^\text{board}T_\text{center} = \begin{pmatrix} 1 & 0 & 0 & 9.225 \\ 0 & 1 & 0 & 9.225 \\ 0 & 0 & 1 & 0.0 \\ 0 & 0 & 0 & 1 \\ \end{pmatrix}

(or 0.09225… in meters)

and it transforms from “center space” to “board space”, or equivalently, it describes the center pose in board space.

to go the other way with a transformation, you use the inverse of the matrix. you can use numpy.linalg.inv (or cv::Mat::inv in C++). you might need that if some transformation is easier to express in the other direction, or you have two poses in camera space and you wanna know the transformation between those two poses (from one to the other). you don’t need that here though.

you want

^\text{camera}T_\text{center} ~=~ ^\text{camera}T_\text{board} ~ \cdot ~ ^\text{board}T_\text{center}

and then you calculate that (np.dot, np.matmul, or just an infix-@ operator between numpy arrays).

you’ll note the convenience of the notation. the “output” space of the right matrix matches the “input” space of the left matrix. things are mapped from center space to board space to camera space.

if anything needs rvec,tvec instead, you take that matrix apart and convert the 3 \times 3 rotation part using Rodrigues (it goes both ways).

I haven’t tested this, and it’s midnight, so student beware, but that’s all there is to it.

1 Like