Question about ground truth data

I see that the dataset description talks about a rotation matrix and translation vector for the cameras taking the images. I can’t seem to find any information where the ground truth poses are for each object in each image. Are we meant to derive them from other, existing information?

Excuse my ignorance, I’ve never done 6dof pose estimation before. This may be well understood by others.

The GT poses of the objects are available for training data in scene_gt_<cam_id>.json. The object poses for test data need to be detected, that is the goal of the challenge

1 Like