The methods are not exactly equivalent, we use robot consistency instead of ground truth poses (see details here https://openaccess.thecvf.com/content/CVPR2024/papers/Kalra_Towards_Co-Evaluation_of_Cameras_HDR_and_Algorithms_for_Industrial-Grade_6DoF_CVPR_2024_paper.pdf). However, the metrics should correlate well, so you can run BOP benchmark for quick iterations, and submit checkpoints to BPC from time to time to make sure you are moving in the right direction.