we should come to agree on some conventions. correct me if you think your situation uses different conventions.
- everything right-handed
- camera/picture: x right, y “down”, z outwards into scene (into screen).
- robot base/table/world: x right, y away, z up.
- let’s say it’s like a desk.
calibrationMat hold the calibration between the camera and the robot base . -its’ basically the base to camera 4x4 matrix I believe - am I describing it incorrectly?
unclear anyway. “between” doesn’t tell the direction of the transformation, which is vital. base-to-camera describes the situation precisely.
I compare that matrix to its inverse… almost the same, except translation x and y are swapped. if you use that matrix wrongly, you will get slightly wrong results, instead of catastrophically wrong results, which makes wrong use easy to miss. I believe that is the case here. you used the matrix in the wrong sense. you need matrix inversion. more on that below.
I’ll assume you arrived at both base-to-cam and object-to-cam matrices by measuring patterns using the camera.
>>> print(base2cam) # or cam_base, (cam <- base)
[[-0.00809 -0.99959 -0.02754 -0.04999]
[-0.9998 0.00759 0.01848 -0.56534]
[-0.01826 0.02768 -0.99945 0.80148]
[ 0. 0. 0. 1. ]]
>>> print(np.linalg.inv(base2cam)) # (base <- cam)
[[-0.00809 -0.9998 -0.01826 -0.551 ]
[-0.99959 0.00759 0.02768 -0.06787]
[-0.02754 0.01848 -0.99945 0.81011]
[ 0. 0. 0. 1. ]]
calibrationMat (base-to-cam) looks like… the camera sits 0.81 m above the table, 0.55m to the left of the origin, facing down, and the camera’s right side (+x, first column) is facing towards you (base -y).
is all that correct?
>>> print(obj2cam) # or cam_obj, (cam <- obj)
[[-0.67101 -0.0098 -0.74138 0.03284]
[-0.74072 -0.03524 0.67089 0.03055]
[-0.0327 0.99933 0.01639 0.79936]
[ 0. 0. 0. 1. ]]
object sits 0.8m away from camera and pretty close to the optical axis (~4 cm away), so likely on the table’s surface.
object has some rotation, around 45 degrees? object’s +x points to the top left (in camera frame), y pointing away from the camera, i.e. downwards into the table, z pointing to the bottom left in camera frame.
>>> obj2base = np.linalg.inv(base2cam) @ obj2cam # (base <- cam) * (cam <- obj)
>>> print(obj2base) # (base <- obj)
[[ 0.7466 0.01706 -0.66505 -0.5964 ]
[ 0.66421 0.03719 0.74662 -0.07833]
[ 0.03747 -0.99916 0.01644 0.01086]
[ 0. 0. 0. 1. ]]
this is object pose in base frame (obj to base). it sits on the table (z+0.01086), at 60 cm to the left and ~8 cm towards you. rotation agrees, +x goes to away and to the right on the table, +y into the table, +z away and to the left. that agrees with the view of the camera onto the table.
your formulation for offsetMat confuses me greatly. it appears to be the product of a simple offset and cameraPose however… let’s not do that but build transformations up by parts.
ok so you need x+0.05 on the object. let’s say that’s the target. so you have
>>> print(target2obj) # (obj <- target)
[[1. 0. 0. 0.05]
[0. 1. 0. 0. ]
[0. 0. 1. 0. ]
[0. 0. 0. 1. ]]
you have obj2base from above. you want target2base. just multiply the offset from the right:
>>> target2base = obj2base @ target2obj # (base <- obj) * (obj <- target)
>>> print(target2base) # (base <- target)
[[ 0.7466 0.01706 -0.66505 -0.55907]
[ 0.66421 0.03719 0.74662 -0.04512]
[ 0.03747 -0.99916 0.01644 0.01273]
[ 0. 0. 0. 1. ]]
compare to obj2base, which should be ~5 cm away from these coordinates, except the object sits at an angle, so it’s a little hard to see. we shifted from [-0.5964, -0.07833] to [-0.55907, -0.04512] on the table. the difference is [-0.03733, -0.03321], or the desired 50 mm.
I believe that matrix is what you need.
my variables could be chosen to reflect matrix multiplication better. one could say
>>> base_target = base_obj @ obj_target
where the first part (before _
) is the reference frame (“output” frame) and the second part is the described frame (or “input” frame).