How to find surfance orientation (relative to camera) from single depth image?

Supposing I have access to the image stream of a depth camera, and there is a flat surface (e.g. floor, tabletop, etc) within the camera’s FoV at all times, how could one estimate the floor vertical and horizontal orientation (or better yet, the rotation matrix/vector) from the camera perspective?

I have access to the camera matrix, therefore I can select multiple points on the surface and reconstruct their 3D coordinates on the camera frame. But how do I use those coordinates to build a transformation matrix (mostly rotation, translation is irrelevant, I’d just need to orient my reference frame orthogonally with the surface) My main limitation seems to be that I do not have the correspondent coordinates on the surface points (on object/external reference), therefore can’t use cv:: estimateAffine3D or cv::findHomography or cv::solvePnP .

I’ve tried to estimate the plane equation using cv::SVD, but resulting fit doesn’t seem to be that precise and I am not aware how could I use the plane equation to find the affine transformation matrix.

You can create 3D vectors between neighboring points (or any points on the table surface). Then calculate the normal vector of the surface by multiplying the two vectors.
The normal vector is perpenticular on the surface, so it gives basically its orientation.

 P0  -> P1    P0,P1,P2 are points on the plane.
  |           N=V(P0->P1)xV(P0->P2) (V are the vectors between the points)
  V
 P2
1 Like

that’s the cross-product, not the dot-product, to be precise.

Thanks, that will certainly work. How would I, however, find a affine/homography transformation matrix between the camera frame and the surface normal?

Alternatively, I’ve been able to fit a plane (i.e. find the plane equation coefficients) to the 3D points on the camera frame, and was attempting to find the correspondent 3D cooridantes on the surface frame to pass onto solvePnP, but I am not exactly sure how to do it.

you can take camera projection matrices and transformation matrices and multiply them together. some matrices appear as inverses in that expression (“unproject” from one image plane “into” space, transform, reproject into other image plane, without actual depth information). you’d want to work with 4x4 matrices or work out the math to reduce it to 3x3 in the first place. a homography is nothing but 4x4 projection + transformation matrices, but the model points on the plane have no z (in model coordinates), so that lets the matrix collapse from 4x4 to 3x3.

I’m not in a position to explain the math right now. it’s been a while since I did it myself.

I am not sure I understood what you suggest, but I don’t have any transformation matrix other than the camera’s projection/intrinsic matrix - the extrinsic transformation matrix is precisely what I am looking for. Particularly, I would like to have the point coordinates parallel/orthogonal to the surface, but I only have the depth images and camera intrinsics to work with - I can “unproject” the points just fine, but lack any external reference to calibrate extrinsically. The camera frame is at an angle to the surface, which greatly hinders the application I had in mind, so I want to first