Opencv uncalibrated stereo is not robust

I am building stereo vision based on this playlist https://www.youtube.com/playlist?list=PL2zRqk16wsdoCCLpou-dGo7QQNks1Ppzo and few scripts from github.

Setup is 2 raspberry pi cameras located 15-20cm from each other (angle of view is similar but not fully aligned) and 14 ARUCO markers in view.

Test description:

  • both cameras make photos
  • then I find 14x4=56 points on both photos and match them
  • calculate fundamental and essential matrix with cv.findFundamentalMat()
  • find 3d position, using cv.recoverPose()
  • calculate reprojection error

I made 20 tests like these, coords of ARUCO markers can change on 1-3 pixels and it significantly affect result, reprojection error can be up to 25 pixels.
Even in best tests where reprojection error is small, 3d coords can be +30% (probably just scale) and fundamental matrix is very different (some items are same but for some even sign is different).

Fundamental:
[[ 1.02404361e-08  3.94424497e-07  4.19287599e-04]
 [-6.06709285e-07  1.48665703e-07  1.10003616e-02]
 [-9.95485243e-04 -1.09427077e-02  1.00000000e+00]]

Right camera rotate/transform:
[[ 0.99724478 -0.06396331  0.03757064 -0.9957763 ]
 [ 0.06468712  0.9977364  -0.01837518  0.06679281]
 [-0.03631026  0.02075489  0.99912502 -0.06299437]]

1st point coords:
[ 1.5546314   1.4780324  14.06956527]

Left reproj error: avg=0.41166796925350146, max=1.2600882536660383
Right reproj error: avg=0.41589630606977707, max=1.270719461314834

============================

Fundamental:
[[-1.34937569e-08  1.43070416e-07  5.18306040e-04]
 [-6.07612283e-07  1.61790904e-07  1.35069458e-02]
 [-9.94700681e-04 -1.32211279e-02  1.00000000e+00]]

Right camera rotate/transform:
[[ 0.99615666 -0.06037661  0.06345532 -0.99876592]
 [ 0.06158883  0.99795128 -0.01732246  0.04633638]
 [-0.06227945  0.02116402  0.99783433 -0.01787665]]

1st point coords:
[ 1.14391187  1.08137137 10.35207307]

Left reproj error: avg=0.5628295566779377, max=1.888344504666136
Right reproj error: avg=0.5669055402134264, max=1.8905979485852762

How can I make more 3d position correct and stable? I know that I can focus on Aruco, e.g. subpixel refinement. But I find it strange that 1-3 pixels shift in 2d coords makes 25 pixels reprojection error.

Fundamental:
[[-3.42932607e-07 -6.04614005e-06  3.68265220e-03]
 [ 5.10704843e-06 -9.20612123e-07  1.93773428e-02]
 [-3.68138030e-03 -1.76805591e-02  1.00000000e+00]]

Right camera rotate/transform:
[[ 0.9964071  -0.06994167  0.04776044 -0.92437783]
 [ 0.06946588  0.99751744  0.01155218 -0.02865833]
 [-0.04844985 -0.00819295  0.99879201  0.38040022]]

1st point coords:
[ 1.29595325  1.10718262 11.72617239]

Left reproj error: avg=15.227097016132225, max=28.05660685973718
Right reproj error: avg=14.848556273144384, max=27.03773318546886

Code:

import numpy as np
import cv2 as cv

def get_3d_points(proj_left, proj_right, pts_left, pts_right):
    p3d = cv.triangulatePoints(proj_left, proj_right, pts_left.T, pts_right.T)
    p3d /= p3d[3]
    return p3d


def calc_reproj_error(p3d, pts, proj_matrix):
    pts = np.transpose(pts)

    reprojected_pt = np.matmul(proj_matrix, p3d)
    reprojected_pt /= reprojected_pt[2]
    reprojected_pt = reprojected_pt[:2, :]
    error = np.linalg.norm(reprojected_pt - pts, axis=0)
    return np.average(error), np.max(error)

K = np.array([[3420 / 2, 0, 2304 / 2], [0, 3420 / 2, 1296 / 2], [0, 0, 1]], dtype=float)
COUNT_DATAPOINTS = 20


def main():
    for t in range(COUNT_DATAPOINTS):
        print('=' * 100)
        points_left = np.loadtxt(f"data/pts_{t}_left.txt", delimiter=",")
        points_right = np.loadtxt(f"data/pts_{t}_right.txt", delimiter=",")

        F, _ = cv.findFundamentalMat(points_left, points_right, cv.FM_RANSAC, 1, 0.99999)
        print("Fundamental:", F, sep='\n')
        E = np.matmul(np.matmul(np.transpose(K), F), K)
        print()

        Rt_left = np.array([[1,0,0,0], [0,1,0,0], [0,0,1,0]], dtype=float)
        Rt_right = np.empty((3,4), dtype=float)
        retval, R, t, mask = cv.recoverPose(E, points_left, points_right, K)
        Rt_right[:3, :3] = R
        Rt_right[:3, 3] = t.ravel()

        P_left = np.matmul(K, Rt_left)
        P_right = np.matmul(K, Rt_right)

        print("Right camera rotate/transform:", Rt_right, sep='\n')
        print()

        points_3d = get_3d_points(P_left, P_right, points_left, points_right)
        print("1st point coords:", points_3d.T[0, :3], sep='\n')
        print()

        left_avg, left_max = calc_reproj_error(points_3d, points_left, P_left)
        print(f"Left reproj error: avg={left_avg}, max={left_max}")
        right_avg, right_max = calc_reproj_error(points_3d, points_right, P_right)
        print(f"Right reproj error: avg={right_avg}, max={right_max}")


if __name__ == "__main__":
    main()

full code and images (150 MB): https://www.dropbox.com/scl/fi/3c5k1f5zcwh0ga9reqk0b/stereo_vision.zip?rlkey=n5ohswgs3wzh07w73zp83fclb&st=g50mlbek&dl=0