First of all I hope this is the right forum for my question. As the topic suggests, I’m trying to measure lengths using a single camera. Do I need to calibrate my camera to achieve this?

From what I understand, undistorting the image won’t eliminate any projective errors. The only thing I can think of is correcting lens distortions, and for that, I would need to calibrate the camera. Is that correct?

calibration gives you two things: an estimate of the lens distortion, and an estimate of the focal length, which combines the field of view and the pixel size of the sensor in a single value (no relation to “focus”).

you could assume the lens distortion to be zero. you could also just calculate the focal length if you have both the size and distance of a known object (yard stick or whatever).

with a single camera (and the focal length), if you want to measure the size or distance of an object, you’ll still need to know either the distance or the size of the object, respectively. “similar triangles”, the math is called.

Thank you for your reply. I aim to achieve measurement accuracy within ±0.05 mm. Do you think thats possible? Furthermore regarding the accuracy I would guess that it’s crucial to consider lens distortions.

Incorporating a reference object of known length alongside the object of interest would be possible and provides a means to measure a second object of non known length.

Nevertheless regarding calibration: Do I even need the focal length in order to measure the size of an object, with a reference, as you described it? Do I need any of the calibration results at all? (Maybe except lens distortion.)

depends. with a telescope, probably not. with a microscope, I’d say it’s likely. can’t say, don’t have the details on your setup, or your planned setup.

I’d recommend browsing the corresponding chapter in some book on the matter. hartley & zisserman is one typical reference book. or maybe it was, idk what’s hot these days.

maybe you’d like to do whatever you do with a flatbed scanner, or a telecentric lens.

I have a Sony Alpha 7R with a macro lens. I’ve already gone through Hartley & Zisserman’s work, and I highly recommend it because it feels very comprehensive. It covers almost everything. While it serves as a solid theoretical reference, in my optinion it lacks real-world applications. The examples in the book are interesting, but they often lack more in-depth explanations.

Going back to my initial question, my current assumption is that I don’t necessarily need to perform a full camera calibration. That said, I’m skeptical about relying on algorithms to accurately capture the real distortions, especially when accounting for lens distortion. In Zhang’s “A Flexible New Technique for Camera Calibration”, the Maximum-Likelihood Method aims to improve the overall camera model rather than focus on obtaining the most precise distortion coefficients. The distortion model adds two degrees of freedom to enhance the model as a whole, rather than fine-tuning distortion accuracy. (In other words, the optimization ensures the local optimal parameters for the entirety of the model parameters, including the focal length’s, skew, principal point, distortion coefficients, etc… However, it doesn’t guarantee that the distortion coefficients found are optimal. Correct me if I’m wrong.)

that was several wild claims. once more, I’ll try to comment. I’m not going to argue and “force” you out of your notions. it might look like I’m giving in and “granting” you your notions. you should never ever construe that as “winning” or “being right”. just a disclaimer. I don’t know you. I know people. people usually disappoint me when it comes to communication and reasoning. I have no reason to invest myself into this thread or you personally. I’m just here to pass the time solving puzzles.

I notice that I’m having to interpret your words. you spoke fairly non-specifically. I suspect that you used some LLM to generate/improve your recent post. if you did, you shouldn’t do that. it often generates good-sounding but meaningless/empty phrases.

two issues with that.

(1) I don’t remember that you justified that. “full” depends on the camera model you consider, which is the model of the lens and projection. sure, if the model is complex enough, your application is simple enough, your real camera is good enough, you can “simplify” the model because it’s more than you need. that’s all speaking generally. you have not provided specifics.

(2) there is nothing but algorithms, and lens models. what else do you think you could “rely” on? here’s where I have to interpret, because the statement is hollow. it’s something ChatGPT and other “verbally strong” entities might emit.

that sounds absurd. I hope you see that. what else, if not “algorithms”, do you propose to obtain a useful model? there is nothing but algorithms to do this. those algorithms rely on measurements, which is what you get when you wave a calibration pattern in front of the camera. there are no algorithms that can compensate for user error. you need to understand any calibration algorithm in order to know how to generate the measurements, and how not to generate them. Calibration Best Practices – calib.io is one nice summary of what people do wrong that gives them junk calibrations.

aside from the 3x3 projection matrix (focal length and optical center), there is nothing but the distortion coefficients. those distortion coefficients are the lens model. they model how the lens itself distorts the picture. you make it sound as if they’re minor and irrelevant.

you say that like some lens models or estimation algorithms do not have the goal of capturing a good estimation.

that too is wild. adding a degree of freedom to a model causes more effort to its estimation. it does not inherently improve the model. it just allows it to fit one more dimension of complexity.

all models are fitted by optimizing for reprojection error, i.e. image space accuracy. you sound like you’re critiquing the desire to have “accurate” model parameters. sure. those are secondary, but they directly and uniquely determine the accuracy of the model.

there is no “local/global”. I can’t make any sense out of the statement that something could ensure “the parameters” for “the parameters”, not even if those are qualified with “local optimal” and “model”. when I say “I can’t make any sense out of” that, please take that to mean the statement has an issue, not that I have an issue.

I am here to learn and improve on a based discussion and I value your insights and the time you put into helping me out.
You are right that my text is misleasing. I tried to keep it short, but I guess I made a lot of “mental jumps”.

On (1):
Let me define the terms I used with my intention:
With “full” I refer to the fact of calibrating every parameter. I am arguing about the “standard” model with the two focal length’s the skew parameter and the principal point. And adding two parameters for radial lens distortion.
My intention with the sentence was that I thnik I don’t need f_x, f_y, s, x_0 and y_0 for my intended length measurements. My thought is that I only need to distort the image for lens distortions in order to measure correctly. So that’s what I meant with the sentence “no need for full calibration”.

On (2):
Maybe I should have added “on the algorithms stated in Zhang’s paper”. Generally speaking without trusting on algorithms I couldn’t even write this text But I get your point that that sentence was misleading. (Furthermore there are more distortions coming from sensor misalignment, like tangential distortions etc. Thats why I clarified that my case are only radial lens distortions.) I went to a lot of math and I would state that I reached a decent understanding of what camera calibration is doing and how it woks.

Let me clarify my statement on algorithms:
I said: “the Maximum-Likelihood Method aims to improve the overall camera model”, where I guess you would agree. So indeed they find THE optimum, but only around the camera parameters you achive from using the least squares method, see page 6 equation (8) in Zhangs paper. This “guess” is used as starting point for the Maxmimum likelihood. (“It requires an initial guess of A”, Zhang page 6 on the bottom of the page.) The maximum Likelihood method minimizes the error of the model. (minimize: observed point - Model(real world point)) (((Yes, indeed it searches for the mean of the normal distribution for the model parameters, aiming to maximize the likelihood for the observed observation to occur.))) Solving is done via the Levenberg-Marquardt algorithm. That part is based on articles/papers/wikipedia I have read. But still please correct me if I got anything wrong.

Yes I totally agree on that.
My intetntion with my sentence was that the added dimensions aim to improve the model. Its not guaranteed though.

In gerneal I am afraid of, I guess its called, overfitting. The parameters are not independent of each other. You see that when you set the distortion coefficients to zero. And then for a second calibration you allow distortion modelling. All calculated calibration parameters change! To clarify what I am up to: If the algorithm would have found the “right” focal length, it wouln’t change it, after I allow for distortion modeling. It should stay fixed. But it does change. That said the solver optimizes all parameters in order to obtain the optimal calibration parameters. (Measured threw the reprojection error.)

Now regarding the part where it is more my “optinion” or a guess: As far as my understanding goes the parameters are influencing each other. The radial distortion parameters seem to have a similar effect as the two folcal lengths. I guess because radial distortion models that pixels far from the center are stretched out. In order to reduce the mean reprojection error without accounting for lens distortion the algorithm calculates increased pixel sizes.

But going back to the base I think that this is going in some very high level math’s, where I dont know if I would be able to understand it even if someone tryes to explain.
Furthermore: See the discussion about geometric and algebraic error using different solvers in Multiple View Geometry in Computer Vision from Hartley and Zisserman on page 180ff… And I am not sure if solving for the geometric error (using levenberg Levenberg–Marquardt) finds the best distortion parameters. So my idea is kinda an analogy to why the least squares only finds the best solution for the calibration matrix in an algebraic way and not in an geometric!

Best regards

Marius

P.S.: I use LLM’s only for grammar checking and sentence structure. (Just for clarification not as a justification.) This time, because it seemed like you prefer it, I didn’t use them But be warned of my grammar!

P.S.2.: In a discussion its never about winning You would call it fight then instead. But from your wording I guess you went threw a lot of fights in this forum and I am sorry to hear that.

My understanding is that f_x and f_y are allowed to vary independently because not all cameras use square pixels, and having different f_x and f_y parameters allows the model to fit the data. The vast majority of modern cameras do use square pixels, however, so it is common (and recommended) to use the CALIB_FIX_ASPECT_RATIO parameter when calibrating (this optimizes a single focal length parameter / doesn’t allow them to vary independently.)

You probably don’t need a skew parameter. See Hartley/Zisserman 6.2.4 for a discussion on skew.

I take x_0 and y_0 to be the image center (cx, cy). Those are necessary in order to handle the lens distortion because the distortion model is radially symmetric about the optical center of the image.

The camera calibration process in OpenCV accepts flags that let you control what it parameters it optimizes (for example CALIB_FIX_ASPECT_RATIO). You can supply a camera matrix to the calibration process which will serve as a starting point for the optimization (with the flag CALIB_USE_INTRINSIC_GUESS), but by passing the correct flags you can also force the optimization algorition to use the parameters you provide. For example CALIB_FIX_PRINCIPAL_POINT will not optimize the cx,cy parameters.

(there are many other flags, see the opencv documentation to learn about all of them)

Assuming you are using the OpenCV calibration algorithm to get your radial distortion parameters, I would suggest letting it optimize your focal length and image center, too. If you think you know your image center and focal length better than what the optimizer will generate, then you are free to provide a camera matrix (along with the appropriate flags) to force the optimization to use your input. I would suggest trying it both ways and comparing the reprojection error - I suspect you will get better results letting OpenCV optimize all of the relevant parameters.

Depending on the nature of your lens distortion you might be better served using something other than the basic k1 k2 model. I find the rational model to behave very well for high-distortion lenses with the caveat that you have to provide data (chessboard corners) that cover the parts of the image you want to undistort accurately. It (the rational model) can have pretty wild behavior when extrapolating beyond the input data coverage.

As for your original question, my thoughts.

I’m assuming you want to measure lengths where all points are located in the same world plane. If you are trying to do something different, then you’ll need a different approach to what I suggest here.

I would do the following:
Calibrate the camera intrinsics using OpenCV’s calibrateCamera function. I’d fix the aspect ratio (lock the two focal lengths) unless I had reason to think the camera used non-square pixels. I’d evaluate the distortion of the lens and pick the appropriate distortion model. K1,K2 model might be just fine, but the rational model is there if you need it.

I’d use the Charuco calibration target and associated functions because you don’t have to see the full calibration target in the input images - this makes it much easier to get measurements near the edges / corners of the image.

Once the intrinsics are calibrated I would calculate a homography that maps undistorted imaged coordinates to 2D plane coordinates. Once you have this, calculating distances between image points becomes pretty straightforward.

Whether or not you can achieve the accuracy you desire depends on a number of factors, but with the correct optics and sensor I am confident you can do it - but that might mean having a smaller FOV than you want, etc.