Do I even need camera calibration for length measurements with one camera?

Hello,

First of all I hope this is the right forum for my question. As the topic suggests, I’m trying to measure lengths using a single camera. Do I need to calibrate my camera to achieve this?

From what I understand, undistorting the image won’t eliminate any projective errors. The only thing I can think of is correcting lens distortions, and for that, I would need to calibrate the camera. Is that correct?

Best regards,

Marius

calibration gives you two things: an estimate of the lens distortion, and an estimate of the focal length, which combines the field of view and the pixel size of the sensor in a single value (no relation to “focus”).

you could assume the lens distortion to be zero. you could also just calculate the focal length if you have both the size and distance of a known object (yard stick or whatever).

with a single camera (and the focal length), if you want to measure the size or distance of an object, you’ll still need to know either the distance or the size of the object, respectively. “similar triangles”, the math is called.

2 Likes

Thank you for your reply. I aim to achieve measurement accuracy within ±0.05 mm. Do you think thats possible? Furthermore regarding the accuracy I would guess that it’s crucial to consider lens distortions.

Incorporating a reference object of known length alongside the object of interest would be possible and provides a means to measure a second object of non known length.

Nevertheless regarding calibration: Do I even need the focal length in order to measure the size of an object, with a reference, as you described it? Do I need any of the calibration results at all? (Maybe except lens distortion.)

Tank you in advance for your support and time

Best regards

Marius

depends. with a telescope, probably not. with a microscope, I’d say it’s likely. can’t say, don’t have the details on your setup, or your planned setup.

I’d recommend browsing the corresponding chapter in some book on the matter. hartley & zisserman is one typical reference book. or maybe it was, idk what’s hot these days.

maybe you’d like to do whatever you do with a flatbed scanner, or a telecentric lens.

1 Like

Thank you for your reply.

I have a Sony Alpha 7R with a macro lens. I’ve already gone through Hartley & Zisserman’s work, and I highly recommend it because it feels very comprehensive. It covers almost everything. While it serves as a solid theoretical reference, in my optinion it lacks real-world applications. The examples in the book are interesting, but they often lack more in-depth explanations.

Going back to my initial question, my current assumption is that I don’t necessarily need to perform a full camera calibration. That said, I’m skeptical about relying on algorithms to accurately capture the real distortions, especially when accounting for lens distortion. In Zhang’s “A Flexible New Technique for Camera Calibration”, the Maximum-Likelihood Method aims to improve the overall camera model rather than focus on obtaining the most precise distortion coefficients. The distortion model adds two degrees of freedom to enhance the model as a whole, rather than fine-tuning distortion accuracy. (In other words, the optimization ensures the local optimal parameters for the entirety of the model parameters, including the focal length’s, skew, principal point, distortion coefficients, etc… However, it doesn’t guarantee that the distortion coefficients found are optimal. Correct me if I’m wrong.)

that was several wild claims. once more, I’ll try to comment. I’m not going to argue and “force” you out of your notions. it might look like I’m giving in and “granting” you your notions. you should never ever construe that as “winning” or “being right”. just a disclaimer. I don’t know you. I know people. people usually disappoint me when it comes to communication and reasoning. I have no reason to invest myself into this thread or you personally. I’m just here to pass the time solving puzzles.

I notice that I’m having to interpret your words. you spoke fairly non-specifically. I suspect that you used some LLM to generate/improve your recent post. if you did, you shouldn’t do that. it often generates good-sounding but meaningless/empty phrases.

two issues with that.

(1) I don’t remember that you justified that. “full” depends on the camera model you consider, which is the model of the lens and projection. sure, if the model is complex enough, your application is simple enough, your real camera is good enough, you can “simplify” the model because it’s more than you need. that’s all speaking generally. you have not provided specifics.

(2) there is nothing but algorithms, and lens models. what else do you think you could “rely” on? here’s where I have to interpret, because the statement is hollow. it’s something ChatGPT and other “verbally strong” entities might emit.

that sounds absurd. I hope you see that. what else, if not “algorithms”, do you propose to obtain a useful model? there is nothing but algorithms to do this. those algorithms rely on measurements, which is what you get when you wave a calibration pattern in front of the camera. there are no algorithms that can compensate for user error. you need to understand any calibration algorithm in order to know how to generate the measurements, and how not to generate them. Calibration Best Practices – calib.io is one nice summary of what people do wrong that gives them junk calibrations.

aside from the 3x3 projection matrix (focal length and optical center), there is nothing but the distortion coefficients. those distortion coefficients are the lens model. they model how the lens itself distorts the picture. you make it sound as if they’re minor and irrelevant.

you say that like some lens models or estimation algorithms do not have the goal of capturing a good estimation.

that too is wild. adding a degree of freedom to a model causes more effort to its estimation. it does not inherently improve the model. it just allows it to fit one more dimension of complexity.

all models are fitted by optimizing for reprojection error, i.e. image space accuracy. you sound like you’re critiquing the desire to have “accurate” model parameters. sure. those are secondary, but they directly and uniquely determine the accuracy of the model.

there is no “local/global”. I can’t make any sense out of the statement that something could ensure “the parameters” for “the parameters”, not even if those are qualified with “local optimal” and “model”. when I say “I can’t make any sense out of” that, please take that to mean the statement has an issue, not that I have an issue.

1 Like

Hello crackwitz,

I am here to learn and improve on a based discussion and I value your insights and the time you put into helping me out.
You are right that my text is misleasing. I tried to keep it short, but I guess I made a lot of “mental jumps”.

On (1):
Let me define the terms I used with my intention:
With “full” I refer to the fact of calibrating every parameter. I am arguing about the “standard” model with the two focal length’s the skew parameter and the principal point. And adding two parameters for radial lens distortion.
My intention with the sentence was that I thnik I don’t need f_x, f_y, s, x_0 and y_0 for my intended length measurements. My thought is that I only need to distort the image for lens distortions in order to measure correctly. So that’s what I meant with the sentence “no need for full calibration”.

On (2):
Maybe I should have added “on the algorithms stated in Zhang’s paper”. Generally speaking without trusting on algorithms I couldn’t even write this text :wink: But I get your point that that sentence was misleading. (Furthermore there are more distortions coming from sensor misalignment, like tangential distortions etc. Thats why I clarified that my case are only radial lens distortions.) I went to a lot of math and I would state that I reached a decent understanding of what camera calibration is doing and how it woks.

Let me clarify my statement on algorithms:
I said: “the Maximum-Likelihood Method aims to improve the overall camera model”, where I guess you would agree. So indeed they find THE optimum, but only around the camera parameters you achive from using the least squares method, see page 6 equation (8) in Zhangs paper. This “guess” is used as starting point for the Maxmimum likelihood. (“It requires an initial guess of A”, Zhang page 6 on the bottom of the page.) The maximum Likelihood method minimizes the error of the model. (minimize: observed point - Model(real world point)) (((Yes, indeed it searches for the mean of the normal distribution for the model parameters, aiming to maximize the likelihood for the observed observation to occur.))) Solving is done via the Levenberg-Marquardt algorithm. That part is based on articles/papers/wikipedia I have read. But still please correct me if I got anything wrong.

Yes I totally agree on that.
My intetntion with my sentence was that the added dimensions aim to improve the model. Its not guaranteed though.

Sure there is, see: " the LMA finds only a local minimum, which is not necessarily the global minimum", Levenberg–Marquardt algorithm - Wikipedia.

In gerneal I am afraid of, I guess its called, overfitting. The parameters are not independent of each other. You see that when you set the distortion coefficients to zero. And then for a second calibration you allow distortion modelling. All calculated calibration parameters change! To clarify what I am up to: If the algorithm would have found the “right” focal length, it wouln’t change it, after I allow for distortion modeling. It should stay fixed. But it does change. That said the solver optimizes all parameters in order to obtain the optimal calibration parameters. (Measured threw the reprojection error.)

Now regarding the part where it is more my “optinion” or a guess: As far as my understanding goes the parameters are influencing each other. The radial distortion parameters seem to have a similar effect as the two folcal lengths. I guess because radial distortion models that pixels far from the center are stretched out. In order to reduce the mean reprojection error without accounting for lens distortion the algorithm calculates increased pixel sizes.

But going back to the base I think that this is going in some very high level math’s, where I dont know if I would be able to understand it even if someone tryes to explain.
Furthermore: See the discussion about geometric and algebraic error using different solvers in Multiple View Geometry in Computer Vision from Hartley and Zisserman on page 180ff… And I am not sure if solving for the geometric error (using levenberg Levenberg–Marquardt) finds the best distortion parameters. So my idea is kinda an analogy to why the least squares only finds the best solution for the calibration matrix in an algebraic way and not in an geometric!

Best regards

Marius

P.S.: I use LLM’s only for grammar checking and sentence structure. (Just for clarification not as a justification.) This time, because it seemed like you prefer it, I didn’t use them :slight_smile: But be warned of my grammar!

P.S.2.: In a discussion its never about winning :wink: You would call it fight then instead. But from your wording I guess you went threw a lot of fights in this forum and I am sorry to hear that.

My understanding is that f_x and f_y are allowed to vary independently because not all cameras use square pixels, and having different f_x and f_y parameters allows the model to fit the data. The vast majority of modern cameras do use square pixels, however, so it is common (and recommended) to use the CALIB_FIX_ASPECT_RATIO parameter when calibrating (this optimizes a single focal length parameter / doesn’t allow them to vary independently.)

You probably don’t need a skew parameter. See Hartley/Zisserman 6.2.4 for a discussion on skew.

I take x_0 and y_0 to be the image center (cx, cy). Those are necessary in order to handle the lens distortion because the distortion model is radially symmetric about the optical center of the image.

The camera calibration process in OpenCV accepts flags that let you control what it parameters it optimizes (for example CALIB_FIX_ASPECT_RATIO). You can supply a camera matrix to the calibration process which will serve as a starting point for the optimization (with the flag CALIB_USE_INTRINSIC_GUESS), but by passing the correct flags you can also force the optimization algorition to use the parameters you provide. For example CALIB_FIX_PRINCIPAL_POINT will not optimize the cx,cy parameters.

(there are many other flags, see the opencv documentation to learn about all of them)

Opencv Calibration docs v3.4

Assuming you are using the OpenCV calibration algorithm to get your radial distortion parameters, I would suggest letting it optimize your focal length and image center, too. If you think you know your image center and focal length better than what the optimizer will generate, then you are free to provide a camera matrix (along with the appropriate flags) to force the optimization to use your input. I would suggest trying it both ways and comparing the reprojection error - I suspect you will get better results letting OpenCV optimize all of the relevant parameters.

Depending on the nature of your lens distortion you might be better served using something other than the basic k1 k2 model. I find the rational model to behave very well for high-distortion lenses with the caveat that you have to provide data (chessboard corners) that cover the parts of the image you want to undistort accurately. It (the rational model) can have pretty wild behavior when extrapolating beyond the input data coverage.

As for your original question, my thoughts.

I’m assuming you want to measure lengths where all points are located in the same world plane. If you are trying to do something different, then you’ll need a different approach to what I suggest here.

I would do the following:
Calibrate the camera intrinsics using OpenCV’s calibrateCamera function. I’d fix the aspect ratio (lock the two focal lengths) unless I had reason to think the camera used non-square pixels. I’d evaluate the distortion of the lens and pick the appropriate distortion model. K1,K2 model might be just fine, but the rational model is there if you need it.

I’d use the Charuco calibration target and associated functions because you don’t have to see the full calibration target in the input images - this makes it much easier to get measurements near the edges / corners of the image.

Once the intrinsics are calibrated I would calculate a homography that maps undistorted imaged coordinates to 2D plane coordinates. Once you have this, calculating distances between image points becomes pretty straightforward.

Whether or not you can achieve the accuracy you desire depends on a number of factors, but with the correct optics and sensor I am confident you can do it - but that might mean having a smaller FOV than you want, etc.

1 Like

Hello Steve,

I really apprechiate your detailed reply.

Thanks for that tip. The camera I am using has indeed square pixels.

I totally agree. Correct me on that, but it seems opencv doesn’t even allow any other value than 0?

Oh you are right, that even didnt come to my mind, but they are important for the radial distortion. I have already wondered why one needs to provide the calibration matrix in order to undistort.

Furthermore I would guess you cant just set these to the known image center, right? If the size of the camera sensor is known.

Thats a good point :slight_smile: I will also try mixing some flage seeing which setting gets the best result. But I think I will need several test picture sets. (Otherwise I would just find the model that just fits only one picture set.)

In the first place yes.
As far as i know it wouldn’t even be possible without using a second camera, or some sort of distance sensor to measure points out of a plane, right?
A short thought on that topic:
Assuming I want to measure the length of a cylindrical object and I know its diameter. I would gues that I am able to measure its length even if its not parallel placed to the image plane?

That really helps me out. I will switch to the charuco board thats a really valuable tip, also in combination with:

Did I get you right on that: If I place an object on an arbitrary rectangular plane I can calculate a homography that maps that plane to a straight plane. (Given at min 4 localized points on the plane.) Threw the homography I am able to recover angles, ratios, … But I guess I need an reference object of known size rigth? Do I only need the size in one direction of the plane or in two?
(I may use the checkerboard to place the objects on?)

Furthermore do you have any suggestions handling objects that are “high”. A coin for example has a low height and is the example of many measuring tutorials.
But if you measure someting bigger, e.g. a rubics cube, would that be a problem. Its top face would be closer to the camera and therefore appear bigger, because of the projection?

Thank you for the supporting words :slight_smile:

P.S. Coming back to my original question. Did I get you right: I only need the two calibrated image centers to undistort the image?.
I would imply: So f_x and f_y aren’t neccessary for measurements?

For the image center you really should calibrate it. Unless you are using synthetic / generated images with known intrinsics, you can’t just assume the optical image center is the numerical center. For example I just picked up a random camera that I have calibrated and the optical image center is 62 pixels from the numerical center. With high quality cameras and optics it will probably be closer to the numerical center, but it’s still not going to be exact.

You seem to be very interested in avoiding calibration for some reason that isn’t clear to me. All of my experience tells me that if you want accurate results you are better off calibrating the camera than using values that you calculate.

As an example, if you provide the focal length to the calibration algorithm and lock it down (CALIB_FIX_FOCAL_LENGTH or whatever it is) then you will force the optimizer to use the focal length you provide it which could result in worse results for other parameters that you let it optimize. I don’t know how the optimization works for the distortion model, but it’s possible that your distortion parameters would be less accurate if you force it to use your estimated intrinsics vs letting it optimize the intrisics itself.

My best advice, absent a compelling reason otherwise, is to use the OpenCV calibration algorithm to calculate focal length, image center and the distortion parameters.

I’m not sure I understand your question about measuring the cylinder.

As for measuring things that are out of plane, you are right about the measurements being wrong since the object is closer and will appear larger. You’ll have to handle that one way or another. Multiple cameras is one approach…but I would suggest you get a working system that can handle things in the plane first. No need to add complexity at this point.

1 Like

First of all I really apprechiate your help and the effort you put into your replys. Thank you a lot!

Sounds solid, thanks for sharing your experiance. My sensor size is 9504 and 6336 so I would “expect” v_0x = 4752 and v_0y = 3168. But I got 4551.8 and 2419.2. The latter one seems a bit far off :confused: . Still I get a, in my opinion, pretty decent reprojection error of 0.84.
My guess would be that there went someting wrong, what do you think.

Kinda yes. I’ve heard from a lot of people that the camera needs to be calibrated. It sounded important, calibration in the same sentence as measurement. They belong together, I thought.
So I started my “measuring project” by diving into calibration algorithms and everything. But calibrating for camera intrinsics/extrinsics and for measurement, using a reference, are very different approaches! I realized the hard way that having a calibrated camera, in terms of intrinsics and extrinsics, doesn’t allow you at all to do measurements!
(I had a calibrated camera now and still wasn’t able to measure anything.) (Don’t jundge me on that, I am new to the whole topic.)
Thats the reason for my initial quesion. (e.g. Non of the measuring tutorials out there seem to calibrate the camera at all! And furthermore knowing f_x and f_y doesn’t show in any formulas used to calculate lengths. You already helped me out in the way that I now know that the principal point coordiantes are important for undistorting the lens.)
I guess camera calibration has its strong use cases lying more in multiple view geometry.
So my question is more about general understanding. So even threw I get f_x and f_y as free “gifts” when calibrating I would claim: “They dont help me at all in measuting objects!”

True words :slight_smile: I will start with the more simple tasks first.

An example image of what I meant with the cylinder:
The upper image is in orthographic projection (camera infinitly far away.)

The second one shows strong projective properties. And the cylinder isn’t placed parallel to the image plane. So if I know the diameter D in the second picture. Is it possible to calculate L?
image

P.S. Regarding what I meant with:

This is an image out of a lecture. It shows columns placed parallel to the image plane. Interestingly the outer ones appear bigger, even if they are all equally sized.
[Picture 1] (I am only allowed to upload one picture to I stuffed all the grafics into one single image.)

Also a sketch regarding this “effect” I made my own:
[Picture 2]

Here is a model I made in a CAD Programm: (I turned on perspective projection.)
[Picture 3]

For one you may see that, even if the two rods are of the same diameter, the left one appears like it has a greater diameter. (At least slightly.)

Still the two rods are in the same plane. (At least they are touching the same plane. Maybe the issue is that they are “standing out” of that plane.) Did I get you right and mapping a homography to the four red corner points would solve this “issue”?
Definition of “issue”: The two rods of same diameter in the real world appear as having different diameters in an image.

In this one the “measuring plane” is placed parallel to the image plane:
[Picture 4]

It is already “rectified” (Turned into a rectangle :wink: ). So I guess this would be the expected output of applying a homography?

Still D_1 and D_2 are not equal. So to my question: If I know the diameter D_1. Am I able to calculate L_1? Or do I need an object of known length call it “L_reference” in the direction of L.
Since D_1 in the image doesnt equal D_2, measured in pixels. Even it the two rods are modeld with the same diameter. So calcualting e.g. 10 Pixels is one milimeter by counting the pixels along D_1, knwoing its 8 mm I would get a different result for D_2!
I guess measuring “3D” objects is harder than measuring flat objects like coins :confused:

Furthermore I succesfully confused myself now.

Thanks to all of you in advance. Especially @Steve_in_Denver and also to @crackwitz

That whole topic drifts away from my original question.

For the types of cameras I work with (plastic S-mount lens holders with a through-hole PCB mount) that image center would be within range but on the high side. I googled that resolution of your sensor and most results were for Sony Alpha cameras. I would tend to expect the optical image center to be closer to the numerical center on a camera like that. I’m assuming the manufacturing tolerances would be much tighter on that type of camera and optics.

The reprojection error is pretty good. It’s worth taking a moment to consider what the reprojection error means (and what it doesn’t). It’s a measure of how well the calibration results (the model parameters) predict the mapping of the world points of the calibration target to the corresponding image points. Lower numbers are better, etc. BUT just as a high number isn’t a guarantee of a bad model (a few errant points can drive the score up even when the model itself is good), a low number doesn’t guarantee that you got a good model. It does mean that if you plug in the same world points used for calibration (along with the recovered pose/extrinsics for that calibration image) you will get a good estimate of the image point.

At the risk of sounding pedantic, this is important because you don’t just want the model to spit back the answers you already know, you want it to work in a variety of cases including for points that aren’t in your original data set. That’s why it’s important to feed the calibration process a range of images of the calibration target at different distances and (importantly) angles. Also the calibration target points (the chessboard corners) should cover as much of the image as possible - pay special attention to the corners / edges, because those are the hard ones to get. This is how you get a model that actually models what is physically going on with your camera, and not one that merely fits the data you gave it. For example, if you take pictures of a calibration target that doesn’t have much / any depth change, you might get a good score but your image center and focal length could be pretty far off.

How about posting the input images you are using for calibration?

As for not being sure if you really need to be fully calibrating your camera or not…well, it’s complicated.

Yes, it’s true that you don’t need to know your image center or your focal length in some specific cases. For example, if all of your points are going to fall on a plane, and your lens has negligible distortion, you can just use a homography to calibrate the plane to image mapping and you should get pretty good results as long as the things you are measuring is on that plane. In many cases this is good enough! In the cases where you actually need more, starting with a homography still might be the right choice, but at some point you might need a full calibration to make progress to the end objective.

In your case it sounds like you would like to keep things simple and therefore want to use a homography (or something equivalent). Since your lens has a lot of distortion, you need to be able to correct for that first (because homographies can’t model non-linear distortion), which pushes you down the camera calibration path. At a minimum you need to get a good estimate of the image center in order for the distortion to work (it’s a radially symmetric function about the optical image center, if your image center is off you are cooked.) I think it also uses the focal length, so you should probably just let it optimize that for you too, but (I think?) the focal length is only used for normalizing, so you can provide it / lock it down if you prefer, as long as you use it consistently. (I think that’s right, but not certain.)

The problem I see with this approach is that you aren’t really even getting started but you already seem interested in making measurements of points / objects that aren’t constrained to a plane, which means the homography approach isn’t a good fit. Maybe you can get away with a fully calibrated camera + some knowledge about the Z-distance of the different points you are measuring, or maybe you need multiple cameras. In either case I think you will probably need high quality calibration of all parameters in order to get good results.

As for your cylinder question, I’m afraid I’m not smart enough to answer that. My instinct says that if you know D, you might be able to deduce L base don the ellipse equation of the projected circle. But maybe that’s not enough information and you’d need something else?

1 Like

Hello Steve,

again thank you very much for your reply.

Well well, on point! :slight_smile:

Yes I thought the same. I will try a new calibration with new pictures the upcoming days.

I would agreee, especially when considering that there are 60 000 000 pixels per image!

Yes your right on that. I even thought about that myself. I am going to try the per image error that OpenCV provides, in order to gain the ability to “sort out” bad pictures used for calibration. (e.g. Ones that dont provide additional information, because they are too similar.)
Furthermore the variance of the parameters is a very interesting feature. (Especially “stdDeviationsIntrinsics”). I tried diving into that, but got a bit overwhelmed. One thing that confuses me is that in usual error propagation you need your input uncertainties. (Like e.g. you know that your checkerboard was printed with an accuracy of 0.1 mm.) But the “stdDeviationsIntrinsics” does some other kind of magic and ignoring your input uncertainties.
(Using covariance matricies as far as I know. But i wasnt able to grasp it. If your into that topic I would also apprechiate any help.)

Thats a good point. And thank you for the tips on calibration.
I already printed out the Charuco board based on your hint. That will cover the edges. In Zhangs paper the angles have been tested:


Zhengyou Zhang, A Flexible New Technique for Camera Calibration, Page 11.
All around 45 ° prooved to be good:

That point with the distances is where I needed some time to grasp. (Not from your text but from my own experiance.)
Maybe for future readers stumbeling across this discussion: You arent allowed to use a zoom lens. Zooming changes the focal length and so the intrinsics. But focussing is okay! That doesnt change focal length.
(Sike!: From the datasheet of the lens that I am using:
“Depending on the lens mechanism, the focal length may change with any change in shooting distance. The focal lengths given above assume the lens is focused at infinity.”, Sony manual for SEL90M28G. So maybe don’t overabuse different distances too much :wink: )

So with the lens I am using ther is kinda a small distance range at which the checkerboard is not too big and not to small. But I totally agree with your point. The more dfferent the pictures the more general my model, even if the reprojection error gets worse. Better having a worse repr. error and a model that fits the camera more general than a very low repr. error ony on selected calibration images.

I am going to take a few new ones with the charuco chart and I will share them here. Thanks for offering to have a look at them.

Because thats where I was woundering. I thought: Well, if I know the intrinsics and extrinsics I just apply the reverse projection and I am able to measure. But no: The projection matrix is a 3x4 matrix (Hartley and Zisserman page 154) so its not invertable. A point on my imageplane (so a pixel with x and y) maps to a ray in 3D. I am not able to recover any 3D information using one camera. (Neglecting depth of focus and other fancy techniques.)

Yes, absolutely. And thats where I started getting down the rabbit hole.

Haha yes, the results I shared before showed that, prooving you right :slight_smile: My “it’s the middle of the sensor” approach would certainly fail!

Hmm thats what I am interested in too. My initial question is more about a general understanding of the whole topic. My guess would be that the undistort function doesn’t need it at all. But I want to avoid diving into the source code. Im not quite fluently in speeking C(++).

I am only interested in measuring the length of a single object. But maybe you are right that I should just start measuring and getting into theroey later on, if it doesnt work or isnt accurate enough.

Thats a great hint. I dint realize that its not a circle anymore, but your right. The known diameter information may be “encoded” in both directions by the ellipis.
Same “problem” the circular calibration boards feature. (Nice paper I found covering that: Which Pattern? Biasing Aspects of Planar Calibration Patterns and Detection Methods from John Mallon and Paul F. Whelan. TLDR: Checkerboard is “better”, because points map to points under any projection AND distortion.)

Currently reading:

https://docs.opencv.org/3.4/d5/dae/tutorial_aruco_detection.html

The following is stated there: " You can also calibrate your camera using the aruco module as explained in the Calibration with ArUco and ChArUco tutorial. Note that this only needs to be done once unless the camera optics are modified (for instance changing its focus)."

I guess this statement is misleading, since focus shouldn’t change your calibration, ignoring details like:

what makes you hold that notion?

Calibration assumes the pinhole model, right?

For this model an infinitesimal “pinhole” is assumed.
For that model of camera (pinhole) there is no such thing as focus.
(It has an infinte depth of field.)

I will argue with that drawing I made:
image
It shows the defocussing of a “real” pinhole camera with a real pinhole.
Now imagine the diameter of the hole getting infinitesimal small, so it converges into a single point. → Everything will be in focus.

That point is the optical center, assumed by the underlying pinhole model of the most calibration algorithms, including the one used in OpenCV.

P.S.: I changed my wording from “wrong” to “misleading”. Technically its not wrong, that chanig focus changes the intrinsics. Also Steve, and many other sources I have read, suggest taking pictures of your planar calibration target at different distances. But in order to do so, you are in need to refocus. (Or stop down your aperture, to increase depth of field.)

practical calibration assumes the pinhole model for one specific arrangement of optical elements. if you move any of them, which happens if you change focus, the whole thing is all but a different lens. certainly, in practice, the focal length changes. slightly, if you’re lucky.

no. refocusing is worse than slightly out-of-focus images. because it changes the focal length, and due to reality, the distortion as well.

just make sure the lens is focused such that objects at the expected range of working distances are reasonably in focus.

“reasonably” meaning optical focus isn’t much more than one pixel’s worth of blur, or whatever you deem acceptable.

all of that is “conventional wisdom” I’d be surprised if it weren’t found in dusty text books, i.e. it would be a challenge for you to avoid reading about it. it almost seems as if you picked those things and negated them, challenging us to correct you.

or maybe my view of the subject matter is skewed by practice, and all of this isn’t talked about in literature? no, I don’t read books. they were merely supplementary to the lecture and exam.

I’ll see myself out again.

1 Like

I might have time to look at this in more detail later, but a few brief comments.

  1. please share the reference of that first image. I want to make sure I understand what the context is and what they are trying to illustrate.
  2. In that picture it looks to me like there is some sort of optical distortion - that could be affecting the apparent relative sizes.
  3. I think if you point a camera perpendicular to a plane (image sensor plane and world plane are parallel) equal sized things on the world plane would project to equal sized things on the image plane, so any apparent difference would be human perception related, not projection / image formation related (unless lighting / shadows are at play.)

I’m not clear on what you are getting at with the “appear bigger even if they are equally sized” - but if you are saying that equally sized objects in a world plane project to different sizes in the (parallel to the world palne) image plane, I don’t think that’s correct and I think a simple “similar triangles visual proof” would clear it up.

1 Like

I might be able to reply to more of this later, but a few quick notes.

  1. I don’t think the OpenCV calibration function uses Zhang’s method anymore. I don’t have the reference at hand, but it’s something newer which, as I understand it, can detect / account for non-perfect calibration targets which might be relevant to your question about the error propagation / calculation. Speculating a bit here, but it’s probably worth considering.
  2. I think Crackwitz addressed this already, but changing the focus does change the focal length. Additionally, depending on the type of lens and what moves when the focus changes, it could change your image center.
  3. As for the focal length being used for distortion modeling, I think it gets used as a normalization / scale factor. For example if your focal length is 1000, then the radius in image space gets scaled (divided by) 1000…and then the distortion function has a domain that is something like [0, 2.0] instead of [0,5000] I’m out on a limb here a bit, but that’s what I recall. I suspect the function is better behaved / easier to analyze / more consistent if the domain is constrained. So, it might not be that the focal length truly matters, but you probably shouldn’t go changing the focal length when it’s time to distort/undistort images. I think.
1 Like