Mat::convertTo(); scaling in respect to what?

In the OpenCV documentation the alpha parameter is often said to be the contrast of the image (see OpenCV: Changing the contrast and brightness of an image!). Why is that? If my understanding is right, it scales all pixels in respect to 0 instead of 128 (mid gray point) so in effect increasing alpha makes everything brighter and decreasing it (in respect to 1) makes the whole image darker.

To increase contrast without effecting brightness (making dark pixels darker and bright pixels brighter) you would have to
g(x, y) = alpha * (f(x,y) - 128) + 128

g(x,y) = alpha * f(x,y) - alpha * 128 + 128

So for changing contrast without effecting brightness in that way you always have to set:

beta = - alpha * 128 + 128

where g(x,y) is the new value and f(x,y) the old value for the pixel at x,y.

Is my understanding of the convertTo function correct?

“brightness” and “contrast” are mushy terms. their meanings depend on context.

in computer vision, we prefer that things have a somewhat physical motivation. sensor data is proportional to collected light. some cameras are so sensitive that their sensor pixels almost count individual photons.

a camera sensor turns physics into numbers. the transfer function for that can be simplified as y(x) = a*x + b.

a is a measure of light gathering ability, so bigger lens or more exposure time. in the signal processing that follows, it’s a gain, which is nothing but a factor.

b, physically motivated, is a measure of “dark current” noise, its mean at least.

the “contrast” and “brightness” that you know from photo editing programs is physical nonsense, and even for photo editing it’s archaic.

at best, by applying “contrast” and “brightness”, you’re approaching a threshold operation.

there is nothing physical or natural about “128”. it does not even express the possibility of “negative values”, because those you can have without such a bias. just use signed number types. then “128” is nonsense, and that “zero” you think of is actually 0.

in fact, due to gamma mapping, which happens everywhere you don’t have actual RAW data, “50%” isn’t even 128 anymore. it’s some other number, that’s impossible to just state without knowing which gamma mapping was applied, or whatever other transfer functions were involved

that API isn’t for photo editing. it’s supposed to be a building block. “orthogonality” dictates that it makes as few assumptions as possible, in order to be easy to use. the presented equation is the simplest model. what you have in mind complicates things needlessly.

in the end, whatever you want to do, you can. convertTo will use the given arguments according to the documentation and give you whatever result you wish. you decide if that result is useful to you.

browse this for a rabbit hole:

and perhaps look for more structured learning material that gives you a solid introduction to image processing. it should answer many questions before you think of them.

Thank’s for the extensive answer!

Maybe your assumption of me, wanting all of my questions to be answered preemptively in hours of studies, is not quite accurate. Sure most of all of the questions in the world wouldn’t have to be asked, if people asking them read some books about it. I think forums like this have the great advantage of people not having to read all those books and other people who have read them, being forgiving about that and happy to help anyway and if they’re not they can just decide not to help.

But your other assumption was quite right, being new to computer vision and only having used some image editing GUIs in the past, what I really wanted was an adaptive thresholding algorithm.