VideoWriter (ffmpeg): systematical color change with AVC1 (h264) movies?

The amount of color change is about -(2,2,2)

My tests:

Take an AVC1 coded movie. Transcode to FFV1 lossless (e.g. by Avidemux).

VideoCapture a movie and VideoWrite it again, with all 4 combinations:
FFV1 ==> AVC1, AVC1 ==> FFV1, etc.

Then export the first frame of the pairs respectively (by ffmpeg CLI):

ffmpeg -i input -frames:v 1 ffmpeg_screenshot.png

and compare the frames by the averages of their BGR and HSV channels:

(Stripped without slip I hope)

Mat imageMat, reReadMat,
  subMat, imageMatHSV, reReadMatHSV;

imageMat = imread ("ScreenShotA.png");
reReadMat = imread ("ScreenShotB.png");

cout << endl << "Averages:" << endl;

cv::Scalar meanVal = mean (imageMat);
cout << "A:\t Orig: B = " << meanVal.val[0] << "\tG = " << meanVal.val[1] << "\tR = " << meanVal.val[2] << endl;

meanVal = mean (reReadMat);
cout << "B:\t Orig: B = " << meanVal.val[0] << "\tG = " << meanVal.val[1] << "\tR = " << meanVal.val[2] << endl << endl;

// Now the both differences:
subMat = imageMat - reReadMat;

meanVal = mean (subMat);
cout << "A-B:\t Orig: B = " << meanVal.val[0] << "\tG = " << meanVal.val[1] << "\tR = " << meanVal.val[2] << endl;

subMat = reReadMat - imageMat;

cv::Scalar meanValBminusA = mean (subMat);
cout << "B-A:\t Orig: B = " << meanValBminusA.val[0] << "\tG = " << meanValBminusA.val[1] << "\tR = " << meanValBminusA.val[2] << endl << endl;

// mean (A-B) - mean (B-A), "difference of differences" (DOD)
meanVal -= meanValBminusA;

cout << "DOD:\t Orig: B = " << meanVal.val[0] << "\tG = " << meanVal.val[1] << "\tR = " << meanVal.val[2] << endl << endl << endl;

// Now HSV channels

cvtColor (imageMat, imageMatHSV, COLOR_BGR2HSV);
cvtColor (reReadMat, reReadMatHSV, COLOR_BGR2HSV);

meanVal = mean (imageMatHSV);
cout << "A:\t Orig: H = " << meanVal.val[0] << "\tS = " << meanVal.val[1] << "\tV = " << meanVal.val[2] << endl;

meanVal = mean (reReadMatHSV);
cout << "B:\t Orig: H = " << meanVal.val[0] << "\tS = " << meanVal.val[1] << "\tV = " << meanVal.val[2] << endl << endl;

// Now the both differences:
subMat = imageMatHSV - reReadMatHSV;

meanVal = mean (subMat);
cout << "A-B:\t Orig: H = " << meanVal.val[0] << "\tS = " << meanVal.val[1] << "\tV = " << meanVal.val[2] << endl;

subMat = reReadMatHSV - imageMatHSV;

meanValBminusA = mean (subMat);
cout << "B-A:\t Orig: H = " << meanValBminusA.val[0] << "\tS = " << meanValBminusA.val[1] << "\tV = " << meanValBminusA.val[2] << endl << endl;

// mean (A-B) - mean (B-A), "difference of differences"
meanVal -= meanValBminusA;

cout << "DOD:\t Orig: H = " << meanVal.val[0] << "\tS = " << meanVal.val[1] << "\tV = " << meanVal.val[2] << endl << endl;

The outputs

FFV1 ==> AVC1 (VideoWriter impact):

A:       Orig: B = 60.2154      G = 62.571      R = 71.093
B:       Orig: B = 57.9042      G = 61.0538     R = 69.0174

A-B:     Orig: B = 2.53997      G = 1.61761     R = 2.15573
B-A:     Orig: B = 0.228726     G = 0.100314    R = 0.0800911

DOD:     Orig: B = 2.31124      G = 1.51729     R = 2.07564


A:       Orig: H = 76.933       S = 69.9563     V = 72.1645
B:       Orig: H = 71.4863      S = 75.3879     V = 70.1948

A-B:     Orig: H = 7.79836      S = 1.45931     V = 2.05884
B-A:     Orig: H = 2.35164      S = 6.89095     V = 0.0891775

DOD:     Orig: H = 5.44672      S = -5.43164    V = 1.96966

AVC1 ==> AVC1 (VideoCapture plus VideoWriter impact)
is the same as above (FFV1 ==> AVC1).

FFV1 ==> FFV1
no changes, identical image content.
(BTW: imwrite does the same image content).

AVC1 ==> FFV1 (VideoCapture impact):
no changes, identical image content.
(BTW: imwrite does the same image content).

Checking visually the pixel-wise difference of the pairs above
reveals that it’s always the RGB channels that are about homogenous.
The HSV channels’ differences are strongly structured.

Conclusion:

VideoCapture does fine for AVC1 (and FFV1).

VideoWrite for AVC1 movie shall be done as:

if (bAVC1Write)
    outputVideo.write (frame + VideoWriterCorretion4AVC1);
else
    outputVideo.write (frame);

with

const Scalar VideoWriterCorretion4AVC1 (2,2,2);

With that correction I did the test FFV1 ==> AVC1 again:

A:       Orig: B = 60.2154      G = 62.571      R = 71.093
B:       Orig: B = 59.8431      G = 62.9274     R = 70.9557

A-B:     Orig: B = 0.892717     G = 0.314124    R = 0.566224
B-A:     Orig: B = 0.520381     G = 0.670454    R = 0.428946

DOD:     Orig: B = 0.372336     G = -0.356329   R = 0.137278


A:       Orig: H = 76.933       S = 69.9563     V = 72.1645
B:       Orig: H = 71.7396      S = 70.4345     V = 72.13

A-B:     Orig: H = 7.52111      S = 2.98839     V = 0.506662
B-A:     Orig: H = 2.32764      S = 3.46663     V = 0.472235

DOD:     Orig: H = 5.19347      S = -0.47824    V = 0.0344271

That’s better I might say.

However it’s not the perfect correction, just best overall.
There are colors that need different correction.

I’m well aware that a lossy codec (decoded, encoded) like AVC1
will produce differences.
But they won’t be as systematically and as large as this.

So there’s a bug awaiting for unveiling.

For scientific work lossless codecs will be used if possible;
but if there are big movies you’ll have to use compression.

I haven’t tested other codecs.
In my OpenCV 4.8.0 I used a newer ffmpeg (5.0.1).
I can’t tell about the (prehistoric) default one.

Confirmation/ contradiction on that?

EDIT:
mean() of Hue channel cannot be calculated as arithmetical mean, as Hue’s values are circular. Must be done as here.
But H channel aren’t of any importance in this case, due to structural visual difference (see above).

ADDITION AND CORRECTION
Comparing H and S channels is special.
As BGR2HSV is not injective for them, it has to be taken care when comparing them.
Moreover H channel is circlar.
So the difference of two H channels has to be done by absolute difference.
I checked this visually, and there are still large structural differences.
So the bug won’t cause a simple changing of H channel.
(If still you’re interested in quantifying the differences of the means: See link above).
Difference of -2 for V channel is quite homogenous (except for compression artefacts).
This matches the identical correction value for RGB channels.

For bug and workaround, values of 254 and 255 will never occur in the movie.
The darkening by the bug will cause a data loss at the low end (byte values 0 and 1).
The workaround will “shift” the loss to high end (254 and 255).

Filed an issue about the matter:

BTW
Is there any reason why my original first post is hidden (“SPAM”),
(moreover some other posts of mine)?
and why there’s no reaction from admin to my PM?