the graphics is nice !
and it shows, that “our model” here ends with the 7x7x512 maxpool (no fc layers)
and i took a closer look at the training notebook:
normally i’d expect to see some inference like:
result = VGG(input)
but the result is nowwhere used here. instead it seems to use (&manipulate) the input image “by reference”, very weird …