I ran into a problem when I assumed that the SIFT features are normalized float vectors.
So I printed them out and was surprised to see integer values stored in a float array!
I did some research and according to this question on stackoverflow, the normalized vector is multiplied by 512, then cast to an unsigned char, which is then cast back to a float.
This seems to be a very inefficient way regarding computation time and storage space. So what’s the idea behind this?
it makes sense to compress/quantize the float values into uint8 values. 4x less storage definitely has an effect on CPU caches. comparing/subtracting 8 bit integers is very cheap and vectorizable, vs. having to handle 32 bit floats, where you actually don’t need the precision.
look at this (highly optimized and undocumented) code. there is a case for when the output is explicitly float, and there is a case for when it’s uint8.
indeed they apply the uint8 scaling (256/512) to the float result case as well. that’s silly in my opinion.