That’s not a question but image processing spam
First an issue
Oh – that’s pretty cool.
It seems like I should be able to get something like a semantic embedding vector for each segment out of this model (e.g. a descriptor of what each segment component is). I see
#1x256x64x64 tensor 0..255 segments??
SamPredictor.features = model.image_encoder(input_image)
Can this be used to cluster segments by their semantic content? Or track (find similar) segments between video frames?