EfficientSAM input size

Hi, I am using OpenCV 4.10, and wanted to try some of the neural networks available at GitHub - opencv/opencv_zoo: Model Zoo For OpenCV DNN and Benchmarks.

I am interested in EfficientSAM. I managed to make it work in C++, but i couldn’t find how to change the image resolution. It seems to always output 640x640 images, and do the segmentation on this kind of input. With images in 4K or higher, it lacks of precision.
I am really new with the DNN module, so that might be obvious…
Can you guide me ?

Thank you!

did you run this file?

the _preprocess function resizes all inputs to be 640 by 640. if you give it something larger, it’ll take it. the output might be that size, but you can resize that to fit your input. yes, that means it won’t be “pixel-accurate”. it probably isn’t even if the input is sized exactly as the model requires.

try changing the occurences of 640. try doubling that, or halving. not all arbitrary numbers might work, only some specific sizes. if nothing works, the model is probably not fully convolutional.

Thank you for the reactivity !
I tried to resize my input image at 2*640 to see if the output would change, but it remained at 640x640
I think this model cannot handle other dimensions.

Again, thanks for your quick response

I didn’t suggest that.

I suggested editing the code of that script that runs the network.

some places on the internet talk about input being 1024 by 1024, so who knows what’s possible.

The input size, from what I understand from the file you pointed out is doing this with the inputSize.

        image = cv.resize(image, self._inputSize)

which is basically resizing the image as input.
Is there something I missed ?