a real-world solution would be to figure out who drew those boxes and how to make them give you the coordinates properly.
this sounds like either an academic exercise, where the instructor needs to be told they’re being unreasonable, or it’s some corporate bureaucratic issue where someone mistakenly put “in charge” needs to be told they’re being unreasonable.
if you need to discuss the problem as you’ve posed it, you need to provide proper source data. that highly compressed thumbnail is impossible to experiment on.
Thanks for the reply !
I’d agree that a real-world solution / production grade implementation would directly provide the co-ordinates. That would basically solve everything here.
Unfortunately, that ins’t the case.
i) It isn’t an academic or corporate issue as it’s from my very own project where i’m trying to experiment with a closed loop AI system.
ii) The bounding box is generated by the AI (object detection model), which I can’t modify to extract co-ordinate as it’s a proprietary system. I’m limited to only receive images from it with bounding boxes on detected objects (in this case, person).
iii) I’d of course love to discuss more regarding the approach. More data is shared below.
Progress so far:
i) From the image with bounding box, detected yellow color and created a mask
ii) Detected the edges on the mask
iii) After computing contours and polygon approximation, I’m able to extract the ROI from source image
This approach is hardly reliable as it isn’t consistent.
Here it works on maybe 1 or 2 of the test images I’ve shared in the link.
Limited to share only one embedded media here. Code for my current method and test data are in the below link.