Interactive object detection in browser (NOT in real time)

Prout · April 30, 2021, 12:00pm

Hello there,

I’ve just started learning about computer vision, so I’m also discovering the most commonly CV associated language, which is Python. I’m a Python total beginner (only learnt C and Pascal in my youth), so sorry in advance for the noobish question.
I’ve found countless tutorials about object detections in pictures, videos, and even real time streaming (using web workers) but there’s nothing about doing the job on static webpages.
What I’d like to do is: launch a Chrome/Firefox session, browse to a site, perform object detection on that website, and maybe interact with it. Typically, I’d go to an animal care website, detect a dog among the different animals pictures, then click on the first detected dog picture to go to that URL.
Now, from what I understood after looking at the code, cv2 needs a path to a picture first, using the imread method:
image = cv2.imread(imagepath)
The question is: is there a way or a trick to specify a part of the screen as the imread input, instead of a jpg file location? Then I could give the Chrome window coordinates as boundaries for the input.
I first thought about a workaround, which is: take a screenshot of my desktop, define the ROI = box within the browser borders, perform object detection, calculate the X,Y coordinates of the target, then go back to Chrome and send a mouse click on X, Y. But that’s a bit awful, and painful. I’d prefer to do that “live”, i.e. directly on my browser screen. Since the page is static once everything is loaded (no embedded video), I don’t need RT detection.
Thank you in advance.

berak · April 30, 2021, 1:15pm

if you really want to do this in the browser (and you don’t control any of those websites you visit), you probably need to do this from javascript (running inside the browser), not python, which would have to run e.g. on some server.

have a look at the opencv.js tutorials

Prout · April 30, 2021, 4:35pm

oh, thank you very much. I got it

crackwitz · April 30, 2021, 5:53pm

finding specific objects and clicking on them… are you trying to solve captchas?

if you use python, you can find some screenshot module and take a screenshot

(for windows, I’d recommend d3dshot · PyPI )

you have no reason to deal with the browser directly.

you can simply send a mouse click. it’ll land in whatever program happens to be under the pointer. there’s much written about how to simulate a mouse click using python.

Prout · April 30, 2021, 7:08pm

Actually I’m currently looking for some bargains on vinted-like sites, especially used trench coat. As I’m quite lazy, I’m just trying to scrape to find a list of offers. Of course I could use the sort menu, it’s easier… but it’s also fun learning python/java while shopping

Topic		Replies	Views
Live stream object detection videoio , imgproc , objdetect , game-automation	9	1357	January 20, 2021
Object Detection On The Streaming Video Python	0	159	May 11, 2024
How can do object detection in video using OpenCV and Javascript opencvjs	2	541	October 10, 2022
Opencv on browser	4	202	August 19, 2024
On screen detections Python programming	5	744	February 28, 2022

Interactive object detection in browser (NOT in real time)

Related topics