Live stream object detection

hey i was wondering if its possible/how hard it is to do live image character detection (image is the same so no training). on a live feed/stream. can someone please tell me how this is done, is a video file necessary or how does it process the information?

i am new to opencv but not new to python

please illustrate your problem with pictures. you say “live image character detection”. please elaborate, give examples. pictures are preferable to words.


so a sample image would be like this. its a drawing and it always looks the same. if this is coming from a video stream, how would i process this?

i want to add that the image is only on screen for a few seconds, then another image will pop up next round. im not sure if its a series of images from a screenshot that we have to process (jpg,png etc) or how it works… like if im trying to capture a specific instance from a video stream, process it in real time, and do this consistently… do i need a USB camera that looks at my monitor? or what can i use?

ok so it’s for game “automation”. you will find lots of resources on the topic on the internet. you aren’t the first to write a game bot.

no webcams. you just need desktop/screen capture. on windows, I’d recommend the python module “d3dshot”.

for recognizing this, collect all the possible symbols as templates, then check the precise location on screen: calculate pixelwise difference to each known symbol. smallest difference should be 0 or close.

by the way: you don’t need to use an avatar picture on this forum… I think the depicted man would agree.

1 Like


thanks crackwitz, i will take a look into that python library. appreciate the help!

So i’ve read the documentation for d3dshot (thanks crackwitz!)

am i correct in understanding that in order to obtain object detection from a video, we would need to have a buffer(which captures 60-100 different photos), and then we will convert these photos into a numpy array (or whatever format)… and then compare them to existing numpy arrays from our previous photos… is that generally correct?

what happens if the numpy array we have in our database is of the object at a certain size/resolution, does object from the screen capture have to be the exact same size as our object in our “database” ? Sorry if im using the wrong nomenclature, kinda new to this library.

there also seems to be a limited amount of documentation for d3dshot? Would i be better off using a library such as pytorch or pillow that might have more documentation to get me started? seems like there are more examples using pytorch/pillow than d3dshot (could barely find any code/examples).

you misunderstand the purpose of d3dshot.

it is purely for capturing the screen.

1 Like

oh okay thank you Crackwitz!