My question is: how do I make it real time? That is, I indicate a query and in real time I traced the person by indicating an ID, and that recognizes who the person at all times (because it currently recognizes the person of the query, but only generates a photo with all the appearances that it has in The galery).
I’m not looking to offend anyone, I’m just looking for first-step advice. I did not know that there was a documentation for everything, I am new. I will follow the steps you tell me (including reading the documentation), if I have a problem I’ll let you know.
If you have any more steps you want to suggest to change the code to real time, I will appreciate it.
you can think of the cnn used here as a feature processing tool, instead of comparing images, you compare a much lower dimensional embedding.
the sample there is heavy on gallery processing, maybe it helps, if we extract the bare nessecities:
# once, on startup
net = cv2.dnn.readNet("youtu_reid_baseline_lite.onnx")
def process(net, img):
blob = cv2.dnn.blobFromImage(img, 0.5, (128,256), (128,128,128), True, False)
net.setInput(blob)
res = net.forward()
return cv2.normalize(res)
# get features from cropped person images
a = process(net, person_img_a)
b = process(net, person_img_b)
# now you can compare person features using the dot product
dist = np.matmul(a, b)
if dist < some_threshold:
# same
Honestly, that shape is a bit complicated for me. I am studying HOG in depth. Which do you think is better? If I compare two HOG, that is, that of an image that I have stored of a person with that of a video where that person appears with the same clothes. Do you think it would have the same effect as ReID?
Look at the following code. It’s simple, but, analyze the HOG of the image.
well, you’re at step 2 of 5 there, so, still work to do …
(extract histograms from the mag/angle features)
also, be cautious with satya’s tutorial there, – it’s working with 3 channel images, so you get 3 channel grads, mag, angle images as well (and the sizes/numbers dont quite add up there)
Simply put, how is ReiD different? I mean, why is it better? Because, if it is better then I will prefer the ReID.
I will do the following with the ReiD:
I will use an object track (the famous one that uses “MobileNetSSD_deploy.caffemodel” I see that it is very good).
I will analyze each person from behind (in another program) that I see with my query. That is, I will run the ReID program automatically only to obtain the values if it is the person who is in the photo of the query or not.
If it is the person who is in the query with the position that gives me “MobileNetSSD_deploy.caffemodel” I will obtain it and that is that I will continue and I will not remove the camera from above.
If an object is detected it can be tracked with the position of the box, am I wrong? But, I am using this code to assign an ID to each person who sees: Simple object tracking with OpenCV - PyImageSearch
That is, my camera will follow the ID that matches the ReID.