Hello guys. I needed to ask you something.
I recently worked on two different projects:
1: Face Mask Detection
2: Empty Reception Detection (checks to see if the reception is empty or not)
When i trained both the models, the avg loss i got was less than 0.4%. I used yolov3 Darknet for both of these.
When i ran face mask detection on live video feed (i.e. My Webcam) using “cv2.VideoCapture(0)”, the video was really smooth with good fps and the accuracy was also good.
But when i ran my empty reception detection model on live video feed (using my webcam) and the same code, it gave me alot of lag in the video. The accuracy however is really good but the video is laggy and not enough frames. I tried the following two steps to solve it:
1: Used a higher quality webcam still no progress
2: Someone advised me to use yolov3_tiny as it may be light weight but the video still was very laggy.
Can someone please tell me what am i missing or doing wrong?
I was hoping i would get a smoother inference run.
The code that I’m using is =
import cv2
import numpy as np
net = cv2.dnn.readNet("yolov3_training_last.weights", "yolov3_testing.cfg")
classes = ["person"]
cap = cv2.VideoCapture(0)
layer_names = net.getLayerNames()
output_layers = [layer_names[i[0]-1] for i in net.getUnconnectedOutLayers()]
colors = np.random.uniform(0, 255, size=(len(classes), 3))
while cap.isOpened():
success, img = cap.read()
if not success:
print("Ignoring empty camera frame")
continue
height, width, channels = img.shape
blob = cv2.dnn.blobFromImage(img, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
net.setInput(blob)
outs = net.forward(output_layers)
class_ids = []
confidences = []
boxes = []
for out in outs:
for detection in out:
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]
if confidence > 0.3:
# Object detected
print(class_id)
center_x = int(detection[0] * width)
center_y = int(detection[1] * height)
w = int(detection[2] * width)
h = int(detection[3] * height)
# Rectangle coordinates
x = int(center_x - w / 2)
y = int(center_y - h / 2)
boxes.append([x, y, w, h])
confidences.append(float(confidence))
class_ids.append(class_id)
indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
font = cv2.FONT_HERSHEY_PLAIN
for i in range(len(boxes)):
if i in indexes:
x, y, w, h = boxes[i]
label = str(classes[class_ids[i]])
color = colors[class_ids[i]]
cv2.rectangle(img, (x, y), (x + w, y + h), color, 1)
cv2.putText(img, label, (x, y + 30), font, 1, color, 1)
cv2.imshow("Image", img)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()