With the ultralytics tool, I was able to get 86% confidence, while with the opencv code I got 25%.
Without showing the modified code referenced above as it is from the company I work for, are you able to give me hints to why this is happening? I’m making inference with the yolov5m6 model and 1280x1280 image input size.
I have a new update: It turns out the tool gives 25% confidence if the image is resized to 1280x 1280. Basically this is happening in cv2.dnn.blobFromImage(frame, 1 / 255.0, (1280, 1280), swapRB=True, crop=False) where the image is resized.
Is there any way to change the blobfromimage flags to get the desired confidence?
I get an image from a camera by requests, decode with opencv and then I do the same process as mentioned above in the tutorial of learnopencv. Just to clarify that “for net in range(len(self.net)):” exists because I’m processing the same image for different models, each model one class.
def processa(self,url,username,password,index,sectorXY):
imagem=requests.get(url, auth=HTTPDigestAuth(username, password))
frame = cv2.imdecode(np.fromstring(imagem.content, np.uint8), cv2.IMREAD_UNCHANGED)
tamanho = frame.shape
altura = frame.shape[0]
comprimento = frame.shape[1]
blob = cv2.dnn.blobFromImage(frame, 1 / 255.0, (1280, 1280), swapRB=True, crop=False)
objetos_capturados_frame = []
for net in range(len(self.net)):
if str(net) not in self.filtros[index]:
layer_names = self.net[net].getLayerNames()
outputlayers = [layer_names[i-1] for i in self.net[net].getUnconnectedOutLayers()]
outputs = self.net[net].forward(outputlayers)
class_ids = []
confidences = [] # Grau de confiança sobre a imagem
caixas = []
rows = outputs[0].shape[1]
x_factor = 3840 / 1280
y_factor = 2160 / 1280
for r in range(rows):
row = outputs[0][0][r]
confidence = row[4]
# Discard bad detections and continue.
if confidence >= (self.confidence/100):
classes_scores = row[5:]
if classes_scores[0] > 0.5:
cx, cy, w, h = row[0], row[1], row[2], row[3]
left = int((cx - w/2) * x_factor)
top = int((cy - h/2) * y_factor)
width = int(w * x_factor)
height = int(h * y_factor)
caixas.append([left, top, width, height])
indexes = cv2.dnn.NMSBoxes(caixas, confidences, 0.5, 0.4)
if len(indexes) > 0 and net == 0:
for i in indexes:
objeto_no_frame = {}
#i = i[0]
caixa = caixas[i]
x = caixa[0]
y = caixa[1]
w = caixa[2]
h = caixa[3]
objeto_no_frame["object_id"] = int(class_ids[i])
objeto_no_frame["confianca"] = round(confidences[i],2)
objeto_no_frame["topLeft"] = [x, y]
objeto_no_frame["bottomRight"] = [w, h]
I will try to check the yolo output layers thing. But as I mentioned in the 3rd reply, I also got 25% of confidence in the ultralytics tool. I did the following:
In one test, I did this - python detect.py --weights runs/train/exp/weights/best.onnx --source image.jpg --imgsz 1280 1280 --device 0 - where image.jpg is a 4K image and got 84% of confidence
In another test, I did - python detect.py --weights runs/train/exp/weights/best.onnx --source image_resized.jpg --imgsz 1280 1280 --device 0 - where image_resized.jpg is a 1280x1280 image and got 25% of confidence
In my opinion, I think the cv2.dnn.blobfromimage is resizing the image and the inference is made in the resized image not the original. But the problem is that I exported the onnx model in the 1280x1280 size, so the parameter size in blobfromimage can only be (1280,1280)
I have the 4.6.0 version. But I don’t think this is a version issue, has yolov5 is from 2020/2021 and my opencv version is from 2022. I mean if there was a problem with a previous version of opencv, yolov5 would never work up until 2023 that is when opencv 4.8.0 was released. There is something else that is making my issue. And I have a feeling it is related to image resizing, but I don’t know how to overpass this.
I still don’t know why there is still a discrepancy of 5% confidence, but I’m on the right track.
Anyway, thank you for your help. Believe it or not the word you wrote “outdated” helped me a lot as I tried to find a more recent tutorial of the code.