dcox
December 22, 2021, 4:55pm
1
I have successfully exported a yolov5 model to ONNX and was able to read the model using readNetFromONNX(). I then set input using a test image and ran net.forward() which returned a Mat. I am now working on postprocessing to interpret the data contained in the returned Mat.
Most of the examples that I have found that illustrate calling forward() and interpreting the results assume that the returned Mat is 2D. In contrast, the Mat that I am getting is 3D. More specifically, the rank of the Mat is 3. The sizes of these three dimensions are 1x25200x8.
I have not been able to find any information about how to interpret such a result and was wondering if anyone has any suggestions?
Thank you.
1 Like
berak
December 22, 2021, 5:18pm
2
you can simply reshape the output to 2d like:
Mat res = output.reshape(1,25200); // [8x25200]
yolov5 has 25200 possible boxes, each row in the 2d Mat is:
cx, cy, w, h, box_prob, p1, p2, p3, ..., pn
where p1 … pn are N class probabilities (3 in your case ?)
someone supplied example code here:
1 Like
dcox
December 22, 2021, 5:35pm
3
Beautiful! Thank you so much!
1 Like
As I failed to understand berak’s answer, I found a solution here and thought it might provide useful to other users:
blob = cv2.dnn.blobFromImage(input_image, 1/255, (640, 640), [0,0,0],
net.setInput(blob)
output_layers = net.getUnconnectedOutLayersNames()
outputs = net.forward(output_layers)
# Lists to hold respective values while unwrapping.
class_ids = ['frist class',' second class']
confidences = []
boxes = []
# Rows.
rows = outputs[0].shape[1]
image_height, image_width = input_image.shape[:2]
# Resizing factor.
x_factor = image_width / 640 # get var value from your loaded image
y_factor = image_height / 640
# Iterate through 25200 detections.
for r in range(rows):
row = outputs[0][0][r]
confidence = row[4]
# Discard bad detections and continue.
if confidence >= CONFIDENCE_THRESHOLD:
classes_scores = row[5:]
# Get the index of max class score.
class_id = np.argmax(classes_scores)
# Continue if the class score is above threshold.
if (classes_scores[class_id] > SCORE_THRESHOLD):
confidences.append(confidence)
class_ids.append(class_id)
cx, cy, w, h = row[0], row[1], row[2], row[3]
left = int((cx - w/2) * x_factor)
top = int((cy - h/2) * y_factor)
width = int(w * x_factor)
height = int(h * y_factor)
box = np.array([left, top, width, height])
boxes.append(box)
# Perform non maximum suppression to eliminate redundant overlapping boxes with
# lower confidences.
indices = cv2.dnn.NMSBoxes(boxes, confidences, CONFIDENCE_THRESHOLD, NMS_THRESHOLD)
for i in indices:
box = boxes[i]
left = box[0]
top = box[1]
width = box[2]
height = box[3]
cv2.rectangle(input_image, (left, top), (left + width, top + height), BLUE, 3*THICKNESS)
label = "{}:{:.2f}".format(classes[class_ids[i]], confidences[i])
draw_label(input_image, label, left, top)
print(label) # will contain label and confidence