How to interpret the result from the yolov3 with blobFromImages as input?

SanSancr777 · February 17, 2022, 9:17pm

How to use the result for each image when processing the batch of images using the blobfromimages as the input to the yolo model ?

How the post processing for each image should be approached to get the bounding box values ?

Thanks

berak · February 18, 2022, 8:58am

kindly look at the samples for this

in general:

collect “box proposals” from several output layers (representing different scales)
each row in there is:
cx,cy,h,w,box_probability,p1,p2,…pn (where p1…pn are class probabilities)
filter out bad proposals by thresholding box / class probs
filter out near duplicates by applying NMS

SanSancr777 · February 18, 2022, 1:20pm

I have tried the following :

Implemented in C#:
I have choosen batch of 8 images.

Mat blob = CvDnn.BlobFromImages(inputImages, 1 / 255.0, swapRB: true);

#input shape = (8,3,672,672) as (Images,Channels,width,height)
model.SetInput(input);
model.SetPreferableBackend(Backend.OPENCV);
model.SetPreferableTarget(Target.CPU);

step 1:
var outNames = model.GetUnconnectedOutLayersNames();
var outs = outNames.Select(_ => new Mat()).ToArray();
model.Forward(outs, outNames); (“The outs will have the predictions from 3 yolo detection layers “yolo_82”,“yolo_94” and “yolo_106” and these should be used to get the result”)

outs has 3 matrices of these {Mat [ -1*-1*CV_32FC1]

Step 2:
var probs = model.Forward();
probs is a single matrix of {Mat [ -1*-1*CV_32FC1]

I am not sure on how to interpret these matrices and get predictions for each image.

Any comment would be greatly appreciated.
Thanks in advance.

berak · February 18, 2022, 1:59pm

quite unfortunate, as there is no official c# wrapper, and now you have an API problem, specific to that, and noone knows here ;(
at least kindly tell us, what exactly you’re using here ?

how did you retrieve that ? do the same for the outputs

that’s bogus. it’s probably printing out Mat::size(), which is 2d only, and cannot handle 4d tensors (so rows & cols are set to -1 on purpose)
the c++ Mat has a size member, which would hold the proper dims in this case

correct. so 3 output Mats per forward pass

why this ? seems wrong / unnessecary. Step1 should be all you need.
(this way, ,you only get the very last layer, “yolo_106”)

if you want my 2ct:
try to get it working for a single image first
(seems you’re not even there yet), then try with batches

SanSancr777 · February 18, 2022, 2:21pm

I have used OpenCVsharp for .Net and C# programming language. OpenCv provides function, to load the darknet weights directly using the config and the weights file.

I have successfully implemented the detection’s for single image and the above code is a tried version for batches.
Yes, I would also use the step1 from above to set the inputs and do the forwards pass.
The output matrices has size and it is giving me the width and height.
1st matrix , size = (width:1323 height:8) and (rows,columns) = -1,-1
2nd matrix , size = (width:5292 height:8)and (rows,columns) = -1,-1
3rd matrix, size = (width:21168 height:8) and (rows,columns) = -1,-1

The value height changes on the number of images used and this is what I have seen from debugging.

I have challenges unpacking the output matrices and get the results for each image.

berak · February 18, 2022, 2:27pm

hmm. numbers look familiar, however, i’d expect it more like [8,1323,85]
as in: 8 images, 1323 rows, 85 cols

SanSancr777 · February 18, 2022, 2:35pm

oh Okay. I don’t exactly know , how the result is provided as output from the model for batches.

It would be very helpful if you guide me further to achieve this. I have looked a lot and no proper implementation has been found.

I will continue to look into it. Thanks

berak · February 20, 2022, 6:14pm

sorry, but i cant help ay further.
again, it’s a problem with your c# API (which we have no knowledge about here)

raise an issue on that github repo

SanSancr777 · February 20, 2022, 6:27pm

Thanks for your reply.

I believe the challenge is to interpret the output from the opencv dnn module for the yolo model and nothing to do with C# or anything else.I have a opencv and python implementation too.

Thanks anyway.

berak · February 21, 2022, 9:32am

so, again, i made a small test from c++, using yolov3 and a batch size of 8.

printing out the output size members gives:

yolo_82   8 x 1200 x 85
yolo_94   8 x 4800 x 85
yolo_106  8 x 19200 x 85

(i think, this must be interpreted as [batch_size x height x width])

SanSancr777 · February 21, 2022, 9:44am

Hi,
Thanks for more help.

This is what I wanted to know. I do not have these output sizes from the model as we have seen above. I will try to get these results and proceed further. Thanks again.

berak · February 21, 2022, 12:56pm

one problem seems to be, finding the correct height (num rows).
(we already know the batch size(8), and the col length(85))
can you check, if e.g. outs[0].Total() / (8*85) == 1200 ?

from there on, you could build an ordinary 2d Mat from it, e.g. for the 3rd batch image of outs[0]:

IntPtr p = outs[0].Ptr(2);
Mat m = new Mat(h,w,CV_32F, p);

SanSancr777 · February 21, 2022, 2:23pm

I have the following information from the outs.
I have Blob Shape = (Images,Channels,height,width) = (8,3,608,608)

outs[0] has the following :
dims = 3, channels = 1
size = (width:1323 height:8) ( I get this by using , outs[0].size() )
size3d = 8 X 1323 X 7 (I have got this by using the size of each dimension, i.e size(dim))
(rows,columns) = -1,-1
outs.Total() = 74088

outs[1] has the following :
dims = 3, channels = 1
size = (width:5292 height:8) ( I get this by using , outs[1].size() )
size3d = 8 X 5292 X 7
(rows,columns) = -1,-1
outs.Total() = 296352

outs[2] has the following :
dims = 3, channels = 1
size = (width:21168 height:8) I get this by using , outs[2].size() )
size3d = 8 X 21168X 7
(rows,columns) = -1,-1
outs.Total() = 1185408

I don’t see 85 anywhere, am I missing anything about this. ?
Thanks

berak · February 21, 2022, 2:40pm

it is actually 5 + nclasses per col for the default pretrained networks
(80 COCO classes, iirc)

so you have a custom trained, 2 class network there ?

SanSancr777 · February 21, 2022, 8:13pm

Hi,

Yes, I have trained the model for 2 custom classes.

I have finished the implementation for batches and thanks for all the help from you.
Thanks a ton.

Topic		Replies	Views
Output from cv::dnn::Net.forward() is multidimensional with yolov5 C++ dnn , yolov5	3	2873	October 13, 2022
Is it possible to set batch size other than one with yolo models? Python dnn	5	975	September 1, 2021
The output type of net.forward after blobFromImages C++ dnn	7	92	September 29, 2024
Use Dnn.readNetFromModelOptimizer to detect objects Android/Java dnn , objdetect	0	725	February 4, 2021
How do I know what each of these index mean for output blob? dnn	4	44	February 23, 2025

How to interpret the result from the yolov3 with blobFromImages as input?

Related topics