The output type of net.forward after blobFromImages

Sea_Sea · September 29, 2024, 9:34am

Assume zidane.jpg size is (480, 640, 3)

The case1 network output is 1 x 84 x 6300 where the 1 position is the batch (Batch, Classes, Boxes)

But the case2 network output is 1 x 84 x 6300 for outs[0]
How to define the i of outs[i] (std::vectorcv::Mat)?
Is i a batch? but the outs[i] shape is (Batch, Classes, Boxes)
two batch?

Because the input of cv::dnn::blobFromImages can be the std::vector<cv::Mat> where the j of inpMats[j] can be the batch.
It just like inpMats[0] : (480, 640, 3)
inpMats[1] : (480, 640, 3)

case1 .

    std::vector<cv::Mat> inpMats;
  //  inpMats.push_back(cv::imread("source/data/zidane.jpg"));
    inpMats.push_back(cv::imread("source/data/zidane.jpg"));
    cv::Mat blob1 = cv::dnn::blobFromImages(inpMats, 1.0 / 255.0, cv::Size(640, 480), cv::Scalar(), true, false);
// blob1 Dimensions: 1 x 3 x 480 x 640

    net.setInput(blob1);
    cv::Mat single_out = net.forward();
// single_out    Dimensions: 1 x 84 x 6300

case2

    std::vector<cv::Mat> inpMats;
  //  inpMats.push_back(cv::imread("source/data/zidane.jpg"));
    inpMats.push_back(cv::imread("source/data/zidane.jpg"));
    cv::Mat blob1 = cv::dnn::blobFromImages(inpMats, 1.0 / 255.0, cv::Size(640, 480), cv::Scalar(), true, false);
// blob1 Dimensions: 1 x 3 x 480 x 640

    net.setInput(blob1);
    std::vector<cv::Mat> outs;
    net.forward(outs, net.getUnconnectedOutLayersNames());
// outs  Dimensions: 1 x 84 x 6300  for outs[0]

berak · September 29, 2024, 9:37am

what kind of network is it ?
what are you trying to achieve ?
(the purpose of it ?)

what does net.getUnconnectedOutLayersNames() return ?

please, try to make MRE from your code snippets, so ppl here can actually try to reproduce it, ty.

Sea_Sea · September 29, 2024, 9:47am

I just confuse about the case 2 network output define ( std::vectorcv::Mat outs.)

If the outs is defined (Batch, Classes, Boxes), what is the i of outs[i]
I already know the std::vectorcv::Mat inpMat define is (480, 640, 3) for inpMat[j] and the j is the batch

I reference these code and I want to design the input (many img) and output (batch output),

github.com

ultralytics/ultralytics/blob/1fd21ca27e0670ae93be0ae3b132ebec3fc4ac82/examples/YOLOv8-CPP-Inference/inference.cpp#L24


      
          std::vector<Detection> Inference::runInference(const cv::Mat &input)
          {
              cv::Mat modelInput = input;
              if (letterBoxForSquare && modelShape.width == modelShape.height)
                  modelInput = formatToSquare(modelInput);
          
              cv::Mat blob;
              cv::dnn::blobFromImage(modelInput, blob, 1.0/255.0, modelShape, cv::Scalar(), true, false);
              net.setInput(blob);
          
              std::vector<cv::Mat> outputs;
              net.forward(outputs, net.getUnconnectedOutLayersNames());
          
              int rows = outputs[0].size[1];
              int dimensions = outputs[0].size[2];
          
              bool yolov8 = false;
              // yolov5 has an output of shape (batchSize, 25200, 85) (Num classes + box[x,y,w,h] + confidence[c])
              // yolov8 has an output of shape (batchSize, 84,  8400) (Num classes + box[x,y,w,h])
              if (dimensions > rows) // Check if the shape[2] is more than shape[1] (yolov8)
              {

opencv/modules/dnn/test/test_onnx_importer.cpp at 450e741f8d53ff12b4e194c7762adaefb952555a · opencv/opencv · GitHub

Reproduce

Just follow the code that I paste and I use blobFromImages

    std::vector<cv::Mat> inpMats;
    inpMats.push_back(cv::imread("source/data/zidane.jpg"));
    cv::Mat blob1 = cv::dnn::blobFromImages(inpMats, 1.0 / 255.0, cv::Size(640, 480), cv::Scalar(), true, false);
    net.setInput(blob1);

    //case1
    cv::Mat single_out = net.forward();
   // single_out    Dimensions: 1 x 84 x 6300 



     // case2
    std::vector<cv::Mat> outs;.
    net.forward(outs);
    //net.forward(outs, net.getUnconnectedOutLayersNames());
   // outs  Dimensions: 1 x 84 x 6300  for outs[0]

berak · September 29, 2024, 10:42am

sorry, your reproducer code is incomplete (no network loaded)

however, IF it is really yolov8, we know, that it has a single output layer only, thus you only need to check outputs[0]
(and it’s the same as a simple net.forward())

(other yolo nn’s have 2 or 3 outputs (pyramid scale levels), which you have to collect, before applying NMS)

Sea_Sea · September 29, 2024, 10:59am

Thanks for your reply

I think the yolov8 code is simply, so I just point out the place that I want to focus.

According to your reply, you mean that if I choose to use the net output from case 2, I only need to focus on outs[0], The net output will always only output outs[0], and the batch is already included in outs[0]. ?

And the net output is impossible for 1 x 84 x 6300 for outs[1]

     // case2
    std::vector<cv::Mat> outs;.
    net.forward(outs);
    //net.forward(outs, net.getUnconnectedOutLayersNames());
   // outs  Dimensions: 1 x 84 x 6300  for outs[0] 
   // or Dimensions: 10 x 84 x 6300  for outs[0] 
   // it is wrong :   1 x 84 x 6300  for outs[1]   impossible

berak · September 29, 2024, 11:34am

ok, so part 2, - batches :

a batch of 7 input images will result in an input blob like

[ 7, 3, 480, 640 ] // NCHW

and (again, for v8, a single) nn output like:

[7, 84, 6300] // batch, box proposal, count

so you still have to parse / process box proposals per image.

maybe the short answer is:
the number/index of output layers is per scale level, not the image/batch count (which is the 1st index of the resp. output layer tensor)

Sea_Sea · September 29, 2024, 12:56pm

Yes , you are right.

conclusion

yolov8 output is one output layer tensor, so there is only outs[0] for case 2 design

If some network output is three output layer tensor, there are three (outs[0] outs[1] outs[2]) for case 2 design.

new question

But if some network output is three output layer tensor, it can not use the case 1 design, right?

case 1    network output type  is  cv::Mat

case 2    network output type  is   std::vector<cv::Mat>

berak · September 29, 2024, 1:01pm

indeed !

there are a lot of nn’s with multiple outputs, e.g. pose nn’s (with keypoint / connection outputs) or even yolo variants with additional segmentation maps

Topic		Replies	Views
Batch forward onnx model C++ dnn	2	58	November 26, 2024
How to use output of dnn::net::forward() C++ dnn	2	29	May 12, 2025
Output from cv::dnn::Net.forward() is multidimensional with yolov5 C++ dnn , yolov5	3	2845	October 13, 2022
Output of .forward() in dnn module in c++ C++ dnn , yolov5	2	1464	March 6, 2023
blobFromImages return 1-1-1 C++ dnn	5	846	March 10, 2023

The output type of net.forward after blobFromImages

Reproduce

conclusion

new question

Related topics