YOLO3 Detection speed slow down (Opencv 4.5.3)

sgd7 · October 7, 2021, 10:27am

Hello!
I am trying to use YOLO3 object detection with Opencv 4.5.3 with Ndivida backend. When I run my program, after a few seconds the detection speed slow down. The detection time at the start is about 40-50 msec, but after few secound the detection speed increasing up to 400-500 msec.
Anyone has any idea what caused this speed problem?

I tried to change back the Net backend to the CPU and then the detection speed is constatntly run about 300-400 msec.

Here are below the code snipet, what I used:
String modelConfiguration = “yolov3.cfg”;
String modelWeights = “yolov3.weights”;
Net net2 = readNetFromDarknet(modelConfiguration, modelWeights);

net2.setPreferableBackend(DNN_BACKEND_CUDA);
net2.setPreferableTarget(DNN_TARGET_CUDA);
TickMeter tickMeter2;
Mat blob;
for (int count = 0; ; ++count)
{
    tickMeter2.reset();
    tickMeter2.start();

    cap >> image;
    if (image.empty())
    {
        std::cerr << "Can't capture frame " << count << ". End of video stream?" << std::endl;
        break;
    }

    Mat render_image = image.clone();
    vector<Mat> outs;
    if (enable_yolo)
    {
        blobFromImage(image, blob, 1 / 255.0, cv::Size(inpWidth, inpHeight), Scalar(0, 0, 0), true, false);
        net2.setInput(blob);
        net2.forward(outs, getOutputsNames(net2));
    }

    if (enable_yolo)
    {
        postprocess(render_image, outs);
    }
    
    tickMeter2.stop();
    std::string timeLabel = format("Inference time: %.2f ms", tickMeter2.getTimeMilli());
    putText(render_image, timeLabel, Point(0, 40), FONT_HERSHEY_SIMPLEX, 1, Scalar(255, 0, 255), 2);

    imshow(winName, render_image);

    int c = waitKey(10);
}

sgd7 · October 7, 2021, 2:37pm

I’ve made another experimental with TensorFlow model and using CUDA, and I got a similar result.
I’ve used the following code for testing:

github.com

opencv/opencv/blob/master/samples/dnn/object_detection.cpp

#include <fstream>
#include <sstream>

#include <opencv2/dnn.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>

#ifdef CV_CXX11
#include <mutex>
#include <thread>
#include <queue>
#endif

#include "common.hpp"

std::string keys =
    "{ help  h     | | Print help message. }"
    "{ @alias      | | An alias name of model to extract preprocessing parameters from models.yml file. }"
    "{ zoo         | models.yml | An optional path to file with preprocessing parameters }"
    "{ device      |  0 | camera device number. }"

This file has been truncated. show original

If I run the program with CUDA backend and CUDA target, after a few seconds the network speed slow down dramatically.

My runing enviroment are the follows:
Windows10
Intel i7-1165G7
Nvidia Geforce MX350

Anyone has any idea what caused this speed problem?

sgd7 · October 8, 2021, 9:58am

I’ve made a number of experiment about this slow down problem, but I didn’t find right solution for my problem. I’ve tested this code on another machine. On this enviroment the program ran well without slow down, but this machine is weaker then my developer enviroment (which I wrote earlier).
From my experiment I could not decide what cause the detection speed slow down behavior.

I briefly summarize the differences between the two systems:

developer enviroment: Intel i7-1165G7, Nvidia Geforce MX350 - detection speed: 18-10-5 FPS
another enviroment: Intel i5-7200U, Nvidia Geforce 940MX - detection speed: 20-25 FPS

In the older environment, I found that only one program uses the GPU. However, in my development environment, where I experienced a decrease in speed, there were several programs using the GPU at the same time.
I tried to change the graphics acceleration settings in the operating system, but to no avail.

Someone might be able to give guidance on which direction to look for a solution to the problem.

Alejandro_Silvestri · October 8, 2021, 12:32pm

@sgd7 , this is not my forte, but having no reply, here is mine.

You provided valuable info. It would be nice having a reply from someone with experience in the same problem, but this is not the case for now.

As you may thought, it seems the problem is not in the CV part, but in you system’s GPU workload. I believe any tool for visualizing GPU load will help. NVidia has many tools for that, but each of them requires learning how to use it.

sgd7 · October 8, 2021, 1:57pm

Thanks for the guidance! Now I profiling my program with Nvidia tool, which you advice to me. Based on my current tests, I believe the problem is related to the power management settings of the operating system.

Topic		Replies	Views
Opencv + cuda + yolov5-v6.0 dnn , cuda , yolov5	3	874	December 31, 2021
How to detect objects with Cpp and DNN, CUDA C++ dnn , cuda , videoio	6	1955	May 26, 2021
Why is net.forward() slow for YOLOv8 when used in C++? C++ dnn , performance	1	2375	May 25, 2023
Net.forward( ) is taking much time to compute while detecting faces dnn , cuda	11	583	February 15, 2021
Cost time "method detect of cv2.dnn_Detection Model " Python dnn , cuda	10	533	June 23, 2023

YOLO3 Detection speed slow down (Opencv 4.5.3)

Related topics