Why is net.forward() slow for YOLOv8 when used in C++?

stkaskin · May 25, 2023, 7:00am

I am converting my YOLOv8 model trained with Python to ONNX format. Since I don’t have a graphics card, I am testing it on my computer using CPU. I am using C++ with Visual Studio 2019 while working with Python in VS Code. On the C++ side, I am using the following code, but when measuring the performance of net.forward, I can process a 1280x720 pixel image in 2570 ms. When I process the same image on the Python side using model.predict, I obtain a time between 430-570 ms.

Where could the problem be?
PYTHON 450~ ms
model.predict('f’test.jpg)

C++

#include"yolov8.h" ~2500 ms

using namespace std;
using namespace cv;
using namespace cv::dnn;

bool Yolov8::ReadModel(Net& net, string& netPath, bool isCuda = false) {
	try {
		net = readNet(netPath);
#if CV_VERSION_MAJOR==4 &&CV_VERSION_MINOR==7&&CV_VERSION_REVISION==0
		net.enableWinograd(false);  //bug of opencv4.7.x in AVX only platform ,https://github.com/opencv/opencv/pull/23112 and https://github.com/opencv/opencv/issues/23080 
		//net.enableWinograd(true);		//If your CPU supports AVX2, you can set it true to speed up
#endif
	}
	catch (const std::exception&) {
		return false;
	}

	if (isCuda) {
		//cuda
		net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
		net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA); //or DNN_TARGET_CUDA_FP16
	}
	else {
		//cpu
		cout << "Inference device: CPU" << endl;
		net.setPreferableBackend(cv::dnn::DNN_BACKEND_DEFAULT);
		net.setPreferableTarget(cv::dnn::DNN_TARGET_CPU);
	}
	return true;
}


bool Yolov8::Detect(Mat& srcImg, Net& net, vector<OutputSeg>& output,bool one_multi) {
	Mat blob;
	output.clear();
	int col = srcImg.cols;
	int row = srcImg.rows;
	Mat netInputImg;
	Vec4d params;
	try
	{
		LetterBox(srcImg, netInputImg, params, cv::Size(_netWidth, _netHeight));

	}
	catch (const std::exception&)
	{
		bool a = false;
		a = false;
		return false;
	}
	blobFromImage(netInputImg, blob, 1 / 255.0, cv::Size(_netWidth, _netHeight), cv::Scalar(0, 0, 0), true, false);
	net.setInput(blob);
	std::vector<cv::Mat> net_output_img;

	net.forward(net_output_img, net.getUnconnectedOutLayersNames()); //get outputs

stkaskin · May 25, 2023, 8:56am

The issue was due to running the program in debug mode. When switched to release mode, the problem was resolved.

:(chatgpt-3)
The problem occurred because running the program in debug mode can introduce additional overhead and performance limitations compared to running it in release mode. Debug mode typically includes features such as additional checks, logging, and other debugging tools that can slow down the execution of the program.

By switching to release mode, these additional checks and overhead are eliminated, resulting in improved performance and potentially resolving the problem that was encountered. Release mode is optimized for performance and is typically used when deploying the application in a production environment.

Therefore, changing the program execution mode from debug to release mode helped resolve the issue.

Topic		Replies	Views
DNN::Network::forward() works in Python but not C++ C++ dnn	4	2489	March 14, 2022
ONNX forward method returns 1-1-1 FLOAT32 in c++ opencv Python dnn , mat , core	1	1056	February 25, 2021
Net.forward( ) is taking much time to compute while detecting faces dnn , cuda	11	573	February 15, 2021
Batch forward onnx model C++ dnn	2	55	November 26, 2024
Cv.dnn.forward takes 3x times more when using with gstreamer Python dnn , cuda , videoio	7	725	February 26, 2022

Why is net.forward() slow for YOLOv8 when used in C++?

Related topics