Yolov5 image classification in C++

Hi, I’ve exported yolov5-cls model to ONNX and I would like to infer on the Open-Cv C++ side.I wrote this part but the result is not correct. Could you guide me?

int inpWidth = 224;
int inpHeight = 224;
std::string modelFilepath{
    "model.onnx" };
std::string imageFilepath{ "18836.jpg" };

cv::Mat image = cv::imread(imageFilepath, cv::ImreadModes::IMREAD_COLOR);

cv::resize(image, image, cv::Size(inpWidth, inpHeight));

cv::Mat blob;
cv::Scalar mean{ 0.4151, 0.3771, 0.4568 };
cv::Scalar std{ 0.2011, 0.2108, 0.1896 };
bool swapRB = false;
bool crop = false;
cv::dnn::blobFromImage(image, blob, 1.0, cv::Size(inpWidth, inpHeight), mean, swapRB, crop);
if (std.val[0] != 0.0 && std.val[1] != 0.0 && std.val[2] != 0.0) {
    cv::divide(blob, std, blob);
}

cv::dnn::Net net = cv::dnn::readNetFromONNX(modelPath);
net.setInput(blob);
cv::Mat prob = net.forward();
std::cout << prob << std::endl;

cv::Mat probReshaped = prob.reshape(1, prob.total() * prob.channels());
std::vector<float> probVec =
    probReshaped.isContinuous() ? probReshaped : probReshaped.clone();
std::vector<float> probNormalized = sigmoid_(probVec);

cv::Point classIdPoint;
double confidence;
minMaxLoc(prob.reshape(1, 1), 0, &confidence, 0, &classIdPoint);
int classId = classIdPoint.x;
std::cout << " ID " << classId << " - " << " confidence "
    << confidence << std::endl;



std::vector<float> res;
float sum = 0.0f;
float t;
for (int i = 0; i < probVec.size(); i++) {
    auto sec = probVec[i];
    if (sec > 0) {
        t = expf(sec);
        res.push_back(t);
        sum += t;
    }
}
for (int i = 0; i < res.size(); i++) {
    res[i] /= sum;
}
const int topk = std::min(5, (int)res.size());

for (size_t i = 0; i < topk; ++i) {
    const auto& conf = res[i];
    std::cout << " ID " << i << " - " << " confidence "
        << conf << std::endl;
}

Thanks

so, you’re basically trying to port this to dnn / c++ ?

(didn’t even know, they had classification models now …)

and, btw, your mean/std values are in [0…1], while the img is [0…255]

if i print out the ìm from predict.py (for bus.jpg), i get:


im tensor([[[ 0.17681,  0.07406,  2.00916,  ...,  0.99880,  1.01593,  0.99880],
         [-1.12467,  1.59817,  0.51931,  ...,  1.01593,  1.03305,  1.03305],
         [ 1.42692,  1.05018, -0.18281,  ...,  0.99880,  1.03305,  1.03305],
         ...,
         [ 2.04341,  2.09479,  1.25567,  ..., -0.52530, -0.37118, -0.81642],
         [ 1.75229,  2.11191,  1.73517,  ..., -1.14179, -0.54243, -0.97055],
         [ 1.71804,  1.44404,  1.76942,  ..., -0.86780, -0.35405,  0.60493]],

look here:

, imo you need to better adapt your preprocessing

also, these are the first 5 results, not the 5 top ones, which would require some sorting