YuNet face detection is returning face rect with extremely large values

gsalsero · March 15, 2023, 8:28pm

I am attempting to detect a face in an nv12 image that contains only a single face. I couldn’t attach the nv12 data file here, so instead I have attached the corresponding png file.

Normally it works fine and returns single face rect in the correct location. Occasionally, the face detect returns a face rect with large values. Sometimes I get 2 faces. Here is an excerpt of the log I get:

Face 0, top-left coordinates: (-5.17639e+26, -inf), box width: 0, box height: inf, score: 1.00
Face 1, top-left coordinates: (137.535, 44.9726), box width: 31.8052, box height: 41.9043, score: 1.00

OpenCV was built at:

86fa0308fc (HEAD → 4.x, origin/HEAD, origin/4.x) Merge pull request #23139 from AleksandrPanov:add_py_charuco_sample

using Visual Studio 2019 with the following command

cmake ‘-GVisual Studio 16 2019’ -D BUILD_SHARED_LIBS=OFF -D BUILD_WITH_STATIC_CRT=OFF -D 'CMAKE_CXX_FLAGS_RELEASE= /MD ’ -D 'CMAKE_CXX_FLAGS_DEBUG= /MDd ’ -D WITH_IPP=ON -D WITH_MKL=ON -DBUILD_PERF_TESTS:BOOL=OFF -DBUILD_TESTS:BOOL=OFF -DBUILD_DOCS:BOOL=OFF -DWITH_CUDA:BOOL=OFF -DBUILD_EXAMPLES:BOOL=OFF -DINSTALL_CREATE_DISTRIB=ON -DOPENCV_EXTRA_MODULES_PATH=/c/lib2/opencv_contrib/modules -DCMAKE_INSTALL_PREFIX=/c/lib2/install/opencv /c/lib2/opencv

Is there anything wrong with my code?
Here is the c++ code I used to reproduce this

#include "gtest/gtest.h"
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/core/types.hpp>
#include <opencv2/objdetect/objdetect.hpp>

#include <filesystem>
#include <fstream>

using namespace std;
namespace fs = std::filesystem;

int YU_NET_INPUT_SIZE = 320;

cv::Ptr<cv::FaceDetectorYN> createFaceDetector()
{
    // Filter out faces of score < score_threshold
    float scoreThreshold = 0.9;
    // Suppress bounding boxes of iou >= nms_threshold
    float nmsThreshold = 0.3;
    // Keep top_k bounding boxes before NMS
    int topK = 5000;

    auto modelPath = fs::path("opencv-zoo") / "models" / "face_detection_yunet" / "face_detection_yunet_2022mar.onnx";

    // Initialize FaceDetectorYN
    auto faceDetectorYN = cv::FaceDetectorYN::create(
        modelPath.string(),
        "",
        cv::Size(YU_NET_INPUT_SIZE, YU_NET_INPUT_SIZE),
        scoreThreshold,
        nmsThreshold,
        topK
    );

    return faceDetectorYN;
}

vector<unsigned char> loadNv12Data()
{
    ifstream input("1920x720.nv12", std::ios::binary);

    vector<unsigned char> buffer(std::istreambuf_iterator<char>(input), {});

    return buffer;
}

bool test()
{
    auto faceDetectorYN = createFaceDetector();
    auto dataVector = loadNv12Data();

    int frameWidth = 1920;
	int frameHeight = 720;

    // find the face(s)
    cv::Mat picNV12 = cv::Mat(frameHeight * 3 / 2, frameWidth, CV_8UC1, dataVector.data());

    cv::Mat picBgr;
    cv::cvtColor(picNV12, picBgr, cv::COLOR_YUV2BGR_NV12);

    // Scale factor used to resize input video frames
    // optimal size for YuNet is 320 x 320, scale the image
    float scale = YU_NET_INPUT_SIZE / (float)frameWidth;

    int imageWidth = int(picBgr.cols * scale);
    int imageHeight = int(picBgr.rows * scale);
    cv::Mat smallImg;
    cv::resize(picBgr, smallImg, cv::Size(imageWidth, imageHeight));
    // Set input size before inference
    faceDetectorYN->setInputSize(smallImg.size());
    cv::Mat faces;
    faceDetectorYN->detect(smallImg, faces);

    bool badScore = false;
    for (int i = 0; i < faces.rows; i++)
    {
        float score = faces.at<float>(i, 4);
        badScore = score < 0 ? true : badScore;


        cout << "Face " << i
            << ", top-left coordinates: (" << faces.at<float>(i, 0) << ", " << faces.at<float>(i, 1) << "), "
            << "box width: " << faces.at<float>(i, 2) << ", box height: " << faces.at<float>(i, 3) << ", "
            << "score: " << cv::format("%.2f", faces.at<float>(i, 14))
            << endl;
    }

    return badScore;
}


TEST(deleteme, test)
{
    for (int i = 0; i < 100; ++i)
    {
       bool badScore = test();
       ASSERT_FALSE(badScore) << "Failed i = " << i << endl;
    }
}

berak · March 17, 2023, 3:18pm

this is righteye.x, not the ‘score’, which is at index 14:

github.com

opencv/opencv/blob/752ac19a2f6c70b4230ecd5e08f9fc0b6ff775e6/modules/objdetect/src/face_detect.cpp#L209


      
          float clsScore = conf_v[i*2+1];
          float iouScore = iou_v[i];
          // Clamp
          if (iouScore < 0.f) {
              iouScore = 0.f;
          }
          else if (iouScore > 1.f) {
              iouScore = 1.f;
          }
          float score = std::sqrt(clsScore * iouScore);
          face.at<float>(0, 14) = score;
          
          // Get bounding box
          float cx = (priors[i].x + loc_v[i*14+0] * variance[0] * priors[i].width)  * inputW;
          float cy = (priors[i].y + loc_v[i*14+1] * variance[0] * priors[i].height) * inputH;
          float w  = priors[i].width  * exp(loc_v[i*14+2] * variance[0]) * inputW;
          float h  = priors[i].height * exp(loc_v[i*14+3] * variance[1]) * inputH;
          float x1 = cx - w / 2;
          float y1 = cy - h / 2;
          face.at<float>(0, 0) = x1;
          face.at<float>(0, 1) = y1;

apart from that, please tell us about the image. where does the red border come from ? the ‘bleeding’ on the right side ?

wouldn’t it make sense, to crop away the border, before resizing it ?

gsalsero · March 17, 2023, 4:17pm

It’s an image I created in GIMP for unit testing a class that uses OpenCV… The person is ai-generated and I originally placed it on a black background. Later I used Gimp’s “Bucket Fill Tool” to change the black background to red and it bled into the “foreground”.

It’s not representative of the images our product receives. We get NV12 data from a third party and depending on the original aspect ratio, it might be letterboxed. We see both vertical and horizontal letterboxing.

This specific image was not intended to simulate letterboxing. I just wanted to test face detection. We do have other unit tests that test our ability to crop out letterboxing. One of them uses this image centered on a black 1920x1080 image. Under normal conditions, our code does crop out the black letterboxing prior to using face detection.

I’ll create a different test image to see if it helps.

gsalsero · March 17, 2023, 6:59pm

I tried it with this image with similar results.

I also used a beach background, a city background, and a jungle background. All with the similar results. I can upload those images if you like.

gsalsero · March 27, 2023, 5:17pm

Would this be a bug in opencv?

Topic		Replies	Views
Yunet Model gaves an unexpected output sometimes C++ dnn	3	453	January 25, 2024
Python YUNet workflow Python dnn , facedetection	0	70	May 3, 2025
Why I am not getting the same FPS for YuNet model Python dnn	0	39	March 8, 2025
Use YuNet with CUDA Python dnn , cuda , videoio	1	301	June 11, 2024
Unable to get hardware acceleration for YuNet on Windows in Python-Opencv Python dnn , opencl , facedetection	4	171	August 22, 2024

YuNet face detection is returning face rect with extremely large values

Related topics