OpenCV Classes with YoloV3

Hello there,

I developed a program to run YoloV3.
This code allow me to select some classes based on coco.names (80 classes), writing within a config file a bit value to select the desired class(es).
Also, the code works with 4 cameras in sequential mode (One at the time) to preserve the resources of my embedded computer.

The main problem is my impossibility to generate classes, with the goal to limited the repetition of my code 4 times.

Would it be possible for one of you to help me build these classes ?

Thanks for your help and support.

Regards,

can you explain, what “select”, “build” or “generate” classes means, here ?

Hi Berak,

Thanks your your quick reply.
I am a self made man on C++, but I don’t know how can I write classes based on my code.
Can I send you my code by mail ?
It’s a long code…This is why I need to build classes to reduce it.

For example, I have 4 cameras, and the system will work as follow:

The user will select the desired classes (Human, car,…) for each cameras.
One it’s done, the code will run each camera (one by one) and see if there is a detection (Yolo). If yes, it will write an image in a directory, if no, it will go to the next camera.

To do this, I repeat 4 times the code, but I suppose I can do better with C++ classes to automatized this process.
My request is based on the capacity you have, as developer, to help me building these C++ classes.

Regards,

1 Like

so, it’s about code organization.

how exactly are you trying to restrict the detection to some desired classes ?
(we probably need to see some code, now)

It’s easy to do the classes restriction.

I setup the string for target (Yolo classes):

string target11 = ""; // H - human
string target12 = ""; // V - car
string target13 = ""; // M - motorbike
string target14 = ""; // A - plane
string target15 = ""; // C - truck
string target16 = ""; // B - boat

Once it’s done, I modified the postprocess function like this:

void postprocess1(Mat& frame1, const vector<Mat>& outs1)
{
    vector<int> classIds1;
    vector<float> confidences1;
    vector<Rect> boxes1;
    //string name;

    for(size_t i = 0; i < outs1.size(); ++i)
    {
        float* data1 = (float*)outs1[i].data;

        for(int j = 0; j < outs1[i].rows; ++j, data1 += outs1[i].cols)
        {
            Mat scores1 = outs1[i].row(j).colRange(5, outs1[i].cols);
            Point classIdPoint1;
            double confidence1;

            minMaxLoc(scores1, 0, &confidence1, 0, &classIdPoint1);

            //--detect only defined labels--//
            //------------------------------//
            if((classes[classIdPoint1.x] == target11) ||(classes[classIdPoint1.x] == target12) ||(classes[classIdPoint1.x] == target13)
            		||(classes[classIdPoint1.x] == target14) ||(classes[classIdPoint1.x] == target15) ||(classes[classIdPoint1.x] == target16))
            {
            	//--if conf. threshold > threshold set--//
            	//--------------------------------------//
            	if(confidence1 > confThreshold)
            	{
            		//--then draw bounding boxes--//
            		//----------------------------//
            		int centerX = (int)(data1[0] * frame1.cols);
            		int centerY = (int)(data1[1] * frame1.rows);
            		int width = (int)(data1[2] * frame1.cols);
            		int height = (int)(data1[3] * frame1.rows);
            		int left = centerX - width / 2;
            		int top = centerY - height / 2;

            		//--save class ids and correspondent confidences and boxes--//
            		//----------------------------------------------------------//
            		classIds1.push_back(classIdPoint1.x);
            		confidences1.push_back((float)confidence1);
            		boxes1.push_back(Rect(left, top, width, height));
            	}
            }
        }
    }

    vector<int> indices1;
    NMSBoxes(boxes1, confidences1, confThreshold, nmsThreshold, indices1);

    //--detect if a classification exist--//
    //------------------------------------//
    if(indices1.size() > 0)
    {
    	AI1 = true;
    }
    else
    {
    	AI1 = false;
    }

    //--number of classification detected--//
    //-------------------------------------//
    for(size_t i = 0; i < indices1.size(); ++i)
    {
        int idx = indices1[i];
        Rect box1 = boxes1[idx];
        drawPred1(classIds1[idx], confidences1[idx], box1.x, box1.y, box1.x + box1.width, box1.y + box1.height, frame1);
    }
}

The restriction classes selection is done with the “detect only defined label”

 if((classes[classIdPoint1.x] == target11) ||(classes[classIdPoint1.x] == target12) ||(classes[classIdPoint1.x] == target13)
            		||(classes[classIdPoint1.x] == target14) ||(classes[classIdPoint1.x] == target15) ||(classes[classIdPoint1.x] == target16))

And that’s it.

But the real problem is I have 4 cameras.
So, the “TARGET 11-16” represents the 6 classes for the 1st camera.
The other one are done using the label “TARGET 2X - camera 2, TARGET 3X - camera 3,…”
One of my problem is to integrate this function within the “Postprocess function” in function of the running camera.

I hope you understand the problem.

After that, I have another function defined as: “drawPred”.
I modified it to show:
Bounding box with a white and black rectangle to see the selection whatever the image brightness and/or contrast.
A circle at the upper left corner of the bounding box with 3 colors
green (>66% of confidence)
Orange(33% < confidence < 66%)
Red (< 33%)
Each circle show a letter which identified the class (Human, car,…)
I did this to achieve what I would like to have:

void drawPred1(int classId1, float conf, int left, int top, int right, int bottom, Mat& frame1)
{
	int pixel = 1;

    rectangle(frame1, Point(left, top), Point(right, bottom), noir, pixel);
    rectangle(frame1, Point(left+pixel, top+pixel), Point(right-pixel, bottom-pixel), blanc, pixel);

	string label = format("%.1f", conf*100);
	string label1;

    if (!classes.empty())
    {
        CV_Assert(classId1 < (int)classes.size());

        label1 = classes[classId1];

        //--show label and conf. threshold detected--//
        //-------------------------------------------//
        cout << label1 << ": " << label << "%" << endl;
    }

    //--display the label at the top of the bounding box--//
    //----------------------------------------------------//
    int baseLine1;
    Size labelSize1 = getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine1);
    top = max(top, labelSize1.height);

    int a = atoi(label.c_str());
    int radius = 10; // define radius of classification circle

    //--detect low confidence threshold--//
    //-----------------------------------//
    if((a > confThreshold) && (a <= th1))
    {
    	circle(frame1, Point(left, top), radius, rouge, FILLED);

    	if(label1 == target11)
    	{
    		putText(frame1, "H", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target12)
    	{
    		putText(frame1, "V", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target13)
    	{
    		putText(frame1, "M", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target14)
    	{
    		putText(frame1, "A", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target15)
    	{
    		putText(frame1, "C", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target16)
    	{
    		putText(frame1, "B", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    }

    //--detect medium confidence threshold--//
    //--------------------------------------//
    if((a > th1) && (a <= th2))
    {
    	circle(frame1, Point(left, top), radius, orange, FILLED);

    	if(label1 == target11)
    	{
    		putText(frame1, "H", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target12)
    	{
    		putText(frame1, "V", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target13)
    	{
    		putText(frame1, "M", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target14)
    	{
    		putText(frame1, "A", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target15)
    	{
    		putText(frame1, "C", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target16)
    	{
    		putText(frame1, "B", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    }

    //--detect high confidence threshold--//
    //------------------------------------//
    if((a > th2) && (a <= 100))
    {
    	circle(frame1, Point(left, top), radius, vert, FILLED);

    	if(label1 == target11)
    	{
    		putText(frame1, "H", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target12)
    	{
    		putText(frame1, "V", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target13)
    	{
    		putText(frame1, "M", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target14)
    	{
    		putText(frame1, "A", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target15)
    	{
    		putText(frame1, "C", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    	if(label1 == target16)
    	{
    		putText(frame1, "B", Point(left-(radius/2), top+(radius/2)), 1, 1, noir, 1);
    	}
    }
}

The problem remain the same because I don’t know how to build classes to reduce the source code.
Ideally, I would like to have one class with the 2 functions I showed, and, in function of the Id of the camera (between 0 and 3), I will add the target dedicated to the camera.

object oriented programming is beside the point here.

you need to learn about data structures and algorithms, specifically about arrays/vectors, sets, “hash maps”.

your task can be approached in many different ways. I’ll present one.

your long if((classes[classIdPoint1.x] == target11) ||(classes[classIdPoint1.x] == ... can be expressed as checking whether a value is in a set.

your if(label1 == target11) { putText...} if(label == target12)... can be turned into a loop, or into an operation on a map

#include <iostream>
#include <string>
#include <vector>
#include <set>
#include <map>
#include <algorithm> // std::find

using namespace std;

// COCO: 80 classes, in this order
std::vector<std::string> classes {
	"person", "bicycle", "car", "motorbike", "aeroplane", "bus", "train", "truck", "boat", "traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "sofa", "pottedplant", "bed", "diningtable", "toilet", "tvmonitor", "laptop", "mouse", "remote", "keyboard", "cell phone", "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear", "hair drier", "toothbrush"
};

// your data
std::map<std::string, std::string> targets {
	{"person",    "H"}, // human/person
	{"car",       "V"},
	{"motorbike", "M"},
	{"aeroplane", "A"}, // plane/aeroplane
	{"truck",     "C"},
	{"boat",      "B"},
};

// filled below
std::map<int, std::string> id_to_shortname;

int main(void)
{
	// build maps from id to either name

	for (const auto [longname, shortname] : targets) // C++17 syntax
	{
		// find `longname` in `classes` (get an iterator)
		auto iterator = std::find(classes.begin(), classes.end(), longname);

		if (iterator == classes.end())
		{
			cout << "ERROR: can't find class '" << longname << "' in classes list!" << endl;
			continue;
		}

		// calculate index into classes vector
		int index = std::distance(classes.begin(), iterator);

		// fill those other maps
		id_to_shortname[index] = shortname;

		// note that the order is "random" because they're *hash* maps
		cout << "[" << index << "] " << shortname << " : " << longname << endl;
	}

	cout << endl;

	// test class by id (number)
	// it's simpler than going id (int) -> name (string) -> short name (string)

	std::vector<int> some_class_ids { 0, 1, 2, 3, 4, 5, 6, 7 };

	for (const auto class_id : some_class_ids)
	{
		// we could check using id_to_shortname.contains(class_id), but only from c++20

		auto iterator = id_to_shortname.find(class_id);

		if (iterator == id_to_shortname.end())
		{
			cout << "// ID " << class_id << " (" << classes[class_id] << ") not among targets" << endl;
		}
		else
		{

			// maps support access using []
			//const auto& shortname = id_to_shortname[class_id];

			// or we can use the iterator from above:
			// iterator "points to" a "pair" containing the key (first) and value (second)
			//const auto [id, shortname] = *iterator;

			// also possible:
			auto& shortname = iterator->second;

			cout << "putText: " << shortname << ", ... // " << classes[class_id] << endl;
		}
	}

	return 0;
}

Hello Christoph,

Thanks a lot for your feedback and information.
Yes, I know arrays, vector and map.
I will test your solution.

Remember that I would like to have a C++ class which will permit modify the target (Yolo label) in function of the active camera (Id webcam).

So, one more time, thanks for your time and solution.

Regards,