Boost training issue: vector subscript out of range

I am having trouble with training a Boosted trees model for regression. I am using the code below to train the model, but get a runtime error for “vector subscript out of range”

Ptr<TrainData> trainingData = TrainData::create(dataset_opencv.trainData, ROW_SAMPLE, dataset_opencv.trainLabels);
Ptr<Boost> Mdl = Boost::create();
Mdl->setWeakCount(200);
Mdl->setBoostType(cv::ml::Boost::GENTLE);
Mdl->setUseSurrogates(false);
Mdl->setMaxCategories(200);
Mdl->setRegressionAccuracy(0.05);
Mdl->train(trainingData);

My data and responses are cv::Mats with the exact same number of rows. The training data is 25 columns, the labels are 1 column, and both are CV_32F. Using the same training data for training a Random Forest model is running with no issue. Is there something I’m doing wrong in setting the parameters for Boosted Trees model?

that’s the exact and complete error message? there’s more info and it’s useful.

perhaps this helps; I don’t understand a thing: Vector subscript out of range. On a boosting procedure. · Issue #8384 · opencv/opencv · GitHub

Yes, it appears I am having a similar issue to that post. However, I’m not sure how the issue was resolved here. Perhaps it what would be helpful to know is how to appropriately set the variable type to numerical in cv::ml::TrainData()?

The exact error message I am getting is the following:

Debug Assertion Failed!
Program: …_RegressionEngine\References\OpenCV\x64\bin\opencv_ml412d.dll
File: C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.22.27905\include\vector
Line: 1465
Expression: vector subscript out of range

I see that you run OpenCV v4.1.2. you should update to latest or very close. chasing bugs in old versions is a waste of time. they might have been fixed a long time ago.

the assertion is fairly useless because it comes from within C++ STL and it doesn’t say where in user code (OpenCV) it happens.

if you use float labels/responses, it will try to perform regression, with integer ones a classification (that’s how it got “solved” in the mentioned issue above)

(unfortunately, if you want to do a regression, CV_32F is the correct type)

and btw, adding a few lines for synthetic data, like:

Mat d(100,25,CV_32F);
Mat l(100,1,CV_32F);
randu(d,0,1);
randu(d,0,1);
Ptr<TrainData> trainingData = TrainData::create(d, ROW_SAMPLE, l);

to your example i get:

OpenCV(4.5.2-dev) Error: Assertion failed (std::abs(w->ord_responses[si]) == 1) in updateWeightsAndTrim, file C:\p\opencv\modules\ml\src\boost.cpp, line 270

so, updating seems a good idea (at least, we’re on the same level, then…)

[edit]
changing ml::Boost::GENTLE to ml::Boost::DISCRETE fixed the error here.

You only get that error in debug builds because the Microsoft compiler only includes that check in debug builds. The problem still exists in a Release build.

If world.rank() is zero, unsorted_dist_list will be an empty vector when you use it in unsorted_dist_list[j] . At a minimum, you should add . .

unsorted_dist_list.resize(world.size());

before the for (size_t j = 0; loop.

If world.rank() is not zero, unsorted_dist_list will be empty when passed to mpi::scatter .

1 Like