How do I train an OpenCV LBPHFaceRecognizer Model with a very large number of images?
I’m trying to create a model in OpenCV for face detection (specifically sex and age). The data set I have has about 50,000 images. The problem is; when I try to train a model with that amount of data, the memory usage just grows and runs out of memory after about 100 images. Even when trying batching and unloading the loaded cv::mat image, the model just grows in size (which I guess is to be expected).
I tried both loading and training one image at a time, and with this variant I tried training in batches but still the same result
The ImageCollection class is just a map of SEXAGEBRACKETs (labels) to image file paths. The model I’m using is LBPHFaceRecognizer.
Here is what my “load and train” function looks like:
void ModelGenerator::LoadImagesAndTrain(const ImageCollection* imageCollection, const cv::Ptr<cv::face::LBPHFaceRecognizer> modelToTrain, const bool isUpdatingModel)
{
cv::Size loadedImageSize;
cv::Mat currLoadedImage;
std::vector<cv::Mat> imagesToUse;
std::vector<int> labelsToUse;
std::multimap<SEXAGEBRACKET, char*>* bracketToImageFileNameMap;
if (imageCollection != NULL && modelToTrain != NULL)
{
std::cout << "Number of images provided for training:" << imageCollection->GetNumberOfImages() << std::endl;
bracketToImageFileNameMap = imageCollection->GetBracketToFileNameMap();
//Check if this is the first time we are training the model, All other calls will be an update instead.
if (isUpdatingModel == false)
{
currLoadedImage = cv::imread(bracketToImageFileNameMap->begin()->second, cv::IMREAD_GRAYSCALE);
imagesToUse.push_back(currLoadedImage);
labelsToUse.push_back(bracketToImageFileNameMap->begin()->first);
modelToTrain->train(imagesToUse, labelsToUse);
currLoadedImage.release();
}
imagesToUse.clear();
labelsToUse.clear();
for (std::multimap<SEXAGEBRACKET, char*>::iterator it = bracketToImageFileNameMap->begin(); it != bracketToImageFileNameMap->end(); it++)
{
std::cout << "training model...." << std::endl;
imagesToUse.clear();
labelsToUse.clear();
std::cout << it->first << ":" << it->second << std::endl;
currLoadedImage = cv::imread(it->second, cv::IMREAD_GRAYSCALE);
imagesToUse.push_back(currLoadedImage);
labelsToUse.push_back(it->first);
modelToTrain->update(imagesToUse, labelsToUse);
currLoadedImage.release();
}
}
}
Is it possible to create a very large model (with something like 50,000 images) in OpenCV without needing like 300GB of Ram? Or is there an alternative to doing with with a Neural Network? I’m open to suggestions if I’m going about this the wrong way.