Using a MLP to classify lists of numbers

,

Hello and thank you for taking the time,

This is my very first attempt to work with neural networks after reading some theory about them.
What I am trying to do is to classify long lists of numbers (600+ of them per training session, consisting of the numbers 1-10, typically). Each such list indicates a certain activity, so I’ve written a C++ program that utilizes the OpenCV DNN libraries to create a MLP network to get the job done.
To train the MLP, I label each row in the matrix of training data I have based upon the most prevalent item per row (even though it is not necessarily a strong indication regarding the activity it represents, but that will do for now), standardize the matrix and perform a PCA. The MLP consists of 3 layers (3, 5, 1 perceptrons) and I employ a sigmoid activation function.
To predict, I take a vector (=one row), standardize it, run PCA on it and do the call to OpenCV libraries.
Unfortunately the results are not good even after several training sessions - the resulting prediction seems to be biased towards a constant value that is within range, but does not reflect the actual activity.
I am using OpenCV C++ on Linux.
My questions are:

  1. Is a MLP network suitable to address the problem I am trying to solve?
  2. Is it worth fiddling with the MLP layout to get better results?
  3. Are there any other relevant details you can think of that may help me?

As I pointed out above this is my very first DNN program ever. Any comment would be greatly appreciated!

please show example data, your current coding attempt, papers explaining your approach, – add anything helpful, ty…

Hi,
Thanks for your reply. Here is the logic to load the model and perform training / prediction:

void DeepLearningModel::loadModel(cv::Mat &training_data)
{
   // load training data for later PCA discarded here...
   // training data is available "as is" for train(), this is only 
   // needed for PCA in predict()

    cv::FileStorage dnn_file(m_model_path, cv::FileStorage::READ);

    if (dnn_file.isOpened())
    {
        m_neural_network = ANN_MLP::load(m_model_path);
    }
    else
    {
        m_neural_network = ml::ANN_MLP::create();
    }
}

void DeepLearningModel::train(cv::Mat &training_data)
{
    std::cout << "train" << std::endl;

    // Train a neural network to identify patterns in a list of numbers

    cv::Mat dummmy_training_data;
    loadModel(dummmy_training_data);

    auto matrix_prepared_for_learning{
        m_data_process.getMatrixPreparedForLearning()};

    copyMatrixToOpenCvMatrix(matrix_prepared_for_learning, training_data);

    // Every row gets its own label, representing activity demostrated by row
    auto number_of_samples =
        m_data_process.getMatrixPreparedForLearning().size();
    Mat labels(number_of_samples, 1, CV_32FC1);
    assignLabels(training_data, labels);

    // calculate standardized values before PCA
    standardizeMatrix(training_data);

    // Perform PCA on the training data to reduce its dimensionality
    PCA pca(training_data, Mat(), PCA::DATA_AS_ROW);
    Mat reduced_data;
    pca.project(training_data, reduced_data);

    Mat layers_sizes    = Mat(5, 1, CV_32S);
    layers_sizes.row(0) = Scalar(
        reduced_data
            .cols);  // must be equal to the number of colmuns of 'trainingData'
    layers_sizes.row(1) = Scalar(5);
    layers_sizes.row(2) = Scalar(8);
    layers_sizes.row(3) = Scalar(5);
    layers_sizes.row(4) = Scalar(1);
    m_neural_network->setLayerSizes(layers_sizes);

    // Set activation function and termination criteria
    m_neural_network->setActivationFunction(ml::ANN_MLP::SIGMOID_SYM);
    m_neural_network->setTermCriteria(
        TermCriteria(TermCriteria::MAX_ITER, 10000, 0.0001));

    Mat outputData;
    Mat(labels).convertTo(outputData,
                          CV_32F);  // convert labels to floating-point matrix

    // Train the neural network
    m_neural_network->train(reduced_data, ml::ROW_SAMPLE, outputData);
}

void DeepLearningModel::predict()
{
    std::cout << "predict" << std::endl;

    cv::Mat                      loaded_model;
    DataProcess::MatrixLogicalId training_data;

    loadModel(loaded_model);

    copyOpenCvMatrixToMatrix(loaded_model, training_data);

    size_t num_features = training_data[0].size();

    Mat test_data = Mat_<float>(1, num_features, CV_32F);

    int index{0};

    auto vector_prepared_for_prediction{
        m_data_process.getVectorPreparedForPrediction()};

    for (auto const &logical_thread_id : vector_prepared_for_prediction)
    {
        test_data.at<float>(0, index++) = logical_thread_id;
    }

    standardizeMatrix(test_data);

    Mat reduced_test_data;
    PCA pca(loaded_model, Mat(), PCA::DATA_AS_ROW);
    pca.project(test_data, reduced_test_data);

    Mat prediction;
    m_neural_network->predict(reduced_test_data, prediction);

    std::cout << "Activity predicted to be " << prediction << std::endl;
}

A sample of the training data looks like this (one example row of 600 integers, of up to 10 rows):
5,6,1,71,4,4,4,5,9,1,9,9,3,7,7,7,7,6,1,2,6,7,5,5,1,1,3,7,4…

is this from some kind of sensor, or is this categorical data (think “lunch choices” or “names of dog breeds”)?

no, a straight perceptron is pretty worthless on any time series or sequence data.

time series/sequence data requires recurrent or convolutional networks.

Hi,
Thank. You can think of this data as identifiers of program parts being executed. I suppose you can consider this as categorical data, then? Ok, I will need to switch to a RNN instead - I was suspecting this, but was not sure…

ANN_MLP expects one-hot encoded labels, that is:
Mat labels(number_of_samples, number_of_classes, CV_32FC1);

to do, what the comment says, you need to specify, how many components to keep. currently, it does not reduce anything
you also need to use exactly the same pca for train & test data

Right, this is an obvious mistake! :ok_hand: