cv.PCACompute Results Seem Wrong

Hello,

I am exploring PCA with OpenCV and i do not understand the results i am seeing.

When compared to sklearn.decomposition’s PCA the results seem wrong.

  1. The same numbers appear too frequently and some rows are the same.
  2. How do you project the samples back in Python? (Validation)

import numpy as np
import cv2 as cv

x1 = np.asarray([1,1,1,1,1,1],dtype=np.float32)
x2 = np.asarray([20,20,20,20,20,20],dtype=np.float32)
x3 = np.asarray([300,300,300,300,300,300],dtype=np.float32)
x4 = np.asarray([400,400,400,400,400,400],dtype=np.float32)

matrix_test = x1

for file in [x2,x3,x4]:
    matrix_test = np.vstack((matrix_test, file))

mean, eigenvectors = cv.PCACompute(matrix_test, mean = np.array([]), maxComponents=10)

print(mean)
#[[180.25 180.25 180.25 180.25 180.25 180.25]]

print(eigenvectors)
# [[ 0.40824828  0.40824828  0.40824828  0.40824828  0.40824828  0.40824828]
#  [-0.4082483  -0.4082483  -0.4082483  -0.4082483  -0.4082483  -0.4082483 ]
#  [ 0.4082483   0.4082483   0.4082483   0.4082483   0.4082483   0.4082483 ]
#  [ 0.4082483   0.4082483   0.4082483   0.4082483   0.4082483   0.4082483 ]]

from sklearn.decomposition import PCA

pca = PCA()
components = pca.fit_transform(matrix_test)
print(components)
# [[-4.3907104e+02 -5.0779790e-06 -5.4410724e-13  1.9930799e-20]
#  [-3.9253073e+02 -4.5397310e-06  7.3777291e-13  1.2022075e-20]
#  [ 2.9332635e+02  3.7512091e-05  6.0963089e-14  1.6197749e-20]
#  [ 5.3827533e+02 -2.7894388e-05  6.0963008e-14  1.6197741e-20]]

out = pca.inverse_transform(components)
print(out)
# [[  0.99998474   0.99998474   0.99998474   0.99998474   0.99998474
#     0.99998474]
#  [ 20.          20.          20.          20.          20.
#    20.        ]
#  [299.99994    300.         300.         300.         300.
#   300.        ]
#  [400.         400.         400.         400.         400.
#   400.        ]]

Am i missing something obvious?

are you sure, pca.fit_transform returns eigenvectors ? it sounds more like the projected values

(you might be comparing apples to pears …)

You are indeed correct. The projection is also required.

res = cv.PCAProject(matrix_test, mean, eigenvectors)
print(res)


[[-2.68394051e+06  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00 -3.49245965e-10]
 [-2.68389397e+06  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00 -3.49245965e-10]
 [-2.68320811e+06  0.00000000e+00  0.00000000e+00  5.82076609e-11
   0.00000000e+00 -4.65661287e-10]
 [-2.68296316e+06 -5.82076609e-11  0.00000000e+00  1.16415322e-10
   0.00000000e+00 -4.65661287e-10]
 [-2.67169551e+06  0.00000000e+00  0.00000000e+00  5.82076609e-11
   0.00000000e+00 -2.32830644e-10]
 [-2.53697358e+06 -1.16415322e-10  0.00000000e+00  1.16415322e-10
   0.00000000e+00 -1.16415322e-10]
 [-9.69300140e+05  0.00000000e+00  0.00000000e+00 -2.91038305e-11
   0.00000000e+00 -1.16415322e-10]
 [ 1.69119750e+07  0.00000000e+00  0.00000000e+00 -4.65661287e-10
   0.00000000e+00  2.79396772e-09]]

res = cv.PCABackProject(res, mean, eigenvectors)
print(res)


[[1.e+00 1.e+00 1.e+00 1.e+00 1.e+00 1.e+00]
 [2.e+01 2.e+01 2.e+01 2.e+01 2.e+01 2.e+01]
 [3.e+02 3.e+02 3.e+02 3.e+02 3.e+02 3.e+02]
 [4.e+02 4.e+02 4.e+02 4.e+02 4.e+02 4.e+02]
 [5.e+03 5.e+03 5.e+03 5.e+03 5.e+03 5.e+03]
 [6.e+04 6.e+04 6.e+04 6.e+04 6.e+04 6.e+04]
 [7.e+05 7.e+05 7.e+05 7.e+05 7.e+05 7.e+05]
 [8.e+06 8.e+06 8.e+06 8.e+06 8.e+06 8.e+06]]