difference between pca and pcaFromStatToolbox

Question

Amir el 22 de Feb. de 2016

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/269483-difference-between-pca-and-pcafromstattoolbox

Comentada: the cyclist el 23 de Feb. de 2016

It might sound stupid, but I actually am confused with the results of the pca and pcaFromStatToolbox. I noticed the different output just now and I am wondering why there are different:

[coeff1,score1]=pcaFromStatToolbox(ran)
[coeff2,score2]=pca(x)

so lets have an example:

>> pcaData=rand(4,5)
pcaData =
      0.4638    0.7937    0.6250    0.1400    0.4149
      0.7046    0.5080    0.3831    0.8778    0.0977
      0.0153    0.8616    0.8466    0.7827    0.0962
      0.5929    0.9365    0.0800    0.0978    0.8779

----

>> [n,nn]=pcaFromStatToolbox(pcaData)
n =
     -0.2633    0.6508   -0.4266
     -0.1495   -0.3995    0.3599
      0.4276   -0.4884   -0.4736
      0.6094    0.4031    0.5820
     -0.5951   -0.1256    0.3542
nn =
     -0.1772   -0.2040   -0.2480
      0.3372    0.5222   -0.0219
      0.6069   -0.3321    0.1240
     -0.7668    0.0139    0.1458
------
>> [m,mm]=pca(pcaData)
netlab pca: using eig
netlab pca: sorting evec
m =
      0.3671
      0.1416
      0.0329
      0.0000
      0.0000
mm =
      0.2633   -0.6508   -0.4266   -0.1072    0.5601
      0.1495    0.3995    0.3599    0.3753    0.7400
     -0.4276    0.4884   -0.4736   -0.5079    0.3106
     -0.6094   -0.4031    0.5820   -0.2916    0.2056
      0.5951    0.1256    0.3542   -0.7104         0

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Amir el 23 de Feb. de 2016

Abrir en MATLAB Online

Good call! I noticed the other pca is from a tool that was in my path

Estimation_of_Distribution_Algorithms/BNT/KPMstats/pca.m

and it says:

function [PCcoeff, PCvec] = pca(data, N)
%PCA  Principal Components Analysis
%
%  Description
%   PCCOEFF = PCA(DATA) computes the eigenvalues of the covariance
%  matrix of the dataset DATA and returns them as PCCOEFF.  These
%  coefficients give the variance of DATA along the corresponding
%  principal components.
%
%  PCCOEFF = PCA(DATA, N) returns the largest N eigenvalues.
%
%  [PCCOEFF, PCVEC] = PCA(DATA) returns the principal components as well
%  as the coefficients.  This is considerably more computationally
%  demanding than just computing the eigenvalues.
%
%  See also
%  EIGDEC, GTMINIT, PPCA
%
%  Copyright (c) Ian T Nabney (1996-2001)

the cyclist el 23 de Feb. de 2016

Editada: the cyclist el 23 de Feb. de 2016

I just did a quick search KPMstats and MATLAB. I found this annotation:

"KPMstats is a directory of miscellaneous statistics functions written by Kevin Patrick Murphy and various other people (see individual file headers)."

Personally, I would need to dig in to get more confidence in Murphy et al. (who are surely fine fellows). I have a fair amount of experience with the MATLAB pca, and I am very confident in its output.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

the cyclist el 23 de Feb. de 2016

1
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/269483-difference-between-pca-and-pcafromstattoolbox#answer_210858

Editada: the cyclist el 23 de Feb. de 2016

Even without knowing the source of the other function, I can make a guess.

Notice that for the input, you have 4 observations (4 rows) of 5 variables. So, you can fully explain 100% of the variation with just 4 principal components. Furthermore, because MATLAB centers the variables, you can do it with 3 principal components.

Notice that MATLAB outputs 3 principal component coefficients, where your other software outputs 5 vectors. That other software It is clearly making a different assumption in the case where you only actually need 3 to fully span the space. My guess is that the 4th and 5th vectors (the ones that are different from MATLAB) are linear combinations of the first 3.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

the cyclist el 23 de Feb. de 2016

My speculation that the other two vector outputs are linear combinations of the other three seems not to be true. I'm not sure what's going on there.

Iniciar sesión para comentar.

difference between pca and pcaFromStatToolbox

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Respuestas (1)

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

difference between pca and pcaFromStatToolbox

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Respuestas (1)

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos