7,256
Views
217
CrossRef citations to date
0
Altmetric
STATISTICAL DEVELOPMENTS AND APPLICATIONS

Nonlinear Principal Components Analysis With CATPCA: A Tutorial

&
Pages 12-25 | Received 22 Jun 2010, Published online: 16 Dec 2011
 

Abstract

This article is set up as a tutorial for nonlinear principal components analysis (NLPCA), systematically guiding the reader through the process of analyzing actual data on personality assessment by the Rorschach Inkblot Test. NLPCA is a more flexible alternative to linear PCA that can handle the analysis of possibly nonlinearly related variables with different types of measurement level. The method is particularly suited to analyze nominal (qualitative) and ordinal (e.g., Likert-type) data, possibly combined with numeric data. The program CATPCA from the Categories module in SPSS is used in the analyses, but the method description can easily be generalized to other software packages.

Notes

Note that Likert-type rating scales are often analyzed as numeric, assuming the categories to be equally spaced. However, such an a priori assumption might not be justified.

In Figures 1 and 3 we have depicted the loading vector in the category plot for didactic purposes. However, in actual output, the category vectors and loading vectors are displayed in separate plots. Then, it is important to realize that the importance of variables in the solution is indicated by the length of the loading vector in the loadings plot (or table), and not by the length of the variable vector in a category plot.

As this restriction is imposed on the quantifications in each iteration of the optimal scaling process, the final result of ordinal quantification is not equal to the result of directly applying the order restriction to the final nominal category quantifications.

For a variable with numerical analysis level, the quantifications can be more easily obtained by simply standardizing the observed variable (as in linear PCA). However, we indicate how they can be obtained by optimal scaling for didactic purposes.

Note that, when missing option Passive is used, the Model Summary table does not include percentage VAF values, because in that case, proportion VAF is not exactly equal to the eigenvalue divided by the number of analysis variables. However, when the number of missings is not very large, this value still gives a proper indication of proportion VAF.

Remember that it is necessary to look at scree plots in different dimensionalities, because NLPCA solutions are not nested.

Eigenvalues are from the bottom row of the Correlations transformed variables table.

As rotation options are not available within CATPCA in SPSS 17, VARIMAX rotation was performed by saving the transformed variables and submitting them to a linear PCA with VARIMAX rotation.

The multiple nominal level is referred to as multiple because such a variable obtains category quantifications for each principal component, in contrast to the single, overall quantifications obtained with the other analysis levels.

Although not referred to as such in NLPCA, this measure is found in the VAF table in the Mean column under Centroid coordinates.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.