Abstract
The graphical representation of the correlation matrix by means of different multivariate statistical methods is reviewed, a comparison of the different procedures is presented with the use of an example dataset, and an improved representation with better fit is proposed. Principal component analysis is widely used for making pictures of correlation structure, though as shown a weighted alternating least squares approach that avoids the fitting of the diagonal of the correlation matrix outperforms both principal component analysis and principal factor analysis in approximating a correlation matrix. Weighted alternating least squares is a very strong competitor for principal component analysis, in particular if the correlation matrix is the focus of the study, because it improves the representation of the correlation matrix, often at the expense of only a minor percentage of explained variance for the original data matrix, if the latter is mapped onto the correlation biplot by regression. In this article, we propose to combine weighted alternating least squares with an additive adjustment of the correlation matrix, and this is seen to lead to further improved approximation of the correlation matrix.
4 Supplementary Materials
R-package Correlplot: R-package Correlplot (version 1.0.8) contains code to calculate the different approximations to the correlation matrix and to create the graphics shown in the article. The package contains all datasets used in the article. R-package Correlplot has a vignette containing a detailed example showing how to generate all graphical representations of the correlation matrix (GNU zipped tar file).
Approximations: The file approximations.pdf contains the approximations to the correlation matrix of the Heart attack data. Each table in the supplement gives the sample correlations above the diagonal, and the approximations obtained with a particular method on and/or below the diagonal (PDF file).
Disclosure Statement
The authors report there are no competing interests to declare.
Acknowledgments
Part of this work (Graffelman Citation2022) was presented at the 17th Conference of the International Federation of Classification Societies (IFCS 2022) at the” Fifty years of biplots” session organized by professor Niël le Roux (Stellenbosch University) in Porto, Portugal. We thank two anonymous reviewers whose comments on the manuscript have helped to improve it.