Abstract
Principal component analysis (PCA) has been a commonly used unsupervised learning method with broad applications in both descriptive and inferential analytics. It is widely used for representation learning to extract key features from a dataset and visualize them in a lower dimensional space. With more applications of neural network-based methods, autoencoders (AEs) have gained popularity for dimensionality reduction tasks. In this paper, we explore the intriguing relationship between PCA and AEs and demonstrate, through some examples, how these two approaches yield similar results in the case of the so-called linear AEs (LAEs). This study provides insights into the evolving landscape of unsupervised learning and highlights the relevance of both PCA and AEs in modern data analysis.
Notes
1 The code used to generate the results is available at: https://github.com/dcacciarelli/pca-vs-autoencoders
Additional information
Notes on contributors
Davide Cacciarelli
Davide Cacciarelli is a PhD student at the Technical University of Denmark and Norwegian University of Science and Technology. His research is related to active learning and statistical process monitoring.
Murat Kulahci
Murat Kulahci is a professor at the Technical University of Denmark and Luleå University of Technology in Sweden. His research currently focuses primarily on large data analytics for descriptive, inferential and predictive purposes. Many of his research applications involve high dimensional, high frequency data demanding analysis methods in chemometrics and machine learning. He has been collaborating with various industries in many industrial statistics projects and digital manufacturing.