Abstract
We develop methodology for three-dimensional (3D) radial visualization (RadViz) of multidimensional datasets. The classical two-dimensional (2D) RadViz visualizes multivariate data in the 2D plane by mapping every observation to a point inside the unit circle. Our tool, RadViz3D, distributes anchor points uniformly on the 3D unit sphere. We show that this uniform distribution provides the best visualization with minimal artificial visual correlation for data with uncorrelated variables. However, anchor points can be placed exactly equi-distant from each other only for the five Platonic solids, so we provide equi-distant anchor points for these five settings, and approximately equi-distant anchor points via a Fibonacci grid for the other cases. Our methodology, implemented in the R package radviz3d, makes fully 3D RadViz possible and is shown to improve the ability of this nonlinear technique in more faithfully displaying simulated data as well as the crabs, olive oils and wine datasets. Additionally, because radial visualization is naturally suited for compositional data, we useRadViz3D to illustrate (i) the chemical composition of Longquan celadon ceramics and their Jingdezhen imitation over centuries, and (ii) United States regional SARS-Cov-2 variants’ prevalence in the Covid-19 pandemic during the summer 2021 surge of the Delta variant.
Supplementary Materials
The following supplementary materials are available:
A gzipped tar file (supplement.tgz) containing:
A gzipped tar file (html-resource.tgz) HTML source code (also available online at https://radviz3d.github.io/) called supplement_jcgs.html provides overlap maps, Radviz2D displays, and interactive Viz3D and RadViz3D plots on the illustrative and real-data examples discussed in this article. See README for details.
A gzipped tar file (radviz3d-codes.tgz) containing documented codes and necessary datasets for reproducing the results.
radviz3d: An R package implementing the algorithm in this article available at https://github.com/fanne-stat/radviz3d.
Acknowledgments
The authors thank an Associate Editor and three anonymous reviewers for their insightful comments on an earlier version of the manuscript. We are also grateful to N. Kunwar, H. Nguyen, P. Lu, F.S. Aguilar, G. Agadilov and I. Agbemafle for helpful discussions during an introductory graduate multivariate statistics class (Stat 501, Spring 2018 semester) at Iowa State University where this methodology was conceived. Our thanks also to Z. He and H. Zhang for their explanations on how the ceramics observations were collected and to K. S. Dorman for her help in identifying and obtaining the SARS-Cov-2 variants proportions dataset. A portion of this article won the first author a 2021 Student Paper Competition award from the American Statistical Association (ASA) Section on Statistical Computing and Graphics. The content of this article however is solely the responsibility of the authors and does not represent the official views of the USDA.