Abstract
We investigate a generalized framework to estimate a latent low-rank plus sparse tensor, where the low-rank tensor often captures the multi-way principal components and the sparse tensor accounts for potential model mis-specifications or heterogeneous signals that are unexplainable by the low-rank part. The framework flexibly covers both linear and generalized linear models, and can easily handle continuous or categorical variables. We propose a fast algorithm by integrating the Riemannian gradient descent and a novel gradient pruning procedure. Under suitable conditions, the algorithm converges linearly and can simultaneously estimate both the low-rank and sparse tensors. The statistical error bounds of final estimates are established in terms of the gradient of loss function. The error bounds are generally sharp under specific statistical models, for example, the sub-Gaussian robust PCA and Bernoulli tensor model. Moreover, our method achieves nontrivial error bounds for heavy-tailed tensor PCA whenever the noise has a finite moment. We apply our method to analyze the international trade flow dataset and the statistician hypergraph coauthorship network, both yielding new and interesting findings. Supplementary materials for this article are available online.
Supplementary Materials
The supplementary materials include the theoretical guarantees on Poisson tensor robust PCA, more numerical simulation results, more real data analysis, the performance of RGrad on exact low-rank tensor models, and all the technical proofs.
Notes
1 We will show, in Section 4, that obtaining a good initialization for is, under suitable conditions, easy once a good initialization for is available.
2 Egypt is at the cross of Eastern Africa and Western Asia. For simplicity, we treat it as an Asian country. In addition, Turkey is treated as an Eastern European country rather than a Western Asian country.