Search in:

Statistical Theory and Related Fields Volume 4, 2020 - Issue 2

Submit an article Journal homepage

Free access

304

Views

CrossRef citations to date

Altmetric

Listen

Short Communications

Discussion on ‘Review of sparse sufficient dimension reduction’

Xin ZhangDepartment of Statistics, Florida State University, Tallahassee, FL, USACorrespondence[email protected]
View further author information

Pages 146-148 | Received 18 Sep 2020, Accepted 23 Sep 2020, Published online: 13 Oct 2020

Cite this article
https://doi.org/10.1080/24754269.2020.1829393
CrossMark

In this article

1. Least squares formulations
2. Sparse/constrained canonical correlation
Disclosure statement
Additional information
References

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF View EPUB EPUB

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

I congratulate the authors on an excellent overview of an important research area. Sufficient dimension reduction methods are based on the model-free driving condition that $Y ⫫ X ∣ P_{S} X$ , where $X \in R^{p}$ is multivariate and potentially high-dimensional, $P_{S}$ is the projection onto the dimension reduction subspace $S \subseteq R^{p}$ . Equipped with variable selection and variable screening techniques, many modern sparse sufficient dimension reduction methods have been developed in the past few years, and they can work really well in the model development stage of high-dimensional data analysis. This review paper is very timely and provides a thorough overview of sparse sufficient dimension reduction methods and sheds lights on future research directions.

Specific contributions of this paper include the following. First of all, it recasts many moment-based sufficient dimension reduction methods as a generalised eigen-decomposition problem and also as a constrained trace optimisation (Equation (2.1) in the paper). These two formulations are crucial for studying sparse sufficient dimension reduction in high-dimensional settings where p>n or $p ≫ n$ . This paper also provides a comprehensive overview of the key developments in sparse sufficient dimension reduction literature, from the first paper by Ni et al. (Citation2005), to the newest theoretical and computational breakthrough in Tan et al. (Citation2020). Various techniques for sparse sufficient dimension reduction are discussed, including the shrinkage and regularised estimation, variable screening and the trace pursuit, and the sequential procedures. Advantages and possible pitfalls of different approaches are well explained by the authors. Finally, the paper provides the theoretical foundations for establishing the minimax rates of convergence for estimating a sparse dimension reduction subspace. Further calculations are also included to provide insights about second-order dimension reduction methods such as SAVE and DR.

This paper is inspiring for substantive research in high-dimensional multivariate statistics, and encouraging for current and future studies in dimension reduction. In what follows, I draw two connections which may be suggestive of future directions.

1. Least squares formulations

In the seminal work of Li and Duan (Citation1989), a well-known connection between the ordinary least square estimator in the regression of Y on $X \in R^{p}$ and the one-dimensional dimension reduction subspace can be derived. Specifically, consider the following bivariate loss function, (1) $\begin{aligned} L (a + b^{T} X, Y), L (u, v) i s c o n v e x i n u, \\ e . g . L (u, v) = (u - v)^{2}; \end{aligned}$ (1) and define the unique minimiser as follows, (2) $(α, β) \equiv \arg min_{a, b \in R^{p}} E {L (a + b^{T} X, Y)} .$ (2) Then under the linearity condition of dimension reduction subspace, it was shown that $β$ is contained in the central subspace (or any dimension reduction subspace).

A direct consequence of this somewhat surprising result is that one can simply use OLS to extract the direction in nonlinear models of the following form, (3) $Y = g (γ^{T} X, ε),$ (3) where g is some unknown function, ϵ is the error term, and $γ \in R^{p \times d}$ spans the central subspace under this model. The model (Equation3(3) $Y = g (γ^{T} X, ε),$ (3) ) is known as the single-index model when d = 1, and the multiple-index model when d>1. Then Li–Duan Theorem (Li & Duan, Citation1989, Theorem 2.1) implies that the solution $β$ from ordinary least squares estimation, or more generally from (Equation2(2) $(α, β) \equiv \arg min_{a, b \in R^{p}} E {L (a + b^{T} X, Y)} .$ (2) ), lies within the central subspace: $β \in s p a n (γ) = S_{Y ∣ X}$ .

For single-index models, this means that $β$ from OLS fitting, $(α, β) = \arg min_{a, b} E {(Y - a - b^{T} X)^{2}}$ , is the same as $γ$ in population up to a scalar multiplication (i.e. $β = c \cdot γ$ for some constant $c \neq 0$ ). As such, we might still use OLS to study regression graphics even when there is a nonlinear relationship between Y and $X$ (Cook, Citation1998). In high-dimensional setting, one may replace OLS with its penalised version such as LASSO regression and achieve consistent variable selection and directional estimation (Neykov, Lin, et al., Citation2016; Neykov, Liu, et al., Citation2016).

For multiple-index models and model-free sufficient dimension reduction, the (sparse) estimation of the central subspace is much more challenging. In high-dimensional sparse sufficient dimension reduction, it is thus desirable to have a penalised least squares formulation. Indeed, as this paper points out, the computationally tractable and rate optimal sparse SIR is eventually obtained by the adaptive estimation scheme based on a least squares formulation (Tan et al., Citation2020). Finally, the paper concludes that ‘Since SAVE and DR can not be rewritten as a least-square formulation, we do not define refined sparse SAVE and DR estimator’. Not surprisingly, developing theoretical solid and computationally feasible methods for high-dimensional SAVE and DR is very challenging and requires substantial efforts.

2. Sparse/constrained canonical correlation

As discussed in Section 3.1 of this paper, sparse sufficient dimension reduction subspace can be obtained by the C $^{3}$ method: constrained canonical correlation (Zhou and He, Citation2008). The idea is to estimate the constrained canonical variates between the B-spline basis functions of response transformation, $π (Y) \in R^{m + k_{n}}$ , where m is the spline order and $k_{n}$ is the number of interval knots in B-spline transformation, and the predictor $X \in R^{p}$ . Then this procedure can be viewed as an estimation for sparse SIR directions. However, as the authors noted, this method may not be directly applicable to very high-dimensional settings, at least in theory. In the past few years, there are some advances in both the theoretical and computational aspects of high-dimensional canonical correlation analysis. In Mai and Zhang (Citation2019), they solved the sparse canonical correlation analysis (CCA) problem using an iterative penalised least squares approach. This new iterative penalised least squares approach can be directly applied to estimate the sparse sufficient dimension reduction directions when combined with B-spline transformations of the response.

To illustrate the idea, we consider the estimation of the leading CCA directions. For a multivariate $X \in R^{p}$ and a multivariate $Y \in R^{q}$ , the leading CCA directions are defined through a pair of linear combinations $α_{1}^{T} Y$ and $β_{1}^{T} X$ such that the correlation between them are maximised. When $max (p, q) ≫ n$ , the sparse CCA problem assumes that the population solution $α_{1}^{⋆} \in R^{q}$ and $β_{1}^{⋆} \in R^{p}$ are both sparse so that we can estimate them with a limited sample size. Then the leading sparse CCA directions can be obtained by solving the following constrained optimisation problem, (4) $\begin{aligned} (\hat{α}, \hat{β}) & = \arg min_{α \in R^{q}, β \in R^{p}} \sum_{i = 1}^{n} (α^{T} Y_{i} - β^{T} X_{i})^{2} \\ + λ_{α} ∥ α ∥_{1} + λ_{α} ∥ β ∥_{1}, \end{aligned}$ (4) subject to constraints that $α^{T} {\hat{Σ}}_{Y} α = 1$ and $β^{T} {\hat{Σ}}_{X} β = 1$ . Note that the data are centred so that $\sum_{i = 1}^{n} X_{i} = 0$ and $\sum_{i = 1}^{n} Y_{i} = 0$ , also that the sample covariance ${\hat{Σ}}_{Y}$ and ${\hat{Σ}}_{X}$ do not need to be positive definite. Then it can be shown that, the above optimisation can be solved by iteratively solving the following two LASSO regression problems. Specifically, the sparse CCA solutions can be obtained as follows, (5) $\begin{aligned} \tilde{α} & = \arg min_{α \in R^{q}} \sum_{i = 1}^{n} ({\hat{β}}_{1}^{T} X_{i} - α^{T} Y_{i})^{2} + λ_{α} ∥ α ∥_{1}, \\ {\hat{α}}_{1} & = \frac{\tilde{α}}{\sqrt{{\tilde{α}}^{T} {\hat{Σ}}_{Y} \tilde{α}}}; \end{aligned}$ (5) (6) $\begin{aligned} \tilde{β} & = \arg min_{β \in R^{p}} \sum_{i = 1}^{n} ({\hat{α}}_{1}^{T} Y_{i} - β^{T} X_{i})^{2} + λ_{β} ∥ β ∥_{1}, \\ {\hat{β}}_{1} & = \frac{\tilde{β}}{\sqrt{{\tilde{β}}^{T} {\hat{Σ}}_{X} \tilde{β}}} . \end{aligned}$ (6) For sequential sparse CCA directions, ${\hat{α}}_{k} \in R^{q}$ and ${\hat{β}}_{k} \in R^{p}$ , $k = 1, 2, \dots,$ we can use the similar iterative penalised least squares formulation after deflation of the data from the previous $(k - 1)$ estimated directions. Theoretical results show that this approach can consistently estimate the population directions (any fix number of pairs $k = 1, \dots, K$ ) with an overwhelming probability in ultra-high dimensions $\log (p + q) = o (n)$ . More importantly, due to its simplicity, the iterative penalised least squares approach can be extremely fast and scalable – even much faster than some existing convex formulations. This approach might be useful for sparse sufficient dimension reduction when the response is multivariate and even high-dimensional.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Notes on contributors

Xin Zhang

Xin Zhang is an Associate Professor in Statistics at the Florida State University.

References

Cook, R. D. (1998). Regression graphics: Ideas for studying regressions through graphics (Vol. 318). John Wiley & Sons.
Google Scholar
Li, K.-C., & Duan, N. (1989). Regression analysis under link violation. The Annals of Statistics, 17(3), 1009–1052. https://doi.org/10.1214/aos/1176347254
Web of Science ®Google Scholar
Mai, Q., & Zhang, X. (2019). An iterative penalized least squares approach to sparse canonical correlation analysis. Biometrics, 75(3), 734–744. https://doi.org/10.1111/biom.v75.3 doi: 10.1111/biom.13043
Web of Science ®Google Scholar
Neykov, M., Lin, Q., & Liu, J. S. (2016). Signed support recovery for single index models in high-dimensions. Annals of Mathematical Sciences and Applications, 1(2), 379–426. https://doi.org/10.4310/AMSA.2016.v1.n2.a5
Web of Science ®Google Scholar
Neykov, M., Liu, J. S., & Cai, T. (2016). L1-regularized least squares for support recovery of high dimensional single index models with gaussian designs. The Journal of Machine Learning Research, 17(1), 2976–3012.
PubMedGoogle Scholar
Ni, L., Cook, R. D., & Tsai, C.-L. (2005). A note on shrinkage sliced inverse regression. Biometrika, 92(1), 242–247. https://doi.org/10.1093/biomet/92.1.242
Web of Science ®Google Scholar
Tan, K., Shi, L., & Yu, Z. (2020). Sparse sir: Optimal rates and adaptive estimation. The Annals of Statistics, 48(1), 64–85. https://doi.org/10.1214/18-AOS1791
Web of Science ®Google Scholar
Zhou, J., & He, X. (2008). Dimension reduction based on constrained canonical correlation and variable filtering. The Annals of Statistics, 36(4), 1649–1668. https://doi.org/10.1214/07-AOS529
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Discussion on ‘Review of sparse sufficient dimension reduction’

1. Least squares formulations

2. Sparse/constrained canonical correlation

Disclosure statement

Notes on contributors

Xin Zhang

References

Information for

Open access

Opportunities

Help and information

Discussion on ‘Review of sparse sufficient dimension reduction’

1. Least squares formulations

2. Sparse/constrained canonical correlation

Disclosure statement

Additional information

Notes on contributors

Xin Zhang

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date