CrossRef citations to date
Theory and Methods

RANK: Large-Scale Inference With Graphical Nonlinear Knockoffs

, , &
Pages 362-379 | Received 16 Jul 2018, Accepted 31 Oct 2018, Published online: 11 Apr 2019


  • Abramovich, F., Benjamini, Y., Donoho, D. L., and Johnstone, I. M. (2006), “Adapting to Unknown Sparsity by Controlling the False Discovery Rate,” The Annals of Statistics, 34, 584–653. DOI: 10.1214/009053606000000074.
  • Barber, R. F., and Candès, E. J. (2015), “Controlling the False Discovery Rate via Knockoffs,” The Annals of Statistics, 43, 2055–2085. DOI: 10.1214/15-AOS1337.
  • Barber, R. F., and Candès, E. J. (2016), “A Knockoff Filter for High-Dimensional Selective Inference,” arXiv:1602.03574.
  • Benjamini, Y., and Hochberg, Y. (1995), “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,” Journal of the Royal Statistical Society, Series B, 57, 289–300.
  • Benjamini, Y., and Yekutieli, D. (2001), “The Control of the False Discovery Rate in Multiple Testing Under Dependency,” The Annals of Statistics, 29, 1165–1188. DOI: 10.1214/aos/1013699998.
  • Bickel, P. J., Ritov, Y., and Tsybakov, A. B. (2009), “Simultaneous Analysis of Lasso and Dantzig Selector,” The Annals of Statistics, 37, 1705–1732. DOI: 10.1214/08-AOS620.
  • Bühlmann, P., and van de Geer, S. (2011), Statistics for High-Dimensional Data: Methods, Theory and Applications, Berlin: Springer.
  • Candès, E. J., Fan, Y., Janson, L., and Lv, J. (2018), “Panning for Gold: ‘ModelX’ Knockoffs for High Dimensional Controlled Variable Selection,” Journal of the Royal Statistical Society, Series B, 80, 551–577.
  • Chen, M., Ren, Z., Zhao, H., and Zhou, H. H. (2016), “Asymptotically Normal and Efficient Estimation of Covariate-Adjusted Gaussian Graphical Model,” Journal of the American Statistical Association, 111, 394–406. DOI: 10.1080/01621459.2015.1010039.
  • Chouldechova, A., and Hastie, T. (2015), “Generalized Additive Model Selection,” arXiv:1506.03850.
  • Clarke, S., and Hall, P. (2009), “Robustness of Multiple Testing Procedures Against Dependence,” The Annals of Statistics, 37, 332–358. DOI: 10.1214/07-AOS557.
  • Efron, B. (2007a), “Correlation and Large-Scale Simultaneous Significance Testing,” Journal of the American Statistical Association, 102, 93–103. DOI: 10.1198/016214506000001211.
  • Efron, B. (2007b), “Size, Power and False Discovery Rates,” The Annals of Statistics, 35, 1351–1377.
  • Efron, B., and Tibshirani, R. (2002), “Empirical Bayes Methods and False Discovery Rates for Microarrays,” Genetic Epidemiology, 23, 70–86.
  • Engle, R., Granger, C., Rice, J., and Weiss, A. (1986), “Semiparametric Estimates of the Relation Between Weather and Electricity Sales,” Journal of the American Statistical Association, 81, 310–320. DOI: 10.1080/01621459.1986.10478274.
  • Fan, J., and Fan, Y. (2008), “High-Dimensional Classification Using Features Annealed Independence Rules,” The Annals of Statistics, 36, 2605–2637. DOI: 10.1214/07-AOS504.
  • Fan, J., and Gijbels, I. (1996), Local Polynomial Modelling and Its Applications. London: Chapman & Hall/CRC.
  • Fan, J., Guo, S., and Hao, N. (2012), “Variance Estimation Using Refitted Cross-Validation in Ultrahigh Dimensional Regression,” Journal of the Royal Statistical Society, Series B, 74, 37–65.
  • Fan, J., Hall, P., and Yao, Q. (2007), “To How Many Simultaneous Hypothesis Tests Can Normal, Student’s t or Bootstrap Calibration be Applied?,” Journal of the American Statistical Association, 102, 1282–1288.
  • Fan, J., Han, X., and Gu, W. (2012), “Estimating False Discovery Proportion Under Arbitrary Covariance Dependence,” Journal of American Statistical Association, 107, 1019–1035.
  • Fan, J., and Li, R. (2001), “Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties,” Journal of American Statistical Association, 96, 1348–1360.
  • Fan, J., and Lv, J. (2008), “Sure Independence Screening for Ultrahigh Dimensional Feature Space (with discussion),” Journal of the Royal Statistical Society, Series B, 70, 849–911.
  • Fan, J., and Lv, J. (2010), “A Selective Overview of Variable Selection in High Dimensional Feature Space (invited review article),” Statistica Sinica, 20, 101–148.
  • Fan, J., Samworth, R. J., and Wu, Y. (2009), “Ultrahigh Dimensional Variable Selection: Beyond the Linear Model,” Journal of Machine Learning Research, 10, 1829–1853.
  • Fan, Y., Demirkaya, E., and Lv, J. (2017), “Nonuniformity of p-values Can Occur Early in Diverging Dimensions,” arXiv:1705.03604.
  • Fan, Y., and Fan, J. (2011), “Testing and Detecting Jumps Based on a Discretely Observed Process,” Journal of Econometrics, 164, 331–344.
  • Fan, Y., Kong, Y., Li, D., and Zheng, Z. (2015), “Innovated Interaction Screening for High-Dimensional Nonlinear Classification,” The Annals of Statistics, 43, 1243–1272. DOI: 10.1214/14-AOS1308.
  • Fan, Y., and Lv, J. (2013), “Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space,” Journal of the American Statistical Association, 108, 1044–1061. DOI: 10.1080/01621459.2013.803972.
  • Fan, Y., and Lv, J. (2016), “Innovated Scalable Efficient Estimation in Ultra-Large Gaussian Graphical Models,” The Annals of Statistics, 44, 2098–2126.
  • Hall, P., and Wang, Q. (2010), “Strong Approximations of Level Exceedences Related to Multiple Hypothesis Testing,” Bernoulli, 16, 418–434.
  • Härdle, W., Liang, H., and Gao, J. T. (2000), Partially Linear Models, Heidelberg: Springer Physica-Verlag.
  • Härdle, W., and Stoker, T. M. (1989), “Investigating Smooth Multiple Regression by the Method of Average Derivatives,” Journal of the American statistical Association, 84, 986–995. DOI: 10.1080/01621459.1989.10478863.
  • Hastie, T., and Tibshirani, R. (1990), Generalized Additive Models. London: Chapman & Hall/CRC.
  • Hastie, T., Tibshirani, R., and Friedman, J. (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.), Berlin: Springer.
  • Horowitz, J. L. (2009), Semiparametric and Nonparametric Methods in Econometrics. Berlin: Springer.
  • Horvath, D. P., Schaffer, R., and Wisman, E. (2003), “Identification of Genes Induced in Emerging Tillers of Wild Oat (Avena fatua) Using Arabidopsis Microarrays,” Weed Science, 51, 503–508.
  • Huber, P. J. (1973), “Robust Regression: Asymptotics, Conjectures and Monte Carlo,” The Annals of Statistics, 1, 799–821. DOI: 10.1214/aos/1176342503.
  • Ichimura, H. (1993), “Semiparametric Least Squares (SLS) and Weighted SLS Estimation of Single-Index Models,” Journal of Econometrics, 58, 71–120.
  • Lauritzen, S. L. (1996), Graphical Models, Oxford: Oxford University Press.
  • Li, Q., and J. S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton, NJ: Princeton University Press.
  • Lin, Q., Zhao, Z., and Liu, J. S. (2016), “Sparse Sliced Inverse Regression for High Dimensional Data,” arXiv:1611.06655.
  • Liu, W., and Shao, Q.-M. (2014), “Phase Transition and Regularized Bootstrap in Large-Scale t-tests With False Discovery Rate Control,” The Annals of Statistics, 42, 2003–2025.
  • Lv, J. (2013), “Impacts of High Dimensionality in Finite Samples,” The Annals of Statistics, 41, 2236–2262. DOI: 10.1214/13-AOS1149.
  • McCullagh, P., and Nelder, J. A. (1989), Generalized Linear Models. London: Chapman and Hall.
  • Meier, L., van de Geer, S., and Bühlmann, P. (2009), “High-Dimensional Additive Modeling,” The Annals of Statistics, 37, 3779–3821. DOI: 10.1214/09-AOS692.
  • Meng, L., Sun, F., Zhang, X., and Waterman, M. S. (2011), “Sequence Alignment as Hypothesis Testing,” Journal of Computational Biology, 18, 677–691. DOI: 10.1089/cmb.2010.0328.
  • Prelić, A., Bleuler, S., Zimmermann, P., Wille, A., Bühlmann, P., Gruissem, W., Hennig, L., Thiele, L., and Zitzler, E. (2006), “A Systematic Comparison and Evaluation of Biclustering Methods for Gene Expression Data,” Bioinformatics, 22, 1122–1129. DOI: 10.1093/bioinformatics/btl060.
  • Ramel, F., Sulmon, C., Bogard, M., Couée, I., and Gouesbet, G. (2009), “Differential Patterns of Reactive Oxygen Species and Antioxidative Mechanisms During Atrazine Injury and Sucrose-Induced Tolerance in Arabidopsis Thaliana Plantlets,” BMC Plant Biology, 9, 1–18.
  • Ravikumar, P., Liu, H., Lafferty, J., and Wasserman, L. (2009), “Spam: Sparse Additive Models,” Journal of the Royal Statistical Society, Series B, 71, 1009–1030.
  • Ren, Z., Kang, Y., Fan, Y., and Lv, J. (2018), “Tuning-Free Heterogeneous Inference in Massive Networks,” Journal of the American Statistical Association. DOI: 10.1080/01621459.2018.1537920.
  • Schäfer, J., and Strimmer, K. (2005), “A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics,” Statistical Applications in Genetics and Molecular Biology, 4, 1544–1615.
  • Schmitt, B. A. (1992), “Perturbation Bounds for Matrix Square Roots and Pythagorean Sums,” Linear Algebra and Its Applications, 174, 215–227. DOI: 10.1016/0024-3795(92)90052-C.
  • Shah, R. D., and Samworth, R. J. (2013), “Variable Selection With Error Control: Another Look at Stability Selection,” Journal of the Royal Statistical Society, Series B, 75, 55–80.
  • Stoker, T. M. (1986), “Consistent Estimation of Scaled Coefficients,” Econometrica, 54, 1461–1481.
  • Storey, J. D. (2002), A direct approach to false discovery rates. Journal of the Royal Statistical Society, Series B, 64, 479–498.
  • Storey, J. D., Taylor, J. E., and Siegmund, D. (2004), “Strong Control, Conservative Point Estimation and Simultaneous Conservative Consistency of False Discovery Rates: A Unified Approach,” Journal of the Royal Statistical Society, Series B, 66, 187–205.
  • Su, W., and Candès, E. J. (2016), “Slope Is Adaptive to Unknown Sparsity and Asymptotically Minimax,” The Annals of Statistics, 44, 1038–1068. DOI: 10.1214/15-AOS1397.
  • Sur, P., Chen, Y., and Candès, E. J. (2017), “The Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square,” arXiv:1706.01191.
  • Tibshirani, R. (1996), “Regression Shrinkage and Selection via the Lasso,” Journal of the Royal Statistical Society, Series B, 58, 267–288.
  • Wienkoop, S., Glinski, M., Tanaka, N., Tolstikov, V., Fiehn, O., and Weckwerth, W. (2004), “Linking Protein Fractionation With Multidimensional Monolithic Reversed-Phase Peptide Chromatography/Mass Spectrometry Enhances Protein Identification From Complex Mixtures Even in the Presence of Abundant Proteins,” Rapid Communications in Mass Spectrometry, 18, 643–650.
  • Wille, A., Zimmermann, P., Vranová, E., Fürholz, A., Laule, O., Bleuler, S., Hennig, L., Prelić, A., von Rohr, P., Thiele, L., Zitzler, E., Gruissem, W., and Bühlmann, P. (2004), “Sparse Graphical Gaussian Modeling of the Isoprenoid Gene Network in Arabidopsis Thaliana,” Genome Biology, 5, R92. DOI: 10.1186/gb-2004-5-11-r92.
  • Wu, W. B. (2008), “On False Discovery Control Under Dependence,” The Annals of Statistics, 36, 364–380. DOI: 10.1214/009053607000000730.
  • Yang, E., Lozano, A., and Ravikumar, P. (2014), “Elementary Estimators for High-Dimensional Linear Regression,” in Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 388–396.
  • Zhang, Y., and Liu, J. S. (2011), “Fast and Accurate Approximation to Significance Tests in Genome-Wide Association Studies,” Journal of the American Statistical Association, 106, 846–857. DOI: 10.1198/jasa.2011.ap10657.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.