Search in:

Advanced search

Journal of Computational and Graphical Statistics Volume 29, 2020 - Issue 2

Submit an article Journal homepage

422

Views

CrossRef citations to date

Altmetric

Scalable and Efficient Computation

Compressed and Penalized Linear Regression

Darren Homrighausena Department of Statistics, Colorado State University, Fort Collins, CO; View further author information

Daniel J. McDonaldb Department of Statistics, Indiana University, Bloomington, INCorrespondence[email protected]
View further author information

Pages 309-322 | Received 23 May 2018, Accepted 14 Aug 2019, Published online: 30 Sep 2019

Cite this article
https://doi.org/10.1080/10618600.2019.1660179
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions

References

Achlioptas, D. (2003), “Database-Friendly Random Projections: Johnson-Lindenstrauss With Binary Coins,” Journal of Computer and System Sciences, 66, 671–687. DOI: 10.1016/S0022-0000(03)00025-4.
Web of Science ®Google Scholar
Ailon, N., and Chazelle, B. (2006), “Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform,” in Proceedings of the Thirty-Eighth Annual ACM Symposium on Theory of Computing, ACM, pp. 557–563.
Google Scholar
Avron, H., Maymounkov, P., and Toledo, S. (2010), “Blendenpik: Supercharging LAPACK’s Least-Squares Solver,” SIAM Journal on Scientific Computing, 32, 1217–1236. DOI: 10.1137/090767911.
Web of Science ®Google Scholar
Bair, E., Hastie, T., Paul, D., and Tibshirani, R. (2006), “Prediction by Supervised Principal Components,” Journal of the American Statistical Association, 101, 119–137. DOI: 10.1198/016214505000000628.
Web of Science ®Google Scholar
Becker, S., Kawas, B., Petrik, M., and Ramamurthy, K. N. (2017), “Robust Partially-Compressed Least-Squares,” in The Thirty-First AAAI Conference on Artificial Intelligence.
Google Scholar
Cloonan, N., Forrest, A. R. R., Kolle, G., Gardiner, B. B. A., Faulkner, G. J., Brown, M. K., Taylor, D. F., Steptoe, A. L., Wani, S., Bethel, G., Robertson, A. J., Perkins, A. C., Bruce, S. J., Lee, C. C., Ranade, S. S., Peckham, H. E., Manning, J. M., McKernan, K. J., and Grimmond, S. M. (2008), “Stem Cell Transcriptome Profiling via Massive-Scale mRNA Sequencing,” Nature Methods, 5, 613–619. DOI: 10.1038/nmeth.1223.
PubMed Web of Science ®Google Scholar
Collins, D. L., Zijdenbos, A. P., Kollokian, V., Sled, J. G., Kabani, N. J., Holmes, C. J., and Evans, A. C. (1998), “Design and Construction of a Realistic Digital Brain Phantom,” IEEE Transactions on Medical Imaging, 17, 463–468. DOI: 10.1109/42.712135.
PubMed Web of Science ®Google Scholar
Coupe, P., Yger, P., Prima, S., Hellier, P., Kervrann, C., and Barillot, C. (2008), “An Optimized Blockwise Nonlocal Means Denoising Filter for 3-D Magnetic Resonance Images,” IEEE Transactions on Medical Imaging, 27, 425–441. DOI: 10.1109/TMI.2007.906087.
PubMed Web of Science ®Google Scholar
Dalpiaz, D., He, X., and Ma, P. (2013), “Bias Correction in RNA-Seq Short-Read Counts Using Penalized Regression,” Statistics in Biosciences, 5, 88–99. DOI: 10.1007/s12561-012-9057-6.
Google Scholar
Dasgupta, A., Kumar, R., and Sarlós, T. (2010), “A Sparse Johnson-Lindenstrauss Transform,” in Proceedings of the 42nd ACM Symposium on Theory of Computing, ACM, pp. 341–350.
Google Scholar
Ding, L., and McDonald, D. J. (2017), “Predicting Phenotypes From Microarrays Using Amplified, Initially Marginal, Eigenvector Regression,” Bioinformatics, 33, i350–i358. DOI: 10.1093/bioinformatics/btx265.
PubMed Web of Science ®Google Scholar
Drineas, P., Magdon-Ismail, M., Mahoney, M. W., and Woodruff, D. P. (2012), “Fast Approximation of Matrix Coherence and Statistical Leverage,” Journal of Machine Learning Research, 13, 3475–3506.
Web of Science ®Google Scholar
Drineas, P., Mahoney, M. W., Muthukrishnan, S., and Sarlós, T. (2011), “Faster Least Squares Approximation,” Numerische Mathematik, 117, 219–249. DOI: 10.1007/s00211-010-0331-6.
Web of Science ®Google Scholar
Efron, B. (1986), “How Biased Is the Apparent Error Rate of a Prediction Rule?,” Journal of the American Statistical Association, 81, 461–470. DOI: 10.1080/01621459.1986.10478291.
Web of Science ®Google Scholar
Frey, R. A., Ackerman, S., and Soden, B. J. (1996), “Climate Parameters From Satellite Spectral Measurements. Part 1: Collocated AVHRR and HIRS/2 Observations of Spectral Greenhouse Parameter,” Journal of Climate, 9, 327–344. DOI: 10.1175/1520-0442(1996)009<0327:CPFSSM>2.0.CO;2.
Web of Science ®Google Scholar
Friedman, J., Hastie, T., and Tibshirani, R. (2010), “Regularization Paths for Generalized Linear Models via Coordinate Descent,” Journal of Statistical Software, 33, 1–22. DOI: 10.18637/jss.v033.i01.
Web of Science ®Google Scholar
Gittens, A., and Mahoney, M. (2013), “Revisiting the Nystrom Method for Improved Large-Scale Machine Learning,” in Proceedings of the 30th International Conference on Machine Learning (ICML-13), JMLR Workshop and Conference Proceedings (Vol. 28), eds. S. Dasgupta and D. McAllester, pp. 567–575.
Google Scholar
Golub, G. H., Heath, M., and Wahba, G. (1979), “Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter,” Technometrics, 21, 215–223. DOI: 10.1080/00401706.1979.10489751.
Web of Science ®Google Scholar
Golub, G. H., and Van Loan, C. F. (2012), Matrix Computations (Vol. 3), Baltimore, MD: JHU Press.
Google Scholar
Halko, N., Martinsson, P.-G., and Tropp, J. A. (2011), “Finding Structure With Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions,” SIAM Review, 53, 217–288. DOI: 10.1137/090771806.
Web of Science ®Google Scholar
Hoerl, A. E., and Kennard, R. W. (1970), “Ridge Regression: Biased Estimation for Nonorthogonal Problems,” Technometrics, 12, 55–67. DOI: 10.1080/00401706.1970.10488634.
Web of Science ®Google Scholar
Homrighausen, D., and McDonald, D. J. (2016), “On the Nyström and Column-Sampling Methods for the Approximate Principal Components Analysis of Large Data Sets,” Journal of Computational and Graphical Statistics, 25, 344–362. DOI: 10.1080/10618600.2014.995799.
Web of Science ®Google Scholar
Ingrassia, S., and Morlini, I. (2007), Equivalent Number of Degrees of Freedom for Neural Networks, Advances in Data Analysis, Berlin, Heidelberg: Springer Berlin Heidelberg.
Google Scholar
Kane, D. M., and Nelson, J. (2014), “Sparser Johnson-Lindenstrauss Transforms,” Journal of the ACM, 61, Article 4. DOI: 10.1145/2559902.
Web of Science ®Google Scholar
Lang, M., Bischl, B., and Surmann, D. (2017), “batchtools: Tools for R to Work on Batch Systems,” The Journal of Open Source Software, 2, 135. DOI: 10.21105/joss.00135.
Google Scholar
Li, J., Jiang, H., and Wong, W. H. (2010), “Modeling Non-uniformity in Short-Read Rates in RNA-Seq Data,” Genome Biology, 11, 1–11. DOI: 10.1186/gb-2010-11-5-r50.
Web of Science ®Google Scholar
Ma, P., Mahoney, M. W., and Yu, B. (2015), “A Statistical Perspective on Algorithmic Leveraging,” The Journal of Machine Learning Research, 16, 861–911.
Web of Science ®Google Scholar
Mallows, C. L. (1973), “Some Comments on Cp,” Technometrics, 15, 661–675. DOI: 10.1080/00401706.1973.10489103.
Web of Science ®Google Scholar
Meng, X., Saunders, M. A., and Mahoney, M. W. (2014), “LSRN: A Parallel Iterative Solver for Strongly Over- or Under-Determined Systems,” SIAM Journal on Scientific Computing, 36, C95–C118. DOI: 10.1137/120866580.
PubMed Web of Science ®Google Scholar
Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L., and Wold, B. (2008), “Mapping and Quantifying Mammalian Transcriptomes by RNA-Seq,” Nature Methods, 5, 621–628. DOI: 10.1038/nmeth.1226.
PubMed Web of Science ®Google Scholar
Paul, D., Bair, E., Hastie, T., and Tibshirani, R. (2008), “Preconditioning for Feature Selection and Regression in High-Dimensional Problems,” The Annals of Statistics, 36, 1595–1618. DOI: 10.1214/009053607000000578.
Web of Science ®Google Scholar
Pilanci, M., and Wainwright, M. J. (2016), “Iterative Hessian Sketch: Fast and Accurate Solution Approximation for Constrained Least-Squares,” Journal of Machine Learning Research, 17, 1842–1879.
Web of Science ®Google Scholar
R Core Team (2019), R: A Language and Environment for Statistical Computing, Vienna, Austria: R Foundation for Statistical Computing, available at https://www.R-project.org/.
Google Scholar
Raskutti, G., and Mahoney, M. (2015), “Statistical and Algorithmic Perspectives on Randomized Sketching for Ordinary Least-Squares,” in Proceedings of the 32nd International Conference on Machine Learning (ICML), PMLR, Lille, France (Vol. 37), eds. F. Bach and D. Blei, pp. 617–625.
Google Scholar
Rokhlin, V., and Tygert, M. (2008), “A Fast Randomized Algorithm for Overdetermined Linear Least-Squares Regression,” Proceedings of the National Academy of Sciences of the United States of America, 105, 13212–13217. DOI: 10.1073/pnas.0804869105.
PubMed Web of Science ®Google Scholar
Rudelson, M., and Vershynin, R. (2010), “Non-Asymptotic Theory of Random Matrices: Extreme Singular Values,” in Proceedings of the International Congress of Mathematicians 2010 (ICM 2010), eds. R. Bhatia, A. Pal, G. Rangarajan, V. Srinivas, and M. Vanninathan, pp. 1576–1602.
Google Scholar
Saint-Marc, P., Chen, J.-S., and Medioni, G. (1989), “Adaptive Smoothing: A General Tool for Early Vision,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, pp. 618–624.
Google Scholar
Sapiro, G. (1996), “From Active Contours to Anisotropic Diffusion: Connections Between Basic PDE’s in Image Processing,” in Proceedings of the International Conference on Image Processing (Vol. 1), IEEE, pp. 477–480.
Google Scholar
Simon, N., Friedman, J., Hastie, T., and Tibshirani, R. (2011), “Regularization Paths for Cox’s Proportional Hazards Model via Coordinate Descent,” Journal of Statistical Software, 39, 1–13. DOI: 10.18637/jss.v039.i05.
PubMed Web of Science ®Google Scholar
Staten, P. W., Kahn, B. H., Schreier, M. M., and Heidinger, A. K. (2016), “Subpixel Characterization of HIRS Spectral Radiances Using Cloud Properties From AVHRR,” Journal of Atmospheric and Oceanic Technology, 33, 1519–1538. DOI: 10.1175/JTECH-D-15-0187.1.
Web of Science ®Google Scholar
Stein, C. M. (1981), “Estimation of the Mean of a Multivariate Normal Distribution,” The Annals of Statistics, 9, 1135–1151. DOI: 10.1214/aos/1176345632.
Web of Science ®Google Scholar
Wang, E. T., Sandberg, R., Luo, S., Khrebtukova, I., Zhang, L., Mayr, C., Kingsmore, S. F., Schroth, G. P., and Burge, C. B. (2008), “Alternative Isoform Regulation in Human Tissue Transcriptomes,” Nature, 456, 470–476. DOI: 10.1038/nature07509.
PubMed Web of Science ®Google Scholar
Wang, J., Lee, J., Mahdavi, M., Kolar, M., and Srebro, N. (2017), “Sketching Meets Random Projection in the Dual: A Provable Recovery Algorithm for Big and High-Dimensional Data,” in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), Proceedings of Machine Learning Research, PMLR, Fort Lauderdale, FL, USA (Vol. 54), eds. A. Singh and J. Zhu, pp. 1150–1158.
Google Scholar
Wickham, H. (2017), “tidyverse: Easily Install and Load the ‘tidyverse’,” R Package Version 1.2.1, available at https://www.tidyverse.org.
Google Scholar
Woodruff, D. P. (2014), “Sketching as a Tool for Numerical Linear Algebra,” Foundations and Trends® in Theoretical Computer Science, 10, 1–157. DOI: 10.1561/0400000060.
Google Scholar
Xie, Y. (2015), Dynamic Documents With R and knitr (2nd ed.), Boca Raton, FL: Chapman and Hall/CRC.
Google Scholar
Xie, Y. (2019), “knitr: A General-Purpose Package for Dynamic Report Generation in R,” R Package Version 1.22, available at https://yihui.name/knitr/.
Google Scholar
Xie, Y., Allaire, J., and Grolemund, G. (2018), R Markdown: The Definitive Guide, Boca Raton, FL: Chapman and Hall/CRC.
Google Scholar
Zhang, L., Mahdavi, M., Jin, R., Yang, T., and Zhu, S. (2013), “Recovering the Optimal Solution by Dual Random Projection,” in Proceedings of the 26th Annual Conference on Learning Theory, Proceedings of Machine Learning Research, PMLR (Vol. 30), eds. S. Shalev-Shwartz and I. Steinwart, pp. 135–157.
Google Scholar
Zhou, S., Lafferty, J., and Wasserman, L. (2009), “Compressed and Privacy-Sensitive Sparse Regression,” IEEE Transactions on Information Theory, 55, 846–866. DOI: 10.1109/TIT.2008.2009605.
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Compressed and Penalized Linear Regression

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Compressed and Penalized Linear Regression

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date