Search in:

Advanced search

Journal of the American Statistical Association Volume 115, 2020 - Issue 529

Submit an article Journal homepage

1,122

Views

CrossRef citations to date

Altmetric

Theory and Methods

PUlasso: High-Dimensional Variable Selection With Presence-Only Data

Hyebin SongDepartment of Statistics, University of Wisconsin-Madison, Madison, WICorrespondence[email protected]
View further author information

Garvesh RaskuttiDepartment of Statistics, University of Wisconsin-Madison, Madison, WIView further author information

Pages 334-347 | Received 22 Nov 2017, Accepted 29 Oct 2018, Published online: 11 Apr 2019

Cite this article
https://doi.org/10.1080/01621459.2018.1546587
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions

References

Blazère, M., Loubes, J. M., and Gamboa, F. (2014), “Oracle Inequalities for a Group Lasso Procedure Applied to Generalized Linear Models in High Dimension,” IEEE Transactions on Information Theory, 60, 2303–2318. DOI: 10.1109/TIT.2014.2303121.
Web of Science ®Google Scholar
Breheny, P., and Huang, J. (2013), “Group Descent Algorithms for Nonconvex Penalized Linear and Logistic Regression Models With Grouped Predictors,” Statistics and Computing, 25, 173–187. DOI: 10.1007/s11222-013-9424-2.
Web of Science ®Google Scholar
Du Marthinus, P., Niu, G., and Sugiyama, M. (2015), “Convex Formulation for Learning From Positive and Unlabeled Data,” in Proceedings of the 32nd International Conference on Machine Learning, pp. 1386–1394.
Google Scholar
Elkan, C., and Noto, K. (2008), “Learning Classifiers From Only Positive and Unlabeled Data,” in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08, New York, NY, USA: ACM, pp. 213–220.
Google Scholar
Elsener, A., and van de Geer, S. (2018), “Sharp Oracle Inequalities for Stationary Points of Nonconvex Penalized M-Estimators,” arXiv no. 1802.09733.
Google Scholar
Fahrmeir, L., and Kaufmann, H. (1985), “Consistency and Asymptotic Normality of the Maximum Likelihood Estimator in Generalized Linear Models,” The Annals of Statistics, 13, 342–368. DOI: 10.1214/aos/1176346597.
Web of Science ®Google Scholar
Fowler, D. M., and Fields, S. (2014), “Deep Mutational Scanning: A New Style of Protein Science,” Nature Methods, 11, 801–807. DOI: 10.1038/nmeth.3027.
PubMed Web of Science ®Google Scholar
Friedman, J., Hastie, T., Höfling, H., and Tibshirani, R. (2007), “Pathwise Coordinate Optimization,” The Annals of Applied Statistics, 1, 302–332. DOI: 10.1214/07-AOAS131.
Web of Science ®Google Scholar
Friedman, J., Hastie, T., and Tibshirani, R. (2010), “Regularization Paths for Generalized Linear Models via Coordinate Descent,” Journal of Statistical Software, 33, 1–22.
PubMed Web of Science ®Google Scholar
Hietpas, R. T., Jensen, J. D., and Bolon, D. N. A. (2011), “Experimental Illumination of a Fitness Landscape,” Proceedings of the National Academy of Sciences of the USA, 108, 7896–7901. DOI: 10.1073/pnas.1016024108.
PubMed Web of Science ®Google Scholar
Huang, J., Breheny, P., and Ma, S. (2012), “A Selective Review of Group Selection in High-Dimensional Models,” Statistical Science, 27, 481–499. DOI: 10.1214/12-STS392.
Web of Science ®Google Scholar
Jain, S., White, M., and Radivojac, P. (2017), “Recovering True Classifier Performance in Positive-Unlabeled Learning,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4–9, 2017, San Francisco, California, USA, pp. 2066–2072.
Google Scholar
Kakade, S., Shamir, O., Sindharan, K., and Tewari, A. (2010), “Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 381–388.
Google Scholar
Kalousis, A., Prados, J., and Hilario, M. (2007), “Stability of Feature Selection Algorithms: A Study on High-Dimensional Spaces,” Knowledge and Information Systems, 12, 95–116. DOI: 10.1007/s10115-006-0040-8.
Web of Science ®Google Scholar
Krishnapuram, B., Carin, L., Figueiredo, M. A. T., and Hartemink, A. J. (2005), “Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 957–968. DOI: 10.1109/TPAMI.2005.127.
PubMed Web of Science ®Google Scholar
Lancaster, T., and Imbens, G. (1996), “Case-Control Studies With Contaminated Controls,” Journal of Econometrics, 71, 145–160. DOI: 10.1016/0304-4076(94)01698-4.
Web of Science ®Google Scholar
Lange, K., Hunter, D. R., and Yang, I. (2000), “Optimization Transfer Using Surrogate Objective Functions,” Journal of Computational and Graphical Statistics, 9, 1–20. DOI: 10.1080/10618600.2000.10474858.
Web of Science ®Google Scholar
Lee, S., Lee, H., Abbeel, P., and Ng, A. Y. (2006), “Efficient L1 Regularized Logistic Regression,” in Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), pp. 1–9.
Google Scholar
Liu, B., Dai, Y., Li, X., Lee, W. S., and Yu, P. (2006), “Building Text Classifiers Using Positive and Unlabeled Examples,” in Proceedings of the Third IEEE International Conference on Data Mining (ICDM’03).
Google Scholar
Loh, P.-L., and Wainwright, M. J. (2006), “Regularized M-Estimators With Nonconvexity: Statistical and Algorithmic Theory for Local Optima,” Journal of Machine Learning Research, 1, 1–9.
Google Scholar
McCullagh, P., and Nelder, J. A. (2006), Generalized Linear Models (Vol. 28), Boca Raton, FL: Chapman and Hall/CRC.
Google Scholar
Mei, S., Bai, Y., and Montanari, A. (2018), “The Landscape of Empirical Risk for Nonconvex Losses,” Annals of Statistics, 46, 2747–2774. DOI: 10.1214/17-AOS1637.
Web of Science ®Google Scholar
Meier, L., Van De Geer, S., and Bühlmann, P. (2008), “The Group Lasso for Logistic Regression,” Journal of the Royal Statistical Society, Series B, 70, 53–71. DOI: 10.1111/j.1467-9868.2007.00627.x.
Google Scholar
Negahban, S. N., Pradeep, R., Yu, B., and Wainwright, M. J. (2012), “A Unified Framework for High-Dimensional Analysis of M-Estimators With Decomposable Regularizers,” Statistica Sinica, 27, 538–557. DOI: 10.1214/12-STS400.
Web of Science ®Google Scholar
Ortega, J. M., and Rheinboldt, W. C. (2000), Iterative Solution of Nonlinear Equations in Several Variables, Classics in Applied Mathematics, New York: SIAM.
Google Scholar
Puig, A. T., Wiesel, A., Fleury, G., and Hero, A. O. (2011), “Multidimensional Shrinkage-Thresholding Operator and Group LASSO Penalties,” IEEE Signal Processing Letters, 18, 363–366. DOI: 10.1109/LSP.2011.2139204.
Web of Science ®Google Scholar
Raskutti, G., Wainwright, M. J., and Yu, B. (2010), “Restricted Eigenvalue Conditions for Correlated Gaussian Designs,” Journal of Machine Learning Research, 11, 2241–2259.
Google Scholar
Raskutti, G., Wainwright, M. J., and Yu, B. (2011), “Minimax Rates of Estimation for High-Dimensional Linear Regression Over ℓq -Balls,” IEEE Transactions on Information Theory, 57, 6976–6994.
Web of Science ®Google Scholar
Romero, P. A., Tran, T. M., and Abate, A. R. (2015), “Dissecting Enzyme Function With Microfluidic-Based Deep Mutational Scanning,” Proceedings of the National Academy of Sciences of the USA, 112, 7159–7164. DOI: 10.1073/pnas.1422285112.
PubMed Web of Science ®Google Scholar
Simon, N., and Tibshirani, R. (2012), “Standardization and the Group Lasso Penalty,” Statistica Sinica, 22, 1–21.
Web of Science ®Google Scholar
Tibshirani, R., Bien, J., Friedman, J., Hastie, T., Simon, N., Taylor, J., and Tibshirani, R. J. (2012), “Strong Rules for Discarding Predictors in Lasso-Type Problems,” Journal of the Royal Statistical Society, Series B, 74, 245–266. DOI: 10.1111/j.1467-9868.2011.01004.x.
Google Scholar
van de Geer, S. A. (2008), “High-Dimensional Generalized Linear Models and the Lasso,” The Annals of Statistics, 36, 614–645. DOI: 10.1214/009053607000000929.
Web of Science ®Google Scholar
Ward, G., Hastie, T., Barry, S., Elith, J., and Leathwick, J. R. (2009), “Presence-Only Data and the EM Algorithm,” Biometrics, 65, 554–563. DOI: 10.1111/j.1541-0420.2008.01116.x.
PubMed Web of Science ®Google Scholar
Wu, T. T., and Lange, K. (2008), “Coordinate Descent Algorithms for Lasso Penalized Regression,” The Annals of Applied Statistics, 2, 224–244. DOI: 10.1214/07-AOAS147.
Web of Science ®Google Scholar
Yuan, M., and Lin, Y. (2006), “Model Selection and Estimation in Regression With Grouped Variables,” Journal of the Royal Statistical Society, Series B, 68, 49–67. DOI: 10.1111/j.1467-9868.2005.00532.x.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

PUlasso: High-Dimensional Variable Selection With Presence-Only Data

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

PUlasso: High-Dimensional Variable Selection With Presence-Only Data

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date