1,122
Views
6
CrossRef citations to date
0
Altmetric
Theory and Methods

PUlasso: High-Dimensional Variable Selection With Presence-Only Data

&
Pages 334-347 | Received 22 Nov 2017, Accepted 29 Oct 2018, Published online: 11 Apr 2019

References

  • Blazère, M., Loubes, J. M., and Gamboa, F. (2014), “Oracle Inequalities for a Group Lasso Procedure Applied to Generalized Linear Models in High Dimension,” IEEE Transactions on Information Theory, 60, 2303–2318. DOI: 10.1109/TIT.2014.2303121.
  • Breheny, P., and Huang, J. (2013), “Group Descent Algorithms for Nonconvex Penalized Linear and Logistic Regression Models With Grouped Predictors,” Statistics and Computing, 25, 173–187. DOI: 10.1007/s11222-013-9424-2.
  • Du Marthinus, P., Niu, G., and Sugiyama, M. (2015), “Convex Formulation for Learning From Positive and Unlabeled Data,” in Proceedings of the 32nd International Conference on Machine Learning, pp. 1386–1394.
  • Elkan, C., and Noto, K. (2008), “Learning Classifiers From Only Positive and Unlabeled Data,” in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08, New York, NY, USA: ACM, pp. 213–220.
  • Elsener, A., and van de Geer, S. (2018), “Sharp Oracle Inequalities for Stationary Points of Nonconvex Penalized M-Estimators,” arXiv no. 1802.09733.
  • Fahrmeir, L., and Kaufmann, H. (1985), “Consistency and Asymptotic Normality of the Maximum Likelihood Estimator in Generalized Linear Models,” The Annals of Statistics, 13, 342–368. DOI: 10.1214/aos/1176346597.
  • Fowler, D. M., and Fields, S. (2014), “Deep Mutational Scanning: A New Style of Protein Science,” Nature Methods, 11, 801–807. DOI: 10.1038/nmeth.3027.
  • Friedman, J., Hastie, T., Höfling, H., and Tibshirani, R. (2007), “Pathwise Coordinate Optimization,” The Annals of Applied Statistics, 1, 302–332. DOI: 10.1214/07-AOAS131.
  • Friedman, J., Hastie, T., and Tibshirani, R. (2010), “Regularization Paths for Generalized Linear Models via Coordinate Descent,” Journal of Statistical Software, 33, 1–22.
  • Hietpas, R. T., Jensen, J. D., and Bolon, D. N. A. (2011), “Experimental Illumination of a Fitness Landscape,” Proceedings of the National Academy of Sciences of the USA, 108, 7896–7901. DOI: 10.1073/pnas.1016024108.
  • Huang, J., Breheny, P., and Ma, S. (2012), “A Selective Review of Group Selection in High-Dimensional Models,” Statistical Science, 27, 481–499. DOI: 10.1214/12-STS392.
  • Jain, S., White, M., and Radivojac, P. (2017), “Recovering True Classifier Performance in Positive-Unlabeled Learning,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4–9, 2017, San Francisco, California, USA, pp. 2066–2072.
  • Kakade, S., Shamir, O., Sindharan, K., and Tewari, A. (2010), “Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 381–388.
  • Kalousis, A., Prados, J., and Hilario, M. (2007), “Stability of Feature Selection Algorithms: A Study on High-Dimensional Spaces,” Knowledge and Information Systems, 12, 95–116. DOI: 10.1007/s10115-006-0040-8.
  • Krishnapuram, B., Carin, L., Figueiredo, M. A. T., and Hartemink, A. J. (2005), “Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 957–968. DOI: 10.1109/TPAMI.2005.127.
  • Lancaster, T., and Imbens, G. (1996), “Case-Control Studies With Contaminated Controls,” Journal of Econometrics, 71, 145–160. DOI: 10.1016/0304-4076(94)01698-4.
  • Lange, K., Hunter, D. R., and Yang, I. (2000), “Optimization Transfer Using Surrogate Objective Functions,” Journal of Computational and Graphical Statistics, 9, 1–20. DOI: 10.1080/10618600.2000.10474858.
  • Lee, S., Lee, H., Abbeel, P., and Ng, A. Y. (2006), “Efficient L1 Regularized Logistic Regression,” in Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), pp. 1–9.
  • Liu, B., Dai, Y., Li, X., Lee, W. S., and Yu, P. (2006), “Building Text Classifiers Using Positive and Unlabeled Examples,” in Proceedings of the Third IEEE International Conference on Data Mining (ICDM’03).
  • Loh, P.-L., and Wainwright, M. J. (2006), “Regularized M-Estimators With Nonconvexity: Statistical and Algorithmic Theory for Local Optima,” Journal of Machine Learning Research, 1, 1–9.
  • McCullagh, P., and Nelder, J. A. (2006), Generalized Linear Models (Vol. 28), Boca Raton, FL: Chapman and Hall/CRC.
  • Mei, S., Bai, Y., and Montanari, A. (2018), “The Landscape of Empirical Risk for Nonconvex Losses,” Annals of Statistics, 46, 2747–2774. DOI: 10.1214/17-AOS1637.
  • Meier, L., Van De Geer, S., and Bühlmann, P. (2008), “The Group Lasso for Logistic Regression,” Journal of the Royal Statistical Society, Series B, 70, 53–71. DOI: 10.1111/j.1467-9868.2007.00627.x.
  • Negahban, S. N., Pradeep, R., Yu, B., and Wainwright, M. J. (2012), “A Unified Framework for High-Dimensional Analysis of M-Estimators With Decomposable Regularizers,” Statistica Sinica, 27, 538–557. DOI: 10.1214/12-STS400.
  • Ortega, J. M., and Rheinboldt, W. C. (2000), Iterative Solution of Nonlinear Equations in Several Variables, Classics in Applied Mathematics, New York: SIAM.
  • Puig, A. T., Wiesel, A., Fleury, G., and Hero, A. O. (2011), “Multidimensional Shrinkage-Thresholding Operator and Group LASSO Penalties,” IEEE Signal Processing Letters, 18, 363–366. DOI: 10.1109/LSP.2011.2139204.
  • Raskutti, G., Wainwright, M. J., and Yu, B. (2010), “Restricted Eigenvalue Conditions for Correlated Gaussian Designs,” Journal of Machine Learning Research, 11, 2241–2259.
  • Raskutti, G., Wainwright, M. J., and Yu, B. (2011), “Minimax Rates of Estimation for High-Dimensional Linear Regression Over ℓq -Balls,” IEEE Transactions on Information Theory, 57, 6976–6994.
  • Romero, P. A., Tran, T. M., and Abate, A. R. (2015), “Dissecting Enzyme Function With Microfluidic-Based Deep Mutational Scanning,” Proceedings of the National Academy of Sciences of the USA, 112, 7159–7164. DOI: 10.1073/pnas.1422285112.
  • Simon, N., and Tibshirani, R. (2012), “Standardization and the Group Lasso Penalty,” Statistica Sinica, 22, 1–21.
  • Tibshirani, R., Bien, J., Friedman, J., Hastie, T., Simon, N., Taylor, J., and Tibshirani, R. J. (2012), “Strong Rules for Discarding Predictors in Lasso-Type Problems,” Journal of the Royal Statistical Society, Series B, 74, 245–266. DOI: 10.1111/j.1467-9868.2011.01004.x.
  • van de Geer, S. A. (2008), “High-Dimensional Generalized Linear Models and the Lasso,” The Annals of Statistics, 36, 614–645. DOI: 10.1214/009053607000000929.
  • Ward, G., Hastie, T., Barry, S., Elith, J., and Leathwick, J. R. (2009), “Presence-Only Data and the EM Algorithm,” Biometrics, 65, 554–563. DOI: 10.1111/j.1541-0420.2008.01116.x.
  • Wu, T. T., and Lange, K. (2008), “Coordinate Descent Algorithms for Lasso Penalized Regression,” The Annals of Applied Statistics, 2, 224–244. DOI: 10.1214/07-AOAS147.
  • Yuan, M., and Lin, Y. (2006), “Model Selection and Estimation in Regression With Grouped Variables,” Journal of the Royal Statistical Society, Series B, 68, 49–67. DOI: 10.1111/j.1467-9868.2005.00532.x.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.