Abstract
During the past decade, penalized likelihood methods have been widely used in variable selection problems, where the penalty functions are typically symmetric about 0, continuous and nondecreasing in (0, ∞). We propose a new penalized likelihood method, reciprocal Lasso (or in short, rLasso), based on a new class of penalty functions that are decreasing in (0, ∞), discontinuous at 0, and converge to infinity when the coefficients approach zero. The new penalty functions give nearly zero coefficients infinity penalties; in contrast, the conventional penalty functions give nearly zero coefficients nearly zero penalties (e.g., Lasso and smoothly clipped absolute deviation [SCAD]) or constant penalties (e.g., L0 penalty). This distinguishing feature makes rLasso very attractive for variable selection. It can effectively avoid to select overly dense models. We establish the consistency of the rLasso for variable selection and coefficient estimation under both the low- and high-dimensional settings. Since the rLasso penalty functions induce an objective function with multiple local minima, we also propose an efficient Monte Carlo optimization algorithm to solve the involved minimization problem. Our simulation results show that the rLasso outperforms other popular penalized likelihood methods, such as Lasso, SCAD, minimax concave penalty, sure independence screening, interative sure independence screening, and extended Bayesian information criterion. It can produce sparser and more accurate coefficient estimates, and catch the true model with a higher probability. Supplementary materials for this article are available online.
Additional information
Notes on contributors
Qifan Song
Qifan Song is Assistant Professor, Department of Statistics, Purdue University, West Lafayette, IN 47906 (E-mail: [email protected]). Faming Liang is Professor, Department of Biostatistics, University of Florida, Gainesville, FL 32611 (E-mail: [email protected]). Liang’s research was partially supported by the National Science Foundation grants DMS-1106494 and DMS-1317131. The authors thank the editor, associate editor, and two referees for their constructive comments that have led to significant improvements of this article.
Faming Liang
Qifan Song is Assistant Professor, Department of Statistics, Purdue University, West Lafayette, IN 47906 (E-mail: [email protected]). Faming Liang is Professor, Department of Biostatistics, University of Florida, Gainesville, FL 32611 (E-mail: [email protected]). Liang’s research was partially supported by the National Science Foundation grants DMS-1106494 and DMS-1317131. The authors thank the editor, associate editor, and two referees for their constructive comments that have led to significant improvements of this article.