Piecewise linear regularized solution paths

被引:305
作者
Rosset, Saharon [1 ]
Zhu, Ji
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Predict Modeling Grp, Yorktown Hts, NY 10598 USA
[2] Univ Michigan, Dept Stat, Ann Arbor, MI 48109 USA
关键词
l(1)-norm penalty; polynomial splines; regularization; solution paths; sparsity; total variation;
D O I
10.1214/009053606000001370
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the generic regularized optimization problem (beta) over cap(lambda) = arg min(beta) L (y, X beta) + lambda J (beta). Efron, Hastie, Johnstone and Tibshirani [Ann. Statist. 32 (2004) 407-499] have shown that for the LASSO-that is, if L is squared error loss and J(beta) = vertical bar vertical bar beta vertical bar vertical bar(1) is the if l(1) norm of beta-the optimal coefficient path is piecewise linear, that is, is piecewise constant. We derive a general characterization of the properties of (loss L, penalty J) pairs which give piecewise linear coefficient paths. Such pairs allow for efficient generation of the full regularized coefficient paths. We investigate the nature of efficient path following algorithms which arise. We use our results to suggest robust versions of the LASSO for regression and classification, and to develop new, efficient algorithms for existing problems in the literature, including Mammen and van de Geer's locally adaptive regression splines.
引用
收藏
页码:1012 / 1030
页数:19
相关论文
共 21 条
[1]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
[2]   Local extremes, runs, strings and multiresolution - Rejoinder [J].
Davies, PL ;
Kovac, A .
ANNALS OF STATISTICS, 2001, 29 (01) :61-65
[3]  
DONOHO DL, 1995, J ROY STAT SOC B MET, V57, P301
[4]   Least angle regression - Rejoinder [J].
Efron, B ;
Hastie, T ;
Johnstone, I ;
Tibshirani, R .
ANNALS OF STATISTICS, 2004, 32 (02) :494-499
[5]   Nonconcave penalized likelihood with a diverging number of parameters [J].
Fan, JQ ;
Peng, H .
ANNALS OF STATISTICS, 2004, 32 (03) :928-961
[6]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[7]  
Freund Y, 1996, ICML
[8]  
Hastie T, 2004, J MACH LEARN RES, V5, P1391
[9]  
Hastie T., 2009, The Elements of Statistical Learning, P9
[10]   RIDGE REGRESSION - BIASED ESTIMATION FOR NONORTHOGONAL PROBLEMS [J].
HOERL, AE ;
KENNARD, RW .
TECHNOMETRICS, 1970, 12 (01) :55-&