Support vector machines with adaptive Lq penalty

被引:71
作者
Liu, Yufeng [1 ]
Zhang, Hao Helen
Park, Cheolwoo
Ahn, Jeongyoun
机构
[1] Univ N Carolina, Dept Stat & Operat Res, Carolina Ctr Genome Sci, Chapel Hill, NC 27515 USA
[2] N Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
[3] Univ Georgia, Dept Stat, Athens, GA 30602 USA
基金
美国国家科学基金会;
关键词
adaptive penalty; classification; shrinkage; support vector machine; variable selection;
D O I
10.1016/j.csda.2007.02.006
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The standard support vector machine (SVM) minimizes the hinge loss function subject to the L-2 penalty or the roughness penalty. Recently, the L-1 SVM was suggested for variable selection by producing sparse solutions [Bradley, P., Mangasarian, O., 1998. Feature selection via concave minimization and support vector machines. In: Shavlik, J. (Ed.), ICML'98. Morgan Kaufmann, Los Altos, CA; Zhu, J., Hastie, T., Rosset, S., Tibshirani, R., 2003. 1-norm support vector machines. Neural Inform. Process. Systems 16]. These learning methods are non-adaptive since their penalty forms are pre-determined before looking at data, and they often perform well only in a certain type of situation. For instance, the L-2 SVM generally works well except when there are too many noise inputs, while the L-1 SVM is more preferred in the presence of many noise variables. In this article we propose and explore an adaptive learning procedure called the L-q SVM, Where the best q > 0 is automatically chosen by data. Both two- and multi-class classification problems are considered. We show that the new adaptive approach combines the benefit of a class of non-adaptive procedures and gives the best performance of this class across a variety of situations. Moreover, we observe that the proposed L-q penalty is more robust to noise variables than the L-1 and L-2 penalties. An iterative algorithm is suggested to solve the L-q SVM efficiently. Simulations and real data applications support the effectiveness of the proposed procedure. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:6380 / 6394
页数:15
相关论文
共 23 条
[1]  
[Anonymous], 1999, SUPPORT VECTOR MACHI
[2]  
[Anonymous], ICML 98
[3]   Regularization of wavelet approximations - Rejoinder [J].
Antoniadis, A ;
Fan, J .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (455) :964-967
[4]  
BOSER B, 1992, 5 ANN C COMP LEARN T, P142
[5]   Atomic decomposition by basis pursuit [J].
Chen, SSB ;
Donoho, DL ;
Saunders, MA .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1998, 20 (01) :33-61
[6]  
CRAMMER K, 2001, J MACHINE LEARNING R, V2, P265
[7]   IDEAL SPATIAL ADAPTATION BY WAVELET SHRINKAGE [J].
DONOHO, DL ;
JOHNSTONE, IM .
BIOMETRIKA, 1994, 81 (03) :425-455
[8]   Least angle regression - Rejoinder [J].
Efron, B ;
Hastie, T ;
Johnstone, I ;
Tibshirani, R .
ANNALS OF STATISTICS, 2004, 32 (02) :494-499
[9]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[10]   A STATISTICAL VIEW OF SOME CHEMOMETRICS REGRESSION TOOLS [J].
FRANK, IE ;
FRIEDMAN, JH .
TECHNOMETRICS, 1993, 35 (02) :109-135