Arbitrary Norm Support Vector Machines

被引:15
作者
Huang, Kaizhu [1 ]
Zheng, Danian [2 ]
King, Irwin [1 ]
Lyu, Michael R. [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Shatin, Hong Kong, Peoples R China
[2] Fujitsu Res & Dev Ctr, Informat Technol Lab, Beijing 100025, Peoples R China
关键词
REGRESSION;
D O I
10.1162/neco.2008.12-07-667
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Support vector machines (SVM) are state-of-the-art classifiers. Typically L-2-norm or L-1-norm is adopted as a regularization term in SVMs, while other norm-based SVMs, for example, the L-0-norm SVM or even the L-infinity-norm SVM, are rarely seen in the literature. The major reason is that L-0-norm describes a discontinuous and nonconvex term, leading to a combinatorially NP-hard optimization problem. In this letter, motivated by Bayesian learning, we propose a novel framework that can implement arbitrary norm-based SVMs in polynomial time. One significant feature of this framework is that only a sequence of sequential minimal optimization problems needs to be solved, thus making it practical in many real applications. The proposed framework is important in the sense that Bayesian priors can be efficiently plugged into most learning methods without knowing the explicit form. Hence, this builds a connection between Bayesian learning and the kernel machines. We derive the theoretical framework, demonstrate how our approach works on the L-0-norm SVM as a typical example, and perform a series of experiments to validate its advantages. Experimental results on nine benchmark data sets are very encouraging. The implemented L-0-norm is competitive with or even better than the standard L-2-norm SVM in terms of accuracy but with a reduced number of support vectors, -9.46% of the number on average. When compared with another sparse model, the relevance vector machine, our proposed algorithm also demonstrates better sparse properties with a training speed over seven times faster.
引用
收藏
页码:560 / 582
页数:23
相关论文
共 24 条
[1]  
[Anonymous], ADV NEURAL INFORM PR
[2]  
[Anonymous], 2005, INT C MACHINE LEARNI
[3]  
[Anonymous], 1999, Learning in Graphical Models
[4]  
[Anonymous], [No title captured]
[5]  
[Anonymous], 2000, NATURE STAT LEARNING, DOI DOI 10.1007/978-1-4757-3264-1
[6]  
[Anonymous], P 13 INT C MACH LEAR
[7]  
[Anonymous], 1998, REPOSITORY MACHINE L
[8]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[9]   Exact simplification of support vector solutions [J].
Downs, T ;
Gates, KE ;
Masters, A .
JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) :293-297
[10]  
FIGUEIREDO M, 2000, P IEEE COMP SOC C CO