An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization

被引:1936
作者
Dietterich, TG [1 ]
机构
[1] Oregon State Univ, Dept Comp Sci, Corvallis, OR 97331 USA
基金
美国国家科学基金会;
关键词
decision trees; ensemble learning; bagging; boosting; C4.5; Monte Carlo methods;
D O I
10.1023/A:1007607513941
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bagging and boosting are methods that generate a diverse ensemble of classifiers by manipulating the training data given to a "base" learning algorithm. Breiman has pointed out that they rely for their effectiveness on the instability of the base learning algorithm. An alternative approach to generating an ensemble is to randomize the internal decisions made by the base algorithm. This general approach has been studied previously by Ali and Pazzani and by Dietterich and Kong. This paper compares the effectiveness of randomization, bagging, and boosting for improving the performance of the decision-tree algorithm C4.5. The experiments show that in situations with little or no classification noise, randomization is competitive with (and perhaps slightly superior to) bagging but not as accurate as boosting. In situations with substantial classification noise, bagging is much better than boosting, and sometimes better than randomization.
引用
收藏
页码:139 / 157
页数:19
相关论文
共 16 条
[1]  
ALI K, 1995, 9547 U CAL DEP INF C
[2]  
Ali KM, 1996, MACH LEARN, V24, P173, DOI 10.1007/BF00058611
[3]  
[Anonymous], 1993, C4. 5: Programs for empirical learning
[4]   An empirical comparison of voting classification algorithms: Bagging, boosting, and variants [J].
Bauer, E ;
Kohavi, R .
MACHINE LEARNING, 1999, 36 (1-2) :105-139
[5]   Bagging predictors [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (02) :123-140
[6]  
Breiman L., 1994, 416 U CAL DEP STAT
[7]  
Breiman L., 1996, Bias, variance, and arcing classifiers
[8]   Approximate statistical tests for comparing supervised classification learning algorithms [J].
Dietterich, TG .
NEURAL COMPUTATION, 1998, 10 (07) :1895-1923
[9]  
DIETTERICH TG, 1995, MACH LEARNING BIAS S
[10]  
Freund Y, 1996, Experiments with a new boosting algorithm. In proceedings 13th Int Conf Mach learn. Pp.148-156, P45