Boosting for tumor classification with gene expression data

被引:260
作者
Dettling, M [1 ]
Bühlmann, P [1 ]
机构
[1] ETH, Seminar Stat, CH-8092 Zurich, Switzerland
关键词
D O I
10.1093/bioinformatics/btf867
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Microarray experiments generate large datasets with expression values for thousands of genes but not more than a few dozens of samples. Accurate supervised classification of tissue samples in such high-dimensional problems is difficult but often crucial for successful diagnosis and treatment. A promising way to meet this challenge is by using boosting in conjunction with decision trees. Results: We demonstrate that the generic boosting algorithm needs some modification to become an accurate classifier in the context of gene expression data. In particular, we present a feature preselection method, a more robust boosting procedure and a new approach for multi-categorical problems. This allows for slight to drastic increase in performance and yields competitive results on several publicly available datasets.
引用
收藏
页码:1061 / 1069
页数:9
相关论文
共 25 条
[1]  
AGRAWALA AK, 1977, MACHINE RECOGNITION
[2]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[3]  
Allwein E. L., 2000, J MACHINE LEARNING R, V1, P113, DOI DOI 10.1162/15324430152733133
[4]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[5]   Tissue classification with gene expression profiles [J].
Ben-Dor, A ;
Bruhn, L ;
Friedman, N ;
Nachman, I ;
Schummer, M ;
Yakhini, Z .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) :559-583
[6]   Prediction games and arcing algorithms [J].
Breiman, L .
NEURAL COMPUTATION, 1999, 11 (07) :1493-1517
[7]  
Breiman L., 1984, BIOMETRICS, DOI DOI 10.2307/2530946
[8]   Comparison of discrimination methods for the classification of tumors using gene expression data [J].
Dudoit, S ;
Fridlyand, J ;
Speed, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) :77-87
[9]  
Fix E., 1951, JOSEPH
[10]  
Freund Y., 1996, Machine Learning. Proceedings of the Thirteenth International Conference (ICML '96), P148