ProbCons: Probabilistic consistency-based multiple sequence alignment

被引:786
作者
Do, CB [1 ]
Mahabhashyam, MSP [1 ]
Brudno, M [1 ]
Batzoglou, S [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
关键词
D O I
10.1101/gr.2821705
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
To study gene evolution across a wide range of organisms, biologists need accurate tools for Multiple sequence alignment of protein families. Obtaining accurate alignments, however, is a difficult computational problem because of not only the high computational cost but also the lack of proper objective functions for measuring alignment quality. In this paper, we introduce probabilistic consistency, a novel scoring function for multiple sequence comparisons. We present ProbCons, a practical tool for progressive protein multiple sequence alignment based oil probabilistic consistency, and evaluate its performance on several standard alignment benchmark data sets. On the BAHBASE, SABmark, and PREFAB benchmark alignment databases, ProbCons achieves statistically significant improvement over other leading methods while maintaining practical speed. ProbCons is publicly available as a Web resource.
引用
收藏
页码:330 / 340
页数:11
相关论文
共 66 条
[41]   OPTIMAL ALIGNMENTS IN LINEAR-SPACE [J].
MYERS, EW ;
MILLER, W .
COMPUTER APPLICATIONS IN THE BIOSCIENCES, 1988, 4 (01) :11-17
[42]   A GENERAL METHOD APPLICABLE TO SEARCH FOR SIMILARITIES IN AMINO ACID SEQUENCE OF 2 PROTEINS [J].
NEEDLEMAN, SB ;
WUNSCH, CD .
JOURNAL OF MOLECULAR BIOLOGY, 1970, 48 (03) :443-+
[43]   SAGA: Sequence alignment by genetic algorithm [J].
Notredame, C ;
Higgins, DG .
NUCLEIC ACIDS RESEARCH, 1996, 24 (08) :1515-1524
[44]   T-Coffee: A novel method for fast and accurate multiple sequence alignment [J].
Notredame, C ;
Higgins, DG ;
Heringa, J .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 302 (01) :205-217
[45]   COFFEE: An objective function for multiple sequence alignments [J].
Notredame, C ;
Holm, L ;
Higgins, DG .
BIOINFORMATICS, 1998, 14 (05) :407-422
[46]   Multiple sequence alignment in phylogenetic analysis [J].
Phillips, A ;
Janies, D ;
Wheeler, W .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2000, 16 (03) :317-330
[47]   NCBI Reference Sequence Project: update and current status [J].
Pruitt, KD ;
Tatusova, T ;
Maglott, DR .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :34-37
[48]   COMBINING EVOLUTIONARY INFORMATION AND NEURAL NETWORKS TO PREDICT PROTEIN SECONDARY STRUCTURE [J].
ROST, B ;
SANDER, C .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 1994, 19 (01) :55-72
[49]   Twilight zone of protein sequence alignments [J].
Rost, B .
PROTEIN ENGINEERING, 1999, 12 (02) :85-94
[50]   THE NEIGHBOR-JOINING METHOD - A NEW METHOD FOR RECONSTRUCTING PHYLOGENETIC TREES [J].
SAITOU, N ;
NEI, M .
MOLECULAR BIOLOGY AND EVOLUTION, 1987, 4 (04) :406-425