COBALT: constraint-based alignment tool for multiple protein sequences

被引:857
作者
Papadopoulos, Jason S. [1 ]
Agarwala, Richa [1 ]
机构
[1] NIH, Natl Ctr Biotechnol Informat, Dept Hlth & Human Serv, Bethesda, MD 20894 USA
关键词
D O I
10.1093/bioinformatics/btm076
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: A tool that simultaneously aligns multiple protein sequences, automatically utilizes information about protein domains, and has a good compromise between speed and accuracy will have practical advantages over current tools. Results: We describe COBALT, a constraint based alignment tool that implements a general framework for multiple alignment of protein sequences. COBALT finds a collection of pairwise constraints derived from database searches, sequence similarity and user input, combines these pairwise constraints, and then incorporates them into a progressive multiple alignment. We show that using constraints derived from the conserved domain database (CDD) and PROSITE protein-motif database improves COBALT's alignment quality. We also show that COBALT has reasonable runtime performance and alignment accuracy comparable to or exceeding that of other tools for a broad range of problems.
引用
收藏
页码:1073 / 1079
页数:7
相关论文
共 55 条
[1]   BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations [J].
Bahr, A ;
Thompson, JD ;
Thierry, JC ;
Poch, O .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :323-326
[2]  
Bianchetti Laurent, 2005, Journal of Bioinformatics and Computational Biology, V3, P929, DOI 10.1142/S0219720005001326
[3]   Inferring genome trees by using a filter to eliminate phylogenetically discordant sequences and a distance matrix based on mean normalized BLASTP scores [J].
Clarke, GDP ;
Beiko, RG ;
Ragan, MA ;
Charlebois, RL .
JOURNAL OF BACTERIOLOGY, 2002, 184 (08) :2072-2080
[4]   Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle [J].
Desper, R ;
Gascuel, O .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (05) :687-705
[5]   ProbCons: Probabilistic consistency-based multiple sequence alignment [J].
Do, CB ;
Mahabhashyam, MSP ;
Brudno, M ;
Batzoglou, S .
GENOME RESEARCH, 2005, 15 (02) :330-340
[6]   Pattern-constrained multiple polypeptide sequence alignment [J].
Du, ZH ;
Lin, F .
COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2005, 29 (04) :303-307
[7]   MUSCLE: a multiple sequence alignment method with reduced time and space complexity [J].
Edgar, RC .
BMC BIOINFORMATICS, 2004, 5 (1) :1-19
[8]   MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797
[9]   A comparison of scoring functions for protein sequence profile alignment [J].
Edgar, RC ;
Sjölander, K .
BIOINFORMATICS, 2004, 20 (08) :1301-1308
[10]   Multiple sequence alignment [J].
Edgar, Robert C. ;
Batzoglou, Serafim .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2006, 16 (03) :368-373