PairWise and SearchWise: Finding the optimal alignment in a simultaneous comparison of a protein profile against all DNA translation frames

被引:128
作者
Birney, E [1 ]
Thompson, JD [1 ]
Gibson, TJ [1 ]
机构
[1] UNIV OXFORD BALLIOL COLL, OXFORD OX1 3BJ, ENGLAND
关键词
D O I
10.1093/nar/24.14.2730
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
DNA translation frames can be disrupted for several reasons, including: (i) errors in sequence determination; (ii) RNA processing, such as intron removal and guide RNA editing; (iii) less commonly, polymerase frameshifting during transcription or ribosomal frameshifting during translation. Frameshifts frequently confound computational activities involving homologous sequences, such as database searches and inferences on structure, function or phylogeny made from multiple alignments, A dynamic alignment algorithm is reported here which compares a protein profile (a residue scoring matrix for one or more aligned sequences) against the three translation frames of a DNA strand, allowing frameshifting. The algorithm has been incorporated into a new package, WiseTools, for comparison of biological sequences. A protein profile can be compared against either a DNA sequence or a protein sequence, The program PairWise may be used interactively for alignment of any two sequence inputs, SearchWise can perform combinations of searches through DNA or protein databases by a protein profile or DNA sequence, Routine application of the programs has revealed a set of database entries with frameshifts caused by errors in sequence determination.
引用
收藏
页码:2730 / 2739
页数:10
相关论文
共 46 条
[31]   FINDING ERRORS IN DNA-SEQUENCES [J].
POSFAI, J ;
ROBERTS, RJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (10) :4698-4702
[32]  
Searls D B, 1995, Proc Int Conf Intell Syst Mol Biol, V3, P341
[33]   SM AND SM-LIKE PROTEINS BELONG TO A LARGE FAMILY - IDENTIFICATION OF PROTEINS OF THE U6 AS WELL AS THE U1, U2, U4 AND U5 SNRNPS [J].
SERAPHIN, B .
EMBO JOURNAL, 1995, 14 (09) :2089-2098
[34]  
Smith TF., 1981, Advances in applied mathematics, V2, P482, DOI [DOI 10.1016/0196-8858(81)90046-4, 10.1016/0196-8858(81)90046-4]
[35]   IDENTIFICATION OF PROTEIN-CODING REGIONS IN GENOMIC DNA [J].
SNYDER, EE ;
STORMO, GD .
JOURNAL OF MOLECULAR BIOLOGY, 1995, 248 (01) :1-18
[37]  
STADEN R, 1990, METHOD ENZYMOL, V183, P193
[38]   MOLECULAR SEQUENCE ACCURACY AND THE ANALYSIS OF PROTEIN CODING REGIONS [J].
STATES, DJ ;
BOTSTEIN, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1991, 88 (13) :5518-5522
[39]  
STATES DJ, 1992, TRENDS GENET, V8, P52, DOI 10.1016/0168-9525(92)90349-9
[40]   THE STARFISH EGG MESSENGER-RNA RESPONSIBLE FOR MEIOSIS REINITIATION ENCODES CYCLIN [J].
TACHIBANA, K ;
ISHIURA, M ;
UCHIDA, T ;
KISHIMOTO, T .
DEVELOPMENTAL BIOLOGY, 1990, 140 (02) :241-252