FINDING ERRORS IN DNA-SEQUENCES

被引:33
作者
POSFAI, J
ROBERTS, RJ
机构
[1] COLD SPRING HARBOR LAB,POB 100,COLD SPRING HARBOR,NY 11724
[2] HUNGARIAN ACAD SCI,BIOL RES CTR,INST BIOPHYS,H-6701 SZEGED,HUNGARY
关键词
READING FRAMES; FRAMESHIFTS;
D O I
10.1073/pnas.89.10.4698
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
An algorithm is described that can detect certain errors within coding regions of DNA sequences. The algorithm is based on the idea that an insertion or deletion error within a coding sequence would interrupt the reading frame and cause the correct translation of a DNA sequence to require one or more frameshifts. If the coding sequence shows similarity to a known protein sequence then such errors can be detected by comparing the conceptual translations of DNA sequences in all six reading frames with every sequence in a protein sequence data base. We have incorporated these ideas into a computer program, called DETECT, that can serve as an aid to the experimentalist who is determining new DNA sequences so that obvious errors may be located and corrected. The program has been tested using raw experimental data and against sequences from the European Molecular Biology Laboratory data base, annotated as containing frameshifts. We have also tested it using unidentified open reading frames that flank known, annotated genes in the GenBank data base. Many potential errors are apparent and in some cases functions can be suggested for the "corrected" versions of these reading frames leading to the identification of new genes. As more sequences are determined the power of this method will increase substantially.
引用
收藏
页码:4698 / 4702
页数:5
相关论文
共 31 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
ALUTENBERGER JA, 1982, NUCLEIC ACIDS RES, V10, P27
[3]   TN917 TRANSPOSASE - SEQUENCE CORRECTION REVEALS A SINGLE OPEN READING FRAME CORRESPONDING TO THE TNPA DETERMINANT OF TN3-FAMILY ELEMENTS [J].
AN, FY ;
CLEWELL, DB .
PLASMID, 1991, 25 (02) :121-124
[4]   RIBOSOME GYMNASTICS - DEGREE OF DIFFICULTY 9.5, STYLE 10.0 [J].
ATKINS, JF ;
WEISS, RB ;
GESTELAND, RF .
CELL, 1990, 62 (03) :413-423
[5]  
BARRY EM, 1991, J BACTEIROL, V173, P1720
[6]   ORGANIZATION OF MULTISPECIFIC DNA METHYLTRANSFERASES ENCODED BY TEMPERATE BACILLUS-SUBTILIS PHAGES [J].
BEHRENS, B ;
NOYERWEIDNER, M ;
PAWLEK, B ;
LAUSTER, R ;
BALGANESH, TS ;
TRAUTNER, TA .
EMBO JOURNAL, 1987, 6 (04) :1137-1142
[7]   BACTERIAL PEPTIDE-CHAIN RELEASE FACTORS - CONSERVED PRIMARY STRUCTURE AND POSSIBLE FRAMESHIFT REGULATION OF RELEASE FACTOR-II [J].
CRAIGEN, WJ ;
COOK, RG ;
TATE, WP ;
CASKEY, CT .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1985, 82 (11) :3616-3620
[8]  
DAYHOFF MO, 1983, METHOD ENZYMOL, V91, P524
[9]   RECOGNITION OF PROTEIN CODING REGIONS IN DNA-SEQUENCES [J].
FICKETT, JW .
NUCLEIC ACIDS RESEARCH, 1982, 10 (17) :5303-5318
[10]   COMPUTER-PROGRAMS FOR THE ASSEMBLY OF DNA SEQUENCES [J].
GINGERAS, TR ;
MILAZZO, JP ;
SCIAKY, D ;
ROBERTS, RJ .
NUCLEIC ACIDS RESEARCH, 1979, 7 (02) :529-545