ASSIGNMENT OF POSITION-SPECIFIC ERROR-PROBABILITY TO PRIMARY DNA-SEQUENCE DATA

被引:27
作者
LAWRENCE, CB
SOLOVYEV, VV
机构
[1] Department of Cell Biology, Baylor College of Medicine, Houston, TX 77030, One Baylor Plaza
关键词
D O I
10.1093/nar/22.7.1272
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
DNA sequence predicted from polyacrylamide gel-based technologies is inaccurate because of variations in the quality of the primary data due to limitations of the technology, and to sequence-specific variations due to nucleotide interactions within the DNA molecule and with the gel. The ability to recognize the probability of error in the primary data will be useful in reconstructing the target sequence of a DNA sequencing project, and in estimating the accuracy of the final sequence. This paper describes the use of linear discriminant analysis to assign position-specific probabilities of incorrect, over- and under- prediction of nucleotides for each predicted nucleotide position in primary sequence data generated by a gel-based DNA sequencing technology. Using this method, most of the error potential in primary sequence data can be assigned to a limited number of discrete positions. The use of probability values in the sequence reconstruction process, and in estimating the accuracy of consensus sequence determination is described.
引用
收藏
页码:1272 / 1280
页数:9
相关论文
共 14 条
[1]  
AFIFI AA, 1979, STATISTICAL ANAL COM
[2]  
BOLCH BW, 1974, MULTIVARIATE STATIST
[3]   NEIGHBORING NUCLEOTIDE INTERACTIONS DURING DNA SEQUENCING GEL-ELECTROPHORESIS [J].
BOWLING, JM ;
BRUNER, KL ;
CMARIK, JL ;
TIBBETTS, C .
NUCLEIC ACIDS RESEARCH, 1991, 19 (11) :3089-3097
[4]  
Chen W Q, 1992, DNA Seq, V2, P335, DOI 10.3109/10425179209020814
[5]   THE ACCURACY OF DNA-SEQUENCES - ESTIMATING SEQUENCE QUALITY [J].
CHURCHILL, GA ;
WATERMAN, MS .
GENOMICS, 1992, 14 (01) :89-98
[6]   A TIME-EFFICIENT, LINEAR-SPACE LOCAL SIMILARITY ALGORITHM [J].
HUANG, XQ ;
MILLER, W .
ADVANCES IN APPLIED MATHEMATICS, 1991, 12 (03) :337-357
[7]   A CONTIG ASSEMBLY PROGRAM BASED ON SENSITIVE DETECTION OF FRAGMENT OVERLAPS [J].
HUANG, XQ .
GENOMICS, 1992, 14 (01) :18-25
[9]  
KRISTENSENT, 1992, DNA SEQUENCE, V2, P343
[10]  
LAWRENCE CC, UNPUB