HaploRec: efficient and accurate large-scale reconstruction of haplotypes

被引:37
作者
Eronen, Lauri [1 ]
Geerts, Floris
Toivonen, Hannu
机构
[1] Univ Helsinki, Dept Comp Sci, HIIT BRU, FIN-00014 Helsinki, Finland
[2] Univ Edinburgh, Lab Fdn Comp Sci, Edinburgh EH8 9YL, Midlothian, Scotland
[3] Univ Freiburg, Dept Comp Sci, D-7800 Freiburg, Germany
关键词
D O I
10.1186/1471-2105-7-542
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Haplotypes extracted from human DNA can be used for gene mapping and other analysis of genetic patterns within and across populations. A fundamental problem is, however, that current practical laboratory methods do not give haplotype information. Estimation of phased haplotypes of unrelated individuals given their unphased genotypes is known as the haplotype reconstruction or phasing problem. Results: We define three novel statistical models and give an efficient algorithm for haplotype reconstruction, jointly called HaploRec. HaploRec is based on exploiting local regularities conserved in haplotypes: it reconstructs haplotypes so that they have maximal local coherence. This approach-not assuming statistical dependence for remotely located markers-has two useful properties: it is well-suited for sparse marker maps, such as those used in gene mapping, and it can actually take advantage of long maps. Conclusion: Our experimental results with simulated and real data show that HaploRec is a powerful method for the large scale haplotyping needed in association studies. With sample sizes large enough for gene mapping it appeared to be the best compared to all other tested methods (Phase, fastPhase, PL-EM, Snphap, Gerbil; simulated data), with small samples it was competitive with the best available methods ( real data). HaploRec is several orders of magnitude faster than Phase and comparable to the other methods; the running times are roughly linear in the number of subjects and the number of markers. HaploRec is publicly available at http://www.cs.helsinki.fi/group/genetics/haplotyping.html.
引用
收藏
页数:18
相关论文
共 33 条
[1]   Haplotypes vs single marker linkage disequilibrium tests:: what do we gain? (Reprinted European Journal of Human Genetics, Vol 4, pg 291-300, 2001) [J].
Akey, Joshua ;
Jin, Li ;
Xiong, Momiao .
EUROPEAN JOURNAL OF HUMAN GENETICS, 2017, 25 :S51-S58
[2]   On prediction using variable order Markov models [J].
Begleiter, R ;
El-Yaniv, R ;
Yona, G .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2004, 22 :385-421
[3]  
CLARK AG, 1990, MOL BIOL EVOL, V7, P111
[4]  
CLAYTON D, SNPHAP PROGRAM ESTIM
[5]   Computation of haplotypes on SNPs subsets: advantage of the "global method" [J].
Coulonges, Cedric ;
Delaneau, Olivier ;
Girard, Manon ;
Do, Herve ;
Adkins, Ronald ;
Spadoni, Jean-Louis ;
Zagury, Jean-Francois .
BMC GENETICS, 2006, 7 (1)
[6]   Estimated haplotype counts from case-control samples cannot be treated as observed counts [J].
Curtis, D ;
Sham, PC .
AMERICAN JOURNAL OF HUMAN GENETICS, 2006, 78 (04) :729-730
[7]   High-resolution haplotype structure in the human genome [J].
Daly, MJ ;
Rioux, JD ;
Schaffner, SE ;
Hudson, TJ ;
Lander, ES .
NATURE GENETICS, 2001, 29 (02) :229-232
[8]  
Ding ZH, 2005, LECT NOTES COMPUT SC, V3500, P585
[9]  
Eronen L, 2003, PACIFIC SYMPOSIUM ON BIOCOMPUTING 2004, P104
[10]  
Eskin Eleazar, 2006, Journal of Bioinformatics and Computational Biology, V4, P639, DOI 10.1142/S0219720006002272