2SNP: scalable phasing based on 2-SNP haplotypes

被引:26
作者
Brinza, D [1 ]
Zelikovsky, A [1 ]
机构
[1] Georgia State Univ, Dept Comp Sci, Atlanta, GA 30303 USA
关键词
D O I
10.1093/bioinformatics/bti785
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
2SNP software package implements a new very fast scalable algorithm for haplotype inference based on genotype statistics collected only for pairs of SNPs. This software can be used for comparatively accurate phasing of large number of long genome sequences, e.g. obtained from DNA arrays. As an input 2SNP takes genotype matrix and outputs the corresponding haplotype matrix. On datasets across 79 regions from HapMap 2SNP is several orders of magnitude faster than GERBIL and PHASE while matching them in quality measured by the number of correctly phased genotypes, single-site and switching errors. For example, 2SNP requires 41 s on Pentium 4 2 Ghz processor to phase 30 genotypes with 1381 SNPs (ENm010.7p15:2 data from HapMap) versus GERBIL and PHASE requiring more than a week and admitting no less errors than 2SNP.
引用
收藏
页码:371 / 373
页数:3
相关论文
共 12 条
[1]  
CLARK AG, 1990, MOL BIOL EVOL, V7, P111
[2]   High-resolution haplotype structure in the human genome [J].
Daly, MJ ;
Rioux, JD ;
Schaffner, SE ;
Hudson, TJ ;
Lander, ES .
NATURE GENETICS, 2001, 29 (02) :229-232
[3]   The structure of haplotype blocks in the human genome [J].
Gabriel, SB ;
Schaffner, SF ;
Nguyen, H ;
Moore, JM ;
Roy, J ;
Blumenstiel, B ;
Higgins, J ;
DeFelice, M ;
Lochner, A ;
Faggart, M ;
Liu-Cordero, SN ;
Rotimi, C ;
Adeyemo, A ;
Cooper, R ;
Ward, R ;
Lander, ES ;
Daly, MJ ;
Altshuler, D .
SCIENCE, 2002, 296 (5576) :2225-2229
[4]   The International HapMap Project [J].
Gibbs, RA ;
Belmont, JW ;
Hardenbol, P ;
Willis, TD ;
Yu, FL ;
Yang, HM ;
Ch'ang, LY ;
Huang, W ;
Liu, B ;
Shen, Y ;
Tam, PKH ;
Tsui, LC ;
Waye, MMY ;
Wong, JTF ;
Zeng, CQ ;
Zhang, QR ;
Chee, MS ;
Galver, LM ;
Kruglyak, S ;
Murray, SS ;
Oliphant, AR ;
Montpetit, A ;
Hudson, TJ ;
Chagnon, F ;
Ferretti, V ;
Leboeuf, M ;
Phillips, MS ;
Verner, A ;
Kwok, PY ;
Duan, SH ;
Lind, DL ;
Miller, RD ;
Rice, JP ;
Saccone, NL ;
Taillon-Miller, P ;
Xiao, M ;
Nakamura, Y ;
Sekine, A ;
Sorimachi, K ;
Tanaka, T ;
Tanaka, Y ;
Tsunoda, T ;
Yoshino, E ;
Bentley, DR ;
Deloukas, P ;
Hunt, S ;
Powell, D ;
Altshuler, D ;
Gabriel, SB ;
Qiu, RZ .
NATURE, 2003, 426 (6968) :789-796
[5]  
Gusfield D, 2003, LECT NOTES COMPUT SC, V2676, P144
[6]   Haplotype reconstruction from genotype data using Imperfect Phylogeny [J].
Halperin, E ;
Eskin, E .
BIOINFORMATICS, 2004, 20 (12) :1842-1849
[7]   Haplotype mapping of the bronchiolitis susceptibility locus near IL8 [J].
Hull, J ;
Rowlands, K ;
Lockhart, E ;
Sharland, M ;
Moore, C ;
Hanchard, N ;
Kwiatkowski, DP .
HUMAN GENETICS, 2004, 114 (03) :272-279
[8]   GERBIL: Genotype resolution and block identification using likelihood [J].
Kimmel, G ;
Shamir, R .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (01) :158-162
[9]   Variation is the spice of life [J].
Kruglyak, L ;
Nickerson, DA .
NATURE GENETICS, 2001, 27 (03) :234-236
[10]   Algorithms for inferring haplotypes [J].
Niu, TH .
GENETIC EPIDEMIOLOGY, 2004, 27 (04) :334-347