Aligning multiple genomic sequences with the threaded blockset aligner

被引：1092

作者：

Blanchette, M

Kent, WJ

Riemer, C

Elnitski, L

Smit, AFA

Roskin, KM

Baertsch, R

Rosenbloom, K

Clawson, H

Green, ED

Haussler, D

Miller, W ^{[1
]}

机构：

[1] Penn State Univ, Ctr Comparat Genom & Bioinformat, University Pk, PA 16802 USA

[2] Univ Calif Santa Cruz, Howard Hughes Med Inst, Santa Cruz, CA 95064 USA

[3] Univ Calif Santa Cruz, Ctr Biomol Sci & Engn, Santa Cruz, CA 95064 USA

[4] Inst Syst Biol, Seattle, WA 98103 USA

[5] NHGRI, Genome Technol Branch, NIH, Bethesda, MD 20892 USA

[6] NHGRI, NIH Intramural Sequencing Ctr, NIH, Bethesda, MD 20892 USA

来源：

GENOME RESEARCH | 2004年 / 14卷 / 04期

关键词：

D O I：

10.1101/gr.1933104

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

We define a "threaded blockset," which is a novel generalization of the classic notion of a multiple alignment. A new computer program called TBA (for "threaded blockset aligner") builds a threaded blockset under the assumption that all matching segments occur in the same order and orientation in the given sequences; inversions and duplications are not addressed. TBA is designed to be appropriate for aligning many, but by no means all, megabase-sized regions of multiple mammalian genomes. The output of TBA can be projected onto any genome chosen as a reference, thus guaranteeing that different projections present consistent predictions of which genomic positions are orthologous. This capability is illustrated using a new visualization tool to view TBA-generated alignments of vertebrate Hox clusters from both the mammalian and fish perspectives. Experimental evaluation of alignment quality, using a program that simulates evolutionary change in genomic sequences, indicates that TBA is more accurate than earlier programs. To perform the dynamic-programming alignment step, TBA runs a stand-alone program called MULTIZ, which can be used to align highly rearranged or incompletely sequenced genomes. We describe our use of MULTIZ to produce the whole-genome multiple alignments at the Santa Cruz Genome Browser.

引用

页码：708 / 715

页数：8

共 26 条

[1] Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes [J].

Aparicio, S ;

Chapman, J ;

Stupka, E ;

Putnam, N ;

Chia, J ;

Dehal, P ;

Christoffels, A ;

Rash, S ;

Hoon, S ;

Smit, A ;

Gelpke, MDS ;

Roach, J ;

Oh, T ;

Ho, IY ;

Wong, M ;

Detter, C ;

Verhoef, F ;

Predki, P ;

Tay, A ;

Lucas, S ;

Richardson, P ;

Smith, SF ;

Clark, MS ;

Edwards, YJK ;

Doggett, N ;

Zharkikh, A ;

Tavtigian, SV ;

Pruss, D ;

Barnstead, M ;

Evans, C ;

Baden, H ;

Powell, J ;

Glusman, G ;

Rowen, L ;

Hood, L ;

Tan, YH ;

Elgar, G ;

Hawkins, T ;

Venkatesh, B ;

Rokhsar, D ;

Brenner, S .

SCIENCE, 2002, 297 (5585) :1301-1310

[2] MAVID multiple alignment server [J].

Bray, N ;

Pachter, L .

NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3525-3526

[3] LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA [J].

Brudno, M ;

Do, CB ;

Cooper, GM ;

Kim, MF ;

Davydov, E ;

Green, ED ;

Sidow, A ;

Batzoglou, S .

GENOME RESEARCH, 2003, 13 (04) :721-731

[4] Fast and sensitive alignment of large genomic sequences [J].

Brudno, M ;

Morgenstern, B .

CSB2002: IEEE COMPUTER SOCIETY BIOINFORMATICS CONFERENCE, 2002, :138-147

[5] A vision for the future of genomics research [J].

Collins, FS ;

Green, ED ;

Guttmacher, AE ;

Guyer, MS .

NATURE, 2003, 422 (6934) :835-847

[6] GLOBIN GENE SERVER - A PROTOTYPE E-MAIL DATABASE SERVER FEATURING EXTENSIVE MULTIPLE ALIGNMENTS AND DATA COMPILATION FOR ELECTRONIC GENETIC-ANALYSIS [J].

HARDISON, R ;

CHAO, KM ;

SCHWARTZ, S ;

STOJANOVIC, N ;

GANETSKY, M ;

MILLER, W .

GENOMICS, 1994, 21 (02) :344-353

[7]

HEIN J, 1989, MOL BIOL EVOL, V6, P649

[8] The human genome browser at UCSC [J].

Kent, WJ ;

Sugnet, CW ;

Furey, TS ;

Roskin, KM ;

Pringle, TH ;

Zahler, AM ;

Haussler, D .

GENOME RESEARCH, 2002, 12 (06) :996-1006

[9] Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes [J].

Kent, WJ ;

Baertsch, R ;

Hinrichs, A ;

Miller, W ;

Haussler, D .

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (20) :11484-11489

[10] Initial sequencing and analysis of the human genome [J].

Lander, ES ;

Int Human Genome Sequencing Consortium ;

Linton, LM ;

Birren, B ;

Nusbaum, C ;

Zody, MC ;

Baldwin, J ;

Devon, K ;

Dewar, K ;

Doyle, M ;

FitzHugh, W ;

Funke, R ;

Gage, D ;

Harris, K ;

Heaford, A ;

Howland, J ;

Kann, L ;

Lehoczky, J ;

LeVine, R ;

McEwan, P ;

McKernan, K ;

Meldrim, J ;

Mesirov, JP ;

Miranda, C ;

Morris, W ;

Naylor, J ;

Raymond, C ;

Rosetti, M ;

Santos, R ;

Sheridan, A ;

Sougnez, C ;

Stange-Thomann, N ;

Stojanovic, N ;

Subramanian, A ;

Wyman, D ;

Rogers, J ;

Sulston, J ;

Ainscough, R ;

Beck, S ;

Bentley, D ;

Burton, J ;

Clee, C ;

Carter, N ;

Coulson, A ;

Deadman, R ;

Deloukas, P ;

Dunham, A ;

Dunham, I ;

Durbin, R ;

French, L .

NATURE, 2001, 409 (6822) :860-921

← 1 2 3 →