Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation

被引:180
作者
Ng, P
Wei, CL
Sung, WK
Chiu, KP
Lipovich, L
Ang, CC
Gupta, S
Shahab, A
Ridwan, A
Wong, CH
Liu, ET
Ruan, Y
机构
[1] Genome Inst Singapore, Singapore 138672, Singapore
[2] Bioinformat Inst, Singapore 138671, Singapore
关键词
D O I
10.1038/NMETH733
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We have developed a DNA tag sequencing and mapping strategy called gene identification signature (GIS) analysis, in which 5' and 3' signatures of full-length cDNAs are accurately extracted into paired-end ditags (PETS) that are concatenated for efficient sequencing and mapped to genome sequences to demarcate the transcription boundaries of every gene. GIS analysis is potentially 30-fold more efficient than standard cDNA sequencing approaches for transcriptome characterization. We demonstrated this approach with 116,252 PET sequences derived from mouse embryonic stem cells. Initial analysis of this dataset identified hundreds of previously uncharacterized transcripts, including alternative transcripts of known genes. We also uncovered several intergenically spliced and unusual fusion transcripts, one of which was confirmed as a trans-splicing event and was differentially expressed. The concept of paired-end ditagging described here for transcriptome analysis can also be applied to whote-genome analysis of cis-reguLatory and other DNA elements and represents an important technological advance for genome annotation.
引用
收藏
页码:105 / 111
页数:7
相关论文
共 30 条
[1]   Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays [J].
Brenner, S ;
Johnson, M ;
Bridgham, J ;
Golda, G ;
Lloyd, DH ;
Johnson, D ;
Luo, SJ ;
McCurdy, S ;
Foy, M ;
Ewan, M ;
Roth, R ;
George, D ;
Eletr, S ;
Albrecht, G ;
Vermaas, E ;
Williams, SR ;
Moon, K ;
Burcham, T ;
Pallas, M ;
DuBridge, RB ;
Kirchner, J ;
Fearon, K ;
Mao, J ;
Corcoran, K .
NATURE BIOTECHNOLOGY, 2000, 18 (06) :630-634
[2]   Recent advances in gene structure prediction [J].
Brent, MR ;
Guigó, R .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2004, 14 (03) :264-272
[3]  
Carninci P, 1999, METHOD ENZYMOL, V303, P19
[4]   Finishing the euchromatic sequence of the human genome [J].
Collins, FS ;
Lander, ES ;
Rogers, J ;
Waterston, RH .
NATURE, 2004, 431 (7011) :931-945
[5]   The ENCODE (ENCyclopedia of DNA elements) Project [J].
Feingold, EA ;
Good, PJ ;
Guyer, MS ;
Kamholz, S ;
Liefer, L ;
Wetterstrand, K ;
Collins, FS ;
Gingeras, TR ;
Kampa, D ;
Sekinger, EA ;
Cheng, J ;
Hirsch, H ;
Ghosh, S ;
Zhu, Z ;
Pate, S ;
Piccolboni, A ;
Yang, A ;
Tammana, H ;
Bekiranov, S ;
Kapranov, P ;
Harrison, R ;
Church, G ;
Struhl, K ;
Ren, B ;
Kim, TH ;
Barrera, LO ;
Qu, C ;
Van Calcar, S ;
Luna, R ;
Glass, CK ;
Rosenfeld, MG ;
Guigo, R ;
Antonarakis, SE ;
Birney, E ;
Brent, M ;
Pachter, L ;
Reymond, A ;
Dermitzakis, ET ;
Dewey, C ;
Keefe, D ;
Denoeud, F ;
Lagarde, J ;
Ashurst, J ;
Hubbard, T ;
Wesselink, JJ ;
Castelo, R ;
Eyras, E ;
Myers, RM ;
Sidow, A ;
Batzoglou, S .
SCIENCE, 2004, 306 (5696) :636-640
[6]   Genome sequence of the Brown Norway rat yields insights into mammalian evolution [J].
Gibbs, RA ;
Weinstock, GM ;
Metzker, ML ;
Muzny, DM ;
Sodergren, EJ ;
Scherer, S ;
Scott, G ;
Steffen, D ;
Worley, KC ;
Burch, PE ;
Okwuonu, G ;
Hines, S ;
Lewis, L ;
DeRamo, C ;
Delgado, O ;
Dugan-Rocha, S ;
Miner, G ;
Morgan, M ;
Hawes, A ;
Gill, R ;
Holt, RA ;
Adams, MD ;
Amanatides, PG ;
Baden-Tillson, H ;
Barnstead, M ;
Chin, S ;
Evans, CA ;
Ferriera, S ;
Fosler, C ;
Glodek, A ;
Gu, ZP ;
Jennings, D ;
Kraft, CL ;
Nguyen, T ;
Pfannkoch, CM ;
Sitter, C ;
Sutton, GG ;
Venter, JC ;
Woodage, T ;
Smith, D ;
Lee, HM ;
Gustafson, E ;
Cahill, P ;
Kana, A ;
Doucette-Stamm, L ;
Weinstock, K ;
Fechtel, K ;
Weiss, RB ;
Dunn, DM ;
Green, ED .
NATURE, 2004, 428 (6982) :493-521
[7]   An assessment of gene prediction accuracy in large DNA sequences [J].
Guigó, R ;
Agarwal, P ;
Abril, JF ;
Burset, M ;
Fickett, JW .
GENOME RESEARCH, 2000, 10 (10) :1631-1642
[8]   Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes [J].
Guigó, R ;
Dermitzakis, ET ;
Agarwal, P ;
Ponting, CP ;
Parra, G ;
Reymond, A ;
Abril, JF ;
Keibler, E ;
Lyle, R ;
Ucla, C ;
Antonarakis, SE ;
Brent, MR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (03) :1140-1145
[9]   5′-end SAGE for the analysis of transcriptional start sites [J].
Hashimoto, S ;
Suzuki, Y ;
Kasai, Y ;
Morohoshi, K ;
Yamada, T ;
Sese, J ;
Morishita, S ;
Sugano, S ;
Matsushima, K .
NATURE BIOTECHNOLOGY, 2004, 22 (09) :1146-1149
[10]   HPRT-DEFICIENT (LESCH-NYHAN) MOUSE EMBRYOS DERIVED FROM GERMLINE COLONIZATION BY CULTURED-CELLS [J].
HOOPER, M ;
HARDY, K ;
HANDYSIDE, A ;
HUNTER, S ;
MONK, M .
NATURE, 1987, 326 (6110) :292-295