Prominent use of distal 5′ transcription start sites and discovery of a large number of additional exons in ENCODE regions

被引:141
作者
Denoeud, France
Kapranov, Philipp
Ucla, Catherine
Frankish, Adam
Castelo, Robert
Drenkow, Jorg
Lagarde, Julien
Alioto, Tyler
Manzano, Caroline
Chrast, Jacqueline
Dike, Sujit
Wyss, Carine
Henrichsen, Charlotte N.
Holroyd, Nancy
Dickson, Mark C.
Taylor, Ruth
Hance, Zahra
Foissac, Sylvain
Myers, Richard M.
Rogers, Jane
Hubbard, Tim
Harrow, Jennifer
Guigo, Roderic
Gingeras, Thomas R.
Antonarakis, Stylianos E.
Reymond, Alexandre [1 ]
机构
[1] Univ Geneva, Sch Med, Dept Genet Med & Dev, CH-1211 Geneva, Switzerland
[2] Univ Pompeu Fabra, Grup Recerca Informat Biomed, Inst Municipal Invest Med, Barcelona 08003, Spain
[3] Affymetrix Inc, Santa Clara, CA 95051 USA
[4] Wellcome Trust Sanger Inst, Hinxton CB10 1HH, Cambs, England
[5] Ctr Genom Regulat, Barcelona 08003, Spain
[6] Univ Lausanne, Ctr Integrat Genom, CH-1015 Lausanne, Switzerland
[7] Stanford Univ, Sch Med, Dept Genet, Stanford Human Genome Ctr, Stanford, CA 94305 USA
基金
英国惠康基金;
关键词
D O I
10.1101/gr.5660607
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements ( ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends ( RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning ( 1) our current understanding of the architecture of protein-coding genes; ( 2) our views on locations of regulatory regions in the genome; and ( 3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.
引用
收藏
页码:746 / 759
页数:14
相关论文
共 35 条
[1]   The genome sequence of Drosophila melanogaster [J].
Adams, MD ;
Celniker, SE ;
Holt, RA ;
Evans, CA ;
Gocayne, JD ;
Amanatides, PG ;
Scherer, SE ;
Li, PW ;
Hoskins, RA ;
Galle, RF ;
George, RA ;
Lewis, SE ;
Richards, S ;
Ashburner, M ;
Henderson, SN ;
Sutton, GG ;
Wortman, JR ;
Yandell, MD ;
Zhang, Q ;
Chen, LX ;
Brandon, RC ;
Rogers, YHC ;
Blazej, RG ;
Champe, M ;
Pfeiffer, BD ;
Wan, KH ;
Doyle, C ;
Baxter, EG ;
Helt, G ;
Nelson, CR ;
Miklos, GLG ;
Abril, JF ;
Agbayani, A ;
An, HJ ;
Andrews-Pfannkoch, C ;
Baldwin, D ;
Ballew, RM ;
Basu, A ;
Baxendale, J ;
Bayraktaroglu, L ;
Beasley, EM ;
Beeson, KY ;
Benos, PV ;
Berman, BP ;
Bhandari, D ;
Bolshakov, S ;
Borkova, D ;
Botchan, MR ;
Bouck, J ;
Brokstein, P .
SCIENCE, 2000, 287 (5461) :2185-2195
[2]   Transcription-mediated gene fusion in the human genome [J].
Akiva, P ;
Toporik, A ;
Edelheit, S ;
Peretz, Y ;
Diber, A ;
Shemesh, R ;
Novik, A ;
Sorek, R .
GENOME RESEARCH, 2006, 16 (01) :30-36
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   Global identification of human transcribed sequences with genome tiling arrays [J].
Bertone, P ;
Stolc, V ;
Royce, TE ;
Rozowsky, JS ;
Urban, AE ;
Zhu, XW ;
Rinn, JL ;
Tongprasit, W ;
Samanta, M ;
Weissman, S ;
Gerstein, M ;
Snyder, M .
SCIENCE, 2004, 306 (5705) :2242-2246
[5]   MAVID: Constrained ancestral alignment of multiple sequences [J].
Bray, N ;
Pachter, L .
GENOME RESEARCH, 2004, 14 (04) :693-699
[6]   Genome sequence of the nematode C-elegans:: A platform for investigating biology [J].
不详 .
SCIENCE, 1998, 282 (5396) :2012-2018
[7]   The transcriptional landscape of the mammalian genome [J].
Carninci, P ;
Kasukawa, T ;
Katayama, S ;
Gough, J ;
Frith, MC ;
Maeda, N ;
Oyama, R ;
Ravasi, T ;
Lenhard, B ;
Wells, C ;
Kodzius, R ;
Shimokawa, K ;
Bajic, VB ;
Brenner, SE ;
Batalov, S ;
Forrest, ARR ;
Zavolan, M ;
Davis, MJ ;
Wilming, LG ;
Aidinis, V ;
Allen, JE ;
Ambesi-Impiombato, X ;
Apweiler, R ;
Aturaliya, RN ;
Bailey, TL ;
Bansal, M ;
Baxter, L ;
Beisel, KW ;
Bersano, T ;
Bono, H ;
Chalk, AM ;
Chiu, KP ;
Choudhary, V ;
Christoffels, A ;
Clutterbuck, DR ;
Crowe, ML ;
Dalla, E ;
Dalrymple, BP ;
de Bono, B ;
Della Gatta, G ;
di Bernardo, D ;
Down, T ;
Engstrom, P ;
Fagiolini, M ;
Faulkner, G ;
Fletcher, CF ;
Fukushima, T ;
Furuno, M ;
Futaki, S ;
Gariboldi, M .
SCIENCE, 2005, 309 (5740) :1559-1563
[8]   Genome-wide analysis of mammalian promoter architecture and evolution [J].
Carninci, Piero ;
Sandelin, Albin ;
Lenhard, Boris ;
Katayama, Shintaro ;
Shimokawa, Kazuro ;
Ponjavic, Jasmina ;
Semple, Colin A. M. ;
Taylor, Martin S. ;
Engström, Par G. ;
Frith, Martin C. ;
Forrest, Alistair R. R. ;
Alkema, Wynand B. ;
Tan, Sin Lam ;
Plessy, Charles ;
Kodzius, Rimantas ;
Ravasi, Timothy ;
Kasukawa, Takeya ;
Fukuda, Shiro ;
Kanamori-Katayama, Mutsumi ;
Kitazume, Yayoi ;
Kawaji, Hideya ;
Kai, Chikatoshi ;
Nakamura, Mari ;
Konno, Hideaki ;
Nakano, Kenji ;
Mottagui-Tabar, Salim ;
Arner, Peter ;
Chesi, Alessandra ;
Gustincich, Stefano ;
Persichetti, Francesca ;
Suzuki, Harukazu ;
Grimmond, Sean M. ;
Wells, Christine A. ;
Orlando, Valerio ;
Wahlestedt, Claes ;
Liu, Edison T. ;
Harbers, Matthias ;
Kawai, Jun ;
Bajic, Vladimir B. ;
Hume, David A. ;
Hayashizaki, Yoshihide .
NATURE GENETICS, 2006, 38 (06) :626-635
[9]   Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution [J].
Cheng, J ;
Kapranov, P ;
Drenkow, J ;
Dike, S ;
Brubaker, S ;
Patel, S ;
Long, J ;
Stern, D ;
Tammana, H ;
Helt, G ;
Sementchenko, V ;
Piccolboni, A ;
Bekiranov, S ;
Bailey, DK ;
Ganesh, M ;
Ghosh, S ;
Bell, I ;
Gerhard, DS ;
Gingeras, TR .
SCIENCE, 2005, 308 (5725) :1149-1154
[10]   Finishing the euchromatic sequence of the human genome [J].
Collins, FS ;
Lander, ES ;
Rogers, J ;
Waterston, RH .
NATURE, 2004, 431 (7011) :931-945