The sequence read archive: explosive growth of sequencing data

被引：646

作者：

Kodama, Yuichi ^{[1
,2
]}

Shumway, Martin ^{[3
]}

Leinonen, Rasko ^{[4
]}

机构：

[1] Res Org Informat & Syst, Ctr Informat Biol, Mishima, Shizuoka 4118540, Japan

[2] Res Org Informat & Syst, DNA Data Bank Japan, Natl Inst Genet, Mishima, Shizuoka 4118540, Japan

[3] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA

[4] Wellcome Trust Genome Campus, European Bioinformat Inst, Cambridge CB10 1SD, England

来源：

NUCLEIC ACIDS RESEARCH | 2012年 / 40卷 / D1期

基金：

英国惠康基金;

关键词：

D O I：

10.1093/nar/gkr854

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

New generation sequencing platforms are producing data with significantly higher throughput and lower cost. A portion of this capacity is devoted to individual and community scientific projects. As these projects reach publication, raw sequencing datasets are submitted into the primary next-generation sequence data archive, the Sequence Read Archive (SRA). Archiving experimental data is the key to the progress of reproducible science. The SRA was established as a public repository for next-generation sequence data as a part of the International Nucleotide Sequence Database Collaboration (INSDC). INSDC is composed of the National Center for Biotechnology Information (NCBI), the European Bioinformatics Institute (EBI) and the DNA Data Bank of Japan (DDBJ). The SRA is accessible at www.ebi.ac.uk/ena from EBI and at trace.ddbj.nig.ac.jp from DDBJ. In this article, we present the content and structure of the SRA and report on updated metadata structures, submission file formats and supported sequencing platforms. We also briefly outline our various responses to the challenge of explosive data growth.

引用

页码：D54 / D56

页数：3

共 8 条

[1] NCBI GEO: archive for functional genomics data sets-10 years on [J].

Barrett, Tanya ;

Troup, Dennis B. ;

Wilhite, Stephen E. ;

Ledoux, Pierre ;

Evangelista, Carlos ;

Kim, Irene F. ;

Tomashevsky, Maxim ;

Marshall, Kimberly A. ;

Phillippy, Katherine H. ;

Sherman, Patti M. ;

Muertter, Rolf N. ;

Holko, Michelle ;

Ayanbule, Oluwabukunmi ;

Yefanov, Andrey ;

Soboleva, Alexandra .

NUCLEIC ACIDS RESEARCH, 2011, 39 :D1005-D1010

[2] Efficient storage of high throughput DNA sequencing data using reference-based compression [J].

Fritz, Markus Hsi-Yang ;

Leinonen, Rasko ;

Cochrane, Guy ;

Birney, Ewan .

GENOME RESEARCH, 2011, 21 (05) :734-740

[3] The International Nucleotide Sequence Database Collaboration [J].

Karsch-Mizrachi, Ilene ;

Nakamura, Yasukazu ;

Cochrane, Guy .

NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D33-D37

[4]

Kodama Y., NUCL ACIDS IN PRESS

[5] The Sequence Read Archive [J].

Leinonen, Rasko ;

Sugawara, Hideaki ;

Shumway, Martin .

NUCLEIC ACIDS RESEARCH, 2011, 39 :D19-D21

[6]

Li H, 2009, BIOINFORMATICS, V25, P1094, DOI [10.1093/bioinformatics/btp324, 10.1093/bioinformatics/btp100]

[7] ArrayExpress update-an archive of microarray and high-throughput sequencing-based functional genomics experiments [J].

Parkinson, Helen ;

Sarkans, Ugis ;

Kolesnikov, Nikolay ;

Abeygunawardena, Niran ;

Burdett, Tony ;

Dylag, Miroslaw ;

Emam, Ibrahim ;

Farne, Anna ;

Hastings, Emma ;

Holloway, Ele ;

Kurbatova, Natalja ;

Lukk, Margus ;

Malone, James ;

Mani, Roby ;

Pilicheva, Ekaterina ;

Rustici, Gabriella ;

Sharma, Anjan ;

Williams, Eleanor ;

Adamusiak, Tomasz ;

Brandizi, Marco ;

Sklyar, Nataliya ;

Brazma, Alvis .

NUCLEIC ACIDS RESEARCH, 2011, 39 :D1002-D1004

[8] Archiving next generation sequencing data [J].

Shumway, Martin ;

Cochrane, Guy ;

Sugawara, Hideaki .

NUCLEIC ACIDS RESEARCH, 2010, 38 :D870-D871

← 1 →