The outcomes of pathway database computations depend on pathway ontology

被引:62
作者
Green, M. L. [1 ]
Karp, P. D. [1 ]
机构
[1] SRI Int, Ctr Artificial Intelligence, Bioinformat Res Grp, Menlo Pk, CA 94025 USA
关键词
D O I
10.1093/nar/gkl438
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Different biological notions of pathways are used in different pathway databases. Those pathway ontologies significantly impact pathway computations. Computational users of pathway databases will obtain different results depending on the pathway ontology used by the databases they employ, and different pathway ontologies are preferable for different end uses. We explore differences in pathway ontologies by comparing the BioCyc and KEGG ontologies. The BioCyc ontology defines a pathway as a conserved, atomic module of the metabolic network of a single organism, i.e. often regulated as a unit, whose boundaries are defined at high-connectivity stable metabolites. KEGG pathways are on average 4.2 times larger than BioCyc pathways, and combine multiple biological processes from different organisms to produce a substrate-centered reaction mosaic. We compared KEGG and BioCyc pathways using genome context methods, which determine the functional relatedness of pairs of genes. For each method we employed, a pair of genes randomly selected from a BioCyc pathway is more likely to be related by that method than is a pair of genes randomly selected from a KEGG pathway, supporting the conclusion that the BioCyc pathway conceptualization is closer to a single conserved biological process than is that of KEGG.
引用
收藏
页码:3687 / 3697
页数:11
相关论文
共 30 条
[1]   Prolinks: a database of protein functional linkages derived from coevolution [J].
Bowers, PM ;
Pellegrini, M ;
Thompson, MJ ;
Fierro, J ;
Yeates, TO ;
Eisenberg, D .
GENOME BIOLOGY, 2004, 5 (05)
[2]   GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways [J].
Dahlquist, KD ;
Salomonis, N ;
Vranizan, K ;
Lawlor, SC ;
Conklin, BR .
NATURE GENETICS, 2002, 31 (01) :19-20
[3]   Conservation of gene order: a fingerprint of proteins that physically interact [J].
Dandekar, T ;
Snel, B ;
Huynen, M ;
Bork, P .
TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (09) :324-328
[4]   Protein interaction maps for complete genomes based on gene fusion events [J].
Enright, AJ ;
Iliopoulos, I ;
Kyrpides, NC ;
Ouzounis, CA .
NATURE, 1999, 402 (6757) :86-90
[5]  
Finney A, 2003, BIOCHEM SOC T, V31, P1472
[6]  
Gaasterland T, 1998, Microb Comp Genomics, V3, P199
[7]   Prediction of protein function and pathways in the genome era [J].
Gabaldón, T ;
Huynen, MA .
CELLULAR AND MOLECULAR LIFE SCIENCES, 2004, 61 (7-8) :930-944
[8]   A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases [J].
Green, ML ;
Karp, PD .
BMC BIOINFORMATICS, 2004, 5 (1)
[9]   The KEGG resource for deciphering the genome [J].
Kanehisa, M ;
Goto, S ;
Kawashima, S ;
Okuno, Y ;
Hattori, M .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D277-D280
[10]   Expansion of the BioCyc collection of pathway/genome databases to 160 genomes [J].
Karp, PD ;
Ouzounis, CA ;
Moore-Kochlacs, C ;
Goldovsky, L ;
Kaipa, P ;
Ahrén, D ;
Tsoka, S ;
Darzentas, N ;
Kunin, V ;
López-Bigas, N .
NUCLEIC ACIDS RESEARCH, 2005, 33 (19) :6083-6089