Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles

被引:1328
作者
Pellegrini, M
Marcotte, EM
Thompson, MJ
Eisenberg, D
Yeates, TO
机构
[1] Univ Calif Los Angeles, Inst Mol Biol, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Dept Energy Lab Struct Biol & Mol Med, Los Angeles, CA 90095 USA
[3] Univ Calif Los Angeles, Dept Chem & Biochem, Los Angeles, CA 90095 USA
关键词
genomic; bioinformatics; metabolic pathways; structural complexes;
D O I
10.1073/pnas.96.8.4285
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 [理学]; 0710 [生物学]; 09 [农学];
摘要
Determining protein functions from genomic sequences is a central goal of bioinformatics, We present a method based on the assumption that proteins that function together in a pathway or structural complex are likely to evolve in a correlated fashion. During evolution, all such functionally linked proteins tend to be either preserved or eliminated in a new species, We describe this property of correlated evolution by characterizing each protein by its phylogenetic profile, a string that encodes the presence or absence of a protein in every known genome. We show that proteins having matching or similar profiles strongly tend to be functionally linked. This method of phylogenetic profiling allows us to predict the function of uncharacterized proteins.
引用
收藏
页码:4285 / 4288
页数:4
相关论文
共 9 条
[1]
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]
Bioinformatics: From genome data to biological knowledge [J].
Andrade, MA ;
Sander, C .
CURRENT OPINION IN BIOTECHNOLOGY, 1997, 8 (06) :675-683
[3]
The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :38-42
[4]
The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[5]
Predicting function: From genes to genomes and back [J].
Bork, P ;
Dandekar, T ;
Diaz-Lazcoz, Y ;
Eisenhaber, F ;
Huynen, M ;
Yuan, YP .
JOURNAL OF MOLECULAR BIOLOGY, 1998, 283 (04) :707-725
[6]
Gaasterland T, 1998, Microb Comp Genomics, V3, P177, DOI 10.1089/omi.1.1998.3.177
[7]
EcoCyc:: Encyclopedia of Escherichia coli genes and metabolism [J].
Karp, PD ;
Riley, M ;
Paley, SM ;
Pellegrini-Toole, A ;
Krummenacker, M .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :50-53
[8]
Genes and proteins of Escherichia coli K-12 (GenProtEC) [J].
Riley, M .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :54-54
[9]
Metabolism and evolution of Haemophilus influenzae deduced from a whole-genome comparison with Escherichia coli [J].
Tatusov, RL ;
Mushegian, AR ;
Bork, P ;
Brown, NP ;
Hayes, WS ;
Borodovsky, M ;
Rudd, KE ;
Koonin, EV .
CURRENT BIOLOGY, 1996, 6 (03) :279-291