Berkeley PHOG: PhyloFacts orthology group prediction web server

被引:54
作者
Datta, Ruchira S. [1 ]
Meacham, Christopher [1 ]
Samad, Bushra [2 ]
Neyer, Christoph [2 ]
Sjolander, Kimmen [1 ,2 ,3 ]
机构
[1] Univ Calif Berkeley, Inst QB3, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Dept Bioengn, Berkeley, CA 94720 USA
[3] Univ Calif Berkeley, Dept Plant & Microbial Sci, Berkeley, CA 94720 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
PHYLOGENOMIC INFERENCE; PROTEIN; DATABASE; ANNOTATION; PARALOGS; HOMOLOGY; ERRORS; TREE;
D O I
10.1093/nar/gkp373
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Ortholog detection is essential in functional annotation of genomes, with applications to phylogenetic tree construction, prediction of protein-protein interaction and other bioinformatics tasks. We present here the PHOG web server employing a novel algorithm to identify orthologs based on phylogenetic analysis. Results on a benchmark dataset from the TreeFam-A manually curated orthology database show that PHOG provides a combination of high recall and precision competitive with both InParanoid and OrthoMCL, and allows users to target different taxonomic distances and precision levels through the use of tree-distance thresholds. For instance, OrthoMCL-DB achieved 76% recall and 66% precision on this dataset; at a slightly higher precision (68%) PHOG achieves 10% higher recall (86%). InParanoid achieved 87% recall at 24% precision on this dataset, while a PHOG variant designed for high recall achieves 88% recall at 61% precision, increasing precision by 37% over InParanoid. PHOG is based on pre-computed trees in the PhyloFacts resource, and contains over 366K orthology groups with a minimum of three species. Predicted orthologs are linked to GO annotations, pathway information and biological literature. The PHOG web server is available at http://phylofacts.berkeley.edu/orthologs/.
引用
收藏
页码:W84 / W89
页数:6
相关论文
共 24 条
[1]   Ensembl 2006 [J].
Birney, E. ;
Andrews, D. ;
Caccamo, M. ;
Chen, Y. ;
Clarke, L. ;
Coates, G. ;
Cox, T. ;
Cunningham, F. ;
Curwen, V. ;
Cutts, T. ;
Down, T. ;
Durbin, R. ;
Fernandez-Suarez, X. M. ;
Flicek, P. ;
Graf, S. ;
Hammond, M. ;
Herrero, J. ;
Howe, K. ;
Iyer, V. ;
Jekosch, K. ;
Kahari, A. ;
Kasprzyk, A. ;
Keefe, D. ;
Kokocinski, F. ;
Kulesha, E. ;
London, D. ;
Longden, I. ;
Melsopp, C. ;
Meidl, P. ;
Overduin, B. ;
Parker, A. ;
Proctor, G. ;
Prlic, A. ;
Rae, M. ;
Rios, D. ;
Redmond, S. ;
Schuster, M. ;
Sealy, I. ;
Searle, S. ;
Severin, J. ;
Slater, G. ;
Smedley, D. ;
Smith, J. ;
Stabenau, A. ;
Stalker, J. ;
Trevanion, S. ;
Ureta-Vidal, A. ;
Vogel, J. ;
White, S. ;
Woodwark, C. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D556-D561
[2]   Errors in genome annotation [J].
Brenner, SE .
TRENDS IN GENETICS, 1999, 15 (04) :132-133
[3]   Functional classification using phylogenomic inference [J].
Brown, Duncan ;
Sjolander, Kimmen .
PLOS COMPUTATIONAL BIOLOGY, 2006, 2 (06) :479-483
[4]   Automated protein subfamily identification and classification [J].
Brown, Duncan P. ;
Krishnamurthy, Nandini ;
Sjoelander, Kimmen .
PLOS COMPUTATIONAL BIOLOGY, 2007, 3 (08) :1526-1538
[5]   OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups [J].
Chen, Feng ;
Mackey, Aaron J. ;
Stoeckert, Christian J., Jr. ;
Roos, David S. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D363-D368
[6]   Phylogenomics and the reconstruction of the tree of life [J].
Delsuc, F ;
Brinkmann, H ;
Philippe, H .
NATURE REVIEWS GENETICS, 2005, 6 (05) :361-375
[7]   Tree pattern matching in phylogenetic trees:: automatic search for orthologs or paralogs in homologous gene sequence databases [J].
Dufayard, JF ;
Duret, L ;
Penel, S ;
Gouy, M ;
Rechenmann, F ;
Perrière, G .
BIOINFORMATICS, 2005, 21 (11) :2596-2603
[8]   Phylogenomics: Improving functional predictions for uncharacterized genes by evolutionary analysis [J].
Eisen, JA .
GENOME RESEARCH, 1998, 8 (03) :163-167
[9]   Homology - a personal view on some of the problems [J].
Fitch, WM .
TRENDS IN GENETICS, 2000, 16 (05) :227-231
[10]   DISTINGUISHING HOMOLOGOUS FROM ANALOGOUS PROTEINS [J].
FITCH, WM .
SYSTEMATIC ZOOLOGY, 1970, 19 (02) :99-&