TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes

被引:220
作者
Selengut, Jeremy D. [1 ]
Haft, Daniel H. [1 ]
Davidsen, Tanja [1 ]
Ganapathy, Anurhada [1 ]
Gwinn-Giglio, Michelle [1 ]
Nelson, William C. [1 ]
Richter, Alexander R. [1 ]
White, Owen [1 ]
机构
[1] TIGR, Bioinformat Dept, Rockville, MD 20850 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/nar/gkl1043
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
TIGRFAMs is a collection of protein family definitions built to aid in high-throughput annotation of specific protein functions. Each family is based on a hidden Markov model (HMM), where both cutoff scores and membership in the seed alignment are chosen so that the HMMs can classify numerous proteins according to their specific molecular functions. Most TIGRFAMs models describe 'equivalog' families, where both orthology and lateral gene transfer may be part of the evolutionary history, but where a single molecular function has been conserved. The Genome Properties system contains a queriable set of metabolic reconstructions, genome metrics and extractions of information from the scientific literature. Its genome-by-genome assertions of whether or not specific structures, pathways or systems are present provide high-level conceptual descriptions of genomic content. These assertions enable comparative genomics, provide a meaningful biological context to aid in manual annotation, support assignments of Gene Ontology (GO) biological process terms and help validate HMM-based predictions of protein function. The Genome Properties system is particularly useful as a generator of phylogenetic profiles, through which new protein family functions may be discovered.
引用
收藏
页码:D260 / D264
页数:5
相关论文
共 7 条
[1]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[2]  
Bateman A, 2002, NUCLEIC ACIDS RES, V30, P276, DOI [10.1093/nar/gkr1065, 10.1093/nar/gkp985, 10.1093/nar/gkh121]
[3]   Exopolysaccharide-associated protein sorting in environmental organisms: the PEP-CTERM/EpsH system. Application of a novel phylogenetic profiling heuristic [J].
Haft, Daniel H. ;
Paulsen, Ian T. ;
Ward, Naomi ;
Selengut, Jeremy D. .
BMC BIOLOGY, 2006, 4 (1)
[4]   Genome Properties: a system for the investigation of prokaryotic genetic content for microbiology, genome annotation and comparative genomics [J].
Haft, DH ;
Selengut, JD ;
Brinkac, LM ;
Zafar, N ;
White, O .
BIOINFORMATICS, 2005, 21 (03) :293-306
[5]   The TIGRFAMs database of protein families [J].
Haft, DH ;
Selengut, JD ;
White, O .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :371-373
[6]   Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles [J].
Pellegrini, M ;
Marcotte, EM ;
Thompson, MJ ;
Eisenberg, D ;
Yeates, TO .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (08) :4285-4288
[7]   The Comprehensive Microbial Resource [J].
Peterson, JD ;
Umayam, LA ;
Dickinson, T ;
Hickey, EK ;
White, O .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :123-125