A novel method for automatic functional annotation of proteins

被引:65
作者
Fleischmann, W [1 ]
Möller, S [1 ]
Gateau, A [1 ]
Apweiler, R [1 ]
机构
[1] European Bioinformat Inst, EMBL Outstn, Cambridge CB10 1SD, England
关键词
D O I
10.1093/bioinformatics/15.3.228
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: To cope with the increasing amount of sequence data, reliable automatic annotation tools are required. The TrEMBL database contains together with SWISS-PROT nearly all publicly available protein sequences, but in contrast to SWISS-PROT only limited functional annotation. To improve this situation, we had to develop a method of automatic annotation that practices highly reliable functional prediction using the language and the syntax of SWISS-PROT. Results: An algorithm was developed and successfully used for the automatic annotation of a testset of unknown proteins. The predicted information included description, function, catalytic activity, cofactors, pathway, subcellular location, quaternary structure, similarity to other protein, active sites, and keywords. The algorithm showed a low coverage (10%), bur a high specificity and reliability.
引用
收藏
页码:228 / 233
页数:6
相关论文
共 15 条
[1]  
Apweiler R, 1997, ISMB-97 - FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY, PROCEEDINGS, P33
[2]   The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :38-42
[3]   The PROSITE database, its status in 1997 [J].
Bairoch, A ;
Bucher, P ;
Hofmann, K .
NUCLEIC ACIDS RESEARCH, 1997, 25 (01) :217-221
[4]   Recent developments in biological sequence databases [J].
Baker, PG ;
Brass, A .
CURRENT OPINION IN BIOTECHNOLOGY, 1998, 9 (01) :54-58
[5]   Predicting functions from protein sequences - where are the bottlenecks? [J].
Bork, P ;
Koonin, EV .
NATURE GENETICS, 1998, 18 (04) :313-318
[6]   Wanted: subcellular localization of proteins based on sequence [J].
Eisenhaber, F ;
Bork, P .
TRENDS IN CELL BIOLOGY, 1998, 8 (04) :169-170
[7]   WHOLE-GENOME RANDOM SEQUENCING AND ASSEMBLY OF HAEMOPHILUS-INFLUENZAE RD [J].
FLEISCHMANN, RD ;
ADAMS, MD ;
WHITE, O ;
CLAYTON, RA ;
KIRKNESS, EF ;
KERLAVAGE, AR ;
BULT, CJ ;
TOMB, JF ;
DOUGHERTY, BA ;
MERRICK, JM ;
MCKENNEY, K ;
SUTTON, G ;
FITZHUGH, W ;
FIELDS, C ;
GOCAYNE, JD ;
SCOTT, J ;
SHIRLEY, R ;
LIU, LI ;
GLODEK, A ;
KELLEY, JM ;
WEIDMAN, JF ;
PHILLIPS, CA ;
SPRIGGS, T ;
HEDBLOM, E ;
COTTON, MD ;
UTTERBACK, TR ;
HANNA, MC ;
NGUYEN, DT ;
SAUDEK, DM ;
BRANDON, RC ;
FINE, LD ;
FRITCHMAN, JL ;
FUHRMANN, JL ;
GEOGHAGEN, NSM ;
GNEHM, CL ;
MCDONALD, LA ;
SMALL, KV ;
FRASER, CM ;
SMITH, HO ;
VENTER, JC .
SCIENCE, 1995, 269 (5223) :496-512
[8]   PEDANTic genome analysis [J].
Frishman, D ;
Mewes, HW .
TRENDS IN GENETICS, 1997, 13 (10) :415-416
[9]   MAGPIE: Automated genome interpretation [J].
Gaasterland, T ;
Sensen, CW .
TRENDS IN GENETICS, 1996, 12 (02) :76-78
[10]  
GALPERIN MY, 1998, IN SILICO BIOL, V1, P7