Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonucleases

被引:173
作者
Fetrow, JS
Skolnick, J
机构
[1] Scripps Res Inst, Dept Mol Biol, La Jolla, CA 92037 USA
[2] SUNY Albany, Dept Biol Sci, Ctr Biochem & Biophys, Albany, NY 12222 USA
关键词
protein function prediction; ab initio folding algorithm; threading algorithm; ribotoxin; functional genomics;
D O I
10.1006/jmbi.1998.1993
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The practical exploitation of the vast numbers of sequences in the genome sequence databases is crucially dependent on the ability to identify the function of each sequence. Unfortunately, current methods, including global sequence alignment and local sequence motif identification, are limited by the extent of sequence similarity between sequences of unknown and known function; these methods increasingly fail as the sequence identity diverges into and beyond the twilight zone of sequence identity. To address this problem, a novel method for identification of protein function based directly on the sequence-to-structure-to-function paradigm is described. Descriptors of protein active sites, termed "fuzzy functional forms" or FFFs, are created based on the geometry and conformation of the active site. By way of illustration, the active sites responsible for the disulfide oxidoreductase activity of the glutaredoxin/thioredoxin family and the RNA hydrolytic activity of the T-1 ribonuclease family are presented. First, the FFFs are shown to correctly identify their corresponding active sites in a library of exact protein models produced by crystallography or NMR spectroscopy, most of which lack the specified activity. Next, these FFFs are used to screen for active sites in low-to-moderate resolution models produced by nb initio folding or threading prediction algorithms. Again, the FFFs can specifically identify the functional sites of these proteins from their predicted structures. The results demonstrate that low-to-moderate resolution models as produced by state-of-the-art tertiary structure prediction algorithms are sufficient to identify protein active sites. Prediction of a novel function for the gamma subunit of a yeast glycosyl transferase and prediction of the function of two hypothetical yeast proteins whose models were produced via threading are presented. This work suggests a means for the large-scale functional screening of genomic sequence databases based on the prediction of structure from sequence, then on the identification of functional active sites in the predicted structure. (C) 1998 Academic Press.
引用
收藏
页码:949 / 968
页数:20
相关论文
共 81 条
[1]  
ABOLA EE, 1987, PROTEIN DATA BANK CR
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   GLOBAL FOLD DETERMINATION FROM A SMALL NUMBER OF DISTANCE RESTRAINTS [J].
ASZODI, A ;
GRADWELL, MJ ;
TAYLOR, WR .
JOURNAL OF MOLECULAR BIOLOGY, 1995, 251 (02) :308-326
[4]  
ATTWOOD TK, 1994, NUCLEIC ACIDS RES, V22, P3590
[5]   PRINTS - A PROTEIN MOTIF FINGERPRINT DATABASE [J].
ATTWOOD, TK ;
BECK, ME .
PROTEIN ENGINEERING, 1994, 7 (07) :841-848
[6]   Novel developments with the PRINTS protein fingerprint database [J].
Attwood, TK ;
Beck, ME ;
Bleasby, AJ ;
Degtyarenko, K ;
Michie, AD ;
ParrySmith, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (01) :212-216
[7]   The SWISS-PROT protein sequence data bank and its new supplement TREMBL [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 1996, 24 (01) :21-25
[8]   The PROSITE database, its status in 1995 [J].
Bairoch, A ;
Bucher, P ;
Hofmann, K .
NUCLEIC ACIDS RESEARCH, 1996, 24 (01) :189-196
[9]   SELF-SPLICING GROUP-I INTRON IN CYANOBACTERIAL INITIATOR METHIONINE TRANSFER-RNA - EVIDENCE FOR LATERAL TRANSFER OF INTRONS IN BACTERIA [J].
BINISZKIEWICZ, D ;
CESNAVICIENE, E ;
SHUB, DA .
EMBO JOURNAL, 1994, 13 (19) :4629-4635
[10]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+