SCOOP: a simple method for identification of novel protein superfamily relationships

被引:32
作者
Bateman, Alex [1 ]
Finn, Robert D. [1 ]
机构
[1] Wellcome Trust Sanger Inst, Hinxton CB10 1SA, England
基金
英国惠康基金;
关键词
D O I
10.1093/bioinformatics/btm034
中图分类号
Q5 [生物化学];
学科分类号
071010 [生物化学与分子生物学]; 081704 [应用化学];
摘要
Motivation: Profile searches of sequence databases are a sensitive way to detect sequence relationships. Sophisticated profile-profile comparison algorithms that have been recently introduced increase search sensitivity even further. Results: In this article, a simpler approach than profile-profile comparison is presented that has a comparable performance to state-of-the-art tools such as COMPASS, HHsearch and PRC. This approach is called SCOOP (Simple Comparison Of Outputs Program), and is shown to find known relationships between families in the Pfam database as well as detect novel distant relationships between families. Several novel discoveries are presented including the discovery that a domain of unknown function (DUF283) found in Dicer proteins is related to double-stranded RNA-binding domains.
引用
收藏
页码:809 / 814
页数:6
相关论文
共 15 条
[1]
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]
SCOP database in 2004: refinements integrate structure and sequence family data [J].
Andreeva, A ;
Howorth, D ;
Brenner, SE ;
Hubbard, TJP ;
Chothia, C ;
Murzin, AG .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D226-D229
[3]
COACH:: profile-profile alignment of protein families using hidden Markov models [J].
Edgar, RC ;
Sjölander, K .
BIOINFORMATICS, 2004, 20 (08) :1309-1318
[4]
Pfam:: clans, web tools and services [J].
Finn, Robert D. ;
Mistry, Jaina ;
Schuster-Bockler, Benjamin ;
Griffiths-Jones, Sam ;
Hollich, Volker ;
Lassmann, Timo ;
Moxon, Simon ;
Marshall, Mhairi ;
Khanna, Ajay ;
Durbin, Richard ;
Eddy, Sean R. ;
Sonnhammer, Erik L. L. ;
Bateman, Alex .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D247-D251
[5]
HIDDEN MARKOV-MODELS IN COMPUTATIONAL BIOLOGY - APPLICATIONS TO PROTEIN MODELING [J].
KROGH, A ;
BROWN, M ;
MIAN, IS ;
SJOLANDER, K ;
HAUSSLER, D .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 235 (05) :1501-1531
[6]
SMART 4.0: towards genomic data integration [J].
Letunic, I ;
Copley, RR ;
Schmidt, S ;
Ciccarelli, FD ;
Doerks, T ;
Schultz, J ;
Ponting, CP ;
Bork, P .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D142-D144
[7]
Dimeric dUTPases, HisE, and MazG belong to a new superfamily of all-α NTP pyrophosphohydrolases with potential "house-cleaning" functions [J].
Moroz, OV ;
Murzin, AG ;
Makarova, KS ;
Koonin, EV ;
Wilson, KS ;
Galperin, MY .
JOURNAL OF MOLECULAR BIOLOGY, 2005, 347 (02) :243-255
[8]
The region on 9p associated with 46,XY sex reversal contains several transcripts expressed in the urogenital system and a novel doublesex-related domain [J].
Ottolenghi, C ;
Veitia, R ;
Quintana-Murci, L ;
Torchard, D ;
Scapoli, L ;
Souleyreau-Therville, N ;
Beckmann, J ;
Fellous, M ;
McElreavey, K .
GENOMICS, 2000, 64 (02) :170-178
[9]
Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods [J].
Park, J ;
Karplus, K ;
Barrett, C ;
Hughey, R ;
Haussler, D ;
Hubbard, T ;
Chothia, C .
JOURNAL OF MOLECULAR BIOLOGY, 1998, 284 (04) :1201-1210
[10]
The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis [J].
Pearl, F ;
Todd, A ;
Sillitoe, I ;
Dibley, M ;
Redfern, O ;
Lewis, T ;
Bennett, C ;
Marsden, R ;
Grant, A ;
Lee, D ;
Akpor, A ;
Maibaum, M ;
Harrison, A ;
Dallman, T ;
Reeves, G ;
Diboun, I ;
Addou, S ;
Lise, S ;
Johnston, C ;
Sillero, A ;
Thornton, J ;
Orengo, C .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D247-D251