Multi-domain protein

 

I have tried two members of Src tyrosin kinase family Src (2src) and Hck (2hck)
These proteins are 450 respective 438 aa long and according to CATH classification they contain four domains in their chains:


Number in parentheses in column Another true positives means number of retrieved true positives

2SRC 1,5 , 450 res., 49 % SSE

Server

Trivial solution found (itself)

Non-trivial solutions found

Rank of first false positive

Time
(minutes)

CE

yes

yes

Last 5

8

Dali

yes

yes

33,42

5

Dejavu

no

Yes (51)

no

190

Lock

no

yes

6

16

Matras

yes

Yes(30)

No

33

Pride

no

Yes(1)

2.

Sec

SSM

yes

Yes(6)

no

5

Top

yes

Yes(10)

no

11

Tops

no

no

1

5

Topscan

no

no

1

Sec

Vast

yes

Yes(52)

no

31




2hck 3,0 , 438 res., 50 % SSE

Server

Trivial solution found (itself)

Non-trivial solutions found

Rank of first false positive

Time
(minutes)

CE

yes

yes

Far away

ND

Dali

yes

yes

31

52

Dejavu

no

Yes (45)

no

210

Lock

no

yes

6

16

Matras

yes

yes

No

30

Pride

no

no

1

sec

SSM

yes

Yes(6)

no

5

Top

yes

yes

No

10

Tops

No

no

1

6 min

Topscan

no

Yes(1 in first 20)

2

Sec

Vast

yes

yes

No

ND

CE processing time is unknown, because we used pre-computed results.


Comments to results:

CE more results for 2hck than 2src (78 to 67), but also more false positives. The results are properly ranked. First are the structures with four matched domains, than whole bunch of structures with two matched domains and than structures with one matched domain ordered according to size of domains. All CATH defined domains for these structures are covered. False positives are at the really end.

Dali the same case as CE, perfectly ranked results.

Dejavu  failed to find proteins that contein either only SH2 or SH3 domain, but found the structures that contain at least two of the domains in common with SRC kinases.

Lock - in neither of cases it found best results (that mean structures with four matched domains). My explanation is, that these structures are not contained in their database. But the program was able basically find all types of domains, which it was supposed to find, although they were mixed with false positives almost from beginning.

Matras similar to CE and Dali, no false positives, but also missing results (maybe because of higher cut-off z-score)

Pride found only one hit with kinase domain, no SH3 and SH2 domains.

SSM is able to find only structures with all four domains present, not able to find structures containing only some of the domains existing in Src kinases.

Top no false positives, but I couldnt find between the hits any SH2 and SH3 domains. It found only kinase domain, it is not so good.

Tops can not handle with multidomain proteins, no hits

Topscan one hit to kinase domain in one case, but otherwise nothing. It can not handle with multidomain proteins.

Vast no false positives, the same case of Dali, Matras and CE. Perfect.


Comments

More about us