USF

Uppsala Software Factory Tutorial - SPASM tutorial

This page describes how to use programs of the SPASM package together with O to find and inspect proteins that share a structural motif with your own protein. It uses the following programs:

  1. SPASM to find candidate hits
  2. SAVANT to scrutinise candidate hits
  3. DEJANA to throw away poor hits (this program is briefly discussed in the DEJAVU manual)
  4. O to look at the hits

Reference: Kleywegt, G.J. (1999). Recognition of spatial motifs in protein structures. J Mol Biol 285, 1887-1897. (MEDLINE)

Contents:

1 - Prepare your motif

The first thing you need to do is to put the residues you want to look for in a small PDB file. In this example, we will use residues Trp A279 and Met B33 of the complex between acetylcholinesterase and the snake toxin fasciculin II (MEDLINE). You can cut these residues out of the PDB entry 1FSS, or cut and paste them from here (put them in a PDB file called 1fss.pdb):

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
ATOM   2144  N   TRP A 279      10.520  24.780  31.275  1.00 53.83      1FSS2369
ATOM   2145  CA  TRP A 279       9.591  25.831  30.895  1.00 53.83      1FSS2370
ATOM   2146  C   TRP A 279       8.589  26.213  31.972  1.00 53.83      1FSS2371
ATOM   2147  O   TRP A 279       7.863  27.194  31.833  1.00 53.83      1FSS2372
ATOM   2148  CB  TRP A 279       8.936  25.480  29.566  1.00 53.83      1FSS2373
ATOM   2149  CG  TRP A 279       9.955  25.461  28.453  1.00 53.83      1FSS2374
ATOM   2150  CD1 TRP A 279      10.577  24.364  27.912  1.00 53.83      1FSS2375
ATOM   2151  CD2 TRP A 279      10.491  26.601  27.772  1.00 53.83      1FSS2376
ATOM   2152  NE1 TRP A 279      11.464  24.759  26.937  1.00 53.83      1FSS2377
ATOM   2153  CE2 TRP A 279      11.429  26.126  26.830  1.00 53.83      1FSS2378
ATOM   2154  CE3 TRP A 279      10.266  27.981  27.864  1.00 53.83      1FSS2379
ATOM   2155  CZ2 TRP A 279      12.138  26.985  25.988  1.00 53.83      1FSS2380
ATOM   2156  CZ3 TRP A 279      10.969  28.830  27.029  1.00 53.83      1FSS2381
ATOM   2157  CH2 TRP A 279      11.893  28.330  26.103  1.00 53.83      1FSS2382
ATOM   4499  N   MET B  33       4.078  25.286  24.096  1.00 17.83      1FSS4724
ATOM   4500  CA  MET B  33       5.102  24.475  23.450  1.00 17.83      1FSS4725
ATOM   4501  C   MET B  33       4.918  22.975  23.595  1.00 17.83      1FSS4726
ATOM   4502  O   MET B  33       4.592  22.477  24.676  1.00 17.83      1FSS4727
ATOM   4503  CB  MET B  33       6.455  24.846  24.022  1.00 17.83      1FSS4728
ATOM   4504  CG  MET B  33       6.516  24.705  25.534  1.00 17.83      1FSS4729
ATOM   4505  SD  MET B  33       7.770  25.770  26.209  1.00 17.83      1FSS4730
ATOM   4506  CE  MET B  33       8.499  26.448  24.607  1.00 17.83      1FSS4731
END                                                                     1FSS5000
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

2 - Run SPASM

Make sure that SPASM is installed (and that the names of the PDB files in the SPASM database have been changed such that they point to your local copy of the PDB).

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 % 713 gerard sarek 21:05:46 spasm/sites > run spasm
[...]
 SPASM database file ? (/home/gerard/lib/spasm.lib) 
[...]
 Which PDB file ? (0xyz.pdb) 1fss.pdb
[...]
 Four-character ID for this run ? (1FSS) 
[...]
 Max superpositioning RMSD ? (   1.500) 
[...]
 Max CA-CA distance mismatch ? (   2.000) 3
 Max SC-SC distance mismatch ? (   2.000) 1
[...]
 Resolution cut-off (A) ? ( 999.900) 
[...]
[do not allow any residue substitutions :]
 Substitution option ? (       3) 1
[...]
 Conserve sequence directionality ? (N) 
[...]
 Conserve neighbouring residues ? (N) 
[...]
 Conserve sequence gaps ? (N) 
[...]
 Print distance matrices ? (N) 
[...]
 Print operators ? (N) 
[...]
 Extensive output ? (N) 
[...]
[since we have only 2 residues, we *must* include MC and SC :]
 0=SC, 1=MC+SC, 2=MC ? (       1) 
[...]
[since we will use SAVANT, we don't need the O macro file :]
 O macro and operator file ? (N) 
[...]
 SAVANT input file ? (Y) 
 SAVANT input file ? (1fss.savant) 
[...]
 LSQMAN input file ? (N) 
[...]
 MSEQPRO sequence file ? (N) 
[...]
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   
Now SPASM compares the proteins in its database to your motif and all Trp-Met pairs that are roughly in the same relative spatial positions as those in 1FSS will be listed as "hits" (note that some proteins yield more than one hit). Within half a minute, SPASM thus finds 346 hits in 283 distinct PDB entries:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
[...]
 ==> HIT  : (8FAB) 
 Compound : (FAB FRAGMENT FROM HUMAN IMMUNOGLOBULIN IGG1 (LAMBDA, HIL)) 
 File     : (/nfs/pdb/full/8fab.pdb) 
 Residues : (        428) 
 Resol (A): (   1.800) 

 MATCH with RMSD   1.28 A for   4 pseudo-atoms
 TRP A 279  <---> TRP D  36  =  15.0
 MET B  33  <---> MET D  81  =   6.0
 Total BLOSUM-45 score : (  21.000) 

 --------------------------------------------------------

 ==> HIT  : (9LDT) 
 Compound : (LACTATE DEHYDROGENASE (E.C.1.1.1.27) COMPLEX WITH NADH AND 
  OXAMATE) 
 File     : (/nfs/pdb/full/9ldt.pdb) 
 Residues : (        331) 
 Resol (A): (   2.000) 

 MATCH with RMSD   0.92 A for   4 pseudo-atoms
 TRP A 279  <---> TRP A 150  =  15.0
 MET B  33  <---> MET A 274  =   6.0
 Total BLOSUM-45 score : (  21.000) 

 --------------------------------------------------------

 Nr of proteins found : (        283) 
 Nr of proteins tried : (       2590) 
 Total number of hits : (        346) 
 CPU total/user/sys :      36.6      35.8       0.8

 Run again ? (Y) n
[...]
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

3 - Run SAVANT

Now, if you had produced an O macro to look at the hits in O, you would have noticed that many of the hits are rather poor: the Trp and Met have similar distances, but their interaction is quite different from the one seen in 1FSS. Moreover, it is quite likely that O's database becomes full before the macro has read all 283 proteins. And it becomes difficult to switch 346 O graphics objects on and off. This is where SAVANT comes in ! It enables you to scrutinise the hits in more detail, doing a superpositioning on all (main-chain and/or side-chain) atoms. Here we will only use the side-chain atoms (we are not at all interested in the main chain), which will take about a minute of CPU time. (Note: SAVANT expects that a sub-directory called savant exists; if it does not, it will refuse to run. In that case, create such a directory by typing: mkdir savant)

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 % 715 gerard sarek 21:05:46 spasm/sites > run savant
[...]
 SAVANT library file ? (/home/gerard/lib/savant.lib) 
[...]
 SAVANT input file ? (savant.inp) 1fss.savant 
 SAVANT O macro    ? (savant.omac) 
 SAVANT ODB file   ? (savant.odb) 
 O2D plot file     ? (savant.plt) 
[...]
[do the detailed superpositioning using *only* the side-chain atoms:]
 Select mode: 0 = SC, 1 = MC + SC, 2 = MC
 Mode (0/1/2) ? (       1) 0
[...]
 Hit PDB file : (/nfs/pdb/full/8fab.pdb) 
 Residues in hit : (       2) 
 Hit will go to PDB file : (savant/1fss_8fab_1.pdb) 
 Atoms in residues : (      14        8) 
 Atoms in common : (      14) 
 RMS distance : (   1.093) 

 Hit PDB file : (/nfs/pdb/full/9ldt.pdb) 
 Residues in hit : (       2) 
 Hit will go to PDB file : (savant/1fss_9ldt_1.pdb) 
 Atoms in residues : (      14        8) 
 Atoms in common : (      14) 
 RMS distance : (   1.202) 
[...]
 CPU-time taken :
 User    -     60.9 Sys    -      7.9 Total   -     68.8
[...]
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

4 - Run DEJANA

Next we can use DEJANA to select only the best hits for display in O:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 % 716 gerard sarek 21:05:46 spasm/sites > run dejana
[...]
 O macro (DEJAVU/LSQMAN/SPASM/RIGOR/SAVANT) ? (lsqman.omac) savant.omac

 Reading hits ...
 #     1 ID 119l1  Nres    14 RMSD   1.43 A
 #     2 ID 119l2  Nres    14 RMSD   1.35 A
[...]
 #   345 ID 8fab1  Nres    14 RMSD   1.09 A
 #   346 ID 9ldt1  Nres    14 RMSD   1.20 A

 Nr of hits (> 0 atoms/residues/SSEs) : (        346) 

 ------------------------------------------

 Min nr of matched atoms/residues/SSEs   ? (          1) 
 Max RMSD of matched atoms/residues/SSEs ? ( 999.990) 

 Sorting hits ...

 Nr of hits left : (        346) 

 #     1 ID 1req1  Nres    14 RMSD   0.54 A
 #     2 ID 1cto1  Nres    14 RMSD   0.57 A
 #     3 ID 1aro1  Nres    14 RMSD   0.69 A
 #     4 ID 2hbg1  Nres    14 RMSD   0.69 A
 #     5 ID 1ppf1  Nres    14 RMSD   0.74 A
 #     6 ID 4mdh1  Nres    14 RMSD   0.76 A
 #     7 ID 1pnk1  Nres    14 RMSD   0.77 A
 #     8 ID 1kmm1  Nres    14 RMSD   0.80 A
 #     9 ID 1phc1  Nres    14 RMSD   0.80 A
 #    10 ID 3fru1  Nres    14 RMSD   0.80 A
 #    11 ID 1bdm1  Nres    14 RMSD   0.81 A
 #    12 ID 1bak2  Nres    14 RMSD   0.82 A
[...]
 #   344 ID 1yge2  Nres    14 RMSD   2.45 A
 #   345 ID 1ihp1  Nres    14 RMSD   2.47 A
 #   346 ID 1gow1  Nres    14 RMSD   2.57 A

 Select one of the following options:
 0 = re-enter criteria and re-sort
 1 = write new O macro with current hits
 2 = quit program without writing new O macro
 Option (0, 1, 2) ? (          0) 

 ------------------------------------------

 Min nr of matched atoms/residues/SSEs   ? (          1) 10
 Max RMSD of matched atoms/residues/SSEs ? ( 999.990) 0.7

 Sorting hits ...

 Nr of hits left : (          4) 

 #     1 ID 1req1  Nres    14 RMSD   0.54 A
 #     2 ID 1cto1  Nres    14 RMSD   0.57 A
 #     3 ID 1aro1  Nres    14 RMSD   0.69 A
 #     4 ID 2hbg1  Nres    14 RMSD   0.69 A

 Select one of the following options:
 0 = re-enter criteria and re-sort
 1 = write new O macro with current hits
 2 = quit program without writing new O macro
 Option (0, 1, 2) ? (          0) 1
 New O macro file ? (dejana.omac) 

 Writing hits ...

 Processing PDB code : (1req1) 
 Processing PDB code : (1cto1) 
 Processing PDB code : (1aro1) 
 Processing PDB code : (2hbg1) 

 New O macro written ...
[...]
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

5 - Run O

The result of running DEJANA is an O macro that will only display the top 4 hits (which all have an RMSD < 0.7 Å for 14 side-chain atoms). Simply start up O, and execute the macro:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 % 717 gerard sarek 21:05:46 spasm/sites > ono
[...]
 As4> File not found in path: on_startup
 As4> Indirect file does not exist.
@dejana.omac
 As4> Macro in computer file-system.
 Heap>  Created by SAVANT V. 990907/0.3 at Tue Sep 7 21:24:55 1999 for gerard
 Sam> File type is PDB
 Sam>  Database compressed.
[...]
 Sam> Molecule 2HBG1 contained 2 residues and 22 atoms
 Sam>  PDB          is not a visible command.
 Mol>  Object not in list, cannot delete it: 2HBG1
 Mol> No connectivity Db for 2HBG1
 Mol>  Database compressed.
 mol connectivity is          30
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   
Hint: if you want the hits displayed as ball-and-stick models as well, do the following before executing the macro:
      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
cat dejana.omac | sed -e 's/! ske/ske/' > q ; mv q dejana.omac
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Hint: don't forget that your savant sub-directory still contains one little PDB file for each of the 346 hits ... You may want to remove some of those.

Does your result look something like this ?

Picture

6 - "Fuzzy motif matching"

If you have a motif, but you want to allow variations (e.g., Arg could also be Lys, Tyr could also be Phe, etc.), you can do "fuzzy matching" in SPASM. As an example, prepare a PDB file containing Arg 111, Arg 132, and Tyr 134 of PDB entry 1CBS (these residues constitute the fatty-acid-binding motif in cellular retinoic-acid-binding protein).

Start up SPASM and answer the questions, until you get to:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 You may opt to allow substitutions of certain
 residue types.  You have the following options:
  (1) Do not allow substitutions
  (2) Only allow D/E, N/Q, L/I, F/Y and R/K
  (3) Use BLOSUM-45 to decide
  (4) User-defined substitutions
 Substitution option ? (       3) 2
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

In this case, we only need to allow F/Y and R/K substitutions, so we can select option 2. If you want to, you can also use the BLOSUM-45 substitution matrix instead. It will list the matrix values for those residue types that occur in your motif. You can then decide on a cut-off value, and all residue types whose matrix values is not less than your cut-off value will be allowed:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 You may opt to allow substitutions of certain
 residue types.  You have the following options:
  (1) Do not allow substitutions
  (2) Only allow D/E, N/Q, L/I, F/Y and R/K
  (3) Use BLOSUM-45 to decide
  (4) User-defined substitutions
 Substitution option ? (       3) 3

      ALA  ARG  ASN  ASP  CYS  GLU  GLN  GLY  HIS  ILE  LEU  LYS  MET  PHE  PRO  SER  THR  TRP  TYR  VAL
 ARG -2.0  7.0  0.0 -1.0 -3.0  1.0  0.0 -2.0  0.0 -3.0 -2.0  3.0 -1.0 -2.0 -2.0 -1.0 -1.0 -2.0 -1.0 -2.0
 TYR -2.0 -1.0 -2.0 -2.0 -3.0 -1.0 -2.0 -3.0  2.0  0.0  0.0 -1.0  0.0  3.0 -3.0 -2.0 -1.0  3.0  8.0 -1.0

 Statistics for entire BLOSUM-45 matrix:
 Average value : (  -0.918) 
 St. dev.      : (   2.467) 
 Minimum       : (  -5.000) 
 Maximum       : (  15.000) 
 Matrix value cut-off ? (   3.000) 

 Allowed substitutions :
 ARG (ARG LYS) 
 TYR (TYR PHE TRP) 
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Alternatively, you can indicate yourself which substitutions will be allowed (and SPASM will allow you to do silly things here, e.g. to allow Ala-Trp substitutions, if you so insist):

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 You may opt to allow substitutions of certain
 residue types.  You have the following options:
  (1) Do not allow substitutions
  (2) Only allow D/E, N/Q, L/I, F/Y and R/K
  (3) Use BLOSUM-45 to decide
  (4) User-defined substitutions
 Substitution option ? (       3) 4

 Enter allowed substitutions in 3-letter code:
 Which types to allow for ARG ? (ARG) arg lys
 Which types to allow for TYR ? (TYR) tyr phe
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Once you have decided what to do, run SPASM as usual. (Note: you may want to be extra generous with the allowed mismatch distances.)

Run SAVANT (version 1.0 or later) and make sure to feed it a library that contains a description of the atom types in different residue types that you consider matchable (except CB atoms which SAVANT will always consider matchable), e.g.:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
[...]
ALT PHE  CG  TYR  CG
ALT PHE  CD1 TYR  CD1
ALT PHE  CD1 TYR  CD2
ALT PHE  CE1 TYR  CE1
ALT PHE  CE1 TYR  CE2
ALT PHE  CD2 TYR  CD2
ALT PHE  CD2 TYR  CD1
ALT PHE  CE2 TYR  CE1
ALT PHE  CE2 TYR  CE2
ALT PHE  CZ  TYR  CZ
!
ALT PHE  CG  TRP  CG
ALT PHE  CG  HIS  CG
ALT TYR  CG  TRP  CG
ALT TYR  CG  HIS  CG
ALT TRP  CG  HIS  CG
!
ALT ARG  CG  LYS  CG
ALT ARG  CD  LYS  CD
[...]
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Note that with our motif, any lysine that is matched to one of our arginines will be included in the least-squares superpositioning with not only its main-chain atoms (if you include those, that is), but also with its CB (standard behaviour of SAVANT) and its CG and CD atoms (read from the library file).

Also note that the matching of Tyr and Phe is ambiguous for several atom types (e.g., Phe CD1 can be matched with Tyr CD1 and CD2). SAVANT will use the match that gives the lowest overall RMSD. However ... SAVANT does not try all possible permutations ! This means that the same atom can sometimes be used twice (see the first example in the output below for an example: Tyr CD1 and CD2 are both matched to Phe CD2). Also, sometimes you may get "silly" matches such as CD1-CD1/CD2-CD2 with CE1-CE2/CE2-CE1. However, usually things work fine, and if this is not the case it tends to be for the poorer hits (compare the second example in the output below with the first).

Now run SAVANT, and only include the side-chain atoms in the superpositioning:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 SAVANT library file ? (/home/gerard/lib/savant.lib) 
 Nr of AMB lines : (      11) 
 Nr of ALT lines : (      37) 

 SAVANT input file ? (savant.inp) cra2.savant
 SAVANT O macro    ? (savant.omac) 
 SAVANT ODB file   ? (savant.odb) 
 O2D plot file     ? (savant.plt) 

 Select mode: 0 = SC, 1 = MC + SC, 2 = MC
 Mode (0/1/2) ? (       1) 0
 MODE : (       0) 
 SC only

 Pattern PDB file : (cra2.pdb) 
 Residues in pattern : (       3) 
 Atoms in residues : (      11       11       12) 
 Total : (      34) 
 Ambiguous   : (       8) 
 Alternative : (      10) 

 Hit PDB file : (/nfs/pdb/full/169l.pdb) 
 Residues in hit : (       3) 
 Hit will go to PDB file : (savant/cra2_169l_1.pdb) 
 Atoms in residues : (       9       11       11) 
 ==> Match ARG CB  <-> LYS CB 
 ==> Alternative ? ARG CG 
 ==> Alternative ? ARG CD 
 ==> Not matchable ARG NE 
 ==> Not matchable ARG CZ 
 ==> Not matchable ARG NH1
 ==> Not matchable ARG NH2
 ==> Match ARG CB  <-> ARG CB 
 ==> Match ARG CG  <-> ARG CG 
 ==> Match ARG CD  <-> ARG CD 
 ==> Match ARG NE  <-> ARG NE 
 ==> Match ARG CZ  <-> ARG CZ 
 ==> Match ARG NH1 <-> ARG NH1
 ==> Match ARG NH2 <-> ARG NH2
 ==> Match TYR CB  <-> PHE CB 
 ==> Alternative ? TYR CG 
 ==> Alternative ? TYR CD1
 ==> Alternative ? TYR CE1
 ==> Alternative ? TYR CD2
 ==> Alternative ? TYR CE2
 ==> Alternative ? TYR CZ 
 ==> Not matchable TYR OH 
 Atoms in common : (       9) 
 RMS distance : (   1.145) 
 Nr ambiguous : (       2) 
 ARG   132   NH1 <->  NH2 => RMSD (A) : (   1.404) 
 Lowest RMSD  : (   1.145) 
 Nr alternatives : (       8) 
 ARG   111   CG  <-> LYS A  60   CG  => RMSD (A) : (   1.228) 
 ARG   111   CD  <-> LYS A  60   CD  => RMSD (A) : (   1.425) 
 TYR   134   CG  <-> PHE A   4   CG  => RMSD (A) : (   1.440) 
 TYR   134   CD1 <-> PHE A   4   CD1 => RMSD (A) : (   1.526) 
 TYR   134   CD1 <-> PHE A   4   CD2 => RMSD (A) : (   1.418) 
 TYR   134   CE1 <-> PHE A   4   CE1 => RMSD (A) : (   1.648) 
 TYR   134   CE1 <-> PHE A   4   CE2 => RMSD (A) : (   1.452) 
 TYR   134   CD2 <-> PHE A   4   CD1 => RMSD (A) : (   1.698) 
 TYR   134   CD2 <-> PHE A   4   CD2 => RMSD (A) : (   1.493) 
 TYR   134   CE2 <-> PHE A   4   CE1 => RMSD (A) : (   1.884) 
 TYR   134   CE2 <-> PHE A   4   CE2 => RMSD (A) : (   1.637) 
 TYR   134   CZ  <-> PHE A   4   CZ  => RMSD (A) : (   1.791) 
 Final RMSD   : (   1.791) 
 Nr matches   : (      17) 

[...]

 Hit PDB file : (/nfs/pdb/full/1waj.pdb) 
 Residues in hit : (       3) 
 Hit will go to PDB file : (savant/cra2_1waj_1.pdb) 
 Atoms in residues : (       9       11       11) 
 ==> Match ARG CB  <-> LYS CB 
 ==> Alternative ? ARG CG 
 ==> Alternative ? ARG CD 
 ==> Not matchable ARG NE 
 ==> Not matchable ARG CZ 
 ==> Not matchable ARG NH1
 ==> Not matchable ARG NH2
 ==> Match ARG CB  <-> ARG CB 
 ==> Match ARG CG  <-> ARG CG 
 ==> Match ARG CD  <-> ARG CD 
 ==> Match ARG NE  <-> ARG NE 
 ==> Match ARG CZ  <-> ARG CZ 
 ==> Match ARG NH1 <-> ARG NH1
 ==> Match ARG NH2 <-> ARG NH2
 ==> Match TYR CB  <-> PHE CB 
 ==> Alternative ? TYR CG 
 ==> Alternative ? TYR CD1
 ==> Alternative ? TYR CE1
 ==> Alternative ? TYR CD2
 ==> Alternative ? TYR CE2
 ==> Alternative ? TYR CZ 
 ==> Not matchable TYR OH 
 Atoms in common : (       9) 
 RMS distance : (   0.703) 
 Nr ambiguous : (       2) 
 ARG   132   NH1 <->  NH2 => RMSD (A) : (   1.239) 
 Lowest RMSD  : (   0.703) 
 Nr alternatives : (       8) 
 ARG   111   CG  <-> LYS   240   CG  => RMSD (A) : (   0.738) 
 ARG   111   CD  <-> LYS   240   CD  => RMSD (A) : (   0.774) 
 TYR   134   CG  <-> PHE   266   CG  => RMSD (A) : (   0.761) 
 TYR   134   CD1 <-> PHE   266   CD1 => RMSD (A) : (   0.907) 
 TYR   134   CD1 <-> PHE   266   CD2 => RMSD (A) : (   0.984) 
 TYR   134   CE1 <-> PHE   266   CE1 => RMSD (A) : (   1.120) 
 TYR   134   CE1 <-> PHE   266   CE2 => RMSD (A) : (   1.163) 
 TYR   134   CD2 <-> PHE   266   CD1 => RMSD (A) : (   1.173) 
 TYR   134   CD2 <-> PHE   266   CD2 => RMSD (A) : (   1.094) 
 TYR   134   CE2 <-> PHE   266   CE1 => RMSD (A) : (   1.190) 
 TYR   134   CE2 <-> PHE   266   CE2 => RMSD (A) : (   1.066) 
 TYR   134   CZ  <-> PHE   266   CZ  => RMSD (A) : (   1.128) 
 Final RMSD   : (   1.128) 
 Nr matches   : (      17) 

[...]

 Hit PDB file : (/nfs/pdb/full/7aat.pdb) 
 Residues in hit : (       3) 
 Hit will go to PDB file : (savant/cra2_7aat_1.pdb) 
 Atoms in residues : (      11        9       12) 
 ==> Match ARG CB  <-> ARG CB 
 ==> Match ARG CG  <-> ARG CG 
 ==> Match ARG CD  <-> ARG CD 
 ==> Match ARG NE  <-> ARG NE 
 ==> Match ARG CZ  <-> ARG CZ 
 ==> Match ARG NH1 <-> ARG NH1
 ==> Match ARG NH2 <-> ARG NH2
 ==> Match ARG CB  <-> LYS CB 
 ==> Alternative ? ARG CG 
 ==> Alternative ? ARG CD 
 ==> Not matchable ARG NE 
 ==> Not matchable ARG CZ 
 ==> Not matchable ARG NH1
 ==> Not matchable ARG NH2
 ==> Match TYR CB  <-> TYR CB 
 ==> Match TYR CG  <-> TYR CG 
 ==> Match TYR CD1 <-> TYR CD1
 ==> Match TYR CE1 <-> TYR CE1
 ==> Match TYR CD2 <-> TYR CD2
 ==> Match TYR CE2 <-> TYR CE2
 ==> Match TYR CZ  <-> TYR CZ 
 ==> Match TYR OH  <-> TYR OH 
 Atoms in common : (      16) 
 RMS distance : (   2.209) 
 Nr ambiguous : (       6) 
 ARG   111   NH1 <->  NH2 => RMSD (A) : (   2.212) 
 TYR   134   CD1 <->  CD2 => RMSD (A) : (   2.123) 
 TYR   134   CE1 <->  CE2 => RMSD (A) : (   2.020) 
 Lowest RMSD  : (   2.020) 
 Nr alternatives : (       2) 
 ARG   132   CG  <-> LYS A 258   CG  => RMSD (A) : (   2.037) 
 ARG   132   CD  <-> LYS A 258   CD  => RMSD (A) : (   2.036) 
 Final RMSD   : (   2.036) 
 Nr matches   : (      18) 
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Use DEJANA as usual to select the best hits (you may want to sort on RMSD first, and number of matched atoms second; this can be done using option 3 in DEJANA version 1.4 or later).

Does your result look something like this ?

Picture

Just to show that SPASM rocks, check out this (big) picture to see the hits in a different way ...

7 - Main-chain motifs

Let's see how SPASM works when we are only interested in a chunk of main-chain (e.g., an unusual or otherwise interesting loop). For this purpose, first generate a 7-residue bit of left-handed helix with MOLEMAN2:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 MOLEMAN2 > auto spink alpha 7
[...]
 MOLEMAN2 > xyz mirror x
[...]
 MOLEMAN2 > write lal7.pdb
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Now run SPASM:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 SPASM database file ? (/home/gerard/lib/spasm.lib) 
[...]
 Which PDB file ? (0xyz.pdb) lal7.pdb
[...]
 Four-character ID for this run ? (LAL7) 
[...]
 Max superpositioning RMSD ? (   1.500) 1.0
[...]
 Max CA-CA distance mismatch ? (   2.000) 
 Max SC-SC distance mismatch ? (   2.000) 
[...]
 Resolution cut-off (A) ? ( 999.900) 
[...]
 Max nr of residues ? (    9999)
[...]
 Substitution option ? (       3) 1
[...]
 Conserve sequence directionality ? (N) y
[...]
 Conserve neighbouring residues ? (N) y
[...]
 Conserve sequence gaps ? (N) 
[...]
 Print distance matrices ? (N) 
[...]
 Print operators ? (N) 
[...]
 Extensive output ? (N) 
[...]
 0=SC, 1=MC+SC, 2=MC ? (       1) 2
[...]
 O macro and operator file ? (N) y

 O macro file ? (lal7.omac) 

 O operator file ? (lal7.odb) 
[...]
 SAVANT input file ? (Y) 

 SAVANT input file ? (lal7.savant) 
[...]
 LSQMAN input file ? (N) 
[...]
 MSEQPRO sequence file ? (N) 
[...]
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Note that, in general, main-chain searches are quite a bit slower than searches for small motifs.

The results may look as follows:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 ... Searching ...

 --------------------------------------------------------

 ==> HIT  : (1ACP) 
 Compound : (ACYL CARRIER PROTEIN (NMR, 2 STRUCTURES)) 
 File     : (/nfs/pdb/full/1acp.pdb) 
 Residues : (         77) 
 Resol (A): (  99.990) 

 MATCH with RMSD   0.85 A for   7 pseudo-atoms
 ALA Z   1  <---> LYS     9     -1.0
 ALA Z   2  <---> ILE    10     -1.0
 ALA Z   3  <---> ILE    11     -1.0
 ALA Z   4  <---> GLY    12      0.0
 ALA Z   5  <---> GLU    13     -1.0
 ALA Z   6  <---> GLN    14     -1.0
 ALA Z   7  <---> LEU    15     -1.0
 Total BLOSUM-45 score : (  -6.000) 

 --------------------------------------------------------

 ==> HIT  : (1AK0) 
 Compound : (MOL_ID: 1; MOLECULE: P1 NUCLEASE; CHAIN: NULL; EC: 3.1.30.1) 
 File     : (/nfs/pdb/full/1ak0.pdb) 
 Residues : (        264) 
 Resol (A): (   1.800) 

 MATCH with RMSD   0.62 A for   7 pseudo-atoms
 ALA Z   1  <---> ALA   129  =   5.0
 ALA Z   2  <---> TYR   130     -2.0
 ALA Z   3  <---> ALA   131  =   5.0
 ALA Z   4  <---> VAL   132      0.0
 ALA Z   5  <---> GLY   133      0.0
 ALA Z   6  <---> GLY   134      0.0
 ALA Z   7  <---> ASN   135     -1.0
 Total BLOSUM-45 score : (   7.000) 

 --------------------------------------------------------

 ==> HIT  : (1BD0) 
 Compound : (MOL_ID: 1; MOLECULE: ALANINE RACEMASE; CHAIN: A, B; EC: 
  5.1.1.1; BIOLOGICA) 
 File     : (/nfs/pdb/full/1bd0.pdb) 
 Residues : (        381) 
 Resol (A): (   1.600) 

 MATCH with RMSD   0.40 A for   7 pseudo-atoms
 ALA Z   1  <---> LYS A  39     -1.0
 ALA Z   2  <---> ALA A  40  =   5.0
 ALA Z   3  <---> ASN A  41     -1.0
 ALA Z   4  <---> ALA A  42  =   5.0
 ALA Z   5  <---> TYR A  43     -2.0
 ALA Z   6  <---> GLY A  44      0.0
 ALA Z   7  <---> HIS A  45     -2.0
 Total BLOSUM-45 score : (   4.000) 

 --------------------------------------------------------

 ==> HIT  : (2OMF) 
 Compound : (MOL_ID: 1; MOLECULE: MATRIX PORIN OUTER MEMBRANE PROTEIN F; 
  CHAIN: NULL; S) 
 File     : (/nfs/pdb/full/2omf.pdb) 
 Residues : (        340) 
 Resol (A): (   2.400) 

 MATCH with RMSD   0.77 A for   7 pseudo-atoms
 ALA Z   1  <---> ASN   141     -1.0
 ALA Z   2  <---> SER   142  +   1.0
 ALA Z   3  <---> ASN   143     -1.0
 ALA Z   4  <---> PHE   144     -2.0
 ALA Z   5  <---> PHE   145     -2.0
 ALA Z   6  <---> GLY   146      0.0
 ALA Z   7  <---> LEU   147     -1.0
 Total BLOSUM-45 score : (  -6.000) 

 --------------------------------------------------------

 ==> HIT  : (2SXL) 
 Compound : (MOL_ID: 1; MOLECULE: SEX-LETHAL PROTEIN; CHAIN: NULL; 
  FRAGMENT: RNA-BINDIN) 
 File     : (/nfs/pdb/full/2sxl.pdb) 
 Residues : (         88) 
 Resol (A): (  99.990) 

 MATCH with RMSD   0.92 A for   7 pseudo-atoms
 ALA Z   1  <---> PRO    82     -1.0
 ALA Z   2  <---> GLY    83      0.0
 ALA Z   3  <---> GLY    84      0.0
 ALA Z   4  <---> GLU    85     -1.0
 ALA Z   5  <---> SER    86  +   1.0
 ALA Z   6  <---> ILE    87     -1.0
 ALA Z   7  <---> LYS    88     -1.0
 Total BLOSUM-45 score : (  -3.000) 

 --------------------------------------------------------

 Skipped (resolution)  : (          0) 
 Skipped (nr residues) : (          0) 
 Nr of proteins tried  : (       2590) 
 Nr of proteins found  : (          5) 
 Total number of hits  : (          5) 
 CPU total/user/sys :     373.7     372.8       0.9
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Now, you can look at the hits by executing the lal7.omac macro in O. Doesn't look so bad, aye ?

If you want, you can also run SAVANT, of course. Let's see how good the matches are if we include all main-chain atoms and the CB atoms:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 SAVANT library file ? (/home/gerard/lib/savant.lib) 
 Nr of AMB lines : (      11) 
 Nr of ALT lines : (      37) 

 SAVANT input file ? (savant.inp) lal7.savant 
 SAVANT O macro    ? (savant.omac) 
 SAVANT ODB file   ? (savant.odb) 
 O2D plot file     ? (savant.plt) 

 Select mode: 0 = SC, 1 = MC + SC, 2 = MC
 Mode (0/1/2) ? (       2) 1
 MODE : (       1) 
 MC + SC
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Run DEJANA:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Nr of hits left : (          5) 

 #     1 ID 1bd01  Nmatch    34 RMSD   1.18 A
 #     2 ID 1ak01  Nmatch    33 RMSD   1.25 A
 #     3 ID 2omf1  Nmatch    34 RMSD   1.49 A
 #     4 ID 1acp1  Nmatch    34 RMSD   1.61 A
 #     5 ID 2sxl1  Nmatch    33 RMSD   1.64 A
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Now look at the hits again in O. In particular, check the match of 1BD0. It looks pretty good - except the CBs are pointing in different directions compared to our motif ! Explain why this is so (remember how the motif was generated ...) !

If you only use the main-chain atoms, the results are as follows:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Nr of hits left : (          5) 

 #     1 ID 1bd01  Nmatch    28 RMSD   0.81 A
 #     2 ID 1ak01  Nmatch    28 RMSD   1.05 A
 #     3 ID 2omf1  Nmatch    28 RMSD   1.21 A
 #     4 ID 1acp1  Nmatch    28 RMSD   1.40 A
 #     5 ID 2sxl1  Nmatch    28 RMSD   1.56 A
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

If you include the CB atoms, the fit of 1BD0 and the motif should look something like this:

Picture

If you only use the main-chain atoms, the fit should look something like this:

Picture

8 - Your turn

If you want to play some more, here are some suggestions:

4TMN

The zinc-binding motif in thermolysin, His 142 - His 146 - Glu 166. Here are the hits I found with a side-chain RMSD of no more than 1.0 Å in SAVANT:

Picture

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Nr of hits left : (         17) 

 #     1 ID 8tln2  Nmatch    17 RMSD   0.19 A
 #     2 ID 1ezm2  Nmatch    17 RMSD   0.20 A
 #     3 ID 1npc2  Nmatch    17 RMSD   0.27 A
 #     4 ID 1prc2  Nmatch    17 RMSD   0.62 A
 #     5 ID 1toh2  Nmatch    17 RMSD   0.69 A
 #     6 ID 1prc3  Nmatch    17 RMSD   0.76 A
 #     7 ID 1prc4  Nmatch    17 RMSD   0.83 A
 #     8 ID 1aij2  Nmatch    17 RMSD   0.84 A
 #     9 ID 1prc7  Nmatch    17 RMSD   0.85 A
 #    10 ID 1prc10 Nmatch    17 RMSD   0.85 A
 #    11 ID 1fua5  Nmatch    17 RMSD   0.89 A
 #    12 ID 1guq6  Nmatch    17 RMSD   0.91 A
 #    13 ID 2mhr10 Nmatch    17 RMSD   0.91 A
 #    14 ID 1fua7  Nmatch    17 RMSD   0.93 A
 #    15 ID 1prc6  Nmatch    17 RMSD   0.93 A
 #    16 ID 1guq4  Nmatch    17 RMSD   0.94 A
 #    17 ID 2hmz9  Nmatch    17 RMSD   0.95 A
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

1CEL

The catalytic residues of cellobiohydrolase I, Glu 212 - Asp 214 - Glu 217. Here are the hits I found with a side-chain RMSD of no more than 1.2 Å in SAVANT:

Picture

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Nr of hits left : (          7) 

 #     1 ID 3ovw1  Nmatch    14 RMSD   0.30 A
 #     2 ID 2ayh1  Nmatch    14 RMSD   0.34 A
 #     3 ID 1gbg1  Nmatch    14 RMSD   0.41 A
 #     4 ID 3ovw2  Nmatch    14 RMSD   0.95 A
 #     5 ID 2ayh2  Nmatch    14 RMSD   1.02 A
 #     6 ID 1gbg2  Nmatch    14 RMSD   1.05 A
 #     7 ID 1dhk1  Nmatch    14 RMSD   1.19 A
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

2AFN

The intra-molecular copper-binding site in nitrate reductase, His 95 - Cys 136 - His 145 - Met 150. Here are the (30 !) hits I found with a side-chain RMSD of no more than 1.1 Å in SAVANT:

Picture

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Nr of hits left : (         30) 

 #     1 ID 1as81  Nmatch    18 RMSD   0.15 A
 #     2 ID 1nif1  Nmatch    18 RMSD   0.18 A
 #     3 ID 2cbp1  Nmatch    18 RMSD   0.29 A
 #     4 ID 1adw1  Nmatch    18 RMSD   0.33 A
 #     5 ID 1pmy1  Nmatch    18 RMSD   0.35 A
 #     6 ID 1ndr1  Nmatch    18 RMSD   0.40 A
 #     7 ID 1paz1  Nmatch    18 RMSD   0.40 A
 #     8 ID 1iuz1  Nmatch    18 RMSD   0.44 A
 #     9 ID 1ndt1  Nmatch    18 RMSD   0.44 A
 #    10 ID 1plc1  Nmatch    18 RMSD   0.45 A
 #    11 ID 1zia1  Nmatch    18 RMSD   0.46 A
 #    12 ID 7pcy1  Nmatch    18 RMSD   0.47 A
 #    13 ID 1pcs1  Nmatch    18 RMSD   0.48 A
 #    14 ID 1a8z1  Nmatch    18 RMSD   0.49 A
 #    15 ID 1rcy1  Nmatch    18 RMSD   0.49 A
 #    16 ID 1ag61  Nmatch    18 RMSD   0.50 A
 #    17 ID 2plt1  Nmatch    18 RMSD   0.53 A
 #    18 ID 1aoz1  Nmatch    18 RMSD   0.63 A
 #    19 ID 2azu1  Nmatch    18 RMSD   0.63 A
 #    20 ID 9pcy1  Nmatch    18 RMSD   0.63 A
 #    21 ID 1joi1  Nmatch    18 RMSD   0.64 A
 #    22 ID 1nwp1  Nmatch    18 RMSD   0.65 A
 #    23 ID 2aza1  Nmatch    18 RMSD   0.65 A
 #    24 ID 1arn1  Nmatch    18 RMSD   0.67 A
 #    25 ID 1plb1  Nmatch    18 RMSD   0.68 A
 #    26 ID 1rkr1  Nmatch    18 RMSD   0.73 A
 #    27 ID 1bxa1  Nmatch    18 RMSD   0.77 A
 #    28 ID 1nin1  Nmatch    18 RMSD   0.91 A
 #    29 ID 1kcw2  Nmatch    18 RMSD   0.98 A
 #    30 ID 1kcw1  Nmatch    18 RMSD   1.06 A
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


USF Latest update at 18 November, 1999.