In this practical you will use a number of web-based servers that enable you to compare a (often newly determined) protein structure to a structural database in order to find out if the new structure is similar to any known structures, or whether it perhaps has a novel fold.

If the structure turns out to be known, the list of its "structural neighbours" may help you in classifying your new structure (e.g., in terms of the CATH, SCOP, or FSSP classification systems).

In favourable cases, the structure-based sequence alignment may provide you with valuable clues regarding the function of key residues. For example, if your structure turns out to be very similar to that of an enzyme that has an Asp-Asp-Glu catalytic triad, and if your protein contains identical residues in a similar spatial arrangement to that triad, there is a good chance that your protein is also an enzyme with a related function.

Before you begin with this practical, please make sure to check here if you have Rasmol installed and if your browser is configured appropriately !

Some of the servers return their results by E-mail. This means that you can only use these server if you have access to a valid E-mail account !


The coordinates of the unknown protein structure can be accessed as a PDB file or as a text file. It contains 132 amino-acid residues. However, to make your task a bit more challenging, the real identity of the residues has been removed ... Instead, all residues are labelled as "ALA" (alanine) or "GLY" (glycine) residues in the structure file.

Download the structure file to a directory that you own. Have a look at the structure with Rasmol. Is this a mostly alpha, a mostly beta, a mixed alpha-beta or some other class of structure ?

Your mission, should you choose to accept it, is two-fold:

  1. Find out if your structure has a novel fold or not. If you think the fold is new, it should not be recognised by any of the structure recognition servers ! If the fold is not new, what is its classification in CATH, SCOP and FSSP ? Can you find any clues with respect to the possible function of your protein ?

  2. If you have tried a few servers, compare and contrast them (if not, pool your results with some of your fellow students !). Do they give similar results ? What kind of information do you obtain apart from a list of structural neighbours ? What are their advantages and disadvantages ? Are they easy to use ? How easy or difficult is it to find the information you are interested in ? How much time do they take ?
In some cases, you may get hits but no option to display the hits superimposed on your query structure. In such cases, you can use one of the following servers to superimpose them and inspect the result with Rasmol:

Note that several of the servers below also offer a facility for superimposing your structure on any structure in the PDB, and you can find others (e.g., ProSup, Kenobi) on the web using a standard search engine such as Google.

For your information, there is no page with correct answers for this practical !


Dali is one of the oldest and most popular methods for structure recognition. It was in fact one of the very first structure-recognition methods to be offered as a web-based service. Dali is also the structure-comparison "engine" for FSSP.

Some relevant links:

Upload your structure to the server. It may be a while until you receive the results, so you probably want to continue with some of the other servers. Alternatively, you can read the JMB paper about Dali.


VAST is the program that is used by NCBI to determine structure neighbours for all PDB entries as part of the Entrez system.

Some relevant links:

Upload your structure to the server. Use "embo2001" as your password so you won't forget it. Submit the form and wait for the next page to load. Hit the "Do it !" button and wait for the next page, which will contain your search ID. Write down this ID; you will need it to access the results of your search.


TOPS is a bit of an odd duck in the structure-recognition world. It originated as a service to produce two-dimensional topology diagrams for structures. It wasn't until later that the option to search a "TOPS-ified" database with a structure was implemented. An interesting feature of TOPS is that its description of a fold is in terms of secondary structure elements and their hydrogen-bonding patterns, i.e. much more qualitative than the other methods (and, hence, possibly more robust with respect to domain movements etc.). It also makes the searches against the databases very fast. A useful feature of the server is that one can immediately compare a structure against CATH and SCOP representatives, which makes classifying structures simpler.

Some relevant links:

The topology cartoon for your structure looks like this:

Upload your structure to the server. Results will be sent to you by E-mail and shouldn't take too long. You can try using the TOPS, CATH and SCOP databases in turn.


TOP is another program that compares a structure to a database of structures derived from SCOP.

Some relevant links:

Upload your structure to the server. When the job is done you will be notified by E-mail.


3dSearch is an as yet unpublished method for comparing a structure to a SCOP-derived database.

Some relevant links:

With this server you have to cut and paste the contents of the structure file into the form on the web page !

NOTE: in November/December 2001, this server was not functioning !


Some relevant links for CE (Combinatorial Extension):

Upload your structure to the server.


Some relevant links:


Some relevant links:


Some relevant links:


Some relevant links:

NOTE: in November/December 2001, this server was not functioning !


The last server we will look at is DEJAVU. This program was initially developed as a tool for protein crystallographers to accelerate their building of structures if they turned out to be similar to a known structure. More recently, a web-based server has also been developed.

Some relevant links:

Go to the DEJAVU server and select a database. Hit the "Submit" button, upload your structure, select sensible parameters and wait for the results to appear in your browser.


Now collect the result from a number of different servers and discuss the second part of the "mission" of this practical.

Okay, so this was not a difficult structure to recognise. If you want to really put the servers to the test, try to crack this much tougher nut (PDB file; text file) which has the following topology cartoon:


Practical "Structure Recognition" - EMBO Bioinformatics Course - Uppsala 2001 - Gerard Kleywegt

Latest update at 29 March, 2004.