Homo Crystallographicus - Quo Vadis ?

Gerard J Kleywegt & T Alwyn Jones

Supplementary material to: G J Kleywegt & T A Jones (2002). Homo Crystallographicus - Quo Vadis ? Structure 10 (4), 465-472.

Raw data, statistics, and plots. The raw data was gathered and analysed using "jiffy" shell scripts and programs written by GJK. The plot files were generated with ODBMAN, and converted into PostScript with O2D.

New !!! Try Harry Plotter ... a Java-based interactive plotting program which provides direct links between data points in a scatter plot and the corresponding page at the RCSB-PDB !

  1. master.list - ASCII text file containing the raw results of our analysis (~780 kB). The file contains one line per PDB entry with (tab-delimited): PDB identifier, resolution (Å), year of deposition, R-value, free R-value, number of amino-acid residues, number of Ramachandran-plot outliers, percentage of Ramachandran-plot outliers, method of test-set selection, number of test-set reflection, percentage of test-set reflections, (unused counter 1), (unused counter 2), flag to indicate presence or absence of electron-density map in EDS. The list is sorted by resolution first, and percentage Ramachandran-plot outliers second. The first few lines of this file look as follows:
     ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
    1ejg    0.54    2000    0.090   0.094   37      0       0.000   1_REFLECTION_OUT_OF_20  11220   5.000   10336   9119    HAS_AN_OMAP_IN_EDS
    1d8g    0.74    1999    0.105   0.131   0       0       0       _RANDOM_        NULL    NULL    10335   9118    NULL
    1dj4    0.75    1999    0.135   NULL    0       0       0       NULL    NULL    NULL    10333   9116    HAS_AN_OMAP_IN_EDS
     ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

  2. rval.out - output from the initial analysis program (~60 kB). This includes overall statistics, lists of PDB entries with the highest and lowest values for the various statistics, etc.

  3. Analysis I - 10,674 entries with resolution and R-value.

  4. Analysis II - 6,560 entries with resolution, R-value and free R-value.

  5. Analysis III - 10,215 entries with resolution, R-value and Ramachandran plot.

  6. Analysis IV - 6,316 entries with resolution, R-value, free R-value, and Ramachandran plot.

  7. Analysis V - 5,421 entries with more than one protein or peptide chain (at least 10 residues).


Latest update at 11 April, 2006.