Home | RAPPERtk | People | RamPage | Download | Docs | FAQ | Search | RSS feed | Contact Us Bioinformatics Group
University of Cambridge 
Protein Structure Modelling  

Fiser et al. loop modelling decoy sets

As described in DePristo et al. and de Bakker et al., we generated ensembles of loop conformations for the loop targets in the Fiser et al. loop modelling benchmark set. These ensembles can be useful for evaluating the discriminatory power of selection mechanisms such as statistical potentials or molecular mechanics force fields.

We provide here the set of protein structures in the Fiser et al. benchmark set used in our studies (see papers), updated with revised structures from the PDB and renumbered so that the residue number is continuous starting at 1 without insertion codes. Click here to download the Fiser et al. test set. This file, like all files on this site, has been tarred together and compressed with bzip2. Under unix you can expand the archive with:

bzip2 -d FILE.tar.bz2
tar xvf FILE.tar

A list of all targets for lengths two to twelve can be downloaded here Fiser et al. target list

For each target in the above list, we have generated 1000 conformations with RAPPER, assigned sidechain with SCWRL, calculated the conformational probability with the RAPDF all-atom statistical potential, the molecular mechanics energy of the AMBER force field with the GB/SA continuum solvation model. Since there are many targets per length, we have created archives of all of the targets for each length. WARNING: most of these files are over 100 MBs!

Loop LengthDecoy Set File
Twosfiser-twos.tar.bz2
Threesfiser-threes.tar.bz2
Foursfiser-fours.tar.bz2
Fivesfiser-fives.tar.bz2
Sixesfiser-sixes.tar.bz2
Sevensfiser-sevens.tar.bz2
Eightsfiser-eights.tar.bz2
Ninesfiser-nines.tar.bz2
Tensfiser-tens.tar.bz2
Elevensfiser-elevens.tar.bz2
Twelvesfiser-twelves.tar.bz2

Each archive file uncompress to a subdirectory called LENGTH. In this subdirectory there is a directory for each target in the Fiser et al. benchmark set for LENGTH. The conformations for a particular target TARGET starting at residue START_RESIDUE can be found in the subdirectory TARGET-START_RESIDUE. There are many files in each directory, all of which are described below:

benchmark.dat
Information about the conformations generated by RAPPER. The entries are the target name, starting residue, length, number of attempts, number of conformations generated, the division number, and the best, average, and standard deviation of global and local mainchain RMSD the generated ensemble with respect to the native loop conformation.
build.out / build.xml
Output log of the RAPPER program. Contains a lot of detailed information about the progress and operation of RAPPER. build.xml is an XML version of the build.out.
framework.pdb
The target protein structure with the loop target residues clipped out.
native.pdb
The loop residues of target protein structure without the surrounding structure.
looptest.pdb and looptest-best.pdb
These files contain the RAPPER generated conformation for the loop residues in the target protein only, one conformation per MODEL in the PDB file. looptest-best.pdb is the conformation with lowest global mainchain RMSD to the native loop. The native loop is included in both files as the first model (MODEL 0). The remarks preceding each model include information about the RMSD of the conformation to the native loop. The model conformations contain all mainchain heavy atoms N,CA,C,O and the CB atom, except for the native conformation (MODEL 0) which contains all heavy atoms.

repreXXX.pdb
These files are the loop conformations generated by RAPPER (in looptest.pdb) split into individual PDB files, one for each model in looptest.pdb.
scwrl.XXX.pdb
The repreXXX.pdb conformation with sidechain atoms added by SCWRL within the context of the surrounding protein structure in framework.pdb.
tinkered.XXX.pdb, tinker.XXX.out, and tinker.XXX-minimize.out
The tinkered.XXX.pdb is the whole protein structure (framework.pdb + scwrl.XXX.pdb), after minimization of the loop residues in the AMBER + GB/SA force field. The tinker.XXX-minimize.out and tinker.XXX.out files are the log files produced during the minimization and detailed summary of the energy of the final conformation.

rapdf.dat and mm.dat
The datafile containing the molecular mechanics energy of the 50 best conformations from RAPPER. The file is a space separated list of fields, in order as: model id, native structure (1 or 0), global then local mainchain RMSD, global then local mainchain + CB RMSD, global then local all-atom RMSD, for the initial RAPPER structure, and then the same 6 again for the minimized conformation, followed by the anchor RMSD of the initial conformation, the SCWRL energy, the molecular mechanics energy with and without GB/SA, then two forms of the RAPDF all-atom statistical potential. The rapdf.dat file is computed before MM calculations, so the two MM fields are 0.0, and since the conformations are not minimized, the minimized RMSDs are also 0.0.

© 2001-2006 The RAPPER Team 
[Powered by FreeBSD]