Home | RAPPERtk | People | RamPage | Download | Docs | FAQ | Search | RSS feed | Contact Us Bioinformatics Group
University of Cambridge 
Protein Structure Modelling  

X-ray calculation HOW-TO

Introduction

RAPPER can be used to perform a variety of X-ray structure calculations, including model building into electron density maps, X-ray restrained conformational sampling, X-ray restrained side chain reassignment, and even automatic identification of poorly fit regions of protein structure and subsequent all-atom reconstruction of these regions. However, getting everything working smoothly can be a bit complex. The rest of this document describes how to carry out all of the above crystallographic calculations with RAPPER.

Basics

First and foremost, some crystallographic calculations with RAPPER begin by providing and electron density map to RAPPER. RAPPER can read both CCP4 and CNS format maps, and further, automatically detects the type of the input map based on header data included in the map files themselves. Consequently, to supply RAPPER with an electron density map, you need only:

--map map_file_name

The file extension doesn't matter (I use .map for all of my files, for example), so you can supply CNS or CCP4 maps without worrying about extensions. However, RAPPER does not utilize crystallographic symmetry operators to extend maps from the ASU to the unit cell. So the incoming map must either:

If you neglect to provide RAPPER with maps that either surround the molecule or cover the unit cell, you're likely to get error messages about not being able to generate models under the X-ray restraints, because there will be no electron density around your molecule.

Common X-ray parameters

The following parameters are very useful for all X-ray calculations. They can be added to any X-ray mode in RAPPER, and are often major determinants of the quality of the resulting structures:

--chi-squared-electron-density-scoring [True|False]

--cryst-d-high [Float > 0.0]

If chi-squared (or correlation coefficient) scoring is enabled, RAPPER uses the correlation coefficient between the calculated electron density (from the putative generated conformation) and that provided by the --map parameter to score conformations. This is important for any calculation that uses scoring, such as --edm-fit, --use-edm-filters, side chain reassignment, and automatic reconstruction of poorly fit regions.

The cryst-d-high parameter should be set to the high resolution limit of the input PDB file for atomic scattering calculations. Must be provided when chi-squared-electron-density-scoring is enabled.

--sidechain-radius-reduction [Float > 0.0]

The factor by which we reduce the radii of hard-sphere excluded volume interactions when at least one atoms is from a side chain. That is, if this parameter is 0.5, then a side chain atom can approach up to twice as close to any other atom as normal. This parameter is very important to set to a value < 1.0, as it leads to improved fitting and fewer failures, especially in the protein core where everything is tightly packed. I recommend using a value of 0.75

--output-pdb-without-hydrogens [True|False]

Write out PDB files without hydrogens. Useful for the SCL and with REFMAC, where hydrogens are confusing.

--write-user-remarks [True|False]

RAPPER stores a great deal of information about the generated conformations in USER remarks in the produced PDB files. This can confuse many programs and can be turned off by setting this parameter to False.

--models-get-native-bfactors [True|False]

If true, then RAPPER will set the B-factors of the generated conformations to those in the input PDB file. If false, the main chain atoms will have their B-factors set to 20.0 and side chain to 30.0.

Model building into electron density maps (a.k.a. X-ray restrained conformational sampling)

RAPPER can be used to generate ensembles of conformations that fit into electron density maps at all levels of resolution. The generated residues can either be rebuilt based on amino acids already present in the PDB file or generated de novo if the residues are missing but the amino acid sequence and C and N anchor residues are known.

--enforce-mainchain-min-sigma-restraints [True|False]

--enforce-sidechain-min-sigma-restraints [True|False]

If true, then electron density map restraints will be added to main chain or side chain atoms during conformational sampling. These restraints ensure that sampled all main chain (or side chain) atoms lie in electron density of at least 'edm-mainchain-min-sigma' standard deviations above RMS the mean ASU electron density of the provided electron density map. Note that, by default, both main chain and side chain restraints are enabled when a electron density map is provided to RAPPER. You must explicitly set these parameters to False if you want to do unrestrained sampling of the main chain or side chain while providing an electron density map.

--edm-mainchain-min-sigma [Float >= 0.0]

--edm-sidechain-min-sigma [Float >= 0.0]

Only mainchain atoms in a position with greater standard deviation than this are considered to satisfy the electron density map restraints.

--optional-edm-mainchain-restraints [True|False]

--optional-edm-sidechain-restraints [True|False]

If true, then the electron density map restraints will be made optional. If false, then the mainchain will be unconditionally forced to lie in electron density of at least edm-mainchain-min-sigma deviations above mean density. This is primarily useful when tracing through a structure with regions in very poor or non-existent density.

--use-edm-filters [True|False]

If true, then electron density filters will used during conformational sampling. An ED filter is used to rank and filter out putative conformations sampled for each residue. After several rounds of filtering, the fit conformations are enriched in structures that fit the electron density map very well.

--edm-fit [True|False]

If true, then a lot of effort will be put into getting conformations that fit the electron density map. Specifically, instead of stochastic side chain sampling, all available side chain conformations are evaluated with respect to their fit to the electron density map, and the one with the higher 'fit' to the density that also satisfies all restraints is selected. This parameter significantly increases the computational burden for RAPPER.

Examples

The following example fits 10 residues (15-25 of the A chain) into 0.5 sigma electron density from HIV protease structure 1g35.pdb using a model density map calculated with CNS 1g35.map.

./rapper params.xml ca-trace --range "A15-25" --pdb 1g35.pdb --map 1g35.map --enforce-mainchain-min-sigma-restraints true --edm-mainchain-min-sigma 0.5

The output PDB file can be found here.

Just like the previous example, but also fitting side chains:

./rapper params.xml ca-trace --range "A15-25" --pdb 1g35.pdb --map 1g35.map --enforce-mainchain-min-sigma-restraints true --edm-mainchain-min-sigma 0.5 --sidechain-mode smart

The output PDB file can be found here.

Same as the all-atom modelling, but with explicit side chain density fitting (edm-fit) and filters (use-edm-filters):

./rapper params.xml ca-trace --range "A15-25" --pdb 1g35.pdb --map 1g35.map --enforce-mainchain-min-sigma-restraints true --edm-mainchain-min-sigma 0.5 --sidechain-mode smart --edm-fit true --use-edm-filters true

The output PDB file can be found here.

Same as above, but the amino acids are missing from the input PDB file

./rapper params.xml model-loops --range "A15-25" --sequence "IGGQLKEALLD" --pdb 1g35_chopped.pdb --map 1g35.map --enforce-mainchain-min-sigma-restraints true --edm-mainchain-min-sigma 0.5 --sidechain-mode smart --edm-fit true --use-edm-filters true

The output PDB file can be found here.

Here is the command line options to generate 1 all-atom sample around HIV protease, similar to those used in the initial step of the heterogeneity and inaccuracy analysis described in the DePristo et al. Structure paper:

./rapper params.xml ca-trace --chain-id '*' --pdb 1g35.pdb --start 1 --enforce-ca-restraints true --mainchain-restraint-threshold 2 --sidechain-mode smart --models-get-native-headers false --models-get-native-hetatms false --models-get-native-bfactors false --write-individual-models true --map 1g35.map --edm-sidechain-min-sigma 0.0 --edm-mainchain-min-sigma 0.0 --divide-and-conquer false --sidechain-radius-reduction 0.75

The output PDB file can be found here.

X-ray restrained side chain reassignment

X-ray restrained side chain assignment reads in a fixed main chain conformation and an electron density map, and assigns rotameric side chains to all amino acids in the PDB structure, according to their fit to the electron density map. The scoring of the fit is performed with either (1) the average sigma (a la DePristo et al, Structure) or (2) using the correlation coefficient if --chi-squared-electron-density-scoring and --cryst-d-high have been provided.

Side chain assignment is a mode in RAPPER (just like ca-trace and model-loops). So all you need to do is to follow the template:

./rapper params.xml edm-sidechain-assignment --pdb 1g35.pdb --map 1g35.map --chain-id '*'

The output PDB file can be found here.

This next example uses the sophisticated correlation coefficient for X-ray scoring and reduced side chain radii:

./rapper params.xml edm-sidechain-assignment --pdb 1g35.pdb --map 1g35.map --chain-id '*' --chi-squared-electron-density-scoring true --cryst-d-high 1.8 --sidechain-radius-reduction 0.75

The output PDB file can be found here.

Automatic identification of poorly fit regions of protein structure and subsequent all-atom reconstruction of these regions

This approach is identical to the above conformational sampling examples, except that, instead of providing an explicit range of amino acids to rebuild, RAPPER computes the regions of poor fit and automatically generates new conformations for those regions. It can be enabled by setting --edm-rebuild-poor-regions to True, and the strictness of the poor region identification can be controlled with --edm-poor-region-threshold. The ca-trace mode should be used when reconstructing regions automatically, since the amount of conformational movement from the initial structure can be controlled with the --mainchain-restraint-threshold parameter. It is essential that you use chi-squared electron density scoring and provide cryst-d-high for these calculations.

--edm-rebuild-poor-regions [True|False]

If true, then the regions to rebuild will be calculated according to consistency with the provided electron density map.

--edm-poor-region-threshold [Float between 0.0-1.0]

Regions that fit worse than this number are considered 'poor'. If chi-squared scoring is used (at it should be), then amino acids with correlation coefficients between the calculated and observed maps less than this number will be flagged for reconstruction.

--edm-poor-region-buffer-size [Integer >= 0]

If a region fits poorly, the entire region plus this number of residues on the N-terminal side are flagged for rebuilding. That is, if residues A15-25 are poorly fit, and edm-poor-region-buffer-size = 2, then residues A13-25 will be rebuilt.

Examples

The following example rebuilds all residues with correlation coefficients < 0.9 into 0.0 sigma electron density from HIV protease structure 1g35.pdb using a model density map calculated with CNS 1g35.map. Further, --edm-fit and --use-edm-filters are enabled to provide the best fit to the electron density map as currently possible with RAPPER. The --mainchain-restraint-threshold 2 allows up to 2 Å of movement from the initial PDB CA atoms in the generated conformation.

./rapper params.xml ca-trace --edm-rebuild-poor-regions true --edm-poor-region-threshold 0.9 --pdb 1g35.pdb --map 1g35.map --mainchain-restraint-threshold 2 --chi-squared-electron-density-scoring true --cryst-d-high 1.8 --sidechain-radius-reduction 0.75 --edm-fit true --use-edm-filters true

The output PDB file can be found here.
© 2001-2006 The RAPPER Team 
[Powered by FreeBSD]