S-SAD on bovine insulin

The aim of this experiment is to determine the X-ray structure of bovine insulin from a cubic crystal form. The starting point is a data set collected at the BESSY synchrotron. The phase problem is solved by sulfur SAD phasing, i.e., the anomalous signal of the sulfur atoms of insulin is used to determine the protein phases. SAD stands for "single wavelength anomalous dispersion". A model is built manually in COOT and refined with REFMAC. Finally, the structure is analyzed and molecular figures are prepared for the protocol.

The chemistry students should proceed with this exercise as far as a they get in the 3 day practical. The focus is to understand the practical aspects of experimental phasing as well as the process of building an initial model into electron density. It is meant that path towards the molecular model is the actual content of this training.

How to write the protocol

The protocol should contain a description of the determination and analysis of the structure of insulin similar in style to a journal paper. Take the reprints as a guide. The "paper" should contain more focus on the crystallographic part compared to what is nowadays usually published in a journal paper (with a focus on the biological implications). The paper should contain the common structure: Titel, Author, Abstract, Introduction, Experimental procedures, Results, Discussion, References. For some of the experimental parts (Protein Production, Crystallization) include the information given in the paper, to which is linked to in the section "Integration with MOSFLM". The introduction should contain information on insulin, except for the structure. You write the paper as if the 3D structure of insulin would not be known. The paper is written in English language. Format the document in two columns like a scientific paper is set.

Sequence of bovine insulin

chain A

G I V E Q C C A S V C S L Y Q L E N Y C N

chain B

F V N Q H L C G S H L V E A L Y L V C G E R G F F Y T P K A

Integration with MOSFLM

Choose the computer type

There are different laptop types that are used in your practical. Please choose accordingly.

Create a new project in CCP4I2 named "insulin-[your_last_name]". For that create a new folder "insulin-[your_last_name]" in the folder practical.

Process the raw data images with MOSFLM as in practical 2 (Data reduction of diffraction data of a lysozyme crystal). The images can be found in P:\practical_protein_crystallography\data\insulin and will have a name ins_ssad_1_???.img.

As you already have learned all aspects of data processing with MOSFLM in the previous part of the practical, you only need to make one snapshot of one unmasked image and also of one with a mask applied.

Furthermore, omit the "Strategy" part as a full dataset is already collected. Output of the integration procedure will be by default a file ins_ssad_1_001.mtz.

After the integration is finished, close MOSFLM. The integrated data is saved automatically to the CCP4 database.

Merging and scaling with AIMLESS

The program AIMLESS scales and merges the symmetry equivalent reflections.

in the Task menu activate the submenu → "X-ray data reduction and analysis"
choose → "Data reduction - AIMLESS"

A new window opens to input the data for the program AIMLESS. The "Input Data" is under normal circumstances selected automatically. The red background color, however, indicates you that you have to fill in some of the other fields.

Only those fields need to be changed that are mentioned in the points below. The instructions are also ordered according to their position in the CCP4 interface.

Always input meaningful job titles, e.g., "scaling and merging of mosflm data"
In the field "Select unmerged data files", select the data output from the MOSFLM run.
Use "insulin" as crystal name as well as dataset name
In "Options for symmetry determination" select "Choose a known or previous solution"
Below "Options for choice of space group or Laue group" select "Perform search then choose spacegroup". In the empty field on the right choose the space group I2₁3 (number 199). Write it like "I 21 3".
You have to choose this spacegroup by hand, because you cannot distuingish between I23 and I2₁3.
In the section "Optional existing FreeR set, define to copy or extend if necessary" choose 0.10 for "Fraction of reflections in generated freeR set".
Ten percent of the reflections will be flagged as reflections to calculate the free R factor. These reflections will not be used for refinement. This so-called test data set should contain at least 1000 reflections, but also not much more, in order not to reduce the number of observations in the refinement too much.

Choose → "Run" to start the calculation.

A table listing the details of data collection and data reduction.

Information in an MTZ file

Before actually starting the phasing procedure, it is helpful to examine the contents of the mtz file. Click on the symbol (triangle or plus sign) left of the AIMLESS job in the "Job list" (left big window). Click on "/insulin/insulin" (this name depends on the crystal and dataset name you have given in the AIMLESS run). The MTZ file may contain the following data. Later, you can open other MTZ files (like maps) in the same way. Have a look at the column labels, which are needed for different purposes (like storage of diffraction data, calculation of maps, phasing...).

H, K, L: Miller indices of a reflection
Iplus: the intensity of the +h,+k,+l reflection
SIGIplus: the standard deviation (error) of Iplus
Iminus: the intensity of the -h,-k,-l reflection
SIGIminus: the standard deviation (error) of Iminus
FREER: reflections set aside for calculation of the "free" R factor
F: structure factors
PHI: phase values
FOM: uncertainty of phase values (phase errors)
HLA, HLB, HLC, HLD: Hendrickson-Lattmann coefficients describing a phase value distribution

In the following also the intensities and standard deviations of the intensities of the Bijvoet pairs are given.

Thus, F is obtained by averaging all symmetry-related reflections. F(+) and F(-) are obtained by averaging F(+h,+k,+l) with its rotational symmetry mates and F(-h,-k,-l) with its rotational symmetry mates, respectively.

Nothing to document here.

Estimation of the contents of the asymmetric unit

For structure determination you should have an idea how many protein molecules are located in the asymmetric unit. This is possible based on the typical solvent content of protein crystals, the size of the asymmetric unit and the size of the protein.

The necessary program is found under menu → "Import merged data, AU contents, alignments and coordinates" and then → "Define AU contents".

Click on the → "plus sign" and select to → "Enter text".
Give the name "chain A" and copy/paste or type in the sequence of chain A.

If you want to type in the sequence, use always capital letters. The use of spaces is okay.

Now → "Save".
Repeat the last two steps for chain B.
In the field "Solvent content analysis" select the correct "Experimental data".
→ "Run".

In this step, we have both defined the contents of the asymmetric unit and also calculated the so-called Matthews coefficient. Based on this value, we can estimate the probability (Matthews probability) for the presence of a certain number of protein molecules in the asymmetric unit (Number of copies). The corresponding solvent content (Solvent %) is also derived.

Results of the analysis.

Phasing with SHELX

The programs of the SHELX program suite are used for locating the anomalous scatterers and calculating the phases. SHELX uses the anomalous differences to determine the location of the anomalous scatterers by direct methods. Following that, the protein phases are calculated. The phases are improved by phase refinement via solvent flattening. The programs outputs an mtz file with the calculated phases and figure of merit and a pdb file with the heavy atom positions.

Use in the Task menu → "Experimental Phasing" → "Automated structure solution - SHELXC/D/E phasing and building" .

For "Start pipeline with" choose "Substructure detection" and for "and end with" select "Den.mod + poly-ALA tracing"
Uncheck "Input protein sequence" and fill in the number of amino acid of insulin in "Number of residues per monomer""
In "Crystal #1 composition and collected anomalous dataset(s)" specify the type of heavy atoms which should be found ("Substructure atom") and how many you do expect.
You expect all cystein residues to form cystin bridges. Fill in the number of expected S-S pairs into the field after "Number of S-S pairs searched for as 1 supersulfur".
In the tab "Advanced options", lower the number of trials in substructrue detection ("Num. trials") to 1000.

Description and results of the phasing procedure

Examination of the phasing results in COOT

When determining heavy atom positions by either the SIR method or by anomalous dispersion (SAD) using Patterson methods or direct methods, there is always a 50 % chance that the wrong hand solution is obtained, i.e., the inverted (mirror image) coordinates of the heavy atoms are obtained instead of the correct coordinates. When phasing with SIR, both heavy atom solutions will give a similar quality map, but the electron density of the wrong hand solution will produce a mirror image of the correct electron density map, i.e., the alpha helices are lefthanded. When phasing with anomalous dispersion as in the current example, the correct hand of the heavy atom solution will give better maps or interpretable maps. This is usually manually inspected by looking at the electron density maps for both solutions. SHELX therefore provides phases obtained from both solutions.

Below the SHELX result you will find the option to run the program COOT. Choose then "anomalous substructure coordinates" for "Coordinates". In "Electron density maps" you load the maps of both hands in parallel. By default the correct solution (hand) is chosen as "Map coefficients". It has the name "Best electron density map coefficients". To add another density map, click → "Show list" and then on the → "plus sign". Select the second "Best density - other hand" as "Map coefficient". Now → "Run".

Both density maps are loaded with a quite similar color. You can choose to show/hide a map via the "Display Manager" in the top menu of COOT. Just tick/untick the "Display" box next to the density you want to show/hide.

Analyze the electron density maps and the graphs of the SHELX run:

How many S atom positions have been found by SHELX and what is the occupancy of these sites?
Compare the electron density maps of the two solutions? Has a correct and good solution been obtained?
Can you identify side chains in the electron density?
Based on the positions of the disulfide bridges, try to find the sequence around one or two of the cysteines.

Document your "sequencing procedure" for the protocol by making snapshots.

Manual model building with COOT

With an electron density map of this quality, automatic model is usually the first choice and works well, as we will see later. It is, however, more instructive to manually build a model based on the electron density map. In this practical, the aim is to build a model of insulin with the program COOT based on the experimental electron density map.

General tips and procedures:

Building a protein structure from scratch is like building a house (or car, ...). You need the right tools, in a potentially logic order. Look at the flowchart below. It tries to summarize the general way you might want to follow in the practical. The order, however, is not as fixed as the scheme suggests. For example, if you want to renumber the residues first and then do the Mutation step, it will perfectly be fine.
It is very helpful to activate the names of the COOT tools on the right side. Go to → "Edit" → "Preferences". Select the tab → "Refinement Toolbar" and in the section "Toolbar Style" click on → "Icons and Text".
As soon as you add a new C-alpha trace, there will be a unique number associated with that chain in COOT. You can see the number in the Display Manager. It is advisable to remember that number (e.g., by writing it down) especially in the mergeing or renumbering step or for saving the coordinates (COOT will ask for the Select molecule number to save). Because each chain is now separated into a new molecule, you need to merge the S-SAD on bovine insulin Use → "Calculate" → "Merge Molecules..." to do so.
After converting any C-alpha trace to an poly-ALA model or before beginning to model a new C-alpha trace, you should always delete the "Baton Atoms" (in the "Display manager")! Forgetting this will usually create a very confusing model that is extremely hard to fix.
The electron density map from the SHELX run is calculated from imperfect phases. These were derived from the the positions of the anomalous substructure and the subsequent density modification. As a direct result of the phase imperfection the electron density might contain errors or is weak or even absent for side chains or complete residues. Only try to build the side chains or amino acids you can clearly see in the electron density.
Give the chains the correct chain ID ("A" in the PDB file should be chain A of insulin) and the right numbering (the first amino acid should have the number 1). The name of the chains is either the character "A" or "B".
Although the experimental map also reveals some water positions (electron density blobs aside of the main chain), we first want to refine the model and add the water molecules later.

It is a good idea to save the results from time to time in order to avoid a loss of the model due to a program crash or user errors.

Before you close COOT, you need to save the model to CCP4 as well! Do it via → "File" → "Save mol to CCP4". Only this step creates an entry in the CCP4I2 database. Immediately talk to the supervisor(s), if something appears wrong. Do not start another job in CCP4! In almost all cases, the last models can be rescued.

Flowchart of the model building process

          Initial electron density
                     |
                     |
             Activate map skeleton
                     |
                     |
    ╭──────> C-alpha baton mode  <------>  Reverse direction?
    │         (create CA trace)           (correct C->N error)
    │                |
    │                |
    │        Ca Zone -> Mainchain    <-->  Reverse direction?
    │         (convert CA trace to        (correct C->N error)
   Add          poly-ALA chain)           
  second             |
  chain              |
    │         Mutate & AutoFit   <------> Add residues?
    │         (assign sequence)           (extend termini)            
    │                | 
    │                |
    │        Renumber residues +
    │          Change Chain ID
    ╰─────< (correct register and
              names of chains) 
                     |
                     |
                Merge chains
                     |
                     |
                Initial model

Activate the map skeleton.

Why:

This is a helpful tool showing an automatic trace through the electron density (the "skeleton") for orientation.

How:

Go to "Calculate" → "Map Skeleton". Select the map you want and activate the skeleton ("on"). It will go through parts with strong density, i.e., both the main chains and some side chains.

Create a C-alpha trace.

Why:

Build a part of the protein chain by defining the positions of its C-alpha atoms. The C-alpha trace is later converted into a poly-alanin (poly-ALA) model of the protein.

How:

Center at a well recognizable CA position. Such a position typically can be found at a branch in the map skeleton. Unfortunately, COOT cannot center on skeletons, so use the CTRL-left_mouse_button to center the position where to start the CA trace. Remember, you are in a 3-D-space and you need to center the marker from at least two view angles approx. 90° apart. The center of the graphics window is marked by a point. To make sure that there are no other baton atoms from a previous main chain, go to display manager and delete the "Batm Atoms", if necessary. them if Then open the menu "Calculate" → "Other Modelling Tools..." and activate "C-alpha Baton Mode...".

Immediately an menu "Baton" is shown and so-called "Baton Atom Guide Points", possible positions of C-alpha atoms 3.8 A apart, will be displayed as red crosses in the main window of COOT. Take your time to understand, which of these points are really C-alpha carbons, because the starting point is very crucial for the Baton Build procedure. Build the C-alpha trace as far as you can recognize it.

Possible problems:

When you (accidentally) built the C-alpha trace in the direction from the C-terminus towards the N-terminus (direction C→N), you should reverse the C-alpha trace direction at this stage. To do that, you click on "Calculate" → "Other Modelling Tools..." → "Reverse Direction..." and click on one of the atoms of the C-alpha trace. Note, only the numbering is changed and there is no direct visualization of this inversion.

Convert C-alpha trace to poly-ALA chain

Why:

This is the first step towards a full-atom model of insulin. After the conversion into a poly-ALA chain, typically it is more easy to identify where the side chains are located. The next steps will be to assign the correct amino acid sequence.

How:

Go to "Calculate" → "Other Modelling Tools..." and then click on "Ca Zone → Mainchain". After you activated this, you have to click on one of the atoms in the C-alpha trace you want to convert. There are always two new poly-ALA models built! "mainchain-forwards" is the poly-ALA model, which followed the numbering of the CA trace. "mainchain-backwards" is going in the opposite direction.

At this point, you need to figure out, which of both chains is the correct one. Try to fit parts of both chains into the density ("Real Space Refine Zone") to do so.

Possible problems:

Sometimes you recognize too late that your C-alpha trace was built in the direction C→N. This has to be corrected before you can go on. Click on "Calculate" → "Other Modelling Tools..." → "Reverse Direction..." and click on one atoms of the poly-ALA model. This will immediately convert the poly-ALA model into a C-alpha trace again. Convert it back to a poly-ALA as described above.

Mutate & AutoFit and Add terminal residues

Why:

If you start from a poly-ALA model or if you erroneously assigned the wrong amino acid, you have to assign the correct amino acid "identity". It should be done with the "Mutate" feature of COOT. Use "Autofit" to automatically fit the side chain into the electron density.

Sometimes you need to add further residues to either the N- or C-terminal end of you model. This is especially the case, when you have built only a portion of your protein with batons.

How:

Based on the poly-ALA model the assignment of the correct amino acid sequence should be straightforward. The positions of the cysteines can be found with the help of the anomalous scatterers (i.e., the sulfur atoms of the cystein side chains) identified by SHELX. Identify the amino acid next to the cysteins to get an idea at which part of insulin you are looking at. Modify the residue by first activate "Mutate & Autofit", then by clicking on one of the target residues atoms. A small window will pop up. Choose the appropriate amino acid. The "Autofit" procedure will try to fit in the side chain of the residue into the electron density.

The orientation of the sidechains can be deduced from the orientation of the C-alpha atoms. But sometimes the conversion of the Batons to the poly-ALA model, the first and/or last residue is oriented wrongly. Use "Real Space Refine Zone" to drag the wrongly postioned atom, where you want it. This step is more easy when done before the mutate step.

Adding a residue (the default is an alanine) can be done with the option "Add Terminal Residue..." from the menu on the right side. Then click on one atom of the terminal end you want to add the residue to. A new alanine residue will appear in an optimized fashing in the nearby electron density.

Make sure that the orientation of the terminal residue is correct, thus the amine of the N-terminal end as well as the carbon atom of the carbonyl group of the C-terminal side is pointing towards the previous or next residue, respectively.

Possible problems:

As mentioned, COOT will put a new terminal residue into the electron density adjacent to the amine of the N-terminus or carbonyl carbon of the C-terminus, respectively. If the termini are not fitted correctly into the electron density, COOT might place the new alanine into the density belonging to a sidechain or even a neighboring protein chain. In that case, use "Real-space refine zone..." to solve the problem.

Renumber residues + Change Chain ID

Why:

In order to have the correct numbering of the molecule as well as that all chains are named correctly, you might have to renumber the residues and might also have to change the name of the chains.

How:

Renumber residues: Choose "Edit" → "Renumber Residues..." and then select the molecule and the chain you want to renumber. Select the start and end residue. For most purposes, the start is the N-terminus and the end is the C-terminus. Note, that you have to apply an offset meaning that if you want the residue with the actual number "10" to be the number "1", the offset is thus "-9".

Change Chain ID: With "Edit" → "Change Chain IDs..." you have to select the molecule and the chain you want to rename. Select either the complete chain or residues between "from" and "to". Latter is the only for merging specific fragments. For example, if you want to merge chain A and B of one object into one chain A, you have to select the specific range.

Possible problems:

None?

Merge chains

Why:

If you have modeled the two chains as described above, both will have an own ID in COOT. This means that COOT distuigishes between both chains and treates them as separate objects. You need to combine both objects into one.

How:

Go to "Edit" → "Merge Molecules...". Choose the molecules (look at their IDs) you want to "Append/Insert Molecule(s)" and then select the molecule (ID!) you want to merge them into. It is wise to select the object containing chain B in the "Append" section and then to append it to the molecule with chain A. Both chains will then belong to the molecule you have appended to.

Merging molecules cannot easily made undone. Make sure to select the correct molecules beforehand.

Possible problems:

When you appended/inserted the wrong molecules into your target, you cannot undo it.

The molecule(s) you have selected to be appended are not deleted, but only deactivated.

Thus you only need to extract the original molecule present in the object you have appended to. Use "Edit" → "Copy Fragment" and select the chain you want to extract. The selection syntax is e.g., "//A/" to select chain A.

The fragment will then copied into a new object (check the ID, you know the game now) and will have a quite strange name ("Atom selection from ..."). Now, go on and merge the correct molecules.

Automatic model building

Automatic model building will take a while. If it is not time for lunch, you might consider to begin with the next task (Refinement of the model) before. Start first the refinement with REFMAC5 and then start the automated model build.

Open the correct program via → "Model building and Graphics" → "Autobuild with ModelCraft, Buccaneer and Nautilus".

In the field "Reflection data" select for "Reflections" the output of the AIMLESS job. It should have the name ("/insulin/insulin"). The "Free R set" should also be chosen from the same job.
Untick "Get initial phases from refining the stating model" as you want to include the phases you have been obtained in the SHELX job. Define the "Phases" as ""Best" phases".
In "Asymmetric unit contents" choose the appropriate "Asu content file".
In "Optional pipeline steps" select the following steps and unselect the others: → "Classical density modification with Parrot" and → "Addition of waters".
Hit → "Run".

Automatic model building is in general the first step towards the new molecular structure. Sometimes the initial density is not as good as in this case, so by using the information from the molecular model might help to improve the phases and thus obtaining a better and more interpretable electron density map.

The automatically built model might not place the model at the same position in the asymmetric unit as your model is situated. Open in COOT your model as well as the model built automatically. To compare both structures more easily, use a superposition method to place the automatic structure onto your model. In COOT go to → "Calculate" → "SSM Superpose...". As "Reference Structure" choose the model you built, "Moving structure" is the autobuilt model. Hit → "Apply".

Report the results of the automatic model building process in a very brief form. Describe how many residues were built in how many chains. State the number of correctly identified residues, too. This information should be given in the log output of the respective program. Describe very short, how the density and its interpretability changed. Look especially at regions, where you were not able to complete the model (probably at the more flexible termini of the protein chains). State, whether the automatic model had the correct connectivity between the chains (i.e., the disulfide bonds).

Refinement of the model

Refining a structure needs both reciprocal space refinement as well as real-space refinement. Below is a flowchart this process:

                Initial model      
                     |
                     |
    ╭───────>  Reciprocal space 
    │           refinement
    │        (geometry, B factors)
    │                |
    │                |
    ╰─────────<  Real space      <------>  Add waters, ligands, ... 
                 refinement               (New things visible in
           (side chain conformations,      the electron density?)
           water positions, ligand       
             conformation, ...)
                     |
                     |
                Final model

In CCP4I2, open in the task menu "Refinement" and then "Refinement - REFMAC5".

In "Main inputs" choose as "Atomic model" in the first run the initial model you built in step 7. If the model is not available in the drop down list, you should use the "Browse files" symbol next to the menu.
"Reflections" and "Free R set" should be selected from the AIMLESS job.
You want to "Use the anomalous data" to "only calculate an anomalous map".
Under "Options" specify that you want to do 20 cycles of refinement.

Start the refinement. After the refinement has been started, the log output will show the results from the refinement run, even if the job is not yet finished. Check that there are no error messages and everything works as expected. Watch how the R-factor and free R-factors change.

Note the R and FreeR factors before and after refinement also for the protocol. Also the graph of the R-factors vs. cycle number (screenshot). After how many cycles have the R and Rfree factors converged? Was the number of refinement cycles appropriate?

Next examine the refined model in COOT.

In CCP4I2 first define the coordinate file from REFMAC5 as "Coordinates".
Then as "Electron density map" use the "Weighted map from refinement" and as "Difference density map" you need to select the "Weighted difference map from refinement".
Inspect all residues from the N-terminus to the C-terminus of both chains to see if everything appears to be correct and fits the density well. The density should be better than the first experimental electron density from phasing, since the model phases are better than the experimental phases (if the model is mostly built correctly and has a reasonable completeness). If necessary, correct conformations or add missing residues.
Helpful options when analysing the structure: Display symmetry related molecules via "Draw" → "Cell & Symmetry" and then activate "Symmetry On". With this option you can analyze, if density at crystal contacts is explained by symmetry related molecules. Analyzing interactions of a residue/molecule: "Measures" → "Environment Distances" → "Show Residue Environment?".

Rebuild parts of the model, if necessary.

Alternate conformations. Based on the high resolution electron density map, alternate conformations of residues may become apparent. Add these with the option "Add Alt Conf..." in "Model/Fit/Refine" or on the toolbar on the right side. If there is only one alternate conformation for the side chain (usual case), choose "split at Calpha". If also the main chain atoms show an alternate conformation, choose "Split all of a single residue". Often, this requires also the previous/next residue to have an alternate conformation.
For the Gln, Asn, and His residues also check if the side chain conformation is reasonable with respect to a 180° flip based on hydrogen bonding interactions. You may need to check interactions to symmetry related atoms for this purpose (option "Draw" → "Cell & Symmetry"). If necessary, change the conformation.
Are residues with ill defined density present? What is the cause of the "bad" density?
Adding water molcules: Before going through the refined model, water molecules should be added, such that their positions can be checked together with the protein residues. Use the option "Calculate" → "Other Modelling Tools..." → "Find waters ..." to add the water molecules. Choose the current difference density map and the current model. Pick peaks higher than 3.0 sigma. Some water molecules will be placed in protein residues that have not yet been built or in density due to alternate side chain conformations. Delete these waters when going through the chains ("Delete" in "Calculate" → "Model/Fit/Refine" or on the toolbar on the right side).

Save the molecule to a PDB file and do another refinement with REFMAC5 (20 cycles). After that, check the protein model and the water molecules again based on the electron density and the points mentioned above. The R-factors and density should further improve due to the addition of water molecules in the previous cycle. Search for further water molecules.

Always monitor the R factors during the refinement with REFMAC5. The refinement with alternate cycles of model building and refinement should proceed until convergence is reached and there is both no further improvement of the free R-factor and, more importantly, the model cannot be improved any further according to the electron density (or your knowledge).

After you finished the refinement cycle:

Describe the course of structure refinement (R and Rfree factors, how many residues and water molecules have been part of the model). Prepare a table containing the most important refinement parameters.

Model validation

In CCP4I2:

Choose the in the task menu "Validation and Analysis" the "Multimetric model geometry validation". Select the correct model and reflections (from the AIMLESS run).

The output will be printed in the "Results" section. Use a screenshot to get an image of the Ramachandran plot(s) for the protocol. Furthermore, you get the number of residues in the favored, allowed and high-energy backbone-conformations (outliers) from the Ramachandran plot.

If the Ramachandran plot is empty (no dots visible), please use COOT to get a valid plot. Start COOT with the last refined model and go to → "Validate" → "Ramachandran Plot". Select the structure you want to analyze.

Are there also rotamer outliers? If so, is the density well defined for these residues? The program also analyzes the orientation of the side-chains of asparagine, glutamine and histidine and suggests, whether they need to be flipped. Check the hydrogen bonding interactions of the side chain to validate, if a flip is necessary.

Ramachandran plot

Analysis of the molecular structure

Load the last REFMAC5 refined structure with COOT. This time, you need to add an "Anomalous density map". Choose "Weighted anomalous difference map from refinement" from the last REFMAC5 job.
Analyze the disulfide bridges. Which cysteine residues are involved in disulfide bridges? Which disulfide bridges link the two insulin chains?
Analyze the crystal packing: This is best done by using a C-alpha trace representation of the molecule (in the Display Manager) and the symmetry related molecules. Use "Draw" → "Cell and Symmetry" and in "Symmetry by Molecule" switch to "Symmetry on" and activate "Display as CAs" for the current model. Activate "Show Unit Cell". Increase the radius to 50 Å.

How is the molecule packed? Save a snapshot for the protocol.
Analyze the anomalous map based an the measured anomalous differences and the final model phases. What is the peak height of the weakest and the strongest sulfur atom? What is the peak height of the strongest anomalous density peak that is not caused by a sulfur atom?
Analyze where the hydrophobic and where the hydrophilic residues are mainly located. Where are the water molecules located? Are water molecules found in the center of the protein?

For the next step in this practical you need to export the electron density maps. In COOT go to "File" → "Export Map..." and choose the 2Fo-Fc map ("Weighted map from refinement"), the Fo-Fc map ("Weighted difference map from refinement") as well as the anomalous map ("Weighted anomalous difference map"). Save them to the "insulin" folder as "refmac_2fo-fc_final.ccp4", "refmac_fo-fc_final.ccp4" and "refmac_ano.ccp4", respectively.

Save the model coordinates into the same folder as well. This can be done in COOT with "File" → "Save coordinates...". Use the file name "refmac_final.pdb".

Preparing molecular figures with PyMOL

Before you can start working with PyMOL, you need to export some more coordinates and maps with COOT.

In CCP4I2 reload the SHELX run by clicking once on its entry in the CCP4i2 database. Choose to start COOT (on the bottom of the results). In the section "Coordinates" select as "Atomic Model" the "Anomalous substructure coordinates".

The "electron density map" should be the "Best electron density map coefficients".

After COOT is loaded, export the "Best electron density map coefficients" as "shelx_phasing.ccp4" in the "insulin" folder. Also save the coordinates from the anomalous substructure as "shelx_substructure.pdb".

Now make yourself familiar with PyMOL. The best is just to load your structures and to start!

The following figures should be prepared:

An overview of the fold of insulin in cartoon representation. The two chains are represented in different colours and the secondary structure elements as well as the termini of the two chains are labeled. The disulfide bridges are shown, too.

First, change the background to white. All figures should be prepared with white background. Most figures look better with the "Maximum quality" settings. Activate it in "Display" → "Quality".

Load the pdb file "refmac_final.pdb".

Create two objects ("fold", "ss-bonds") from the loaded structure by the command 'copy'. Hide all. Show object 'fold' in cartoon representation and object 'ss-bond' with ss-bonds related cysteine residues as sticks. Color both chains of 'fold' in different color, e.g., marine and red. Change color of 'ss-bonds' residues to yellow. Label every alpha and beta element, also termini.

Orient the molecule in a way you like and in which you can see all the labels. Use the "Editing" mode of PyMOL to move labels, if necessary.

Save the actual orientation matrix with the command "get_view". That matrix will be printed into the console output of PyMOL.

You can ray-trace the figure with the command "ray". It might take a while to render on your computer. Save the rendered figure (further on called scene) with the name "scene_A" via "File" → "Export Image As" → "PNG...".
A C-alpha representation of the fold in the same orientation. Use the same settings as before, except that no labeling of the secondary structure elements is given. Use the orientation matrix from figure (A). To do that, mark the lines with the "set_view" command and copy these into the PyMOL command input area.

Copy a new object ('calpha') from 'fold'. Disable the 'fold' object. Hide labels. In the new object (rename it to "fold2") hide cartoon and show ribbon. If you activated the "Maximum Quality", your ribbons will drawn in a roundish fashion, however, ribbons are often shown with sharp edges. Use the command "set ribbon_sampling, 1" to get the sharper edges and use "set ribbon_sampling, 10" for the rounded ribbons. Choose the representation you like most.

Save scene as "scene_B".
A view of an all atom representation (sticks) of the whole structure. The two chains are distinguished by different colors of the carbon atoms. The water molecules should also been shown.

Create new object 'sticks-all' from 'calpha'. Disable every other object. Show sticks for 'sticks-all'. Enable wall-eye stereo. You need to adjust the view afterwards.

Save scene as "scene_C".

Besides wall-eye stereo representation, there are other ways of depicting 3D figures (like cross-eye stereo or anaglyph stereo). Thus, mention in the protocol, which type of representation you have chosen.
Figures of residues and their density. This includes a disulfide bridge, residue with alternate conformations, disordered regions, a well defined region and water molecules.

Disable every other object. Create new objects for the residues to be displayed. Load the maps "refmac_2fo-fc_final.ccp4" and "refmac_fo-fc_final.ccp4". You need to create for both maps three electron density objects (with the command "isomesh"). Show the 2Fo-Fc electron density in different colors. Use green and red for Fo-Fc density. This is also the default of PyMOL.

Save scenes.
The sidechain of an Asn, Gln or His residue, for which the 180° sidechain flip conformation, can be unambiguously determined based on the hydrogen bonding pattern. Show the residue and the hydrogen bonds (as dashed lines with the distance in Å) in both conformations.

Disable former objects. Create a new object with the chosen amino acid and surrounding (4 Å distance) amino acids. Show sticks and make hydrogen bonds with 'measurement' wizard. Move distance labels so that they are visible.

Save scence.
An anomalous |F+ - F-| Fourier map with the final model phases. Show the S atoms found by SHELX and the cysteines of the final refined model together.

Disable former objects. Load the coordinate file "shelx_substructure.pdb". Show the sulfur atoms of the SHELX anomalous substructure and of the dedicated cysteines residues as spheres, but scale down sphere radius to 0.3 (with the "alter" command). Color sulfurs in yellow. Load and show the anomalous density map "refmac_ano.ccp4" around the "shelx_substructure".

SHELX does not necessarily place the sulfur atom of the anomalous substructure in a way that they belong to the final model of insulin. Actually, their position are arbitrarily chosen. You can create the symmetry related sulfurs via the "Action" menu → "Generate" → "symmetry mates". Choose a "within" radius appropriate for you.

Save scence.
The experimental electron density map together with the final model. Show only part of the model such that the quality of the map can be judged well.

Disable former objects. Create a new object, show ribbons and lines. Zoom into a region for which you want to display the electron density. Use COOT to decide, which region you want to show. Maybe show a disulfide bond or a region inside the protein with well defined aromatic residues.

Hide ribbons, show cartoon. Fine tune helix and strand appearance. Change the color of the helices and strands as you want. Create another object for residues you want to show as sticks, show them as sticks.

Load in density map "shelx_phasing.ccp4" and show density.

Save the scene.

The molecular figures with proper figure legends and an analysis of the results in the main text.

References

Faust, A., Panjikar, S., Mueller, U., Parthasarathy, V., Schmidt, A., Lamzin, V.S., Weiss, M.S.. (2008) 'A tutorial for learning and teaching macromolecular crystallography' Journal of Applied Crystallography, 41, (6), 1161-1172. Paper