Working with PDB Files

The ChemDoodle Web Components library can handle advanced rendering of Protein Data Bank files in 3D components.

Loading PDF Files

There are a three methods for loading PDB files:

1. Locally included in HTML. Just as described on the loading data page, PDB data can be included as a Javascript variable with all returns replaced by “\n” character pairs. Then use the ChemDoodle.readPDB() function to generate the chemical data structure. This is shown in the following code:

1
2
3
4
5
6
<script>
  // print the PDB file to a Javascript variable
  var pdbFile = '...';
  // read the PDB data and store the returned Molecule data structure as the variable, structure
  var structure = ChemDoodle.readPDB(pdbFile);
</script>

NOTE: PDB files may contain quotes, so to properly create Javascript strings with quotes that match the construction quotes, be sure to escape the inner quotes by placing a backslash before each of them. For example, if you use single quotes, as in the source above, to define the string containing the PDB data, you will need to replace all single quotes in that PDB data with backslash-single quote character pairs (“\’”).

2. PDB files can be loaded from the iChemLabs.getOptimizedPDBStructure() function. Using this method is significantly faster than by parsing the file in Javascript. The same file may take minutes to parse by using method 1 above, but will take less than 2 seconds using this method or the next.

The getOptimizedPDBStructure() function is very simple. The first parameter is the PDB code of the structure to be returned, the second parameter is a boolean that states whether ligand atoms (atoms defined by the HETATM tag, as opposed to the ATOM tag, in PDB files) will be present in the returned structure. The final parameter is the callback function that will be executed when the data is returned. The following example demonstrates how to use this function:

1
2
3
4
5
6
7
8
9
10
11
12
13
<script>
  // declare the pdb_1bna variable in the top scope that will be set as the data from the function call
  var pdb_1bna = null;
  // call the iChemLabs function
  // we will get the structure for 1BNA
  // the 'withAtoms' option states that ligand atoms are to be retained
  ChemDoodle.iChemLabs.getOptimizedPDBStructure('1BNA', {
    'withAtoms' : true
  }, function(structure) {
    // set the top pdb_1bna variable to be used later or directly load it into a Canvas
    pdb_1bna = structure;
  });
</script>

3. Use ChemDoodle desktop to generate an optimized JSON input file for you. In ChemDoodle‘s File menu, there will be a function Optimize PDB to JSON… that will guide you through creating a JSON file for an input PDB file. Include this file into your HTML page with a URI. This will place a variable named pdb_* into the global Javascript namespace, where * is the name of the input file. This variable will be a ready to use Molecule data structure that can be loaded directly into a 3D canvas. This process is the fastest method to load PDB files, typically taking less than a second.

Proteins

Proteins will be represented by ribbon models. There are several visual specifications to control how the ribbons are displayed. By default, ribbons will be displayed in a continuous style. Both sides of the ribbon can have a unique color, or they can be the same color. These colors can be modified with the proteins_primaryColor and proteins_secondaryColor specifications.

The ribbons can also be displayed in a cartoon style, with alpha helices and beta sheets rendered with a flat starting end and ending with an arrowhead. To do this, set the proteins_ribbonCartoonize specification to true. The segments of the ribbon representing helices can be bicolored independently of the rest of the ribbon. Sheets will have the same color on both sides and can also be colored independently of the rest of the ribbon. The specifications to control these colors are proteins_ribbonCartoonHelixPrimaryColor, proteins_ribbonCartoonHelixSecondaryColor and proteins_ribbonCartoonSheetColor and are self-explanatory.

Ribbons can also be segmented and colored by amino acid. To segment the ribbon by amino acid, set any of the following specifications to true:

  1. proteins_useShapelyColors – Use the Shapely color set for amino acids.
  2. proteins_useAminoColors – Use the Amino color set for amino acids.
  3. proteins_usePolarityColors – Amino acids that are polar will be colored red, non-polar amino acids will be colored white.

The proteins_ribbonThickness specification will control the ribbon thickness, but this specification must be set before the ribbon is generated from the Molecule data structure.

You may want to display alpha carbon backbone traces for proteins, rather than a ribbon. Both traces and ribbons can be displayed at the same time. To display traces, set the proteins_displayBackbone specification to true. You can also control the trace thickness with the proteins_backboneThickness specification, and the color with the proteins_backboneColor specification.

Nucleic Acids

Nucleic acids will be represented in cartoon style, with tubes representing the phosphate backbone and with shaped platforms representing the bases connected to the backend tube.

The thickness of the tubes is controlled by the nucleics_tubeThickness visual specification, while the color is determined by the nucleics_tubeColor specification.

By default, the platforms for the bases will be colored by the Shapely color set. You can set them all to a uniform color by setting the nucleics_useShapelyColors specification to false, and then setting the nucleics_baseColor specification to the desired color.

Color by Chain

Discrete protein and nucleic structures can be colored by chain by setting the macro_colorByChain specification to true. The colors will be unique for each structure, by iterating through HSL color space to provide the highest contrast.

Ligands

In general, the atoms from PDB files will be classified into 2 groups by the ChemDoodle Web Components library. The first group contains water and ligands, defined by the HETATM tag, and the second group contains the atoms from proteins and nucleic acids, defined by the ATOM tag. By default, atoms defined by the HETATM tag are displayed (except for water, covered in the next section), and atoms defined by the ATOM tag are hidden.

Controlling the representation of the ligand atoms is done using the specs.set3DRepresentation() function of the owner Canvas.

To show the protein and nucleic acid atoms and bonds, set the macro_displayAtoms and macro_displayBonds specifications to true. You can then control the representation of these atoms and bonds, separately from the ligand structures, by initializing a new VisualSpecifications object, calling its set3DRepresentation() function, and then setting that new VisualSpecifications object to the owning Canvas‘s residueSpecs variable.

The following source code reviews the instructions in this section:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
<script>
  // we have our pdb structure loaded by one of the methods from above
  var pdbStructure = ...;
  // declare our 3D component
  var display3d = new ChemDoodle.TransformCanvas3D('display3d', 400, 400);
  // set the 3D representation for ligand atoms
  display3d.specs.set3DRepresentation('van der Waals Spheres');
  // set the 3D representation for the protein and nucleic acid atoms and set them to be displayed
  // first, create a new VisualSpecifications object
  var newSpecs = new ChemDoodle.structures.VisualSpecifications();
  // display these atoms in wireframe
  newSpecs.set3DRepresentation('Wireframe');
  // set the residueSpecs variable for the Canvas3D to bind it
  display3d.residueSpecs = newSpecs;
  // set the original specifications to display the protein and nucleic acid atoms and bonds
  display3d.specs.macro_displayAtoms = true;
  display3d.specs.macro_displayBonds = true;
  // load the molecule into the Canvas
  display3d.loadMolecule(pdbStructure);
</script>

You can also set a cutoff value to display protein and nucleic acid atoms, by their distance to the closest ligand atom. This is controlled by the macro_atomToLigandDistance specification. Water is not considered a ligand. If no ligands are present, then this specification will have no effect.

Note that you must specifically account for this distance data when reading the PDB file. Using method 1, you must create a PDBInterpreter instance and set its calculateRibbonDistances variable to true before using it. Method 2 currently does not contain protein or nucleic acid atoms. Method 3 will automatically perform this calculation for you.

Show/Hide Water

Water can be shown and hidden with the macro_showWater specification. By default, the specification is false.