Query Sketcher

Structure queries can be created in the SketcherCanvas component. The following information explains what a structure query is and how they can be created. If you have not yet read the SketcherCanvas page, please do so now.

  1. What are Query Structures?
  2. Variables
  3. Initializing the Query Sketcher
  4. Additional Setup
  5. Simple Query Match
  6. Database Applications

What are Query Structures?

Structure matches will help to find molecules exactly as drawn, but users will also want to search for a range of results that fit some rules. To do this, the ChemDoodle Web Components provides a query sketcher to create query structures. Query structures are rules that define a set of chemical structures. A user can set various properties to query atoms, such as element types, aromaticity and connectivity definitions. The same can be done for query bonds. The search then returns all results that fit those rules, allowing users to quickly discover a vast amount of desired information.

A query structure resembles other chemical structures, however various variables are set for the nodes (query atoms) and edges (query bonds) to describe the search criteria. For instance, in the sketcher above, the query structure defines any molecule with a benzene ring where a halogen (x) is ortho to a saturated oxygen, sulfur or selenium atom and para to a 3 membered chain where the second bond is either a single or double bond and is contained in no rings.

Note that query atoms and query bonds with no variables defined are defaulted to only matching their current definition. A carbon atom will be converted into a query atom that only matches carbon, while a single bond will be converted into a query bond that only matches single bonds.

Variables

There are two types of variables, identity variables and attribute variables. Identity variables define the entity type: element types for atoms and bond orders for bonds and are graphically represented in bold between square brackets. Attribute variables define all remaining criteria, such as the number of hydrogens on an atom or whether a bond is in a ring or not. Attribute variables are graphically represented in italics between parentheses below the identity variables.

Attribute variables are further categorized into several types. Each type corresponds to the type of data it matches and defines the match values.

Type Description Value
Boolean This datatype matches on a true/false basis, for instance declaring that a matched atom must be aromatic. The value is implied at true. Use the not operator to force a false.
Integer This datatype matches an integer or some range of values, for instance declaring that the atom’s charge must be a certain value. The value can be a specific integer value or an arbitrary range. For instance, users can define the charge to be -3, -1, or from 2-3.
Hash This datatype matches a value that is mapped to some set of characters, for instance declaring that a matched atom’s chirality must be either ‘R’ or ‘S’. The value is a character of the allowed values defined for the variable. For instance, for atom chirality, the allowed values are ‘R’, ‘S’ or ‘A’.

Atom Variables

The identity variables for query atoms define the allowed elemental values for matched atoms. The following values are allowed:

  1. Element Symbols – H, C, N, Fe, Mn, U, etc. These are the element symbols from the periodic table and will match any atom of that element.
  2. Any – The any wildcard, specified by a lowercase ‘a’ symbol, will match any atom label. This is the default.
  3. Non-hydrogen – The non-hydrogen wildcard, specified by a lowercase ‘r’ symbol, will match any atom label except hydrogen.
  4. Heteroatom – The heteroatom wildcard, specified by a lowercase ‘q’ symbol, will match any atom label except hydrogen or carbon.
  5. Halide – The halide wildcard, specified by a lowercase ‘x’ symbol, will match any halogen symbol: F, Cl, Br, I, At.
  6. Metal – The metal wildcard, specified by a lowercase ‘m’ symbol, will match any metal (transition metals, other metals).

A single variable, or any combination of variables is allowed for the identity definition.

The following query atom attribute variables are available:

Variable Symbol Datatype Description
Aromaticity A Boolean Matches any atom that is aromatic
Charge C Integer Matches any atom with charge defined by range
Chirality @ Hash Matches any atom with CIP stereochemistry of ‘R’ or ‘S’, ‘A’ will match any stereocenter
Connectivity X Integer Matches any atom with number of connections defined by range
Connectivity (No H) x Integer Matches any atom with number of connections defined by range, not including any connections to hydrogen
Hydrogens H Integer Matches any atom with number of connections to hydrogens defined by range, both implicit and explicit
Rings R Integer Matches any atom contained in the number of rings defined by range (SSSR set)
Saturation S Boolean Matches any atom that is saturated (connected to only single bonds)

Any combination of these attribute variables may be defined for a query atom, with or without the identity variable (the default is any element symbol). Combined with the not operator, very advanced queries are possible.

Bond Variables

The identity variables for query bonds define the allowed bond type values for matched bonds. The following values are allowed:

  1. Positive Whole Numbers – 0, 1, 2, 3, 4, and so on. These represent the standard zero, single, double, triple, quadruple and so forth bonds.
  2. Any – The any wildcard, specified by a lowercase ‘a’ symbol, will match any bond type. This is the default.
  3. Half – The half bond, specified by a lowercase ‘h’ symbol, will match any half bond, or 0.5 bond order bond (a single electron).
  4. Resonance – The resonance bond, specified by a lowercase ‘r’ symbol, will match any resonance bond, or 1.5 bond order bond (delocalized).

A single variable, or any combination of variables is allowed for the identity definition.

The following query bond attribute variables are available:

Variable Symbol Datatype Description
Aromaticity A Boolean Matches any bond that is aromatic
Rings R Integer Matches any bonds that are members of the number of rings defined by range (SSSR set)
Stereochemistry @ Hash Matches any bond with CIP stereochemistry of ‘E’ or ‘Z’, ‘A’ will match any stereocenter

Any combination of these attribute variables may be defined for a query bond, with or without the identity variable (the default is any element symbol). Combined with the not operator, very advanced queries are possible.

NOT Operator

The not operator inverts the query variable. Any variable, either identity or attribute, can be inverted. So for instance, a user can define to a query atom a saturated variable with the not operator to designate an unsaturated atom match.

Initializing the Query Sketcher

The Query Sketcher is just a special instance of the SketcherCanvas setting the includeQuery option to true. The SketcherCanvas class is a child of the Canvas class, so working with it is the same as with working with any other Canvas in the ChemDoodle Web Components library. To initialize it, we just call its constructor, which places it into the HTML page, including the includeQuery option in the options object:

1
new ChemDoodle.SketcherCanvas(name, width, height, options);

Additional Setup

In addition to just calling the constructor, we recommend you set the following common query sketcher options, which will change the graphics to improve the user experience. To this note, the default bond length should be doubled to 40 pixels as query labels take more space than typical atom labels. Additional decorations such as implicit hydrogens, terminal carbon labels and colors should not be enabled as they have no significance for query structures. The following code is the exact code used to initialize the Query Sketcher at the top of this page:

1
2
3
4
5
6
7
8
9
10
11
12
13
<script>
  // create the query sketcher by calling the SketcherCanvas constructor and setting the includeQuery variable to true
  var sketcher = new ChemDoodle.SketcherCanvas('sketcher', 500, 300, {useServices:true, oneMolecule:true, includeQuery:true});
  // this option helps users see perspective if bonds overlap
  sketcher.specs.bonds_clearOverlaps_2D = true;
  // double the bond length to 40 so query labels are readable
  sketcher.specs.bondLength_2D = 40;
  // this is the ChemDoodle JSON content for the loaded query structure
  var content = {"m":[{"a":[{"x":146.645,"i":"a0","y":98.1956},{"x":186.6451,"i":"a1","y":98.1956},{"x":126.645,"i":"a2","y":132.8366},{"x":126.645,"i":"a3","y":63.5546},{"x":206.645,"i":"a4","y":132.8366},{"x":86.645,"i":"a5","y":132.8366},{"x":86.645,"i":"a6","y":63.5546},{"x":246.645,"i":"a7","y":132.8366},{"x":66.645,"i":"a8","y":98.1956},{"q":{"as":{"v":["O","S","Se"]},"S":{"v":true}},"x":66.645,"i":"a9","y":167.4776},{"q":{"as":{"v":["x"]}},"x":26.645,"i":"a10","y":98.1956}],"b":[{"b":0,"e":3,"i":"b0","o":2},{"b":3,"e":6,"i":"b1"},{"b":6,"e":8,"i":"b2","o":2},{"b":8,"e":5,"i":"b3"},{"b":5,"e":2,"i":"b4","o":2},{"b":2,"e":0,"i":"b5"},{"b":5,"e":9,"i":"b6"},{"b":8,"e":10,"i":"b7"},{"b":0,"e":1,"i":"b8"},{"q":{"bs":{"v":["1","2"]},"R":{"v":"0"}},"b":1,"e":4,"i":"b9"},{"b":4,"e":7,"i":"b10"}]}]};
  // read and load the content
  sketcher.loadContent(new ChemDoodle.io.JSONInterpreter().contentFrom(content).molecules, []);
  sketcher.repaint();
</script>

Simple Query Match

A query structure is a powerful definition, but you will want to do something with it. iChemLabs Cloud services contains several graph isomorphism algorithms for comparing molecular structures: iChemLabs.isGraphIsomorphism(), iChemLabs.isSubgraphIsomorphism() and iChemLabs.isSupergraphIsomorphism(). All three of these functions can take a query structure as the arrow to check if it matches the target molecule. This is great for setting up educational systems or quick structure validators. Access to our services from academic organizations is free. Otherwise, our rates are very reasonable, please view our support options.

Database Applications

iChemLabs provides complete APIs for working with our query structures and database systems. Please contact us for information.