Package org.jmol.symmetry
Class CIPChirality
java.lang.Object
org.jmol.symmetry.CIPChirality
A fully validated relatively efficient implementation of Cahn-Ingold-Prelog
rules for assigning R/S, M/P, and E/Z stereochemical descriptors. Based on
IUPAC Blue Book rules of 2013 and assorted corrections.
IUPAC Project: Corrections, Revisions and Extension for the Nomenclature of
Organic Chemistry - IUPAC Recommendations and Preferred Names 2013 (the IUPAC
Blue Book)
https://iupac.org/projects/project-details/?project_nr=2001-043-1-800
http://www.sbcs.qmul.ac.uk/iupac/bibliog/BBerrors.html
Settable options:
set testflag1 use advanced in/out-sensitive Rule 6 (r,r-bicyclo[2.2.2]octane)
set testflag2 turn off tracking (saving of _M.CIPInfo) for speed
Features include:
- deeply validated
- includes revised Rules 1b, and 2
- includes a proposed Rule 6
- implemented in Java (Jmol) and JavaScript (JSmol)
- only a few Java classes; < 1000 lines
- efficient, one-pass process for each center using a single finite digraph
for all auxiliary descriptors
- exhaustive processing of all 9 sequence rules (1a, 1b, 2, 3, 4a, 4b, 4c, 5,
6)
- includes R/S, r/s, M/P (axial, not planar), E/Z
- covers any-length odd and even cumulenes
- uses Jmol conformational SMARTS to detect atropisomers and helicenes
- covers chiral phosphorus and sulfur, including trigonal pyramidal and
tetrahedral
- properly treats complex combinations of R/S, M/P, and seqCis/seqTrans
centers (Rule 4b)
- properly treats neutral-species resonance structures using fractional
atomic mass and a modified Rule 1b
- implements CIP spiro rule (BB P-93.5.3.1) as part of Rule 6
- detects small rings (fewer than 8 members) and removes E/Z specifications
for such
- detects chiral bridgehead nitrogens and E/Z imines and diazines
- reports atom descriptor along with the rule that ultimately decided it
- fills _M.CIPInfo with detailed information about how each ligand was decided
(feature turned off by set testflag2)
- generates advanced Rule 6 descriptors for cubane and the like. (Generally 'r')
using set testflag1
Primary 236-compound Chapter-9 validation set (AY-236) provided by Andrey
Yerin, ACD/Labs (Moscow).
Mikko Vainio also supplied a 64-compound testing suite (MV-64), which is
available on SourceForge in the Jmol-datafiles directory.
(https://sourceforge.net/p/jmol/code/HEAD/tree/trunk/Jmol-datafiles/cip).
Additional test structures provided by John Mayfield.
Additional thanks to the IUPAC Blue Book Revision project, specifically
Karl-Heinz Hellwich for alerting me to the errata page for the 2013 IUPAC
specs (http://www.chem.qmul.ac.uk/iupac/bibliog/BBerrors.html), Gerry Moss
for discussions, Andrey Yerin for discussion and digraph checking.
Many thanks to the members of the BlueObelisk-Discuss group, particularly
Mikko Vainio, John Mayfield (aka John May), Wolf Ihlenfeldt, and Egon
Willighagen, for encouragement, examples, serious skepticism, and extremely
helpful advice.
References:
CIP(1966) R.S. Cahn, C. Ingold, V. Prelog, Specification of Molecular
Chirality, Angew.Chem. Internat. Edit. 5, 385ff
Custer(1986) Roland H. Custer, Mathematical Statements About the Revised
CIP-System, MATCH, 21, 1986, 3-31
http://match.pmf.kg.ac.rs/electronic_versions/Match21/match21_3-31.pdf
Mata(1993) Paulina Mata, Ana M. Lobo, Chris Marshall, A.Peter Johnson The CIP
sequence rules: Analysis and proposal for a revision, Tetrahedron: Asymmetry,
Volume 4, Issue 4, April 1993, Pages 657-668
Mata(1994) Paulina Mata, Ana M. Lobo, Chris Marshall, and A. Peter Johnson,
Implementation of the Cahn-Ingold-Prelog System for Stereochemical Perception
in the LHASA Program, J. Chem. Inf. Comput. Sci. 1994, 34, 491-504 491
http://pubs.acs.org/doi/abs/10.1021/ci00019a004
Mata(2005) Paulina Mata, Ana M. Lobo, The Cahn, Ingold and Prelog System:
eliminating ambiguity in the comparison of diastereomorphic and
enantiomorphic ligands, Tetrahedron: Asymmetry, Volume 16, Issue 13, 4 July
2005, Pages 2215-2223
Favre(2013) Henri A Favre, Warren H Powell, Nomenclature of Organic Chemistry
: IUPAC Recommendations and Preferred Names 2013 DOI:10.1039/9781849733069
http://pubs.rsc.org/en/content/ebook/9780854041824#!divbookcontent
code history:
5/12/18 Jmol 14.29.14 fixes minor Rule 5 bug and adds advanced Rule 6 in/out testflag1 option (857 lines)
5/1/18 Jmol 14.29.14 fixes enantiomorphic Rule 5 R/S check for BH64_85 and BH64_86
4/25/18 Jmol 14.29.14 fixes spiroallene Rule 6 issue for BH64_84
4/23/18 Jmol 14.29.14 fixes Rule 2 for JM_008, involving mass and duplicates (824 lines)
4/11/18 Jmol 14.29.13 adds optional CIPDataTracker class (822 lines)
4/2/18 Jmol 14.29.13 adds optional CIPDataSmiles class
4/2/18 Jmol 14.29.13 adds John's "mancude-like" cyclic conjugated ene Kekule
averaging
12/10/17 Jmol 14.29.9 adds CIPData, mancude Kekule averaging
11/11/17 Jmol 14.25.1 adds "duplicate over terminal" in Rule 1b; streamlined
(777 lines)
11/05/17 Jmol 14.24.1 fixes a problem with seqCis/seqTrans and also with Rule
2 (799 lines)
10/17/17 Jmol 14.20.10 adds S4 check in Rule 6 and also fixes bug in aux
descriptors being skipped when two ligands are equivalent for the root (798
lines)
9/19/17 CIPChirality code simplification (778 lines)
9/14/17 Jmol 14.20.6 switching to Mikko's idea for Rule 4b and 5. Abandons
"thread" idea. Uses breadth-first algorithm for generating bitsets for R and
S. Processing time reduced by 50%. Still could be optimized some. (820 lines)
7/25/17 Jmol 14.20.4 consolidates all ene determinations; moves auxiliary
descriptor generation to prior to Rule 3 (850 lines) 7/23/17 Jmol 14.20.4
adds Rule 6; rewrite/consolidate spiro, C3, double spiran code (853 lines)
7/19/17 Jmol 14.20.3 fixing Rule 2 (880 lines) 7/13/17 Jmol 14.20.3 more
thorough spiro testing (858 lines) 7/10/17 Jmol 14.20.2 adding check for C3
and double spiran (CIP 1966 #32 and #33) 7/8/17 Jmol 14.20.2 adding presort
for Rules 4a and 4c (test12.mol; 828 lines)
7/7/17 Jmol 14.20.1 minor coding efficiencies (833 lines)
7/6/17 Jmol 14.20.1 major rewrite to correct and simplify logic; full
validation for 433 structures (many duplicates) in AY236, BH64, MV64, MV116,
JM, and L (836 lines)
6/30/17 Jmol 14.20.1 major rewrite of Rule 4b (999 lines)
6/25/17 Jmol 14.19.1 minor fixes for Rule 4b and 5 for BH64_012-015; better
atropisomer check
6/12/2017 Jmol 14.18.2 tested for Rule 1b sphere (AY236.53, 163, 173, 192);
957 lines
6/8/2017 Jmol 14.18.2 removed unnecessary presort for Rule 1b
5/27/17 Jmol 14.17.2 fully interfaced using SimpleNode and SimpleEdge
5/27/17 Jmol 14.17.1 fully validated; simplified code; 978 lines
5/17/17 Jmol 14.16.1. adds helicene M/P chirality; 959 lines validated using
CCDC structures HEXHEL02 HEXHEL03 HEXHEL04 ODAGOS ODAHAF
http://pubs.rsc.org/en/content/articlehtml/2017/CP/C6CP07552E
5/14/17 Jmol 14.15.5. trimmed up and documented; no need for lone pairs; 948
lines
5/13/17 Jmol 14.15.4. algorithm simplified; validated for mixed Rule 4b
systems involving auxiliary R/S, M/P, and seqCis/seqTrans; 959 lines
5/06/17 validated for 236 compound set AY-236.
5/02/17 validated for 161 compounds, including M/P, m/p (axial chirality for
biaryls and odd-number cumulenes)
4/29/17 validated for 160 compounds, including M/P, m/p (axial chirality for
biaryls and odd-number cumulenes)
4/28/17 Validated for 146 compounds, including imines and diazines, sulfur,
phosphorus
4/27/17 Rules 3-5 preliminary version 14.15.1
4/6/17 Introduced in Jmol 14.12.0; validated for Rules 1 and 2 in Jmol
14.13.2; 100 lines
NOTE! NOTE! NOTE! NOTE! NOTE! NOTE! NOTE! NOTE! NOTE! NOTE! NOTE! NOTE! NOTE!
Added logic to Rule 1b:
Rule 1b: In comparing duplicate atoms, the one with lower root distance has
precedence, where root distance is defined as: (a) in the case of
ring-closure duplicates, the sphere of the duplicated atom; and (b) in the
case of multiple-bond duplicates, the sphere of the atom to which the
duplicate atom is attached.
Rationale: Using only the distance of the duplicated atom (current
definition) introduces a Kekule bias, which can be illustrated with various
simple models. By moving that distance to be the sphere of the parent atom of
the duplicate, the problem is resolved.
Added clarification to Rule 2:
Rule 2: Higher mass precedes lower mass, where mass is defined in the case of
nonduplicate atoms with identified isotopes for elements as their exact
isotopic mass and, in all other cases, as their element's atomic weight.
Rationale: BB is not self-consistent, including both "mass number" (in the
rule) and "atomic mass" in the description, where "79Br < Br < 81Br". And
again we have the same Kekule-ambiguous issue as in Rule 1b. The added
clarification fixes the Kekule issue (not using isotope mass number for
duplicate atoms), solves the problem that F < 19F (though 100% nat.
abundance), and is easily programmable.
In Jmol the logic is very simple, actually using the isotope mass number, but
doing two checks:
a) if one of four specific isotopes (16O, 52Cr, 96Mo, 175Lu), reverse the test, and
b) if on the list of 100% natural isotopes or one of the non-natural
elements, use the element's accepted atomic weight.
See CIPAtom.getMass();
PROPOSED Rule 6: An undifferentiated reference node has priority over any
other undifferentiated node.
Rationale: This rule is stated in CIP(1966) p. 357.
- Author:
- Bob Hanson hansonr@stolaf.edu
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescription(package private) static final int
(package private) static final int
(package private) javajs.util.BS
set bits RULE_1a - RULE_6 to indicate a need for that rule based on what is in the model(package private) int
the current rule being applied exhaustively(package private) CIPData
collected bitsets and more specialized SMILES/SMARTS searches and vwr references(package private) boolean
are we tracking pathways for _M.CIPInfo?(package private) boolean
do we have r or s and so will need to recalculate Mata like/unlike lists in Rule 5?(package private) static final int
(package private) boolean
are we in the midst of auxiliary center creation?(package private) static final int
maximum path to display for debugging only using SET DEBUG in Jmol(package private) static final int
(package private) int
incremental pointer providing a unique ID to every CIPAtom for debugging(package private) CIPChirality.CIPAtom
The atom for which we are determining the stereochemistry(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final String
These elements have 100% natural abundance; we will use their isotope mass number instead of their actual average mass, since there is no difference(package private) static final String
These elements have an isotope number that is a bit higher than the average mass, even though their actual isotope mass is a bit lower.(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final String[]
(package private) static final int
maximum ring size that can have a double bond with no E/Z designation; also used for identifying aromatic rings and bridgehead nitrogens(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate void
clearSmallRingEZ
(SimpleNode[] atoms, javajs.util.Lst<int[]> lstEZ) Remove E/Z designations for small-rings double bonds (IUPAC 2013.P-93.5.1.4.1).private void
getAtomBondChirality
(SimpleNode atom, javajs.util.Lst<int[]> lstEZ, javajs.util.BS bsToDo) Get E/Z characteristics for specific atoms.(package private) int
getAtomChiralityLimited
(SimpleNode atom, CIPChirality.CIPAtom cipAtom, SimpleNode parentAtom) Determine R/S or one half of E/Z determinationprivate int
getBondChiralityLimited
(SimpleEdge bond, SimpleNode a) Determine the axial or E/Z chirality for this bond, with the given starting atom avoid
getChiralityForAtoms
(CIPData data) A general determination of chirality that involves ultimately all of Rules 1-6.(package private) int
getEneChirality
(CIPChirality.CIPAtom winner1, CIPChirality.CIPAtom end1, CIPChirality.CIPAtom end2, CIPChirality.CIPAtom winner2, boolean isAxial, boolean allowPseudo) Determine the stereochemistry of a bondprivate SimpleNode
getLastCumuleneAtom
(SimpleEdge bond, SimpleNode atom, int[] nSP2, SimpleNode[] parents) getRuleName
(int rule) (package private) static boolean
Check if an atom is 1st row.void
private boolean
preFilterAtomList
(SimpleNode[] atoms, javajs.util.BS bsToDo, javajs.util.BS bsEnes) Remove unnecessary atoms from the list and let us know if we have alkenes to consider.private int
setBondChirality
(SimpleNode a, SimpleNode pa, SimpleNode pb, SimpleNode b, boolean isAxial) Determine the axial or E/Z chirality for the a-b bond.private void
setStereoFromSmiles
(javajs.util.BS bsHelix, int stereo, SimpleNode[] atoms)
-
Field Details
-
RULE_2_nXX_EQ_XX
These elements have 100% natural abundance; we will use their isotope mass number instead of their actual average mass, since there is no difference- See Also:
-
RULE_2_REDUCE_ISOTOPE_MASS_NUMBER
These elements have an isotope number that is a bit higher than the average mass, even though their actual isotope mass is a bit lower. We will change 16 to 15.9, 52 to 51.9, 96 to 95.9, 175 to 174.9 so as to force the unspecified mass atom to be higher priority than the specified one. All other isotopes can use their integer isotope mass number instead of looking up their exact isotope mass.- See Also:
-
NO_CHIRALITY
static final int NO_CHIRALITY- See Also:
-
TIED
static final int TIED- See Also:
-
A_WINS
static final int A_WINS- See Also:
-
B_WINS
static final int B_WINS- See Also:
-
IGNORE
static final int IGNORE- See Also:
-
UNDETERMINED
static final int UNDETERMINED- See Also:
-
STEREO_R
static final int STEREO_R- See Also:
-
STEREO_S
static final int STEREO_S- See Also:
-
STEREO_M
static final int STEREO_M- See Also:
-
STEREO_P
static final int STEREO_P- See Also:
-
STEREO_Z
static final int STEREO_Z- See Also:
-
STEREO_E
static final int STEREO_E- See Also:
-
STEREO_BOTH_RS
static final int STEREO_BOTH_RS- See Also:
-
STEREO_BOTH_EZ
static final int STEREO_BOTH_EZ- See Also:
-
RULE_1a
static final int RULE_1a- See Also:
-
RULE_1b
static final int RULE_1b- See Also:
-
RULE_2
static final int RULE_2- See Also:
-
RULE_3
static final int RULE_3- See Also:
-
RULE_4a
static final int RULE_4a- See Also:
-
RULE_4b
static final int RULE_4b- See Also:
-
RULE_4c
static final int RULE_4c- See Also:
-
RULE_5
static final int RULE_5- See Also:
-
RULE_6
static final int RULE_6- See Also:
-
ruleNames
-
MAX_PATH
static final int MAX_PATHmaximum path to display for debugging only using SET DEBUG in Jmol- See Also:
-
SMALL_RING_MAX
static final int SMALL_RING_MAXmaximum ring size that can have a double bond with no E/Z designation; also used for identifying aromatic rings and bridgehead nitrogens- See Also:
-
currentRule
int currentRulethe current rule being applied exhaustively -
root
CIPChirality.CIPAtom rootThe atom for which we are determining the stereochemistry -
data
CIPData datacollected bitsets and more specialized SMILES/SMARTS searches and vwr references -
doTrack
boolean doTrackare we tracking pathways for _M.CIPInfo? -
isAux
boolean isAuxare we in the midst of auxiliary center creation? -
bsNeedRule
javajs.util.BS bsNeedRuleset bits RULE_1a - RULE_6 to indicate a need for that rule based on what is in the model -
havePseudoAuxiliary
boolean havePseudoAuxiliarydo we have r or s and so will need to recalculate Mata like/unlike lists in Rule 5? -
ptIDLogger
int ptIDLoggerincremental pointer providing a unique ID to every CIPAtom for debugging
-
-
Constructor Details
-
CIPChirality
public CIPChirality()
-
-
Method Details
-
getRuleName
-
getChiralityForAtoms
A general determination of chirality that involves ultimately all of Rules 1-6.- Parameters:
data
-
-
setStereoFromSmiles
-
preFilterAtomList
Remove unnecessary atoms from the list and let us know if we have alkenes to consider.- Parameters:
atoms
-bsToDo
-bsEnes
-- Returns:
- whether we have any alkenes that could be EZ
-
isFirstRow
Check if an atom is 1st row.- Parameters:
a
-- Returns:
- elemno > 2 && elemno <= 10
-
clearSmallRingEZ
Remove E/Z designations for small-rings double bonds (IUPAC 2013.P-93.5.1.4.1).- Parameters:
atoms
-lstEZ
-
-
getAtomBondChirality
private void getAtomBondChirality(SimpleNode atom, javajs.util.Lst<int[]> lstEZ, javajs.util.BS bsToDo) Get E/Z characteristics for specific atoms. Also check here for atropisomeric M/P designations- Parameters:
atom
-lstEZ
-bsToDo
-
-
getLastCumuleneAtom
private SimpleNode getLastCumuleneAtom(SimpleEdge bond, SimpleNode atom, int[] nSP2, SimpleNode[] parents) - Parameters:
bond
-atom
-nSP2
- returns the number of sp2 carbons in this alkene or cumuleneparents
-- Returns:
- the terminal atom of this alkene or cumulene
-
getAtomChiralityLimited
Determine R/S or one half of E/Z determination- Parameters:
atom
- ignored if a is not null (just checking ene end top priority)cipAtom
- ignored if atom is not nullparentAtom
- null for tetrahedral, other alkene carbon for E/Z- Returns:
- if and E/Z test, [0:none, 1: atoms[0] is higher, 2: atoms[1] is higher] otherwise [0:none, 1:R, 2:S]
-
getBondChiralityLimited
Determine the axial or E/Z chirality for this bond, with the given starting atom a- Parameters:
bond
-a
- first atom to consider, or null- Returns:
- one of: {NO_CHIRALITY | STEREO_Z | STEREO_E | STEREO_Ra | STEREO_Sa | STEREO_ra | STEREO_sa}
-
setBondChirality
private int setBondChirality(SimpleNode a, SimpleNode pa, SimpleNode pb, SimpleNode b, boolean isAxial) Determine the axial or E/Z chirality for the a-b bond.- Parameters:
a
-pa
-pb
-b
-isAxial
-- Returns:
- one of: {NO_CHIRALITY | STEREO_Z | STEREO_E | STEREO_M | STEREO_P | STEREO_m | STEREO_p}
-
getEneChirality
int getEneChirality(CIPChirality.CIPAtom winner1, CIPChirality.CIPAtom end1, CIPChirality.CIPAtom end2, CIPChirality.CIPAtom winner2, boolean isAxial, boolean allowPseudo) Determine the stereochemistry of a bond- Parameters:
winner1
-end1
-end2
-winner2
-isAxial
- if an odd-cumuleneallowPseudo
- if we are working from a high-level bond stereochemistry method- Returns:
- STEREO_M, STEREO_P, STEREO_Z, STEREO_E, or NO_CHIRALITY
-
logInfo
-