4.3.3. Water Bridge analysis — MDAnalysis.analysis.hydrogenbonds.WaterBridgeAnalysis

Author:

Zhiyi Wu

Year:

2017-2018

Copyright:

GNU Public License v3

Maintainer:

Zhiyi Wu <zhiyi.wu@gtc.ox.ac.uk>, @xiki-tempula on GitHub

Given a Universe (simulation trajectory with 1 or more frames) measure all water bridges for each frame between selections 1 and 2. Water bridge is defined as a bridging water which simultaneously forms two hydrogen bonds with atoms from both selection 1 and selection 2.

A water bridge can form between two hydrogen bond acceptors.

e.g. -CO2-:···H−O−H···:-O2C-

A water bridge can also form between two hydrogen bond donors.

e.g. -NH···:O:···HN- (where O is the oxygen of a bridging water)

A hydrogen bond acceptor and another hydrogen bond donor can be bridged by a water.

e.g. -CO2-:···H−O:···HN- (where H−O is part of H−O−H)

A higher order water bridge is defined as more than one water bridging hydrogen bond acceptor and donor. An example of a second order water bridge:

e.g. -CO2-:···H−O:···H−O:···HN- (where H−O is part of H−O−H)

The following keyword arguments are important to control the behaviour of the water bridge analysis:

  • water_selection (resname SOL): the selection string for the bridging water

  • order the maximum number of water bridging both ends

  • donor-acceptor distance (Å): 3.0

  • Angle cutoff (degrees): 120.0

  • forcefield to switch between default values for different force fields

  • donors and acceptors atom types (to add additional atom names)

4.3.3.1. Theory

This module attempts to find multi-order water bridge by an approach similar to breadth-first search, where the first solvation shell of selection 1 is selected, followed by the selection of the second solvation shell as well as any hydrogen bonding partner from selection 1. After that, the third solvation shell, as well as any binding partners from selection 2, are detected. This process is repeated until the maximum order of water bridges is reached.

4.3.3.2. Output as Network

Since the waters connecting the two ends of the selections are by nature a network. We provide a network representation of the water network. Water bridge data are returned per frame, which is stored in WaterBridgeAnalysis.results.network. Each frame is represented as a dictionary, where the keys are the hydrogen bonds originating from selection 1 and the values are new dictionaries representing the hydrogen bonds coming out of the corresponding molecules making hydrogen bonds with selection 1.

As for the hydrogen bonds which reach the selection 2, the values of the corresponding keys are None. One example where selection 1 and selection 2 are joined by one water molecule (A) which also hydrogen bond to another water (B) which also hydrogen bond to selection 2 would be represented as

# (selection 1)-O:···H-O(water 1):···H-(selection 2)
#                      |             :
#                      H·············O-H(water2)
#                                    H
{(sele1_acceptor, None, water1_donor, water1_donor_heavy, distance, angle):
     {(water1_acceptor, None, sele2_donor, sele2_donor_heavy,
     distance, angle): None},
     {(water1_donor, water1_donor_heavy, water2_acceptor, None,
     distance, angle):
          {(water2_acceptor, None, sele2_donor, sele2_donor_heavy,
          distance, angle): None}
      },
}

The atoms are represented by atom index and if the atom is hydrogen bond donor, it is followed by the index of the corresponding heavy atom (donor_proton, donor_heavy_atom). If the atom is a hydrogen bond acceptor, it is followed by none.

4.3.3.3. Output as Timeseries

For lower order water bridges, it might be desirable to represent the connections as WaterBridgeAnalysis.results.timeseries. The results are returned per frame and are a list of hydrogen bonds between the selection 1 or selection 2 and the bridging waters. Due to the complexity of the higher order water bridge and the fact that one hydrogen bond between two waters can appear in both third and fourth order water bridge, the hydrogen bonds in the WaterBridgeAnalysis.results.timeseries attribute are generated in a depth-first search manner to avoid duplication. Example code of how WaterBridgeAnalysis.results.timeseries is generated:

def network2timeseries(network, timeseries):
    '''Traverse the network in a depth-first fashion.
    expand_timeseries will expand the compact representation to the
    familiar timeseries representation.'''

    if network is None:
        return
    else:
        for node in network:
            timeseries.append(expand_timeseries(node))
            network2timeseries(network[node], timeseries)

timeseries = []
network2timeseries(network, timeseries)

An example would be.

results = [
    [ # frame 1
       [ <donor index>, <acceptor index>,
        (<donor residue name>, <donor residue number>, <donor atom name>),
        (<acceptor residue name>, <acceptor residue number>,
        <acceptor atom name>),
         <distance>, <angle>],
       ....
    ],
    [ # frame 2
      [ ... ], [ ... ], ...
    ],
    ...
]

Using the WaterBridgeAnalysis.generate_table() method one can reformat the results as a flat “normalised” table that is easier to import into a database or dataframe for further processing.

4.3.3.4. Detection of water bridges

Water bridges are recorded if a bridging water simultaneously forms hydrogen bonds with selection 1 and selection 2.

Hydrogen bonds are detected based on a geometric criterion:

  1. The distance between acceptor and hydrogen is less than or equal to distance (default is 3 Å).

  2. The angle between donor-hydrogen-acceptor is greater than or equal to angle (default is 120º).

The cut-off values angle and distance can be set as keywords to WaterBridgeAnalysis.

Donor and acceptor heavy atoms are detected from atom names. The current defaults are appropriate for the CHARMM27 and GLYCAM06 force fields as defined in Table Default atom names for water bridge analysis.

Hydrogen atoms bonded to a donor are searched based on its distance to the donor. The algorithm searches for all hydrogens (name “H*” or name “[123]H” or type “H”) in the same residue as the donor atom within a cut-off distance of 1.2 Å.

Default heavy atom names for CHARMM27 force field.

group

donor

acceptor

comments

main chain

N

O, OC1, OC2

OC1, OC2 from amber99sb-ildn (Gromacs)

water

OH2, OW

OH2, OW

SPC, TIP3P, TIP4P (CHARMM27,Gromacs)

ARG

NE, NH1, NH2

ASN

ND2

OD1

ASP

OD1, OD2

CYS

SG

CYH

SG

possible false positives for CYS

GLN

NE2

OE1

GLU

OE1, OE2

HIS

ND1, NE2

ND1, NE2

presence of H determines if donor

HSD

ND1

NE2

HSE

NE2

ND1

HSP

ND1, NE2

LYS

NZ

MET

SD

see e.g. [Gregoret1991]

SER

OG

OG

THR

OG1

OG1

TRP

NE1

TYR

OH

OH

Heavy atom types for GLYCAM06 force field.

element

donor

acceptor

N

N,NT,N3

N,NT

O

OH,OW

O,O2,OH,OS,OW,OY

S

SM

Donor and acceptor names for the CHARMM27 force field will also work for e.g. OPLS/AA or amber (tested in Gromacs). Residue names in the table are for information only and are not taken into account when determining acceptors and donors. This can potentially lead to some ambiguity in the assignment of donors/acceptors for residues such as histidine or cytosine.

For more information about the naming convention in GLYCAM06 have a look at the Carbohydrate Naming Convention in Glycam.

The lists of donor and acceptor names can be extended by providing lists of atom names in the donors and acceptors keywords to WaterBridgeAnalysis. If the lists are entirely inappropriate (e.g. when analysing simulations done with a force field that uses very different atom names) then one should either use the value “other” for forcefield to set no default values or derive a new class and set the default list oneself:

class WaterBridgeAnalysis_OtherFF(WaterBridgeAnalysis):
      DEFAULT_DONORS = {"OtherFF": tuple(set([...]))}
      DEFAULT_ACCEPTORS = {"OtherFF": tuple(set([...]))}

Then simply use the new class instead of the parent class and call it with `forcefield` = "OtherFF". Please also consider contributing the list of heavy atom names to MDAnalysis.

References

[Gregoret1991]

Lydia M. Gregoret, Stephen D. Rader, Robert J. Fletterick, and Fred E. Cohen. Hydrogen bonds involving sulfur atoms in proteins. Proteins: Structure, Function, and Bioinformatics, 9(2):99–107, 1991. doi:https://doi.org/10.1002/prot.340090204.