Developers

Note

The following content is autogenerated from the Python docstrings. Don’t expect a well written story!

The package

Code structure

Garleek code is organized in two layers: the library and the application layers. The former controls the business logic while the latter exposes this functionality as a command-line interface and, if useful, a high-level Python function.

Library layer

  • qm and mm subpackages host low-level QM and MM software interfacing. Both feature a documented, standardized dictionary representation on the data each are expecting. The communication logic for each qm and mm modules is performed in the connectors module. Refer to each module documentation for more information.
  • atom_types controls atom_type file parsing, and units stores unit conversion factors.

Application layer

  • cli module lists the CLI entry-points for users (frontend) and QM softwares handling the ONIOM calculation (backend)

Library layer

mm/

The mm subpackage hosts all the code that handles calculations involving MM software.

Each module in this subpackage is expected to perform the following tasks:

  1. Take the standardized dictionary (as explained) in the qm module and convert the contained data into the representation requested by the interfaced MM software (units included).
  2. Calculate the requested data (depending on derivatives value) with the MM software.
  3. Organize the obtained data into a standardized representation, as described below.

Standardized object for interfaced data

The QM engine will be expecting a dictionary containing the following keys and values. Unit conversion is not handled here (that’s responsibility of the connector), so just use those employed by the MM software (but document those in the docstring!)

energy : float
Potential energy
gradients : np.array with shape (3*n_atom,)
Gradient on each tom
hessian : np.array with shape (9*n_atom,)
Flattened hessian matrix (force constants)
dipole_moment : np.array with shape (3,), optional
Dipole X,Y,Z-Components
polarizability : np.array with shape (6,), optional
Atom polarizability
dipole_derivatives : np.array with shape (9*n_atom)
Dipole derivatives with respect to X,Y,Z components for each atom

mm.tinker.py

Garleek - Tinker bridge

garleek.mm.tinker.patch_tinker_output_for_inactive_atoms(results, indices, n_atoms)

TODO: Patch ‘hessian’ to support FREQ calculations with inactive

garleek.mm.tinker.prepare_tinker_key(forcefield, atoms=None, version=None)

Prepare a file ready for TINKER’s -k option.

Parameters:
  • forcefield (str) –

    forcefield should be either a:

    • *.prm: proper forcefield file
    • *.key, *.par: key file that can call *.prm files and

    add more parameters

    If a .prm file is provided, a .key file will be written to accommodate the forcefield in a parameters * call.

  • atoms (OrderedDict, optional=None) – Set of atoms to write, following convention defined in garleek.qm.
  • version (str, optional=None) –

    Specific behavior flag. Supports: - qmcharges, which would write charges provided by QM engine.

    Needs atoms to be passed.
Returns:

path – Absolute path to the generated TINKER .key file

Return type:

str

garleek.mm.tinker.prepare_tinker_xyz(atoms, bonds=None, version=None)

Write a TINKER-style XYZ file. This is similar to a normal XYZ, but with more fields:

atom_index element x y z type bonded_atom_1 bonded_order_1 ...

TINKER expects coordinates in Angstrom.

Parameters:
  • atoms (OrderedDict) – Set of atoms to write, following convention defined in garleek.qm.
  • bonds (OrderedDict) – Connectivity information, following convention defined in garleek.qm.
  • version (str, optional=None) – Specific behavior flag, if needed. Like ‘qmcharges’
Returns:

xyzblock – String with XYZ contents

Return type:

str

garleek.mm.tinker.run_tinker(xyz_data, n_atoms, key, energy=True, dipole_moment=True, gradients=True, hessian=True)

qm/

The qm subpackage hosts all the code that handles calculations involving QM software.

All modules listed here are expected to perform the following tasks:

  • Patch the INPUT file with proper garleek-backend calls and atom type conversion.
  • Parse the intermediate files as provided by the QM software into a standardized object (details below).
  • Write the output file expected by the QM software.
  • List supported versions and the default ones in two tuples named supported_versions and default_version, respectively.

Standardized object for interfaced data

The intermediate representation of the parsed data, which will be passed to the MM engine, should be a dict with these keys and values:

n_atoms : int
Number of atoms in the structure or substructure
derivatives : int

Calculations requested for the MM part:

  • 0: energy only
  • 1: calculate gradient
  • 2: calculate hessian

These values are cumulative, so if 2 is requested, 0 and 1 should be computed as well.

charge : float
Global charge of the structure
spin : int
Multiplicity of the system
atoms : OrderedDict of dicts

Ordered dictionary mapping atom index with another dictionary containing these values:

  • element : str. Chemical element
  • type : str. Atom type as expected by the MM part
  • xyz : np.array with shape (3,). Cartesian coordinates
  • charge : float. Atom point charge for the MM part
bonds : OrderedDict of lists of 2-tuples
Ordered dictionary mapping atom-index to a list of 2-tuples containing bonded atom index (int) and bond order (float)

qm.gaussian.py

Garleek - Gaussian bridge

class garleek.qm.gaussian.GaussianPatcher(filename, atom_types, mm='tinker', qm='gaussian', forcefield=None, version='16')

Bases: object

patch()
garleek.qm.gaussian.parse_gaussian_EIn(ein_filename, version='16')

Parse the *.EIn file produced by Gaussian external keyword.

This file contains the following data (taken from http://gaussian.com/external)

n_atoms  derivatives-requested  charge  spin
atom_name  x  y  z  MM-charge [atom_type]
atom_name  x  y  z  MM-charge [atom_type]
atom_name  x  y  z  MM-charge [atom_type]
...
  • derivatives-requested can be 0 (energy only), 1 (first derivatives) or 2 (second derivatives).
  • version must be one of garleek.qm.gaussian.supported_versions
garleek.qm.gaussian.patch_gaussian_input(*a, **kw)
garleek.qm.gaussian.prepare_gaussian_EOu(n_atoms, energy, dipole_moment, gradients=None, hessian=None, polarizability=None, dipole_polarizability=None)

Generate the *.EOu file Gaussian expects after external launch.

After performing the MM calculations, Gaussian expects a file with the following information (all in atomic units; taken from http://gaussian.com/external)

Items Pseudo Code Line Format
energy, dipole-moment (xyz) E, Dip(I), I=1,3 4D20.12
gradient on atom (xyz) FX(J,I), J=1,3; I=1,NAtoms 3D20.12
polarizability Polar(I), I=1,6 3D20.12
dipole derivatives DDip(I), I=1,9*NAtoms 3D20.12
force constants FFX(I), I=1,(3*NAtoms*(3*NAtoms+1))/2 3D20.12

The second section is present only if first derivatives or frequencies were requested, and the final section is present only if frequencies were requested. In the latter case, the Hessian is given in lower triangular form: αij, i=1 to N, j=1 to i. The dipole moment, polarizability, and dipole derivatives can be zero if none are available.

connectors.py

This module hosts high-level functions that connect different engines together, handling input/output files delivery and unit conversion.

A CONNECTORS dict is maintained at the end of the file listing the connectors available. It’s a dict of dicts, where the primary keys are QM engines and secondary keys, MM engines.

garleek.connectors.gaussian_tinker(qmargs, forcefield='mm3.prm', write_file=True, qm_version='16', mm_version=None, **kwargs)

Connects QM engine gaussian with MM engine tinker.

When Gaussian does an ONIOM calculation with Garleek, the MM part is configured with the external keyword, meaning that Gaussian will write a series of files to disk and call the requested program. Gaussian expects some data written back to an *.EOu file, which should contain potential energy, dipole moment, polarizability and/or hessian matrix, depending on the calculation. The called program is expected to take those input files, convert them to the format expected by the MM program, obtaint the needed data and write them to the EOu file with the adequate syntax. So, that’s what we are doing here:

  1. Parse Gaussian EIn file
  2. Convert it to TINKER’s XYZ and KEY files
  3. Run TINKER to obtain energy, dipole, etc
  4. Convert units and write the EOu file
Parameters:
  • qmargs (tuple) – CLI arguments passed by Gaussian. Depending on the version, its length can vary, but we only care about qmargs[1], so it’s not usually a problem
  • forcefield (str, optional=mm3.prm) – Path to file listing the TINKER forcefield to use. It can be a *.prm file or a *.key file. PRM files are full forcefields with no modifications. KEY files can import PRM files with parameters and then list custom parameters below.
  • write_file (bool, optional=True) – Wether to write the resulting EOu file to disk.
  • qm_version (string, optional=16) – Gaussian version in use. Needed to cover the slight differences between Gaussian versions (EIn/EOu syntax, number of args, and so on).
  • mm_version (string, optional=None) – TINKER behavior. If QM-charges must be considered for the MM part, set it to ‘qmcharges’
Returns:

eou_data – Contents of the EOu file Gaussian expects back.

Return type:

str

atom_types.py

Utilities to deal with atom_types mappings.

These files are needed to convert atom types found in the QM engine to those expected by the MM engine. This is a key part of the whole QM/MM calculation, so those types must be chosen wisely. While we provide a few default mappings (check BUILTIN_TYPES list), the user is encouraged to define his or her own conversions if needed.

An atom_types file format is very simple: just two columns of plain text, where the first field is the QM type and the MM type is in the second field. garleek-prepare will just replace the QM types with the corresponding MM type. If a QM type is not found in the file, it will throw an error.

Comments can be inserted with a preceding # character, in its own line or after any valid content (just like Python). Blank lines are ignored as well.

This is valid syntax:

# atomic number, mm3 type, description

1          5            # H_norm
2          51           # He
3          163          # Li
4          165          # Be
5          26           # B_sp2
6          1            # C_sp3
7          8            # N_sp3
8          6            # O_sp3
garleek.atom_types.get_file(filename)

Get file from one of the default locations

garleek.atom_types.parse(atom_types_filename)

Parse atom_types file

units.py

Conversion factors are listed here

The convention is to import this module when needed, using u as an alias:

>>> from garleek import units as u

Application layer

cli.py

This module contains the command-line interfaces for both garleek-prepare (the user-friendly patcher) and garleek-backend (the program the QM engine calls behind the scenes to handle the MM calculations).

garleek-preare takes a naive QM input file for ONIOM and patches it so the MM part is performed through garleek-backend, which will be interfacing with the configured MM engine. For this to work, the atom types featured in the QM input file should be understandable by the MM engine, so garleek-prepare will replace those too, using the atom_types file mapping to do so.

In general, the worfklow is the following:

  1. Build a standard ONIOM calculation, with layers, link atoms and so on. The garleek-prepare keyword should be present in the MM layer configuration so the patcher can find it and properly configure it.

  2. Patch the QM input file with garleek-prepare:

    garleek-prepare --qm <QM_engine> -mm <MM_engine> --ff <MM_forcefield> \
                    --types <QM/MM_atom_type_dictionary> QM_input_file.in
    
  3. Submit the calculation with the resulting patched file, named QM_input_file.garleek.in with the desired QM software:

    QM_engine QM_input_file.garleek.in
    
  4. Profit!

garleek.cli.backend_app(qmargs, qm='gaussian', mm='tinker', ff='mm3.prm', **kw)

garleek-backend Python entry-point

Parameters:
  • qmargs (tuple) – CLI arguments passed by the QM engine. This can be anything!
  • qm (str) – QM engine to use. Must be one of QM_ENGINES, optionally followed by _version to indicate slight differences in the QM logic. For example, gaussian defaults to gaussian_16, but gaussian_09a exports the connectivity differently.
  • mm (str) – MM engine to use. Must be one of MM_ENGINES,optionally followed by _version to indicate slight differences in the MM logic.
  • ff (str) – Forcefield to use in the MM part. This can be anything that the MM engine is able to use as a forcefield (normally a path to a file).
Returns:

Whatever the QM-MM connector returns

Return type:

result

garleek.cli.backend_app_main(argv=None)

garleek-backend CLI entry-point

garleek.cli.frontend_app(input_file, types='uff_to_mm3', qm='gaussian', mm='tinker', ff='mm3.prm', **kw)

garleek-prepare Python entry-point

Parameters:
  • input_file (str) – Path to the QM input file that should be patched so Garleek can handle the MM part through the desired MM engine.
  • types (str, default=uff_to_mm3) – Path to a file listing the mapping between the QM atom types present in input_file and the MM atom types expected by the MM engine given the current forcefield. Atom types are case INSENSITIVE. They will be uppercased upon processing.
  • qm (str) – QM engine to use. Must be one of QM_ENGINES, optionally followed by _version to indicate slight differences in the QM logic. For example, gaussian defaults to gaussian_16, but gaussian_09a exports the connectivity differently. This is only needed so the patched garleek-backend calls include this argument.
  • mm (str) – MM engine to use. Must be one of MM_ENGINES,optionally followed by _version to indicate slight differences in the MM logic. This is only needed so the patched garleek-backend calls include this argument.
  • ff (str) – Path to the forcefield the MM engine will be using to compute values requested by the QM engine. It should conform to the specified types mapping. This is only needed so the patched garleek-backend calls include this argument.
Returns:

outname – Path to patched input file. It will always be a derivative of input_file. If input_file is input.in, outname will be input.garleek.in.

Return type:

str

garleek.cli.frontend_app_main(argv=None)

garleek-prepare CLI entry-point