Developers¶

Note

The following content is autogenerated from the Python docstrings. Don’t expect a well written story!

The package¶

Code structure¶

Garleek code is organized in two layers: the library and the application layers. The former controls the business logic while the latter exposes this functionality as a command-line interface and, if useful, a high-level Python function.

Library layer¶

qm and mm subpackages host low-level QM and MM software interfacing. Both feature a documented, standardized dictionary representation on the data each are expecting. The communication logic for each qm and mm modules is performed in the connectors module. Refer to each module documentation for more information.
atom_types controls atom_type file parsing, and units stores unit conversion factors.

Application layer¶

cli module lists the CLI entry-points for users (frontend) and QM softwares handling the ONIOM calculation (backend)

Library layer¶

mm/¶

The mm subpackage hosts all the code that handles calculations involving MM software.

Each module in this subpackage is expected to perform the following tasks:

Take the standardized dictionary (as explained) in the qm module and convert the contained data into the representation requested by the interfaced MM software (units included).
Calculate the requested data (depending on derivatives value) with the MM software.
Organize the obtained data into a standardized representation, as described below.

Standardized object for interfaced data¶

The QM engine will be expecting a dictionary containing the following keys and values. Unit conversion is not handled here (that’s responsibility of the connector), so just use those employed by the MM software (but document those in the docstring!)

energy : float

Potential energy

gradients : np.array with shape (3*n_atom,)

Gradient on each tom

hessian : np.array with shape (9*n_atom,)

Flattened hessian matrix (force constants)

dipole_moment : np.array with shape (3,), optional

Dipole X,Y,Z-Components

polarizability : np.array with shape (6,), optional

Atom polarizability

dipole_derivatives : np.array with shape (9*n_atom)

Dipole derivatives with respect to X,Y,Z components for each atom

mm.tinker.py¶

Garleek - Tinker bridge

garleek.mm.tinker.patch_tinker_output_for_inactive_atoms(results, indices, n_atoms)¶: TODO: Patch ‘hessian’ to support FREQ calculations with inactive

garleek.mm.tinker.prepare_tinker_key(forcefield, atoms=None, version=None)¶

Prepare a file ready for TINKER’s -k option.

Parameters:	forcefield (str) – `forcefield` should be either a: `.prm`: proper forcefield file `.key`, `.par`: key file that can call `.prm files` and add more parameters If a .prm file is provided, a .key file will be written to accommodate the forcefield in a `parameters ` call. atoms* (OrderedDict, optional=None) – Set of atoms to write, following convention defined in `garleek.qm`. version (str, optional=None) – Specific behavior flag. Supports: - `qmcharges`, which would write charges provided by QM engine. Needs `atoms` to be passed.
Returns:	path – Absolute path to the generated TINKER .key file
Return type:	str

garleek.mm.tinker.prepare_tinker_xyz(atoms, bonds=None, version=None)¶

Write a TINKER-style XYZ file. This is similar to a normal XYZ, but with more fields:

atom_index element x y z type bonded_atom_1 bonded_order_1 ...

TINKER expects coordinates in Angstrom.

Parameters:	atoms (OrderedDict) – Set of atoms to write, following convention defined in `garleek.qm`. bonds (OrderedDict) – Connectivity information, following convention defined in `garleek.qm`. version (str, optional=None) – Specific behavior flag, if needed. Like ‘qmcharges’
Returns:	xyzblock – String with XYZ contents
Return type:	str

garleek.mm.tinker.run_tinker(xyz_data, n_atoms, key, energy=True, dipole_moment=True, gradients=True, hessian=True)¶

qm/¶

The qm subpackage hosts all the code that handles calculations involving QM software.

All modules listed here are expected to perform the following tasks:

Patch the INPUT file with proper garleek-backend calls and atom type conversion.
Parse the intermediate files as provided by the QM software into a standardized object (details below).
Write the output file expected by the QM software.
List supported versions and the default ones in two tuples named supported_versions and default_version, respectively.

Standardized object for interfaced data¶

The intermediate representation of the parsed data, which will be passed to the MM engine, should be a dict with these keys and values:

n_atoms : int

Number of atoms in the structure or substructure

derivatives : int

Calculations requested for the MM part:

0: energy only

1: calculate gradient

2: calculate hessian

These values are cumulative, so if 2 is requested, 0 and 1 should be computed as well.

charge : float

Global charge of the structure

spin : int

Multiplicity of the system

atoms : OrderedDict of dicts

Ordered dictionary mapping atom index with another dictionary containing these values:

element : str. Chemical element

type : str. Atom type as expected by the MM part

xyz : np.array with shape (3,). Cartesian coordinates

charge : float. Atom point charge for the MM part

bonds : OrderedDict of lists of 2-tuples

Ordered dictionary mapping atom-index to a list of 2-tuples containing bonded atom index (int) and bond order (float)

qm.gaussian.py¶

Garleek - Gaussian bridge

class garleek.qm.gaussian.GaussianPatcher(filename, atom_types, mm='tinker', qm='gaussian', forcefield=None, version='16')¶

Bases: object

patch()¶

garleek.qm.gaussian.parse_gaussian_EIn(ein_filename, version='16')¶

Parse the *.EIn file produced by Gaussian external keyword.

This file contains the following data (taken from http://gaussian.com/external)

n_atoms  derivatives-requested  charge  spin
atom_name  x  y  z  MM-charge [atom_type]
atom_name  x  y  z  MM-charge [atom_type]
atom_name  x  y  z  MM-charge [atom_type]
...

derivatives-requested can be 0 (energy only), 1 (first derivatives) or 2 (second derivatives).
version must be one of garleek.qm.gaussian.supported_versions

garleek.qm.gaussian.patch_gaussian_input(*a, **kw)¶

garleek.qm.gaussian.prepare_gaussian_EOu(n_atoms, energy, dipole_moment, gradients=None, hessian=None, polarizability=None, dipole_polarizability=None)¶

Generate the *.EOu file Gaussian expects after external launch.

After performing the MM calculations, Gaussian expects a file with the following information (all in atomic units; taken from http://gaussian.com/external)

Items	Pseudo Code	Line Format
energy, dipole-moment (xyz)	E, Dip(I), I=1,3	4D20.12
gradient on atom (xyz)	FX(J,I), J=1,3; I=1,NAtoms	3D20.12
polarizability	Polar(I), I=1,6	3D20.12
dipole derivatives	DDip(I), I=1,9*NAtoms	3D20.12
force constants	FFX(I), I=1,(3NAtoms(3*NAtoms+1))/2	3D20.12

The second section is present only if first derivatives or frequencies were requested, and the final section is present only if frequencies were requested. In the latter case, the Hessian is given in lower triangular form: αij, i=1 to N, j=1 to i. The dipole moment, polarizability, and dipole derivatives can be zero if none are available.

connectors.py¶

This module hosts high-level functions that connect different engines together, handling input/output files delivery and unit conversion.

A CONNECTORS dict is maintained at the end of the file listing the connectors available. It’s a dict of dicts, where the primary keys are QM engines and secondary keys, MM engines.

garleek.connectors.gaussian_tinker(qmargs, forcefield='mm3.prm', write_file=True, qm_version='16', mm_version=None, **kwargs)¶

Connects QM engine gaussian with MM engine tinker.

When Gaussian does an ONIOM calculation with Garleek, the MM part is configured with the external keyword, meaning that Gaussian will write a series of files to disk and call the requested program. Gaussian expects some data written back to an *.EOu file, which should contain potential energy, dipole moment, polarizability and/or hessian matrix, depending on the calculation. The called program is expected to take those input files, convert them to the format expected by the MM program, obtaint the needed data and write them to the EOu file with the adequate syntax. So, that’s what we are doing here:

Parse Gaussian EIn file

Convert it to TINKER’s XYZ and KEY files

Run TINKER to obtain energy, dipole, etc

Convert units and write the EOu file

Parameters:	qmargs (tuple) – CLI arguments passed by Gaussian. Depending on the version, its length can vary, but we only care about qmargs[1], so it’s not usually a problem forcefield (str, optional=mm3.prm) – Path to file listing the TINKER forcefield to use. It can be a `.prm` file or a `.key` file. PRM files are full forcefields with no modifications. KEY files can import PRM files with `parameters` and then list custom parameters below. write_file (bool, optional=True) – Wether to write the resulting EOu file to disk. qm_version (string, optional=16) – Gaussian version in use. Needed to cover the slight differences between Gaussian versions (EIn/EOu syntax, number of args, and so on). mm_version (string, optional=None) – TINKER behavior. If QM-charges must be considered for the MM part, set it to ‘qmcharges’
Returns:	eou_data – Contents of the EOu file Gaussian expects back.
Return type:	str

atom_types.py¶

Utilities to deal with atom_types mappings.

These files are needed to convert atom types found in the QM engine to those expected by the MM engine. This is a key part of the whole QM/MM calculation, so those types must be chosen wisely. While we provide a few default mappings (check BUILTIN_TYPES list), the user is encouraged to define his or her own conversions if needed.

An atom_types file format is very simple: just two columns of plain text, where the first field is the QM type and the MM type is in the second field. garleek-prepare will just replace the QM types with the corresponding MM type. If a QM type is not found in the file, it will throw an error.

Comments can be inserted with a preceding # character, in its own line or after any valid content (just like Python). Blank lines are ignored as well.

This is valid syntax:

# atomic number, mm3 type, description

        5            # H_norm
        51           # He
        163          # Li
        165          # Be
        26           # B_sp2
        1            # C_sp3
        8            # N_sp3
        6            # O_sp3

garleek.atom_types.get_file(filename)¶: Get file from one of the default locations

garleek.atom_types.parse(atom_types_filename)¶: Parse atom_types file

units.py¶

Conversion factors are listed here

The convention is to import this module when needed, using u as an alias:

>>> from garleek import units as u

Application layer¶

cli.py¶

This module contains the command-line interfaces for both garleek-prepare (the user-friendly patcher) and garleek-backend (the program the QM engine calls behind the scenes to handle the MM calculations).

garleek-preare takes a naive QM input file for ONIOM and patches it so the MM part is performed through garleek-backend, which will be interfacing with the configured MM engine. For this to work, the atom types featured in the QM input file should be understandable by the MM engine, so garleek-prepare will replace those too, using the atom_types file mapping to do so.

In general, the worfklow is the following:

Build a standard ONIOM calculation, with layers, link atoms and so on. The garleek-prepare keyword should be present in the MM layer configuration so the patcher can find it and properly configure it.

Patch the QM input file with garleek-prepare:

garleek-prepare --qm <QM_engine> -mm <MM_engine> --ff <MM_forcefield> \
                --types <QM/MM_atom_type_dictionary> QM_input_file.in

Submit the calculation with the resulting patched file, named QM_input_file.garleek.in with the desired QM software:
```
QM_engine QM_input_file.garleek.in
```
Profit!

garleek.cli.backend_app(qmargs, qm='gaussian', mm='tinker', ff='mm3.prm', **kw)¶

garleek-backend Python entry-point

Parameters:	qmargs (tuple) – CLI arguments passed by the QM engine. This can be anything! qm (str) – QM engine to use. Must be one of `QM_ENGINES`, optionally followed by `_version` to indicate slight differences in the QM logic. For example, `gaussian` defaults to `gaussian_16`, but `gaussian_09a` exports the connectivity differently. mm (str) – MM engine to use. Must be one of `MM_ENGINES`,optionally followed by `_version` to indicate slight differences in the MM logic. ff (str) – Forcefield to use in the MM part. This can be anything that the MM engine is able to use as a forcefield (normally a path to a file).
Returns:	Whatever the QM-MM connector returns
Return type:	result

garleek.cli.backend_app_main(argv=None)¶: garleek-backend CLI entry-point

garleek.cli.frontend_app(input_file, types='uff_to_mm3', qm='gaussian', mm='tinker', ff='mm3.prm', **kw)¶

garleek-prepare Python entry-point

Parameters:	input_file (str) – Path to the QM input file that should be patched so Garleek can handle the MM part through the desired MM engine. types (str, default=uff_to_mm3) – Path to a file listing the mapping between the QM atom types present in `input_file` and the MM atom types expected by the MM engine given the current forcefield. Atom types are case INSENSITIVE. They will be uppercased upon processing. qm (str) – QM engine to use. Must be one of `QM_ENGINES`, optionally followed by `_version` to indicate slight differences in the QM logic. For example, `gaussian` defaults to `gaussian_16`, but `gaussian_09a` exports the connectivity differently. This is only needed so the patched `garleek-backend` calls include this argument. mm (str) – MM engine to use. Must be one of `MM_ENGINES`,optionally followed by `_version` to indicate slight differences in the MM logic. This is only needed so the patched `garleek-backend` calls include this argument. ff (str) – Path to the forcefield the MM engine will be using to compute values requested by the QM engine. It should conform to the specified `types` mapping. This is only needed so the patched `garleek-backend` calls include this argument.
Returns:	outname – Path to patched input file. It will always be a derivative of `input_file`. If `input_file` is `input.in`, `outname` will be `input.garleek.in`.
Return type:	str

garleek.cli.frontend_app_main(argv=None)¶: garleek-prepare CLI entry-point