examol.simulate

Functions associated with evaluating the properties of molecules

examol.simulate.base

Base class defining the interfaces for common simulation operations

class examol.simulate.base.BaseSimulator(scratch_dir: Path | str | None, retain_failed: bool = True)[source]

Bases: object

Uniform interface for common types of computations

Creating a New Simulator

There are a few considerations to weigh when fulfilling the abstract methods:

  • Use underscores in the name of method configurations.

Parameters:
  • scratch_dir – Path in which to create temporary directories

  • retain_failed – Whether to retain failed computations

compute_energy(mol_key: str, xyz: str, config_name: str, charge: int = 0, solvent: str | None = None, forces: bool = True, **kwargs) tuple[SimResult, str | None][source]

Get the energy and forces of a structure

Parameters:
  • mol_key – InChI key of the molecule being evaluated

  • xyz – 3D geometry of the molecule

  • config_name – Name of the method

  • charge – Charge on the molecule

  • solvent – Name of the solvent

  • forces – Whether to compute forces

  • **kwargs – Any other arguments for the method

Returns:

  • Energy result

  • Other metadata produced by the computation

create_configuration(name: str, xyz: str, charge: int, solvent: str | None, **kwargs) Any[source]

Create the configuration needed for a certain computation

Parameters:
  • name – Name of the computational method

  • xyz – Structure being evaluated in XYZ format

  • charge – Charge on the system

  • solvent – Name of any solvent

optimize_structure(mol_key: str, xyz: str, config_name: str, charge: int = 0, solvent: str | None = None, **kwargs) tuple[SimResult, list[SimResult], str | None][source]

Minimize the energy of a structure

Parameters:
  • mol_key – InChI key of the molecule being evaluated

  • xyz – 3D geometry of the molecule

  • config_name – Name of the method

  • charge – Charge on the molecule

  • solvent – Name of the solvent

  • **kwargs – Any other arguments for the method

Returns:

  • The minimized structure

  • Any intermediate structures

  • Other metadata produced by the computation

class examol.simulate.base.SimResult(config_name: str, charge: int, solvent: str | None, xyz: str, energy: float | None = None, forces: ndarray | None = None)[source]

Bases: object

Stores the results from a calculation in a code-agnostic format

property atoms: Atoms

ASE Atoms object representation of the structure

charge: int

Charge of the molecule

config_name: str

Name of the configuration used to compute the energy

energy: float | None = None

eV)

Type:

Energy of the molecule (units

forces: ndarray | None = None

eV/Ang)

Type:

Forces acting on each atom (units

json(**kwargs) str[source]

Write the record to JSON format

solvent: str | None

Solvent around the molecule, if any

xyz: str

XYZ-format structure, adjusted such that the center of mass is at the origin

examol.simulate.initialize

Functions needed to start evaluating a certain molecule

examol.simulate.initialize.add_initial_conformer(record: MoleculeRecord) MoleculeRecord[source]

Add an initial conformation to the record

Generates an XYZ using RDKit if none are available then adds the MMF94 to all neutral conformers.

Generated conformer is stored with a config name of mmff, a charge of 0, and a source of optimize. MMFF energies are stored using the configuration name mmff.

Parameters:

record – Record to be processed

Returns:

Input record

examol.simulate.initialize.fix_cyclopropenyl(xyz: str, mol_string: str) str[source]

Detect cyclopropenyl groups and assure they are planar. :param xyz: Current structure in XYZ format :param mol_string: SMILES or InChI string of the molecule

Returns:

Version of atoms with the rings flattened

examol.simulate.initialize.generate_inchi_and_xyz(mol_string: str, special_cases: bool = True) tuple[str, str][source]

Generate the XYZ coordinates and InChI string for a molecule using a standard procedure:

  1. Generate 3D coordinates with RDKit. Use a set random number seed

  2. Assign yet-undetermined stereochemistry based on the 3D geometry

  3. Generate an InCHi string for the molecules

If allowed, then perform post-processing steps for common mistakes in generating geometries:

  1. Ensure cyclopropenyl groups are planar

Parameters:
  • mol_string – SMILES or InChI string

  • special_cases – Whether to perform the post-processing

Returns:

  • InChI string for the molecule

  • XYZ coordinates for the molecule

examol.simulate.initialize.write_xyz_from_mol(mol: Mol, comment: str = '')[source]

Write an RDKit Mol object to an XYZ-format string

Parameters:
  • mol – Molecule to write

  • comment – Comment line for the file

Returns:

XYZ-format version of the molecule

examol.simulate.ase

Utilities for simulation using ASE

class examol.simulate.ase.ASESimulator(cp2k_command: str | None = None, gaussian_command: str | None = None, optimization_steps: int = 250, scratch_dir: Path | str | None = None, clean_after_run: bool = True, ase_db_path: Path | str | None = None, retain_failed: bool = True)[source]

Bases: BaseSimulator

Use ASE to perform quantum chemistry calculations

The calculator supports calculations with the following codes:

  • XTB: Tight binding using the GFN2-xTB parameterization

  • Gaussian: Supports any of the methods and basis sets of Gaussian using names of the format gaussian_[method]_[basis]. Supply additional arguments to Gaussian as keyword arguments.

  • MOPAC: Semiempirical quantum chemistry. Choose a method by providing a configuration name of the form mopac_[method]

  • CP2K: Supports only a few combinations of basis sets and XC functions, those for which we have determined appropriate cutoff energies. All are named cp2k_[xc name]_[basis]

Parameters:
  • cp2k_command – Command to launch CP2K

  • gaussian_command – Command to launch Gaussian. Only the path to the executable is generally needed

  • optimization_steps – Maximum number of optimization steps

  • scratch_dir – Path in which to create temporary directories

  • clean_after_run – Whether to clean output files after a run exits successfully

  • ase_db_path – Path to an ASE db in which to store results

  • retain_failed – Whether to clean output files after a run fails

compute_energy(mol_key: str, xyz: str, config_name: str, charge: int = 0, solvent: str | None = None, forces: bool = True, **kwargs) tuple[SimResult, str | None][source]

Get the energy and forces of a structure

Parameters:
  • mol_key – InChI key of the molecule being evaluated

  • xyz – 3D geometry of the molecule

  • config_name – Name of the method

  • charge – Charge on the molecule

  • solvent – Name of the solvent

  • forces – Whether to compute forces

  • **kwargs – Any other arguments for the method

Returns:

  • Energy result

  • Other metadata produced by the computation

create_configuration(name: str, xyz: str, charge: int, solvent: str | None, **kwargs) dict[source]

Create the configuration needed for a certain computation

Parameters:
  • name – Name of the computational method

  • xyz – Structure being evaluated in XYZ format

  • charge – Charge on the system

  • solvent – Name of any solvent

optimize_structure(mol_key: str, xyz: str, config_name: str, charge: int = 0, solvent: str | None = None, **kwargs) tuple[SimResult, list[SimResult], str | None][source]

Minimize the energy of a structure

Parameters:
  • mol_key – InChI key of the molecule being evaluated

  • xyz – 3D geometry of the molecule

  • config_name – Name of the method

  • charge – Charge on the molecule

  • solvent – Name of the solvent

  • **kwargs – Any other arguments for the method

Returns:

  • The minimized structure

  • Any intermediate structures

  • Other metadata produced by the computation

update_database(atoms_to_write: list[Atoms], config_name: str, charge: int, solvent: str | None)[source]

Update the ASE database collected along with this class

Parameters:
  • atoms_to_write – List of Atoms objects to store in DB

  • config_name – Name of the configuration used to compute energies

  • charge – Charge on the system

  • solvent – Name of solvent, if any

examol.simulate.ase.utils

Utilities related to using ASE

examol.simulate.ase.utils.add_vacuum_buffer(atoms: Atoms, buffer_size: float, cubic: bool = False)[source]

Add a vacuum buffer around a molecule

Parameters:
  • atoms – Atoms object to be edited

  • buffer_size – Length of vacuum on each side of the molecule (unit: Angstrom)

  • cubic – Whether the resultant box should be cubic

examol.simulate.ase.utils.initialize_charges(atoms: Atoms, charge: int)[source]

Set initial charges to sum up to a certain value

Parameters:
  • atoms – Atoms object to be manipulated

  • charge – Total charge for the system

examol.simulate.ase.utils.make_ephemeral_calculator(calc: Calculator | dict) Iterator[Calculator][source]

Make a calculator then tear it down after completion

Parameters:

calc – Already-defined calculator or a dict defining it. The dict must contain the key “name” to define the name of the code and could contain the keys “args” and “kwargs” to define the arguments and keyword arguments for creating a new one, respectively.

Yields:

An Calculator that is town down as the context manager exits