examol.store¶
Tools related to storing then retrieving data about molecules
examol.store.db¶
Tools for interfacing with data stores
examol.store.db.base¶
Base classes for storage utilities
- class examol.store.db.base.MoleculeStore[source]¶
Bases:
AbstractContextManager
,ABC
Base class defining how to interface with a dataset of molecule records.
Data stores provide the ability to persist the data collected by ExaMol to disk during a run. The
update_record()
call need not immediately persist the data but should ensure that the data is stored on disk eventually. In fact, it is actually better for the update operation to not block until the resulting write has completed.Stores do not need support concurrent access from multiple client, which is why this documentation avoids the word “database.”
- export_records(path: Path)[source]¶
Save a current copy of the database to disk as line-delimited JSON
- Parameters:
path – Path in which to save all data. Use a “.json.gz”
- get_or_make_record(mol_string: str) MoleculeRecord [source]¶
Either the existing record for a molecule or make a new one
- Parameters:
mol_string – String describing a molecule (e.g., SMILES string)
- Returns:
Record
- iterate_over_records() Iterable[MoleculeRecord] [source]¶
Iterate over all records in data
- Yields:
A single record
- update_record(record: MoleculeRecord)[source]¶
Update a single record
- Parameters:
record – Record to be updated
- update_records(records: Iterable[MoleculeRecord])[source]¶
Update many records at once
- Parameters:
records – Iterator over records to be stored
examol.store.db.memory¶
Stores that keep the entire dataset in memory
- class examol.store.db.memory.InMemoryStore(path: Path | None, write_freq: float = 10.0)[source]¶
Bases:
MoleculeStore
Store all molecule records in memory, write to disk as a single file
The class will start checkpointing as soon as any record is updated but no more frequently than
write_freq
- Parameters:
path – Path from which to read data. Must be a JSON file, can be compressed with GZIP. Set to
None
if you do not want data to be storedwrite_freq – Minimum time between writing checkpoints
- iterate_over_records() Iterable[MoleculeRecord] [source]¶
Iterate over all records in data
- Yields:
A single record
- update_record(record: MoleculeRecord)[source]¶
Update a single record
- Parameters:
record – Record to be updated
examol.store.models¶
Data models used for molecular data
- class examol.store.models.Conformer(*, xyz: str, xyz_hash: str, date_created: datetime, source: str | None = None, config_name: str | None = None, charge: int, energies: list[EnergyEvaluation] = None)[source]¶
Bases:
BaseModel
Describes a single conformer of a molecule
- add_energy(sim_result: SimResult) bool [source]¶
Add the energy from a simulation result
- Parameters:
sim_result – Result to be added
- property atoms: Atoms¶
- energies: list[EnergyEvaluation]¶
List of energies for this structure
- classmethod from_simulation_result(sim_result: SimResult, source: str = 'relaxation') Conformer [source]¶
Create a new object from a simulation
- Parameters:
sim_result – Simulation result
source – How this conformer was determined
- Returns:
An initialized conformer record which includes energies
- classmethod from_xyz(xyz: str, **kwargs)[source]¶
Create a new object from a XYZ-format object
- Parameters:
xyz – XYZ-format description of the molecule
- Returns:
An initialized conformer object
- get_energy(config_name: str, charge: int, solvent: str | None) float [source]¶
Get the energy for a certain level
- Parameters:
config_name – Name of the compute configuration
charge – Charge of the molecule
solvent – Solvent in which the molecule is dissolved
- Returns:
Energy of the target conformer
- Raises:
NoSuchConformer – If there is no such energy for this conformer
- get_energy_index(config_name: str, charge: int, solvent: str | None) int | None [source]¶
Get the index of the record for a certain level of energy
- Parameters:
config_name – Name of the compute configuration
charge – Charge of the molecule
solvent – Solvent in which the molecule is dissolved
- Returns:
Index of the record, if available, or
None
, if not.
- class examol.store.models.EnergyEvaluation(*, energy: float, config_name: str, charge: int, solvent: str | None = None, completed: datetime = None)[source]¶
Bases:
BaseModel
Energy of a conformer under a certain condition
- class examol.store.models.Identifiers(*, smiles: str, inchi: str, pubchem_id: int | None = None)[source]¶
Bases:
BaseModel
IDs known for a molecule
- exception examol.store.models.MissingData(config_name: str = Ellipsis, charge: int = Ellipsis, solvent: str | None = Ellipsis)[source]¶
Bases:
ValueError
No conformer or energy with the desired settings was found
- class examol.store.models.MoleculeRecord(*, key: ConstrainedStrValue, identifier: Identifiers, names: list[str] = None, subsets: list[str] = None, conformers: list[Conformer] = None, properties: dict[str, dict[str, float]] = None)[source]¶
Bases:
BaseModel
Defines whatever we know about a molecule
- add_energies(result: SimResult, opt_steps: Collection[SimResult] = (), match_tol: float = 0.001) bool [source]¶
Add a new set of energies to a structure
Will add a new conformer if the structure does not yet exist
If provided, will match the energies of any materials within the optimization steps
- Parameters:
result – Energy computation to be added
opt_steps – Optimization steps, if available
match_tol – Maximum absolute difference between XYZ coordinates to match
- Returns:
Whether a new conformer was added
- find_lowest_conformer(config_name: str, charge: int, solvent: str | None, optimized_only: bool = True) tuple[Conformer, float] [source]¶
Get the energy of the lowest-energy conformer of a molecule in a certain state
- Parameters:
config_name – Name of the compute configuration
charge – Charge of the molecule
solvent – Solvent in which the molecule is dissolved
optimized_only – Only match conformers which were optimized with the specified configuration and charge
- Returns:
Lowest-energy conformer
Energy of the structure (eV)
- Raises:
NoSuchConformer – If we lack a conformer with these settings
- classmethod from_identifier(mol_string: str)[source]¶
Parse the molecule from either the SMILES or InChI string
- Parameters:
mol_string – Molecule to parse
- Returns:
Empty record for this molecule
- identifier: Identifiers¶
Collection of identifiers which define the molecule
examol.store.recipes¶
Tools for computing the properties of molecules from their record
- class examol.store.recipes.PropertyRecipe(name: str, level: str)[source]¶
Bases:
object
Compute the property given a
MoleculeRecord
Creating a New Recipe
Define a recipe by implementing three operations:
__init__()
: Take a users options for the recipe (e.g., what level of accuracy to use)then define a name and level for the recipe. Pass the name and level to the superclass’s constructor. It is better to avoid using underscores when creating the name as underscores are used in the names of simulation configurations.
recipe()
: Return a mapping of the different types of geometries definedusing
RequiredGeometry
and the energies which must be computed for each geometry usingRequiredEnergy
.
compute_property()
: Compute the property using the record and raiseeither a
ValueError
,KeyError
, orAssertionError
if the record lacks the required information.
from_name()
: Restore a recipe from its name and level.
- compute_property(record: MoleculeRecord) float [source]¶
Compute the property
- Parameters:
record – Data about the molecule
- Returns:
Property value
- classmethod from_name(name: str, level: str) PropertyRecipe [source]¶
Generate a recipe from the name
- Parameters:
name – Name of the property
level – Level at which it is computed
- lookup(record: MoleculeRecord, recompute: bool = False) float | None [source]¶
Lookup the value of a property from a record
- Parameters:
record – Record to be evaluated
recompute – Whether we should attempt to recompute the property beforehand
- Returns:
Value of the property, if available, or
None
if not
- property recipe: dict[RequiredGeometry, list[RequiredEnergy]]¶
List of the geometries required for this recipe and the energies which must be computed for them
- suggest_computations(record: MoleculeRecord) list[SimulationRequest] [source]¶
Generate a list of computations that should be performed next on a molecule
The list of computations may not be sufficient to complete a recipe. For example, you may need to first relax a structure and then compute the energy of the relaxed structure under different conditions.
- Parameters:
record – Data about the molecule
- Returns:
List of computations to perform
- update_record(record: MoleculeRecord) float [source]¶
Compute a property and update the record
- Parameters:
record – Record to be updated
- Returns:
Value of the property being computed
- class examol.store.recipes.RedoxEnergy(charge: int, energy_config: str, vertical: bool = False, solvent: str | None = None)[source]¶
Bases:
PropertyRecipe
Compute the redox energy for a molecule
The level is named by the configuration used to compute the energy, whether a solvent was included, and whether we are computing the vertical or adiabatic energy.
- Parameters:
charge – Amount the charge of the molecule should change by
energy_config – Configuration used to compute the energy
solvent – Solvent in which molecule is dissolved, if any
- compute_property(record: MoleculeRecord) float [source]¶
Compute the property
- Parameters:
record – Data about the molecule
- Returns:
Property value
- classmethod from_name(name: str, level: str) RedoxEnergy [source]¶
Generate a recipe from the name
- Parameters:
name – Name of the property
level – Level at which it is computed
- property recipe: dict[RequiredGeometry, list[RequiredEnergy]]¶
List of the geometries required for this recipe and the energies which must be computed for them
- class examol.store.recipes.RequiredEnergy(config_name: str = Ellipsis, charge: int = Ellipsis, solvent: str | None = None)[source]¶
Bases:
object
Energy computation level required for a geometry
- class examol.store.recipes.RequiredGeometry(config_name: str = Ellipsis, charge: int = Ellipsis)[source]¶
Bases:
object
Geometry level required for a recipe
- class examol.store.recipes.SimulationRequest(xyz: str, optimize: bool = Ellipsis, config_name: str = Ellipsis, charge: int = Ellipsis, solvent: str | None = Ellipsis)[source]¶
Bases:
object
Request for a specific simulation type
- class examol.store.recipes.SolvationEnergy(config_name: str, solvent: str)[source]¶
Bases:
PropertyRecipe
Compute the solvation energy in kcal/mol
- Parameters:
config_name – Name of the configuration used to compute energy
solvent – Target solvent
- compute_property(record: MoleculeRecord) float [source]¶
Compute the property
- Parameters:
record – Data about the molecule
- Returns:
Property value
- classmethod from_name(name: str, level: str) SolvationEnergy [source]¶
Generate a recipe from the name
- Parameters:
name – Name of the property
level – Level at which it is computed
- property recipe: dict[RequiredGeometry, list[RequiredEnergy]]¶
List of the geometries required for this recipe and the energies which must be computed for them