Store

The Store module handles capturing data about molecules and using collected data to compute derived properties.

Data Stores

ExaMol provides access to the data via a MoleculeStore interface. There are several implementations of these types of stores, each with different use cases.

Store

Description

InMemoryStore

Store all data in memory, periodically write to a single file

Using a Data Store

All MoleculeStore classes are used the same way. The most important note is that all operations for a store should be performed inside a context, which ensures that the operations on the dataset will be persisted to disk after the context closes.

with InMemoryStore('db.json') as store:
    for record in records:
        store.update_record(record)
    print(f'The store contains {len(store)} records')

It is also important to know that the store is thread safe but the store may not be accessed from separate processes. Specifically, any data written to a store with the update_record() method will be available to all threads before the call exits.

Data Models

The MoleculeRecord captures information on a molecule.[1] Information includes identifiers (e.g., SMILES string, project-specific names), organizational metadata (e.g., membership in project-specific subsets), energies of the molecule in different geometries (i.e., conformers), and properties derived from the energies.

Energies are stored as a list of Conformer objects. Each of the Conformer objects are different geometries[2] and we store the energies under different conditions (e.g., computational methods, charge states) as EnergyEvaluation objects.

Create a record and populate information about it by creating a blank Record from a molecule identifier (i.e., SMILES) then providing a simulation result to its add_energies method.

record = MoleculeRecord.from_identifier('C')
sim_result = SimResult(
    xyz='5\nmethane\n0.0000...'
    charge=0,
    energy=-1,
    config_name='test',
    solvent=None
)  # an example result (normally not created manually)
record.add_energies(sim_result)

You can then look up the stored energies for a molecule from the record. For example, ExaMol provides a utility operation for finding the lowest-energy conformer:

conf, energy = record.find_lowest_conformer(config_name='test', charge=0, solvent=None)
assert isclose(energy, -1)
assert conf.xyz.startswith('5\nmethane\n0.0000')

Recipes

Recipes define how to compute property of a molecule from multiple energy computations. All are based on the PropertyRecipe object, and provide a function to compute the property from a molecule data record and second to generate the list of computations required to complete a computation.

Use an existing recipe by specifying details on the property (e.g., which solvent?) and the target level of accuracy. Consult the API docs for properties available in ExaMol.

The recipe will then create an informative name for the property and a level of accuracy:

recipe = RedoxEnergy(charge=1, config_name='test', solvent='acn', vertical=False)
print(recipe.name)  # reduction_potential
print(recipe.level)  # test_acn_vertical

You can then use the recipe to determine what is left to do for a recipe

to_do = recipe.suggest_computations(record)

or compute the property then store it in a data record.

recipe.update_record(record)
print(record.properties['reduction_potential']['test_acn_vertical'])  # Value of the property