Conformers
This workflow generates a set of conformers from a SMILES string.
Unlike most other conformer generation tools, the Sierra conformers workflow generates conformers with structures optimized using the QM semi-empirical GFN-xTB method and provides ranking of conformers by energy computed using one of various QM methods or using OrbNet machine learning methods.
The workflow is inspired by ReSCoSS and contains the following steps:
- The input SMILES string is checked for chemical soundness and standardized based on a set of rules.
- An initial pool of conformers (default: 250) are generated using RDKit.
- The initial conformers are optimized using MMFF94 force field.
- The initial conformers are clustered based on a set of descriptors, and the final pool of conformers (35 at default settings) are selected.
- The selected conformers are optimized at the GFN1-xTB level of theory.
- The optimized structures are checked for consistency with the input SMILES string and duplicates are removed.
- The final unique conformers are sorted by energy computed using
energy_method
(GFN1-xTB at default settings).
Examples
The following example demonstrates conformer generation for butane using default settings.
import sierra
from sierra.inputs import *
# Generate conformers of butane:
butane_input = ConformersInput(
smiles="CCCC",
)
ret = sierra.run(butane_input)
# There are two particularly interesting results in the object.
# `result.energies` is a list containing the energies of the generated
# conformers, and `result.conformers` is a list containing the generated
# confomers as `sierra.inputs.Molecule` objects. The conformers are sorted by
# the energy (lowest to highest) and the geometry of each conformer is aligned
# to the previous in the list.
# Conformer Energies:
for index, energy in enumerate(ret.energies):
print(f" {index + 1:2d}: {energy:.6f}")
#> 1: -13.866303
#> 2: -13.865710
#> 3: -13.865710
Note that the workflow only generates unique conformers. For example, for benzene the workflow will only generate 1 final conformer disregard the value of optimized_conformers
.
The default size of the starting pool of initial conformers is 250, but for smaller molecules with only very few conformers, this may be reduced. However, it takes very little time to generate these, compared to duration of the subsequent GFN1-xTB optimizations.
Below is an example showing more options for the conformer workflow:
import sierra
from sierra.inputs import *
# Generate conformers for butane
# and rank them by energy in ascending order
# at wB97XD3/def2-tzvp level of theory
butane_input = ConformersInput(
smiles="CCCC",
details={
# Maximum number of conformers to generate in the end (default 35)
"optimized_conformers": 10,
# The size of the starting pool of conformers (default 1000)
"initial_conformers": 500,
# if reproducibility is needed
"rng_seed": 2,
},
# rank the conformers by energy at wB97XD3/def2-tzvp level of theory
energy_method=DFTMethod(xc="wB97XD3", ao="def2-svp"),
)
ret = sierra.run(butane_input)
# Conformer Energies:
for index, energy in enumerate(ret.energies):
print(f" {index + 1:2d}: {energy:.6f}")
#> 1: -158.300445
#> 2: -158.299741
#> 3: -158.299722
ConformersInput
- Representation of input for the conformers workflow
Fields
data
-
A string containing a file block describing a chemical structure. Currently only .mol file format is allowed.
- Type: Optional[str]
energy_method
-
The method used to compute the energy of the conformers.
- Type: One of: [MethodBase, CustomMethod, XTBMethod, HFMethod, DFTMethod, EMFTMethod, OrbNetMethod]
- Default: XTBMethod(model='GFN1')
smiles
-
The SMILES string representing a chemical structure.
- Type: Optional[str]
workflow
-
- Type: WorkflowIdentifier
- Default: WorkflowIdentifier(name='conformers', image=None)
details
-
Detail Fields
allow_complex
-
Flag indicating whether to allow conformer generation to be performed on a molecular complex, i.e. a chemical structure contaning more than one disconnected fragements
- Type: bool
- Default: False
duplicate_energy_threshold
-
The energy threshold used to identify duplicate conformers. Conformers that have an energy difference of less than this value and an RMSD of less than
duplicate_rmsd_threshold
are flagged as duplicates.- Type: EnergyQuantity
- Default: "0.1 kcal/mol"
duplicate_rmsd_threshold
-
The RMSD threshold used to identify duplicate conformers. Conformers that have an RMSD of less than this and an energy difference of less than
duplicate_energy_threshold
are flagged as duplicates.- Type: LengthQuantity
- Default: "0.5 angstrom"
enforce_saturated_hydrogens
-
Enforces the organic molecule to be closed-shell and properly saturated with hydrogen atoms.
- Type: bool
- Default: False
error_flags
-
- Type: List[ConformerFlags]
- Default: [ConformerFlags.optimization_fail,ConformerFlags.properties_fail,ConformerFlags.energy_fail,ConformerFlags.duplicate,ConformerFlags.bond_length_check,ConformerFlags.ez_stereo_check]
initial_conformers
-
The maximum number of initial conformers to begin with.
- Type: PositiveInt
- Default: 250
initial_energy_cutoff
-
The energy cutoff used to prune the initial set of conformers. Conformers that have an energy larger than the minimum energy by more than this value are purged. If
None
, no pruning based on energy will be applied to the initial conformers.- Type: Optional[EnergyQuantity]
initial_optimization
-
Flag indicating whether to perform force field optimization for the initially generated conformers.
- Type: bool
- Default: True
initial_rmsd_threshold
-
The RMSD threshold used to prune the initial set of conformers. If
None
, no pruning based on RMSD will be applied to the initial conformers.- Type: Optional[LengthQuantity]
- Default: "0.2 angstrom"
n_optimizer_steps
-
Number of optimizer steps in each optimizer macro cycle.
- Type: PositiveInt
- Default: 50
optimized_conformers
-
The number of conformers to optimize using the provided
energy_method
.- Type: PositiveInt
- Default: 35
prefer_lower_energies
-
Flag indicating whether to obtain the final conformers with lower energies or with higher diversity in terms of geometry.
- Type: bool
- Default: True
remove_salt
-
Flag indicating whether to remove salt counterions from the input SMILES string.
- Type: bool
- Default: False
remove_solvent
-
Flag indicating whether to remove solvent from the input SMILES string.
- Type: bool
- Default: False
rng_seed
-
Seed for the random number generator when generating conformers.
- Type: Optional[int]
skip_final_optimization
-
Flag indicating whether to skip final conformer optimization. If true, the initial conformers will be returned.
- Type: bool
- Default: False
standardize
-
Flag indicating whether to standardize the chemical structure from the input SMILES string.
- Type: bool
- Default: False
warning_flags
-
- Type: List[ConformerFlags]
- Default: [ConformerFlags.bond_order_check]
ConformersResult
Representation of results of the conformers workflow.
Fields
All the fields in ConformersInput and the following:
conformers
-
The list of conformers generated by the workflow.
- Type: Optional[List[Molecule]]
energies
-
The energies (in Hartree) of each conformer generated by the workflow, arranged in ascending order. The order of energies matches the order of
conformers
.- Type: Optional[List[float]]
radii_of_gyration
-
The radius of gyration of each conformer generated by the workflow. The order of radii matches the order of
conformers
.- Type: Optional[List[float]]
warnings
-
A list of warnings generated while evaulating the workflow.
- Type: Optional[List[List[ConformerFlags]]]