Skip to content

Conformers

This workflow generates a set of conformers from a SMILES string.

Unlike most other conformer generation tools, the Sierra conformers workflow generates conformers with structures optimized using the QM semi-empirical GFN-xTB method and provides ranking of conformers by energy computed using one of various QM methods or using OrbNet machine learning methods.

The workflow is inspired by ReSCoSS and contains the following steps:

  1. The input SMILES string is checked for chemical soundness and standardized based on a set of rules.
  2. An initial pool of conformers (default: 250) are generated using RDKit.
  3. The initial conformers are optimized using MMFF94 force field.
  4. The initial conformers are clustered based on a set of descriptors, and the final pool of conformers (35 at default settings) are selected.
  5. The selected conformers are optimized at the GFN1-xTB level of theory.
  6. The optimized structures are checked for consistency with the input SMILES string and duplicates are removed.
  7. The final unique conformers are sorted by energy computed using energy_method (GFN1-xTB at default settings).

Examples

The following example demonstrates conformer generation for butane using default settings.

import sierra
from sierra.inputs import *

# Generate conformers of butane:
butane_input = ConformersInput(
    smiles="CCCC",
)
ret = sierra.run(butane_input)

# There are two particularly interesting results in the object.
# `result.energies` is a list containing the energies of the generated
# conformers, and `result.conformers` is a list containing the generated
# confomers as `sierra.inputs.Molecule` objects. The conformers are sorted by
# the energy (lowest to highest) and the geometry of each conformer is aligned
# to the previous in the list.

# Conformer Energies:
for index, energy in enumerate(ret.energies):
    print(f"   {index + 1:2d}:  {energy:.6f}")
    #>     1:  -13.866303
    #>     2:  -13.865710
    #>     3:  -13.865710

Note that the workflow only generates unique conformers. For example, for benzene the workflow will only generate 1 final conformer disregard the value of optimized_conformers.

The default size of the starting pool of initial conformers is 250, but for smaller molecules with only very few conformers, this may be reduced. However, it takes very little time to generate these, compared to duration of the subsequent GFN1-xTB optimizations.

Below is an example showing more options for the conformer workflow:

import sierra
from sierra.inputs import *

# Generate conformers for butane
# and rank them by energy in ascending order
# at wB97XD3/def2-tzvp level of theory
butane_input = ConformersInput(
    smiles="CCCC",
    details={
        # Maximum number of conformers to generate in the end (default 35)
        "optimized_conformers": 10,
        # The size of the starting pool of conformers (default 1000)
        "initial_conformers": 500,
        # if reproducibility is needed
        "rng_seed": 2,
    },
    # rank the conformers by energy at wB97XD3/def2-tzvp level of theory
    energy_method=DFTMethod(xc="wB97XD3", ao="def2-svp"),
)
ret = sierra.run(butane_input)

# Conformer Energies:
for index, energy in enumerate(ret.energies):
    print(f"   {index + 1:2d}:  {energy:.6f}")
    #>     1:  -158.300445
    #>     2:  -158.299741
    #>     3:  -158.299722

ConformersInput

Representation of input for the conformers workflow

Fields

data

A string containing a file block describing a chemical structure. Currently only .mol file format is allowed.

  • Type: Optional[str]
energy_method

The method used to compute the energy of the conformers.

smiles

The SMILES string representing a chemical structure.

  • Type: Optional[str]
workflow
  • Type: WorkflowIdentifier
  • Default: WorkflowIdentifier(name='conformers', image=None)
details

Detail Fields

allow_complex

Flag indicating whether to allow conformer generation to be performed on a molecular complex, i.e. a chemical structure contaning more than one disconnected fragements

  • Type: bool
  • Default: False
duplicate_energy_threshold

The energy threshold used to identify duplicate conformers. Conformers that have an energy difference of less than this value and an RMSD of less than duplicate_rmsd_threshold are flagged as duplicates.

duplicate_rmsd_threshold

The RMSD threshold used to identify duplicate conformers. Conformers that have an RMSD of less than this and an energy difference of less than duplicate_energy_threshold are flagged as duplicates.

enforce_saturated_hydrogens

Enforces the organic molecule to be closed-shell and properly saturated with hydrogen atoms.

  • Type: bool
  • Default: False
error_flags
  • Type: List[ConformerFlags]
  • Default: [ConformerFlags.optimization_fail,ConformerFlags.properties_fail,ConformerFlags.energy_fail,ConformerFlags.duplicate,ConformerFlags.bond_length_check,ConformerFlags.ez_stereo_check]
initial_conformers

The maximum number of initial conformers to begin with.

  • Type: PositiveInt
  • Default: 250
initial_energy_cutoff

The energy cutoff used to prune the initial set of conformers. Conformers that have an energy larger than the minimum energy by more than this value are purged. If None, no pruning based on energy will be applied to the initial conformers.

initial_optimization

Flag indicating whether to perform force field optimization for the initially generated conformers.

  • Type: bool
  • Default: True
initial_rmsd_threshold

The RMSD threshold used to prune the initial set of conformers. If None, no pruning based on RMSD will be applied to the initial conformers.

n_optimizer_steps

Number of optimizer steps in each optimizer macro cycle.

  • Type: PositiveInt
  • Default: 50
optimized_conformers

The number of conformers to optimize using the provided energy_method.

  • Type: PositiveInt
  • Default: 35
prefer_lower_energies

Flag indicating whether to obtain the final conformers with lower energies or with higher diversity in terms of geometry.

  • Type: bool
  • Default: True
remove_salt

Flag indicating whether to remove salt counterions from the input SMILES string.

  • Type: bool
  • Default: False
remove_solvent

Flag indicating whether to remove solvent from the input SMILES string.

  • Type: bool
  • Default: False
rng_seed

Seed for the random number generator when generating conformers.

  • Type: Optional[int]
skip_final_optimization

Flag indicating whether to skip final conformer optimization. If true, the initial conformers will be returned.

  • Type: bool
  • Default: False
standardize

Flag indicating whether to standardize the chemical structure from the input SMILES string.

  • Type: bool
  • Default: False
warning_flags
  • Type: List[ConformerFlags]
  • Default: [ConformerFlags.bond_order_check]

ConformersResult

Representation of results of the conformers workflow.

Fields

All the fields in ConformersInput and the following:

conformers

The list of conformers generated by the workflow.

energies

The energies (in Hartree) of each conformer generated by the workflow, arranged in ascending order. The order of energies matches the order of conformers.

  • Type: Optional[List[float]]
radii_of_gyration

The radius of gyration of each conformer generated by the workflow. The order of radii matches the order of conformers.

  • Type: Optional[List[float]]
warnings

A list of warnings generated while evaulating the workflow.

  • Type: Optional[List[List[ConformerFlags]]]