String Path
Introduction
Sierra provides a string path tool for interpolating structures (images) between stable intermediates in a reaction pathway and optimizing the string path. These tools can be used to provide good launching points for transition-state optimizations, as well as to approximate minimum-energy pathways.
In this tutorial we use the interpolation feature to build a trial string of structures between the reactant and product of a simple SN2 reaction:
CH3Br + OH– → CH3OH + Br–
To begin, we will need the converged reactants and products structures to serve as endpoints for the string. Next we will optimize that string path to provide an approximate model for the reaction pathway.
Python API
To begin, we will setup the initial structures in concatenated XYZ+ and parse them into Python objects.
import sierra
from sierra.inputs import *
from breeze.string_path import build_initial_string
reactant_and_product_str = """
7
-1 1
C -1.90229 2.06889 -0.00347
Br 0.01695 2.38946 -0.00958
H -2.38271 2.94268 -0.44537
H -2.20876 1.92960 1.03395
H -2.08740 1.17206 -0.59590
O -5.13908 1.52355 0.00721
H -6.05937 1.33238 0.01316
7
-1 1
C -3.13497 2.11847 -0.08125
Br -0.70778 2.15117 0.13672
H -2.95652 3.12426 -0.46771
H -2.74763 2.04262 0.93761
H -2.64808 1.38474 -0.72797
O -4.53634 1.86861 -0.06632
H -4.63031 0.96103 0.26892
"""
reactant_and_product = sierra.sources.molecules_from_multi_xyz(reactant_and_product_str)
Next we interpolate structures in internal coordinates, adding six additional images between the reactant and product structures, and resulting in 8 structures overall:
interpolated_string = build_initial_string(reactant_and_product, 6)
Then we optimize the string path using the B97-3c level of theory.
b97_3c = DFTMethod(xc="b97-3c")
inp = StringPathInput(
initial_string=interpolated_string,
optimization_method=b97_3c,
energy_method=b97_3c
)
result = sierra.run(inp, stream_output=True)
We can examine the energy of each image as follows. There are eight energies - one each for the reactant and product, and six more for the images that lie between then on them approximate reaction path.
e_min = min(result.final_energies)
hartree_to_kj_per_mol = sierra.constants.cf("hartree", "kJ / mol")
delta_energies = [(e - e_min) * hartree_to_kj_per_mol for e in result.final_energies]
print(" Image # Δenergy [kJ / mol]")
for n, e in enumerate(delta_energies):
print(f" {(n + 1):8d} {e:20.3f}")
The result
object also contains final_string
, a list of Molecule
objects, one for each image along the string. The Molecule
objects can be inspected, used in further stages, or exported through their .write(format="xyz")
method.
Further configuration
Additional options are available to further customize the calculation and output. Refinement of the convergence criteria, maximum iterations, and other settings are available through the details
argument of StringPathInput
, with the default settings appearing below:
details = {
"convergence_threshold": "0.01 angstrom",
"trust_radius": "0.2 angstrom",
"max_iterations": 60,
"smooth_factor": 0.01,
"freeze_start_image": False,
"freeze_end_image": False,
"n_images": 12,
}
The electronic structure method for the geometry optimizations (optimization_method
) and for final energy evaluation (energy_method
) can be selected as required.
Command-line interface
1: Interpolating between endpoints to produce an initial string
We first need to setup a file with the reactant and product structures in concatenated XYZ+ format and write to a file named sn2_endpoints.xyz
. (The comment line in XYZ+ contains charge and multiplicity.)
7
-1 1
C -1.90229 2.06889 -0.00347
Br 0.01695 2.38946 -0.00958
H -2.38271 2.94268 -0.44537
H -2.20876 1.92960 1.03395
H -2.08740 1.17206 -0.59590
O -5.13908 1.52355 0.00721
H -6.05937 1.33238 0.01316
7
-1 1
C -3.13497 2.11847 -0.08125
Br -0.70778 2.15117 0.13672
H -2.95652 3.12426 -0.46771
H -2.74763 2.04262 0.93761
H -2.64808 1.38474 -0.72797
O -4.53634 1.86861 -0.06632
H -4.63031 0.96103 0.26892
Next, we interpolate structures in internal coordinates, adding 6 additional "beads" between the reactant and product structures, resulting in 8 structures overall. This can be done on the command line in the following way:
sierra string-path interpolate sn2_endpoints.xyz --n-intermediates=6 --output=sn2_initial_guess.xyz
The resulting structures are output in sn2_initial_guess.xyz
.
2: Optimizing the string path
Now that we have initial images, next we will optimize the string path:
sierra string-path optimize sn2_initial_guess.xyz --output=gfn_optimized_string.xyz
The command will print a number of setting and iteration options to the shell and finalize by writing a concatenated XYZ+ of the optimized string to gfn_optimized_string.xyz
. By default the string path is optimized at the GFN-xTB level of theory, however this optimized string can be inspected and/or evaluated at a higher level of theory. For example, the energies at point along the string can be evaluated using the B97-3c level of theory as follows:
sierra string-path optimize gfn_optimized_string.xyz --output=b97_optimized_string.xyz --energy-method="{'kind': 'dft', 'xc': 'b97-3c'}"
Optimizing first at lower levels of theory is fast for initial optimization and can speed up convergence of more expensive methods. The level of theory used for the geometry optimizations can be set using --optimization-method
, independent of the --energy-method
used for the final energy calculation along the string.