Molecule
A Molecule
defines a collection of atoms by their geometry (in bohr), atomic numbers, charge and the multiplicity. It can be created from a variety of sources and data formats.
The Molecule
building block is a critical input (and often output) of most workflows.
Creating a new Molecule
A common method of Molecule
creation is to directly set the fields:
import sierra
from sierra.inputs import *
# Build a molecule from raw data, note the distances are in Bohr
he2 = Molecule(atomic_numbers=[2, 2], geometry=[0, 0, 0, 0, 0, 5])
print(he2)
#> Molecule(formula='He2', eoi='6e4877e')
print(he2.measure([0, 1]))
#> 5.0
Here, the charge
and multiplicity
are set to the defaults of 0
and 1
, respectively.
Element symbols can also be used in place of atomic numbers for initialization:
from sierra.inputs import *
# Build a molecule from symbols, note the distances are in Bohr
he2 = Molecule(symbols=["He", "He"], geometry=[0, 0, 0, 0, 0, 5])
print(he2)
#> Molecule(formula='He2', eoi='6e4877e')
Importing common file formats
It is also common to construct a Molecule
from SDF, XYZ or XYZ+ text. These formats specify positions in Angstrom and Molecule
will convert these to Bohr to store in the geometry
field.
A compatible file can be loaded with the file
field:
from sierra.inputs import *
# Build a molecule from a SDF, XYZ or XYZ+ file
water = Molecule(file="examples/atoms.xyz")
or the file content can be passed to the data
field:
from sierra.inputs import *
# Build a molecule from SDF, XYZ or XYZ+ contents
# Note the distances are in Angstrom
water = Molecule(
data="""
O 0 0 0
H 0 0 1
H 0 1 0
"""
)
print(water)
#> Molecule(formula='H2O', eoi='6398bd8')
Generating from a SMILES string
A molecule can be straightforwardly generated from a smiles
string.
from sierra.inputs import *
butane = Molecule(smiles="CCCC")
print(butane)
#> Molecule(formula='C4H10', eoi='0aabcca')
Here, our internal conformers tools are used to generate a structure from the SMILES string. Note that this implementation prioritizes speed of execution to obtain a reasonable structure rather than a rigorous conformational search. Please use the Conformer workflow for full control over geometry generation.
Importing from PubChem
A very useful form of making a Molecule
is via the PubChem interface. The pubchem
attribute can be used to automatically search pubchem for the best common name match and generate a Molecule
.
from sierra.inputs import *
caffeine = Molecule(pubchem="caffeine")
print(caffeine)
#> Molecule(formula='C8H10N4O2', eoi='6812f19')
Warning
The pubchem
interface sends data to PubChem servers and should not be used for proprietary material. This is the only operation in Sierra which reaches to an outside server, all other calls, including the Conformer workflow, run locally.
Exporting a Molecule
Molecule
objects can easily be exported to a file or a string variable in XYZ+ format:
from pathlib import Path
from sierra.inputs import *
mol = Molecule(pubchem="caffeine")
# Write a molecule as XYZ+ format
xyz_text = mol.write()
print(xyz_text)
"""
24
0 1
O 0.470000000014 2.568799999980 0.000600000013
O -3.127099999991 -0.443600000008 -0.000299999980
N -0.968599999993 -1.312500000015 0.000000000000
N 2.218199999976 0.141200000002 -0.000299999980
N -1.347700000022 1.079700000022 -0.000099999993
N 1.411900000022 -1.937199999987 0.000199999987
C 0.857899999991 0.259199999998 -0.000800000000
C 0.389700000018 -1.026399999992 -0.000399999974
C 0.030699999986 1.421999999991 -0.000600000013
C -1.906099999974 -0.249500000003 -0.000399999974
C 2.503200000019 -1.199799999986 0.000299999980
C -1.427599999992 -2.696000000005 0.000800000000
C 3.192600000010 1.206099999994 0.000299999980
C -2.296900000025 2.188100000003 0.000700000007
H 3.516299999989 -1.578699999975 0.000800000000
H -1.045099999975 -3.197300000018 -0.893700000012
H -2.518600000009 -2.759599999991 0.001099999980
H -1.044700000002 -3.196299999978 0.895699999986
H 4.199199999985 0.780099999989 0.000199999987
H 3.046799999995 1.809200000014 -0.899200000019
H 3.046600000008 1.808300000021 0.900399999993
H -1.808699999994 3.165100000025 -0.000299999980
H -2.932199999985 2.102700000026 0.888100000011
H -2.934599999986 2.102100000013 -0.884900000010
"""
# Write to a file
mol.write(filename=Path("caffeine.xyz+"))
Fields
atomic_numbers
-
The (n, ) atomic numbers of the Molecule.
- Type: Optional[Array]
- Additional Details: shape: (-1,)
charge
-
The overall charge of the molecule.
- Type: int
- Default: 0
geometry
-
The (n, 3) coordinates for the molecule in Bohr.
- Type: Array
- Additional Details: shape: (-1, 3)
multiplicity
-
The overall multiplicity of the molecule. A value of
None
refers to the lowest multiplicity given the electron number parity.- Type: Optional[int]
masses
-
The
(n, )
array of masses of the atoms. This field is read-only.- Type:
Array[float]
- Additional Details: shape:
(-1,)
- Type:
symbols
-
The
(n, )
array of symbols of the atoms. This field is read-only.- Type:
Array[str]
- Additional Details: shape:
(-1,)
- Type:
Functions
Molecule.measure
measure((List[int]) indices) -> float
:
For a list of two, three or four atom indices, this function returns the corresponding bond length, angle or dihedral angle, respectively.
Arguments
indices
-
A list of two, three or four atom indices.
- Type
List[int]
- Type
Returns
The corresponding value as a float
.
Molecule.write
write((str) format, (Optional[Union[str, Path]]) filename) -> Optional[str]
:
Write the molecule into a file or string, using the specified file format
.
Arguments
format
-
The file format.
- Type:
str
- Default:
"xyz"
- Type:
filename
-
If specified, the molecule is written to this file. If no
filename
is provided, the result is returned as astr
.- Type:
Optional[Union[str, Path]]
- Default: None
- Type:
Returns
None
if a filename
has been specified, and otherwise a str
with the same content.