The openbabel module provides direct access to the C++ Open Babel library from Python. This binding is generated using the SWIG package and provides access to almost all of the Open Babel interfaces via Python, including the base classes OBMol, OBAtom, OBBond, and OBResidue, as well as the conversion framework OBConversion. As such, essentially any call in the C++ API is available to Python scripts with very little difference in syntax. As a result, the principal documentation is the Open Babel C++ API documentation.
Here we give some examples of common Python syntax for the openbabel module and pointers to the appropriate sections of the API documentation.
The example script below creates atoms and bonds one-by-one using the OBMol, OBAtom, and OBBond classes.
import openbabel
mol = openbabel.OBMol()
print 'Should print 0 (atoms)'
print mol.NumAtoms()
a = mol.NewAtom()
a.SetAtomicNum(6) # carbon atom
a.SetVector(0.0, 1.0, 2.0) # coordinates
b = mol.NewAtom()
mol.AddBond(1, 2, 1) # atoms indexed from 1
print 'Should print 2 (atoms)'
print mol.NumAtoms()
print 'Should print 1 (bond)'
print mol.NumBonds()
mol.Clear();
More commonly, Open Babel can be used to read in molecules using the OBConversion framework. The following script reads in molecular information (a SMI file) from a string, adds hydrogens, and writes out an MDL file as a string.
import openbabel
obConversion = openbabel.OBConversion()
obConversion.SetInAndOutFormats("smi", "mdl")
mol = openbabel.OBMol()
obConversion.ReadString(mol, "C1=CC=CS1")
print 'Should print 5 (atoms)'
print mol.NumAtoms()
mol.AddHydrogens()
print 'Should print 9 (atoms) after adding hydrogens'
print mol.NumAtoms()
outMDL = obConversion.WriteString(mol)
The following script writes out a file using a filename, rather than reading and writing to a Python string.
import openbabel
obConversion = openbabel.OBConversion()
obConversion.SetInAndOutFormats("pdb", "mol2")
mol = openbabel.OBMol()
obConversion.ReadFile(mol, "1ABC.pdb.gz") # Open Babel will uncompress automatically
mol.AddHydrogens()
print mol.NumAtoms()
print mol.NumBonds()
print mol.NumResidues()
obConversion.WriteFile(mol, '1abc.mol2')
A number of Open Babel toolkit classes provide iterators over various objects; these classes are identifiable by the suffix “Iter” in the list of toolkit classes in the API:
These iterator classes can be used using the typical Python syntax for iterators:
for obatom in openbabel.OBMolAtomIter(obmol):
print obatom.GetAtomicMass()
Note that OBMolTorsionIter returns atom IDs which are off by one. That is, you need to add one to each ID to get the correct ID. Also, if you add or remove atoms, you will need to delete the existing TorsionData before using OBMolTorsionIter. This is done as follows:
mol.DeleteData(openbabel.TorsionData)
Some Open Babel toolkit methods, for example OBMol::Rotate()(), require an array of doubles. It’s not possible to directly use a list of floats when calling such a function from Python. Instead, you need to first explicitly create a C array using the double_array() function:
obMol.Rotate([1.0, -54.7, 3])
# Error!
myarray = openbabel.double_array([1.0, -54.7, 3])
obMol.Rotate(myarray)
# Works!
If you want to access any OBPairData or OBUnitCell associated with a molecule, you need to ‘cast’ the OBGenericData returned by OBMol.GetData using the toPairData() or toUnitCell() functions:
pairdata = [openbabel.toPairData(x) for x in obMol.GetData() if x.GetDataType()==openbabel.PairData]
print pairdata[0].GetAttribute(), pairdata[0].GetValue()
unitcell = openbabel.toUnitCell(obMol.GetData(openbabel.UnitCell))
print unitcell.GetAlpha(), unitcell.GetSpaceGroup()
If you are using the Python numerical extension, numpy, and you try to pass values from a numpy array to Open Babel, it may not work unless you convert the values to Python built-in types first:
import numpy, openbabel
mol = openbabel.OBMol()
atom = mol.NewAtom()
coord = numpy.array([1.2, 2.3, 4.6], "float32")
atom.SetVector(coord[0], coord[1], coord[2])
# Error
atom.SetVector(float(coord[0]), float(coord[1]), float(coord[2]))
# No error
coord = numpy.array([1.2, 2.3, 4.6], "float64")
atom.SetVector(coord[0], coord[1], coord[2])
# No error either - not all numpy arrays will cause an error