Basic Walk-Through: Command-Line Interface¶
This example shows basic usage of presto through its command line interface. We'll run a ligand of TYK2 (a common benchmark system for FEP) with the SMILES CCC(CC)C(=O)Nc2cc(NC(=O)c1c(Cl)cccc1Cl)ccn2. The entire workflow can be run in a single line (after activating the environment):
presto train --parameterisation-settings.smiles "CCC(CC)C(=O)Nc2cc(NC(=O)c1c(Cl)cccc1Cl)ccn2"
but we'll go through this in more detail below.
! and % symbols appear before commands in notebook cells to get them to behave as if they were run on the command line. If you're following along, you can ignore them and run the commands directly on the command line.
Setup¶
After activating your environment (e.g. with pixi shell), navigate to a new directory and use presto write-default-yaml to write a default settings file:
! mkdir bespoke-fitting-example-cli
%cd bespoke-fitting-example-cli
! presto write-default-yaml
Have a look at the contents of the workflow_settings.yaml file, which comes pre-populated with all of the default settings for every available option:
! cat workflow_settings.yaml
Some particularly important settings are:
smilesunderparameterisation_settings. You must tell the program what molecule you want to run!ml_potentialundertraining_sampling_settings,testing_sampling_settings, andmsm_settings. The default model isaceff-2.0which can handle charged and neutral species. Other MLPs, such asegret-1, are available.sampling_protocolundertraining_sampling_settings. Usingmm_md_metadynamics_torsion_minimisationmeans that we will run MD with the molecular mechanics (mm_md) force field and use well-tempered metadynamics on rotatable bonds to enhance sampling (metadynamics) . We also mix in structures from very short minimisations (torsion_minimisation) using the MLP to introduce structures closer to the MLP potential energy surface which may be missed with purely MM sampling (for example configurations with strong clashes). The minimisations are short enough that there is little relaxation of the torisons.
Change the SMILES and any other settings you'd like in the yaml file.
! sed -i 's/ smiles: CHANGEME/ smiles: "CCC(CC)C(=O)Nc2cc(NC(=O)c1c(Cl)cccc1Cl)ccn2"/' workflow_settings.yaml
Now we're ready to run!
Execution¶
Run the fitting with presto train-from-yaml. This takes around 20 minutes with a GPU and a few hours on CPUs.
! presto train-from-yaml workflow_settings.yaml
Analysis¶
We now have a bespoke force field: check out training_iteration_2/bespoke_ff.offxml. Have a look for the bespoke types at the end of each section:
! cat training_iteration_2/bespoke_ff.offxml
We can check out the standard plots to get more information on how well the fitting has gone and how the parameters have changed -- take a look in plots:
! ls plots
For example, loss.png displays how the training loss (computed on the training set) and the test loss (computed on a seperate set of samples generated with the MLP) changes during training at each iteration (indexed from 0). Our loss looks reasonably well-converged.
from IPython.display import Image, display
display(Image(filename='plots/loss.png'))
Using your bespoke force field¶
Now we have our bespoke force field, we can use it in our intended application. As a quick illustration, we can easily run some vacuum MD with OpenFF's Toolkit and Interchange packages, and OpenMM. First, we can create an Interchange object which contains all of the information requied to start molecular dynamics.
from openff.toolkit import Molecule, ForceField, Topology
force_field = ForceField('training_iteration_2/bespoke_ff.offxml')
molecule = Molecule.from_smiles('CCC(CC)C(=O)Nc2cc(NC(=O)c1c(Cl)cccc1Cl)ccn2')
molecule.generate_conformers(n_conformers=1)
interchange = force_field.create_interchange(molecule.to_topology())
Now we can run MD with OpenMM:
import openmm
import openmm.unit
from openff.interchange import Interchange
import mdtraj
import nglview
def run_openmm(
interchange: Interchange,
reporter_frequency: int = 1000, # Decrease this to save more frames!
trajectory_name: str = "small_mol_solvated.pdb",
) -> None:
"""Run a simulation using OpenMM."""
simulation = interchange.to_openmm_simulation(
integrator=openmm.LangevinMiddleIntegrator(
300 * openmm.unit.kelvin,
1 / openmm.unit.picosecond,
0.002 * openmm.unit.picoseconds,
),
)
pdb_reporter = openmm.app.PDBReporter(trajectory_name, reporter_frequency)
simulation.reporters.append(pdb_reporter)
simulation.context.setVelocitiesToTemperature(300 * openmm.unit.kelvin)
simulation.runForClockTime(10 * openmm.unit.second)
def visualise_traj(
topology: Topology, filename: str = "small_mol_solvated.pdb"
) -> nglview.NGLWidget:
"""Visualise a trajectory using nglview."""
traj = mdtraj.load(
filename,
top=mdtraj.Topology.from_openmm(topology.to_openmm()),
)
view = nglview.show_mdtraj(traj)
view.add_representation("licorice", selection="water")
return view
run_openmm(interchange)
visualise_traj(interchange.topology)
Cleaning up¶
To remove all files created by presto, you can run presto clean workflow_settings.yaml. This does not remove workflow_settings.yaml, rather uses it to find the expected files and remove them. This will raise an error and exit if it comes across any files it did not generate in directories it would otherwise delete.
! presto clean workflow_settings.yaml