In this post, I’ll link to a few training materials and software packages that might be useful to new starters in our group (or elsewhere). I intend for this to be updated regularly, so please feel free to suggest changes or additions, particularly if links break. A big thank you to everyone that has made these resources available online, we find them invaluable!

Workshops

CCP-BioSim. (requires log-in) Excellent set of training modules and Youtube channel. Subjects include biomolecular dynamics, FESetup, BioSimSpace and Python for biomolecular modelling.

RSC Chemical Information and Computer Applications Group. Highly recommended series of webinars on open source tools for chemistry, including cheminformatics, docking (gnina), chimeraX and pymol.

Bespoke-fit. A workshop describing the use of Open Force Field’s bespoke-fit package for fitting molecule-specific torsion parameters, given by our own Josh Horton.

TeachOpenCADD. Really useful set of talktorials on the use of open source cheminformatics tools for computer-aided drug design.

Practical cheminformatics. A set of Jupyter notebooks by Pat Walters for learning Cheminformatics.

The Alan Turing Institute offers online Introduction to Research Data Science (github repo) and Software Engineering (github repo) courses.

Molecular Software Sciences Institute. Highly recommended workshops on topics including programming and molecular modelling. See in particular the workshop on best practices in software development.

The AI3SD network have kindly made available a series of recorded research seminars, as well as a Skills 4 Scientists series (including research data management, python, version control, ethics, and career development).

Scientific Computing From Scratch. A summer bootcamp on scientific computing for beginners with Python and Pytorch organized by Pratyush Tiwary.

Research Software

QUBEKit. Our own Quantum Mechanical Bespoke Force Field Derivation Toolkit, by Chris Ringrose and Josh Horton. More tutorials coming soon, watch this space!

Open Force Field. We’re very happy to be working with the Open Force Field initiative, and we make extensive use of their software infrastructure in our work. A good place to start is with the documentation examples on building and interacting with molecules, and a set of notebooks, which show how to use the toolkit to parameterise a system and run a short simulation in OpenMM. The new examples page assembles examples drawn from throughout the OpenFF stack.

OpenMM. All of our atomistic molecular dynamics simulations are performed through OpenMM. A cookbook demonstrating a few protocols is available here. Additionally, teaching materials from a recent workshop can be found here, and tutorials demonstrating the OpenMM toolkit on the Google Colab framework from the paper “Making it rain: Cloud-based molecular simulations for everyone” can be found here.

BioSimSpace / SOMD. Python framework for biomolecular simulation, includes a set of tutorials for common simulation types. Our own tutorials written with Julien Michel’s lab show how to use SOMD in protein-ligand binding and hydration free energy calculations with different force fields.

RDKit. Open source cheminformatics toolkit, which we use in virtually all our workflows. You’ll find many tutorials online, including Getting started with RDKit in python and introductory youtube tutorials by Jan Jensen: Tutorial 1; Tutorial 2.

QCArchive. Josh In particular has done a lot of work with MolSSI’s quantum chemistry archive, which is resource for compiling and sharing QM data. Lots of documentation is available, including for QCengine, a quantum chemistry program executor and IO standardiser for quantum chemistry.

ONETEP. Several of our projects involve large-scale DFT calculations, often working with the ONETEP community. Frequent workshops are run and tutorials are available online.

DeLinker. Deep generative models for 3D linker design. We’ve found this to be very promising in fragment-based design projects.

FEGrow. Our own open-source molecular builder and free energy preparation workflow. We hope it’s useful for building molecules into protein binding pockets, scoring the poses, and outputting structures for free energy calculations, all through a Jupyter notebook. A tutorial is provided.

Gnina. We don’t use docking very often to be honest, but do like the ease-of-use and promising accuracy of gnina. David Koes also has an excellent webinar describing its use.

DeepChem. Open-source toolchain demonstrating the use of deep learning in drug discovery, materials science, quantum chemistry, and biology, with extensive online tutorials.

CADD vault. A comprehensive repository dedicated to sharing resources, software tools, and knowledge in the field of computer-aided drug design.

Miscellaneous

Learning Scientific Programming with Python. The course textbook for my undergraduate Python classes (Newcastle students, feel free to message me for access to course materials). Textbook should be available through Newcastle University library.

Linux Command Line for Beginners. Feel free to send me other linux tutorials that you find useful.

Getting started with Conda. Official tutorial for getting started with Conda environment and package management. See also the pip Python package manager.

Deep Learning for Molecules and Materials. I’m only on Chapter 2 so far, but this looks like a really good online textbook by Andrew White, with code examples.

Papers for molecular design using deep learning. Collection of literature related to Generative AI and Deep Learning for molecular/drug design and molecular conformation generation.

In Silico Medicinal Chemistry: Computational Methods to Support Drug Design. Book by Nathan Brown on computational tools for drug design. Should be available through Newcastle University library.

Quantum Chemistry. Set of hands-on quantum chemistry notebooks and exercises from eChem.

Writing. David Mobley has some nice tips on academic writing style, and he also pointed us to this paper for a more detailed overview.

Narrative CVs. Collection of resources on writing a narrative CV.