RMol

class fegrow.RMol(*args, id=None, template=None, **kwargs)[source]

RMol is essentially a wrapper around RDKit Mol with tailored functionalities for attaching R groups, etc.

Parameters:
  • rmol (RMol) – when provided, energies and additional metadata is preserved.

  • template – Provide the original molecule template used for this RMol.

df()[source]

Generate a pandas dataframe row for this molecule with SMILES.

Returns:

pandas dataframe row.

generate_conformers(num_conf: int, minimum_conf_rms: float | None = [], **kwargs)[source]

Generate conformers using the RDKIT’s ETKDG. The generated conformers are embedded into the template structure. In other words, any atoms that are common with the template structure, should have the same coordinates.

Parameters:
  • num_conf – fixme

  • minimum_conf_rms (float) – The minimum acceptable difference in the RMS in any new generated conformer. Conformers that are too similar are discarded.

  • flexible (List[int]) – A list of indices that are common with the template molecule that should have new coordinates.

gnina(receptor_file)[source]

Use gnina to extract CNNaffinity, and convert it into IC50.

LIMITATION: currenly the gnina binaries do not support Mac.

Parameters:

receptor_file (str) – Path to the receptor file.

optimise_in_receptor(*args, **kwargs)[source]

Enumerate the conformers inside of the receptor by employing ANI2x, a hybrid machine learning / molecular mechanics (ML/MM) approach. ANI2x is neural nework potential for the ligand energetics but works only for the following atoms: H, C, N, O, F, S, Cl.

Open Force Field Parsley force field is used for intermolecular interactions with the receptor.

Parameters:
  • sigma_scale_factor – is used to scale the Lennard-Jones radii of the atoms.

  • relative_permittivity – is used to scale the electrostatic interactions with the protein.

  • water_model – can be used to set the force field for any water molecules present in the binding site.

remove_clashing_confs(prot, min_dst_allowed=1.0)[source]

Removing conformations that class with the protein. Note that the original conformer should be well docked into the protein, ideally with some space between the area of growth and the protein, so that any growth on the template doesn’t automatically cause clashes.

Parameters:
  • prot (Prody instance) – The protein against which the conformers should be tested.

  • min_dst_allowed (float in Angstroms) – If any atom is within this distance in a conformer, the conformer will be deleted.

rep2D(**kwargs)[source]

Use RDKit and get a 2D diagram. Uses Compute2DCoords and Draw.MolToImage function

Works with IPython Notebook.

Parameters:

**kwargs

are passed further to Draw.MolToImage function.

rep3D(view=None, prody=None, template=False, confIds: List[int] | None = None)[source]

Use py3Dmol to obtain the 3D view of the molecule.

Works with IPython Notebook.

Parameters:
  • view (py3Dmol view instance (None)) – a view to which add the visualisation. Useful if one wants to 3D view multiple conformers in one view.

  • prody (Prody instance (Default: None)) – A prody protein around which a view 3D can be created

  • template (bool (False)) – Whether to visualise the original 3D template as well from which the molecule was made.

  • confIds (List[int]) – Select the conformations for display.

static set_gnina(loc)[source]

Set the location of the binary file gnina. This could be your own compiled directory, or a directory where you’d like it to be downloaded.

By default, gnina path is to the working directory (~500MB).

Parameters:

loc (str) – path to gnina binary file. E.g. /dir/path/gnina. Note that right now gnina should be a binary file with that specific filename “gnina”.

sort_conformers(energy_range=5)[source]
For the given molecule and the conformer energies order the energies

and only keep any conformers with in the energy range of the lowest energy conformer.

Parameters:

energy_range – The energy range (kcal/mol), above the minimum, for which conformers should be kept.

to_file(file_name: str)[source]

Write the molecule and all conformers to file.

Note:

The file type is worked out from the name extension by splitting on ..

toxicity()[source]
Assessed various ADMET properties, including
  • Lipinksi rule of 5 properties,

  • the presence of unwanted substructures

  • problematic functional groups

  • synthetic accessibility

return:

a row of a dataframe with the descriptors

rtype:

dataframe