settings
#
Pydantic models which control/validate the settings.
Classes:
-
MMMDSamplingSettings–Settings for molecular dynamics sampling using a molecular mechanics
-
MLMDSamplingSettings–Settings for molecular dynamics sampling using a machine learning
-
MMMDMetadynamicsSamplingSettings–Settings for molecular dynamics sampling using a molecular mechanics
-
MMMDMetadynamicsTorsionMinimisationSamplingSettings–Settings for MM MD metadynamics sampling with additional torsion-restrained
-
PreComputedDatasetSettings–Settings for loading pre-computed datasets from disk.
-
TrainingSettings–Settings for the training process.
-
OutlierFilterSettings–Settings for filtering outliers from datasets based on MM vs MLP differences.
-
TypeGenerationSettings–Settings for generating tagged SMARTS types for a given potential type.
-
MSMSettings–Settings for the modified Seminario method.
-
ParameterisationSettings–Settings for the starting parameterisation.
-
WorkflowSettings–Overall settings for the full fitting workflow.
Attributes:
-
SamplingSettings–Union type for all sampling settings. See the associated
sampling_protocolfield
SamplingSettings
module-attribute
#
SamplingSettings = Union[
MMMDSamplingSettings,
MLMDSamplingSettings,
MMMDMetadynamicsSamplingSettings,
MMMDMetadynamicsTorsionMinimisationSamplingSettings,
PreComputedDatasetSettings,
]
Union type for all sampling settings. See the associated sampling_protocol field
in each class for the string identifier which should be supplied to
training_sampling_settings and testing_sampling_settings fields in
WorkflowSettings.
_DefaultSettings
pydantic-model
#
Bases: BaseModel, ABC
Default configuration for all models.
Show JSON schema:
output_types
property
#
output_types: set[OutputType]
Return a set of expected output types for the function which implements this settings object. Subclasses should override this method.
to_yaml
#
_SamplingSettingsBase
pydantic-model
#
Bases: _DefaultSettings, ABC
Settings for sampling (usually molecular dynamics).
Show JSON schema:
{
"additionalProperties": false,
"description": "Settings for sampling (usually molecular dynamics).",
"properties": {
"sampling_protocol": {
"description": "Type of sampling protocol. Each sampling settings subclass should set this to a unique value. This is used as a discriminator when loading from YAML.",
"title": "Sampling Protocol",
"type": "string"
},
"ml_potential": {
"default": "aceff-2.0",
"description": "The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.",
"enum": [
"aceff-2.0",
"mace-off23-small",
"mace-off23-medium",
"mace-off23-large",
"egret-1",
"aimnet2_b973c_d3_ens",
"aimnet2_wb97m_d3_ens"
],
"title": "Ml Potential",
"type": "string"
},
"timestep": {
"description": "MD timestep",
"title": "Timestep",
"type": "string"
},
"temperature": {
"description": "Temperature to run MD at",
"title": "Temperature",
"type": "string"
},
"snapshot_interval": {
"description": "Interval between saving snapshots during production sampling",
"title": "Snapshot Interval",
"type": "string"
},
"n_conformers": {
"default": 10,
"description": "The number of conformers to generate, from which sampling is started",
"title": "N Conformers",
"type": "integer"
},
"equilibration_sampling_time_per_conformer": {
"description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
"title": "Equilibration Sampling Time Per Conformer",
"type": "string"
},
"production_sampling_time_per_conformer": {
"description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
"title": "Production Sampling Time Per Conformer",
"type": "string"
},
"loss_energy_weight": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for samples from this protocol.",
"title": "Loss Energy Weight",
"type": "number"
},
"loss_force_weight": {
"default": 0.1,
"description": "Scaling factor for the force loss term for samples from this protocol.",
"title": "Loss Force Weight",
"type": "number"
}
},
"required": [
"sampling_protocol"
],
"title": "_SamplingSettingsBase",
"type": "object"
}
Fields:
-
sampling_protocol(str) -
ml_potential(Literal[AvailableModels]) -
timestep(OpenMMQuantity[femtoseconds]) -
temperature(OpenMMQuantity[kelvin]) -
snapshot_interval(OpenMMQuantity[femtoseconds]) -
n_conformers(int) -
equilibration_sampling_time_per_conformer(OpenMMQuantity[picoseconds]) -
production_sampling_time_per_conformer(OpenMMQuantity[picoseconds]) -
loss_energy_weight(float) -
loss_force_weight(float)
Validators:
sampling_protocol
pydantic-field
#
Type of sampling protocol. Each sampling settings subclass should set this to a unique value. This is used as a discriminator when loading from YAML.
ml_potential
pydantic-field
#
ml_potential: Literal[AvailableModels] = 'aceff-2.0'
The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.
temperature
pydantic-field
#
Temperature to run MD at
snapshot_interval
pydantic-field
#
Interval between saving snapshots during production sampling
n_conformers
pydantic-field
#
The number of conformers to generate, from which sampling is started
equilibration_sampling_time_per_conformer
pydantic-field
#
Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.
production_sampling_time_per_conformer
pydantic-field
#
Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.
loss_energy_weight
pydantic-field
#
Scaling factor for the energy loss term for samples from this protocol.
loss_force_weight
pydantic-field
#
Scaling factor for the force loss term for samples from this protocol.
validate_sampling_times
pydantic-validator
#
Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval.
Source code in presto/settings.py
to_yaml
#
MMMDSamplingSettings
pydantic-model
#
Bases: _SamplingSettingsBase
Settings for molecular dynamics sampling using a molecular mechanics force field. This is initally the force field supplined in the parameterisation settings, but is updated as the bespoke force field is trained.
Show JSON schema:
{
"additionalProperties": false,
"description": "Settings for molecular dynamics sampling using a molecular mechanics\nforce field. This is initally the force field supplined in the parameterisation\nsettings, but is updated as the bespoke force field is trained.",
"properties": {
"sampling_protocol": {
"const": "mm_md",
"default": "mm_md",
"description": "Sampling protocol to use.",
"title": "Sampling Protocol",
"type": "string"
},
"ml_potential": {
"default": "aceff-2.0",
"description": "The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.",
"enum": [
"aceff-2.0",
"mace-off23-small",
"mace-off23-medium",
"mace-off23-large",
"egret-1",
"aimnet2_b973c_d3_ens",
"aimnet2_wb97m_d3_ens"
],
"title": "Ml Potential",
"type": "string"
},
"timestep": {
"description": "MD timestep",
"title": "Timestep",
"type": "string"
},
"temperature": {
"description": "Temperature to run MD at",
"title": "Temperature",
"type": "string"
},
"snapshot_interval": {
"description": "Interval between saving snapshots during production sampling",
"title": "Snapshot Interval",
"type": "string"
},
"n_conformers": {
"default": 10,
"description": "The number of conformers to generate, from which sampling is started",
"title": "N Conformers",
"type": "integer"
},
"equilibration_sampling_time_per_conformer": {
"description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
"title": "Equilibration Sampling Time Per Conformer",
"type": "string"
},
"production_sampling_time_per_conformer": {
"description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
"title": "Production Sampling Time Per Conformer",
"type": "string"
},
"loss_energy_weight": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for samples from this protocol.",
"title": "Loss Energy Weight",
"type": "number"
},
"loss_force_weight": {
"default": 0.1,
"description": "Scaling factor for the force loss term for samples from this protocol.",
"title": "Loss Force Weight",
"type": "number"
}
},
"title": "MMMDSamplingSettings",
"type": "object"
}
Fields:
-
ml_potential(Literal[AvailableModels]) -
timestep(OpenMMQuantity[femtoseconds]) -
temperature(OpenMMQuantity[kelvin]) -
snapshot_interval(OpenMMQuantity[femtoseconds]) -
n_conformers(int) -
equilibration_sampling_time_per_conformer(OpenMMQuantity[picoseconds]) -
production_sampling_time_per_conformer(OpenMMQuantity[picoseconds]) -
loss_energy_weight(float) -
loss_force_weight(float) -
sampling_protocol(Literal['mm_md'])
Validators:
sampling_protocol
pydantic-field
#
Sampling protocol to use.
ml_potential
pydantic-field
#
ml_potential: Literal[AvailableModels] = 'aceff-2.0'
The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.
temperature
pydantic-field
#
Temperature to run MD at
snapshot_interval
pydantic-field
#
Interval between saving snapshots during production sampling
n_conformers
pydantic-field
#
The number of conformers to generate, from which sampling is started
equilibration_sampling_time_per_conformer
pydantic-field
#
Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.
production_sampling_time_per_conformer
pydantic-field
#
Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.
loss_energy_weight
pydantic-field
#
Scaling factor for the energy loss term for samples from this protocol.
loss_force_weight
pydantic-field
#
Scaling factor for the force loss term for samples from this protocol.
to_yaml
#
from_yaml
classmethod
#
validate_sampling_times
pydantic-validator
#
Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval.
Source code in presto/settings.py
MLMDSamplingSettings
pydantic-model
#
Bases: _SamplingSettingsBase
Settings for molecular dynamics sampling using a machine learning potential. This protocol uses the ML reference potential for sampling as well as for energy and force calculations.
Show JSON schema:
{
"additionalProperties": false,
"description": "Settings for molecular dynamics sampling using a machine learning\npotential. This protocol uses the ML reference potential for sampling as\nwell as for energy and force calculations.",
"properties": {
"sampling_protocol": {
"const": "ml_md",
"default": "ml_md",
"description": "Sampling protocol to use.",
"title": "Sampling Protocol",
"type": "string"
},
"ml_potential": {
"default": "aceff-2.0",
"description": "The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.",
"enum": [
"aceff-2.0",
"mace-off23-small",
"mace-off23-medium",
"mace-off23-large",
"egret-1",
"aimnet2_b973c_d3_ens",
"aimnet2_wb97m_d3_ens"
],
"title": "Ml Potential",
"type": "string"
},
"timestep": {
"description": "MD timestep",
"title": "Timestep",
"type": "string"
},
"temperature": {
"description": "Temperature to run MD at",
"title": "Temperature",
"type": "string"
},
"snapshot_interval": {
"description": "Interval between saving snapshots during production sampling",
"title": "Snapshot Interval",
"type": "string"
},
"n_conformers": {
"default": 10,
"description": "The number of conformers to generate, from which sampling is started",
"title": "N Conformers",
"type": "integer"
},
"equilibration_sampling_time_per_conformer": {
"description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
"title": "Equilibration Sampling Time Per Conformer",
"type": "string"
},
"production_sampling_time_per_conformer": {
"description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
"title": "Production Sampling Time Per Conformer",
"type": "string"
},
"loss_energy_weight": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for samples from this protocol.",
"title": "Loss Energy Weight",
"type": "number"
},
"loss_force_weight": {
"default": 0.1,
"description": "Scaling factor for the force loss term for samples from this protocol.",
"title": "Loss Force Weight",
"type": "number"
}
},
"title": "MLMDSamplingSettings",
"type": "object"
}
Fields:
-
ml_potential(Literal[AvailableModels]) -
timestep(OpenMMQuantity[femtoseconds]) -
temperature(OpenMMQuantity[kelvin]) -
snapshot_interval(OpenMMQuantity[femtoseconds]) -
n_conformers(int) -
equilibration_sampling_time_per_conformer(OpenMMQuantity[picoseconds]) -
production_sampling_time_per_conformer(OpenMMQuantity[picoseconds]) -
loss_energy_weight(float) -
loss_force_weight(float) -
sampling_protocol(Literal['ml_md'])
Validators:
sampling_protocol
pydantic-field
#
Sampling protocol to use.
ml_potential
pydantic-field
#
ml_potential: Literal[AvailableModels] = 'aceff-2.0'
The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.
temperature
pydantic-field
#
Temperature to run MD at
snapshot_interval
pydantic-field
#
Interval between saving snapshots during production sampling
n_conformers
pydantic-field
#
The number of conformers to generate, from which sampling is started
equilibration_sampling_time_per_conformer
pydantic-field
#
Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.
production_sampling_time_per_conformer
pydantic-field
#
Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.
loss_energy_weight
pydantic-field
#
Scaling factor for the energy loss term for samples from this protocol.
loss_force_weight
pydantic-field
#
Scaling factor for the force loss term for samples from this protocol.
to_yaml
#
from_yaml
classmethod
#
validate_sampling_times
pydantic-validator
#
Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval.
Source code in presto/settings.py
MMMDMetadynamicsSamplingSettings
pydantic-model
#
Bases: _SamplingSettingsBase
Settings for molecular dynamics sampling using a molecular mechanics force field with metadynamics. This is initally the force field supplined in the parameterisation settings, but is updated as the bespoke force field is trained.
Show JSON schema:
{
"additionalProperties": false,
"description": "Settings for molecular dynamics sampling using a molecular mechanics\nforce field with metadynamics. This is initally the force field supplined in the parameterisation\nsettings, but is updated as the bespoke force field is trained.",
"properties": {
"sampling_protocol": {
"const": "mm_md_metadynamics",
"default": "mm_md_metadynamics",
"description": "Sampling protocol to use.",
"title": "Sampling Protocol",
"type": "string"
},
"ml_potential": {
"default": "aceff-2.0",
"description": "The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.",
"enum": [
"aceff-2.0",
"mace-off23-small",
"mace-off23-medium",
"mace-off23-large",
"egret-1",
"aimnet2_b973c_d3_ens",
"aimnet2_wb97m_d3_ens"
],
"title": "Ml Potential",
"type": "string"
},
"timestep": {
"description": "MD timestep",
"title": "Timestep",
"type": "string"
},
"temperature": {
"description": "Temperature to run MD at",
"title": "Temperature",
"type": "string"
},
"snapshot_interval": {
"description": "Interval between saving snapshots during production sampling",
"title": "Snapshot Interval",
"type": "string"
},
"n_conformers": {
"default": 10,
"description": "The number of conformers to generate, from which sampling is started",
"title": "N Conformers",
"type": "integer"
},
"equilibration_sampling_time_per_conformer": {
"description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
"title": "Equilibration Sampling Time Per Conformer",
"type": "string"
},
"production_sampling_time_per_conformer": {
"description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
"title": "Production Sampling Time Per Conformer",
"type": "string"
},
"loss_energy_weight": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for samples from this protocol.",
"title": "Loss Energy Weight",
"type": "number"
},
"loss_force_weight": {
"default": 0.1,
"description": "Scaling factor for the force loss term for samples from this protocol.",
"title": "Loss Force Weight",
"type": "number"
},
"bias_width": {
"default": 0.3141592653589793,
"description": "Width of the bias (in radians)",
"title": "Bias Width",
"type": "number"
},
"bias_factor": {
"default": 20.0,
"description": "Bias factor for well-tempered metadynamics. Typical range: 5-20",
"title": "Bias Factor",
"type": "number"
},
"bias_height": {
"description": "Initial height of the bias",
"title": "Bias Height",
"type": "string"
},
"bias_frequency": {
"description": "Frequency at which to add bias",
"title": "Bias Frequency",
"type": "string"
},
"bias_save_frequency": {
"description": "Frequency at which to save the bias",
"title": "Bias Save Frequency",
"type": "string"
},
"torsions_to_include_smarts": {
"description": "SMARTS patterns for torsions to include in metadynamics biasing. Matches single bonds not in rings and single bonds in aliphatic rings of size 5 or more. These should match the entire torsion (4 atoms), not just the rotatable bond.",
"items": {
"type": "string"
},
"title": "Torsions To Include Smarts",
"type": "array"
},
"torsions_to_exclude_smarts": {
"description": "SMARTS patterns for bonds to exclude from metadynamics biasing. These are removed from the list of torsions matched by the include patterns. These should match only the rotatable bond (2 atoms), not the full torsion.",
"items": {
"type": "string"
},
"title": "Torsions To Exclude Smarts",
"type": "array"
}
},
"title": "MMMDMetadynamicsSamplingSettings",
"type": "object"
}
Fields:
-
ml_potential(Literal[AvailableModels]) -
timestep(OpenMMQuantity[femtoseconds]) -
temperature(OpenMMQuantity[kelvin]) -
snapshot_interval(OpenMMQuantity[femtoseconds]) -
n_conformers(int) -
equilibration_sampling_time_per_conformer(OpenMMQuantity[picoseconds]) -
production_sampling_time_per_conformer(OpenMMQuantity[picoseconds]) -
loss_energy_weight(float) -
loss_force_weight(float) -
sampling_protocol(Literal['mm_md_metadynamics']) -
bias_width(float) -
bias_factor(float) -
bias_height(OpenMMQuantity[kilojoules_per_mole]) -
bias_frequency(OpenMMQuantity[picoseconds]) -
bias_save_frequency(OpenMMQuantity[picoseconds]) -
torsions_to_include_smarts(list[str]) -
torsions_to_exclude_smarts(list[str])
Validators:
-
validate_sampling_times -
validate_frequencies
sampling_protocol
pydantic-field
#
Sampling protocol to use.
bias_factor
pydantic-field
#
Bias factor for well-tempered metadynamics. Typical range: 5-20
bias_height
pydantic-field
#
Initial height of the bias
bias_frequency
pydantic-field
#
Frequency at which to add bias
bias_save_frequency
pydantic-field
#
Frequency at which to save the bias
torsions_to_include_smarts
pydantic-field
#
SMARTS patterns for torsions to include in metadynamics biasing. Matches single bonds not in rings and single bonds in aliphatic rings of size 5 or more. These should match the entire torsion (4 atoms), not just the rotatable bond.
torsions_to_exclude_smarts
pydantic-field
#
SMARTS patterns for bonds to exclude from metadynamics biasing. These are removed from the list of torsions matched by the include patterns. These should match only the rotatable bond (2 atoms), not the full torsion.
ml_potential
pydantic-field
#
ml_potential: Literal[AvailableModels] = 'aceff-2.0'
The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.
temperature
pydantic-field
#
Temperature to run MD at
snapshot_interval
pydantic-field
#
Interval between saving snapshots during production sampling
n_conformers
pydantic-field
#
The number of conformers to generate, from which sampling is started
equilibration_sampling_time_per_conformer
pydantic-field
#
Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.
production_sampling_time_per_conformer
pydantic-field
#
Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.
loss_energy_weight
pydantic-field
#
Scaling factor for the energy loss term for samples from this protocol.
loss_force_weight
pydantic-field
#
Scaling factor for the force loss term for samples from this protocol.
to_yaml
#
from_yaml
classmethod
#
validate_sampling_times
pydantic-validator
#
Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval.
Source code in presto/settings.py
MMMDMetadynamicsTorsionMinimisationSamplingSettings
pydantic-model
#
Bases: MMMDMetadynamicsSamplingSettings
Settings for MM MD metadynamics sampling with additional torsion-restrained minimisation structures. This extends MMMDMetadynamicsSamplingSettings by generating additional training data from torsion-restrained minimisations.
Show JSON schema:
{
"additionalProperties": false,
"description": "Settings for MM MD metadynamics sampling with additional torsion-restrained\nminimisation structures. This extends MMMDMetadynamicsSamplingSettings by generating\nadditional training data from torsion-restrained minimisations.",
"properties": {
"sampling_protocol": {
"const": "mm_md_metadynamics_torsion_minimisation",
"default": "mm_md_metadynamics_torsion_minimisation",
"description": "Sampling protocol to use.",
"title": "Sampling Protocol",
"type": "string"
},
"ml_potential": {
"default": "aceff-2.0",
"description": "The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.",
"enum": [
"aceff-2.0",
"mace-off23-small",
"mace-off23-medium",
"mace-off23-large",
"egret-1",
"aimnet2_b973c_d3_ens",
"aimnet2_wb97m_d3_ens"
],
"title": "Ml Potential",
"type": "string"
},
"timestep": {
"description": "MD timestep",
"title": "Timestep",
"type": "string"
},
"temperature": {
"description": "Temperature to run MD at",
"title": "Temperature",
"type": "string"
},
"snapshot_interval": {
"description": "Interval between saving snapshots during production sampling",
"title": "Snapshot Interval",
"type": "string"
},
"n_conformers": {
"default": 10,
"description": "The number of conformers to generate, from which sampling is started",
"title": "N Conformers",
"type": "integer"
},
"equilibration_sampling_time_per_conformer": {
"description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
"title": "Equilibration Sampling Time Per Conformer",
"type": "string"
},
"production_sampling_time_per_conformer": {
"description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
"title": "Production Sampling Time Per Conformer",
"type": "string"
},
"loss_energy_weight": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for samples from this protocol.",
"title": "Loss Energy Weight",
"type": "number"
},
"loss_force_weight": {
"default": 0.1,
"description": "Scaling factor for the force loss term for samples from this protocol.",
"title": "Loss Force Weight",
"type": "number"
},
"bias_width": {
"default": 0.3141592653589793,
"description": "Width of the bias (in radians)",
"title": "Bias Width",
"type": "number"
},
"bias_factor": {
"default": 20.0,
"description": "Bias factor for well-tempered metadynamics. Typical range: 5-20",
"title": "Bias Factor",
"type": "number"
},
"bias_height": {
"description": "Initial height of the bias",
"title": "Bias Height",
"type": "string"
},
"bias_frequency": {
"description": "Frequency at which to add bias",
"title": "Bias Frequency",
"type": "string"
},
"bias_save_frequency": {
"description": "Frequency at which to save the bias",
"title": "Bias Save Frequency",
"type": "string"
},
"torsions_to_include_smarts": {
"description": "SMARTS patterns for torsions to include in metadynamics biasing. Matches single bonds not in rings and single bonds in aliphatic rings of size 5 or more. These should match the entire torsion (4 atoms), not just the rotatable bond.",
"items": {
"type": "string"
},
"title": "Torsions To Include Smarts",
"type": "array"
},
"torsions_to_exclude_smarts": {
"description": "SMARTS patterns for bonds to exclude from metadynamics biasing. These are removed from the list of torsions matched by the include patterns. These should match only the rotatable bond (2 atoms), not the full torsion.",
"items": {
"type": "string"
},
"title": "Torsions To Exclude Smarts",
"type": "array"
},
"ml_minimisation_steps": {
"default": 10,
"description": "Number of MLP minimisation steps with restrained torsions.",
"title": "Ml Minimisation Steps",
"type": "integer"
},
"mm_minimisation_steps": {
"default": 10,
"description": "Number of MM minimisation steps with restrained torsions.",
"title": "Mm Minimisation Steps",
"type": "integer"
},
"torsion_restraint_force_constant": {
"description": "Force constant for torsion restraints.",
"title": "Torsion Restraint Force Constant",
"type": "string"
},
"map_ml_coords_energy_to_mm_coords_energy": {
"default": false,
"description": "Whether to substitute the MLP energy for the MM-minimised coordinates with the MLP energy for the corresponding MLP-minimised coordinates.",
"title": "Map Ml Coords Energy To Mm Coords Energy",
"type": "boolean"
},
"loss_energy_weight_mm_torsion_min": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for torsion-minimised samples, using MM minimisation. Note that the weights for the MMMD samples are controlled by the loss_energy_weight field.",
"title": "Loss Energy Weight Mm Torsion Min",
"type": "number"
},
"loss_force_weight_mm_torsion_min": {
"default": 0.1,
"description": "Scaling factor for the force loss term for torsion-minimised samples, using MM minimisation. Note that the weights for the MMMD samples are controlled by the loss_force_weight field.",
"title": "Loss Force Weight Mm Torsion Min",
"type": "number"
},
"loss_energy_weight_ml_torsion_min": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for torsion-minimised samples, using MLP minimisation. Note that the weights for the MMMD samples are controlled by the loss_energy_weight field.",
"title": "Loss Energy Weight Ml Torsion Min",
"type": "number"
},
"loss_force_weight_ml_torsion_min": {
"default": 0.1,
"description": "Scaling factor for the force loss term for torsion-minimised samples, using MLP minimisation. Note that the weights for the MMMD samples are controlled by the loss_force_weight field.",
"title": "Loss Force Weight Ml Torsion Min",
"type": "number"
}
},
"title": "MMMDMetadynamicsTorsionMinimisationSamplingSettings",
"type": "object"
}
Fields:
-
ml_potential(Literal[AvailableModels]) -
timestep(OpenMMQuantity[femtoseconds]) -
temperature(OpenMMQuantity[kelvin]) -
snapshot_interval(OpenMMQuantity[femtoseconds]) -
n_conformers(int) -
equilibration_sampling_time_per_conformer(OpenMMQuantity[picoseconds]) -
production_sampling_time_per_conformer(OpenMMQuantity[picoseconds]) -
loss_energy_weight(float) -
loss_force_weight(float) -
bias_width(float) -
bias_factor(float) -
bias_height(OpenMMQuantity[kilojoules_per_mole]) -
bias_frequency(OpenMMQuantity[picoseconds]) -
bias_save_frequency(OpenMMQuantity[picoseconds]) -
torsions_to_include_smarts(list[str]) -
torsions_to_exclude_smarts(list[str]) -
sampling_protocol(Literal['mm_md_metadynamics_torsion_minimisation']) -
ml_minimisation_steps(int) -
mm_minimisation_steps(int) -
torsion_restraint_force_constant(OpenMMQuantity[kilojoules_per_mole / radian ** 2]) -
map_ml_coords_energy_to_mm_coords_energy(bool) -
loss_energy_weight_mm_torsion_min(float) -
loss_force_weight_mm_torsion_min(float) -
loss_energy_weight_ml_torsion_min(float) -
loss_force_weight_ml_torsion_min(float)
Validators:
-
validate_sampling_times -
validate_frequencies
sampling_protocol
pydantic-field
#
sampling_protocol: Literal[
"mm_md_metadynamics_torsion_minimisation"
] = "mm_md_metadynamics_torsion_minimisation"
Sampling protocol to use.
ml_minimisation_steps
pydantic-field
#
Number of MLP minimisation steps with restrained torsions.
mm_minimisation_steps
pydantic-field
#
Number of MM minimisation steps with restrained torsions.
torsion_restraint_force_constant
pydantic-field
#
torsion_restraint_force_constant: OpenMMQuantity[
kilojoules_per_mole / radian**2
] = (0.0 * kilojoules_per_mole / radian**2)
Force constant for torsion restraints.
map_ml_coords_energy_to_mm_coords_energy
pydantic-field
#
Whether to substitute the MLP energy for the MM-minimised coordinates with the MLP energy for the corresponding MLP-minimised coordinates.
loss_energy_weight_mm_torsion_min
pydantic-field
#
Scaling factor for the energy loss term for torsion-minimised samples, using MM minimisation. Note that the weights for the MMMD samples are controlled by the loss_energy_weight field.
loss_force_weight_mm_torsion_min
pydantic-field
#
Scaling factor for the force loss term for torsion-minimised samples, using MM minimisation. Note that the weights for the MMMD samples are controlled by the loss_force_weight field.
loss_energy_weight_ml_torsion_min
pydantic-field
#
Scaling factor for the energy loss term for torsion-minimised samples, using MLP minimisation. Note that the weights for the MMMD samples are controlled by the loss_energy_weight field.
loss_force_weight_ml_torsion_min
pydantic-field
#
Scaling factor for the force loss term for torsion-minimised samples, using MLP minimisation. Note that the weights for the MMMD samples are controlled by the loss_force_weight field.
ml_potential
pydantic-field
#
ml_potential: Literal[AvailableModels] = 'aceff-2.0'
The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.
temperature
pydantic-field
#
Temperature to run MD at
snapshot_interval
pydantic-field
#
Interval between saving snapshots during production sampling
n_conformers
pydantic-field
#
The number of conformers to generate, from which sampling is started
equilibration_sampling_time_per_conformer
pydantic-field
#
Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.
production_sampling_time_per_conformer
pydantic-field
#
Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.
loss_energy_weight
pydantic-field
#
Scaling factor for the energy loss term for samples from this protocol.
loss_force_weight
pydantic-field
#
Scaling factor for the force loss term for samples from this protocol.
bias_factor
pydantic-field
#
Bias factor for well-tempered metadynamics. Typical range: 5-20
bias_height
pydantic-field
#
Initial height of the bias
bias_frequency
pydantic-field
#
Frequency at which to add bias
bias_save_frequency
pydantic-field
#
Frequency at which to save the bias
torsions_to_include_smarts
pydantic-field
#
SMARTS patterns for torsions to include in metadynamics biasing. Matches single bonds not in rings and single bonds in aliphatic rings of size 5 or more. These should match the entire torsion (4 atoms), not just the rotatable bond.
torsions_to_exclude_smarts
pydantic-field
#
SMARTS patterns for bonds to exclude from metadynamics biasing. These are removed from the list of torsions matched by the include patterns. These should match only the rotatable bond (2 atoms), not the full torsion.
to_yaml
#
from_yaml
classmethod
#
validate_sampling_times
pydantic-validator
#
Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval.
Source code in presto/settings.py
PreComputedDatasetSettings
pydantic-model
#
Bases: _DefaultSettings
Settings for loading pre-computed datasets from disk.
For single-molecule fits, provide a single Path. For multi-molecule fits, provide a list of Paths (one per molecule).
Show JSON schema:
{
"additionalProperties": false,
"description": "Settings for loading pre-computed datasets from disk.\n\nFor single-molecule fits, provide a single Path.\nFor multi-molecule fits, provide a list of Paths (one per molecule).",
"properties": {
"sampling_protocol": {
"const": "pre_computed",
"default": "pre_computed",
"description": "Sampling protocol identifier.",
"title": "Sampling Protocol",
"type": "string"
},
"dataset_paths": {
"description": "Path(s) to pre-computed dataset(s) saved with dataset.save_to_disk(). For single-molecule fits, provide a single Path. For multi-molecule fits, provide a list of Paths (one per molecule in order).",
"items": {
"format": "path",
"type": "string"
},
"title": "Dataset Paths",
"type": "array"
}
},
"required": [
"dataset_paths"
],
"title": "PreComputedDatasetSettings",
"type": "object"
}
Fields:
-
sampling_protocol(Literal['pre_computed']) -
dataset_paths(list[Path])
Validators:
-
normalize_dataset_paths→dataset_paths
sampling_protocol
pydantic-field
#
Sampling protocol identifier.
output_types
property
#
output_types: set[OutputType]
Pre-computed datasets don't produce any output files.
normalize_dataset_paths
pydantic-validator
#
Normalize dataset_paths to always be a list internally.
Source code in presto/settings.py
to_yaml
#
TrainingSettings
pydantic-model
#
Bases: _DefaultSettings
Settings for the training process.
Show JSON schema:
{
"$defs": {
"AttributeConfig": {
"description": "Configuration for how a potential's attributes should be trained.",
"properties": {
"cols": {
"description": "The parameters to train, e.g. 'k', 'length', 'epsilon'.",
"items": {
"type": "string"
},
"title": "Cols",
"type": "array"
},
"scales": {
"additionalProperties": {
"type": "number"
},
"default": {},
"description": "The scales to apply to each parameter, e.g. 'k': 1.0, 'length': 1.0, 'epsilon': 1.0.",
"title": "Scales",
"type": "object"
},
"limits": {
"additionalProperties": {
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
]
},
{
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
]
}
],
"type": "array"
},
"default": {},
"description": "The min and max values to clamp each parameter within, e.g. 'k': (0.0, None), 'angle': (0.0, pi), 'epsilon': (0.0, None), where none indicates no constraint.",
"title": "Limits",
"type": "object"
},
"regularize": {
"additionalProperties": {
"type": "number"
},
"default": {},
"description": "The regularization strength to apply to each parameter, e.g. 'k': 0.01, 'epsilon': 0.001. Parameters not listed are not regularized.",
"title": "Regularize",
"type": "object"
}
},
"required": [
"cols"
],
"title": "AttributeConfig",
"type": "object"
},
"ParameterConfig": {
"description": "Configuration for how a potential's parameters should be trained.",
"properties": {
"cols": {
"description": "The parameters to train, e.g. 'k', 'length', 'epsilon'.",
"items": {
"type": "string"
},
"title": "Cols",
"type": "array"
},
"scales": {
"additionalProperties": {
"type": "number"
},
"default": {},
"description": "The scales to apply to each parameter, e.g. 'k': 1.0, 'length': 1.0, 'epsilon': 1.0.",
"title": "Scales",
"type": "object"
},
"limits": {
"additionalProperties": {
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
]
},
{
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
]
}
],
"type": "array"
},
"default": {},
"description": "The min and max values to clamp each parameter within, e.g. 'k': (0.0, None), 'angle': (0.0, pi), 'epsilon': (0.0, None), where none indicates no constraint.",
"title": "Limits",
"type": "object"
},
"regularize": {
"additionalProperties": {
"type": "number"
},
"default": {},
"description": "The regularization strength to apply to each parameter, e.g. 'k': 0.01, 'epsilon': 0.001. Parameters not listed are not regularized.",
"title": "Regularize",
"type": "object"
},
"include": {
"anyOf": [
{
"items": {
"$ref": "#/$defs/_PotentialKey"
},
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "The keys (see ``smee.TensorPotential.parameter_keys`` for details) corresponding to specific parameters to be trained. If ``None``, all parameters will be trained.",
"title": "Include"
},
"exclude": {
"anyOf": [
{
"items": {
"$ref": "#/$defs/_PotentialKey"
},
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "The keys (see ``smee.TensorPotential.parameter_keys`` for details) corresponding to specific parameters to be excluded from training. If ``None``, no parameters will be excluded.",
"title": "Exclude"
}
},
"required": [
"cols"
],
"title": "ParameterConfig",
"type": "object"
},
"_PotentialKey": {
"description": "TODO: Needed until interchange upgrades to pydantic >=2",
"properties": {
"id": {
"title": "Id",
"type": "string"
},
"mult": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"default": null,
"title": "Mult"
},
"associated_handler": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Associated Handler"
},
"bond_order": {
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"title": "Bond Order"
}
},
"required": [
"id"
],
"title": "_PotentialKey",
"type": "object"
}
},
"additionalProperties": false,
"description": "Settings for the training process.",
"properties": {
"optimiser": {
"default": "adam",
"description": "Optimiser to use for the training. 'adam' is Adam, 'lm' is Levenberg-Marquardt",
"enum": [
"adam",
"lm"
],
"title": "Optimiser",
"type": "string"
},
"parameter_configs": {
"additionalProperties": {
"$ref": "#/$defs/ParameterConfig"
},
"description": "Configuration for the force field parameters to be trained.",
"propertyNames": {
"enum": [
"Bonds",
"LinearBonds",
"Angles",
"LinearAngles",
"ProperTorsions",
"ImproperTorsions"
]
},
"title": "Parameter Configs",
"type": "object"
},
"attribute_configs": {
"additionalProperties": {
"$ref": "#/$defs/AttributeConfig"
},
"default": {},
"description": "Configuration for the force field attributes to be trained. This allows 1-4 scaling for 'vdW' and 'Electrostatics' to be trained.",
"propertyNames": {
"enum": [
"vdW",
"Electrostatics"
]
},
"title": "Attribute Configs",
"type": "object"
},
"n_epochs": {
"default": 1000,
"description": "Number of epochs in the ML fit",
"title": "N Epochs",
"type": "integer"
},
"learning_rate": {
"default": 0.01,
"description": "Learning Rate in the ML fit",
"title": "Learning Rate",
"type": "number"
},
"learning_rate_decay": {
"default": 1.0,
"description": "Learning Rate Decay. 0.99 is 1%, and 1.0 is no decay.",
"title": "Learning Rate Decay",
"type": "number"
},
"learning_rate_decay_step": {
"default": 10,
"description": "Learning Rate Decay Step",
"title": "Learning Rate Decay Step",
"type": "integer"
},
"regularisation_target": {
"default": "initial",
"description": "Target value to regularise parameters towards. 'initial' is the initial parameter value, 'zero' is zero.",
"enum": [
"initial",
"zero"
],
"title": "Regularisation Target",
"type": "string"
}
},
"title": "TrainingSettings",
"type": "object"
}
Fields:
-
optimiser(OptimiserName) -
parameter_configs(dict[ValenceType, ParameterConfig]) -
attribute_configs(dict[AllowedAttributeType, AttributeConfig]) -
n_epochs(int) -
learning_rate(float) -
learning_rate_decay(float) -
learning_rate_decay_step(int) -
regularisation_target(Literal['initial', 'zero'])
optimiser
pydantic-field
#
optimiser: OptimiserName = 'adam'
Optimiser to use for the training. 'adam' is Adam, 'lm' is Levenberg-Marquardt
parameter_configs
pydantic-field
#
Configuration for the force field parameters to be trained.
attribute_configs
pydantic-field
#
Configuration for the force field attributes to be trained. This allows 1-4 scaling for 'vdW' and 'Electrostatics' to be trained.
learning_rate_decay
pydantic-field
#
Learning Rate Decay. 0.99 is 1%, and 1.0 is no decay.
learning_rate_decay_step
pydantic-field
#
Learning Rate Decay Step
regularisation_target
pydantic-field
#
Target value to regularise parameters towards. 'initial' is the initial parameter value, 'zero' is zero.
to_yaml
#
OutlierFilterSettings
pydantic-model
#
Bases: _DefaultSettings
Settings for filtering outliers from datasets based on MM vs MLP differences.
Outliers are identified by comparing MM and reference (typically MLP) energies and forces. Conformations where the absolute difference exceeds a threshold are removed.
Show JSON schema:
{
"additionalProperties": false,
"description": "Settings for filtering outliers from datasets based on MM vs MLP differences.\n\nOutliers are identified by comparing MM and reference (typically MLP) energies\nand forces. Conformations where the absolute difference exceeds a threshold\nare removed.",
"properties": {
"energy_outlier_threshold": {
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
],
"default": 2.0,
"description": "Absolute threshold in kcal/mol/atom for energy outlier detection. Conformations where |energy_mm - energy_ref| / n_atoms (relative to minimum) exceeds this threshold will be removed. Set to None to disable energy-based filtering.",
"title": "Energy Outlier Threshold"
},
"force_outlier_threshold": {
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
],
"default": 500.0,
"description": "Absolute threshold in kcal/mol/\u00c5 for force outlier detection. Conformations where max |force_mm - force_ref| exceeds this threshold will be removed. Set to None to disable force-based filtering.",
"title": "Force Outlier Threshold"
},
"min_conformations": {
"default": 1,
"description": "Minimum number of conformations to keep per molecule. If filtering would remove too many conformations, all conformations will be kept for that molecule.",
"title": "Min Conformations",
"type": "integer"
}
},
"title": "OutlierFilterSettings",
"type": "object"
}
Fields:
-
energy_outlier_threshold(float | None) -
force_outlier_threshold(float | None) -
min_conformations(int)
energy_outlier_threshold
pydantic-field
#
Absolute threshold in kcal/mol/atom for energy outlier detection. Conformations where |energy_mm - energy_ref| / n_atoms (relative to minimum) exceeds this threshold will be removed. Set to None to disable energy-based filtering.
force_outlier_threshold
pydantic-field
#
Absolute threshold in kcal/mol/Å for force outlier detection. Conformations where max |force_mm - force_ref| exceeds this threshold will be removed. Set to None to disable force-based filtering.
min_conformations
pydantic-field
#
Minimum number of conformations to keep per molecule. If filtering would remove too many conformations, all conformations will be kept for that molecule.
output_types
property
#
output_types: set[OutputType]
Return a set of expected output types for the function which implements this settings object. Subclasses should override this method.
to_yaml
#
TypeGenerationSettings
pydantic-model
#
Bases: _DefaultSettings
Settings for generating tagged SMARTS types for a given potential type.
Show JSON schema:
{
"additionalProperties": false,
"description": "Settings for generating tagged SMARTS types for a given potential type.",
"properties": {
"max_extend_distance": {
"default": -1,
"description": "Maximum number of bonds to extend from the atoms to which the potential is applied when generating tagged SMARTS patterns. A value of -1 means no limit.",
"title": "Max Extend Distance",
"type": "integer"
},
"include": {
"default": [],
"description": "List of SMARTS present in the initial force field for which to generate new SMARTS patterns. This allows you to split specific types for reparameterisation. This is mutually exclusive with the exclude field.",
"items": {
"type": "string"
},
"title": "Include",
"type": "array"
},
"exclude": {
"default": [],
"description": "List of SMARTS patterns to exclude when generating tagged SMARTS types. If present, these patterns will remain the same as in the initial force field. This is mutually exclusive with the include field.",
"items": {
"type": "string"
},
"title": "Exclude",
"type": "array"
}
},
"title": "TypeGenerationSettings",
"type": "object"
}
Fields:
-
max_extend_distance(int) -
include(list[str]) -
exclude(list[str])
Validators:
max_extend_distance
pydantic-field
#
Maximum number of bonds to extend from the atoms to which the potential is applied when generating tagged SMARTS patterns. A value of -1 means no limit.
include
pydantic-field
#
List of SMARTS present in the initial force field for which to generate new SMARTS patterns. This allows you to split specific types for reparameterisation. This is mutually exclusive with the exclude field.
exclude
pydantic-field
#
List of SMARTS patterns to exclude when generating tagged SMARTS types. If present, these patterns will remain the same as in the initial force field. This is mutually exclusive with the include field.
output_types
property
#
output_types: set[OutputType]
Return a set of expected output types for the function which implements this settings object. Subclasses should override this method.
validate_include_exclude
pydantic-validator
#
Ensure that only one of include or exclude is set.
Source code in presto/settings.py
to_yaml
#
MSMSettings
pydantic-model
#
Bases: _DefaultSettings
Settings for the modified Seminario method.
Show JSON schema:
{
"additionalProperties": false,
"description": "Settings for the modified Seminario method.",
"properties": {
"ml_potential": {
"default": "aceff-2.0",
"description": "The machine learning potential to use for calculating the Hessian matrix",
"enum": [
"aceff-2.0",
"mace-off23-small",
"mace-off23-medium",
"mace-off23-large",
"egret-1",
"aimnet2_b973c_d3_ens",
"aimnet2_wb97m_d3_ens"
],
"title": "Ml Potential",
"type": "string"
},
"finite_step": {
"description": "Finite step to calculate Hessian (Angstrom)",
"title": "Finite Step",
"type": "string"
},
"tolerance": {
"description": "Tolerance for the geometry optimizer",
"title": "Tolerance",
"type": "string"
},
"vib_scaling": {
"default": 0.958,
"description": "Vibrational scaling factor. This is a reasonable default for \u03c9B97M-V/def2-TZVPPD (AceFF-2.0 LOT), see https://doi-org.libproxy.ncl.ac.uk/10.1063/5.0152838",
"title": "Vib Scaling",
"type": "number"
},
"n_conformers": {
"default": 1,
"description": "Number of conformers to generate and calculate MSM parameters for. The resulting bond and angle parameters will be averaged over all conformers.",
"title": "N Conformers",
"type": "integer"
}
},
"title": "MSMSettings",
"type": "object"
}
Fields:
-
ml_potential(Literal[AvailableModels]) -
finite_step(OpenMMQuantity[nanometers]) -
tolerance(OpenMMQuantity[kilocalories_per_mole / angstrom]) -
vib_scaling(float) -
n_conformers(int)
ml_potential
pydantic-field
#
ml_potential: Literal[AvailableModels] = 'aceff-2.0'
The machine learning potential to use for calculating the Hessian matrix
finite_step
pydantic-field
#
Finite step to calculate Hessian (Angstrom)
tolerance
pydantic-field
#
tolerance: OpenMMQuantity[
kilocalories_per_mole / angstrom
] = (0.005291772 * kilocalories_per_mole / angstrom)
Tolerance for the geometry optimizer
vib_scaling
pydantic-field
#
Vibrational scaling factor. This is a reasonable default for ωB97M-V/def2-TZVPPD (AceFF-2.0 LOT), see https://doi-org.libproxy.ncl.ac.uk/10.1063/5.0152838
n_conformers
pydantic-field
#
Number of conformers to generate and calculate MSM parameters for. The resulting bond and angle parameters will be averaged over all conformers.
output_types
property
#
output_types: set[OutputType]
Return a set of expected output types for the function which implements this settings object. Subclasses should override this method.
to_yaml
#
ParameterisationSettings
pydantic-model
#
Bases: _DefaultSettings
Settings for the starting parameterisation.
Show JSON schema:
{
"$defs": {
"MSMSettings": {
"additionalProperties": false,
"description": "Settings for the modified Seminario method.",
"properties": {
"ml_potential": {
"default": "aceff-2.0",
"description": "The machine learning potential to use for calculating the Hessian matrix",
"enum": [
"aceff-2.0",
"mace-off23-small",
"mace-off23-medium",
"mace-off23-large",
"egret-1",
"aimnet2_b973c_d3_ens",
"aimnet2_wb97m_d3_ens"
],
"title": "Ml Potential",
"type": "string"
},
"finite_step": {
"description": "Finite step to calculate Hessian (Angstrom)",
"title": "Finite Step",
"type": "string"
},
"tolerance": {
"description": "Tolerance for the geometry optimizer",
"title": "Tolerance",
"type": "string"
},
"vib_scaling": {
"default": 0.958,
"description": "Vibrational scaling factor. This is a reasonable default for \u03c9B97M-V/def2-TZVPPD (AceFF-2.0 LOT), see https://doi-org.libproxy.ncl.ac.uk/10.1063/5.0152838",
"title": "Vib Scaling",
"type": "number"
},
"n_conformers": {
"default": 1,
"description": "Number of conformers to generate and calculate MSM parameters for. The resulting bond and angle parameters will be averaged over all conformers.",
"title": "N Conformers",
"type": "integer"
}
},
"title": "MSMSettings",
"type": "object"
},
"TypeGenerationSettings": {
"additionalProperties": false,
"description": "Settings for generating tagged SMARTS types for a given potential type.",
"properties": {
"max_extend_distance": {
"default": -1,
"description": "Maximum number of bonds to extend from the atoms to which the potential is applied when generating tagged SMARTS patterns. A value of -1 means no limit.",
"title": "Max Extend Distance",
"type": "integer"
},
"include": {
"default": [],
"description": "List of SMARTS present in the initial force field for which to generate new SMARTS patterns. This allows you to split specific types for reparameterisation. This is mutually exclusive with the exclude field.",
"items": {
"type": "string"
},
"title": "Include",
"type": "array"
},
"exclude": {
"default": [],
"description": "List of SMARTS patterns to exclude when generating tagged SMARTS types. If present, these patterns will remain the same as in the initial force field. This is mutually exclusive with the include field.",
"items": {
"type": "string"
},
"title": "Exclude",
"type": "array"
}
},
"title": "TypeGenerationSettings",
"type": "object"
}
},
"additionalProperties": false,
"description": "Settings for the starting parameterisation.",
"properties": {
"smiles": {
"description": "SMILES string or list of SMILES for molecules to fit",
"items": {
"type": "string"
},
"title": "Smiles",
"type": "array"
},
"initial_force_field": {
"default": "openff_unconstrained-2.3.0.offxml",
"description": "The force field from which to start. This can be any OpenFF force field, or your own .offxml file.",
"title": "Initial Force Field",
"type": "string"
},
"expand_torsions": {
"default": true,
"description": "Whether to expand the torsion periodicities up to 4.",
"title": "Expand Torsions",
"type": "boolean"
},
"linearise_harmonics": {
"default": true,
"description": "Linearise the harmonic potentials in the Force Field (Default)",
"title": "Linearise Harmonics",
"type": "boolean"
},
"msm_settings": {
"anyOf": [
{
"$ref": "#/$defs/MSMSettings"
},
{
"type": "null"
}
],
"description": "Settings for the modified Seminario method to initialise force field parameters."
},
"type_generation_settings": {
"additionalProperties": {
"$ref": "#/$defs/TypeGenerationSettings"
},
"description": "Settings for generating tagged SMARTS types for each valence type.",
"propertyNames": {
"enum": [
"Bonds",
"Angles",
"ProperTorsions",
"ImproperTorsions"
]
},
"title": "Type Generation Settings",
"type": "object"
}
},
"required": [
"smiles"
],
"title": "ParameterisationSettings",
"type": "object"
}
Fields:
-
smiles(list[str]) -
initial_force_field(str) -
expand_torsions(bool) -
linearise_harmonics(bool) -
msm_settings(MSMSettings | None) -
type_generation_settings(dict[NonLinearValenceType, TypeGenerationSettings])
Validators:
initial_force_field
pydantic-field
#
The force field from which to start. This can be any OpenFF force field, or your own .offxml file.
expand_torsions
pydantic-field
#
Whether to expand the torsion periodicities up to 4.
linearise_harmonics
pydantic-field
#
Linearise the harmonic potentials in the Force Field (Default)
msm_settings
pydantic-field
#
msm_settings: MSMSettings | None
Settings for the modified Seminario method to initialise force field parameters.
type_generation_settings
pydantic-field
#
type_generation_settings: dict[
NonLinearValenceType, TypeGenerationSettings
]
Settings for generating tagged SMARTS types for each valence type.
molecules
property
#
Return the list of OpenFF Molecule objects for the SMILES strings.
output_types
property
#
output_types: set[OutputType]
Return a set of expected output types for the function which implements this settings object. Subclasses should override this method.
validate_smiles
pydantic-validator
#
Validate all SMILES are valid, unique. Accepts string or list.
Source code in presto/settings.py
to_yaml
#
WorkflowSettings
pydantic-model
#
Bases: _DefaultSettings
Overall settings for the full fitting workflow.
Show JSON schema:
{
"$defs": {
"AttributeConfig": {
"description": "Configuration for how a potential's attributes should be trained.",
"properties": {
"cols": {
"description": "The parameters to train, e.g. 'k', 'length', 'epsilon'.",
"items": {
"type": "string"
},
"title": "Cols",
"type": "array"
},
"scales": {
"additionalProperties": {
"type": "number"
},
"default": {},
"description": "The scales to apply to each parameter, e.g. 'k': 1.0, 'length': 1.0, 'epsilon': 1.0.",
"title": "Scales",
"type": "object"
},
"limits": {
"additionalProperties": {
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
]
},
{
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
]
}
],
"type": "array"
},
"default": {},
"description": "The min and max values to clamp each parameter within, e.g. 'k': (0.0, None), 'angle': (0.0, pi), 'epsilon': (0.0, None), where none indicates no constraint.",
"title": "Limits",
"type": "object"
},
"regularize": {
"additionalProperties": {
"type": "number"
},
"default": {},
"description": "The regularization strength to apply to each parameter, e.g. 'k': 0.01, 'epsilon': 0.001. Parameters not listed are not regularized.",
"title": "Regularize",
"type": "object"
}
},
"required": [
"cols"
],
"title": "AttributeConfig",
"type": "object"
},
"MLMDSamplingSettings": {
"additionalProperties": false,
"description": "Settings for molecular dynamics sampling using a machine learning\npotential. This protocol uses the ML reference potential for sampling as\nwell as for energy and force calculations.",
"properties": {
"sampling_protocol": {
"const": "ml_md",
"default": "ml_md",
"description": "Sampling protocol to use.",
"title": "Sampling Protocol",
"type": "string"
},
"ml_potential": {
"default": "aceff-2.0",
"description": "The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.",
"enum": [
"aceff-2.0",
"mace-off23-small",
"mace-off23-medium",
"mace-off23-large",
"egret-1",
"aimnet2_b973c_d3_ens",
"aimnet2_wb97m_d3_ens"
],
"title": "Ml Potential",
"type": "string"
},
"timestep": {
"description": "MD timestep",
"title": "Timestep",
"type": "string"
},
"temperature": {
"description": "Temperature to run MD at",
"title": "Temperature",
"type": "string"
},
"snapshot_interval": {
"description": "Interval between saving snapshots during production sampling",
"title": "Snapshot Interval",
"type": "string"
},
"n_conformers": {
"default": 10,
"description": "The number of conformers to generate, from which sampling is started",
"title": "N Conformers",
"type": "integer"
},
"equilibration_sampling_time_per_conformer": {
"description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
"title": "Equilibration Sampling Time Per Conformer",
"type": "string"
},
"production_sampling_time_per_conformer": {
"description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
"title": "Production Sampling Time Per Conformer",
"type": "string"
},
"loss_energy_weight": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for samples from this protocol.",
"title": "Loss Energy Weight",
"type": "number"
},
"loss_force_weight": {
"default": 0.1,
"description": "Scaling factor for the force loss term for samples from this protocol.",
"title": "Loss Force Weight",
"type": "number"
}
},
"title": "MLMDSamplingSettings",
"type": "object"
},
"MMMDMetadynamicsSamplingSettings": {
"additionalProperties": false,
"description": "Settings for molecular dynamics sampling using a molecular mechanics\nforce field with metadynamics. This is initally the force field supplined in the parameterisation\nsettings, but is updated as the bespoke force field is trained.",
"properties": {
"sampling_protocol": {
"const": "mm_md_metadynamics",
"default": "mm_md_metadynamics",
"description": "Sampling protocol to use.",
"title": "Sampling Protocol",
"type": "string"
},
"ml_potential": {
"default": "aceff-2.0",
"description": "The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.",
"enum": [
"aceff-2.0",
"mace-off23-small",
"mace-off23-medium",
"mace-off23-large",
"egret-1",
"aimnet2_b973c_d3_ens",
"aimnet2_wb97m_d3_ens"
],
"title": "Ml Potential",
"type": "string"
},
"timestep": {
"description": "MD timestep",
"title": "Timestep",
"type": "string"
},
"temperature": {
"description": "Temperature to run MD at",
"title": "Temperature",
"type": "string"
},
"snapshot_interval": {
"description": "Interval between saving snapshots during production sampling",
"title": "Snapshot Interval",
"type": "string"
},
"n_conformers": {
"default": 10,
"description": "The number of conformers to generate, from which sampling is started",
"title": "N Conformers",
"type": "integer"
},
"equilibration_sampling_time_per_conformer": {
"description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
"title": "Equilibration Sampling Time Per Conformer",
"type": "string"
},
"production_sampling_time_per_conformer": {
"description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
"title": "Production Sampling Time Per Conformer",
"type": "string"
},
"loss_energy_weight": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for samples from this protocol.",
"title": "Loss Energy Weight",
"type": "number"
},
"loss_force_weight": {
"default": 0.1,
"description": "Scaling factor for the force loss term for samples from this protocol.",
"title": "Loss Force Weight",
"type": "number"
},
"bias_width": {
"default": 0.3141592653589793,
"description": "Width of the bias (in radians)",
"title": "Bias Width",
"type": "number"
},
"bias_factor": {
"default": 20.0,
"description": "Bias factor for well-tempered metadynamics. Typical range: 5-20",
"title": "Bias Factor",
"type": "number"
},
"bias_height": {
"description": "Initial height of the bias",
"title": "Bias Height",
"type": "string"
},
"bias_frequency": {
"description": "Frequency at which to add bias",
"title": "Bias Frequency",
"type": "string"
},
"bias_save_frequency": {
"description": "Frequency at which to save the bias",
"title": "Bias Save Frequency",
"type": "string"
},
"torsions_to_include_smarts": {
"description": "SMARTS patterns for torsions to include in metadynamics biasing. Matches single bonds not in rings and single bonds in aliphatic rings of size 5 or more. These should match the entire torsion (4 atoms), not just the rotatable bond.",
"items": {
"type": "string"
},
"title": "Torsions To Include Smarts",
"type": "array"
},
"torsions_to_exclude_smarts": {
"description": "SMARTS patterns for bonds to exclude from metadynamics biasing. These are removed from the list of torsions matched by the include patterns. These should match only the rotatable bond (2 atoms), not the full torsion.",
"items": {
"type": "string"
},
"title": "Torsions To Exclude Smarts",
"type": "array"
}
},
"title": "MMMDMetadynamicsSamplingSettings",
"type": "object"
},
"MMMDMetadynamicsTorsionMinimisationSamplingSettings": {
"additionalProperties": false,
"description": "Settings for MM MD metadynamics sampling with additional torsion-restrained\nminimisation structures. This extends MMMDMetadynamicsSamplingSettings by generating\nadditional training data from torsion-restrained minimisations.",
"properties": {
"sampling_protocol": {
"const": "mm_md_metadynamics_torsion_minimisation",
"default": "mm_md_metadynamics_torsion_minimisation",
"description": "Sampling protocol to use.",
"title": "Sampling Protocol",
"type": "string"
},
"ml_potential": {
"default": "aceff-2.0",
"description": "The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.",
"enum": [
"aceff-2.0",
"mace-off23-small",
"mace-off23-medium",
"mace-off23-large",
"egret-1",
"aimnet2_b973c_d3_ens",
"aimnet2_wb97m_d3_ens"
],
"title": "Ml Potential",
"type": "string"
},
"timestep": {
"description": "MD timestep",
"title": "Timestep",
"type": "string"
},
"temperature": {
"description": "Temperature to run MD at",
"title": "Temperature",
"type": "string"
},
"snapshot_interval": {
"description": "Interval between saving snapshots during production sampling",
"title": "Snapshot Interval",
"type": "string"
},
"n_conformers": {
"default": 10,
"description": "The number of conformers to generate, from which sampling is started",
"title": "N Conformers",
"type": "integer"
},
"equilibration_sampling_time_per_conformer": {
"description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
"title": "Equilibration Sampling Time Per Conformer",
"type": "string"
},
"production_sampling_time_per_conformer": {
"description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
"title": "Production Sampling Time Per Conformer",
"type": "string"
},
"loss_energy_weight": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for samples from this protocol.",
"title": "Loss Energy Weight",
"type": "number"
},
"loss_force_weight": {
"default": 0.1,
"description": "Scaling factor for the force loss term for samples from this protocol.",
"title": "Loss Force Weight",
"type": "number"
},
"bias_width": {
"default": 0.3141592653589793,
"description": "Width of the bias (in radians)",
"title": "Bias Width",
"type": "number"
},
"bias_factor": {
"default": 20.0,
"description": "Bias factor for well-tempered metadynamics. Typical range: 5-20",
"title": "Bias Factor",
"type": "number"
},
"bias_height": {
"description": "Initial height of the bias",
"title": "Bias Height",
"type": "string"
},
"bias_frequency": {
"description": "Frequency at which to add bias",
"title": "Bias Frequency",
"type": "string"
},
"bias_save_frequency": {
"description": "Frequency at which to save the bias",
"title": "Bias Save Frequency",
"type": "string"
},
"torsions_to_include_smarts": {
"description": "SMARTS patterns for torsions to include in metadynamics biasing. Matches single bonds not in rings and single bonds in aliphatic rings of size 5 or more. These should match the entire torsion (4 atoms), not just the rotatable bond.",
"items": {
"type": "string"
},
"title": "Torsions To Include Smarts",
"type": "array"
},
"torsions_to_exclude_smarts": {
"description": "SMARTS patterns for bonds to exclude from metadynamics biasing. These are removed from the list of torsions matched by the include patterns. These should match only the rotatable bond (2 atoms), not the full torsion.",
"items": {
"type": "string"
},
"title": "Torsions To Exclude Smarts",
"type": "array"
},
"ml_minimisation_steps": {
"default": 10,
"description": "Number of MLP minimisation steps with restrained torsions.",
"title": "Ml Minimisation Steps",
"type": "integer"
},
"mm_minimisation_steps": {
"default": 10,
"description": "Number of MM minimisation steps with restrained torsions.",
"title": "Mm Minimisation Steps",
"type": "integer"
},
"torsion_restraint_force_constant": {
"description": "Force constant for torsion restraints.",
"title": "Torsion Restraint Force Constant",
"type": "string"
},
"map_ml_coords_energy_to_mm_coords_energy": {
"default": false,
"description": "Whether to substitute the MLP energy for the MM-minimised coordinates with the MLP energy for the corresponding MLP-minimised coordinates.",
"title": "Map Ml Coords Energy To Mm Coords Energy",
"type": "boolean"
},
"loss_energy_weight_mm_torsion_min": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for torsion-minimised samples, using MM minimisation. Note that the weights for the MMMD samples are controlled by the loss_energy_weight field.",
"title": "Loss Energy Weight Mm Torsion Min",
"type": "number"
},
"loss_force_weight_mm_torsion_min": {
"default": 0.1,
"description": "Scaling factor for the force loss term for torsion-minimised samples, using MM minimisation. Note that the weights for the MMMD samples are controlled by the loss_force_weight field.",
"title": "Loss Force Weight Mm Torsion Min",
"type": "number"
},
"loss_energy_weight_ml_torsion_min": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for torsion-minimised samples, using MLP minimisation. Note that the weights for the MMMD samples are controlled by the loss_energy_weight field.",
"title": "Loss Energy Weight Ml Torsion Min",
"type": "number"
},
"loss_force_weight_ml_torsion_min": {
"default": 0.1,
"description": "Scaling factor for the force loss term for torsion-minimised samples, using MLP minimisation. Note that the weights for the MMMD samples are controlled by the loss_force_weight field.",
"title": "Loss Force Weight Ml Torsion Min",
"type": "number"
}
},
"title": "MMMDMetadynamicsTorsionMinimisationSamplingSettings",
"type": "object"
},
"MMMDSamplingSettings": {
"additionalProperties": false,
"description": "Settings for molecular dynamics sampling using a molecular mechanics\nforce field. This is initally the force field supplined in the parameterisation\nsettings, but is updated as the bespoke force field is trained.",
"properties": {
"sampling_protocol": {
"const": "mm_md",
"default": "mm_md",
"description": "Sampling protocol to use.",
"title": "Sampling Protocol",
"type": "string"
},
"ml_potential": {
"default": "aceff-2.0",
"description": "The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.",
"enum": [
"aceff-2.0",
"mace-off23-small",
"mace-off23-medium",
"mace-off23-large",
"egret-1",
"aimnet2_b973c_d3_ens",
"aimnet2_wb97m_d3_ens"
],
"title": "Ml Potential",
"type": "string"
},
"timestep": {
"description": "MD timestep",
"title": "Timestep",
"type": "string"
},
"temperature": {
"description": "Temperature to run MD at",
"title": "Temperature",
"type": "string"
},
"snapshot_interval": {
"description": "Interval between saving snapshots during production sampling",
"title": "Snapshot Interval",
"type": "string"
},
"n_conformers": {
"default": 10,
"description": "The number of conformers to generate, from which sampling is started",
"title": "N Conformers",
"type": "integer"
},
"equilibration_sampling_time_per_conformer": {
"description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
"title": "Equilibration Sampling Time Per Conformer",
"type": "string"
},
"production_sampling_time_per_conformer": {
"description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
"title": "Production Sampling Time Per Conformer",
"type": "string"
},
"loss_energy_weight": {
"default": 1000.0,
"description": "Scaling factor for the energy loss term for samples from this protocol.",
"title": "Loss Energy Weight",
"type": "number"
},
"loss_force_weight": {
"default": 0.1,
"description": "Scaling factor for the force loss term for samples from this protocol.",
"title": "Loss Force Weight",
"type": "number"
}
},
"title": "MMMDSamplingSettings",
"type": "object"
},
"MSMSettings": {
"additionalProperties": false,
"description": "Settings for the modified Seminario method.",
"properties": {
"ml_potential": {
"default": "aceff-2.0",
"description": "The machine learning potential to use for calculating the Hessian matrix",
"enum": [
"aceff-2.0",
"mace-off23-small",
"mace-off23-medium",
"mace-off23-large",
"egret-1",
"aimnet2_b973c_d3_ens",
"aimnet2_wb97m_d3_ens"
],
"title": "Ml Potential",
"type": "string"
},
"finite_step": {
"description": "Finite step to calculate Hessian (Angstrom)",
"title": "Finite Step",
"type": "string"
},
"tolerance": {
"description": "Tolerance for the geometry optimizer",
"title": "Tolerance",
"type": "string"
},
"vib_scaling": {
"default": 0.958,
"description": "Vibrational scaling factor. This is a reasonable default for \u03c9B97M-V/def2-TZVPPD (AceFF-2.0 LOT), see https://doi-org.libproxy.ncl.ac.uk/10.1063/5.0152838",
"title": "Vib Scaling",
"type": "number"
},
"n_conformers": {
"default": 1,
"description": "Number of conformers to generate and calculate MSM parameters for. The resulting bond and angle parameters will be averaged over all conformers.",
"title": "N Conformers",
"type": "integer"
}
},
"title": "MSMSettings",
"type": "object"
},
"OutlierFilterSettings": {
"additionalProperties": false,
"description": "Settings for filtering outliers from datasets based on MM vs MLP differences.\n\nOutliers are identified by comparing MM and reference (typically MLP) energies\nand forces. Conformations where the absolute difference exceeds a threshold\nare removed.",
"properties": {
"energy_outlier_threshold": {
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
],
"default": 2.0,
"description": "Absolute threshold in kcal/mol/atom for energy outlier detection. Conformations where |energy_mm - energy_ref| / n_atoms (relative to minimum) exceeds this threshold will be removed. Set to None to disable energy-based filtering.",
"title": "Energy Outlier Threshold"
},
"force_outlier_threshold": {
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
],
"default": 500.0,
"description": "Absolute threshold in kcal/mol/\u00c5 for force outlier detection. Conformations where max |force_mm - force_ref| exceeds this threshold will be removed. Set to None to disable force-based filtering.",
"title": "Force Outlier Threshold"
},
"min_conformations": {
"default": 1,
"description": "Minimum number of conformations to keep per molecule. If filtering would remove too many conformations, all conformations will be kept for that molecule.",
"title": "Min Conformations",
"type": "integer"
}
},
"title": "OutlierFilterSettings",
"type": "object"
},
"ParameterConfig": {
"description": "Configuration for how a potential's parameters should be trained.",
"properties": {
"cols": {
"description": "The parameters to train, e.g. 'k', 'length', 'epsilon'.",
"items": {
"type": "string"
},
"title": "Cols",
"type": "array"
},
"scales": {
"additionalProperties": {
"type": "number"
},
"default": {},
"description": "The scales to apply to each parameter, e.g. 'k': 1.0, 'length': 1.0, 'epsilon': 1.0.",
"title": "Scales",
"type": "object"
},
"limits": {
"additionalProperties": {
"maxItems": 2,
"minItems": 2,
"prefixItems": [
{
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
]
},
{
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
]
}
],
"type": "array"
},
"default": {},
"description": "The min and max values to clamp each parameter within, e.g. 'k': (0.0, None), 'angle': (0.0, pi), 'epsilon': (0.0, None), where none indicates no constraint.",
"title": "Limits",
"type": "object"
},
"regularize": {
"additionalProperties": {
"type": "number"
},
"default": {},
"description": "The regularization strength to apply to each parameter, e.g. 'k': 0.01, 'epsilon': 0.001. Parameters not listed are not regularized.",
"title": "Regularize",
"type": "object"
},
"include": {
"anyOf": [
{
"items": {
"$ref": "#/$defs/_PotentialKey"
},
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "The keys (see ``smee.TensorPotential.parameter_keys`` for details) corresponding to specific parameters to be trained. If ``None``, all parameters will be trained.",
"title": "Include"
},
"exclude": {
"anyOf": [
{
"items": {
"$ref": "#/$defs/_PotentialKey"
},
"type": "array"
},
{
"type": "null"
}
],
"default": null,
"description": "The keys (see ``smee.TensorPotential.parameter_keys`` for details) corresponding to specific parameters to be excluded from training. If ``None``, no parameters will be excluded.",
"title": "Exclude"
}
},
"required": [
"cols"
],
"title": "ParameterConfig",
"type": "object"
},
"ParameterisationSettings": {
"additionalProperties": false,
"description": "Settings for the starting parameterisation.",
"properties": {
"smiles": {
"description": "SMILES string or list of SMILES for molecules to fit",
"items": {
"type": "string"
},
"title": "Smiles",
"type": "array"
},
"initial_force_field": {
"default": "openff_unconstrained-2.3.0.offxml",
"description": "The force field from which to start. This can be any OpenFF force field, or your own .offxml file.",
"title": "Initial Force Field",
"type": "string"
},
"expand_torsions": {
"default": true,
"description": "Whether to expand the torsion periodicities up to 4.",
"title": "Expand Torsions",
"type": "boolean"
},
"linearise_harmonics": {
"default": true,
"description": "Linearise the harmonic potentials in the Force Field (Default)",
"title": "Linearise Harmonics",
"type": "boolean"
},
"msm_settings": {
"anyOf": [
{
"$ref": "#/$defs/MSMSettings"
},
{
"type": "null"
}
],
"description": "Settings for the modified Seminario method to initialise force field parameters."
},
"type_generation_settings": {
"additionalProperties": {
"$ref": "#/$defs/TypeGenerationSettings"
},
"description": "Settings for generating tagged SMARTS types for each valence type.",
"propertyNames": {
"enum": [
"Bonds",
"Angles",
"ProperTorsions",
"ImproperTorsions"
]
},
"title": "Type Generation Settings",
"type": "object"
}
},
"required": [
"smiles"
],
"title": "ParameterisationSettings",
"type": "object"
},
"PreComputedDatasetSettings": {
"additionalProperties": false,
"description": "Settings for loading pre-computed datasets from disk.\n\nFor single-molecule fits, provide a single Path.\nFor multi-molecule fits, provide a list of Paths (one per molecule).",
"properties": {
"sampling_protocol": {
"const": "pre_computed",
"default": "pre_computed",
"description": "Sampling protocol identifier.",
"title": "Sampling Protocol",
"type": "string"
},
"dataset_paths": {
"description": "Path(s) to pre-computed dataset(s) saved with dataset.save_to_disk(). For single-molecule fits, provide a single Path. For multi-molecule fits, provide a list of Paths (one per molecule in order).",
"items": {
"format": "path",
"type": "string"
},
"title": "Dataset Paths",
"type": "array"
}
},
"required": [
"dataset_paths"
],
"title": "PreComputedDatasetSettings",
"type": "object"
},
"TrainingSettings": {
"additionalProperties": false,
"description": "Settings for the training process.",
"properties": {
"optimiser": {
"default": "adam",
"description": "Optimiser to use for the training. 'adam' is Adam, 'lm' is Levenberg-Marquardt",
"enum": [
"adam",
"lm"
],
"title": "Optimiser",
"type": "string"
},
"parameter_configs": {
"additionalProperties": {
"$ref": "#/$defs/ParameterConfig"
},
"description": "Configuration for the force field parameters to be trained.",
"propertyNames": {
"enum": [
"Bonds",
"LinearBonds",
"Angles",
"LinearAngles",
"ProperTorsions",
"ImproperTorsions"
]
},
"title": "Parameter Configs",
"type": "object"
},
"attribute_configs": {
"additionalProperties": {
"$ref": "#/$defs/AttributeConfig"
},
"default": {},
"description": "Configuration for the force field attributes to be trained. This allows 1-4 scaling for 'vdW' and 'Electrostatics' to be trained.",
"propertyNames": {
"enum": [
"vdW",
"Electrostatics"
]
},
"title": "Attribute Configs",
"type": "object"
},
"n_epochs": {
"default": 1000,
"description": "Number of epochs in the ML fit",
"title": "N Epochs",
"type": "integer"
},
"learning_rate": {
"default": 0.01,
"description": "Learning Rate in the ML fit",
"title": "Learning Rate",
"type": "number"
},
"learning_rate_decay": {
"default": 1.0,
"description": "Learning Rate Decay. 0.99 is 1%, and 1.0 is no decay.",
"title": "Learning Rate Decay",
"type": "number"
},
"learning_rate_decay_step": {
"default": 10,
"description": "Learning Rate Decay Step",
"title": "Learning Rate Decay Step",
"type": "integer"
},
"regularisation_target": {
"default": "initial",
"description": "Target value to regularise parameters towards. 'initial' is the initial parameter value, 'zero' is zero.",
"enum": [
"initial",
"zero"
],
"title": "Regularisation Target",
"type": "string"
}
},
"title": "TrainingSettings",
"type": "object"
},
"TypeGenerationSettings": {
"additionalProperties": false,
"description": "Settings for generating tagged SMARTS types for a given potential type.",
"properties": {
"max_extend_distance": {
"default": -1,
"description": "Maximum number of bonds to extend from the atoms to which the potential is applied when generating tagged SMARTS patterns. A value of -1 means no limit.",
"title": "Max Extend Distance",
"type": "integer"
},
"include": {
"default": [],
"description": "List of SMARTS present in the initial force field for which to generate new SMARTS patterns. This allows you to split specific types for reparameterisation. This is mutually exclusive with the exclude field.",
"items": {
"type": "string"
},
"title": "Include",
"type": "array"
},
"exclude": {
"default": [],
"description": "List of SMARTS patterns to exclude when generating tagged SMARTS types. If present, these patterns will remain the same as in the initial force field. This is mutually exclusive with the include field.",
"items": {
"type": "string"
},
"title": "Exclude",
"type": "array"
}
},
"title": "TypeGenerationSettings",
"type": "object"
},
"_PotentialKey": {
"description": "TODO: Needed until interchange upgrades to pydantic >=2",
"properties": {
"id": {
"title": "Id",
"type": "string"
},
"mult": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"default": null,
"title": "Mult"
},
"associated_handler": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Associated Handler"
},
"bond_order": {
"anyOf": [
{
"type": "number"
},
{
"type": "null"
}
],
"default": null,
"title": "Bond Order"
}
},
"required": [
"id"
],
"title": "_PotentialKey",
"type": "object"
}
},
"additionalProperties": false,
"description": "Overall settings for the full fitting workflow.",
"properties": {
"version": {
"default": "0.1.dev1+g55bd96965",
"description": "Version of presto used to create these settings",
"title": "Version",
"type": "string"
},
"output_dir": {
"default": ".",
"description": "Directory where the output files will be saved",
"format": "path",
"title": "Output Dir",
"type": "string"
},
"device_type": {
"default": "cuda",
"description": "Device type for training, either 'cpu' or 'cuda'",
"enum": [
"cpu",
"cuda"
],
"title": "Device Type",
"type": "string"
},
"n_iterations": {
"default": 2,
"description": "Number of iterations of sampling, then training the FF to run",
"title": "N Iterations",
"type": "integer"
},
"memory": {
"default": false,
"description": "Whether to append new training data to training data from the previous iterations, or overwrite it (False).",
"title": "Memory",
"type": "boolean"
},
"parameterisation_settings": {
"$ref": "#/$defs/ParameterisationSettings",
"description": "Settings for the starting parameterisation"
},
"training_sampling_settings": {
"description": "Settings for sampling for generating the training data (usually molecular dynamics)",
"discriminator": {
"mapping": {
"ml_md": "#/$defs/MLMDSamplingSettings",
"mm_md": "#/$defs/MMMDSamplingSettings",
"mm_md_metadynamics": "#/$defs/MMMDMetadynamicsSamplingSettings",
"mm_md_metadynamics_torsion_minimisation": "#/$defs/MMMDMetadynamicsTorsionMinimisationSamplingSettings",
"pre_computed": "#/$defs/PreComputedDatasetSettings"
},
"propertyName": "sampling_protocol"
},
"oneOf": [
{
"$ref": "#/$defs/MMMDSamplingSettings"
},
{
"$ref": "#/$defs/MLMDSamplingSettings"
},
{
"$ref": "#/$defs/MMMDMetadynamicsSamplingSettings"
},
{
"$ref": "#/$defs/MMMDMetadynamicsTorsionMinimisationSamplingSettings"
},
{
"$ref": "#/$defs/PreComputedDatasetSettings"
}
],
"title": "Training Sampling Settings"
},
"testing_sampling_settings": {
"description": "Settings for sampling for generating the testing data (usually molecular dynamics)",
"discriminator": {
"mapping": {
"ml_md": "#/$defs/MLMDSamplingSettings",
"mm_md": "#/$defs/MMMDSamplingSettings",
"mm_md_metadynamics": "#/$defs/MMMDMetadynamicsSamplingSettings",
"mm_md_metadynamics_torsion_minimisation": "#/$defs/MMMDMetadynamicsTorsionMinimisationSamplingSettings",
"pre_computed": "#/$defs/PreComputedDatasetSettings"
},
"propertyName": "sampling_protocol"
},
"oneOf": [
{
"$ref": "#/$defs/MMMDSamplingSettings"
},
{
"$ref": "#/$defs/MLMDSamplingSettings"
},
{
"$ref": "#/$defs/MMMDMetadynamicsSamplingSettings"
},
{
"$ref": "#/$defs/MMMDMetadynamicsTorsionMinimisationSamplingSettings"
},
{
"$ref": "#/$defs/PreComputedDatasetSettings"
}
],
"title": "Testing Sampling Settings"
},
"training_settings": {
"$ref": "#/$defs/TrainingSettings",
"description": "Settings for the training process"
},
"outlier_filter_settings": {
"anyOf": [
{
"$ref": "#/$defs/OutlierFilterSettings"
},
{
"type": "null"
}
],
"description": "Settings for filtering outliers from training data. Set to None to disable outlier filtering."
}
},
"required": [
"parameterisation_settings"
],
"title": "WorkflowSettings",
"type": "object"
}
Fields:
-
version(str) -
output_dir(Path) -
device_type(TorchDevice) -
n_iterations(int) -
memory(bool) -
parameterisation_settings(ParameterisationSettings) -
training_sampling_settings(SamplingSettings) -
testing_sampling_settings(SamplingSettings) -
training_settings(TrainingSettings) -
outlier_filter_settings(OutlierFilterSettings | None)
output_dir
pydantic-field
#
Directory where the output files will be saved
device_type
pydantic-field
#
Device type for training, either 'cpu' or 'cuda'
n_iterations
pydantic-field
#
Number of iterations of sampling, then training the FF to run
memory
pydantic-field
#
Whether to append new training data to training data from the previous iterations, or overwrite it (False).
parameterisation_settings
pydantic-field
#
parameterisation_settings: ParameterisationSettings
Settings for the starting parameterisation
training_sampling_settings
pydantic-field
#
training_sampling_settings: SamplingSettings
Settings for sampling for generating the training data (usually molecular dynamics)
testing_sampling_settings
pydantic-field
#
testing_sampling_settings: SamplingSettings
Settings for sampling for generating the testing data (usually molecular dynamics)
training_settings
pydantic-field
#
training_settings: TrainingSettings
Settings for the training process
outlier_filter_settings
pydantic-field
#
outlier_filter_settings: OutlierFilterSettings | None
Settings for filtering outliers from training data. Set to None to disable outlier filtering.
output_types
property
#
output_types: set[OutputType]
Return a set of expected output types for the function which implements this settings object. Subclasses should override this method.
validate_version
classmethod
#
Validate version format and check compatibility.
Source code in presto/settings.py
validate_device_type
classmethod
#
Ensure that the requested device type is available.
Source code in presto/settings.py
validate_parameterisation_training_consistency
#
Validate that linearise_harmonics argument in parameterisation settings is consistent with the valence types in the training settings.
Source code in presto/settings.py
get_path_manager
#
get_path_manager() -> WorkflowPathManager
Get the output paths manager for this workflow settings object.
Source code in presto/settings.py
to_yaml
#
_model_to_yaml
#
Save the settings to a YAML file