Skip to content

settings #

Pydantic models which control/validate the settings.

Classes:

Attributes:

  • SamplingSettings

    Union type for all sampling settings. See the associated sampling_protocol field

SamplingSettings module-attribute #

Union type for all sampling settings. See the associated sampling_protocol field in each class for the string identifier which should be supplied to training_sampling_settings and testing_sampling_settings fields in WorkflowSettings.

_DefaultSettings pydantic-model #

Bases: BaseModel, ABC

Default configuration for all models.

Show JSON schema:
{
  "additionalProperties": false,
  "description": "Default configuration for all models.",
  "properties": {},
  "title": "_DefaultSettings",
  "type": "object"
}

output_types property #

output_types: set[OutputType]

Return a set of expected output types for the function which implements this settings object. Subclasses should override this method.

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

_SamplingSettingsBase pydantic-model #

Bases: _DefaultSettings, ABC

Settings for sampling (usually molecular dynamics).

Show JSON schema:
{
  "additionalProperties": false,
  "description": "Settings for sampling (usually molecular dynamics).",
  "properties": {
    "sampling_protocol": {
      "description": "Type of sampling protocol. Each sampling settings subclass should set this to a unique value. This is used as a discriminator when loading from YAML.",
      "title": "Sampling Protocol",
      "type": "string"
    },
    "ml_potential": {
      "default": "aceff-2.0",
      "description": "The machine learning potential to use for calculating energies and forces of  the snapshots. Note that this is not generally the potential used for sampling.",
      "enum": [
        "aceff-2.0",
        "mace-off23-small",
        "mace-off23-medium",
        "mace-off23-large",
        "egret-1",
        "aimnet2_b973c_d3_ens",
        "aimnet2_wb97m_d3_ens"
      ],
      "title": "Ml Potential",
      "type": "string"
    },
    "timestep": {
      "description": "MD timestep",
      "title": "Timestep",
      "type": "string"
    },
    "temperature": {
      "description": "Temperature to run MD at",
      "title": "Temperature",
      "type": "string"
    },
    "snapshot_interval": {
      "description": "Interval between saving snapshots during production sampling",
      "title": "Snapshot Interval",
      "type": "string"
    },
    "n_conformers": {
      "default": 10,
      "description": "The number of conformers to generate, from which sampling is started",
      "title": "N Conformers",
      "type": "integer"
    },
    "equilibration_sampling_time_per_conformer": {
      "description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
      "title": "Equilibration Sampling Time Per Conformer",
      "type": "string"
    },
    "production_sampling_time_per_conformer": {
      "description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
      "title": "Production Sampling Time Per Conformer",
      "type": "string"
    },
    "loss_energy_weight": {
      "default": 1000.0,
      "description": "Scaling factor for the energy loss term for samples from this protocol.",
      "title": "Loss Energy Weight",
      "type": "number"
    },
    "loss_force_weight": {
      "default": 0.1,
      "description": "Scaling factor for the force loss term for samples from this protocol.",
      "title": "Loss Force Weight",
      "type": "number"
    }
  },
  "required": [
    "sampling_protocol"
  ],
  "title": "_SamplingSettingsBase",
  "type": "object"
}

Fields:

Validators:

sampling_protocol pydantic-field #

sampling_protocol: str

Type of sampling protocol. Each sampling settings subclass should set this to a unique value. This is used as a discriminator when loading from YAML.

ml_potential pydantic-field #

ml_potential: Literal[AvailableModels] = 'aceff-2.0'

The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.

timestep pydantic-field #

timestep: OpenMMQuantity[femtoseconds] = 1 * femtoseconds

MD timestep

temperature pydantic-field #

temperature: OpenMMQuantity[kelvin] = 500 * kelvin

Temperature to run MD at

snapshot_interval pydantic-field #

snapshot_interval: OpenMMQuantity[femtoseconds] = (
    0.5 * picoseconds
)

Interval between saving snapshots during production sampling

n_conformers pydantic-field #

n_conformers: int = 10

The number of conformers to generate, from which sampling is started

equilibration_sampling_time_per_conformer pydantic-field #

equilibration_sampling_time_per_conformer: OpenMMQuantity[
    picoseconds
] = (0.0 * picoseconds)

Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.

production_sampling_time_per_conformer pydantic-field #

production_sampling_time_per_conformer: OpenMMQuantity[
    picoseconds
] = (100 * picoseconds)

Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.

loss_energy_weight pydantic-field #

loss_energy_weight: float = 1000.0

Scaling factor for the energy loss term for samples from this protocol.

loss_force_weight pydantic-field #

loss_force_weight: float = 0.1

Scaling factor for the force loss term for samples from this protocol.

validate_sampling_times pydantic-validator #

validate_sampling_times() -> Self

Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval.

Source code in presto/settings.py
@model_validator(mode="after")
def validate_sampling_times(self) -> Self:
    """Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval."""
    for time, name in [
        (
            self.equilibration_sampling_time_per_conformer,
            "equilibration_sampling_time_per_conformer",
        ),
        (
            self.production_sampling_time_per_conformer,
            "production_sampling_time_per_conformer",
        ),
    ]:
        n_steps = time / self.timestep
        if not n_steps.is_integer():
            raise InvalidSettingsError(
                f"{name} ({time}) must be divisible by the timestep ({self.timestep})."
            )

    # Additionally check that production sampling time divides by snapshot interval
    time = self.production_sampling_time_per_conformer / self.snapshot_interval
    if not n_steps.is_integer():
        raise InvalidSettingsError(
            f"production_sampling_time_per_conformer ({time}) must be divisible by the snapshot_interval ({self.snapshot_interval})."
        )

    return self

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

MMMDSamplingSettings pydantic-model #

Bases: _SamplingSettingsBase

Settings for molecular dynamics sampling using a molecular mechanics force field. This is initally the force field supplined in the parameterisation settings, but is updated as the bespoke force field is trained.

Show JSON schema:
{
  "additionalProperties": false,
  "description": "Settings for molecular dynamics sampling using a molecular mechanics\nforce field. This is initally the force field supplined in the parameterisation\nsettings, but is updated as the bespoke force field is trained.",
  "properties": {
    "sampling_protocol": {
      "const": "mm_md",
      "default": "mm_md",
      "description": "Sampling protocol to use.",
      "title": "Sampling Protocol",
      "type": "string"
    },
    "ml_potential": {
      "default": "aceff-2.0",
      "description": "The machine learning potential to use for calculating energies and forces of  the snapshots. Note that this is not generally the potential used for sampling.",
      "enum": [
        "aceff-2.0",
        "mace-off23-small",
        "mace-off23-medium",
        "mace-off23-large",
        "egret-1",
        "aimnet2_b973c_d3_ens",
        "aimnet2_wb97m_d3_ens"
      ],
      "title": "Ml Potential",
      "type": "string"
    },
    "timestep": {
      "description": "MD timestep",
      "title": "Timestep",
      "type": "string"
    },
    "temperature": {
      "description": "Temperature to run MD at",
      "title": "Temperature",
      "type": "string"
    },
    "snapshot_interval": {
      "description": "Interval between saving snapshots during production sampling",
      "title": "Snapshot Interval",
      "type": "string"
    },
    "n_conformers": {
      "default": 10,
      "description": "The number of conformers to generate, from which sampling is started",
      "title": "N Conformers",
      "type": "integer"
    },
    "equilibration_sampling_time_per_conformer": {
      "description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
      "title": "Equilibration Sampling Time Per Conformer",
      "type": "string"
    },
    "production_sampling_time_per_conformer": {
      "description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
      "title": "Production Sampling Time Per Conformer",
      "type": "string"
    },
    "loss_energy_weight": {
      "default": 1000.0,
      "description": "Scaling factor for the energy loss term for samples from this protocol.",
      "title": "Loss Energy Weight",
      "type": "number"
    },
    "loss_force_weight": {
      "default": 0.1,
      "description": "Scaling factor for the force loss term for samples from this protocol.",
      "title": "Loss Force Weight",
      "type": "number"
    }
  },
  "title": "MMMDSamplingSettings",
  "type": "object"
}

Fields:

Validators:

sampling_protocol pydantic-field #

sampling_protocol: Literal['mm_md'] = 'mm_md'

Sampling protocol to use.

ml_potential pydantic-field #

ml_potential: Literal[AvailableModels] = 'aceff-2.0'

The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.

timestep pydantic-field #

timestep: OpenMMQuantity[femtoseconds] = 1 * femtoseconds

MD timestep

temperature pydantic-field #

temperature: OpenMMQuantity[kelvin] = 500 * kelvin

Temperature to run MD at

snapshot_interval pydantic-field #

snapshot_interval: OpenMMQuantity[femtoseconds] = (
    0.5 * picoseconds
)

Interval between saving snapshots during production sampling

n_conformers pydantic-field #

n_conformers: int = 10

The number of conformers to generate, from which sampling is started

equilibration_sampling_time_per_conformer pydantic-field #

equilibration_sampling_time_per_conformer: OpenMMQuantity[
    picoseconds
] = (0.0 * picoseconds)

Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.

production_sampling_time_per_conformer pydantic-field #

production_sampling_time_per_conformer: OpenMMQuantity[
    picoseconds
] = (100 * picoseconds)

Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.

loss_energy_weight pydantic-field #

loss_energy_weight: float = 1000.0

Scaling factor for the energy loss term for samples from this protocol.

loss_force_weight pydantic-field #

loss_force_weight: float = 0.1

Scaling factor for the force loss term for samples from this protocol.

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

validate_sampling_times pydantic-validator #

validate_sampling_times() -> Self

Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval.

Source code in presto/settings.py
@model_validator(mode="after")
def validate_sampling_times(self) -> Self:
    """Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval."""
    for time, name in [
        (
            self.equilibration_sampling_time_per_conformer,
            "equilibration_sampling_time_per_conformer",
        ),
        (
            self.production_sampling_time_per_conformer,
            "production_sampling_time_per_conformer",
        ),
    ]:
        n_steps = time / self.timestep
        if not n_steps.is_integer():
            raise InvalidSettingsError(
                f"{name} ({time}) must be divisible by the timestep ({self.timestep})."
            )

    # Additionally check that production sampling time divides by snapshot interval
    time = self.production_sampling_time_per_conformer / self.snapshot_interval
    if not n_steps.is_integer():
        raise InvalidSettingsError(
            f"production_sampling_time_per_conformer ({time}) must be divisible by the snapshot_interval ({self.snapshot_interval})."
        )

    return self

MLMDSamplingSettings pydantic-model #

Bases: _SamplingSettingsBase

Settings for molecular dynamics sampling using a machine learning potential. This protocol uses the ML reference potential for sampling as well as for energy and force calculations.

Show JSON schema:
{
  "additionalProperties": false,
  "description": "Settings for molecular dynamics sampling using a machine learning\npotential. This protocol uses the ML reference potential for sampling as\nwell as for energy and force calculations.",
  "properties": {
    "sampling_protocol": {
      "const": "ml_md",
      "default": "ml_md",
      "description": "Sampling protocol to use.",
      "title": "Sampling Protocol",
      "type": "string"
    },
    "ml_potential": {
      "default": "aceff-2.0",
      "description": "The machine learning potential to use for calculating energies and forces of  the snapshots. Note that this is not generally the potential used for sampling.",
      "enum": [
        "aceff-2.0",
        "mace-off23-small",
        "mace-off23-medium",
        "mace-off23-large",
        "egret-1",
        "aimnet2_b973c_d3_ens",
        "aimnet2_wb97m_d3_ens"
      ],
      "title": "Ml Potential",
      "type": "string"
    },
    "timestep": {
      "description": "MD timestep",
      "title": "Timestep",
      "type": "string"
    },
    "temperature": {
      "description": "Temperature to run MD at",
      "title": "Temperature",
      "type": "string"
    },
    "snapshot_interval": {
      "description": "Interval between saving snapshots during production sampling",
      "title": "Snapshot Interval",
      "type": "string"
    },
    "n_conformers": {
      "default": 10,
      "description": "The number of conformers to generate, from which sampling is started",
      "title": "N Conformers",
      "type": "integer"
    },
    "equilibration_sampling_time_per_conformer": {
      "description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
      "title": "Equilibration Sampling Time Per Conformer",
      "type": "string"
    },
    "production_sampling_time_per_conformer": {
      "description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
      "title": "Production Sampling Time Per Conformer",
      "type": "string"
    },
    "loss_energy_weight": {
      "default": 1000.0,
      "description": "Scaling factor for the energy loss term for samples from this protocol.",
      "title": "Loss Energy Weight",
      "type": "number"
    },
    "loss_force_weight": {
      "default": 0.1,
      "description": "Scaling factor for the force loss term for samples from this protocol.",
      "title": "Loss Force Weight",
      "type": "number"
    }
  },
  "title": "MLMDSamplingSettings",
  "type": "object"
}

Fields:

Validators:

sampling_protocol pydantic-field #

sampling_protocol: Literal['ml_md'] = 'ml_md'

Sampling protocol to use.

ml_potential pydantic-field #

ml_potential: Literal[AvailableModels] = 'aceff-2.0'

The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.

timestep pydantic-field #

timestep: OpenMMQuantity[femtoseconds] = 1 * femtoseconds

MD timestep

temperature pydantic-field #

temperature: OpenMMQuantity[kelvin] = 500 * kelvin

Temperature to run MD at

snapshot_interval pydantic-field #

snapshot_interval: OpenMMQuantity[femtoseconds] = (
    0.5 * picoseconds
)

Interval between saving snapshots during production sampling

n_conformers pydantic-field #

n_conformers: int = 10

The number of conformers to generate, from which sampling is started

equilibration_sampling_time_per_conformer pydantic-field #

equilibration_sampling_time_per_conformer: OpenMMQuantity[
    picoseconds
] = (0.0 * picoseconds)

Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.

production_sampling_time_per_conformer pydantic-field #

production_sampling_time_per_conformer: OpenMMQuantity[
    picoseconds
] = (100 * picoseconds)

Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.

loss_energy_weight pydantic-field #

loss_energy_weight: float = 1000.0

Scaling factor for the energy loss term for samples from this protocol.

loss_force_weight pydantic-field #

loss_force_weight: float = 0.1

Scaling factor for the force loss term for samples from this protocol.

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

validate_sampling_times pydantic-validator #

validate_sampling_times() -> Self

Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval.

Source code in presto/settings.py
@model_validator(mode="after")
def validate_sampling_times(self) -> Self:
    """Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval."""
    for time, name in [
        (
            self.equilibration_sampling_time_per_conformer,
            "equilibration_sampling_time_per_conformer",
        ),
        (
            self.production_sampling_time_per_conformer,
            "production_sampling_time_per_conformer",
        ),
    ]:
        n_steps = time / self.timestep
        if not n_steps.is_integer():
            raise InvalidSettingsError(
                f"{name} ({time}) must be divisible by the timestep ({self.timestep})."
            )

    # Additionally check that production sampling time divides by snapshot interval
    time = self.production_sampling_time_per_conformer / self.snapshot_interval
    if not n_steps.is_integer():
        raise InvalidSettingsError(
            f"production_sampling_time_per_conformer ({time}) must be divisible by the snapshot_interval ({self.snapshot_interval})."
        )

    return self

MMMDMetadynamicsSamplingSettings pydantic-model #

Bases: _SamplingSettingsBase

Settings for molecular dynamics sampling using a molecular mechanics force field with metadynamics. This is initally the force field supplined in the parameterisation settings, but is updated as the bespoke force field is trained.

Show JSON schema:
{
  "additionalProperties": false,
  "description": "Settings for molecular dynamics sampling using a molecular mechanics\nforce field with metadynamics. This is initally the force field supplined in the parameterisation\nsettings, but is updated as the bespoke force field is trained.",
  "properties": {
    "sampling_protocol": {
      "const": "mm_md_metadynamics",
      "default": "mm_md_metadynamics",
      "description": "Sampling protocol to use.",
      "title": "Sampling Protocol",
      "type": "string"
    },
    "ml_potential": {
      "default": "aceff-2.0",
      "description": "The machine learning potential to use for calculating energies and forces of  the snapshots. Note that this is not generally the potential used for sampling.",
      "enum": [
        "aceff-2.0",
        "mace-off23-small",
        "mace-off23-medium",
        "mace-off23-large",
        "egret-1",
        "aimnet2_b973c_d3_ens",
        "aimnet2_wb97m_d3_ens"
      ],
      "title": "Ml Potential",
      "type": "string"
    },
    "timestep": {
      "description": "MD timestep",
      "title": "Timestep",
      "type": "string"
    },
    "temperature": {
      "description": "Temperature to run MD at",
      "title": "Temperature",
      "type": "string"
    },
    "snapshot_interval": {
      "description": "Interval between saving snapshots during production sampling",
      "title": "Snapshot Interval",
      "type": "string"
    },
    "n_conformers": {
      "default": 10,
      "description": "The number of conformers to generate, from which sampling is started",
      "title": "N Conformers",
      "type": "integer"
    },
    "equilibration_sampling_time_per_conformer": {
      "description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
      "title": "Equilibration Sampling Time Per Conformer",
      "type": "string"
    },
    "production_sampling_time_per_conformer": {
      "description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
      "title": "Production Sampling Time Per Conformer",
      "type": "string"
    },
    "loss_energy_weight": {
      "default": 1000.0,
      "description": "Scaling factor for the energy loss term for samples from this protocol.",
      "title": "Loss Energy Weight",
      "type": "number"
    },
    "loss_force_weight": {
      "default": 0.1,
      "description": "Scaling factor for the force loss term for samples from this protocol.",
      "title": "Loss Force Weight",
      "type": "number"
    },
    "bias_width": {
      "default": 0.3141592653589793,
      "description": "Width of the bias (in radians)",
      "title": "Bias Width",
      "type": "number"
    },
    "bias_factor": {
      "default": 20.0,
      "description": "Bias factor for well-tempered metadynamics. Typical range: 5-20",
      "title": "Bias Factor",
      "type": "number"
    },
    "bias_height": {
      "description": "Initial height of the bias",
      "title": "Bias Height",
      "type": "string"
    },
    "bias_frequency": {
      "description": "Frequency at which to add bias",
      "title": "Bias Frequency",
      "type": "string"
    },
    "bias_save_frequency": {
      "description": "Frequency at which to save the bias",
      "title": "Bias Save Frequency",
      "type": "string"
    },
    "torsions_to_include_smarts": {
      "description": "SMARTS patterns for torsions to include in metadynamics biasing. Matches single bonds not in rings and single bonds in aliphatic rings of size 5 or more. These should match the entire torsion (4 atoms), not just the rotatable bond.",
      "items": {
        "type": "string"
      },
      "title": "Torsions To Include Smarts",
      "type": "array"
    },
    "torsions_to_exclude_smarts": {
      "description": "SMARTS patterns for bonds to exclude from metadynamics biasing. These are removed from the list of torsions matched by the include patterns. These should match only the rotatable bond (2 atoms), not the full torsion.",
      "items": {
        "type": "string"
      },
      "title": "Torsions To Exclude Smarts",
      "type": "array"
    }
  },
  "title": "MMMDMetadynamicsSamplingSettings",
  "type": "object"
}

Fields:

Validators:

sampling_protocol pydantic-field #

sampling_protocol: Literal["mm_md_metadynamics"] = (
    "mm_md_metadynamics"
)

Sampling protocol to use.

bias_width pydantic-field #

bias_width: float = pi / 10

Width of the bias (in radians)

bias_factor pydantic-field #

bias_factor: float = 20.0

Bias factor for well-tempered metadynamics. Typical range: 5-20

bias_height pydantic-field #

bias_height: OpenMMQuantity[kilojoules_per_mole] = (
    1.0 * kilojoules_per_mole
)

Initial height of the bias

bias_frequency pydantic-field #

bias_frequency: OpenMMQuantity[picoseconds] = (
    0.1 * picoseconds
)

Frequency at which to add bias

bias_save_frequency pydantic-field #

bias_save_frequency: OpenMMQuantity[picoseconds] = (
    10 * picoseconds
)

Frequency at which to save the bias

torsions_to_include_smarts pydantic-field #

torsions_to_include_smarts: list[str]

SMARTS patterns for torsions to include in metadynamics biasing. Matches single bonds not in rings and single bonds in aliphatic rings of size 5 or more. These should match the entire torsion (4 atoms), not just the rotatable bond.

torsions_to_exclude_smarts pydantic-field #

torsions_to_exclude_smarts: list[str]

SMARTS patterns for bonds to exclude from metadynamics biasing. These are removed from the list of torsions matched by the include patterns. These should match only the rotatable bond (2 atoms), not the full torsion.

ml_potential pydantic-field #

ml_potential: Literal[AvailableModels] = 'aceff-2.0'

The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.

timestep pydantic-field #

timestep: OpenMMQuantity[femtoseconds] = 1 * femtoseconds

MD timestep

temperature pydantic-field #

temperature: OpenMMQuantity[kelvin] = 500 * kelvin

Temperature to run MD at

snapshot_interval pydantic-field #

snapshot_interval: OpenMMQuantity[femtoseconds] = (
    0.5 * picoseconds
)

Interval between saving snapshots during production sampling

n_conformers pydantic-field #

n_conformers: int = 10

The number of conformers to generate, from which sampling is started

equilibration_sampling_time_per_conformer pydantic-field #

equilibration_sampling_time_per_conformer: OpenMMQuantity[
    picoseconds
] = (0.0 * picoseconds)

Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.

production_sampling_time_per_conformer pydantic-field #

production_sampling_time_per_conformer: OpenMMQuantity[
    picoseconds
] = (100 * picoseconds)

Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.

loss_energy_weight pydantic-field #

loss_energy_weight: float = 1000.0

Scaling factor for the energy loss term for samples from this protocol.

loss_force_weight pydantic-field #

loss_force_weight: float = 0.1

Scaling factor for the force loss term for samples from this protocol.

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

validate_sampling_times pydantic-validator #

validate_sampling_times() -> Self

Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval.

Source code in presto/settings.py
@model_validator(mode="after")
def validate_sampling_times(self) -> Self:
    """Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval."""
    for time, name in [
        (
            self.equilibration_sampling_time_per_conformer,
            "equilibration_sampling_time_per_conformer",
        ),
        (
            self.production_sampling_time_per_conformer,
            "production_sampling_time_per_conformer",
        ),
    ]:
        n_steps = time / self.timestep
        if not n_steps.is_integer():
            raise InvalidSettingsError(
                f"{name} ({time}) must be divisible by the timestep ({self.timestep})."
            )

    # Additionally check that production sampling time divides by snapshot interval
    time = self.production_sampling_time_per_conformer / self.snapshot_interval
    if not n_steps.is_integer():
        raise InvalidSettingsError(
            f"production_sampling_time_per_conformer ({time}) must be divisible by the snapshot_interval ({self.snapshot_interval})."
        )

    return self

MMMDMetadynamicsTorsionMinimisationSamplingSettings pydantic-model #

Bases: MMMDMetadynamicsSamplingSettings

Settings for MM MD metadynamics sampling with additional torsion-restrained minimisation structures. This extends MMMDMetadynamicsSamplingSettings by generating additional training data from torsion-restrained minimisations.

Show JSON schema:
{
  "additionalProperties": false,
  "description": "Settings for MM MD metadynamics sampling with additional torsion-restrained\nminimisation structures. This extends MMMDMetadynamicsSamplingSettings by generating\nadditional training data from torsion-restrained minimisations.",
  "properties": {
    "sampling_protocol": {
      "const": "mm_md_metadynamics_torsion_minimisation",
      "default": "mm_md_metadynamics_torsion_minimisation",
      "description": "Sampling protocol to use.",
      "title": "Sampling Protocol",
      "type": "string"
    },
    "ml_potential": {
      "default": "aceff-2.0",
      "description": "The machine learning potential to use for calculating energies and forces of  the snapshots. Note that this is not generally the potential used for sampling.",
      "enum": [
        "aceff-2.0",
        "mace-off23-small",
        "mace-off23-medium",
        "mace-off23-large",
        "egret-1",
        "aimnet2_b973c_d3_ens",
        "aimnet2_wb97m_d3_ens"
      ],
      "title": "Ml Potential",
      "type": "string"
    },
    "timestep": {
      "description": "MD timestep",
      "title": "Timestep",
      "type": "string"
    },
    "temperature": {
      "description": "Temperature to run MD at",
      "title": "Temperature",
      "type": "string"
    },
    "snapshot_interval": {
      "description": "Interval between saving snapshots during production sampling",
      "title": "Snapshot Interval",
      "type": "string"
    },
    "n_conformers": {
      "default": 10,
      "description": "The number of conformers to generate, from which sampling is started",
      "title": "N Conformers",
      "type": "integer"
    },
    "equilibration_sampling_time_per_conformer": {
      "description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
      "title": "Equilibration Sampling Time Per Conformer",
      "type": "string"
    },
    "production_sampling_time_per_conformer": {
      "description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
      "title": "Production Sampling Time Per Conformer",
      "type": "string"
    },
    "loss_energy_weight": {
      "default": 1000.0,
      "description": "Scaling factor for the energy loss term for samples from this protocol.",
      "title": "Loss Energy Weight",
      "type": "number"
    },
    "loss_force_weight": {
      "default": 0.1,
      "description": "Scaling factor for the force loss term for samples from this protocol.",
      "title": "Loss Force Weight",
      "type": "number"
    },
    "bias_width": {
      "default": 0.3141592653589793,
      "description": "Width of the bias (in radians)",
      "title": "Bias Width",
      "type": "number"
    },
    "bias_factor": {
      "default": 20.0,
      "description": "Bias factor for well-tempered metadynamics. Typical range: 5-20",
      "title": "Bias Factor",
      "type": "number"
    },
    "bias_height": {
      "description": "Initial height of the bias",
      "title": "Bias Height",
      "type": "string"
    },
    "bias_frequency": {
      "description": "Frequency at which to add bias",
      "title": "Bias Frequency",
      "type": "string"
    },
    "bias_save_frequency": {
      "description": "Frequency at which to save the bias",
      "title": "Bias Save Frequency",
      "type": "string"
    },
    "torsions_to_include_smarts": {
      "description": "SMARTS patterns for torsions to include in metadynamics biasing. Matches single bonds not in rings and single bonds in aliphatic rings of size 5 or more. These should match the entire torsion (4 atoms), not just the rotatable bond.",
      "items": {
        "type": "string"
      },
      "title": "Torsions To Include Smarts",
      "type": "array"
    },
    "torsions_to_exclude_smarts": {
      "description": "SMARTS patterns for bonds to exclude from metadynamics biasing. These are removed from the list of torsions matched by the include patterns. These should match only the rotatable bond (2 atoms), not the full torsion.",
      "items": {
        "type": "string"
      },
      "title": "Torsions To Exclude Smarts",
      "type": "array"
    },
    "ml_minimisation_steps": {
      "default": 10,
      "description": "Number of MLP minimisation steps with restrained torsions.",
      "title": "Ml Minimisation Steps",
      "type": "integer"
    },
    "mm_minimisation_steps": {
      "default": 10,
      "description": "Number of MM minimisation steps with restrained torsions.",
      "title": "Mm Minimisation Steps",
      "type": "integer"
    },
    "torsion_restraint_force_constant": {
      "description": "Force constant for torsion restraints.",
      "title": "Torsion Restraint Force Constant",
      "type": "string"
    },
    "map_ml_coords_energy_to_mm_coords_energy": {
      "default": false,
      "description": "Whether to substitute the MLP energy for the MM-minimised coordinates with the MLP energy for the corresponding MLP-minimised coordinates.",
      "title": "Map Ml Coords Energy To Mm Coords Energy",
      "type": "boolean"
    },
    "loss_energy_weight_mm_torsion_min": {
      "default": 1000.0,
      "description": "Scaling factor for the energy loss term for torsion-minimised samples, using MM minimisation. Note that the weights for the MMMD samples are controlled by the loss_energy_weight field.",
      "title": "Loss Energy Weight Mm Torsion Min",
      "type": "number"
    },
    "loss_force_weight_mm_torsion_min": {
      "default": 0.1,
      "description": "Scaling factor for the force loss term for torsion-minimised samples, using MM minimisation. Note that the weights for the MMMD samples are controlled by the loss_force_weight field.",
      "title": "Loss Force Weight Mm Torsion Min",
      "type": "number"
    },
    "loss_energy_weight_ml_torsion_min": {
      "default": 1000.0,
      "description": "Scaling factor for the energy loss term for torsion-minimised samples, using MLP minimisation. Note that the weights for the MMMD samples are controlled by the loss_energy_weight field.",
      "title": "Loss Energy Weight Ml Torsion Min",
      "type": "number"
    },
    "loss_force_weight_ml_torsion_min": {
      "default": 0.1,
      "description": "Scaling factor for the force loss term for torsion-minimised samples, using MLP minimisation. Note that the weights for the MMMD samples are controlled by the loss_force_weight field.",
      "title": "Loss Force Weight Ml Torsion Min",
      "type": "number"
    }
  },
  "title": "MMMDMetadynamicsTorsionMinimisationSamplingSettings",
  "type": "object"
}

Fields:

Validators:

sampling_protocol pydantic-field #

sampling_protocol: Literal[
    "mm_md_metadynamics_torsion_minimisation"
] = "mm_md_metadynamics_torsion_minimisation"

Sampling protocol to use.

ml_minimisation_steps pydantic-field #

ml_minimisation_steps: int = 10

Number of MLP minimisation steps with restrained torsions.

mm_minimisation_steps pydantic-field #

mm_minimisation_steps: int = 10

Number of MM minimisation steps with restrained torsions.

torsion_restraint_force_constant pydantic-field #

torsion_restraint_force_constant: OpenMMQuantity[
    kilojoules_per_mole / radian**2
] = (0.0 * kilojoules_per_mole / radian**2)

Force constant for torsion restraints.

map_ml_coords_energy_to_mm_coords_energy pydantic-field #

map_ml_coords_energy_to_mm_coords_energy: bool = False

Whether to substitute the MLP energy for the MM-minimised coordinates with the MLP energy for the corresponding MLP-minimised coordinates.

loss_energy_weight_mm_torsion_min pydantic-field #

loss_energy_weight_mm_torsion_min: float = 1000.0

Scaling factor for the energy loss term for torsion-minimised samples, using MM minimisation. Note that the weights for the MMMD samples are controlled by the loss_energy_weight field.

loss_force_weight_mm_torsion_min pydantic-field #

loss_force_weight_mm_torsion_min: float = 0.1

Scaling factor for the force loss term for torsion-minimised samples, using MM minimisation. Note that the weights for the MMMD samples are controlled by the loss_force_weight field.

loss_energy_weight_ml_torsion_min pydantic-field #

loss_energy_weight_ml_torsion_min: float = 1000.0

Scaling factor for the energy loss term for torsion-minimised samples, using MLP minimisation. Note that the weights for the MMMD samples are controlled by the loss_energy_weight field.

loss_force_weight_ml_torsion_min pydantic-field #

loss_force_weight_ml_torsion_min: float = 0.1

Scaling factor for the force loss term for torsion-minimised samples, using MLP minimisation. Note that the weights for the MMMD samples are controlled by the loss_force_weight field.

ml_potential pydantic-field #

ml_potential: Literal[AvailableModels] = 'aceff-2.0'

The machine learning potential to use for calculating energies and forces of the snapshots. Note that this is not generally the potential used for sampling.

timestep pydantic-field #

timestep: OpenMMQuantity[femtoseconds] = 1 * femtoseconds

MD timestep

temperature pydantic-field #

temperature: OpenMMQuantity[kelvin] = 500 * kelvin

Temperature to run MD at

snapshot_interval pydantic-field #

snapshot_interval: OpenMMQuantity[femtoseconds] = (
    0.5 * picoseconds
)

Interval between saving snapshots during production sampling

n_conformers pydantic-field #

n_conformers: int = 10

The number of conformers to generate, from which sampling is started

equilibration_sampling_time_per_conformer pydantic-field #

equilibration_sampling_time_per_conformer: OpenMMQuantity[
    picoseconds
] = (0.0 * picoseconds)

Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.

production_sampling_time_per_conformer pydantic-field #

production_sampling_time_per_conformer: OpenMMQuantity[
    picoseconds
] = (100 * picoseconds)

Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.

loss_energy_weight pydantic-field #

loss_energy_weight: float = 1000.0

Scaling factor for the energy loss term for samples from this protocol.

loss_force_weight pydantic-field #

loss_force_weight: float = 0.1

Scaling factor for the force loss term for samples from this protocol.

bias_width pydantic-field #

bias_width: float = pi / 10

Width of the bias (in radians)

bias_factor pydantic-field #

bias_factor: float = 20.0

Bias factor for well-tempered metadynamics. Typical range: 5-20

bias_height pydantic-field #

bias_height: OpenMMQuantity[kilojoules_per_mole] = (
    1.0 * kilojoules_per_mole
)

Initial height of the bias

bias_frequency pydantic-field #

bias_frequency: OpenMMQuantity[picoseconds] = (
    0.1 * picoseconds
)

Frequency at which to add bias

bias_save_frequency pydantic-field #

bias_save_frequency: OpenMMQuantity[picoseconds] = (
    10 * picoseconds
)

Frequency at which to save the bias

torsions_to_include_smarts pydantic-field #

torsions_to_include_smarts: list[str]

SMARTS patterns for torsions to include in metadynamics biasing. Matches single bonds not in rings and single bonds in aliphatic rings of size 5 or more. These should match the entire torsion (4 atoms), not just the rotatable bond.

torsions_to_exclude_smarts pydantic-field #

torsions_to_exclude_smarts: list[str]

SMARTS patterns for bonds to exclude from metadynamics biasing. These are removed from the list of torsions matched by the include patterns. These should match only the rotatable bond (2 atoms), not the full torsion.

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

validate_sampling_times pydantic-validator #

validate_sampling_times() -> Self

Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval.

Source code in presto/settings.py
@model_validator(mode="after")
def validate_sampling_times(self) -> Self:
    """Ensure that the sampling times divide exactly by the timestep and (for production) the snapshot interval."""
    for time, name in [
        (
            self.equilibration_sampling_time_per_conformer,
            "equilibration_sampling_time_per_conformer",
        ),
        (
            self.production_sampling_time_per_conformer,
            "production_sampling_time_per_conformer",
        ),
    ]:
        n_steps = time / self.timestep
        if not n_steps.is_integer():
            raise InvalidSettingsError(
                f"{name} ({time}) must be divisible by the timestep ({self.timestep})."
            )

    # Additionally check that production sampling time divides by snapshot interval
    time = self.production_sampling_time_per_conformer / self.snapshot_interval
    if not n_steps.is_integer():
        raise InvalidSettingsError(
            f"production_sampling_time_per_conformer ({time}) must be divisible by the snapshot_interval ({self.snapshot_interval})."
        )

    return self

PreComputedDatasetSettings pydantic-model #

Bases: _DefaultSettings

Settings for loading pre-computed datasets from disk.

For single-molecule fits, provide a single Path. For multi-molecule fits, provide a list of Paths (one per molecule).

Show JSON schema:
{
  "additionalProperties": false,
  "description": "Settings for loading pre-computed datasets from disk.\n\nFor single-molecule fits, provide a single Path.\nFor multi-molecule fits, provide a list of Paths (one per molecule).",
  "properties": {
    "sampling_protocol": {
      "const": "pre_computed",
      "default": "pre_computed",
      "description": "Sampling protocol identifier.",
      "title": "Sampling Protocol",
      "type": "string"
    },
    "dataset_paths": {
      "description": "Path(s) to pre-computed dataset(s) saved with dataset.save_to_disk(). For single-molecule fits, provide a single Path. For multi-molecule fits, provide a list of Paths (one per molecule in order).",
      "items": {
        "format": "path",
        "type": "string"
      },
      "title": "Dataset Paths",
      "type": "array"
    }
  },
  "required": [
    "dataset_paths"
  ],
  "title": "PreComputedDatasetSettings",
  "type": "object"
}

Fields:

Validators:

sampling_protocol pydantic-field #

sampling_protocol: Literal['pre_computed'] = 'pre_computed'

Sampling protocol identifier.

output_types property #

output_types: set[OutputType]

Pre-computed datasets don't produce any output files.

normalize_dataset_paths pydantic-validator #

normalize_dataset_paths(
    value: Path | list[Path],
) -> list[Path]

Normalize dataset_paths to always be a list internally.

Source code in presto/settings.py
@field_validator("dataset_paths", mode="before")
@classmethod
def normalize_dataset_paths(cls, value: Path | list[Path]) -> list[Path]:
    """Normalize dataset_paths to always be a list internally."""
    if isinstance(value, (str, Path)):
        return [Path(value)]
    return [Path(p) for p in value]

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

TrainingSettings pydantic-model #

Bases: _DefaultSettings

Settings for the training process.

Show JSON schema:
{
  "$defs": {
    "AttributeConfig": {
      "description": "Configuration for how a potential's attributes should be trained.",
      "properties": {
        "cols": {
          "description": "The parameters to train, e.g. 'k', 'length', 'epsilon'.",
          "items": {
            "type": "string"
          },
          "title": "Cols",
          "type": "array"
        },
        "scales": {
          "additionalProperties": {
            "type": "number"
          },
          "default": {},
          "description": "The scales to apply to each parameter, e.g. 'k': 1.0, 'length': 1.0, 'epsilon': 1.0.",
          "title": "Scales",
          "type": "object"
        },
        "limits": {
          "additionalProperties": {
            "maxItems": 2,
            "minItems": 2,
            "prefixItems": [
              {
                "anyOf": [
                  {
                    "type": "number"
                  },
                  {
                    "type": "null"
                  }
                ]
              },
              {
                "anyOf": [
                  {
                    "type": "number"
                  },
                  {
                    "type": "null"
                  }
                ]
              }
            ],
            "type": "array"
          },
          "default": {},
          "description": "The min and max values to clamp each parameter within, e.g. 'k': (0.0, None), 'angle': (0.0, pi), 'epsilon': (0.0, None), where none indicates no constraint.",
          "title": "Limits",
          "type": "object"
        },
        "regularize": {
          "additionalProperties": {
            "type": "number"
          },
          "default": {},
          "description": "The regularization strength to apply to each parameter, e.g. 'k': 0.01, 'epsilon': 0.001. Parameters not listed are not regularized.",
          "title": "Regularize",
          "type": "object"
        }
      },
      "required": [
        "cols"
      ],
      "title": "AttributeConfig",
      "type": "object"
    },
    "ParameterConfig": {
      "description": "Configuration for how a potential's parameters should be trained.",
      "properties": {
        "cols": {
          "description": "The parameters to train, e.g. 'k', 'length', 'epsilon'.",
          "items": {
            "type": "string"
          },
          "title": "Cols",
          "type": "array"
        },
        "scales": {
          "additionalProperties": {
            "type": "number"
          },
          "default": {},
          "description": "The scales to apply to each parameter, e.g. 'k': 1.0, 'length': 1.0, 'epsilon': 1.0.",
          "title": "Scales",
          "type": "object"
        },
        "limits": {
          "additionalProperties": {
            "maxItems": 2,
            "minItems": 2,
            "prefixItems": [
              {
                "anyOf": [
                  {
                    "type": "number"
                  },
                  {
                    "type": "null"
                  }
                ]
              },
              {
                "anyOf": [
                  {
                    "type": "number"
                  },
                  {
                    "type": "null"
                  }
                ]
              }
            ],
            "type": "array"
          },
          "default": {},
          "description": "The min and max values to clamp each parameter within, e.g. 'k': (0.0, None), 'angle': (0.0, pi), 'epsilon': (0.0, None), where none indicates no constraint.",
          "title": "Limits",
          "type": "object"
        },
        "regularize": {
          "additionalProperties": {
            "type": "number"
          },
          "default": {},
          "description": "The regularization strength to apply to each parameter, e.g. 'k': 0.01, 'epsilon': 0.001. Parameters not listed are not regularized.",
          "title": "Regularize",
          "type": "object"
        },
        "include": {
          "anyOf": [
            {
              "items": {
                "$ref": "#/$defs/_PotentialKey"
              },
              "type": "array"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "description": "The keys (see ``smee.TensorPotential.parameter_keys`` for details) corresponding to specific parameters to be trained. If ``None``, all parameters will be trained.",
          "title": "Include"
        },
        "exclude": {
          "anyOf": [
            {
              "items": {
                "$ref": "#/$defs/_PotentialKey"
              },
              "type": "array"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "description": "The keys (see ``smee.TensorPotential.parameter_keys`` for details) corresponding to specific parameters to be excluded from training. If ``None``, no parameters will be excluded.",
          "title": "Exclude"
        }
      },
      "required": [
        "cols"
      ],
      "title": "ParameterConfig",
      "type": "object"
    },
    "_PotentialKey": {
      "description": "TODO: Needed until interchange upgrades to pydantic >=2",
      "properties": {
        "id": {
          "title": "Id",
          "type": "string"
        },
        "mult": {
          "anyOf": [
            {
              "type": "integer"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Mult"
        },
        "associated_handler": {
          "anyOf": [
            {
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Associated Handler"
        },
        "bond_order": {
          "anyOf": [
            {
              "type": "number"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Bond Order"
        }
      },
      "required": [
        "id"
      ],
      "title": "_PotentialKey",
      "type": "object"
    }
  },
  "additionalProperties": false,
  "description": "Settings for the training process.",
  "properties": {
    "optimiser": {
      "default": "adam",
      "description": "Optimiser to use for the training. 'adam' is Adam, 'lm' is Levenberg-Marquardt",
      "enum": [
        "adam",
        "lm"
      ],
      "title": "Optimiser",
      "type": "string"
    },
    "parameter_configs": {
      "additionalProperties": {
        "$ref": "#/$defs/ParameterConfig"
      },
      "description": "Configuration for the force field parameters to be trained.",
      "propertyNames": {
        "enum": [
          "Bonds",
          "LinearBonds",
          "Angles",
          "LinearAngles",
          "ProperTorsions",
          "ImproperTorsions"
        ]
      },
      "title": "Parameter Configs",
      "type": "object"
    },
    "attribute_configs": {
      "additionalProperties": {
        "$ref": "#/$defs/AttributeConfig"
      },
      "default": {},
      "description": "Configuration for the force field attributes to be trained. This allows 1-4 scaling for 'vdW' and 'Electrostatics' to be trained.",
      "propertyNames": {
        "enum": [
          "vdW",
          "Electrostatics"
        ]
      },
      "title": "Attribute Configs",
      "type": "object"
    },
    "n_epochs": {
      "default": 1000,
      "description": "Number of epochs in the ML fit",
      "title": "N Epochs",
      "type": "integer"
    },
    "learning_rate": {
      "default": 0.01,
      "description": "Learning Rate in the ML fit",
      "title": "Learning Rate",
      "type": "number"
    },
    "learning_rate_decay": {
      "default": 1.0,
      "description": "Learning Rate Decay. 0.99 is 1%, and 1.0 is no decay.",
      "title": "Learning Rate Decay",
      "type": "number"
    },
    "learning_rate_decay_step": {
      "default": 10,
      "description": "Learning Rate Decay Step",
      "title": "Learning Rate Decay Step",
      "type": "integer"
    },
    "regularisation_target": {
      "default": "initial",
      "description": "Target value to regularise parameters towards. 'initial' is the initial parameter value, 'zero' is zero.",
      "enum": [
        "initial",
        "zero"
      ],
      "title": "Regularisation Target",
      "type": "string"
    }
  },
  "title": "TrainingSettings",
  "type": "object"
}

Fields:

optimiser pydantic-field #

optimiser: OptimiserName = 'adam'

Optimiser to use for the training. 'adam' is Adam, 'lm' is Levenberg-Marquardt

parameter_configs pydantic-field #

parameter_configs: dict[ValenceType, ParameterConfig]

Configuration for the force field parameters to be trained.

attribute_configs pydantic-field #

attribute_configs: dict[
    AllowedAttributeType, AttributeConfig
] = {}

Configuration for the force field attributes to be trained. This allows 1-4 scaling for 'vdW' and 'Electrostatics' to be trained.

n_epochs pydantic-field #

n_epochs: int = 1000

Number of epochs in the ML fit

learning_rate pydantic-field #

learning_rate: float = 0.01

Learning Rate in the ML fit

learning_rate_decay pydantic-field #

learning_rate_decay: float = 1.0

Learning Rate Decay. 0.99 is 1%, and 1.0 is no decay.

learning_rate_decay_step pydantic-field #

learning_rate_decay_step: int = 10

Learning Rate Decay Step

regularisation_target pydantic-field #

regularisation_target: Literal["initial", "zero"] = (
    "initial"
)

Target value to regularise parameters towards. 'initial' is the initial parameter value, 'zero' is zero.

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

OutlierFilterSettings pydantic-model #

Bases: _DefaultSettings

Settings for filtering outliers from datasets based on MM vs MLP differences.

Outliers are identified by comparing MM and reference (typically MLP) energies and forces. Conformations where the absolute difference exceeds a threshold are removed.

Show JSON schema:
{
  "additionalProperties": false,
  "description": "Settings for filtering outliers from datasets based on MM vs MLP differences.\n\nOutliers are identified by comparing MM and reference (typically MLP) energies\nand forces. Conformations where the absolute difference exceeds a threshold\nare removed.",
  "properties": {
    "energy_outlier_threshold": {
      "anyOf": [
        {
          "type": "number"
        },
        {
          "type": "null"
        }
      ],
      "default": 2.0,
      "description": "Absolute threshold in kcal/mol/atom for energy outlier detection. Conformations where |energy_mm - energy_ref| / n_atoms (relative to minimum) exceeds this threshold will be removed. Set to None to disable energy-based filtering.",
      "title": "Energy Outlier Threshold"
    },
    "force_outlier_threshold": {
      "anyOf": [
        {
          "type": "number"
        },
        {
          "type": "null"
        }
      ],
      "default": 500.0,
      "description": "Absolute threshold in kcal/mol/\u00c5 for force outlier detection. Conformations where max |force_mm - force_ref| exceeds this threshold will be removed. Set to None to disable force-based filtering.",
      "title": "Force Outlier Threshold"
    },
    "min_conformations": {
      "default": 1,
      "description": "Minimum number of conformations to keep per molecule. If filtering would remove too many conformations, all conformations will be kept for that molecule.",
      "title": "Min Conformations",
      "type": "integer"
    }
  },
  "title": "OutlierFilterSettings",
  "type": "object"
}

Fields:

energy_outlier_threshold pydantic-field #

energy_outlier_threshold: float | None = 2.0

Absolute threshold in kcal/mol/atom for energy outlier detection. Conformations where |energy_mm - energy_ref| / n_atoms (relative to minimum) exceeds this threshold will be removed. Set to None to disable energy-based filtering.

force_outlier_threshold pydantic-field #

force_outlier_threshold: float | None = 500.0

Absolute threshold in kcal/mol/Å for force outlier detection. Conformations where max |force_mm - force_ref| exceeds this threshold will be removed. Set to None to disable force-based filtering.

min_conformations pydantic-field #

min_conformations: int = 1

Minimum number of conformations to keep per molecule. If filtering would remove too many conformations, all conformations will be kept for that molecule.

output_types property #

output_types: set[OutputType]

Return a set of expected output types for the function which implements this settings object. Subclasses should override this method.

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

TypeGenerationSettings pydantic-model #

Bases: _DefaultSettings

Settings for generating tagged SMARTS types for a given potential type.

Show JSON schema:
{
  "additionalProperties": false,
  "description": "Settings for generating tagged SMARTS types for a given potential type.",
  "properties": {
    "max_extend_distance": {
      "default": -1,
      "description": "Maximum number of bonds to extend from the atoms to which the potential is applied when generating tagged SMARTS patterns. A value of -1 means no limit.",
      "title": "Max Extend Distance",
      "type": "integer"
    },
    "include": {
      "default": [],
      "description": "List of SMARTS present in the initial force field for which to generate new SMARTS  patterns. This allows you to split specific types for reparameterisation. This is mutually exclusive with the exclude field.",
      "items": {
        "type": "string"
      },
      "title": "Include",
      "type": "array"
    },
    "exclude": {
      "default": [],
      "description": "List of SMARTS patterns to exclude when generating tagged SMARTS types. If present,  these patterns will remain the same as in the initial force field. This is mutually exclusive with the include field.",
      "items": {
        "type": "string"
      },
      "title": "Exclude",
      "type": "array"
    }
  },
  "title": "TypeGenerationSettings",
  "type": "object"
}

Fields:

Validators:

max_extend_distance pydantic-field #

max_extend_distance: int = -1

Maximum number of bonds to extend from the atoms to which the potential is applied when generating tagged SMARTS patterns. A value of -1 means no limit.

include pydantic-field #

include: list[str] = []

List of SMARTS present in the initial force field for which to generate new SMARTS patterns. This allows you to split specific types for reparameterisation. This is mutually exclusive with the exclude field.

exclude pydantic-field #

exclude: list[str] = []

List of SMARTS patterns to exclude when generating tagged SMARTS types. If present, these patterns will remain the same as in the initial force field. This is mutually exclusive with the include field.

output_types property #

output_types: set[OutputType]

Return a set of expected output types for the function which implements this settings object. Subclasses should override this method.

validate_include_exclude pydantic-validator #

validate_include_exclude() -> Self

Ensure that only one of include or exclude is set.

Source code in presto/settings.py
@model_validator(mode="after")
def validate_include_exclude(self) -> Self:
    """Ensure that only one of include or exclude is set."""
    if self.include and self.exclude:
        raise InvalidSettingsError(
            "Only one of include or exclude can be set in TypeGenerationSettings."
        )
    return self

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

MSMSettings pydantic-model #

Bases: _DefaultSettings

Settings for the modified Seminario method.

Show JSON schema:
{
  "additionalProperties": false,
  "description": "Settings for the modified Seminario method.",
  "properties": {
    "ml_potential": {
      "default": "aceff-2.0",
      "description": "The machine learning potential to use for calculating the Hessian matrix",
      "enum": [
        "aceff-2.0",
        "mace-off23-small",
        "mace-off23-medium",
        "mace-off23-large",
        "egret-1",
        "aimnet2_b973c_d3_ens",
        "aimnet2_wb97m_d3_ens"
      ],
      "title": "Ml Potential",
      "type": "string"
    },
    "finite_step": {
      "description": "Finite step to calculate Hessian (Angstrom)",
      "title": "Finite Step",
      "type": "string"
    },
    "tolerance": {
      "description": "Tolerance for the geometry optimizer",
      "title": "Tolerance",
      "type": "string"
    },
    "vib_scaling": {
      "default": 0.958,
      "description": "Vibrational scaling factor. This is a reasonable default for \u03c9B97M-V/def2-TZVPPD (AceFF-2.0 LOT),  see https://doi-org.libproxy.ncl.ac.uk/10.1063/5.0152838",
      "title": "Vib Scaling",
      "type": "number"
    },
    "n_conformers": {
      "default": 1,
      "description": "Number of conformers to generate and calculate MSM parameters for. The resulting bond and angle parameters will be averaged over all conformers.",
      "title": "N Conformers",
      "type": "integer"
    }
  },
  "title": "MSMSettings",
  "type": "object"
}

Fields:

ml_potential pydantic-field #

ml_potential: Literal[AvailableModels] = 'aceff-2.0'

The machine learning potential to use for calculating the Hessian matrix

finite_step pydantic-field #

finite_step: OpenMMQuantity[nanometers] = (
    0.0005291772 * nanometers
)

Finite step to calculate Hessian (Angstrom)

tolerance pydantic-field #

tolerance: OpenMMQuantity[
    kilocalories_per_mole / angstrom
] = (0.005291772 * kilocalories_per_mole / angstrom)

Tolerance for the geometry optimizer

vib_scaling pydantic-field #

vib_scaling: float = 0.958

Vibrational scaling factor. This is a reasonable default for ωB97M-V/def2-TZVPPD (AceFF-2.0 LOT), see https://doi-org.libproxy.ncl.ac.uk/10.1063/5.0152838

n_conformers pydantic-field #

n_conformers: int = 1

Number of conformers to generate and calculate MSM parameters for. The resulting bond and angle parameters will be averaged over all conformers.

output_types property #

output_types: set[OutputType]

Return a set of expected output types for the function which implements this settings object. Subclasses should override this method.

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

ParameterisationSettings pydantic-model #

Bases: _DefaultSettings

Settings for the starting parameterisation.

Show JSON schema:
{
  "$defs": {
    "MSMSettings": {
      "additionalProperties": false,
      "description": "Settings for the modified Seminario method.",
      "properties": {
        "ml_potential": {
          "default": "aceff-2.0",
          "description": "The machine learning potential to use for calculating the Hessian matrix",
          "enum": [
            "aceff-2.0",
            "mace-off23-small",
            "mace-off23-medium",
            "mace-off23-large",
            "egret-1",
            "aimnet2_b973c_d3_ens",
            "aimnet2_wb97m_d3_ens"
          ],
          "title": "Ml Potential",
          "type": "string"
        },
        "finite_step": {
          "description": "Finite step to calculate Hessian (Angstrom)",
          "title": "Finite Step",
          "type": "string"
        },
        "tolerance": {
          "description": "Tolerance for the geometry optimizer",
          "title": "Tolerance",
          "type": "string"
        },
        "vib_scaling": {
          "default": 0.958,
          "description": "Vibrational scaling factor. This is a reasonable default for \u03c9B97M-V/def2-TZVPPD (AceFF-2.0 LOT),  see https://doi-org.libproxy.ncl.ac.uk/10.1063/5.0152838",
          "title": "Vib Scaling",
          "type": "number"
        },
        "n_conformers": {
          "default": 1,
          "description": "Number of conformers to generate and calculate MSM parameters for. The resulting bond and angle parameters will be averaged over all conformers.",
          "title": "N Conformers",
          "type": "integer"
        }
      },
      "title": "MSMSettings",
      "type": "object"
    },
    "TypeGenerationSettings": {
      "additionalProperties": false,
      "description": "Settings for generating tagged SMARTS types for a given potential type.",
      "properties": {
        "max_extend_distance": {
          "default": -1,
          "description": "Maximum number of bonds to extend from the atoms to which the potential is applied when generating tagged SMARTS patterns. A value of -1 means no limit.",
          "title": "Max Extend Distance",
          "type": "integer"
        },
        "include": {
          "default": [],
          "description": "List of SMARTS present in the initial force field for which to generate new SMARTS  patterns. This allows you to split specific types for reparameterisation. This is mutually exclusive with the exclude field.",
          "items": {
            "type": "string"
          },
          "title": "Include",
          "type": "array"
        },
        "exclude": {
          "default": [],
          "description": "List of SMARTS patterns to exclude when generating tagged SMARTS types. If present,  these patterns will remain the same as in the initial force field. This is mutually exclusive with the include field.",
          "items": {
            "type": "string"
          },
          "title": "Exclude",
          "type": "array"
        }
      },
      "title": "TypeGenerationSettings",
      "type": "object"
    }
  },
  "additionalProperties": false,
  "description": "Settings for the starting parameterisation.",
  "properties": {
    "smiles": {
      "description": "SMILES string or list of SMILES for molecules to fit",
      "items": {
        "type": "string"
      },
      "title": "Smiles",
      "type": "array"
    },
    "initial_force_field": {
      "default": "openff_unconstrained-2.3.0.offxml",
      "description": "The force field from which to start. This can be any OpenFF force field, or your own .offxml file.",
      "title": "Initial Force Field",
      "type": "string"
    },
    "expand_torsions": {
      "default": true,
      "description": "Whether to expand the torsion periodicities up to 4.",
      "title": "Expand Torsions",
      "type": "boolean"
    },
    "linearise_harmonics": {
      "default": true,
      "description": "Linearise the harmonic potentials in the Force Field (Default)",
      "title": "Linearise Harmonics",
      "type": "boolean"
    },
    "msm_settings": {
      "anyOf": [
        {
          "$ref": "#/$defs/MSMSettings"
        },
        {
          "type": "null"
        }
      ],
      "description": "Settings for the modified Seminario method to initialise force field parameters."
    },
    "type_generation_settings": {
      "additionalProperties": {
        "$ref": "#/$defs/TypeGenerationSettings"
      },
      "description": "Settings for generating tagged SMARTS types for each valence type.",
      "propertyNames": {
        "enum": [
          "Bonds",
          "Angles",
          "ProperTorsions",
          "ImproperTorsions"
        ]
      },
      "title": "Type Generation Settings",
      "type": "object"
    }
  },
  "required": [
    "smiles"
  ],
  "title": "ParameterisationSettings",
  "type": "object"
}

Fields:

Validators:

smiles pydantic-field #

smiles: list[str]

SMILES string or list of SMILES for molecules to fit

initial_force_field pydantic-field #

initial_force_field: str = (
    "openff_unconstrained-2.3.0.offxml"
)

The force field from which to start. This can be any OpenFF force field, or your own .offxml file.

expand_torsions pydantic-field #

expand_torsions: bool = True

Whether to expand the torsion periodicities up to 4.

linearise_harmonics pydantic-field #

linearise_harmonics: bool = True

Linearise the harmonic potentials in the Force Field (Default)

msm_settings pydantic-field #

msm_settings: MSMSettings | None

Settings for the modified Seminario method to initialise force field parameters.

type_generation_settings pydantic-field #

type_generation_settings: dict[
    NonLinearValenceType, TypeGenerationSettings
]

Settings for generating tagged SMARTS types for each valence type.

molecules property #

molecules: list[Molecule]

Return the list of OpenFF Molecule objects for the SMILES strings.

output_types property #

output_types: set[OutputType]

Return a set of expected output types for the function which implements this settings object. Subclasses should override this method.

validate_smiles pydantic-validator #

validate_smiles(value: str | list[str]) -> list[str]

Validate all SMILES are valid, unique. Accepts string or list.

Source code in presto/settings.py
@field_validator("smiles", mode="before")
def validate_smiles(cls, value: str | list[str]) -> list[str]:
    """Validate all SMILES are valid, unique. Accepts string or list."""
    # Convert single string to list for backward compatibility
    if isinstance(value, str):
        value = [value]

    if not value:
        raise ValueError("smiles list cannot be empty")

    # Check for duplicates
    if len(value) != len(set(value)):
        duplicates = [s for s in value if value.count(s) > 1]
        unique_duplicates = list(set(duplicates))
        raise ValueError(f"Duplicate SMILES found: {unique_duplicates}")

    # Validate each SMILES string
    for smiles in value:
        if Chem.MolFromSmiles(smiles) is None:
            raise ValueError(f"Invalid SMILES string: {smiles}")
    return value

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

WorkflowSettings pydantic-model #

Bases: _DefaultSettings

Overall settings for the full fitting workflow.

Show JSON schema:
{
  "$defs": {
    "AttributeConfig": {
      "description": "Configuration for how a potential's attributes should be trained.",
      "properties": {
        "cols": {
          "description": "The parameters to train, e.g. 'k', 'length', 'epsilon'.",
          "items": {
            "type": "string"
          },
          "title": "Cols",
          "type": "array"
        },
        "scales": {
          "additionalProperties": {
            "type": "number"
          },
          "default": {},
          "description": "The scales to apply to each parameter, e.g. 'k': 1.0, 'length': 1.0, 'epsilon': 1.0.",
          "title": "Scales",
          "type": "object"
        },
        "limits": {
          "additionalProperties": {
            "maxItems": 2,
            "minItems": 2,
            "prefixItems": [
              {
                "anyOf": [
                  {
                    "type": "number"
                  },
                  {
                    "type": "null"
                  }
                ]
              },
              {
                "anyOf": [
                  {
                    "type": "number"
                  },
                  {
                    "type": "null"
                  }
                ]
              }
            ],
            "type": "array"
          },
          "default": {},
          "description": "The min and max values to clamp each parameter within, e.g. 'k': (0.0, None), 'angle': (0.0, pi), 'epsilon': (0.0, None), where none indicates no constraint.",
          "title": "Limits",
          "type": "object"
        },
        "regularize": {
          "additionalProperties": {
            "type": "number"
          },
          "default": {},
          "description": "The regularization strength to apply to each parameter, e.g. 'k': 0.01, 'epsilon': 0.001. Parameters not listed are not regularized.",
          "title": "Regularize",
          "type": "object"
        }
      },
      "required": [
        "cols"
      ],
      "title": "AttributeConfig",
      "type": "object"
    },
    "MLMDSamplingSettings": {
      "additionalProperties": false,
      "description": "Settings for molecular dynamics sampling using a machine learning\npotential. This protocol uses the ML reference potential for sampling as\nwell as for energy and force calculations.",
      "properties": {
        "sampling_protocol": {
          "const": "ml_md",
          "default": "ml_md",
          "description": "Sampling protocol to use.",
          "title": "Sampling Protocol",
          "type": "string"
        },
        "ml_potential": {
          "default": "aceff-2.0",
          "description": "The machine learning potential to use for calculating energies and forces of  the snapshots. Note that this is not generally the potential used for sampling.",
          "enum": [
            "aceff-2.0",
            "mace-off23-small",
            "mace-off23-medium",
            "mace-off23-large",
            "egret-1",
            "aimnet2_b973c_d3_ens",
            "aimnet2_wb97m_d3_ens"
          ],
          "title": "Ml Potential",
          "type": "string"
        },
        "timestep": {
          "description": "MD timestep",
          "title": "Timestep",
          "type": "string"
        },
        "temperature": {
          "description": "Temperature to run MD at",
          "title": "Temperature",
          "type": "string"
        },
        "snapshot_interval": {
          "description": "Interval between saving snapshots during production sampling",
          "title": "Snapshot Interval",
          "type": "string"
        },
        "n_conformers": {
          "default": 10,
          "description": "The number of conformers to generate, from which sampling is started",
          "title": "N Conformers",
          "type": "integer"
        },
        "equilibration_sampling_time_per_conformer": {
          "description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
          "title": "Equilibration Sampling Time Per Conformer",
          "type": "string"
        },
        "production_sampling_time_per_conformer": {
          "description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
          "title": "Production Sampling Time Per Conformer",
          "type": "string"
        },
        "loss_energy_weight": {
          "default": 1000.0,
          "description": "Scaling factor for the energy loss term for samples from this protocol.",
          "title": "Loss Energy Weight",
          "type": "number"
        },
        "loss_force_weight": {
          "default": 0.1,
          "description": "Scaling factor for the force loss term for samples from this protocol.",
          "title": "Loss Force Weight",
          "type": "number"
        }
      },
      "title": "MLMDSamplingSettings",
      "type": "object"
    },
    "MMMDMetadynamicsSamplingSettings": {
      "additionalProperties": false,
      "description": "Settings for molecular dynamics sampling using a molecular mechanics\nforce field with metadynamics. This is initally the force field supplined in the parameterisation\nsettings, but is updated as the bespoke force field is trained.",
      "properties": {
        "sampling_protocol": {
          "const": "mm_md_metadynamics",
          "default": "mm_md_metadynamics",
          "description": "Sampling protocol to use.",
          "title": "Sampling Protocol",
          "type": "string"
        },
        "ml_potential": {
          "default": "aceff-2.0",
          "description": "The machine learning potential to use for calculating energies and forces of  the snapshots. Note that this is not generally the potential used for sampling.",
          "enum": [
            "aceff-2.0",
            "mace-off23-small",
            "mace-off23-medium",
            "mace-off23-large",
            "egret-1",
            "aimnet2_b973c_d3_ens",
            "aimnet2_wb97m_d3_ens"
          ],
          "title": "Ml Potential",
          "type": "string"
        },
        "timestep": {
          "description": "MD timestep",
          "title": "Timestep",
          "type": "string"
        },
        "temperature": {
          "description": "Temperature to run MD at",
          "title": "Temperature",
          "type": "string"
        },
        "snapshot_interval": {
          "description": "Interval between saving snapshots during production sampling",
          "title": "Snapshot Interval",
          "type": "string"
        },
        "n_conformers": {
          "default": 10,
          "description": "The number of conformers to generate, from which sampling is started",
          "title": "N Conformers",
          "type": "integer"
        },
        "equilibration_sampling_time_per_conformer": {
          "description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
          "title": "Equilibration Sampling Time Per Conformer",
          "type": "string"
        },
        "production_sampling_time_per_conformer": {
          "description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
          "title": "Production Sampling Time Per Conformer",
          "type": "string"
        },
        "loss_energy_weight": {
          "default": 1000.0,
          "description": "Scaling factor for the energy loss term for samples from this protocol.",
          "title": "Loss Energy Weight",
          "type": "number"
        },
        "loss_force_weight": {
          "default": 0.1,
          "description": "Scaling factor for the force loss term for samples from this protocol.",
          "title": "Loss Force Weight",
          "type": "number"
        },
        "bias_width": {
          "default": 0.3141592653589793,
          "description": "Width of the bias (in radians)",
          "title": "Bias Width",
          "type": "number"
        },
        "bias_factor": {
          "default": 20.0,
          "description": "Bias factor for well-tempered metadynamics. Typical range: 5-20",
          "title": "Bias Factor",
          "type": "number"
        },
        "bias_height": {
          "description": "Initial height of the bias",
          "title": "Bias Height",
          "type": "string"
        },
        "bias_frequency": {
          "description": "Frequency at which to add bias",
          "title": "Bias Frequency",
          "type": "string"
        },
        "bias_save_frequency": {
          "description": "Frequency at which to save the bias",
          "title": "Bias Save Frequency",
          "type": "string"
        },
        "torsions_to_include_smarts": {
          "description": "SMARTS patterns for torsions to include in metadynamics biasing. Matches single bonds not in rings and single bonds in aliphatic rings of size 5 or more. These should match the entire torsion (4 atoms), not just the rotatable bond.",
          "items": {
            "type": "string"
          },
          "title": "Torsions To Include Smarts",
          "type": "array"
        },
        "torsions_to_exclude_smarts": {
          "description": "SMARTS patterns for bonds to exclude from metadynamics biasing. These are removed from the list of torsions matched by the include patterns. These should match only the rotatable bond (2 atoms), not the full torsion.",
          "items": {
            "type": "string"
          },
          "title": "Torsions To Exclude Smarts",
          "type": "array"
        }
      },
      "title": "MMMDMetadynamicsSamplingSettings",
      "type": "object"
    },
    "MMMDMetadynamicsTorsionMinimisationSamplingSettings": {
      "additionalProperties": false,
      "description": "Settings for MM MD metadynamics sampling with additional torsion-restrained\nminimisation structures. This extends MMMDMetadynamicsSamplingSettings by generating\nadditional training data from torsion-restrained minimisations.",
      "properties": {
        "sampling_protocol": {
          "const": "mm_md_metadynamics_torsion_minimisation",
          "default": "mm_md_metadynamics_torsion_minimisation",
          "description": "Sampling protocol to use.",
          "title": "Sampling Protocol",
          "type": "string"
        },
        "ml_potential": {
          "default": "aceff-2.0",
          "description": "The machine learning potential to use for calculating energies and forces of  the snapshots. Note that this is not generally the potential used for sampling.",
          "enum": [
            "aceff-2.0",
            "mace-off23-small",
            "mace-off23-medium",
            "mace-off23-large",
            "egret-1",
            "aimnet2_b973c_d3_ens",
            "aimnet2_wb97m_d3_ens"
          ],
          "title": "Ml Potential",
          "type": "string"
        },
        "timestep": {
          "description": "MD timestep",
          "title": "Timestep",
          "type": "string"
        },
        "temperature": {
          "description": "Temperature to run MD at",
          "title": "Temperature",
          "type": "string"
        },
        "snapshot_interval": {
          "description": "Interval between saving snapshots during production sampling",
          "title": "Snapshot Interval",
          "type": "string"
        },
        "n_conformers": {
          "default": 10,
          "description": "The number of conformers to generate, from which sampling is started",
          "title": "N Conformers",
          "type": "integer"
        },
        "equilibration_sampling_time_per_conformer": {
          "description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
          "title": "Equilibration Sampling Time Per Conformer",
          "type": "string"
        },
        "production_sampling_time_per_conformer": {
          "description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
          "title": "Production Sampling Time Per Conformer",
          "type": "string"
        },
        "loss_energy_weight": {
          "default": 1000.0,
          "description": "Scaling factor for the energy loss term for samples from this protocol.",
          "title": "Loss Energy Weight",
          "type": "number"
        },
        "loss_force_weight": {
          "default": 0.1,
          "description": "Scaling factor for the force loss term for samples from this protocol.",
          "title": "Loss Force Weight",
          "type": "number"
        },
        "bias_width": {
          "default": 0.3141592653589793,
          "description": "Width of the bias (in radians)",
          "title": "Bias Width",
          "type": "number"
        },
        "bias_factor": {
          "default": 20.0,
          "description": "Bias factor for well-tempered metadynamics. Typical range: 5-20",
          "title": "Bias Factor",
          "type": "number"
        },
        "bias_height": {
          "description": "Initial height of the bias",
          "title": "Bias Height",
          "type": "string"
        },
        "bias_frequency": {
          "description": "Frequency at which to add bias",
          "title": "Bias Frequency",
          "type": "string"
        },
        "bias_save_frequency": {
          "description": "Frequency at which to save the bias",
          "title": "Bias Save Frequency",
          "type": "string"
        },
        "torsions_to_include_smarts": {
          "description": "SMARTS patterns for torsions to include in metadynamics biasing. Matches single bonds not in rings and single bonds in aliphatic rings of size 5 or more. These should match the entire torsion (4 atoms), not just the rotatable bond.",
          "items": {
            "type": "string"
          },
          "title": "Torsions To Include Smarts",
          "type": "array"
        },
        "torsions_to_exclude_smarts": {
          "description": "SMARTS patterns for bonds to exclude from metadynamics biasing. These are removed from the list of torsions matched by the include patterns. These should match only the rotatable bond (2 atoms), not the full torsion.",
          "items": {
            "type": "string"
          },
          "title": "Torsions To Exclude Smarts",
          "type": "array"
        },
        "ml_minimisation_steps": {
          "default": 10,
          "description": "Number of MLP minimisation steps with restrained torsions.",
          "title": "Ml Minimisation Steps",
          "type": "integer"
        },
        "mm_minimisation_steps": {
          "default": 10,
          "description": "Number of MM minimisation steps with restrained torsions.",
          "title": "Mm Minimisation Steps",
          "type": "integer"
        },
        "torsion_restraint_force_constant": {
          "description": "Force constant for torsion restraints.",
          "title": "Torsion Restraint Force Constant",
          "type": "string"
        },
        "map_ml_coords_energy_to_mm_coords_energy": {
          "default": false,
          "description": "Whether to substitute the MLP energy for the MM-minimised coordinates with the MLP energy for the corresponding MLP-minimised coordinates.",
          "title": "Map Ml Coords Energy To Mm Coords Energy",
          "type": "boolean"
        },
        "loss_energy_weight_mm_torsion_min": {
          "default": 1000.0,
          "description": "Scaling factor for the energy loss term for torsion-minimised samples, using MM minimisation. Note that the weights for the MMMD samples are controlled by the loss_energy_weight field.",
          "title": "Loss Energy Weight Mm Torsion Min",
          "type": "number"
        },
        "loss_force_weight_mm_torsion_min": {
          "default": 0.1,
          "description": "Scaling factor for the force loss term for torsion-minimised samples, using MM minimisation. Note that the weights for the MMMD samples are controlled by the loss_force_weight field.",
          "title": "Loss Force Weight Mm Torsion Min",
          "type": "number"
        },
        "loss_energy_weight_ml_torsion_min": {
          "default": 1000.0,
          "description": "Scaling factor for the energy loss term for torsion-minimised samples, using MLP minimisation. Note that the weights for the MMMD samples are controlled by the loss_energy_weight field.",
          "title": "Loss Energy Weight Ml Torsion Min",
          "type": "number"
        },
        "loss_force_weight_ml_torsion_min": {
          "default": 0.1,
          "description": "Scaling factor for the force loss term for torsion-minimised samples, using MLP minimisation. Note that the weights for the MMMD samples are controlled by the loss_force_weight field.",
          "title": "Loss Force Weight Ml Torsion Min",
          "type": "number"
        }
      },
      "title": "MMMDMetadynamicsTorsionMinimisationSamplingSettings",
      "type": "object"
    },
    "MMMDSamplingSettings": {
      "additionalProperties": false,
      "description": "Settings for molecular dynamics sampling using a molecular mechanics\nforce field. This is initally the force field supplined in the parameterisation\nsettings, but is updated as the bespoke force field is trained.",
      "properties": {
        "sampling_protocol": {
          "const": "mm_md",
          "default": "mm_md",
          "description": "Sampling protocol to use.",
          "title": "Sampling Protocol",
          "type": "string"
        },
        "ml_potential": {
          "default": "aceff-2.0",
          "description": "The machine learning potential to use for calculating energies and forces of  the snapshots. Note that this is not generally the potential used for sampling.",
          "enum": [
            "aceff-2.0",
            "mace-off23-small",
            "mace-off23-medium",
            "mace-off23-large",
            "egret-1",
            "aimnet2_b973c_d3_ens",
            "aimnet2_wb97m_d3_ens"
          ],
          "title": "Ml Potential",
          "type": "string"
        },
        "timestep": {
          "description": "MD timestep",
          "title": "Timestep",
          "type": "string"
        },
        "temperature": {
          "description": "Temperature to run MD at",
          "title": "Temperature",
          "type": "string"
        },
        "snapshot_interval": {
          "description": "Interval between saving snapshots during production sampling",
          "title": "Snapshot Interval",
          "type": "string"
        },
        "n_conformers": {
          "default": 10,
          "description": "The number of conformers to generate, from which sampling is started",
          "title": "N Conformers",
          "type": "integer"
        },
        "equilibration_sampling_time_per_conformer": {
          "description": "Equilibration sampling time per conformer. No snapshots are saved during equilibration sampling. The total sampling time per conformer will be this plus the production_sampling_time_per_conformer.",
          "title": "Equilibration Sampling Time Per Conformer",
          "type": "string"
        },
        "production_sampling_time_per_conformer": {
          "description": "Production sampling time per conformer. The total sampling time per conformer will be this plus the equilibration_sampling_time_per_conformer.",
          "title": "Production Sampling Time Per Conformer",
          "type": "string"
        },
        "loss_energy_weight": {
          "default": 1000.0,
          "description": "Scaling factor for the energy loss term for samples from this protocol.",
          "title": "Loss Energy Weight",
          "type": "number"
        },
        "loss_force_weight": {
          "default": 0.1,
          "description": "Scaling factor for the force loss term for samples from this protocol.",
          "title": "Loss Force Weight",
          "type": "number"
        }
      },
      "title": "MMMDSamplingSettings",
      "type": "object"
    },
    "MSMSettings": {
      "additionalProperties": false,
      "description": "Settings for the modified Seminario method.",
      "properties": {
        "ml_potential": {
          "default": "aceff-2.0",
          "description": "The machine learning potential to use for calculating the Hessian matrix",
          "enum": [
            "aceff-2.0",
            "mace-off23-small",
            "mace-off23-medium",
            "mace-off23-large",
            "egret-1",
            "aimnet2_b973c_d3_ens",
            "aimnet2_wb97m_d3_ens"
          ],
          "title": "Ml Potential",
          "type": "string"
        },
        "finite_step": {
          "description": "Finite step to calculate Hessian (Angstrom)",
          "title": "Finite Step",
          "type": "string"
        },
        "tolerance": {
          "description": "Tolerance for the geometry optimizer",
          "title": "Tolerance",
          "type": "string"
        },
        "vib_scaling": {
          "default": 0.958,
          "description": "Vibrational scaling factor. This is a reasonable default for \u03c9B97M-V/def2-TZVPPD (AceFF-2.0 LOT),  see https://doi-org.libproxy.ncl.ac.uk/10.1063/5.0152838",
          "title": "Vib Scaling",
          "type": "number"
        },
        "n_conformers": {
          "default": 1,
          "description": "Number of conformers to generate and calculate MSM parameters for. The resulting bond and angle parameters will be averaged over all conformers.",
          "title": "N Conformers",
          "type": "integer"
        }
      },
      "title": "MSMSettings",
      "type": "object"
    },
    "OutlierFilterSettings": {
      "additionalProperties": false,
      "description": "Settings for filtering outliers from datasets based on MM vs MLP differences.\n\nOutliers are identified by comparing MM and reference (typically MLP) energies\nand forces. Conformations where the absolute difference exceeds a threshold\nare removed.",
      "properties": {
        "energy_outlier_threshold": {
          "anyOf": [
            {
              "type": "number"
            },
            {
              "type": "null"
            }
          ],
          "default": 2.0,
          "description": "Absolute threshold in kcal/mol/atom for energy outlier detection. Conformations where |energy_mm - energy_ref| / n_atoms (relative to minimum) exceeds this threshold will be removed. Set to None to disable energy-based filtering.",
          "title": "Energy Outlier Threshold"
        },
        "force_outlier_threshold": {
          "anyOf": [
            {
              "type": "number"
            },
            {
              "type": "null"
            }
          ],
          "default": 500.0,
          "description": "Absolute threshold in kcal/mol/\u00c5 for force outlier detection. Conformations where max |force_mm - force_ref| exceeds this threshold will be removed. Set to None to disable force-based filtering.",
          "title": "Force Outlier Threshold"
        },
        "min_conformations": {
          "default": 1,
          "description": "Minimum number of conformations to keep per molecule. If filtering would remove too many conformations, all conformations will be kept for that molecule.",
          "title": "Min Conformations",
          "type": "integer"
        }
      },
      "title": "OutlierFilterSettings",
      "type": "object"
    },
    "ParameterConfig": {
      "description": "Configuration for how a potential's parameters should be trained.",
      "properties": {
        "cols": {
          "description": "The parameters to train, e.g. 'k', 'length', 'epsilon'.",
          "items": {
            "type": "string"
          },
          "title": "Cols",
          "type": "array"
        },
        "scales": {
          "additionalProperties": {
            "type": "number"
          },
          "default": {},
          "description": "The scales to apply to each parameter, e.g. 'k': 1.0, 'length': 1.0, 'epsilon': 1.0.",
          "title": "Scales",
          "type": "object"
        },
        "limits": {
          "additionalProperties": {
            "maxItems": 2,
            "minItems": 2,
            "prefixItems": [
              {
                "anyOf": [
                  {
                    "type": "number"
                  },
                  {
                    "type": "null"
                  }
                ]
              },
              {
                "anyOf": [
                  {
                    "type": "number"
                  },
                  {
                    "type": "null"
                  }
                ]
              }
            ],
            "type": "array"
          },
          "default": {},
          "description": "The min and max values to clamp each parameter within, e.g. 'k': (0.0, None), 'angle': (0.0, pi), 'epsilon': (0.0, None), where none indicates no constraint.",
          "title": "Limits",
          "type": "object"
        },
        "regularize": {
          "additionalProperties": {
            "type": "number"
          },
          "default": {},
          "description": "The regularization strength to apply to each parameter, e.g. 'k': 0.01, 'epsilon': 0.001. Parameters not listed are not regularized.",
          "title": "Regularize",
          "type": "object"
        },
        "include": {
          "anyOf": [
            {
              "items": {
                "$ref": "#/$defs/_PotentialKey"
              },
              "type": "array"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "description": "The keys (see ``smee.TensorPotential.parameter_keys`` for details) corresponding to specific parameters to be trained. If ``None``, all parameters will be trained.",
          "title": "Include"
        },
        "exclude": {
          "anyOf": [
            {
              "items": {
                "$ref": "#/$defs/_PotentialKey"
              },
              "type": "array"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "description": "The keys (see ``smee.TensorPotential.parameter_keys`` for details) corresponding to specific parameters to be excluded from training. If ``None``, no parameters will be excluded.",
          "title": "Exclude"
        }
      },
      "required": [
        "cols"
      ],
      "title": "ParameterConfig",
      "type": "object"
    },
    "ParameterisationSettings": {
      "additionalProperties": false,
      "description": "Settings for the starting parameterisation.",
      "properties": {
        "smiles": {
          "description": "SMILES string or list of SMILES for molecules to fit",
          "items": {
            "type": "string"
          },
          "title": "Smiles",
          "type": "array"
        },
        "initial_force_field": {
          "default": "openff_unconstrained-2.3.0.offxml",
          "description": "The force field from which to start. This can be any OpenFF force field, or your own .offxml file.",
          "title": "Initial Force Field",
          "type": "string"
        },
        "expand_torsions": {
          "default": true,
          "description": "Whether to expand the torsion periodicities up to 4.",
          "title": "Expand Torsions",
          "type": "boolean"
        },
        "linearise_harmonics": {
          "default": true,
          "description": "Linearise the harmonic potentials in the Force Field (Default)",
          "title": "Linearise Harmonics",
          "type": "boolean"
        },
        "msm_settings": {
          "anyOf": [
            {
              "$ref": "#/$defs/MSMSettings"
            },
            {
              "type": "null"
            }
          ],
          "description": "Settings for the modified Seminario method to initialise force field parameters."
        },
        "type_generation_settings": {
          "additionalProperties": {
            "$ref": "#/$defs/TypeGenerationSettings"
          },
          "description": "Settings for generating tagged SMARTS types for each valence type.",
          "propertyNames": {
            "enum": [
              "Bonds",
              "Angles",
              "ProperTorsions",
              "ImproperTorsions"
            ]
          },
          "title": "Type Generation Settings",
          "type": "object"
        }
      },
      "required": [
        "smiles"
      ],
      "title": "ParameterisationSettings",
      "type": "object"
    },
    "PreComputedDatasetSettings": {
      "additionalProperties": false,
      "description": "Settings for loading pre-computed datasets from disk.\n\nFor single-molecule fits, provide a single Path.\nFor multi-molecule fits, provide a list of Paths (one per molecule).",
      "properties": {
        "sampling_protocol": {
          "const": "pre_computed",
          "default": "pre_computed",
          "description": "Sampling protocol identifier.",
          "title": "Sampling Protocol",
          "type": "string"
        },
        "dataset_paths": {
          "description": "Path(s) to pre-computed dataset(s) saved with dataset.save_to_disk(). For single-molecule fits, provide a single Path. For multi-molecule fits, provide a list of Paths (one per molecule in order).",
          "items": {
            "format": "path",
            "type": "string"
          },
          "title": "Dataset Paths",
          "type": "array"
        }
      },
      "required": [
        "dataset_paths"
      ],
      "title": "PreComputedDatasetSettings",
      "type": "object"
    },
    "TrainingSettings": {
      "additionalProperties": false,
      "description": "Settings for the training process.",
      "properties": {
        "optimiser": {
          "default": "adam",
          "description": "Optimiser to use for the training. 'adam' is Adam, 'lm' is Levenberg-Marquardt",
          "enum": [
            "adam",
            "lm"
          ],
          "title": "Optimiser",
          "type": "string"
        },
        "parameter_configs": {
          "additionalProperties": {
            "$ref": "#/$defs/ParameterConfig"
          },
          "description": "Configuration for the force field parameters to be trained.",
          "propertyNames": {
            "enum": [
              "Bonds",
              "LinearBonds",
              "Angles",
              "LinearAngles",
              "ProperTorsions",
              "ImproperTorsions"
            ]
          },
          "title": "Parameter Configs",
          "type": "object"
        },
        "attribute_configs": {
          "additionalProperties": {
            "$ref": "#/$defs/AttributeConfig"
          },
          "default": {},
          "description": "Configuration for the force field attributes to be trained. This allows 1-4 scaling for 'vdW' and 'Electrostatics' to be trained.",
          "propertyNames": {
            "enum": [
              "vdW",
              "Electrostatics"
            ]
          },
          "title": "Attribute Configs",
          "type": "object"
        },
        "n_epochs": {
          "default": 1000,
          "description": "Number of epochs in the ML fit",
          "title": "N Epochs",
          "type": "integer"
        },
        "learning_rate": {
          "default": 0.01,
          "description": "Learning Rate in the ML fit",
          "title": "Learning Rate",
          "type": "number"
        },
        "learning_rate_decay": {
          "default": 1.0,
          "description": "Learning Rate Decay. 0.99 is 1%, and 1.0 is no decay.",
          "title": "Learning Rate Decay",
          "type": "number"
        },
        "learning_rate_decay_step": {
          "default": 10,
          "description": "Learning Rate Decay Step",
          "title": "Learning Rate Decay Step",
          "type": "integer"
        },
        "regularisation_target": {
          "default": "initial",
          "description": "Target value to regularise parameters towards. 'initial' is the initial parameter value, 'zero' is zero.",
          "enum": [
            "initial",
            "zero"
          ],
          "title": "Regularisation Target",
          "type": "string"
        }
      },
      "title": "TrainingSettings",
      "type": "object"
    },
    "TypeGenerationSettings": {
      "additionalProperties": false,
      "description": "Settings for generating tagged SMARTS types for a given potential type.",
      "properties": {
        "max_extend_distance": {
          "default": -1,
          "description": "Maximum number of bonds to extend from the atoms to which the potential is applied when generating tagged SMARTS patterns. A value of -1 means no limit.",
          "title": "Max Extend Distance",
          "type": "integer"
        },
        "include": {
          "default": [],
          "description": "List of SMARTS present in the initial force field for which to generate new SMARTS  patterns. This allows you to split specific types for reparameterisation. This is mutually exclusive with the exclude field.",
          "items": {
            "type": "string"
          },
          "title": "Include",
          "type": "array"
        },
        "exclude": {
          "default": [],
          "description": "List of SMARTS patterns to exclude when generating tagged SMARTS types. If present,  these patterns will remain the same as in the initial force field. This is mutually exclusive with the include field.",
          "items": {
            "type": "string"
          },
          "title": "Exclude",
          "type": "array"
        }
      },
      "title": "TypeGenerationSettings",
      "type": "object"
    },
    "_PotentialKey": {
      "description": "TODO: Needed until interchange upgrades to pydantic >=2",
      "properties": {
        "id": {
          "title": "Id",
          "type": "string"
        },
        "mult": {
          "anyOf": [
            {
              "type": "integer"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Mult"
        },
        "associated_handler": {
          "anyOf": [
            {
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Associated Handler"
        },
        "bond_order": {
          "anyOf": [
            {
              "type": "number"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Bond Order"
        }
      },
      "required": [
        "id"
      ],
      "title": "_PotentialKey",
      "type": "object"
    }
  },
  "additionalProperties": false,
  "description": "Overall settings for the full fitting workflow.",
  "properties": {
    "version": {
      "default": "0.1.dev1+g55bd96965",
      "description": "Version of presto used to create these settings",
      "title": "Version",
      "type": "string"
    },
    "output_dir": {
      "default": ".",
      "description": "Directory where the output files will be saved",
      "format": "path",
      "title": "Output Dir",
      "type": "string"
    },
    "device_type": {
      "default": "cuda",
      "description": "Device type for training, either 'cpu' or 'cuda'",
      "enum": [
        "cpu",
        "cuda"
      ],
      "title": "Device Type",
      "type": "string"
    },
    "n_iterations": {
      "default": 2,
      "description": "Number of iterations of sampling, then training the FF to run",
      "title": "N Iterations",
      "type": "integer"
    },
    "memory": {
      "default": false,
      "description": "Whether to append new training data to training data from the previous iterations, or overwrite it (False).",
      "title": "Memory",
      "type": "boolean"
    },
    "parameterisation_settings": {
      "$ref": "#/$defs/ParameterisationSettings",
      "description": "Settings for the starting parameterisation"
    },
    "training_sampling_settings": {
      "description": "Settings for sampling for generating the training data (usually molecular dynamics)",
      "discriminator": {
        "mapping": {
          "ml_md": "#/$defs/MLMDSamplingSettings",
          "mm_md": "#/$defs/MMMDSamplingSettings",
          "mm_md_metadynamics": "#/$defs/MMMDMetadynamicsSamplingSettings",
          "mm_md_metadynamics_torsion_minimisation": "#/$defs/MMMDMetadynamicsTorsionMinimisationSamplingSettings",
          "pre_computed": "#/$defs/PreComputedDatasetSettings"
        },
        "propertyName": "sampling_protocol"
      },
      "oneOf": [
        {
          "$ref": "#/$defs/MMMDSamplingSettings"
        },
        {
          "$ref": "#/$defs/MLMDSamplingSettings"
        },
        {
          "$ref": "#/$defs/MMMDMetadynamicsSamplingSettings"
        },
        {
          "$ref": "#/$defs/MMMDMetadynamicsTorsionMinimisationSamplingSettings"
        },
        {
          "$ref": "#/$defs/PreComputedDatasetSettings"
        }
      ],
      "title": "Training Sampling Settings"
    },
    "testing_sampling_settings": {
      "description": "Settings for sampling for generating the testing data (usually molecular dynamics)",
      "discriminator": {
        "mapping": {
          "ml_md": "#/$defs/MLMDSamplingSettings",
          "mm_md": "#/$defs/MMMDSamplingSettings",
          "mm_md_metadynamics": "#/$defs/MMMDMetadynamicsSamplingSettings",
          "mm_md_metadynamics_torsion_minimisation": "#/$defs/MMMDMetadynamicsTorsionMinimisationSamplingSettings",
          "pre_computed": "#/$defs/PreComputedDatasetSettings"
        },
        "propertyName": "sampling_protocol"
      },
      "oneOf": [
        {
          "$ref": "#/$defs/MMMDSamplingSettings"
        },
        {
          "$ref": "#/$defs/MLMDSamplingSettings"
        },
        {
          "$ref": "#/$defs/MMMDMetadynamicsSamplingSettings"
        },
        {
          "$ref": "#/$defs/MMMDMetadynamicsTorsionMinimisationSamplingSettings"
        },
        {
          "$ref": "#/$defs/PreComputedDatasetSettings"
        }
      ],
      "title": "Testing Sampling Settings"
    },
    "training_settings": {
      "$ref": "#/$defs/TrainingSettings",
      "description": "Settings for the training process"
    },
    "outlier_filter_settings": {
      "anyOf": [
        {
          "$ref": "#/$defs/OutlierFilterSettings"
        },
        {
          "type": "null"
        }
      ],
      "description": "Settings for filtering outliers from training data. Set to None to disable outlier filtering."
    }
  },
  "required": [
    "parameterisation_settings"
  ],
  "title": "WorkflowSettings",
  "type": "object"
}

Fields:

version pydantic-field #

version: str = __version__

Version of presto used to create these settings

output_dir pydantic-field #

output_dir: Path = Path('.')

Directory where the output files will be saved

device_type pydantic-field #

device_type: TorchDevice = 'cuda'

Device type for training, either 'cpu' or 'cuda'

n_iterations pydantic-field #

n_iterations: int = 2

Number of iterations of sampling, then training the FF to run

memory pydantic-field #

memory: bool = False

Whether to append new training data to training data from the previous iterations, or overwrite it (False).

parameterisation_settings pydantic-field #

parameterisation_settings: ParameterisationSettings

Settings for the starting parameterisation

training_sampling_settings pydantic-field #

training_sampling_settings: SamplingSettings

Settings for sampling for generating the training data (usually molecular dynamics)

testing_sampling_settings pydantic-field #

testing_sampling_settings: SamplingSettings

Settings for sampling for generating the testing data (usually molecular dynamics)

training_settings pydantic-field #

training_settings: TrainingSettings

Settings for the training process

outlier_filter_settings pydantic-field #

outlier_filter_settings: OutlierFilterSettings | None

Settings for filtering outliers from training data. Set to None to disable outlier filtering.

output_types property #

output_types: set[OutputType]

Return a set of expected output types for the function which implements this settings object. Subclasses should override this method.

validate_version classmethod #

validate_version(value: str) -> str

Validate version format and check compatibility.

Source code in presto/settings.py
@field_validator("version")
@classmethod
def validate_version(cls, value: str) -> str:
    """Validate version format and check compatibility."""
    try:
        parsed = Version(value)
    except Exception as e:
        raise ValueError(f"Invalid version format: {value}") from e

    actual_version = Version(__version__)

    # Warn the user if major or minor versions do not match
    if parsed.major != actual_version.major or parsed.minor != actual_version.minor:
        logger.warning(
            f"Version mismatch: settings version {value} may not be compatible with current version {__version__}."
        )

    return value

validate_device_type classmethod #

validate_device_type(value: TorchDevice) -> TorchDevice

Ensure that the requested device type is available.

Source code in presto/settings.py
@field_validator("device_type")
@classmethod
def validate_device_type(cls, value: TorchDevice) -> TorchDevice:
    """Ensure that the requested device type is available."""
    if value == "cuda" and not torch.cuda.is_available():
        raise ValueError("CUDA is not available on this system.")

    if value == "cpu":
        warnings.warn(
            "Using CPU for training and sampling. This may be slow. Consider using CUDA if available.",
            UserWarning,
            stacklevel=2,
        )

    return value

validate_parameterisation_training_consistency #

validate_parameterisation_training_consistency() -> Self

Validate that linearise_harmonics argument in parameterisation settings is consistent with the valence types in the training settings.

Source code in presto/settings.py
@model_validator(mode="after")
def validate_parameterisation_training_consistency(self) -> Self:
    """Validate that linearise_harmonics argument in parameterisation settings is consistent with the valence types
    in the training settings."""

    harmonics_linearised = self.parameterisation_settings.linearise_harmonics
    excluded_valence_types = (
        ("Bonds", "Angles")
        if harmonics_linearised
        else ("LinearBonds", "LinearAngles")
    )
    if any(
        valence_type in self.training_settings.parameter_configs
        for valence_type in excluded_valence_types
    ):
        raise InvalidSettingsError(
            f"ParameterisationSettings.linearise_harmonics is {harmonics_linearised}, but TrainingSettings.parameter_configs "
            f"contains valence types that are inconsistent with this setting: {excluded_valence_types}. "
        )

    return self

get_path_manager #

get_path_manager() -> WorkflowPathManager

Get the output paths manager for this workflow settings object.

Source code in presto/settings.py
def get_path_manager(self) -> WorkflowPathManager:
    """Get the output paths manager for this workflow settings object."""
    # Get the number of molecules from the smiles list
    smiles = self.parameterisation_settings.smiles
    n_mols = len(smiles) if isinstance(smiles, list) else 1
    return WorkflowPathManager(
        output_dir=self.output_dir,
        n_iterations=self.n_iterations,
        n_mols=n_mols,
        training_settings=self.training_settings,
        training_sampling_settings=self.training_sampling_settings,
        testing_sampling_settings=self.testing_sampling_settings,
    )

to_yaml #

to_yaml(yaml_path: PathLike) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def to_yaml(self, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    _model_to_yaml(self, yaml_path)

from_yaml classmethod #

from_yaml(yaml_path: PathLike) -> Self

Load settings from a YAML file

Source code in presto/settings.py
@classmethod
def from_yaml(cls, yaml_path: PathLike) -> Self:
    """Load settings from a YAML file"""
    return _model_from_yaml(cls, yaml_path)

_model_to_yaml #

_model_to_yaml(
    model: BaseModel, yaml_path: PathLike
) -> None

Save the settings to a YAML file

Source code in presto/settings.py
def _model_to_yaml(model: BaseModel, yaml_path: PathLike) -> None:
    """Save the settings to a YAML file"""
    data = model.model_dump(mode="json")
    with open(yaml_path, "w") as file:
        yaml.dump(data, file, default_flow_style=False, sort_keys=False, indent=4)

_model_from_yaml #

_model_from_yaml(cls: type[_T], yaml_path: PathLike) -> _T

Load settings from a YAML file

Source code in presto/settings.py
def _model_from_yaml(cls: type[_T], yaml_path: PathLike) -> _T:
    """Load settings from a YAML file"""
    with open(yaml_path, "r") as file:
        settings_data = yaml.safe_load(file)
    return cls(**settings_data)