ichor.core.analysis package

Subpackages

Submodules

ichor.core.analysis.predictions module

get_predicted(models: Models, points: ListOfAtoms, atoms: List[str] | None = None, types: List[str] | None = None) DataFrame

Returns the predicted values for a given ListOfAtoms given Models

Parameters:
  • models – the models to use for predicting the values of points

  • points – a ListOfAtoms containing geometries to predict

  • atoms – optional list of atoms to predict the values of points for, defaults to all atoms in models

  • types – optional list of property types, such as iqa, q00, etc. to predict the values of points for, defaults to all types in models

Returns:

predictions of points given models as a ModelsResult

get_true(validation_set: PointsDirectory, atoms: List[str], types: List[str]) DataFrame

Returns the true values for a given PointsDirectory as a ModelsResult

Parameters:
  • validation_set – the PointsDirectory containing the true values

  • atoms – List of atoms to get the true values for

  • types – List of property types, such as iqa, q00, etc. to get the true values for

Returns:

ModelsResult containing the true values requested from the validation set

get_true_predicted(models: Models, validation_set: PointsDirectory, atoms: List[str] | None = None, types: List[str] | None = None) Tuple[DataFrame, DataFrame]
Returns the true and predicted values of the given model and

validation set for each of the specified atoms and types

Parameters:
  • models – models to use for the predictions

  • validation_set – validation set containing geometry data and true values

  • atoms – optional list of atoms to predict, defaults to all atoms found in model_location

  • types – optional list of types to predict, such as iqa, q00, etc. Defaults to all types found in model_location

Returns:

ModelsResult for true and predicted values from the models and validation set provided

ichor.core.analysis.trajectory_analysis module

class Stability(reference_path: Path | str, trajectory_path: Path | str, threshold: float)

Bases: TrajectoryAnalysis

Class used to check whether a simulation of a molecule is stable given a threshold value. Each connected bond distance is subtracted to a reference bond distance (e.g. of the corresponding optimised molecule) and then checked against the threshold.

Parameters:
  • reference_path – path to the reference structure of a molecule. Can be .gjf or .xyz file.

  • trajectory_path – path to the trajectory file. Can be a DLPOLY4 trajectory or a .xyz trajectory file.

  • threshold – threshold value for stability across the trajectory.

Example usage:

traj = Stability('path/to/reference.xyz','path/to/trajectory.xyz', threshold=0.5)
traj.stable_trajectory()
traj.hr(nbins=1000, max_dist=10.0)
traj.plot_hr('path/to/figure.png')
bond_lengths_differences_matrix() ndarray

Computes a matrix of the differences of distances between a reference geometry and all the timesteps of a trajectory.

Returns:

numpy array of dimensions (timesteps,natoms,natoms)

property bond_lengths_matrix: ndarray

Returns a distances matrix only for bonds.

Returns:

numpy array of (timesteps,natoms,natoms) dimensions

path: Path | str
stable_trajectory()

Computes the bond_lengths_differences_matrix and then checks at which timesteps the trajectory is not stable and overwrites the original trajectory attribute with the stable trajectory only so that the distribution of distances hr can then be computed on the stable part of the trajectory.

class TrajectoryAnalysis(trajectory_path: Path | str)

Bases: ReadFile

TrajectoryAnalysis is a class used for general analysis of molecular dynamics trajectories. Functionality for now comprehends only distribution of distances for a single molecule

Parameters:

trajectory_path (Union[Path, str]) – Path to the trajectory file. Can be an xyz or a DLPOLY4 trajectory.

Example usage:

traj = TrajectoryAnalysis('path/to/trajectory')
traj.hr(nbins=1000,max_dist=10.0)
traj.plot_hr('path/to/figure.png')
delta_dirac(r0: float, r1: float) int

Computes a modified version of the Dirac delta function. In other words this uses the distances_matrix attribute and checks all the distances that are between two float values. It then returns the number of times the distances in all timesteps is between these two values.

Parameters:
  • r0 – lower bound of the interval

  • r1 – higher bound of the interval

Returns:

number of values that are within that interval

hr(nbins: int | None = 1000, max_dist: float | None = 10.0) List[float]

Computes the distributions of distances of all pair-wise distances across a whole trajectory. This function needs to be called if the bins and distance_hist attributes need to be computed

Parameters:
  • nbins – number of bins to consider for the distance range considered, defaults to 1000

  • max_dist – maximum distance to consider for the distribution, defaults to 10.0

path: Path | str
plot_hr(nbins: int = 1000, max_dist: float = 10.0, ax=None, label=None)

Helper function which plots a quick graph for visualising the hr distribution.

Parameters:
  • nbins – The number of bins to use to calculate hr

  • max_dist – The maximum pairwise relative distance to plot

  • save_path – path and name of the file ending in .png

r(nbins: int, max_dist: float)

Module contents

class Stability(reference_path: Path | str, trajectory_path: Path | str, threshold: float)

Bases: TrajectoryAnalysis

Class used to check whether a simulation of a molecule is stable given a threshold value. Each connected bond distance is subtracted to a reference bond distance (e.g. of the corresponding optimised molecule) and then checked against the threshold.

Parameters:
  • reference_path – path to the reference structure of a molecule. Can be .gjf or .xyz file.

  • trajectory_path – path to the trajectory file. Can be a DLPOLY4 trajectory or a .xyz trajectory file.

  • threshold – threshold value for stability across the trajectory.

Example usage:

traj = Stability('path/to/reference.xyz','path/to/trajectory.xyz', threshold=0.5)
traj.stable_trajectory()
traj.hr(nbins=1000, max_dist=10.0)
traj.plot_hr('path/to/figure.png')
bond_lengths_differences_matrix() ndarray

Computes a matrix of the differences of distances between a reference geometry and all the timesteps of a trajectory.

Returns:

numpy array of dimensions (timesteps,natoms,natoms)

property bond_lengths_matrix: ndarray

Returns a distances matrix only for bonds.

Returns:

numpy array of (timesteps,natoms,natoms) dimensions

path: Path | str
stable_trajectory()

Computes the bond_lengths_differences_matrix and then checks at which timesteps the trajectory is not stable and overwrites the original trajectory attribute with the stable trajectory only so that the distribution of distances hr can then be computed on the stable part of the trajectory.

class TrajectoryAnalysis(trajectory_path: Path | str)

Bases: ReadFile

TrajectoryAnalysis is a class used for general analysis of molecular dynamics trajectories. Functionality for now comprehends only distribution of distances for a single molecule

Parameters:

trajectory_path (Union[Path, str]) – Path to the trajectory file. Can be an xyz or a DLPOLY4 trajectory.

Example usage:

traj = TrajectoryAnalysis('path/to/trajectory')
traj.hr(nbins=1000,max_dist=10.0)
traj.plot_hr('path/to/figure.png')
delta_dirac(r0: float, r1: float) int

Computes a modified version of the Dirac delta function. In other words this uses the distances_matrix attribute and checks all the distances that are between two float values. It then returns the number of times the distances in all timesteps is between these two values.

Parameters:
  • r0 – lower bound of the interval

  • r1 – higher bound of the interval

Returns:

number of values that are within that interval

hr(nbins: int | None = 1000, max_dist: float | None = 10.0) List[float]

Computes the distributions of distances of all pair-wise distances across a whole trajectory. This function needs to be called if the bins and distance_hist attributes need to be computed

Parameters:
  • nbins – number of bins to consider for the distance range considered, defaults to 1000

  • max_dist – maximum distance to consider for the distribution, defaults to 10.0

path: Path | str
plot_hr(nbins: int = 1000, max_dist: float = 10.0, ax=None, label=None)

Helper function which plots a quick graph for visualising the hr distribution.

Parameters:
  • nbins – The number of bins to use to calculate hr

  • max_dist – The maximum pairwise relative distance to plot

  • save_path – path and name of the file ending in .png

r(nbins: int, max_dist: float)