ichor.hpc.main package

Submodules

ichor.hpc.main.aimall module

add_method_and_get_wfn_paths(points: PointsDirectory, method: str) → List[Path]: AIMALL needs to know the method from the wfn file. The method needs to be added in the wfn file, otherwise AIMALL gets the method wrong and gives the wrong results.

submit_points_directory_to_aimall(points_directory: PointsDirectory | Path, method='B3LYP', ncores: int = 2, naat: int = 1, aimall_atoms: List[str] | None = None, force_calculate_ints=False, hold: JobID | None = None, script_name: str = PosixPath('.DATA/SCRIPTS/AIMALL.sh'), outputs_dir_path=PosixPath('.DATA/SCRIPTS/OUTPUTS'), errors_dir_path=PosixPath('.DATA/SCRIPTS/ERRORS'), **kwargs) → JobID | None

Submits .wfn files which will be partitioned into .int files by AIMALL. Each topological atom i the system has its own .int file

Parameters:

points_directory – A path to a PointsDirectory-structured directory or a PointsDirectory instance
method – Functional to be written to the .wfn file because AIMAll needs to know it to function correctly. Note that only HF, B3LYP, M062X, PBE are supported.
ncores – Number of cores to run AIMAll with, defaults to 2
naat – Number of atoms at a time, defaults to 1
aimall_atoms – A list of atom names (e.g. [C1, H2, etc.]) which to integrate over, defaults to None If left as None, AIMAll will do calculations for all atoms.
hold – Hold for a specific JobID, defaults to None

Raises:

ValueError – if the provided method is not in the supported functionals, then raise error.

Returns:

The job id of the submitted job

Return type:

Optional[ichor.hpc.batch_system.jobs.JobID]

submit_wfns(wfns: List[Path], aimall_atoms: List[str] | None = None, script_name: str = PosixPath('.DATA/SCRIPTS/AIMALL.sh'), ncores=2, naat=1, force_calculate_ints=False, hold: JobID | None = None, outputs_dir_path=PosixPath('.DATA/SCRIPTS/OUTPUTS'), errors_dir_path=PosixPath('.DATA/SCRIPTS/ERRORS'), **kwargs) → JobID | None

Write out submission script and submit wavefunctions to AIMALL on a cluster.

Parameters:

wfns – a list of wavefunction paths which to write to the submission script
atoms – a list of stings corresponding to atom names. These will be the atoms for which AIMALL computes properties.
force – Whether or not to compute AIMALL for this wfn. If force is True, AIMALL will be ran again
hold – An optional JobID to hold for. The AIMALL job will not run until that other job is finished.

ichor.hpc.main.check_for_missing_files module

submit_check_points_directory_for_missing_files(points_dir_path: str | Path)

Submits a job that checks if contents of files are present. Files are going to be written to the file which contains standard output from the job.

Parameters:: points_dir_path – Path or PointsDirectory or PointsDirectoryParent-like directory

ichor.hpc.main.database module

submit_make_csvs_from_database(db_path: Path, db_type: str, ncores: int, alf: List[ALF] | None = None, float_difference_iqa_wfn: float = 4.184, float_integration_error: float = 0.001, rotate_multipole_moments: bool = True, calculate_feature_forces: bool = False)

Submits making of csv files from a databse to compute node. Note that the csv making code is parallelized per atom, meaning that each atomic csv is made using 1 core. Using the same number of cores as the number of atoms in the system is the optimal choice.

Parameters:

db_path – pathlib.Path object that holds path to database
db_type – The type of database, sqlite or json
ncores – Number of cores to run job with
float_difference_iqa_wfn – Absolute tolerance for difference of energy between WFN and sum of IQA energies.
submit_on_compute – Whether to submit on compute or now
float_integration_error – Absolute tolerance for integration error.
alf – A list of ALF for the whole system. If not given, it will be calculated automatically.
rotate_multipole_moments – Whether or not to rotate multipole moments, defaults to True
calculate_feature_forces – Whether or not to calculate ALF forces, defaults to False

submit_make_database(points_dir_path: Path, database_format: str = 'sqlite', ncores=1)

Method for making a PointsDirectory or parent to PointsDirectory into a database. Infers if it is a PointsDirectory or PointsDirectoryParent based on the suffix of the directory

Parameters:

points_dir_path – Path to PointsDirectory or parent to PointsDirectory-ies
database_format – the format, currently sqlite and json are supported
ncores – number of cores to use on compute node

ichor.hpc.main.gaussian module

submit_gjfs(gjfs: List[Path], force_calculate_wfn: bool = False, script_name: str | Path | None = PosixPath('.DATA/SCRIPTS/GAUSSIAN.sh'), hold: JobID | None = None, ncores=2, outputs_dir_path=PosixPath('.DATA/SCRIPTS/OUTPUTS'), errors_dir_path=PosixPath('.DATA/SCRIPTS/ERRORS')) → JobID

Function that writes out a submission script which contains an array of Gaussian jobs to be ran on compute nodes. If calling this function from a log-in node, it will write out the submission script, a datafile (file which contains the names of all the .gjf file that need to be ran through Gaussian), and it will submit the submission script to compute nodes as well to run Gaussian on compute nodes. However, if using this function from a compute node, (which will happen when ichor is ran in auto-run mode), this function will only be used to write out the datafile and will not submit any new jobs from the compute node (as you cannot submit jobs from compute nodes on CSF3.)

Parameters:

gjfs – A list of Path objects pointing to .gjf files
force_calculate_wfn – Run Gaussian calculations on given .gjf files, even if .wfn files already exist. Defaults to False.
hold – An optional JobID for which this job to hold. This is used in auto-run to hold this job for the previous job to finish, defaults to None

Script_name:

Path to write submission script out to defaults to ichor.hpc.global_variables.SCRIPT_NAMES[“gaussian”]

Returns:

The JobID of this job given by the submission system.

submit_points_directory_to_gaussian(points_directory: Path | PointsDirectory, overwrite_existing=False, force_calculate_wfn: bool = False, ncores=2, hold: JobID | None = None, script_name: str = PosixPath('.DATA/SCRIPTS/GAUSSIAN.sh'), outputs_dir_path=PosixPath('.DATA/SCRIPTS/OUTPUTS'), errors_dir_path=PosixPath('.DATA/SCRIPTS/ERRORS'), **kwargs) → JobID | None

Function that writes out .gjf files from .xyz files that are in each directory and calls submit_gjfs which submits all .gjf files in a directory to Gaussian. Gaussian outputs .wfn files.

Parameters:

directory – A Path object which is the path of the directory (commonly traning set path, sample pool path, etc.).
force_calculate_wfn – Run Gaussian calculations on given .gjf files, even if .wfn files already exist. Defaults to False.
kwargs – Key word arguments to pass to GJF class. These are things like number of cores, basis set, level of theory, spin multiplicity, charge, etc. These will get used in the new written gjf files (overwriting settings from previously existing gjf files)

write_gjfs(points_directory: PointsDirectory, overwrite_existing: bool, **kwargs) → List[Path]

Writes out .gjf files in every PointDirectory which is contained in a PointsDirectory. Each PointDirectory should always have a .xyz file in it, which contains only one molecular geometry. This .xyz file can be used to write out the .gjf file in the PointDirectory (if it does not exist already).

Parameters:: points – A PointsDirectory instance which wraps around a whole directory containing points (such as TRAINING_SET).
Returns:: A list of Path objects which point to .gjf files in each PointDirectory that is contained in the PointsDirectory.

ichor.hpc.main.orca module

submit_orca_to_compute(orca_inputs: List[Path], force_calculate_wfn: bool = False, script_name: str | Path | None = PosixPath('.DATA/SCRIPTS/ORCA.sh'), hold: JobID | None = None, ncores=2, outputs_dir_path=PosixPath('.DATA/SCRIPTS/OUTPUTS'), errors_dir_path=PosixPath('.DATA/SCRIPTS/ERRORS')) → JobID

Function that writes out a submission script which contains an array of ORCA jobs to be ran on compute nodes. If calling this function from a log-in node, it will write out the submission script, a datafile (file which contains the names of all the .orcainput file that need to be ran through ORCA), and it will submit the submission script to compute nodes as well to run ORCA on compute nodes. However, if using this function from a compute node, (which will happen when ichor is ran in auto-run mode), this function will only be used to write out the datafile and will not submit any new jobs from the compute node (as you cannot submit jobs from compute nodes on CSF3.)

Parameters:

orca_inputs – A list of Path objects pointing to ORCA .inp files
force_calculate_wfn – Run ORCA calculation on the given files, even if .wfn files already exist. Defaults to False.
hold – An optional JobID for which this job to hold. This is used in auto-run to hold this job for the previous job to finish, defaults to None

Script_name:

Path to write submission script out to defaults to ichor.hpc.global_variables.SCRIPT_NAMES[“orca”]

Returns:

The JobID of this job given by the submission system.

submit_points_directory_to_orca(points_directory: Path | PointsDirectory, overwrite_existing=False, force_calculate_wfn: bool = False, ncores=2, hold: JobID | None = None, script_name: str = PosixPath('.DATA/SCRIPTS/ORCA.sh'), outputs_dir_path=PosixPath('.DATA/SCRIPTS/OUTPUTS'), errors_dir_path=PosixPath('.DATA/SCRIPTS/ERRORS'), **kwargs) → JobID | None

Function that writes out .inp files from .xyz files that are in each directory and submits ORCA jobs to compute nodes.

Parameters:

directory – A Path object which is the path of the directory (commonly training set path, sample pool path, etc.).
force_calculate_wfn – Run ORCA calculations on given input files, even if .wfn files already exist. Defaults to False.
kwargs – Key word arguments to pass to orca input file class. These are things like number of cores, basis set, level of theory, spin multiplicity, charge, etc. These will get used in the new written input files (overwriting settings from previously existing input files)

write_orca_inputs(points_directory: PointsDirectory, overwrite_existing: bool, **kwargs) → List[Path]

Writes out .inp files in every PointDirectory which is contained in a PointsDirectory. Each PointDirectory should always have a .xyz file in it, which contains only one molecular geometry. This .xyz file can be used to write out the orca input file in the PointDirectory (if it does not exist already).

Parameters:: points – A PointsDirectory instance which wraps around a whole directory containing points (such as TRAINING_SET).
Returns:: A list of Path objects which point to orca input files in each PointDirectory that is contained in the PointsDirectory.

ichor.hpc.main.trajectory module

submit_center_trajectory_on_atom(trajectory_path: str | Path, central_atom_name: str, alf_dict: Dict[str, ALF], xyz_output_path='ALF_centered_trajectory.xyz', ncores=1)

Submits centering on atom on compute

Parameters:

trajectory_path – Path of trajectory file to center
central_atom_name – Central atom name to center on, eg. C1
alf_dict – dictionary of atom_name: ALF, which must contain the central atom as key
xyz_output_path – xyz file to write centered geometries to, defaults to “ALF_centered_trajectory.xyz”
ncores – number of cores for job, defaults to 1

Module contents

submit_check_points_directory_for_missing_files(points_dir_path: str | Path)

Submits a job that checks if contents of files are present. Files are going to be written to the file which contains standard output from the job.

Parameters:: points_dir_path – Path or PointsDirectory or PointsDirectoryParent-like directory

submit_gjfs(gjfs: List[Path], force_calculate_wfn: bool = False, script_name: str | Path | None = PosixPath('.DATA/SCRIPTS/GAUSSIAN.sh'), hold: JobID | None = None, ncores=2, outputs_dir_path=PosixPath('.DATA/SCRIPTS/OUTPUTS'), errors_dir_path=PosixPath('.DATA/SCRIPTS/ERRORS')) → JobID

Function that writes out a submission script which contains an array of Gaussian jobs to be ran on compute nodes. If calling this function from a log-in node, it will write out the submission script, a datafile (file which contains the names of all the .gjf file that need to be ran through Gaussian), and it will submit the submission script to compute nodes as well to run Gaussian on compute nodes. However, if using this function from a compute node, (which will happen when ichor is ran in auto-run mode), this function will only be used to write out the datafile and will not submit any new jobs from the compute node (as you cannot submit jobs from compute nodes on CSF3.)

Parameters:

gjfs – A list of Path objects pointing to .gjf files
force_calculate_wfn – Run Gaussian calculations on given .gjf files, even if .wfn files already exist. Defaults to False.
hold – An optional JobID for which this job to hold. This is used in auto-run to hold this job for the previous job to finish, defaults to None

Script_name:

Path to write submission script out to defaults to ichor.hpc.global_variables.SCRIPT_NAMES[“gaussian”]

Returns:

The JobID of this job given by the submission system.

submit_make_csvs_from_database(db_path: Path, db_type: str, ncores: int, alf: List[ALF] | None = None, float_difference_iqa_wfn: float = 4.184, float_integration_error: float = 0.001, rotate_multipole_moments: bool = True, calculate_feature_forces: bool = False)

Submits making of csv files from a databse to compute node. Note that the csv making code is parallelized per atom, meaning that each atomic csv is made using 1 core. Using the same number of cores as the number of atoms in the system is the optimal choice.

Parameters:

db_path – pathlib.Path object that holds path to database
db_type – The type of database, sqlite or json
ncores – Number of cores to run job with
float_difference_iqa_wfn – Absolute tolerance for difference of energy between WFN and sum of IQA energies.
submit_on_compute – Whether to submit on compute or now
float_integration_error – Absolute tolerance for integration error.
alf – A list of ALF for the whole system. If not given, it will be calculated automatically.
rotate_multipole_moments – Whether or not to rotate multipole moments, defaults to True
calculate_feature_forces – Whether or not to calculate ALF forces, defaults to False

submit_points_directory_to_aimall(points_directory: PointsDirectory | Path, method='B3LYP', ncores: int = 2, naat: int = 1, aimall_atoms: List[str] | None = None, force_calculate_ints=False, hold: JobID | None = None, script_name: str = PosixPath('.DATA/SCRIPTS/AIMALL.sh'), outputs_dir_path=PosixPath('.DATA/SCRIPTS/OUTPUTS'), errors_dir_path=PosixPath('.DATA/SCRIPTS/ERRORS'), **kwargs) → JobID | None

Submits .wfn files which will be partitioned into .int files by AIMALL. Each topological atom i the system has its own .int file

Parameters:

points_directory – A path to a PointsDirectory-structured directory or a PointsDirectory instance
method – Functional to be written to the .wfn file because AIMAll needs to know it to function correctly. Note that only HF, B3LYP, M062X, PBE are supported.
ncores – Number of cores to run AIMAll with, defaults to 2
naat – Number of atoms at a time, defaults to 1
aimall_atoms – A list of atom names (e.g. [C1, H2, etc.]) which to integrate over, defaults to None If left as None, AIMAll will do calculations for all atoms.
hold – Hold for a specific JobID, defaults to None

Raises:

ValueError – if the provided method is not in the supported functionals, then raise error.

Returns:

The job id of the submitted job

Return type:

Optional[ichor.hpc.batch_system.jobs.JobID]

submit_points_directory_to_gaussian(points_directory: Path | PointsDirectory, overwrite_existing=False, force_calculate_wfn: bool = False, ncores=2, hold: JobID | None = None, script_name: str = PosixPath('.DATA/SCRIPTS/GAUSSIAN.sh'), outputs_dir_path=PosixPath('.DATA/SCRIPTS/OUTPUTS'), errors_dir_path=PosixPath('.DATA/SCRIPTS/ERRORS'), **kwargs) → JobID | None

Function that writes out .gjf files from .xyz files that are in each directory and calls submit_gjfs which submits all .gjf files in a directory to Gaussian. Gaussian outputs .wfn files.

Parameters:

directory – A Path object which is the path of the directory (commonly traning set path, sample pool path, etc.).
force_calculate_wfn – Run Gaussian calculations on given .gjf files, even if .wfn files already exist. Defaults to False.
kwargs – Key word arguments to pass to GJF class. These are things like number of cores, basis set, level of theory, spin multiplicity, charge, etc. These will get used in the new written gjf files (overwriting settings from previously existing gjf files)

submit_points_directory_to_orca(points_directory: Path | PointsDirectory, overwrite_existing=False, force_calculate_wfn: bool = False, ncores=2, hold: JobID | None = None, script_name: str = PosixPath('.DATA/SCRIPTS/ORCA.sh'), outputs_dir_path=PosixPath('.DATA/SCRIPTS/OUTPUTS'), errors_dir_path=PosixPath('.DATA/SCRIPTS/ERRORS'), **kwargs) → JobID | None

Function that writes out .inp files from .xyz files that are in each directory and submits ORCA jobs to compute nodes.

Parameters:

directory – A Path object which is the path of the directory (commonly training set path, sample pool path, etc.).
force_calculate_wfn – Run ORCA calculations on given input files, even if .wfn files already exist. Defaults to False.
kwargs – Key word arguments to pass to orca input file class. These are things like number of cores, basis set, level of theory, spin multiplicity, charge, etc. These will get used in the new written input files (overwriting settings from previously existing input files)