ichor.core.database package
Subpackages
Submodules
ichor.core.database.query_database module
- check_supported_db_types(db_type: str)
Checks the given database type, raises ValueError if it is not present.
- get_alf_from_first_db_geometry(db_path: str | ~pathlib.Path, db_type: str, alf_calc_func=<function calculate_alf_cahn_ingold_prelog>, echo=False) List[ALF]
Returns the atomic local frame for every atom from the first point.
- Parameters:
db_path – Path to SQLite3 database containing Points, AtomNames, and Dataset tables.
db_type – The type of database, currently only sqlite and json supported
alf_calc_func – The function to calculate ALF with on an Atoms instance
echo – Whether to echo executed SQL queries, defaults to False
- Returns:
A list of ALF instances for every atom in the system.
- get_database_info_from_db_type(db_path: str | Path, db_type: str, echo=False)
Gets the required information from the database to make processed csvs. Works for sqlite or json
- Parameters:
db_path – path to database
db_type – the type of database containing info, currently only “json” and “sqlite” are supported.
- Raises:
ValueError – If the value of db_type is not in supported databases.
- rotate_multipole_moments(row_with_atom_info, C)
- worker(x)
- worker_init(func)
- write_processed_data_for_atoms(db_path: str | Path, db_type: str, alf: List[ALF], max_integration_error: float = 0.001, write_index_col=False, echo=False, atom_names: List | None = None, calc_multipoles: bool = True, calc_forces: bool = False)
- Writes a csv containing the features, wfn energy, -dE/df (note that these are forces wtr features),
iqa energy, and rotated multipoles for every atom in the SQL database. Note that only points for which the absolute integration error for the atom of interest is below the threshold are added to the corresponding atomic datasets.
- Parameters:
db_path – Path to SQLite3 database containing Points, AtomNames, and Dataset tables.
db_type – type of database, sqlite or json
alf – A list of ALF instances to be used when calculating features and calculating C matrices.
max_integration_error – Maximum integration error that a point needs to have for the atom of interest. Having a higher (absolute) integration error for the atom of interest means that this point will not be added in the dataset for the atom of interest. However, the same point can be added in the dataset for another atom, if the integration error is good, defaults to 0.001
write_index_col – Whether to write the index col in the final .csv file, defaults to False
echo – Whether to echo executed SQL statements, defaults to False
atom_names – A list of atom names for which to write db. If left to None, csv files will be written for all atoms.
properties – Which properties to write out to csv files.
- write_processed_data_for_atoms_parallel(db_path: str | Path, db_type: List[str], alf: List[ALF], ncores: int, max_diff_iqa_wfn: float = 4.184, max_integration_error: float = 0.001, atom_names: List[str] | None = None, write_index_col=False, echo=False, calc_multipoles: bool = True, calc_forces: bool = False, parent_directory: Path = PosixPath('processed_csvs'))
Function uses the concurrent.futures.ProcessPoolExecutor class to parallelize the calculations on multiple cores, so that multiple atom calculations can be done in parallel.
Writes a csv containing the features, wfn energy, -dE/df (note that these are forces wtr features), iqa energy, and rotated multipoles for every atom in the SQL database. Note that only points for which the absolute integration error for the atom of interest is below the threshold are added to the corresponding atomic datasets.
- Parameters:
db_path – Path to SQLite3 database containing Points, AtomNames, and Dataset tables. or a json database (a directory), potentially containing multiple directories
db_type – The type of database that is given, currently only json or sqlite formats supported.
alf – A list of ALF instances to be used when calculating features and calculating C matrices.
ncores – The number of cores to use for the parallel calculations. Each core will calculate the data for an individual atom.
max_diff_iqa_wfn – The maximum difference between the sum of iqa and wfn energy (in kJ mol-1). Any point that is above this threshold will be filtered out before doing integration errors.
max_integration_error – Maximum integration error that a point needs to have for the atom of interest. Having a higher (absolute) integration error for the atom of interest means that this point will not be added in the dataset for the atom of interest. However, the same point can be added in the dataset for another atom, if the integration error is good, defaults to 0.001
write_index_col – Whether to write the index col in the final .csv file, defaults to False
echo – Whether to echo executed SQL statements, defaults to False
atom_names – A list of atom names for which to write db. If left to None, csv files will be written for all atoms.
calc_forces – Whether to calculate -dE/df, default False.
- write_processed_one_atom_data_to_csv(full_df: DataFrame, point_ids: List[int], atom_name: str, alf: List[ALF], max_diff_iqa_wfn: float = 4.184, max_integration_error: float = 0.001, write_index_col=False, calc_multipoles: bool = True, calc_forces: bool = False, parent_directory: Path = PosixPath('processed_csvs'))
Writes features, iqa energy, as well as rotated multipole moments (given an ALF) to a csv file for all points (as long as integration error for the atom of interest is below a threshold integration error).
- Parameters:
full_df – DataFrame object extracted from SQLite database. This object contains information for all points (and all atoms in every point)
point_ids – A list of integers representing the id column of the points table of the SQLite database.
atom_name – The atom for which features, local multipole moments, as well as local forces are going to be calculated for every point in the dataset
alf – A list of ALF instance to be used when calculating features and calculating C matrices
max_diff_iqa_wfn – Maximum difference between sum of IQA and wfn energy (in kJ mol-1). If point is above threshold, it will get filtered before doing integration error.
max_integration_error – Maximum integration error that a point needs to have for the atom of interest. Having a higher (absolute) integration error for the atom of interest means that this point will not be added in the dataset for the atom of interest. However, the same point can be added in the dataset for another atom, if the integration error is good, defaults to 0.001
calc_forces – Whether to calculate -dE/df forces (which takes a long time currently), default False.
Module contents
- get_alf_from_first_db_geometry(db_path: str | ~pathlib.Path, db_type: str, alf_calc_func=<function calculate_alf_cahn_ingold_prelog>, echo=False) List[ALF]
Returns the atomic local frame for every atom from the first point.
- Parameters:
db_path – Path to SQLite3 database containing Points, AtomNames, and Dataset tables.
db_type – The type of database, currently only sqlite and json supported
alf_calc_func – The function to calculate ALF with on an Atoms instance
echo – Whether to echo executed SQL queries, defaults to False
- Returns:
A list of ALF instances for every atom in the system.
- get_database_info_from_db_type(db_path: str | Path, db_type: str, echo=False)
Gets the required information from the database to make processed csvs. Works for sqlite or json
- Parameters:
db_path – path to database
db_type – the type of database containing info, currently only “json” and “sqlite” are supported.
- Raises:
ValueError – If the value of db_type is not in supported databases.
- write_processed_data_for_atoms_parallel(db_path: str | Path, db_type: List[str], alf: List[ALF], ncores: int, max_diff_iqa_wfn: float = 4.184, max_integration_error: float = 0.001, atom_names: List[str] | None = None, write_index_col=False, echo=False, calc_multipoles: bool = True, calc_forces: bool = False, parent_directory: Path = PosixPath('processed_csvs'))
Function uses the concurrent.futures.ProcessPoolExecutor class to parallelize the calculations on multiple cores, so that multiple atom calculations can be done in parallel.
Writes a csv containing the features, wfn energy, -dE/df (note that these are forces wtr features), iqa energy, and rotated multipoles for every atom in the SQL database. Note that only points for which the absolute integration error for the atom of interest is below the threshold are added to the corresponding atomic datasets.
- Parameters:
db_path – Path to SQLite3 database containing Points, AtomNames, and Dataset tables. or a json database (a directory), potentially containing multiple directories
db_type – The type of database that is given, currently only json or sqlite formats supported.
alf – A list of ALF instances to be used when calculating features and calculating C matrices.
ncores – The number of cores to use for the parallel calculations. Each core will calculate the data for an individual atom.
max_diff_iqa_wfn – The maximum difference between the sum of iqa and wfn energy (in kJ mol-1). Any point that is above this threshold will be filtered out before doing integration errors.
max_integration_error – Maximum integration error that a point needs to have for the atom of interest. Having a higher (absolute) integration error for the atom of interest means that this point will not be added in the dataset for the atom of interest. However, the same point can be added in the dataset for another atom, if the integration error is good, defaults to 0.001
write_index_col – Whether to write the index col in the final .csv file, defaults to False
echo – Whether to echo executed SQL statements, defaults to False
atom_names – A list of atom names for which to write db. If left to None, csv files will be written for all atoms.
calc_forces – Whether to calculate -dE/df, default False.