API documentation

PGaudi is a package for the optimization of the performance of the GaudiMM suite by external parallelization. It consists of five modules:

pgaudi.main

Main module of the package from which the main process is run and the parallelization is controlled.

pgaudi.main.main()

Main function that gathers the arguments from the command line with parse_cli() and execute the function run().

pgaudi.main.parse_cli()

Function to parse the arguments of the command line

Returns:args – List of the arguments gathered from the command line.
Return type:argparse.Namespace
pgaudi.main.run(cfg, processes, complexity)

Function that executes the whole job.

Parameters:
  • cfg (str or gaudi.parse.Settings) – Path to YAML input file or an already parsed YAML file via gaudi.parse.Settings class.
  • processes (int) – Number of processes in which the main process is divided. Default = number of cores detected in the CPU’s machine.
  • complexity (bool) – If True, the new subprocesses generated are computational equal to the main process. Default = False.

pgaudi.parallel

Module for helper function for the parallel process: similarity and gaudi run.

pgaudi.parallel.divide_cfg(cfg, processes, complexity)

From the input cfg (gaudi.parse.Settings) creates the new yaml files for the parallel execution.

Parameters:
  • cfg (gaudi.parse.Settings) – The loaded input file in a gaudi.parse.Settings object.
  • processes (int) – Number of processes in which the main process is divided.
  • complexity (bool) – If True, the computational complexity of the new subprocess will be the same as for the main process.
Returns:

  • pcfg_names (list) – A list with the names of the new yaml files generated.
  • pcfgs (list) – A list with the contents of the gaudi.parse.Settings of each new yaml file.

pgaudi.parallel.gaudi_parallel(yaml)

Helper function for parallel run of the gaudi run function in a bash terminal.

Parameters:yaml (str) – Name of the input yaml file.
pgaudi.parallel.similarity_parallel(pair_list, cfg)

Helper function for parallel similarity.rmsd() function to detect double solutions.

Parameters:pair_list (tuple) – Tuple of two populations to compare all individuals of each population with the individuals of the other population.
Returns:pairs_selected – List of tuples of the pairs of identical individuals.
Return type:list

pgaudi.treatment

pgaudi.treatment.parse_zip(directory)

Function for parse the output zip files of gaudi and save them in individuals stored in a population.

Parameters:directory (str) – Path to the directory where the output zip files are located.
Returns:population – List of individuals represented in dictionaries.
Return type:list

pgaudi.similarity

Module for the similarity and removal of double solutions.

pgaudi.similarity.remove_equal(pairs_selected, full_pop)

Function to remove double solutions.

Parameters:
  • pairs_selected (list) – List of pairs of identical individuals.
  • full_pop (list) – List of the whole populations of all subprocesses.
pgaudi.similarity.rmsd(ind1, ind2, subjects, threshold, *args, **kwargs)

Function to check if two individuals are two identical solution.

Parameters:
  • ind1, ind2 (dict) – Dictionaries representing one individual.
  • threshold (float) – Maximum RMSD value to consider two individuals as similar. If rmsd > threshold, they are considered different.
  • subjects (list) – List of molecules to measure.
Returns:

Returns True if both individuals are equal.

Return type:

bool

pgaudi.create_output

Module to generate the two global output file: file.gaudi-output and file.gaudi-log.

pgaudi.create_output.generate_out(population, cfg)

Function to write the global gaudi-output file.

Parameters:
  • population (list) – List of unique individuals.
  • cfg (gaudi.parse.Settings) – gaudi.parse.Settings object of the main process (the input file).
pgaudi.create_output.merge_log(pcfgs, cfg)

Function to merge the gaudi-log files of the different subprocesses.

Parameters:pcfgs, cfg (gaudi.parse.Settings) – gaudi.parse.Settings objects for the yaml files of the subprocess and the main process (input file) respectively.