deepdrivemd.data.api
Functions
|
Return a list of all items matching pattern in multiple dirs. |
Classes
|
|
|
- class deepdrivemd.data.api.DeepDriveMD_API(experiment_directory: Union[str, pathlib.Path])
- AGENT_DIR = 'agent_runs'
- AGGREGATE_DIR = 'aggregation_runs'
- MACHINE_LEARNING_DIR = 'machine_learning_runs'
- MODEL_SELECTION_DIR = 'model_selection_runs'
- MOLECULAR_DYNAMICS_DIR = 'molecular_dynamics_runs'
- static get_initial_pdbs(initial_pdb_dir: Union[str, pathlib.Path]) List[pathlib.Path]
Return a list of PDB paths from the initial_pdb_dir.
- Parameters
initial_pdb_dir (Union[str, Path]) – Initial data directory passed containing PDBs and optional topologies.
- Returns
List[Path] – List of paths to initial PDB files.
- Raises
ValueError – If any of the PDB file names contain a double underscore __.
- get_last_n_md_runs(n: Optional[int] = None, data_file_suffix: str = '.h5', traj_file_suffix: str = '.dcd', structure_file_suffix: str = '.pdb') Dict[str, List[str]]
Get the last n MD run directories data file paths.
Return a dictionary of data file paths for the last n MD runs including the training data files, the trajectory files, and the coordinate files.
- Parameters
n (int, optional) – Number of latest MD run directories to glob data files from. Defaults to all MD run directories.
data_file_suffix (int, optional) – The suffix of the training data file. Defaults to “.h5”.
traj_file_suffix (str, optional) – The suffix of the traj file. Defaults to “.dcd”.
structure_file_suffix (str, optional) – The suffix of the structure file. Defaults to “.pdb”.
- Returns
Dict[str, List[str]] – A dictionary with keys “data_files”, “traj_files” and “structure_files” each containing a list of n paths globed from the the latest n MD run directories.
- get_restart_pdb(index: int, stage_idx: int = - 1, task_idx: int = 0) Dict[str, Any]
Gets a single datum for the restart points JSON file.
- Parameters
index (int) – Index into the agent_{}.json file of the latest DeepDriveMD iteration.
- Returns
Dict[Any] – Dictionary entry written by the outlier detector.
- static get_system_name(pdb_file: Union[str, pathlib.Path]) str
Parse the system name from a PDB file.
- Parameters
pdb_file (Union[str, Path]) – The PDB file to parse. Can be absolute path, relative path, or filename.
- Returns
str – The system name used to identify system topology.
Examples
>>> pdb_file = "/path/to/system_name__anything.pdb" >>> DeepDriveMD_API.get_system_name(pdb_file) 'system_name'
>>> pdb_file = "/path/to/system_name/anything.pdb" >>> DeepDriveMD_API.get_system_name(pdb_file) 'system_name'
- static get_system_pdb_name(pdb_file: Union[str, pathlib.Path]) str
Generate PDB file name with correct system name.
Parse pdb_file for the system name and generate a PDB file name that is parseable by DeepDriveMD. If pdb_file name is already compatible with DeepDriveMD, the returned name will be the same.
- Parameters
pdb_file (Union[str, Path]) – The PDB file to parse. Can be absolute path, relative path, or filename.
- Returns
str – The new PDB file name. File is not created.
- Raises
ValueError – If pdb_file contains more than one __.
Examples
>>> pdb_file = "/path/to/system_name__anything.pdb" >>> DeepDriveMD_API.get_system_pdb_name(pdb_file) 'system_name__anything.pdb'
>>> pdb_file = "/path/to/system_name/anything.pdb" >>> DeepDriveMD_API.get_system_pdb_name(pdb_file) 'system_name__anything.pdb'
- static get_topology(initial_pdb_dir: Union[str, pathlib.Path], pdb_file: Union[str, pathlib.Path], suffix: str = '.top') Optional[pathlib.Path]
Get the topology file for the system.
Parse
pdb_file
for the system name and then retrieve the topology file from the correct subdirectory, given by the system name, in the initial_pdb_dir directory or return None if the system doesn’t have a topology.- Parameters
initial_pdb_dir (Union[str, Path]) – Initial data directory passed containing system subdirectories with PDBs and optional topologies.
pdb_file (Union[str, Path]) – The PDB file to parse. Can be absolute path, relative path, or filename.
suffix (str) – Suffix of the topology file (.top, .prmtop, etc).
- Returns
Optional[Path] – The path to the topology file, or None if system has no topology.
- get_total_iterations() int
- static write_pdb(output_pdb_file: Union[str, pathlib.Path], input_pdb_file: Union[str, pathlib.Path], traj_file: Union[str, pathlib.Path], frame: int, in_memory: bool = False) None
Write a PDB file.
Writes output_pdb_file to disk containing coordindates of a single frame from a given input PDB input_pdb_file and trajectory file traj_file.
- Parameters
output_pdb_file (Union[str, Path]) – The path of the output PDB file to be written to.
input_pdb_file (Union[str, Path]) – The path of the input PDB file used to open traj_file in MDAnalysis.Universe().
traj_file (Union[str, Path]) – The path of the trajectory file to be read from.
frame (int) – The frame index into traj_file used to write output_pdb_file.
in_memory (bool, optional) – If true, will load the MDAnalysis.Universe() trajectory into memory.
Examples
>>> output_pdb_file = "/path/to/output.pdb" >>> input_pdb_file = "/path/to/input.pdb" >>> traj_file = "/path/to/traj.dcd" >>> frame = 10 >>> DeepDriveMD_API.write_pdb(output_pdb_file, input_pdb_file, traj_file, frame)
- class deepdrivemd.data.api.Stage_API(experiment_dir: pathlib.Path, stage_dir_name: str)
- config_path(stage_idx: int = - 1, task_idx: int = 0) Optional[pathlib.Path]
- static get_count(path: pathlib.Path, pattern: str, is_dir: bool = False) int
- static get_latest(path: pathlib.Path, pattern: str, is_dir: bool = False, key: typing.Callable[[pathlib.Path], pathlib.Path] = <function Stage_API.<lambda>>) Optional[pathlib.Path]
- json_path(stage_idx: int = - 1, task_idx: int = 0) Optional[pathlib.Path]
- read_task_json(stage_idx: int = - 1, task_idx: int = 0) Optional[List[Dict[str, Any]]]
- property runs_dir: pathlib.Path
- stage_dir(stage_idx: int = - 1) Optional[pathlib.Path]
Return the stage directory containing task subdirectories.
Each stage type has a directory containing subdirectories stageXXXX. In each stageXXXX there are several task directories labeled taskXXXX. This function returns a particular stageXXXX directory selected with stage_idx. Each iteration of DeepDriveMD corresponds to a stageXXXX directory, they are labeled in increasing order.
- stage_dir_count() int
Return the number of stage directories.
- static stage_name(stage_idx: int) str
- task_dir(stage_idx: int = - 1, task_idx: int = 0, mkdir: bool = False) Optional[pathlib.Path]
- static task_name(task_idx: int) str
- static unique_name(task_path: pathlib.Path) str
- write_task_json(data: List[Dict[str, Any]], stage_idx: int = - 1, task_idx: int = 0) None
Dump data to a new JSON file for the agent.
Dump data to a JSON file written to the directory specified by stage_idx and task_idx.
- Parameters
data (List[Dict[str, Any]]) – List of dictionarys to pass to json.dump(). Values in the dictionarys must be JSON serializable.
- deepdrivemd.data.api.glob_file_from_dirs(dirs: List[str], pattern: str) List[str]
Return a list of all items matching pattern in multiple dirs.