Skip to content

Commit

Permalink
Merge pull request #2 from Kitaolab/ver1.1
Browse files Browse the repository at this point in the history
Ver1.1
  • Loading branch information
kh01734 authored Sep 14, 2024
2 parents 9827c26 + 090a0c5 commit ed3120f
Show file tree
Hide file tree
Showing 42 changed files with 497 additions and 181 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -164,3 +164,5 @@ trial*
!trial.py
inputs
docs/book/
MSM/*_test.ipynb
test/
53 changes: 27 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,27 @@
# PaCS-ToolKit
# PaCS-Toolkit

PaCS-ToolKit enables the execution of PaCS-MD (Parallel Cascade Selection Molecular Dynamic Simulation), a non-bias-enhanced sampling method, across various environments. Additionally, it offers tools for result analysis and visualization.
PaCS-Toolkit enables the execution of PaCS-MD (Parallel Cascade Selection Molecular Dynamic Simulation), a non-bias-enhanced sampling method, across various environments. Additionally, it offers tools for result analysis and visualization.
While PaCS-MD offers a wide range of applications with existing evaluation types, our toolkit also allows for the integration of additional types as needed.

We believe our package will benefit your research.

- [PaCS-ToolKit](#pacs-toolkit)
- [PaCS-Toolkit](#pacs-toolkit)
- [Document](#document)
- [Quick install](#quick-install)
- [Example command](#example-command)
- [Citation](#citation)
- [LICENSE](#license)


## Document
- The documentation of PaCS-ToolKit is [here](https://kitaolab.github.io/PaCS-Toolkit/).
- The documentation of PaCS-Toolkit is [here](https://kitaolab.github.io/PaCS-Toolkit/).

## Quick install

<details><summary> 1. Install by pip </summary>

~~~shell
# Install all feautres of PaCS-ToolKit
# Install all feautres of PaCS-Toolkit
pip install "pacs[all] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
~~~

Expand All @@ -32,10 +33,10 @@ see [document](https://kitaolab.github.io/PaCS-Toolkit/) for more information.
<details><summary> 2. Install by conda and pip </summary>

~~~shell
conda create -n pacs "python>=3.7" -y
conda create -n pacs "python>=3.8" -y
conda activate pacs

# Install all features of PaCS-ToolKit
# Install all features of PaCS-Toolkit
pip install "pacs[all] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
~~~

Expand All @@ -44,32 +45,32 @@ see [document](https://kitaolab.github.io/PaCS-Toolkit/) for more information.
</details>


## Example command
```sh
pacs mdrun -t 1 -f input.toml
```
see help messages(`pacs --help`) and [document](https://kitaolab.github.io/PaCS-Toolkit/) for more information.

## Citation
~~~txt
- PaCS-Toolkit
[1] Ikizawa, S.*, Hori, T.*, Wijana, T.N.*, Kono, H., Bai, Z., Kimizono, T., Lu, W., Tran, D.P., & Kitao, A. PaCS-Toolkit: Optimized software utilities for parallel cascade selection molecular dynamics (PaCS-MD) simulations and subsequent analyses. J. Phys. Chem. B. 128, 15, 3631-3642 (2024). https://doi.org/10.1021/acs.jpcb.4c01271
- [1] PaCS-Toolkit: Ikizawa, S.*, Hori, T.*, Wijana, T.N.*, Kono, H., Bai, Z., Kimizono, T., Lu, W., Tran, D.P., & Kitao, A. PaCS-Toolkit: Optimized software utilities for parallel cascade selection molecular dynamics (PaCS-MD) simulations and subsequent analyses. *J. Phys. Chem. B.*, **128**, 15, 3631-3642 (2024). https://doi.org/10.1021/acs.jpcb.4c01271

- Original PaCS-MD or targeted-PaCS-MD (t-PaCS-MD)
[2] Harada, R., & Kitao, A. Parallel cascade selection molecular dynamics (PaCS-MD) to generate conformational transition pathway. J. Chem. Phys. 139, 035103 (2013). https://doi.org/10.1063/1.4813023
- [2] Original PaCS-MD or targeted-PaCS-MD (t-PaCS-MD): Harada, R., & Kitao, A. Parallel cascade selection molecular dynamics (PaCS-MD) to generate conformational transition pathway. *J. Chem. Phys.* **139**, 035103 (2013). https://doi.org/10.1063/1.4813023

- Dissociation PaCS-MD (dPaCS-MD)
[3] Tran, D. P., Takemura, K., Kuwata, K., & Kitao, A. Protein–Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics. J. Chem. Theory Comput. 14, 404–417 (2018). https://doi.org/10.1021/acs.jctc.7b00504
[4] Tran, D. P., & Kitao, A. Dissociation Process of a MDM2/p53 Complex Investigated by Parallel Cascade Selection Molecular Dynamics and the Markov State Model. J. Phys. Chem. B , 123, 11, 2469–2478 (2019). https://doi.org/10.1021/acs.jpcb.8b10309
[5] Hata, H., Phuoc Tran, D., Marzouk Sobeh, M., & Kitao, A. Binding free energy of protein/ligand complexes calculated using dissociation Parallel Cascade Selection Molecular Dynamics and Markov state model. Biophysics and Physicobiology, 18, 305–31 (2021). https://doi.org/10.2142/biophysico.bppb-v18.037
- [3] Dissociation PaCS-MD (dPaCS-MD): Tran, D. P., Takemura, K., Kuwata, K., & Kitao, A. Protein–Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics. *J. Chem. Theory Comput*. **14**, 404–417 (2018). https://doi.org/10.1021/acs.jctc.7b00504

- Application to protein domain motion
[6] Inoue, Y., Ogawa, Y., Kinoshita, M., Terahara, N., Shimada, M., Kodera, N., Ando, T., Namba, K., Kitao, A., Imada, K., & Minamino, T. Structural Insights into the Substrate Specificity Switch Mechanism of the Type III Protein Export Apparatus. Structure, 27 , 965-976 (2019). https://doi.org/10.1016/j.str.2019.03.017
- [4] Dissociation PaCS-MD (dPaCS-MD): Tran, D. P., & Kitao, A. Dissociation Process of a MDM2/p53 Complex Investigated by Parallel Cascade Selection Molecular Dynamics and the Markov State Model. *J. Phys. Chem. B*, **123**, 11, 2469–2478 (2019). https://doi.org/10.1021/acs.jpcb.8b10309

- Association and dissociation PaCS-MD (a/dPaCS-MD)
[7] Tran, D. P., & Kitao, A. Kinetic Selection and Relaxation of the Intrinsically Disordered Region of a Protein upon Binding. J. Chem. Theory Comput. 16, 2835–2845 (2020). https://doi.org/10.1021/acs.jctc.9b01203
- [5] Dissociation PaCS-MD (dPaCS-MD): Hata, H., Phuoc Tran, D., Marzouk Sobeh, M., & Kitao, A. Binding free energy of protein/ligand complexes calculated using dissociation Parallel Cascade Selection Molecular Dynamics and Markov state model. *Biophysics and Physicobiology*, **18**, 305–31 (2021). https://doi.org/10.2142/biophysico.bppb-v18.037

- Edge expansion PaCS-MD (eePaCS-MD)
[8] Takaba, K., Tran, D. P., & Kitao, A. Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins. J. Chem. Phys. 152, 225101 (2020). https://doi.org/10.1063/5.0004654
[9] Takaba, K., Tran, D. P., & Kitao, A. Erratum: "Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins" [J. Chem. Phys. 152, 225101 (2020)]. . J. Chem. Phys. 153, 179902 (2020). https://doi.org/10.1063/5.0032465
- [6] Application to protein domain motion: Inoue, Y., Ogawa, Y., Kinoshita, M., Terahara, N., Shimada, M., Kodera, N., Ando, T., Namba, K., Kitao, A., Imada, K., & Minamino, T. Structural Insights into the Substrate Specificity Switch Mechanism of the Type III Protein Export Apparatus. *Structure*, **27** , 965-976 (2019). https://doi.org/10.1016/j.str.2019.03.017

- rmsdPaCS-MD
[10] Tran, D. P., Taira, Y., Ogawa, T., Misu, R., Miyazawa, Y., & Kitao, A. Inhibition of the hexamerization of SARS-CoV-2 endoribonuclease and modeling of RNA structures bound to the hexamer. Sci Rep 12, 3860 (2022). https://doi.org/10.1038/s41598-022-07792-2
~~~
- [7] Association and dissociation PaCS-MD (a/dPaCS-MD): Tran, D. P., & Kitao, A. Kinetic Selection and Relaxation of the Intrinsically Disordered Region of a Protein upon Binding. *J. Chem. Theory Comput.*, **16**, 2835–2845 (2020). https://doi.org/10.1021/acs.jctc.9b01203

- [8] Edge expansion PaCS-MD (eePaCS-MD): Takaba, K., Tran, D. P., & Kitao, A. Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins. *J. Chem. Phys.* **152**, 225101 (2020). https://doi.org/10.1063/5.0004654

- [9] Edge expansion PaCS-MD (eePaCS-MD): Takaba, K., Tran, D. P., & Kitao, A. Erratum: "Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins" [J. Chem. Phys. 152, 225101 (2020)]. *J. Chem. Phys.* **153**, 179902 (2020). https://doi.org/10.1063/5.0032465

- [10] rmsdPaCS-MD: Tran, D. P., Taira, Y., Ogawa, T., Misu, R., Miyazawa, Y., & Kitao, A. Inhibition of the hexamerization of SARS-CoV-2 endoribonuclease and modeling of RNA structures bound to the hexamer. *Sci Rep* **12**, 3860 (2022). https://doi.org/10.1038/s41598-022-07792-2


## LICENSE
Expand Down
8 changes: 4 additions & 4 deletions docs/src/fit.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,19 +17,19 @@ pacs fit traj mdtraj -tf ./trial001/cycle001/replica001/prd.xtc -top ./inputs/in

#### for single trial
```shell
pacs fit trial mdtraj -t 1 -s ./trial001/cycle001/replica001/prd.pdb -r ./trial001/cycle001/replica001/prd.pdb -ts "protein" -rs "protein" -tf prd.xtc -p 10
pacs fit trial mdtraj -t 1 -top ./trial001/cycle001/replica001/prd.pdb -r ./trial001/cycle001/replica001/prd.pdb -ts "protein" -rs "protein" -tf prd.xtc -p 10
```

### Arguments

#### for single trajectory
```plaintext
usage: pacs fit mdtraj [-h] [-tf] [-top] [-r] [-ts] [-rs] [-p] [-o]
usage: pacs fit traj mdtraj [-h] [-tf] [-top] [-r] [-ts] [-rs] [-p] [-o]
```
- `-tf, --trj_file` (str):
- file name of the trajectory to be fitted (e.g. `-tf prd.xtc`)
- `-top, --topology` (str):
- topology file path for loading trajectory (e.g. `-s trial001/cycle000/replica001/prd.pdb`)
- topology file path for loading trajectory (e.g. `-top trial001/cycle000/replica001/prd.pdb`)
- `-r, --ref_structure` (str):
- reference structure file path for fitting reference (e.g. `-r trial001/cycle000/replica001/prd.pdb`)
- `-ts, --trj_selection` (str):
Expand All @@ -53,7 +53,7 @@ usage: pacs fit trial mdtraj [-h] [-t] [-tf] [-top] [-r] [-ts] [-rs] [-p] [-o]
- `-tf, --trj_file` (str):
- file name of the trajectory to be fitted (e.g. `-tf prd.xtc`)
- `-top, --topology` (str):
- topology file path for loading trajectory (e.g. `-s trial001/cycle000/replica001/prd.pdb`)
- topology file path for loading trajectory (e.g. `-top trial001/cycle000/replica001/prd.pdb`)
- `-r, --ref_structure` (str):
- reference structure file path for fitting reference (e.g. `-r trial001/cycle000/replica001/prd.pdb`)
- `-ts, --trj_selection` (str):
Expand Down
29 changes: 16 additions & 13 deletions docs/src/genfeature.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
# genfeature
- This command is used after executing `pacs mdrun`.
- This command generates data that will be used for MSM analysis.
- This command supports parallel process.


| feature | mdtraj | gmx | cpptraj |
| ------- | ------ | --- | ------- |
| comdist | o | x | x |
| comvec | o | x | x |
| pca | o | x | x |
| tica | o | x | x |
| rmsd | o | x | x |
| xyz | o | x | x |
- This command should be executed after running `pacs mdrun`.
- It generates feature data in `.npy` format, which is cconvenient for MSM analysis in Python.
- Feature data files (e.g., `t001c002r010.npy`) are stored in the directory specified with the `-od` option.
- Each `.npy` file has the `np.arry` in the shape as described in the table below.
- This command supports parallel processing.


Currently implemented analysis tools and the shape of the output data in `.npy` files.


| feature | mdtraj | gmx | cpptraj | shape of `.npy` |
| ------- | ------ | --- | ------- | ---------------------- |
| comdist | o | x | x | (n_frames,) |
| comvec | o | x | x | (n_frames, 3) |
| rmsd | o | x | x | (n_frames,) |
| xyz | o | x | x | (n_frames, n_atoms, 3) |
2 changes: 2 additions & 0 deletions docs/src/genfeature/comvec.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# comvec
- Center of mass vector
- Calculate the vector between the centers of mass of `s1` and `s2`
- The vector is calculated as `s1` - `s2`

### Example
- The following example generates features about COM vector for MSM analysis
Expand Down
3 changes: 2 additions & 1 deletion docs/src/genfeature/rmsd.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# RMSD
- Root Mean Square Deviation
- Calculate the RMSD relative to the structure specified in `ref`

### Example
- The following example generates features about RMSD for MSM analysis
Expand All @@ -14,7 +15,7 @@ pacs genfeature rmsd mdtraj -t 1 -tf prd.xtc -top ./inputs/input.gro -ref ./inpu

#### mdtraj
```plaintext
usage: pacs genfeature pca mdtraj [-h] [-tf] [-top] [-od] [-p] [-ref] [-ft] [-fr] [-ct] [-cr]
usage: pacs genfeature rmsd mdtraj [-h] [-tf] [-top] [-od] [-p] [-ref] [-ft] [-fr] [-ct] [-cr]
```

- `-t, --trial` (int):
Expand Down
30 changes: 12 additions & 18 deletions docs/src/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,23 +11,22 @@
- [2.2. Install by pip locally](#22-install-by-pip-locally)

## Requirements
- [Python](https://www.python.org/) >= 3.7
- PaCS-ToolKit currently supports 3 simulator
- [Gromacs](https://www.gromacs.org/)
- [Amber](https://ambermd.org/index.php)
- [Namd](https://www.ks.uiuc.edu/Research/namd/)
- [Python](https://www.python.org/) >= 3.7 (but python >= 3.8 is recommended because of deeptime)
- PaCS-Toolkit currently supports 3 simulator
- [Gromacs](https://www.gromacs.org/) >= 2022.2 tested
- [Amber](https://ambermd.org/index.php) >= 2023 tested
- [Namd](https://www.ks.uiuc.edu/Research/namd/) >= 2021-02-20 tested

## 1. Install by pip
### 1.1 Install by conda and pip
~~~shell
conda create -n pacsmd "python>=3.7" -y
conda create -n pacsmd "python>=3.8" -y
conda activate pacsmd
~~~

- if using whole pacstk function
~~~shell
pip install "pacs[all] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
pip install pyemma
~~~

- elif using "pacs mdrun" and analyzer == "mdtraj"
Expand All @@ -41,9 +40,9 @@ pip install "pacs @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
~~~

- elif performing MSM
- python >= 3.8 is recommended because of deeptime
~~~shell
pip install "pacs[msm] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
pip install pyemma
~~~

### 1.2. Install by pip
Expand All @@ -63,9 +62,9 @@ pip install "pacs @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
~~~

- elif performing MSM
- python >= 3.8 is recommended because of deeptime
~~~shell
pip install "pacs[msm] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
pip install pyemma
~~~


Expand All @@ -87,15 +86,14 @@ cd pacsmd-${version}

### 2.1. Install by conda and pip locally
~~~shell
conda create -n pacsmd "python>=3.7" -y
conda create -n pacsmd "python>=3.8" -y
conda activate pacsmd
~~~

- if using whole pacstk function
- pyemma does not recommend pip-install
- python >= 3.8 is recommended because of deeptime
~~~shell
pip install -e ".[all]"
conda install -c conda-forge pyemma
~~~

- elif using "pacs mdrun" and analyzer == "mdtraj"
Expand All @@ -114,18 +112,15 @@ pip install -e "."
~~~

- elif performing MSM
- pyemma does not recommend pip-install
~~~
pip install -e ".[msm]"
conda install -c conda-forge pyemma
~~~

### 2.2. Install by pip locally
- if using whole pacstk function
- pyemma does not work, conda is recommend
- python >= 3.8 is recommended because of deeptime
~~~shell
pip install -e ".[all]"
pip install pyemma
~~~

- elif using "pacs mdrun" and analyzer == "mdtraj"
Expand All @@ -144,8 +139,7 @@ pip install -e "."
~~~

- elif performing MSM
- sometimes pyemma does not work, conda is recommend
- python >= 3.8 is recommended because of deeptime
~~~shell
pip install -e ".[msm]"
pip install pyemma
~~~
44 changes: 29 additions & 15 deletions docs/src/mdrun/inputfile.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,22 @@
- input file must be in [toml format](https://toml.io/en/).

*Contents*
- [sample input file](#sample-input-file)
- [basic option](#basic-option)
- [simulator option](#simulator-option)
- [Gromacs](#gromacs)
- [Amber](#amber)
- [NAMD](#namd)
- [analyzer option](#analyzer-option)
- [Target](#target)
- [RMSD](#rmsd)
- [Association](#association)
- [Dissociation](#dissociation)
- [EdgeExpansion](#edgeexpansion)
- [A\_D](#a_d)
- [Template](#template)
- [hidden option (No need to specify)](#hidden-option-no-need-to-specify)
- [Input file](#input-file)
- [sample input file](#sample-input-file)
- [basic option](#basic-option)
- [simulator option](#simulator-option)
- [Gromacs](#gromacs)
- [Amber](#amber)
- [NAMD](#namd)
- [analyzer option](#analyzer-option)
- [Target](#target)
- [RMSD](#rmsd)
- [Association](#association)
- [Dissociation](#dissociation)
- [EdgeExpansion](#edgeexpansion)
- [A\_D](#a_d)
- [Template](#template)
- [hidden option (No need to specify)](#hidden-option-no-need-to-specify)

## sample input file
- please check [here](https://github.com/Kitaolab/PaCS-Toolkit/tree/main/jobscripts)
Expand Down Expand Up @@ -94,6 +95,16 @@ rmfile = true # Whether rmfile is executed after trial
- Gromacs index file
- **trajectory_extension: str, required**
- Trajectory file extension. ("." is necessary)
- **nojump: bool, default=false**
- whether to execute `-pbc nojump` treatment for the selection feature calculation in `analayzer`, snapshot extraction in `exporter` and performing rmmol
- **valid only when `analyzer` is also gromacs**
- If `true`, molecules are allowed to get out of the simulation box in order to avoid the error in MSM due to the jumping of break of the molecule over pbc box.
- If `false`, molecules are just made whole by `-pbc mol` and can warp across the pbc box.
- Be noted that the output `prd.xtc` files are not processed with these `-pbc` options. (only `prd_rmmol.xtc` files are processed)
- This option is recommended to use when a/dissociation and a_d pacsmd is performed using gromacs as simulator and analyzer
- `nojump=true` can lead too large coordinate value to cause overflow or loss-of-significane problem. It will not happpen in most cases, but be carefull if your ligand is very small and simulation box is very large.
- When this options is applied, analyzer can consider the distance even if ligand exceeds simulation box
- This option is not present in example input in the [sample input repository](https://github.com/Kitaolab/PaCS-Toolkit-example/tree/main) since this option was added in version 1.1.0

</details>

Expand All @@ -107,6 +118,7 @@ topology = "/work/topol.top" # Topology file such as top, parm7, psf,
mdconf = "/work/parameter.mdp" # Parameter file such as mdp, mdin, namd, etc.
index_file = "/work/index.ndx" # Gromacs index file
trajectory_extension = ".xtc" # Trajectory file extension. ("." is necessary)
nojump = true # whether to execute nojump treatment only for gmx
```


Expand Down Expand Up @@ -527,6 +539,7 @@ user-defined-variable2 = "hoge"

## hidden option (No need to specify)
<details><summary> click here </summary>

- **cmd_gmx: str**
- Gromacs command (ex. gmx, gmx_mpi)
- will be created from `cmd_serial`
Expand All @@ -536,4 +549,5 @@ user-defined-variable2 = "hoge"
- **structure_extension: str**
- Structure file extension
- will be created from `structure`

</details>
6 changes: 3 additions & 3 deletions docs/src/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ git clone https://github.com/Kitaolab/PaCS-Toolkit.git
pip install -e ".[mdtraj]"

# Or install by conda and pip
conda create -n pacsmd "python>=3.7" -y
conda create -n pacsmd "python>=3.8" -y
conda activate pacsmd
pip install -e ".[mdtraj]"
```
Expand Down Expand Up @@ -140,13 +140,13 @@ $ pacs fit trial mdtraj -t 1 -tf prd_rmmol.xtc -top rmmol_top.pdb -r ref.gro -ts
- So if you want to use other specific CVs, you need to write a code by yourself.

~~~shell
$ pacs genfeature comdist mdtraj -t 1 -tf prd.xtc -top inputs/example_gromacs/input.gro -s1 "residue 1" -s2 "residue 9"
$ pacs genfeature comdist mdtraj -t 1 -tf prd.xtc -top inputs/example_gromacs/input.gro -s1 "residue 1" -s2 "residue 9"
$ ls
comdist-CV/
~~~


## Step7: Building MSM and predicting free energy
- After extracting CVs, various analyses can be performed on them.
- After extracting CVs, various analyses can be performed on them.
- PaCS-MD is especially compatible with analyses using MSM.

Loading

0 comments on commit ed3120f

Please sign in to comment.