Skip to content

Commit

Permalink
Merge pull request #67 from volkamerlab/dev
Browse files Browse the repository at this point in the history
[v2.0.0] CustomKinFragLib release
  • Loading branch information
PaulaKramer authored Sep 18, 2024
2 parents ab5775c + fd62244 commit 4b801b2
Show file tree
Hide file tree
Showing 102 changed files with 172,357 additions and 127 deletions.
6 changes: 3 additions & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,6 @@ jobs:
- name: Run tests
shell: bash -l {0}
run: |
PYTEST_ARGS="--nbval-lax --nbval-current-env --nbval-cell-timeout=1800"
pytest $PYTEST_ARGS
PYTEST_ARGS="--nbval-lax --nbval-current-env --nbval-cell-timeout=7200"
PYTEST_IGNORE="--ignore=notebooks/custom_kinfraglib/2_3_custom_filters_paper.ipynb"
pytest $PYTEST_ARGS $PYTEST_IGNORE
25 changes: 15 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# KinFragLib: Kinase-focused fragment library

[![GitHub Actions Build Status](https://github.com/volkamerlab/KinFragLib/workflows/CI/badge.svg)](https://github.com/volkamerlab/KinFragLib/actions?query=workflow%3ACI)
[![GitHub Actions Build Status](https://github.com/volkamerlab/KinFragLib/actions/workflows/ci.yml/badge.svg)](https://github.com/volkamerlab/KinFragLib/actions?query=branch%3Amaster+workflow%3ACI)

![KinFragLib workflow](./docs/img/toc_github_kinfraglib.png)

Expand All @@ -10,31 +10,36 @@ You can retrieve the repository state for the published KinFragLib paper in rele

## Table of contents

- [Description](#description)
- [Repository content](#repository-content)
- [Description](#description)
- [Quick start](#quick-start)
- [Contact](#contact)
- [License](#license)
- [Citation](#citation)
- [List of publications](#list-of-publications)


## Repository content

This repository holds the following resources:

1. Fragment library data and a link to the combinatorial library data.
2. *Quick start* notebook explaining how to load and use the fragment library.
3. Notebooks covering the full analyses regarding the fragment and combinatorial libraries as described in
the corresponding paper.
3. Notebooks

3.1. *KinFragLib*: Notebooks covering the full analyses regarding the fragment and combinatorial libraries as described in
the corresponding paper.
3.2. *CustomKinFragLib*: Notebooks providing a custom filtering framework to reduce the fragment library size.

Please find detailed description of files in `data/` and `notebooks/` in the folders' `README` files.
Please find detailed descriptions of files in `data/` and `notebooks/` in the folders' `README` files.

## Description

**Exploring the kinase inhibitor space using subpocket-focused fragmentation and recombination**

Protein kinases play a crucial role in many cell signaling processes,
making them one of the most important families of drug targets.
Fragment-based drug design has proven useful as one approach to develop novel kinase inhibitors.
Fragment-based drug design has proven useful as one approach to developing novel kinase inhibitors.
Usually, fragment-based methods follow a knowledge-driven approach, i.e., optimizing a focused set of fragments into
molecular hits.

Expand All @@ -46,9 +51,11 @@ well as back pocket 1 and 2 (B1 and B2), based on defined pocket-spanning residu
Each co-crystallized ligand is fragmented using the BRICS algorithm and its fragments are assigned to the respective
subpocket they occupy.
Following this approach, a fragment library is created with respective subpocket pools. This fragment library enables
an in-depth analysis of the chemical space of known kinase inhibitors, and can be used to enumerate recombined
an in-depth analysis of the chemical space of known kinase inhibitors and can be used to enumerate recombined
fragments in order to generate novel potential inhibitors.

We have added an extension with *CustomKinFragLib* which provides a pipeline to filter the fragments in KinFragLib checking for unwanted substructures (PAINS and Brenk et al.), drug-likeness (Rule of Three and QED), synthesizability (similarity to buyable building blocks and SYBA) and pairwise retrosynthesizability. Each filter can be (de-)activated and the parameters can be modified by the user to create a customized filtered fragment library.

## Quick start

1. Clone this repository.
Expand Down Expand Up @@ -98,7 +105,7 @@ We are looking forward to hearing from you!

## License

This resource is licensed under the [MIT](https://opensource.org/licenses/MIT) license, a permissive open source license.
This resource is licensed under the [MIT](https://opensource.org/licenses/MIT) license, a permissive open-source license.

## Citation

Expand Down Expand Up @@ -146,5 +153,3 @@ Backenköhler M, Groß J, Wolf V, Volkamer A.
- **Constructing Innovative Covalent and Noncovalent Compound Libraries: Insights from 3D Protein–Ligand Interactions** Xiaohe Xu, Weijie Han, Xiangzhen Ning, Chengdong Zang, Chengcheng Xu, Chen Zeng, Chengtao Pu, Yanmin Zhang, Yadong Chen, and Haichun Liu *Journal of Chemical Information and Modeling* **2024**[10.1021/acs.jcim.3c01689](https://pubs.acs.org/doi/10.1021/acs.jcim.3c01689)




3 changes: 3 additions & 0 deletions data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,6 @@ Overview of data content:
- `fragment_library_reduced/`: Reduced fragment library: Select a diverse set of fragments (per subpocket) for recombination starting from the filtered fragment library.
- `combinatorial_library/`: Combinatorial library based on the reduced fragment library.
- `external/`: Data from external resources.
- `filters/`: Data used for custom filters.
- `fragment_library_custom_filtered/`: Custom filtered fragment library: Pre-filtered (remove pool X, deduplicate per subpocket, remove unfragmented ligands, remove all fragments that connect only to pool X), and filtered for unwanted substructures (PAINS and Brenk), drug-likeness (Ro3 and QED), synthesizability (buyable building blocks and SYBA) and pairwise retrosynthesizability (using ASKCOS).
- `fragment_library_old/`: Full fragment library v1.1.0 which was described in the KinFragLib paper.
4 changes: 2 additions & 2 deletions data/combinatorial_library/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@ In order to run the analysis notebooks, please download this dataset to this fol

## Raw data

- `combinatorial_library.json`: Full combinatorial library, please refer to `notebooks/4_1_combinatorial_library_data_preparation.ipynb` at https://github.com/volkamerlab/KinFragLib for detailed information about this data format
- `combinatorial_library.json`: Full combinatorial library, please refer to `notebooks/kinfraglib/4_1_combinatorial_library_data_preparation.ipynb` at https://github.com/volkamerlab/KinFragLib for detailed information about this data format
- `combinatorial_library_deduplicated.json`: Deduplicated combinatorial library (based on InChIs)
- `chembl_standardized_inchi.csv`: Standardized ChEMBL 33 molecules in the form of InChI strings.

## Processed data

Data extracted from `combinatorial_library_deduplicated.json`, performed in `notebooks/4_1_combinatorial_library_data_preparation.ipynb` at https://github.com/volkamerlab/KinFragLib.
Data extracted from `combinatorial_library_deduplicated.json`, performed in `notebooks/kinfraglib/4_1_combinatorial_library_data_preparation.ipynb` at https://github.com/volkamerlab/KinFragLib.

- `n_atoms.csv`: Number of atoms for each recombined ligand
- `ro5.csv`: Number of ligands that fulfill Lipinski's rule of five (Ro5) and its individual criteria; number of ligands in total
Expand Down
3 changes: 3 additions & 0 deletions data/filters/Brenk/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Brenk et al.

- `unwanted_substructures.csv`: File with unwanted substructures provided by Brenk et al. [(Chem. Med. Chem. (2008), 3, 535-44)](https://chemistry-europe.onlinelibrary.wiley.com/doi/full/10.1002/cmdc.200700139) containing the name and the SMARTS string of the unwanted substructure.
105 changes: 105 additions & 0 deletions data/filters/Brenk/unwanted_substructures.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
name smarts
>2EsterGroups C(=O)O[C,H1].C(=O)O[C,H1].C(=O)O[C,H1]
2-haloPyridine n1c([F,Cl,Br,I])cccc1
acidHalide C(=O)[Cl,Br,I,F]
acyclic-C=C-O C=[C!r]O
acylCyanide N#CC(=O)
acylHydrazine C(=O)N[NH2]
aldehyde [CH1](=O)
Aliphatic-long-chain [R0;D2][R0;D2][R0;D2][R0;D2]
alkyl-halide [CX4][Cl,Br,I]
amidotetrazole c1nnnn1C=O
aniline c1cc([NH2])ccc1
azepane [CH2R2]1N[CH2R2][CH2R2][CH2R2][CH2R2][CH2R2]1
Azido-group N=[N+]=[N-]
Azo-group N#N
azocane [CH2R2]1N[CH2R2][CH2R2][CH2R2][CH2R2][CH2R2][CH2R2]1
benzidine [cR2]1[cR2][cR2]([Nv3X3,Nv4X4])[cR2][cR2][cR2]1[cR2]2[cR2][cR2][cR2]([Nv3X3,Nv4X4])[cR2][cR2]2
betaketo/anhydride [C,c](=O)[CX4,CR0X3,O][C,c](=O)
biotin-analogue C12C(NC(N1)=O)CSC2
Carbo-cation/anion [C+,c+,C-,c-]
catechol c1c([OH])c([OH,NH2,NH])ccc1
charged-oxygen/sulfur-atoms [O+,o+,S+,s+]
chinone C1(=[O,N])C=CC(=[O,N])C=C1
chinone C1(=[O,N])C(=[O,N])C=CC=C1
conjugated-nitrile-group C=[C!r]C#N
crown-ether [OR2,NR2]@[CR2]@[CR2]@[OR2,NR2]@[CR2]@[CR2]@[OR2,NR2]
cumarine c1ccc2c(c1)ccc(=O)o2
cyanamide N[CH2]C#N
cyanate/aminonitrile/thiocyanate [N,O,S]C#N
cyanohydrins N#CC[OH]
cycloheptane [CR2]1[CR2][CR2][CR2][CR2][CR2][CR2]1
cycloheptane [CR2]1[CR2][CR2]cc[CR2][CR2]1
cyclooctane [CR2]1[CR2][CR2][CR2][CR2][CR2][CR2][CR2]1
cyclooctane [CR2]1[CR2][CR2]cc[CR2][CR2][CR2]1
diaminobenzene [cR2]1[cR2]c([N+0X3R0,nX3R0])c([N+0X3R0,nX3R0])[cR2][cR2]1
diaminobenzene [cR2]1[cR2]c([N+0X3R0,nX3R0])[cR2]c([N+0X3R0,nX3R0])[cR2]1
diaminobenzene [cR2]1[cR2]c([N+0X3R0,nX3R0])[cR2][cR2]c1([N+0X3R0,nX3R0])
diazo-group [N!R]=[N!R]
diketo-group [C,c](=O)[C,c](=O)
disulphide SS
enamine [CX2R0][NX3R0]
ester-of-HOBT C(=O)Onnn
four-member-lactones C1(=O)OCC1
halogenated-ring c1cc([Cl,Br,I,F])cc([Cl,Br,I,F])c1[Cl,Br,I,F]
halogenated-ring c1ccc([Cl,Br,I,F])c([Cl,Br,I,F])c1[Cl,Br,I,F]
heavy-metal [Hg,Fe,As,Sb,Zn,Se,se,Te,B,Si]
het-C-het-not-in-ring [NX3R0,NX4R0,OR0,SX2R0][CX4][NX3R0,NX4R0,OR0,SX2R0]
hydantoin C1NC(=O)NC(=O)1
hydrazine N[NH2]
hydroquinone [OH]c1ccc([OH,NH2,NH])cc1
hydroxamic-acid C(=O)N[OH]
imine C=[N!R]
imine N=[CR0][N,n,O,S]
iodine I
isocyanate N=C=O
isolate-alkene [$([CH2]),$([CH][CX4]),$(C([CX4])[CX4])]=[$([CH2]),$([CH][CX4]),$(C([CX4])[CX4])]
ketene C=C=O
methylidene-1,3-dithiole S1C=CSC1=S
Michael-acceptor C=!@CC=[O,S]
Michael-acceptor [$([CH]),$(CC)]#CC(=O)[C,c]
Michael-acceptor [$([CH]),$(CC)]#CS(=O)(=O)[C,c]
Michael-acceptor C=C(C=O)C=O
Michael-acceptor [$([CH]),$(CC)]#CC(=O)O[C,c]
N-oxide [NX2,nX3][OX1]
N-acyl-2-amino-5-mercapto-1,3,4-thiadiazole s1c(S)nnc1NC=O
N-C-halo NC[F,Cl,Br,I]
N-halo [NX3,NX4][F,Cl,Br,I]
N-hydroxyl-pyridine n[OH]
nitro-group [N+](=O)[O-]
N-nitroso [#7]-N=O
oxime [C,c]=N[OH]
oxime [C,c]=NOC=O
Oxygen-nitrogen-single-bond [OR0,NR0][OR0,NR0]
perfluorinated-chain [CX4](F)(F)[CX4](F)F
peroxide OO
phenol-ester c1ccccc1OC(=O)[#6]
phenyl-carbonate c1ccccc1OC(=O)O
phosphor-P-phthalimide [cR,CR]~C(=O)NC(=O)~[cR,CR]
Polycyclic-aromatic-hydrocarbon a1aa2a3a(a1)A=AA=A3=AA=A2
Polycyclic-aromatic-hydrocarbon a21aa3a(aa1aaaa2)aaaa3
Polycyclic-aromatic-hydrocarbon a31a(a2a(aa1)aaaa2)aaaa3
polyene [CR0]=[CR0][CR0]=[CR0]
quaternary-nitrogen [s,S,c,C,n,N,o,O]~[nX3+,NX3+](~[s,S,c,C,n,N])~[s,S,c,C,n,N]
quaternary-nitrogen [s,S,c,C,n,N,o,O]~[n+,N+](~[s,S,c,C,n,N,o,O])(~[s,S,c,C,n,N,o,O])~[s,S,c,C,n,N,o,O]
quaternary-nitrogen [*]=[N+]=[*]
saponine-derivative O1CCCCC1OC2CCC3CCCCC3C2
silicon-halogen [Si][F,Cl,Br,I]
stilbene c1ccccc1C=Cc2ccccc2
sulfinic-acid [SX3](=O)[O-,OH]
Sulfonic-acid [C,c]S(=O)(=O)O[C,c]
Sulfonic-acid S(=O)(=O)[O-,OH]
sulfonyl-cyanide S(=O)(=O)C#N
sulfur-oxygen-single-bond [SX2]O
sulphate OS(=O)(=O)[O-]
sulphur-nitrogen-single-bond [SX2H0][N]
Thiobenzothiazole c12ccccc1(SC(S)=N2)
thiobenzothiazole c12ccccc1(SC(=S)N2)
Thiocarbonyl-group [C,c]=S
thioester SC=O
thiol [S-]
thiol [SH]
Three-membered-heterocycle *1[O,S,N]*1
triflate OS(=O)(=O)C(F)(F)F
triphenyl-methylsilyl [SiR0,CR0](c1ccccc1)(c2ccccc2)(c3ccccc3)
triple-bond C#C
Loading

0 comments on commit 4b801b2

Please sign in to comment.