Skip to content

Commit

Permalink
Merge pull request #45 from saezlab/42-expand-method-utilities
Browse files Browse the repository at this point in the history
42 expand method utilities
  • Loading branch information
vicpaton authored Aug 5, 2024
2 parents ac65afc + 2ef398c commit 2d80616
Show file tree
Hide file tree
Showing 15 changed files with 510 additions and 61 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,4 @@ api
report*
trace*
work/
*.png
test.ipynb
Binary file added docs/src/_static/ap.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/_static/ppr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/_static/reach.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/_static/sign.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/_static/sp.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
43 changes: 28 additions & 15 deletions docs/src/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,6 @@ CORNETO
:recursive:

methods.run_corneto_carnival
methods.to_cornetograph
methods.to_networkx



Prior Knowledge
Expand Down Expand Up @@ -120,7 +117,9 @@ PANACEA
:toctree: api
:recursive:

data.omics.panacea
data.omics.panacea_experiments
data.omics.panacea_datatypes
data.omics.panacea_tables

scPerturb
~~~~~~~~
Expand Down Expand Up @@ -164,17 +163,6 @@ NCI60
data.omics.nci60_table


Other
~~~~~~~~
.. module::networkcommons.data.omics
.. currentmodule:: networkcommons

.. autosummary::
:toctree: api
:recursive:

data.omics.moon


Evaluation and description
==========================
Expand Down Expand Up @@ -231,3 +219,28 @@ Visualization
visual.plot_density
visual.plot_scatter
visual.plot_rank


Utilities
=========

.. module::networkcommons.utils
.. currentmodule:: networkcommons

.. autosummary::
:toctree: api
:recursive:


utils.to_cornetograph
utils.to_networkx
utils.read_network_from_file
utils.network_from_df
utils.get_subnetwork
utils.decoupler_formatter
utils.targetlayer_formatter
utils.handle_missing_values
utils.subset_df_with_nodes
utils.node_attrs_from_corneto
utils.edge_attrs_from_corneto

8 changes: 8 additions & 0 deletions docs/src/contents.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,20 @@ NetworkCommons: Table of Contents
installation
api

.. toctree::
:maxdepth: 2
:caption: Details

datasets
methods


.. toctree::
:maxdepth: 2
:caption: Contribution guidelines

guidelines/guide_1_data
guidelines/guide_2_methods


.. toctree::
Expand Down
114 changes: 114 additions & 0 deletions docs/src/datasets.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
####
Data
####
NetworkCommons provides a collection of omics datasets and prior knowledge resources. The datasets are available in the form of files that can be downloaded and used for further analysis. The prior knowledge resources are available in the form of networks (either Network objects or pd.DataFrames).
All the data can be accessed via the NetworkCommons API.

----------
Omics data
----------
Below, we provide a list of all the omics datasets currently available in NetworkCommons. For each data, we provide a link to the original publication, a description, processing (if applicable), and a link to the data location.


DecryptM
--------

**Alias:** decryptm

**Description:** Drug perturbation proteomics and phosphoproteomics data

**Publication Link:** `Jana Zecha et al. Decrypting drug actions and protein modifications by dose- and time-resolved proteomics. Science 380,93-101(2023). <https://doi.org/10.1126/science.ade3925>`_

**Data location:** `PRIDE <https://ftp.pride.ebi.ac.uk/pride/data/archive/2023/03/PXD037285/>`_

**Detailed Description:** This dataset contains the profiling of 31 cancer drugs in 13 human cancer cell line models, resulting in 1.8 million dose-response curves. The data includes 47,502 regulated phosphopeptides, 7316 ubiquitinylated peptides, and 546 regulated acetylated peptides.
Networkcommons contains the files containing, per phosphosite, EC50 values obtained from fitting the intensity values of the 10 drug concentration points to a four-parameter logistic function.


PANACEA
-------

**Alias:** panacea

**Description:** Pancancer Analysis of Chemical Entity Activity RNA-Seq data

**Publication Link:** `Eugene F. Douglass et al. A community challenge for a pancancer drug mechanism of action inference from perturbational profile data. Cell Reports Medicine (2022). <https://doi.org/10.1016/j.xcrm.2021.100492>`_

**Data location:** `NCBI GEO <https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE186341>`_

**Detailed Description:** PANACEA contains dose-response and perturbational profiles for 32 kinase inhibitors in 11 cancer cell lines, in addition to a DMSO control. Originally, this resource served as the basis for a DREAM Challenge assessing the accuracy and sensitivity of computational algorithms for de novo drug polypharmacology predictions.
NetworkCommons provides raw files for countdata and metadata, as retrieved in the original page. In addition, differential expression and TF activity tables are provided.

**Data processing:** The differential expression statistics were obtained via `FLOP <https://doi.org/10.1093/nar/gkae552>`_, using FilterbyExpr and DESeq2, one of the top performer combinations in the benchmarking study.
The contrasts were set, per cell line, between each drug and the DMSO control. The TF activity tables were obtained also via `FLOP <https://doi.org/10.1093/nar/gkae552>`_, using univariate linear models as implemented
in `decoupler <https://doi.org/10.1093/bioadv/vbac016>`_.


CPTAC
-----

**Alias:** CPTAC

**Description:** Clinical Proteomic Tumor Analysis Consortium data

**Publication Link:** `Ellis, M. J. et al. Connecting genomic alterations to cancer biology with proteomics: the NCI Clinical Proteomic Tumor Analysis Consortium. Cancer Discov. 3, 1108–1112 (2013). <https://doi.org/10.1158/2159-8290.CD-13-0219>`_

**Data location:** `NIH NCI Proteommic Data Commons <https://pdc.cancer.gov/pdc/cptac-pancancer>`_

**Detailed Description:** This dataset contains data from the Clinical Proteomic Tumor Analysis Consortium. It includes various cancer types and proteomic data.
We included only the data processed by the University of Michigan team's pipeline, and then post-processed by the Baylor College of Medicine's pipeline. Details
can be found in the STAR Methods of `'Proteogenomic Data and Resources for Pan-Cancer Analysis' <https://doi.org/10.1016/j.ccell.2023.06.009>`_ (i.e., 'BCM pipeline for pan-cancer multi-omics data harmonization').


NCI60
-----

**Alias:** NCI60

**Description:** NCI-60 cell line data

**Publication Link:** `Shoemaker, R. The NCI60 human tumour cell line anticancer drug screen. Nat Rev Cancer 6, 813–823 (2006). <https://doi.org/10.1038/nrc1951>`_

**Data location:** `COSMOS R package - Bioconductor <https://www.bioconductor.org/packages/release/bioc/html/cosmosR.html>`_

**Detailed Description:** This dataset contains data from the NCI-60 cell line panel. It includes three files: TF activities from transcriptomics data, metabolite abundances, and gene reads.

---------------
Prior Knowledge
---------------
Below, we provide a list of all the prior knowledge resources currently available in NetworkCommons. For each resource, we provide a description and a link to the original publication.

OmniPath
--------

**Alias:** omnipath

**Description:** OmniPath database

**Publication Link:** `Türei, D. et al. OmniPath: guidelines and gateway for literature-curated signaling pathway resources. Nat Methods 13, 966–967 (2016). <https://doi.org/10.1038/nmeth.4077>`_

**Detailed Description:** OmniPath is a comprehensive collection of signaling pathways and regulatory interactions. Currently, NetworkCommons include the signed and directed PPI network that can be obtained from Omnipath.Interactions.
Our aim is to expand the API to more data sources within OmniPath. For more information, please refer to the `OmniPath website <https://omnipathdb.org/>`_ and the `OmniPath documentation page <https://omnipath.readthedocs.io/>`_.

Liana
-----

**Alias:** liana

**Description:** Liana database

**Publication Link:** `Dimitrov, D., Türei, D., Garrido-Rodriguez, M. et al. Comparison of methods and resources for cell-cell communication inference from single-cell RNA-Seq data. Nat Commun 13, 3224 (2022). <https://doi.org/10.1038/s41467-022-30755-0>`_

**Detailed Description:** The Prior Knowledge from Liana contains ligand-receptor interactions. For more information, please refer to the `Liana documentation page <https://liana-py.readthedocs.io/en/latest/>`_.

PhosphositePlus
---------------

**Alias:** phosphositeplus

**Description:** PhosphositePlus database

**Publication Link:** `Hornbeck, P. V. et al. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res 43, D512–D520 (2015). <https://doi.org/10.1093/nar/gku1267>`_

**Detailed Description:** PhosphositePlus is a comprehensive resource that contains, among other PTM interactions, kinase-subsrate interactions, which can then be useful to infer kinase activities from phosphoproteomics data.
For more information, please refer to the `PhosphositePlus website <https://www.phosphosite.org/>`_.
51 changes: 18 additions & 33 deletions docs/src/guidelines/guide_1_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,6 @@
"cell_type": "markdown",
"metadata": {},
"source": [
".. code-block:: yaml\n",
"\n",
" NCI60:\n",
" name: NCI60\n",
" description: NCI-60 cell line data\n",
Expand All @@ -51,7 +49,16 @@
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import networkcommons as nc"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -97,31 +104,17 @@
" <td>PANACEA contains dose-response and perturbational profiles for 32 kinase inhibitors in 11 cancer cell lines, in addition to a DMSO control. Originally, this resource served as the basis for a DREAM Challenge assessing the accuracy and sensitivity of computational algorithms for de novo drug polypharmacology predictions.</td>\n",
" </tr>\n",
" <tr>\n",
" <th>moon</th>\n",
" <td>MOON</td>\n",
" <td>Database files for running MOON</td>\n",
" <td>https://example.com/moon</td>\n",
" <td>This dataset contains database files required for running the MOON software.</td>\n",
" </tr>\n",
" <tr>\n",
" <th>cosmos</th>\n",
" <td>COSMOS</td>\n",
" <td>Database files for running COSMOS (MetaPKN)</td>\n",
" <td>https://example.com/cosmos</td>\n",
" <td>This dataset includes database files for the COSMOS software (MetaPKN).</td>\n",
" </tr>\n",
" <tr>\n",
" <th>CPTAC</th>\n",
" <td>CPTAC</td>\n",
" <td>Clinical Proteomic Tumor Analysis Consortium data</td>\n",
" <td>https://example.com/CPTAC</td>\n",
" <td>https://doi.org/10.1158/2159-8290.CD-13-0219</td>\n",
" <td>This dataset contains data from the Clinical Proteomic Tumor Analysis Consortium. It includes various cancer types and proteomic data.</td>\n",
" </tr>\n",
" <tr>\n",
" <th>NCI60</th>\n",
" <td>NCI60</td>\n",
" <td>NCI-60 cell line data</td>\n",
" <td>https://example.com/NCI60</td>\n",
" <td>https://doi.org/10.1038/nrc1951</td>\n",
" <td>This dataset contains data from the NCI-60 cell line panel. It includes three files: TF activities from transcriptomics data, metabolite abundances and gene reads.</td>\n",
" </tr>\n",
" </tbody>\n",
Expand All @@ -132,37 +125,29 @@
" name \\\n",
"decryptm DecryptM \n",
"panacea Panacea \n",
"moon MOON \n",
"cosmos COSMOS \n",
"CPTAC CPTAC \n",
"NCI60 NCI60 \n",
"\n",
" description \\\n",
"decryptm Drug perturbation proteomics and phosphoproteomics data \n",
"panacea Pancancer Analysis of Chemical Entity Activity RNA-Seq data \n",
"moon Database files for running MOON \n",
"cosmos Database files for running COSMOS (MetaPKN) \n",
"CPTAC Clinical Proteomic Tumor Analysis Consortium data \n",
"NCI60 NCI-60 cell line data \n",
"\n",
" publication_link \\\n",
"decryptm https://doi.org/10.1126/science.ade3925 \n",
"panacea https://doi.org/10.1016/j.xcrm.2021.100492 \n",
"moon https://example.com/moon \n",
"cosmos https://example.com/cosmos \n",
"CPTAC https://example.com/CPTAC \n",
"NCI60 https://example.com/NCI60 \n",
" publication_link \\\n",
"decryptm https://doi.org/10.1126/science.ade3925 \n",
"panacea https://doi.org/10.1016/j.xcrm.2021.100492 \n",
"CPTAC https://doi.org/10.1158/2159-8290.CD-13-0219 \n",
"NCI60 https://doi.org/10.1038/nrc1951 \n",
"\n",
" detailed_description \n",
"decryptm This dataset contains the profiling of 31 cancer drugs in 13 human cancer cell line models resulted in 1.8 million dose-response curves, including 47,502 regulated phosphopeptides, 7316 ubiquitinylated peptides, and 546 regulated acetylated peptides. \n",
"panacea PANACEA contains dose-response and perturbational profiles for 32 kinase inhibitors in 11 cancer cell lines, in addition to a DMSO control. Originally, this resource served as the basis for a DREAM Challenge assessing the accuracy and sensitivity of computational algorithms for de novo drug polypharmacology predictions. \n",
"moon This dataset contains database files required for running the MOON software. \n",
"cosmos This dataset includes database files for the COSMOS software (MetaPKN). \n",
"CPTAC This dataset contains data from the Clinical Proteomic Tumor Analysis Consortium. It includes various cancer types and proteomic data. \n",
"NCI60 This dataset contains data from the NCI-60 cell line panel. It includes three files: TF activities from transcriptomics data, metabolite abundances and gene reads. "
]
},
"execution_count": 21,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
Expand Down
Loading

0 comments on commit 2d80616

Please sign in to comment.