Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

48 add benchmark details contrib guidelines #49

Merged
merged 12 commits into from
Aug 14, 2024
Binary file modified docs/src/_static/ap.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/_static/nc_ec50.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/_static/nc_multiomics.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/_static/nc_offtarget.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/_static/nc_pathway.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/src/_static/ppr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/src/_static/reach.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/src/_static/sign.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/src/_static/sp.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
106 changes: 106 additions & 0 deletions docs/src/benchmarks.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#####################
Evaluation strategies
#####################

One of the main aims of NetworkCommons is to provide a comprehesive set of metrics to evaluate the performance of the different network inference methodologies.
Currently, we provide three different evaluation strategies.
If you want to contribute with your own, please check our :doc:`Contribution guidelines <guidelines/guide_3_eval>`.

.. _eval-offtarget:

------------------
Offtarget recovery
------------------

**Data**: perturbational scenarios, e.g a drug perturbation, for which there are differential expression profiles between control and drug-perturbed samples.
See :ref:`PANACEA <details-panacea>`.

**Assumption**: In this setting, we assume that, in a perturbational context, the effects that are measured via omics data is not only a product of the perturbation origin
(e.g KO, KD, drug perturbation), but also of other origins of perturbation that are not directly targeted by the perturbation agent (e.g a drug offtarget).

.. image:: ./_static/nc_offtarget.png
:alt: Evaluation based on offtarget recovery
:width: 1000px

**Performance metric:** Share (%) of offtargets recovered in the solution network

.. note::
Methods that recover a higher share of offtargets, compared to a random control, will be more successful in contextualising the perturbation, since the method incorporates
the offtargets' effect.

**Example:** :doc:`Vignette 1: A simple example <vignettes/1_simple_example>`

.. _eval-ec50:

------------------------------------------------
Phosphorylation sensitivity to drug perturbation
------------------------------------------------

**Data**: phosphoproteomics dose-response curves, EC50 values, time-course data,
See :ref:`DecryptM <details-decryptm>`.

**Assumption**: In this setting, we assume that, in a perturbational context, those elements in a network that respond quicker to a perturbation (have a lower EC50) will be more
important in the contextualisation of said perturbation

**Performance metric:** EC50 values for nodes included and excluded of the solution network.

.. image:: ./_static/nc_ec50.png
:alt: Evaluation based on ensitivity to drug perturbation
:width: 1000px

.. note::
Methods producing result networks whose nodes have a low average EC50 (compared to nodes not included in the network) are better performers that those producing network
where this difference (EC50_in - EC50_out) is not that big.

**Example:** :doc:`Vignette 3: Sensitive response to drug perturbation using phosphoproteomics <vignettes/3_evaluation_decryptm>`

.. _eval-pathway:

---------------------------
Pathway enrichment analysis
---------------------------

**Data**: perturbational scenarios, dysregulation (e.g cancer basal profiles), basal profiles (e.g tissue specific profiles)
See :ref:`PANACEA <details-decryptm>`.

**Assumption**: In this setting, we use the nodes of the subnetworks to perform Overrepresentation Analysis against a set of predefined gene sets, among which we expect one to be especially represented
(for example, a specific pathway will be overrepresented if said pathway is perturbed, or is especially active/inactive in a given profile)

**Performance metric:** rank of the selected gene set among all gene sets, according to ORA score

.. image:: ./_static/nc_pathway.png
:alt: Evaluation based on pathway enrichment
:width: 1000px

.. note::
Having preselected a gene set of interest, the methods producing networks in which the said geneset is ranked high, according to their ORA score, against others, will have a better performance.

**Example:** :doc:`Vignette 1: A simple example <vignettes/1_simple_example>`

.. _eval-multiomics:

--------------------------------
Recovery of dysregulated kinases
--------------------------------

**Data**: perturbational scenarios, dysregulation (e.g cancer basal profiles)
See :ref:`CPTAC <details-cptac>`.

**Assumption**: In this setting, we use three different types of omics data:

* **Proteomics**: we identified the most differentially abundant receptors in the proteomics profiles between healthy and tumor samples. We assume that if they are differentially abundant, they will be activated/inhibited.
* **Transcriptomics**: we performed TF enrichment analysis, in order to get the TFs that are dysregulated in the tumor samples compared to the healthy control.
* **Phosphoproteomics**: we performed kinase activity estimation and then evaluate the level of dysregulation in the resulting subnetwork.

**Performance metric:** difference between kinase activity score in the solution network and the overall PKN.

.. image:: ./_static/nc_multiomics.png
:alt: Evaluation based on ensitivity to drug perturbation
:width: 1000px

.. note::
Methods whose result subnetworks have an average higher kinase activity score, compared to the overall PKN, will be better performers.

**Example:** :doc:`Vignette 4: Recovery of dysregulated kinases in response to cancer mutations <vignettes/4_cptac_phosphoactivity>`


2 changes: 2 additions & 0 deletions docs/src/contents.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ NetworkCommons: Table of Contents

datasets
methods
benchmarks


.. toctree::
Expand All @@ -24,6 +25,7 @@ NetworkCommons: Table of Contents

guidelines/guide_1_data
guidelines/guide_2_methods
guidelines/guide_3_eval


.. toctree::
Expand Down
18 changes: 17 additions & 1 deletion docs/src/datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,17 @@
Data
####
NetworkCommons provides a collection of omics datasets and prior knowledge resources. The datasets are available in the form of files that can be downloaded and used for further analysis. The prior knowledge resources are available in the form of networks (either Network objects or pd.DataFrames).
All the data can be accessed via the NetworkCommons API.
All the data can be accessed via the NetworkCommons API.
If you want to contribute with your own, please check our :doc:`Contribution guidelines <guidelines/guide_1_data>`.

.. _details-omics:

----------
Omics data
----------
Below, we provide a list of all the omics datasets currently available in NetworkCommons. For each data, we provide a link to the original publication, a description, processing (if applicable), and a link to the data location.

.. _details-decryptm:

DecryptM
--------
Expand All @@ -26,6 +30,7 @@ Networkcommons contains the files containing, per phosphosite, EC50 values obtai

**Functions:** See API documentation for :ref:`DecryptM <api-decryptm>`.

.. _details-panacea:

PANACEA
-------
Expand All @@ -49,6 +54,7 @@ in `decoupler <https://doi.org/10.1093/bioadv/vbac016>`_.

**Functions:** See API documentation for :ref:`PANACEA <api-panacea>`.

.. _details-cptac:

CPTAC
-----
Expand All @@ -67,6 +73,8 @@ can be found in the STAR Methods of `'Proteogenomic Data and Resources for Pan-C

**Functions:** See API documentation for :ref:`CPTAC <api-cptac>`.

.. _details-nci60:

NCI60
-----

Expand All @@ -82,11 +90,15 @@ NCI60

**Functions:** See API documentation for :ref:`NCI60 <api-nci60>`.

.. _details-pk:

---------------
Prior Knowledge
---------------
Below, we provide a list of all the prior knowledge resources currently available in NetworkCommons. For each resource, we provide a description and a link to the original publication.

.. _details-omnipath:

OmniPath
--------

Expand All @@ -101,6 +113,8 @@ Our aim is to expand the API to more data sources within OmniPath. For more info

**Functions:** See API documentation for :ref:`Prior knowledge <api-pk>`.

.. _details-liana:

Liana
-----

Expand All @@ -114,6 +128,8 @@ Liana

**Functions:** See API documentation for :ref:`Prior knowledge <api-pk>`.

.. _details-phosphositeplus:

PhosphositePlus
---------------

Expand Down
2 changes: 1 addition & 1 deletion docs/src/guidelines/guide_1_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Thank you very much for considering contributing to the data collection of **NetworkCommons**! In order to make the resource as user-friendly as possible, we aim to be as transparent as possible, which means that all contributions should contain at least the following elements.\n",
"Thank you very much for considering contributing to the data collection of **NetworkCommons**! In order to make the resource as user-friendly as possible, we aim to be as transparent as possible, which means that all contributions should contain at least the following elements. For other examples, see [the Datasets details.](../datasets.html)\n",
"\n",
"## 1. Data information\n",
"* Experimental design: number of samples, number of experiments (if applicable), confounding factors\n",
Expand Down
115 changes: 0 additions & 115 deletions docs/src/guidelines/guide_2_methods.ipynb

This file was deleted.

35 changes: 35 additions & 0 deletions docs/src/guidelines/guide_2_methods.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#################################
Contribution's guideline: Methods
#################################

Thank you very much for considering contributing to the methods collection of **NetworkCommons**! For methods, it is especially important that inputs and outputs are
compatible with the rest of the package, the purpose is stated and the assumptions of the method are clear.


----------------
1. Documentation
----------------

In the :doc:`./docs/src/methods.rst file <../methods>`, contributors should add:

* The description of the method
* A figure showcasing the basics (if possible)
* Input/output definition
* Link to publication and repository (if available)

Functions should be documented using `Google style Python docstrings <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html>`_.

In :doc:`./docs/src/api.rst file <../api>`, contributors should add a new documentation module that contains the new classes/functions implemented:

.. literalinclude:: ../api.rst
:language: rest
:lines: 40-53

------
2. API
------

* Every new method should be implemented in a separate file (e.g `_moon.py`) inside `/networkcommons/methods/`.
* Contributors can then implement their own set of functionalities and expose those necessary to the public API via the `__all__` variable (see other files for examples).
* The input of the overall pipeline must be at least a `Network` object, and its overall output should return at least a `Network` object containing the contextualised network.
This does not apply to intermediate functions (e.g `Network` --function 1--> `pd.DataFrame` --function 2--> `Network`) in case of a pipeline containing several functions, such as MOON.
Loading