BIONIC-evals

Evaluation library for BIONIC. This library contains code to reproduce the co-annotation prediction, module detection, and gene function prediction evaluations from Fig. 2a, 3a, 4 and 5.

NOTE: The module detection and gene function prediction evaluations take a considerable amount of time to complete (on the order of hours). You can speed them up by reducing the size of the parameter search space, reducing the number of trials, or using more CPUs.

⚙️ Installation

The library can be installed using Poetry.

First, install Poetry.
Create a virtual Python 3.8 environment using conda:
```
 $ conda create -n bionic-evals python=3.8
```
Make sure your virutal environment is active for the following steps:
```
 $ conda activate bionic-evals
```

Clone this repository by running

$ git clone https://github.com/duncster94/BIONIC-evals.git

Make sure you are in the same directory as the pyproject.toml file. Install the bioniceval library as follows:
```
$ poetry install
```
Test bioniceval is installed properly by running
```
$ bioniceval --help
```
You should see a help message.

⚡ Usage

You can run bioniceval by simply passing in a config file as follows:

$ bioniceval path/to/config/file.json

Configuration File

bioniceval runs by passing in a configuration file: a JSON file containing all the relevant file paths and evaluation parameters. You can have a uniquely named config file for each evaluation scenario you want to run. An example config file can be found here.

The configuration keys are as follows:

Argument	Description
Input files
`networks.name`	Name for the given network.
`networks.path`	Filepath to input network.
`networks.delimiter`	Delimiter of network file.
`features.name`	Name for the given feature set.
`features.path`	Filepath to input feature set.
`features.delimiter`	Delimiter of feature file.
Evaluation standards
`standards.name`	Name for the given standard.
`standards.task`	The type of evaluation task. Valid values are `"coannotation"`, `"module_detection"`, and `"function_prediction"`
`standards.path`	Filepath to standard.
`standards.delimiter`	Delimiter of standard file.
Module detection specific parameters
`standards.samples`	Number of flat module set samples to perform evaluations for.
`standards.methods`	A list of valid linkage methods to perform clustering for. See here for more information.
`standards.metrics`	A list of valid distance metrics to perform clustering for. See here for more information.
`standards.thresholds`	Number of clustering thresholds to extract clusters for and evaluate.
Function prediction specific parameters
`standards.test_size`	Held-out test size. A value of 0.1 corresponds to test set of 10% of genes.
`standards.folds`	Number of folds to perform cross validation on.
`standards.trials`	Number of trials to repeat function prediction evaluations for.
`standards.gamma.minimum`	Lower bound of radial basis function kernel coefficient.
`standards.gamma.maximum`	Upper bound of radial basis function kernel coefficient.
`standards.gamma.samples`	Number of coefficients to sample from the range defined by `minimum` and `maximum` arguments.
`standards.regularization.minimum`	Lower bound of regularization parameter (`C` in scikit-learn SVC).
`standards.regularization.maximum`	Upper bound of regularization parameter.
`standards.regularization.samples`	Number of regularization parameters to sample from the range defined by `minimum` and `maximum` arguments.
Miscellaneous
`consolidation`	Whether to consolidate differences in gene sets between datasets by extending datasets to the union of genes (`"union"`) or reducing datasets to the intersection of genes (`"intersection"`). `union` was used for analyses in the BIONIC manuscript.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

BIONIC-evals

⚙️ Installation

⚡ Usage

Configuration File

Files

README.md

Latest commit

History

README.md

File metadata and controls

BIONIC-evals

⚙️ Installation

⚡ Usage

Configuration File