From db6bd5c154130b5ab2ef25381dbb660ddae714b3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?M=C3=A1t=C3=A9=20Balajti?= Date: Thu, 31 Oct 2024 14:19:42 +0100 Subject: [PATCH] docs: update README, guides --- README.md | 102 +++++------------------------------ docs/guides/examples.rst | 10 ++-- docs/guides/installation.rst | 4 +- docs/guides/usage.rst | 38 ++++++++++--- 4 files changed, 52 insertions(+), 102 deletions(-) diff --git a/README.md b/README.md index 6a1f23c..7557cb7 100644 --- a/README.md +++ b/README.md @@ -11,41 +11,27 @@ HTSinfer infers metadata from Illumina high-throughput sequencing (HTS) data. ## Quick start -## Installation + +For a more in-depth guide please refer to the [HTSinfer documentation][docs-documentation]. + +### Installation In order to use the HTSinfer, clone the repository and install the -dependencies via [Conda][conda]: +dependencies via [Conda][conda] or [Mamba][mamba]: ```sh git clone https://github.com/zavolanlab/htsinfer cd htsinfer conda env create --file environment.yml -# Alternatively, to install with development dependencies, -# run the following instead -conda env create --file environment-dev.yml ``` -> Note that creating the environment takes non-trivial time and it is strongly -> recommended that you install [Mamba][mamba] and replace `conda` with `mamba` -> in the previous command. - Then, activate the `htsinfer` Conda environment with: ```sh conda activate htsinfer ``` -If you have installed the development/testing dependencies, you may first want -to verify that HTSinfer was installed correctly by executing the tests shipped -with the package: - -```sh -python -m pytest -``` - -Otherwise just go ahead and try one of the [examples](#Examples). - -## General usage +### General usage ```sh htsinfer [--output-directory PATH] @@ -69,15 +55,15 @@ htsinfer [--output-directory PATH] PATH [PATH] ``` -## Examples +### Examples -**Single-ended library*** +**Single-ended library** ```sh htsinfer tests/files/adapter_single.fastq ``` -**Paired-ended library*** +**Paired-ended library** ```sh htsinfer tests/files/adapter_1.fastq tests/files/adapter_2.fastq @@ -146,74 +132,13 @@ example library: ``` To better understand the output, please refer to the [`Results` -model][docs-api-results] in the [API documentation][badge-url-docs]. Note that -`Results` model has several nested child models, such as enumerators of -possible outcomes. Simply follow the references in each parent model for -detailed descriptions of each child model's attributes. - -## General usage - -```sh -htsinfer [--output-directory PATH] - [--temporary-directory PATH] - [--cleanup-regime {DEFAULT,KEEP_ALL,KEEP_NONE,KEEP_RESULTS}] - [--records INT] - [--threads INT] - [--transcripts FASTA] - [--read-layout-adapters PATH] - [--read-layout-min-match-percentage FLOAT] - [--read-layout-min-frequency-ratio FLOAT] - [--library-source-min-match-percentage FLOAT] - [--library-source-min-frequency-ratio FLOAT] - [--library-type-max-distance INT] - [--library-type-mates-cutoff FLOAT] - [--read-orientation-min-mapped-reads INT] - [--read-orientation-min-fraction FLOAT] - [--tax-id INT] - [--verbosity {DEBUG,INFO,WARN,ERROR,CRITICAL}] - [-h] [--version] - PATH [PATH] -``` - -## Installation - -In order to use the HTSinfer, clone the repository and install the -dependencies via [Conda][conda]: - -```sh -git clone https://github.com/zavolanlab/htsinfer -cd htsinfer -conda env create --file environment.yml -# Alternatively, to install with development dependencies, -# run the following instead -conda env create --file environment-dev.yml -``` - -> Note that creating the environment takes non-trivial time and it is strongly -> recommended that you install [Mamba][mamba] and replace `conda` with `mamba` -> in the previous command. - -Then, activate the `htsinfer` Conda environment with: - -```sh -conda activate htsinfer -``` - -If you have installed the development/testing dependencies, you may first want -to verify that HTSinfer was installed correctly by executing the tests shipped -with the package: - -```sh -python -m pytest -``` - -Otherwise just go ahead and try one of the [examples](#Examples). +model][docs-api-results] in the [API documentation][badge-url-docs]. -## API documentation +### API documentation Auto-built API documentation is hosted on [ReadTheDocs][badge-url-docs]. -## Contributing +### Contributing This project lives off your contributions, be it in the form of bug reports, feature requests, discussions, or fixes and other code changes. Please refer @@ -221,7 +146,7 @@ to the [contributing guidelines](CONTRIBUTING.md) if you are interested to contribute. Please mind the [code of conduct](CODE_OF_CONDUCT.md) for all interactions with the community. -## Contact +### Contact For questions or suggestions regarding the code, please use the [issue tracker][issue-tracker]. For any other inquiries, please contact us @@ -245,6 +170,7 @@ by email: [badge-url-doi-zenodo]: [conda]: [contact]: +[docs-documentation]: [docs-api-results]: [issue-tracker]: [mamba]: diff --git a/docs/guides/examples.rst b/docs/guides/examples.rst index 97919dc..a3be423 100644 --- a/docs/guides/examples.rst +++ b/docs/guides/examples.rst @@ -1,13 +1,13 @@ Examples ======== -HTSinfer provides easy-to-use commands for analyzing single- and paired-ended RNA-Seq libraries. +`HTSinfer` provides easy-to-use commands for analyzing single- and paired-ended RNA-Seq libraries. Single-ended Library Example ---------------------------- -To run HTSinfer on a single-ended RNA-Seq library, use the following command: +To run `HTSinfer` on a single-ended RNA-Seq library, use the following command: .. code-block:: bash @@ -16,13 +16,13 @@ To run HTSinfer on a single-ended RNA-Seq library, use the following command: Paired-ended Library Example ---------------------------- -To run HTSinfer on a paired-ended RNA-Seq library, use the following command: +To run `HTSinfer` on a paired-ended RNA-Seq library, use the following command: .. code-block:: bash htsinfer tests/files/adapter_1.fastq tests/files/adapter_2.fastq -Both commands will output the results in JSON format to `STDOUT` and the log to `STDERR`. +Both commands will output the results in JSON format to :code:`STDOUT` and the log to :code:`STDERR`. Example Output -------------- @@ -84,4 +84,4 @@ Here is a sample output for the paired-ended library: } } -For more details on the output structure, refer to the `Results` model in the API documentation. +For more details on the output structure, refer to the :code:`Results` model in the API documentation. diff --git a/docs/guides/installation.rst b/docs/guides/installation.rst index dd767cc..c71ea5f 100644 --- a/docs/guides/installation.rst +++ b/docs/guides/installation.rst @@ -19,12 +19,12 @@ To install `HTSinfer`, first clone the repository and install the dependencies v .. note:: - Creating the environment may take some time. It is strongly recommended to install `Mamba `_ and replace ``conda`` with ``mamba`` in the previous commands for faster installation. + Creating the environment may take some time. It is strongly recommended to install `Mamba `_ and replace :code:`conda` with :code:`mamba` in the previous commands for faster installation. Activate the Conda Environment ------------------------------ -After the installation is complete, activate the `htsinfer` Conda environment with: +After the installation is complete, activate the :code:`htsinfer` Conda environment with: .. code-block:: bash diff --git a/docs/guides/usage.rst b/docs/guides/usage.rst index 1971df4..15b25ce 100644 --- a/docs/guides/usage.rst +++ b/docs/guides/usage.rst @@ -1,7 +1,7 @@ Usage ===== -This sections describes the general usage of `HTSinfer`. +This section describes the general usage of `HTSinfer`. General Usage ------------- @@ -28,20 +28,44 @@ General Usage [-h] [--version] PATH [PATH] -The above command allows the user to infer metadata for single- or paired-ended RNA-Seq libraries by specifying file paths and relevant parameters. The tool outputs metadata in JSON format to `STDOUT` and logs to `STDERR`. +The above command allows the user to infer metadata for single- or paired-ended RNA-Seq libraries by specifying file paths and relevant parameters. The tool outputs metadata in JSON format to :code:`STDOUT` and logs to :code:`STDERR`. Command-line Options --------------------- Available command-line parameters are categorized as follows: -- **General Options**: These include specifying directories, verbosity level, and other global settings. -- **Library-specific Options**: These parameters allow the user to modify settings related to the input data, such as transcript references, adapter sequences, and match thresholds. -- **Output Options**: These settings control the output format, including the number of records and the output destination. -- **Meta Options**: The user can also control the behavior of the tool with meta options such as cleanup regimes, thread count, and version information. +- **General Options**: + - :code:`--output-directory`: Path where output data will be saved. + - :code:`--temporary-directory`: Path for storing temporary files generated during execution. + - :code:`--cleanup-regime`: Specifies which data should be kept after completion. Available options are :code:`DEFAULT`, :code:`KEEP_ALL`, :code:`KEEP_NONE`, and :code:`KEEP_RESULTS`. + - :code:`--verbosity`: Controls the verbosity level of log output; options are :code:`DEBUG`, :code:`INFO`, :code:`WARN`, :code:`ERROR`, and :code:`CRITICAL`. -For a complete list of all available options, use the following command: +- **Library-specific Options**: + - :code:`PATH [PATH]`: Path(s) to the RNA-Seq input data. For paired-end libraries, provide paths to both mate files. + - :code:`--transcripts`: Path to the FASTA file containing transcript sequences for reference. + - :code:`--read-layout-adapters`: Path to a file with 3' adapter sequences (one sequence per line) used to identify adapter content. + - :code:`--read-layout-min-match-percentage`: Minimum percentage of reads containing an adapter for it to be considered as the library’s 3’-end adapter. + - :code:`--read-layout-min-frequency-ratio`: Minimum frequency ratio between the most and second most frequent adapters to select the 3’-end adapter. + - :code:`--library-source-min-match-percentage`: Minimum percentage of reads aligning with a library source for it to be considered representative of the library. + - :code:`--library-source-min-frequency-ratio`: Minimum frequency ratio between primary and secondary library sources, ensuring only the most prominent source is identified. + - :code:`--library-type-max-distance`: Maximum allowable distance between read pairs to classify the library type. + - :code:`--library-type-mates-cutoff`: Ratio cutoff to determine the consistency of mate orientation in paired-end reads. + - :code:`--read-orientation-min-mapped-reads`: Minimum number of mapped reads to ensure reliable inference of read orientation. + - :code:`--read-orientation-min-fraction`: Minimum fraction (must exceed 0.5) of reads supporting a given orientation to confirm its accuracy. + +- **Processing and Performance Options**: + - :code:`--records`: Limits the number of input records to process; setting this to 0 will process all records. + - :code:`--threads`: Specifies the number of threads for concurrent processing to optimize performance. + - :code:`--tax-id`: Taxonomy ID for the sample source, aiding in organism-specific analyses. + +Meta Options +------------ + +For help or version information, use the following: .. code-block:: bash htsinfer --help + htsinfer --version +