Skip to content

Commit

Permalink
deploy: 9495509
Browse files Browse the repository at this point in the history
  • Loading branch information
joshuailevy committed Nov 20, 2024
0 parents commit 1c827a0
Show file tree
Hide file tree
Showing 135 changed files with 13,799 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file records the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 3d4aa8dc7132818b73224c05d4cb6b10
tags: 645f666f9bcd5a90fca523b33c5a78b7
Empty file added .nojekyll
Empty file.
Binary file added _images/covariants_heatmap.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/dataupload.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/dockstore.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/downloadresult.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/example_covariants_plot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/igv_screenshot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/import.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/lineage_prevalence_monthly.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/lineage_prevalence_sample.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/progress.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/setup.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/stacked_area_plot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/test.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/test0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/test2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/testSummary.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/variant_prevalence_monthly.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/variant_prevalence_sample.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/workflowoptions.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/workspaces.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
42 changes: 42 additions & 0 deletions _sources/index.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
Freyja Documentation
==================================
Freyja is a tool to recover relative lineage abundances from mixed SARS-CoV-2 samples from a sequencing dataset (BAM aligned to the Hu-1 reference). The method uses lineage-determining mutational "barcodes" derived from the UShER global phylogenetic tree as a basis set to solve the constrained (unit sum, non-negative) de-mixing problem.

Freyja is intended as a post-processing step after primer trimming and variant calling in `iVar (Grubaugh and Gangavaparu et al., 2019) <https://github.com/andersen-lab/ivar>`_. From measurements of SNV freqency and sequencing depth at each position in the genome, Freyja returns an estimate of the true lineage abundances in the sample.

To ensure reproducibility of results, we provide old (timestamped) barcodes and metadata in the separate `Freyja-data <https://github.com/andersen-lab/Freyja-data>`_ repository. Barcode version can be checked using the ``freyja demix --version`` command.

.. toctree::
:maxdepth: 2
:caption: Usage:

src/installation
src/usage/demix
src/usage/variants
src/usage/barcode-build
src/usage/get-lineage-def
src/usage/update
src/usage/boot
src/usage/aggregate
src/usage/plot
src/usage/dash
src/usage/relgrowthrate
src/usage/extract
src/usage/filter
src/usage/covariants
src/usage/plot-covariants

.. toctree::
:maxdepth: 2
:caption: Wiki:

src/wiki/command_line_workflow
src/wiki/cryptic_variants
src/wiki/custom_plotting_tutorial
src/wiki/freyja-measles
src/wiki/custom-plots-with-R
src/wiki/lineage_barcode_extract
src/wiki/read_analysis_tutorial
src/wiki/terra_workflow
src/wiki/expanded_pathogens

22 changes: 22 additions & 0 deletions _sources/src/installation.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Installation
-------------------------------------------------------------------------------

Freyja is entirely written in Python 3, but requires preprocessing by tools like iVar and `samtools <https://github.com/samtools/samtools>`_ mpileup to generate the required input data. We recommend using python3.7, but Freyja has been tested on python versions up to 3.10.

Install via Conda::

conda install -c bioconda freyja


Local build from source::

git clone https://github.com/andersen-lab/Freyja.git
cd Freyja
pip install -e .

Docker::

docker pull staphb/freyja
docker run --rm -it staphb/freyja [command]

27 changes: 27 additions & 0 deletions _sources/src/usage/aggregate.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
.. click:: freyja._cli:aggregate
:prog: freyja aggregate
:nested: full
:commands: aggregate
------------

**Example Usage:**

For rapid visualization of results, we also offer two utility methods
for manipulating the “demixed” output files. The first is an aggregation
method

::

freyja aggregate [directory-of-output-files] --output [aggregated-filename.tsv]

By default, the minimum genome coverage is set at 60 percent. To adjust
this, the ``--mincov`` option can be used (e.g. ``--mincov 75``.We also
now allow the user to specify a file extension of their choosing, using
the ``--ext`` option (for example, for ``demix`` outputs called
``X.output``)

::

freyja aggregate [directory-of-output-files] --output [aggregated-filename.tsv] --ext output


5 changes: 5 additions & 0 deletions _sources/src/usage/barcode-build.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.. click:: freyja._cli:barcode_build
:prog: freyja barcode-build
:nested: full
:commands: barcode-build

27 changes: 27 additions & 0 deletions _sources/src/usage/boot.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
.. click:: freyja._cli:boot
:prog: freyja boot
:nested: full
:commands: boot
------------

**Example Usage:**

We provide a fast bootstrapping method for freyja, which can be run
using the command

::

freyja boot [variants-file] [depth-file] --nt [number-of-cpus] --nb [number-of-bootstraps] --output_basename [base-name]

which results in two output files: ``base-name_lineages.csv`` and
``base-name_summarized.csv``, which contain the 0.025, 0.05, 0.25, 0.5
(median),0.75, 0.95, and 0.975 percentiles for each lineage and WHO
designated VOI/VOC, respectively, as obtained via the bootstrap. A
custom lineage hierarchy file can be provided using ``--lineageyml``
option. If the ``--rawboots`` option is used, it will return two
additional output files ``base-name_lineages_boot.csv`` and
``base-name_summarized_boot.csv``, which contain the bootstrap estimates
(rather than summary statistics). We also provide the ``--eps``,
``--barcodes``, and ``--meta`` options as in ``freyja demix``. We now
also provide a ``--boxplot`` option, which should be specified in the
form ``--boxplot pdf`` if you want the boxplot in pdf format.
30 changes: 30 additions & 0 deletions _sources/src/usage/covariants.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
.. click:: freyja._cli:covariants
:prog: freyja covariants
:nested: full
:commands: covariants
------------

**Example Usage:**

In many cases, it can be useful to study covariant mutations
(i.e. mutations co-occurring on the same read pair). This outputs to a tsv file that includes the mutations present in each
set of covariants, their absolute counts (the number of read pairs with
the mutations), their coverage ranges (the minimum and maximum position
for read-pairs with the mutations), their “maximum” counts (the number
of read pairs that span the positions in the mutations), and their
frequencies (the absolute count divided by the maximum count). Should
the user wish to only consider read pairs that span the entire genomic
region defined by (min_site, max_site), they may include the
``--spans_region`` flag. By default, the covariant patterns are sorted
in descending order by count, however they can also be sorted in
descending order by frequency by setting the ``--sort_by`` option to
“freq”, or sorted sequentially by mutation site by setting the
``--sort_by`` option to “site”. The ``--ref-genome`` argument defaults
to ``freyja/data/NC_045512_Hu-1.fasta``. If you are using a different
build to perfrom alignment, it is important to pass that file in to
``--ref-genome`` instead. Optionally, a gff file
(e.g. ``freyja/data/NC_045512_Hu-1.gff``) may be included via the
``--annot`` option to output amino acid mutations alongside
nucleotide mutations. Inclusion thresholds for read-mapping quality and
the number of observed instances of a set of covariants can be set using
``--min_quality`` and ``--min_count`` respectively.
45 changes: 45 additions & 0 deletions _sources/src/usage/dash.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
.. click:: freyja._cli:dash
:prog: freyja dash
:nested: full
:commands: dash
------------

**Example Usage:**

We are now providing functionality to rapidly prepare a dashboard web
page, directly from aggregated freyja output. This can be done with the
command

::

freyja dash [aggregated-filename-tsv] [sample-metadata.csv] [dashboard-title.txt] [introContent.txt] --output [outputname.html] --lineage.yml [path-to-lineage.yml-file]

where the metadata file should have this
`form <freyja/data/sweep_metadata.csv>`__. See example
`title <freyja/data/title.txt>`__ and
`intro-text <freyja/data/introContent.txt>`__ files as well. For samples
taken the same day, we average the freyja outputs by default. However,
averaging can be performed that takes the viral loads into account using
the ``--scale_by_viral_load`` flag. The header and body color can be
changed with the ``--headerColor [mycolorname/hexcolor]`` and
``--bodyColor [mycolorname/hexcolor]`` option respectively. The
``--mincov`` option is also available, as in ``plot``. The resulting
dashboard will look like
`this <https://htmlpreview.github.io/?https://github.com/andersen-lab/Freyja/blob/main/freyja/data/test0.html>`__.

The plot can now be configured using the
``--config [path-to-plot-config-file]`` option. The `plot config
file <freyja/data/plot_config.yml>`__ is a yaml file. More information
about the plot config file can be found in the `sample config
file <freyja/data/plot_config.yml>`__. By default, this will use the
lineage hierarchy information present in ``freyja/dash/lineages.yml``,
but a custom hierarchy can be supplied using the
``--lineageyml [path-to-hierarchy-file]`` option. The
``--keep_plot_files`` option can be used keep the intermediate html for
the core plot (will be deleted following incorporation into the main
html output by default).

A CSV file will also be created along with the html dashboard which will
contain the relative growth rates for each lineage. The lineages will be
grouped together based on the ``Lineages`` key specified in the config
file if provided.
48 changes: 48 additions & 0 deletions _sources/src/usage/demix.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
.. click:: freyja._cli:demix
:prog: freyja demix
:nested: full
:commands: demix
------------

**Example Usage:**

After running ``freyja variants`` we can run:
``freyja demix [variants-file] [depth-file] --output [output-file]``

This outputs to a tsv file that includes the lineages present, their
corresponding abundances, and summarization by constellation. This
method also includes a ``--eps`` option, which enables the user to
define the minimum lineage abundance returned to the user
(e.g. ``--eps 0.0001``). A custom barcode file can be provided using the
``--barcodes [path-to-barcode-file]`` option. By default, freyja uses
the lineage hierarchy file located in\ ``freyja/data`` directory which
is updated everytime the ``freyja update`` command is run. The user,
however, can define a custom lineage hierarchy file
using\ ``--lineageyml [path-to-lineage-file]``. Users can get the
historic ``lineage.yml`` file at freyja-data GitHub repository
`here <https://github.com/andersen-lab/Freyja-data/tree/main/history_lineage_hierarchy>`_.
As the UShER tree now included proposed lineages, we now offer the
``--confirmedonly`` flag which removes unconfirmed lineages from the
analysis. For additional flexibility and reproducibility of analyses, a
custom lineage-to-constellation mapping metadata file can be provided
using the ``--meta`` option. A coverage depth minimum can be specified
using the ``--depthcutoff`` option, which excludes sites with coverage
less than the specified value. An example output should have the format

+-------------+------------------------------------------------------+
| | filename |
+=============+======================================================+
| summarized | [('Delta', 0.65), ('Other', 0.25), ('Alpha', 0.1)] |
+-------------+------------------------------------------------------+
| lineages | ['B.1.617.2' 'B.1.2' 'AY.6' 'Q.3'] |
+-------------+------------------------------------------------------+
| abundances | "[0.5 0.25 0.15 0.1]" |
+-------------+------------------------------------------------------+
| resid | 3.14159 |
+-------------+------------------------------------------------------+
| coverage | 95.8 |
+-------------+------------------------------------------------------+

Where ``summarized`` denotes a sum of all lineage abundances in a particular WHO designation (i.e. B.1.617.2 and AY.6 abundances are summed in the above example), otherwise they are grouped into "Other". The ``lineage`` array lists the identified lineages in descending order, and ``abundances`` contains the corresponding abundances estimates. Using the ``--depthcutoff`` option may result in some distinct lineages now having identical barcodes, which are grouped into the format ``[lineage]-like(num)`` (based on their shared phylogeny) in the output. A summary of this lineage grouping is outputted to ``[output-file]_collapsed_lineages.yml``. The value of ``resid`` corresponds to the residual of the weighted least absolute deviation problem used to estimate lineage abundances. The ``coverage`` value provides the 10x coverage estimate (percent of sites with 10 or greater reads- 10 is the default but can be modfied using the ``--covcut`` option in ``demix``). If there is an solver error during the `demix` step (generally associated with poor data quality), an error message will be returned, along with an output empty summarized, lineages, and abundances, and with resid = -1.

**NOTE**: The ``freyja variants`` output is stable in time, and does not need to be re-run to incorporate updated lineage designations/corresponding mutational barcodes, whereas the outputs of ``freyja demix`` will change as barcodes are updated (and thus ``demix`` should be re-run as new information is made available).
22 changes: 22 additions & 0 deletions _sources/src/usage/extract.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
.. click:: freyja._cli:extract
:prog: freyja extract
:nested: full
:commands: extract
------------

**Example Usage:**

The above command will extract reads containing one or more mutations of
interest and save them to ``[input-bam]_extracted.bam``. Additionally,
the ``--same_read`` flag can be included to specify that all query
mutations must occur on the same set of paired reads (same UMI).

Formatting requirements for ``[query-mutations.csv]`` can be
found below, where SNPs, insertions, and deletions are listed in
separate lines.

::

C75T,G230A,A543C
(732:'TT'),(1349:'A'),(12333:'A')
(1443:32),(1599:2),(2036:3)
20 changes: 20 additions & 0 deletions _sources/src/usage/filter.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
.. click:: freyja._cli:filter
:prog: freyja filter
:nested: full
:commands: filter
------------

**Example Usage:**

Excludes reads containing one or more mutations, akin to ``freyja extract`` but with the opposite effect.

``[min-site]`` and ``[max-site]`` specify the range of reads to
include. Formatting requirements for ``[query-mutations.csv]`` can be
found below, where SNPs, insertions, and deletions are listed in
separate lines.

::

C75T,G230A,A543C
(732:'TT'),(1349:'A'),(12333:'A')
(1443:32),(1599:2),(2036:3)
39 changes: 39 additions & 0 deletions _sources/src/usage/get-lineage-def.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
.. click:: freyja._cli:get_lineage_def
:prog: freyja get-lineage-def
:nested: full
:commands: get_lineage_def

------------

**Example Usage:**

This command fetches the lineage-defining mutations for the specified lineage. Defaults to stdout if no output file is specified.

.. code-block:: bash
freyja get-lineage-def B.1.1.7
C241T
C913T
C3037T
C3267T
C5388A
(cont...)
If a gene annotation file (gff3) is provided along with a reference genome (fasta), the mutations will include the respective amino acid changes for non-synonymous mutations.

.. code-block:: bash
freyja get-lineage-def BA.2.12 --annot freyja/data/NC_045512_Hu-1.gff --ref freyja/data/NC_045512_Hu-1.fasta
...
C21618T(S:T19I)
T22200G(S:V213G)
G22578A(S:G339D)
C22674T(S:S371F)
T22679C(S:S373P)
C22686T(S:S375F)
A22688G(S:T376A)
G22775A(S:D405N)
A22786C(S:R408S)
G22813T(S:K417N)
(cont...)
24 changes: 24 additions & 0 deletions _sources/src/usage/plot-covariants.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
.. click:: freyja._cli:plot_covariants
:prog: freyja plot-covariants
:nested: full
:commands: plot-covariants
------------

**Example Usage:**

This generates an informative heatmap visualization showing the
frequency and coverage ranges of each observed covariant pattern. ``--num_clusters`` sets the
number of patterns (i.e. rows in the plot) to display, while the
threshold ``--min_mutations`` sets a minimum number of mutations per set
of covariants for inclusion in the plot. The flag ``--nt_muts`` can be
included to provide nucleotide-specific information in the plot
(e.g. C22995A(S:T478K) as opposed to S:T478K), making it possible to
view multi-nucleotide variants.



**Example output:**

|image2|

.. |image2| image:: ../../../freyja/data/example_covariants_plot.png
Loading

0 comments on commit 1c827a0

Please sign in to comment.