docs: regenerate automatic docs

taxprofiler · Aug 24, 2023 · 3a83e40 · 3a83e40
1 parent 346e1fd
commit 3a83e40
Show file tree

Hide file tree

Showing 3 changed files with 68 additions and 58 deletions.
diff --git a/docs/quick_reference/merge_help.txt b/docs/quick_reference/merge_help.txt
@@ -8,7 +8,7 @@ Arguments:
                              Filenames will be parsed as sample names.
 
 Options:
-  -p, --profiler [bracken|centrifuge|diamond|kaiju|kraken2|krakenuniq|megan6|metaphlan|motus]
+  -p, --profiler [bracken|centrifuge|diamond|ganon|kaiju|kmcp|kraken2|krakenuniq|megan6|metaphlan|motus]
                                   The taxonomic profiler used. All provided
                                   profiles must come from the same tool!
                                   [required]
@@ -66,4 +66,7 @@ Options:
   --add-id-lineage                Add the taxon's entire lineage to the
                                   output. These are taxon identifiers
                                   separated by semi-colons.
+  --add-rank-lineage              Add the taxon's entire rank lineage to the
+                                  output. These are taxon ranks separated by
+                                  semi-colons.
   -h, --help                      Show this message and exit.
diff --git a/docs/quick_reference/standardise_help.txt b/docs/quick_reference/standardise_help.txt
@@ -6,7 +6,7 @@ Arguments:
   PROFILE  A file containing a taxonomic profile.  [required]
 
 Options:
-  -p, --profiler [bracken|centrifuge|diamond|kaiju|kraken2|krakenuniq|megan6|metaphlan|motus]
+  -p, --profiler [bracken|centrifuge|diamond|ganon|kaiju|kmcp|kraken2|krakenuniq|megan6|metaphlan|motus]
                                   The taxonomic profiler used.  [required]
   -o, --output PATH               The desired output file. By default, the
                                   file extension will be used to determine the
@@ -46,4 +46,7 @@ Options:
   --add-id-lineage                Add the taxon's entire lineage to the
                                   output. These are taxon identifiers
                                   separated by semi-colons.
+  --add-rank-lineage              Add the taxon's entire rank lineage to the
+                                  output. These are taxon ranks separated by
+                                  semi-colons.
   -h, --help                      Show this message and exit.
diff --git a/docs/tutorials/tutorial_r_snippets.md b/docs/tutorials/tutorial_r_snippets.md
@@ -1,30 +1,31 @@
+
 <!-- --8<-- [start:software] -->
 
--   Unix terminal (e.g., `bash`)
+- Unix terminal (e.g., `bash`)
 
--   [Python environment with taxpasta](../index.md#install)
+- [Python environment with taxpasta](/#install)
 
--   [R](https://rstudio-education.github.io/hopr/starting.html#how-to-download-and-install-r)
+- [R](https://rstudio-education.github.io/hopr/starting.html#how-to-download-and-install-r)
 
--   Package dependencies
+- Package dependencies
 
-    ```r
-    install.packages(c("readr", "dplyr"), dependencies = TRUE)
-    ```
+  ``` r
+  install.packages(c("readr", "dplyr"), dependencies = TRUE)
+  ```
 
 <!-- --8<-- [end:software] -->
 <!-- --8<-- [start:raw-motus] -->
 
 We can try loading a mOTUs profile into R using the common table reading
 function `read_tsv()` from the `readr` package with default arguments.
 
-```r
+``` r
 requireNamespace("readr")
 ```
 
     Loading required namespace: readr
 
-```r
+``` r
 profile_motus <- readr::read_tsv("2612_pe-ERR5766176-db_mOTU.out")
 ```
 
@@ -42,15 +43,16 @@ profile_motus <- readr::read_tsv("2612_pe-ERR5766176-db_mOTU.out")
     ℹ Use `spec()` to retrieve the full column specification for this data.
     ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
 
-You can see we immediately hit an error, as there is a ‘comment’ line at
-the top of the mOTUs profile with information on how the profile was
-generated.
+You can see we immediately hit an error, as as we saw above, there is a
+‘comment’ line at the top of the mOTUs profile with information on how
+the profile was generated.
 
-While such a comment is very useful for reproducibility, to load this we
-have to instead add extra options to the function, which makes loading
-the table less than smooth for downstream analyses.
+While such a comment is very useful for reproducibility, to load this
+into software expecting ‘true’ tabular data, we have to instead add
+extra options to the function, which makes loading the table less than
+smooth for downstream analyses.
 
-```r
+``` r
 profile_motus <- readr::read_tsv("2612_pe-ERR5766176-db_mOTU.out", comment = "#")
 ```
 
@@ -68,11 +70,11 @@ profile_motus <- readr::read_tsv("2612_pe-ERR5766176-db_mOTU.out", comment = "#"
     ℹ Use `spec()` to retrieve the full column specification for this data.
     ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
 
-However, once again we hit another error: the column headers are _also_
+However, once again we hit another error: the column headers are *also*
 specified as a comment line… Instead we can try to skip the first two
 lines entirely.
 
-```r
+``` r
 profile_motus <- readr::read_tsv("2612_pe-ERR5766176-db_mOTU.out", skip = 2)
 ```
 
@@ -85,7 +87,7 @@ profile_motus <- readr::read_tsv("2612_pe-ERR5766176-db_mOTU.out", skip = 2)
     ℹ Use `spec()` to retrieve the full column specification for this data.
     ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
 
-```r
+``` r
 profile_motus
 ```
 
@@ -108,7 +110,7 @@ profile_motus
 <!-- --8<-- [end:raw-motus] -->
 <!-- --8<-- [start:raw-kraken2] -->
 
-```r
+``` r
 profile_kraken2 <- readr::read_tsv("2612_pe-ERR5766176-db1.kraken2.report.txt")
 ```
 
@@ -122,30 +124,30 @@ profile_kraken2 <- readr::read_tsv("2612_pe-ERR5766176-db1.kraken2.report.txt")
     • `627680` -> `627680...2`
     • `627680` -> `627680...3`
 
-```r
+``` r
 profile_kraken2
 ```
 
     # A tibble: 43 × 6
-       `99.97` `627680...2` `627680...3` U        `0` unclassified
-         <dbl>        <dbl>        <dbl> <chr>  <dbl> <chr>
-     1    0.03          168            0 R          1 root
+       `99.97` `627680...2` `627680...3` U        `0` unclassified      
+         <dbl>        <dbl>        <dbl> <chr>  <dbl> <chr>             
+     1    0.03          168            0 R          1 root              
      2    0.03          168            0 R1    131567 cellular organisms
-     3    0.03          168            0 D       2759 Eukaryota
-     4    0.03          168            0 D1     33154 Opisthokonta
-     5    0.02          152            0 K      33208 Metazoa
-     6    0.02          152            0 K1      6072 Eumetazoa
-     7    0.02          152            0 K2     33213 Bilateria
-     8    0.02          152            0 K3     33511 Deuterostomia
-     9    0.02          152            0 P       7711 Chordata
-    10    0.02          152            0 P1     89593 Craniata
+     3    0.03          168            0 D       2759 Eukaryota         
+     4    0.03          168            0 D1     33154 Opisthokonta      
+     5    0.02          152            0 K      33208 Metazoa           
+     6    0.02          152            0 K1      6072 Eumetazoa         
+     7    0.02          152            0 K2     33213 Bilateria         
+     8    0.02          152            0 K3     33511 Deuterostomia     
+     9    0.02          152            0 P       7711 Chordata          
+    10    0.02          152            0 P1     89593 Craniata          
     # … with 33 more rows
 
 This doesn’t fail to load but unfortunately the column headers look a
 bit weird. It seems the Kraken2 file does not include a column header!
 In this case we have to specify these ourselves.
 
-```r
+``` r
 profile_kraken2 <- readr::read_tsv(
     "2612_pe-ERR5766176-db1.kraken2.report.txt",
     col_names = c(
@@ -168,15 +170,15 @@ profile_kraken2 <- readr::read_tsv(
     ℹ Use `spec()` to retrieve the full column specification for this data.
     ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
 
-```r
+``` r
 profile_kraken2
 ```
 
     # A tibble: 44 × 6
-       percent clade_assigned_reads direct_assigned_reads taxonomy_lvl taxon…¹ name
+       percent clade_assigned_reads direct_assigned_reads taxonomy_lvl taxon…¹ name 
          <dbl>                <dbl>                 <dbl> <chr>          <dbl> <chr>
      1  100.                 627680                627680 U                  0 uncl…
-     2    0.03                  168                     0 R                  1 root
+     2    0.03                  168                     0 R                  1 root 
      3    0.03                  168                     0 R1            131567 cell…
      4    0.03                  168                     0 D               2759 Euka…
      5    0.03                  168                     0 D1             33154 Opis…
@@ -194,13 +196,13 @@ In the tidyverse flavour of R, we can do this with the `full_join`
 function of the `dplyr` package. This form of joining tables includes
 all rows both from the left and right table in the resulting table.
 
-```r
+``` r
 requireNamespace("dplyr")
 ```
 
     Loading required namespace: dplyr
 
-```r
+``` r
 dplyr::full_join(profile_motus, profile_kraken2)
 ```
 
@@ -210,27 +212,29 @@ dplyr::full_join(profile_motus, profile_kraken2)
 
 The error `by must be supplied when x and y have no common variables`
 occurs because the column names are not the same between the two tables
-for the different profilers’ outputs. We need to specify which column of
-the left table should be joined with what column of the right table.
+for the different profilers’ outputs.
+
+We need to specify which column of the left table should be joined with
+what column of the right table.
 
-```r
+``` r
 raw_merged_table <- dplyr::full_join(profile_motus, profile_kraken2, by = c("NCBI_tax_id" = "taxonomy_id"))
 raw_merged_table
 ```
 
     # A tibble: 33,615 × 8
-       `#consensus_taxonomy`   NCBI_…¹ 2612_…² percent clade…³ direc…⁴ taxon…⁵ name
+       `#consensus_taxonomy`   NCBI_…¹ 2612_…² percent clade…³ direc…⁴ taxon…⁵ name 
        <chr>                     <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <chr>   <chr>
-     1 Leptospira alexanderi …  100053       0      NA      NA      NA <NA>    <NA>
-     2 Leptospira weilii [ref…   28184       0      NA      NA      NA <NA>    <NA>
-     3 Chryseobacterium sp. […      NA       0      NA      NA      NA <NA>    <NA>
-     4 Chryseobacterium galli… 1324352       0      NA      NA      NA <NA>    <NA>
-     5 Chryseobacterium indol…     253       0      NA      NA      NA <NA>    <NA>
-     6 Chryseobacterium artoc…      NA       0      NA      NA      NA <NA>    <NA>
-     7 Chryseobacterium jejue…  445960       0      NA      NA      NA <NA>    <NA>
-     8 Chryseobacterium sp. G… 1805473       0      NA      NA      NA <NA>    <NA>
-     9 Chryseobacterium conta… 1423959       0      NA      NA      NA <NA>    <NA>
-    10 Chryseobacterium indol…     253       0      NA      NA      NA <NA>    <NA>
+     1 Leptospira alexanderi …  100053       0      NA      NA      NA <NA>    <NA> 
+     2 Leptospira weilii [ref…   28184       0      NA      NA      NA <NA>    <NA> 
+     3 Chryseobacterium sp. […      NA       0      NA      NA      NA <NA>    <NA> 
+     4 Chryseobacterium galli… 1324352       0      NA      NA      NA <NA>    <NA> 
+     5 Chryseobacterium indol…     253       0      NA      NA      NA <NA>    <NA> 
+     6 Chryseobacterium artoc…      NA       0      NA      NA      NA <NA>    <NA> 
+     7 Chryseobacterium jejue…  445960       0      NA      NA      NA <NA>    <NA> 
+     8 Chryseobacterium sp. G… 1805473       0      NA      NA      NA <NA>    <NA> 
+     9 Chryseobacterium conta… 1423959       0      NA      NA      NA <NA>    <NA> 
+    10 Chryseobacterium indol…     253       0      NA      NA      NA <NA>    <NA> 
     # … with 33,605 more rows, and abbreviated variable names ¹NCBI_tax_id,
     #   ²`2612_pe-ERR5766176-db_mOTU`, ³clade_assigned_reads,
     #   ⁴direct_assigned_reads, ⁵taxonomy_lvl
@@ -241,7 +245,7 @@ raw_merged_table
 Now let’s try to load the taxpasta standardised Kraken2 result into R
 again.
 
-```r
+``` r
 profile_kraken2_std <- readr::read_tsv("2612_pe-ERR5766176-db1_kraken2.tsv")
 ```
 
@@ -253,7 +257,7 @@ profile_kraken2_std <- readr::read_tsv("2612_pe-ERR5766176-db1_kraken2.tsv")
     ℹ Use `spec()` to retrieve the full column specification for this data.
     ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
 
-```r
+``` r
 profile_kraken2_std
 ```
 
@@ -278,7 +282,7 @@ profile_kraken2_std
 Once again, let’s try loading the standardised and merged mOTUs result
 into R.
 
-```r
+``` r
 profile_motus_merged <- readr::read_tsv("dbMOTUs_motus.tsv")
 ```
 
@@ -290,7 +294,7 @@ profile_motus_merged <- readr::read_tsv("dbMOTUs_motus.tsv")
     ℹ Use `spec()` to retrieve the full column specification for this data.
     ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
 
-```r
+``` r
 profile_motus_merged
 ```