Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move DBs to separate package #10

Merged
merged 16 commits into from
Jun 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,5 @@
^codecov\.yml$
^data-raw$
^R/abx_index_old\.R$
^\.github$
^.*\.Rproj$
1 change: 1 addition & 0 deletions .github/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.html
52 changes: 52 additions & 0 deletions .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]

name: R-CMD-check

permissions: read-all

jobs:
R-CMD-check:
runs-on: ${{ matrix.config.os }}

name: ${{ matrix.config.os }} (${{ matrix.config.r }})

strategy:
fail-fast: false
matrix:
config:
- {os: macos-latest, r: 'release'}
- {os: windows-latest, r: 'release'}
- {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'}
- {os: ubuntu-latest, r: 'release'}
- {os: ubuntu-latest, r: 'oldrel-1'}

env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
R_KEEP_PKG_SOURCE: yes

steps:
- uses: actions/checkout@v4

- uses: r-lib/actions/setup-pandoc@v2

- uses: r-lib/actions/setup-r@v2
with:
r-version: ${{ matrix.config.r }}
http-user-agent: ${{ matrix.config.http-user-agent }}
use-public-rspm: true

- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: any::rcmdcheck
needs: check

- uses: r-lib/actions/check-r-package@v2
with:
upload-snapshots: true
build_args: 'c("--no-manual","--compact-vignettes=gs+qpdf")'
61 changes: 61 additions & 0 deletions .github/workflows/test-coverage.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]

name: test-coverage

permissions: read-all

jobs:
test-coverage:
runs-on: ubuntu-latest
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}

steps:
- uses: actions/checkout@v4

- uses: r-lib/actions/setup-r@v2
with:
use-public-rspm: true

- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: any::covr, any::xml2
needs: coverage

- name: Test coverage
run: |
cov <- covr::package_coverage(
quiet = FALSE,
clean = FALSE,
install_path = file.path(normalizePath(Sys.getenv("RUNNER_TEMP"), winslash = "/"), "package")
)
covr::to_cobertura(cov)
shell: Rscript {0}

- uses: codecov/codecov-action@v4
with:
fail_ci_if_error: ${{ github.event_name != 'pull_request' && true || false }}
file: ./cobertura.xml
plugin: noop
disable_search: true
token: ${{ secrets.CODECOV_TOKEN }}

- name: Show testthat output
if: always()
run: |
## --------------------------------------------------------------------
find '${{ runner.temp }}/package' -name 'testthat.Rout*' -exec cat '{}' \; || true
shell: bash

- name: Upload test results
if: failure()
uses: actions/upload-artifact@v4
with:
name: coverage-test-failures
path: ${{ runner.temp }}/package
14 changes: 0 additions & 14 deletions .travis.yml

This file was deleted.

9 changes: 6 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,19 @@ Title: Modeling the Susceptibility of a Bacterial Community to Antibiotics
Version: 0.0.1.0
Author: Vincent Tu <tuv@email.chop.edu>
Maintainer: Vincent Tu <tuv@email.chop.edu>
Description: This package calculates an index for a given bacterial community's susceptibility to the specified antibiotics.
Description: Calculates an index for a given bacterial community's susceptibility to specified antibiotics.
License: GPL-3
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.2
RoxygenNote: 7.2.3
Suggests:
knitr,
rmarkdown,
covr,
testthat
Depends:
Depends:
R (>= 2.10)
Imports:
whatbacteria
Remotes: PennChopMicrobiomeProgram/whatbacteria
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ export(mirix_metronidazole)
export(mirix_vancomycin)
export(phenotype_susceptibility)
export(predict_abundance)
import(whatbacteria)
17 changes: 9 additions & 8 deletions R/abx_index.R
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ NULL
mirix_vancomycin <- function(abundance,
lineage,
replace_zero = 1e-4,
antibiotic_db = taxon_susceptibility,
phenotype_db = taxon_phenotypes) {
antibiotic_db = whatbacteria::taxon_susceptibility,
phenotype_db = whatbacteria::taxon_phenotypes) {
susceptibility <- antibiotic_susceptibility_vancomycin(
lineage, antibiotic_db, phenotype_db)
mirix(abundance, susceptibility, replace_zero)
Expand All @@ -36,7 +36,7 @@ mirix_vancomycin <- function(abundance,
mirix_doxycycline <- function(abundance,
lineage,
replace_zero = 1e-4,
antibiotic_db = taxon_susceptibility) {
antibiotic_db = whatbacteria::taxon_susceptibility) {
susceptibility <- antibiotic_susceptibility_tetracycline(
lineage, antibiotic_db)
mirix(abundance, susceptibility, replace_zero)
Expand All @@ -47,7 +47,7 @@ mirix_doxycycline <- function(abundance,
mirix_amoxicillin <- function(abundance,
lineage,
replace_zero = 1e-4,
antibiotic_db = taxon_susceptibility) {
antibiotic_db = whatbacteria::taxon_susceptibility) {
susceptibility <- antibiotic_susceptibility_penicillin(
lineage, antibiotic_db)
mirix(abundance, susceptibility, replace_zero)
Expand All @@ -58,7 +58,7 @@ mirix_amoxicillin <- function(abundance,
mirix_metronidazole <- function(abundance,
lineage,
replace_zero = 1e-4,
phenotype_db = taxon_phenotypes) {
phenotype_db = whatbacteria::taxon_phenotypes) {
susceptibility <- phenotype_susceptibility(
lineage = lineage,
phenotype = "aerobic_status",
Expand All @@ -75,7 +75,7 @@ mirix_metronidazole <- function(abundance,
mirix_ciprofloxacin <- function(abundance,
lineage,
replace_zero = 1e-4,
phenotype_db = taxon_phenotypes) {
phenotype_db = whatbacteria::taxon_phenotypes) {
susceptibility <- phenotype_susceptibility(
lineage = lineage,
phenotype = "aerobic_status",
Expand All @@ -92,8 +92,8 @@ mirix_ciprofloxacin <- function(abundance,
mirix_gentamicin <- function(abundance,
lineage,
replace_zero = 1e-4,
antibiotic_db = taxon_susceptibility,
phenotype_db = taxon_phenotypes) {
antibiotic_db = whatbacteria::taxon_susceptibility,
phenotype_db = whatbacteria::taxon_phenotypes) {
susceptibility <- antibiotic_susceptibility_aminoglycoside(
lineage, antibiotic_db, phenotype_db)
mirix(abundance, susceptibility, replace_zero)
Expand All @@ -111,6 +111,7 @@ mirix_gentamicin <- function(abundance,
#' 0.5 is typical. For relative abundances, a number that is slightly lower
#' than the lowest relative abundance will work.
#'
#' @import whatbacteria
#' @return The MiRIx value
#' @export
mirix <- function (abundance, susceptibility, replace_zero = 1e-4) {
Expand Down
28 changes: 0 additions & 28 deletions R/data.R
Original file line number Diff line number Diff line change
@@ -1,31 +1,3 @@
#' Gram stain and aerobic status of bacterial taxa
#' @format A data frame with the following columns:
#' \describe{
#' \item{taxon}{The name of the taxon}
#' \item{rank}{The rank of the taxon}
#' \item{aerobic_status}{
#' The aerobic status. One of "aerobe", "facultative anaerobe", or
#' "obligate anaerobe".}
#' \item{gram_stain}{
#' How the taxon appears when Gram-stained. One of "Gram-positive" or
#' "Gram-negative".}
#' \item{doi}{DOI of the publication from which the information was obtained.}
#' }
"taxon_phenotypes"

#' Antibiotic susceptibility of bacterial taxa
#' @format A data frame with the following columns:
#' \describe{
#' \item{taxon}{The name of the taxon}
#' \item{rank}{The rank of the taxon}
#' \item{antibiotic}{The antibiotic or antibiotic class}
#' \item{value}{
#' The susceptibility of the taxon to the antibiotic, one of "susceptible"
#' or "resistant".}
#' \item{doi}{DOI of the publication from which the information was obtained.}
#' }
"taxon_susceptibility"

#' Example data from Weiss et al.
#'
#' @format A data frame with the following columns:
Expand Down
110 changes: 4 additions & 106 deletions R/match.R
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,8 @@
#' @export
antibiotic_susceptibility <- function (lineage,
antibiotic,
db = taxon_susceptibility) {
is_relevant <- db$antibiotic %in% antibiotic
db <- db[is_relevant, c("taxon", "rank", "value")]

susceptibility_values <- match_annotation(lineage, db)
susceptibility_values
db = whatbacteria::taxon_susceptibility) {
whatbacteria::what_antibiotic(lineage, antibiotic, db)
}

#' Evaluate antibiotic susceptibility based on phenotype
Expand Down Expand Up @@ -74,107 +70,9 @@ antibiotic_susceptibility <- function (lineage,
phenotype_susceptibility <- function (lineage,
phenotype,
susceptibility,
db = taxon_phenotypes) {
is_relevant <- db[[phenotype]] %in% names(susceptibility)
db <- db[is_relevant, c("taxon", "rank", phenotype)]
# match_annotation() requires a column named "value"
colnames(db)[3] <- "value"

phenotype_values <- match_annotation(lineage, db)

db = whatbacteria::taxon_phenotypes) {
phenotype_values <- whatbacteria::what_phenotype(lineage, phenotype, db)
susceptibility_values <- susceptibility[phenotype_values]
susceptibility_values <- unname(susceptibility_values)
susceptibility_values
}

# Determine the annotation values for each lineage
#
# @param lineage A vector of taxonomic assignments or lineages
# @param db A data frame with columns named "taxon", "rank", and "value"
# @return A vector of assigned values
match_annotation <- function (lineage, db) {
get_rank_specific_db <- function (r) {
rank_is_r <- db[["rank"]] %in% r
db[rank_is_r,]
}
db_ranks <- lapply(rev(taxonomic_ranks), get_rank_specific_db)
names(db_ranks) <- rev(taxonomic_ranks)

get_values_by_rank <- function (rank_specific_db) {
taxa_idx <- match_taxa(lineage, rank_specific_db[["taxon"]])
rank_specific_db[["value"]][taxa_idx]
}
values_by_rank <- vapply(
db_ranks,
get_values_by_rank,
rep("a", length(lineage)))

if (length(lineage) == 1) {
assigned_values <- first_non_na_value(values_by_rank)
} else {
assigned_values <- apply(values_by_rank, 1, first_non_na_value)
}
assigned_values
}

# The 'official' taxonomic ranks supported by this package
taxonomic_ranks <- c(
"Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species")

# Return the first value that is not NA. If all values are NA, return NA. The
# resultant vector will not have names.
first_non_na_value <- function (x) {
unname(x[first_true_idx(!is.na(x))])
}

# For each lineage, return the index of the taxon that is found within the
# lineage. If no taxa are found, return NA for that element. If multiple taxa
# are found, we issue a warning and return the index of the first taxon in the
# vector of taxa.
match_taxa <- function (lineages, taxa) {
n_lineages <- length(lineages)
if (length(taxa) == 0) {
return(rep_len(NA_character_, length(lineages)))
}

taxa_patterns <- paste0("(?<=__|\\b)(?:", taxa, ")\\b")
lineage_matches <- vapply(
X = taxa_patterns,
FUN = grepl,
FUN.VALUE = rep_len(TRUE, n_lineages),
x = lineages,
perl = TRUE,
USE.NAMES = TRUE)

# If the user passes only one lineage, lineage_matches will be a vector
# rather than an array. After some trial and error, I found that it's better
# to deal with this at each stage of the computation, rather than trying to
# coerce the vector to an array up front.
if (n_lineages == 1) {
multi_matches <- sum(lineage_matches) > 1
} else {
multi_matches <- rowSums(lineage_matches) > 1
}
if (any(multi_matches)) {
warning(
"The following lineages match more than one taxon:\n",
paste(lineages[multi_matches], collapse = "\n"), "\n")
}

if (n_lineages == 1) {
taxon_idx <- first_true_idx(lineage_matches)
} else {
taxon_idx <- apply(lineage_matches, 1, first_true_idx)
}
taxon_idx
}

# Return the first index of a boolean vector that is TRUE. If all elements of
# the vector are FALSE, return NA. Tempted to call this function minwhich.
first_true_idx <- function (x) {
if (any(x)) {
min(which(x == TRUE))
} else {
NA_integer_
}
}
2 changes: 1 addition & 1 deletion R/predict.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#' Predict taxon abundances at given values of an index
#'
#' @param index_value Value or values of the index at which to make predictions.
#' @param abundances A vector of taxon abundances in a sample.
#' @param abundance A vector of taxon abundances in a sample.
#' @param susceptibility A character vector of antibiotic susceptibility, with
#' values that are "susceptible", "resistant", or \code{NA}.
#' @return A new vector of abundances if \code{index_value} has length 1. If
Expand Down
Loading
Loading