Skip to content

Commit

Permalink
Add new merge_clin() function
Browse files Browse the repository at this point in the history
- new function wrapper to allow users
  to merge in clinical variables to `soma_adat`
  objects easily
- closes #80
  • Loading branch information
stufield committed Mar 11, 2024
1 parent e433267 commit 989966c
Show file tree
Hide file tree
Showing 5 changed files with 174 additions and 0 deletions.
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ export(loadAdatsAsList)
export(locateSeqId)
export(matchSeqIds)
export(meltExpressionSet)
export(merge_clin)
export(mutate)
export(parseHeader)
export(pivotExpressionSet)
Expand Down Expand Up @@ -172,6 +173,7 @@ importFrom(tidyr,unite)
importFrom(tools,md5sum)
importFrom(utils,capture.output)
importFrom(utils,head)
importFrom(utils,read.csv)
importFrom(utils,read.delim)
importFrom(utils,tail)
importFrom(utils,write.table)
84 changes: 84 additions & 0 deletions R/merge-clin.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
#' Merge Clinical Data into Data Frame
#'
#' Occasionally, additional clinical data is obtained _after_ samples
#' have been submitted to SomaLogic, Inc. or even after 'SomaScan'
#' results have been delivered.
#' This requires the new clinical variables, i.e. non-proteomic, data to be
#' merged with 'SomaScan' data into a "new" ADAT prior to analysis.
#' This wrapper easily merges such clinical variables into an
#' existing 2 dimensional data frame object, e.g. a `soma_adat`,
#' and is a simple wrapper around [dplyr::left_join()].
#'
#' This is a package export of the `merge_clin.R` command-line tool (R script)
#' that lives in the `cli/merge` system file directory. Please see:
#' \itemize{
#' \item `dir(system.file("cli/merge", package = "SomaDataIO"), full.names = TRUE)`
#' \item `vignette("clin-merge-tool", package = "SomaDataIO")`
#' }
#'
#' @inheritParams params
#' @param clin_data One of 2 options:
#' \itemize{
#' \item A data frame containing clinical variables to merge into `x`, or
#' \item A path to a file, typically a `*.csv`,
#' containing clinical variables to merge into `x`.
#' }
#' @param by A character vector of variables to join by.
#' See [dplyr::left_join()] for more details.
#' @param ... Additional parameters passed to [dplyr::left_join()].
#' @return An object of the same class as `x` with new clinical
#' variables merged.
#' @author Stu Field
#' @seealso [dplyr::left_join()]
#' @examples
#' # retrieve clinical data
#' clin_file <- system.file("cli/merge/meta.csv", package = "SomaDataIO", mustWork = TRUE)
#' clin_file
#'
#' # view clinical data to be merged:
#' # 1) `group`
#' # 2) `newvar`
#' clin_df <- read.csv(clin_file, header = TRUE)
#' clin_df
#'
#' # ensure compatible type for `by =`
#' clin_df$SampleId <- as.character(clin_df$SampleId)
#'
#' # create mini-adat
#' apts <- withr::with_seed(123, sample(getAnalytes(example_data), 3L))
#' adat <- head(example_data, 10L) |>
#' dplyr::select(SampleId, all_of(apts))
#'
#' # merge clinical variables
#' adat_merged <- merge_clin(adat, clin_df, by = "SampleId")
#' adat_merged
#'
#' # Alternative syntax:
#' # merge on different variable names
#' clin_df2 <- system.file("cli/merge/meta2.csv", package = "SomaDataIO",
#' mustWork = TRUE) |> read.csv(header = TRUE)
#' clin_df2
#'
#' clin_df2$ClinKey <- as.character(clin_df2$ClinKey)
#' adat_merged2 <- merge_clin(adat, clin_df2, by = c(SampleId = "ClinKey"))
#' adat_merged2
#' @importFrom utils read.csv
#' @importFrom dplyr left_join
#' @export
merge_clin <- function(x, clin_data, by = NULL, ...) {

stopifnot("`adat` must be a `soma_adat`." = is.soma_adat(x))

if ( inherits(clin_data, "data.frame") ) {
clin_df <- clin_data
} else if ( is.character(clin_data) && length(clin_data) == 1L &&
file.exists(clin_data) ) {
clin_df <- normalizePath(clin_data, mustWork = TRUE) |>
utils::read.csv(header = TRUE)
} else {
stop("Invalid `clin` argument: ", .value(class(clin_data)), call. = FALSE)
}

dplyr::left_join(x, clin_df, by = by)
}

1 change: 1 addition & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ reference:
- starts_with("getAnalyte")
- getMeta
- diffAdats
- merge_clin

- title: Transform Between SomaScan Versions
desc: >
Expand Down
84 changes: 84 additions & 0 deletions man/merge_clin.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions vignettes/cli-merge-tool.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ in the `cli/merge/` directory, which allows one to
generate an updated `*.adat` file via the command-line without
having to launch an integrated development environment ("IDE"), e.g. `RStudio`.

To use `SomaDataIO`s exported functionality fro _within_ and R session,
please see `merge_clin()`.


----------------

Expand Down

0 comments on commit 989966c

Please sign in to comment.