Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #36 update readme #42

Merged
merged 10 commits into from
Aug 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .github/CODEOWNERS.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# This a CODEOWNERS file, where you can establish code owners.
# Code owners are automatically requested for review when someone opens a pull request
# that modifies code that they own.
#
#
data/ts.rda @kaz462
R/ts.R @kaz462
7 changes: 0 additions & 7 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -93,13 +93,6 @@
#' @author Antonio Rodríguez Contestí
"pp"

#' Questionnaire Dataset
#'
#' A SDTM QS dataset from the CDISC pilot project & Ophthalmology test data
#'
#' @source \url{https://github.com/pharmaverse/admiral.test/blob/main/data/admiral_qs.rda} # nolint
"qs"

#' Ophthalmology Questionnaire Dataset
#'
#' An example Questionnaires SDTM dataset with ophthalmology-specific questionnaire of NEI VFQ-25
Expand Down
54 changes: 29 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,21 @@

<!-- badges: start -->

[<img src="http://pharmaverse.org/shields/admiral.svg">](https://pharmaverse.org)
[![Test Coverage](https://raw.githubusercontent.com/pharmaverse/pharmaversesdtm/badges/main/test-coverage.svg)](https://github.com/pharmaverse/pharmaversesdtm/actions/workflows/code-coverage.yml)
[<img src="http://pharmaverse.org/shields/admiral.svg"/>](https://pharmaverse.org) [![Test Coverage](https://raw.githubusercontent.com/pharmaverse/pharmaversesdtm/badges/main/test-coverage.svg)](https://github.com/pharmaverse/pharmaversesdtm/actions/workflows/code-coverage.yml)

<!-- badges: end -->

Test data (SDTM) for the pharmaverse family of packages
Test data (SDTM) for the pharmaverse family of packages

# Purpose

To provide a one-stop-shop for SDTM test data in the pharmaverse family of packages. This includes datasets that are therapeutic area (TA)-agnostic (`DM`, `VS`, `EG`, etc.) as well TA-specific ones (`RS`, `TR`, `OE`, etc.).
To provide a one-stop-shop for SDTM test data in the pharmaverse family of packages. This includes datasets that are therapeutic area (TA)-agnostic (`DM`, `VS`, `EG`, etc.) as well TA-specific ones (`RS`, `TR`, `OE`, etc.).

# Installation

The package is available from CRAN and can be installed by running `install.packages("pharmaversesdtm")`.
To install the latest development version of the package directly from GitHub use the following code:
The package is available from CRAN and can be installed by running `install.packages("pharmaversesdtm")`. To install the latest development version of the package directly from GitHub use the following code:

```r
``` r
if (!requireNamespace("remotes", quietly = TRUE)) {
install.packages("remotes")
}
Expand All @@ -28,39 +26,45 @@ remotes::install_github("pharmaverse/pharmaversesdtm", ref = "devel")

# Data Sources

Some of the test datasets has been sourced from the [CDISC pilot project](https://github.com/cdisc-org/sdtm-adam-pilot-project), while other datasets have been constructed ad-hoc by the admiral team. Please check the [Github repository](https://github.com/pharmaverse/admiral.test/tree/main/data) for detailed information regarding the source of specific datasets.
Some of the test datasets has been sourced from the [CDISC pilot project](https://github.com/cdisc-org/sdtm-adam-pilot-project), while other datasets have been constructed ad-hoc by the admiral team. Please check the [Reference page](https://pharmaverse.github.io/pharmaversesdtm/cran-release/reference/index.html) for detailed information regarding the source of specific datasets.

# Naming Conventions {#naming}

# Naming Conventions
* Datasets that are TA-agnostic: same as SDTM domain name (e.g., `dm`, `rs`).
* Datasets that are TA-specific: prefix the domain name with the TA (e.g., `onco_rs`, `ophtha_oe`).
* Datasets that are TA-specific: domain_TA_others, others go from broader categories to more specific ones (e.g., `oe_ophtha`, `rs_onco`, `rs_onco_irecist`).

**Note**: *If an SDTM domain is used by multiple TAs, `{pharmaversesdtm}` may provide multiple versions of the corresponding test dataset. For instance, the package contains `ex` and `ophtha_ex` as the latter contains ophthalmology-specific variables such as `EXLAT` and `EXLOC`, and `EXROUTE` is exchanged for a plausible ophthalmology value.*
**Note**: *If an SDTM domain is used by multiple TAs, `{pharmaversesdtm}` may provide multiple versions of the corresponding test dataset. For instance, the package contains `ex` and `ex_ophtha` as the latter contains ophthalmology-specific variables such as `EXLAT` and `EXLOC`, and `EXROUTE` is exchanged for a plausible ophthalmology value.*

# How To Update

Firstly, make a GitHub issue in this repo with the planned updates and tag `@pharmaverse/admiral` so that one of the development core team can sanity check the request.
Firstly, make a GitHub issue in [`{pharmaversesdtm}`](https://github.com/pharmaverse/pharmaversesdtm) with the planned updates and tag `@pharmaverse/admiral` so that one of the development core team can sanity check the request.
Then there are two main ways to extend the test data: either by adding new datasets or extending existing datasets with new records/variables. Whichever method you choose, it is worth noting the following:

* Programs that generate test data are stored in the `dev/` folder.
* Each of these programs is written as a standalone R script: if any packages need to be loaded for a given program, then call `library()` at the start of the program (but please do __not__ call `library(pharmaversesdtm)`).
* Most of the packages that you are likely to need will already be specified in the `renv.lock` file, so they will already be installed if you have been keeping in sync--you can check this by entering `renv::status()` in the Console. However, you may also wish to install `{metatools}` and `{ggplot2}`, which are currently not specified in the `renv.lock` file. If you feel that you need to install any other packages in addition to those just mentioned, then please tag `@pharmaverse/admiral` to discuss with the development core team.
* When you have created a program in the `dev/` folder, you need to run it as a standalone R script, in order to generate a test dataset that will become part of the `{pharmaversesdtm}` package, but you do not need to build the package.
* Following [best practice](https://r-pkgs.org/data.html#sec-data-data), each dataset is stored as a `.rda` file whose name is consistent with the name of the dataset: for example, the dataset `dm` should be renamed to `raw_dm` before saving it as `raw_dm.rda`; if you save `dm` as `raw_dm.rda` and subsequently load the `.rda` file, then `dm` (not `raw_dm`) will be loaded into the global environment.
* The programs in `dev/` are stored within the `{pharmaversesdtm}` GitHub repository, but they are __not__ part of the `{pharmaversesdtm}` package--the `dev/` folder is specified in `.Rbuildignore`.
* When you run a program that is in the `dev/` folder, you generate a dataset that is written to the `data/` folder, which will become part of the `{pharmaversesdtm}` package.
* The names of test datasets are specified in `R/data.R`, for the purpose of generating documentation in the `man/` folder.
* Programs that generate test data are stored in the `data-raw/` folder.
* Each of these programs is written as a standalone R script: if any packages need to be loaded for a given program, then call `library()` at the start of the program (but please do **not** call `library(pharmaversesdtm)`).
* Most of the packages that you are likely to need will already be specified in the `renv.lock` file, so they will already be installed if you have been keeping in sync--you can check this by entering `renv::status()` in the Console. However, you may also wish to install `{metatools}`, which is currently not specified in the `renv.lock` file. If you feel that you need to install any other packages in addition to those just mentioned, then please tag `@pharmaverse/admiral` to discuss with the development core team.
* When you have created a program in the `data-raw/` folder, you need to run it as a standalone R script, in order to generate a test dataset that will become part of the `{pharmaversesdtm}` package, but you do not need to build the package.
* Following [best practice](https://r-pkgs.org/data.html#sec-data-data), each dataset is stored as a `.rda` file whose name is consistent with the name of the dataset, e.g., dataset `xx` is stored as `xx.rda`. The easiest way to achieve this is to use `usethis::use_data(xx)`
* The programs in `data-raw/` are stored within the `{pharmaversesdtm}` GitHub repository, but they are **not** part of the `{pharmaversesdtm}` package--the `data-raw/` folder is specified in `.Rbuildignore`.
* When you run a program that is in the `data-raw/` folder, you generate a dataset that is written to the `data/` folder, which will become part of the `{pharmaversesdtm}` package.
* The names and sources of test datasets are specified in `R/data.R`, for the purpose of generating documentation in the `man/` folder.

## Adding New SDTM Datasets

* Create a program in the `dev/` folder, named `<name>.R`, where `<name>` is the SDTM domain name, (e.g. `rs.R`), to generate the test data and output `<name>.rda` to the `data/` folder. Use CDISC pilot data such as `dm` as input in this program in order to create realistic synthetic data that remains consistent with other domains. Note that __no personal data should be used__ as part of this package, even if anonymized.
* Create a program in the `data-raw/` folder, named `<name>.R`, where `<name>` should follow the [naming convention](#naming), to generate the test data and output `<name>.rda` to the `data/` folder.
* Use CDISC pilot data such as `dm` as input in this program in order to create realistic synthetic data that remains consistent with other domains (not mandatory).
* Note that **no personal data should be used** as part of this package, even if anonymized.
* Run the program.
* Reflect this update, by specifying `<name>` in `R/data.R`.
* Run `devtools::document()` in order to update `NAMESPACE` and update the `.Rd` files in `man/`.
* Add your GitHub handle to `.github/CODEOWNERS`.
* Update `NEWS.md`.

kaz462 marked this conversation as resolved.
Show resolved Hide resolved
## Updating Existing SDTM Datasets

* Rename the source dataset as `raw_<name>`, where `<name>` is the SDTM domain name (e.g. rename `ds` to `raw_ds`), and then save it to the `data/` folder as `raw_<name>.rda` (e.g. `save(raw_ds, file = "data/raw_ds.rda")`).
* Create a program in the `dev/` folder, named `update_<name>.R`, to load `raw_<name>.rda`, make the updates, and output `<name>.rda` to the `data/` folder.
* Run the program.
* Reflect this update, by specifying both `raw_<name>` and `<name>` in `R/data.R`.
* Locate the existing program `<name>.R` in the `data-raw/` folder, update it accordingly.
* Run the program, and output updated `<name>.rda` to the `data/` folder.
* Run `devtools::document()` in order to update `NAMESPACE` and update the `.Rd` files in `man/`.
* Add your GitHub handle to `.github/CODEOWNERS`.
* Update `NEWS.md`.

52 changes: 5 additions & 47 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,56 +12,14 @@ repo:
user: https://github.com/
news:
cran_dates: true
reference:

- title: Derivations for Adding Variables
- subtitle: ADXX-specific
desc: Derivation Functions helpful for building the ADXX dataset
- contents:
- has_keyword("der_adxx")

- title: Derivations for Adding Parameters
- subtitle: ADXX-specific
desc: Parameter Derivation Functions helpful for building the ADXX dataset
- contents:
- has_keyword("der_prm_adxx")

- title: Advanced Functions
- subtitle: Pre-Defined Source Objects
desc: Source objects defined by {pharmaversesdtm}
- contents:
- has_keyword("source_specifications")

- title: Utility Functions
- subtitle: Utilities for Formatting Observations
- contents:
- has_keyword('utils_fmt')

- subtitle: Utilities for Dataset Checking
- contents:
- has_keyword('utils_ds_chk')

- subtitle: Utilities for Filtering Observations
- contents:
- has_keyword('utils_fil')

- title: Example Datasets
desc: You can run `admiral::use_ad_template()` to produce additional datasets
- contents:
- has_keyword('datasets')

navbar:
structure:
left: [getstarted, reference, articles, news]
left: [reference, news]
components:
getstarted:
text: Get Started
href: articles/pharmaversesdtm.html
reference:
text: Reference
href: reference/index.html
articles:
text: User Guides
menu:
- text: Creating ADXX
href: articles/adxx.html
text: Reference
href: reference/index.html


6 changes: 0 additions & 6 deletions data-raw/pc.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ library(haven) # Load xpt
library(plyr)
library(dplyr) # apply distincts
library(lubridate)
library(ggplot2)
library(labelled)
library(admiral)

Expand Down Expand Up @@ -152,11 +151,6 @@ pc <- pc %>%
)


# Some test to look the overall figure
plot <- ggplot(pc, aes(x = PCTPTNUM, y = PCSTRESN, group = USUBJID)) +
geom_line() +
geom_point()


# ---- Save output ----
save(pc, file = "data/pc.rda", compress = "bzip2")
1 change: 0 additions & 1 deletion data-raw/pp.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
library(haven) # Load xpt
library(dplyr) # apply distincts
library(lubridate)
library(ggplot2)
library(labelled)
library(admiral)

Expand Down
107 changes: 0 additions & 107 deletions data-raw/qs.R

This file was deleted.

Binary file removed data/qs.rda
Binary file not shown.
19 changes: 0 additions & 19 deletions man/qs.Rd

This file was deleted.

11 changes: 7 additions & 4 deletions staged_dependencies.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
---
current_repo:
repo: pharmaverse/admiraltemplate
repo: pharmaverse/pharmaversesdtm
host: https://github.com
upstream_repos:
downstream_repos:
- repo: pharmaverse/admiral
host: https://github.com
- repo: pharmaverse/admiral.test
- repo: pharmaverse/admiralonco
host: https://github.com
- repo: pharmaverse/admiraldev
- repo: pharmaverse/admiralvaccine
host: https://github.com
downstream_repos:
- repo: pharmaverse/admiralophtha
host: https://github.com

20 changes: 0 additions & 20 deletions vignettes/adxx.Rmd

This file was deleted.

21 changes: 0 additions & 21 deletions vignettes/pharmaversesdtm.Rmd

This file was deleted.

Loading