-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First draft for text description of the data #69
Conversation
Relates to #35 |
…uture improvements. Added description of raw data. National Patient Register only for now.
Moved changes section to #78
…same recnum value is repeated for all diagnoses for that specific contact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I have added some minor suggestions/questions.
Do you expect to fill in the todo items in this PR as well, or is your plan to create new PRs for those? :)
vignettes/algorithm_logic.Rmd
Outdated
in future revisions. Refer to the other vignettes for background | ||
information and a more general description of the algorithm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add links to specific vignettes with relevant information here?
vignettes/algorithm_data.Rmd
Outdated
|
||
## Contents | ||
|
||
This document describes the structure of the data components processed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have used the word "data sources". Do we prefer "data components"? Or "data objects" as in the title? I think it makes sense to align this, so the same words are used for the same things throughout the documentation :)
vignettes/algorithm_data.Rmd
Outdated
In a future revision, the algorithm can also utilise the Danish Medical | ||
Birth Register to extend the period of time of valid inclusions further | ||
back in time compared to what is possible using obstetric codes from the | ||
National Patient Register. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to #78 ?
vignettes/algorithm_data.Rmd
Outdated
assumes that raw data is stored/structured in the most common format for | ||
raw data provided on Statistics Denmark's servers (from our experience). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add which specific formats you are talking about here? :)
vignettes/algorithm_data.Rmd
Outdated
`d_inddto`/`dato_start`. | ||
|
||
- Named `lpr_adm` in the LPR2-formatted data prior to 2019, and | ||
`kontakter` in contact-based LPR3-formatted data from 2019 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to use the English abbreviation DNPR
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also add what the abbreviation comes from? I.e., Landspatientregisteret?
Or do we assume readers know this?
Co-authored-by: Signe Kirk Brødbæk <40836345+signekb@users.noreply.github.com>
rlang::check_installed("glue") | ||
rlang::check_installed("knitr") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whenever you use a function from a package that you set as "suggests" as a dependency, you need to include a check function to inform the user to install these packages if they are not installed. So that's what these do (when you use use_package("packagename", "suggests")
, it tells you exactly what to do)
|
||
variable_description |> | ||
dplyr::select( | ||
.data$register_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I mentioned this already, but the .data$
is used to masked the variable so that CRAN doesn't warn of "undeclared variables". Since we declare this placeholder variable ".data" already.
the `dw_ek_kontakt` variable (LPR3 data). | ||
|
||
```{r} | ||
for (register in osdc:::get_register_abbrev()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I almost always suggest not to use for loops, this is one of those cases that you need to, since these functions are used to create Markdown text, and it doesn't really work in a "functional" way.
register, | ||
caption = glue::glue("Variables and their descriptions within the `{register}` register.") | ||
) |> | ||
print() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to print because only the last thing in a for loop is output, but we want all these things to output.
```{r, include = FALSE} | ||
knitr::opts_chunk$set( | ||
echo = FALSE, | ||
results = "asis", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tells knitr to treat all the output as plain text rather than as code output text
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AWESOME!
sources: | ||
|
||
```{r, results='asis'} | ||
osdc:::registers_as_md_table("Danish registers used in the OSDC algorithm.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The triple-colons is because this is an internal function/object or what is going on here?
Also, a note to my future self: you need to build the package in order to access internal objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haah yes! Or use Ctrl-Shift-L
to load the package. And yea, :::
accesses all internal objects in a package, neat trick!
Just want some feedback on the current format before I fill out the remaining sections describing the raw data with similar content.
I suspect @lwjohnst86 will want to automate the data source descriptions at some point. Then we might have to add a column with the english register names to the variable description csv.