-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: add some unit tests for hba1c inclusion #101
Conversation
"498718589803", "20220101", "NPU27300", 49, | ||
"498718589804", "20220101", "NPU27300", 47, | ||
# Duplicate patient_cpr but with the old units. | ||
"498718589803", "20220101", "NPU03835", 6.5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when one patient has both the old and new units that match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way it's been done so far, in cases of 03835 NPU-codes and 27300 NPU-codes within the same pnr and date, the 03835-row is filtered out/deduplicated.
I think that might be a bit over-complicated- I could look into if simply deduplicating all rows with HbA1c-related NPU-codes by pnr and date works out the same. Then we'll only have one HbA1c measurement/inclusion event per day, so some inclusions might get delayed, but if you follow diagnostic guidelines to the letter you have to have two elevated HbA1c samples taken on different dates for them to be diagnostic of diabetes anyway, so it'd align more with that.
By the way, 03835-%-values are converted to 27300-mmol/mol-values using the following formula:
value_27300 = (value_03835 * 100 * 10.93) - 23.5
or in code, e.g.:
value = ifelse(
analysiscode == "NPU03835",
(value * 100 * 10.93) - 23.5,
value
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is helpful. Just to confirm, if an old unit is measured on the same day as a new unit (purely hypothetical), and the old unit is above the threshold but the new unit isn't, is the new unit's threshold used or the old?
And also, if someone had a measurement on the same day and one value was above the threshold and another was below it, as long as there is a true condition in the date, it is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your example is actually very real, since the test results would usually be reported back in both formats (unless the doctor actively unticks a box when filling out the form to request analyses). So you get two rows of data where everything is identical, except for the NPU-code and the value of the result - down to the exact time of the sample being drawn. This is done to appease clinicians who are more comfortable reading the old units. Since only one analysis is actually performed in the lab, the "old" format is simply generated in the lab IT system by mathematically converting from the "new" format, so both should be identical in terms of absolute values and relative to the diagnostic threshold.
You can have multiple samples taken/recorded on the same day, and those can differ in terms of the result. Even if just one of them is positive, it counts as an inclusion event, yes.
tests/testthat/test-include-hba1c.R
Outdated
) | ||
|
||
expected <- tibble::tribble( | ||
~pnr, ~include_hba1c, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we want to keep date here?
@@ -0,0 +1,3 @@ | |||
name,logic | |||
hba1c,(analysiscode == 'NPU27300' AND value >= 48) OR (analysiscode == 'NPU03835' AND value >= 6.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how I was thinking the logic would be stored and written down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very clever! For HbA1c we (currently) convert NPU03835 (%) values to NPU27300 (mmol/mol) values in the pipeline, so we could potentially simplify the logic to hba1c,(analysiscode %in% c('NPU27300', 'NPU03835') AND value >= 48)
or similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be faster from a performance point of view to filter only and not do an additional calculation step before hand.
tests/testthat/test-include-hba1c.R
Outdated
|
||
expected <- tibble::tribble( | ||
~pnr, ~include_hba1c, | ||
"498718589803", TRUE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't have duplicates, keep the two earliest dates of being true, drop all later dates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can tell, this looks great 👍
I'm not sure whether this PR needs some changes based on Anders' comments in relation to whether the values with the old (%) unit should be converted to the new (mmol/mol) unit?
"498718589807", "20200101", "NPU03835", 6.6, | ||
"498718589807", "20200101", "NPU27300", 47, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Aastedet So here, if there is a duplicate date, with new below but old above the threshold, should that be counted as a true condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that should be counted as a positive inclusion event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whenever you approve this PR, I'll merge it in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😉 😉 @signekb 😄 😄
I'll merge this in now that it has had a very rounds of reviews. |
I'm starting by adding some tests without the actual code to test because I want to make sure we're on the same page with how the output should look like.
We can go over this during the meeting.