Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove normal approximation #153

Merged
merged 10 commits into from
Jul 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# cfr (development version)

# cfr 0.1.2

Updated version to fix instability in normal approximation with displayed Ebola example. This release includes:

1. Removal of normal approximation from the `.estimate_severity()` function, instead using the binomial likelihood unless criteria for a Poisson approximation met.
2. Updated README example focusing on first 30 days of outbreak, to emphasise effects of not accounting for delays to outcome.

# cfr 0.1.1

Maintainer is changing to @adamkucharski (#143).
Expand Down
65 changes: 30 additions & 35 deletions R/estimate_severity.R
Original file line number Diff line number Diff line change
Expand Up @@ -30,16 +30,8 @@
#' the estimate and confidence intervals cannot be calculated and the output
#' `<data.frame>` contains only `NA`s.
#'
#' - When `total_cases == total_deaths` _and_ `total_outcomes <= total_deaths`,
#' while `total_cases < poisson_threshold`, the confidence intervals cannot be
#' calculated and are returned as `NA`. The severity is returned as the lowest
#' possible value for the method used when cases are below the Poisson
#' threshold, which is 0.001.
#'
#' - When `total_outcomes == total_deaths` while
#' `total_cases < poisson_threshold` the confidence intervals cannot be
#' calculated and are returned as `NA`s while the severity estimate is returned
#' as `0.999`.
#' - When `total_outcomes <= total_deaths`, estimate and confidence intervals
#' cannot be reliably calculated and are returned as `NA`.
.estimate_severity <- function(total_cases,
total_deaths,
total_outcomes,
Expand Down Expand Up @@ -68,16 +60,38 @@
)

# maximum likelihood estimation for corrected severity
# using increments of 0.1% severity
pprange <- seq(from = 1e-4, to = 1.0, by = 1e-4)

# if more expected outcomes than observed deaths, set outcomes equal to deaths
if (total_outcomes >= total_deaths) {
total_outcomes_checked <- total_outcomes
} else {
total_outcomes_checked <- NA
message(
"Total deaths = ", total_deaths,
" and expected outcomes = ", round(total_outcomes),
" so setting expected outcomes = NA. If we were to assume
total deaths = expected outcomes, it would produce an estimate of 1."
)
}

# get likelihoods using selected function
lik <- func_likelihood(total_outcomes, total_deaths, pprange)
lik <- func_likelihood(total_outcomes_checked, total_deaths, pprange)

# maximum likelihood estimate - if this is empty, return NA
# Otherwise return 95% confidence interval of likelihood
severity_estimate <- pprange[which.max(lik)]

# 95% confidence interval of likelihood
severity_lims <- range(pprange[lik >= (max(lik) - 1.92)])
if (length(severity_estimate) == 0) {
severity_estimate <- NA
severity_lims <- c(NA, NA)
} else {
severity_lims <- range(
pprange[lik >=
(max(lik, na.rm = TRUE) - 1.92)],
na.rm = TRUE
)
}

# return a vector for easy conversion to data
severity_estimate <- c(severity_estimate, severity_lims)
Expand Down Expand Up @@ -111,9 +125,6 @@
#' - Poisson approximation: when `total_cases >= poisson_threshold` but
#' when `p_mid` < 0.05;
#'
#' - Normal approximation: when `total_cases >= poisson_threshold` and
#' `p_mid >=` 0.05.
#'
#' @return A function with three arguments, `total_outcomes`, `total_deaths`,
#' and `pp`, which is used to generate the profile likelihood.
#' Also prints messages to the screen when a Poisson or Normal approximation
Expand All @@ -123,7 +134,7 @@
# NOTE: internal function is not input checked
# switch likelihood function based on total cases and p_mid
# Binomial approx
if (total_cases < poisson_threshold) {
if (total_cases < poisson_threshold || (p_mid >= 0.05)) {
func_likelihood <- function(total_outcomes, total_deaths, pp) {
lchoose(round(total_outcomes), total_deaths) +
(total_deaths * log(pp)) +
Expand All @@ -132,7 +143,7 @@
}

# Poisson approx
if ((total_cases >= poisson_threshold) && p_mid < 0.05) {
if ((total_cases >= poisson_threshold) && (p_mid < 0.05)) {
func_likelihood <- function(total_outcomes, total_deaths, pp) {
stats::dpois(
total_deaths, pp * round(total_outcomes),
Expand All @@ -145,21 +156,5 @@
)
}

# Normal approx
if ((total_cases >= poisson_threshold) && p_mid >= 0.05) {
func_likelihood <- function(total_outcomes, total_deaths, pp) {
stats::dnorm(
total_deaths,
mean = pp * round(total_outcomes),
sd = pp * (1 - pp) * round(total_outcomes),
log = TRUE
)
}
message(
"Total cases = ", total_cases, " and p = ", signif(p_mid, 3),
": using Normal approximation to binomial likelihood."
)
}

func_likelihood
}
7 changes: 5 additions & 2 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -66,12 +66,15 @@ library(cfr)
# Load the Ebola 1976 data provided with the package
data(ebola1976)

# Focus on the first 20 days the outbreak
ebola1976_first_30 <- ebola1976[1:30, ]

# Calculate the static CFR without correcting for delays
cfr_static(data = ebola1976)
cfr_static(data = ebola1976_first_30)

# Calculate the static CFR while correcting for delays
cfr_static(
data = ebola1976,
data = ebola1976_first_30,
delay_density = function(x) dgamma(x, shape = 2.40, scale = 3.33)
)
```
Expand Down
17 changes: 7 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,21 +73,21 @@ library(cfr)
# Load the Ebola 1976 data provided with the package
data(ebola1976)

# Focus on the first 20 days the outbreak
ebola1976_first_30 <- ebola1976[1:30, ]

# Calculate the static CFR without correcting for delays
cfr_static(data = ebola1976)
cfr_static(data = ebola1976_first_30)
#> severity_estimate severity_low severity_high
#> 1 0.955102 0.9210866 0.9773771
```

``` r
#> 1 0.4740741 0.3875497 0.5617606

# Calculate the static CFR while correcting for delays
cfr_static(
data = ebola1976,
data = ebola1976_first_30,
delay_density = function(x) dgamma(x, shape = 2.40, scale = 3.33)
)
#> severity_estimate severity_low severity_high
#> 1 0.9742 0.8356 0.9877
#> 1 0.9422 0.8701 0.9819
```

### Change in real-time estimates of overall severity during the 1976 Ebola outbreak
Expand Down Expand Up @@ -118,9 +118,6 @@ head(rolling_cfr_naive)
#> 4 1976-08-28 0 0 0.975
#> 5 1976-08-29 0 0 0.975
#> 6 1976-08-30 0 0 0.975
```

``` r

# Calculate the rolling daily CFR while correcting for delays
rolling_cfr_corrected <- cfr_rolling(
Expand Down
10 changes: 2 additions & 8 deletions man/dot-estimate_severity.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 0 additions & 2 deletions man/dot-select_func_likelihood.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file modified man/figures/README-fig-rolling-cfr-ebola-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 2 additions & 6 deletions tests/testthat/_snaps/estimate_ascertainment.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,16 @@
Code
estimate_ascertainment(data = ebola1976, delay_density = function(x) dgamma(x,
shape = 2.4, scale = 3.33), severity_baseline = 0.7)
Message
Total cases = 245 and p = 0.959: using Normal approximation to binomial likelihood.
Output
ascertainment_estimate ascertainment_low ascertainment_high
1 0.7185383 0.7087172 0.8377214
1 0.7297748 0.7147963 0.7530931

# Static ascertainment from vignette

Code
estimate_ascertainment(data = covid_uk, delay_density = function(x) dlnorm(x,
meanlog = 2.577, sdlog = 0.44), severity_baseline = 0.014)
Message
Total cases = 283420 and p = 0.206: using Normal approximation to binomial likelihood.
Output
ascertainment_estimate ascertainment_low ascertainment_high
1 0.09810792 0.02316347 0.2167183
1 0.06779661 0.06734007 0.06829268

2 changes: 1 addition & 1 deletion tests/testthat/_snaps/estimate_severity.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
severity_estimate
Output
severity_estimate severity_low severity_high
0.9742 0.8356 0.9877
0.9592 0.9295 0.9793

---

Expand Down
2 changes: 1 addition & 1 deletion tests/testthat/_snaps/estimate_static.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,5 @@
scfr_corrected
Output
severity_estimate severity_low severity_high
1 0.9742 0.8356 0.9877
1 0.9592 0.9295 0.9793

2 changes: 1 addition & 1 deletion tests/testthat/test-estimate_ascertainment.R
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ test_that("Ascertainment > 1.0 throws a warning", {
estimate_ascertainment(
data = ebola1976,
delay_density = function(x) dgamma(x, shape = 2.40, scale = 3.33),
severity_baseline = 0.9
severity_baseline = 0.99
),
regexp = "Ascertainment ratios > 1.0 detected, setting these values to 1.0"
)
Expand Down
19 changes: 3 additions & 16 deletions tests/testthat/test-estimate_severity.R
Original file line number Diff line number Diff line change
Expand Up @@ -131,22 +131,9 @@ test_that("Special cases of `.estimate_severity()`", {
poisson_threshold = 100
),
c(
severity_estimate = 1e-4, # lowest possible severity under this method
severity_low = NA_real_,
severity_high = NA_real_
)
)

total_outcomes <- 99
expect_identical(
.estimate_severity(
total_cases, total_deaths, total_outcomes,
poisson_threshold = 100
),
c(
severity_estimate = 1 - 1e-4, # highest possible severity
severity_low = NA_real_,
severity_high = NA_real_
severity_estimate = NA, # set NA because not valid calculation
severity_low = NA,
severity_high = NA
)
)

Expand Down
Binary file added tests/testthat/testthat-problems.rds
Binary file not shown.
2 changes: 0 additions & 2 deletions vignettes/cfr.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
bibliography: resources/library.json
link-citations: true
vignette: >
Expand Down
2 changes: 0 additions & 2 deletions vignettes/data_from_incidence2.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
vignette: >
%\VignetteIndexEntry{Handling data from {incidence2}}
%\VignetteEngine{knitr::rmarkdown}
Expand Down
2 changes: 0 additions & 2 deletions vignettes/delay_distributions.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
bibliography: resources/library.json
link-citations: true
vignette: >
Expand Down
2 changes: 0 additions & 2 deletions vignettes/estimate_ascertainment.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
bibliography: resources/library.json
link-citations: true
vignette: >
Expand Down
4 changes: 1 addition & 3 deletions vignettes/estimate_static_severity.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
bibliography: resources/library.json
link-citations: true
vignette: >
Expand Down Expand Up @@ -172,7 +170,7 @@ head(df_covid_uk)
We retrieve the appropriate distribution for Covid-19 from @linton2020; this is a lognormal distribution with $\mu$ = 2.577 and $\sigma$ = 0.440.

::: {.alert .alert-warning}
**Note that** @linton2020 fitted a discrete lognormal distribution and we use a continuous distribution, and that we are ignoring uncertainty in the distribution parameters and likely under-estimating uncertainty in the CFR.
**Note that** @linton2020 fitted a discrete lognormal distribution and we use a continuous distribution, and that we are ignoring uncertainty in the distribution parameters and hence likely under-estimating uncertainty in the CFR.
:::

### Estimating the naive and corrected CFR
Expand Down
2 changes: 0 additions & 2 deletions vignettes/estimate_time_varying_severity.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
bibliography: resources/library.json
link-citations: true
vignette: >
Expand Down
Loading