Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1301 feature request add confidence intervals for quantiles in surv time #1306

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 40 additions & 5 deletions R/survival_time.R
Original file line number Diff line number Diff line change
Expand Up @@ -70,10 +70,24 @@ s_surv_time <- function(df,
conf.type = conf_type
)
srv_tab <- summary(srv_fit, extend = TRUE)$table
srv_qt_tab <- stats::quantile(srv_fit, probs = quantiles)$quantile
srv_qt_tab_pre <- stats::quantile(srv_fit, probs = quantiles)
srv_qt_tab <- srv_qt_tab_pre$quantile
range_censor <- range_noinf(df[[.var]][!df[[is_event]]], na.rm = TRUE)
range_event <- range_noinf(df[[.var]][df[[is_event]]], na.rm = TRUE)
range <- range_noinf(df[[.var]], na.rm = TRUE)

names(quantiles) <- as.character(100 * quantiles)
srv_qt_tab_pre <- unlist(srv_qt_tab_pre)
srv_qt_ci <- lapply(quantiles, function(x) {
name <- as.character(100 * x)

c(
srv_qt_tab_pre[[paste0("quantile.", name)]],
srv_qt_tab_pre[[paste0("lower.", name)]],
srv_qt_tab_pre[[paste0("upper.", name)]]
)
})

list(
median = formatters::with_label(unname(srv_tab["median"]), "Median"),
median_ci = formatters::with_label(
Expand All @@ -84,7 +98,20 @@ s_surv_time <- function(df,
),
range_censor = formatters::with_label(range_censor, "Range (censored)"),
range_event = formatters::with_label(range_event, "Range (event)"),
range = formatters::with_label(range, "Range")
range = formatters::with_label(range, "Range"),
median_ci_1_line = formatters::with_label(
c(
unname(srv_tab["median"]),
unname(srv_tab[paste0(srv_fit$conf.int, c("LCL", "UCL"))])
),
paste0("Median ", f_conf_level(conf_level))
),
quantiles_ci_1 = formatters::with_label(
unname(srv_qt_ci[[1]]), paste0(quantiles[1] * 100, "%-ile with ", f_conf_level(conf_level))
),
quantiles_ci_2 = formatters::with_label(
unname(srv_qt_ci[[2]]), paste0(quantiles[2] * 100, "%-ile with ", f_conf_level(conf_level))
)
)
}

Expand Down Expand Up @@ -121,8 +148,17 @@ a_surv_time <- function(df,
rng_censor_upr <- x_stats[["range_censor"]][2]

# Use method-specific defaults
fmts <- c(median_ci = "(xx.x, xx.x)", quantiles = "xx.x, xx.x", range = "xx.x to xx.x")
lbls <- c(median_ci = "95% CI", range = "Range", range_censor = "Range (censored)", range_event = "Range (event)")
fmts <- c(
median_ci = "(xx.x, xx.x)", quantiles = "xx.x, xx.x", range = "xx.x to xx.x",
median_ci_1_line = "xx.x (xx.x - xx.x)",
quantiles_ci_1 = "xx.x (xx.x - xx.x)", quantiles_ci_2 = "xx.x (xx.x - xx.x)"
)
lbls <- c(
median_ci = "95% CI", range = "Range", range_censor = "Range (censored)", range_event = "Range (event)",
median_ci_1_line = "Median 95% CI",
quantiles_ci_1 = "25%-ile with 95% CI",
quantiles_ci_2 = "75%-ile with 95% CI"
Melkiades marked this conversation as resolved.
Show resolved Hide resolved
)
lbls_custom <- .labels
.formats <- c(.formats, fmts[setdiff(names(fmts), names(.formats))])
.labels <- c(.labels, lbls[setdiff(names(lbls), names(lbls_custom))])
Expand Down Expand Up @@ -209,7 +245,6 @@ surv_time <- function(lyt,
var_labels = var_labels,
show_labels = show_labels,
table_names = table_names,
na_str = na_str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is already passed as extra_args into the analysis function . If the na_str is passed as a named list (to be able to show the formatted value as "NA (NA, NA)" when a stat is all missing), having na_str as extra_args and inside analyze gave an error.

Copy link
Contributor

@Melkiades Melkiades Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! good catch! thanks :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may also be present elsewhere. Can you add an issue regarding this please?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after looking at it again, it might be more easy to remove na_str from extra_args and keep it in analyze only. Then it is much easier to accomplish "NA (NA, NA)", and there would not be the need for a named list to na_str to accomplish this.
Would it be ok to remove it from in_rows call and keep it only in analyze?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of na_str seems a bit duplicated. It is more straightforward to modify it from the analyze function instead of sending it down. Yet the statistical functions may be used and tested externally from the decided analyze. For simplicity, I would keep only the na_str on the top level. If users want to use custom analyze functions can provide their own NA handling directly there

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds fine to me!

nested = nested,
extra_args = extra_args
)
Expand Down
25 changes: 24 additions & 1 deletion R/utils_default_stats_formats_labels.R
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,20 @@ labels_use_control <- function(labels_default, control, labels_custom = NULL) {
labels_default["quantiles"]
)
}
if ("quantiles" %in% names(control) && "quantiles_ci_1" %in% names(labels_default) &&
!"quantiles_ci_1" %in% names(labels_custom)) { # nolint
labels_default["quantiles_ci_1"] <- gsub(
"[0-9]+%-ile", paste0(control[["quantiles"]][1] * 100, "%-ile", ""),
labels_default["quantiles_ci_1"]
)
}
if ("quantiles" %in% names(control) && "quantiles_ci_2" %in% names(labels_default) &&
!"quantiles_ci_2" %in% names(labels_custom)) { # nolint
labels_default["quantiles_ci_2"] <- gsub(
"[0-9]+%-ile", paste0(control[["quantiles"]][2] * 100, "%-ile", ""),
labels_default["quantiles_ci_2"]
)
}
if ("test_mean" %in% names(control) && "mean_pval" %in% names(labels_default) &&
!"mean_pval" %in% names(labels_custom)) { # nolint
labels_default["mean_pval"] <- gsub(
Expand Down Expand Up @@ -392,7 +406,10 @@ tern_default_stats <- list(
summarize_glm_count = c("n", "rate", "rate_ci", "rate_ratio", "rate_ratio_ci", "pval"),
summarize_num_patients = c("unique", "nonunique", "unique_count"),
summarize_patients_events_in_cols = c("unique", "all"),
surv_time = c("median", "median_ci", "quantiles", "range_censor", "range_event", "range"),
surv_time = c(
"median", "median_ci", "median_ci_1_line", "quantiles",
"quantiles_ci_1", "quantiles_ci_2", "range_censor", "range_event", "range"
),
surv_timepoint = c("pt_at_risk", "event_free_rate", "rate_se", "rate_ci", "rate_diff", "rate_diff_ci", "ztest_pval"),
tabulate_rsp_biomarkers = c("n_tot", "n_rsp", "prop", "or", "ci", "pval"),
tabulate_rsp_subgroups = c("n", "n_rsp", "prop", "n_tot", "or", "ci", "pval"),
Expand Down Expand Up @@ -431,7 +448,10 @@ tern_default_formats <- c(
median = "xx.x",
mad = "xx.x",
median_ci = "(xx.xx, xx.xx)",
median_ci_1_line = "xx.xx (xx.xx, xx.xx)",
quantiles = "xx.x - xx.x",
quantiles_ci_1 = "xx.xx (xx.xx, xx.xx)",
quantiles_ci_2 = "xx.xx (xx.xx, xx.xx)",
iqr = "xx.x",
range = "xx.x - xx.x",
min = "xx.x",
Expand Down Expand Up @@ -480,7 +500,10 @@ tern_default_labels <- c(
median = "Median",
mad = "Median Absolute Deviation",
median_ci = "Median 95% CI",
median_ci_1_line = "Median 95% CI",
quantiles = "25% and 75%-ile",
quantiles_ci_1 = "25%-ile 95% CI",
quantiles_ci_2 = "75%-ile 95% CI",
iqr = "IQR",
range = "Min - Max",
min = "Minimum",
Expand Down
68 changes: 52 additions & 16 deletions tests/testthat/_snaps/survival_time.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,21 @@
attr(,"label")
[1] "Range"

$median_ci_1_line
[1] 23.91143 18.25878 32.85945
attr(,"label")
[1] "Median 95% CI"

$quantiles_ci_1
[1] 9.822926 5.628823 16.690121
attr(,"label")
[1] "25%-ile with 95% CI"

$quantiles_ci_2
[1] 41.98181 32.85945 53.41445
attr(,"label")
[1] "75%-ile with 95% CI"


# s_surv_time works with customized arguments

Expand Down Expand Up @@ -69,6 +84,21 @@
attr(,"label")
[1] "Range"

$median_ci_1_line
[1] 23.91143 13.59124 37.97055
attr(,"label")
[1] "Median 99% CI"

$quantiles_ci_1
[1] 6.649204 1.887860 12.771697
attr(,"label")
[1] "20%-ile with 99% CI"

$quantiles_ci_2
[1] 51.09487 37.97055 NA
attr(,"label")
[1] "80%-ile with 99% CI"


# a_surv_time works with default arguments

Expand All @@ -77,13 +107,16 @@
Output
RowsVerticalSection (in_rows) object print method:
----------------------------
row_name formatted_cell indent_mod row_label
1 Median 24.8 0 Median
2 95% CI (21.1, 31.3) 0 95% CI
3 25% and 75%-ile 10.8, 47.6 0 25% and 75%-ile
4 Range (censored) 0.8 to 78.9 0 Range (censored)
5 Range (event) 0.1 to 155.5 0 Range (event)
6 Range 0.1 to 155.5 0 Range
row_name formatted_cell indent_mod row_label
1 Median 24.8 0 Median
2 95% CI (21.1, 31.3) 0 95% CI
3 Median 95% CI 24.8 (21.1 - 31.3) 0 Median 95% CI
4 25% and 75%-ile 10.8, 47.6 0 25% and 75%-ile
5 25%-ile with 95% CI 10.8 (6.6 - 13.4) 0 25%-ile with 95% CI
6 75%-ile with 95% CI 47.6 (39.3 - 57.8) 0 75%-ile with 95% CI
7 Range (censored) 0.8 to 78.9 0 Range (censored)
8 Range (event) 0.1 to 155.5 0 Range (event)
9 Range 0.1 to 155.5 0 Range

# a_surv_time works with customized arguments

Expand Down Expand Up @@ -115,15 +148,18 @@
Code
res
Output
ARM A ARM B ARM C
———————————————————————————————————————————————————————————————————
Survival Time (Months)
Median 32.0 23.9 20.8
90% CI (25.6, 49.3) (18.9, 32.1) (13.0, 26.0)
40% and 60%-ile 25.6, 46.5 18.3, 29.2 13.0, 25.7
Range (censored) 0.8 to 63.5 6.2 to 78.9 3.4 to 52.4
Range (event) 0.3 to 155.5 0.1 to 154.1 0.6 to 80.7
Range 0.3 to 155.5 0.1 to 154.1 0.6 to 80.7
ARM A ARM B ARM C
—————————————————————————————————————————————————————————————————————————————————————
Survival Time (Months)
Median 32.0 23.9 20.8
90% CI (25.6, 49.3) (18.9, 32.1) (13.0, 26.0)
Median 90% CI 32.0 (25.6 - 49.3) 23.9 (18.9 - 32.1) 20.8 (13.0 - 26.0)
40% and 60%-ile 25.6, 46.5 18.3, 29.2 13.0, 25.7
40%-ile with 90% CI 25.6 (20.7 - 33.4) 18.3 (12.8 - 23.9) 13.0 (10.1 - 24.8)
60%-ile with 90% CI 46.5 (32.0 - 57.8) 29.2 (23.9 - 41.3) 25.7 (20.8 - 37.1)
Range (censored) 0.8 to 63.5 6.2 to 78.9 3.4 to 52.4
Range (event) 0.3 to 155.5 0.1 to 154.1 0.6 to 80.7
Range 0.3 to 155.5 0.1 to 154.1 0.6 to 80.7

# surv_time works with referential footnotes

Expand Down
Loading