Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1301 feature request add confidence intervals for quantiles in surv time #1306

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

iaugusty
Copy link
Collaborator

@iaugusty iaugusty commented Sep 9, 2024

Pull Request

Fixes #1301
@Melkiades
I'm not sure about the impact of adding new stats to an analyze function on the remainder of the code/packages.
Could you review and share your thoughts?
This is the first function for which we'd like to add extra stats, there will be more functions for which we'd like to combine the stat and it's confidence interval into 1 line.
Once we know the approach we can follow, more of these types of updates might follow.

@iaugusty iaugusty linked an issue Sep 9, 2024 that may be closed by this pull request
3 tasks
Copy link
Contributor

github-actions bot commented Sep 9, 2024

✅ All contributors have signed the CLA
Posted by the CLA Assistant Lite bot.

Copy link
Contributor

github-actions bot commented Sep 9, 2024

Unit Tests Summary

    1 files     84 suites   1m 13s ⏱️
  855 tests   843 ✅  12 💤 0 ❌
1 836 runs  1 154 ✅ 682 💤 0 ❌

Results for commit e7bfd91.

♻️ This comment has been updated with latest results.

Copy link
Contributor

github-actions bot commented Sep 9, 2024

Unit Test Performance Difference

Test Suite $Status$ Time on main $±Time$ $±Tests$ $±Skipped$ $±Failures$ $±Errors$
analyze_vars_in_cols 💔 $2.42$ $+3.71$ $+17$ $-7$ $0$ $0$
count_occurrences 💔 $0.75$ $+1.62$ $+10$ $-8$ $0$ $0$
summarize_coxreg 💔 $3.51$ $+2.57$ $+13$ $-13$ $0$ $0$
summarize_num_patients 💔 $1.15$ $+1.21$ $+18$ $-16$ $0$ $0$
utils_rtables 💔 $3.28$ $+1.27$ $+16$ $-19$ $0$ $0$
Additional test case details
Test Suite $Status$ Time on main $±Time$ Test Case
analyze_vars_in_cols 💔 $0.48$ $+1.71$ summarize_works_with_nested_analyze

Results for commit e9b6f9c

♻️ This comment has been updated with latest results.

Copy link
Contributor

github-actions bot commented Sep 9, 2024

badge

Code Coverage Summary

Filename                                   Stmts    Miss  Cover    Missing
---------------------------------------  -------  ------  -------  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
R/abnormal_by_baseline.R                      65       0  100.00%
R/abnormal_by_marked.R                        55       5  90.91%   92-96
R/abnormal_by_worst_grade_worsen.R           116       3  97.41%   262-264
R/abnormal_by_worst_grade.R                   60       0  100.00%
R/abnormal.R                                  43       0  100.00%
R/analyze_variables.R                        172       2  98.84%   500, 640
R/analyze_vars_in_cols.R                     176      13  92.61%   178, 221, 235-236, 244-252
R/bland_altman.R                              92       1  98.91%   43
R/combination_function.R                       9       0  100.00%
R/compare_variables.R                         84       2  97.62%   255, 314
R/control_incidence_rate.R                    10       0  100.00%
R/control_logistic.R                           7       0  100.00%
R/control_step.R                              23       1  95.65%   58
R/control_survival.R                          15       0  100.00%
R/count_cumulative.R                          50       1  98.00%   73
R/count_missed_doses.R                        34       0  100.00%
R/count_occurrences_by_grade.R               113       1  99.12%   166
R/count_occurrences.R                        115       1  99.13%   115
R/count_patients_events_in_cols.R             67       1  98.51%   59
R/count_patients_with_event.R                 47       0  100.00%
R/count_patients_with_flags.R                 58       0  100.00%
R/count_values.R                              27       0  100.00%
R/cox_regression_inter.R                     154       0  100.00%
R/cox_regression.R                           161       0  100.00%
R/coxph.R                                    167       7  95.81%   191-195, 238, 253, 261, 267-268
R/d_pkparam.R                                406       0  100.00%
R/decorate_grob.R                            113       0  100.00%
R/desctools_binom_diff.R                     621      64  89.69%   53, 88-89, 125-126, 129, 199, 223-232, 264, 266, 286, 290, 294, 298, 353, 356, 359, 362, 422, 430, 439, 444-447, 454, 457, 466, 469, 516-517, 519-520, 522-523, 525-526, 593, 604-616, 620, 663, 676, 680
R/df_explicit_na.R                            30       0  100.00%
R/estimate_multinomial_rsp.R                  50       1  98.00%   64
R/estimate_proportion.R                      205      11  94.63%   82-89, 93, 98, 319, 485
R/fit_rsp_step.R                              36       0  100.00%
R/fit_survival_step.R                         36       0  100.00%
R/formatting_functions.R                     183       2  98.91%   143, 278
R/g_forest.R                                 585      59  89.91%   241, 253-256, 261-262, 278, 288-291, 336-339, 346, 415, 502, 515, 519-520, 525-526, 539, 555, 602, 633, 708, 717, 723, 742, 797-817, 820, 831, 850, 905, 908, 1043-1048
R/g_ipp.R                                    133       0  100.00%
R/g_km.R                                     350      57  83.71%   286-289, 308-310, 364-367, 401, 429, 433-476, 483-487
R/g_lineplot.R                               243      22  90.95%   196, 370-377, 416-426, 518, 526
R/g_step.R                                    68       1  98.53%   109
R/g_waterfall.R                               47       0  100.00%
R/h_adsl_adlb_merge_using_worst_flag.R        73       0  100.00%
R/h_biomarkers_subgroups.R                    46       0  100.00%
R/h_cox_regression.R                         110       0  100.00%
R/h_incidence_rate.R                          45       0  100.00%
R/h_km.R                                     508      41  91.93%   137, 189-194, 287, 378, 380-381, 392-394, 413, 420-421, 423-425, 433-435, 460, 465-468, 651-654, 1108-1119
R/h_logistic_regression.R                    468       3  99.36%   203-204, 273
R/h_map_for_count_abnormal.R                  54       0  100.00%
R/h_pkparam_sort.R                            15       0  100.00%
R/h_response_biomarkers_subgroups.R           90      12  86.67%   50-55, 107-112
R/h_response_subgroups.R                     178      18  89.89%   257-270, 329-334
R/h_stack_by_baskets.R                        64       1  98.44%   89
R/h_step.R                                   180       0  100.00%
R/h_survival_biomarkers_subgroups.R           88       6  93.18%   111-116
R/h_survival_duration_subgroups.R            207      18  91.30%   259-271, 336-341
R/imputation_rule.R                           17       0  100.00%
R/incidence_rate.R                            86       7  91.86%   66-71, 151
R/logistic_regression.R                      102       0  100.00%
R/missing_data.R                              21       3  85.71%   32, 66, 76
R/odds_ratio.R                               110       0  100.00%
R/prop_diff_test.R                            91       0  100.00%
R/prop_diff.R                                265      15  94.34%   69-72, 104, 289-296, 439, 604
R/prune_occurrences.R                         57       0  100.00%
R/response_biomarkers_subgroups.R             69       6  91.30%   196-201
R/response_subgroups.R                       213       8  96.24%   100-105, 260-261
R/riskdiff.R                                  65       5  92.31%   102-105, 114
R/rtables_access.R                            38       0  100.00%
R/score_occurrences.R                         20       1  95.00%   124
R/split_cols_by_groups.R                      49       0  100.00%
R/stat.R                                      59       0  100.00%
R/summarize_ancova.R                         106       2  98.11%   182, 187
R/summarize_change.R                          30       0  100.00%
R/summarize_colvars.R                         10       0  100.00%
R/summarize_coxreg.R                         172       0  100.00%
R/summarize_glm_count.R                      209       3  98.56%   192-193, 489
R/summarize_num_patients.R                    94       4  95.74%   116-118, 265
R/summarize_patients_exposure_in_cols.R       96       1  98.96%   55
R/survival_biomarkers_subgroups.R             78       6  92.31%   117-122
R/survival_coxph_pairwise.R                   84      12  85.71%   50-51, 63-72
R/survival_duration_subgroups.R              211       6  97.16%   124-129
R/survival_time.R                            111       0  100.00%
R/survival_timepoint.R                       124      10  91.94%   130-139
R/utils_checkmate.R                           68       0  100.00%
R/utils_default_stats_formats_labels.R       136       0  100.00%
R/utils_factor.R                             109       2  98.17%   84, 302
R/utils_ggplot.R                             110       0  100.00%
R/utils_grid.R                               126       5  96.03%   164, 279-286
R/utils_rtables.R                            100       4  96.00%   39, 46, 403-404
R/utils_split_funs.R                          52       2  96.15%   82, 94
R/utils.R                                    141       7  95.04%   118, 121, 124, 128, 137-138, 332
TOTAL                                      10581     463  95.62%

Diff against main

Filename                                  Stmts    Miss  Cover
--------------------------------------  -------  ------  --------
R/analyze_variables.R                        +6       0  +0.04%
R/survival_coxph_pairwise.R                  +5      +1  -0.36%
R/survival_time.R                           +32       0  +100.00%
R/survival_timepoint.R                      +11      +3  -1.87%
R/utils_default_stats_formats_labels.R      +12       0  +100.00%
TOTAL                                       +66      +4  -0.01%

Results for commit: e7bfd91

Minimum allowed coverage is 80%

♻️ This comment has been updated with latest results

@iaugusty
Copy link
Collaborator Author

iaugusty commented Sep 9, 2024

I have read the CLA Document and I hereby sign the CLA

iaugusty and others added 3 commits September 9, 2024 13:39
Merge remote-tracking branch 'refs/remotes/origin/1301-feature-request-add-confidence-intervals-for-quantiles-in-surv_time' into 1301-feature-request-add-confidence-intervals-for-quantiles-in-surv_time

# Conflicts:
#	R/survival_time.R
@@ -209,7 +245,6 @@ surv_time <- function(lyt,
var_labels = var_labels,
show_labels = show_labels,
table_names = table_names,
na_str = na_str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is already passed as extra_args into the analysis function . If the na_str is passed as a named list (to be able to show the formatted value as "NA (NA, NA)" when a stat is all missing), having na_str as extra_args and inside analyze gave an error.

Copy link
Contributor

@Melkiades Melkiades Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! good catch! thanks :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may also be present elsewhere. Can you add an issue regarding this please?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after looking at it again, it might be more easy to remove na_str from extra_args and keep it in analyze only. Then it is much easier to accomplish "NA (NA, NA)", and there would not be the need for a named list to na_str to accomplish this.
Would it be ok to remove it from in_rows call and keep it only in analyze?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of na_str seems a bit duplicated. It is more straightforward to modify it from the analyze function instead of sending it down. Yet the statistical functions may be used and tested externally from the decided analyze. For simplicity, I would keep only the na_str on the top level. If users want to use custom analyze functions can provide their own NA handling directly there

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds fine to me!

R/survival_time.R Outdated Show resolved Hide resolved
Copy link
Contributor

@Melkiades Melkiades left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good! Thanks. Just a couple of comments (could you also add NEWS entry ;))

@Melkiades
Copy link
Contributor

Pull Request

Fixes #1301 @Melkiades I'm not sure about the impact of adding new stats to an analyze function on the remainder of the code/packages. Could you review and share your thoughts? This is the first function for which we'd like to add extra stats, there will be more functions for which we'd like to combine the stat and it's confidence interval into 1 line. Once we know the approach we can follow, more of these types of updates might follow.

For the moment, it looks good. I will check downstream if there are breaking changes due to using all possible values by default but in principle that should be changed. @shajoezhu @edelarua, what do you think about freely adding stats like this?

@iaugusty, we are already trying to introduce a bit more flexibility in custom stats' addition, so for me, it is practically good to go. Lets hear the others ;)

@edelarua
Copy link
Contributor

edelarua commented Sep 9, 2024

@edelarua, what do you think about freely adding stats like this?

Sounds fine to me, I would just try to stick with naming conventions used throughout the package as much as possible for any added statistics :)

iaugusty and others added 5 commits September 10, 2024 09:14
Merge remote-tracking branch 'refs/remotes/origin/1301-feature-request-add-confidence-intervals-for-quantiles-in-surv_time' into 1301-feature-request-add-confidence-intervals-for-quantiles-in-surv_time

# Conflicts:
#	R/survival_time.R
#	R/utils_default_stats_formats_labels.R
@Melkiades
Copy link
Contributor

Lgtm anyway! @iaugusty could you just fix the checks? Thanks

@shajoezhu
Copy link
Contributor

block this PR by #1311

@shajoezhu shajoezhu removed the blocked label Sep 24, 2024
Copy link
Contributor

@Melkiades Melkiades left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm! Thanks @iaugusty for the great work. I took a bit to check the na_str behavior and we should go back to it and make an uniform choice at a certain point. Tested your additions and works very well ;) I am moving the discussion for NAs handling to the issue you created.

@Melkiades
Copy link
Contributor

@iaugusty could you update the snapshots please? Thanks!!

@iaugusty
Copy link
Collaborator Author

iaugusty commented Sep 27, 2024

@iaugusty could you update the snapshots please? Thanks!!

When I try to run the tests, it seems I have to update the svg files for the graphs, for font changes? Should I have some special settings to ensure I match the fonts that are being used at your end?

I added the long version also to s_summary.numeric, as median_long was added to utils_default_stats_formats_labels.R, and this is being utilized by s_summary.numeric as well. This triggered me to add this stat in s_summary.numeric, and as we want the same long version for mean there as well, I added it at the same time.
I hope this is fine within this same branch.

@Melkiades
Copy link
Contributor

Melkiades commented Sep 27, 2024

@iaugusty could you update the snapshots please? Thanks!!

When I try to run the tests, it seems I have to update the svg files for the graphs, for font changes? Should I have some special settings to ensure I match the fonts that are being used at your end?

I added the long version also to s_summary.numeric, as median_long was added to utils_default_stats_formats_labels.R, and this is being utilized by s_summary.numeric as well. This triggered me to add this stat in s_summary.numeric, and as we want the same long version for mean there as well, I added it at the same time. I hope this is fine within this same branch.

You can turn off the plot diff with the following (in test/setup.R):

# expect_snapshot_ggplot - set custom plot dimensions
expect_snapshot_ggplot <- function(title, fig, width = NA, height = NA) {
  testthat::skip()
  testthat::skip_on_ci()
  testthat::skip_if_not_installed("svglite")

Looking now at mean_long etc I think it would be better to have mean_ci etc as it is exactly that. Thanks Ilse!!

mean_ci is already in, which is just the CI of the mean, not including the mean, while mean_long is a 3-d stat: mean + CI

@iaugusty
Copy link
Collaborator Author

iaugusty commented Oct 1, 2024

@Melkiades : not sure if you saw my reply on your comment
Looking now at mean_long etc I think it would be better to have mean_ci etc as it is exactly that. Thanks Ilse!!

mean_ci is already in, which is just the CI of the mean, not including the mean, while mean_long is a 3-d stat: mean + CI

@Melkiades
Copy link
Contributor

@Melkiades : not sure if you saw my reply on your comment Looking now at mean_long etc I think it would be better to have mean_ci etc as it is exactly that. Thanks Ilse!!

mean_ci is already in, which is just the CI of the mean, not including the mean, while mean_long is a 3-d stat: mean + CI

I see! I would call it mean_and_ci or mean_ci_3d then ;)

@shajoezhu
Copy link
Contributor

hi @iaugusty , I was wondering if you could update and resolve this conflict please? the PR is mostly fine to be merged now. Thanks!

@shajoezhu
Copy link
Contributor

shajoezhu commented Oct 16, 2024

hi @iaugusty , I was wondering could you do the following please.

  1. resolve the conflict
  2. in https://github.com/insightsengineering/scda.test, on branch 1301-feature-request-add-confidence-intervals-for-quantiles-in-surv_time, so the test will kick off (1301 feature request add confidence intervals for quantiles in surv time scda.test#157 I have done this for you. lets wait and see)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: add confidence intervals for quantiles in surv_time
4 participants