-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rlang type checkers #1387
Add rlang type checkers #1387
Conversation
args <- list(...) | ||
|
||
check_string(subclass, call = call) | ||
.prefix <- rlang::arg_match0(.prefix, c("step_", "check_", ""), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have some code in update()
that assigns a prefix of ""
and I didn't want to mess with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall like what is happening. good idea to check for common arguments in step()
and check()
as all the steps run through there.
I want to see what happens if we extend it to a few common arguments like keep_original_cols
and case_weights
that has very specific meanings, but aren't used all the time
Here are the most common arguments for recipes steps. So I'll go back on my word before, and way we shouldn't check for library(recipes)
library(tidyverse)
fns <- lsf.str("package:recipes")
fn_args <- lapply(fns, formals)
names(fn_args) <- fns
fn_args <- fn_args[str_detect(fns, "^step_") & str_detect(fns, "_new", negate = TRUE)]
map(fn_args, ~ tibble(arg = names(.x))) |>
list_rbind(names_to = "step") |>
count(arg, sort = TRUE) |>
print(n = 20)
#> # A tibble: 127 × 2
#> arg n
#> <chr> <int>
#> 1 id 98
#> 2 recipe 98
#> 3 role 98
#> 4 skip 98
#> 5 trained 98
#> 6 ... 96
#> 7 columns 36
#> 8 keep_original_cols 36
#> 9 options 26
#> 10 prefix 16
#> 11 objects 11
#> 12 res 9
#> 13 threshold 9
#> 14 degree 8
#> 15 num_comp 8
#> 16 deg_free 7
#> 17 inputs 7
#> 18 complete_set 6
#> 19 levels 6
#> 20 na_rm 6
#> # ℹ 107 more rows Created on 2024-10-24 with reprex v2.1.0 |
Extending that to extension packages: suppressPackageStartupMessages(library(tidymodels))
suppressPackageStartupMessages(library(revdepcheck))
recipe_pkgs <- revdepcheck::cran_revdeps("recipes", dependencies = c("Depends", "Imports"))
recipe_pkgs <- c(recipe_pkgs, "recipes")
recipe_pkgs <- sort(unique(c(recipe_pkgs)))
excl <- c("hydrorecipes", "healthcareai", "viruslearner")
recipe_pkgs <- recipe_pkgs[!(recipe_pkgs %in% excl)]
# pak::pak(recipe_pkgs, ask = FALSE)
excl_internal <- c("check_type", "check_nominal_type", "check_factor_vars",
"check_name", "check_is_lat_lon", "check_new_data",
"check_role_requirements", "check_bake_role_requirements",
"check_step_check_args")
listify <- function(x, nm) {
tibble::tibble(fn = nm, argument = names(x))
}
get_steps_args <- function(pkg, excl = character(0)) {
suppressPackageStartupMessages(library(pkg, character.only = TRUE))
fns <- lsf.str(paste0("package:", pkg), pattern = ("(step_)|(check_)"))
fns <- fns[!(fns %in% excl_internal)]
fn_args <- lapply(fns, formals)
purrr::map2_dfr(fn_args, fns, listify) %>% dplyr::mutate(package = pkg) %>% relocate(package)
}
pkg_arg_tbl <-
map_dfr(recipe_pkgs, get_steps_args) %>%
anti_join(tibble::tibble(package = "recipes", fn = excl_internal))
#> Joining with `by = join_by(package, fn)`
pkg_arg_tbl %>% count(argument) %>% arrange(desc(n))
#> # A tibble: 295 × 2
#> argument n
#> <chr> <int>
#> 1 ... 216
#> 2 id 199
#> 3 recipe 199
#> 4 role 199
#> 5 skip 199
#> 6 trained 194
#> 7 columns 90
#> 8 keep_original_cols 57
#> 9 options 41
#> 10 prefix 39
#> # ℹ 285 more rows Created on 2024-10-25 with reprex v2.1.0 Maybe add |
yes, that sounds reasonable! |
Co-authored-by: Emil Hvitfeldt <emil.hvitfeldt@posit.co>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amazing!
Now on to part two! |
This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue. |
Does some additional tests for basic things. The next step would be adding these checks to the steps.