Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rlang type checkers #1387

Merged
merged 28 commits into from
Nov 1, 2024
Merged

Add rlang type checkers #1387

merged 28 commits into from
Nov 1, 2024

Conversation

topepo
Copy link
Member

@topepo topepo commented Oct 22, 2024

Does some additional tests for basic things. The next step would be adding these checks to the steps.

args <- list(...)

check_string(subclass, call = call)
.prefix <- rlang::arg_match0(.prefix, c("step_", "check_", ""),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do have some code in update() that assigns a prefix of "" and I didn't want to mess with that.

@topepo topepo marked this pull request as ready for review October 22, 2024 16:38
@topepo topepo requested a review from EmilHvitfeldt October 22, 2024 16:39
Copy link
Member

@EmilHvitfeldt EmilHvitfeldt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall like what is happening. good idea to check for common arguments in step() and check() as all the steps run through there.

I want to see what happens if we extend it to a few common arguments like keep_original_cols and case_weights that has very specific meanings, but aren't used all the time

R/steps_and_checks.R Show resolved Hide resolved
R/steps_and_checks.R Outdated Show resolved Hide resolved
R/steps_and_checks.R Outdated Show resolved Hide resolved
tests/testthat/test-basics.R Outdated Show resolved Hide resolved
@EmilHvitfeldt
Copy link
Member

Here are the most common arguments for recipes steps. So I'll go back on my word before, and way we shouldn't check for case_weights

library(recipes)
library(tidyverse)

fns <- lsf.str("package:recipes")
fn_args <- lapply(fns, formals)
names(fn_args) <- fns

fn_args <- fn_args[str_detect(fns, "^step_") & str_detect(fns, "_new", negate = TRUE)]

map(fn_args, ~ tibble(arg = names(.x))) |>
  list_rbind(names_to = "step") |>
  count(arg, sort = TRUE) |>
  print(n = 20)
#> # A tibble: 127 × 2
#>    arg                    n
#>    <chr>              <int>
#>  1 id                    98
#>  2 recipe                98
#>  3 role                  98
#>  4 skip                  98
#>  5 trained               98
#>  6 ...                   96
#>  7 columns               36
#>  8 keep_original_cols    36
#>  9 options               26
#> 10 prefix                16
#> 11 objects               11
#> 12 res                    9
#> 13 threshold              9
#> 14 degree                 8
#> 15 num_comp               8
#> 16 deg_free               7
#> 17 inputs                 7
#> 18 complete_set           6
#> 19 levels                 6
#> 20 na_rm                  6
#> # ℹ 107 more rows

Created on 2024-10-24 with reprex v2.1.0

@topepo
Copy link
Member Author

topepo commented Oct 25, 2024

Extending that to extension packages:

suppressPackageStartupMessages(library(tidymodels))
suppressPackageStartupMessages(library(revdepcheck))

recipe_pkgs <- revdepcheck::cran_revdeps("recipes", dependencies = c("Depends", "Imports"))
recipe_pkgs <- c(recipe_pkgs, "recipes")

recipe_pkgs <- sort(unique(c(recipe_pkgs)))
excl <- c("hydrorecipes", "healthcareai", "viruslearner")
recipe_pkgs <- recipe_pkgs[!(recipe_pkgs %in% excl)]

# pak::pak(recipe_pkgs, ask = FALSE)

excl_internal <- c("check_type", "check_nominal_type", "check_factor_vars", 
                   "check_name", "check_is_lat_lon", "check_new_data", 
                   "check_role_requirements", "check_bake_role_requirements", 
                   "check_step_check_args")         
        
listify <- function(x, nm) {
  tibble::tibble(fn = nm, argument = names(x))
} 

get_steps_args <- function(pkg, excl = character(0)) {
  suppressPackageStartupMessages(library(pkg, character.only = TRUE))
  fns <- lsf.str(paste0("package:", pkg), pattern = ("(step_)|(check_)"))
  fns <- fns[!(fns %in% excl_internal)]
  fn_args <- lapply(fns, formals)
  purrr::map2_dfr(fn_args, fns, listify) %>% dplyr::mutate(package = pkg) %>% relocate(package)
} 

pkg_arg_tbl <- 
  map_dfr(recipe_pkgs, get_steps_args) %>% 
  anti_join(tibble::tibble(package = "recipes", fn = excl_internal))
#> Joining with `by = join_by(package, fn)`

pkg_arg_tbl %>% count(argument) %>% arrange(desc(n))
#> # A tibble: 295 × 2
#>    argument               n
#>    <chr>              <int>
#>  1 ...                  216
#>  2 id                   199
#>  3 recipe               199
#>  4 role                 199
#>  5 skip                 199
#>  6 trained              194
#>  7 columns               90
#>  8 keep_original_cols    57
#>  9 options               41
#> 10 prefix                39
#> # ℹ 285 more rows

Created on 2024-10-25 with reprex v2.1.0

Maybe add role?

@EmilHvitfeldt
Copy link
Member

yes, that sounds reasonable!

@topepo topepo requested a review from EmilHvitfeldt October 31, 2024 14:33
Copy link
Member

@EmilHvitfeldt EmilHvitfeldt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amazing!

@EmilHvitfeldt EmilHvitfeldt merged commit 28e3e4d into main Nov 1, 2024
13 checks passed
@EmilHvitfeldt EmilHvitfeldt deleted the type-checkers branch November 1, 2024 17:32
@topepo
Copy link
Member Author

topepo commented Nov 1, 2024

Now on to part two!

Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Nov 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants