Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

selected_features for learners that don't support it should be the entirety of features seen in training #935

Closed
mb706 opened this issue Jun 29, 2023 · 7 comments
Assignees
Labels

Comments

@mb706
Copy link
Collaborator

mb706 commented Jun 29, 2023

This way we could correctly query a pipeline that selects features first and gives the result to a learner. The GraphLearner could then ask the learner at the end how many features it used, and if it is a learner that supports embedded featsel (rpart e.g.) then this would give the correct value, but even for learners that do not do support it the result could make sense.

Also this would solve mlr-org/mlr3fselect#87

@be-marc
Copy link
Member

be-marc commented Aug 17, 2024

library(mlr3)
library(mlr3learners)

learner = lrn("classif.rpart")
task = tsk("spam")

learner$train(task)
learner$selected_features()

#> [1] "charDollar"      "hp"             
#> [3] "remove"          "charExclamation"
#> [5] "capitalTotal"    "free"  

learner = lrn("classif.log_reg")
learner$train(task)
learner$selected_features()
# > Error: attempt to apply non-function

@berndbischl berndbischl self-assigned this Aug 17, 2024
@berndbischl
Copy link
Member

so first order of business would be here to extend the docs, the docs don't say what happens if the property does not exists

@mb706
Copy link
Collaborator Author

mb706 commented Dec 19, 2024

currently mlr3pipelines handles this on its own end if this flag is set:

https://github.com/mlr-org/mlr3pipelines/blob/b1042d7967d13207276f6c1e429dfca86c76416f/R/GraphLearner.R#L73-L84

@berndbischl
Copy link
Member

i would suggest this
a) the learner has an member var / option:
selected_features_not_supported = "error" / "all"
this controls what happens in "selected_features"

b) selected_features as a method is present in all learners.
by default it return an error (if not implemented)

@mb706
Copy link
Collaborator Author

mb706 commented Dec 19, 2024

selected_features_impute

@berndbischl berndbischl assigned be-marc and unassigned berndbischl Dec 19, 2024
@berndbischl
Copy link
Member

we also need to remove this from pipelines then

@be-marc
Copy link
Member

be-marc commented Dec 20, 2024

Closed by #1230

@be-marc be-marc closed this as completed Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants