- Minor tweaks to URLs and tests to pass CRAN checks.
-
This NEWS file now follows the Keep a Changelog format.
-
Removed lifecycle badge from
README
file. -
The training data has to be explicitly passed in more cases when using
vi_permute()
,vi_shap()
, andvi_firm()
. -
Raised R version dependency to >= 4.1.0 (the introduction of the native piper operator
|>
). -
The
vi_permute
function now relies on the yardstick package for compouting performance measures (e.g., RMSE and log loss); consequently, user-supplied metric functions now nned to conform to yardstick metric argument names. -
The
var_fun
argument invi_firm()
has been deprecated; use the newvar_continuous
andvar_categorical
instead. -
The explicit
ice
argument invi_firm()
has been removed; it was not really needed since it can be passed via the...
argument. -
Removed magrittr from imports; it's easy enough to just laod the package if you need it or use R's newer internal pipe operator.
-
Tweaked examples.
-
Tests based on fastshap now check to make sure it's available.
-
Suppress loading of mixOmics in tests.
-
Switched lifecycle badge from "maturing", which has been superseded, to "experimental."
-
Fixed H2O URL in
vi_model.R
. -
Removed the unnecessary
LazyData: true
line from theDESCRIPTION
file. -
Switched to using markdown syntax in
roxygen2
comments.
-
vi_model()
now supports lightgbm models. Thanks to @nipnipj for the suggestion (#146). -
The permutation importance method (i.e., function
vi_permute()
) now integrates with and uses yardstick performance metrics. -
list_metrics()
gained an additionalsmaller_is_better
column indicating whether or not the corresponding metric should be minimized (smaller_is_better = TRUE
) or maximized (smaller_is_better = FALSE
); thanks to @topedo. Additionally, all the column names are now in lower case. -
Added support for partial least squares via the mixOmics package (PR #129); thanks to @topedo.
-
Added support for the workflows and parsnip packages from the tidymodels ecosystem (PR #128); thanks to @topedo.
-
New pkgdown site and vignette based on our original R Journal article.
- Function
add_sparklines()
seems out of scope and has been removed. - Function
vint()
also seems out of scope and is too slow to implement for most practical problems; for now, the function will likely live on in the moreparty package.
- Add
tools/
to .Rbuildignore.
-
Change http://spark.rstudio.com/mlib/ to https://spark.rstudio.com/mlib/ in NEWS.md.
-
Remove unnecessary codecov.yml file.
- Removed deprecated arguments from
vip()
; in particular,bar
,width
,alpha
,color
,fill
,size
, andshape
. Users should instead rely on themapping
andaesthetics
arguments; see?vip::vip
for details.
- Fixed a couple bugs that occurred when using
vi_model()
with the glmnet package. In particular, we added a newlamnda
parameter for specifying the value of the penalty term to use when extracting the estimated coefficients. This is equivalent to thes
argument inglmnet::coef()
; the namelambda
was chosen to not conflict with other arguments invi()
. Additionally,vi_model()
did not return the absolute value of the estimated coefficients for glmnet models like advertised, but is now fixed in this version (#103).
-
Switched from Travis-CI to GitHub Actions for continuous integration.
-
Added a CITATION file and PDF-based vignette based off of the published article in The R Journal (#109).
-
Switch from
tibble::as.tibble()
---which was deprecated in tibble 2.0.0---totibble::as_tibble()
in a few function calls (#101).
- The
Importance
column fromvi_model()
no longer contains "inner" names; in accordance with breaking changes in tibble 3.0.0.
-
Added support for SHAP-based feature importance which makes use of the recent fastshap package on CRAN. To use, simply call
vi()
orvip()
and specifymethod = "shap"
, or you can just callvi_shap()
directly (#87). -
Added support for the parsnip, mlr, and mlr3 packages (#94).
-
Added support for
"mvr"
objects from the pls package (currently just callscaret::varImp()
) (#35). -
The
"lm"
method forvi_model()
gained a newtype
argument that allows users to use either (1) the raw coefficients if the features were properly standardized (type = "raw"
), or (2) the absolute value of the corresponding t- or z-statistic (type = "stat"
, the default) (#77). -
New function
gen_friedman()
for simulating data from the Friedman 1 benchmark problem; see?vip::gen_friedman
for details.
-
The
vi_pdp()
andvi_ice()
functions have been deprecated and merged into a single new function calledvi_firm()
. Consequently, settingmethod = "pdp"
andmethod = "ice"
has also been deprecated; usemethod = "firm"
instead. -
The
metric
andpred_wrapper
arguments tovi_permute()
are no longer optional. -
The
vip()
function gained a new argument,geom
, for specifying which type of plot to construct. Current options aregeom = "col"
(the default),geom = "point"
,geom = "boxplot"
, orgeom = "violin"
(the latter two only work for permutation-based importance withnsim > 1
) (#79). Consequently, thebar
argument has been removed. -
The
vip()
function gained two new arguments for specifying aesthetics:mapping
andaesthetics
(for fixed aesthetics likecolor = "red"
). Consequently, the argumentscolor
,fill
, etc. have been removed (#80).
An example illustrating the above two changes is given below:
# Load required packages
library(ggplot2) # for `aes_string()` function
# Load the sample data
data(mtcars)
# Fit a linear regression model
model <- lm(mpg ~ ., data = mtcars)
# Construct variable importance plots
p1 <- vip(model)
p2 <- vip(model, mapping = aes_string(color = "Sign"))
p3 <- vip(model, type = "dotplot")
p4 <- vip(model, type = "dotplot", mapping = aes_string(color = "Variable"),
aesthetics = list(size = 3))
grid.arrange(p1, p2, p3, p4, nrow = 2)
- The
vip()
function gained a new argument,include_type
, which defaults toFALSE
. IfTRUE
, the type of variable importance that was computed is included in the appropriate axis label. Setinclude_type = TRUE
to revert to the old behavior.
-
Removed dependency on ModelMetrics and the built-in family of performance metrics (
metric_*()
) are now documented and exported. See, for example,?vip::metric_rmse
(#93). -
Minor documentation improvements.
-
The internal (i.e., not exported)
get_feature_names()
function does a better job with"nnet"
objects containing factors. It also does a better job at extracting feature names from model objects containing a"formula"
component. -
vi_model()
now works correctly for"glm"
objects with non-Gaussian families (e.g., logistic regression) (#74). -
Added appropriate sparklyr version dependency (#59).
-
Removed warnings from experimental functions.
-
vi_permute()
gained a type argument (i.e.,type = "difference"
ortype = "ratio"
); this argument can be passed viavi()
orvip()
as well. -
add_sparklines()
creates an HTML widget to display variable importance scores with a sparkline representation of each features effect (i.e., its partial dependence function) (#64). -
Added support for the Olden and Garson algorithms with neural networks fit using the neuralnet, nnet, and RSNNS packages (#28).
-
Added support for GLMNET models fit using the glmnet package (with and without cross-validation).
-
The
pred_fun
argument invi_permute()
has been changed topred_wrapper
. -
The
FUN
argument tovi()
,vi_pdp()
, andvi_ice()
has been changed tovar_fun
. -
Only the predicted class probabilities for the reference class are required (as a numeric vector) for binary classification when
metric = "auc"
ormetric = "logloss"
.
-
vi_permute()
gained a new logicalkeep
argument. IfTRUE
(the default), the raw permutation scores from allnsim
repetitions (providednsim > 1
) will be stored in an attribute called"raw_scores"
. -
vip()
gained new logical argumentsall_permutations
andjitter
which help to visualize the raw permutation scores for allnsim
repetitions (providednsim > 1
). -
You can now pass a
type
argument tovi_permute()
specifying how to compare the baseline and permuted performance metrics. Current choices are"difference"
(the default) and"ratio"
. -
Improved documentation (especially for
vi_permute()
andvi_model()
). -
Results from
vi_model()
,vi_pdp()
,vi_ice()
, andvi_permute()
now have class"vi"
, making them easier to plot withvip()
.
-
Added
nsim
argument tovi_permute()
for reducing the sampling variability induced by permuting each predictor (#36). -
Added
sample_size
andsample_frac
arguments tovi_permute()
for reducing the size of the training sample for every Monte Carlo repetition (#41). -
Greatly improved the documentation for
vi_model()
and the various objects it supports. -
New argument
rank
, which defaults toFALSE
, available invi()
(#55). -
Added support for Spark (G)LMs.
-
vi()
is now a generic which makes adding new methods easier (e.g., to support DataRobot models). -
Bug fixes.
-
Fixed bug in
get_feature_names.ranger()
s.t. it never returnsNULL
; it either returns the feature names or throws an error if they cannot be recovered from the model object (#43). -
Added
pkgdown
site: https://github.com/koalaverse/vip. -
Changed
truncate_feature_names
argument ofvi()
toabbreviate_feature_names
which abbreviates all feature names, rather than just truncating them. -
Added CRAN-related badges (#32).
-
New generic
vi_permute()
for constructing permutation-based variable importance scores (#19). -
Fixed bug and unnecessary error check in
vint()
(#38). -
New vignette on using
vip
with unsupported models (using the Keras API to TensorFlow as an example). -
Added basic sparklyr support.
-
Added support for XGBoost models (i.e., objects of class
"xgb.booster"
). -
Added support for ranger models (i.e., objects of class
"ranger"
). -
Added support for random forest models from the
party
package (i.e., objects of class"RandomForest"
). -
vip()
gained a new argument,num_features
, for specifying how many variable importance scores to plot. The default is set to10
. -
.
was changed to_
in all argument names. -
vi()
gained three new arguments:truncate_feature_names
(for truncating feature names in the returned tibble),sort
(a logical argument specifying whether or not the resulting variable importance scores should be sorted), anddecreasing
(a logical argument specifying whether or not the variable importance scores should be sorted in decreasing order). -
vi_model.lm()
, and hencevi()
, contains an additional column calledSign
that contains the sign of the original coefficients (#27). -
vi()
gained a new argument,scale
, for scaling the variable importance scores so that the largest is 100. Default isFALSE
(#24). -
vip()
gained two new arguments,size
andshape
, for controlling the size and shape of the points wheneverbar = FALSE
(#9). -
Added support for
"H2OBinomialModel"
,"H2OMultinomialModel"
, and,"H2ORegressionModel"
objects (#8).
- Initial release.