Extended StatsModels.jl
@formula
syntax for
regression modeling.
Note that the functionality in this package is very new: please verify that the resulting schematized formulae and model coefficient (names) are what you were expecting, especially if you are combining multiple "advanced" formula features.
a / b
expands to a + fulldummy(a) & b
.
Numeric constants are special cased so that /
performs division, making it possible to e.g. convert time to speed in the formula:
julia> fit(MyModelType, @formula(time_in_milliseconds / 1000 ~ 1 + x), my_data)
Generate all main effects and interactions up to the specified order. For
instance, (a+b+c)^2
generates a + b + c + a&b + a&c + b&c
, but not a&b&c
.
NB: The presence of interaction terms within the base will result in redundant terms and is currently unsupported.
Extended syntax is supported at two levels. First, RegressionFormulae.jl
defines apply_schema
methods that capture calls within a @formula
to the
special syntax (^
, /
, etc.). Second, we define methods for the
corresponding functions in Base
(Base.:(^)
, Base.:(/)
, etc.) for arguments
that are <:AbstractTerm
which implement the special behavior, returning the
appropriate terms. This allows the syntax to be used both within a @formula
and for constructing terms programmatically at run-time.
If using apply_schema
directly, please note that you need to pass an appropriate model type as context.
Currently, the extensions here are defined for StatsAPI.RegressionModel
and subtypes:
f = apply_schema(f, s, RegressionModel)