Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Multiple Values of the Log Marginal Likelihood #574

Open
ajnafa opened this issue Oct 22, 2022 · 3 comments
Open

Allow Multiple Values of the Log Marginal Likelihood #574

ajnafa opened this issue Oct 22, 2022 · 3 comments
Labels
Enhancement 💥 Implemented features can be improved or revised

Comments

@ajnafa
Copy link
Contributor

ajnafa commented Oct 22, 2022

Describe the solution you'd like

As I mentioned to @bwiernik and @mattansb on twitter yesterday, it appears the bayesfactor_.* functions only support a single value for the log marginal likelihood. As far as I can tell, the current behavior of the package is to throw a warning if less than 40,000 post-warmup samples are taken but as noted in Schad et al. (2022) and as I discuss in the context of Bayesian Model Averaging here this doesn't necessary guarantee repeated runs of the bridge sampling algorithm will produce stable results and a preferable approach where tractable is to calculate a distribution of estimates for the log marginal likelihood by passing the repetitions argument to bridgesampling::bridge_sampler.

I looked into implementing this for brmsfit objects in #573 since you can store the log marginal likelihood estimates internally in the model object using brms::add_criterion but it seems like all of the functions that inherent from bayesfactor_models expect a single estimate for the log ML of each model so for the time being it just takes the median of the stored values. Longer term though, it is probably worth considering adding support for repeated estimates of the log ML.

How could we do it?

As far as implementation is concerned, the first step would be to allow users to pass additional arguments to the call to bridgesampling::bridge_sampler through bayesfactor_models. After that things get a bit more complicated because the other bayesfactor.* functions need to be modified to handle a vector of values for each model. One way to do this that would integrate nicely into the current structure of the print functions might be to simply return the median and quantiles of the bayes factor estimates but I'd need to dig more into the structure of the package to figure out how feasible this is.

@mattansb
Copy link
Member

  1. Yes, adding a repetitions argument (or perhaps just allowing one to be passed via ...)
  2. When multiple repetitions are used, we should returns the median/mean for our users convenience;
    (bayesfactor_model() for stan models is really just a easystats flavored wrapper around the functions from {bridgesampling}, so any user wanting something more advanced would probably be using those functions directly anyway...)

@ajnafa
Copy link
Contributor Author

ajnafa commented Oct 23, 2022

I think that sounds reasonable. Passing arguments via ... to bridge_sampler would allow a bit more control over things like the max number of iterations, parallel computation on Linux and Mac OS, etc.

With the target user base in mind, it might also be a good idea to include an explicit warning/message somewhere that advises users to run the algorithm more than once or set the repetitions argument to ensure their estimates aren't wildly unstable.

@mattansb
Copy link
Member

Yes, I like all of these (:

@strengejacke strengejacke added the Enhancement 💥 Implemented features can be improved or revised label Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement 💥 Implemented features can be improved or revised
Projects
None yet
Development

No branches or pull requests

3 participants