Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attribute or other way to distinguish MCMC vs MC #239

Open
avehtari opened this issue May 17, 2022 · 6 comments
Open

Attribute or other way to distinguish MCMC vs MC #239

avehtari opened this issue May 17, 2022 · 6 comments
Labels
feature New feature or request

Comments

@avehtari
Copy link
Collaborator

The posterior package started with focus on multi-chain MCMC and stores chain and iterations ids. These are useful when computing multi-chain Rhat, ESS, and MCSE. It is also possible to set weights for the draws which is usedul for importance sampling. It would be useful to think about the default behavior of some functions and whether the current draws objects contain sufficient information to do the right thing. For example, if we want to compute MCSE we have 4 different cases

  • MCMC draws (e.g. usual Stan posterior draws, use MCMC-ESS to compute MCSE)
  • MCMC draws with weights (e.g. Stan posterior draws + IS, e.g. in loo, use MCMC-ESS and IS-ESS)
  • MC draws (e.g. draws from Gaussian posterior approximation, use MC-ESS)
  • MC draws with weights (e.g. draws Gaussian posterior approximation + IS, use IS-ESS)

I guess we could assume that if iteration information is available, then the draws are from MCMC. But at the moment, we don't have support for independent (weighted) MC draws. Would it make sense to set the iteration to 1 for all independent draws? Other ideas for making the difference?

This issue is related to psis() function in loo package complaining if r_eff argument is not set. r_eff is used to pass the earlier computed (MCMC-ESS)/S. If we could determine whether the draws are from MCMC or MC, we would not need to complain in the latter case (and could compute r_eff internally in the first case)

@paul-buerkner
Copy link
Collaborator

Not all formats have iteration information. In fact, in a way, only draws_df has. The other ones just store iteration implicitly through the ordering of the draws.

Adding an attribute would not be difficult I think. The only question is how to proceed with it when the draws objects are transformed somehow. Attributes in R are timid things that tend to vanish into the dark as soon as one lightly touches the object they belong to.

For rvars it might be the easiest(?) to maintain an attribute as all transformations are fully custom there anywhere. For the other objects, I am not sure. I didn't want to go through the effort or reimplementing every standard transformation such as +, * etc. for every format to make sure the attributes are kept/alterted correctly.

Does anybody have other ideas how to differentiate this types of draws?

@mjskay
Copy link
Collaborator

mjskay commented Jun 19, 2022

For rvars it might be the easiest(?) to maintain an attribute as all transformations are fully custom there anywhere.

Yeah, the annoying stuff to make this work in rvars has already been figured out to track chain information, so it could certainly be done. It does seem like we wouldn't want this feature to be limited to rvars though.

For the other objects, I am not sure. I didn't want to go through the effort or reimplementing every standard transformation such as +, * etc. for every format to make sure the attributes are kept/alterted correctly.

Yeah I feel that. If this feature is desired that may end up being the only feasible way unfortunately. In the end it may not be that hard, since most of those operations can be implemented using group generics instead of one-by-one, because we aren't changing their fundamental functionality, just passing on to the superclass and then making sure the attribute is maintained on the result.

The only other mechanism I can think of is a "special" variable like the one used to store weights. Seems wasteful though since presumably it would always hold the same value for every draw.

@paul-buerkner paul-buerkner added the feature New feature or request label Aug 1, 2022
@mjskay mjskay mentioned this issue Nov 24, 2022
@avehtari
Copy link
Collaborator Author

avehtari commented Aug 4, 2023

Now that cmdstanr is getting laplace() method to get draws from the normal approximation, it would be great to tag those draws as not being from MCMC, and by default not to show Rhat, ESS-Bulk and ESS-Tail in summarize_draws()

@mjskay
Copy link
Collaborator

mjskay commented Aug 4, 2023

In order to implement this, it might make sense to create some systematic infrastructure for resolving subtype conflicts amongst MC / MCMC / weighted MC / weighted MCMC. This would probably have to include a way for people to do coercion manually if needed, particularly if we decide some of the subtype combinations result in an error that has to be resolved by the user.

It could be helpful to fill out a table like this:

x y op result
any same as x any? same as x
MCMC MC +,-,*,/,... MC? MCMC? error?
MCMC MC bind draws MC? MCMC? error?
MCMC weighted MCMC +,-,*,/,... resample y to MCMC, then MC or MCMC? error?
... ... ... ...

Something like a mc_subtype attribute on objects and a resolve_mc_subtype(x, y, op_type) internal method? Not sure what user-facing methods would be needed as well. Coercion from weighted to non-weighted is already handled by resample_draws(), so we might only need a new user-facing method if we decide combining MC and MCMC draws is an error so that people would have to do a coercion first (though I don't really think combining those two should be an error, since it would make the API pretty clunky in a lot of places).

@avehtari
Copy link
Collaborator Author

avehtari commented Aug 8, 2023

I'd fine with explicit coercion, but if automatic then, for example, couple examples are

  • MCMC draws of the parameters used to compute parameters of predictive distribution and sampling from that could be written in case of normal as mu + sigma*r where mu and sigma are MCMC type, but r is independent draws. Thus natural combination is MCMC.
  • Comparison of MCMC draws and MC draws for analysing how much difference there is in the inferences. Again auto-correlation from MCMC draws will stay there

Binding MCMC and MC seems less likely, with binding as independent chains maybe a bit more likely. I would coerce to MCMC, and MC draws would lose the information that hey were independent also over iterations.

Resampling weighted draws with some default is a non-trivial choice. I'm not what would be the use case, for combining non-weighted and weighted. For non-weighted MCMC we could assume the weights are equal. We could also consider the case of two weighted MCMC or weighted MC, but with different weights, which would make the generic math operations also complicated. For variables with equal weights there is no difference whether we do arithmetic first and resampling then or vice versa, except that the diagnostics can be better if we do arithmetic first and keep the weights.

@mjskay mjskay mentioned this issue Feb 3, 2024
2 tasks
@n-kall
Copy link
Collaborator

n-kall commented Mar 11, 2024

As @mjskay mentioned in #331, this attribute could be added to rvars. If rvars are then passed to summary functions in summarise draws, as discussed regarding weight support in #184, then summary functions could use this info too.

I think this would mostly affect mcse_* and ess_* functions as @avehtari mentions in the original post. I'm already adjusting them to handle weights, so it wouldn't be too much more to add MC vs MCMC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants