feature request: mixtures over groups / participants #1659

pmusfeld · 2024-05-22T15:13:55Z

Hi Paul!

We discussed about the possibility to specify mixture models in brms over grouping variables. Right now, the default is that mixtures are calculated on the level of observations, but sometimes we encounter use cases in which we want to specify different possible models for an entire sequence of data points (like the data of one participant), and fit a mixture over these models (basically what is described in the Stan-Manual as the "erroneous way" of vectorzing mixtures: https://mc-stan.org/docs/2_19/stan-users-guide/vectorizing-mixtures.html; also see the math in the attached picture why mixtures over single observations and over sequences of observations lead to different likelihoods).

For implementing this in brms, we discussed that one could add a new "mix" argument in the aterms of a brmsformula, which specifies over which groups a mixture should be calculated. For example, when attempting to specify a mixture over participants ("ID"), one could write: y | mix(gr = "ID") ~ ...

For translating this into stan-code, one could generate temporary variables for first accumulating the group-level likelihood components, which are then passed to the log_mix function. For a two-component mixture over a grouping-variable this could look something like the example code below, where L_1 and L_2 are the two likelihood components, N_1 is the number of grouping-levels, J_mix is the grouping-indicator per observation (similar to how it is done for random effects), N is the number of total observations, and lambda is the mixture proportion:

model {
   vector L_1[N_1] = rep(0, N_1);  // Likelihood component for model 1
   vector L_2[N_1] = rep(0, N_1);  // Likelihood component for model 2

   for (i in 1:N) {
       L_1[J_mix[i]] += log-likelihood-model1
       L_2[J_mix[i]] += log-likelihood-model2
   }

   for (j in 1:N_1) {
      target += log_mix(lambda, L_1[j], L_2[j])
   }
}

In case the mixture is specified over single observations instead of groups of observations, this would be ne simplified to:

model {
   vector L_1[N] = rep(0, N);  // Likelihood component for model 1
   vector L_2[N] = rep(0, N);  // Likelihood component for model 2

   for (i in 1:N) {
       L_1[i] += log-likelihood-model1
       L_2[i] += log-likelihood-model2
   }

   for (j in 1:N) {
      target += log_mix(lambda, L_1[j], L_2[j])
   }
}

Thanks for considering this feature, and I hope the summary of what we discussed helps for implementation!

Best,
Philipp

Note 21. May 2024.pdf

The text was updated successfully, but these errors were encountered:

paul-buerkner added the feature label May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature request: mixtures over groups / participants #1659

feature request: mixtures over groups / participants #1659

pmusfeld commented May 22, 2024

feature request: mixtures over groups / participants #1659

feature request: mixtures over groups / participants #1659

Comments

pmusfeld commented May 22, 2024