Accessing values from the Posterior the right way #937

PK1706 · 2024-08-16T10:43:19Z

PK1706
Aug 16, 2024

Hi everyone,
I have some questions regarding the right way of accessing values from the posterior, depending on what I want to compare.

Based on PyMC-Marketing I implemented an MMM (some business requirements and conforming to legacy code made this necessary) which has the following structure:

Part of the control variables are a trend, holiday and seasonality component which were created using Prophet (therefore I didn't use the Fourier contributions provided by PyMC-Marketing).
I used a Truncated Normal Distribution for the control variables to allow for negative coefficients for all control coefficients except for the ones related to Prophet.

To make this more structured I will number my questions:

To get control contributions and channel contributions I currently calculate the median (also tried it with the mean) of all of the parameters and then perform transformations and multiply by the coefficients. I do this in the same way as e.g. in the budget allocation code. When I on the other hand access the posterior median of channel or control contributions directly (also sometimes used in the PyMC-Marketing Code), this results in different values. What could be a possible cause of this and what is the right way to access these values?
What is the difference between extracting the posterior median of mu and the posterior median of y? Basically what I am confused about is when interpreting the influence of variables we basically only look at their influence on mu and not on the likelihood y right? What implications does this have, for example also when sampling from the posterior predictive distribution?
This question may be or may be not connected to the previous questions. Below you see...:
a) the posterior predictive plot (which looks quite good)

b) a residual dependence plot (comparing the observed values with the sum of channel contributions, control contributions and intercept) --> the dashed line is at 0

c) a goodness of fit plot (comparing the observed values (red) with the sum of median channel contributions, control contributions and intercept (blue))

Unfortunately I had to remove the axis labels.
What is however directly noticeable is a kind of constant bias especially visible in the residual dependence plot. It almost looks like the model is trying to predict my observed value, plus a constant. In the Posterior predictive plot this is however not as visible.

Do you have any idea what could lead to this bias? I have tried different kinds of priors, managing to reduce the bias (to the level you see in the plots) but not completely remove it. Am I accessing the wrong thing by using the median of the sum of median channel contributions, control contributions and intercept? If so, why would this be (vastly) different from the median of y and lead to this constant bias?

I hope that I could somehow convey what my problem is. If you need some more information to answer any of these questions please let me know!
I would highly appreciate an answer to any of these questions!

wd60622 · 2024-08-16T11:07:29Z

wd60622
Aug 16, 2024
Maintainer

Hi @PK1706

Lots of items here so hopefully I can address at least a few:

Bit hard to judge this without an example or knowing what you are doing. Accessing the "control_contributions" and "channel_contributions" directly from the InferenceData would be the way to go. They already back in the uncertainty of the posterior parameters. They will have to be scaled back based on the max value in training y.
mu in the graph is the mean estimate of the outcome variable whereas y in the graph is the outcome itself. This is the difference between mean vs an observation. Think confidence of average vs confidence of individual observation
If you take the mean across draws, chains, etc, then they will be close since the likelihood in this case is symmetric around mu. The variation of y will be greater than variation of mu since y has additional noise from Normal

7 replies

cluhmann Sep 3, 2024
Maintainer

There is no error here. This is just how summary statistics (e.g., the median) and non-linear transformations work. Compare the following:

import numpy as np
mu, sigma = 0, 1
rng = np.random.default_rng()
draws = rng.normal(mu, sigma, 1000)

# take the square of the median
print(np.median(draws)**2)
# take the median of the square
print(np.median(draws**2))

Chances are, you want to delay collapsing your full posterior down to a point estimate (e.g., by taking the median) for as long as possible, only doing so when absolutely necessary. That's "the Bayesian way" of doing things.

wd60622 Sep 3, 2024
Maintainer

Great example above to recreate what is happening. In many scenarios, f(g(x)) != g(f(x)). In our case, g = media transformation and f = median
Here is related theorem at play: https://en.wikipedia.org/wiki/Jensen%27s_inequality

PK1706 Sep 13, 2024
Author

@cluhmann
The non-monotiny of the quadratic transformation function in your example leads to the deviation of results. However in our case we use a logistic saturation function, which is strictly monotonous and thus should commute with taking the median.

Can you think of any reason why your argument still appllies or do you have any other idea on what could be the issue ?

wd60622 Sep 13, 2024
Maintainer

I am struggling to understand why the PyMC internals cannot be relied upon and that the calculations have to be done again after already defined in the model graph.

I would trust the model graph over doing it manually. Is that required?

PK1706 Sep 16, 2024
Author

@wd60622 I get where you are coming from.
The reason for calculating it manually came from the fact that when generating response curves directly from the model graph there are non-monotonic patterns which I do not understand. When calculating the response manually this problem vanishes.
An example of this can be seen in my previous response:

In this plot you see adstocked spend on the x axis and contribution for one channel on the y axis. The red line is constructed using the median values of all the parameters and calculating the response "by hand". The blue line is the median of the channel_contributions posterior for this channel. Obviously there seems to be something wrong with the way I am calculating this judging from the constant bias. What exactly is wrong I have yet to uncover.

However the reason why I initially chose this method instead of accessing the channel_contributions "directly" was that there are some points in the blue curve which show non monotonic behavior (I highlighted 2 of them in black on the left side). Can you explain how these values come about? Given the model structure it doesn't make sense to me that a higher adstocked spend could lead to lower response (of course there could still be diminishing returns, it should however still be higher/equal)

If you could provide an explanation for why at some levels of adstock spend an increase does not lead to a larger or at least the same response I would be more than happy to use the results from the model graph. Otherwise it is really hard to communicate the response curve to stakeholders.

I really want to thank you for all the ongoing support and valuable information, it is highly appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accessing values from the Posterior the right way #937

{{title}}

Replies: 1 comment 7 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Accessing values from the Posterior the right way #937

PK1706 Aug 16, 2024

Replies: 1 comment · 7 replies

wd60622 Aug 16, 2024 Maintainer

cluhmann Sep 3, 2024 Maintainer

wd60622 Sep 3, 2024 Maintainer

PK1706 Sep 13, 2024 Author

wd60622 Sep 13, 2024 Maintainer

PK1706 Sep 16, 2024 Author

PK1706
Aug 16, 2024

Replies: 1 comment 7 replies

wd60622
Aug 16, 2024
Maintainer

cluhmann Sep 3, 2024
Maintainer

wd60622 Sep 3, 2024
Maintainer

PK1706 Sep 13, 2024
Author

wd60622 Sep 13, 2024
Maintainer

PK1706 Sep 16, 2024
Author