Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

negative yerr not handled by cabinetry.visualize.data_mc #388

Closed
rmnmllr opened this issue Jan 25, 2023 · 9 comments
Closed

negative yerr not handled by cabinetry.visualize.data_mc #388

rmnmllr opened this issue Jan 25, 2023 · 9 comments

Comments

@rmnmllr
Copy link
Contributor

rmnmllr commented Jan 25, 2023

Probably a minor issue and more of a error handling request, but while trying to visualize the prefit plots I receive an error from matplotlib that yerr can't contain negative values:

Traceback (most recent call last):
  File "/terabig/muellerr/Leptoquarks/fitting/run_cabinetry.py", line 333, in <module>
    run_cabinetry(args["inputs"], args["variables"], args["couplings"], gen_input, plot, limit, ranking, args["signi"])
  File "/terabig/muellerr/Leptoquarks/fitting/run_cabinetry.py", line 178, in run_cabinetry
    cabinetry.visualize.data_mc(prediction_prefit, data, config=config, figure_folder=figurespath, log_scale=True, close_figure=True)
  File "/terabig/muellerr/.conda/envs/fit/lib/python3.9/site-packages/cabinetry/visualize/__init__.py", line 236, in data_mc
    fig = plot_model.data_mc(
  File "/terabig/muellerr/.conda/envs/fit/lib/python3.9/site-packages/cabinetry/visualize/plot_model.py", line 174, in data_mc
    ax2.errorbar(
  File "/terabig/muellerr/.conda/envs/fit/lib/python3.9/site-packages/matplotlib/__init__.py", line 1423, in inner
    return func(ax, *map(sanitize_sequence, args), **kwargs)
  File "/terabig/muellerr/.conda/envs/fit/lib/python3.9/site-packages/matplotlib/axes/_axes.py", line 3587, in errorbar
    raise ValueError(
ValueError: 'yerr' must not contain negative values

This is probably related to few statistics in the selected region, but can this be logged and or be catched somewhere?

@alexander-held
Copy link
Member

This looks to be happening in the ratio panel, where the yerr kwarg is filled with the Poisson uncertainty on data divided by the total model prediction. The former should be strictly positive, so the pre-fit model prediction for a given bin must be negative here. That's unphysical so we should probably catch this even earlier, for example in model_utils.prediction. Do you expect that the model prediction may be negative? Perhaps this is a consequence of initial parameter values not having suitable values? (There may also be bugs with pre-fit model predictions in combinations with custom modifiers #382, that has not been tested much.)

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Jan 26, 2023

Thank you @alexander-held!

Do you expect that the model prediction may be negative? Perhaps this is a consequence of initial parameter values not having suitable values?

No the model prediction is not expected to be negative, but I see in the prefit yield table that for one specific region the total is indeed negative :/ We have to go back to the inputs then..

(There may also be bugs with pre-fit model predictions in combinations with custom modifiers #382, that has not been tested much.)

For this error above I used the latest stable version of cabinetry, custom modifiers are not included.

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Jan 26, 2023

I made it work, it was indeed a consequence of initial paramater values not having suitable values. For my signal I changed the Nominal in NormFactors from:

NormFactors:
  - Name: "Signal_norm"
    Samples: "LQ"
    Nominal: 1
    Bounds: [0, 30]

to

NormFactors:
  - Name: "Signal_norm"
    Samples: "LQ"
    Nominal: 0
    Bounds: [0, 30]

This then also sets my signal LQ to 0.00 in the yields resulting the total to be positive.

@alexander-held
Copy link
Member

Is the signal sample predicting a negative number of events? I do not see how otherwise the nominal value of 1 could be causing issues.

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Jan 26, 2023

Yes they are negative interference terms. The yield table for the mention region for Nominal: 0 looks like:

sample        rec_smim_vs_Nbjet_HighMassOSs2thh_1bjets  
----------    ------------------------------------------
LQ            0.00 ± 0.00                               
Wjets         0.15 ± 0.02                               
Zll           0.00 ± 0.00                               
Ztautau       0.66 ± 0.07                               
multiboson    0.05 ± 0.00                               
singletop     0.44 ± 0.04                               
ttbar         2.31 ± 0.92                               
total         3.61 ± 0.92                               
data          3.61                                        

while for Nominal: 1:

sample        rec_smim_vs_Nbjet_HighMassOSs2thh_1bjets  
----------    ------------------------------------------
LQ            -7.97 ± 0.16                              
Wjets         0.15 ± 0.02                               
Zll           0.00 ± 0.00                               
Ztautau       0.66 ± 0.07                               
multiboson    0.05 ± 0.00                               
singletop     0.44 ± 0.04                               
ttbar         2.31 ± 0.92                               
total         -4.36 ± 0.93                              
data          3.61                                      

It's the only region where the total drops below 0, for other regions there is enough background to stay in the positive. For my understanding, is it wrong to use Nominal: 0 or do I kill the signal with it?

@alexander-held
Copy link
Member

Hi @rmnmllr, sorry I forgot to reply here earlier: Nominal: 0 only sets the initial value for the parameter, the best-fit value will then still be determined in the fit so this is fine to use.

@rmnmllr
Copy link
Contributor Author

rmnmllr commented Mar 22, 2023

hey @alexander-held, no worries -- thanks for taking the time!

After a while we have a new set of inputs and now again I stumble upon the ValueError: 'yerr' must not contain negative values. This time though the ValueError happens at the plotting of the syst. templates cabinetry.visualize.templates. The error message including some additional logger info is:

INFO - cabinetry.visualize.plot_model - label = region: HighMassOSs2thh_1bjets
sample: Zll
systematic: FT_EFF_Eigen_B_0
INFO - cabinetry.visualize.plot_model - nominal_histo['yields']=[9.999999717180685e-10, 9.999999717180685e-10, 0.030537374317646027, 0.04678068682551384, -0.09997741878032684, 0.03829693794250488, 0.004840612877160311, 9.999999717180685e-10, 9.999999717180685e-10]
INFO - cabinetry.visualize.plot_model - template['yields'] =    [9.999999717180685e-10, 9.999999717180685e-10, 0.030371561646461487, 0.04616935923695564, -0.09442667663097382, 0.03830915689468384, 0.004814603365957737, 9.999999717180685e-10, 9.999999717180685e-10]
INFO - cabinetry.visualize.plot_model - template_ratio_plot =   [1.0, 1.0, 0.9945701726199582, 0.986932051877768, 0.944480041422661, 1.0003190582024417, 0.9946268144421762, 1.0, 1.0]
INFO - cabinetry.visualize.plot_model - template['stdev'] =     [0.0, 0.0, 0.026351320503366565, 0.03463904655528507, 0.11993040470068965, 0.038309157288696605, 0.0036683593526272563, 0.0, 0.0]
INFO - cabinetry.visualize.plot_model - yerr =                  [0.0, 0.0, 0.8629203096920959, 0.7404561349106399, -1.1995749256560029, 1.0003190684908039, 0.7578295240125165, 0.0, 0.0]
Traceback (most recent call last):
  File "/terabig/muellerr/Leptoquarks/fitting/run_cabinetry.py", line 356, in <module>
    run_cabinetry(args["inputs"], args["variables"], args["couplings"], gen_input, plot, limit, ranking, args["signi"], args["mlfit"], args["prune_MC"], args["custom_function"])
  File "/terabig/muellerr/Leptoquarks/fitting/run_cabinetry.py", line 97, in run_cabinetry
    cabinetry.visualize.templates(config, figure_folder=figurespath, close_figure=True)
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/cabinetry/visualize/__init__.py", line 362, in templates
    fig = plot_model.templates(
  File "/terabig/muellerr/.conda/envs/fit-dev/lib/python3.9/site-packages/cabinetry/visualize/plot_model.py", line 365, in templates
    ax2.errorbar(
  File "/terabig/muellerr/.local/lib/python3.9/site-packages/matplotlib/__init__.py", line 1442, in inner
    return func(ax, *map(sanitize_sequence, args), **kwargs)
  File "/terabig/muellerr/.local/lib/python3.9/site-packages/matplotlib/axes/_axes.py", line 3642, in errorbar
    raise ValueError(
ValueError: 'yerr' must not contain negative values

Now for fixing this, I took the absolute value of nominal_histo["yields"] on line 357 e.g.:

yerr=template["stdev"] / abs(nominal_histo["yields"]),

which makes it work.

Do you see any dealbreaker by doing this? At least now I can get some plots, cabinetry later fails at cabinetry.model_utils.prediction with a RuntimeError: [True, False, False, False, False, False, False, False, False] is neither all-True nor all-False, so not compressible but I need to investigate more there before I come (again) back to you 😅

@alexander-held
Copy link
Member

RuntimeError: [True, False, False, False, False, False, False, False, False] is neither all-True nor all-False, so not compressible

This probably comes from #390. I have an idea of how to fix that, so hopefully I can get that done soon and release a new version.

Fixing the negative error issue by taking the absolute value at that point seems reasonable to me. Please feel free to submit a PR with this change (perhaps with np.abs instead to make it more obvious that this is an array)!

@alexander-held
Copy link
Member

With #390 addressed via #396 and the changes introduced in #394, I believe everything here is addressed in the latest release v0.5.2. Please feel free to open this again (or a new issue) if you run into additional issues!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants