-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bounds estimation for produced photons / electrons #66
Comments
Specifically, the problem is estimating the number of produced [photons/electrons] There is a stackexchange post on this here. The accepted answer suggest a procedure involving the negative binomial. Unless I made a mistake while implementing it, it has unacceptably high false exclusions (low coverage) for low N and p. This is for 99% intervals: You could also imagine an oversimplified solution like this:
Currently, flamedisx uses a mysterious equation here, leading to an interval of However, you can see the asymptote is no longer correct. For higher p-values the mystery interval is much too conservative -- the false exclusion rate is in the 10^-4 range around N=4, then drops even more. This makes flamedisx slower, but at least its values are not wrong. For the moment, we could keep the mystery equation, but it is a bit embarrassing and makes things unnecessarily slow. Perhaps we could instead use one of the other motivated solutions, with an 'empirical fix' for the low p, low N regime? Finally, an option is to compute and cache exact confidence intervals using the Neyman construction in the low (N, p) regime. We'd have to do it for several p's and interpolate somehow. Fortunately, this is all in the numpy part of flamedisx, specifically the annotation stage, so we can be creative. |
Believe it or not, but exactly a year after I wrote the long post above, the derivation of the mystery equation has washed back up from the seas of time... Let me try to translate, but remember it's a hand-wavy argument, though apparently somewhat successful:
This is indeed the 1 sigma bound used in the mystery equation, assuming I made a typo in coding up q=1/p instead of q=1-p. Why q = (1-p)/p = 1/p-1 worked better for @pelssers I have no idea; maybe q=1-p is even better ;-) I also didn't consider that the bounds might be asymmetric. Repeating the exercise with mean -> mean - σ gives |
Hi @JelleAalbers! |
If you set max_sigma high enough, flamedisx will give accurate results regardless of how bad the bounds estimation is. Empirically we found that max_sigma 5 is reasonable for our current bound estimation.
It would be good to redo the derivations of the bound estimation formulas, and see if we can improve them. This would improve our speed / accuracy profile.
The text was updated successfully, but these errors were encountered: