Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to properly do uniqueness tracking #111

Open
s6pepaul opened this issue May 3, 2019 · 12 comments
Open

How to properly do uniqueness tracking #111

s6pepaul opened this issue May 3, 2019 · 12 comments

Comments

@s6pepaul
Copy link
Contributor

s6pepaul commented May 3, 2019

Hello,

while I am getting towards the final stages of parts of my analysis I am wondering how to properly take care of multiple combinations in one event. By that I do not mean multiple combos from random tagger hits. This was well explained by Richard in a tutorial given at a Physics WG meeting. I mean the additional combos arising from allowing additional tracks or showers in the event. I think I brought this topic up a couple of times with various people and so far the consensus was always: This is not an easy topic and we should discuss this more.

I wrote a short document stating what the current recommendations are for uniqueness tracking and where I see problems with that. You find it here: https://halldweb.jlab.org/doc-private/DocDB/ShowDocument?docid=3993

Short summary of the problems (not complete but meant as a starting point to think about uniqueness tracking):

  • histogram the phi angle of a proton to measure the (e.g. pi0) beam asymmetry:
    what happens if you have two possible protons that survive all cuts. Which one should we take? Going by the current recommendations we should take both, but this inflates the signal (and messes up errors)
  • histogram variables from kinematic fit four-vectors:
    if there are additional tracks/showers all particles will have slightly different values based on which subset (i.e. combo) of tracks/showers was used for the fit
  • histogram inv. mass in bins of momentum transfer (measure dsigma/dt):
    Stay with gp->pi0 p plus an additional track that could be the proton as example. The inv. mass only depends on the photons from the pi0 decay but the momentum transfer can be calculated from the proton. So depending on which track is used as proton the event could even end up in different bins.

A possible solution (not tested, just an idea):
Don't do uniqueness tracking anymore. Instead always take all combinations. If after applying all cuts (i.e. using all the discriminatory information you have on a given event) n combos survive an event that do not originate from beam photons than weight the combo with 1/n

I admit this might not be a huge effect for some analyses but given that we really push the statistical uncertainties to less than 1% in some channels it might be something that biases the results and/or their errors. In any case it is surely something we want to get right.

I hope that this triggers a discussion that yields some sort of definite prescription of how to properly account for multiple combinations.

@zihlmann
Copy link
Contributor

zihlmann commented May 7, 2019 via email

@s6pepaul
Copy link
Contributor Author

s6pepaul commented May 7, 2019

The track ID is unique to the measured track. But the combo ID is different for the two different combos even if the proton track input is the same.

Ok, that means that if you use a set to keep track of used unique track IDs to histogram the kin. fitted values you would only get the 'first' combo, whichever this may be. So this is not what you want to do to plot kin. fitted values.
So you use both? Or just one based on a figure-of-merit (missing mass sq. e.g.?)?

@zihlmann
Copy link
Contributor

zihlmann commented May 7, 2019 via email

@gvasil
Copy link

gvasil commented May 17, 2019

Hello Peter,

I don't know how exactly you would implement this procedure in your analysis code but I gave it a try following your suggestion and Beni's comments. I am attaching the results from the eta pi+ pi- and the eta pi0 pi0 analyses. The blue histograms were filled using Uniqueness Tracking and the red ones were filled without Uniqueness Tracking but assigning weights to each combo instead.

MEtaPiPlusPiMinusWithAndWithoutUniqueness
MEtaPi0Pi0WithAndWithoutUniqueness

@zihlmann
Copy link
Contributor

zihlmann commented May 17, 2019

Can you put your DSelector code somehwere where one could look at it. To me it is not clear
what you mean by
"using Uniqueness tracking"
and what means
"using Weights"
not only do you lose events in the red histogram but but it is not the same everywhere changing
the shapes of some of the peaks!

even more to the point. the eta pi0 pi0 final state I assume is constructed from 6 gammas in the final state. Taking only the first combination and fill it with weight 1 I think is wrong. All working combinations
should be used and filled with Weight=1/N where N is the number of surviving candidates.
the combination with out of time photons that will survive in such an event will carry a weight of
W=1/N*1/N_beam_bunches
This is of course only a proposal others may disagree. But this will make sure that the integral over
one trigger event gives you only one good event.

@s6pepaul
Copy link
Contributor Author

s6pepaul commented May 17, 2019

Hi George,
very interesting to see. Thank you for trying it! I plan to do a similar test on MC in the next week. There it is also possible to compare to the actual number of events. I have a few questions regarding your test:

  1. What exactly did you include in your uniqueness tracking when you made those plots?
  2. Are those measured or kin. fitted quantities?
  3. How exactly did you implement the weights?

I was thinking about doing something along the lines of:
In the combo loop

vector<Int_t> locCombosPassedCuts;
set<map<Particle_t, set<Int_t> > > AdditionalCombos;  //similar to uniqueness tracking to make sure additional combo doesn't come from beam photons, we want to keep those and subtract with side peaks
Int_t MultComboWeight = 0; //start at 0 because we will add at least 1 to it -> weight 1
//Loop over combos
for(UInt_t loc_i = 0; loc_i < Get_NumCombos(); ++loc_i)
{
    if( "all your cuts" ){
        dComboWrapper->Set_IsComboCut(true);
    }
    else{
        locCombosPassedCuts.push_back(loc_i);

        map<Particle_t, set<Int_t> > AdditionalComboMap; // only for non beam particles
        AdditionalComboMap[KPlus].insert(locKPlusTrackID);
        AdditionalComboMap[KMinus].insert(locKMinusTrackID);
        AdditionalComboMap[Proton].insert(locProtonTrackID);
        if(AdditionalCombos(AdditionalComboMap) == AdditionalCombos()){
            MultComboWeight++;
            AdditionalCombos(AdditionalComboMap);
        }
    }
}

This gives you a vector with the combo IDs that passed all your cuts. Then in a second loop go over all the elements in the vector and assign weights 1/MultComboWeight*timingWeight, where timingWeight is 1 if it is the main peak and -1/N_sidepeaks if not. This should be the same as Beni's suggestion I think. I haven't tried this implementation yet. I will do it next week.

@gvasil
Copy link

gvasil commented May 20, 2019

Hi,

Below you can find the part of the eta pi+ pi- code that I used to get these histograms, omitting every line that is irrelevant to this study. The code is similar for the eta pi0 pi0 final state.

As for your questions, Peter:

  1. In the Uniqueness tracking I included all the final state particles as well as the beam photon and the proton

  2. The quantities that I plotted are the Kin. Fitted ones.

  3. As for the implementation of the weights for each combo, I followed the suggestions in this discussion. You can find this implementation in the last 15 lines of my code below

Kind regards,
George

    // the number of combos that survive after applying the cuts
    UInt_t locNumCombosAfterAllTheCuts = 0;

    //Loop over combos
    for(UInt_t loc_i = 0; loc_i < Get_NumCombos(); ++loc_i)
    {
            // store the information about the combos that were cut
            if(locAllCutsExcludingOneCut[8]) \\ apply all the cuts but exclude the RF beam bunch cut
            {
                    dComboWrapper->Set_IsComboCut(true);
            }

            // if the current combo survives all the cuts, increment the relevant variable
            if(!locAllCuts)
            {
                    locNumCombosAfterAllTheCuts += 1;
            }

            // Weight for drawing accidental subtracted histograms  
            double locAcciWeight = 0.0;

            if(fabs(locDeltaTBeamMinusRF) < (0.5 * 4.008)) { // prompt signal recieves a weight of 1
                    locAcciWeight = 1.0;
            }
            else { // accidentals receive a weight of 1/# RF bunches included in TTree (8 in this case)
                    locAcciWeight = -1.0 / 8.0;
            }

            // Fill the MEtaPiPlusPiMinus mass histogram that uses Uniqueness tracking
            if(!locAllCutsExcludingOneCut[8]) // include the beam photon in the uniqueness tracking
            {
                    map<Particle_t, set<Int_t> > locUsedThisCombo_EtaPiPlusPiMinusAllCuts3;
                    locUsedThisCombo_EtaPiPlusPiMinusAllCuts3[Unknown].insert(locBeamID); //beam
                    locUsedThisCombo_EtaPiPlusPiMinusAllCuts3[PiPlus].insert(locPiPlusTrackID);
                    locUsedThisCombo_EtaPiPlusPiMinusAllCuts3[PiMinus].insert(locPiMinusTrackID);
                    locUsedThisCombo_EtaPiPlusPiMinusAllCuts3[Proton].insert(locProtonTrackID);
                    locUsedThisCombo_EtaPiPlusPiMinusAllCuts3[Gamma].insert(locPhoton1NeutralID);
                    locUsedThisCombo_EtaPiPlusPiMinusAllCuts3[Gamma].insert(locPhoton2NeutralID);
        
        if(locUsedSoFar_EtaPiPlusPiMinusAllCuts3.find(locUsedThisCombo_EtaPiPlusPiMinusAllCuts3) == locUsedSoFar_EtaPiPlusPiMinusAllCuts3.end())
                    {
                            dHist_MEtaPiPlusPiMinusPostCutsHighEnergies->Fill(locEtaPiPlusPiMinusMassKinFit, locAcciWeight);

       locUsedSoFar_EtaPiPlusPiMinusAllCuts3.insert(locUsedThisCombo_EtaPiPlusPiMinusAllCuts3);
                    }

    } // end of 1st combo loop

    // loop again over the combos that survived, in order to plot the weighted histogram
    if(locNumCombosAfterAllTheCuts != 0)
    {
            for(UInt_t loc_i = 0; loc_i < Get_NumCombos(); ++loc_i)
            {
                    //Set branch array indices for combo and all combo particles
                    dComboWrapper->Set_ComboIndex(loc_i);

                    // Is used to indicate when combos have been cut
                    if(dComboWrapper->Get_IsComboCut()) // Is false when tree originally created
                            continue; // Combo has been cut previously

                    // define the weights for the accidentals and for the combos that survived the cuts

                    // Weight for drawing accidental subtracted histograms  
                    double locAcciWeight = 0.0; //
                    if(fabs(locDeltaTBeamMinusRF) < (0.5 * 4.008)) { // prompt signal recieves a weight of 1
                            locAcciWeight = 1.0;
                    }
                    else { // accidentals receive a weight of 1/# RF bunches included in TTree (8 in this case)
                            locAcciWeight = -1.0 / 8.0;
                    }

                    double locAcciAndComboWeight = 1.0; // weight to accidentally subtracted and combo-weighted histograms
                    if(locNumCombosAfterAllTheCuts != 0)
                    {
                            locAcciAndComboWeight = locAcciWeight / locNumCombosAfterAllTheCuts;
                    }


                    // fill the mass histo using weights for each combo (i.e. without Uniqueness tracking)
                    dHist_MEtaPiPlusPiMinusWithoutUniqueness->Fill(locEtaPiPlusPiMinusMassKinFit, locAcciAndComboWeight);

            } // end of second combo loop
    }

@s6pepaul
Copy link
Contributor Author

Hi George,
sorry it took me so long to get back to you. So if I understand correctly, then

        // if the current combo survives all the cuts, increment the relevant variable
        if(!locAllCuts)
        {
                locNumCombosAfterAllTheCuts += 1;
        }

Also includes cases where the additional combination arises from having two beam photons that could have triggered the reaction. I don't think you want to include these. They should be taken care of with your locAcciWeight. You really just want to include additional combinations from final state permutations.

I am currently testing the idea on MC and hope to have something ready to show soon.

@gvasil
Copy link

gvasil commented May 28, 2019

Hi Peter,

Apologies for my cryptic cut flow. I didn't include the relevant lines where the cuts are defined. The if-statement:

if(!locAllCuts)

applies also a cut on the RF beam bunches, allowing only the central RF peak. Therefore, I do not include beam photons from the other bunches. If, however, we have events with two photons that belong in the central RF peak for the same reaction then, indeed, these have been included in the permutations (even though they shouldn't)

When I tried running the same code over some MC data for both eta pi0 pi0 and eta pi+ pi- final states, the mass histograms "using Uniqueness tracking" and "using Weights" were almost identical. I am curious to see what will happen in your case.

@s6pepaul
Copy link
Contributor Author

I finally got round to test my idea on a bit of MC data. First let me explain what the aim was and what I did:
I wanted to test the idea of using weights to handle multiple combinations arising from final states NOT from beam photons. Therefore I created a sample of Lambda(1520) decaying into the pKK final state. I used MCwrapper and set "BKG=None" to not have any additional beam photons in the sample. The deltaTRF histogram just shows a peak at 0 without any side peaks.
DeltaTRF
I didn't apply any cuts in addition to what ReactionFilter does to have a sample with relatively high background of combinatorics.
The inv. mass constructed from the thrown information in the tree looks like this:
pKm_thrown
Note the integral of 32739.
The inv. mass constructed from the measured information using uniqueness tracking:
pKm_measured
Note the integral of 35098. Also one can clearly see additional entries above 2GeV from combinatorics.
The measured beam energy using uniqueness tracking:
BeamEnergy
Note the integral of 32739. Since there was always only one beam photon per event the number agrees with the thrown integral.
The inv. mass constructed from the measured information using weights:
pKm_measured_weighted
Note the integral of 32739 which agrees with the number of thrown. Of course there are still entries above 2GeV, but the weights got the overall number of events right again. So assuming that one would measure a (very bad) cross section by counting all the events in the pKm mass spectrum then the weights would give the right answer and uniqueness tracking would be 7% out.

I looked at inv. mass, missing mass squared and beam energy histograms with uniqueness tracking and the weights method (both kinfit and measured but they agreed since I didn't place any additional cuts). Here is a summary of the integrals I got in each histogram:
............................ uniq. ..... weight ...... thrown
beam energy......32739...... 32739 ...... 32739
inv. mass............35098...... 32739 ...... 32739
MM squared.......35259...... 32739 ...... 32739

In a real analysis one would place cuts to reduce the number of wrong combinations and the effect in this final state would be much smaller than 7%. I just wanted to keep many wrong combinations for this test to have a visual confirmation (entries >2GeV). It is really hard to guess how big the effect is in general. I think some final states will be more sensitive than others. I can imagine that especially final states with many particles of the same type (multiple photons e.g.) can have multiple combinations surviving all cuts.

I think that this test has shown that uniqueness tracking in its current form is not a good way of doing things. Using the weights might be a much better solution. How big the effect is for any analysis has to be assessed individually of course but in any case weights should give the "more correct" solution.

Let me know what you think of this.
Cheers,
Peter

PS: The root file containing all the histograms and the DSelector code I use to make them is here. Please have a look.
As a comment: I produced the tree with the B4 option. Hence the part in the code for the accidental subtraction, so I can try it on real data as well. In this example it doesn't make a difference since there are no side peaks from accidentals.

PPS: It is a bit harder to repeat this test with background beam photons since the comparison to thrown data isn't that easy anymore. The IsCorrectCombo flag seems to be a bit broken. When I looked at the histograms produced with it (also in the root file) I see many events in the L1520 peak region that are marked as wrong combination. If someone has an idea of how to to this test please let me know.

@gvasil
Copy link

gvasil commented Jun 26, 2019

Hi Peter,

Sorry for the detailed reply. Your results are very interesting and made me wonder where this could be the reason for the differences that I see in the cross section that I get for the eta pi+ pi- and the eta pi0 pi0 channel. The eta pi0 pi0 channel has 6 photons in the final state, which give rise to a huge number of combinations, contrary to the eta pi+ pi- channel.

I have modified my code to match your code as much as possible (i.e. exclude the beam photons when calculating the weight for each combo) . Below I attach two histograms with the dsigma/dt results for both channels. The first one is the one I get when using Uniqueness Tracking, the second one is the one I get when I follow the "weights" procedure:

RatioOfDSigmaDtForThe6To12GeVInterval
RatioOfDSigmaDtForThe6To12GeVIntervalAttempt14

As you can see the results are encouraging given the fact that the convergence between the two different decay modes improves.

The only thing that bothers me is that we are still partially using uniqueness tracking when we create the map of particles that stores the particles of the combo that passed the cuts. I am referring to these lines in your code:

map<Particle_t, set<Int_t> > AdditionalComboMap; // only for non beam particles
AdditionalComboMap[KPlus].insert(locKPlusTrackID);
AdditionalComboMap[KMinus].insert(locKMinusTrackID);
AdditionalComboMap[Proton].insert(locProtonTrackID);
if(AdditionalCombos.find(AdditionalComboMap) == AdditionalCombos.end()){
MultComboWeight++;
AdditionalCombos.insert(AdditionalComboMap);
}

I plan to show these results and continue this discussion tomorrow, in the Cross Section Workshop.

Kind regards,
George

@s6pepaul
Copy link
Contributor Author

Thank you for trying it again. Very interesting results and I am sure this will spark a discussion tomorrow! Interesting to see, that the red points don't just "drop down" onto the black points but both seem to move (e.g. in bin 2).
I slightly disagree that we would still use uniqueness tracking when building the combo map that you referred to. Of course it looks very similar and uses the same idea but with a very different intention. This is used not to make sure not to double count but to make sure not to include combos from beam photons. So I get your point but think it is not really what people call uniqueness tracking so far.

PS. I updated the note on docdb and included the results from the MC study (https://halldweb.jlab.org/doc-private/DocDB/ShowDocument?docid=3993).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants