Skip to content

Commit

Permalink
Add csv file output to PrintDQ (#328)
Browse files Browse the repository at this point in the history
* add .csv output for storing run metrics
  • Loading branch information
S81D authored Dec 20, 2024
1 parent 0bc220f commit 4a88c36
Show file tree
Hide file tree
Showing 4 changed files with 42 additions and 2 deletions.
37 changes: 37 additions & 0 deletions UserTools/PrintDQ/PrintDQ.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,36 @@ bool PrintDQ::Finalise()
std::cout << "" << std::endl;
std::cout << "" << std::endl;

// write metrics to .csv file
std::ofstream csv_file("R" + std::to_string(fRunNumber) + "_PrintDQ.csv");

if (!csv_file.is_open()) {
Log("PrintDQ Error: Unable to open CSV file for writing", v_error, verbosity);
return false;
}

csv_file << "Metric,Counts,Rate (%),Rate Error (+/- %)\n";
WritetoCSV(csv_file, "Total Events", totalevents, 0.0, 0.0); // no corresponding rate for total events
WritetoCSV(csv_file, "Has LAPPD Data", totalhas_lappd, has_lappd, er_has_lappd);
WritetoCSV(csv_file, "Has BRF Fit", totalhas_BRF, has_BRF, er_has_BRF);
WritetoCSV(csv_file, "EventTimeTank = 0", totaltimezero, timezero, er_timezero);
WritetoCSV(csv_file, "Beam OK", totalokay_beam, okay_beam, er_okay_beam);
WritetoCSV(csv_file, "Total Clusters (rate given as clusters / event)", totalclusters, events_per_cluster, er_events_per_cluster);
WritetoCSV(csv_file, "Prompt Clusters", totalclusters_in_prompt, clusters_in_prompt, er_clusters_in_prompt);
WritetoCSV(csv_file, "Spill Clusters", totalclusters_in_spill, clusters_in_spill, er_clusters_in_spill);
WritetoCSV(csv_file, "Ext Clusters", totalclusters_in_ext, clusters_in_ext, er_clusters_in_ext);
WritetoCSV(csv_file, "Extended (CC)", totalext_rate_1, ext_rate_1, er_ext_rate_1);
WritetoCSV(csv_file, "Extended (NC)", totalext_rate_2, ext_rate_2, er_ext_rate_2);
WritetoCSV(csv_file, "Tank+MRD Coinc", totaltmrd_coinc, tmrd_coinc, er_tmrd_coinc);
WritetoCSV(csv_file, "1 MRD Track", totalhas_track, has_track, er_has_track);
WritetoCSV(csv_file, "Tank+Veto Coinc", totalveto_hit, veto_hit, er_veto_hit);
WritetoCSV(csv_file, "Tank+MRD+Veto Coinc", totalveto_tmrd_coinc, veto_tmrd_coinc, er_veto_tmrd_coinc);

csv_file.close();

std::cout << "Run metrics written to " << "R" + std::to_string(fRunNumber) + "_PrintDQ.csv" << std::endl;
std::cout << "" << std::endl;

return true;
}

Expand Down Expand Up @@ -441,6 +471,13 @@ bool PrintDQ::GrabVariables() {
}


//------------------------------------------------------------------------------

void PrintDQ::WritetoCSV(std::ofstream& file, const std::string& metric, int count, float percentage, float error) {
file << metric << "," << count << "," << percentage << "," << error << "\n";
}


// ************************************************************************* //

// done
2 changes: 2 additions & 0 deletions UserTools/PrintDQ/PrintDQ.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ class PrintDQ: public Tool {
bool GrabVariables(); ///< Assign values to tricky variables (clusterTime, Grouped Triggers, MRD Tracks)
void FindCounts(); ///< Loop over extracted event information and count them up
float CalculateStatError(float numerator, float denominator); ///< Statistical error calculation for the rates
void WritetoCSV(std::ofstream& file, const std::string& metric,
int count, float percentage, float error); ///< Write each row of metrics to a .csv file

private:

Expand Down
2 changes: 1 addition & 1 deletion UserTools/PrintDQ/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

## Data

`PrintDQ` currently just prints out the run statistics and does not add anything to the Store. Here's an example of the print output from the tool:
`PrintDQ` prints out the run statistics and does not add anything to the Store. It also populates a .csv file with the same statistics (`R<run>_PrintDQ.csv`). Here's an example of the print output from the tool:
```
**************************************
Run 4314
Expand Down
3 changes: 2 additions & 1 deletion configfiles/PrintDQ/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,12 @@ The `PrintDQ` toolchain runs the clustering tools (MRD + PMT) over the Processed

- Populate the `my_inputs.txt` file with all part files from a Processed runs. Running the script `sh create_my_inputs.sh <run_number>` will automatically populate the input file with all Processed Data part files for that run.
- Run the toolchain via: `./Analyse ./configfiles/PrintDQ/ToolChain`
- Run statistics will be outputted via `std::out` once the toolchain completes.
- Run statistics will be outputted via `std::out` once the toolchain completes, in addition to a .csv file containing the metrics.


************************
# Additional information
************************

- The current version of the `PrintDQ` tool is intended to be run over 1 run at a time. As the clustering tools may take some time to compile, the processing time of this toolchain may take several minutes (for a ~100 part file run) to ~1 hour (~thousands of part files) depending on how many part files exist.
- Find [here](https://github.com/S81D/PrintDQ/tree/main) a set of scripts to run this toolchain on the grid and produce data quality plots from the outputted .csv files.

0 comments on commit 4a88c36

Please sign in to comment.