Skip to content

Latest commit

 

History

History
61 lines (53 loc) · 5.43 KB

README.md

File metadata and controls

61 lines (53 loc) · 5.43 KB

Differentially Private Inductive Miner (DPIM)

Before executing the DPIM, please make sure the requirements are installed. If not, please install the requirements by running the following command:

python3 pip install -r requirements.txt

The DPIM offers two modes of execution:

  1. Differential private In this mode, the DPIM is executed with differential privacy. The user can specify the epsilon ($\epsilon$) value, as well as needed lower and upper bounds. The DPIM will then execute with the specified epsilon value and bounds. To avoid errors the lowest lower bound is the total number of activities $(\#unique\ activities)$ and the highest upper bound is the $(\#unique\ activities)^2 -1$.
  2. Non-differential private In this mode, the DPIM is executed using $\epsilon \rightarrow \infty$. The lower and upper bounds are not needed as only those permutations are considered that occur in at least one trace.

Out of these two modi, the differential private mode is the default mode. To execute the DPIM in non-differential private mode, specify the --no-dp flag.

To test the DPIM on a specific event log, you can run the following command:

python3 main.py <eventlog> --epsilon <epsilon> --lower <lower_bound> --upper <upper_bound>

where:

  • <eventlog> is the path to the event log. This is a required argument.
  • <epsilon> is the epsilon value.
  • <lower_bound> is the lower bound.
  • <upper_bound> is the upper bound.

Or, for non-differential private mode, the user can run the following command:

python3 main.py <event_log> --no-dp

All synthetic event logs and the URLs to the BPI Challenges are in the event_logs directory.

Info: If no flag is given, or a specific flag is forgotten the DPIM asks the user to input the missing values.

Arguments

The following arguments are available for the DPIM:

  • eventlog The path to the event log. This is a required argument.
  • -e, --epsilon The $\epsilon$ value for differential privacy. The default value is 1.0.
  • -l, --lower The lower bound for the number of permutations. The default value is the $\#unique\ activities$ in the event log.
  • -u, --upper The upper bound for the number of permutations. The default value is the $(\#unique\ activities)^2 -1$.
  • -t --threshold The threshold used by the Rejection sampler to accept the generated PST. The default is 0.95
  • --no-dp The flag to run the DPIM in non-differential private mode, $\epsilon \rightarrow \infty$.

Example

To test the DPIM on the TF_5 event log with $\epsilon = 1.0$, the user can run the following command:

python3 main.py event_logs\synthetic_EventLogs\TF_5.xes -e 1.0 -l 6 -u 35 -t 0.9

Bounds used

The following tables show the lower and upper bounds used for the BPI Challenge datasets (all links can be found at BPI) and the synthetic logs.

BPI Challanges Synthetic Logs
Event Log Lower Bound Upper Bound
BPI_Challenge_2011 4280 4310
BPI_Challenge_2012 120 150
BPI_Challenge_2013_closed_problems 5 20
BPI_Challenge_2013_incidents 5 20
BPI_Challenge_2013_open_problems 5 15
BPI_Challenge_2015_1 4805 4835
BPI_Challenge_2015_2 4885 4915
BPI_Challenge_2015_3 5020 5050
BPI_Challenge_2015_4 3650 3680
BPI_Challenge_2015_5 4960 4990
BPI_Challenge_2017 175 205
BPI_Challenge_2018 605 635
BPI_Challenge_2019 525 555
DomesticDeclarations_2020 30 60
InternationalDeclarations_2020 195 225
PermitLog_2020 555 585
PrepaidTravelCost_2020 160 190
RequestForPayment_2020 40 70
Sepsis Cases-Event Log 120 150
Event Log Lower Bound Upper Bound
TF_04 6 20
TF_05 6 35
TF_06 6 35
TF_07 3 8
TF_08 3 8
TF_09 6 35
TF_10 6 35
TF_11 5 24
TF_12 5 24
TF_13 5 24
TF_14 30 60
TF_15 3 8
TF_16 6 35

Cite

@inproceedings{Schulze_2024,
   author={Schulze, Max and Zisgen, Yorck and Kirschte, Moritz and Mohammadi, Esfandiar and Koschmider, Agnes},
   title={Differentially Private Inductive Miner},
   booktitle={2024 6th International Conference on Process Mining (ICPM)},
   DOI={10.1109/icpm63005.2024.10680684},
   publisher={IEEE},
   year={2024},
   pages={89–96} }