An application of the "Mondrian Multidimensional K-Anonymity" article in Python.
It provides a robust anonymization technique for protecting user privacy which prevents table joining attacks on datasets while preserving data utility.
Clone the repository and set up a virtual environment:
cd Mondrian-Multidimensional-K-Anonymity
python3 -m venv venv
. venv/bin/activate
Install the requirements:
pip install -r requirements.txt
To run the Mondrian anonymization process on your data:
python3 mondrian.py --input data/adult.csv --sensitive-data class
You can also get an illustration of c-avg metric for different k values using data/adult.csv file:
python3 mondrian.py --test
Note that this is only an illustration of algorithm quality and will be generated using predefined data and the generated figure is always the same.
To see the full list of commands run:
python3 mondrian.py --help
usage: mondrian.py [-h] [--ei Explicit Identifier [Explicit Identifier ...]]
[--sensitive-data Sensitive Data [Sensitive Data ...]] [--k K] [--input INPUT] [--test]
An application of 'Mondrian Multidimensional K-Anonymity' article in Python
options:
-h, --help show this help message and exit
--ei Explicit Identifier [Explicit Identifier ...]
Explicit Identifiers to be removed from dataset (example: --ei id name)
--sensitive-data Sensitive Data [Sensitive Data ...]
Sensitive Data you don't want to anonymize to maintain utility (example: --sensitive-
data salary)
--k K The k value for k-anonymity (default: 2)
--input INPUT Input csv file path (default: data/adult.csv)
--test Draws an illustration of c-avg metric for different k values using data/adult.csv file.
Feel free to contribute in any way possible.