GitHub - llgeek/K-anonymity-and-Differential-Privacy: Implementation of DataFly for K-anonymity, and implemented Laplace Mechanism and Exponential Mechanism for Differential Privacy

K-anonymity and Differential Privacy

[TOC]

1. K-anonymity

1.1 Generalization Hierarchy

Generalization hierarchy is defined in files in conf folder.

1.2 Heuristic program

I implement the datafly heuristic algorithm, which pseudo code is below:

The detailed comments for each function can be found in file k_anonymity.py.

1.3 Evaluation

I evaluate the result for k = [5, 10, 50, 100], and calculate the distoration and precision. The calculation of distoration and precision is the same as given in lecture, where distortion is:

and precision is:

2. Differential Privacy

2.1 Laplace Mechanism

2.1.1 Query for e=0.5 and e = 1

Query 1000 times for e in [0.5, 1].

2.1.2 0.5-Indistinguishable Proof

To prove the indistinguishable of two queries, I use bucket to gather outputs into 20 buckets, then calculate the probability for each bucket. Then calculate the quotient over each bucket probability. For two query result, query1 and query2, I calculate both probability of query1 over probability over query2, and also probability of query2 over probability over `query1. We can see in both cases, the quotient is smaller than $exp^{\epsilon}$, which proves the $\epsilon$-indistinguishable.

######2.1.3 1-Indistinguishable Proof

The proof is same as 0.5-indistinguishable proof. We can see for each case, the $1$-indistinguishable holds.

2.1.4 Distortion

To calculate the distortion, I used the RMSE as metric. Firstly I calculate the groundtruth, which is the true average age greater than 25 without adding noise. Then I calculate the RMSE of query result with groundtruth. We can see when $\epsilon = 1$, the RMSE is smaller than $\epsilon=0.5$, which proves the distortino of $\epsilon = 1$ is smaller than $\epsilon=0.5$.

2.2 Exponential Mechanism

2.2.1 Query for e=0.5 and e = 1

Query 1000 times for e in [0.5, 1].

2.2.2 0.5-Indistinguishable Proof

To prove the $\epsilon$-indistinguishable, I firstly count the frequency of each education value in the query results. Then calculate the probability for each education value. Indistinguishability can be proved by showing the probability quotient of two adjacent tables is smaller than $\exp^{\epsilon}$. We can see for each case, the indistinguishability holds.

2.2.3 1-Indistinguishable Proof

The proof is same with $\epsilon=0.5$. We can see for each case, the indistinguishability holds.

2.2.4 Distortion

The metric I used here is 1-precision. Precision is calculated by #results which is the groudtruth / total query numbers. 1-precision is a measure for distortion, as higher precison implies lower distortion. We can see when $\epsilon = 1$, the distortion is smaller than $\epsilon=0.5$, which proves the distortino of $\epsilon = 1$ is smaller than $\epsilon=0.5$.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
conf		conf
data		data
src		src
utilis		utilis
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

K-anonymity and Differential Privacy

1. K-anonymity

1.1 Generalization Hierarchy

1.2 Heuristic program

1.3 Evaluation

2. Differential Privacy

2.1 Laplace Mechanism

2.1.1 Query for e=0.5 and e = 1

2.1.2 0.5-Indistinguishable Proof

2.1.4 Distortion

2.2 Exponential Mechanism

2.2.1 Query for e=0.5 and e = 1

2.2.2 0.5-Indistinguishable Proof

2.2.3 1-Indistinguishable Proof

2.2.4 Distortion

About

Releases

Packages

Languages

llgeek/K-anonymity-and-Differential-Privacy

Folders and files

Latest commit

History

Repository files navigation

K-anonymity and Differential Privacy

1. K-anonymity

1.1 Generalization Hierarchy

1.2 Heuristic program

1.3 Evaluation

2. Differential Privacy

2.1 Laplace Mechanism

2.1.1 Query for e=0.5 and e = 1

2.1.2 0.5-Indistinguishable Proof

2.1.4 Distortion

2.2 Exponential Mechanism

2.2.1 Query for e=0.5 and e = 1

2.2.2 0.5-Indistinguishable Proof

2.2.3 1-Indistinguishable Proof

2.2.4 Distortion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages