From 0f20e4d6de3f9422993fcdac0fa35b845920ab3f Mon Sep 17 00:00:00 2001 From: Cedric Sirianni Date: Mon, 25 Dec 2023 16:35:45 -0500 Subject: [PATCH] docs: explain implementation (#55) * docs: add implementation details and images * fix: use relative file path * fix: try absolute file path * docs: try image path top level * docs: use top level indentation only * docs: add PSI description * docs: add optimization description * docs: remove image * docs: improve optimization description * docs: add About section to frontend README * docs: add Testing section to README * docs: remove image folder * docs: use LaTeX instead of markdown for math text * docs: fix typo * docs: change some wording * docs: use imgur link for image url GitHub recently changed the image hosting system in issues. Links are no longer persistent, so they can't be reliably used in the README. --- README.md | 22 ++++++++++++++++++++-- backend/README.md | 23 ++++++++++++++--------- frontend/README.md | 36 ++++++++++++++++++++++++++++++++---- 3 files changed, 66 insertions(+), 15 deletions(-) diff --git a/README.md b/README.md index 37f301b..bb60b73 100644 --- a/README.md +++ b/README.md @@ -12,9 +12,27 @@ Private Data Lookup (PDL) is a web application that allows users to privately qu > - [How Meta is improving password security and preserving privacy](https://engineering.fb.com/2023/08/08/security/how-meta-is-improving-password-security-and-preserving-privacy/) > - [Data Breaches, Phishing, or Malware?: Understanding the Risks of Stolen Credentials](https://dl.acm.org/doi/10.1145/3133956.3134067) +## Implementation + +### PSI + +In Private Set Intersection, neither party reveals anything to their counterpart except for the elements in the intersection. This is accomplished using encryption. Hashed passwords are encrypted using secret key $a$ on the frontend and secret key $b$ on the backend. Querying the set of breached passwords is a three step process: + +1. The client sends an encrypted user password $\text{Hash}(p)^a$ to the server. +2. The server sends the re-encrypted user password $\text{Hash}(p)^{ab}$ and the encrypted breached passwords $\text{Hash}(b_1)^{b}, ...,\text{Hash}(b_n)^{b}$ to the client. +3. The client partially decrypts the user password using $a^{-1}$ and checks if $\text{Hash}(p)^{aba^{-1}}$ is contained in the set of breached passwords. + +If the set intersection is non-empty, the user's password is compromised and should not be used. + +### Performance Optimization + +The initial PSI implementation unreasonably increases critical path latency due to the size of the breached password dataset. To address this challenge, [k-anonymity](https://en.wikipedia.org/wiki/K-anonymity) is used. Passwords are partitioned in $k$ buckets based on one or more leaked bytes. Given $n$ leaked bytes, there are $\left[0, (2^8)^n - 1\right]$ buckets. The client generates a partition index using $n$ leaked bytes, and then the server returns a smaller subset of the dataset. The result is a decrease in the number of serialized passwords per request and faster processing times. + +This feature involves a tradeoff between user privacy and application performance. The key assumption is that the number of breached passwords is sufficiently large to not reveal identifiable information about individual users. Since real breached password datasets contain billions of passwords [[1](https://www.wired.com/story/collection-leak-usernames-passwords-billions/)], each bucket contains millions of passwords. Thus, the assumption holds and the increased leakage involves neglible privacy risk. + ## Instructions -It's necessary to configure the `/frontend` and `/backend` folders initially. See the respective `README.md`s for more information. After configuration, you can run the application using the following commands. +You need to configure the `/frontend` and `/backend` folders initially. See the respective `README.md`s for more information. After configuration, you can run the application using the following commands. To run the frontend, `cd` into `/frontend` and run @@ -41,4 +59,4 @@ If you want to build a new database from a new or existing path, you can use the build/src/server --build ``` -Ensure that the backend is running with the frontend, otherwise you will see a server error on the front-end website. +Ensure that the backend is running with the frontend, otherwise you will see a server error in the web application. diff --git a/backend/README.md b/backend/README.md index 7903a99..5c523f1 100644 --- a/backend/README.md +++ b/backend/README.md @@ -2,7 +2,7 @@ ## About -The backend hosts the breached passwords for use in the Private Set Intersection computation. The user sends a password and receives that password and the set of breached passwords both encrypted with secret key `b`. The backend uses [Crow](https://crowcpp.org/master/) for the REST API, [SQLite3](https://www.sqlite.org/index.html) for the breached password database, and [Libsodium](https://libsodium.gitbook.io/doc/) for the cryptography. The API has a single endpoint: +The backend hosts the breached passwords for use in the Private Set Intersection computation. The user sends a password and receives that password and the set of breached passwords both encrypted with secret key $b$. The backend uses [Crow](https://crowcpp.org/master/) for the REST API, [SQLite3](https://www.sqlite.org/index.html) for the breached password database, and [Libsodium](https://libsodium.gitbook.io/doc/) for the cryptography. The API has a single endpoint: ### Encrypt user password and get breached passwords @@ -14,7 +14,7 @@ This endpoint encrypts the user's password and provides a list of encrypted brea `body` *Required* -The leaked bytes followed by the user's encrypted password. In general, for `n` leaked bytes, the length of the data is `32 + n`: +The leaked bytes followed by the user's encrypted password. In general, for $n$ leaked bytes, the length of the data is $32 + n$: ```text 0 n 32 + n @@ -59,8 +59,13 @@ const response = await fetch( In `/backend`, start by installing [Conan](https://conan.io/): -```bash +```console brew install conan +``` + +Then, install the project's packages using the following: + +```console conan install . --output-folder=build --build=missing ``` @@ -68,25 +73,25 @@ You probably need to create a default profile. Use `conan profile detect`. If you haven't installed CMake already, do so now: -```bash +```console brew install cmake ``` Next, link and compile the program: -```bash +```console make build ``` From `/backend`, start the server. Use the `--build` flag to create or rebuild a database for the breached passwords: -```bash +```console build/src/server --build ``` Or, omit the `--build` flag to use an existing database: -```bash +```console build/src/server data/passwords.db ``` @@ -104,12 +109,12 @@ To fix VS Code import errors, try adding the following line to your `settings.js After building, you can run tests from `/backend`: -```bash +```console cd build && ./test/pdl_test ``` or alternatively: -```bash +```console make check ``` diff --git a/frontend/README.md b/frontend/README.md index 3054e48..e1b189a 100644 --- a/frontend/README.md +++ b/frontend/README.md @@ -1,9 +1,17 @@ # Frontend -Made with [Next.js](https://nextjs.org/). +## About + +The frontend renders the client web application and computes the private set intersection. It is made with [Next.js](https://nextjs.org/), [Tailwind](https://tailwindcss.com/), and [MaterialUI](https://mui.com/material-ui/). + +![Sign up page](https://i.imgur.com/8sea2io.png) + +Note that a password must satisfy the listed requirements before the user can click "Sign Up." ## Configuration +Make sure you have [Yarn](https://yarnpkg.com/) and [Node](https://nodejs.org/en) installed. + To run the frontend server, use your preferred terminal to `cd` into `/frontend` and then install the required packages by running ```bash @@ -18,8 +26,28 @@ yarn dev Open [http://localhost:3000](http://localhost:3000) to view it in the browser. The page will reload if you make edits. -## Deploy on Vercel +## Testing + +You can run tests from `/frontend`: + +```console +yarn test +``` + +This command involves integration testing, so make sure the backend is running and that the leaked bytes are the same. For example, + +`/frontend/tests/psi.test.ts` -The easiest way to deploy your Next.js app is to use the [Vercel Platform](https://vercel.com/new?utm_medium=default-template&filter=next.js&utm_source=create-next-app&utm_campaign=create-next-app-readme) from the creators of Next.js. +```javascript +test("sending breached password should return fail status", async () => { + const password = "TestPass1&"; + const response = await checkSecurity(password, 1); + expect(response.status).toBe("fail"); +}); +``` + +```/backend/src/main.cpp``` -Check out our [Next.js deployment documentation](https://nextjs.org/docs/deployment) for more details. +```cpp +const size_t offset = 1; +```