Data Anonymizer tool ensures top-tier privacy by irreversibly obscuring personal identifiers without storing any data. Ideal for businesses prioritizing data security and compliance, it offers a reliable solution for safeguarding sensitive information.
This project is inspired by a need we recognized during the development of our machine learning solution, NextBrain. It empowers us to transcend the barriers typically encountered in the proof of concept phase. Furthermore, through this initiative, we are eager to support the open-source community, tackling some of the foremost challenges encountered in employing these types of tools.
The anonymization process maintains the original data distribution while injecting noise into the statistics. This approach helps to prevent the re-identification of the data. For more details about the anonymization process, refer to the following paper: DataSynthesizer
- Data Anonymization Tool
Coming soon
Coming soon
Coming soon
To get started with the development of NB-Anonymizer, follow these steps:
- Clone the Repository: First, clone the repository to your local machine using Git:
git clone https://github.com/NextBrain-ai/Data-Anonymizer-Tool
- Create a Virtual Environment: Navigate to the project directory and create a Python virtual environment:
Required Python 3.10+
cd NB-Anonymizer python3 -m venv venv
- Activate the Virtual Environment: Activate the virtual environment. The command varies depending on your operating system:
- On Windows:
.\venv\Scripts\activate
- On Linux:
source venv/bin/activate
- On Windows:
- Install Dependencies: Install the required dependencies using pip:
pip install -r requirements.txt
- Launch the Application: Start the application with Streamlit: streamlit run streamlit_app/streamlit_app.py
- 🎉 At this point, the application should be accessible at http://localhost:8501.
Download the NB Anonymizer Jupyter Notebook now and dive into a user-friendly, step-by-step guide designed for effortless navigation and comprehension.
Prefer not to install anything? No problem! You can also utilize Google Colab to run the notebook seamlessly. Simply click here to execute the code in a cloud-based environment, hassle-free.
If you have Docker and Docker Compose already installed on your system, running NB-Anonymizer is straightforward. Just follow these steps:
-
Navigate to the Project Directory: Open a terminal and navigate to the root directory of the NB-Anonymizer project.
-
Start the Application with Docker Compose: Run the following command to start the application in a Docker container:
docker compose up -d
This command will pull the necessary Docker images, create a container, and start the application.
🎉 Once the process is complete, NB-Anonymizer should be running in a Docker container, typically accessible at http://localhost:8501.
Before building a distribution of NB-Anonymizer, ensure that you meet the following prerequisites:
-
Python 3.10 or Higher: The project requires Python version 3.10 or newer. You can download and install the latest version of Python from python.org.
-
Node.js and npm: npm (Node Package Manager) is needed for handling the project's JavaScript dependencies. You can download and install Node.js and npm from nodejs.org.
Once you have these prerequisites installed, follow these steps to build the project:
- Install JavaScript Dependencies: Navigate to the project directory and run
npm install
. This will install Electron along with other necessary dependencies:npm install
- Build the Project: Execute the following commands to build a distribution of the project:
These commands will package the application and prepare it for distribution. After running these commands, the built version of NB-Anonymizer will be available in the distribution directory of the project.
npm run dump streamlit_app -- -r requirements.txt npm run dist
- Initial Anonymization System: Develop a basic system for data anonymization to lay the groundwork for more advanced features.
- Web Platform for Anonymization: Create a user-friendly web interface that allows users to easily anonymize their data.
- Jupyter Notebook and Google Colab Integration: Ensure our system is compatible with Jupyter Notebook and Google Colab for wider accessibility.
- Docker and Docker Compose Deployment: Implement Docker and Docker Compose to facilitate easy deployment and scaling of our application.
- Desktop Application: Develop a standalone desktop application for users who prefer a dedicated software solution.
- Testing More Powerful Algorithms: Experiment with more advanced algorithms to improve the efficiency and effectiveness of the anonymization process.
- Evaluation of Original vs Anonymized Data: Conduct thorough evaluations to ensure the integrity of the data post-anonymization.
- Statistical Graph Generation: Implement features for generating statistical graphs to visualize the effectiveness of our anonymization processes.
Contributions to NB-Anonymizer are welcome and appreciated. If you want to contribute, please:
- Fork the repository.
- Create a new branch for your feature (
git checkout -b feature/AmazingFeature
). - Commit your changes (
git commit -am 'Add some AmazingFeature'
). - Push to the branch (
git push origin feature/AmazingFeature
). - Open a pull request.
For more detailed information, please refer to the CONTRIBUTING file.
NB-Anonymizer is licensed under the MIT License.
For support or any queries, feel free to contact us at info@nextbrain.ai.