This project provides a Python-based pipeline for extracting data from a SQL database, transforming it, and loading it into a Neo4j graph database.
- Python 3.6 or later
- A SQL database (e.g., PostgreSQL, MySQL, SQLite)
- Neo4j graph database
- Clone the repository:
git clone https://github.com/your-username/sql_neo4j_pipeline.git
- Navigate to the project directory:
cd sql_neo4j_pipeline
- Create a virtual environment (optional but recommended):
python3 -m venv venv
source venv/bin/activate # On Windows, use venv\Scripts\activate
- Install the required dependencies:
pip install -r requirements.txt
Before running the pipeline, you need to configure the database connections and other settings in the sql_neo4j_pipeline/config.py
file.
To run the pipeline, execute the following command:
python scripts/run_pipeline.py
This will trigger the following steps:
- Extract: Data is extracted from the configured SQL database.
- Transform: The extracted data is transformed into a format suitable for the Neo4j graph database.
- Load: The transformed data is loaded into the Neo4j graph database.
Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
This project is licensed under the MIT License.