Analyze Project using AdventureWorks 2019 DB

On-prem DB to Azure Cloud Pipeline with Data Factory, Lake Storage, Spark, Databricks, Synapse, PowerBI

📝 Table of Contents

Project Overview
Project Architecture
2.1. Data Ingestion
2.2. Data Transformation
2.3. Data Loading
2.4. Data Reporting
Credits
Contact

🔬 Project Overview

This project can be defined as End-to-end Data Engineering Project applied in Azure Cloud. Basically, Data Ingestion is applied with using Data Factory which gets raw data from on-premise SQL DB to Azure Data Lake storage in bronze layer, then data transformation process is applied by Azure Databricks using Spark and transformed data is stored in silver layer and gold layer kept cleansed data which is loaded into Synapse Serverless DB and its data is visualized in PowerBI report. Also, I used Azure Active Directory (AAD) and Azure Key Vault for the data monitoring and governance purpose.

📝 Project Architecture

You can find the detailed information on the diagram below:

📤 Data Ingestion

Connected the on-premise SQL Server with Azure using Microsoft Integration Runtime.

Setup the Resource group with needed services (Key Vault, Storage Account, Data Factory, Databricks, Synapse Analytics)

Migrated the tables from on-premise SQL Server to Azure Data Lake Storage Gen2.

⚙️ Data Transformation

Mounted Azure Blob Storage to Databricks to retrieve raw data from the Data Lake.
Used Spark Cluster in Azure Databricks to clean and refine the raw data.
Saved the cleaned data in a Delta format; optimized for further analysis.

📥 Data Loading

Used Azure Synapse Analytics to load the refined data efficiently.
Created SQL database and connected it to the data lake.

📊 Data Reporting

Connected Microsoft Power BI to Azure Synapse, and used the Views of the DB to create interactive and insightful data visualizations.

🛠️ Technologies Used

Data Source: SQL Server
Orchestration: Azure Data Factory
Ingestion: Azure Data Lake Gen2
Storage: Azure Synapse Analytics
Authentication and Secrets Management: Azure Active Directory and Azure Key Vault
Data Visualization: PowerBI

📋 Credits

This Project is inspired by the video of the YouTube Channel "Mr. K Talks Tech"

📨 Contact Me

LinkedIn Website Gmail

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
images		images
powerbi		powerbi
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analyze Project using AdventureWorks 2019 DB

📝 Table of Contents

🔬 Project Overview

📝 Project Architecture

📤 Data Ingestion

⚙️ Data Transformation

📥 Data Loading

📊 Data Reporting

🛠️ Technologies Used

📋 Credits

📨 Contact Me

About

Releases

Packages

Languages

dogucanelci/Azure_e2e_data_engineering_project_1

Folders and files

Latest commit

History

Repository files navigation

Analyze Project using AdventureWorks 2019 DB

📝 Table of Contents

🔬 Project Overview

📝 Project Architecture

📤 Data Ingestion

⚙️ Data Transformation

📥 Data Loading

📊 Data Reporting

🛠️ Technologies Used

📋 Credits

📨 Contact Me

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages