Georgos Valavanis, Michael Voutyritsas, Elisavet Kavoura, Evaggelos Stergiou, Charisis Skordas
In this final project, we developed a robust Data Warehouse for a company called Cataschevastika. The project was divided into three main phases:
- Data Lake Creation and Deployment
- Advanced Analytical Capabilities for Sales Decision-Making
- Business Insights Enhancement Using Power BI
We began by updating the Online Transaction Processing (OLTP) database to meet the new requirements. This involved adding additional tables to our existing OLTP database, which was insufficient for our needs.
The creation of the Data Warehouse was a multi-step process:
- Staging: Data was extracted from the OLTP system and loaded into the staging area.
- Fact Tables: Fact tables were designed and created to store quantitative data for analysis.
- View of Tables: Logical views of tables were created to simplify complex queries.
- Slowly Changing Dimensions (SCD) Type 2: Implemented row versioning to handle historical data changes over time.
We deployed the local Data Warehouse to Azure Data Lake, which provided scalable and secure storage for structured and unstructured data.
Utilizing Databricks and PySpark, we developed a sales prediction model using K-Means clustering. This machine learning model helped in forecasting sales and identifying trends.
The final phase involved visualizing the results and insights gained from the Data Warehouse and ML models using Power BI. This enabled stakeholders to make data-driven decisions through interactive and intuitive dashboards.