This project is to build production-ready ELT pipelines using Airflow for Airbnb listings. The ELT pipeline includes processing and cleaning two provided datasets and loading the data into a data warehouse and data mart for analysis.
The ELT pipeline will be built by python scripts, SQL scripts, Airflow, GCP Cloud Composer, and Snowflake. The raw data will be extracted from the provided CSV files, loaded into the Snowflake database, then transformed into star schema in the data warehouse. After that, a datamart will be designed and populated through an ETL pipeline from the data warehouse.
Edit on May 17, 2020