Sparkify Postgres ETL

This project extract, transform and loads 5 main informations from the Sparkify app (an app to listen to your favorite musics) logs:

With this structured database we can extract several insightful informations and can find several hidden patterns in listeners data.

Objective

You will learn three most useful concepts from this project

First you must create the PostgreSQL database structure, by doing:

python create_tables.py

After Creation parse the logs files:

python etl.py

The schema used for this exercise is the Star Schema: " One Fact Table surround by 4 Dimension Table "

We have a small list of files, easy to maintain and understand the Concept:

sql_queries.py - Contains all your sql queries to use throughout the ETL process

create_tables.py - File reponsible to create the schema structure into the PostgreSQL database

etl.py - Reads and processes files from song_data and log_data and load them into the tables.

etl.ipynb - The python notebook that was written to develop the logic behind the etl.py process.

test.ipynb - Displays the first few rows of each table, to certify if our ETL process was being successful (or not).

Aditya Dhanraj - Linkedin Profile.