Sparkify Postgres ETL

This project extract, transform and loads 5 main informations from the Sparkify app (an app to listen to your favorite musics) logs:

songplays
users
songs
artists
time - (timestamps breakdown into comprehensible columns)

With this structured database we can extract several insightful informations and can find several hidden patterns in listeners data.

Objective

You will learn three most useful concepts from this project

Data modeling with Postgres
Database star schema creation
Building ETL PipeLine using python

Running the ETL

First you must create the PostgreSQL database structure, by doing:

python create_tables.py

After Creation parse the logs files:

python etl.py

Database Schema

The schema used for this exercise is the Star Schema: " One Fact Table surround by 4 Dimension Table "

The project file structure

We have a small list of files, easy to maintain and understand the Concept:

sql_queries.py - Contains all your sql queries to use throughout the ETL process

create_tables.py - File reponsible to create the schema structure into the PostgreSQL database

etl.py - Reads and processes files from song_data and log_data and load them into the tables.

etl.ipynb - The python notebook that was written to develop the logic behind the etl.py process.

test.ipynb - Displays the first few rows of each table, to certify if our ETL process was being successful (or not).

Author

Aditya Dhanraj - Linkedin Profile.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
images		images
README.md		README.md
create_tables.py		create_tables.py
etl.ipynb		etl.ipynb
etl.py		etl.py
sql_queries.py		sql_queries.py
sql_table.txt		sql_table.txt
test.ipynb		test.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sparkify Postgres ETL

Objective

Running the ETL

Database Schema

The project file structure

Author

About

Releases

Packages

Languages

aditya-dhanraj/Data-Modeling-with-Postgres

Folders and files

Latest commit

History

Repository files navigation

Sparkify Postgres ETL

Objective

Running the ETL

Database Schema

The project file structure

Author

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages