Working with Britta we extracted CSV files, transformed the data so we can easily upload to PostGres database and loaded the data into a unique database.
First we Extracted the Backer Info dataset and did some minor transformation to get the data to be in a more structured format. The data within the CSV was all included in one column.
We converted each row into a dictionary, converted into useable dataframe then exported the dataframe file to a CSV.
Once all the data extracted into a usebale database we transfored the Data. We split out the name columns into first and last name then dropped the name column and exported the updated dataframe into a CSV.
To load the data we used an ERD to create various tables within quick database designs and exported the Schema in PostGresSQL for analysis.
Finally we analyzed the tables and created queries to analyze the data.
-
First we created a table that joined the campaign and contacts tables on ID and included with all the distinct first names, last names, email address and amount left to reach the goal for all Live projects of each contact.
-
The Second table we created inlcuded joining two tables on ID and including the email address, first and last name of each backer, ID, company name description,end date of the campaing and remaining amount of the campaign goal.
There is a lot to learn in the ETL process. The main items learned in this section was getting hands on experience of the ETL process and combining different programs together to create completed database.
- Extract_Transform_final_code - Jupyter Notebook that Extracts and Transforms the backer_info.csv to be input into a SQL Database.
- Crowdfunding_SQL_Analysis: SQL workbook that utilized the newly created table to input into our database and queried on the results.