The purpose of this exercise is to implement a script for manipulating and aggregating a large dataset through code.
This project relies on Python 3 and the following packages:
- pandas
- os
- csv
- datetime
Instructions for installing Python 3 into your computer can be found here: https://www.python.org/downloads/
- Clone the repo into your working directory:
git clone https://github.com/jonadata13/data_engineer_exercise.git
- Install Python packages by running the command in your terminal:
pip install [package_name]
- Using your terminal, navigate to the project folder:
cd [path to data_engineer_exercise folder]
- Run
script.py
python script.py
- Verify that two CSV files have been saved to your current working directory:
- people.csv
- aquisition_facts.csv