Skip to content

jonavellecuerdo/data_engineer_exercise

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Engineer Exercise

The purpose of this exercise is to implement a script for manipulating and aggregating a large dataset through code.

Prerequisites

This project relies on Python 3 and the following packages:

  • pandas
  • os
  • csv
  • datetime

Instructions for installing Python 3 into your computer can be found here: https://www.python.org/downloads/

Installation

  1. Clone the repo into your working directory:
git clone https://github.com/jonadata13/data_engineer_exercise.git
  1. Install Python packages by running the command in your terminal:
pip install [package_name]

Usage

  1. Using your terminal, navigate to the project folder:
cd [path to data_engineer_exercise folder]
  1. Run script.py
python script.py
  1. Verify that two CSV files have been saved to your current working directory:
  • people.csv
  • aquisition_facts.csv

About

Data engineer exercise with large data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages