Skip to content

Greyhouse-Consulting/LAB4

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

Preconditions

  • Python 3.7 installed
  • Pyspark installed and correctly configured
  • Knowledge on how to run jupyter notebooks

Instructions

The code for this exercise is written as Jupyter notebook file. To be able to run please follow these steps.

  1. Download the dataset at
    http://stat-computing.org/dataexpo/2009/the-data.html
    or
    https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/HG7NV7
  2. Extract the files to a directory
  3. Name the files in such way that they begin with the year. 2009_some_name.csv etc
  4. Open lab4.ipynb file and modify the variable fileLocation to point to the directory created in step 2.
  5. Run lab4.ipynb from jupyter notebook

About

Machine Learning With Big Data - Övning 1

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published