This is the homepage for the SAILORS 2017 NLP research project. Here you can find links to all class materials used for the research project.
Instructors: Abi See (abisee@stanford.edu), Sebastian Schuster (sebschu@stanford.edu)
- Day 1: Introduction to NLP
- Day 2: Rule-based classifiers
- Day 3: Evaluation metrics (Exercise sheet here)
- Day 4: Probability theory and Bayes rule (Exercise sheet here)
- Day 5 morning: Naive Bayes classifier
- Day 5 afternoon: More NLP
- Day 6 morning: Naive Bayes classifier for Twitter project
- Day 6 afternoon: Neural Networks
- Day 7: Wrap-up
-
Install Anaconda.
Anaconda is a python distribution that makes it really easy to install additional python packages and manage different Python versions. You can download Anaconda from https://www.continuum.io/. Make sure to download the Python 2.7 version! This should also automatically install Jupyter notebook, which you'll need to run the notebooks.
-
Install numpy and nltk:
Open a Terminal window and type
conda install nltk numpy pandas
-
Copy ("clone") the GitHub repository to your computer:
Open a Terminal window and type
git clone https://github.com/abisee/sailors2017
This will copy all the notebooks to your computer.
-
Change into the directory:
In the same Terminal window, type
cd sailors2017
-
Download the tokenizer models:
Start a Python console by typing
python
in the Terminal window. Then run the following commands:import nltk nltk.download("punkt") exit()
-
Run the jupyter notebook:
jupyter notebook
The directory filled
contains versions of the iPython notebooks with the solutions filled in. If you would like to run these, you need to copy them to the main directory (i.e. sailors2017
), overwriting the blank versions of the notebooks that are currently there. Then run jupyter notebook
and you should be able to access the completed versions of the notebooks.