Use Jupyter, SciPy, pandas, and scikit-learn without installing any Pythons.
kingofsnake
is a template for a reproducible data analysis lab which:
- serves Jupyter notebooks from a Docker container.
- never conflicts with existing Python(s), Anaconda, or virtualenvs.
- includes pinned versions of these packages and their dependencies:
networkx
notebook
pandas[all]
requests
scikit-learn
scipy
seaborn
See the books folder for examples of:
- data cleaning
- unsupervised clustering
- supervised classification
- principal component analysis
- force-directed graph drawing
Generate a new repo from this template.
- Open a terminal and
cd
to this folder. - Edit the Dockerfile to choose a Python version.
- Edit requirements.txt to choose Python packages.
- Run
./kitchen bake
to build akingofsnake:latest
Docker image. - Run
./kitchen freeze
to updaterequirements.txt
and rebuild.
- Open a terminal and
cd
to this folder. - Run
./kitchen serve
to start a Jupyter server. - Open a web browser and enter
localhost:8888
in the address bar.
This runs Jupyter in a container, publishes port 8888, and mounts some folders from this repo:
etc/ipython
is mounted as/home/kos/.ipython
etc/jupyter
is mounted as/home/kos/.jupyter
books
is mounted as/home/kos/books
code
is mounted as/home/kos/code
data
is mounted as/home/kos/data
Jupyter security: On the first run, Jupyter might ask you to copypaste a token and create a password. It will save the hashed password and any custom settings to etc/ipython
and etc/jupyter
in this repo. If those folders do not exist, they will be created automatically. Git ignores the contents of both folders.
- Open a terminal and
cd
to this folder. ./kitchen clean
stops and deletes allkingofsnake
containers../kitchen eightysix
deletes thekingofsnake:latest
image.
The clean
command is rarely necessary because kingofsnake
containers self-destruct.
The books
folder contains example notebooks:
- classify.ipynb trains and tests an sklearn.linear_model classifier.
- clean.ipynb standardizes, sorts, and filters pandas DataFrames.
- cluster.ipynb finds clusters with scipy.cluster.hierarchy.
- components.ipynb finds principal components with sklearn.decomposition.PCA.
- graph.ipynb draws graphs using the ForceAtlas2 energy model.
- plot.ipynb uses matplotlib to visualize data.
The code
folder contains example Python modules:
- classify.py for classification
- cluster.py for clustering
- graph.py for graph drawing
- plot.py for data visualization
- tools.py for constants and convenience methods
This folder is for storing data files. Git ignores everything in it except a few examples.
kingofsnake
has one dependency:
Windows users may need to edit the kitchen script for path compatibility.
Show all available kitchen commands.
./kitchen help
Run a container as root without Jupyter, folder mounts, or published ports.
./kitchen runit latest
Bake another image called kingofsnake:karl
, freeze it, and serve Jupyter.
./kitchen bake karl
./kitchen freeze karl
./kitchen serve karl
Delete the kingofsnake:karl
image and its containers:
./kitchen eightysix karl
Don't install anything. Use this repo as a template.
Click on the terminal running Jupyter and press CTRL-C.
Yes. See the Docker run reference.
No. Delete them if you want to.