Skip to content

Commit

Permalink
Merge pull request #7 from Paperspace/new-template
Browse files Browse the repository at this point in the history
New template
  • Loading branch information
joshua-paperspace authored Oct 10, 2022
2 parents 8b5335a + d3ec4e9 commit 9090823
Show file tree
Hide file tree
Showing 2 changed files with 50 additions and 43 deletions.
17 changes: 8 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,13 @@ We assume a generic advanced data science user who probably wants GPU access, bu
| Category | Software | Version | Install Method | Why / Notes |
| ------------- | ------------- | ------------- | ------------- | ------------- |
| GPU | NVidia Driver | 515.65.01 | pre-installed | Enable Nvidia GPUs. Latest version as of VM creation date |
| | CUDA | 11.7.1 | Conda | Nvidia A100 GPUs require CUDA 11+ to work, so 10.x is not suitable |
| | CUDA | 11.7.1 | Install script | Nvidia A100 GPUs require CUDA 11+ to work, so 10.x is not suitable |
| | CUDA toolkit | 11.7.1 | Apt | Needed for `nvcc` command for cuDNN |
| | cuDNN | 8.5.0.*-1+cuda11.7 | Ubuntu repo | Nvidia GPU deep learning library |
| | CUDA toolkit | 11.7.1 | Conda | Needed for `nvcc` command for cuDNN. Installed with Conda CUDA installation |
| Infra | Anaconda | 4.14.0 | Ubuntu repo | Package management system that installs Python3, pip, and other python packages |
| | Docker Engine CE | 20.10.8, build 3967b7d | pre-installed | Docker Engine community edition |
| Infra | Docker Engine CE | 20.10.8, build 3967b7d | pre-installed | Docker Engine community edition |
| | NVidia Docker | 2.6.0-1 | pre-installed | Enable NVidia GPU in Docker containers |
| Python | Python | 3.9.12 | Conda | Most widely used programming language for data science. Version 3.9.12 is installed when downloading Anaconda3 and is compatible with other software and their versions installed here. |
| | pip3 | 22.2.2 | Conda | Enable easy installation of 1000s of other data science, etc., packages. Installed with Anaconda3 installation. |
| Python | Python | 3.9.14 | Apt | Most widely used programming language for data science |
| | pip3 | 22.2.2 | Apt | Enable easy installation of 1000s of other data science, etc., packages. |
| | NumPy | 1.23.2 | pip3 | Handle arrays, matrices, etc., in Python |
| | SciPy | 1.9.1 | pip3 | Fundamental algorithms for scientific computing in Python |
| | Pandas | 1.4.4 | pip3 | De facto standard for data science data exploration/preparation in Python |
Expand All @@ -43,6 +42,7 @@ We assume a generic advanced data science user who probably wants GPU access, bu
| | opencv-python | 4.6.0.66 | pip3 | Includes several hundreds of computer vision algorithms |
| | pyyaml | 5.4.1 | pip3 | YAML parser and emitter for Python |
| | JupyterLab | 3.4.6 | pip3 | De facto standard for data science using Jupyter notebooks |
| | wandb | 0.13.4 | pip3 | CLI and library to interact with the Weights & Biases API (model tracking) |
| Machine Learning | Scikit-learn | 1.1.2 | pip3 | Widely used ML library for data science, generally for smaller data or models |
| | Scikit-image | 0.19.3 | pip3 | Collection of algorithms for image processing |
| | TensorFlow | 2.9.2 | pip3 | Most widely used deep learning library, alongside PyTorch |
Expand All @@ -59,7 +59,6 @@ We assume a generic advanced data science user who probably wants GPU access, bu
| --------------- | ------------- | ------------- |
| CUDA | NVidia EULA | https://docs.nvidia.com/cuda/eula/index.html |
| cuDNN | NVidia EULA | https://docs.nvidia.com/deeplearning/cudnn/sla/index.html |
| Anaconda | Other | https://legal.anaconda.com/policies/en/?name=end-user-license-agreements#anaconda-distribution |
| Docker Engine | Apache 2.0 | https://github.com/moby/moby/blob/master/LICENSE |
| JupyterLab | New BSD | https://github.com/jupyterlab/jupyterlab/blob/master/LICENSE |
| Matplotlib | PSF-based | https://matplotlib.org/stable/users/license.html |
Expand Down Expand Up @@ -98,6 +97,7 @@ We assume a generic advanced data science user who probably wants GPU access, bu
| jsonify | MIT | https://pypi.org/project/jsonify/0.5/#data |
| opencv-python | MIT | https://github.com/opencv/opencv-python/blob/4.x/LICENSE.txt |
| pyyaml | MIT | https://github.com/yaml/pyyaml/blob/master/LICENSE |
| wandb | MIT | https://github.com/wandb/wandb/blob/master/LICENSE |


Information about license types:
Expand All @@ -111,7 +111,6 @@ ISC: https://opensource.org/licenses/ISC

Open source software can be used for commercial purposes: https://opensource.org/docs/osd#fields-of-endeavor.

Note: Anaconda has its own End User Licensing agreements around commercial use

## Software not included

Expand All @@ -134,7 +133,7 @@ Some generic categories of software not included:
| Connectors | Academic Torrents | |
| Dashboarding | panel, dash, voila, streamlit | |
| Databases | MySQL, Hive, PostgreSQL, Prometheus, Neo4j, MongoDB, Cassandra, Redis | No particular infra to connect to databases |
| Deep Learning | Caffe, Caffe2, Theano, PaddlePaddle, Chainer, Torch, MXNet | PyTorch and TensorFlow are dominant, rest niche |
| Deep Learning | Caffe, Caffe2, Theano, PaddlePaddle, Chainer, MXNet | PyTorch and TensorFlow are dominant, rest niche |
| Deployment | Dash, TFServing, R Shiny, Flask | Use Gradient Deployments |
| Distributed. | Horovod, OpenMPI | Use Gradient distributed |
| Feature store | Feast | |
Expand Down
76 changes: 42 additions & 34 deletions ml_in_a_box.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# ==================================================================
# Module list
# ------------------------------------------------------------------
# python 3.9.12 (conda)
# python 3.9.14 (apt)
# torch 1.12.1 (pip)
# torchvision 0.13.1 (pip)
# torchaudio 0.12.1 (pip)
Expand Down Expand Up @@ -39,6 +39,7 @@
# opencv-python 4.6.0.66 (pip)
# pyyaml 5.4.1 (pip)
# sentence-transformers 2.2.2 (pip)
# wandb 0.13.4 (pip)
# nodejs 16.x latest (apt)
# default-jre latest (apt)
# default-jdk latest (apt)
Expand All @@ -50,8 +51,7 @@

# Set ENV variables
export APT_INSTALL="apt-get install -y --no-install-recommends"
export PIP_INSTALL="python3 -m pip --no-cache-dir install --upgrade"
export CONDA_INSTALL="conda install -y"
export PIP_INSTALL="python -m pip --no-cache-dir install --upgrade"
export GIT_CLONE="git clone --depth 10"

# Update apt
Expand All @@ -75,6 +75,7 @@
rsync \
git \
vim \
mlocate \
libssl-dev \
curl \
openssh-client \
Expand Down Expand Up @@ -102,40 +103,49 @@


# ==================================================================
# Conda
# Python
# ------------------------------------------------------------------

#Based on https://docs.anaconda.com/anaconda/install/linux/
#Based on https://launchpad.net/~deadsnakes/+archive/ubuntu/ppa

sudo $APT_INSTALL libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6
sudo wget https://repo.anaconda.com/archive/Anaconda3-2022.05-Linux-x86_64.sh
sudo bash ~/Anaconda3-2022.05-Linux-x86_64.sh -b -p $HOME/anaconda3

sudo chown -R $USER:$USER $HOME/anaconda3
sudo chmod -R +x $HOME/anaconda3

source $HOME/anaconda3/bin/activate
conda init bash
conda deactivate

export PATH=$HOME/anaconda3/bin:${PATH}

$PIP_INSTALL pip

rm -f ~/Anaconda3-2022.05-Linux-x86_64.sh
# Adding repository for python3.9
DEBIAN_FRONTEND=noninteractive \
sudo $APT_INSTALL software-properties-common
sudo add-apt-repository ppa:deadsnakes/ppa -y

# Installing python3.9
DEBIAN_FRONTEND=noninteractive sudo $APT_INSTALL \
python3.9 \
python3.9-dev \
python3.9-venv \
python3-distutils-extra

# Add symlink so python and python3 commands use same python3.9 executable
sudo ln -s /usr/bin/python3.9 /usr/local/bin/python3
sudo ln -s /usr/bin/python3.9 /usr/local/bin/python

# Installing pip
curl -sS https://bootstrap.pypa.io/get-pip.py | python3.9
export PATH=$PATH:/home/paperspace/.local/bin


# ==================================================================
# Installing CUDA packages (CUDA Toolkit 11.7.1 & CUDNN 8.5.0)
# Installing CUDA packages (CUDA Toolkit 11.8.0 & CUDNN 8.5.0)
# ------------------------------------------------------------------

$CONDA_INSTALL -c nvidia/label/cuda-11.7.1 cuda
# Based on https://developer.nvidia.com/cuda-toolkit-archive
# Based on https://developer.nvidia.com/rdp/cudnn-archive

wget https://developer.download.nvidia.com/compute/cuda/11.7.1/local_installers/cuda_11.7.1_515.65.01_linux.run
sudo sh cuda_11.7.1_515.65.01_linux.run --silent --toolkit
export PATH=$PATH:/usr/local/cuda-11.7/bin
export LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64
rm cuda_11.7.1_515.65.01_linux.run

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
sudo apt-get update
sudo apt-get install libcudnn8=8.5.0.*-1+cuda11.7
sudo apt-get install libcudnn8-dev=8.5.0.*-1+cuda11.7

Expand Down Expand Up @@ -164,7 +174,7 @@

# Based on https://www.tensorflow.org/install/pip

export LD_LIBRARY_PATH=${HOME}/anaconda3/lib
# export LD_LIBRARY_PATH=${HOME}/anaconda3/lib
$PIP_INSTALL tensorflow==2.9.2


Expand Down Expand Up @@ -218,7 +228,8 @@
jsonify==0.5 \
opencv-python==4.6.0.66 \
pyyaml==5.4.1 \
sentence-transformers==2.2.2
sentence-transformers==2.2.2 \
wandb==0.13.4


# ==================================================================
Expand All @@ -245,24 +256,21 @@

sudo curl -sL https://deb.nodesource.com/setup_16.x | sudo bash
sudo $APT_INSTALL nodejs
$PIP_INSTALL jupyter_contrib_nbextensions jupyterlab-git && \
DEBIAN_FRONTEND=noninteractive jupyter contrib nbextension install --sys-prefix
$PIP_INSTALL jupyter_contrib_nbextensions jupyterlab-git
DEBIAN_FRONTEND=noninteractive jupyter contrib nbextension install --user


# ==================================================================
# Config & Cleanup
# ------------------------------------------------------------------

rm $HOME/anaconda3/lib/libtinfo.so.6
rm $HOME/anaconda3/lib/libncursesw.so.6

echo "export PATH=${PATH}" >> ~/.bashrc
echo "export LD_LIBRARY_PATH=${HOME}/anaconda3/lib" >> ~/.bashrc
echo "export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" >> ~/.bashrc

echo "export PATH=${PATH}" >> ~/.profile
echo "export LD_LIBRARY_PATH=${HOME}/anaconda3/lib" >> ~/.profile
echo "export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" >> ~/.profile

echo "export PATH=${PATH}" >> ~/.bash_profile
echo "export LD_LIBRARY_PATH=${HOME}/anaconda3/lib" >> ~/.bash_profile
echo "export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" >> ~/.bash_profile


0 comments on commit 9090823

Please sign in to comment.