(Last Update on April 7, 2021)
- Overview
- Create and Configure Your Account
- Claim CS231N GCP credits
- Request an Increase in GPU Quota
- Set Up Google Cloud VM Image
- Access Your Newly created VM
- Remote Server Development
For your class project, we recommend setting up a GPU instance on GCP (Google Cloud Platform).
(We know you won't read until the very bottom once your assignment is running, so we are printing this at the top too since it is super important)
Don't forget to stop your instance when you are done (by clicking on the stop button at the top of the page showing your instances), otherwise you will run out of credits and that will be very sad. :(
If you follow our instructions below correctly, you should be able to restart your instance and the downloaded software will still be available.
While Colab is good for assignments, and is still a helpful and free tool for experimentation for your project, you will likely need a dedicated GPU instance when you start training on large datasets and collaborating as a team:
- Colab will disconnect after 12 hours or ~30 min of idling (and you will lose your unsaved data). A GCP VM instance will not disconnect untill you stop it (or run out of credits).
- A GCP VM instance's disk space allows you to deal with larger datasets. In Colab's case, you will have to save all your data and models to Google Drive.
- Colab does not innately support real-time collaboration.
- You can choose your GPU models and can set >1 GPUs for distributed training on GCP.
You should use your personal GMail account for GCP, i.e. NOT SUID@stanford.edu, because Stanford University managed email accounts do not support creating a new project.
For the class project, we offer students $50 GCP coupons for each student to use Google Compute Engine for developing and testing your implementations. When you first sign up on GCP, you will have $300 free credits.
If your credits ends up not being enough, contact course staff on Piazza. We will also send out forms for extra GCP credit request form later in the quarter.
This tutorial lists the necessary steps of working on the projects using Google Cloud. We expect this tutorial to take up to an hour. Don't get intimidated by the steps, we tried to make the tutorial detailed so that you are less likely to get stuck on a particular step. Please tag all questions related to Google Cloud with google_cloud on Piazza.
You should receive $300 credits from Google when you first sign up with Personal GMail and also UPGRADE it into a full account. Please try to use the resources judiciously.
-
Create Google Cloud account by going to the Google Cloud homepage. Click on the blue Get Started for free button. Sign into your Gmail account. Here is an illustrative example.
-
Choose Account type to be Individual. You will then fill in your name, address and credit card information.
-
Click the "Google Cloud Platform" (in red circle), and it will take you to the main project dashboard:
-
On the main project dashboard, you can change the name of your project by clicking Go to project settings.
-
To add project collaborators, click ADD PEOPLE TO THIS PROJECT. Add their email and make their role owners.
-
Upgrade your account in order to use GPUs following this instruction. Otherwise Google Cloud Free Tier does not come with GPU support or quota.
NOTE: You should have created and logged in your GCP account registered with your personal gmail account by now.
-
We will release the $50 GCP credits Google form on Piazza. After your complete the form, you will see a link to Google Cloud Education Grants page. It requires your Stanford email to receive the credits. (These credits can be applied to your GCP account registered with Personal GMail. )
-
After submission, you should receive a email from GCP with a link to confirm your email address. Click the link to verify your Stanford email.
-
You will soon receive another email from GCP with a link that applys the $50 credits to your account. After that the website will jump to your Billing page where you should see your have linked to CS231N billing account with $50 free credits.
-
Switching billing accounts from Free Tier credits to CS231N credits Google Cloud does not support combining credits. You will need to switch billing account if you want to use 2 sources of gcloud credits.
i.e. You can use up the $300 free credits first. Then switch to the CS231N billing account referring to this GCloud documentation.
Your account typically does not come with GPU quota. You have to explicitly request for it under IAM Admin > Quotas.
Please request the quota increase ASAP, because they will take up between couple minutes to a week to process! If you don't have GPU quota, you will have to create a CPU-only VM first and create another GPU VM later, explained in the next section.
You will need to change your quota for GPU (all regions).
-
Select Limit name from the dropdown. Then select GPUs (all regions) from the next prompted dropdown. Click the checkbox for Global in the menu to the right, and click into ALL QUOTAS.
-
Select the checkbox to the left of the first item in the table, and click EDIT QUOTAS. Set the New limit to 1, and make the Request description "Stanford CS 231N class project".
-
Wait until GCP sends you a second email (first email is just to notify they receive the request) that looks like this. It could take couple minutes to couple days for them to approve.
-
First, make sure you upgrade your free tier account to a full account following these instructions.
-
If you just registered a Google Cloud account, GCP can be slow on setting up its Compute Engine API services (this is the service that provides GPU access, so the GPU quota won't show up before it is ready).
One way I found that can make Compute Engine API setup faster is by visiting the VM instance page by clicking Compute Engine > VM instances
If you see that Compute Engine is not ready yet, wait for couple minutes until you see something like this screenshot below. The GPU-related Quota should now show up in IAM Admin > Quotas.
- For region-specific GPUs: Check that you have a default zone and region set under Compute Engine > Settings > Region / Zone. Some zones do not have certain GPU resources. Check pricing and spec for GCP GPUs to find the availability of GPU resources.
More instructions at General quota instructions and Step-by-step GPU-specific walk-through (all answers in the link are useful)
- Go to this gcloud marketplace. You may (or may not) be taken to a page where you have to click on "Launch", and then you should see a configuration sheet with the title "New Deep Learning VM deployment".
- Fill in
Deployment name
field with your preferred VM name. - In
Machine type
box, clickCustomize
. - Choose your desired number of CPUs and memory (if you are unsure, keep the default).
- For
GPU type
,NVIDIA Tesla K80
is typically enough.P100
andV100
are way more expensive (check the price on the right), but also faster and has larger memory. Check pricing and spec for GCP GPUs. GPU drivers and CUDA will be automatically installed only if you select at least 1 GPU. - In
Frameworks
field, choose the most recent version of TensorFlow or PyTorch, depending on which framework you plan to use. - Check the box
Install NVIDIA GPU driver automatically on first startup?
. - Check the box
Enable access to JupyterLab via URL instead of SSH. (Beta)
. - Leave all other options as default.
- Click the blue botton
Deploy
at the end of the page. It will automatically start your instance, so if you don't need to use it now, stop it immediately.
Your configuration sheet should look similar to below image. Follow exactly the same configuration for the ones with red boxes. For configurations with orange boxes, you can adjust it based on your project need as discussed below.
Pay attention to the monthly price, make sure you claim only necessary HW resources, so that you can use your GCP instance for longer. Once you run out of credits, the VM instance will be shut down automatically and you might lose unsaved data and models. If you are almost running out of credits, contact the course staff.
-
You can always change number of CPUs, number of GPUs, CPU memory, and GPU type after your VM has been created.
-
Just stop your instance, go to your VM instance's details at Compute Engine > VM instances > [click on instance name].
-
Click "edit" on your VM's page to modify the settings. Finally click "Save".
Wait until the deployment is finished. You should see a running VM with a green checkmark next to it on your Compute Engine page.
We need to tweak a few more settings to enable remote access to Jupyter Notebook.
- You must stop the instance first.
- Go to your VM instance's details at Compute Engine > VM instances > [click on instance name]. Click "edit" on your VM's page.
- Select "Allow HTTP traffic" and "Allow HTTPS traffic".
- Scroll all the way down and click the blue button "save".
- Go to Firewall config page.
- Click "CREATE FIREWALL RULE"
- Give it an arbitrary name, such as
cs231n
. - In
Targets
field, selectAll instances in the network
. - In
Source IP ranges
, enter0.0.0.0/0
. - In
Protocols and ports
field, select "Specified protocols and ports". Then checktcp
and enter7000-9000
. - Click the blue button
Create
. - Restart your instance on the Compute Engine page.
Your configuration sheets should look similar to below:
Firewall Rules:
If you want to have a static IP for ease of access, you can change the External IP address of your Google Cloud Engine instance to be static (see screenshot below).
To Do this, click on the 3 line icon next to the Google Cloud Platform button on the top left corner of your screen, go to VPC network > External IP addresses (see screenshot below).
To have a static IP address, change Type from Ephemeral to Static. Enter your prefered name for your static IP, ours is cs231n-ip
(see screenshot below). And click on Reserve.
NOTE: At the end of CS 231N when you don't need your instance anymore, release the static IP address because Google charges a small fee for unused static IPs (according to this page).
Take note of your Static IP address (circled on the screenshot below). We use 35.185.240.182 for this tutorial.
Now that you have created your virtual GCE, you want to be able to connect to it from your computer. The rest of this tutorial goes over how to do that using the command line.
To access gcloud commands in your local terminal, install Google Cloud SDK that is appropriate for your platform and follow their instructions.
If gcloud
command is not in your system path after installation, you can also reference it by its full path /<DIRECTORY-WHERE-GOOGLE-CLOUD-IS-INSTALLED>/bin/gcloud
. See this page for more detailed instructions.
To ssh into your VM, go to your VM instance details page by clicking on its name. Start the VM instance first. Once it has a green check mark on, click on the drop-down arrow and select View gcloud command
instead to retrieve the terminal command. It should look like
gcloud compute --project "<YOUR_PROJECT_ID>" ssh --zone "us-west1-b" "<YOUR_VM_NAME>"
You should now be able to run nvidia-smi
and see the list of attached GPUs and their usage statistics. Run watch nvidia-smi
to monitor your GPU usage in real time.
If you wish, you can use Jupyter Notebook to experiment in your projects. Below, we discuss how to run Jupyter Notebook from your GCE instance and connect to it with your local browser.
After you SSH into the VM for the first time, you need to run a few commands in your home directory. You will be asked to set up a password for your Jupyter Notebook.
git clone https://github.com/cs231n/gcloud.git
cd gcloud/
chmod +x ./setup.sh
./setup.sh
Now you can run Jupyter Notebook from the folder with your assignment files.
jupyter notebook
The default port is 8888
, specified in ~/.jupyter/jupyter_notebook_config.py
.
You can connect to your Jupyter session from your personal laptop. Check the external ip address of your instance, say it is 35.185.240.182
. Open any browser and visit 35.185.240.182:8888
. The login password is the one you set with the setup script above.
- Inside the
gcloud/
folder, runpython verify_gpu.py
. If your GPU is attached and CUDA is correctly installed, you shouldn't see any error. - If you want to use TensorFlow, run
python test_tf.py
. The script will show you the installed TensorFlow version and then run a sample MNIST training. You should see around 97% accuracy at the end.
For instance, to transfer file.zip
from GCE instance to your local laptop. There is an easy command for this purpose:
gcloud compute scp <user>@<instance-name>:/path/to/file.zip /local/path
For example, to download files from our instance to the current folder:
gcloud compute scp tonystark@cs231n:/home/shared/file.zip .
The transfer works in both directions. To upload a file to your instance:
gcloud compute scp /my/local/file tonystark@cs231n:/home/shared/
If you would like to transfer an entire folder, you will need to add a resursive flag:
gcloud compute scp --recursive /my/local/folder tonystark@cs231n:/home/shared/
You can use Tmux to keep the training sessions running when you close your laptop. Also, if your collaborators log into the same account on the VM instance, they will see the same tmux session screen in real time.
You can develop your code on remote server directly if you are comfortable with vim or emac.
You can develop locally on your favorite editor, push to your branch on Github, and pull on remote server to run. (git commit frequently is also one of good Github practices)
Besides gcloud compute scp
, another tool you can check out is rsync which can synchronize files and folders between your local machine and remote server.
Don't forget to stop your instance when you are done (by clicking on the stop button at the top of the page showing your instances). You can restart your instance and the downloaded software will still be available.
We have seen students who left their instances running for many days and ran out of credits. You will be charged per hour when your instance is running. This includes code development time. We encourage you to read up on Google Cloud, regularly keep track of your credits and not solely rely on our tutorials.