-
Notifications
You must be signed in to change notification settings - Fork 61
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add first draft of Accelerate example (#178)
* Add first draft of Accelerate example * Update links * Update links and fix minimum version of accelerate * Update links
- Loading branch information
1 parent
49b61e4
commit 8ab9e78
Showing
1 changed file
with
338 additions
and
0 deletions.
There are no files selected for viewing
338 changes: 338 additions & 0 deletions
338
integrations/model-training/accelerate/notebooks/Comet_and_Accelerate.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,338 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "PWVljpddz_vN" | ||
}, | ||
"source": [ | ||
"<img src=\"https://cdn.comet.ml/img/notebook_logo.png\">" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "A0-thQauBRRL" | ||
}, | ||
"source": [ | ||
"[Comet](https://www.comet.com/site/products/ml-experiment-tracking/??utm_source=accelerate&utm_medium=colab&utm_campaign=comet_examples) is an MLOps Platform that is designed to help Data Scientists and Teams build better models faster! Comet provides tooling to track, Explain, Manage, and Monitor your models in a single place! It works with Jupyter Notebooks and Scripts and most importantly it's 100% free to get started!\n", | ||
"\n", | ||
"[Hugging Face Accelerate](https://github.com/huggingface/accelerate) was created for PyTorch users who like to write the training loop of PyTorch models but are reluctant to write and maintain the boilerplate code needed to use multi-GPUs/TPU/fp16.\n", | ||
"\n", | ||
"Instrument Accelerate with Comet to start managing experiments, create dataset versions and track hyperparameters for faster and easier reproducibility and collaboration.\n", | ||
"\n", | ||
"[Find more information about our integration with Accelerate](https://www.comet.ml/docs/v2/integrations/ml-frameworks/transformers/?utm_source=accelerate&utm_medium=colab&utm_campaign=comet_examples)\n", | ||
"\n", | ||
"Curious about how Comet can help you build better models, faster? Find out more about [Comet](https://www.comet.com/site/products/ml-experiment-tracking/?utm_source=accelerate&utm_medium=colab&utm_campaign=comet_examples) and our [other integrations](https://www.comet.com/docs/v2/integrations/overview/?utm_source=accelerate&utm_medium=colab&utm_campaign=comet_examples)\n", | ||
"\n", | ||
"Get a preview for what's to come. Check out a completed experiment created from this notebook [here](https://www.comet.com/examples/comet-example-accelerate-notebook/0373dd068a484105b16c4053407f1bb2/?utm_source=accelerate&utm_medium=colab&utm_campaign=comet_examples).\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "V2UZtdWitSLf" | ||
}, | ||
"source": [ | ||
"# Install Dependencies" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "vIQsPNvatQIU" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"%pip install \"comet_ml>=3.31.5\" torch torchvision tqdm \"accelerate>=0.17.0\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "lpCFdN33tday" | ||
}, | ||
"source": [ | ||
"# Initialize Comet" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "kGyz_i-dtfk4" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import comet_ml\n", | ||
"\n", | ||
"comet_ml.init()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "zMVITKxktj7H" | ||
}, | ||
"source": [ | ||
"# Import Dependencies" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "cuRPreREp8y0" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from accelerate import Accelerator\n", | ||
"from torch.autograd import Variable\n", | ||
"from tqdm import tqdm\n", | ||
"import torch\n", | ||
"import torch.nn as nn\n", | ||
"import torch.nn.functional as F\n", | ||
"import torchvision.datasets as datasets\n", | ||
"import torchvision.transforms as transforms" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "a15YYRcKp9xV" | ||
}, | ||
"source": [ | ||
"# Define Parameters" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "rZ9biJqMtMq0" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"hyper_params = {\"batch_size\": 100, \"num_epochs\": 3, \"learning_rate\": 0.01}" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "htcACg9grtTe" | ||
}, | ||
"source": [ | ||
"# Load Data" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "crCfZX9Zrufn" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# MNIST Dataset\n", | ||
"train_dataset = datasets.MNIST(\n", | ||
" root=\"./data/\", train=True, transform=transforms.ToTensor(), download=True\n", | ||
")\n", | ||
"\n", | ||
"test_dataset = datasets.MNIST(\n", | ||
" root=\"./data/\", train=False, transform=transforms.ToTensor()\n", | ||
")\n", | ||
"\n", | ||
"# Data Loader (Input Pipeline)\n", | ||
"train_loader = torch.utils.data.DataLoader(\n", | ||
" dataset=train_dataset, batch_size=hyper_params[\"batch_size\"], shuffle=True\n", | ||
")\n", | ||
"\n", | ||
"test_loader = torch.utils.data.DataLoader(\n", | ||
" dataset=test_dataset, batch_size=hyper_params[\"batch_size\"], shuffle=False\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "vqJaYzGvryUz" | ||
}, | ||
"source": [ | ||
"# Define Model and Optimizer" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "pMRQESULrzXg" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"accelerator = Accelerator(log_with=\"comet_ml\")\n", | ||
"accelerator.init_trackers(\n", | ||
" project_name=\"comet-example-accelerate-notebook\", config=hyper_params\n", | ||
")\n", | ||
"device = accelerator.device\n", | ||
"\n", | ||
"\n", | ||
"class Net(nn.Module):\n", | ||
" def __init__(self):\n", | ||
" super(Net, self).__init__()\n", | ||
" self.conv1 = nn.Conv2d(1, 32, 3, 1)\n", | ||
" self.conv2 = nn.Conv2d(32, 64, 3, 1)\n", | ||
" self.dropout1 = nn.Dropout(0.25)\n", | ||
" self.dropout2 = nn.Dropout(0.5)\n", | ||
" self.fc1 = nn.Linear(9216, 128)\n", | ||
" self.fc2 = nn.Linear(128, 10)\n", | ||
"\n", | ||
" def forward(self, x):\n", | ||
" x = self.conv1(x)\n", | ||
" x = F.relu(x)\n", | ||
" x = self.conv2(x)\n", | ||
" x = F.relu(x)\n", | ||
" x = F.max_pool2d(x, 2)\n", | ||
" x = self.dropout1(x)\n", | ||
" x = torch.flatten(x, 1)\n", | ||
" x = self.fc1(x)\n", | ||
" x = F.relu(x)\n", | ||
" x = self.dropout2(x)\n", | ||
" x = self.fc2(x)\n", | ||
"\n", | ||
" return x\n", | ||
"\n", | ||
"\n", | ||
"model = Net().to(device)\n", | ||
"\n", | ||
"# Loss and Optimizer\n", | ||
"criterion = nn.CrossEntropyLoss()\n", | ||
"optimizer = torch.optim.Adam(model.parameters(), lr=hyper_params[\"learning_rate\"])\n", | ||
"\n", | ||
"model, optimizer, train_loader = accelerator.prepare(model, optimizer, train_loader)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "7hgzjMnYqZPQ" | ||
}, | ||
"source": [ | ||
"# Train a Model" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "zmb3DAsXqa4K" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"def train(model, optimizer, criterion, dataloader, epoch):\n", | ||
" model.train()\n", | ||
" total_loss = 0\n", | ||
" correct = 0\n", | ||
" for batch_idx, (images, labels) in enumerate(\n", | ||
" tqdm(dataloader, total=len(dataloader))\n", | ||
" ):\n", | ||
" optimizer.zero_grad()\n", | ||
" images = images.to(device)\n", | ||
" labels = labels.to(device)\n", | ||
"\n", | ||
" outputs = model(images)\n", | ||
"\n", | ||
" loss = criterion(outputs, labels)\n", | ||
" pred = outputs.argmax(\n", | ||
" dim=1, keepdim=True\n", | ||
" ) # get the index of the max log-probability\n", | ||
"\n", | ||
" accelerator.backward(loss)\n", | ||
" optimizer.step()\n", | ||
"\n", | ||
" # Compute train accuracy\n", | ||
" batch_correct = pred.eq(labels.view_as(pred)).sum().item()\n", | ||
" batch_total = labels.size(0)\n", | ||
"\n", | ||
" total_loss += loss.item()\n", | ||
" correct += batch_correct\n", | ||
"\n", | ||
" # Log batch_accuracy to Comet; step is each batch\n", | ||
" accelerator.log({\"batch_accuracy\": batch_correct / batch_total})\n", | ||
"\n", | ||
" total_loss /= len(dataloader.dataset)\n", | ||
" correct /= len(dataloader.dataset)\n", | ||
"\n", | ||
" # Log data at an epoch level\n", | ||
" accelerator.get_tracker(\"comet_ml\").tracker.log_metrics(\n", | ||
" {\"accuracy\": correct, \"loss\": total_loss}, epoch=epoch\n", | ||
" )" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "kpNDbdvQuPZw" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Train the Model\n", | ||
"print(\"Running Model Training\")\n", | ||
"\n", | ||
"max_epochs = hyper_params[\"num_epochs\"]\n", | ||
"for epoch in range(max_epochs + 1):\n", | ||
" print(\"Epoch: {}/{}\".format(epoch, max_epochs))\n", | ||
" train(model, optimizer, criterion, train_loader, epoch)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "F__x_H052a1E" | ||
}, | ||
"source": [ | ||
"# End Experiment " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "YsTMrwd_2cmo" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"accelerator.end_training()" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"accelerator": "GPU", | ||
"colab": { | ||
"collapsed_sections": [], | ||
"name": "Comet and Pytorch.ipynb", | ||
"provenance": [] | ||
}, | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.1" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 4 | ||
} |