All code was taken from Max deGroot's & Ellis Brown's ssd.pytorch repository except the object_detection.py
file. However, some modifications were done in order to make this project run on Windows 10 and Python 3.6 with PyTorch 0.4.1 for CUDA 9.2.
Local Machine Specs |
---|
Windows 10 |
NVIDIA GeForce GTX 850M |
CUDA 10.0 (Download) |
cuDNN 10.0 (Download) |
Even though PyTorch 0.4.1 for CUDA 9.2 was installed, the library also works for CUDA 10.0.
Before you start please refer to the original repository on how to use this code properly.
Basically, what you will need to do is:
-
Download the datasets and the pretrained VGG-16 base network (both described in the original repository).
-
Install PyTorch by visiting the website and choosing your specifications.
-
Install OpenCV, NumPy & imageio by executing the following line in your Terminal/Command Prompt:
pip install -r requirements.txt
The following modifications has been made to successfully execute train.py
:
In train.py
line 203 (line 165 in the original repo) was changed from:
images, targets = next(batch_iterator)
to:
try:
images, targets = next(batch_iterator)
except StopIteration:
batch_iterator = iter(data_loader)
images, targets = next(batch_iterator)
The fix was copied from this comment.
Fixed naming of the saved model in train.py
on line 239 & 244 (line 196 & 198).
In layers/modules/multibox_loss.py
add loss_c = loss_c.view(pos.size()[0], pos.size()[1])
on line 97 like so:
# Hard Negative Mining
loss_c = loss_c.view(pos.size()[0], pos.size()[1])
loss_c[pos] = 0 # filter out pos boxes for now
loss_c = loss_c.view(num, -1)
and then change N = num_pos.data.sum()
to N = num_pos.data.sum().float()
on line 115.
In layers/functions/detection.py
line 62 was changed from:
if scores.dim() == 0:
continue
to:
if scores.size(0) == 0:
continue
If you are training on a Windows machine make sure to set the value of the --num_workers
flag to 0
or you will get a BrokenPipeError: [Errno 32] Broken pipe
error. On my machine, I also need to close all programs (except the Command Prompt of course) and set the batch size to 2 as well as the learning rate to 0.000006 in order to train the model otherwise I get a RuntimeError: CUDA error: out of memory
error.
python train.py --num_workers 0 --batch_size 2 --lr 1e-6
Since training on my local machine with the settings/flags above would take days (or even weeks) to get reasonable results I decided to train the SSD on an AWS spot instance.
To set up an AWS spot instance do the following steps:
- Login to your Amazon AWS Account
- Navigate to EC2 > Instances > Spot Requests > Request Spot Instances
- Under
AMI
click onSearch for AMI
, typeAWS Deep Learning AMI
in the search field, chooseCommunity AMIs
from the drop-down and select theDeep Learning AMI (Ubuntu) Version 14.0
- Delete the default instance type, click on Select and select the p2.xlarge instance
- Uncheck the
Delete
checkbox under EBS Volumes so your progress is not deleted when the instance gets terminated - Set Security Groups to default
- Select your key pair under Key pair name (if you don't have one create a new key pair)
- At the very bottom set
Request valid until
to about 10 - 12 hours and setTerminate instances at expiration
as checked (You don't have to do this but keep in mind to receive a very large bill from AWS if you forget to terminate your spot instance because the default value for termination is set to 1 year.) - Click
Launch
, wait until the instance is created and then connect to your instance via ssh
There's also a detailed explanation from AWS about AWS Deep Learning AMIs. You might give it a shot as well.
When your spot instance is up and running AND you have connected to your spot instance you then need to activate the PyTorch environment like so:
source activate pytorch_p36
Activate PyTorch environment on spot instance
Lastly, clone this repository, proceed with the installation process (except for PyTorch) and start training by executing:
## you probably don't need to add any arguments here
python train.py
If you don't want to train an SSD model and want to try the detection only you can download my trained SSD model. I've trained the model with all default values/parameters from the original repository but stopped the training after 1500 iterations because the loss stagnated.
To detect objects in a video you first need to install ffmpeg
by executing the following line:
conda install ffmpeg -c conda-forge
Note: This command only works if you have the Anaconda Distribution installed on your computer.
After you have trained the SSD model and you want to detect objects in a video execute the following line in your Terminal/Command Prompt.
python object_detection.py path_to/your_ssd_model.pth path_to/your_video.mp4 -o name_of_your_output_video.mp4
If the -o
flag is not specified the output video will simply have the name output.mp4
You can watch sample outputs from here: