This repo provides a sample of a Custom Vision Service ONNX model running in a docker container on an Xavier AGX NVIDIA Jetson device. It utilizes the NVIDIA GPU for predictions. The resulting docker image will be compatible with Live Video Analytics on IoT Edge.
Note: this repo is updating often to fix issues as it is a work in progress. Please excuse the iterations. It will be noted here when things are more stable. Thank you for your patience.
Jetson AGX Xavier
- Jetson AGX Xavier flashed with JetPack 4.4 (L4T R32.4.3) with all ML and CV tools (including
nvidia-docker
) - 8-16 GB swap file (on NVMe mount if using one, otherwise main storage disk) for expanding memory during prediction
- Additional NVMe (512 GB SSD used)
- Set docker to use the NVMe drive for docker images and containers
Software on device
- OS: Ubuntu 18.04 LTS
- JetPack 4.4
- Azure CLI installed on Jetson to push AI docker image
- Default is
nvidia-docker
(do not need Moby or Docker CE) - IoT Edge runtime 1.0.10.x
Notes
- Building and running the docker images will be with
nvidia-docker
- A swap file will be used to expand memory by 8-16GB on the SSD (see below setup details).
General getting started guides for all Jetson platforms: https://developer.nvidia.com/embedded/learn/getting-started-jetson
- Additional SSD like an NVMe 512GB (Xavier)
- A "host" machine with Ubuntu 16.04 or 18.04 (e.g. an AMD64-based Lenovo Thinkpad) for NVIDIA SDK Manager
- Mouse and keyboard (wired USB)
- Monitor (HDMI)
- [Recommended] USB hub for peripherals
- Note: USB devices share power and the more you have, the more power gets used
- Download the NVIDIA SDK Manager for JetPack 4.4 (not the SD card method) to the host machine (an NVIDIA Developer membership may be required)
IMPORTANT:
- Plug in all peripherals and host before plugging into power
- Ensure you select all ML tools, CV tools, and
nvidia-docker
.
Note: The manual method was chosen in this guide during flashing through the NVIDIA SDK Manager. Follow the instructions online.
Note: other Jetson devices may not support NVMe w/ an M.2 connector
- Follow this tutorial to install the NVMe SSD ((512 GB used here)) on the Xavier: https://www.jetsonhacks.com/2018/10/18/install-nvme-ssd-on-nvidia-jetson-agx-developer-kit/.
In the terminal, mount the SSD as follows (example).
cat /proc/partitions
(echo o; echo n; echo p; echo 1; echo ""; echo ""; echo w; echo q) | sudo fdisk /dev/nvme0n1
cat /proc/partitions
sudo mkfs.ext4 /dev/nvme0n1p1
# Look at the output to get your own UUID
sudo vim /etc/fstab
# UUID=<block id> /media/nvme ext4 defaults 0 0
# For example:
# UUID=13024e1e-cfa1-4f99-a40b-d3cdccc37869 /media/nvme ext4 defaults 0 0
sudo mkdir /media/nvme
sudo chmod 777 /media/nvme
sudo mount /media/nvme/
sudo reboot
Here, we will modify the etc/docker/daemon.json
file to add the data root for docker assets to point to the NVMe SSD and the NVIDIA runtime setting.
Create the docker directory and open the docker daemon config file with a text editor.
sudo mkdir /media/nvme/docker
sudo vim /etc/docker/daemon.json
Ensure the /etc/docker/daemon.json
includes the following.
{
"data-root": "/media/nvme/docker",
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
And restart:
sudo systemctl restart docker
Give the user sudo permission to run docker by following https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user.
Create a swap file on the NVMe SSD so that there will be enough memory to load and run machine learning models and the necessary containers. Create a 16GB file according to the following: https://linuxize.com/post/how-to-add-swap-space-on-ubuntu-18-04/#creating-a-swap-file (using /media/nvme/swapfile
as file location).
Additionally, swappiness of the swapfile (e.g. the /media/ssd/swapfile
) may be increased with the following (Source).
First, check the value of swappiness. The number can be from 0-100, but the default is usually 60.
sudo cat /proc/sys/vm/swappiness
Temporarily set swappiness to a more aggressive number (it will tend to use the swap more) with the following (note: this will not persist after reboot).
sudo sysctl -w vm.swappiness=90
Once you know what swappiness works best, you may set it permanently by adding or replacing this line in /etc/sysctl.conf
file (try a higher value to prioritize using the swap file, e.g., 90 is used here).
vm.swappiness=90
Swappiness values may need to be tested and adjusted as needed.
It is nice to have Python for local testing and is required for the Azure CLI to work. However, most work will be from docker containers and one can always run a bash shell inside a running container for Python.
Note: Python3 may already be on the device so you may check by typing
python3
on the command line, however oftenpip3
is not installed.
Install python3
and pip3
as shown here. Then, add ln -s /usr/bin/python3 /usr/bin/python
afer the apt-get install
.
sudo apt-get update
sudo apt-get install -y \
build-essential \
libffi-dev \
libssl1.0.0 \
libssl-dev \
python3-dev \
python3-pip
For some packages, a more recent pip
and setuptools
will be needed. Update these with the following commands:
pip3 install --upgrade pip
pip3 install --upgrade setuptools
There are two ways to install the Azure CLI on Linux. The first is the way recommended in Azure docs. The second is through PyPI, the Python package repo. Here, PyPI is used and works well.
Install the Azure CLI as a Python package as follows (you will need Python3 and Pip3 installed on your system).
pip3 install azure-cli
By default the az
command will be at $HOME/.local/bin/az
. You may wish to add this to your PATH
.
Note: This is roughly following the Tutorial: Analyze live video with Live Video Analytics on IoT Edge and Azure Custom Vision doc where you can find the video and necessary data to train a toy truck detector model.
Follow Quickstart: Build an object detector with the Custom Vision website to train an object detector with your images.
Notes
- Use object detection and "General (compact)" as "Domain" ("compact" will ensure we can export for IoT Edge and model is of a smaller architecture)
- If there are multiple classes, ensure a balanced dataset, that is, same number of images in each class for best performance (other tips from Microsoft)
- Export as ONNX --> ONNX Float16 (the float16 means the model has been quantized to decrease its memory footprint)
- Locate the
model.onnx
andlabels.txt
files in the downloaded folder.
Example of hardhat detection (two-class) with Custom Vision (using the "Quick Test" feature after training):
- Place the
model.onnx
model file and thelabels.txt
labels file into thecustomvision-linux-arm/app
folder from this repo (if there is alabels.txt
already, just overwrite with the newly exported one).
Note: this will build the image based on the Dockerfile
. Use a tag that is not "latest" so that it's easy to upgrade as needed (e.g. shown here as 0.0.1
).
cd customvision-linux-arm
nvidia-docker build -t objectdetection:0.0.1 -f onnx.dockerfile .
Use an image file similar to your training dataset in the following instructions.
Use the following command to start the container.
nvidia-docker run --name my_cvs_container -p 127.0.0.1:80:80 -d objectdetection:0.0.1
Use the following command to predict on an image (use full path of image on system, replacing the <>
with the path).
curl -X POST http://127.0.0.1:80/image -F imageData=@<full path to a test image file that has the object(s)>
The results will look like:
{"created":"2021-01-30T00:30:42.895140","id":"","iteration":"","predictions":[{"boundingBox":{"height":0.12187065,"left":0.76290076,"top":0.32214137,"width":0.05968696},"probability":0.78185642,"tagId":0,"tagName":"no_hardhat"}],"project":""}
TIP: The position indicated by "left" and "top" refers to the distances from the top-left corner of the image
To test that the container is using GPU, check the Troubleshooting section for instructions on using jtop
during a prediction.
- Retag the docker image according to the name of your Azure Container Registry, e.g.,
docker tag objectdetection:0.0.1 myacr.azurecr.io/objectdetection:0.0.1
TIP: Always use a unique tag (e.g.
:0.0.1
) for each time you push a significant change to ACR - this makes it easy to keep track of iterations and ensure you are on the latest image in the IoT Edge runtime.
-
This docker image can now be pushed to Azure Container Registry with the Azure CLI and used with Live Video Analytics on IoT Edge on a Jetson device registered with Azure IoT Hub
IMPORTANT: You don't need Moby Engine/runtime because you already have the NVIDIA docker runtime (
nvidia-docker
)
-
Install the IoT Edge security daemon according to Azure docs.
-
Set the device connection string on the device by updating
/etc/iotedge/config.yaml
line 88 (as of this writing) with the properdevice_connection_string
from the Azure Portal from the desired device registration within the IoT Hub resource. -
Create the video input folder and video output folder and set permissions. For example:
sudo mkdir -p /live/mediaServer/media
sudo chmod -R ugo+rw /live/mediaServer/media
sudo mkdir /media/nvme/output
sudo chmod -R ugo+rw /media/nvme/output
- Download the test toy truck video to the Jetson device. This is the video that the RTSP simulator module/container will use.
cd /live/mediaServer/media
wget https://lvamedia.blob.core.windows.net/public/t2.mkv
- Clone the Live Video Analytics Python sample.
git clone https://github.com/Azure-Samples/live-video-analytics-iot-edge-python.git
- Follow the rest of the tutorial starting at https://docs.microsoft.com/en-us/azure/media-services/live-video-analytics-edge/custom-vision-tutorial?pivots=programming-language-python#examine-the-sample-files.
IMPORTANT:
-
Ensure the "cv" ML IoT module has the following
createOptions
(see the includeddeployment.customvision.arm64.template.json
deployment manifest):"createOptions": { "HostConfig":{"runtime": "nvidia", "Privileged":true} }
-
Ensure the docker daemon config is using the NVIDIA runtime. This config may be found at
/etc/docker/daemon.json
and will look like (or similar to):{ "runtimes": { "nvidia": { "path": "nvidia-container-runtime", "runtimeArgs": [] } } }
- Check that the GPU is being utlized
To check with a package called jetson-stats
, Python3 is needed (on Jetson, nvidia-smi
is not available). Install Python as follows.
sudo apt-get update
sudo apt-get install -y \
build-essential \
libffi-dev \
libssl1.0.0 \
libssl-dev \
python3-dev \
python3-pip
Install jetson-stats
(more here):
sudo pip3 install jetson-stats
Run the tool to check resource management and usage:
jtop
- If the device is sluggish or appears frozen
To work with a device that is sluggish, try to SSH in with the command line from a host computer rather than with a USB-attached keyboard and mouse and HDMI attached monitor (turn off device and dettach all of these components, then power back on). This will usually allow more responsiveness.
If the device is still slow or LVA module is throwing errors related to secrets, even after using an SSH method, try the following (this is to delete app data for LVA module and reset it):
- Stop IoT Edge runtime
sudo systemctl stop iotedge
- Delete the application data associated with LVA (this is specified in the
.env
on the dev machine - in deployment manifest file as well). e.g.sudo rm -fr /var/lib/azuremediaservices/*
- Start IoT Edge runtime
sudo systemctl start iotedge
- If prediction is unacceptably slow
To speed up the time for prediction, adjust the input resolution on line 18 of <path to repo>/jetson-customvision/customvision-linux-arm/object_detection.py
(https://github.com/michhar/jetson-customvision/blob/main/customvision-linux-arm/app/object_detection.py#L18) to be smaller, e.g., 416 * 416
.