Deploying Deep Learning

Welcome to our guide to deploying inference and embedded deep vision runtime library for NVIDIA Jetson TX1/TX2.

Included in this repo are resources for efficiently deploying neural networks into the field using NVIDIA TensorRT.

Vision primitives, such as imageNet for image recognition, detectNet for object localization, and segNet for segmentation, inherit from the shared tensorNet object. Examples are provided for streaming from live camera feed and processing images from disk. See the Deep Vision API Reference Specification for documentation accompanying this tutorial.

Read our recent Parallel ForAll post, NVIDIA Jetson TX2 Delivers Twice the Intelligence to the Edge.
Support for Image Segmentation models and training guide with aerial drone dataset.

note: this branch is verified against the following BSP versions for Jetson TX1/TX2:
             > Jetson TX2 - JetPack 3.0 / L4T R27.1 aarch64 (Ubuntu 16.04 LTS)
             > Jetson TX1 - JetPack 2.3 / L4T R24.2 aarch64 (Ubuntu 16.04 LTS)
             > Jetson TX1 - JetPack 2.3.1 / L4T R24.2.1 aarch64 (Ubuntu 16.04 LTS)

Note that TensorRT samples from the repo are intended for deployment on embedded Jetson TX1/TX2 module, however when cuDNN and TensorRT have been installed on the host side, the TensorRT samples in the repo can be compiled for PC.

DIGITS Workflow

New to deep neural networks (DNNs) and machine learning? Take this introductory primer on training and inference.

Using NVIDIA deep learning tools, it's easy to Get Started training DNNs and deploying them into the field with high performance. Discrete GPUs are typically used in a server, PC, or laptop for training with DIGITS, while Jetson and integrated GPU is used by embedded form factors.

NVIDIA DIGITS is used to interactively train network models on annotated datasets in the cloud or PC, while TensorRT and Jetson are used to deploy runtime inference in the field. TensorRT uses graph optimizations and half-precision FP16 support to more than double DNN inferencing. Together, DIGITS and TensorRT form an effective workflow for developing and deploying deep neural networks capable of implementing advanced AI and perception.

System Setup

During this tutorial, we will use a host PC (or AWS), for training DNNs, alongside a Jetson for inference. The host PC will also serve to flash the Jetson with the latest JetPack. First we'll setup and configure the host PC with the required OS and tools.

Installing Ubuntu on the Host

Download and install Ubuntu 16.04 x86_64 onto the host PC from one of the following locations:

http://releases.ubuntu.com/16.04/ubuntu-16.04.2-desktop-amd64.iso
http://releases.ubuntu.com/16.04/ubuntu-16.04.2-desktop-amd64.iso.torrent

Ubuntu 14.04 x86_64 may also be acceptable with minor modifications later while installing some packages with apt-get.

Running JetPack on the Host

Download the latest JetPack to the host PC. In addition to flashing the Jetson with the latest Board Support Package (BSP), JetPack automatically installs tools for the host like CUDA Toolkit. See the JetPack Release Notes for the full list of features and installed packages.

After downloading JetPack from the link above, run it from the host PC with the following commands:

$ cd <directory where you downloaded JetPack>
$ chmod +x JetPack-L4T-3.0-linux-x64.run 
$ ./JetPack-L4T-3.0-linux-x64.run

The JetPack GUI will start. Follow the step-by-step Install Guide to complete the setup. Near the beginning, JetPack will confirm which generation Jetson you are developing for.

Select Jetson TX1 if you are using TX1, or Jetson TX2 if you're using TX2, and press Next to continue.

The next screen will list the packages available to be installed. The packages installed to the host are listed at the top under the Host - Ubuntu dropdown, while those intended for the Jetson are shown near the bottom. You can select or deselect an individual package for installation by clicking it's Action column.

Since CUDA will be used on the host for training DNNs, it's recommended to select the Full install by click on the radio button in the top right. Then press Next to begin setup. JetPack will download and then install the sequence of packages. Note that all the .deb packages are stored under the jetpack_downloads subdirectory if you are to need them later.

After the downloads have finished installing, JetPack will enter the post-install phase where the JetPack is flashed with the L4T BSP. You'll need to connect your Jetson to your host PC via the micro-USB port and cable included in the devkit. Then enter your Jetson into recovery mode by holding down the Recovery button while pressing and releasing Reset. If you type lsusb from the host PC after you've connected the micro-USB cable and entered the Jetson into recovery mode, you should see the NVIDIA device come up under the list of USB devices. JetPack uses the micro-USB connection from the host to flash the L4T BSP to the Jetson.

After flashing, the Jetson will reboot and if attached to an HDMI display, will boot up to the Ubuntu desktop. After this, JetPack connects to the Jetson from the host via SSH to install additional packages to the Jetson, like the ARM aarch64 builds of CUDA Toolkit, cuDNN, and TensorRT. For JetPack to be able to reach the Jetson via SSH, the host PC should be networked to the Jetson via Ethernet. This can be accomplished by running an Ethernet cable directly from the host to the Jetson, or by connecting both devices to a router or switch — the JetPack GUI will ask you to confirm which networking scenario is being used. See the JetPack Install Guide for the full directions for installing JetPack and flashing Jetson.

Installing NVIDIA Driver on the Host

At this point, JetPack will have flashed the Jetson with the latest L4T BSP, and installed CUDA toolkits to both the Jetson and host PC. However, the NVIDIA PCIe driver will still need to be installed on the host PC to enable GPU-accelerated training. Run the following commands from the host PC to install the NVIDIA driver from the Ubuntu repo:

$ sudo apt-get install nvidia-375
$ sudo reboot

Afer rebooting, the NVIDIA driver should be listed under lsmod:

$ lsmod | grep nvidia
nvidia_uvm            647168  0
nvidia_drm             49152  1
nvidia_modeset        790528  4 nvidia_drm
nvidia              12144640  60 nvidia_modeset,nvidia_uvm
drm_kms_helper        167936  1 nvidia_drm
drm                   368640  4 nvidia_drm,drm_kms_helper

To verify the CUDA toolkit and NVIDIA driver are working, run some tests that come with the CUDA samples:

$ cd /usr/local/cuda/samples
$ sudo make
$ cd bin/x86_64/linux/release/
$ ./deviceQuery
$ ./bandwidthTest --memory=pinned

Installing cuDNN on the Host

The next step is to install NVIDIA cuDNN libraries on the host PC. Download the libcudnn and libcudnn packages from the NVIDIA site:

https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6_6.0.20-1+cuda8.0_amd64-deb
https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v6/prod/8.0_20170307/Ubuntu16_04_x64/libcudnn6-dev_6.0.20-1+cuda8.0_amd64-deb

Then install the packages with the following commands:

$ sudo dpkg -i libcudnn6_6.0.20-1+cuda8.0_amd64.deb
$ sudo dpkg -i libcudnn6-dev_6.0.20-1+cuda8.0_amd64.deb

Installing NVcaffe on the Host

NVcaffe is the NVIDIA branch of Caffe with optimizations for GPU. NVcaffe uses cuDNN and is used by DIGITS for training DNNs. To install it, clone the NVcaffe repo from GitHub and compile from source. First some prequisite packages for Caffe are installed, including the Python bindings required by DIGITS:

$ sudo apt-get install --no-install-recommends build-essential cmake git gfortran libatlas-base-dev libboost-filesystem-dev libboost-python-dev libboost-system-dev libboost-thread-dev libgflags-dev libgoogle-glog-dev libhdf5-serial-dev libleveldb-dev liblmdb-dev libprotobuf-dev libsnappy-dev protobuf-compiler python-all-dev python-dev python-h5py python-matplotlib python-numpy python-opencv python-pil python-pip python-protobuf python-scipy python-skimage python-sklearn python-setuptools 
$ sudo pip install --upgrade pip
$ git clone http://github.com/NVIDIA/caffe
$ cd caffe
$ sudo pip install -r python/requirements.txt 
$ mkdir build
$ cd build
$ cmake ../ -DCUDA_USE_STATIC_CUDA_RUNTIME=OFF
$ make --jobs=4
$ make pycaffe

Caffe should now be configured and built. Now edit your user's ~/.bashrc to include the path to your Caffe tree (replace the paths below to reflect your own):

export CAFFE_ROOT=/home/dusty/workspace/caffe
export PYTHONPATH=/home/dusty/workspace/caffe/python:$PYTHONPATH

Close and re-open the terminal for the changes to take effect.

Installing DIGITS on the Host

NVIDIA DIGITS is a Python-based web service which interactively trains DNNs and manages datasets. As highlighed in the DIGITS workflow, it runs on the host PC to create the network model during the training phase. The trained model is then copied from the host PC to the Jetson for the runtime inference phase with TensorRT.

To install DIGITS, first install the prerequisiste packages, then clone the DIGITS repo from GitHub:

$ sudo apt-get install --no-install-recommends graphviz python-dev python-flask python-flaskext.wtf python-gevent python-h5py python-numpy python-pil python-pip python-protobuf python-scipy python-tk
$ git clone http://github.com/nvidia/DIGITS
$ cd DIGITS
$ sudo pip install -r requirements.txt

Starting the DIGITS Server

Assuming that your terminal is still in the DIGITS directory, the webserver can be started by running the digits-devserver Python script:

$ ./digits-devserver 
  ___ ___ ___ ___ _____ ___
 |   \_ _/ __|_ _|_   _/ __|
 | |) | | (_ || |  | | \__ \
 |___/___\___|___| |_| |___/ 5.1-dev

2017-04-17 13:19:02 [INFO ] Loaded 0 jobs.

DIGITS will store user jobs (training datasets and model snapshots) under the digits/jobs directory.

To access the interactive DIGITS session, open your web browser and navigate to 0.0.0.0:5000.

note: by default the DIGITS server will start on port 5000, but the port can be specified by passing the --port argument to the digits-devserver script.

Building from Source on Jetson

Provided along with this repo are TensorRT-enabled deep learning primitives for running Googlenet/Alexnet on live camera feed for image recognition, pedestrian detection networks with localization capabilities (i.e. that provide bounding boxes), and segmentation. This repo is intended to be built & run on the Jetson and to accept the network models from the host PC trained on the DIGITS server.

The latest source can be obtained from GitHub and compiled onboard Jetson TX1/TX2.

note: this branch is verified against the following BSP versions for Jetson TX1/TX2:
             > Jetson TX2 - JetPack 3.0 / L4T R27.1 aarch64 (Ubuntu 16.04 LTS)
             > Jetson TX1 - JetPack 2.3 / L4T R24.2 aarch64 (Ubuntu 16.04 LTS)
             > Jetson TX1 - JetPack 2.3.1 / L4T R24.2.1 aarch64 (Ubuntu 16.04 LTS)

Cloning the Repo

To obtain the repository, navigate to a folder of your choosing on the Jetson. First, make sure git and cmake are installed locally:

$ sudo apt-get install git cmake

Then clone the jetson-inference repo:

$ git clone http://github.com/dusty-nv/jetson-inference

Configuring with CMake

When cmake is run, a special pre-installation script (CMakePreBuild.sh) is run and will automatically install any dependencies.

$ cd jetson-inference
$ mkdir build
$ cd build
$ cmake ../

Compiling the Project

Make sure you are still in the jetson-inference/build directory, created above in step #2.

$ cd jetson-inference/build			# omit if pwd is already /build from above
$ make

Depending on architecture, the package will be built to either armhf or aarch64, with the following directory structure:

|-build
   \aarch64		    (64-bit)
      \bin			where the sample binaries are built to
      \include		where the headers reside
      \lib			where the libraries are build to
   \armhf           (32-bit)
      \bin			where the sample binaries are built to
      \include		where the headers reside
      \lib			where the libraries are build to

binaries residing in aarch64/bin, headers in aarch64/include, and libraries in aarch64/lib.

Digging Into the Code

For reference, see the available vision primitives, including imageNet for image recognition and detectNet for object localization.

/**
 * Image recognition with GoogleNet/Alexnet or custom models, using TensorRT.
 */
class imageNet : public tensorNet
{
public:
	/**
	 * Network choice enumeration.
	 */
	enum NetworkType
	{
		ALEXNET,
		GOOGLENET
	};

	/**
	 * Load a new network instance
	 */
	static imageNet* Create( NetworkType networkType=GOOGLENET );
	
	/**
	 * Load a new network instance
	 * @param prototxt_path File path to the deployable network prototxt
	 * @param model_path File path to the caffemodel
	 * @param mean_binary File path to the mean value binary proto
	 * @param class_info File path to list of class name labels
	 * @param input Name of the input layer blob.
	 */
	static imageNet* Create( const char* prototxt_path, const char* model_path, const char* mean_binary,
							 const char* class_labels, const char* input="data", const char* output="prob" );

	/**
	 * Determine the maximum likelihood image class.
	 * @param rgba float4 input image in CUDA device memory.
	 * @param width width of the input image in pixels.
	 * @param height height of the input image in pixels.
	 * @param confidence optional pointer to float filled with confidence value.
	 * @returns Index of the maximum class, or -1 on error.
	 */
	int Classify( float* rgba, uint32_t width, uint32_t height, float* confidence=NULL );
};

Both inherit from the shared tensorNet object which contains common TensorRT code.

Classifying Images with ImageNet

There are multiple types of deep learning networks available, including recognition, detection/localization, and soon segmentation. The first deep learning capability we're highlighting in this tutorial is image recognition using an 'imageNet' that's been trained to identify similar objects.

The imageNet object accepts an input image and outputs the probability for each class. Having been trained on ImageNet database of 1000 objects, the standard AlexNet and GoogleNet networks are downloaded during step 2 from above. As examples of using imageNet we provide a command-line interface called imagenet-console and a live camera program called imagenet-camera.

Using the Console Program

First, use the imagenet-console program to test imageNet recognition on some example images. After building, make sure your terminal is located in the aarch64/bin directory:

$ cd jetson-inference/build/aarch64/bin

Then, classify an example image with the imagenet-console program. imagenet-console accepts 2 command-line arguments: the path to the input image and path to the output image (with the class overlay printed).

$ ./imagenet-console orange_0.jpg output_0.jpg

$ ./imagenet-console granny_smith_1.jpg output_1.jpg

Next, we will use imageNet to classify a live video feed from the Jetson onboard camera.

Running the Live Camera Recognition Demo

Similar to the last example, the realtime image recognition demo is located in /aarch64/bin and is called imagenet-camera. It runs on live camera stream and depending on user arguments, loads googlenet or alexnet with TensorRT.

$ ./imagenet-camera googlenet           # to run using googlenet
$ ./imagenet-camera alexnet             # to run using alexnet

The frames per second (FPS), classified object name from the video, and confidence of the classified object are printed to the openGL window title bar. By default the application can recognize up to 1000 different types of objects, since Googlenet and Alexnet are trained on the ILSVRC12 ImageNet database which contains 1000 classes of objects. The mapping of names for the 1000 types of objects, you can find included in the repo under data/networks/ilsvrc12_synset_words.txt

note: by default, the Jetson's onboard CSI camera will be used as the video source. If you wish to use a USB webcam instead, change the DEFAULT_CAMERA define at the top of imagenet-camera.cpp to reflect the /dev/video V4L2 device of your USB camera. The model it's tested with is Logitech C920.

Re-training the Network with DIGITS

The existing GoogleNet and AlexNet models that are downloaded by the repo are pre-trained on 1000 classes of objects.

What if you require a new object class to be added to the network, or otherwise require a different organization of the classes?

Using NVIDIA DIGITS, networks can be fine-tuned or re-trained from a pre-exisiting network model. After installing DIGITS on a PC or in the cloud (such as an AWS instance), see the Image Folder Specification to learn how to organize the data for your particular application.

Popular training databases with various annotations and labels include ImageNet, MS COCO, and Google Images among others.

See here under the Downloading the dataset section to obtain a crawler script that will download the 1000 original classes, including as many of the original images that are still available online.

note: be considerate running the crawler script from a corporate network, they may flag the activity. It will probably take overnight on a decent connection to download the 1000 ILSVRC12 classes (100GB) from ImageNet (1.2TB)

Then, while creating the new network model in DIGITS, copy the GoogleNet prototxt and specify the existing GoogleNet caffemodel as the DIGITS Pretrained Model:

The network training should now converge faster than if it were trained from scratch. After the desired accuracy has been reached, copy the new model checkpoint back over to your Jetson and proceed as before, but now with the added classes available for recognition.

Locating Object Coordinates using DetectNet

The previous image recognition examples output class probabilities representing the entire input image. The second deep learning capability we're highlighting in this tutorial is detecting multiple objects, and finding where in the video those objects are located (i.e. extracting their bounding boxes). This is performed using a 'detectNet' - or object detection / localization network.

The detectNet object accepts as input the 2D image, and outputs a list of coordinates of the detected bounding boxes. Three example detection network models are are automatically downloaded during the repo source configuration:

ped-100 (single-class pedestrian detector)
multiped-500 (multi-class pedestrian + baggage detector)
facenet-120 (single-class facial recognition detector)

As with the previous examples, provided are a console program and a camera streaming program for using detectNet.

Processing Images from the Command Line

To process test images with detectNet and TensorRT, use the detectnet-console program. detectnet-console accepts command-line arguments representing the path to the input image and path to the output image (with the bounding box overlays rendered). Some test images are included with the repo:

$ ./detectnet-console peds-007.png output-7.png

To change the network that detectnet-console uses, modify detectnet-console.cpp (beginning line 33):

detectNet* net = detectNet::Create( detectNet::PEDNET_MULTI );	 // uncomment to enable one of these 
//detectNet* net = detectNet::Create( detectNet::PEDNET );
//detectNet* net = detectNet::Create( detectNet::FACENET );

Then to recompile, navigate to the jetson-inference/build directory and run make.

Multi-class Object Detection

When using the multiped-500 model (PEDNET_MULTI), for images containing luggage or baggage in addition to pedestrians, the 2nd object class is rendered with a green overlay.

$ ./detectnet-console peds-008.png output-8.png

Running the Live Camera Detection Demo

Similar to the previous example, detectnet-camera runs the object detection networks on live video feed from the Jetson onboard camera. Launch it from command line along with the type of desired network:

$ ./detectnet-camera multiped       # run using multi-class pedestrian/luggage detector
$ ./detectnet-camera ped-100        # run using original single-class pedestrian detector
$ ./detectnet-camera facenet        # run using facial recognition network
$ ./detectnet-camera                # by default, program will run using multiped

note: to achieve maximum performance while running detectnet, increase the Jetson TX1 clock limits by running the script: sudo ~/jetson_clocks.sh

> **note**: by default, the Jetson's onboard CSI camera will be used as the video source. If you wish to use a USB webcam instead, change the `DEFAULT_CAMERA` define at the top of [`detectnet-camera.cpp`](detectnet-camera/detectnet-camera.cpp) to reflect the /dev/video V4L2 device of your USB camera. The model it's tested with is Logitech C920.

Re-training DetectNet with DIGITS

For a step-by-step guide to training custom DetectNets, see the Object Detection example included in DIGITS version 4:

The DIGITS guide above uses the KITTI dataset, however MS COCO also has bounding data available for a variety of objects.

Image Segmentation with SegNet

The third deep learning capability we're highlighting in this tutorial is image segmentation. Segmentation is based on image recognition, except the classifications occur at the pixel level as opposed to classifying entire images as with image recognition. This is accomplished by convolutionalizing a pre-trained imageNet recognition model (like Alexnet), which turns it into a fully-convolutional segmentation model capable of per-pixel labelling. Useful for environmental sensing and collision avoidance, segmentation yields dense per-pixel classification of many different potential objects per scene, including scene foregrounds and backgrounds.

The segNet object accepts as input the 2D image, and outputs a second image with the per-pixel classification mask overlay. Each pixel of the mask corresponds to the class of object that was classified.

note: see the DIGITS semantic segmentation example for more background info on segmentation.

Downloading Aerial Drone Dataset

As an example of image segmentation, we'll work with an aerial drone dataset that separates ground terrain from the sky. The dataset is in First Person View (FPV) to emulate the vantage point of a drone in flight and train a network that functions as an autopilot guided by the terrain that it senses.

To download and extract the dataset, run the following commands from the host PC running the DIGITS server:

$ wget --no-check-certificate https://nvidia.box.com/shared/static/ft9cc5yjvrbhkh07wcivu5ji9zola6i1.gz -O NVIDIA-Aerial-Drone-Dataset.tar.gz

HTTP request sent, awaiting response... 200 OK
Length: 7140413391 (6.6G) [application/octet-stream]
Saving to: ‘NVIDIA-Aerial-Drone-Dataset.tar.gz’

NVIDIA-Aerial-Drone-Datase 100%[======================================>]   6.65G  3.33MB/s    in 44m 44s 

2017-04-17 14:11:54 (2.54 MB/s) - ‘NVIDIA-Aerial-Drone-Dataset.tar.gz’ saved [7140413391/7140413391]

$ tar -xzvf NVIDIA-Aerial-Drone-Dataset.tar.gz

The dataset includes various clips captured from flights of drone platforms, but the one we'll be focusing on in this tutorial is under FPV/SFWA. Next we'll create the training database in DIGITS before training the model.

Importing the Dataset into DIGITS

First, navigate your browser to your DIGITS server instance and choose to create a new Segmentation Dataset from the drop-down in the Datasets tab:

In the dataset creation form, specify the following options and paths to the image and label folders under the location where you extracted the aerial dataset:

Feature image folder: NVIDIA-Aerial-Drone-Dataset/FPV/SFWA/720p/images
Label image folder: NVIDIA-Aerial-Drone-Dataset/FPV/SFWA/720p/labels
set % for validation to 1%
Class labels: NVIDIA-Aerial-Drone-Dataset/FPV/SFWA/fpv-labels.txt
Color map: From text file
Feature Encoding: None
Label Encoding: None

Name the dataset whatever you choose and click the Create button at the bottom of the page to launch the importing job. Next we'll create the new segmentation model and begin training.

Generating Pretrained FCN-Alexnet

Fully Convolutional Network (FCN) Alexnet is the network topology that we'll use for segmentation models with DIGITS and TensorRT. See this Parallel ForAll article about the convolutionalizing process. A new feature to DIGITS5 was supporting segmentation datasets and training models. A script is included with the DIGITS semantic segmentation example which converts the Alexnet model into FCN-Alexnet. This base model is then used as a pre-trained starting point for training future FCN-Alexnet segmentation models on custom datasets.

To generate the pre-trained FCN-Alexnet model, open a terminal, navigate to the DIGITS semantic-segmantation example, and run the net_surgery script:

$ cd DIGITS/examples/semantic-segmentation
$ ./net_surgery.py
Downloading files (this might take a few minutes)...
Downloading https://raw.githubusercontent.com/BVLC/caffe/rc3/models/bvlc_alexnet/deploy.prototxt...
Downloading http://dl.caffe.berkeleyvision.org/bvlc_alexnet.caffemodel...
Loading Alexnet model...
...
Saving FCN-Alexnet model to fcn_alexnet.caffemodel

Training FCN-Alexnet with DIGITS

When the previous data import job is complete, return to the DIGITS home screen. Select the Models tab and choose to create a new Segmentation Model from the drop-down:

In the model creation form, select the dataset you previously created. Set Subtract Mean to None and the Base Learning Rate to 0.0001. To set the network topology in DIGITS, select the Custom Network tab and make sure the Caffe sub-tab is selected. Copy/paste the FCN-Alexnet prototxt into the text box. Finally, set the Pretrained Model to the output that the net_surgery generated above: DIGITS/examples/semantic-segmentation/fcn_alexnet.caffemodel

Give your aerial model a name and click the Create button at the bottom of the page to start the training job. After about 5 epochs, the Accuracy plot (in orange) should ramp up and the model becomes usable:

At this point, we can try testing our new model's inference on some example images in DIGITS.

Testing Inference Model in DIGITS

Before transfering the trained model to Jetson, let's test it first in DIGITS. On the same page as previous plot, scroll down under the Trained Models section. Set the Visualization Model to Image Segmentation and under Test a Single Image, select an image to try (for example /NVIDIA-Aerial-Drone-Dataset/FPV/SFWA/720p/images/0428.png):

Press Test One and you should see a display similar to:

Next, download and extract the trained model snapshot to Jetson.

FCN-Alexnet Patches for TensorRT

There exist a couple non-essential layers included in the original FCN-Alexnet which aren't supported in TensorRT and should be deleted from the deploy.prototxt included in the snapshot.

At the end of deploy.prototxt, delete the deconv and crop layers:

layer {
  name: "upscore"
  type: "Deconvolution"
  bottom: "score_fr"
  top: "upscore"
  param {
    lr_mult: 0.0
  }
  convolution_param {
    num_output: 21
    bias_term: false
    kernel_size: 63
    group: 21
    stride: 32
    weight_filler {
      type: "bilinear"
    }
  }
}
layer {
  name: "score"
  type: "Crop"
  bottom: "upscore"
  bottom: "data"
  top: "score"
  crop_param {
    axis: 2
    offset: 18
  }
}

And on line 24 of deploy.prototxt, change pad: 100 to pad: 0. Finally copy the fpv-labels.txt and fpv-deploy-colors.txt from the aerial dataset to your model snapshot folder on Jetson. Your FCN-Alexnet model snapshot is now compatible with TensorRT. Now we can run it on Jetson and perform inference on images.

Running Segmentation Models on Jetson

To test a custom segmentation network model snapshot on the Jetson, use the command line interface to test the segnet-console program.

First, for convienience, set the path to your extracted snapshot into a $NET variable:

$ NET=20170421-122956-f7c0_epoch_5.0

$ ./segnet-console drone_0428.png output_0428.png \
--prototxt=$NET/deploy.prototxt \
--model=$NET/snapshot_iter_22610.caffemodel \
--labels=$NET/fpv-labels.txt \
--colors=$NET/fpv-deploy-colors.txt \
--input_blob=data \ 
--output_blob=score_fr

This runs the specified segmentation model on a test image downloaded with the repo.

In addition to the pre-trained aerial model from this tutorial, the repo also includes pre-trained models on other segmentation datasets, including Cityscapes, SYNTHIA, and Pascal-VOC.

Extra Resources

In this area, links and resources for deep learning developers are listed:

Appendix
- NVIDIA Deep Learning Institute — Introductory QwikLabs
- Building nvcaffe
- Other Examples
- ros_deep_learning - TensorRT inference ROS nodes

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
data		data
detectnet-camera		detectnet-camera
detectnet-console		detectnet-console
docs		docs
imagenet-camera		imagenet-camera
imagenet-console		imagenet-console
segnet-console		segnet-console
tools		tools
util		util
CMakeLists.txt		CMakeLists.txt
CMakePreBuild.sh		CMakePreBuild.sh
README.md		README.md
detectNet.cpp		detectNet.cpp
detectNet.h		detectNet.h
imageNet.cpp		imageNet.cpp
imageNet.cu		imageNet.cu
imageNet.h		imageNet.h
segNet.cpp		segNet.cpp
segNet.h		segNet.h
tensorNet.cpp		tensorNet.cpp
tensorNet.h		tensorNet.h

liutao2017/jetson-inference

Folders and files

Latest commit

History

Repository files navigation