Skip to content

Latest commit

 

History

History
180 lines (134 loc) · 4.41 KB

README.md

File metadata and controls

180 lines (134 loc) · 4.41 KB

Depth Anything V2 - ONNX and TensorRT Triton Inference Server

Introduction

This repository is a simple implementation of Depth Anything V2 with ONNX and TensorRT. The model is converted from the original PyTorch model and can be used for image and video depth estimation.

Installation

git clone https://github.com/DepthAnything/Depth-Anything-V2
pip install -r requirements.txt

TensorRT version:

  • torch==1.13.0+cu114
  • torchvision==0.14.0+cu114
  • pycuda==2022.2.2
  • tensorrt==8.5.2.2
  • JetPack 5.0

Convert model

ONNX

Download the pre-trained model from Depth-Anything-V2 and put it under the Depth-Anything-V2/checkpoints directory.

python export.py --encoder vits --input-size 518

convert_onnx

TensorRT

python onnx2trt.py -o models/depth_anything_v2_vits.onnx --output depth_anything_v2_vits.engine --workspace 2

convert_trt

Or you can download the converted model from Google Drive and put it under the models directory.

Usage

python infer.py 
  --input-path assets/demo01.jpg --input-type image \
  --mode onnx --encoder vits \
  --model_path models/depth_anything_v2_vits.onnx

output

Focus on a region with crop region:

python infer.py 
  --input-path assets/demo01.jpg --input-type image \
  --mode onnx --encoder vits \ 
  --model_path depth_anything_v2_vits.onnx \
  --crop-region "0 550 800 750"

output

Options:

  • --input-path: path to input image
  • --input-type: input type, image or video
  • --mode: inference mode, onnx or trt
  • --encoder: encoder type, vits, vitb, vitl, vitg
  • --model_path: path to model file
  • --crop-region: crop region, x y w h
  • --output-path: path to output image
  • --grayscale: output grayscale image

User Interface

python app.py

URL: http://127.0.0.1:7860

  • Preview crop zone crop_zone

  • UI with crop zone and the output depth map ui

Deploy on Triton Inference Server

Convert model to TensorRT plan and save it to the model repository

makedirs model_repository/depth_anything/1
trtexec --onnx=models/depth_anything_v2_vits.onnx --saveEngine=model_repository/depth_anything/1/model.plan

Check I/O of model and create the model configuration

polygraphy inspect model model_repository/depth_anything/1/model.plan

polygraphy_model

Create config.pbtxt under the model_repository/depth_anything directory.

name: "depth_anything"
platform: "tensorrt_plan"
default_model_filename: "model.plan"
max_batch_size : 0
input [
  {
    name: "input"
    data_type: TYPE_FP32
    dims: [ 1, 3, 518, 518 ]
  }
]
output [
  {
    name: "output"
    data_type: TYPE_FP32
    dims: [ 1, 518, 518 ]
  }
]

Build and run the Triton Inference Server container

  • Edge devices with NVIDIA GPU
sudo docker build -t tritonserver:v1 .
sudo docker run --runtime=nvidia --gpus all -p 8000:8000 -p 8001:8001 -p 8002:8002
-v /home/nx-int/code/Depth-Anythingv2-TensorRT-python/model_repository:/models tritonserver:v1
  • Server with NVIDIA GPU
sudo docker compose up --build

deploy_triton

Inference

python3 depth_anything_triton_infer.py --input-path assets/demo01.jpg

inference_triton

Performance Benchmark

Inside the container, run the following command to benchmark the model:

perf_analyzer -m depth_anything --shape input:1,3,518,158 --percentile=95 --concurrency-range 1:4

or 
perf_analyzer -m depth_anything_dynamic --shape input:480,960,3 --percentile=95 --concurrency-range 1:4

TODO

  • Add UI for easy usage with crop region
  • Deploy on Triton Inference Server

Citation

@article{depth_anything_v2,
  title={Depth Anything V2},
  author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
  journal={arXiv:2406.09414},
  year={2024}
}

Reference: Depth-Anythingv2-TensorRT-python