Skip to content
/ usls Public

A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models.

License

Notifications You must be signed in to change notification settings

jamjamjon/usls

Repository files navigation

usls

Documentation

ONNXRuntime Release Page CUDA Toolkit Page TensorRT Page

Crates Page Crates.io Total Downloads

usls is a Rust library integrated with ONNXRuntime that provides a collection of state-of-the-art models for Computer Vision and Vision-Language tasks, including:

Click to expand Supported Models

Supported Models

Model Task / Type Example CUDA f32 CUDA f16 TensorRT f32 TensorRT f16
YOLOv5 Classification
Object Detection
Instance Segmentation
demo
YOLOv6 Object Detection demo
YOLOv7 Object Detection demo
YOLOv8 Object Detection
Instance Segmentation
Classification
Oriented Object Detection
Keypoint Detection
demo
YOLOv9 Object Detection demo
YOLOv10 Object Detection demo
YOLOv11 Object Detection
Instance Segmentation
Classification
Oriented Object Detection
Keypoint Detection
demo
RTDETR Object Detection demo
FastSAM Instance Segmentation demo
SAM Segment Anything demo
SAM2 Segment Anything demo
MobileSAM Segment Anything demo
EdgeSAM Segment Anything demo
SAM-HQ Segment Anything demo
YOLO-World Object Detection demo
DINOv2 Vision-Self-Supervised demo
CLIP Vision-Language demo ✅ Visual
❌ Textual
✅ Visual
❌ Textual
BLIP Vision-Language demo ✅ Visual
❌ Textual
✅ Visual
❌ Textual
DB Text Detection demo
SVTR Text Recognition demo
RTMO Keypoint Detection demo
YOLOPv2 Panoptic Driving Perception demo
Depth-Anything v1 & v2 Monocular Depth Estimation demo
MODNet Image Matting demo
GroundingDINO Open-Set Detection With Language demo
Sapiens Body Part Segmentation demo
Florence2 a Variety of Vision Tasks demo
DepthPro Monocular Depth Estimation demo

⛳️ ONNXRuntime Linking

You have two options to link the ONNXRuntime library
  • Option 1: Manual Linking

    • For detailed setup instructions, refer to the ORT documentation.

    • For Linux or macOS Users:

      • Download the ONNX Runtime package from the Releases page.
      • Set up the library path by exporting the ORT_DYLIB_PATH environment variable:
        export ORT_DYLIB_PATH=/path/to/onnxruntime/lib/libonnxruntime.so.1.19.0
  • Option 2: Automatic Download

    Just use --features auto

    cargo run -r --example yolo --features auto

🎈 Demo

cargo run -r --example yolo   # blip, clip, yolop, svtr, db, ...

🥂 Integrate Into Your Own Project

  • Add usls as a dependency to your project's Cargo.toml

    cargo add usls

    Or use a specific commit:

    [dependencies]
    usls = { git = "https://github.com/jamjamjon/usls", rev = "commit-sha" }
  • Follow the pipeline

    • Build model with the provided models and Options
    • Load images, video and stream with DataLoader
    • Do inference
    • Retrieve inference results from Vec<Y>
    • Annotate inference results with Annotator
    • Display images and write them to video with Viewer

    example code
    use usls::{models::YOLO, Annotator, DataLoader, Nms, Options, Vision, YOLOTask, YOLOVersion};
    
    fn main() -> anyhow::Result<()> {
        // Build model with Options
        let options = Options::new()
            .with_trt(0)
            .with_model("yolo/v8-m-dyn.onnx")?
            .with_yolo_version(YOLOVersion::V8) // YOLOVersion: V5, V6, V7, V8, V9, V10, RTDETR
            .with_yolo_task(YOLOTask::Detect) // YOLOTask: Classify, Detect, Pose, Segment, Obb
            .with_ixx(0, 0, (1, 2, 4).into())
            .with_ixx(0, 2, (0, 640, 640).into())
            .with_ixx(0, 3, (0, 640, 640).into())
            .with_confs(&[0.2]);
        let mut model = YOLO::new(options)?;
    
        // Build DataLoader to load image(s), video, stream
        let dl = DataLoader::new(
            // "./assets/bus.jpg", // local image
            // "images/bus.jpg",  // remote image
            // "../images-folder",  // local images (from folder)
            // "../demo.mp4",  // local video
            // "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4",  // online video
            "rtsp://admin:kkasd1234@192.168.2.217:554/h264/ch1/",  // stream
        )?
        .with_batch(2)  // iterate with batch_size = 2
        .build()?;
    
        // Build annotator
        let annotator = Annotator::new()
            .with_bboxes_thickness(4)
            .with_saveout("YOLO-DataLoader");
    
        // Build viewer
        let mut viewer = Viewer::new().with_delay(10).with_scale(1.).resizable(true);
    
        // Run and annotate results
        for (xs, _) in dl {
            let ys = model.forward(&xs, false)?;
            // annotator.annotate(&xs, &ys);
            let images_plotted = annotator.plot(&xs, &ys, false)?;
    
            // show image
            viewer.imshow(&images_plotted)?;
    
            // check out window and key event
            if !viewer.is_open() || viewer.is_key_pressed(usls::Key::Escape) {
                break;
            }
    
            // write video
            viewer.write_batch(&images_plotted)?;
    
            // Retrieve inference results
            for y in ys {
                // bboxes
                if let Some(bboxes) = y.bboxes() {
                    for bbox in bboxes {
                        println!(
                            "Bbox: {}, {}, {}, {}, {}, {}",
                            bbox.xmin(),
                            bbox.ymin(),
                            bbox.xmax(),
                            bbox.ymax(),
                            bbox.confidence(),
                            bbox.id(),
                        );
                    }
                }
            }
        }
    
        // finish video write
        viewer.finish_write()?;
    
        Ok(())
    }

📌 License

This project is licensed under LICENSE.