Skip to content

Releases: clearml/clearml-serving

v1.3.1

29 Feb 21:54
Compare
Choose a tag to compare

New Features and Bug Fixes

  • Add missing await (#55, thanks @amirhmk!)
  • Add traceback for failing to load preprocess class (#57)
  • Fix Triton config.pbtxt is not checked for missing values or colliding specifications (#62)
  • Add safer code for pulling from Kafka
  • Add str type to Triton type conversion
  • Fix ignore auto detected platform when passing config.pbtxt with platform entry
  • Fix Triton engine model with multiple versions was not properly supported
  • Fix serving session keep alive is also sent on idle
  • Fix examples readme files
  • Log preprocess exceptions with full stack trace to serving session console output

v1.3.0

12 Apr 21:36
eaa2b8a
Compare
Choose a tag to compare

Stable Release

  • Features

    • 20% Overall performance increase 🚀 thank you python 3.11 🔥
    • gRPC channel configuration #49 @amirhmk
    • Huggingface Transformer example
  • Bug fixes

v1.2.0

08 Oct 11:49
Compare
Choose a tag to compare

Stable Release

  • Features

    • GPU Performance improvements, 50%-300% improvement over vanilla Triton
    • Performance improvements on CPU, optimize uvloop + multi-processing
    • Huggingface Transformer example
    • Binary input support, #37 , thanks @Aleksandar1932
  • Bug fixes

    • stdout/stderr in inference service was not logged to dedicated Task

v1.1.0

02 Sep 21:45
Compare
Choose a tag to compare

Stable Release

Notice: This release is not backwards compatible - see notes below on upgrading

  • Breaking Changes

    • Triton engine size supports variable request size (-1)
  • Features & Bug fixes

    • Add version number of serving session task
    • Triton engine support for variable request (matrix) sizes
    • Triton support, fix --aux-config to support more configurations elements
    • Huggingface Transformer support
    • Preprocess class as module (see note below)

Note: To add a Preprocess class from a module (the entire module folder will be packaged)

preprocess_folder
├── __init__.py  # from .sub.some_file import Preprocess
└── sub
    └── some_file.py

Pass the top folder as a path for --preprocess, for example:

clearml-serving --id <serving_session_id> model add --preprocess /path/to/preprocess_folder ...

Upgrading from v1.0

  1. Take down the serving containers (docker-compose or k8s)
  2. Update the clearml-serving CLI pip3 install -U clearml-serving
  3. Re-add a single existing endpoint with clearml-serving model add ... (press yes when asked)
    (it will upgrade the clearml-serving session definitions)
  4. Pull latest serving containers (docker-compose pull ... or k8s)
  5. Re-spin serving containers (docker-compose or k8s)

v1.0.0

07 Jun 08:16
a12311c
Compare
Choose a tag to compare

Stable Release

Notice: This release is not backwards compatible

  • Breaking Changes

    • pre / post processing class functions get 3 arguments, see example
    • Add support for per-request state storage, passing information between the pre/post processing functions
  • Features & Bug fixes

    • Optimize serving latency while collecting statistics
    • Fix metric statistics collecting auto-refresh issue
    • Fix live update of model preprocessing code
    • Add pandas to the default serving container
    • Add per endpoint/variable statistics collection control
    • Add CLEARML_EXTRA_PYTHON_PACKAGES for easier additional python package support (serving inference container)
    • Upgrade Nvidia Triton base container image to 22.04 (requires Nvidia drivers 510+)
    • Add Kubernetes Helm chart

PyPI v0.9.0

21 Mar 15:38
Compare
Choose a tag to compare

Redesign Release

Notice: This release is not backwards compatible

  • Easy to deploy & configure
    • Support Machine Learning Models (Scikit Learn, XGBoost, LightGBM)
    • Support Deep Learning Models (Tensorflow, PyTorch, ONNX)
    • Customizable RestAPI for serving (i.e. allow per model pre/post-processing for easy integration)
  • Flexible
    • On-line model deployment
    • On-line endpoint model/version deployment (i.e. no need to take the service down)
    • Per model standalone preprocessing and postprocessing python code
  • Scalable
    • Multi model per container
    • Multi models per serving service
    • Multi-service support (fully separated multiple serving service running independently)
    • Multi cluster support
    • Out-of-the-box node auto-scaling based on load/usage
  • Efficient
    • multi-container resource utilization
    • Support for CPU & GPU nodes
    • Auto-batching for DL models
  • Automatic deployment
    • Automatic model upgrades w/ canary support
    • Programmable API for model deployment
  • Canary A/B deployment
    • Online Canary updates
  • Model Monitoring
    • Usage Metric reporting
    • Metric Dashboard
    • Model performance metric
    • Model performance Dashboard

Features:

  • FastAPI integration for inference service
  • multi-process Gunicorn for inference service
  • Dynamic preprocess python code loading (no need for container/process restart)
  • Model files download/caching (http/s3/gs/azure)
  • Scikit-learn. XGBoost, LightGBM integration
  • Custom inference, including dynamic code loading
  • Manual model upload/registration to model repository (http/s3/gs/azure)
  • Canary load balancing
  • Auto model endpoint deployment based on model repository state
  • Machine/Node health metrics
  • Dynamic online configuration
  • CLI configuration tool
  • Nvidia Triton integration
  • GZip request compression
  • TorchServe engine integration
  • Prebuilt Docker containers (dockerhub)
  • Docker-compose deployment (CPU/GPU)
  • Scikit-Learn example
  • XGBoost example
  • LightGBM example
  • PyTorch example
  • TensorFlow/Keras example
  • Model ensemble example
  • Model pipeline example
  • Statistics Service
  • Kafka install instructions
  • Prometheus install instructions
  • Grafana install instructions

PyPI v0.3.3

05 Jun 00:16
Compare
Choose a tag to compare

Features & Bug Fixes

  • Fix argparse.FileType error (issue #1)
  • Fix passing both --id and --project / --name

PyPI v0.3.2

13 May 22:38
Compare
Choose a tag to compare

Features & Bug Fixes

  • Add --debug for increased verbosity
  • Fix --config always required (issue #1)