Releases: huggingface/optimum-intel
Releases · huggingface/optimum-intel
v1.9.1: Patch release
- Fix inference for OpenVINO export for causal language models by @echarlaix in #351
v1.9.0: OpenVINO models improvements, TorchScript export, INC quantized SD pipeline
OpenVINO and NNCF
- Ensure compatibility for OpenVINO
v2023.0
by @jiwaszki in #265 - Add Stable Diffusion quantization example by @AlexKoff88 in #294 #304 #326
- Enable decoder quantized models export to leverage cache by @echarlaix in #303
- Set height and width during inference for static models Stable Diffusion models by @echarlaix in #308
- Set batch size to 1 by default for Wav2Vec2 for NNCF compatibility
v2.5.0
@ljaljushkin in #312 - Ensure compatibility for NNCF
v2.5
by @ljaljushkin in #314 - Fix OVModel for BLOOM architecture by @echarlaix in #340
- Add SD OV model height and width attribute and fix export for
torch>=v2.0.0
by @eaidova in #342
Intel Neural Compressor
- Add
TSModelForCausalLM
to enable TorchScript export, loading and inference for causal lm models by @echarlaix in #283 - Remove INC deprecated classes by @echarlaix in #293
- Enable IPEX model inference for text generation task by @jiqing-feng in #227 #300
- Add
INCStableDiffusionPipeline
to enable INC quantized Stable Diffusion model loading by @echarlaix in #305 - Enable the possibility to provide a quantization function and not a calibration dataset during INC static PTQ by @PenghuiCheng in #309
- Fix
INCSeq2SeqTrainer
evaluation step by @AbhishekSalian in #335 - Fix
INCSeq2SeqTrainer
padding step by @echarlaix in #336
Full Changelog: https://github.com/huggingface/optimum-intel/commits/v1.9.0
v1.8.1: Patch release
- Fix OpenVINO Trainer for transformers >= v4.29.0 by @echarlaix in #328
Full Changelog: v1.8.0...v1.8.1
v1.8.0: Optimum INC CLI, past key values for OpenVINO decoder models
Optimum INC CLI
Integration of the Intel Neural Compressor dynamic quantization to the Optimum command line interface. Example commands:
optimum-cli inc --help
optimum-cli inc quantize --help
optimum-cli inc quantize --model distilbert-base-cased-distilled-squad --output int8_distilbert/
- Add Optimum INC CLI to apply dynamic quantization by @echarlaix in #280
Levarage past key values for OpenVINO decoder models
Enable the possibility to use the pre-computed key / values in order to make inference faster. This will be enabled by default when exporting the model.
model = OVModelForCausalLM.from_pretrained(model_id, export=True)
To disable it, use_cache
can be set to False
when loading the model:
model = OVModelForCausalLM.from_pretrained(model_id, export=True, use_cache=False)
- Enable the possibility to use the pre-computed key / values for OpenVINO decoder models by @echarlaix in #274
INC config summarizing optimizations details
- Add
INCConfig
by @echarlaix in #263
Fixes
- Remove dynamic shapes restriction for GPU devices by @helena-intel in #262
- Enable OpenVINO model caching for CPU devices by @helena-intel in #281
- Fix the
.to()
method for causal langage models by @helena-intel in #284 - Fix pytorch model saving for
transformers>=4.28.0
when optimized withOVTrainer
@echarlaix in #285 - Update for task name for ONNX and OpenVINO export for
optimum>=1.8.0
by @echarlaix in #286
v1.7.3: Patch release
- Fix INC distillation to be compatible with
neural-compressor
v2.1 by @echarlaix in #260
v1.7.2: Patch release
- Fix OpenVINO Seq2Seq model export for optimum v1.7.3 by @echarlaix in #253
v1.7.1: Patch release
- Fix IPEX quantization model output by @sywangyi in #218
- Fix INC pruning and QAT combination by @xin3he in #241
- Fix loading of stable diffusion model when model config is not adapted by @echarlaix in #237
- Enable VAE encoder openvino export by @echarlaix in #224
- Fix openvino fp16 conversion for seq2seq models by @echarlaix in #238
- Disable scheduler, tokenizer, feature extractor loading when provided by @echarlaix in #245
- Fix OVTrainer openvino export for structurally pruned model by @yujiepan-work in #236
v1.7.0: OpenVINO pruning, knowledge distillation, Stable Diffusion models inference
NNCF Joint pruning quantization and distillation
Enable joint pruning, quantization and distillation through the OVTrainer
by @vuiseng9 in #150
Stable Diffusion models OpenVINO export and inference
Add stable diffusion OpenVINO pipeline by @echarlaix in #195
from optimum.intel.openvino import OVStableDiffusionPipeline
model_id = "stabilityai/stable-diffusion-2-1"
stable_diffusion = OVStableDiffusionPipeline.from_pretrained(model_id, export=True)
prompt = "sailing ship in storm by Rembrandt"
images = stable_diffusion(prompt).images
v1.6.3: Patch release
- Standardize OpenVINO inference and quantization with INC and ORT by @echarlaix in #185
- Fix past key values usage following transformers
4.26.0
release by @fxmarty in #187
v1.6.2: Patch release
- Fix OpenVINO export for NNCF quantized model by @echarlaix in #176
- Fix ONNX export INC trainer by @echarlaix in #177
- Fix OpenVINO Seq2Seq models inference by @echarlaix in #175
- Enable JIT mode for IPEX to improve inference performance for pipelines by @sywangyi in #163
- Enable OpenVINO Runtime support for audio classification by @echarlaix in #166
- Fix INC model loading by @echarlaix in #170
- Fix INCTrainer distillation and evaluation steps by @echarlaix in #182