From 5b8d9cce4e3954982335eaef671bb6818064ae9e Mon Sep 17 00:00:00 2001 From: Dmitriy Pastushenkov Date: Wed, 6 Mar 2024 06:08:33 +0100 Subject: [PATCH] Notebook to run inference for LCM using Optimum Intel with OpenVINO (#1696) Notebook allows to run inference with the standard Diffusers pipeline and the Optimum Intel pipeline on CPU and GPU --------- Co-authored-by: Raymond Lo --- ...tent-consistency-models-optimum-demo.ipynb | 379 ++++++++++++++++++ 1 file changed, 379 insertions(+) create mode 100644 notebooks/263-latent-consistency-models-image-generation/263-latent-consistency-models-optimum-demo.ipynb diff --git a/notebooks/263-latent-consistency-models-image-generation/263-latent-consistency-models-optimum-demo.ipynb b/notebooks/263-latent-consistency-models-image-generation/263-latent-consistency-models-optimum-demo.ipynb new file mode 100644 index 00000000000..d0d3ccdcc3b --- /dev/null +++ b/notebooks/263-latent-consistency-models-image-generation/263-latent-consistency-models-optimum-demo.ipynb @@ -0,0 +1,379 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "0a465fbf", + "metadata": {}, + "source": [ + "# Latent Consistency Model using Optimum-Intel OpenVINO\n", + "This notebook provides instructions how to run Latent Consistency Model (LCM). It allows to setup standard Hugging Face diffusers pipeline and Optimum Intel pipeline optimized for Intel hardware including CPU and GPU. Running inference on CPU and GPU it is easy to compare performance and time required to generate an image for provided prompt. The notebook can be also used on other Intel hardware with minimal or no modifications. \n", + "\n", + "![](https://github.com/openvinotoolkit/openvino_notebooks/assets/10940214/1858dae4-72fd-401e-b055-66d503d82446)\n", + "\n", + "Optimum Intel is an interface from Hugging Face between both diffusers and transformers libraries and various tools provided by Intel to accelerate pipelines on Intel hardware. It allows to perform quantization of the models hosted on Hugging Face.\n", + "In this notebook OpenVINO is used for AI-inference acceleration as a backend for Optimum Intel! \n", + "\n", + "For more details please refer to Optimum Intel repository\n", + "https://github.com/huggingface/optimum-intel\n", + "\n", + "\n", + "\n", + "\n", + "LCMs are the next generation of generative models after Latent Diffusion Models (LDMs). They are proposed to overcome the slow iterative sampling process of Latent Diffusion Models (LDMs), enabling fast inference with minimal steps (from 2 to 4) on any pre-trained LDMs (e.g. Stable Diffusion). To read more about LCM please refer to https://latent-consistency-models.github.io/\n", + "\n", + "#### Table of contents:\n", + "- [Prerequisites](#Prerequisites)\n", + "- [Full precision model on the CPU](#Using-full-precision-model-in-CPU-with-LatentConsistencyModelPipeline)\n", + "- [Running inference using Optimum Intel `OVLatentConsistencyModelPipeline`](#Running-inference-using-Optimum-Intel-OVLatentConsistencyModelPipeline)\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "523a3f91", + "metadata": {}, + "source": [ + "### Prerequisites\n", + "[back to top ⬆️](#Table-of-contents:)\n", + "\n", + "Install required packages" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "ec2a1a2a", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Note: you may need to restart the kernel to use updated packages.\n" + ] + } + ], + "source": [ + "%pip install -q \"openvino>=2023.3.0\"\n", + "%pip install -q \"onnx>=1.11.0\"\n", + "%pip install -q \"optimum-intel[diffusers]@git+https://github.com/huggingface/optimum-intel.git\" \"ipywidgets\" \"transformers>=4.33.0\" --extra-index-url https://download.pytorch.org/whl/cpu" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "6960adc6", + "metadata": {}, + "outputs": [], + "source": [ + "import warnings\n", + "warnings.filterwarnings('ignore')" + ] + }, + { + "cell_type": "markdown", + "id": "b3e8c87b", + "metadata": {}, + "source": [ + "### Showing Info Available Devices\n", + "[back to top ⬆️](#Table-of-contents:)\n", + "\n", + "The `available_devices` property shows the available devices in your system. The \"FULL_DEVICE_NAME\" option to `ie.get_property()` shows the name of the device. Check what is the ID name for the discrete GPU, if you have integrated GPU (iGPU) and discrete GPU (dGPU), it will show `device_name=\"GPU.0\"` for iGPU and `device_name=\"GPU.1\"` for dGPU. If you just have either an iGPU or dGPU that will be assigned to `\"GPU\"`\n", + "\n", + "Note: For more details about GPU with OpenVINO visit this [link](https://docs.openvino.ai/nightly/openvino_docs_install_guides_configurations_for_intel_gpu.html). If you have been facing any issue in Ubuntu 20.04 or Windows 11 read this [blog](https://blog.openvino.ai/blog-posts/install-gpu-drivers-windows-ubuntu)." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "f71b081b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "CPU: Intel(R) Core(TM) Ultra 7 155H\n", + "GNA.GNA_SW: GNA_SW\n", + "GNA.GNA_HW: GNA_HW\n", + "GPU: Intel(R) Arc(TM) Graphics (iGPU)\n", + "NPU: Intel(R) AI Boost\n" + ] + } + ], + "source": [ + "import openvino as ov\n", + "core = ov.Core()\n", + "devices = core.available_devices\n", + "\n", + "for device in devices:\n", + " device_name = core.get_property(device, \"FULL_DEVICE_NAME\")\n", + " print(f\"{device}: {device_name}\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "2857b2f8", + "metadata": {}, + "source": [ + "### Using full precision model in CPU with `LatentConsistencyModelPipeline`\n", + "[back to top ⬆️](#Table-of-contents:)\n", + "\n", + "Standard pipeline for the Latent Consistency Model(LCM) from Diffusers library is used here. For more information please refer to https://huggingface.co/docs/diffusers/en/api/pipelines/latent_consistency_models\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "99c65ed5", + "metadata": {}, + "outputs": [ + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "a0d632d1a4b14722a98fc4ef779374b7", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Loading pipeline components...: 0%| | 0/7 [00:00" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "prompt = \"A cute squirrel in the forest, portrait, 8k\"\n", + "\n", + "image = pipeline(\n", + " prompt=prompt, num_inference_steps=4, guidance_scale=8.0\n", + ").images[0]\n", + "image.save(\"image_standard_pipeline.png\")\n", + "image" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "d8fcee96", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "345" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "del pipeline\n", + "gc.collect();" + ] + }, + { + "cell_type": "markdown", + "id": "7fedcc5e", + "metadata": {}, + "source": [ + "### Select inference device for text-to-image generation" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fb1ed573", + "metadata": {}, + "outputs": [], + "source": [ + "import ipywidgets as widgets\n", + "\n", + "core = ov.Core()\n", + "\n", + "device = widgets.Dropdown(\n", + " options=core.available_devices + [\"AUTO\"],\n", + " value='CPU',\n", + " description='Device:',\n", + " disabled=False,\n", + ")\n", + "\n", + "device" + ] + }, + { + "cell_type": "markdown", + "id": "7860bb7f", + "metadata": {}, + "source": [ + "### Running inference using Optimum Intel `OVLatentConsistencyModelPipeline`\n", + "[back to top ⬆️](#Table-of-contents:)\n", + "\n", + "Accelerating inference of LCM using Intel Optimum with OpenVINO backend. For more information please refer to https://huggingface.co/docs/optimum/intel/inference#latent-consistency-models. \n", + "The pretrained model in this notebook is available on Hugging Face in FP32 precision and in case if CPU is selected as a device, then inference runs with full precision. For GPU accelerated AI-inference is supported for FP16 data type and FP32 precision for GPU may produce high memory footprint and latency. Therefore, default precision for GPU in OpenVINO is FP16. OpenVINO GPU Plugin converts FP32 to FP16 on the fly and there is no need to do it manually" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f8578ffd", + "metadata": {}, + "outputs": [], + "source": [ + "from optimum.intel.openvino import OVLatentConsistencyModelPipeline\n", + "\n", + "ov_pipeline = OVLatentConsistencyModelPipeline.from_pretrained(\"SimianLuo/LCM_Dreamshaper_v7\", export=True, compile=False)\n", + "ov_pipeline.reshape(batch_size=1, height=768, width=768, num_images_per_prompt=1)\n", + "ov_pipeline.save_pretrained(\"./openvino_ir\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2afd5738", + "metadata": {}, + "outputs": [], + "source": [ + "ov_pipeline.to(device.value)\n", + "ov_pipeline.compile()" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "cd78df50-c08d-4b1e-98e9-5b7721ca20e0", + "metadata": {}, + "outputs": [ + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "8a126d171dc74d1694986c898ebe6cb5", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + " 0%| | 0/4 [00:00" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "prompt = \"A cute squirrel in the forest, portrait, 8k\"\n", + "\n", + "image_ov = ov_pipeline(prompt=prompt, num_inference_steps=4, guidance_scale=8.0).images[0]\n", + "image_ov.save(\"image_opt.png\")\n", + "image_ov" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a04cdc19-f526-493b-8cb0-4ed5b92c9173", + "metadata": {}, + "outputs": [], + "source": [ + "del ov_pipeline\n", + "gc.collect();" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.7" + }, + "openvino_notebooks": { + "imageUrl": "https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/236-stable-diffusion-v2/236-stable-diffusion-v2-optimum-demo.png?raw=true", + "tags": { + "categories": [ + "Model Demos", + "AI Trends" + ], + "libraries": [], + "other": [], + "tasks": [ + "Text-to-Image" + ] + } + }, + "widgets": { + "application/vnd.jupyter.widget-state+json": { + "state": {}, + "version_major": 2, + "version_minor": 0 + } + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}