add nanollava notebook (#2044)

CVS-142136
openvinotoolkit · May 27, 2024 · 0320c6c · 0320c6c
1 parent 934d565
commit 0320c6c
Show file tree

Hide file tree

Showing 5 changed files with 1,296 additions and 1 deletion.
diff --git a/.ci/check_notebooks.py b/.ci/check_notebooks.py
@@ -44,7 +44,6 @@ def complain(message):
                 print(f"SKIPPED: {nb_path.relative_to(NOTEBOOKS_ROOT)} for device wdget check")
                 device_found = True
             for cell in notebook_json["cells"]:
-
                 if not toc_found and cell["cell_type"] == "markdown":
                     tc_cell, tc_line = find_tc_in_cell(cell)
                     if tc_line is not None:

diff --git a/.ci/ignore_treon_mac.txt b/.ci/ignore_treon_mac.txt
@@ -71,3 +71,4 @@ notebooks/hello-npu/hello-npu.ipynb
 notebooks/stable-cascade-image-generation/stable-cascade-image-generation.ipynb
 notebooks/dynamicrafter-animating-images/dynamicrafter-animating-images.ipynb
 notebooks/yolov10-optimization/yolov10-optimization.ipynb
+notebooks/nano-llava-multimodal-chatbot/nano-llava-multimodal-chatbot.ipynb
diff --git a/.ci/spellcheck/.pyspelling.wordlist.txt b/.ci/spellcheck/.pyspelling.wordlist.txt
@@ -452,6 +452,7 @@ MusicGen
 Müller
 Nakayosi
 nano
+nanoLLaVA
 nar
 NAS
 natively

diff --git a/notebooks/nano-llava-multimodal-chatbot/README.md b/notebooks/nano-llava-multimodal-chatbot/README.md
@@ -0,0 +1,25 @@
+# Visual-language assistant with LLaVA Next and OpenVINO
+
+[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/nano-llava-multimodal-chatbot/nano-llava-multimodal-chatbot.ipynb)
+
+nanoLLaVA is a "small but mighty" 1B vision-language model designed to run efficiently on edge devices. It uses [SigLIP-400m](https://huggingface.co/google/siglip-so400m-patch14-384) as Image Encoder and [Qwen1.5-0.5B](https://huggingface.co/Qwen/Qwen1.5-0.5B) as LLM.
+In this tutorial, we consider how to convert and run nanoLLaVA model using OpenVINO. Additionally, we will optimize model  using [NNCF](https://github.com/openvinotoolkit/nncf)
+
+## Notebook contents
+The tutorial consists from following steps:
+
+- Install requirements
+- Download PyTorch model
+- Convert model to OpenVINO Intermediate Representation (IR)
+- Compress model weights using NNCF
+- Prepare Inference Pipeline
+- Run OpenVINO model inference
+- Launch Interactive demo
+
+In this demonstration, you'll create interactive chatbot that can answer questions about provided image's content.
+
+
+## Installation instructions
+This is a self-contained example that relies solely on its own code.</br>
+We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
+For details, please refer to [Installation Guide](../../README.md).
diff --git a/notebooks/nano-llava-multimodal-chatbot/nano-llava-multimodal-chatbot.ipynb b/notebooks/nano-llava-multimodal-chatbot/nano-llava-multimodal-chatbot.ipynb