diff --git a/docs/articles_en/about-openvino/release-notes-openvino.rst b/docs/articles_en/about-openvino/release-notes-openvino.rst index 70df641fff6e67..2f7cffccd5076b 100644 --- a/docs/articles_en/about-openvino/release-notes-openvino.rst +++ b/docs/articles_en/about-openvino/release-notes-openvino.rst @@ -1028,7 +1028,7 @@ Discontinued in 2024 * Deployment Manager. See :doc:`installation <../get-started/install-openvino>` and :doc:`deployment <../get-started/install-openvino>` guides for current distribution options. - * `Accuracy Checker `__. + * `Accuracy Checker `__. * `Post-Training Optimization Tool `__ (POT). Neural Network Compression Framework (NNCF) should be used instead. * A `Git patch `__ @@ -1065,15 +1065,15 @@ Deprecated and to be removed in the future * See alternative: `Optical Character Recognition (OCR) with OpenVINO™ `__, * See alternative: `PaddleOCR with OpenVINO™ `__, - * See alternative: `Handwritten Text Recognition Demo `__ + * See alternative: `Handwritten Text Recognition Demo `__ * `Image In-painting with OpenVINO™ `__ - * See alternative: `Image Inpainting Python Demo `__ + * See alternative: `Image Inpainting Python Demo `__ * `Interactive Machine Translation with OpenVINO `__ - * See alternative: `Machine Translation Python* Demo `__ + * See alternative: `Machine Translation Python* Demo `__ * `Open Model Zoo Tools Tutorial `__ diff --git a/docs/articles_en/documentation/legacy-features/model-zoo.rst b/docs/articles_en/documentation/legacy-features/model-zoo.rst index e8981c7fa03bfa..4b761e6c7df831 100644 --- a/docs/articles_en/documentation/legacy-features/model-zoo.rst +++ b/docs/articles_en/documentation/legacy-features/model-zoo.rst @@ -1,32 +1,8 @@ Model Zoo ========= - .. _model zoo: -.. toctree:: - :maxdepth: 1 - :hidden: - - ../../omz_models_group_intel - ../../omz_models_group_public - -.. toctree:: - :maxdepth: 1 - :hidden: - - ../../omz_tools_downloader - ../../omz_tools_accuracy_checker - ../../omz_data_datasets - ../../omz_demos - -.. toctree:: - :maxdepth: 1 - :hidden: - - ../../omz_model_api_ovms_adapter - - .. note:: Since the deprecation of Open Model Zoo, OpenVINO has significantly extended its presence on the @@ -35,21 +11,21 @@ Model Zoo Open Model Zoo for OpenVINO™ toolkit delivers a wide variety of free, pre-trained deep learning models and demo applications that provide full application templates to help you implement deep -learning in Python, C++, or OpenCV Graph API (G-API). Models and demos are available in the +learning in Python, C++, or OpenCV Graph API (G-API). + +Models, demos and full documentation are available in the `Open Model Zoo GitHub repo `__ and licensed under Apache License Version 2.0. Browse through over 200 neural network models, both -:doc:`public <../../omz_models_group_public>` and from -:doc:`Intel <../../omz_models_group_intel>`, and pick the right one for your solution. +`public `__ and from +`Intel `__, and pick the right one for your solution. Types include object detection, classification, image segmentation, handwriting recognition, text to speech, pose estimation, and others. The Intel models have already been converted to work with OpenVINO™ toolkit, while public models can easily be converted using the :doc:`OpenVINO Model Conversion API <../../openvino-workflow/model-preparation>` utility. -Get started with simple -:doc:`step-by-step procedures <../../learn-openvino/openvino-samples/get-started-demos>` -to learn how to build and run demo applications or discover the -:doc:`full set of demos <../../omz_demos>` and adapt them for implementing specific deep +Open Model Zoo offers a +`comprehensive set of demos `__ that you can adapt for implementing specific deep learning scenarios in your applications. diff --git a/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-supported-model-formats/[legacy]-conversion-tutorials/convert-tensorflow-efficient-det.rst b/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-supported-model-formats/[legacy]-conversion-tutorials/convert-tensorflow-efficient-det.rst index 1ffceee4a7081a..c894765a5dc604 100644 --- a/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-supported-model-formats/[legacy]-conversion-tutorials/convert-tensorflow-efficient-det.rst +++ b/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-supported-model-formats/[legacy]-conversion-tutorials/convert-tensorflow-efficient-det.rst @@ -54,7 +54,7 @@ The attribute ``image_size`` specifies the shape to be defined for the model con The color channel order (RGB or BGR) of an input data should match the channel order of the model training dataset. If they are different, perform the ``RGB<->BGR`` conversion specifying the command-line parameter: ``--reverse_input_channels``. Otherwise, inference results may be incorrect. For more information about the parameter, refer to the **When to Reverse Input Channels** section of the :doc:`Converting a Model to Intermediate Representation (IR) <../../[legacy]-setting-input-shapes>` guide. OpenVINO toolkit provides samples that can be used to infer EfficientDet model. -For more information, refer to the :doc:`Open Model Zoo Demos <../../../../../../omz_demos>`. +For more information, refer to the `Open Model Zoo Demos `__. .. important:: diff --git a/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-supported-model-formats/[legacy]-conversion-tutorials/convert-tensorflow-retina-net.rst b/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-supported-model-formats/[legacy]-conversion-tutorials/convert-tensorflow-retina-net.rst index d5639d68834fb0..db2c6424367f58 100644 --- a/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-supported-model-formats/[legacy]-conversion-tutorials/convert-tensorflow-retina-net.rst +++ b/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-supported-model-formats/[legacy]-conversion-tutorials/convert-tensorflow-retina-net.rst @@ -16,7 +16,7 @@ Converting a TensorFlow RetinaNet Model This tutorial explains how to convert a RetinaNet model to the Intermediate Representation (IR). `Public RetinaNet model `__ does not contain pretrained TensorFlow weights. -To convert this model to the TensorFlow format, follow the `Reproduce Keras to TensorFlow Conversion tutorial `__. +To convert this model to the TensorFlow format, follow the `Reproduce Keras to TensorFlow Conversion tutorial `__. After converting the model to TensorFlow format, run the following command: diff --git a/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-supported-model-formats/[legacy]-conversion-tutorials/convert-tensorflow-yolo.rst b/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-supported-model-formats/[legacy]-conversion-tutorials/convert-tensorflow-yolo.rst index 99c806dcb99649..e7e8072b1bda05 100644 --- a/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-supported-model-formats/[legacy]-conversion-tutorials/convert-tensorflow-yolo.rst +++ b/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-supported-model-formats/[legacy]-conversion-tutorials/convert-tensorflow-yolo.rst @@ -216,7 +216,7 @@ where: The color channel order (RGB or BGR) of an input data should match the channel order of the model training dataset. If they are different, perform the ``RGB<->BGR`` conversion specifying the command-line parameter: ``reverse_input_channels``. Otherwise, inference results may be incorrect. For more information about the parameter, refer to the **When to Reverse Input Channels** section of the :doc:`Converting a Model to Intermediate Representation (IR) <../../[legacy]-setting-input-shapes>` guide. -OpenVINO toolkit provides a demo that uses YOLOv3 model. Refer to the :doc:`Object Detection C++ Demo <../../../../../../omz_demos_object_detection_demo_cpp>` for more information. +OpenVINO toolkit provides a demo that uses YOLOv3 model. Refer to the `Object Detection C++ Demo `__ for more information. Converting YOLOv1 and YOLOv2 Models to the IR ############################################# diff --git a/docs/articles_en/documentation/openvino-ecosystem/openvino-security-add-on.rst b/docs/articles_en/documentation/openvino-ecosystem/openvino-security-add-on.rst index b7291a2148c8ed..ea76392be4e2e6 100644 --- a/docs/articles_en/documentation/openvino-ecosystem/openvino-security-add-on.rst +++ b/docs/articles_en/documentation/openvino-ecosystem/openvino-security-add-on.rst @@ -735,7 +735,7 @@ How to Use the OpenVINO™ Security Add-on This section requires interactions between the Model Developer/Independent Software vendor and the User. All roles must complete all applicable :ref:`set up steps ` and :ref:`installation steps ` before beginning this section. -This document uses the :doc:`face-detection-retail-0004 <../../omz_models_model_face_detection_retail_0004>` model as an example. +This document uses the `face-detection-retail-0004 `__ model as an example. The following figure describes the interactions between the Model Developer, Independent Software Vendor, and User. diff --git a/docs/articles_en/documentation/openvino-extensibility/openvino-plugin-library/advanced-guides/low-precision-transformations.rst b/docs/articles_en/documentation/openvino-extensibility/openvino-plugin-library/advanced-guides/low-precision-transformations.rst index a5e2f42484c27a..6ba9e0a9b60f52 100644 --- a/docs/articles_en/documentation/openvino-extensibility/openvino-plugin-library/advanced-guides/low-precision-transformations.rst +++ b/docs/articles_en/documentation/openvino-extensibility/openvino-plugin-library/advanced-guides/low-precision-transformations.rst @@ -312,13 +312,13 @@ This step is optional. It modifies the transformation function to a device-speci Result model overview ##################### -Let's explore quantized `TensorFlow implementation of ResNet-50 `__ model. Use :doc:`Model Downloader <../../../../omz_tools_downloader>` tool to download the ``fp16`` model from `OpenVINO™ Toolkit - Open Model Zoo repository `__: +Let's explore quantized `TensorFlow implementation of ResNet-50 `__ model. Use `Model Downloader `__ tool to download the ``fp16`` model from `OpenVINO™ Toolkit - Open Model Zoo repository `__: .. code-block:: sh omz_downloader --name resnet-50-tf --precisions FP16-INT8 -After that you should quantize model by the :doc:`Model Quantizer <../../../../omz_tools_downloader>` tool. +After that you should quantize model by the `Model Quantizer `__ tool. .. code-block:: sh diff --git a/docs/articles_en/documentation/openvino-security.rst b/docs/articles_en/documentation/openvino-security.rst index 2deebbc320f285..18598cc921ab03 100644 --- a/docs/articles_en/documentation/openvino-security.rst +++ b/docs/articles_en/documentation/openvino-security.rst @@ -72,5 +72,4 @@ Additional Resources - :doc:`Convert a Model `. - :doc:`OpenVINO™ Runtime User Guide <../openvino-workflow/running-inference>`. - For more information on Sample Applications, see the :doc:`OpenVINO Samples Overview <../learn-openvino/openvino-samples>` -- For information on a set of pre-trained models, see the :doc:`Overview of OpenVINO™ Toolkit Pre-Trained Models <../omz_models_group_intel>`. - For IoT Libraries and Code Samples, see the `Intel® IoT Developer Kit `__. diff --git a/docs/articles_en/learn-openvino/openvino-samples/bert-benchmark.rst b/docs/articles_en/learn-openvino/openvino-samples/bert-benchmark.rst index 65de6d4f966913..92f6a410219f43 100644 --- a/docs/articles_en/learn-openvino/openvino-samples/bert-benchmark.rst +++ b/docs/articles_en/learn-openvino/openvino-samples/bert-benchmark.rst @@ -7,7 +7,8 @@ Bert Benchmark Python Sample This sample demonstrates how to estimate performance of a Bert model using Asynchronous -Inference Request API. Unlike `demos `__ this sample does not have +Inference Request API. Unlike `demos `__ +this sample does not have configurable command line arguments. Feel free to modify sample's source code to try out different options. diff --git a/docs/articles_en/learn-openvino/openvino-samples/hello-reshape-ssd.rst b/docs/articles_en/learn-openvino/openvino-samples/hello-reshape-ssd.rst index 4ba00d7cee6def..23de8eb1979824 100644 --- a/docs/articles_en/learn-openvino/openvino-samples/hello-reshape-ssd.rst +++ b/docs/articles_en/learn-openvino/openvino-samples/hello-reshape-ssd.rst @@ -14,7 +14,7 @@ using the sample, refer to the following requirements: - Models with only one input and output are supported. - The sample accepts any file format supported by ``core.read_model``. -- The sample has been validated with: `person-detection-retail-0013 `__ +- The sample has been validated with: `person-detection-retail-0013 `__ models and the NCHW layout format. - To build the sample, use instructions available at :ref:`Build the Sample Applications ` section in "Get Started with Samples" guide. diff --git a/docs/articles_en/learn-openvino/openvino-samples/sync-benchmark.rst b/docs/articles_en/learn-openvino/openvino-samples/sync-benchmark.rst index 7ebe2d7bcf8567..245672decb7ab2 100644 --- a/docs/articles_en/learn-openvino/openvino-samples/sync-benchmark.rst +++ b/docs/articles_en/learn-openvino/openvino-samples/sync-benchmark.rst @@ -9,13 +9,14 @@ Sync Benchmark Sample This sample demonstrates how to estimate performance of a model using Synchronous Inference Request API. It makes sense to use synchronous inference only in latency oriented scenarios. Models with static input shapes are supported. Unlike -`demos `__ this sample does not have other configurable command-line +`demos `__ +this sample does not have other configurable command-line arguments. Feel free to modify sample's source code to try out different options. Before using the sample, refer to the following requirements: - The sample accepts any file format supported by ``core.read_model``. -- The sample has been validated with: `yolo-v3-tf `__, - `face-detection-0200 `__ models. +- The sample has been validated with: `yolo-v3-tf `__, + `face-detection-0200 `__ models. - To build the sample, use instructions available at :ref:`Build the Sample Applications ` section in "Get Started with Samples" guide. diff --git a/docs/articles_en/learn-openvino/openvino-samples/throughput-benchmark.rst b/docs/articles_en/learn-openvino/openvino-samples/throughput-benchmark.rst index 5c69b9759ce130..e8b723afd2a480 100644 --- a/docs/articles_en/learn-openvino/openvino-samples/throughput-benchmark.rst +++ b/docs/articles_en/learn-openvino/openvino-samples/throughput-benchmark.rst @@ -7,7 +7,7 @@ Throughput Benchmark Sample This sample demonstrates how to estimate performance of a model using Asynchronous -Inference Request API in throughput mode. Unlike `demos `__ this sample +Inference Request API in throughput mode. Unlike `demos `__ this sample does not have other configurable command-line arguments. Feel free to modify sample's source code to try out different options. @@ -18,8 +18,8 @@ sets ``uint8``, while the sample uses default model precision which is usually ` Before using the sample, refer to the following requirements: - The sample accepts any file format supported by ``core.read_model``. -- The sample has been validated with: `yolo-v3-tf `__, - `face-detection-0200 `__ models. +- The sample has been validated with: `yolo-v3-tf `__, + `face-detection-0200 `__ models. - To build the sample, use instructions available at :ref:`Build the Sample Applications ` section in "Get Started with Samples" guide. diff --git a/docs/articles_en/openvino-workflow/running-inference/optimize-inference/general-optimizations.rst b/docs/articles_en/openvino-workflow/running-inference/optimize-inference/general-optimizations.rst index 15610c8d2d1c63..b8ec2da9235fd4 100644 --- a/docs/articles_en/openvino-workflow/running-inference/optimize-inference/general-optimizations.rst +++ b/docs/articles_en/openvino-workflow/running-inference/optimize-inference/general-optimizations.rst @@ -60,7 +60,7 @@ Below are example-codes for the regular and async-based approaches to compare: The technique can be generalized to any available parallel slack. For example, you can do inference and simultaneously encode the resulting or previous frames or run further inference, like emotion detection on top of the face detection results. -Refer to the `Object Detection C++ Demo `__ , `Object Detection Python Demo `__ (latency-oriented Async API showcase) and :doc:`Benchmark App Sample <../../../learn-openvino/openvino-samples/benchmark-tool>` for complete examples of the Async API in action. +Refer to the `Object Detection C++ Demo `__ , `Object Detection Python Demo `__ (latency-oriented Async API showcase) and :doc:`Benchmark App Sample <../../../learn-openvino/openvino-samples/benchmark-tool>` for complete examples of the Async API in action. .. note:: diff --git a/docs/notebooks/explainable-ai-1-basic-with-output.rst b/docs/notebooks/explainable-ai-1-basic-with-output.rst index a739e2da7ffc27..37f452d7c23571 100644 --- a/docs/notebooks/explainable-ai-1-basic-with-output.rst +++ b/docs/notebooks/explainable-ai-1-basic-with-output.rst @@ -37,7 +37,7 @@ notebook: .. image:: https://github.com/openvinotoolkit/openvino_xai/assets/17028475/ccb67c0b-c58e-4beb-889f-af0aff21cb66 A pre-trained `MobileNetV3 -model `__ +model `__ from `Open Model Zoo `__ is used in this tutorial. diff --git a/docs/notebooks/handwritten-ocr-with-output.rst b/docs/notebooks/handwritten-ocr-with-output.rst index eec888531e8a12..5a3df549aad01c 100644 --- a/docs/notebooks/handwritten-ocr-with-output.rst +++ b/docs/notebooks/handwritten-ocr-with-output.rst @@ -8,9 +8,9 @@ Latin alphabet is available in `notebook This model is capable of processing only one line of symbols at a time. The models used in this notebook are -`handwritten-japanese-recognition-0001 `__ +`handwritten-japanese-recognition-0001 `__ and -`handwritten-simplified-chinese-0001 `__. +`handwritten-simplified-chinese-0001 `__. To decode model outputs as readable text `kondate_nakayosi `__ and @@ -49,10 +49,10 @@ Guide =2023.1.0" opencv-python tqdm - + if platform.system() != "Windows": %pip install -q "matplotlib>=3.4" else: @@ -74,19 +74,19 @@ Imports from collections import namedtuple from itertools import groupby - + import cv2 import matplotlib.pyplot as plt import numpy as np import openvino as ov - + # Fetch `notebook_utils` module import requests - + r = requests.get( url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/notebook_utils.py", ) - + open("notebook_utils.py", "w").write(r.text) from notebook_utils import download_file, device_widget @@ -103,7 +103,7 @@ Set up all constants and folders used in this notebook base_models_dir = "models" data_folder = "data" charlist_folder = f"{data_folder}/text" - + # Precision used by the model. precision = "FP16" @@ -139,9 +139,9 @@ If you want to perform OCR on a text in Japanese, set # Select the language by using either language="chinese" or language="japanese". language = "chinese" - + languages = {"chinese": chinese_files, "japanese": japanese_files} - + selected_language = languages.get(language) Download the Model @@ -256,28 +256,28 @@ keep letters proportional and meet input shape. "https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/image/" + selected_language.demo_image_name, directory=data_folder, ) - + # Text detection models expect an image in grayscale format. # IMPORTANT! This model enables reading only one line at time. - + # Read the image. image = cv2.imread(filename=str(file_name), flags=cv2.IMREAD_GRAYSCALE) - + # Fetch the shape. image_height, _ = image.shape - + # B,C,H,W = batch size, number of channels, height, width. _, _, H, W = recognition_input_layer.shape - + # Calculate scale ratio between the input shape height and image height to resize the image. scale_ratio = H / image_height - + # Resize the image to expected input sizes. resized_image = cv2.resize(image, None, fx=scale_ratio, fy=scale_ratio, interpolation=cv2.INTER_AREA) - + # Pad the image to match input size, without changing aspect ratio. resized_image = np.pad(resized_image, ((0, 0), (0, W - resized_image.shape[1])), mode="edge") - + # Reshape to network input shape. input_image = resized_image[None, None, :, :] @@ -335,10 +335,10 @@ Chinese and Japanese models. # Get a dictionary to encode the output, based on model documentation. used_charlist = selected_language.charlist_name - + # With both models, there should be blank symbol added at index 0 of each charlist. blank_char = "~" - + with used_charlist_file.open(mode="r", encoding="utf-8") as charlist: letters = blank_char + "".join(line.strip() for line in charlist) @@ -380,7 +380,7 @@ Finally, get the symbols from corresponding indexes in the charlist. # Remove a batch dimension. predictions = np.squeeze(predictions) - + # Run the `argmax` function to pick the symbols with the highest probability. predictions_indexes = np.argmax(predictions, axis=1) @@ -388,13 +388,13 @@ Finally, get the symbols from corresponding indexes in the charlist. # Use the `groupby` function to remove concurrent letters, as required by CTC greedy decoding. output_text_indexes = list(groupby(predictions_indexes)) - + # Remove grouper objects. output_text_indexes, _ = np.transpose(output_text_indexes, (1, 0)) - + # Remove blank symbols. output_text_indexes = output_text_indexes[output_text_indexes != 0] - + # Assign letters to indexes from the output array. output_text = [letters[letter_index] for letter_index in output_text_indexes] @@ -411,7 +411,7 @@ the image with predicted text printed below. plt.figure(figsize=(20, 1)) plt.axis("off") plt.imshow(resized_image, cmap="gray", vmin=0, vmax=255) - + print("".join(output_text)) diff --git a/docs/notebooks/hello-detection-with-output.rst b/docs/notebooks/hello-detection-with-output.rst index 60bfe929c596d7..3a72e56b3e801e 100644 --- a/docs/notebooks/hello-detection-with-output.rst +++ b/docs/notebooks/hello-detection-with-output.rst @@ -5,7 +5,7 @@ A very basic introduction to using object detection models with OpenVINO™. The -`horizontal-text-detection-0001 `__ +`horizontal-text-detection-0001 `__ model from `Open Model Zoo `__ is used. It detects horizontal text in images and returns a blob of data in the @@ -60,16 +60,16 @@ Imports import numpy as np import openvino as ov from pathlib import Path - + # Fetch `notebook_utils` module import requests - + r = requests.get( url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/notebook_utils.py", ) - + open("notebook_utils.py", "w").write(r.text) - + from notebook_utils import download_file, device_widget Download model weights @@ -80,18 +80,18 @@ Download model weights .. code:: ipython3 base_model_dir = Path("./model").expanduser() - + model_name = "horizontal-text-detection-0001" model_xml_name = f"{model_name}.xml" model_bin_name = f"{model_name}.bin" - + model_xml_path = base_model_dir / model_xml_name model_bin_path = base_model_dir / model_bin_name - + if not model_xml_path.exists(): model_xml_url = "https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.3/models_bin/1/horizontal-text-detection-0001/FP32/horizontal-text-detection-0001.xml" model_bin_url = "https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.3/models_bin/1/horizontal-text-detection-0001/FP32/horizontal-text-detection-0001.bin" - + download_file(model_xml_url, model_xml_name, base_model_dir) download_file(model_bin_url, model_bin_name, base_model_dir) else: @@ -139,10 +139,10 @@ Load the Model .. code:: ipython3 core = ov.Core() - + model = core.read_model(model=model_xml_path) compiled_model = core.compile_model(model=model, device_name=device.value) - + input_layer_ir = compiled_model.input(0) output_layer_ir = compiled_model.output("boxes") @@ -158,19 +158,19 @@ Load an Image "https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/image/intel_rnb.jpg", directory="data", ) - + # Text detection models expect an image in BGR format. image = cv2.imread(str(image_filename)) - + # N,C,H,W = batch size, number of channels, height, width. N, C, H, W = input_layer_ir.shape - + # Resize the image to meet network expected input sizes. resized_image = cv2.resize(image, (W, H)) - + # Reshape to the network input shape. input_image = np.expand_dims(resized_image.transpose(2, 0, 1), 0) - + plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB)); @@ -193,7 +193,7 @@ Do Inference # Create an inference request. boxes = compiled_model([input_image])[output_layer_ir] - + # Remove zero only boxes. boxes = boxes[~np.all(boxes == 0, axis=1)] @@ -209,17 +209,17 @@ Visualize Results def convert_result_to_image(bgr_image, resized_image, boxes, threshold=0.3, conf_labels=True): # Define colors for boxes and descriptions. colors = {"red": (255, 0, 0), "green": (0, 255, 0)} - + # Fetch the image shapes to calculate a ratio. (real_y, real_x), (resized_y, resized_x) = ( bgr_image.shape[:2], resized_image.shape[:2], ) ratio_x, ratio_y = real_x / resized_x, real_y / resized_y - + # Convert the base image from BGR to RGB format. rgb_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2RGB) - + # Iterate through non-zero boxes. for box in boxes: # Pick a confidence factor from the last place in an array. @@ -231,10 +231,10 @@ Visualize Results (x_min, y_min, x_max, y_max) = [ (int(max(corner_position * ratio_y, 10)) if idx % 2 else int(corner_position * ratio_x)) for idx, corner_position in enumerate(box[:-1]) ] - + # Draw a box based on the position, parameters in rectangle function are: image, start_point, end_point, color, thickness. rgb_image = cv2.rectangle(rgb_image, (x_min, y_min), (x_max, y_max), colors["green"], 3) - + # Add text to the image based on position and confidence. # Parameters in text function are: image, text, bottom-left_corner_textfield, font, font_scale, color, thickness, line_type. if conf_labels: @@ -248,7 +248,7 @@ Visualize Results 1, cv2.LINE_AA, ) - + return rgb_image .. code:: ipython3 diff --git a/docs/notebooks/hello-segmentation-with-output.rst b/docs/notebooks/hello-segmentation-with-output.rst index b22fe6e27c0c68..5aa44384df8955 100644 --- a/docs/notebooks/hello-segmentation-with-output.rst +++ b/docs/notebooks/hello-segmentation-with-output.rst @@ -4,7 +4,7 @@ Hello Image Segmentation A very basic introduction to using segmentation models with OpenVINO™. In this tutorial, a pre-trained -`road-segmentation-adas-0001 `__ +`road-segmentation-adas-0001 `__ model from the `Open Model Zoo `__ is used. ADAS stands for Advanced Driver Assistance Services. The model diff --git a/docs/notebooks/hello-world-with-output.rst b/docs/notebooks/hello-world-with-output.rst index af625d217a6103..5cf9788de96a35 100644 --- a/docs/notebooks/hello-world-with-output.rst +++ b/docs/notebooks/hello-world-with-output.rst @@ -5,7 +5,7 @@ This basic introduction to OpenVINO™ shows how to do inference with an image classification model. A pre-trained `MobileNetV3 -model `__ +model `__ from `Open Model Zoo `__ is used in this tutorial. For more information about how OpenVINO IR models are @@ -37,15 +37,15 @@ Guide =2023.1.0" opencv-python tqdm - + if platform.system() != "Windows": %pip install -q "matplotlib>=3.4" else: %pip install -q "matplotlib>=3.4,<3.7" - + @@ -63,21 +63,21 @@ Imports .. code:: ipython3 from pathlib import Path - + import cv2 import matplotlib.pyplot as plt import numpy as np import openvino as ov - + # Fetch `notebook_utils` module import requests - + r = requests.get( url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/notebook_utils.py", ) - + open("notebook_utils.py", "w").write(r.text) - + from notebook_utils import download_file, device_widget Download the Model and data samples @@ -88,15 +88,15 @@ Download the Model and data samples .. code:: ipython3 base_artifacts_dir = Path("./artifacts").expanduser() - + model_name = "v3-small_224_1.0_float" model_xml_name = f"{model_name}.xml" model_bin_name = f"{model_name}.bin" - + model_xml_path = base_artifacts_dir / model_xml_name - + base_url = "https://storage.openvinotoolkit.org/repositories/openvino_notebooks/models/mobelinet-v3-tf/FP32/" - + if not model_xml_path.exists(): download_file(base_url + model_xml_name, model_xml_name, base_artifacts_dir) download_file(base_url + model_bin_name, model_bin_name, base_artifacts_dir) @@ -126,7 +126,7 @@ select device from dropdown list for running inference using OpenVINO .. code:: ipython3 device = device_widget() - + device @@ -148,7 +148,7 @@ Load the Model core = ov.Core() model = core.read_model(model=model_xml_path) compiled_model = core.compile_model(model=model, device_name=device.value) - + output_layer = compiled_model.output(0) Load an Image @@ -163,13 +163,13 @@ Load an Image "https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/image/coco.jpg", directory="data", ) - + # The MobileNet model expects images in RGB format. image = cv2.cvtColor(cv2.imread(filename=str(image_filename)), code=cv2.COLOR_BGR2RGB) - + # Resize to MobileNet image shape. input_image = cv2.resize(src=image, dsize=(224, 224)) - + # Reshape to model input shape. input_image = np.expand_dims(input_image, 0) plt.imshow(image); @@ -201,7 +201,7 @@ Do Inference "https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/datasets/imagenet/imagenet_2012.txt", directory="data", ) - + imagenet_classes = imagenet_filename.read_text().splitlines() @@ -216,7 +216,7 @@ Do Inference # The model description states that for this model, class 0 is a background. # Therefore, a background must be added at the beginning of imagenet_classes. imagenet_classes = ["background"] + imagenet_classes - + imagenet_classes[result_index] diff --git a/docs/notebooks/optical-character-recognition-with-output.rst b/docs/notebooks/optical-character-recognition-with-output.rst index 61e296de0b49cb..0f03c8fc85f65c 100644 --- a/docs/notebooks/optical-character-recognition-with-output.rst +++ b/docs/notebooks/optical-character-recognition-with-output.rst @@ -7,9 +7,9 @@ This tutorial demonstrates how to perform optical character recognition which shows only text detection. The -`horizontal-text-detection-0001 `__ +`horizontal-text-detection-0001 `__ and -`text-recognition-resnet `__ +`text-recognition-resnet `__ models are used together for text detection and then text recognition. In this tutorial, Open Model Zoo tools including Model Downloader, Model @@ -61,10 +61,10 @@ Guide =2024.0.0" onnx torch torchvision pillow opencv-python --extra-index-url https://download.pytorch.org/whl/cpu - + if platform.system() != "Windows": %pip install -q "matplotlib>=3.4" else: @@ -90,21 +90,21 @@ Imports .. code:: ipython3 from pathlib import Path - + import cv2 import matplotlib.pyplot as plt import numpy as np import openvino as ov from IPython.display import Markdown, display from PIL import Image - + # Fetch `notebook_utils` module import requests - + r = requests.get( url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/notebook_utils.py", ) - + open("notebook_utils.py", "w").write(r.text) from notebook_utils import load_image, device_widget @@ -116,12 +116,12 @@ Settings .. code:: ipython3 core = ov.Core() - + model_dir = Path("model") precision = "FP16" detection_model = "horizontal-text-detection-0001" recognition_model = "text-recognition-resnet-fc" - + model_dir.mkdir(exist_ok=True) Download Models @@ -142,7 +142,7 @@ not be downloaded again. display(Markdown(f"Downloading {detection_model}, {recognition_model}...")) !$download_command display(Markdown(f"Finished downloading {detection_model}, {recognition_model}.")) - + detection_model_path = (model_dir / "intel/horizontal-text-detection-0001" / precision / detection_model).with_suffix(".xml") recognition_model_path = (model_dir / "public/text-recognition-resnet-fc" / precision / recognition_model).with_suffix(".xml") @@ -159,159 +159,159 @@ Downloading horizontal-text-detection-0001, text-recognition-resnet-fc… .. parsed-literal:: ################|| Downloading horizontal-text-detection-0001 ||################ - + ========== Downloading model/intel/horizontal-text-detection-0001/FP16/horizontal-text-detection-0001.xml - - + + ========== Downloading model/intel/horizontal-text-detection-0001/FP16/horizontal-text-detection-0001.bin - - + + ################|| Downloading text-recognition-resnet-fc ||################ - + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/__init__.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/builder.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/model.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/weight_init.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/registry.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/heads/__init__.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/heads/builder.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/heads/fc_head.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/heads/registry.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/__init__.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/builder.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/registry.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/body.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/component.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/sequences/__init__.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/sequences/builder.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/sequences/registry.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/__init__.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/builder.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/decoders/__init__.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/decoders/builder.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/decoders/registry.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/decoders/bricks/__init__.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/decoders/bricks/bricks.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/decoders/bricks/builder.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/decoders/bricks/registry.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/encoders/__init__.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/encoders/builder.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/encoders/backbones/__init__.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/encoders/backbones/builder.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/encoders/backbones/registry.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/encoders/backbones/resnet.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/encoders/enhance_modules/__init__.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/encoders/enhance_modules/builder.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/encoders/enhance_modules/registry.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/utils/__init__.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/utils/builder.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/utils/conv_module.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/utils/fc_module.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/utils/norm.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/models/utils/registry.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/utils/__init__.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/utils/common.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/utils/registry.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/utils/config.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/configs/resnet_fc.py - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/ckpt/resnet_fc.pth - - + + ========== Downloading model/public/text-recognition-resnet-fc/vedastr/addict-2.4.0-py3-none-any.whl - - + + ========== Replacing text in model/public/text-recognition-resnet-fc/vedastr/models/heads/__init__.py ========== Replacing text in model/public/text-recognition-resnet-fc/vedastr/models/bodies/__init__.py ========== Replacing text in model/public/text-recognition-resnet-fc/vedastr/models/bodies/sequences/__init__.py @@ -330,7 +330,7 @@ Downloading horizontal-text-detection-0001, text-recognition-resnet-fc… ========== Replacing text in model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/encoders/backbones/resnet.py ========== Replacing text in model/public/text-recognition-resnet-fc/vedastr/models/bodies/feature_extractors/encoders/backbones/resnet.py ========== Unpacking model/public/text-recognition-resnet-fc/vedastr/addict-2.4.0-py3-none-any.whl - + @@ -342,7 +342,7 @@ text-recognition-resnet-fc. ### The text-recognition-resnet-fc model consists of many files. All filenames are printed in ### the output of Model Downloader. Uncomment the next two lines to show this output. - + # for line in download_result: # print(line) @@ -382,21 +382,21 @@ Converting text-recognition-resnet-fc… ========== Converting text-recognition-resnet-fc to ONNX Conversion to ONNX command: /opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-761/.workspace/scm/ov-notebook/.venv/bin/python -- /opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-761/.workspace/scm/ov-notebook/.venv/lib/python3.8/site-packages/omz_tools/internal_scripts/pytorch_to_onnx.py --model-path=/opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-761/.workspace/scm/ov-notebook/.venv/lib/python3.8/site-packages/omz_tools/models/public/text-recognition-resnet-fc --model-path=model/public/text-recognition-resnet-fc --model-name=get_model --import-module=model '--model-param=file_config=r"model/public/text-recognition-resnet-fc/vedastr/configs/resnet_fc.py"' '--model-param=weights=r"model/public/text-recognition-resnet-fc/vedastr/ckpt/resnet_fc.pth"' --input-shape=1,1,32,100 --input-names=input --output-names=output --output-file=model/public/text-recognition-resnet-fc/resnet_fc.onnx - + ONNX check passed successfully. - + ========== Converting text-recognition-resnet-fc to IR (FP16) Conversion command: /opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-761/.workspace/scm/ov-notebook/.venv/bin/python -- /opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-761/.workspace/scm/ov-notebook/.venv/bin/mo --framework=onnx --output_dir=model/public/text-recognition-resnet-fc/FP16 --model_name=text-recognition-resnet-fc --input=input '--mean_values=input[127.5]' '--scale_values=input[127.5]' --output=output --input_model=model/public/text-recognition-resnet-fc/resnet_fc.onnx '--layout=input(NCHW)' '--input_shape=[1, 1, 32, 100]' --compress_to_fp16=True - + [ INFO ] MO command line tool is considered as the legacy conversion API as of OpenVINO 2023.2 release. - In 2025.0 MO command line tool and openvino.tools.mo.convert_model() will be removed. Please use OpenVINO Model Converter (OVC) or openvino.convert_model(). OVC represents a lightweight alternative of MO and provides simplified model conversion API. + In 2025.0 MO command line tool and openvino.tools.mo.convert_model() will be removed. Please use OpenVINO Model Converter (OVC) or openvino.convert_model(). OVC represents a lightweight alternative of MO and provides simplified model conversion API. Find more information about transition from MO to OVC at https://docs.openvino.ai/2023.2/openvino_docs_OV_Converter_UG_prepare_model_convert_model_MO_OVC_transition.html [ INFO ] Generated IR will be compressed to FP16. If you get lower accuracy, please consider disabling compression explicitly by adding argument --compress_to_fp16=False. Find more information about compression to FP16 at https://docs.openvino.ai/2023.0/openvino_docs_MO_DG_FP16_Compression.html [ SUCCESS ] Generated IR version 11 model. [ SUCCESS ] XML file: /opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-761/.workspace/scm/ov-notebook/notebooks/optical-character-recognition/model/public/text-recognition-resnet-fc/FP16/text-recognition-resnet-fc.xml [ SUCCESS ] BIN file: /opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-761/.workspace/scm/ov-notebook/notebooks/optical-character-recognition/model/public/text-recognition-resnet-fc/FP16/text-recognition-resnet-fc.bin - + Select inference device @@ -409,7 +409,7 @@ select device from dropdown list for running inference using OpenVINO .. code:: ipython3 device = device_widget() - + device @@ -438,7 +438,7 @@ Load a Detection Model detection_model = core.read_model(model=detection_model_path, weights=detection_model_path.with_suffix(".bin")) detection_compiled_model = core.compile_model(model=detection_model, device_name=device.value) - + detection_input_layer = detection_compiled_model.input(0) Load an Image @@ -450,18 +450,18 @@ Load an Image # The `image_file` variable can point to a URL or a local image. image_file = "https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/image/intel_rnb.jpg" - + image = load_image(image_file) - + # N,C,H,W = batch size, number of channels, height, width. N, C, H, W = detection_input_layer.shape - + # Resize the image to meet network expected input sizes. resized_image = cv2.resize(image, (W, H)) - + # Reshape to the network input shape. input_image = np.expand_dims(resized_image.transpose(2, 0, 1), 0) - + plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB)); @@ -482,7 +482,7 @@ the shape of ``[100, 5]``. Each description of detection has the output_key = detection_compiled_model.output("boxes") boxes = detection_compiled_model([input_image])[output_key] - + # Remove zero only boxes. boxes = boxes[~np.all(boxes == 0, axis=1)] @@ -495,25 +495,25 @@ Get Detection Results def multiply_by_ratio(ratio_x, ratio_y, box): return [max(shape * ratio_y, 10) if idx % 2 else shape * ratio_x for idx, shape in enumerate(box[:-1])] - - + + def run_preprocesing_on_crop(crop, net_shape): temp_img = cv2.resize(crop, net_shape) temp_img = temp_img.reshape((1,) * 2 + temp_img.shape) return temp_img - - + + def convert_result_to_image(bgr_image, resized_image, boxes, threshold=0.3, conf_labels=True): # Define colors for boxes and descriptions. colors = {"red": (255, 0, 0), "green": (0, 255, 0), "white": (255, 255, 255)} - + # Fetch image shapes to calculate a ratio. (real_y, real_x), (resized_y, resized_x) = image.shape[:2], resized_image.shape[:2] ratio_x, ratio_y = real_x / resized_x, real_y / resized_y - + # Convert the base image from BGR to RGB format. rgb_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2RGB) - + # Iterate through non-zero boxes. for box, annotation in boxes: # Pick a confidence factor from the last place in an array. @@ -521,10 +521,10 @@ Get Detection Results if conf > threshold: # Convert float to int and multiply position of each box by x and y ratio. (x_min, y_min, x_max, y_max) = map(int, multiply_by_ratio(ratio_x, ratio_y, box)) - + # Draw a box based on the position. Parameters in the `rectangle` function are: image, start_point, end_point, color, thickness. cv2.rectangle(rgb_image, (x_min, y_min), (x_max, y_max), colors["green"], 3) - + # Add a text to an image based on the position and confidence. Parameters in the `putText` function are: image, text, bottomleft_corner_textfield, font, font_scale, color, thickness, line_type if conf_labels: # Create a background box based on annotation length. @@ -549,7 +549,7 @@ Get Detection Results 1, cv2.LINE_AA, ) - + return rgb_image Text Recognition @@ -568,12 +568,12 @@ Load Text Recognition Model .. code:: ipython3 recognition_model = core.read_model(model=recognition_model_path, weights=recognition_model_path.with_suffix(".bin")) - + recognition_compiled_model = core.compile_model(model=recognition_model, device_name=device.value) - + recognition_output_layer = recognition_compiled_model.output(0) recognition_input_layer = recognition_compiled_model.input(0) - + # Get the height and width of the input layer. _, _, H, W = recognition_input_layer.shape @@ -587,13 +587,13 @@ Do Inference # Calculate scale for image resizing. (real_y, real_x), (resized_y, resized_x) = image.shape[:2], resized_image.shape[:2] ratio_x, ratio_y = real_x / resized_x, real_y / resized_y - + # Convert the image to grayscale for the text recognition model. grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) - + # Get a dictionary to encode output, based on the model documentation. letters = "~0123456789abcdefghijklmnopqrstuvwxyz" - + # Prepare an empty list for annotations. annotations = list() cropped_images = list() @@ -603,18 +603,18 @@ Do Inference # Get coordinates on corners of a crop. (x_min, y_min, x_max, y_max) = map(int, multiply_by_ratio(ratio_x, ratio_y, crop)) image_crop = run_preprocesing_on_crop(grayscale_image[y_min:y_max, x_min:x_max], (W, H)) - + # Run inference with the recognition model. result = recognition_compiled_model([image_crop])[recognition_output_layer] - + # Squeeze the output to remove unnecessary dimension. recognition_results_test = np.squeeze(result) - + # Read an annotation based on probabilities from the output layer. annotation = list() for letter in recognition_results_test: parsed_letter = letters[letter.argmax()] - + # Returning 0 index from `argmax` signalizes an end of a string. if parsed_letter == letters[0]: break @@ -622,7 +622,7 @@ Do Inference annotations.append("".join(annotation)) cropped_image = Image.fromarray(image[y_min:y_max, x_min:x_max]) cropped_images.append(cropped_image) - + boxes_with_annotations = list(zip(boxes, annotations)) Show Results diff --git a/docs/notebooks/person-tracking-with-output.rst b/docs/notebooks/person-tracking-with-output.rst index 286aa05e8dfdbf..bca77abe6046ff 100644 --- a/docs/notebooks/person-tracking-with-output.rst +++ b/docs/notebooks/person-tracking-with-output.rst @@ -207,18 +207,18 @@ Representation (OpenVINO IR). and post-processing. In this case, `person detection -model `__ +model `__ is deployed to detect the person in each frame of the video, and `reidentification -model `__ +model `__ is used to output embedding vector to match a pair of images of a person by the cosine distance. If you want to download another model (``person-detection-xxx`` from `Object Detection Models -list `__, +list `__, ``person-reidentification-retail-xxx`` from `Reidentification Models -list `__), +list `__), replace the name of the model in the code below. .. code:: ipython3 diff --git a/docs/notebooks/tensorflow-classification-to-openvino-with-output.rst b/docs/notebooks/tensorflow-classification-to-openvino-with-output.rst index c4a1f394753a09..cd491548f49267 100644 --- a/docs/notebooks/tensorflow-classification-to-openvino-with-output.rst +++ b/docs/notebooks/tensorflow-classification-to-openvino-with-output.rst @@ -2,7 +2,7 @@ Convert a TensorFlow Model to OpenVINO™ ======================================= This short tutorial shows how to convert a TensorFlow -`MobileNetV3 `__ +`MobileNetV3 `__ image classification model to OpenVINO `Intermediate Representation `__ (OpenVINO IR) format, using `Model Conversion @@ -49,7 +49,7 @@ Guide =2023.1.0" "opencv-python" if platform.system() != "Windows": @@ -88,25 +88,25 @@ Imports import os import time from pathlib import Path - + os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2" os.environ["TF_USE_LEGACY_KERAS"] = "1" - + import cv2 import matplotlib.pyplot as plt import numpy as np import openvino as ov import tensorflow as tf - + # Fetch `notebook_utils` module import requests - + r = requests.get( url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/notebook_utils.py", ) - + open("notebook_utils.py", "w").write(r.text) - + from notebook_utils import download_file, device_widget Settings @@ -119,9 +119,9 @@ Settings # The paths of the source and converted models. model_dir = Path("model") model_dir.mkdir(exist_ok=True) - + model_path = Path("model/v3-small_224_1.0_float") - + ir_path = Path("model/v3-small_224_1.0_float.xml") Download model @@ -252,16 +252,16 @@ network. "https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/image/coco.jpg", directory="data", ) - + # The MobileNet network expects images in RGB format. image = cv2.cvtColor(cv2.imread(filename=str(image_filename)), code=cv2.COLOR_BGR2RGB) - + # Resize the image to the network input shape. resized_image = cv2.resize(src=image, dsize=(224, 224)) - + # Transpose the image to the network input shape. input_image = np.expand_dims(resized_image, 0) - + plt.imshow(image); @@ -283,7 +283,7 @@ Do Inference .. code:: ipython3 result = compiled_model(input_image)[output_key] - + result_index = np.argmax(result) .. code:: ipython3 @@ -293,10 +293,10 @@ Do Inference "https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/datasets/imagenet/imagenet_2012.txt", directory="data", ) - + # Convert the inference result to a class name. imagenet_classes = image_filename.read_text().splitlines() - + imagenet_classes[result_index] @@ -329,15 +329,15 @@ performance. .. code:: ipython3 num_images = 1000 - + start = time.perf_counter() - + for _ in range(num_images): compiled_model([input_image]) - + end = time.perf_counter() time_ir = end - start - + print(f"IR model in OpenVINO Runtime/CPU: {time_ir/num_images:.4f} " f"seconds per image, FPS: {num_images/time_ir:.2f}") diff --git a/docs/notebooks/vision-monodepth-with-output.rst b/docs/notebooks/vision-monodepth-with-output.rst index 3b23010032f640..6bedb6c0aa4134 100644 --- a/docs/notebooks/vision-monodepth-with-output.rst +++ b/docs/notebooks/vision-monodepth-with-output.rst @@ -3,7 +3,7 @@ Monodepth Estimation with OpenVINO This tutorial demonstrates Monocular Depth Estimation with MidasNet in OpenVINO. Model information can be found -`here `__. +`here `__. .. figure:: https://user-images.githubusercontent.com/36741649/127173017-a0bbcf75-db24-4d2c-81b9-616e04ab7cd9.gif :alt: monodepth @@ -78,22 +78,22 @@ Install requirements .. code:: ipython3 import platform - + %pip install -q "openvino>=2023.1.0" %pip install -q opencv-python requests tqdm - + if platform.system() != "Windows": %pip install -q "matplotlib>=3.4" else: %pip install -q "matplotlib>=3.4,<3.7" - + # Fetch `notebook_utils` module import requests - + r = requests.get( url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/notebook_utils.py", ) - + open("notebook_utils.py", "w").write(r.text) @@ -121,7 +121,7 @@ Imports import time from pathlib import Path - + import cv2 import matplotlib.cm import matplotlib.pyplot as plt @@ -136,7 +136,7 @@ Imports display, ) import openvino as ov - + from notebook_utils import download_file, load_image, device_widget Download the model @@ -151,14 +151,14 @@ format. .. code:: ipython3 model_folder = Path("model") - + ir_model_url = "https://storage.openvinotoolkit.org/repositories/openvino_notebooks/models/depth-estimation-midas/FP32/" ir_model_name_xml = "MiDaS_small.xml" ir_model_name_bin = "MiDaS_small.bin" - + download_file(ir_model_url + ir_model_name_xml, filename=ir_model_name_xml, directory=model_folder) download_file(ir_model_url + ir_model_name_bin, filename=ir_model_name_bin, directory=model_folder) - + model_xml_path = model_folder / ir_model_name_xml @@ -184,13 +184,13 @@ Functions def normalize_minmax(data): """Normalizes the values in `data` between 0 and 1""" return (data - data.min()) / (data.max() - data.min()) - - + + def convert_result_to_image(result, colormap="viridis"): """ Convert network result of floating point numbers to an RGB image with integer values from 0-255 by applying a colormap. - + `result` is expected to be a single network result in 1,H,W shape `colormap` is a matplotlib colormap. See https://matplotlib.org/stable/tutorials/colors/colormaps.html @@ -201,8 +201,8 @@ Functions result = cmap(result)[:, :, :3] * 255 result = result.astype(np.uint8) return result - - + + def to_rgb(image_data) -> np.ndarray: """ Convert image_data from BGR to RGB @@ -219,7 +219,7 @@ select device from dropdown list for running inference using OpenVINO .. code:: ipython3 device = device_widget() - + device @@ -245,15 +245,15 @@ output keys and the expected input shape for the model. # Create cache folder cache_folder = Path("cache") cache_folder.mkdir(exist_ok=True) - + core = ov.Core() core.set_property({"CACHE_DIR": cache_folder}) model = core.read_model(model_xml_path) compiled_model = core.compile_model(model=model, device_name=device.value) - + input_key = compiled_model.input(0) output_key = compiled_model.output(0) - + network_input_shape = list(input_key.shape) network_image_height, network_image_width = network_input_shape[2:] @@ -275,10 +275,10 @@ H=height, W=width). IMAGE_FILE = "https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/image/coco_bike.jpg" image = load_image(path=IMAGE_FILE) - + # Resize to input shape for network. resized_image = cv2.resize(src=image, dsize=(network_image_height, network_image_width)) - + # Reshape the image to network input shape NCHW. input_image = np.expand_dims(np.transpose(resized_image, (2, 0, 1)), 0) @@ -293,11 +293,11 @@ original image shape. .. code:: ipython3 result = compiled_model([input_image])[output_key] - + # Convert the network result of disparity map to an image that shows # distance as colors. result_image = convert_result_to_image(result=result) - + # Resize back to original image shape. The `cv2.resize` function expects shape # in (width, height), [::-1] reverses the (height, width) shape to match this. result_image = cv2.resize(result_image, image.shape[:2][::-1]) @@ -359,7 +359,7 @@ Video Settings # Try the `THEO` encoding if you have FFMPEG installed. # FOURCC = cv2.VideoWriter_fourcc(*"THEO") FOURCC = cv2.VideoWriter_fourcc(*"vp09") - + # Create Path objects for the input video and the result video. output_directory = Path("output") output_directory.mkdir(exist_ok=True) @@ -382,11 +382,11 @@ compute values for these properties for the monodepth video. raise ValueError(f"The video at {VIDEO_FILE} cannot be read.") input_fps = cap.get(cv2.CAP_PROP_FPS) input_video_frame_height, input_video_frame_width = image.shape[:2] - + target_fps = input_fps / ADVANCE_FRAMES target_frame_height = int(input_video_frame_height * SCALE_OUTPUT) target_frame_width = int(input_video_frame_width * SCALE_OUTPUT) - + cap.release() print(f"The input video has a frame width of {input_video_frame_width}, " f"frame height of {input_video_frame_height} and runs at {input_fps:.2f} fps") print( @@ -413,10 +413,10 @@ Do Inference on a Video and Create Monodepth Video input_video_frame_nr = 0 start_time = time.perf_counter() total_inference_duration = 0 - + # Open the input video cap = cv2.VideoCapture(str(VIDEO_FILE)) - + # Create a result video. out_video = cv2.VideoWriter( str(result_video_path), @@ -424,36 +424,36 @@ Do Inference on a Video and Create Monodepth Video target_fps, (target_frame_width * 2, target_frame_height), ) - + num_frames = int(NUM_SECONDS * input_fps) total_frames = cap.get(cv2.CAP_PROP_FRAME_COUNT) if num_frames == 0 else num_frames progress_bar = ProgressBar(total=total_frames) progress_bar.display() - + try: while cap.isOpened(): ret, image = cap.read() if not ret: cap.release() break - + if input_video_frame_nr >= total_frames: break - + # Only process every second frame. # Prepare a frame for inference. # Resize to the input shape for network. resized_image = cv2.resize(src=image, dsize=(network_image_height, network_image_width)) # Reshape the image to network input shape NCHW. input_image = np.expand_dims(np.transpose(resized_image, (2, 0, 1)), 0) - + # Do inference. inference_start_time = time.perf_counter() result = compiled_model([input_image])[output_key] inference_stop_time = time.perf_counter() inference_duration = inference_stop_time - inference_start_time total_inference_duration += inference_duration - + if input_video_frame_nr % (10 * ADVANCE_FRAMES) == 0: clear_output(wait=True) progress_bar.display() @@ -467,7 +467,7 @@ Do Inference on a Video and Create Monodepth Video f"({1/inference_duration:.2f} FPS)" ) ) - + # Transform the network result to a RGB image. result_frame = to_rgb(convert_result_to_image(result)) # Resize the image and the result to a target frame shape. @@ -477,13 +477,13 @@ Do Inference on a Video and Create Monodepth Video stacked_frame = np.hstack((image, result_frame)) # Save a frame to the video. out_video.write(stacked_frame) - + input_video_frame_nr = input_video_frame_nr + ADVANCE_FRAMES cap.set(1, input_video_frame_nr) - + progress_bar.progress = input_video_frame_nr progress_bar.update() - + except KeyboardInterrupt: print("Processing interrupted.") finally: @@ -493,7 +493,7 @@ Do Inference on a Video and Create Monodepth Video cap.release() end_time = time.perf_counter() duration = end_time - start_time - + print( f"Processed {processed_frames} frames in {duration:.2f} seconds. " f"Total FPS (including video processing): {processed_frames/duration:.2f}." @@ -504,7 +504,7 @@ Do Inference on a Video and Create Monodepth Video .. parsed-literal:: - Processed 60 frames in 26.07 seconds. Total FPS (including video processing): 2.30.Inference FPS: 45.85 + Processed 60 frames in 26.07 seconds. Total FPS (including video processing): 2.30.Inference FPS: 45.85 Monodepth Video saved to 'output/Coco%20Walking%20in%20Berkeley_monodepth.mp4'. @@ -532,7 +532,7 @@ Display Monodepth Video Showing monodepth video saved at /opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-761/.workspace/scm/ov-notebook/notebooks/vision-monodepth/output/Coco%20Walking%20in%20Berkeley_monodepth.mp4 - If you cannot see the video in your browser, please click on the following link to download the video + If you cannot see the video in your browser, please click on the following link to download the video