diff --git a/nbs/image_prompter.ipynb b/nbs/image_prompter.ipynb index e3c1750..c6065d8 100644 --- a/nbs/image_prompter.ipynb +++ b/nbs/image_prompter.ipynb @@ -364,7 +364,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "First, before set the image into the prompter, we need to read the image. For it, we can use [kornia.io](https://kornia.readthedocs.io/en/latest/io.html), which internally uses [kornia-rs](https://github.com/kornia/kornia-rs). So for, it ensure to have `kornia-rs` installed, you can install it with `pip install kornia_rs`. This API implement the [DLPack](https://github.com/dmlc/dlpack) protocol natively in Rust to reduce the memory footprint during the decoding and types conversion. Allowing us to read the image from the disk directly to a tensor." + "First, before adding the image to the prompter, we need to read the image. For that, we can use [kornia.io](https://kornia.readthedocs.io/en/latest/io.html), which internally uses [kornia-rs](https://github.com/kornia/kornia-rs). If you do not have `kornia-rs` installed, you can install it with `pip install kornia_rs`. This API implements the [DLPack](https://github.com/dmlc/dlpack) protocol natively in Rust to reduce the memory footprint during the decoding and type conversion. Allowing us to read the image from the disk directly to a tensor. Note that the image should be scaled within the range [0,1]." ] }, { @@ -396,7 +396,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "With the image loaded into the same device than the model, and with the right shape `3xHxW` let's set the image into our image prompter. Attention, when doing this the model will already compute the embeddings of this image. This means, we will pass this image through the encoder, which will uses a lot of memory. It is possible to use the largest model (vit-h) with a graphic card (GPU) that has at least 8Gb of VRAM. " + "With the image loaded onto the same device as the model, and with the right shape `3xHxW`, we can now set the image in our image prompter. Attention: when doing this, the model will compute the embeddings of this image; this means, we will pass this image through the encoder, which will use a lot of memory. It is possible to use the largest model (vit-h) with a graphic card (GPU) that has at least 8Gb of VRAM. " ] }, { @@ -413,7 +413,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "If no error occurred, the features needed to run queries are already cached. If you want to check this, you can see the status of the `prompter.is_image_set` property." + "If no error occurred, the features needed to run queries are now cached. If you want to check this, you can see the status of the `prompter.is_image_set` property." ] }, {