Skip to content

Commit

Permalink
[SAM] note that the image (tensor) should be scaled between [0,1] bef…
Browse files Browse the repository at this point in the history
…ore it is passed to the VisualPrompter (#76)
  • Loading branch information
scott-vsi authored Jan 6, 2024
1 parent cb5fcfb commit dd26525
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions nbs/image_prompter.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -364,7 +364,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"First, before set the image into the prompter, we need to read the image. For it, we can use [kornia.io](https://kornia.readthedocs.io/en/latest/io.html), which internally uses [kornia-rs](https://github.com/kornia/kornia-rs). So for, it ensure to have `kornia-rs` installed, you can install it with `pip install kornia_rs`. This API implement the [DLPack](https://github.com/dmlc/dlpack) protocol natively in Rust to reduce the memory footprint during the decoding and types conversion. Allowing us to read the image from the disk directly to a tensor."
"First, before adding the image to the prompter, we need to read the image. For that, we can use [kornia.io](https://kornia.readthedocs.io/en/latest/io.html), which internally uses [kornia-rs](https://github.com/kornia/kornia-rs). If you do not have `kornia-rs` installed, you can install it with `pip install kornia_rs`. This API implements the [DLPack](https://github.com/dmlc/dlpack) protocol natively in Rust to reduce the memory footprint during the decoding and type conversion. Allowing us to read the image from the disk directly to a tensor. Note that the image should be scaled within the range [0,1]."
]
},
{
Expand Down Expand Up @@ -396,7 +396,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"With the image loaded into the same device than the model, and with the right shape `3xHxW` let's set the image into our image prompter. Attention, when doing this the model will already compute the embeddings of this image. This means, we will pass this image through the encoder, which will uses a lot of memory. It is possible to use the largest model (vit-h) with a graphic card (GPU) that has at least 8Gb of VRAM. "
"With the image loaded onto the same device as the model, and with the right shape `3xHxW`, we can now set the image in our image prompter. Attention: when doing this, the model will compute the embeddings of this image; this means, we will pass this image through the encoder, which will use a lot of memory. It is possible to use the largest model (vit-h) with a graphic card (GPU) that has at least 8Gb of VRAM. "
]
},
{
Expand All @@ -413,7 +413,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"If no error occurred, the features needed to run queries are already cached. If you want to check this, you can see the status of the `prompter.is_image_set` property."
"If no error occurred, the features needed to run queries are now cached. If you want to check this, you can see the status of the `prompter.is_image_set` property."
]
},
{
Expand Down

0 comments on commit dd26525

Please sign in to comment.