Skip to content

Commit

Permalink
Merge pull request #37 from NERC-CEH/bioclip-exp
Browse files Browse the repository at this point in the history
BioCLIP WIP notebook
  • Loading branch information
albags authored Oct 4, 2024
2 parents 8f39f39 + 673c18d commit 735ec87
Showing 1 changed file with 305 additions and 0 deletions.
305 changes: 305 additions & 0 deletions notebooks/BioclipExperiment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,305 @@
---
jupyter:
jupytext:
formats: ipynb,md
text_representation:
extension: .md
format_name: markdown
format_version: '1.3'
jupytext_version: 1.16.4
kernelspec:
display_name: Python 3 (ipykernel)
language: python
name: python3
---

# Joe experiments with BioCLIP

```python
import logging
import warnings

from PIL import Image
import torch
import open_clip

# Get clues about what's happening upstream
logging.basicConfig(level=logging.INFO)

warnings.filterwarnings(action="ignore", category=FutureWarning)
```

<!-- #region -->
## Loading and inspecting the model

I am following the (very minimal) instructions for loading the BioCLIP from the [BioCLIP huggingface page](https://huggingface.co/imageomics/bioclip#how-to-get-started-with-the-model
), and the more generic ones from the [`open_clip` README](https://github.com/mlfoundations/open_clip?tab=readme-ov-file#usage).

The first step is

```python
model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms('hf-hub:imageomics/bioclip')
```

I am looking at the logic inside [`open_clip.create_model_and_transforms`](https://github.com/mlfoundations/open_clip/blob/main/src/open_clip/factory.py#L375) and it's quite messy/confusing and sparsely documented, with defaults silently overriden. Probably best to just treat this as a black box! Most importantly, it returns three things:

1. an instance of [`open_clip.model.CLIP`](https://github.com/mlfoundations/open_clip/blob/main/src/open_clip/model.py#L220) (a subclass of `torch.nn.Module`) with loaded weights
2. an instance of [`torchvision.transforms.transforms.Compose`](https://pytorch.org/vision/main/generated/torchvision.transforms.Compose.html) that performs the image preprocessing for training
3. another `Compose` for validation
<!-- #endregion -->

```python
model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms('hf-hub:imageomics/bioclip')
model.eval()
print(type(model), type(preprocess_train), type(preprocess_val))
```

### Unpacking CLIP

The `forward` pass of `CLIP` transforms both image and text inputs (if both are provided).

#### Images
Images are passed to the `visual` submodule, which is an instance of [`VisionTransformer`](https://github.com/mlfoundations/open_clip/blob/main/src/open_clip/transformer.py#L434) (You can get an idea for what a `VisionTransformer` contains/does from the `repr` below, but see [source code](https://github.com/mlfoundations/open_clip/blob/main/src/open_clip/transformer.py#L434) for more details.)
The outputs of the transformer are then [normalised](https://pytorch.org/docs/stable/generated/torch.nn.functional.normalize.html) (by dividing by the $\ell_2$ norm).

#### Text
Text goes through the following stages:

1. Token embedding (`token_embedding`)
2. Transformer (`transformer`)
3. Layer norm (`ln_final1`)
4. Normalise (another division by the $\ell_2$ norm, functional so not listed as a module)

```python
model
```

### Preprocessing

The preprocessing for training and validation are slightly different: during training [`torchvision.transforms.RandomResizedCrop`](https://pytorch.org/vision/main/generated/torchvision.transforms.RandomResizedCrop.html) is used to crop a random portion of the image and resize it to (224, 224), whereas in validation the image is resized so that the *smaller* edge is 224 pixels, before a center crop is applied to make the resulting image square.

Note that [`torchvision.transforms.CenterCrop`](https://pytorch.org/vision/main/generated/torchvision.transforms.CenterCrop.html#torchvision.transforms.CenterCrop.forward) pads any dimensions that are smaller than the give center size with zeros.

```python
preprocess_train.transforms
```

```python
preprocess_val.transforms
```

**TODO:** demo the preprocessors on our images


### Tokenizer

```python
tokenizer = open_clip.get_tokenizer('hf-hub:imageomics/bioclip')
```

## Plankton images

```python
import os
from cyto_ml.models.scivision import prepare_image
from intake_xarray import ImageSource
from dotenv import load_dotenv

load_dotenv() # sets our object store endpoint and credentials from the .env file

from intake import open_catalog
```

```python
dataset = open_catalog(f"{os.environ.get('ENDPOINT', '')}/metadata/intake.yml")
imgs = dataset.test_image().to_dask()
imgs.to_numpy().shape
```

```python
index_df = dataset.plankton().to_dask().compute() # intake-xarray is not sensible here - to_dask.compute() seems silly
index_df
```

<!-- #raw -->
image_data = [
ImageSource(row).to_dask().to_numpy() for row in index_df["Filename"].iloc[:100] # take first 100, 535 too many!
]
image_data[0].shape
<!-- #endraw -->

```python
from tqdm.autonotebook import tqdm

fnames = index_df["Filename"]

images = []

for row in tqdm(fnames):
try:
image = ImageSource(row).to_dask().to_numpy()
except OSError as e: # I think one of these is a csv not an image!
print(e)
else:
images.append(image)
```

```python
import matplotlib.pyplot as plt

plt.imshow(images[0])
```

```python
from IPython.display import display
print(images[0].shape)
im = Image.fromarray(images[0], "RGB")
display(im)
```

```python
for i in images[10:50]:
display(Image.fromarray(i, "RGB"))
```

```python
images[0].shape
```

```python
dimensions = [image.shape[:-1] for image in images]
plt.scatter(*zip(*dimensions), s=3)
```

```python
test = preprocess_val(Image.fromarray(images[0], "RGB"))
print(test.shape)
print(test.min(), test.max())
test

testim = test.permute(2, 1, 0).numpy()
print(testim.shape)
im = Image.fromarray(test.permute(1, 2, 0).numpy(), "RGB")
display(im)
```

```python
#resize, crop, totensor, norm = preprocess_val.transforms

test_image = Image.fromarray(images[16], "RGB")

for transform in preprocess_val.transforms:
print("transform: ", transform)
test_image = transform(test_image)
print("new type: ", type(test_image))


if isinstance(test_image, Image.Image):
display(test_image)

elif isinstance(test_image, torch.Tensor):
plt.imshow(test_image.transpose(2, 0))
print(test_image.min(), test_image.max())

else:
print(type(test_image))
```

```python
from torchvision.transforms.v2.functional import to_image

test_image = Image.fromarray(images[16], "RGB")

poo = to_image(test_image)
poo.min(), poo.max()
```

## Put in the thing

```python
test_image = Image.fromarray(images[16], "RGB")

display(test_image)
```

```python
in_tensor = preprocess_val(test_image)
in_tensor.shape
```

```python
image_features, text_features, scale = model(in_tensor.unsqueeze(0))
```

```python
image_features.shape
```

### One at a time or it cries

```python
from pathlib import Path
from tqdm.autonotebook import tqdm
assert Path("features").is_dir()

for i, image in tqdm(enumerate(images)):
image_tensor = preprocess_val(Image.fromarray(image, "RGB"))
out_feature, _, _ = model(image_tensor.unsqueeze(0))
torch.save(out_feature, f"features/feature_{i}.pt")
```

```python
features = torch.cat(
[torch.load(f"features/feature_{i}.pt") for i in range(len(images))]
)
features.shape
```

```python
!pip install scikit-learn
```

```python
from sklearn.manifold import TSNE

t_sne = TSNE()

embedded = t_sne.fit_transform(features.detach().numpy())
```

```python
embedded.shape
```

```python
plt.scatter(*zip(*embedded))
```

```python

```

## Resources

### CLIP

- [GitHub repo](https://github.com/mlfoundations/open_clip/)
- [Class definition](https://github.com/mlfoundations/open_clip/blob/main/src/open_clip/model.py#L220)
-

### BioCLIP

- [arXiv paper](https://arxiv.org/pdf/2311.18803)
- [GitHub repo](https://github.com/Imageomics/bioclip)
- [Model card](https://huggingface.co/imageomics/bioclip)
- [Evaluation using t-SNE](https://imageomics.github.io/bioclip/#intrinsic)

### Evaluation

- [Scikit Learn TSNE class](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html#tsne)
- [t-SNE from the original author](https://lvdmaaten.github.io/tsne/)
- [Demo about manifold learning](https://scikit-learn.org/stable/auto_examples/manifold/plot_compare_methods.html#)

```python

```

0 comments on commit 735ec87

Please sign in to comment.