Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add usage snippets for Google Health AI models #1084

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

ndebuhr
Copy link
Contributor

@ndebuhr ndebuhr commented Dec 24, 2024

Adding usage snippets to improve Hugging Face Hub model pages for CXR Foundation and Derm Foundation - improving usability via example Python code and linked Jupyter notebooks. Doing this at the model library level, to complement the usage documentation in the model cards, so that the "Use this model" Hub button provides useful information (currently "integration status unknown").

Copy link
Member

@Vaibhavs10 Vaibhavs10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @ndebuhr - re: both snippets:

  1. Both should actually be snippets (we discourage putting URLs to notebooks etc)
  2. For the other code snippet, we try to put the least possible lines of code that someone can understand/ work with. So would be great to reduce it.

Otherwise, I think I think this information can be nicely fit in the Model Card.

@ndebuhr
Copy link
Contributor Author

ndebuhr commented Jan 2, 2025

@Vaibhavs10 I can collapse some lines or remove some comments to further reduce the LOC, but I think that would impact readability (especially in the relatively low-width snippet modal). Hoping this second pass is better?

@ndebuhr ndebuhr requested a review from Vaibhavs10 January 2, 2025 23:29
Copy link
Member

@Vaibhavs10 Vaibhavs10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for iterating here, some last nits and we're good to merge!

@@ -95,6 +95,26 @@ export const bm25s = (model: ModelData): string[] => [
retriever = BM25HF.load_from_hub("${model.id}")`,
];

export const cxr_foundation = (model: ModelData): string[] => [
`# Install library
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the install instructions completely from here and keep them in the model card (As they are currently and just start from from PIL import Image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we aren't doing a "typical" installation (e.g., pip install from PyPI or git), I'm a bit concerned about pulling that information out of the snippet. I think people will make assumptions that aren't true. Basically, there's no way somebody is going to correctly use the library or guess the installation without this, so I'd really prefer we keep that critical information in the snippet (as it looks like some other libraries have done). Is that fair?

Copy link
Contributor

@Wauplin Wauplin Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ndebuhr from where did you got these installation steps? I would be more inclined to add a single comment line to redirect users to the installation step. Something like this:

# Install from https://github.com/Google-Health/cxr-foundation

This is what is done for quite some libraries already and the direction we want to take for new ones. The problem of adding installation guide in the code snippet is that it makes the snippets much longer and much harder to maintain (installation guides can be tricky). In the case here, the best thing would be to have those few lines in cxr-foundation readme directly.

packages/tasks/src/model-libraries-snippets.ts Outdated Show resolved Hide resolved
packages/tasks/src/model-libraries-snippets.ts Outdated Show resolved Hide resolved
packages/tasks/src/model-libraries-snippets.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Agree with @Vaibhavs10 on trying to get more concise examples (though still readable).

Comment on lines +112 to +115
cxr_client = make_hugging_face_client('cxr_model')
!wget -nc -q https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png

print(cxr_client.get_image_embeddings_from_images([Image.open("Chest_Xray_PA_3-8-2010.png")]))`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interlacing python lines of code with command lines feels weird. Could you switch to a full Python snippet using for instance requests.get("https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png") ? (as done below)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a pure python solution. Alternatively we can require the user to provide a local image.

import requests
from io import BytesIO

# Image attribution: Stillwaterising, CC0, via Wikimedia Commons
image_url = "https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png"
response = requests.get(image_url, headers={'User-Agent': 'Demo'}, stream=True)
response.raw.decode_content = True  # Ensure correct decoding
img = Image.open(BytesIO(response.content)).convert('L')  # Convert to grayscale

I tested the following complete snippet:

!git clone https://github.com/Google-Health/cxr-foundation.git
import tensorflow as tf, sys, requests
sys.path.append('cxr-foundation/python/')

# Install dependencies
major_version = tf.__version__.rsplit(".", 1)[0]
!pip install tensorflow-text=={major_version} pypng && pip install --no-deps pydicom hcls_imaging_ml_toolkit retrying

# Load image (Stillwaterising, CC0, via Wikimedia Commons)
from PIL import Image
from io import BytesIO
image_url = "https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png"
response = requests.get(image_url, headers={'User-Agent': 'Demo'}, stream=True)
response.raw.decode_content = True # Ensure correct decoding
img = Image.open(BytesIO(response.content)).convert('L') # Convert to grayscale

# Run inference
from clientside.clients import make_hugging_face_client
cxr_client = make_hugging_face_client('cxr_model')
print(cxr_client.get_image_embeddings_from_images([img]))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it locally and it seems that this simplified snippet works as well for loading image. Could you replace it?

# Load image
from PIL import Image
from io import BytesIO
import requests
response = requests.get("https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png")
img = Image.open(BytesIO(response.content)).convert('L') # Convert to grayscale

Copy link
Contributor

@Wauplin Wauplin Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what I don't like with

!git clone https://github.com/Google-Health/cxr-foundation.git
import tensorflow as tf, sys, requests
sys.path.append('cxr-foundation/python/')

# Install dependencies major_version = tf.__version__.rsplit(".", 1)[0]
!pip install tensorflow-text=={major_version} pypng && pip install --no-deps pydicom hcls_imaging_ml_toolkit retrying

is that it involves both command lines and python code. This is ok-ish in a notebook environment but it is not a valid Python snippet. I would really prefer if these installation instructions were added to the Github repo or in a CXR-foundation wiki directly. This way, lines could be explained to the user which we don't do here. Since the installation process is complex I do think it's important to explain it in a proper markdown document rather than an autogenerated code snippet on the Hugging Face Hub.

packages/tasks/src/model-libraries-snippets.ts Outdated Show resolved Hide resolved
packages/tasks/src/model-libraries-snippets.ts Outdated Show resolved Hide resolved
Comment on lines +206 to +208
buf = BytesIO()
image.convert("RGB").save(buf, "PNG")
image_bytes = buf.getvalue()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this step mandatory for the provided example? https://storage.googleapis.com/dx-scin-public-data/dataset/images/3445096909671059178.png seems to already be an png file with rgb colors

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Below is a more condensed snippet:

from huggingface_hub import from_pretrained_keras
import tensorflow as tf, requests

# Load and format input
IMAGE_URL = "https://storage.googleapis.com/dx-scin-public-data/dataset/images/3445096909671059178.png"
input_tensor = tf.train.Example(
    features=tf.train.Features(
        feature={
            "image/encoded": tf.train.Feature(
                bytes_list=tf.train.BytesList(value=[requests.get(IMAGE_URL, stream=True).content])
            )
        }
    )
).SerializeToString()

# Load model and run inference
loaded_model = from_pretrained_keras("google/derm-foundation")
infer = loaded_model.signatures["serving_default"]
print(infer(inputs=tf.constant([input_tensor])))

Copy link

@wzwz1 wzwz1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've incorporated the feedback and rewritten the snippets and tested them.

Comment on lines +112 to +115
cxr_client = make_hugging_face_client('cxr_model')
!wget -nc -q https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png

print(cxr_client.get_image_embeddings_from_images([Image.open("Chest_Xray_PA_3-8-2010.png")]))`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a pure python solution. Alternatively we can require the user to provide a local image.

import requests
from io import BytesIO

# Image attribution: Stillwaterising, CC0, via Wikimedia Commons
image_url = "https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png"
response = requests.get(image_url, headers={'User-Agent': 'Demo'}, stream=True)
response.raw.decode_content = True  # Ensure correct decoding
img = Image.open(BytesIO(response.content)).convert('L')  # Convert to grayscale

I tested the following complete snippet:

!git clone https://github.com/Google-Health/cxr-foundation.git
import tensorflow as tf, sys, requests
sys.path.append('cxr-foundation/python/')

# Install dependencies
major_version = tf.__version__.rsplit(".", 1)[0]
!pip install tensorflow-text=={major_version} pypng && pip install --no-deps pydicom hcls_imaging_ml_toolkit retrying

# Load image (Stillwaterising, CC0, via Wikimedia Commons)
from PIL import Image
from io import BytesIO
image_url = "https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png"
response = requests.get(image_url, headers={'User-Agent': 'Demo'}, stream=True)
response.raw.decode_content = True # Ensure correct decoding
img = Image.open(BytesIO(response.content)).convert('L') # Convert to grayscale

# Run inference
from clientside.clients import make_hugging_face_client
cxr_client = make_hugging_face_client('cxr_model')
print(cxr_client.get_image_embeddings_from_images([img]))

Comment on lines +206 to +208
buf = BytesIO()
image.convert("RGB").save(buf, "PNG")
image_bytes = buf.getvalue()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Below is a more condensed snippet:

from huggingface_hub import from_pretrained_keras
import tensorflow as tf, requests

# Load and format input
IMAGE_URL = "https://storage.googleapis.com/dx-scin-public-data/dataset/images/3445096909671059178.png"
input_tensor = tf.train.Example(
    features=tf.train.Features(
        feature={
            "image/encoded": tf.train.Feature(
                bytes_list=tf.train.BytesList(value=[requests.get(IMAGE_URL, stream=True).content])
            )
        }
    )
).SerializeToString()

# Load model and run inference
loaded_model = from_pretrained_keras("google/derm-foundation")
infer = loaded_model.signatures["serving_default"]
print(infer(inputs=tf.constant([input_tensor])))

@ndebuhr
Copy link
Contributor Author

ndebuhr commented Jan 15, 2025

I'll cement these discussions into commits shortly. Thanks team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants