-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add usage snippets for Google Health AI models #1084
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: vb <vaibhavs10@gmail.com>
Co-authored-by: vb <vaibhavs10@gmail.com>
…ation models (Google Health AI), to improve their Hugging Face Hub pages and ease model adoption
…gle-hai-libraries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @ndebuhr - re: both snippets:
- Both should actually be snippets (we discourage putting URLs to notebooks etc)
- For the other code snippet, we try to put the least possible lines of code that someone can understand/ work with. So would be great to reduce it.
Otherwise, I think I think this information can be nicely fit in the Model Card.
…gle-hai-libraries
…mage preprocessing and embeddings computation, not post-processing/visualization
@Vaibhavs10 I can collapse some lines or remove some comments to further reduce the LOC, but I think that would impact readability (especially in the relatively low-width snippet modal). Hoping this second pass is better? |
…tilizes CxrClient make_hugging_face_client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for iterating here, some last nits and we're good to merge!
@@ -95,6 +95,26 @@ export const bm25s = (model: ModelData): string[] => [ | |||
retriever = BM25HF.load_from_hub("${model.id}")`, | |||
]; | |||
|
|||
export const cxr_foundation = (model: ModelData): string[] => [ | |||
`# Install library |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove the install instructions completely from here and keep them in the model card (As they are currently and just start from from PIL import Image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we aren't doing a "typical" installation (e.g., pip install from PyPI or git), I'm a bit concerned about pulling that information out of the snippet. I think people will make assumptions that aren't true. Basically, there's no way somebody is going to correctly use the library or guess the installation without this, so I'd really prefer we keep that critical information in the snippet (as it looks like some other libraries have done). Is that fair?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ndebuhr from where did you got these installation steps? I would be more inclined to add a single comment line to redirect users to the installation step. Something like this:
# Install from https://github.com/Google-Health/cxr-foundation
This is what is done for quite some libraries already and the direction we want to take for new ones. The problem of adding installation guide in the code snippet is that it makes the snippets much longer and much harder to maintain (installation guides can be tricky). In the case here, the best thing would be to have those few lines in cxr-foundation readme directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Agree with @Vaibhavs10 on trying to get more concise examples (though still readable).
cxr_client = make_hugging_face_client('cxr_model') | ||
!wget -nc -q https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png | ||
|
||
print(cxr_client.get_image_embeddings_from_images([Image.open("Chest_Xray_PA_3-8-2010.png")]))`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interlacing python lines of code with command lines feels weird. Could you switch to a full Python snippet using for instance requests.get("https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png")
? (as done below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a pure python solution. Alternatively we can require the user to provide a local image.
import requests
from io import BytesIO
# Image attribution: Stillwaterising, CC0, via Wikimedia Commons
image_url = "https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png"
response = requests.get(image_url, headers={'User-Agent': 'Demo'}, stream=True)
response.raw.decode_content = True # Ensure correct decoding
img = Image.open(BytesIO(response.content)).convert('L') # Convert to grayscale
I tested the following complete snippet:
!git clone https://github.com/Google-Health/cxr-foundation.git
import tensorflow as tf, sys, requests
sys.path.append('cxr-foundation/python/')
# Install dependencies
major_version = tf.__version__.rsplit(".", 1)[0]
!pip install tensorflow-text=={major_version} pypng && pip install --no-deps pydicom hcls_imaging_ml_toolkit retrying
# Load image (Stillwaterising, CC0, via Wikimedia Commons)
from PIL import Image
from io import BytesIO
image_url = "https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png"
response = requests.get(image_url, headers={'User-Agent': 'Demo'}, stream=True)
response.raw.decode_content = True # Ensure correct decoding
img = Image.open(BytesIO(response.content)).convert('L') # Convert to grayscale
# Run inference
from clientside.clients import make_hugging_face_client
cxr_client = make_hugging_face_client('cxr_model')
print(cxr_client.get_image_embeddings_from_images([img]))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested it locally and it seems that this simplified snippet works as well for loading image. Could you replace it?
# Load image
from PIL import Image
from io import BytesIO
import requests
response = requests.get("https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png")
img = Image.open(BytesIO(response.content)).convert('L') # Convert to grayscale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what I don't like with
!git clone https://github.com/Google-Health/cxr-foundation.git
import tensorflow as tf, sys, requests
sys.path.append('cxr-foundation/python/')
# Install dependencies major_version = tf.__version__.rsplit(".", 1)[0]
!pip install tensorflow-text=={major_version} pypng && pip install --no-deps pydicom hcls_imaging_ml_toolkit retrying
is that it involves both command lines and python code. This is ok-ish in a notebook environment but it is not a valid Python snippet. I would really prefer if these installation instructions were added to the Github repo or in a CXR-foundation wiki directly. This way, lines could be explained to the user which we don't do here. Since the installation process is complex I do think it's important to explain it in a proper markdown document rather than an autogenerated code snippet on the Hugging Face Hub.
buf = BytesIO() | ||
image.convert("RGB").save(buf, "PNG") | ||
image_bytes = buf.getvalue() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this step mandatory for the provided example? https://storage.googleapis.com/dx-scin-public-data/dataset/images/3445096909671059178.png seems to already be an png file with rgb colors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Below is a more condensed snippet:
from huggingface_hub import from_pretrained_keras
import tensorflow as tf, requests
# Load and format input
IMAGE_URL = "https://storage.googleapis.com/dx-scin-public-data/dataset/images/3445096909671059178.png"
input_tensor = tf.train.Example(
features=tf.train.Features(
feature={
"image/encoded": tf.train.Feature(
bytes_list=tf.train.BytesList(value=[requests.get(IMAGE_URL, stream=True).content])
)
}
)
).SerializeToString()
# Load model and run inference
loaded_model = from_pretrained_keras("google/derm-foundation")
infer = loaded_model.signatures["serving_default"]
print(infer(inputs=tf.constant([input_tensor])))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've incorporated the feedback and rewritten the snippets and tested them.
cxr_client = make_hugging_face_client('cxr_model') | ||
!wget -nc -q https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png | ||
|
||
print(cxr_client.get_image_embeddings_from_images([Image.open("Chest_Xray_PA_3-8-2010.png")]))`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a pure python solution. Alternatively we can require the user to provide a local image.
import requests
from io import BytesIO
# Image attribution: Stillwaterising, CC0, via Wikimedia Commons
image_url = "https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png"
response = requests.get(image_url, headers={'User-Agent': 'Demo'}, stream=True)
response.raw.decode_content = True # Ensure correct decoding
img = Image.open(BytesIO(response.content)).convert('L') # Convert to grayscale
I tested the following complete snippet:
!git clone https://github.com/Google-Health/cxr-foundation.git
import tensorflow as tf, sys, requests
sys.path.append('cxr-foundation/python/')
# Install dependencies
major_version = tf.__version__.rsplit(".", 1)[0]
!pip install tensorflow-text=={major_version} pypng && pip install --no-deps pydicom hcls_imaging_ml_toolkit retrying
# Load image (Stillwaterising, CC0, via Wikimedia Commons)
from PIL import Image
from io import BytesIO
image_url = "https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png"
response = requests.get(image_url, headers={'User-Agent': 'Demo'}, stream=True)
response.raw.decode_content = True # Ensure correct decoding
img = Image.open(BytesIO(response.content)).convert('L') # Convert to grayscale
# Run inference
from clientside.clients import make_hugging_face_client
cxr_client = make_hugging_face_client('cxr_model')
print(cxr_client.get_image_embeddings_from_images([img]))
buf = BytesIO() | ||
image.convert("RGB").save(buf, "PNG") | ||
image_bytes = buf.getvalue() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Below is a more condensed snippet:
from huggingface_hub import from_pretrained_keras
import tensorflow as tf, requests
# Load and format input
IMAGE_URL = "https://storage.googleapis.com/dx-scin-public-data/dataset/images/3445096909671059178.png"
input_tensor = tf.train.Example(
features=tf.train.Features(
feature={
"image/encoded": tf.train.Feature(
bytes_list=tf.train.BytesList(value=[requests.get(IMAGE_URL, stream=True).content])
)
}
)
).SerializeToString()
# Load model and run inference
loaded_model = from_pretrained_keras("google/derm-foundation")
infer = loaded_model.signatures["serving_default"]
print(infer(inputs=tf.constant([input_tensor])))
I'll cement these discussions into commits shortly. Thanks team. |
Adding usage snippets to improve Hugging Face Hub model pages for CXR Foundation and Derm Foundation - improving usability via example Python code and linked Jupyter notebooks. Doing this at the model library level, to complement the usage documentation in the model cards, so that the "Use this model" Hub button provides useful information (currently "integration status unknown").