Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix vision model graph capture not creating static buffers for embedding #942

Merged
merged 1 commit into from
Oct 7, 2024

Conversation

PatriceVignola
Copy link
Contributor

This change essentially reverses the assignment of the embeddings memory. Instead of creating the embeddings tensor in the embedding model and pointing the embeddings of the text model to it, we now create the embeddings tensor inside the text model and point the embeddings of the embedding model to it.

The reason to do that is that the text model can possibly be in "graph capture mode", which means that it allocates static buffers that it uses between iterations, and even between generators. If we allocate the memory in the embedding model and point the text model to it, the memory will become invalid when the generator is destroyed and the captured graph will exhibit undefined behavior (mostly spitting out garbage output). But by pointing the embeddings output of the embedding model towards the static buffer created by the text model, we can be certain that the memory will stay alive for the duration of the model.

This PR doesn't change the behavior of the non-graph capture mode since it really doesn't matter in that scenario whether the tensor is created by the embedding model or the text model, but it fixes graph capture usage for vision models.

@PatriceVignola PatriceVignola merged commit 77a88c3 into main Oct 7, 2024
13 checks passed
@PatriceVignola PatriceVignola deleted the user/pavignol/fix-vision-model-graph-capture branch October 7, 2024 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants