You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Find a way to avoid sending a lot of tokens past the last token for a particular item in the batch (i.e. we need to trim past the EOS in encode_response, let's open an issue and create an example about it)
as a more immediate improvement, we need to find a way to avoid sending a lot of tokens past the last token for a particular item in the batch (i.e. we need to trim past the EOS in `encode_response`, let's open an issue and create an example about it)
The text was updated successfully, but these errors were encountered:
aniketmaurya
changed the title
find a way to avoid sending a lot of tokens past the last token for a particular item in the batch (i.e. we need to trim past the EOS in encode_response, let's open an issue and create an example about it)
avoid sending too much content during batched streaming
Apr 23, 2024
lantiga
changed the title
avoid sending too much content during batched streaming
avoid sending content past the last token during batching / batched streaming
Apr 29, 2024
Find a way to avoid sending a lot of tokens past the last token for a particular item in the batch (i.e. we need to trim past the EOS in encode_response, let's open an issue and create an example about it)
Originally posted by @lantiga in #55 (comment)
The text was updated successfully, but these errors were encountered: