Replies: 1 comment
-
The streaming output is ordered and token-by-token. We follow the same semantics as OpenAI API |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
By in order, I mean, like this:
"I" -> "I like" -> "I Like vLLM" -> "I Like vLLM\n"
What I want to implement is that I want to stream inference returns only incremental characters, not whole output. So to achieve this goal, I must store the last output. I must ensure the last output is REAL last output, not out of ordered output.
Again, sorry of my poor English. :)
Beta Was this translation helpful? Give feedback.
All reactions