Does vLLM return stream inference result in order? #3178

gaoxt1983 · 2024-03-04T12:10:11Z

gaoxt1983
Mar 4, 2024

By in order, I mean, like this:
"I" -> "I like" -> "I Like vLLM" -> "I Like vLLM\n"

What I want to implement is that I want to stream inference returns only incremental characters, not whole output. So to achieve this goal, I must store the last output. I must ensure the last output is REAL last output, not out of ordered output.

Again, sorry of my poor English. :)

simon-mo · 2024-03-04T18:39:57Z

simon-mo
Mar 4, 2024
Maintainer

The streaming output is ordered and token-by-token. We follow the same semantics as OpenAI API

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does vLLM return stream inference result in order? #3178

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Does vLLM return stream inference result in order? #3178

gaoxt1983 Mar 4, 2024

Replies: 1 comment

simon-mo Mar 4, 2024 Maintainer

gaoxt1983
Mar 4, 2024

simon-mo
Mar 4, 2024
Maintainer