-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream = False issue #147
Comments
I've modified the code in chat.py to show the messages generated in those two cases:
I've got those results:
stream = true
I'm using functionary-small-v2.4 as a model with vllm. Can anyone help? |
vllm gives this output:
|
If i disable grammar sampling I've got this in vllm:
|
Hi, I manage to solve the issue but I don't know why it works as it should.
I found that model is always returning finish_reason = "tool_calls" if there was tool calling even in response with content. Without this change the inference never stops. The second change I remove append because it was appending the same tool data to the message. The message already had the tool data in it. As you can see in the example that the tool call id is present two times when using chat with stream=False option. I don't know if this behaviour is only connected with functionary-v2.4 model because I didn't test any other model with this.
Hopefully this solve the issue. Can you comment? |
Interesting, thank you. I'll have to dig in further. |
Hello,
I run this code as an example:
The result is streamed fine:
BUT if I run with this change:
await chat.submit("What is the weather in San Francisco?",stream=False)
I got errors:
Is this an issue or I am doing something wrong?
The text was updated successfully, but these errors were encountered: