I'm always getting n_tokens <= n_batch and then it fails running #76
-
Do you have any workarounds or suggestions on how to know if I am reaching max tokens per prompt? Like a sample prompt is:
|
Beta Was this translation helpful? Give feedback.
Answered by
giladgd
Oct 23, 2023
Replies: 1 comment
-
@rossjackson Try setting |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
giladgd
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@rossjackson Try setting
batchSize
to the same value as thecontextSize
you have set on aLlamaContext
.The limit on a single prompt evaluation tokens count is the
batchSize
(including the tokens generated by the chat prompt wrapper)