We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
From @nweir127:
If I feed the API text that is longer than the acceptable context window (2048 tokens) , the generator post processor currently crashes A nice feature of the API would be to either give an appropriate error response or to perform front truncation
If I feed the API text that is longer than the acceptable context window (2048 tokens) , the generator post processor currently crashes
A nice feature of the API would be to either give an appropriate error response or to perform front truncation
And a code snippet:
input_token_ids = tokenizer(text, return_tensors='pt')['input_ids'] max_context = model.config.n_positions - max_new_tokens if hasattr(model.config, "n_positions") and input_token_ids.shape[1] > max_context: input_token_ids = input_token_ids[:, -max_context:] text = tokenizer.batch_decode(input_token_ids)[0] input_token_ids = tokenizer(text, return_tensors='pt')['input_ids']
The text was updated successfully, but these errors were encountered:
No branches or pull requests
From @nweir127:
And a code snippet:
The text was updated successfully, but these errors were encountered: