This is a mistral-7b-instruct-v0.1 starter template from Banana.dev that allows on-demand serverless GPU inference.
You can fork this repository and deploy it on Banana as is, or customize it based on your own needs.
- Fork this repository to your own Github account.
- Connect your Github account on Banana.
- Create a new model on Banana from the forked Github repository.
- Wait for the model to build after creating it.
- Make an API request using one of the provided snippets in your Banana dashboard. However, instead of sending a prompt as provided in the snippet, structure your request as follows:
inputs = {
"prompt": "your_prompt",
"max_new_tokens": "your_max_new_tokens"
}
For more info, check out the Banana.dev docs.