Skip to content

Commit

Permalink
Merge pull request #2910 from continuedev/nate/qwen-coder
Browse files Browse the repository at this point in the history
recommend qwen2.5-coder:1.5b
  • Loading branch information
sestinj authored Nov 13, 2024
2 parents 60166e3 + ce784e1 commit 83bff02
Show file tree
Hide file tree
Showing 8 changed files with 51 additions and 40 deletions.
22 changes: 10 additions & 12 deletions core/autocomplete/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Continue now provides support for tab autocomplete in [VS Code](https://marketpl
We recommend setting up tab-autocomplete with a local Ollama instance. To do this, first download the latest version of Ollama from [here](https://ollama.ai). Then, run the following command to download our recommended model:

```bash
ollama run starcoder:3b
ollama run qwen2.5-coder:1.5b
```

Once it has been downloaded, you should begin to see completions in VS Code.
Expand All @@ -17,9 +17,9 @@ Once it has been downloaded, you should begin to see completions in VS Code.
You can also set up tab-autocomplete with a local LM Studio instance by following these steps:

1. Download the latest version of LM Studio from [here](https://lmstudio.ai/)
2. Download a model (e.g. search for `second-state/StarCoder2-3B-GGUF` and choose one of the options there)
2. Download a model (e.g. search for `Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF` and choose one of the options there)
3. Go to the server section (button is on the left), select your model from the dropdown at the top, and click "Start Server"
4. Go to the "My Models" section (button is on the left), find your selected model, and copy the name the path (example: `second-state/StarCoder2-3B-GGUF/starcoder2-3b-Q8_0.gguf`); this will be used as the "model" attribute in Continue
4. Go to the "My Models" section (button is on the left), find your selected model, and copy the name the path (example: `Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF/qwen2.5-coder-1.5b-instruct-q4_k_m.gguf`); this will be used as the "model" attribute in Continue
5. Go to Continue and modify the configurations for a [custom model](#setting-up-a-custom-model)
6. Set the "provider" to `lmstudio` and the "model" to the path copied earlier

Expand All @@ -28,8 +28,8 @@ Example:
```json title="config.json"
{
"tabAutocompleteModel": {
"title": "Starcoder2 3b",
"model": "second-state/StarCoder2-3B-GGUF/starcoder2-3b-Q8_0.gguf",
"title": "Qwen2.5-Coder 1.5b",
"model": "Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF",
"provider": "lmstudio",
},
...
Expand Down Expand Up @@ -69,11 +69,9 @@ If you aren't yet familiar with the available options, you can learn more in our

### What model should I use?

If you are running the model locally, we recommend `starcoder:3b`.
If you are running the model locally, we recommend `qwen2.5-coder:1.5b`.

If you find it to be too slow, you should try `deepseek-coder:1.3b-base`.

If you have a bit more compute, or are running a model in the cloud, you can upgrade to `deepseek-coder:6.7b-base`.
If you have a bit more compute, or are running a model in the cloud, you can upgrade to `qwen2.5-coder:7b`.

Regardless of what you are willing to spend, we do not recommend using GPT or Claude for autocomplete. Learn why [below](#i-want-better-completions-should-i-use-gpt-4).

Expand All @@ -83,7 +81,7 @@ The following can be configured in `config.json`:

### `tabAutocompleteModel`

This is just another object like the ones in the `"models"` array of `config.json`. You can choose and configure any model you would like, but we strongly suggest using a small model made for tab-autocomplete, such as `deepseek-1b`, `starcoder-1b`, or `starcoder-3b`.
This is just another object like the ones in the `"models"` array of `config.json`. You can choose and configure any model you would like, but we strongly suggest using a small model made for tab-autocomplete, such as `deepseek-1b`, `qwen2.5-coder:1.5b`, or `starcoder-3b`.

### `tabAutocompleteOptions`

Expand All @@ -105,7 +103,7 @@ This object allows you to customize the behavior of tab-autocomplete. The availa
"tabAutocompleteModel": {
"title": "Tab Autocomplete Model",
"provider": "ollama",
"model": "starcoder:3b",
"model": "qwen2.5-coder:1.5b",
"apiBase": "https://<my endpoint>"
},
"tabAutocompleteOptions": {
Expand All @@ -128,7 +126,7 @@ Follow these steps to ensure that everything is set up correctly:

1. Make sure you have the "Enable Tab Autocomplete" setting checked (in VS Code, you can toggle by clicking the "Continue" button in the status bar).
2. Make sure you have downloaded Ollama.
3. Run `ollama run starcoder:3b` to verify that the model is downloaded.
3. Run `ollama run qwen2.5-coder:1.5b` to verify that the model is downloaded.
4. Make sure that any other completion providers are disabled (e.g. Copilot), as they may interfere.
5. Make sure that you aren't also using another Ollama model for chat. This will cause Ollama to constantly load and unload the models from memory, resulting in slow responses (or none at all) for both.
6. Check the output of the logs to find any potential errors (cmd/ctrl+shift+p -> "Toggle Developer Tools" -> "Console" tab in VS Code, ~/.continue/logs/core.log in JetBrains).
Expand Down
4 changes: 2 additions & 2 deletions core/config/onboarding.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import { FREE_TRIAL_MODELS } from "./default";

export const TRIAL_FIM_MODEL = "codestral-latest";
export const ONBOARDING_LOCAL_MODEL_TITLE = "Ollama";
export const LOCAL_ONBOARDING_FIM_MODEL = "starcoder2:3b";
export const LOCAL_ONBOARDING_FIM_MODEL = "qwen2.5-coder:1.5b";
export const LOCAL_ONBOARDING_CHAT_MODEL = "llama3.1:8b";
export const LOCAL_ONBOARDING_CHAT_TITLE = "Llama 3.1 8B";

Expand Down Expand Up @@ -35,7 +35,7 @@ export function setupLocalConfig(
...config.models.filter((model) => model.provider !== "free-trial"),
],
tabAutocompleteModel: {
title: "Starcoder 3b",
title: "Qwen2.5-Coder 1.5B",
provider: "ollama",
model: LOCAL_ONBOARDING_FIM_MODEL,
},
Expand Down
13 changes: 5 additions & 8 deletions docs/docs/autocomplete/model-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,26 +26,23 @@ The API keys for Codestral and the general Mistral APIs are different. If you ar

## Local, offline / self-hosted experience

For those preferring local execution or self-hosting,`StarCoder2-3b` offers a good balance of performance and quality for most users:
For those preferring local execution or self-hosting,`Qwen2.5-Coder 1.5B` offers a good balance of performance and quality for most users:

```json title="config.json""
{
"tabAutocompleteModel": {
"title": "StarCoder2-3b",
"model": "starcoder2:3b",
"title": "Qwen2.5-Coder 1.5B",
"model": "qwen2.5-coder:1.5b",
"provider": "ollama"
}
}
```

## Alternative experiences

- Completions too slow? Try `deepseek-coder:1.3b-base` for quicker completions on less powerful hardware
- Have more compute? Use `deepseek-coder:6.7b-base` for potentially higher-quality suggestions
Have more compute? Use `qwen2.5-coder:7b` for potentially higher-quality suggestions.

:::note

For LM Studio users, navigate to the "My Models" section, find your desired model, and copy the path (e.g., second-state/StarCoder2-3B-GGUF/starcoder2-3b-Q8_0.gguf). Use this path as the `model` value in your configuration.
For LM Studio users, navigate to the "My Models" section, find your desired model, and copy the path (e.g., `Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF/wen2.5-coder-1.5b-instruct-q4_k_m.gguf`). Use this path as the `model` value in your configuration.

:::

Expand Down
14 changes: 5 additions & 9 deletions docs/docs/customize/deep-dives/autocomplete.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ If you want to have the best autocomplete experience, we recommend using Codestr
If you'd like to run your autocomplete model locally, we recommend using Ollama. To do this, first download the latest version of Ollama from [here](https://ollama.ai). Then, run the following command to download our recommended model:

```bash
ollama run starcoder2:3b
ollama run qwen2.5-coder:1.5b
```

Once it has been downloaded, you should begin to see completions in VS Code.
Expand All @@ -37,7 +37,7 @@ All of the configuration options available for chat models are available to use
"tabAutocompleteModel": {
"title": "Tab Autocomplete Model",
"provider": "ollama",
"model": "starcoder2:3b",
"model": "qwen2.5-coder:1.5b",
"apiBase": "https://<my endpoint>"
},
...
Expand All @@ -52,7 +52,7 @@ The following can be configured in `config.json`:

### `tabAutocompleteModel`

This is just another object like the ones in the `"models"` array of `config.json`. You can choose and configure any model you would like, but we strongly suggest using a small model made for tab-autocomplete, such as `deepseek-1b`, `starcoder-1b`, or `starcoder2-3b`.
This is just another object like the ones in the `"models"` array of `config.json`. You can choose and configure any model you would like, but we strongly suggest using a small model made for tab-autocomplete, such as `deepseek-1b`, `qwen2.5-coder:1.5b`, or `starcoder2-3b`.

### `tabAutocompleteOptions`

Expand All @@ -78,7 +78,7 @@ This object allows you to customize the behavior of tab-autocomplete. The availa
"tabAutocompleteModel": {
"title": "Tab Autocomplete Model",
"provider": "ollama",
"model": "starcoder2:3b",
"model": "qwen2.5-coder:1.5b",
"apiBase": "https://<my endpoint>"
},
"tabAutocompleteOptions": {
Expand All @@ -101,16 +101,12 @@ Follow these steps to ensure that everything is set up correctly:

1. Make sure you have the "Enable Tab Autocomplete" setting checked (in VS Code, you can toggle by clicking the "Continue" button in the status bar, and in JetBrains by going to Settings -> Tools -> Continue).
2. Make sure you have downloaded Ollama.
3. Run `ollama run starcoder2:3b` to verify that the model is downloaded.
3. Run `ollama run qwen2.5-coder:1.5b` to verify that the model is downloaded.
4. Make sure that any other completion providers are disabled (e.g. Copilot), as they may interfere.
5. Check the output of the logs to find any potential errors: <kbd>cmd/ctrl</kbd> + <kbd>shift</kbd> + <kbd>P</kbd> -> "Toggle Developer Tools" -> "Console" tab in VS Code, ~/.continue/logs/core.log in JetBrains.
6. Check VS Code settings to make sure that `"editor.inlineSuggest.enabled"` is set to `true` (use <kbd>cmd/ctrl</kbd> + <kbd>,</kbd> then search for this and check the box)
7. If you are still having issues, please let us know in our [Discord](https://discord.gg/vapESyrFmJ) and we'll help as soon as possible.

### Completions are slow

Depending on your hardware, you may want to try a smaller, faster model. If 3b isn't working for you we recommend trying `deepseek-coder:1.3b-base`.

### Completions are only ever single-line

To ensure that you receive multi-line completions, you can set `"multilineCompletions": "always"` in `tabAutocompleteOptions`. By default, it is `"auto"`. If you still find that you are only seeing single-line completions, this may be because some models tend to produce shorter completions when starting in the middle of a file. You can try temporarily moving text below your cursor out of your active file, or switching to a larger model.
Expand Down
6 changes: 3 additions & 3 deletions docs/docs/customize/model-providers/top-level/ollama.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,14 @@ We recommend configuring **Llama3.1 8B** as your chat model.

## Autocomplete model

We recommend configuring **StarCoder2 3B** as your autocomplete model.
We recommend configuring **Qwen2.5-Coder 1.5B** as your autocomplete model.

```json title="config.json"
{
"tabAutocompleteModel": {
"title": "StarCoder2 3B",
"title": "Qwen2.5-Coder 1.5B",
"provider": "ollama",
"model": "starcoder2:3b"
"model": "qwen2.5-coder:1.5b"
}
}
```
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/customize/model-types/autocomplete.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ In Continue, these models are used to display inline [Autocomplete](../../autoco

If you have the ability to use any model, we recommend `Codestral` with [Mistral](../model-providers/top-level/mistral.md#autocomplete-model) or [Vertex AI](../model-providers/top-level/vertexai.md#autocomplete-model).

If you want to run a model locally, we recommend `Starcoder2-3B` with [Ollama](../model-providers/top-level/ollama.md#autocomplete-model).
If you want to run a model locally, we recommend `Qwen2.5-Coder 1.5B` with [Ollama](../model-providers/top-level/ollama.md#autocomplete-model).
28 changes: 24 additions & 4 deletions extensions/vscode/config_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -772,6 +772,11 @@
"mistral-tiny",
"mistral-small",
"mistral-medium",
"qwen2.5-coder:1.5b",
"qwen2.5-coder:3b",
"qwen2.5-coder:7b",
"qwen2.5-coder:14b",
"qwen2.5-coder:32b",
"AUTODETECT"
]
},
Expand Down Expand Up @@ -1028,7 +1033,12 @@
"stable-code-3b",
"starcoder-1b",
"starcoder-3b",
"starcoder2-3b"
"starcoder2-3b",
"qwen2.5-coder:1.5b",
"qwen2.5-coder:3b",
"qwen2.5-coder:7b",
"qwen2.5-coder:14b",
"qwen2.5-coder:32b"
]
},
{
Expand Down Expand Up @@ -1081,6 +1091,11 @@
"starcoder-1b",
"starcoder-3b",
"starcoder2-3b",
"qwen2.5-coder:1.5b",
"qwen2.5-coder:3b",
"qwen2.5-coder:7b",
"qwen2.5-coder:14b",
"qwen2.5-coder:32b",
"AUTODETECT"
]
},
Expand Down Expand Up @@ -1400,7 +1415,12 @@
"stable-code-3b",
"starcoder-1b",
"starcoder-3b",
"starcoder2-3b"
"starcoder2-3b",
"qwen2.5-coder:1.5b",
"qwen2.5-coder:3b",
"qwen2.5-coder:7b",
"qwen2.5-coder:14b",
"qwen2.5-coder:32b"
]
},
{
Expand Down Expand Up @@ -2710,8 +2730,8 @@
},
"tabAutocompleteModel": {
"title": "Tab Autocomplete Model",
"markdownDescription": "The model used for tab autocompletion. If undefined, Continue will default to using starcoder2:3b on a local Ollama instance.\n\n*IMPORTANT*:\n\nIf you use a custom model, ensure that it is one trained for fill-in-the-middle completions. An instruct model is typically not well-suited to autocomplete and you may receive unsatisfactory completions.",
"x-intellij-html-description": "The model used for tab autocompletion. If undefined, Continue will default to using starcoder2:3b on a local Ollama instance.<br><br><i>IMPORTANT</i>:<br><br>If you use a custom model, ensure that it is one trained for fill-in-the-middle completions. An instruct model is typically not well-suited to autocomplete and you may receive unsatisfactory completions.",
"markdownDescription": "The model used for tab autocompletion. If undefined, Continue will default to using qwen2.5-coder:1.5b on a local Ollama instance.\n\n*IMPORTANT*:\n\nIf you use a custom model, ensure that it is one trained for fill-in-the-middle completions. An instruct model is typically not well-suited to autocomplete and you may receive unsatisfactory completions.",
"x-intellij-html-description": "The model used for tab autocompletion. If undefined, Continue will default to using qwen2.5-coder:1.5b on a local Ollama instance.<br><br><i>IMPORTANT</i>:<br><br>If you use a custom model, ensure that it is one trained for fill-in-the-middle completions. An instruct model is typically not well-suited to autocomplete and you may receive unsatisfactory completions.",
"default": {
"title": "Tab Autocomplete Model",
"provider": "ollama",
Expand Down
2 changes: 1 addition & 1 deletion extensions/vscode/src/util/loadAutocompleteModel.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ import type { ILLM } from "core";

export class TabAutocompleteModel {
private _llm: ILLM | undefined;
private defaultTag = "starcoder2:3b";
private defaultTag = "qwen2.5-coder:1.5b";
private globalContext: GlobalContext = new GlobalContext();

constructor(private configHandler: ConfigHandler) {}
Expand Down

0 comments on commit 83bff02

Please sign in to comment.