Merge pull request #2910 from continuedev/nate/qwen-coder

recommend qwen2.5-coder:1.5b
continuedev · Nov 13, 2024 · 83bff02 · 83bff02
2 parents 60166e3 + ce784e1
commit 83bff02
Show file tree

Hide file tree

Showing 8 changed files with 51 additions and 40 deletions.
diff --git a/core/autocomplete/README.md b/core/autocomplete/README.md
@@ -7,7 +7,7 @@ Continue now provides support for tab autocomplete in [VS Code](https://marketpl
 We recommend setting up tab-autocomplete with a local Ollama instance. To do this, first download the latest version of Ollama from [here](https://ollama.ai). Then, run the following command to download our recommended model:
 
 ```bash
-ollama run starcoder:3b
+ollama run qwen2.5-coder:1.5b
 ```
 
 Once it has been downloaded, you should begin to see completions in VS Code.
@@ -17,9 +17,9 @@ Once it has been downloaded, you should begin to see completions in VS Code.
 You can also set up tab-autocomplete with a local LM Studio instance by following these steps:
 
 1. Download the latest version of LM Studio from [here](https://lmstudio.ai/)
-2. Download a model (e.g. search for `second-state/StarCoder2-3B-GGUF` and choose one of the options there)
+2. Download a model (e.g. search for `Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF` and choose one of the options there)
 3. Go to the server section (button is on the left), select your model from the dropdown at the top, and click "Start Server"
-4. Go to the "My Models" section (button is on the left), find your selected model, and copy the name the path (example: `second-state/StarCoder2-3B-GGUF/starcoder2-3b-Q8_0.gguf`); this will be used as the "model" attribute in Continue
+4. Go to the "My Models" section (button is on the left), find your selected model, and copy the name the path (example: `Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF/qwen2.5-coder-1.5b-instruct-q4_k_m.gguf`); this will be used as the "model" attribute in Continue
 5. Go to Continue and modify the configurations for a [custom model](#setting-up-a-custom-model)
 6. Set the "provider" to `lmstudio` and the "model" to the path copied earlier
 
@@ -28,8 +28,8 @@ Example:
 ```json title="config.json"
 {
   "tabAutocompleteModel": {
-      "title": "Starcoder2 3b",
-      "model": "second-state/StarCoder2-3B-GGUF/starcoder2-3b-Q8_0.gguf",
+      "title": "Qwen2.5-Coder 1.5b",
+      "model": "Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF",
       "provider": "lmstudio",
   },
   ...
@@ -69,11 +69,9 @@ If you aren't yet familiar with the available options, you can learn more in our
 
 ### What model should I use?
 
-If you are running the model locally, we recommend `starcoder:3b`.
+If you are running the model locally, we recommend `qwen2.5-coder:1.5b`.
 
-If you find it to be too slow, you should try `deepseek-coder:1.3b-base`.
-
-If you have a bit more compute, or are running a model in the cloud, you can upgrade to `deepseek-coder:6.7b-base`.
+If you have a bit more compute, or are running a model in the cloud, you can upgrade to `qwen2.5-coder:7b`.
 
 Regardless of what you are willing to spend, we do not recommend using GPT or Claude for autocomplete. Learn why [below](#i-want-better-completions-should-i-use-gpt-4).
 
@@ -83,7 +81,7 @@ The following can be configured in `config.json`:
 
 ### `tabAutocompleteModel`
 
-This is just another object like the ones in the `"models"` array of `config.json`. You can choose and configure any model you would like, but we strongly suggest using a small model made for tab-autocomplete, such as `deepseek-1b`, `starcoder-1b`, or `starcoder-3b`.
+This is just another object like the ones in the `"models"` array of `config.json`. You can choose and configure any model you would like, but we strongly suggest using a small model made for tab-autocomplete, such as `deepseek-1b`, `qwen2.5-coder:1.5b`, or `starcoder-3b`.
 
 ### `tabAutocompleteOptions`
 
@@ -105,7 +103,7 @@ This object allows you to customize the behavior of tab-autocomplete. The availa
   "tabAutocompleteModel": {
     "title": "Tab Autocomplete Model",
     "provider": "ollama",
-    "model": "starcoder:3b",
+    "model": "qwen2.5-coder:1.5b",
     "apiBase": "https://<my endpoint>"
   },
   "tabAutocompleteOptions": {
@@ -128,7 +126,7 @@ Follow these steps to ensure that everything is set up correctly:
 
 1. Make sure you have the "Enable Tab Autocomplete" setting checked (in VS Code, you can toggle by clicking the "Continue" button in the status bar).
 2. Make sure you have downloaded Ollama.
-3. Run `ollama run starcoder:3b` to verify that the model is downloaded.
+3. Run `ollama run qwen2.5-coder:1.5b` to verify that the model is downloaded.
 4. Make sure that any other completion providers are disabled (e.g. Copilot), as they may interfere.
 5. Make sure that you aren't also using another Ollama model for chat. This will cause Ollama to constantly load and unload the models from memory, resulting in slow responses (or none at all) for both.
 6. Check the output of the logs to find any potential errors (cmd/ctrl+shift+p -> "Toggle Developer Tools" -> "Console" tab in VS Code, ~/.continue/logs/core.log in JetBrains).

diff --git a/core/config/onboarding.ts b/core/config/onboarding.ts
@@ -4,7 +4,7 @@ import { FREE_TRIAL_MODELS } from "./default";
 
 export const TRIAL_FIM_MODEL = "codestral-latest";
 export const ONBOARDING_LOCAL_MODEL_TITLE = "Ollama";
-export const LOCAL_ONBOARDING_FIM_MODEL = "starcoder2:3b";
+export const LOCAL_ONBOARDING_FIM_MODEL = "qwen2.5-coder:1.5b";
 export const LOCAL_ONBOARDING_CHAT_MODEL = "llama3.1:8b";
 export const LOCAL_ONBOARDING_CHAT_TITLE = "Llama 3.1 8B";
 
@@ -35,7 +35,7 @@ export function setupLocalConfig(
       ...config.models.filter((model) => model.provider !== "free-trial"),
     ],
     tabAutocompleteModel: {
-      title: "Starcoder 3b",
+      title: "Qwen2.5-Coder 1.5B",
       provider: "ollama",
       model: LOCAL_ONBOARDING_FIM_MODEL,
     },

diff --git a/docs/docs/autocomplete/model-setup.md b/docs/docs/autocomplete/model-setup.md
@@ -26,26 +26,23 @@ The API keys for Codestral and the general Mistral APIs are different. If you ar
 
 ## Local, offline / self-hosted experience
 
-For those preferring local execution or self-hosting,`StarCoder2-3b` offers a good balance of performance and quality for most users:
+For those preferring local execution or self-hosting,`Qwen2.5-Coder 1.5B` offers a good balance of performance and quality for most users:
 
 ```json title="config.json""
 {
   "tabAutocompleteModel": {
-    "title": "StarCoder2-3b",
-    "model": "starcoder2:3b",
+    "title": "Qwen2.5-Coder 1.5B",
+    "model": "qwen2.5-coder:1.5b",
     "provider": "ollama"
   }
 }
 ```
 
-## Alternative experiences
-
-- Completions too slow? Try `deepseek-coder:1.3b-base` for quicker completions on less powerful hardware
-- Have more compute? Use `deepseek-coder:6.7b-base` for potentially higher-quality suggestions
+Have more compute? Use `qwen2.5-coder:7b` for potentially higher-quality suggestions.
 
 :::note
 
-For LM Studio users, navigate to the "My Models" section, find your desired model, and copy the path (e.g., second-state/StarCoder2-3B-GGUF/starcoder2-3b-Q8_0.gguf). Use this path as the `model` value in your configuration.
+For LM Studio users, navigate to the "My Models" section, find your desired model, and copy the path (e.g., `Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF/wen2.5-coder-1.5b-instruct-q4_k_m.gguf`). Use this path as the `model` value in your configuration.
 
 :::
 

diff --git a/docs/docs/customize/deep-dives/autocomplete.md b/docs/docs/customize/deep-dives/autocomplete.md
@@ -23,7 +23,7 @@ If you want to have the best autocomplete experience, we recommend using Codestr
 If you'd like to run your autocomplete model locally, we recommend using Ollama. To do this, first download the latest version of Ollama from [here](https://ollama.ai). Then, run the following command to download our recommended model:
 
 ```bash
-ollama run starcoder2:3b
+ollama run qwen2.5-coder:1.5b
 ```
 
 Once it has been downloaded, you should begin to see completions in VS Code.
@@ -37,7 +37,7 @@ All of the configuration options available for chat models are available to use
     "tabAutocompleteModel": {
         "title": "Tab Autocomplete Model",
         "provider": "ollama",
-        "model": "starcoder2:3b",
+        "model": "qwen2.5-coder:1.5b",
         "apiBase": "https://<my endpoint>"
     },
     ...
@@ -52,7 +52,7 @@ The following can be configured in `config.json`:
 
 ### `tabAutocompleteModel`
 
-This is just another object like the ones in the `"models"` array of `config.json`. You can choose and configure any model you would like, but we strongly suggest using a small model made for tab-autocomplete, such as `deepseek-1b`, `starcoder-1b`, or `starcoder2-3b`.
+This is just another object like the ones in the `"models"` array of `config.json`. You can choose and configure any model you would like, but we strongly suggest using a small model made for tab-autocomplete, such as `deepseek-1b`, `qwen2.5-coder:1.5b`, or `starcoder2-3b`.
 
 ### `tabAutocompleteOptions`
 
@@ -78,7 +78,7 @@ This object allows you to customize the behavior of tab-autocomplete. The availa
   "tabAutocompleteModel": {
     "title": "Tab Autocomplete Model",
     "provider": "ollama",
-    "model": "starcoder2:3b",
+    "model": "qwen2.5-coder:1.5b",
     "apiBase": "https://<my endpoint>"
   },
   "tabAutocompleteOptions": {
@@ -101,16 +101,12 @@ Follow these steps to ensure that everything is set up correctly:
 
 1. Make sure you have the "Enable Tab Autocomplete" setting checked (in VS Code, you can toggle by clicking the "Continue" button in the status bar, and in JetBrains by going to Settings -> Tools -> Continue).
 2. Make sure you have downloaded Ollama.
-3. Run `ollama run starcoder2:3b` to verify that the model is downloaded.
+3. Run `ollama run qwen2.5-coder:1.5b` to verify that the model is downloaded.
 4. Make sure that any other completion providers are disabled (e.g. Copilot), as they may interfere.
 5. Check the output of the logs to find any potential errors: <kbd>cmd/ctrl</kbd> + <kbd>shift</kbd> + <kbd>P</kbd> -> "Toggle Developer Tools" -> "Console" tab in VS Code, ~/.continue/logs/core.log in JetBrains.
 6. Check VS Code settings to make sure that `"editor.inlineSuggest.enabled"` is set to `true` (use <kbd>cmd/ctrl</kbd> + <kbd>,</kbd> then search for this and check the box)
 7. If you are still having issues, please let us know in our [Discord](https://discord.gg/vapESyrFmJ) and we'll help as soon as possible.
 
-### Completions are slow
-
-Depending on your hardware, you may want to try a smaller, faster model. If 3b isn't working for you we recommend trying `deepseek-coder:1.3b-base`.
-
 ### Completions are only ever single-line
 
 To ensure that you receive multi-line completions, you can set `"multilineCompletions": "always"` in `tabAutocompleteOptions`. By default, it is `"auto"`. If you still find that you are only seeing single-line completions, this may be because some models tend to produce shorter completions when starting in the middle of a file. You can try temporarily moving text below your cursor out of your active file, or switching to a larger model.

diff --git a/docs/docs/customize/model-providers/top-level/ollama.md b/docs/docs/customize/model-providers/top-level/ollama.md
@@ -23,14 +23,14 @@ We recommend configuring **Llama3.1 8B** as your chat model.
 
 ## Autocomplete model
 
-We recommend configuring **StarCoder2 3B** as your autocomplete model.
+We recommend configuring **Qwen2.5-Coder 1.5B** as your autocomplete model.
 
 ```json title="config.json"
 {
   "tabAutocompleteModel": {
-    "title": "StarCoder2 3B",
+    "title": "Qwen2.5-Coder 1.5B",
     "provider": "ollama",
-    "model": "starcoder2:3b"
+    "model": "qwen2.5-coder:1.5b"
   }
 }
 ```

diff --git a/docs/docs/customize/model-types/autocomplete.md b/docs/docs/customize/model-types/autocomplete.md
@@ -13,4 +13,4 @@ In Continue, these models are used to display inline [Autocomplete](../../autoco
 
 If you have the ability to use any model, we recommend `Codestral` with [Mistral](../model-providers/top-level/mistral.md#autocomplete-model) or [Vertex AI](../model-providers/top-level/vertexai.md#autocomplete-model).
 
-If you want to run a model locally, we recommend `Starcoder2-3B` with [Ollama](../model-providers/top-level/ollama.md#autocomplete-model).
+If you want to run a model locally, we recommend `Qwen2.5-Coder 1.5B` with [Ollama](../model-providers/top-level/ollama.md#autocomplete-model).
diff --git a/extensions/vscode/config_schema.json b/extensions/vscode/config_schema.json
@@ -772,6 +772,11 @@
                       "mistral-tiny",
                       "mistral-small",
                       "mistral-medium",
+                      "qwen2.5-coder:1.5b",
+                      "qwen2.5-coder:3b",
+                      "qwen2.5-coder:7b",
+                      "qwen2.5-coder:14b",
+                      "qwen2.5-coder:32b",
                       "AUTODETECT"
                     ]
                   },
@@ -1028,7 +1033,12 @@
                       "stable-code-3b",
                       "starcoder-1b",
                       "starcoder-3b",
-                      "starcoder2-3b"
+                      "starcoder2-3b",
+                      "qwen2.5-coder:1.5b",
+                      "qwen2.5-coder:3b",
+                      "qwen2.5-coder:7b",
+                      "qwen2.5-coder:14b",
+                      "qwen2.5-coder:32b"
                     ]
                   },
                   {
@@ -1081,6 +1091,11 @@
                       "starcoder-1b",
                       "starcoder-3b",
                       "starcoder2-3b",
+                      "qwen2.5-coder:1.5b",
+                      "qwen2.5-coder:3b",
+                      "qwen2.5-coder:7b",
+                      "qwen2.5-coder:14b",
+                      "qwen2.5-coder:32b",
                       "AUTODETECT"
                     ]
                   },
@@ -1400,7 +1415,12 @@
                       "stable-code-3b",
                       "starcoder-1b",
                       "starcoder-3b",
-                      "starcoder2-3b"
+                      "starcoder2-3b",
+                      "qwen2.5-coder:1.5b",
+                      "qwen2.5-coder:3b",
+                      "qwen2.5-coder:7b",
+                      "qwen2.5-coder:14b",
+                      "qwen2.5-coder:32b"
                     ]
                   },
                   {
@@ -2710,8 +2730,8 @@
         },
         "tabAutocompleteModel": {
           "title": "Tab Autocomplete Model",
-          "markdownDescription": "The model used for tab autocompletion. If undefined, Continue will default to using starcoder2:3b on a local Ollama instance.\n\n*IMPORTANT*:\n\nIf you use a custom model, ensure that it is one trained for fill-in-the-middle completions. An instruct model is typically not well-suited to autocomplete and you may receive unsatisfactory completions.",
-          "x-intellij-html-description": "The model used for tab autocompletion. If undefined, Continue will default to using starcoder2:3b on a local Ollama instance.<br><br><i>IMPORTANT</i>:<br><br>If you use a custom model, ensure that it is one trained for fill-in-the-middle completions. An instruct model is typically not well-suited to autocomplete and you may receive unsatisfactory completions.",
+          "markdownDescription": "The model used for tab autocompletion. If undefined, Continue will default to using qwen2.5-coder:1.5b on a local Ollama instance.\n\n*IMPORTANT*:\n\nIf you use a custom model, ensure that it is one trained for fill-in-the-middle completions. An instruct model is typically not well-suited to autocomplete and you may receive unsatisfactory completions.",
+          "x-intellij-html-description": "The model used for tab autocompletion. If undefined, Continue will default to using qwen2.5-coder:1.5b on a local Ollama instance.<br><br><i>IMPORTANT</i>:<br><br>If you use a custom model, ensure that it is one trained for fill-in-the-middle completions. An instruct model is typically not well-suited to autocomplete and you may receive unsatisfactory completions.",
           "default": {
             "title": "Tab Autocomplete Model",
             "provider": "ollama",

diff --git a/extensions/vscode/src/util/loadAutocompleteModel.ts b/extensions/vscode/src/util/loadAutocompleteModel.ts
@@ -6,7 +6,7 @@ import type { ILLM } from "core";
 
 export class TabAutocompleteModel {
   private _llm: ILLM | undefined;
-  private defaultTag = "starcoder2:3b";
+  private defaultTag = "qwen2.5-coder:1.5b";
   private globalContext: GlobalContext = new GlobalContext();
 
   constructor(private configHandler: ConfigHandler) {}