Skip to content

Commit

Permalink
Merge pull request #981 from andreadimaio/improve_watsonx_model_provider
Browse files Browse the repository at this point in the history
Enable /chat and /chat_stream in watsonx.ai
  • Loading branch information
geoand authored Oct 15, 2024
2 parents 0b94f8b + 6b3edf4 commit 9b983a9
Show file tree
Hide file tree
Showing 45 changed files with 2,953 additions and 956 deletions.
112 changes: 38 additions & 74 deletions docs/modules/ROOT/pages/watsonx.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -62,111 +62,75 @@ quarkus.langchain4j.watsonx.api-key=hG-...

NOTE: To determine the API key, go to https://cloud.ibm.com/iam/apikeys and generate it.

==== Writing prompts
==== Interacting with Models

When creating prompts using watsonx.ai, it's important to follow the guidelines of the model you choose. Depending on the model, some special instructions may be required to ensure the desired output. For best results, always refer to the documentation provided for each model to maximize the effectiveness of your prompts.
The `watsonx.ai` module provides two different modes for interacting with LLM models: `generation` and `chat`. These modes allow you to tailor the interaction based on the complexity of your use case and how much control you want to have over the prompt structure.

To simplify the process of prompt creation, you can use the `prompt-formatter` property to automatically handle the addition of tags to your prompts. This property allows you to avoid manually adding tags by letting the system handle the formatting based on the model's requirements. This functionality is particularly useful for models such as `ibm/granite-13b-chat-v2`, `meta-llama/llama-3-405b-instruct`, and other supported models, ensuring consistent and accurate prompt structures without additional effort.
You can select the interaction mode using the property `quarkus.langchain4j.watsonx.chat-model.mode`.

To enable this functionality, configure the `prompt-formatter` property in your `application.properties` file as follows:
* `generation`: In this mode, you must explicitly structure the prompts using the required model-specific tags. This provides full control over the format of the prompt, but requires in-depth knowledge of the model being used. For best results, always refer to the documentation provided of each model to maximize the effectiveness of your prompts.
* `chat`: This mode abstracts the complexity of tagging by automatically formatting prompts so you can focus on the content (*default value*).

To choose between one of these two modes, add the `chat-model.mode` property to your `application.properties` file:

[source,properties,subs=attributes+]
----
quarkus.langchain4j.watsonx.chat-model.prompt-formatter=true
quarkus.langchain4j.watsonx.chat-model.mode=chat // or 'generate'
----

When this property is set to `true`, the system will automatically format prompts with the appropriate tags. This helps to maintain prompt clarity and improves interaction with the LLM by ensuring that prompts follow the required structure. If set to `false`, you'll need to manage the tags manually.
==== Chat Mode

In `chat` mode, you can interact with models without having to manually manage the tags of a prompt.

You might choose this mode if you are looking for dynamic interactions where the model can build on previous messages and provide more contextually relevant responses. This mode simplifies the interaction by automatically managing the necessary tags, allowing you to focus on the content of your prompts rather than formatting.

For example, if you choose to use `ibm/granite-13b-chat-v2` without using the `prompt-formatter`, you will need to manually add the `<|system|>`, `<|user|>` and `<|assistant|>` instructions:
Chat mode also supports the use of `tools`, allowing the model to perform specific actions or retrieve external data as part of its responses. This extends the capabilities of the model, allowing it to perform complex tasks dynamically and adapt to your needs. More information about tools is available on the xref:./agent-and-tools.adoc[Agent and Tools] page.

[source,properties,subs=attributes+]
----
quarkus.langchain4j.watsonx.api-key=hG-...
quarkus.langchain4j.watsonx.base-url=https://us-south.ml.cloud.ibm.com
quarkus.langchain4j.watsonx.chat-model.model-id=ibm/granite-13b-chat-v2
quarkus.langchain4j.watsonx.chat-model.prompt-formatter=false
quarkus.langchain4j.watsonx.base-url=${BASE_URL}
quarkus.langchain4j.watsonx.api-key=${API_KEY}
quarkus.langchain4j.watsonx.project-id=${PROJECT_ID}
quarkus.langchain4j.watsonx.chat-model.model-id=mistralai/mistral-large
quarkus.langchain4j.watsonx.chat-model.mode=chat
----

[source,java]
----
@RegisterAiService
public interface LLMService {
public record Result(Integer result) {}
@SystemMessage("""
<|system|>
You are a calculator and you must perform the mathematical operation
{response_schema}
""")
@UserMessage("""
<|user|>
{firstNumber} + {secondNumber}
<|assistant|>
""")
public Result calculator(int firstNumber, int secondNumber);
public interface AiService {
@SystemMessage("You are a helpful assistant")
public String chat(@MemoryId String id, @UserMessage message);
}
----

Enabling the `prompt-formatter` will result in:
NOTE: The availability of `chat` and `tools` is currently limited to certain models. Not all models support these features, so be sure to consult the documentation for the specific model you are using to confirm whether these features are available.

==== Generation Mode

In `generation` mode, you have complete control over the structure of your prompts by manually specifying tags for a specific model. This mode could be useful in scenarios where a single-response is desired.

[source,properties,subs=attributes+]
----
quarkus.langchain4j.watsonx.api-key=hG-...
quarkus.langchain4j.watsonx.base-url=https://us-south.ml.cloud.ibm.com
quarkus.langchain4j.watsonx.chat-model.model-id=ibm/granite-13b-chat-v2
quarkus.langchain4j.watsonx.chat-model.prompt-formatter=true
quarkus.langchain4j.watsonx.base-url=${BASE_URL}
quarkus.langchain4j.watsonx.api-key=${API_KEY}
quarkus.langchain4j.watsonx.project-id=${PROJECT_ID}
quarkus.langchain4j.watsonx.chat-model.model-id=mistralai/mistral-large
quarkus.langchain4j.watsonx.chat-model.mode=generation
----

[source,java]
----
@RegisterAiService
public interface LLMService {
public record Result(Integer result) {}
@SystemMessage("""
You are a calculator and you must perform the mathematical operation
{response_schema}
""")
@RegisterAiService(chatMemoryProviderSupplier = RegisterAiService.NoChatMemoryProviderSupplier.class)
public interface AiService {
@UserMessage("""
{firstNumber} + {secondNumber}
""")
public Result calculator(int firstNumber, int secondNumber);
<s>[INST] You are a helpful assistant [/INST]</s>\
[INST] What is the capital of {capital}? [/INST]""")
public String askCapital(String capital);
}
----

The `prompt-formatter` supports the following models:

* `mistralai/mistral-large`
* `mistralai/mixtral-8x7b-instruct-v01`
* `sdaia/allam-1-13b-instruct`
* `meta-llama/llama-3-405b-instruct`
* `meta-llama/llama-3-1-70b-instruct`
* `meta-llama/llama-3-1-8b-instruct`
* `meta-llama/llama-3-70b-instruct`
* `meta-llama/llama-3-8b-instruct`
* `ibm/granite-13b-chat-v2`
* `ibm/granite-13b-instruct-v2`
* `ibm/granite-7b-lab`
* `ibm/granite-20b-code-instruct`
* `ibm/granite-34b-code-instruct`
* `ibm/granite-3b-code-instruct`
* `ibm/granite-8b-code-instruct`

==== Tool Execution with Prompt Formatter

In addition to simplifying prompt creation, the `prompt-formatter` property also enables the execution of tools for specific models. Tools allow for dynamic interactions within the model, enabling the AI to perform specific actions or fetch data as part of its response.

When the `prompt-formatter` is enabled and a supported model is selected, the prompt will be automatically formatted to use the tools. More information about tools is available in the xref:./agent-and-tools.adoc[Agent and Tools] page.

Currently, the following model supports tool execution:

* `mistralai/mistral-large`
* `meta-llama/llama-3-405b-instruct`
* `meta-llama/llama-3-1-70b-instruct`

IMPORTANT: The `@SystemMessage` and `@UserMessage` annotations are joined by default with a new line. If you want to change this behavior, use the property `quarkus.langchain4j.watsonx.chat-model.prompt-joiner=<value>`. By adjusting this property, you can define your preferred way of joining messages and ensure that the prompt structure meets your specific needs. This customization option is available only when the `prompt-formatter` property is set to `false`. When the `prompt-formatter` is enabled (set to `true`), the prompt formatting, including the addition of tags and message joining, is automatically handled. In this case, the `prompt-joiner` property will be ignored, and you will not have the ability to customize how messages are joined.
NOTE: The `@SystemMessage` and `@UserMessage` annotations are joined by default with a new line. If you want to change this behavior, use the property `quarkus.langchain4j.watsonx.chat-model.prompt-joiner=<value>`. By adjusting this property, you can define your preferred way of joining messages and ensure that the prompt structure meets your specific needs.

NOTE: Sometimes it may be useful to use the `quarkus.langchain4j.watsonx.chat-model.stop-sequences` property to prevent the LLM model from returning more results than desired.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,12 @@ quarkus.langchain4j.watsonx.c7.base-url=https://somecluster.somedomain.ai:443/ap
quarkus.langchain4j.watsonx.c7.api-key=test8
quarkus.langchain4j.watsonx.c7.project-id=proj

quarkus.langchain4j.c8.chat-model.provider=watsonx
quarkus.langchain4j.watsonx.c8.base-url=https://somecluster.somedomain.ai:443/api
quarkus.langchain4j.watsonx.c8.api-key=test9
quarkus.langchain4j.watsonx.c8.project-id=proj
quarkus.langchain4j.watsonx.c8.chat-model.mode=generation

quarkus.langchain4j.e1.embedding-model.provider=openai
quarkus.langchain4j.openai.e1.api-key=test5
quarkus.langchain4j.e2.embedding-model.provider=ollama
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import io.quarkiverse.langchain4j.ollama.OllamaChatLanguageModel;
import io.quarkiverse.langchain4j.openshiftai.OpenshiftAiChatModel;
import io.quarkiverse.langchain4j.watsonx.WatsonxChatModel;
import io.quarkiverse.langchain4j.watsonx.WatsonxGenerationModel;
import io.quarkus.arc.ClientProxy;
import io.quarkus.test.junit.QuarkusTest;

Expand Down Expand Up @@ -47,6 +48,10 @@ public class MultipleChatProvidersTest {
@ModelName("c7")
ChatLanguageModel seventhNamedModel;

@Inject
@ModelName("c8")
ChatLanguageModel eighthNamedModel;

@Test
void defaultModel() {
assertThat(ClientProxy.unwrap(defaultModel)).isInstanceOf(OpenAiChatModel.class);
Expand Down Expand Up @@ -81,4 +86,9 @@ void sixthNamedModel() {
void seventhNamedModel() {
assertThat(ClientProxy.unwrap(seventhNamedModel)).isInstanceOf(WatsonxChatModel.class);
}

@Test
void eighthNamedModel() {
assertThat(ClientProxy.unwrap(eighthNamedModel)).isInstanceOf(WatsonxGenerationModel.class);
}
}
Loading

0 comments on commit 9b983a9

Please sign in to comment.