From 782f0855b6942b1dc4abddd3eecd9d81d3cd59f3 Mon Sep 17 00:00:00 2001 From: Xiaotian Jin Date: Mon, 30 Dec 2024 23:40:42 +0800 Subject: [PATCH] adapt the style from contribution style --- docs/cookbooks/qwen_structure_output.ipynb | 859 ++++++++++++++------- 1 file changed, 577 insertions(+), 282 deletions(-) diff --git a/docs/cookbooks/qwen_structure_output.ipynb b/docs/cookbooks/qwen_structure_output.ipynb index 521c548129..0d95879560 100644 --- a/docs/cookbooks/qwen_structure_output.ipynb +++ b/docs/cookbooks/qwen_structure_output.ipynb @@ -1,289 +1,584 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Data model generation and structured output(Camel Agent using Qwen) " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "LLMs speaks natural languages, as human do. While the applications speak structured language, like JSON. So, it’s important to equip LLM with the ability to speak structured language, so that they can communicate with both human and other applications.\n", - "\n", - "There are a few different high level strategies that are used to do this:\n", - "\n", - "- Prompting: This is when you ask the LLM (very nicely) to return output in the desired format (JSON, XML). This is nice because it works with all LLMs. It is not nice because there is no guarantee that the LLM returns the output in the right format.\n", - "- Function calling: This is when the LLM is fine-tuned to be able to not just generate a completion, but also generate a function call. The functions the LLM can call are generally passed as extra parameters to the model API. The function names and descriptions should be treated as part of the prompt (they usually count against token counts, and are used by the LLM to decide what to do).\n", - "- JSON mode: This is when the LLM is guaranteed to return JSON.\n", - "\n", - "In Camel, the agent uses both prompting and function calling. For models does not support tool calling, camel will use prompt engineering for it to generate valid structure output." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Qwen data generation" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "[Qwen](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api) is a good example in Camel of using prompt engineering for structure output. It offers powerful models like **Qwen-max**, **Qwen-coder**, but yet not support structure output by itself. We can then make use of its own ability to generate structured data. " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Import necessary libraries, define the Qwen agent, and define the Pydantic classes. " - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "from pydantic import BaseModel, Field\n", - "\n", - "from camel.agents import ChatAgent\n", - "from camel.messages import BaseMessage\n", - "from camel.models import ModelFactory\n", - "from camel.types import ModelPlatformType, ModelType\n", - "from camel.configs import QwenConfig\n", - "\n", - "from dotenv import load_dotenv\n", - "import os\n", - "load_dotenv() \n", - "\n", - "# Define Qwen model\n", - "qwen_model = ModelFactory.create(\n", - " model_platform=ModelPlatformType.QWEN,\n", - " model_type=ModelType.QWEN_PLUS,\n", - " model_config_dict=QwenConfig().as_dict(),\n", - ")\n", - "\n", - "qwen_agent = ChatAgent(\n", - " model=qwen_model,\n", - " message_window_size=10,\n", - ")\n", - "\n", - "\n", - "# Define Pydantic models\n", - "class Student(BaseModel):\n", - " name: str\n", - " age: str\n", - " email: str\n", - "\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "First, let's try if we don't specific format just in prompt. " - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "2024-12-18 16:13:25,265 - httpx - INFO - HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \"HTTP/1.1 200 OK\"\n", - "2024-12-18 16:13:25,267 - camel.agents.chat_agent - INFO - Model qwen-plus, index 0, processed these messages: [{'role': 'user', 'content': 'Help me 1 student info in JSON format, with the following format:\\n{\\n \"name\": \"string\",\\n \"age\": \"string\",\\n \"email\": \"string\"\\n}'}]\n", - "Certainly! Below is an example of a student's information formatted in JSON as you requested:\n", - "\n", - "```json\n", - "{\n", - " \"name\": \"John Doe\",\n", - " \"age\": \"20\",\n", - " \"email\": \"johndoe@example.com\"\n", - "}\n", - "```\n", - "\n", - "If you need more specific details or another example, feel free to let me know!\n" - ] + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "language_info": { + "name": "python" } - ], - "source": [ - "assistant_sys_msg = BaseMessage.make_assistant_message(\n", - " role_name=\"Assistant\",\n", - " content=\"You are a helpful assistant in helping user to generate necessary data information.\",\n", - ")\n", - "\n", - "user_msg = \"\"\"Help me 1 student info in JSON format, with the following format:\n", - "{\n", - " \"name\": \"string\",\n", - " \"age\": \"string\",\n", - " \"email\": \"string\"\n", - "}\"\"\"\n", - "\n", - "response = qwen_agent.step(user_msg)\n", - "print(response.msgs[0].content)\n", - "\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "It did it, but we need to expand our prompts, and the result still has some annoying extra texts, and we still need to parse it into valid JSON object by ourselves. \n", - "\n", - "A more elegant way is to use the `response_format` argument in `.step()` function:" - ] }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ + "cells": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "2024-12-18 16:20:15,988 - httpx - INFO - HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \"HTTP/1.1 200 OK\"\n", - "2024-12-18 16:20:15,991 - camel.agents.chat_agent - INFO - Model qwen-plus, index 0, processed these messages: [{'role': 'user', 'content': \"\\n Given the user message, please generate a JSON response adhering to the following JSON schema:\\n{'properties': {'name': {'title': 'Name', 'type': 'string'}, 'age': {'title': 'Age', 'type': 'string'}, 'email': {'title': 'Email', 'type': 'string'}}, 'required': ['name', 'age', 'email'], 'title': 'Student', 'type': 'object'}\\nMake sure the JSON response is valid and matches the EXACT structure defined in the schema. Your result should only be a valid json object, without any other text or comments.\\n\\n User message: Help me 1 student info in JSON format\\n\\n \"}]\n", - "{\n", - " \"name\": \"John Doe\",\n", - " \"age\": \"20\",\n", - " \"email\": \"johndoe@example.com\"\n", - "}\n" - ] - } - ], - "source": [ - "qwen_agent.reset()\n", - "user_msg = \"Help me 1 student info in JSON format\"\n", - "response = qwen_agent.step(user_msg, response_format=Student)\n", - "print(response.msgs[0].content)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "And we can directly extract the Pydantic object in `response.msgs[0].parsed` field:" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ + "cell_type": "markdown", + "source": [ + "# Data model generation and structured output(Camel Agent using Qwen)" + ], + "metadata": { + "id": "ymsq1Lw0VEqT" + } + }, { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "name='John Doe' age='20' email='johndoe@example.com'\n" - ] - } - ], - "source": [ - "print(type(response.msgs[0].parsed))\n", - "print(response.msgs[0].parsed)\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Hooray, now we successfully generate 1 entry of student, suppose we want to generate more, we can still achieve this easily." - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [ + "cell_type": "markdown", + "source": [ + "You can also check this cookbook in colab [here](https://colab.research.google.com/drive/17SiWWjoK7l8Sy9FBsGKUHC6zuEsLt2yX?usp=sharing) (Use the colab share link)" + ], + "metadata": { + "id": "7V3aV16AmY0K" + } + }, + { + "cell_type": "markdown", + "source": [ + "This notebook demonstrates how to set up and leverage CAMEL's ability of structure output, like JSON, and Pydantic objects.\n", + "\n", + "In this notebook, you'll explore:\n", + "\n", + "* **CAMEL**: A powerful multi-agent framework that enables Retrieval-Augmented Generation and multi-agent role-playing scenarios, allowing for sophisticated AI-driven tasks.\n", + "* **Structure output**: The ability of LLMs to return structured output.\n", + "* **Qwen**: The Qwen model is a series of LLMs and multimodal models developed by the Qwen Team at Alibaba Group. Designed for diverse scenarios, Qwen integrates advanced AI capabilities, such as natural language understanding, text and vision processing, programming assistance, and dialogue simulation.\n", + "\n", + "This setup not only demonstrates a practical application but also serves as a flexible framework that can be adapted for various scenarios requiring structure output and data generation." + ], + "metadata": { + "id": "G5gE04UuPUWj" + } + }, + { + "cell_type": "markdown", + "source": [ + "⭐ **Star the Repo**\n", + "\n", + "If you find CAMEL useful or interesting, please consider giving it a star on our [CAMEL GitHub Repo](https://github.com/camel-ai/camel)! Your stars help others find this project and motivate us to continue improving it." + ], + "metadata": { + "id": "soIw38pJLv2f" + } + }, + { + "cell_type": "markdown", + "source": [ + "## πŸ“¦ Installation" + ], + "metadata": { + "id": "0J0_iW-YVcq2" + } + }, + { + "cell_type": "markdown", + "source": [ + "First, install the CAMEL package with all its dependencies:" + ], + "metadata": { + "id": "7p-JjpyNVcCT" + } + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "id": "0GXs2pruU9Vl", + "outputId": "5e0a3e75-f8d5-4c76-993b-dd163d4e170c", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Collecting git+https://github.com/camel-ai/camel.git@master\n", + " Cloning https://github.com/camel-ai/camel.git (to revision master) to /tmp/pip-req-build-v0pzwiv3\n", + " Running command git clone --filter=blob:none --quiet https://github.com/camel-ai/camel.git /tmp/pip-req-build-v0pzwiv3\n", + " Resolved https://github.com/camel-ai/camel.git to commit 487ed436a81067d68ad86472a95092821594dc4c\n", + " Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n", + " Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n", + " Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", + "Collecting colorama<1,>=0 (from camel-ai==0.2.15a0)\n", + " Downloading colorama-0.4.6-py2.py3-none-any.whl.metadata (17 kB)\n", + "Collecting curl_cffi==0.6.2 (from camel-ai==0.2.15a0)\n", + " Downloading curl_cffi-0.6.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (10 kB)\n", + "Collecting docstring-parser<0.16,>=0.15 (from camel-ai==0.2.15a0)\n", + " Downloading docstring_parser-0.15-py3-none-any.whl.metadata (2.4 kB)\n", + "Requirement already satisfied: eval-type-backport==0.2.0 in /usr/local/lib/python3.10/dist-packages (from camel-ai==0.2.15a0) (0.2.0)\n", + "Collecting httpx<0.27.3,>=0.23.0 (from camel-ai==0.2.15a0)\n", + " Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)\n", + "Requirement already satisfied: jsonschema<5,>=4 in /usr/local/lib/python3.10/dist-packages (from camel-ai==0.2.15a0) (4.23.0)\n", + "Requirement already satisfied: numpy<2,>=1 in /usr/local/lib/python3.10/dist-packages (from camel-ai==0.2.15a0) (1.26.4)\n", + "Collecting openai<2.0.0,>=1.58.1 (from camel-ai==0.2.15a0)\n", + " Downloading openai-1.58.1-py3-none-any.whl.metadata (27 kB)\n", + "Collecting pandoc (from camel-ai==0.2.15a0)\n", + " Downloading pandoc-2.4.tar.gz (34 kB)\n", + " Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + "Requirement already satisfied: pathlib<2.0.0,>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from camel-ai==0.2.15a0) (1.0.1)\n", + "Requirement already satisfied: protobuf<5,>=4 in /usr/local/lib/python3.10/dist-packages (from camel-ai==0.2.15a0) (4.25.5)\n", + "Collecting pydantic<2.10,>=1.9 (from camel-ai==0.2.15a0)\n", + " Downloading pydantic-2.9.2-py3-none-any.whl.metadata (149 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m149.4/149.4 kB\u001b[0m \u001b[31m5.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hCollecting tiktoken<0.8.0,>=0.7.0 (from camel-ai==0.2.15a0)\n", + " Downloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)\n", + "Requirement already satisfied: cffi>=1.12.0 in /usr/local/lib/python3.10/dist-packages (from curl_cffi==0.6.2->camel-ai==0.2.15a0) (1.17.1)\n", + "Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from curl_cffi==0.6.2->camel-ai==0.2.15a0) (2024.12.14)\n", + "Requirement already satisfied: anyio in /usr/local/lib/python3.10/dist-packages (from httpx<0.27.3,>=0.23.0->camel-ai==0.2.15a0) (3.7.1)\n", + "Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx<0.27.3,>=0.23.0->camel-ai==0.2.15a0) (1.0.7)\n", + "Requirement already satisfied: idna in /usr/local/lib/python3.10/dist-packages (from httpx<0.27.3,>=0.23.0->camel-ai==0.2.15a0) (3.10)\n", + "Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from httpx<0.27.3,>=0.23.0->camel-ai==0.2.15a0) (1.3.1)\n", + "Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx<0.27.3,>=0.23.0->camel-ai==0.2.15a0) (0.14.0)\n", + "Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.10/dist-packages (from jsonschema<5,>=4->camel-ai==0.2.15a0) (24.3.0)\n", + "Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.10/dist-packages (from jsonschema<5,>=4->camel-ai==0.2.15a0) (2024.10.1)\n", + "Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.10/dist-packages (from jsonschema<5,>=4->camel-ai==0.2.15a0) (0.35.1)\n", + "Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.10/dist-packages (from jsonschema<5,>=4->camel-ai==0.2.15a0) (0.22.3)\n", + "Requirement already satisfied: distro<2,>=1.7.0 in /usr/local/lib/python3.10/dist-packages (from openai<2.0.0,>=1.58.1->camel-ai==0.2.15a0) (1.9.0)\n", + "Requirement already satisfied: jiter<1,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from openai<2.0.0,>=1.58.1->camel-ai==0.2.15a0) (0.8.2)\n", + "Requirement already satisfied: tqdm>4 in /usr/local/lib/python3.10/dist-packages (from openai<2.0.0,>=1.58.1->camel-ai==0.2.15a0) (4.67.1)\n", + "Requirement already satisfied: typing-extensions<5,>=4.11 in /usr/local/lib/python3.10/dist-packages (from openai<2.0.0,>=1.58.1->camel-ai==0.2.15a0) (4.12.2)\n", + "Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<2.10,>=1.9->camel-ai==0.2.15a0) (0.7.0)\n", + "Collecting pydantic-core==2.23.4 (from pydantic<2.10,>=1.9->camel-ai==0.2.15a0)\n", + " Downloading pydantic_core-2.23.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)\n", + "Requirement already satisfied: regex>=2022.1.18 in /usr/local/lib/python3.10/dist-packages (from tiktoken<0.8.0,>=0.7.0->camel-ai==0.2.15a0) (2024.11.6)\n", + "Requirement already satisfied: requests>=2.26.0 in /usr/local/lib/python3.10/dist-packages (from tiktoken<0.8.0,>=0.7.0->camel-ai==0.2.15a0) (2.32.3)\n", + "Collecting plumbum (from pandoc->camel-ai==0.2.15a0)\n", + " Downloading plumbum-1.9.0-py3-none-any.whl.metadata (10 kB)\n", + "Requirement already satisfied: ply in /usr/local/lib/python3.10/dist-packages (from pandoc->camel-ai==0.2.15a0) (3.11)\n", + "Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio->httpx<0.27.3,>=0.23.0->camel-ai==0.2.15a0) (1.2.2)\n", + "Requirement already satisfied: pycparser in /usr/local/lib/python3.10/dist-packages (from cffi>=1.12.0->curl_cffi==0.6.2->camel-ai==0.2.15a0) (2.22)\n", + "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.26.0->tiktoken<0.8.0,>=0.7.0->camel-ai==0.2.15a0) (3.4.0)\n", + "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.26.0->tiktoken<0.8.0,>=0.7.0->camel-ai==0.2.15a0) (2.2.3)\n", + "Downloading curl_cffi-0.6.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.7 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m5.7/5.7 MB\u001b[0m \u001b[31m49.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading colorama-0.4.6-py2.py3-none-any.whl (25 kB)\n", + "Downloading docstring_parser-0.15-py3-none-any.whl (36 kB)\n", + "Downloading httpx-0.27.2-py3-none-any.whl (76 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m76.4/76.4 kB\u001b[0m \u001b[31m5.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading openai-1.58.1-py3-none-any.whl (454 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m454.3/454.3 kB\u001b[0m \u001b[31m27.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading pydantic-2.9.2-py3-none-any.whl (434 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m434.9/434.9 kB\u001b[0m \u001b[31m25.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading pydantic_core-2.23.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.1/2.1 MB\u001b[0m \u001b[31m42.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.1/1.1 MB\u001b[0m \u001b[31m28.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading plumbum-1.9.0-py3-none-any.whl (127 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m128.0/128.0 kB\u001b[0m \u001b[31m8.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hBuilding wheels for collected packages: camel-ai, pandoc\n", + " Building wheel for camel-ai (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", + " Created wheel for camel-ai: filename=camel_ai-0.2.15a0-py3-none-any.whl size=570339 sha256=aee5ab4189f89f91faea5007dfc248cae505f64b6b8e1825230e6d4158f36485\n", + " Stored in directory: /tmp/pip-ephem-wheel-cache-r7sn0n0a/wheels/e0/b7/22/f95f60dc6231f421db20c8c937b1a1c0a846f24638bb410a1c\n", + " Building wheel for pandoc (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + " Created wheel for pandoc: filename=pandoc-2.4-py3-none-any.whl size=34792 sha256=669628ed6378ce2cd1803c5f6d0eb7ff7e3d5dc205b0b5eccff2d1701d1549ca\n", + " Stored in directory: /root/.cache/pip/wheels/14/79/8c/5d7a023cc8df1aa0381c1739d69da18ae7f90c08b2dc9a1bf5\n", + "Successfully built camel-ai pandoc\n", + "Installing collected packages: pydantic-core, plumbum, docstring-parser, colorama, tiktoken, pydantic, pandoc, httpx, curl_cffi, openai, camel-ai\n", + " Attempting uninstall: pydantic-core\n", + " Found existing installation: pydantic_core 2.27.1\n", + " Uninstalling pydantic_core-2.27.1:\n", + " Successfully uninstalled pydantic_core-2.27.1\n", + " Attempting uninstall: docstring-parser\n", + " Found existing installation: docstring_parser 0.16\n", + " Uninstalling docstring_parser-0.16:\n", + " Successfully uninstalled docstring_parser-0.16\n", + " Attempting uninstall: pydantic\n", + " Found existing installation: pydantic 2.10.3\n", + " Uninstalling pydantic-2.10.3:\n", + " Successfully uninstalled pydantic-2.10.3\n", + " Attempting uninstall: httpx\n", + " Found existing installation: httpx 0.28.1\n", + " Uninstalling httpx-0.28.1:\n", + " Successfully uninstalled httpx-0.28.1\n", + " Attempting uninstall: openai\n", + " Found existing installation: openai 1.57.4\n", + " Uninstalling openai-1.57.4:\n", + " Successfully uninstalled openai-1.57.4\n", + "Successfully installed camel-ai-0.2.15a0 colorama-0.4.6 curl_cffi-0.6.2 docstring-parser-0.15 httpx-0.27.2 openai-1.58.1 pandoc-2.4 plumbum-1.9.0 pydantic-2.9.2 pydantic-core-2.23.4 tiktoken-0.7.0\n" + ] + } + ], + "source": [ + "!pip install git+https://github.com/camel-ai/camel.git@master" + ] + }, + { + "cell_type": "markdown", + "source": [ + "## πŸ”‘ Setting Up API Keys" + ], + "metadata": { + "id": "lfNvFbhD6o8B" + } + }, + { + "cell_type": "markdown", + "source": [ + "You'll need to set up your API keys for Qwen This ensures that the tools can interact with external services securely." + ], + "metadata": { + "id": "jqV12oQfQTyl" + } + }, { - "name": "stdout", - "output_type": "stream", - "text": [ - "2024-12-18 16:24:04,985 - httpx - INFO - HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \"HTTP/1.1 200 OK\"\n", - "2024-12-18 16:24:04,988 - camel.agents.chat_agent - INFO - Model qwen-plus, index 0, processed these messages: [{'role': 'user', 'content': 'Help me 1 student info in JSON format'}, {'role': 'assistant', 'content': '{\\n \"name\": \"John Doe\",\\n \"age\": \"20\",\\n \"email\": \"johndoe@example.com\"\\n}'}, {'role': 'user', 'content': 'Help me 3 random student info in JSON format'}, {'role': 'assistant', 'content': '{\\n \"studentList\": [\\n {\\n \"name\": \"Alice Johnson\",\\n \"age\": \"22\",\\n \"email\": \"alice.johnson@example.com\"\\n },\\n {\\n \"name\": \"Bob Smith\",\\n \"age\": \"21\",\\n \"email\": \"bob.smith@example.com\"\\n },\\n {\\n \"name\": \"Charlie Brown\",\\n \"age\": \"23\",\\n \"email\": \"charlie.brown@example.com\"\\n }\\n ]\\n}'}, {'role': 'user', 'content': \"\\n Given the user message, please generate a JSON response adhering to the following JSON schema:\\n{'$defs': {'Student': {'properties': {'name': {'title': 'Name', 'type': 'string'}, 'age': {'title': 'Age', 'type': 'string'}, 'email': {'title': 'Email', 'type': 'string'}}, 'required': ['name', 'age', 'email'], 'title': 'Student', 'type': 'object'}}, 'properties': {'studentList': {'items': {'$ref': '#/$defs/Student'}, 'title': 'Studentlist', 'type': 'array'}}, 'required': ['studentList'], 'title': 'StudentList', 'type': 'object'}\\nMake sure the JSON response is valid and matches the EXACT structure defined in the schema. Your result should only be a valid json object, without any other text or comments.\\n\\n User message: Help me 5 random student info in JSON format\\n\\n \"}]\n", - "{\n", - " \"studentList\": [\n", - " {\n", - " \"name\": \"Emma Williams\",\n", - " \"age\": \"20\",\n", - " \"email\": \"emma.williams@example.com\"\n", - " },\n", - " {\n", - " \"name\": \"Liam Davis\",\n", - " \"age\": \"21\",\n", - " \"email\": \"liam.davis@example.com\"\n", - " },\n", - " {\n", - " \"name\": \"Olivia Taylor\",\n", - " \"age\": \"22\",\n", - " \"email\": \"olivia.taylor@example.com\"\n", - " },\n", - " {\n", - " \"name\": \"Noah Anderson\",\n", - " \"age\": \"23\",\n", - " \"email\": \"noah.anderson@example.com\"\n", - " },\n", - " {\n", - " \"name\": \"Ava Martinez\",\n", - " \"age\": \"24\",\n", - " \"email\": \"ava.martinez@example.com\"\n", - " }\n", - " ]\n", - "}\n", - "studentList=[Student(name='Emma Williams', age='20', email='emma.williams@example.com'), Student(name='Liam Davis', age='21', email='liam.davis@example.com'), Student(name='Olivia Taylor', age='22', email='olivia.taylor@example.com'), Student(name='Noah Anderson', age='23', email='noah.anderson@example.com'), Student(name='Ava Martinez', age='24', email='ava.martinez@example.com')]\n" - ] + "cell_type": "markdown", + "source": [ + "Your can go to [here](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api/) to get API Key from Qwen AI." + ], + "metadata": { + "id": "czxWvnvnAimt" + } + }, + { + "cell_type": "code", + "source": [ + "# Prompt for the API key securely\n", + "import os\n", + "from getpass import getpass\n", + "\n", + "qwen_api_key = getpass('Enter your API key: ')\n", + "os.environ[\"QWEN_API_KEY\"] = qwen_api_key" + ], + "metadata": { + "id": "T0FBl1WF6jFs", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "af911650-d8d4-4714-e1d1-596cd1d233bf" + }, + "execution_count": 3, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Enter your API key: Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "## Qwen data generation" + ], + "metadata": { + "id": "NEUciNquON9_" + } + }, + { + "cell_type": "markdown", + "source": [ + "In this section, we'll demonstrate how to Qwen to generate structured data. [Qwen](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api) is a good example in Camel of using prompt engineering for structure output. It offers powerful models like **Qwen-max**, **Qwen-coder**, but yet not support structure output by itself. We can then make use of its own ability to generate structured data." + ], + "metadata": { + "id": "6f64VOMMP93d" + } + }, + { + "cell_type": "markdown", + "source": [ + "Import necessary libraries, define the Qwen agent, and define the Pydantic classes." + ], + "metadata": { + "id": "46Irp_SurLaV" + } + }, + { + "cell_type": "markdown", + "source": [ + "The following function retrieves relevant information from a list of URLs based on a given query. It combines web scraping with Firecrawl and CAMEL's AutoRetriever for a seamless information retrieval process. (Some explaination)" + ], + "metadata": { + "id": "QVB-Xra8QIU1" + } + }, + { + "cell_type": "code", + "source": [ + "from pydantic import BaseModel, Field\n", + "\n", + "from camel.agents import ChatAgent\n", + "from camel.messages import BaseMessage\n", + "from camel.models import ModelFactory\n", + "from camel.types import ModelPlatformType, ModelType\n", + "from camel.configs import QwenConfig" + ], + "metadata": { + "id": "gE_qBFCVveBR" + }, + "execution_count": 4, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Define Qwen model\n", + "qwen_model = ModelFactory.create(\n", + " model_platform=ModelPlatformType.QWEN,\n", + " model_type=ModelType.QWEN_CODER_TURBO,\n", + " model_config_dict=QwenConfig().as_dict(),\n", + ")\n", + "\n", + "qwen_agent = ChatAgent(\n", + " model=qwen_model,\n", + " message_window_size=10,\n", + ")\n", + "\n", + "\n", + "# Define Pydantic models\n", + "class Student(BaseModel):\n", + " name: str\n", + " age: str\n", + " email: str" + ], + "metadata": { + "id": "jnVCqRIS9snF" + }, + "execution_count": 5, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "First, let's try if we don't specific format just in prompt.\n" + ], + "metadata": { + "id": "1g31Q_CUCSZ3" + } + }, + { + "cell_type": "code", + "source": [ + "assistant_sys_msg = BaseMessage.make_assistant_message(\n", + " role_name=\"Assistant\",\n", + " content=\"You are a helpful assistant in helping user to generate necessary data information.\",\n", + ")\n", + "\n", + "user_msg = \"\"\"Help me 1 student info in JSON format, with the following format:\n", + "{\n", + " \"name\": \"string\",\n", + " \"age\": \"string\",\n", + " \"email\": \"string\"\n", + "}\"\"\"\n", + "\n", + "response = qwen_agent.step(user_msg)\n", + "print(response.msgs[0].content)" + ], + "metadata": { + "id": "KX_Ojed_CRtx", + "outputId": "78382ec7-0fc4-48bc-a107-b4a6497d478f", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "execution_count": 6, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Sure! Here is an example of student information in JSON format:\n", + "\n", + "```json\n", + "{\n", + " \"name\": \"John Doe\",\n", + " \"age\": \"20\",\n", + " \"email\": \"johndoe@example.com\"\n", + "}\n", + "```\n", + "\n", + "Feel free to replace the values with actual data for your specific student.\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "It did it, but we need to expand our prompts, and the result still has some annoying extra texts, and we still need to parse it into valid JSON object by ourselves.\n", + "\n", + "A more elegant way is to use the `response_format` argument in `.step()` function:" + ], + "metadata": { + "id": "PpyFnTWlHc7j" + } + }, + { + "cell_type": "code", + "source": [ + "qwen_agent.reset()\n", + "user_msg = \"Help me 1 student info in JSON format\"\n", + "response = qwen_agent.step(user_msg, response_format=Student)\n", + "print(response.msgs[0].content)" + ], + "metadata": { + "id": "zu1DZw0HCfo_", + "outputId": "742c1694-69fc-48f2-a78a-4c8e4ba8fb6f", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "execution_count": 8, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "{\n", + " \"name\": \"John Doe\",\n", + " \"age\": \"20\",\n", + " \"email\": \"johndoe@example.com\"\n", + "}\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "And we can directly extract the Pydantic object in `response.msgs[0].parsed` field:" + ], + "metadata": { + "id": "52xqKSa3CaRH" + } + }, + { + "cell_type": "code", + "source": [ + "print(type(response.msgs[0].parsed))\n", + "print(response.msgs[0].parsed)\n" + ], + "metadata": { + "id": "VEn6YLxPC8nu", + "outputId": "5d452e35-bd77-4f0c-e49d-074c8f04624c", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "execution_count": 9, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "\n", + "name='John Doe' age='20' email='johndoe@example.com'\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "Hooray, now we successfully generate 1 entry of student, suppose we want to generate more, we can still achieve this easily." + ], + "metadata": { + "id": "Ly2R-RegHo1o" + } + }, + { + "cell_type": "code", + "source": [ + "class StudentList(BaseModel):\n", + " studentList: list[Student]\n", + "\n", + "user_msg = \"Help me 5 random student info in JSON format\"\n", + "response = qwen_agent.step(user_msg, response_format=StudentList)\n", + "print(response.msgs[0].content)\n", + "print(response.msgs[0].parsed)\n", + "\n" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "MuiZIKq0Hxp1", + "outputId": "9920d910-5818-47b3-bed2-e3db7141cd4a" + }, + "execution_count": 10, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "{\n", + " \"studentList\": [\n", + " {\n", + " \"name\": \"Alice Johnson\",\n", + " \"age\": \"22\",\n", + " \"email\": \"alice.johnson@example.com\"\n", + " },\n", + " {\n", + " \"name\": \"Bob Smith\",\n", + " \"age\": \"21\",\n", + " \"email\": \"bob.smith@example.com\"\n", + " },\n", + " {\n", + " \"name\": \"Charlie Brown\",\n", + " \"age\": \"23\",\n", + " \"email\": \"charlie.brown@example.com\"\n", + " },\n", + " {\n", + " \"name\": \"Diana Prince\",\n", + " \"age\": \"24\",\n", + " \"email\": \"diana.prince@example.com\"\n", + " },\n", + " {\n", + " \"name\": \"Eve Adams\",\n", + " \"age\": \"20\",\n", + " \"email\": \"eve.adams@example.com\"\n", + " }\n", + " ]\n", + "}\n", + "studentList=[Student(name='Alice Johnson', age='22', email='alice.johnson@example.com'), Student(name='Bob Smith', age='21', email='bob.smith@example.com'), Student(name='Charlie Brown', age='23', email='charlie.brown@example.com'), Student(name='Diana Prince', age='24', email='diana.prince@example.com'), Student(name='Eve Adams', age='20', email='eve.adams@example.com')]\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "That's it! We just generate 5 random students out of nowhere by using Qwen Camel agent!" + ], + "metadata": { + "id": "APqHogdcH1nJ" + } + }, + { + "cell_type": "markdown", + "source": [ + "## 🌟 Highlights" + ], + "metadata": { + "id": "flYNal6-R4yR" + } + }, + { + "cell_type": "markdown", + "source": [ + "This notebook has guided you through setting up and running Qwen chat agent and use it to generate structured data.\n", + "\n", + "Key tools utilized in this notebook include:\n", + "\n", + "* **CAMEL**: A powerful multi-agent framework that enables Retrieval-Augmented Generation and multi-agent role-playing scenarios, allowing for sophisticated AI-driven tasks.\n", + "* **Qwen data generation**: Use Qwen model to generate structured data for further use of other applications.\n" + ], + "metadata": { + "id": "SmkXhy4JR726" + } + }, + { + "cell_type": "markdown", + "source": [ + "⭐ **Star the Repo**\n", + "\n", + "If you find CAMEL useful or interesting, please consider giving it a star on [GitHub](https://github.com/camel-ai/camel)! Your stars help others find this project and motivate us to continue improving it." + ], + "metadata": { + "id": "s6Det-fcMb9A" + } } - ], - "source": [ - "class StudentList(BaseModel):\n", - " studentList: list[Student]\n", - "\n", - "user_msg = \"Help me 5 random student info in JSON format\"\n", - "response = qwen_agent.step(user_msg, response_format=StudentList)\n", - "print(response.msgs[0].content)\n", - "print(response.msgs[0].parsed)\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "That's it! We just generate 5 random students out of nowhere by using Qwen Camel agent!" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.7" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} + ] +} \ No newline at end of file