From 9b3b8487085e4b6ee6a0723426000ebe64452e54 Mon Sep 17 00:00:00 2001 From: Owl Bot Date: Wed, 16 Oct 2024 21:13:12 +0000 Subject: [PATCH] =?UTF-8?q?=F0=9F=A6=89=20Updates=20from=20OwlBot=20post-p?= =?UTF-8?q?rocessor?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md --- .../vertex_ai_prompt_optimizer_sdk.ipynb | 2355 +++++++------- ...i_prompt_optimizer_sdk_custom_metric.ipynb | 2746 ++++++++--------- 2 files changed, 2537 insertions(+), 2564 deletions(-) diff --git a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk.ipynb b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk.ipynb index 4b4342545c..d25b4fd8d2 100644 --- a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk.ipynb +++ b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk.ipynb @@ -1,1186 +1,1173 @@ { - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ur8xi4C7S06n" - }, - "outputs": [], - "source": [ - "# Copyright 2024 Google LLC\n", - "#\n", - "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# https://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "JAPoU8Sm5E6e" - }, - "source": [ - "# Vertex Prompt Optimizer Notebook SDK (Preview)\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - "
\n", - " \n", - " \"Google
Open in Colab\n", - "
\n", - "
\n", - " \n", - " \"Google
Open in Colab Enterprise\n", - "
\n", - "
\n", - " \n", - " \"Vertex
Open in Vertex AI Workbench\n", - "
\n", - "
\n", - " \n", - " \"GitHub
View on GitHub\n", - "
\n", - "
\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "0ccc35a93b9f" - }, - "source": [ - "| | | |\n", - "|-|-|-|\n", - "|Author | [Ivan Nardini](https://github.com/inardini)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "tvgnzT1CKxrO" - }, - "source": [ - "## I. Overview\n", - "\n", - "In the context of developing Generative AI (Gen AI) applications, prompt engineering poses challenges due to its time-consuming and error-prone nature. You often dedicate significant effort to crafting and inputting prompts to achieve successful task completion. Additionally, with the frequent release of foundational models, you face the additional burden of migrating working prompts from one model version to another.\n", - "\n", - "Vertex AI Prompt Optimizer aims to alleviate these challenges by providing you with an intelligent prompt optimization tool. With this tool you can both refine optimize system instruction (and task) in the prompts and selects the best demonstrations (few-shot examples) for prompt templates, empowering you to shape LLM responses from any source model to on a target Google model.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "4HKyj5KwYePX" - }, - "source": [ - "### Objective\n", - "\n", - "This notebook demostrates how to leverage Vertex AI Prompt Optimizer (Preview) to optimize a simple prompt for a Gemini model using your own metrics. The goal is to use Vertex AI Prompt Optimizer (Preview) to find the new prompt template which generate the most correct and grounded responses.\n", - "\n", - "This tutorial uses the following Google Cloud ML services and resources:\n", - "\n", - "- Vertex Gen AI\n", - "- Vertex AI Prompt Optimizer (Preview)\n", - "- Vertex AI Model Eval\n", - "- Vertex AI Custom job\n", - "\n", - "The steps performed include:\n", - "\n", - "- Prepare the prompt-ground truth pairs optimized for another model\n", - "- Define the prompt template you want to optimize\n", - "- Set target model and evaluation metric\n", - "- Set optimization mode and steps\n", - "- Run the automatic prompt optimization job\n", - "- Collect the best prompt template and eval metric\n", - "- Validate the best prompt template" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "08d289fa873f" - }, - "source": [ - "### Dataset\n", - "\n", - "The dataset is a question-answering dataset generated by a simple AI cooking assistant that provides suggestions on how to cook healthier dishes.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "aed92deeb4a0" - }, - "source": [ - "### Costs\n", - "\n", - "This tutorial uses billable components of Google Cloud:\n", - "\n", - "* Vertex AI\n", - "* Cloud Storage\n", - "\n", - "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage pricing](https://cloud.google.com/storage/pricing) and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "61RBz8LLbxCR" - }, - "source": [ - "## II. Before you start" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "No17Cw5hgx12" - }, - "source": [ - "### Install Vertex AI SDK for Python and other required packages\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "tFy3H3aPgx12" - }, - "outputs": [], - "source": [ - "%pip install --upgrade --quiet 'google-cloud-aiplatform[evaluation]'\n", - "%pip install --upgrade --quiet 'plotly'\n", - "%pip install --upgrade --quiet 'asyncio' 'tqdm' 'tenacity' 'etils' 'importlib_resources' 'fsspec' 'gcsfs' 'nbformat>=4.2.0'" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "e55e2195ce2d" - }, - "outputs": [], - "source": [ - "! mkdir -p ./tutorial/utils && wget https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/prompts/prompt_optimizer/utils/helpers.py -P ./tutorial/utils" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "R5Xep4W9lq-Z" - }, - "source": [ - "### Restart runtime (Colab only)\n", - "\n", - "To use the newly installed packages, you must restart the runtime on Google Colab." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "XRvKdaPDTznN" - }, - "outputs": [], - "source": [ - "import sys\n", - "\n", - "if \"google.colab\" in sys.modules:\n", - " import IPython\n", - "\n", - " app = IPython.Application.instance()\n", - " app.kernel.do_shutdown(True)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "SbmM4z7FOBpM" - }, - "source": [ - "
\n", - "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n", - "
\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "dmWOrTJ3gx13" - }, - "source": [ - "### Authenticate your notebook environment (Colab only)\n", - "\n", - "Authenticate your environment on Google Colab.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "NyKGtVQjgx13" - }, - "outputs": [], - "source": [ - "import sys\n", - "\n", - "if \"google.colab\" in sys.modules:\n", - " from google.colab import auth\n", - "\n", - " auth.authenticate_user()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "DF4l8DTdWgPY" - }, - "source": [ - "### Set Google Cloud project information\n", - "\n", - "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n", - "\n", - "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "WReHDGG5g0XY" - }, - "source": [ - "#### Set your project ID and project number" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "oM1iC_MfAts1" - }, - "outputs": [], - "source": [ - "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n", - "\n", - "# Set the project id\n", - "! gcloud config set project {PROJECT_ID}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "11a4349d673e" - }, - "outputs": [], - "source": [ - "PROJECT_NUMBER = !gcloud projects describe {PROJECT_ID} --format=\"get(projectNumber)\"[0]\n", - "PROJECT_NUMBER = PROJECT_NUMBER[0]" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "region" - }, - "source": [ - "#### Region\n", - "\n", - "You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "I6FmBV2_0fBP" - }, - "outputs": [], - "source": [ - "REGION = \"us-central1\" # @param {type: \"string\"}" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "zgPO1eR3CYjk" - }, - "source": [ - "#### Create a Cloud Storage bucket\n", - "\n", - "Create a storage bucket to store intermediate artifacts such as datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "MzGDU7TWdts_" - }, - "outputs": [], - "source": [ - "BUCKET_NAME = \"your-bucket-name-{PROJECT_ID}-unique\" # @param {type:\"string\"}\n", - "\n", - "BUCKET_URI = f\"gs://{BUCKET_NAME}\" # @param {type:\"string\"}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "NIq7R4HZCfIc" - }, - "outputs": [], - "source": [ - "! gsutil mb -l {REGION} -p {PROJECT_ID} {BUCKET_URI}" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "set_service_account" - }, - "source": [ - "#### Service Account and permissions\n", - "\n", - "Vertex AI Automated Prompt Design requires a service account with the following permissions:\n", - "\n", - "- `Vertex AI User` to call Vertex LLM API\n", - "- `Storage Object Admin` to read and write to your GCS bucket.\n", - "- `Artifact Registry Reader` to download the pipeline template from Artifact Registry.\n", - "\n", - "[Check out the documentation](https://cloud.google.com/iam/docs/manage-access-service-accounts#iam-view-access-sa-gcloud) to know how to grant those permissions to a single service account. \n", - "\n", - "**Important**: If you run following commands using Vertex AI Workbench, please directly run in the terminal. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ssUJJqXJJHgC" - }, - "outputs": [], - "source": [ - "SERVICE_ACCOUNT = f\"{PROJECT_NUMBER}-compute@developer.gserviceaccount.com\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "wqOHg5aid6HP" - }, - "outputs": [], - "source": [ - "for role in ['aiplatform.user', 'storage.objectAdmin', 'artifactregistry.reader']:\n", - "\n", - " ! gcloud projects add-iam-policy-binding {PROJECT_ID} \\\n", - " --member=serviceAccount:{SERVICE_ACCOUNT} \\\n", - " --role=roles/{role} --condition=None" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Ek1-iTbPjzdJ" - }, - "source": [ - "### Set tutorial folder and workspace\n", - "\n", - "Set a folder to collect data and any tutorial artifacts." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "BbfKRabXj3la" - }, - "outputs": [], - "source": [ - "from pathlib import Path as path\n", - "\n", - "ROOT_PATH = path.cwd()\n", - "TUTORIAL_PATH = ROOT_PATH / \"tutorial\"\n", - "CONFIG_PATH = TUTORIAL_PATH / \"config\"\n", - "TUNED_PROMPT_PATH = TUTORIAL_PATH / \"tuned_prompts\"\n", - "\n", - "TUTORIAL_PATH.mkdir(parents=True, exist_ok=True)\n", - "CONFIG_PATH.mkdir(parents=True, exist_ok=True)\n", - "TUNED_PROMPT_PATH.mkdir(parents=True, exist_ok=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "BaNdfftpXTIX" - }, - "source": [ - "Set the associated workspace on Cloud Storage bucket." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "joJPc3FmX1fk" - }, - "outputs": [], - "source": [ - "from etils import epath\n", - "\n", - "WORKSPACE_URI = epath.Path(BUCKET_URI) / \"prompt_migration_gemini\"\n", - "INPUT_DATA_URI = epath.Path(WORKSPACE_URI) / \"data\"\n", - "\n", - "WORKSPACE_URI.mkdir(parents=True, exist_ok=True)\n", - "INPUT_DATA_URI.mkdir(parents=True, exist_ok=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "960505627ddf" - }, - "source": [ - "### Import libraries" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "PyQmSRbKA8r-" - }, - "outputs": [], - "source": [ - "import json\n", - "# General\n", - "import logging\n", - "import warnings\n", - "# Tutorial\n", - "from argparse import Namespace\n", - "\n", - "import pandas as pd\n", - "from google.cloud import aiplatform\n", - "from tqdm.asyncio import tqdm_asyncio\n", - "from tutorial.utils.helpers import (async_generate, display_eval_report,\n", - " evaluate_task, get_id,\n", - " get_optimization_result,\n", - " get_results_file_uris, init_new_model,\n", - " plot_eval_metrics, print_df_rows)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "820DIvw1o8tB" - }, - "source": [ - "### Libraries settings" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "HKc4ZdUBo_SM" - }, - "outputs": [], - "source": [ - "warnings.filterwarnings(\"ignore\")\n", - "logging.getLogger(\"urllib3.connectionpool\").setLevel(logging.ERROR)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "init_aip:mbsdk,all" - }, - "source": [ - "### Initialize Vertex AI SDK for Python\n", - "\n", - "Initialize the Vertex AI SDK for Python for your project." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "bQMc2Uwf0fBQ" - }, - "outputs": [], - "source": [ - "aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "gxc7q4r-DFH4" - }, - "source": [ - "### Define constants" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "0Y5t67f3DHNm" - }, - "outputs": [], - "source": [ - "INPUT_DATA_FILE_URI = \"gs://github-repo/prompts/prompt_optimizer/rag_qa_dataset.jsonl\"\n", - "\n", - "EXPERIMENT_NAME = \"qa-prompt-eval\"\n", - "INPUT_TUNING_DATA_URI = epath.Path(WORKSPACE_URI) / \"tuning_data\"\n", - "INPUT_TUNING_DATA_FILE_URI = str(INPUT_DATA_URI / \"prompt_tuning.jsonl\")\n", - "OUTPUT_TUNING_DATA_URI = epath.Path(WORKSPACE_URI) / \"tuned_prompt\"\n", - "APD_CONTAINER_URI = (\n", - " \"us-docker.pkg.dev/vertex-ai-restricted/builtin-algorithm/apd:preview_v1_0\"\n", - ")\n", - "CONFIG_FILE_URI = str(WORKSPACE_URI / \"config\" / \"config.json\")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "EdvJRUWRNGHE" - }, - "source": [ - "## III. Automated prompt design with Vertex AI Prompt Optimizer (Preview)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "mmTotjRAJplw" - }, - "source": [ - "### Load the dataset\n", - "\n", - "Load the dataset from Cloud Storage bucket." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "LA7aG08wJtVm" - }, - "outputs": [], - "source": [ - "prompt_tuning_df = pd.read_json(INPUT_DATA_FILE_URI, lines=True)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "1xn-pz3v5HVK" - }, - "outputs": [], - "source": [ - "prompt_tuning_df.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "PsXdJBJXiaVH" - }, - "outputs": [], - "source": [ - "print_df_rows(prompt_tuning_df, n=1)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "E5SmBApC-WDg" - }, - "source": [ - "### Evaluate the previous model version in question-answering task\n", - "\n", - "Run an evaluation using Vertex AI Gen AI Evaluation Service to define question-answering baseline." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "qA-dl76E-H23" - }, - "outputs": [], - "source": [ - "evaluation_qa_results = [\n", - " (\n", - " \"qa_eval_result_old_model\",\n", - " evaluate_task(\n", - " df=prompt_tuning_df,\n", - " prompt_col=\"prompt\",\n", - " reference_col=\"reference\",\n", - " response_col=\"answer\",\n", - " experiment_name=EXPERIMENT_NAME,\n", - " eval_metrics=[\"question_answering_quality\", \"groundedness\"],\n", - " eval_sample_n=len(prompt_tuning_df),\n", - " ),\n", - " )\n", - "]" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "_9ZMmVHZfl5O" - }, - "source": [ - "Plot the evaluation metrics." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "yTZKlgOk-0qz" - }, - "outputs": [], - "source": [ - "plot_eval_metrics(evaluation_qa_results)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Rp1n1aMACzSW" - }, - "source": [ - "### Optimize the prompt template with Vertex AI Prompt Optimizer (Preview)\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "h1650lf3X8xW" - }, - "source": [ - "#### Prepare the prompt template you want to optimize\n", - "\n", - "A prompt consists of two key parts:\n", - "\n", - "* **System Instruction Template** which is a fixed part of the prompt shared across all queries for a given task.\n", - "\n", - "* **Prompt Template** which is a dynamic part of the prompt that changes based on the task.\n", - "\n", - "Vertex AI Prompt Optimizer enables the translation and optimization of the Instruction Template, while the Task/Context Template remains essential for evaluating different instruction templates.\n", - "\n", - "In this case, you want to enhance or optimize a simple prompt template.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "Db8rHNC6DmtY" - }, - "outputs": [], - "source": [ - "SYSTEM_INSTRUCTION_TEMPLATE = \"\"\"\n", - "Given a question with some context, provide the correct answer to the question.\n", - "\"\"\"\n", - "\n", - "PROMPT_TEMPLATE = \"\"\"\n", - "Some examples of correct answer to a question with context are:\n", - "Question: {{question}}\n", - "Answer: {{target}}\n", - "\"\"\"" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "a1TCgXsrXztm" - }, - "source": [ - "#### Prepare few samples\n", - "\n", - "Vertex AI Prompt optimizer requires a CSV or JSONL file containing labeled samples.\n", - "\n", - "For **prompt optimization**:\n", - "\n", - "* Focus on examples that specifically demonstrate the issues you want to address.\n", - "* Recommendation: Use 50-100 distinct samples for reliable results. However, the tool can still be effective with as few as 5 samples.\n", - "\n", - "For **prompt translation**:\n", - "\n", - "* Consider using the source model to label examples that the target model struggles with, helping to identify areas for improvement.\n", - "\n", - "Learn more about setting up your CSV or JSONL file as input [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "vTIl_v9Ig1F-" - }, - "outputs": [], - "source": [ - "prepared_prompt_tuning_df = prompt_tuning_df.copy()\n", - "\n", - "# Prepare question and target columns\n", - "prepared_prompt_tuning_df[\"question\"] = (\n", - " prepared_prompt_tuning_df[\"user_question\"]\n", - " + \"\\nnContext:\\n\"\n", - " + prepared_prompt_tuning_df[\"context\"]\n", - ")\n", - "prepared_prompt_tuning_df = prepared_prompt_tuning_df.rename(\n", - " columns={\"reference\": \"target\"}\n", - ")\n", - "\n", - "# Remove uneccessary columns\n", - "prepared_prompt_tuning_df = prepared_prompt_tuning_df.drop(\n", - " columns=[\"user_question\", \"context\", \"prompt\", \"answer\"]\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "_DUFEAb82eEi" - }, - "outputs": [], - "source": [ - "prepared_prompt_tuning_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "nF3XY_d_yB-K" - }, - "source": [ - "#### Upload samples to bucket\n", - "\n", - "Once you prepare samples, you can upload them on Cloud Storage bucket." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "155paLgGUXOm" - }, - "outputs": [], - "source": [ - "prepared_prompt_tuning_df.to_json(\n", - " INPUT_TUNING_DATA_FILE_URI, orient=\"records\", lines=True\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "F5RD0l2xX-FI" - }, - "source": [ - "#### Configure optimization settings\n", - "\n", - "Vertex AI Prompt Optimizer allows you to optimize prompts by optimizing instructions only, demonstration only, or both (`optimization_mode`), and after you set the system instruction, prompt templates that will be optimized (`system_instruction`, `prompt_template`), and the model you want to optimize for (`target_model`), it allows to condition the optimization process by setting metrics, number of iterations used to improve the prompt and more.\n", - "\n", - "Below you have some configurations as default that are most commonly used and recommended. And if you want to have more control of the optimization process, Vertex AI Prompt Optimizer (Preview) provides also additional configurations. Refer [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer) to learn more about the different parameters settings and how to best utilize them.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "sFHutXhgeqRx" - }, - "outputs": [], - "source": [ - "PROMPT_OPTIMIZATION_JOB = \"auto-prompt-design-job-\" + get_id()\n", - "OUTPUT_TUNING_RUN_URI = str(OUTPUT_TUNING_DATA_URI / PROMPT_OPTIMIZATION_JOB)\n", - "\n", - "args = Namespace(\n", - " # Basic configuration\n", - " system_instruction=SYSTEM_INSTRUCTION_TEMPLATE,\n", - " prompt_template=PROMPT_TEMPLATE,\n", - " target_model=\"gemini-1.5-flash-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n", - " optimization_mode=\"instruction\", # Supported modes: \"instruction\", \"demonstration\", \"instruction_and_demo\"\n", - " num_steps=3,\n", - " num_template_eval_per_step=2,\n", - " num_demo_set_candidates=3,\n", - " demo_set_size=2,\n", - " input_data_path=INPUT_TUNING_DATA_FILE_URI,\n", - " output_path=OUTPUT_TUNING_RUN_URI,\n", - " project=PROJECT_ID,\n", - " # Advanced configuration\n", - " target_model_qps=1,\n", - " target_model_location=\"us-central1\",\n", - " source_model=\"\",\n", - " source_model_qps=\"\",\n", - " source_model_location=\"\",\n", - " optimizer_model=\"gemini-1.5-pro-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n", - " optimizer_model_qps=1,\n", - " optimizer_model_location=\"us-central1\",\n", - " eval_model=\"gemini-1.5-pro-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n", - " eval_qps=1,\n", - " eval_model_location=\"us-central1\",\n", - " eval_metrics_types=[\n", - " \"question_answering_correctness\",\n", - " \"groundedness\",\n", - " ], # Supported metrics: \"bleu\", \"coherence\", \"exact_match\", \"fluidity\", \"fulfillment\", \"groundedness\", \"rouge_1\", \"rouge_2\", \"rouge_l\", \"rouge_l_sum\", \"safety\", \"question_answering_correctness\", \"question_answering_helpfulness\", \"question_answering_quality\", \"question_answering_relevance\", \"summarization_helpfulness\", \"summarization_quality\", \"summarization_verbosity\", \"tool_name_match\", \"tool_parameter_key_match\", \"tool_parameter_kv_match\"\n", - " eval_metrics_weights=[0.9, 0.1],\n", - " aggregation_type=\"weighted_sum\", # Supported aggregation types: \"weighted_sum\", \"weighted_average\"\n", - " data_limit=50,\n", - " response_mime_type=\"application/json\",\n", - " language=\"English\", # Supported languages: \"English\", \"French\", \"German\", \"Hebrew\", \"Hindi\", \"Japanese\", \"Korean\", \"Portuguese\", \"Simplified Chinese\", \"Spanish\", \"Traditional Chinese\"\n", - " placeholder_to_content=json.loads(\"{}\"),\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Jd_uzQYQx6L7" - }, - "source": [ - "#### Upload Vertex AI Prompt Optimizer (Preview) config to Cloud Storage\n", - "\n", - "After you define Vertex AI Prompt Optimizer (Preview) configuration, you upload them on Cloud Storage bucket.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "QCJAqcfWBqAh" - }, - "source": [ - "Now you can save the config to the bucket." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "iqiv8ApR_SAM" - }, - "outputs": [], - "source": [ - "args = vars(args)\n", - "\n", - "with epath.Path(CONFIG_FILE_URI).open(\"w\") as config_file:\n", - " json.dump(args, config_file)\n", - "config_file.close()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "spqgBT8hYAle" - }, - "source": [ - "#### Run the automatic prompt optimization job\n", - "\n", - "Now you are ready to run your first Vertex AI Prompt Optimizer (Preview) job using the Vertex AI SDK for Python.\n", - "\n", - "**Important:** Be sure you have provisioned enough queries per minute (QPM) quota and the recommended QPM for each model. If you configure the Vertex AI prompt optimizer with a QPM that is higher than the QPM than you have access to, the job will fail. \n", - "\n", - "[Check out](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer#before-you-begin) the documentation to know more. \n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "GtPnvKIpUQ3q" - }, - "outputs": [], - "source": [ - "WORKER_POOL_SPECS = [\n", - " {\n", - " \"machine_spec\": {\n", - " \"machine_type\": \"n1-standard-4\",\n", - " },\n", - " \"replica_count\": 1,\n", - " \"container_spec\": {\n", - " \"image_uri\": APD_CONTAINER_URI,\n", - " \"args\": [\"--config=\" + CONFIG_FILE_URI],\n", - " },\n", - " }\n", - "]\n", - "\n", - "custom_job = aiplatform.CustomJob(\n", - " display_name=PROMPT_OPTIMIZATION_JOB,\n", - " worker_pool_specs=WORKER_POOL_SPECS,\n", - ")\n", - "\n", - "custom_job.run(service_account=SERVICE_ACCOUNT)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "3YwwKBhtJ4ut" - }, - "source": [ - "### Collect the optimization results\n", - "\n", - "Vertex AI Prompt Optimizer returns both optimized templates and evaluation results for either instruction, or demostrations, or both depending on the optimization mode you define as JSONL files on Cloud Storage bucket. Those results help you understand the optimization process.\n", - "\n", - "In this case, you want to collect the optimized templates and evaluation results for the instruction.\n", - "\n", - "Below you use a helper function to read those results.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "xTPJsvg-kzkO" - }, - "outputs": [], - "source": [ - "apd_result_uris = get_results_file_uris(\n", - " output_uri=OUTPUT_TUNING_RUN_URI,\n", - " required_files=[\"eval_results.json\", \"templates.json\"],\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "ZezzQSYWjYPd" - }, - "source": [ - "#### Get the best system instruction\n", - "\n", - "Below you have the optimal system instruction template and the associated evaluation metrics." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "PrezXkBUu1s5" - }, - "outputs": [], - "source": [ - "best_prompt_df, prompt_summary_df, prompt_metrics_df = get_optimization_result(\n", - " apd_result_uris[\"instruction_templates\"],\n", - " apd_result_uris[\"instruction_eval_results\"],\n", - ")\n", - "\n", - "display_eval_report(\n", - " (best_prompt_df, prompt_summary_df, prompt_metrics_df),\n", - " prompt_component=\"instruction\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "TrMrbcA5Gzep" - }, - "source": [ - "### Validate and Evaluate the optimized template in question-answering task\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "bGRELw3U3I28" - }, - "source": [ - "#### Generate new responses using the optimized template\n", - "\n", - "Finally, you generate the new responses with the optimized template. Below you can see an example of a generated response using the optimized system instructions template." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "GXDU_ydAG5ak" - }, - "outputs": [], - "source": [ - "optimized_prompt_template = (\n", - " best_prompt_df[\"prompt\"].iloc[0]\n", - " + \"\\nQuestion: \\n{question}\"\n", - " + \"\\nContext: \\n{context}\"\n", - ")\n", - "\n", - "optimized_prompts = [\n", - " optimized_prompt_template.format(question=q, context=c)\n", - " for q, c in zip(\n", - " prompt_tuning_df[\"user_question\"].to_list(),\n", - " prompt_tuning_df[\"context\"].to_list(),\n", - " )\n", - "]\n", - "\n", - "prompt_tuning_df[\"optimized_prompt_with_vapo\"] = optimized_prompts" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "qG6QJW8alttS" - }, - "outputs": [], - "source": [ - "gemini_llm = init_new_model(\"gemini-1.5-flash-001\")\n", - "\n", - "gemini_predictions = [async_generate(p, model=gemini_llm) for p in optimized_prompts]\n", - "\n", - "gemini_predictions_col = await tqdm_asyncio.gather(*gemini_predictions)\n", - "\n", - "prompt_tuning_df[\"gemini_answer_with_vapo\"] = gemini_predictions_col" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "_55cHbD4kFAz" - }, - "outputs": [], - "source": [ - "print_df_rows(prompt_tuning_df, n=1)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "D1wxiPhv21TT" - }, - "source": [ - "#### Evaluate new responses using Vertex AI Gen AI Evaluation\n", - "\n", - "And you use the generated responses with the optimized prompt to run a new round of evaluation with Vertex AI Gen AI Evaluation.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "5Ebtvk0fKApV" - }, - "outputs": [], - "source": [ - "evaluation_qa_results.append(\n", - " (\n", - " \"qa_eval_result_new_model_with_vapo\",\n", - " evaluate_task(\n", - " df=prompt_tuning_df,\n", - " prompt_col=\"optimized_prompt_with_vapo\",\n", - " reference_col=\"reference\",\n", - " response_col=\"gemini_answer_with_vapo\",\n", - " experiment_name=EXPERIMENT_NAME,\n", - " eval_metrics=[\"question_answering_quality\", \"groundedness\"],\n", - " eval_sample_n=len(prompt_tuning_df),\n", - " ),\n", - " )\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "wJXNAnJjmnga" - }, - "outputs": [], - "source": [ - "plot_eval_metrics(evaluation_qa_results)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "2a4e033321ad" - }, - "source": [ - "## IV. Clean up" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "WRY_3wh1GVNm" - }, - "outputs": [], - "source": [ - "delete_bucket = False\n", - "delete_job = False\n", - "delete_experiment = False\n", - "delete_tutorial = False\n", - "\n", - "if delete_bucket:\n", - " ! gsutil rm -r $BUCKET_URI\n", - "\n", - "if delete_job:\n", - " custom_job.delete()\n", - "\n", - "if delete_experiment:\n", - " experiment = aiplatform.Experiment(experiment_name=EXPERIMENT_NAME)\n", - " experiment.delete()\n", - "\n", - "if delete_tutorial:\n", - " import shutil\n", - "\n", - " shutil.rmtree(str(TUTORIAL_PATH))" - ] - } - ], - "metadata": { - "colab": { - "name": "vertex_ai_prompt_optimizer_sdk.ipynb", - "toc_visible": true - }, - "environment": { - "kernel": "python3", - "name": "tf2-cpu.2-11.m125", - "type": "gcloud", - "uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/tf2-cpu.2-11:m125" - }, - "kernelspec": { - "display_name": "Python 3 (Local)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.15" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "cells": [ + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ur8xi4C7S06n" + }, + "outputs": [], + "source": [ + "# Copyright 2024 Google LLC\n", + "#\n", + "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "JAPoU8Sm5E6e" + }, + "source": [ + "# Vertex Prompt Optimizer Notebook SDK (Preview)\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + "
\n", + " \n", + " \"Google
Open in Colab\n", + "
\n", + "
\n", + " \n", + " \"Google
Open in Colab Enterprise\n", + "
\n", + "
\n", + " \n", + " \"Vertex
Open in Vertex AI Workbench\n", + "
\n", + "
\n", + " \n", + " \"GitHub
View on GitHub\n", + "
\n", + "
\n", + " " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0ccc35a93b9f" + }, + "source": [ + "| | | |\n", + "|-|-|-|\n", + "|Author | [Ivan Nardini](https://github.com/inardini)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tvgnzT1CKxrO" + }, + "source": [ + "## I. Overview\n", + "\n", + "In the context of developing Generative AI (Gen AI) applications, prompt engineering poses challenges due to its time-consuming and error-prone nature. You often dedicate significant effort to crafting and inputting prompts to achieve successful task completion. Additionally, with the frequent release of foundational models, you face the additional burden of migrating working prompts from one model version to another.\n", + "\n", + "Vertex AI Prompt Optimizer aims to alleviate these challenges by providing you with an intelligent prompt optimization tool. With this tool you can both refine optimize system instruction (and task) in the prompts and selects the best demonstrations (few-shot examples) for prompt templates, empowering you to shape LLM responses from any source model to on a target Google model.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4HKyj5KwYePX" + }, + "source": [ + "### Objective\n", + "\n", + "This notebook demostrates how to leverage Vertex AI Prompt Optimizer (Preview) to optimize a simple prompt for a Gemini model using your own metrics. The goal is to use Vertex AI Prompt Optimizer (Preview) to find the new prompt template which generate the most correct and grounded responses.\n", + "\n", + "This tutorial uses the following Google Cloud ML services and resources:\n", + "\n", + "- Vertex Gen AI\n", + "- Vertex AI Prompt Optimizer (Preview)\n", + "- Vertex AI Model Eval\n", + "- Vertex AI Custom job\n", + "\n", + "The steps performed include:\n", + "\n", + "- Prepare the prompt-ground truth pairs optimized for another model\n", + "- Define the prompt template you want to optimize\n", + "- Set target model and evaluation metric\n", + "- Set optimization mode and steps\n", + "- Run the automatic prompt optimization job\n", + "- Collect the best prompt template and eval metric\n", + "- Validate the best prompt template" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "08d289fa873f" + }, + "source": [ + "### Dataset\n", + "\n", + "The dataset is a question-answering dataset generated by a simple AI cooking assistant that provides suggestions on how to cook healthier dishes.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "aed92deeb4a0" + }, + "source": [ + "### Costs\n", + "\n", + "This tutorial uses billable components of Google Cloud:\n", + "\n", + "* Vertex AI\n", + "* Cloud Storage\n", + "\n", + "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage pricing](https://cloud.google.com/storage/pricing) and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "61RBz8LLbxCR" + }, + "source": [ + "## II. Before you start" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "No17Cw5hgx12" + }, + "source": [ + "### Install Vertex AI SDK for Python and other required packages\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "tFy3H3aPgx12" + }, + "outputs": [], + "source": [ + "%pip install --upgrade --quiet 'google-cloud-aiplatform[evaluation]'\n", + "%pip install --upgrade --quiet 'plotly'\n", + "%pip install --upgrade --quiet 'asyncio' 'tqdm' 'tenacity' 'etils' 'importlib_resources' 'fsspec' 'gcsfs' 'nbformat>=4.2.0'" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "e55e2195ce2d" + }, + "outputs": [], + "source": [ + "! mkdir -p ./tutorial/utils && wget https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/prompts/prompt_optimizer/utils/helpers.py -P ./tutorial/utils" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "R5Xep4W9lq-Z" + }, + "source": [ + "### Restart runtime (Colab only)\n", + "\n", + "To use the newly installed packages, you must restart the runtime on Google Colab." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "XRvKdaPDTznN" + }, + "outputs": [], + "source": [ + "import sys\n", + "\n", + "if \"google.colab\" in sys.modules:\n", + " import IPython\n", + "\n", + " app = IPython.Application.instance()\n", + " app.kernel.do_shutdown(True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SbmM4z7FOBpM" + }, + "source": [ + "
\n", + "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n", + "
\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "dmWOrTJ3gx13" + }, + "source": [ + "### Authenticate your notebook environment (Colab only)\n", + "\n", + "Authenticate your environment on Google Colab.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "NyKGtVQjgx13" + }, + "outputs": [], + "source": [ + "import sys\n", + "\n", + "if \"google.colab\" in sys.modules:\n", + " from google.colab import auth\n", + "\n", + " auth.authenticate_user()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DF4l8DTdWgPY" + }, + "source": [ + "### Set Google Cloud project information\n", + "\n", + "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n", + "\n", + "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WReHDGG5g0XY" + }, + "source": [ + "#### Set your project ID and project number" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "oM1iC_MfAts1" + }, + "outputs": [], + "source": [ + "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n", + "\n", + "# Set the project id\n", + "! gcloud config set project {PROJECT_ID}" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "11a4349d673e" + }, + "outputs": [], + "source": [ + "PROJECT_NUMBER = !gcloud projects describe {PROJECT_ID} --format=\"get(projectNumber)\"[0]\n", + "PROJECT_NUMBER = PROJECT_NUMBER[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "region" + }, + "source": [ + "#### Region\n", + "\n", + "You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "I6FmBV2_0fBP" + }, + "outputs": [], + "source": [ + "REGION = \"us-central1\" # @param {type: \"string\"}" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "zgPO1eR3CYjk" + }, + "source": [ + "#### Create a Cloud Storage bucket\n", + "\n", + "Create a storage bucket to store intermediate artifacts such as datasets." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "MzGDU7TWdts_" + }, + "outputs": [], + "source": [ + "BUCKET_NAME = \"your-bucket-name-{PROJECT_ID}-unique\" # @param {type:\"string\"}\n", + "\n", + "BUCKET_URI = f\"gs://{BUCKET_NAME}\" # @param {type:\"string\"}" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "NIq7R4HZCfIc" + }, + "outputs": [], + "source": [ + "! gsutil mb -l {REGION} -p {PROJECT_ID} {BUCKET_URI}" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "set_service_account" + }, + "source": [ + "#### Service Account and permissions\n", + "\n", + "Vertex AI Automated Prompt Design requires a service account with the following permissions:\n", + "\n", + "- `Vertex AI User` to call Vertex LLM API\n", + "- `Storage Object Admin` to read and write to your GCS bucket.\n", + "- `Artifact Registry Reader` to download the pipeline template from Artifact Registry.\n", + "\n", + "[Check out the documentation](https://cloud.google.com/iam/docs/manage-access-service-accounts#iam-view-access-sa-gcloud) to know how to grant those permissions to a single service account. \n", + "\n", + "**Important**: If you run following commands using Vertex AI Workbench, please directly run in the terminal. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ssUJJqXJJHgC" + }, + "outputs": [], + "source": [ + "SERVICE_ACCOUNT = f\"{PROJECT_NUMBER}-compute@developer.gserviceaccount.com\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "wqOHg5aid6HP" + }, + "outputs": [], + "source": [ + "for role in ['aiplatform.user', 'storage.objectAdmin', 'artifactregistry.reader']:\n", + "\n", + " ! gcloud projects add-iam-policy-binding {PROJECT_ID} \\\n", + " --member=serviceAccount:{SERVICE_ACCOUNT} \\\n", + " --role=roles/{role} --condition=None" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Ek1-iTbPjzdJ" + }, + "source": [ + "### Set tutorial folder and workspace\n", + "\n", + "Set a folder to collect data and any tutorial artifacts." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "BbfKRabXj3la" + }, + "outputs": [], + "source": [ + "from pathlib import Path as path\n", + "\n", + "ROOT_PATH = path.cwd()\n", + "TUTORIAL_PATH = ROOT_PATH / \"tutorial\"\n", + "CONFIG_PATH = TUTORIAL_PATH / \"config\"\n", + "TUNED_PROMPT_PATH = TUTORIAL_PATH / \"tuned_prompts\"\n", + "\n", + "TUTORIAL_PATH.mkdir(parents=True, exist_ok=True)\n", + "CONFIG_PATH.mkdir(parents=True, exist_ok=True)\n", + "TUNED_PROMPT_PATH.mkdir(parents=True, exist_ok=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BaNdfftpXTIX" + }, + "source": [ + "Set the associated workspace on Cloud Storage bucket." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "joJPc3FmX1fk" + }, + "outputs": [], + "source": [ + "from etils import epath\n", + "\n", + "WORKSPACE_URI = epath.Path(BUCKET_URI) / \"prompt_migration_gemini\"\n", + "INPUT_DATA_URI = epath.Path(WORKSPACE_URI) / \"data\"\n", + "\n", + "WORKSPACE_URI.mkdir(parents=True, exist_ok=True)\n", + "INPUT_DATA_URI.mkdir(parents=True, exist_ok=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "960505627ddf" + }, + "source": [ + "### Import libraries" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "PyQmSRbKA8r-" + }, + "outputs": [], + "source": [ + "# Tutorial\n", + "from argparse import Namespace\n", + "import json\n", + "\n", + "# General\n", + "import logging\n", + "import warnings\n", + "\n", + "from google.cloud import aiplatform\n", + "import pandas as pd\n", + "from tutorial.utils.helpers import (\n", + " async_generate,\n", + " display_eval_report,\n", + " evaluate_task,\n", + " get_id,\n", + " get_optimization_result,\n", + " get_results_file_uris,\n", + " init_new_model,\n", + " plot_eval_metrics,\n", + " print_df_rows,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "820DIvw1o8tB" + }, + "source": [ + "### Libraries settings" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "HKc4ZdUBo_SM" + }, + "outputs": [], + "source": [ + "warnings.filterwarnings(\"ignore\")\n", + "logging.getLogger(\"urllib3.connectionpool\").setLevel(logging.ERROR)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "init_aip:mbsdk,all" + }, + "source": [ + "### Initialize Vertex AI SDK for Python\n", + "\n", + "Initialize the Vertex AI SDK for Python for your project." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "bQMc2Uwf0fBQ" + }, + "outputs": [], + "source": [ + "aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "gxc7q4r-DFH4" + }, + "source": [ + "### Define constants" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "0Y5t67f3DHNm" + }, + "outputs": [], + "source": [ + "INPUT_DATA_FILE_URI = \"gs://github-repo/prompts/prompt_optimizer/rag_qa_dataset.jsonl\"\n", + "\n", + "EXPERIMENT_NAME = \"qa-prompt-eval\"\n", + "INPUT_TUNING_DATA_URI = epath.Path(WORKSPACE_URI) / \"tuning_data\"\n", + "INPUT_TUNING_DATA_FILE_URI = str(INPUT_DATA_URI / \"prompt_tuning.jsonl\")\n", + "OUTPUT_TUNING_DATA_URI = epath.Path(WORKSPACE_URI) / \"tuned_prompt\"\n", + "APD_CONTAINER_URI = (\n", + " \"us-docker.pkg.dev/vertex-ai-restricted/builtin-algorithm/apd:preview_v1_0\"\n", + ")\n", + "CONFIG_FILE_URI = str(WORKSPACE_URI / \"config\" / \"config.json\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "EdvJRUWRNGHE" + }, + "source": [ + "## III. Automated prompt design with Vertex AI Prompt Optimizer (Preview)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "mmTotjRAJplw" + }, + "source": [ + "### Load the dataset\n", + "\n", + "Load the dataset from Cloud Storage bucket." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "LA7aG08wJtVm" + }, + "outputs": [], + "source": [ + "prompt_tuning_df = pd.read_json(INPUT_DATA_FILE_URI, lines=True)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "1xn-pz3v5HVK" + }, + "outputs": [], + "source": [ + "prompt_tuning_df.head()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "PsXdJBJXiaVH" + }, + "outputs": [], + "source": [ + "print_df_rows(prompt_tuning_df, n=1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "E5SmBApC-WDg" + }, + "source": [ + "### Evaluate the previous model version in question-answering task\n", + "\n", + "Run an evaluation using Vertex AI Gen AI Evaluation Service to define question-answering baseline." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "qA-dl76E-H23" + }, + "outputs": [], + "source": [ + "evaluation_qa_results = [\n", + " (\n", + " \"qa_eval_result_old_model\",\n", + " evaluate_task(\n", + " df=prompt_tuning_df,\n", + " prompt_col=\"prompt\",\n", + " reference_col=\"reference\",\n", + " response_col=\"answer\",\n", + " experiment_name=EXPERIMENT_NAME,\n", + " eval_metrics=[\"question_answering_quality\", \"groundedness\"],\n", + " eval_sample_n=len(prompt_tuning_df),\n", + " ),\n", + " )\n", + "]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_9ZMmVHZfl5O" + }, + "source": [ + "Plot the evaluation metrics." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "yTZKlgOk-0qz" + }, + "outputs": [], + "source": [ + "plot_eval_metrics(evaluation_qa_results)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Rp1n1aMACzSW" + }, + "source": [ + "### Optimize the prompt template with Vertex AI Prompt Optimizer (Preview)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "h1650lf3X8xW" + }, + "source": [ + "#### Prepare the prompt template you want to optimize\n", + "\n", + "A prompt consists of two key parts:\n", + "\n", + "* **System Instruction Template** which is a fixed part of the prompt shared across all queries for a given task.\n", + "\n", + "* **Prompt Template** which is a dynamic part of the prompt that changes based on the task.\n", + "\n", + "Vertex AI Prompt Optimizer enables the translation and optimization of the Instruction Template, while the Task/Context Template remains essential for evaluating different instruction templates.\n", + "\n", + "In this case, you want to enhance or optimize a simple prompt template.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Db8rHNC6DmtY" + }, + "outputs": [], + "source": [ + "SYSTEM_INSTRUCTION_TEMPLATE = \"\"\"\n", + "Given a question with some context, provide the correct answer to the question.\n", + "\"\"\"\n", + "\n", + "PROMPT_TEMPLATE = \"\"\"\n", + "Some examples of correct answer to a question with context are:\n", + "Question: {{question}}\n", + "Answer: {{target}}\n", + "\"\"\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "a1TCgXsrXztm" + }, + "source": [ + "#### Prepare few samples\n", + "\n", + "Vertex AI Prompt optimizer requires a CSV or JSONL file containing labeled samples.\n", + "\n", + "For **prompt optimization**:\n", + "\n", + "* Focus on examples that specifically demonstrate the issues you want to address.\n", + "* Recommendation: Use 50-100 distinct samples for reliable results. However, the tool can still be effective with as few as 5 samples.\n", + "\n", + "For **prompt translation**:\n", + "\n", + "* Consider using the source model to label examples that the target model struggles with, helping to identify areas for improvement.\n", + "\n", + "Learn more about setting up your CSV or JSONL file as input [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "vTIl_v9Ig1F-" + }, + "outputs": [], + "source": [ + "prepared_prompt_tuning_df = prompt_tuning_df.copy()\n", + "\n", + "# Prepare question and target columns\n", + "prepared_prompt_tuning_df[\"question\"] = (\n", + " prepared_prompt_tuning_df[\"user_question\"]\n", + " + \"\\nnContext:\\n\"\n", + " + prepared_prompt_tuning_df[\"context\"]\n", + ")\n", + "prepared_prompt_tuning_df = prepared_prompt_tuning_df.rename(\n", + " columns={\"reference\": \"target\"}\n", + ")\n", + "\n", + "# Remove uneccessary columns\n", + "prepared_prompt_tuning_df = prepared_prompt_tuning_df.drop(\n", + " columns=[\"user_question\", \"context\", \"prompt\", \"answer\"]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "_DUFEAb82eEi" + }, + "outputs": [], + "source": [ + "prepared_prompt_tuning_df.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nF3XY_d_yB-K" + }, + "source": [ + "#### Upload samples to bucket\n", + "\n", + "Once you prepare samples, you can upload them on Cloud Storage bucket." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "155paLgGUXOm" + }, + "outputs": [], + "source": [ + "prepared_prompt_tuning_df.to_json(\n", + " INPUT_TUNING_DATA_FILE_URI, orient=\"records\", lines=True\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "F5RD0l2xX-FI" + }, + "source": [ + "#### Configure optimization settings\n", + "\n", + "Vertex AI Prompt Optimizer allows you to optimize prompts by optimizing instructions only, demonstration only, or both (`optimization_mode`), and after you set the system instruction, prompt templates that will be optimized (`system_instruction`, `prompt_template`), and the model you want to optimize for (`target_model`), it allows to condition the optimization process by setting metrics, number of iterations used to improve the prompt and more.\n", + "\n", + "Below you have some configurations as default that are most commonly used and recommended. And if you want to have more control of the optimization process, Vertex AI Prompt Optimizer (Preview) provides also additional configurations. Refer [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer) to learn more about the different parameters settings and how to best utilize them.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "sFHutXhgeqRx" + }, + "outputs": [], + "source": [ + "PROMPT_OPTIMIZATION_JOB = \"auto-prompt-design-job-\" + get_id()\n", + "OUTPUT_TUNING_RUN_URI = str(OUTPUT_TUNING_DATA_URI / PROMPT_OPTIMIZATION_JOB)\n", + "\n", + "args = Namespace(\n", + " # Basic configuration\n", + " system_instruction=SYSTEM_INSTRUCTION_TEMPLATE,\n", + " prompt_template=PROMPT_TEMPLATE,\n", + " target_model=\"gemini-1.5-flash-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n", + " optimization_mode=\"instruction\", # Supported modes: \"instruction\", \"demonstration\", \"instruction_and_demo\"\n", + " num_steps=3,\n", + " num_template_eval_per_step=2,\n", + " num_demo_set_candidates=3,\n", + " demo_set_size=2,\n", + " input_data_path=INPUT_TUNING_DATA_FILE_URI,\n", + " output_path=OUTPUT_TUNING_RUN_URI,\n", + " project=PROJECT_ID,\n", + " # Advanced configuration\n", + " target_model_qps=1,\n", + " target_model_location=\"us-central1\",\n", + " source_model=\"\",\n", + " source_model_qps=\"\",\n", + " source_model_location=\"\",\n", + " optimizer_model=\"gemini-1.5-pro-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n", + " optimizer_model_qps=1,\n", + " optimizer_model_location=\"us-central1\",\n", + " eval_model=\"gemini-1.5-pro-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n", + " eval_qps=1,\n", + " eval_model_location=\"us-central1\",\n", + " eval_metrics_types=[\n", + " \"question_answering_correctness\",\n", + " \"groundedness\",\n", + " ], # Supported metrics: \"bleu\", \"coherence\", \"exact_match\", \"fluidity\", \"fulfillment\", \"groundedness\", \"rouge_1\", \"rouge_2\", \"rouge_l\", \"rouge_l_sum\", \"safety\", \"question_answering_correctness\", \"question_answering_helpfulness\", \"question_answering_quality\", \"question_answering_relevance\", \"summarization_helpfulness\", \"summarization_quality\", \"summarization_verbosity\", \"tool_name_match\", \"tool_parameter_key_match\", \"tool_parameter_kv_match\"\n", + " eval_metrics_weights=[0.9, 0.1],\n", + " aggregation_type=\"weighted_sum\", # Supported aggregation types: \"weighted_sum\", \"weighted_average\"\n", + " data_limit=50,\n", + " response_mime_type=\"application/json\",\n", + " language=\"English\", # Supported languages: \"English\", \"French\", \"German\", \"Hebrew\", \"Hindi\", \"Japanese\", \"Korean\", \"Portuguese\", \"Simplified Chinese\", \"Spanish\", \"Traditional Chinese\"\n", + " placeholder_to_content=json.loads(\"{}\"),\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Jd_uzQYQx6L7" + }, + "source": [ + "#### Upload Vertex AI Prompt Optimizer (Preview) config to Cloud Storage\n", + "\n", + "After you define Vertex AI Prompt Optimizer (Preview) configuration, you upload them on Cloud Storage bucket.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QCJAqcfWBqAh" + }, + "source": [ + "Now you can save the config to the bucket." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "iqiv8ApR_SAM" + }, + "outputs": [], + "source": [ + "args = vars(args)\n", + "\n", + "with epath.Path(CONFIG_FILE_URI).open(\"w\") as config_file:\n", + " json.dump(args, config_file)\n", + "config_file.close()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "spqgBT8hYAle" + }, + "source": [ + "#### Run the automatic prompt optimization job\n", + "\n", + "Now you are ready to run your first Vertex AI Prompt Optimizer (Preview) job using the Vertex AI SDK for Python.\n", + "\n", + "**Important:** Be sure you have provisioned enough queries per minute (QPM) quota and the recommended QPM for each model. If you configure the Vertex AI prompt optimizer with a QPM that is higher than the QPM than you have access to, the job will fail. \n", + "\n", + "[Check out](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer#before-you-begin) the documentation to know more. \n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "GtPnvKIpUQ3q" + }, + "outputs": [], + "source": [ + "WORKER_POOL_SPECS = [\n", + " {\n", + " \"machine_spec\": {\n", + " \"machine_type\": \"n1-standard-4\",\n", + " },\n", + " \"replica_count\": 1,\n", + " \"container_spec\": {\n", + " \"image_uri\": APD_CONTAINER_URI,\n", + " \"args\": [\"--config=\" + CONFIG_FILE_URI],\n", + " },\n", + " }\n", + "]\n", + "\n", + "custom_job = aiplatform.CustomJob(\n", + " display_name=PROMPT_OPTIMIZATION_JOB,\n", + " worker_pool_specs=WORKER_POOL_SPECS,\n", + ")\n", + "\n", + "custom_job.run(service_account=SERVICE_ACCOUNT)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3YwwKBhtJ4ut" + }, + "source": [ + "### Collect the optimization results\n", + "\n", + "Vertex AI Prompt Optimizer returns both optimized templates and evaluation results for either instruction, or demostrations, or both depending on the optimization mode you define as JSONL files on Cloud Storage bucket. Those results help you understand the optimization process.\n", + "\n", + "In this case, you want to collect the optimized templates and evaluation results for the instruction.\n", + "\n", + "Below you use a helper function to read those results.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "xTPJsvg-kzkO" + }, + "outputs": [], + "source": [ + "apd_result_uris = get_results_file_uris(\n", + " output_uri=OUTPUT_TUNING_RUN_URI,\n", + " required_files=[\"eval_results.json\", \"templates.json\"],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZezzQSYWjYPd" + }, + "source": [ + "#### Get the best system instruction\n", + "\n", + "Below you have the optimal system instruction template and the associated evaluation metrics." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "PrezXkBUu1s5" + }, + "outputs": [], + "source": [ + "best_prompt_df, prompt_summary_df, prompt_metrics_df = get_optimization_result(\n", + " apd_result_uris[\"instruction_templates\"],\n", + " apd_result_uris[\"instruction_eval_results\"],\n", + ")\n", + "\n", + "display_eval_report(\n", + " (best_prompt_df, prompt_summary_df, prompt_metrics_df),\n", + " prompt_component=\"instruction\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TrMrbcA5Gzep" + }, + "source": [ + "### Validate and Evaluate the optimized template in question-answering task\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "bGRELw3U3I28" + }, + "source": [ + "#### Generate new responses using the optimized template\n", + "\n", + "Finally, you generate the new responses with the optimized template. Below you can see an example of a generated response using the optimized system instructions template." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "GXDU_ydAG5ak" + }, + "outputs": [], + "source": [ + "optimized_prompt_template = (\n", + " best_prompt_df[\"prompt\"].iloc[0]\n", + " + \"\\nQuestion: \\n{question}\"\n", + " + \"\\nContext: \\n{context}\"\n", + ")\n", + "\n", + "optimized_prompts = [\n", + " optimized_prompt_template.format(question=q, context=c)\n", + " for q, c in zip(\n", + " prompt_tuning_df[\"user_question\"].to_list(),\n", + " prompt_tuning_df[\"context\"].to_list(),\n", + " )\n", + "]\n", + "\n", + "prompt_tuning_df[\"optimized_prompt_with_vapo\"] = optimized_prompts" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "qG6QJW8alttS" + }, + "outputs": [], + "source": [ + "gemini_llm = init_new_model(\"gemini-1.5-flash-001\")\n", + "\n", + "gemini_predictions = [async_generate(p, model=gemini_llm) for p in optimized_prompts]\n", + "\n", + "gemini_predictions_col = await tqdm_asyncio.gather(*gemini_predictions)\n", + "\n", + "prompt_tuning_df[\"gemini_answer_with_vapo\"] = gemini_predictions_col" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "_55cHbD4kFAz" + }, + "outputs": [], + "source": [ + "print_df_rows(prompt_tuning_df, n=1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "D1wxiPhv21TT" + }, + "source": [ + "#### Evaluate new responses using Vertex AI Gen AI Evaluation\n", + "\n", + "And you use the generated responses with the optimized prompt to run a new round of evaluation with Vertex AI Gen AI Evaluation.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "5Ebtvk0fKApV" + }, + "outputs": [], + "source": [ + "evaluation_qa_results.append(\n", + " (\n", + " \"qa_eval_result_new_model_with_vapo\",\n", + " evaluate_task(\n", + " df=prompt_tuning_df,\n", + " prompt_col=\"optimized_prompt_with_vapo\",\n", + " reference_col=\"reference\",\n", + " response_col=\"gemini_answer_with_vapo\",\n", + " experiment_name=EXPERIMENT_NAME,\n", + " eval_metrics=[\"question_answering_quality\", \"groundedness\"],\n", + " eval_sample_n=len(prompt_tuning_df),\n", + " ),\n", + " )\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "wJXNAnJjmnga" + }, + "outputs": [], + "source": [ + "plot_eval_metrics(evaluation_qa_results)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2a4e033321ad" + }, + "source": [ + "## IV. Clean up" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "WRY_3wh1GVNm" + }, + "outputs": [], + "source": [ + "delete_bucket = False\n", + "delete_job = False\n", + "delete_experiment = False\n", + "delete_tutorial = False\n", + "\n", + "if delete_bucket:\n", + " ! gsutil rm -r $BUCKET_URI\n", + "\n", + "if delete_job:\n", + " custom_job.delete()\n", + "\n", + "if delete_experiment:\n", + " experiment = aiplatform.Experiment(experiment_name=EXPERIMENT_NAME)\n", + " experiment.delete()\n", + "\n", + "if delete_tutorial:\n", + " import shutil\n", + "\n", + " shutil.rmtree(str(TUTORIAL_PATH))" + ] + } + ], + "metadata": { + "colab": { + "name": "vertex_ai_prompt_optimizer_sdk.ipynb", + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 0 } diff --git a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk_custom_metric.ipynb b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk_custom_metric.ipynb index 749d39b323..9e7a8aa982 100644 --- a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk_custom_metric.ipynb +++ b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk_custom_metric.ipynb @@ -1,1382 +1,1368 @@ { - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ur8xi4C7S06n" - }, - "outputs": [], - "source": [ - "# Copyright 2024 Google LLC\n", - "#\n", - "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# https://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "JAPoU8Sm5E6e" - }, - "source": [ - "# Vertex Prompt Optimizer Notebook SDK (Preview) - Custom metric\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - "
\n", - " \n", - " \"Google
Open in Colab\n", - "
\n", - "
\n", - " \n", - " \"Google
Open in Colab Enterprise\n", - "
\n", - "
\n", - " \n", - " \"Vertex
Open in Vertex AI Workbench\n", - "
\n", - "
\n", - " \n", - " \"GitHub
View on GitHub\n", - "
\n", - "
\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "0ccc35a93b9f" - }, - "source": [ - "| | | |\n", - "|-|-|-|\n", - "| Author | [Ivan Nardini](https://github.com/inardini) |" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "tvgnzT1CKxrO" - }, - "source": [ - "## I. Overview\n", - "\n", - "In the context of developing Generative AI (Gen AI) applications, prompt engineering poses challenges due to its time-consuming and error-prone nature. You often dedicate significant effort to crafting and inputting prompts to achieve successful task completion. Additionally, with the frequent release of foundational models, you face the additional burden of migrating working prompts from one model version to another.\n", - "\n", - "Vertex AI Prompt Optimizer aims to alleviate these challenges by providing you with an intelligent prompt optimization tool. With this tool you can both refine optimize system instruction (and task) in the prompts and selects the best demonstrations (few-shot examples) for prompt templates, empowering you to shape LLM responses from any source model to on a target Google model.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "4HKyj5KwYePX" - }, - "source": [ - "### Objective\n", - "\n", - "This notebook demostrates how to leverage Vertex AI Prompt Optimizer (Preview) to optimize a simple prompt for a Gemini model using your own metric. The goal is to use Vertex AI Prompt Optimizer (Preview) to find the new prompt template which generates responses according to your own metric.\n", - "\n", - "\n", - "This tutorial uses the following Google Cloud services and resources:\n", - "\n", - "- Vertex AI Gen AI\n", - "- Vertex AI Prompt Optimizer (Preview)\n", - "- Vertex AI Model Eval\n", - "- Vertex AI Custom job\n", - "- Cloud Run\n", - "\n", - "The steps performed include:\n", - "\n", - "- Prepare the prompt-ground truth pairs optimized for another model\n", - "- Define the prompt template you want to optimize\n", - "- Define and deploy your own custom evaluation metric on Cloud function\n", - "- Set optimization mode and steps\n", - "- Run the automatic prompt optimization job\n", - "- Collect the best prompt template and eval metric\n", - "- Validate the best prompt template" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "08d289fa873f" - }, - "source": [ - "### Dataset\n", - "\n", - "The dataset is a question-answering dataset generated by a simple AI cooking assistant that provides suggestions on how to cook healthier dishes.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "aed92deeb4a0" - }, - "source": [ - "### Costs\n", - "\n", - "This tutorial uses billable components of Google Cloud:\n", - "\n", - "* Vertex AI\n", - "* Cloud Storage\n", - "\n", - "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage pricing](https://cloud.google.com/storage/pricing) and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "61RBz8LLbxCR" - }, - "source": [ - "## II. Before you start" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "No17Cw5hgx12" - }, - "source": [ - "### Install Vertex AI SDK for Python and other required packages\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "tFy3H3aPgx12" - }, - "outputs": [], - "source": [ - "%pip install --upgrade --quiet 'google-cloud-aiplatform[evaluation]' 'plotly' 'asyncio' 'tqdm' 'tenacity' 'etils' 'importlib_resources' 'fsspec' 'gcsfs' 'nbformat>=4.2.0'" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "e55e2195ce2d" - }, - "outputs": [], - "source": [ - "! mkdir -p ./tutorial/utils && wget https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/prompts/prompt_optimizer/utils/helpers.py -P ./tutorial/utils" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "R5Xep4W9lq-Z" - }, - "source": [ - "### Restart runtime (Colab only)\n", - "\n", - "To use the newly installed packages, you must restart the runtime on Google Colab." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "XRvKdaPDTznN" - }, - "outputs": [], - "source": [ - "import sys\n", - "\n", - "if \"google.colab\" in sys.modules:\n", - " import IPython\n", - "\n", - " app = IPython.Application.instance()\n", - " app.kernel.do_shutdown(True)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "SbmM4z7FOBpM" - }, - "source": [ - "
\n", - "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n", - "
\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "dmWOrTJ3gx13" - }, - "source": [ - "### Authenticate your notebook environment (Colab only)\n", - "\n", - "Authenticate your environment on Google Colab.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "NyKGtVQjgx13" - }, - "outputs": [], - "source": [ - "import sys\n", - "\n", - "if \"google.colab\" in sys.modules:\n", - " try:\n", - " from google.colab import auth\n", - "\n", - " auth.authenticate_user()\n", - " creds, project = auth.default()\n", - " if creds.token:\n", - " print(\"Authentication successful.\")\n", - " else:\n", - " print(\"Authentication successful, but no token was returned.\")\n", - " except Exception as e:\n", - " print(f\"Error during Colab authentication: {e}\")\n", - "\n", - "! gcloud auth login" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "DF4l8DTdWgPY" - }, - "source": [ - "### Set Google Cloud project information\n", - "\n", - "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the following APIs](https://console.cloud.google.com/flows/enableapi?apiid=cloudresourcemanager.googleapis.com,aiplatform.googleapis.com,cloudfunctions.googleapis.com,run.googleapis.com).\n", - "\n", - "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "WReHDGG5g0XY" - }, - "source": [ - "#### Set your project ID and project number" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "oM1iC_MfAts1" - }, - "outputs": [], - "source": [ - "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n", - "\n", - "# Set the project id\n", - "! gcloud config set project {PROJECT_ID}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "oZpm-sL8f1z_" - }, - "outputs": [], - "source": [ - "PROJECT_NUMBER = !gcloud projects describe {PROJECT_ID} --format=\"get(projectNumber)\"[0]\n", - "PROJECT_NUMBER = PROJECT_NUMBER[0]" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "region" - }, - "source": [ - "#### Region\n", - "\n", - "You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "I6FmBV2_0fBP" - }, - "outputs": [], - "source": [ - "REGION = \"us-central1\" # @param {type: \"string\"}" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "zgPO1eR3CYjk" - }, - "source": [ - "#### Create a Cloud Storage bucket\n", - "\n", - "Create a storage bucket to store intermediate artifacts such as datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "MzGDU7TWdts_" - }, - "outputs": [], - "source": [ - "BUCKET_NAME = \"your-bucket-name-{PROJECT_ID}-unique\" # @param {type:\"string\"}\n", - "\n", - "BUCKET_URI = f\"gs://{BUCKET_NAME}\" # @param {type:\"string\"}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "NIq7R4HZCfIc" - }, - "outputs": [], - "source": [ - "! gsutil mb -l {REGION} -p {PROJECT_ID} {BUCKET_URI}" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "set_service_account" - }, - "source": [ - "#### Service Account and permissions\n", - "\n", - "Vertex AI Automated Prompt Design requires a service account with the following permissions:\n", - "\n", - "- `Vertex AI User` to call Vertex LLM API\n", - "- `Storage Object Admin` to read and write to your GCS bucket.\n", - "- `Artifact Registry Reader` to download the pipeline template from Artifact Registry.\n", - "- `Cloud Run Developer` to deploy function on Cloud Run.\n", - "\n", - "[Check out the documentation](https://cloud.google.com/iam/docs/manage-access-service-accounts#iam-view-access-sa-gcloud) to know how to grant those permissions to a single service account.\n", - "\n", - "**Important**: If you run following commands using Vertex AI Workbench, please directly run in the terminal.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ssUJJqXJJHgC" - }, - "outputs": [], - "source": [ - "SERVICE_ACCOUNT = f\"{PROJECT_NUMBER}-compute@developer.gserviceaccount.com\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "wqOHg5aid6HP" - }, - "outputs": [], - "source": [ - "for role in ['aiplatform.user', 'storage.objectAdmin', 'artifactregistry.reader', 'run.developer', 'run.invoker']:\n", - "\n", - " ! gcloud projects add-iam-policy-binding {PROJECT_ID} \\\n", - " --member=serviceAccount:{SERVICE_ACCOUNT} \\\n", - " --role=roles/{role} --condition=None" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Ek1-iTbPjzdJ" - }, - "source": [ - "### Set tutorial folder and workspace\n", - "\n", - "Set a folder to collect data and any tutorial artifacts." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "BbfKRabXj3la" - }, - "outputs": [], - "source": [ - "from pathlib import Path as path\n", - "\n", - "ROOT_PATH = path.cwd()\n", - "TUTORIAL_PATH = ROOT_PATH / \"tutorial\"\n", - "CONFIG_PATH = TUTORIAL_PATH / \"config\"\n", - "TUNED_PROMPT_PATH = TUTORIAL_PATH / \"tuned_prompts\"\n", - "BUILD_PATH = TUTORIAL_PATH / \"build\"\n", - "\n", - "TUTORIAL_PATH.mkdir(parents=True, exist_ok=True)\n", - "CONFIG_PATH.mkdir(parents=True, exist_ok=True)\n", - "TUNED_PROMPT_PATH.mkdir(parents=True, exist_ok=True)\n", - "BUILD_PATH.mkdir(parents=True, exist_ok=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "BaNdfftpXTIX" - }, - "source": [ - "Set the associated workspace on Cloud Storage bucket." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "joJPc3FmX1fk" - }, - "outputs": [], - "source": [ - "from etils import epath\n", - "\n", - "WORKSPACE_URI = epath.Path(BUCKET_URI) / \"prompt_migration_gemini\"\n", - "INPUT_DATA_URI = epath.Path(WORKSPACE_URI) / \"data\"\n", - "\n", - "WORKSPACE_URI.mkdir(parents=True, exist_ok=True)\n", - "INPUT_DATA_URI.mkdir(parents=True, exist_ok=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "960505627ddf" - }, - "source": [ - "### Import libraries" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "PyQmSRbKA8r-" - }, - "outputs": [], - "source": [ - "import json\n", - "import logging\n", - "import warnings\n", - "from argparse import Namespace\n", - "# General\n", - "from pprint import pprint\n", - "\n", - "import pandas as pd\n", - "import requests\n", - "from google.cloud import aiplatform\n", - "from tqdm.asyncio import tqdm_asyncio\n", - "from tutorial.utils.helpers import (async_generate, display_eval_report,\n", - " get_auth_token, get_id,\n", - " get_optimization_result,\n", - " get_results_file_uris, init_new_model,\n", - " print_df_rows)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "820DIvw1o8tB" - }, - "source": [ - "### Libraries settings" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "HKc4ZdUBo_SM" - }, - "outputs": [], - "source": [ - "warnings.filterwarnings(\"ignore\")\n", - "logging.getLogger(\"urllib3.connectionpool\").setLevel(logging.ERROR)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "gxc7q4r-DFH4" - }, - "source": [ - "### Define constants" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "0Y5t67f3DHNm" - }, - "outputs": [], - "source": [ - "INPUT_DATA_FILE_URI = \"gs://github-repo/prompts/prompt_optimizer/rag_qa_dataset.jsonl\"\n", - "\n", - "INPUT_TUNING_DATA_URI = epath.Path(WORKSPACE_URI) / \"tuning_data\"\n", - "INPUT_TUNING_DATA_FILE_URI = str(INPUT_DATA_URI / \"prompt_tuning.jsonl\")\n", - "OUTPUT_TUNING_DATA_URI = epath.Path(WORKSPACE_URI) / \"tuned_prompt\"\n", - "APD_CONTAINER_URI = (\n", - " \"us-docker.pkg.dev/vertex-ai-restricted/builtin-algorithm/apd:preview_v1_0\"\n", - ")\n", - "CONFIG_FILE_URI = str(WORKSPACE_URI / \"config\" / \"config.json\")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "init_aip:mbsdk,all" - }, - "source": [ - "### Initialize Vertex AI SDK for Python\n", - "\n", - "Initialize the Vertex AI SDK for Python for your project." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "bQMc2Uwf0fBQ" - }, - "outputs": [], - "source": [ - "aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "EdvJRUWRNGHE" - }, - "source": [ - "## III. Automated prompt design with Vertex AI Prompt Optimizer (Preview)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "mmTotjRAJplw" - }, - "source": [ - "### Load the dataset\n", - "\n", - "Load the dataset from Cloud Storage bucket." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "LA7aG08wJtVm" - }, - "outputs": [], - "source": [ - "prompt_tuning_df = pd.read_json(INPUT_DATA_FILE_URI, lines=True)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "1xn-pz3v5HVK" - }, - "outputs": [], - "source": [ - "prompt_tuning_df.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "PsXdJBJXiaVH" - }, - "outputs": [], - "source": [ - "print_df_rows(prompt_tuning_df, n=1)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Rp1n1aMACzSW" - }, - "source": [ - "### Enhance the prompt template with Vertex AI Prompt Optimizer (Preview) with custom metric\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "h1650lf3X8xW" - }, - "source": [ - "#### Prepare the prompt template you want to optimize\n", - "\n", - "A prompt consists of two key parts:\n", - "\n", - "* **System Instruction Template** which is a fixed part of the prompt shared across all queries for a given task.\n", - "\n", - "* **Prompt Template** which is a dynamic part of the prompt that changes based on the task.\n", - "\n", - "Vertex AI Prompt Optimizer enables the translation and optimization of the Instruction Template, while the Task/Context Template remains essential for evaluating different instruction templates.\n", - "\n", - "In this case, you want to translate a prompt\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "Db8rHNC6DmtY" - }, - "outputs": [], - "source": [ - "SYSTEM_INSTRUCTION_TEMPLATE = \"\"\"\n", - "Given a question with some context, provide the correct answer to the question.\n", - "\"\"\"\n", - "\n", - "PROMPT_TEMPLATE = \"\"\"\n", - "Some examples of correct answer to a question with context are:\n", - "Question: {{question}}\n", - "Answer: {{target}}\n", - "\"\"\"" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "a1TCgXsrXztm" - }, - "source": [ - "#### Prepare few samples\n", - "\n", - "Vertex AI Prompt optimizer requires a CSV or JSONL file containing labeled samples.\n", - "\n", - "For **prompt optimization**:\n", - "\n", - "* Focus on examples that specifically demonstrate the issues you want to address.\n", - "* Recommendation: Use 50-100 distinct samples for reliable results. However, the tool can still be effective with as few as 5 samples.\n", - "\n", - "For **prompt translation**:\n", - "\n", - "* Consider using the source model to label examples that the target model struggles with, helping to identify areas for improvement.\n", - "\n", - "Learn more about setting up your CSV or JSONL file as input [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "vTIl_v9Ig1F-" - }, - "outputs": [], - "source": [ - "prepared_prompt_tuning_df = prompt_tuning_df.copy()\n", - "\n", - "# Prepare question and target columns\n", - "prepared_prompt_tuning_df[\"question\"] = (\n", - " prepared_prompt_tuning_df[\"user_question\"]\n", - " + \"\\nnContext:\\n\"\n", - " + prepared_prompt_tuning_df[\"context\"]\n", - ")\n", - "prepared_prompt_tuning_df = prepared_prompt_tuning_df.rename(\n", - " columns={\"reference\": \"target\"}\n", - ")\n", - "\n", - "# Remove uneccessary columns\n", - "prepared_prompt_tuning_df = prepared_prompt_tuning_df.drop(\n", - " columns=[\"user_question\", \"context\", \"prompt\", \"answer\"]\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "_DUFEAb82eEi" - }, - "outputs": [], - "source": [ - "prepared_prompt_tuning_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "nF3XY_d_yB-K" - }, - "source": [ - "#### Upload samples to bucket\n", - "\n", - "Once you prepare samples, you can upload them on Cloud Storage bucket." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "155paLgGUXOm" - }, - "outputs": [], - "source": [ - "prepared_prompt_tuning_df.to_json(\n", - " INPUT_TUNING_DATA_FILE_URI, orient=\"records\", lines=True\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Hxpid3KgAkYM" - }, - "source": [ - "#### Define and deploy your own custom optimization metric on Cloud function\n", - "\n", - "To optimize your prompt template using a custom optimization metric, you need to deploy a function with your own metric code on Cloud function. To deploy a Cloud function with your own custom metric, you cover the following steps:\n", - "\n", - "1. Define requirements\n", - "2. Write your own custom metric function code\n", - "3. Deploy the custom code as Cloud function\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Nxh2e88fAnQc" - }, - "source": [ - "##### Define requirements\n", - "\n", - "Set the custom metric dependencies." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "q-hUlhgBCus4" - }, - "outputs": [], - "source": [ - "requirements = \"\"\"\n", - "functions-framework==3.*\n", - "google-cloud-aiplatform\n", - "\"\"\"\n", - "\n", - "with open(BUILD_PATH / \"requirements.txt\", \"w\") as f:\n", - " f.write(requirements)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "k_EFZEBeEy48" - }, - "source": [ - "##### Write your own custom metric function\n", - "\n", - "Define the module which contains your own custom metric function definition.\n", - "\n", - "It is important to highlight that you need to retrieve the input data using `request.get_json()` as shown below. This will return a json dict. The `response` field will be provided by the service which contains the LLM output.\n", - "\n", - "Also you have to return a json serialized dict with two fields: `custom metric name` you specified, and the `explanation` to correctly optimize the prompt template with your own metric.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "1wGoVQNCMbxe" - }, - "outputs": [], - "source": [ - "custom_metric_function_code = '''\n", - "\"\"\"\n", - "This module contains the custom evaluation metric definition to optimize a prompt template with Vertex AI Prompt Optimizer\n", - "\"\"\"\n", - "\n", - "from typing import Dict\n", - "from vertexai.generative_models import (\n", - " GenerationConfig,\n", - " GenerativeModel,\n", - " HarmBlockThreshold,\n", - " HarmCategory,\n", - ")\n", - "\n", - "import json\n", - "import functions_framework\n", - "\n", - "def get_autorater_response(metric_prompt: str) -> dict:\n", - " \"\"\"This function is to generate the evaluation response from the autorater.\"\"\"\n", - "\n", - " metric_response_schema = {\n", - " \"type\": \"OBJECT\",\n", - " \"properties\": {\n", - " \"score\": {\"type\": \"NUMBER\"},\n", - " \"explanation\": {\"type\": \"STRING\"},\n", - " },\n", - " \"required\": [\"score\", \"explanation\"],\n", - " }\n", - "\n", - " autorater = GenerativeModel(\n", - " \"gemini-1.5-pro\",\n", - " generation_config=GenerationConfig(\n", - " response_mime_type=\"application/json\",\n", - " response_schema=metric_response_schema,\n", - " ),\n", - " safety_settings={\n", - " HarmCategory.HARM_CATEGORY_UNSPECIFIED: HarmBlockThreshold.BLOCK_NONE,\n", - " HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,\n", - " HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_NONE,\n", - " HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,\n", - " HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_NONE,\n", - " },\n", - " )\n", - "\n", - " response = autorater.generate_content(metric_prompt)\n", - "\n", - " response_json = {}\n", - " response_json = json.loads(response.text)\n", - " return response_json\n", - "\n", - "\n", - "# Define custom evaluation criteria\n", - "def evaluate_engagement_personalization_fn(question: str, response:str, target: str) -> Dict[str, str]:\n", - " \"\"\"Evaluates an AI-generated response for User Engagement and Personalization.\"\"\"\n", - "\n", - " custom_metric_prompt_template = \"\"\"\n", - "\n", - " # Instruction\n", - " You are an expert evaluator. Your task is to evaluate the quality of the LLM-generated responses against a reference target response.\n", - " You should first read the Question carefully, and then evaluate the quality of the responses based on the Criteria provided in the Evaluation section below.\n", - " You will assign the response a rating following the Rating Rubric only and an step-by-step explanation for your rating.\n", - "\n", - " # Evaluation\n", - "\n", - " ## Criteria\n", - " Relevance and Customization: The response should directly address the user's query and demonstrate an understanding of their specific needs or preferences, such as dietary restrictions, skill level, or taste preferences.\n", - " Interactivity and Proactiveness: The response should go beyond simply answering the question by actively encouraging further interaction through follow-up questions, suggestions for additional exploration, or prompts for more information to provide a tailored experience.\n", - " Tone and Empathy: The response should adopt an appropriate and empathetic tone that fosters a positive and supportive user experience, making the user feel heard and understood.\n", - "\n", - " ## Rating rubric\n", - " 1 - Minimal: The response lacks personalization and demonstrates minimal engagement with the user. The tone may be impersonal or generic.\n", - " 2 - Basic: The response shows some basic personalization but lacks depth or specificity. Engagement is limited, possibly with generic prompts or suggestions. The tone is generally neutral but may lack warmth or empathy.\n", - " 3 - Moderate: The response demonstrates clear personalization and attempts to engage the user with relevant follow-up questions or prompts based on their query. The tone is friendly and supportive, fostering a positive user experience.\n", - " 4 - High: The response demonstrates a high degree of personalization and actively engages the user with relevant follow-up questions or prompts. The tone is empathetic and understanding, creating a strong connection with the user.\n", - " 5 - Exceptional: The response goes above and beyond to personalize the experience, anticipating user needs, and fostering a genuine connection. The tone is warm, encouraging, and inspiring, leaving the user feeling empowered and motivated.\n", - "\n", - " ## Evaluation steps\n", - " Step 1: Carefully read both the question and the generated response. Ensure a clear understanding of the user's intent, needs, and any specific context provided.\n", - " Step 2: Evaluate how well the response directly addresses the user's query and demonstrates an understanding of their specific needs or preferences.\n", - " Step 3: Determine the extent to which the response actively encourages further interaction and provides a tailored experience.\n", - " Step 4: Evaluate Tone & Empathy: Analyze the tone of the response, ensuring it fosters a positive and supportive user experience, making the user feel heard and understood.\n", - " Step 5: Based on the three criteria above, assign a score from 1 to 5 according to the score rubric.\n", - " Step 5: Justify the assigned score with a clear and concise explanation, highlighting the strengths and weaknesses of the response with respect to each criterion.\n", - "\n", - " # Question : {question}\n", - " # Generated response: {response}\n", - " # Reference response: {target}\n", - " \"\"\"\n", - "\n", - " custom_metric_prompt = custom_metric_prompt_template.format(question=question, response=response, target=target)\n", - " response_dict = get_autorater_response(custom_metric_prompt)\n", - "\n", - " return {\n", - " \"custom_metric\": response_dict[\"score\"],\n", - " \"explanation\": response_dict[\"explanation\"],\n", - " }\n", - "\n", - "# Register an HTTP function with the Functions Framework\n", - "@functions_framework.http\n", - "def main(request):\n", - " request_json = request.get_json(silent=True)\n", - "\n", - " if not request_json:\n", - " raise ValueError('Cannot find request json.')\n", - "\n", - " question = request_json['question']\n", - " response = request_json['response']\n", - " reference = request_json['target']\n", - "\n", - " get_evaluation_result = evaluate_engagement_personalization_fn(question, response, reference)\n", - " return json.dumps(get_evaluation_result)\n", - "'''\n", - "\n", - "with open(BUILD_PATH / \"main.py\", \"w\") as f:\n", - " f.write(custom_metric_function_code)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "T7R0LDZMCPnL" - }, - "source": [ - "##### Deploy the custom metric as a Cloud Function\n", - "\n", - "Use gcloud command line to deploy the cloud function." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "nwBZGvkLCizs" - }, - "outputs": [], - "source": [ - "!gcloud functions deploy 'custom_engagement_personalization_metric' \\\n", - " --gen2 \\\n", - " --runtime=\"python310\" \\\n", - " --source={str(BUILD_PATH)} \\\n", - " --entry-point=main \\\n", - " --trigger-http \\\n", - " --timeout=3600 \\\n", - " --memory=2Gb \\\n", - " --concurrency=6 \\\n", - " --min-instances=6 \\\n", - " --project {PROJECT_ID} \\\n", - " --region={REGION} \\\n", - " --quiet" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "FoEypczSwAGK" - }, - "source": [ - "##### Test your custom evaluation function\n", - "\n", - "Submit a request to validate the output of the custom evaluation function." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "HXOWYp2MwEsA" - }, - "outputs": [], - "source": [ - "custom_evaluator_function_uri = ! gcloud functions describe 'custom_engagement_personalization_metric' --gen2 --region {REGION} --format=\"value(url)\"\n", - "custom_evaluator_function_uri = custom_evaluator_function_uri[0].strip()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "0JMeIyx0DHnc" - }, - "outputs": [], - "source": [ - "headers = {\n", - " \"Authorization\": f\"Bearer {get_auth_token()}\",\n", - " \"Content-Type\": \"application/json\",\n", - "}\n", - "\n", - "json_data = {\n", - " \"question\": \"\"\"\n", - " What are some techniques for cooking red meat and pork that maximize flavor and tenderness while minimizing the formation of unhealthy compounds?\n", - " \"\"\",\n", - " \"response\": \"\"\"\n", - " * Marinating in acidic ingredients like lemon juice or vinegar to tenderize the meat \\n * Cooking to an internal temperature of 145°F (63°C) for safety \\n * Using high-heat cooking methods like grilling and pan-searing for browning and caramelization /n * Avoiding charring to minimize the formation of unhealthy compounds\n", - " \"\"\",\n", - " \"target\": \"\"\"\n", - " Here's how to tackle those delicious red meats and pork while keeping things healthy:\n", - " **Prioritize Low and Slow:**\n", - " * **Braising and Stewing:** These techniques involve gently simmering meat in liquid over low heat for an extended period. This breaks down tough collagen, resulting in incredibly tender and flavorful meat. Plus, since the cooking temperature is lower, it minimizes the formation of potentially harmful compounds associated with high-heat cooking.\n", - " * **Sous Vide:** This method involves sealing meat in a vacuum bag and immersing it in a precisely temperature-controlled water bath. It allows for even cooking to the exact desired doneness, resulting in incredibly juicy and tender meat. Because the temperature is controlled and lower than traditional methods, it can be a healthier option.\n", - " **High Heat Tips:**\n", - " * **Marinades are Your Friend:** As you mentioned, acidic marinades tenderize meat. They also add flavor!\n", - " * **Temperature Control is Key:** Use a meat thermometer to ensure you reach the safe internal temperature of 145°F (63°C) without overcooking.\n", - " * **Don't Burn It!** While some browning is desirable, charring creates those unhealthy compounds. Pat meat dry before cooking to minimize steaming and promote browning. Let the pan heat up properly before adding the meat to achieve a good sear.\n", - "\n", - " **Remember:** Trim visible fat before cooking to reduce saturated fat content. Let meat rest after cooking; this allows juices to redistribute, resulting in a more tender and flavorful final product.\n", - " \"\"\",\n", - "}\n", - "\n", - "response = requests.post(\n", - " custom_evaluator_function_uri, headers=headers, json=json_data, timeout=70\n", - ").json()\n", - "pprint(response)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "F5RD0l2xX-FI" - }, - "source": [ - "#### Configure optimization settings\n", - "\n", - "Vertex AI Prompt Optimizer allows you to optimize prompts by optimizing instructions only, demonstration only, or both (`optimization_mode`), and after you set the system instruction, prompt templates that will be optimized (`system_instruction`, `prompt_template`), and the model you want to optimize for (`target_model`), it allows to condition the optimization process by setting metrics, number of iterations used to improve the prompt and more.\n", - "\n", - "In this scenario, you set two parameters:\n", - "\n", - "* `custom_metric_name` parameter which allows you to pass your own custom metric to optimizer the prompt template.\n", - "* `custom_metric_cloud_function_name` parameter which indicates the Cloud function to call for collecting custom function evaluation metric output.\n", - "\n", - "For additional configurations, check out the documentation [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer).\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "sFHutXhgeqRx" - }, - "outputs": [], - "source": [ - "PROMPT_OPTIMIZATION_JOB = \"auto-prompt-design-job-\" + get_id()\n", - "OUTPUT_TUNING_RUN_URI = str(OUTPUT_TUNING_DATA_URI / PROMPT_OPTIMIZATION_JOB)\n", - "\n", - "args = Namespace(\n", - " # Basic configuration\n", - " system_instruction=SYSTEM_INSTRUCTION_TEMPLATE,\n", - " prompt_template=PROMPT_TEMPLATE,\n", - " target_model=\"gemini-1.5-flash-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n", - " optimization_mode=\"instruction\", # Supported modes: \"instruction\", \"demonstration\", \"instruction_and_demo\"\n", - " custom_metric_name=\"custom_metric\",\n", - " custom_metric_cloud_function_name=\"custom_engagement_personalization_metric\",\n", - " num_steps=3,\n", - " num_template_eval_per_step=2,\n", - " num_demo_set_candidates=3,\n", - " demo_set_size=2,\n", - " input_data_path=INPUT_TUNING_DATA_FILE_URI,\n", - " output_path=OUTPUT_TUNING_RUN_URI,\n", - " project=PROJECT_ID,\n", - " # Advanced configuration\n", - " target_model_qps=1,\n", - " target_model_location=\"us-central1\",\n", - " source_model=\"\",\n", - " source_model_qps=\"\",\n", - " source_model_location=\"\",\n", - " optimizer_model=\"gemini-1.5-pro-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n", - " optimizer_model_qps=1,\n", - " optimizer_model_location=\"us-central1\",\n", - " eval_model=\"gemini-1.5-pro-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n", - " eval_qps=1,\n", - " eval_model_location=\"us-central1\",\n", - " eval_metrics_types=[\n", - " \"question_answering_correctness\",\n", - " \"custom_metric\",\n", - " ], # Supported metrics: \"bleu\", \"coherence\", \"exact_match\", \"fluidity\", \"fulfillment\", \"groundedness\", \"rouge_1\", \"rouge_2\", \"rouge_l\", \"rouge_l_sum\", \"safety\", \"question_answering_correctness\", \"question_answering_helpfulness\", \"question_answering_quality\", \"question_answering_relevance\", \"summarization_helpfulness\", \"summarization_quality\", \"summarization_verbosity\", \"tool_name_match\", \"tool_parameter_key_match\", \"tool_parameter_kv_match\"\n", - " eval_metrics_weights=[0.8, 0.2],\n", - " aggregation_type=\"weighted_sum\", # Supported aggregation types: \"weighted_sum\", \"weighted_average\"\n", - " data_limit=50,\n", - " response_mime_type=\"application/json\",\n", - " language=\"English\", # Supported languages: \"English\", \"French\", \"German\", \"Hebrew\", \"Hindi\", \"Japanese\", \"Korean\", \"Portuguese\", \"Simplified Chinese\", \"Spanish\", \"Traditional Chinese\"\n", - " placeholder_to_content=json.loads(\"{}\"),\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Jd_uzQYQx6L7" - }, - "source": [ - "#### Upload Vertex AI Prompt Optimizer (Preview) config to Cloud Storage\n", - "\n", - "After you define Vertex AI Prompt Optimizer (Preview) configuration, you upload them on Cloud Storage bucket.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "QCJAqcfWBqAh" - }, - "source": [ - "Now you can save the config to the bucket." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "iqiv8ApR_SAM" - }, - "outputs": [], - "source": [ - "args = vars(args)\n", - "\n", - "with epath.Path(CONFIG_FILE_URI).open(\"w\") as config_file:\n", - " json.dump(args, config_file)\n", - "config_file.close()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "spqgBT8hYAle" - }, - "source": [ - "#### Run the automatic prompt optimization job\n", - "\n", - "Now you are ready to run your first Vertex AI Prompt Optimizer (Preview) job using the Vertex AI SDK for Python.\n", - "\n", - "**Important:** Be sure you have provisioned enough queries per minute (QPM) quota and the recommended QPM for each model. If you configure the Vertex AI prompt optimizer with a QPM that is higher than the QPM than you have access to, the job will fail.\n", - "\n", - "[Check out](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer#before-you-begin) the documentation to know more." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "GtPnvKIpUQ3q" - }, - "outputs": [], - "source": [ - "WORKER_POOL_SPECS = [\n", - " {\n", - " \"machine_spec\": {\n", - " \"machine_type\": \"n1-standard-4\",\n", - " },\n", - " \"replica_count\": 1,\n", - " \"container_spec\": {\n", - " \"image_uri\": APD_CONTAINER_URI,\n", - " \"args\": [\"--config=\" + CONFIG_FILE_URI],\n", - " },\n", - " }\n", - "]\n", - "\n", - "custom_job = aiplatform.CustomJob(\n", - " display_name=PROMPT_OPTIMIZATION_JOB,\n", - " worker_pool_specs=WORKER_POOL_SPECS,\n", - ")\n", - "\n", - "custom_job.submit(service_account=SERVICE_ACCOUNT)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "3YwwKBhtJ4ut" - }, - "source": [ - "### Collect the optimization results\n", - "\n", - "After the optimization job successfully run, you collect the optimized templates and evaluation results for the instruction\n", - "\n", - "Below you use a helper function to read the optimal system instruction template and the associated evaluation metrics." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "xTPJsvg-kzkO" - }, - "outputs": [], - "source": [ - "apd_result_uris = get_results_file_uris(\n", - " output_uri=OUTPUT_TUNING_RUN_URI,\n", - " required_files=[\"eval_results.json\", \"templates.json\"],\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "PrezXkBUu1s5" - }, - "outputs": [], - "source": [ - "best_prompt_df, prompt_summary_df, prompt_metrics_df = get_optimization_result(\n", - " apd_result_uris[\"instruction_templates\"],\n", - " apd_result_uris[\"instruction_eval_results\"],\n", - ")\n", - "\n", - "display_eval_report(\n", - " (best_prompt_df, prompt_summary_df, prompt_metrics_df),\n", - " prompt_component=\"instruction\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "TrMrbcA5Gzep" - }, - "source": [ - "### Validate and evaluate the optimized template in question-answering task\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "bGRELw3U3I28" - }, - "source": [ - "#### Generate new responses using the optimized template\n", - "\n", - "Then, you generate the new responses with the optimized template. Below you can see an example of a generated response using the optimized system instructions template." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "GXDU_ydAG5ak" - }, - "outputs": [], - "source": [ - "optimized_prompt_template = (\n", - " best_prompt_df[\"prompt\"].iloc[0]\n", - " + \"\\nQuestion: \\n{question}\"\n", - " + \"\\nContext: \\n{context}\"\n", - " + \"\\nAnswer:\"\n", - ")\n", - "\n", - "optimized_prompts = [\n", - " optimized_prompt_template.format(question=q, context=c)\n", - " for q, c in zip(\n", - " prompt_tuning_df[\"user_question\"].to_list(),\n", - " prompt_tuning_df[\"context\"].to_list(),\n", - " )\n", - "]\n", - "\n", - "prompt_tuning_df[\"optimized_prompt_with_vapo\"] = optimized_prompts" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "qG6QJW8alttS" - }, - "outputs": [], - "source": [ - "gemini_llm = init_new_model(\"gemini-1.5-flash-001\")\n", - "\n", - "gemini_predictions = [async_generate(p, model=gemini_llm) for p in optimized_prompts]\n", - "\n", - "gemini_predictions_col = await tqdm_asyncio.gather(*gemini_predictions)\n", - "\n", - "prompt_tuning_df[\"gemini_answer_with_vapo\"] = gemini_predictions_col" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "4sGDKpXU-SqG" - }, - "source": [ - "#### Evaluate the quality of generated responses with the optimized instruction\n", - "\n", - "Finally, you evaluate generated responses with the optimized instruction qualitatively. If you want to know how to evaluate the new generated responses quantitatively, check out [the SDK notebook](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/prompts/prompt_optimizer) in the official repo." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "_55cHbD4kFAz" - }, - "outputs": [], - "source": [ - "print_df_rows(prompt_tuning_df, n=1)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "2a4e033321ad" - }, - "source": [ - "## IV. Clean up" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "WRY_3wh1GVNm" - }, - "outputs": [], - "source": [ - "delete_bucket = False\n", - "delete_job = False\n", - "delete_run = False\n", - "delete_tutorial = False\n", - "\n", - "if delete_bucket:\n", - " ! gsutil rm -r {BUCKET_URI}\n", - "\n", - "if delete_job:\n", - " custom_job.delete()\n", - "\n", - "if delete_run:\n", - " ! gcloud functions delete 'custom_engagement_personalization_metric' --region={REGION}\n", - "\n", - "if delete_tutorial:\n", - " import shutil\n", - "\n", - " shutil.rmtree(str(TUTORIAL_PATH))" - ] - } - ], - "metadata": { - "colab": { - "name": "vertex_ai_prompt_optimizer_sdk_custom_metric.ipynb", - "toc_visible": true - }, - "environment": { - "kernel": "python3", - "name": "tf2-cpu.2-11.m125", - "type": "gcloud", - "uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/tf2-cpu.2-11:m125" - }, - "kernelspec": { - "display_name": "Python 3 (Local)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.15" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "cells": [ + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ur8xi4C7S06n" + }, + "outputs": [], + "source": [ + "# Copyright 2024 Google LLC\n", + "#\n", + "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "JAPoU8Sm5E6e" + }, + "source": [ + "# Vertex Prompt Optimizer Notebook SDK (Preview) - Custom metric\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + "
\n", + " \n", + " \"Google
Open in Colab\n", + "
\n", + "
\n", + " \n", + " \"Google
Open in Colab Enterprise\n", + "
\n", + "
\n", + " \n", + " \"Vertex
Open in Vertex AI Workbench\n", + "
\n", + "
\n", + " \n", + " \"GitHub
View on GitHub\n", + "
\n", + "
\n", + " " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0ccc35a93b9f" + }, + "source": [ + "| | | |\n", + "|-|-|-|\n", + "| Author | [Ivan Nardini](https://github.com/inardini) |" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tvgnzT1CKxrO" + }, + "source": [ + "## I. Overview\n", + "\n", + "In the context of developing Generative AI (Gen AI) applications, prompt engineering poses challenges due to its time-consuming and error-prone nature. You often dedicate significant effort to crafting and inputting prompts to achieve successful task completion. Additionally, with the frequent release of foundational models, you face the additional burden of migrating working prompts from one model version to another.\n", + "\n", + "Vertex AI Prompt Optimizer aims to alleviate these challenges by providing you with an intelligent prompt optimization tool. With this tool you can both refine optimize system instruction (and task) in the prompts and selects the best demonstrations (few-shot examples) for prompt templates, empowering you to shape LLM responses from any source model to on a target Google model.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4HKyj5KwYePX" + }, + "source": [ + "### Objective\n", + "\n", + "This notebook demostrates how to leverage Vertex AI Prompt Optimizer (Preview) to optimize a simple prompt for a Gemini model using your own metric. The goal is to use Vertex AI Prompt Optimizer (Preview) to find the new prompt template which generates responses according to your own metric.\n", + "\n", + "\n", + "This tutorial uses the following Google Cloud services and resources:\n", + "\n", + "- Vertex AI Gen AI\n", + "- Vertex AI Prompt Optimizer (Preview)\n", + "- Vertex AI Model Eval\n", + "- Vertex AI Custom job\n", + "- Cloud Run\n", + "\n", + "The steps performed include:\n", + "\n", + "- Prepare the prompt-ground truth pairs optimized for another model\n", + "- Define the prompt template you want to optimize\n", + "- Define and deploy your own custom evaluation metric on Cloud function\n", + "- Set optimization mode and steps\n", + "- Run the automatic prompt optimization job\n", + "- Collect the best prompt template and eval metric\n", + "- Validate the best prompt template" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "08d289fa873f" + }, + "source": [ + "### Dataset\n", + "\n", + "The dataset is a question-answering dataset generated by a simple AI cooking assistant that provides suggestions on how to cook healthier dishes.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "aed92deeb4a0" + }, + "source": [ + "### Costs\n", + "\n", + "This tutorial uses billable components of Google Cloud:\n", + "\n", + "* Vertex AI\n", + "* Cloud Storage\n", + "\n", + "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage pricing](https://cloud.google.com/storage/pricing) and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "61RBz8LLbxCR" + }, + "source": [ + "## II. Before you start" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "No17Cw5hgx12" + }, + "source": [ + "### Install Vertex AI SDK for Python and other required packages\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "tFy3H3aPgx12" + }, + "outputs": [], + "source": [ + "%pip install --upgrade --quiet 'google-cloud-aiplatform[evaluation]' 'plotly' 'asyncio' 'tqdm' 'tenacity' 'etils' 'importlib_resources' 'fsspec' 'gcsfs' 'nbformat>=4.2.0'" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "e55e2195ce2d" + }, + "outputs": [], + "source": [ + "! mkdir -p ./tutorial/utils && wget https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/prompts/prompt_optimizer/utils/helpers.py -P ./tutorial/utils" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "R5Xep4W9lq-Z" + }, + "source": [ + "### Restart runtime (Colab only)\n", + "\n", + "To use the newly installed packages, you must restart the runtime on Google Colab." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "XRvKdaPDTznN" + }, + "outputs": [], + "source": [ + "import sys\n", + "\n", + "if \"google.colab\" in sys.modules:\n", + " import IPython\n", + "\n", + " app = IPython.Application.instance()\n", + " app.kernel.do_shutdown(True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SbmM4z7FOBpM" + }, + "source": [ + "
\n", + "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n", + "
\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "dmWOrTJ3gx13" + }, + "source": [ + "### Authenticate your notebook environment (Colab only)\n", + "\n", + "Authenticate your environment on Google Colab.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "NyKGtVQjgx13" + }, + "outputs": [], + "source": [ + "import sys\n", + "\n", + "if \"google.colab\" in sys.modules:\n", + " try:\n", + " from google.colab import auth\n", + "\n", + " auth.authenticate_user()\n", + " creds, project = auth.default()\n", + " if creds.token:\n", + " print(\"Authentication successful.\")\n", + " else:\n", + " print(\"Authentication successful, but no token was returned.\")\n", + " except Exception as e:\n", + " print(f\"Error during Colab authentication: {e}\")\n", + "\n", + "! gcloud auth login" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DF4l8DTdWgPY" + }, + "source": [ + "### Set Google Cloud project information\n", + "\n", + "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the following APIs](https://console.cloud.google.com/flows/enableapi?apiid=cloudresourcemanager.googleapis.com,aiplatform.googleapis.com,cloudfunctions.googleapis.com,run.googleapis.com).\n", + "\n", + "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WReHDGG5g0XY" + }, + "source": [ + "#### Set your project ID and project number" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "oM1iC_MfAts1" + }, + "outputs": [], + "source": [ + "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n", + "\n", + "# Set the project id\n", + "! gcloud config set project {PROJECT_ID}" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "oZpm-sL8f1z_" + }, + "outputs": [], + "source": [ + "PROJECT_NUMBER = !gcloud projects describe {PROJECT_ID} --format=\"get(projectNumber)\"[0]\n", + "PROJECT_NUMBER = PROJECT_NUMBER[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "region" + }, + "source": [ + "#### Region\n", + "\n", + "You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "I6FmBV2_0fBP" + }, + "outputs": [], + "source": [ + "REGION = \"us-central1\" # @param {type: \"string\"}" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "zgPO1eR3CYjk" + }, + "source": [ + "#### Create a Cloud Storage bucket\n", + "\n", + "Create a storage bucket to store intermediate artifacts such as datasets." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "MzGDU7TWdts_" + }, + "outputs": [], + "source": [ + "BUCKET_NAME = \"your-bucket-name-{PROJECT_ID}-unique\" # @param {type:\"string\"}\n", + "\n", + "BUCKET_URI = f\"gs://{BUCKET_NAME}\" # @param {type:\"string\"}" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "NIq7R4HZCfIc" + }, + "outputs": [], + "source": [ + "! gsutil mb -l {REGION} -p {PROJECT_ID} {BUCKET_URI}" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "set_service_account" + }, + "source": [ + "#### Service Account and permissions\n", + "\n", + "Vertex AI Automated Prompt Design requires a service account with the following permissions:\n", + "\n", + "- `Vertex AI User` to call Vertex LLM API\n", + "- `Storage Object Admin` to read and write to your GCS bucket.\n", + "- `Artifact Registry Reader` to download the pipeline template from Artifact Registry.\n", + "- `Cloud Run Developer` to deploy function on Cloud Run.\n", + "\n", + "[Check out the documentation](https://cloud.google.com/iam/docs/manage-access-service-accounts#iam-view-access-sa-gcloud) to know how to grant those permissions to a single service account.\n", + "\n", + "**Important**: If you run following commands using Vertex AI Workbench, please directly run in the terminal.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ssUJJqXJJHgC" + }, + "outputs": [], + "source": [ + "SERVICE_ACCOUNT = f\"{PROJECT_NUMBER}-compute@developer.gserviceaccount.com\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "wqOHg5aid6HP" + }, + "outputs": [], + "source": [ + "for role in ['aiplatform.user', 'storage.objectAdmin', 'artifactregistry.reader', 'run.developer', 'run.invoker']:\n", + "\n", + " ! gcloud projects add-iam-policy-binding {PROJECT_ID} \\\n", + " --member=serviceAccount:{SERVICE_ACCOUNT} \\\n", + " --role=roles/{role} --condition=None" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Ek1-iTbPjzdJ" + }, + "source": [ + "### Set tutorial folder and workspace\n", + "\n", + "Set a folder to collect data and any tutorial artifacts." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "BbfKRabXj3la" + }, + "outputs": [], + "source": [ + "from pathlib import Path as path\n", + "\n", + "ROOT_PATH = path.cwd()\n", + "TUTORIAL_PATH = ROOT_PATH / \"tutorial\"\n", + "CONFIG_PATH = TUTORIAL_PATH / \"config\"\n", + "TUNED_PROMPT_PATH = TUTORIAL_PATH / \"tuned_prompts\"\n", + "BUILD_PATH = TUTORIAL_PATH / \"build\"\n", + "\n", + "TUTORIAL_PATH.mkdir(parents=True, exist_ok=True)\n", + "CONFIG_PATH.mkdir(parents=True, exist_ok=True)\n", + "TUNED_PROMPT_PATH.mkdir(parents=True, exist_ok=True)\n", + "BUILD_PATH.mkdir(parents=True, exist_ok=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BaNdfftpXTIX" + }, + "source": [ + "Set the associated workspace on Cloud Storage bucket." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "joJPc3FmX1fk" + }, + "outputs": [], + "source": [ + "from etils import epath\n", + "\n", + "WORKSPACE_URI = epath.Path(BUCKET_URI) / \"prompt_migration_gemini\"\n", + "INPUT_DATA_URI = epath.Path(WORKSPACE_URI) / \"data\"\n", + "\n", + "WORKSPACE_URI.mkdir(parents=True, exist_ok=True)\n", + "INPUT_DATA_URI.mkdir(parents=True, exist_ok=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "960505627ddf" + }, + "source": [ + "### Import libraries" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "PyQmSRbKA8r-" + }, + "outputs": [], + "source": [ + "from argparse import Namespace\n", + "import json\n", + "import logging\n", + "\n", + "# General\n", + "from pprint import pprint\n", + "import warnings\n", + "\n", + "from google.cloud import aiplatform\n", + "import pandas as pd\n", + "import requests\n", + "from tutorial.utils.helpers import (\n", + " async_generate,\n", + " display_eval_report,\n", + " get_auth_token,\n", + " get_id,\n", + " get_optimization_result,\n", + " get_results_file_uris,\n", + " init_new_model,\n", + " print_df_rows,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "820DIvw1o8tB" + }, + "source": [ + "### Libraries settings" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "HKc4ZdUBo_SM" + }, + "outputs": [], + "source": [ + "warnings.filterwarnings(\"ignore\")\n", + "logging.getLogger(\"urllib3.connectionpool\").setLevel(logging.ERROR)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "gxc7q4r-DFH4" + }, + "source": [ + "### Define constants" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "0Y5t67f3DHNm" + }, + "outputs": [], + "source": [ + "INPUT_DATA_FILE_URI = \"gs://github-repo/prompts/prompt_optimizer/rag_qa_dataset.jsonl\"\n", + "\n", + "INPUT_TUNING_DATA_URI = epath.Path(WORKSPACE_URI) / \"tuning_data\"\n", + "INPUT_TUNING_DATA_FILE_URI = str(INPUT_DATA_URI / \"prompt_tuning.jsonl\")\n", + "OUTPUT_TUNING_DATA_URI = epath.Path(WORKSPACE_URI) / \"tuned_prompt\"\n", + "APD_CONTAINER_URI = (\n", + " \"us-docker.pkg.dev/vertex-ai-restricted/builtin-algorithm/apd:preview_v1_0\"\n", + ")\n", + "CONFIG_FILE_URI = str(WORKSPACE_URI / \"config\" / \"config.json\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "init_aip:mbsdk,all" + }, + "source": [ + "### Initialize Vertex AI SDK for Python\n", + "\n", + "Initialize the Vertex AI SDK for Python for your project." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "bQMc2Uwf0fBQ" + }, + "outputs": [], + "source": [ + "aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "EdvJRUWRNGHE" + }, + "source": [ + "## III. Automated prompt design with Vertex AI Prompt Optimizer (Preview)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "mmTotjRAJplw" + }, + "source": [ + "### Load the dataset\n", + "\n", + "Load the dataset from Cloud Storage bucket." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "LA7aG08wJtVm" + }, + "outputs": [], + "source": [ + "prompt_tuning_df = pd.read_json(INPUT_DATA_FILE_URI, lines=True)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "1xn-pz3v5HVK" + }, + "outputs": [], + "source": [ + "prompt_tuning_df.head()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "PsXdJBJXiaVH" + }, + "outputs": [], + "source": [ + "print_df_rows(prompt_tuning_df, n=1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Rp1n1aMACzSW" + }, + "source": [ + "### Enhance the prompt template with Vertex AI Prompt Optimizer (Preview) with custom metric\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "h1650lf3X8xW" + }, + "source": [ + "#### Prepare the prompt template you want to optimize\n", + "\n", + "A prompt consists of two key parts:\n", + "\n", + "* **System Instruction Template** which is a fixed part of the prompt shared across all queries for a given task.\n", + "\n", + "* **Prompt Template** which is a dynamic part of the prompt that changes based on the task.\n", + "\n", + "Vertex AI Prompt Optimizer enables the translation and optimization of the Instruction Template, while the Task/Context Template remains essential for evaluating different instruction templates.\n", + "\n", + "In this case, you want to translate a prompt\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Db8rHNC6DmtY" + }, + "outputs": [], + "source": [ + "SYSTEM_INSTRUCTION_TEMPLATE = \"\"\"\n", + "Given a question with some context, provide the correct answer to the question.\n", + "\"\"\"\n", + "\n", + "PROMPT_TEMPLATE = \"\"\"\n", + "Some examples of correct answer to a question with context are:\n", + "Question: {{question}}\n", + "Answer: {{target}}\n", + "\"\"\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "a1TCgXsrXztm" + }, + "source": [ + "#### Prepare few samples\n", + "\n", + "Vertex AI Prompt optimizer requires a CSV or JSONL file containing labeled samples.\n", + "\n", + "For **prompt optimization**:\n", + "\n", + "* Focus on examples that specifically demonstrate the issues you want to address.\n", + "* Recommendation: Use 50-100 distinct samples for reliable results. However, the tool can still be effective with as few as 5 samples.\n", + "\n", + "For **prompt translation**:\n", + "\n", + "* Consider using the source model to label examples that the target model struggles with, helping to identify areas for improvement.\n", + "\n", + "Learn more about setting up your CSV or JSONL file as input [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "vTIl_v9Ig1F-" + }, + "outputs": [], + "source": [ + "prepared_prompt_tuning_df = prompt_tuning_df.copy()\n", + "\n", + "# Prepare question and target columns\n", + "prepared_prompt_tuning_df[\"question\"] = (\n", + " prepared_prompt_tuning_df[\"user_question\"]\n", + " + \"\\nnContext:\\n\"\n", + " + prepared_prompt_tuning_df[\"context\"]\n", + ")\n", + "prepared_prompt_tuning_df = prepared_prompt_tuning_df.rename(\n", + " columns={\"reference\": \"target\"}\n", + ")\n", + "\n", + "# Remove uneccessary columns\n", + "prepared_prompt_tuning_df = prepared_prompt_tuning_df.drop(\n", + " columns=[\"user_question\", \"context\", \"prompt\", \"answer\"]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "_DUFEAb82eEi" + }, + "outputs": [], + "source": [ + "prepared_prompt_tuning_df.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nF3XY_d_yB-K" + }, + "source": [ + "#### Upload samples to bucket\n", + "\n", + "Once you prepare samples, you can upload them on Cloud Storage bucket." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "155paLgGUXOm" + }, + "outputs": [], + "source": [ + "prepared_prompt_tuning_df.to_json(\n", + " INPUT_TUNING_DATA_FILE_URI, orient=\"records\", lines=True\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Hxpid3KgAkYM" + }, + "source": [ + "#### Define and deploy your own custom optimization metric on Cloud function\n", + "\n", + "To optimize your prompt template using a custom optimization metric, you need to deploy a function with your own metric code on Cloud function. To deploy a Cloud function with your own custom metric, you cover the following steps:\n", + "\n", + "1. Define requirements\n", + "2. Write your own custom metric function code\n", + "3. Deploy the custom code as Cloud function\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Nxh2e88fAnQc" + }, + "source": [ + "##### Define requirements\n", + "\n", + "Set the custom metric dependencies." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "q-hUlhgBCus4" + }, + "outputs": [], + "source": [ + "requirements = \"\"\"\n", + "functions-framework==3.*\n", + "google-cloud-aiplatform\n", + "\"\"\"\n", + "\n", + "with open(BUILD_PATH / \"requirements.txt\", \"w\") as f:\n", + " f.write(requirements)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "k_EFZEBeEy48" + }, + "source": [ + "##### Write your own custom metric function\n", + "\n", + "Define the module which contains your own custom metric function definition.\n", + "\n", + "It is important to highlight that you need to retrieve the input data using `request.get_json()` as shown below. This will return a json dict. The `response` field will be provided by the service which contains the LLM output.\n", + "\n", + "Also you have to return a json serialized dict with two fields: `custom metric name` you specified, and the `explanation` to correctly optimize the prompt template with your own metric.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "1wGoVQNCMbxe" + }, + "outputs": [], + "source": [ + "custom_metric_function_code = '''\n", + "\"\"\"\n", + "This module contains the custom evaluation metric definition to optimize a prompt template with Vertex AI Prompt Optimizer\n", + "\"\"\"\n", + "\n", + "from typing import Dict\n", + "from vertexai.generative_models import (\n", + " GenerationConfig,\n", + " GenerativeModel,\n", + " HarmBlockThreshold,\n", + " HarmCategory,\n", + ")\n", + "\n", + "import json\n", + "import functions_framework\n", + "\n", + "def get_autorater_response(metric_prompt: str) -> dict:\n", + " \"\"\"This function is to generate the evaluation response from the autorater.\"\"\"\n", + "\n", + " metric_response_schema = {\n", + " \"type\": \"OBJECT\",\n", + " \"properties\": {\n", + " \"score\": {\"type\": \"NUMBER\"},\n", + " \"explanation\": {\"type\": \"STRING\"},\n", + " },\n", + " \"required\": [\"score\", \"explanation\"],\n", + " }\n", + "\n", + " autorater = GenerativeModel(\n", + " \"gemini-1.5-pro\",\n", + " generation_config=GenerationConfig(\n", + " response_mime_type=\"application/json\",\n", + " response_schema=metric_response_schema,\n", + " ),\n", + " safety_settings={\n", + " HarmCategory.HARM_CATEGORY_UNSPECIFIED: HarmBlockThreshold.BLOCK_NONE,\n", + " HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,\n", + " HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_NONE,\n", + " HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,\n", + " HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_NONE,\n", + " },\n", + " )\n", + "\n", + " response = autorater.generate_content(metric_prompt)\n", + "\n", + " response_json = {}\n", + " response_json = json.loads(response.text)\n", + " return response_json\n", + "\n", + "\n", + "# Define custom evaluation criteria\n", + "def evaluate_engagement_personalization_fn(question: str, response:str, target: str) -> Dict[str, str]:\n", + " \"\"\"Evaluates an AI-generated response for User Engagement and Personalization.\"\"\"\n", + "\n", + " custom_metric_prompt_template = \"\"\"\n", + "\n", + " # Instruction\n", + " You are an expert evaluator. Your task is to evaluate the quality of the LLM-generated responses against a reference target response.\n", + " You should first read the Question carefully, and then evaluate the quality of the responses based on the Criteria provided in the Evaluation section below.\n", + " You will assign the response a rating following the Rating Rubric only and an step-by-step explanation for your rating.\n", + "\n", + " # Evaluation\n", + "\n", + " ## Criteria\n", + " Relevance and Customization: The response should directly address the user's query and demonstrate an understanding of their specific needs or preferences, such as dietary restrictions, skill level, or taste preferences.\n", + " Interactivity and Proactiveness: The response should go beyond simply answering the question by actively encouraging further interaction through follow-up questions, suggestions for additional exploration, or prompts for more information to provide a tailored experience.\n", + " Tone and Empathy: The response should adopt an appropriate and empathetic tone that fosters a positive and supportive user experience, making the user feel heard and understood.\n", + "\n", + " ## Rating rubric\n", + " 1 - Minimal: The response lacks personalization and demonstrates minimal engagement with the user. The tone may be impersonal or generic.\n", + " 2 - Basic: The response shows some basic personalization but lacks depth or specificity. Engagement is limited, possibly with generic prompts or suggestions. The tone is generally neutral but may lack warmth or empathy.\n", + " 3 - Moderate: The response demonstrates clear personalization and attempts to engage the user with relevant follow-up questions or prompts based on their query. The tone is friendly and supportive, fostering a positive user experience.\n", + " 4 - High: The response demonstrates a high degree of personalization and actively engages the user with relevant follow-up questions or prompts. The tone is empathetic and understanding, creating a strong connection with the user.\n", + " 5 - Exceptional: The response goes above and beyond to personalize the experience, anticipating user needs, and fostering a genuine connection. The tone is warm, encouraging, and inspiring, leaving the user feeling empowered and motivated.\n", + "\n", + " ## Evaluation steps\n", + " Step 1: Carefully read both the question and the generated response. Ensure a clear understanding of the user's intent, needs, and any specific context provided.\n", + " Step 2: Evaluate how well the response directly addresses the user's query and demonstrates an understanding of their specific needs or preferences.\n", + " Step 3: Determine the extent to which the response actively encourages further interaction and provides a tailored experience.\n", + " Step 4: Evaluate Tone & Empathy: Analyze the tone of the response, ensuring it fosters a positive and supportive user experience, making the user feel heard and understood.\n", + " Step 5: Based on the three criteria above, assign a score from 1 to 5 according to the score rubric.\n", + " Step 5: Justify the assigned score with a clear and concise explanation, highlighting the strengths and weaknesses of the response with respect to each criterion.\n", + "\n", + " # Question : {question}\n", + " # Generated response: {response}\n", + " # Reference response: {target}\n", + " \"\"\"\n", + "\n", + " custom_metric_prompt = custom_metric_prompt_template.format(question=question, response=response, target=target)\n", + " response_dict = get_autorater_response(custom_metric_prompt)\n", + "\n", + " return {\n", + " \"custom_metric\": response_dict[\"score\"],\n", + " \"explanation\": response_dict[\"explanation\"],\n", + " }\n", + "\n", + "# Register an HTTP function with the Functions Framework\n", + "@functions_framework.http\n", + "def main(request):\n", + " request_json = request.get_json(silent=True)\n", + "\n", + " if not request_json:\n", + " raise ValueError('Cannot find request json.')\n", + "\n", + " question = request_json['question']\n", + " response = request_json['response']\n", + " reference = request_json['target']\n", + "\n", + " get_evaluation_result = evaluate_engagement_personalization_fn(question, response, reference)\n", + " return json.dumps(get_evaluation_result)\n", + "'''\n", + "\n", + "with open(BUILD_PATH / \"main.py\", \"w\") as f:\n", + " f.write(custom_metric_function_code)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "T7R0LDZMCPnL" + }, + "source": [ + "##### Deploy the custom metric as a Cloud Function\n", + "\n", + "Use gcloud command line to deploy the cloud function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "nwBZGvkLCizs" + }, + "outputs": [], + "source": [ + "!gcloud functions deploy 'custom_engagement_personalization_metric' \\\n", + " --gen2 \\\n", + " --runtime=\"python310\" \\\n", + " --source={str(BUILD_PATH)} \\\n", + " --entry-point=main \\\n", + " --trigger-http \\\n", + " --timeout=3600 \\\n", + " --memory=2Gb \\\n", + " --concurrency=6 \\\n", + " --min-instances=6 \\\n", + " --project {PROJECT_ID} \\\n", + " --region={REGION} \\\n", + " --quiet" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FoEypczSwAGK" + }, + "source": [ + "##### Test your custom evaluation function\n", + "\n", + "Submit a request to validate the output of the custom evaluation function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "HXOWYp2MwEsA" + }, + "outputs": [], + "source": [ + "custom_evaluator_function_uri = ! gcloud functions describe 'custom_engagement_personalization_metric' --gen2 --region {REGION} --format=\"value(url)\"\n", + "custom_evaluator_function_uri = custom_evaluator_function_uri[0].strip()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "0JMeIyx0DHnc" + }, + "outputs": [], + "source": [ + "headers = {\n", + " \"Authorization\": f\"Bearer {get_auth_token()}\",\n", + " \"Content-Type\": \"application/json\",\n", + "}\n", + "\n", + "json_data = {\n", + " \"question\": \"\"\"\n", + " What are some techniques for cooking red meat and pork that maximize flavor and tenderness while minimizing the formation of unhealthy compounds?\n", + " \"\"\",\n", + " \"response\": \"\"\"\n", + " * Marinating in acidic ingredients like lemon juice or vinegar to tenderize the meat \\n * Cooking to an internal temperature of 145°F (63°C) for safety \\n * Using high-heat cooking methods like grilling and pan-searing for browning and caramelization /n * Avoiding charring to minimize the formation of unhealthy compounds\n", + " \"\"\",\n", + " \"target\": \"\"\"\n", + " Here's how to tackle those delicious red meats and pork while keeping things healthy:\n", + " **Prioritize Low and Slow:**\n", + " * **Braising and Stewing:** These techniques involve gently simmering meat in liquid over low heat for an extended period. This breaks down tough collagen, resulting in incredibly tender and flavorful meat. Plus, since the cooking temperature is lower, it minimizes the formation of potentially harmful compounds associated with high-heat cooking.\n", + " * **Sous Vide:** This method involves sealing meat in a vacuum bag and immersing it in a precisely temperature-controlled water bath. It allows for even cooking to the exact desired doneness, resulting in incredibly juicy and tender meat. Because the temperature is controlled and lower than traditional methods, it can be a healthier option.\n", + " **High Heat Tips:**\n", + " * **Marinades are Your Friend:** As you mentioned, acidic marinades tenderize meat. They also add flavor!\n", + " * **Temperature Control is Key:** Use a meat thermometer to ensure you reach the safe internal temperature of 145°F (63°C) without overcooking.\n", + " * **Don't Burn It!** While some browning is desirable, charring creates those unhealthy compounds. Pat meat dry before cooking to minimize steaming and promote browning. Let the pan heat up properly before adding the meat to achieve a good sear.\n", + "\n", + " **Remember:** Trim visible fat before cooking to reduce saturated fat content. Let meat rest after cooking; this allows juices to redistribute, resulting in a more tender and flavorful final product.\n", + " \"\"\",\n", + "}\n", + "\n", + "response = requests.post(\n", + " custom_evaluator_function_uri, headers=headers, json=json_data, timeout=70\n", + ").json()\n", + "pprint(response)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "F5RD0l2xX-FI" + }, + "source": [ + "#### Configure optimization settings\n", + "\n", + "Vertex AI Prompt Optimizer allows you to optimize prompts by optimizing instructions only, demonstration only, or both (`optimization_mode`), and after you set the system instruction, prompt templates that will be optimized (`system_instruction`, `prompt_template`), and the model you want to optimize for (`target_model`), it allows to condition the optimization process by setting metrics, number of iterations used to improve the prompt and more.\n", + "\n", + "In this scenario, you set two parameters:\n", + "\n", + "* `custom_metric_name` parameter which allows you to pass your own custom metric to optimizer the prompt template.\n", + "* `custom_metric_cloud_function_name` parameter which indicates the Cloud function to call for collecting custom function evaluation metric output.\n", + "\n", + "For additional configurations, check out the documentation [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer).\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "sFHutXhgeqRx" + }, + "outputs": [], + "source": [ + "PROMPT_OPTIMIZATION_JOB = \"auto-prompt-design-job-\" + get_id()\n", + "OUTPUT_TUNING_RUN_URI = str(OUTPUT_TUNING_DATA_URI / PROMPT_OPTIMIZATION_JOB)\n", + "\n", + "args = Namespace(\n", + " # Basic configuration\n", + " system_instruction=SYSTEM_INSTRUCTION_TEMPLATE,\n", + " prompt_template=PROMPT_TEMPLATE,\n", + " target_model=\"gemini-1.5-flash-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n", + " optimization_mode=\"instruction\", # Supported modes: \"instruction\", \"demonstration\", \"instruction_and_demo\"\n", + " custom_metric_name=\"custom_metric\",\n", + " custom_metric_cloud_function_name=\"custom_engagement_personalization_metric\",\n", + " num_steps=3,\n", + " num_template_eval_per_step=2,\n", + " num_demo_set_candidates=3,\n", + " demo_set_size=2,\n", + " input_data_path=INPUT_TUNING_DATA_FILE_URI,\n", + " output_path=OUTPUT_TUNING_RUN_URI,\n", + " project=PROJECT_ID,\n", + " # Advanced configuration\n", + " target_model_qps=1,\n", + " target_model_location=\"us-central1\",\n", + " source_model=\"\",\n", + " source_model_qps=\"\",\n", + " source_model_location=\"\",\n", + " optimizer_model=\"gemini-1.5-pro-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n", + " optimizer_model_qps=1,\n", + " optimizer_model_location=\"us-central1\",\n", + " eval_model=\"gemini-1.5-pro-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n", + " eval_qps=1,\n", + " eval_model_location=\"us-central1\",\n", + " eval_metrics_types=[\n", + " \"question_answering_correctness\",\n", + " \"custom_metric\",\n", + " ], # Supported metrics: \"bleu\", \"coherence\", \"exact_match\", \"fluidity\", \"fulfillment\", \"groundedness\", \"rouge_1\", \"rouge_2\", \"rouge_l\", \"rouge_l_sum\", \"safety\", \"question_answering_correctness\", \"question_answering_helpfulness\", \"question_answering_quality\", \"question_answering_relevance\", \"summarization_helpfulness\", \"summarization_quality\", \"summarization_verbosity\", \"tool_name_match\", \"tool_parameter_key_match\", \"tool_parameter_kv_match\"\n", + " eval_metrics_weights=[0.8, 0.2],\n", + " aggregation_type=\"weighted_sum\", # Supported aggregation types: \"weighted_sum\", \"weighted_average\"\n", + " data_limit=50,\n", + " response_mime_type=\"application/json\",\n", + " language=\"English\", # Supported languages: \"English\", \"French\", \"German\", \"Hebrew\", \"Hindi\", \"Japanese\", \"Korean\", \"Portuguese\", \"Simplified Chinese\", \"Spanish\", \"Traditional Chinese\"\n", + " placeholder_to_content=json.loads(\"{}\"),\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Jd_uzQYQx6L7" + }, + "source": [ + "#### Upload Vertex AI Prompt Optimizer (Preview) config to Cloud Storage\n", + "\n", + "After you define Vertex AI Prompt Optimizer (Preview) configuration, you upload them on Cloud Storage bucket.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QCJAqcfWBqAh" + }, + "source": [ + "Now you can save the config to the bucket." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "iqiv8ApR_SAM" + }, + "outputs": [], + "source": [ + "args = vars(args)\n", + "\n", + "with epath.Path(CONFIG_FILE_URI).open(\"w\") as config_file:\n", + " json.dump(args, config_file)\n", + "config_file.close()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "spqgBT8hYAle" + }, + "source": [ + "#### Run the automatic prompt optimization job\n", + "\n", + "Now you are ready to run your first Vertex AI Prompt Optimizer (Preview) job using the Vertex AI SDK for Python.\n", + "\n", + "**Important:** Be sure you have provisioned enough queries per minute (QPM) quota and the recommended QPM for each model. If you configure the Vertex AI prompt optimizer with a QPM that is higher than the QPM than you have access to, the job will fail.\n", + "\n", + "[Check out](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer#before-you-begin) the documentation to know more." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "GtPnvKIpUQ3q" + }, + "outputs": [], + "source": [ + "WORKER_POOL_SPECS = [\n", + " {\n", + " \"machine_spec\": {\n", + " \"machine_type\": \"n1-standard-4\",\n", + " },\n", + " \"replica_count\": 1,\n", + " \"container_spec\": {\n", + " \"image_uri\": APD_CONTAINER_URI,\n", + " \"args\": [\"--config=\" + CONFIG_FILE_URI],\n", + " },\n", + " }\n", + "]\n", + "\n", + "custom_job = aiplatform.CustomJob(\n", + " display_name=PROMPT_OPTIMIZATION_JOB,\n", + " worker_pool_specs=WORKER_POOL_SPECS,\n", + ")\n", + "\n", + "custom_job.submit(service_account=SERVICE_ACCOUNT)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3YwwKBhtJ4ut" + }, + "source": [ + "### Collect the optimization results\n", + "\n", + "After the optimization job successfully run, you collect the optimized templates and evaluation results for the instruction\n", + "\n", + "Below you use a helper function to read the optimal system instruction template and the associated evaluation metrics." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "xTPJsvg-kzkO" + }, + "outputs": [], + "source": [ + "apd_result_uris = get_results_file_uris(\n", + " output_uri=OUTPUT_TUNING_RUN_URI,\n", + " required_files=[\"eval_results.json\", \"templates.json\"],\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "PrezXkBUu1s5" + }, + "outputs": [], + "source": [ + "best_prompt_df, prompt_summary_df, prompt_metrics_df = get_optimization_result(\n", + " apd_result_uris[\"instruction_templates\"],\n", + " apd_result_uris[\"instruction_eval_results\"],\n", + ")\n", + "\n", + "display_eval_report(\n", + " (best_prompt_df, prompt_summary_df, prompt_metrics_df),\n", + " prompt_component=\"instruction\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TrMrbcA5Gzep" + }, + "source": [ + "### Validate and evaluate the optimized template in question-answering task\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "bGRELw3U3I28" + }, + "source": [ + "#### Generate new responses using the optimized template\n", + "\n", + "Then, you generate the new responses with the optimized template. Below you can see an example of a generated response using the optimized system instructions template." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "GXDU_ydAG5ak" + }, + "outputs": [], + "source": [ + "optimized_prompt_template = (\n", + " best_prompt_df[\"prompt\"].iloc[0]\n", + " + \"\\nQuestion: \\n{question}\"\n", + " + \"\\nContext: \\n{context}\"\n", + " + \"\\nAnswer:\"\n", + ")\n", + "\n", + "optimized_prompts = [\n", + " optimized_prompt_template.format(question=q, context=c)\n", + " for q, c in zip(\n", + " prompt_tuning_df[\"user_question\"].to_list(),\n", + " prompt_tuning_df[\"context\"].to_list(),\n", + " )\n", + "]\n", + "\n", + "prompt_tuning_df[\"optimized_prompt_with_vapo\"] = optimized_prompts" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "qG6QJW8alttS" + }, + "outputs": [], + "source": [ + "gemini_llm = init_new_model(\"gemini-1.5-flash-001\")\n", + "\n", + "gemini_predictions = [async_generate(p, model=gemini_llm) for p in optimized_prompts]\n", + "\n", + "gemini_predictions_col = await tqdm_asyncio.gather(*gemini_predictions)\n", + "\n", + "prompt_tuning_df[\"gemini_answer_with_vapo\"] = gemini_predictions_col" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4sGDKpXU-SqG" + }, + "source": [ + "#### Evaluate the quality of generated responses with the optimized instruction\n", + "\n", + "Finally, you evaluate generated responses with the optimized instruction qualitatively. If you want to know how to evaluate the new generated responses quantitatively, check out [the SDK notebook](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/prompts/prompt_optimizer) in the official repo." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "_55cHbD4kFAz" + }, + "outputs": [], + "source": [ + "print_df_rows(prompt_tuning_df, n=1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2a4e033321ad" + }, + "source": [ + "## IV. Clean up" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "WRY_3wh1GVNm" + }, + "outputs": [], + "source": [ + "delete_bucket = False\n", + "delete_job = False\n", + "delete_run = False\n", + "delete_tutorial = False\n", + "\n", + "if delete_bucket:\n", + " ! gsutil rm -r {BUCKET_URI}\n", + "\n", + "if delete_job:\n", + " custom_job.delete()\n", + "\n", + "if delete_run:\n", + " ! gcloud functions delete 'custom_engagement_personalization_metric' --region={REGION}\n", + "\n", + "if delete_tutorial:\n", + " import shutil\n", + "\n", + " shutil.rmtree(str(TUTORIAL_PATH))" + ] + } + ], + "metadata": { + "colab": { + "name": "vertex_ai_prompt_optimizer_sdk_custom_metric.ipynb", + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 0 }