From ee24cb8a435776677e326c54a01892aa54529134 Mon Sep 17 00:00:00 2001
From: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Date: Mon, 23 Sep 2024 11:33:33 -0500
Subject: [PATCH 01/76] fix: Fix PDF summarization prompt in Gemini 1.5 Pro
Notebook (#1151)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
# Description
Edited PDF Summarization prompt to resolve issue where PDF was not read.
Seems to be just a weird model quirk. Doesn't affect 1.5 Flash.
Fixes #754 🦕
---
.../intro_gemini_1_5_pro.ipynb | 61 +++++++++----------
1 file changed, 30 insertions(+), 31 deletions(-)
diff --git a/gemini/getting-started/intro_gemini_1_5_pro.ipynb b/gemini/getting-started/intro_gemini_1_5_pro.ipynb
index 1b3735ff9b..59cda2f42e 100644
--- a/gemini/getting-started/intro_gemini_1_5_pro.ipynb
+++ b/gemini/getting-started/intro_gemini_1_5_pro.ipynb
@@ -29,7 +29,7 @@
"id": "7yVV6txOmNMn"
},
"source": [
- "# Getting started with the Vertex AI Gemini 1.5 Pro\n",
+ "# Getting started with Vertex AI Gemini 1.5 Pro\n",
"\n",
"\n",
"
\n",
@@ -105,7 +105,7 @@
},
{
"cell_type": "code",
- "execution_count": 21,
+ "execution_count": null,
"metadata": {
"id": "tFy3H3aPgx12"
},
@@ -195,7 +195,7 @@
},
{
"cell_type": "code",
- "execution_count": 26,
+ "execution_count": 1,
"metadata": {
"id": "Nqwi-5ufWp_B"
},
@@ -220,7 +220,7 @@
},
{
"cell_type": "code",
- "execution_count": 27,
+ "execution_count": 2,
"metadata": {
"id": "lslYAvw37JGQ"
},
@@ -253,7 +253,7 @@
},
{
"cell_type": "code",
- "execution_count": 28,
+ "execution_count": 3,
"metadata": {
"id": "U7ExWmuLBdIA"
},
@@ -277,7 +277,7 @@
},
{
"cell_type": "code",
- "execution_count": 29,
+ "execution_count": 4,
"metadata": {
"id": "FhFxrtfdSwOP"
},
@@ -286,8 +286,8 @@
"name": "stdout",
"output_type": "stream",
"text": [
- "total_tokens: 14\n",
- "total_billable_characters: 29\n",
+ "total_tokens: 32\n",
+ "total_billable_characters: 108\n",
"\n",
"\n",
"Answer:\n",
@@ -295,7 +295,7 @@
"\n",
"\n",
"Usage metadata:\n",
- "{'prompt_token_count': 14, 'candidates_token_count': 8, 'total_token_count': 22}\n",
+ "{'prompt_token_count': 32, 'candidates_token_count': 8, 'total_token_count': 40}\n",
"\n",
"Finish reason:\n",
"1\n",
@@ -303,24 +303,24 @@
"Safety settings:\n",
"[category: HARM_CATEGORY_HATE_SPEECH\n",
"probability: NEGLIGIBLE\n",
- "probability_score: 0.15077754855155945\n",
+ "probability_score: 0.155273438\n",
"severity: HARM_SEVERITY_NEGLIGIBLE\n",
- "severity_score: 0.07821886986494064\n",
+ "severity_score: 0.0737304688\n",
", category: HARM_CATEGORY_DANGEROUS_CONTENT\n",
"probability: NEGLIGIBLE\n",
- "probability_score: 0.06730107963085175\n",
+ "probability_score: 0.0727539062\n",
"severity: HARM_SEVERITY_NEGLIGIBLE\n",
- "severity_score: 0.09089674800634384\n",
+ "severity_score: 0.0913085938\n",
", category: HARM_CATEGORY_HARASSMENT\n",
"probability: NEGLIGIBLE\n",
- "probability_score: 0.1252792477607727\n",
+ "probability_score: 0.134765625\n",
"severity: HARM_SEVERITY_NEGLIGIBLE\n",
- "severity_score: 0.08525123447179794\n",
+ "severity_score: 0.0815429688\n",
", category: HARM_CATEGORY_SEXUALLY_EXPLICIT\n",
"probability: NEGLIGIBLE\n",
- "probability_score: 0.21060390770435333\n",
+ "probability_score: 0.232421875\n",
"severity: HARM_SEVERITY_NEGLIGIBLE\n",
- "severity_score: 0.11260009557008743\n",
+ "severity_score: 0.125\n",
"]\n"
]
}
@@ -606,7 +606,7 @@
},
{
"cell_type": "code",
- "execution_count": 30,
+ "execution_count": 8,
"metadata": {
"id": "JgKDIZUstYwV"
},
@@ -615,19 +615,18 @@
"name": "stdout",
"output_type": "stream",
"text": [
- "## Summary of \"Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context\"\n",
- "\n",
- "**Gemini 1.5 Pro** is a new large language model (LLM) from Google DeepMind capable of processing and understanding extremely long sequences of information across various modalities like text, code, images, audio, and video. It utilizes a mixture-of-experts architecture and achieves state-of-the-art performance on many tasks while being significantly more efficient than previous models. \n",
+ "This is a technical report introducing Gemini 1.5 Pro, Google's latest multi-modal model. The model is built upon the mixture-of-experts (MoE) architecture and exhibits impressive performance on reasoning, multi-modality, and long context understanding. Gemini 1.5 Pro distinguishes itself by expanding the context window size to several million tokens, a significant leap beyond the 200k tokens offered by its predecessor, Claude 2.1. This expanded capacity allows for processing nearly five days of audio, entire books, or extensive code repositories. \n",
"\n",
- "**Key advancements and findings:**\n",
+ "The report highlights the model's abilities through: \n",
+ "* **Qualitative examples:** Showcasing impressive feats such as pinpointing specific code within the complete JAX codebase, learning to translate a new language from a single grammar book and dictionary, and identifying a scene from Les Misérables based on a hand-drawn sketch. \n",
+ "* **Quantitative evaluations:** \n",
+ " * **Diagnostic:** demonstrating near-perfect recall in \"needle-in-a-haystack\" tasks across text, video, and audio, even maintaining high recall with context lengths extending to 10 million tokens. \n",
+ " * **Realistic:** excelling in long-document QA using Les Misérables as context, outperforming competitors on long-video QA tasks, and showing significant progress in long-context automatic speech recognition. \n",
+ " * **Core Capabilities:** Surpassing the performance of its predecessor (Gemini 1.0) and rivaling or exceeding the performance of a state-of-the-art model, Gemini 1.0 Ultra, on core benchmarks related to coding, math, science, reasoning, and instruction following. \n",
"\n",
- "* **Unprecedented context length:** Gemini 1.5 Pro can handle up to 10 million tokens of context, enabling it to process information like entire books, days-long audio recordings, and hours of video. This opens up new possibilities for applications like analyzing large datasets, summarizing documents, and understanding complex video content.\n",
- "* **Improved performance across modalities:** The model surpasses its predecessors and even matches or exceeds the performance of state-of-the-art models like Gemini 1.0 Ultra on various benchmarks across text (e.g., reasoning, math, coding), vision, and audio understanding.\n",
- "* **In-context learning:** Gemini 1.5 Pro showcases the ability to learn new skills like translating languages (e.g., English to Kalamang) with very limited data by providing the necessary reference materials directly in the context. This has implications for supporting low-resource languages and facilitating cross-lingual communication.\n",
- "* **Diagnostic and realistic evaluations:** The researchers developed new benchmarks and evaluation methodologies to assess the long-context capabilities of the model, including \"needle-in-a-haystack\" tasks for different modalities and question answering from long documents and videos.\n",
- "* **Responsible AI practices:** Google DeepMind emphasizes its commitment to responsible deployment by conducting impact assessments, implementing model safety mitigations, and evaluating potential risks and biases. \n",
+ "The report also delves into the responsible development and deployment of the model, emphasizing their approach to impact assessment, model mitigations, and ongoing safety evaluations. \n",
"\n",
- "**Overall, Gemini 1.5 Pro represents a significant leap forward in LLM research, demonstrating the potential of long-context understanding and multimodal capabilities for various applications while emphasizing the importance of responsible development and deployment.** \n",
+ "In conclusion, Gemini 1.5 Pro represents a significant advancement in AI, showcasing unprecedented capabilities in long-context understanding across multiple modalities. The report emphasizes the need for novel evaluation methods to better assess the potential of such models and suggests promising avenues for future research. \n",
"\n"
]
}
@@ -636,12 +635,12 @@
"pdf_file_uri = \"gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf\"\n",
"\n",
"prompt = \"\"\"\n",
- " Your are a very professional document summarization specialist.\n",
- " Please summarize the given document.\n",
+ " You are a very professional document summarization specialist.\n",
+ " Summarize the given document.\n",
"\"\"\"\n",
"\n",
"pdf_file = Part.from_uri(pdf_file_uri, mime_type=\"application/pdf\")\n",
- "contents = [pdf_file, prompt]\n",
+ "contents = [prompt, pdf_file]\n",
"\n",
"response = model.generate_content(contents)\n",
"print(response.text)"
From 14d843303d35242f7440e29181273346b1acf152 Mon Sep 17 00:00:00 2001
From: nhootan <103317089+nhootan@users.noreply.github.com>
Date: Mon, 23 Sep 2024 21:43:44 -0400
Subject: [PATCH 02/76] refactor: Add doc links to VAPO notebook (#1156)
Co-authored-by: hootan
Co-authored-by: Owl Bot
---
.../vertex_ai_prompt_optimizer_ui.ipynb | 112 ++++++++++++------
1 file changed, 76 insertions(+), 36 deletions(-)
diff --git a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
index 940cd61a1c..93d820f28e 100644
--- a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
+++ b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
@@ -29,6 +29,7 @@
"id": "RN8N3O43QDT5"
},
"source": [
+ "# Vertex Prompt Optimizer Notebook UI (Preview)\n",
"
\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tvgnzT1CKxrO"
+ },
+ "source": [
+ "## I. Overview\n",
+ "\n",
+ "In the context of developing Generative AI (Gen AI) applications, prompt engineering poses challenges due to its time-consuming and error-prone nature. You often dedicate significant effort to crafting and inputting prompts to achieve successful task completion. Additionally, with the frequent release of foundational models, you face the additional burden of migrating working prompts from one model version to another.\n",
+ "\n",
+ "Vertex AI Prompt Optimizer aims to alleviate these challenges by providing you with an intelligent prompt optimization tool. With this tool you can both refine optimize system instruction (and task) in the prompts and selects the best demonstrations (few-shot examples) for prompt templates, empowering you to shape LLM responses from any source model to on a target Google model.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4HKyj5KwYePX"
+ },
+ "source": [
+ "### Objective\n",
+ "\n",
+ "This notebook demostrates how to leverage Vertex AI Prompt Optimizer (Preview) to efficiently migrate a prompt template from one model to another. The goal is to use Vertex AI Prompt Optimizer (Preview) to find the new prompt template which generate the most correct and grounded responses.\n",
+ "\n",
+ "\n",
+ "This tutorial uses the following Google Cloud ML services and resources:\n",
+ "\n",
+ "- Vertex Gen AI\n",
+ "- Vertex AI Prompt Optimizer (Preview)\n",
+ "- Vertex AI Model Eval\n",
+ "- Vertex AI Custom job\n",
+ "\n",
+ "The steps performed include:\n",
+ "\n",
+ "- Prepare the prompt-ground truth pairs optimized for another model\n",
+ "- Define the prompt template you want to optimize\n",
+ "- Set target model and evaluation metric\n",
+ "- Set optimization mode and steps\n",
+ "- Run the automatic prompt optimization job\n",
+ "- Collect the best prompt template and eval metric\n",
+ "- Validate the best prompt template"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "08d289fa873f"
+ },
+ "source": [
+ "### Dataset\n",
+ "\n",
+ "The dataset is a question-answering dataset generated by a simple AI cooking assistant that provides suggestions on how to cook healthier dishes.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "aed92deeb4a0"
+ },
+ "source": [
+ "### Costs\n",
+ "\n",
+ "This tutorial uses billable components of Google Cloud:\n",
+ "\n",
+ "* Vertex AI\n",
+ "* Cloud Storage\n",
+ "\n",
+ "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage pricing](https://cloud.google.com/storage/pricing) and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "61RBz8LLbxCR"
+ },
+ "source": [
+ "## II. Before you start"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "No17Cw5hgx12"
+ },
+ "source": [
+ "### Install Vertex AI SDK for Python and other required packages\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "tFy3H3aPgx12"
+ },
+ "outputs": [],
+ "source": [
+ "! pip3 install --upgrade --quiet 'google-cloud-aiplatform[evaluation]'\n",
+ "! pip3 install --upgrade --quiet 'plotly'\n",
+ "! pip3 install --upgrade --quiet 'asyncio' 'tqdm' 'tenacity' 'etils' 'importlib_resources' 'fsspec' 'gcsfs' 'nbformat>=4.2.0'"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "R5Xep4W9lq-Z"
+ },
+ "source": [
+ "### Restart runtime (Colab only)\n",
+ "\n",
+ "To use the newly installed packages, you must restart the runtime on Google Colab."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XRvKdaPDTznN"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " import IPython\n",
+ "\n",
+ " app = IPython.Application.instance()\n",
+ " app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SbmM4z7FOBpM"
+ },
+ "source": [
+ "
\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dmWOrTJ3gx13"
+ },
+ "source": [
+ "### Authenticate your notebook environment (Colab only)\n",
+ "\n",
+ "Authenticate your environment on Google Colab.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "NyKGtVQjgx13"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " from google.colab import auth\n",
+ "\n",
+ " auth.authenticate_user()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DF4l8DTdWgPY"
+ },
+ "source": [
+ "### Set Google Cloud project information\n",
+ "\n",
+ "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
+ "\n",
+ "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "WReHDGG5g0XY"
+ },
+ "source": [
+ "#### Set your project ID and project number"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "oM1iC_MfAts1"
+ },
+ "outputs": [],
+ "source": [
+ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n",
+ "\n",
+ "# Set the project id\n",
+ "! gcloud config set project {PROJECT_ID}"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "oZpm-sL8f1z_"
+ },
+ "outputs": [],
+ "source": [
+ "PROJECT_NUMBER = \"[your-project-number]\" # @param {type:\"string\"}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "region"
+ },
+ "source": [
+ "#### Region\n",
+ "\n",
+ "You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "I6FmBV2_0fBP"
+ },
+ "outputs": [],
+ "source": [
+ "REGION = \"us-central1\" # @param {type: \"string\"}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "zgPO1eR3CYjk"
+ },
+ "source": [
+ "#### Create a Cloud Storage bucket\n",
+ "\n",
+ "Create a storage bucket to store intermediate artifacts such as datasets."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "MzGDU7TWdts_"
+ },
+ "outputs": [],
+ "source": [
+ "BUCKET_NAME = \"your-bucket-name-{PROJECT_ID}-unique\" # @param {type:\"string\"}\n",
+ "\n",
+ "BUCKET_URI = f\"gs://{BUCKET_NAME}\" # @param {type:\"string\"}"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "NIq7R4HZCfIc"
+ },
+ "outputs": [],
+ "source": [
+ "! gsutil mb -l {REGION} -p {PROJECT_ID} {BUCKET_URI}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "set_service_account"
+ },
+ "source": [
+ "#### Service Account and permissions\n",
+ "\n",
+ "Vertex AI Automated Prompt Design requires a service account with the following permissions:\n",
+ "\n",
+ "- `Vertex AI User` to call Vertex LLM API\n",
+ "- `Storage Object Admin` to read and write to your GCS bucket.\n",
+ "- `Artifact Registry Reader` to download the pipeline template from Artifact Registry.\n",
+ "\n",
+ "[Check out the documentation](https://cloud.google.com/iam/docs/manage-access-service-accounts#iam-view-access-sa-gcloud) to know how to grant those permissions to a single service account.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ssUJJqXJJHgC"
+ },
+ "outputs": [],
+ "source": [
+ "SERVICE_ACCOUNT = f\"{PROJECT_NUMBER}-compute@developer.gserviceaccount.com\""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "wqOHg5aid6HP"
+ },
+ "outputs": [],
+ "source": [
+ "! gcloud projects add-iam-policy-binding {PROJECT_ID} \\\n",
+ " --member=serviceAccount:{SERVICE_ACCOUNT} \\\n",
+ " --role=roles/aiplatform.user\n",
+ "\n",
+ "! gcloud projects add-iam-policy-binding {PROJECT_ID} \\\n",
+ " --member=serviceAccount:{SERVICE_ACCOUNT} \\\n",
+ " --role=roles/storage.objectAdmin\n",
+ "\n",
+ "! gcloud projects add-iam-policy-binding {PROJECT_ID} \\\n",
+ " --member=serviceAccount:{SERVICE_ACCOUNT} \\\n",
+ " --role=roles/artifactregistry.reader"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Ek1-iTbPjzdJ"
+ },
+ "source": [
+ "### Set tutorial folder and workspace\n",
+ "\n",
+ "Set a folder to collect data and any tutorial artifacts."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "BbfKRabXj3la"
+ },
+ "outputs": [],
+ "source": [
+ "from pathlib import Path as path\n",
+ "\n",
+ "ROOT_PATH = path.cwd()\n",
+ "TUTORIAL_PATH = ROOT_PATH / \"tutorial\"\n",
+ "CONFIG_PATH = TUTORIAL_PATH / \"config\"\n",
+ "TUNED_PROMPT_PATH = TUTORIAL_PATH / \"tuned_prompts\"\n",
+ "\n",
+ "TUTORIAL_PATH.mkdir(parents=True, exist_ok=True)\n",
+ "CONFIG_PATH.mkdir(parents=True, exist_ok=True)\n",
+ "TUNED_PROMPT_PATH.mkdir(parents=True, exist_ok=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "BaNdfftpXTIX"
+ },
+ "source": [
+ "Set the associated workspace on Cloud Storage bucket."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "joJPc3FmX1fk"
+ },
+ "outputs": [],
+ "source": [
+ "from etils import epath\n",
+ "\n",
+ "WORKSPACE_URI = epath.Path(BUCKET_URI) / \"prompt_migration_gemini\"\n",
+ "INPUT_DATA_URI = epath.Path(WORKSPACE_URI) / \"data\"\n",
+ "\n",
+ "WORKSPACE_URI.mkdir(parents=True, exist_ok=True)\n",
+ "INPUT_DATA_URI.mkdir(parents=True, exist_ok=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "960505627ddf"
+ },
+ "source": [
+ "### Import libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "PyQmSRbKA8r-"
+ },
+ "outputs": [],
+ "source": [
+ "# Tutorial\n",
+ "from argparse import Namespace\n",
+ "import json\n",
+ "\n",
+ "# General\n",
+ "import logging\n",
+ "import warnings\n",
+ "\n",
+ "from google.cloud import aiplatform\n",
+ "import pandas as pd\n",
+ "from utils.helpers import (\n",
+ " async_generate,\n",
+ " display_eval_report,\n",
+ " evaluate_task,\n",
+ " get_id,\n",
+ " get_optimization_result,\n",
+ " get_results_file_uris,\n",
+ " init_new_model,\n",
+ " plot_eval_metrics,\n",
+ " print_df_rows,\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "820DIvw1o8tB"
+ },
+ "source": [
+ "### Libraries settings"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "HKc4ZdUBo_SM"
+ },
+ "outputs": [],
+ "source": [
+ "warnings.filterwarnings(\"ignore\")\n",
+ "logging.getLogger(\"urllib3.connectionpool\").setLevel(logging.ERROR)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "init_aip:mbsdk,all"
+ },
+ "source": [
+ "### Initialize Vertex AI SDK for Python\n",
+ "\n",
+ "Initialize the Vertex AI SDK for Python for your project."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "bQMc2Uwf0fBQ"
+ },
+ "outputs": [],
+ "source": [
+ "aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "gxc7q4r-DFH4"
+ },
+ "source": [
+ "### Define constants"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "0Y5t67f3DHNm"
+ },
+ "outputs": [],
+ "source": [
+ "INPUT_DATA_FILE_URI = \"gs://github-repo/prompts/prompt_optimizer/rag_qa_dataset.jsonl\"\n",
+ "\n",
+ "EXPERIMENT_NAME = \"qa-prompt-eval\"\n",
+ "INPUT_TUNING_DATA_URI = epath.Path(WORKSPACE_URI) / \"tuning_data\"\n",
+ "INPUT_TUNING_DATA_FILE_URI = str(INPUT_DATA_URI / \"prompt_tuning.jsonl\")\n",
+ "OUTPUT_TUNING_DATA_URI = epath.Path(WORKSPACE_URI) / \"tuned_prompt\"\n",
+ "APD_CONTAINER_URI = (\n",
+ " \"us-docker.pkg.dev/vertex-ai-restricted/builtin-algorithm/apd:preview_v1_0\"\n",
+ ")\n",
+ "CONFIG_FILE_URI = str(WORKSPACE_URI / \"config\" / \"config.json\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "EdvJRUWRNGHE"
+ },
+ "source": [
+ "## III. Automated prompt design with Vertex AI Prompt Optimizer (Preview)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "mmTotjRAJplw"
+ },
+ "source": [
+ "### Load the dataset\n",
+ "\n",
+ "Load the dataset from Cloud Storage bucket."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "LA7aG08wJtVm"
+ },
+ "outputs": [],
+ "source": [
+ "prompt_tuning_df = pd.read_json(INPUT_DATA_FILE_URI, lines=True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "1xn-pz3v5HVK"
+ },
+ "outputs": [],
+ "source": [
+ "prompt_tuning_df.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "PsXdJBJXiaVH"
+ },
+ "outputs": [],
+ "source": [
+ "print_df_rows(prompt_tuning_df, n=1)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "E5SmBApC-WDg"
+ },
+ "source": [
+ "### Evaluate the previous model version in question-answering task\n",
+ "\n",
+ "Run an evaluation using Vertex AI Gen AI Evaluation Service to define question-answering baseline."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "qA-dl76E-H23"
+ },
+ "outputs": [],
+ "source": [
+ "evaluation_qa_results = [\n",
+ " (\n",
+ " \"qa_eval_result_old_model\",\n",
+ " evaluate_task(\n",
+ " df=prompt_tuning_df,\n",
+ " prompt_col=\"prompt\",\n",
+ " reference_col=\"reference\",\n",
+ " response_col=\"answer\",\n",
+ " experiment_name=EXPERIMENT_NAME,\n",
+ " eval_metrics=[\"question_answering_quality\", \"groundedness\"],\n",
+ " eval_sample_n=len(prompt_tuning_df),\n",
+ " ),\n",
+ " )\n",
+ "]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "_9ZMmVHZfl5O"
+ },
+ "source": [
+ "Plot the evaluation metrics."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "yTZKlgOk-0qz"
+ },
+ "outputs": [],
+ "source": [
+ "plot_eval_metrics(evaluation_qa_results)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Rp1n1aMACzSW"
+ },
+ "source": [
+ "### Translate the prompt template with Vertex AI Prompt Optimizer (Preview)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "h1650lf3X8xW"
+ },
+ "source": [
+ "#### Prepare the prompt template you want to optimize\n",
+ "\n",
+ "A prompt consists of two key parts:\n",
+ "\n",
+ "* **System Instruction Template** which is a fixed part of the prompt shared across all queries for a given task.\n",
+ "\n",
+ "* **Prompt Template** which is a dynamic part of the prompt that changes based on the task.\n",
+ "\n",
+ "Vertex AI Prompt Optimizer enables the translation and optimization of the Instruction Template, while the Task/Context Template remains essential for evaluating different instruction templates.\n",
+ "\n",
+ "In this case, you want to translate a prompt\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Db8rHNC6DmtY"
+ },
+ "outputs": [],
+ "source": [
+ "SYSTEM_INSTRUCTION_TEMPLATE = \"\"\"\n",
+ "Given a question with some context, provide the correct answer to the question.\n",
+ "\"\"\"\n",
+ "\n",
+ "PROMPT_TEMPLATE = \"\"\"\n",
+ "Some examples of correct answer to a question with context are:\n",
+ "Question: {{question}}\n",
+ "Answer: {{target}}\n",
+ "\"\"\""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "a1TCgXsrXztm"
+ },
+ "source": [
+ "#### Prepare few samples\n",
+ "\n",
+ "Vertex AI Prompt optimizer requires a CSV or JSONL file containing labeled samples.\n",
+ "\n",
+ "For **prompt optimization**:\n",
+ "\n",
+ "* Focus on examples that specifically demonstrate the issues you want to address.\n",
+ "* Recommendation: Use 50-100 distinct samples for reliable results. However, the tool can still be effective with as few as 5 samples.\n",
+ "\n",
+ "For **prompt translation**:\n",
+ "\n",
+ "* Consider using the source model to label examples that the target model struggles with, helping to identify areas for improvement.\n",
+ "\n",
+ "Learn more about setting up your CSV or JSONL file as input [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "vTIl_v9Ig1F-"
+ },
+ "outputs": [],
+ "source": [
+ "prepared_prompt_tuning_df = prompt_tuning_df.copy()\n",
+ "\n",
+ "# Prepare question and target columns\n",
+ "prepared_prompt_tuning_df[\"question\"] = (\n",
+ " prepared_prompt_tuning_df[\"user_question\"]\n",
+ " + \"\\nnContext:\\n\"\n",
+ " + prepared_prompt_tuning_df[\"context\"]\n",
+ ")\n",
+ "prepared_prompt_tuning_df = prepared_prompt_tuning_df.rename(\n",
+ " columns={\"reference\": \"target\"}\n",
+ ")\n",
+ "\n",
+ "# Remove uneccessary columns\n",
+ "prepared_prompt_tuning_df = prepared_prompt_tuning_df.drop(\n",
+ " columns=[\"user_question\", \"context\", \"prompt\", \"answer\"]\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "_DUFEAb82eEi"
+ },
+ "outputs": [],
+ "source": [
+ "prepared_prompt_tuning_df.head()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "nF3XY_d_yB-K"
+ },
+ "source": [
+ "#### Upload samples to bucket\n",
+ "\n",
+ "Once you prepare samples, you can upload them on Cloud Storage bucket."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "155paLgGUXOm"
+ },
+ "outputs": [],
+ "source": [
+ "prepared_prompt_tuning_df.to_json(\n",
+ " INPUT_TUNING_DATA_FILE_URI, orient=\"records\", lines=True\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "F5RD0l2xX-FI"
+ },
+ "source": [
+ "#### Configure optimization settings\n",
+ "\n",
+ "Vertex AI Prompt Optimizer allows you to optimize prompts by optimizing instructions only, demonstration only, or both (`optimization_mode`), and after you set the system instruction, prompt templates that will be optimized (`system_instruction`, `prompt_template`), and the model you want to optimize for (`target_model`), it allows to condition the optimization process by setting metrics, number of iterations used to improve the prompt and more.\n",
+ "\n",
+ "Below you have some configurations as default that are most commonly used and recommended. And if you want to have more control of the optimization process, Vertex AI Prompt Optimizer (Preview) provides also additional configurations. Refer [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer) to learn more about the different parameters settings and how to best utilize them.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "sFHutXhgeqRx"
+ },
+ "outputs": [],
+ "source": [
+ "PROMPT_OPTIMIZATION_JOB = \"auto-prompt-design-job-\" + get_id()\n",
+ "OUTPUT_TUNING_RUN_URI = str(OUTPUT_TUNING_DATA_URI / PROMPT_OPTIMIZATION_JOB)\n",
+ "\n",
+ "args = Namespace(\n",
+ " # Basic configuration\n",
+ " system_instruction=SYSTEM_INSTRUCTION_TEMPLATE,\n",
+ " prompt_template=PROMPT_TEMPLATE,\n",
+ " target_model=\"gemini-1.5-flash-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n",
+ " optimization_mode=\"instruction\", # Supported modes: \"instruction\", \"demonstration\", \"instruction_and_demo\"\n",
+ " num_steps=3,\n",
+ " num_template_eval_per_step=2,\n",
+ " num_demo_set_candidates=3,\n",
+ " demo_set_size=2,\n",
+ " input_data_path=INPUT_TUNING_DATA_FILE_URI,\n",
+ " output_path=OUTPUT_TUNING_RUN_URI,\n",
+ " project=PROJECT_ID,\n",
+ " # Advanced configuration\n",
+ " target_model_qps=1,\n",
+ " target_model_location=\"us-central1\",\n",
+ " source_model=\"\",\n",
+ " source_model_qps=\"\",\n",
+ " source_model_location=\"\",\n",
+ " eval_model=\"gemini-1.5-pro-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n",
+ " eval_qps=1,\n",
+ " eval_model_location=\"us-central1\",\n",
+ " eval_metrics_types=[\n",
+ " \"question_answering_correctness\",\n",
+ " \"groundedness\",\n",
+ " ], # Supported metrics: \"bleu\", \"coherence\", \"exact_match\", \"fluidity\", \"fulfillment\", \"groundedness\", \"rouge_1\", \"rouge_2\", \"rouge_l\", \"rouge_l_sum\", \"safety\", \"question_answering_correctness\", \"question_answering_helpfulness\", \"question_answering_quality\", \"question_answering_relevance\", \"summarization_helpfulness\", \"summarization_quality\", \"summarization_verbosity\", \"tool_name_match\", \"tool_parameter_key_match\", \"tool_parameter_kv_match\"\n",
+ " eval_metrics_weights=[0.9, 0.1],\n",
+ " aggregation_type=\"weighted_sum\", # Supported aggregation types: \"weighted_sum\", \"weighted_average\"\n",
+ " data_limit=50,\n",
+ " response_mime_type=\"application/json\",\n",
+ " language=\"English\", # Supported languages: \"English\", \"French\", \"German\", \"Hebrew\", \"Hindi\", \"Japanese\", \"Korean\", \"Portuguese\", \"Simplified Chinese\", \"Spanish\", \"Traditional Chinese\"\n",
+ " placeholder_to_content=json.loads(\"{}\"),\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Jd_uzQYQx6L7"
+ },
+ "source": [
+ "#### Upload Vertex AI Prompt Optimizer (Preview) config to Cloud Storage\n",
+ "\n",
+ "After you define Vertex AI Prompt Optimizer (Preview) configuration, you upload them on Cloud Storage bucket.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "QCJAqcfWBqAh"
+ },
+ "source": [
+ "Now you can save the config to the bucket."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "iqiv8ApR_SAM"
+ },
+ "outputs": [],
+ "source": [
+ "args = vars(args)\n",
+ "\n",
+ "with epath.Path(CONFIG_FILE_URI).open(\"w\") as config_file:\n",
+ " json.dump(args, config_file)\n",
+ "config_file.close()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "spqgBT8hYAle"
+ },
+ "source": [
+ "#### Run the automatic prompt optimization job\n",
+ "\n",
+ "Now you are ready to run your first Vertex AI Prompt Optimizer (Preview) job using the Vertex AI SDK for Python.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "GtPnvKIpUQ3q"
+ },
+ "outputs": [],
+ "source": [
+ "WORKER_POOL_SPECS = [\n",
+ " {\n",
+ " \"machine_spec\": {\n",
+ " \"machine_type\": \"n1-standard-4\",\n",
+ " },\n",
+ " \"replica_count\": 1,\n",
+ " \"container_spec\": {\n",
+ " \"image_uri\": APD_CONTAINER_URI,\n",
+ " \"args\": [\"--config=\" + CONFIG_FILE_URI],\n",
+ " },\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "custom_job = aiplatform.CustomJob(\n",
+ " display_name=PROMPT_OPTIMIZATION_JOB,\n",
+ " worker_pool_specs=WORKER_POOL_SPECS,\n",
+ ")\n",
+ "\n",
+ "custom_job.run()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "3YwwKBhtJ4ut"
+ },
+ "source": [
+ "### Collect the optimization results\n",
+ "\n",
+ "Vertex AI Prompt Optimizer returns both optimized templates and evaluation results for either instruction, or demostrations, or both depending on the optimization mode you define as JSONL files on Cloud Storage bucket. Those results help you understand the optimization process.\n",
+ "\n",
+ "In this case, you want to collect the optimized templates and evaluation results for the instruction.\n",
+ "\n",
+ "Below you use a helper function to read those results.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "xTPJsvg-kzkO"
+ },
+ "outputs": [],
+ "source": [
+ "apd_result_uris = get_results_file_uris(\n",
+ " output_uri=OUTPUT_TUNING_RUN_URI,\n",
+ " required_files=[\"eval_results.json\", \"templates.json\"],\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ZezzQSYWjYPd"
+ },
+ "source": [
+ "#### Get the best system instruction\n",
+ "\n",
+ "Below you have the optimal system instruction template and the associated evaluation metrics."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "PrezXkBUu1s5"
+ },
+ "outputs": [],
+ "source": [
+ "best_prompt_df, prompt_summary_df, prompt_metrics_df = get_optimization_result(\n",
+ " apd_result_uris[\"instruction_templates\"],\n",
+ " apd_result_uris[\"instruction_eval_results\"],\n",
+ ")\n",
+ "\n",
+ "display_eval_report(\n",
+ " (best_prompt_df, prompt_summary_df, prompt_metrics_df),\n",
+ " prompt_component=\"instruction\",\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "TrMrbcA5Gzep"
+ },
+ "source": [
+ "### Validate and Evaluate the optimized template in question-answering task\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "bGRELw3U3I28"
+ },
+ "source": [
+ "#### Generate new responses using the optimized template\n",
+ "\n",
+ "Finally, you generate the new responses with the optimized template. Below you can see an example of a generated response using the optimized system instructions template."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "GXDU_ydAG5ak"
+ },
+ "outputs": [],
+ "source": [
+ "optimized_prompt_template = (\n",
+ " best_prompt_df[\"prompt\"].iloc[0]\n",
+ " + \"\\nQuestion: \\n{question}\"\n",
+ " + \"\\nContext: \\n{context}\"\n",
+ ")\n",
+ "\n",
+ "optimized_prompts = [\n",
+ " optimized_prompt_template.format(question=q, context=c)\n",
+ " for q, c in zip(\n",
+ " prompt_tuning_df[\"user_question\"].to_list(),\n",
+ " prompt_tuning_df[\"context\"].to_list(),\n",
+ " )\n",
+ "]\n",
+ "\n",
+ "prompt_tuning_df[\"optimized_prompt_with_vapo\"] = optimized_prompts"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "qG6QJW8alttS"
+ },
+ "outputs": [],
+ "source": [
+ "gemini_llm = init_new_model(\"gemini-1.5-flash-001\")\n",
+ "\n",
+ "gemini_predictions = [async_generate(p, model=gemini_llm) for p in optimized_prompts]\n",
+ "\n",
+ "gemini_predictions_col = await tqdm_asyncio.gather(*gemini_predictions)\n",
+ "\n",
+ "prompt_tuning_df[\"gemini_answer_with_vapo\"] = gemini_predictions_col"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "_55cHbD4kFAz"
+ },
+ "outputs": [],
+ "source": [
+ "print_df_rows(prompt_tuning_df, n=1)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "D1wxiPhv21TT"
+ },
+ "source": [
+ "#### Evaluate new responses using Vertex AI Gen AI Evaluation\n",
+ "\n",
+ "And you use the generated responses with the optimized prompt to run a new round of evaluation with Vertex AI Gen AI Evaluation.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "5Ebtvk0fKApV"
+ },
+ "outputs": [],
+ "source": [
+ "evaluation_qa_results.append(\n",
+ " (\n",
+ " \"qa_eval_result_new_model_with_vapo\",\n",
+ " evaluate_task(\n",
+ " df=prompt_tuning_df,\n",
+ " prompt_col=\"optimized_prompt_with_vapo\",\n",
+ " reference_col=\"reference\",\n",
+ " response_col=\"gemini_answer_with_vapo\",\n",
+ " experiment_name=EXPERIMENT_NAME,\n",
+ " eval_metrics=[\"question_answering_quality\", \"groundedness\"],\n",
+ " eval_sample_n=len(prompt_tuning_df),\n",
+ " ),\n",
+ " )\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "wJXNAnJjmnga"
+ },
+ "outputs": [],
+ "source": [
+ "plot_eval_metrics(evaluation_qa_results)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "2a4e033321ad"
+ },
+ "source": [
+ "## IV. Clean up"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "WRY_3wh1GVNm"
+ },
+ "outputs": [],
+ "source": [
+ "delete_bucket = False\n",
+ "delete_job = False\n",
+ "delete_experiment = False\n",
+ "delete_tutorial = False\n",
+ "\n",
+ "if delete_bucket:\n",
+ " ! gsutil rm -r $BUCKET_URI\n",
+ "\n",
+ "if delete_job:\n",
+ " custom_job.delete()\n",
+ "\n",
+ "if delete_experiment:\n",
+ " experiment = aiplatform.Experiment(experiment_name=EXPERIMENT_NAME)\n",
+ " experiment.delete()\n",
+ "\n",
+ "if delete_tutorial:\n",
+ " import shutil\n",
+ "\n",
+ " shutil.rmtree(str(TUTORIAL_PATH))"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "name": "get_started_with_vertex_ai_prompt_optimizer_sdk.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
diff --git a/gemini/prompts/prompt_optimizer/utils/helpers.py b/gemini/prompts/prompt_optimizer/utils/helpers.py
new file mode 100644
index 0000000000..c20b10269c
--- /dev/null
+++ b/gemini/prompts/prompt_optimizer/utils/helpers.py
@@ -0,0 +1,357 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import json
+import random
+import string
+from typing import Dict, List, Optional, Tuple, Union
+
+from IPython.display import HTML, Markdown, display
+from etils import epath
+import pandas as pd
+import plotly.graph_objects as go
+from tenacity import retry, wait_random_exponential
+from vertexai import generative_models
+from vertexai.evaluation import EvalTask
+from vertexai.generative_models import GenerativeModel
+
+METRICS = [
+ "bleu",
+ "coherence",
+ "exact_match",
+ "fluidity",
+ "fulfillment",
+ "groundedness",
+ "rouge_1",
+ "rouge_2",
+ "rouge_l",
+ "rouge_l_sum",
+ "safety",
+ "question_answering_correctness",
+ "question_answering_helpfulness",
+ "question_answering_quality",
+ "question_answering_relevance",
+ "summarization_helpfulness",
+ "summarization_quality",
+ "summarization_verbosity",
+ "tool_name_match",
+ "tool_parameter_key_match",
+ "tool_parameter_kv_match",
+]
+COMPOSITE_METRIC = "composite_metric"
+
+
+def get_id(length: Union[int, None] = 8) -> str:
+ """Generate a uuid of a specified length (default=8)."""
+ if length is None:
+ length = 8
+ return "".join(random.choices(string.ascii_lowercase + string.digits, k=length))
+
+
+@retry(wait=wait_random_exponential(multiplier=1, max=120))
+async def async_generate(prompt: str, model: GenerativeModel) -> Union[str, None]:
+ """Generate a response from the model."""
+ response = await model.generate_content_async(
+ [prompt],
+ stream=False,
+ )
+ return response.text[0] if response.text else None
+
+
+def evaluate_task(
+ df: pd.DataFrame,
+ prompt_col: str,
+ reference_col: str,
+ response_col: str,
+ experiment_name: str,
+ eval_metrics: List[str],
+ eval_sample_n: int,
+) -> Dict[str, float]:
+ """Evaluate task using Vertex AI Evaluation."""
+
+ # Generate a unique id for the experiment run
+ id = get_id()
+
+ # Rename the columns to match the expected format
+ eval_dataset = df[[prompt_col, reference_col, response_col]].rename(
+ columns={
+ prompt_col: "prompt",
+ reference_col: "reference",
+ response_col: "response",
+ }
+ )
+
+ # Drop rows with missing values
+ eval_dataset = eval_dataset.dropna()
+
+ # Sample a subset of the dataset
+ eval_dataset = eval_dataset.sample(n=eval_sample_n, random_state=8).reset_index(
+ drop=True
+ )
+
+ # Create an EvalTask object
+ eval_task = EvalTask(
+ dataset=eval_dataset,
+ metrics=eval_metrics,
+ experiment=experiment_name,
+ )
+
+ # Evaluate the task
+ result = eval_task.evaluate(experiment_run_name=f"{experiment_name}-{id}")
+
+ # Return the summary metrics
+ return result.summary_metrics
+
+
+def print_df_rows(
+ df: pd.DataFrame, columns: Optional[List[str]] = None, n: int = 3
+) -> None:
+ """Print a subset of rows from a DataFrame."""
+
+ # Define the base style for the text
+ base_style = (
+ "white-space: pre-wrap; width: 800px; overflow-x: auto; font-size: 16px;"
+ )
+
+ # Define the header style for the text
+ header_style = (
+ "white-space: pre-wrap; width: 800px; overflow-x: auto; font-size: 16px;"
+ )
+
+ # If columns are specified, filter the DataFrame
+ if columns:
+ df = df[columns]
+
+ # Initialize the counter for printed samples
+ printed_samples = 0
+
+ # Iterate over the rows of the DataFrame
+ for _, row in df.iterrows():
+ for field in df.columns:
+ display(HTML(f"{field.capitalize()}:"))
+ display(HTML(" "))
+ value = row[field]
+ display(HTML(f"{value}"))
+ display(HTML(" "))
+
+ printed_samples += 1
+ if printed_samples >= n:
+ break
+
+
+def init_new_model(model_name: str) -> GenerativeModel:
+ """Initialize a new model."""
+
+ # Initialize the model
+ model = GenerativeModel(
+ model_name=model_name,
+ generation_config={
+ "candidate_count": 1,
+ "max_output_tokens": 2048,
+ "temperature": 0.5,
+ },
+ safety_settings={
+ generative_models.HarmCategory.HARM_CATEGORY_HATE_SPEECH: generative_models.HarmBlockThreshold.BLOCK_NONE,
+ generative_models.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: generative_models.HarmBlockThreshold.BLOCK_NONE,
+ generative_models.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: generative_models.HarmBlockThreshold.BLOCK_NONE,
+ generative_models.HarmCategory.HARM_CATEGORY_HARASSMENT: generative_models.HarmBlockThreshold.BLOCK_NONE,
+ },
+ )
+ return model
+
+
+def plot_eval_metrics(
+ eval_results: List[tuple[str, Dict[str, float]]],
+ metrics: Optional[List[str]] = None,
+) -> None:
+ """Plot a bar plot for the evaluation results."""
+
+ # Create data for the bar plot
+ data = []
+ for eval_result in eval_results:
+ title, summary_metrics = eval_result
+ if metrics:
+ summary_metrics = {
+ k: summary_metrics[k]
+ for k, v in summary_metrics.items()
+ if any(selected_metric in k for selected_metric in metrics)
+ }
+
+ summary_metrics = {k: v for k, v in summary_metrics.items() if "mean" in k}
+ data.append(
+ go.Bar(
+ x=list(summary_metrics.keys()),
+ y=list(summary_metrics.values()),
+ name=title,
+ )
+ )
+
+ # Update the figure with the data
+ fig = go.Figure(data=data)
+
+ # Add the title
+ fig.update_layout(
+ title=go.layout.Title(text="Evaluation Metrics", x=0.5),
+ xaxis_title="Metric Name",
+ yaxis_title="Mean Value",
+ )
+
+ # Change the bar mode
+ fig.update_layout(barmode="group")
+
+ # Show the plot
+ fig.show()
+
+
+def get_results_file_uris(
+ output_uri: str, required_files: List[str] = ["eval_results.json", "templates.json"]
+) -> Dict[str, str]:
+ """Finds directories containing specific files under the given full GCS path."""
+
+ # Create a path object for the given output URI
+ path = epath.Path(output_uri)
+
+ # Initialize a dictionary to store the results file URIs
+ results_file_uris: Dict[str, str] = {}
+
+ # Iterate over the directories and files in the path
+ for directory in path.iterdir():
+ for file in directory.iterdir():
+ if file.name in required_files:
+ file_key = directory.name + "_" + file.stem
+ results_file_uris[file_key] = str(directory / file)
+
+ # Return the results file URIs
+ return results_file_uris
+
+
+def get_best_template(template_uri: str) -> pd.DataFrame:
+ """Retrieves and processes the best template."""
+
+ # Load templates from the URI
+ with epath.Path(template_uri).open() as f:
+ templates = json.load(f)
+
+ # Process metrics for each template
+ for template in templates:
+ template["metrics"] = {
+ key.split("/")[0]: value for key, value in template["metrics"].items()
+ }
+
+ # Sort templates based on composite metric or highest metric value
+ if any(template["metrics"].get(COMPOSITE_METRIC) for template in templates):
+ sorted_templates = sorted(
+ templates, key=lambda x: x["metrics"][COMPOSITE_METRIC], reverse=True
+ )
+ elif any(
+ metric in template["metrics"] for template in templates for metric in METRICS
+ ):
+ sorted_metrics = sorted(
+ templates, key=lambda x: max(x["metrics"].values()), reverse=True
+ )
+ top_metric = list(sorted_metrics[0]["metrics"].keys())[0]
+ sorted_templates = sorted(
+ templates, key=lambda x: x["metrics"][top_metric], reverse=True
+ )
+ else:
+ raise ValueError("No valid metrics found in templates.")
+
+ # Create a DataFrame with the best template and metrics
+ best_template_df = pd.DataFrame([sorted_templates[0]])
+
+ # Add metrics as separate columns
+ for metric in best_template_df["metrics"].iloc[0]:
+ best_template_df[f"metrics_{metric}"] = best_template_df["metrics"].apply(
+ lambda x: x[metric]
+ )
+
+ # Drop the 'metrics' column
+ best_template_df = best_template_df.drop("metrics", axis=1)
+
+ return best_template_df
+
+
+def get_best_evaluation(
+ best_template_df: pd.DataFrame, eval_result_uri: str
+) -> Tuple[pd.DataFrame, pd.DataFrame]:
+ """Retrieves and processes the best evaluation."""
+
+ # Load the evaluations from the URI
+ with epath.Path(eval_result_uri).open() as f:
+ evaluations = json.load(f)
+
+ # Get the best index from the best template DataFrame
+ best_index = best_template_df["step"].iloc[0]
+
+ # Retrieve the best evaluation based on the index
+ best_evaluation: Dict = evaluations[best_index]
+
+ # Create a DataFrame from the summary results
+ summary_df = pd.DataFrame([best_evaluation["summary_results"]])
+
+ # Load the metrics table from the best evaluation
+ metrics_table = json.loads(best_evaluation["metrics_table"])
+
+ # Create a DataFrame from the metrics table
+ metrics_df = pd.DataFrame(metrics_table)
+
+ return summary_df, metrics_df
+
+
+def get_optimization_result(
+ template_uri: str, eval_result_uri: str
+) -> Union[Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame], None]:
+ """Retrieves and processes the best template and evaluation results."""
+
+ # Get the best template DataFrame
+ best_template_df = get_best_template(template_uri)
+
+ # Get the summary and metrics DataFrames for the best evaluation
+ summary_df, metrics_df = get_best_evaluation(best_template_df, eval_result_uri)
+
+ return best_template_df, summary_df, metrics_df
+
+
+def display_eval_report(
+ eval_result: Union[pd.DataFrame, Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]],
+ prompt_component: str = "instruction",
+) -> None:
+ """Displays evaluation results with optional filtering by metrics."""
+
+ # Unpack the evaluation result
+ best_template_df, summary_df, metrics_df = eval_result
+
+ # Display the report title
+ display(Markdown("## Vertex AI Prompt Optimizer - Report"))
+
+ # Display the prompt component title
+ if prompt_component == "instruction":
+ display(Markdown("### Best Instruction"))
+ elif prompt_component == "demonstration":
+ display(Markdown("### Best Demonstration"))
+ else:
+ raise ValueError(
+ "Invalid prompt_component value. Must be 'instruction' or 'demonstration'."
+ )
+
+ # Display the best template DataFrame
+ display(best_template_df.style.hide(axis="index"))
+
+ # Display the summary metrics title
+ display(Markdown("### Summary Metrics"))
+ display(summary_df.style.hide(axis="index"))
+
+ # Display the report metrics title
+ display(Markdown("### Report Metrics"))
+ display(metrics_df.style.hide(axis="index"))
From 518e6c9ed33bf49d0dd0e6f4a6502cdec8d487b7 Mon Sep 17 00:00:00 2001
From: Alok Pattani <51244947+alokpattani@users.noreply.github.com>
Date: Tue, 24 Sep 2024 08:31:29 -0700
Subject: [PATCH 06/76] feat: Adding YouTube video analysis notebook (#1161)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
# Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [x] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [ ] You are listed as the author in your notebook or README file.
- [ ] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [x] Make your Pull Request title in the
specification.
- [ ] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [ ] Appropriate docs were updated (if necessary)
Fixes # 🦕
---------
Co-authored-by: Owl Bot
Co-authored-by: Eric Dong
---
.../youtube_video_analysis.ipynb | 1409 +++++++++++++++++
1 file changed, 1409 insertions(+)
create mode 100644 gemini/use-cases/video-analysis/youtube_video_analysis.ipynb
diff --git a/gemini/use-cases/video-analysis/youtube_video_analysis.ipynb b/gemini/use-cases/video-analysis/youtube_video_analysis.ipynb
new file mode 100644
index 0000000000..500437a464
--- /dev/null
+++ b/gemini/use-cases/video-analysis/youtube_video_analysis.ipynb
@@ -0,0 +1,1409 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {
+ "id": "ur8xi4C7S06n"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "JAPoU8Sm5E6e"
+ },
+ "source": [
+ "# YouTube Video Analysis with Gemini\n",
+ "\n",
+ "
\n",
+ "Loading ITables v2.2.1 from the internet...\n",
+ "(need help?)
\n",
+ "\n",
+ "
\n",
+ "\n",
+ "
\n",
+ "\n",
+ "\n"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# Convert structured output from response to data frame for display and/or further analysis\n",
+ "video_extraction_response_df = pd.DataFrame(json.loads(video_extraction_response_text))\n",
+ "\n",
+ "show(video_extraction_response_df)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "cfa2e8496790"
+ },
+ "source": [
+ "## Creating insights from analyzing multiple YouTube videos together"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "c634255fd419"
+ },
+ "source": [
+ "### Google \"Year in Search\" videos\n",
+ "Now, consider expanding the problem to a more common enterprise use case: extracting information from _multiple_ YouTube videos at once.\n",
+ "\n",
+ "This time, we’ll use [Google’s “Year in Search” videos](https://about.google/intl/ALL_us/stories/year-in-search/), which summarize the questions, people, and moments that captured the world’s attention in each year. As of fall 2024, there are 14 of these videos, each 2-4 minutes in length, from [2010](https://www.youtube.com/watch?v=F0QXB5pw2qE) through [2023](https://www.youtube.com/watch?v=3KtWfp0UopM).\n",
+ "\n",
+ "We start by reading in a CSV file that has links to all the videos."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {
+ "id": "b004061c908a"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
\n",
+ "
year
\n",
+ "
yt_link
\n",
+ "
\n",
+ " \n",
+ " \n",
+ "
\n",
+ "
0
\n",
+ "
2023
\n",
+ "
https://www.youtube.com/watch?v=3KtWfp0UopM
\n",
+ "
\n",
+ "
\n",
+ "
1
\n",
+ "
2022
\n",
+ "
https://www.youtube.com/watch?v=4WXs3sKu41I
\n",
+ "
\n",
+ "
\n",
+ "
2
\n",
+ "
2021
\n",
+ "
https://www.youtube.com/watch?v=EqboAI-Vk-U
\n",
+ "
\n",
+ "
\n",
+ "
3
\n",
+ "
2020
\n",
+ "
https://www.youtube.com/watch?v=rokGy0huYEA
\n",
+ "
\n",
+ "
\n",
+ "
4
\n",
+ "
2019
\n",
+ "
https://www.youtube.com/watch?v=ZRCdORJiUgU
\n",
+ "
\n",
+ "
\n",
+ "
5
\n",
+ "
2018
\n",
+ "
https://www.youtube.com/watch?v=6aFdEhEZQjE
\n",
+ "
\n",
+ "
\n",
+ "
6
\n",
+ "
2017
\n",
+ "
https://www.youtube.com/watch?v=vI4LHl4yFuo
\n",
+ "
\n",
+ "
\n",
+ "
7
\n",
+ "
2016
\n",
+ "
https://www.youtube.com/watch?v=KIViy7L_lo8
\n",
+ "
\n",
+ "
\n",
+ "
8
\n",
+ "
2015
\n",
+ "
https://www.youtube.com/watch?v=q7o7R5BgWDY
\n",
+ "
\n",
+ "
\n",
+ "
9
\n",
+ "
2014
\n",
+ "
https://www.youtube.com/watch?v=DVwHCGAr_OE
\n",
+ "
\n",
+ "
\n",
+ "
10
\n",
+ "
2013
\n",
+ "
https://www.youtube.com/watch?v=Lv-sY_z8MNs
\n",
+ "
\n",
+ "
\n",
+ "
11
\n",
+ "
2012
\n",
+ "
https://www.youtube.com/watch?v=xY_MUB8adEQ
\n",
+ "
\n",
+ "
\n",
+ "
12
\n",
+ "
2011
\n",
+ "
https://www.youtube.com/watch?v=SAIEamakLoY
\n",
+ "
\n",
+ "
\n",
+ "
13
\n",
+ "
2010
\n",
+ "
https://www.youtube.com/watch?v=F0QXB5pw2qE
\n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " year yt_link\n",
+ "0 2023 https://www.youtube.com/watch?v=3KtWfp0UopM\n",
+ "1 2022 https://www.youtube.com/watch?v=4WXs3sKu41I\n",
+ "2 2021 https://www.youtube.com/watch?v=EqboAI-Vk-U\n",
+ "3 2020 https://www.youtube.com/watch?v=rokGy0huYEA\n",
+ "4 2019 https://www.youtube.com/watch?v=ZRCdORJiUgU\n",
+ "5 2018 https://www.youtube.com/watch?v=6aFdEhEZQjE\n",
+ "6 2017 https://www.youtube.com/watch?v=vI4LHl4yFuo\n",
+ "7 2016 https://www.youtube.com/watch?v=KIViy7L_lo8\n",
+ "8 2015 https://www.youtube.com/watch?v=q7o7R5BgWDY\n",
+ "9 2014 https://www.youtube.com/watch?v=DVwHCGAr_OE\n",
+ "10 2013 https://www.youtube.com/watch?v=Lv-sY_z8MNs\n",
+ "11 2012 https://www.youtube.com/watch?v=xY_MUB8adEQ\n",
+ "12 2011 https://www.youtube.com/watch?v=SAIEamakLoY\n",
+ "13 2010 https://www.youtube.com/watch?v=F0QXB5pw2qE"
+ ]
+ },
+ "execution_count": 11,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# Read in table of Year in Search video links from public CSV file\n",
+ "GOOGLE_YEAR_IN_SEARCH_VIDEO_LINKS_CSV_GCS_URI = (\n",
+ " \"gs://github-repo/video/google_year_in_search_video_links.csv\"\n",
+ ")\n",
+ "\n",
+ "year_in_search_yt_links = pd.read_csv(GOOGLE_YEAR_IN_SEARCH_VIDEO_LINKS_CSV_GCS_URI)\n",
+ "\n",
+ "year_in_search_yt_links"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "145522e33a47"
+ },
+ "source": [
+ "### Set up for analyzing multiple video files\n",
+ "\n",
+ "Let’s say we are a sports agency who wants to see which athletes or teams appear most often in these videos as a measure of cultural relevance. Instead of watching and manually counting, we can use Gemini’s multimodal capabilities and world knowledge to extract each appearance of an athlete or team into a structured output that we can use for further analysis.\n",
+ "\n",
+ "The system instructions, prompt, and response schema that will apply to all 14 videos are each created in the cell below."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {
+ "id": "b8589a51547d"
+ },
+ "outputs": [],
+ "source": [
+ "# Set up pieces (prompt, response schema, config) for Google Year in Search videos\n",
+ "multiple_video_extraction_system_instruction_text = (\n",
+ " \"You are a video analyst that \"\n",
+ " \"carefully looks through all frames of provided videos, extracting out the \"\n",
+ " \"pieces necessary to respond to user prompts.\"\n",
+ ")\n",
+ "\n",
+ "multiple_video_extraction_prompt = (\n",
+ " \"Which sports athletes or teams are mentioned or \"\n",
+ " \"shown in this video? Please look through each frame carefully, and respond \"\n",
+ " \"with a complete list that includes the athlete or team's name (1 row per \"\n",
+ " \"athlete or team), whether they are an athlete or team, the sport they play, \"\n",
+ " \"and the timestamp into the video at which they appear (in mm:ss format, \"\n",
+ " \"do not give extra precision) for each one.\"\n",
+ ")\n",
+ "\n",
+ "multiple_video_extraction_response_schema = {\n",
+ " \"type\": \"ARRAY\",\n",
+ " \"items\": {\n",
+ " \"type\": \"OBJECT\",\n",
+ " \"properties\": {\n",
+ " \"name\": {\"type\": \"STRING\"},\n",
+ " \"athlete_or_team\": {\"type\": \"STRING\", \"enum\": [\"athlete\", \"team\"]},\n",
+ " \"sport\": {\"type\": \"STRING\"},\n",
+ " \"video_timestamp\": {\"type\": \"STRING\"},\n",
+ " },\n",
+ " },\n",
+ "}\n",
+ "\n",
+ "multiple_video_extraction_json_generation_config = GenerationConfig(\n",
+ " temperature=0.0,\n",
+ " max_output_tokens=8192,\n",
+ " response_mime_type=\"application/json\",\n",
+ " response_schema=multiple_video_extraction_response_schema,\n",
+ ")\n",
+ "\n",
+ "multiple_video_extraction_model = GenerativeModel(\n",
+ " model_name=GEMINI_PRO_MODEL_ID,\n",
+ " system_instruction=multiple_video_extraction_system_instruction_text,\n",
+ " generation_config=multiple_video_extraction_json_generation_config,\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "0cb2d4688f68"
+ },
+ "source": [
+ "Next, we'll set up to run each of these prompt/video pairs through the Gemini API _asynchronously_. This allows us to send all the requests to Gemini at once, then wait for all the answers to come back - a more efficient process than sending them synchronously (one-by-one). See more details in [this Google Cloud Community Medium blog post](https://medium.com/google-cloud/how-to-prompt-gemini-asynchronously-using-python-on-google-cloud-986ca45d9f1b).\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {
+ "id": "5aa93ca907bc"
+ },
+ "outputs": [],
+ "source": [
+ "# Function for asynchronous generation\n",
+ "\n",
+ "\n",
+ "@retry(wait=wait_random_exponential(multiplier=1, max=120), stop=stop_after_attempt(2))\n",
+ "async def async_generate(prompt, yt_link):\n",
+ " try:\n",
+ " response = await multiple_video_extraction_model.generate_content_async(\n",
+ " [prompt, Part.from_uri(mime_type=\"video/webm\", uri=yt_link)], stream=False\n",
+ " )\n",
+ "\n",
+ " response_dict = response.to_dict()\n",
+ "\n",
+ " return response_dict\n",
+ " except Exception as e:\n",
+ " print(\"Something failed, retrying\")\n",
+ " print(e)\n",
+ " with retry.stop_after_attempt(2) as retry_state:\n",
+ " if retry_state.attempt > 2:\n",
+ " return None\n",
+ " raise # Re-raise the exception for tenacity to handle"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "61265bdff388"
+ },
+ "source": [
+ "### Run asynchronous Gemini calls to do video extraction"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {
+ "id": "4191dc30d77a"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Elapsed time: 63.80 seconds\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Perform asynhronous calls across all videos, gather responses\n",
+ "start_time = time.time()\n",
+ "\n",
+ "get_responses = [\n",
+ " async_generate(multiple_video_extraction_prompt, yt_link)\n",
+ " for yt_link in year_in_search_yt_links[\"yt_link\"]\n",
+ "]\n",
+ "\n",
+ "multiple_video_extraction_responses = await asyncio.gather(*get_responses)\n",
+ "\n",
+ "end_time = time.time()\n",
+ "\n",
+ "elapsed_time = end_time - start_time\n",
+ "\n",
+ "print(f\"Elapsed time: {elapsed_time:.2f} seconds\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "7c69057ae51d"
+ },
+ "source": [
+ "### Extract and analyze video results across years\n",
+ "\n",
+ "Once we have the results from Gemini, we can process them and get table of every athlete or team apperance across all 14 \"Year in Search\" videos."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {
+ "id": "6e424adf2cf8"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "97b1d7579d2b"
- },
- "source": [
- "| | |\n",
- "|-|-|\n",
- "| Author(s) | [Erwin Huizenga](https://github.com/erwinh85) |"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "l5LXfKSgShfJ"
- },
- "source": [
- "## Overview\n",
- "This notebook demonstrates fine-tuning the Gemini generative model using the Vertex AI Supervised Tuning feature. Supervised Tuning allows you to use your training data to refine the base model's capabilities toward specific tasks.\n",
- "\n",
- "Supervised Tuning uses labeled examples to tune a model. Each example demonstrates the output you want from your text model during inference.\n",
- "\n",
- "- Data Preparation: Your role is crucial in ensuring your training data is high-quality, well-labeled, and directly relevant to the target task. The quality of the data can significantly impact the model's performance and the presence of bias in the fine-tuned model, underscoring the importance of your contribution.\n",
- "- Training: This phase presents an exciting opportunity to experiment with different configurations, allowing you to optimize the model's performance on the target task. Your creativity and innovation can lead to significant improvements in the model's capabilities.\n",
- "- Evaluation:\n",
- " - Metric: Choose appropriate evaluation metrics that accurately reflect the success of the fine-tuned model for your specific task\n",
- " - Evaluation Set: Use a separate set of data to evaluate the model's performance\n",
- "\n",
- "### Recommended configurations\n",
- "The following table shows the recommended configurations for tuning a foundation model by task:\n",
- "\n",
- "| Task | No. of examples in dataset | Number of epochs |\n",
- "| -------------- | -------------------------- | ----------- |\n",
- "| Classification | 500+ | 2-4 |\n",
- "| Summarization | 1000+ | 2-4 |\n",
- "| Extractive QA | 500+ | 2-4 |\n",
- "| Chat | 1000+ | 2-4 |\n",
- "\n",
- "Before running this notebook, ensure you have:\n",
- "\n",
- "- A Google Cloud project: Provide your project ID in the `PROJECT_ID` variable.\n",
- "\n",
- "- Authenticated your Colab environment: Run the authentication code block at the beginning.\n",
- "\n",
- "- Prepared training data: Data should be formatted in JSON Lines with prompts and corresponding completions."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "2D59iF36T62k"
- },
- "source": [
- "## Getting Started"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "4fEBa5FbT-dc"
- },
- "source": [
- "### Install Vertex AI SDK and other required packages"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {
- "id": "0M04I5j3_KY5"
- },
- "outputs": [],
- "source": [
- "!pip3 install --upgrade --user --quiet google-cloud-aiplatform"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "FLXs74x8UD6A"
- },
- "source": [
- "### Restart runtime (Colab only)\n",
- "\n",
- "To use the newly installed packages, you must restart the runtime on Google Colab."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {
- "id": "8nnS20eqUJSK"
- },
- "outputs": [],
- "source": [
- "import sys\n",
- "\n",
- "if \"google.colab\" in sys.modules:\n",
- " import IPython\n",
- "\n",
- " app = IPython.Application.instance()\n",
- " app.kernel.do_shutdown(True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "qoh03Du8UNAn"
- },
- "source": [
- "
\n",
- "⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. ⚠️\n",
- "
"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "ID0JD1lUUPmx"
- },
- "source": [
- "## Authenticate your notebook environment (Colab only)\n",
- "\n",
- "If you are running this notebook on Google Colab, run the cell below to authenticate your environment."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "metadata": {
- "id": "t49529b_UZJ0"
- },
- "outputs": [],
- "source": [
- "import sys\n",
- "\n",
- "if \"google.colab\" in sys.modules:\n",
- " from google.colab import auth\n",
- "\n",
- " auth.authenticate_user()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "JKcjFI9iUa2T"
- },
- "source": [
- "- If you are running this notebook in a local development environment:\n",
- " - Install the [Google Cloud SDK](https://cloud.google.com/sdk).\n",
- " - Obtain authentication credentials. Create local credentials by running the following command and following the oauth2 flow (read more about the command [here](https://cloud.google.com/sdk/gcloud/reference/beta/auth/application-default/login)):\n",
- "\n",
- " ```bash\n",
- " gcloud auth application-default login\n",
- " ```"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "MyKgURhIUnAM"
- },
- "source": [
- "## Set Project and Location\n",
- "\n",
- "First, you will have to set your project_id, location, and bucket_name. You can also use an existing bucket within the project."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {
- "id": "4acO9tVcU1Ey"
- },
- "outputs": [],
- "source": [
- "PROJECT_ID = \"[your-project]\" # @param {type:\"string\"}\n",
- "LOCATION = \"us-central1\" # @param {type:\"string\"}\n",
- "\n",
- "BUCKET_NAME = \"[your-bucket]\" # @param {type:\"string\"}\n",
- "BUCKET_URI = f\"gs://{BUCKET_NAME}\""
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {
- "id": "udWd9pp7YTSb"
- },
- "outputs": [],
- "source": [
- "if BUCKET_NAME == \"\" or BUCKET_NAME is None or BUCKET_NAME == \"\":\n",
- " BUCKET_NAME = \"vertex-\" + UUID\n",
- " BUCKET_URI = f\"gs://{BUCKET_NAME}\""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "aVe_7ETAasWS"
- },
- "source": [
- "**warning**: Only if your bucket doesn't already exist: Run the following cell to create your Cloud Storage bucket."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "z6L_c9lwatsw"
- },
- "outputs": [],
- "source": [
- "! gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "bJllsmMiavrH"
- },
- "source": [
- "Finally, validate access to your Cloud Storage bucket by examining its contents:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "87Mrn9G6ayl3"
- },
- "outputs": [],
- "source": [
- "! gsutil ls -al $BUCKET_URI"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "AkBZ-e85UeiI"
- },
- "source": [
- "## Import Libraries"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {
- "id": "jsnIinC4UfZq"
- },
- "outputs": [],
- "source": [
- "import vertexai\n",
- "from vertexai.generative_models import GenerativeModel\n",
- "from vertexai.preview.tuning import sft\n",
- "\n",
- "vertexai.init(project=PROJECT_ID, location=LOCATION)\n",
- "\n",
- "import datetime\n",
- "\n",
- "from google.cloud import bigquery\n",
- "import pandas as pd\n",
- "from sklearn.model_selection import train_test_split"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "nbaPSIO4_iur"
- },
- "source": [
- "## Supervised fine tuning with Gemini on a question and answer dataset\n",
- "\n",
- "Now it's time for you to create a tuning job. You will be using a Q&A with a context dataset in JSON format.\n",
- "\n",
- "Supervised fine-tuning offers a solution, allowing focused adaptation of foundation models to new tasks. You can create a supervised text model tuning job using the Google Cloud console, API, or the Vertex AI SDK for Python. You can read more on our [documentation page](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-use-supervised-tuning),\n",
- "\n",
- "But how do you ensure your data is primed for success with supervised fine-tuning? Here's a breakdown of critical areas to focus on:\n",
- "\n",
- "- **Domain Alignment:** Supervised fine-tuning thrives on smaller datasets, but they must be highly relevant to your downstream task. Seek out data that closely mirrors the domain you will encounter in real-world use cases.\n",
- "- **Labeling Accuracy:** Noisy labels will sabotage even the best technique. Prioritize accuracy in your annotations and labeling.\n",
- "- **Noise Reduction:** Outliers, inconsistencies, or irrelevant examples hurt model adaptation. Implement preprocessing, such as removing duplicates, fixing typos, and verifying that data conforms to your task's expectations.\n",
- "- **Distribution:** A diverse range of examples will help your model generalize better within the confines of your target task. Refrain from overloading the process with excessive variance that strays from your core domain.\n",
- "- **Balanced Classes:** For classification tasks, try to keep a reasonable balance between different classes to avoid the model learning biases towards a specific class\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "ivFjWO5M-Z8H"
- },
- "source": [
- "### Fetching data from BigQuery\n",
- "💾 Your model tuning dataset must be in a JSONL format where each line contains a single training example. You must make sure that you include instructions.\n",
- "\n",
- "You will use the [StackOverflow dataset](https://cloud.google.com/blog/topics/public-datasets/google-bigquery-public-datasets-now-include-stack-overflow-q-a) on BigQuery Public Datasets, limiting to questions with the `python` tag, and accepted answers for answers since 2020-01-01.\n",
- "\n",
- "You will use a helper function to read the data from BigQuery and create a Pandas dataframe."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "metadata": {
- "id": "2JIlL-aVbNPg"
- },
- "outputs": [],
- "source": [
- "def run_bq_query(sql: str) -> str | pd.DataFrame:\n",
- " \"\"\"\n",
- " Run a BigQuery query and return the job ID or result as a DataFrame\n",
- " Args:\n",
- " sql: SQL query, as a string, to execute in BigQuery\n",
- " Returns:\n",
- " df: DataFrame of results from query, or error, if any\n",
- " \"\"\"\n",
- "\n",
- " bq_client = bigquery.Client(project=PROJECT_ID)\n",
- "\n",
- " # Try dry run before executing query to catch any errors\n",
- " job_config = bigquery.QueryJobConfig(dry_run=True, use_query_cache=False)\n",
- " bq_client.query(sql, job_config=job_config)\n",
- "\n",
- " # If dry run succeeds without errors, proceed to run query\n",
- " job_config = bigquery.QueryJobConfig()\n",
- " client_result = bq_client.query(sql, job_config=job_config)\n",
- "\n",
- " job_id = client_result.job_id\n",
- "\n",
- " # Wait for query/job to finish running. then get & return DataFrame\n",
- " df = client_result.result().to_arrow().to_pandas()\n",
- " print(f\"Finished job_id: {job_id}\")\n",
- "\n",
- " return df"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "11WLzqp-b59c"
- },
- "source": [
- "Next you will write the query. For now you will limit our example to 550."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "gLC_elwzb3ZF"
- },
- "outputs": [],
- "source": [
- "stack_overflow_df = run_bq_query(\n",
- " \"\"\"SELECT\n",
- " CONCAT(q.title, q.body) AS input_text,\n",
- " a.body AS output_text\n",
- " FROM `bigquery-public-data.stackoverflow.posts_questions` q\n",
- " JOIN `bigquery-public-data.stackoverflow.posts_answers` a\n",
- " ON q.accepted_answer_id = a.id\n",
- " WHERE q.accepted_answer_id IS NOT NULL\n",
- " AND REGEXP_CONTAINS(q.tags, \"python\")\n",
- " AND a.creation_date >= \"2020-01-01\"\n",
- " LIMIT 550\n",
- " \"\"\"\n",
- ")\n",
- "\n",
- "stack_overflow_df.head()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "b404hW8jcRDQ"
- },
- "source": [
- "There should be 550 questions and answers."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "mUg-lF61cUVI"
- },
- "outputs": [],
- "source": [
- "print(len(stack_overflow_df))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "yHda8BzbmRMC"
- },
- "source": [
- "#### Adding instructions\n",
- "Finetuning language models on a collection of datasets phrased as instructions have been shown to improve model performance and generalization to unseen tasks [(Google, 2022)](https://arxiv.org/pdf/2210.11416.pdf).\n",
- "\n",
- "An instruction refers to a specific directive or guideline that conveys a task or action to be executed. These instructions can be expressed in various forms, such as step-by-step procedures, commands, or rules. When we don't use the instructions, it's only a question and answer. The instruction tells the large language model what to do. We want them to answer the question. We have to give a hint about the task we want to perform. Let's extend the dataset with an instruction."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "metadata": {
- "id": "XIy7BjKWmu5j"
- },
- "outputs": [],
- "source": [
- "INSTRUCTION_TEMPLATE = \"\"\"\\\n",
- "You are a helpful Python developer \\\n",
- "You are good at answering StackOverflow questions \\\n",
- "Your mission is to provide developers with helpful answers that work\n",
- "\"\"\""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "-tM_f1b3n4TK"
- },
- "source": [
- "You will create a new column for the `INSTRUCTION_TEMPLATE`. Use a new column and do not overwrite the existing one, which you might want to use later."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "UJpAJG8uoE7F"
- },
- "outputs": [],
- "source": [
- "stack_overflow_df[\"input_text_instruct\"] = INSTRUCTION_TEMPLATE\n",
- "\n",
- "stack_overflow_df.head(2)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "MNMMxaB2cZvY"
- },
- "source": [
- "Next, you will randomly split the data into training and evaluation. For Extractive Q&A tasks, we advise 500+ training examples. In this case, you will use 440 to generate a tuning job that runs faster. \n",
- "\n",
- "20% of your dataset will be used for test. The `random_state` controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls. Feel free to adjust this. "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "qdrweRsscfgU"
- },
- "outputs": [],
- "source": [
- "# split is set to 80/20\n",
- "train, evaluation = train_test_split(stack_overflow_df, test_size=0.2, random_state=42)\n",
- "\n",
- "print(len(train))\n",
- "print(len(evaluation))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "b_MuRRGmfLni"
- },
- "source": [
- "#### Generating the JSONL files\n",
- "\n",
- "Prepare your training data in a JSONL (JSON Lines) file and store it in a Google Cloud Storage (GCS) bucket. This format ensures efficient processing. Each line of the JSONL file must represent a single data instance and follow a well-defined schema:\n",
- "\n",
- "`{\"messages\": [{\"role\": \"system\", \"content\": \"instructions\"}, {\"role\": \"user\", \"content\": \"question\"}, {\"role\": \"model\", \"content\": \"answering\"}]}`\n",
- "\n",
- "This is how it maps to the Pandas df columns:\n",
- "\n",
- "* `instructions -> input_text_instruct`\n",
- "* `question -> input_text`\n",
- "* `answer -> output_text`\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 11,
- "metadata": {
- "id": "fgPXoXOlc0vI"
- },
- "outputs": [],
- "source": [
- "date = datetime.datetime.now().strftime(\"%H:%d:%m:%Y\")\n",
- "\n",
- "tuning_data_filename = f\"tune_data_stack_overflow_qa-{date}.jsonl\"\n",
- "validation_data_filename = f\"validation_data_stack_overflow_qa-{date}.jsonl\""
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "metadata": {
- "id": "9-oHmx0wfElN"
- },
- "outputs": [],
- "source": [
- "def format_messages(row):\n",
- " \"\"\"Formats a single row into the desired JSONL structure\"\"\"\n",
- " return {\n",
- " \"messages\": [\n",
- " {\"role\": \"system\", \"content\": row[\"input_text_instruct\"]},\n",
- " {\"role\": \"user\", \"content\": row[\"input_text\"]},\n",
- " {\"role\": \"model\", \"content\": row[\"output_text\"]},\n",
- " ]\n",
- " }"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "metadata": {
- "id": "8mBwn2jJEkYl"
- },
- "outputs": [],
- "source": [
- "# Apply formatting function to each row, then convert to JSON Lines format\n",
- "tuning_data = train.apply(format_messages, axis=1).to_json(orient=\"records\", lines=True)\n",
- "\n",
- "# Save the result to a JSONL file\n",
- "with open(tuning_data_filename, \"w\") as f:\n",
- " f.write(tuning_data)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "yz9IbouGftaZ"
- },
- "source": [
- "Next you can check if the number of rows match with your Pandas df."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "w4JfgAijikHp"
- },
- "outputs": [],
- "source": [
- "with open(tuning_data_filename) as f:\n",
- " num_rows = sum(1 for _ in f)\n",
- "\n",
- "print(\"Number of rows in the JSONL file:\", num_rows)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "42u53mHQVZk3"
- },
- "source": [
- "You will do the same for the validation dataset."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 15,
- "metadata": {
- "id": "nBc6ufE0h2zL"
- },
- "outputs": [],
- "source": [
- "# Apply formatting function to each row, then convert to JSON Lines format\n",
- "validation_data = evaluation.apply(format_messages, axis=1).to_json(\n",
- " orient=\"records\", lines=True\n",
- ")\n",
- "\n",
- "# Save the result to a JSONL file\n",
- "with open(validation_data_filename, \"w\") as f:\n",
- " f.write(validation_data)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "sYGr7h_2ahqb"
- },
- "source": [
- "Next, you will copy the JSONL files into the Google Cloud Storage bucket you specified or created at the beginning of the notebook."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "eq0MYC6nxhKy"
- },
- "outputs": [],
- "source": [
- "!gsutil cp $tuning_data_filename $validation_data_filename $BUCKET_URI"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "cBq0NMIxa2iD"
- },
- "source": [
- "Next you can check if the files are in the bucket."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "cVel0g6pkOiA"
- },
- "outputs": [],
- "source": [
- "!gsutil ls -al $BUCKET_URI"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "hsc0xhNGa7ZQ"
- },
- "source": [
- "Now, you will create two variables for the data.\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 18,
- "metadata": {
- "id": "tXzEZFjtkTWJ"
- },
- "outputs": [],
- "source": [
- "TUNING_DATA_URI = f\"{BUCKET_URI}/{tuning_data_filename}\"\n",
- "VALIDATION_DATA_URI = f\"{BUCKET_URI}/{validation_data_filename}\""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "ZAOu-xJnA54y"
- },
- "source": [
- "### Create a supervised tuning job using Gemini\n",
- "Now it's time for you to start your tuning job. You will use the `gemini-1.0-pro-002` model."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 19,
- "metadata": {
- "id": "SodJv2vWicfu"
- },
- "outputs": [],
- "source": [
- "foundation_model = GenerativeModel(\"gemini-1.0-pro-002\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "6e7zBH5foZbC"
- },
- "outputs": [],
- "source": [
- "# Tune a model using `train` method.\n",
- "sft_tuning_job = sft.train(\n",
- " source_model=foundation_model,\n",
- " train_dataset=TUNING_DATA_URI,\n",
- " # Optional:\n",
- " validation_dataset=VALIDATION_DATA_URI,\n",
- " epochs=3,\n",
- " learning_rate_multiplier=1.0,\n",
- ")\n",
- "\n",
- "# Get the tuning job info.\n",
- "sft_tuning_job.to_dict()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "-LSm5Ns5gjx-"
- },
- "source": [
- "Lets monitor the state. Wait for the next step to complete. Tuning a model will take some time."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "KgukIEFPlVdD"
- },
- "source": [
- "Next you can retrieve the model resource name."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "Q3yiKi-KofGK"
- },
- "outputs": [],
- "source": [
- "# Get the resource name of the tuning job\n",
- "sft_tuning_job_name = sft_tuning_job.resource_name\n",
- "sft_tuning_job_name"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "SM1RZVqWKRdg"
- },
- "source": [
- "Tuning takes time. Please wait until the job is finished before you continue after the next cell."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "Uyug1dw4FAgn"
- },
- "outputs": [],
- "source": [
- "%%time\n",
- "# Wait for job completion\n",
- "while not sft_tuning_job.refresh().has_ended:\n",
- " time.sleep(60)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "nyDS9G2TTX9p"
- },
- "outputs": [],
- "source": [
- "# tuned model name\n",
- "tuned_model_name = sft_tuning_job.tuned_model_name\n",
- "tuned_model_name"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "iU2AVJdLliUh"
- },
- "source": [
- "And the model endpoint."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "_s57xpI5o9m0"
- },
- "source": [
- "You can use `tuning.TuningJob.list()` to retrieve your tuning jobs."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "8QtT3uJ3Jw0N"
- },
- "outputs": [],
- "source": [
- "sft_tuning_job.list()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "0KQmyjjcJ9uz"
- },
- "source": [
- "You model is automatically deployed as a Vertex AI Endpoint and ready for usage!"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "X9uQD-Ee_h6h"
- },
- "outputs": [],
- "source": [
- "# tuned model endpoint name\n",
- "tuned_model_endpoint_name = sft_tuning_job.tuned_model_endpoint_name\n",
- "tuned_model_endpoint_name"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "IRPlb4ZO8ulD"
- },
- "source": [
- "# Load tuned Generative Model"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "OhfU4wTOtH1y"
- },
- "outputs": [],
- "source": [
- "tuned_model = GenerativeModel(tuned_model_endpoint_name)\n",
- "print(tuned_model)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "1b1b39693d75"
- },
- "source": [
- "Call the API"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "S1q1PT2zJRO9"
- },
- "outputs": [],
- "source": [
- "tuned_model.generate_content(\n",
- " \"How do I store a TensorFlow checkpoint on Google Cloud Storage while training?\"\n",
- ")"
- ]
- }
- ],
- "metadata": {
- "colab": {
- "name": "gemini_supervised_tuning_qa.ipynb",
- "toc_visible": true
- },
- "kernelspec": {
- "display_name": "Python 3",
- "name": "python3"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 0
-}
diff --git a/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb b/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb
index 0b79e28204..27c7faf1a0 100644
--- a/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb
+++ b/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb
@@ -76,6 +76,8 @@
"\n",
"This notebook serves as a tool to preprocess and estimate token counts for tuning costs for tuning [`gemini-1.0-pro-002`](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning).\n",
"\n",
+ "You can also find the code to check your dataset for Vertex AI Gemini `gemini-1.5-pro-001`.\n",
+ "\n",
"\n",
"For how to prepare dataset for tuning gemini, please refer to this [tutorial](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning-about)."
]
@@ -100,7 +102,7 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 1,
"metadata": {
"id": "tFy3H3aPgx12"
},
@@ -124,7 +126,7 @@
},
{
"cell_type": "code",
- "execution_count": 2,
+ "execution_count": null,
"metadata": {
"id": "XRvKdaPDTznN"
},
@@ -171,7 +173,7 @@
},
{
"cell_type": "code",
- "execution_count": 1,
+ "execution_count": null,
"metadata": {
"id": "NyKGtVQjgx13"
},
@@ -200,7 +202,7 @@
},
{
"cell_type": "code",
- "execution_count": 2,
+ "execution_count": null,
"metadata": {
"id": "Nqwi-5ufWp_B"
},
@@ -235,7 +237,7 @@
},
{
"cell_type": "code",
- "execution_count": 3,
+ "execution_count": 1,
"metadata": {
"id": "4498u5KpQijW"
},
@@ -264,7 +266,7 @@
},
{
"cell_type": "code",
- "execution_count": 5,
+ "execution_count": 2,
"metadata": {
"id": "f0JwfuPSSofK"
},
@@ -288,11 +290,25 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 3,
"metadata": {
"id": "PTvunHqRTHqe"
},
- "outputs": [],
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Num training examples: 500\n",
+ "First example:\n",
+ "{'role': 'user', 'content': \"#Person1#: Hi, Mr. Smith. I'm Doctor Hawkins. Why are you here today?\\n#Person2#: I found it would be a good idea to get a check-up.\\n#Person1#: Yes, well, you haven't had one for 5 years. You should have one every year.\\n#Person2#: I know. I figure as long as there is nothing wrong, why go see the doctor?\\n#Person1#: Well, the best way to avoid serious illnesses is to find out about them early. So try to come at least once a year for your own good.\\n#Person2#: Ok.\\n#Person1#: Let me see here. Your eyes and ears look fine. Take a deep breath, please. Do you smoke, Mr. Smith?\\n#Person2#: Yes.\\n#Person1#: Smoking is the leading cause of lung cancer and heart disease, you know. You really should quit.\\n#Person2#: I've tried hundreds of times, but I just can't seem to kick the habit.\\n#Person1#: Well, we have classes and some medications that might help. I'll give you more information before you leave.\\n#Person2#: Ok, thanks doctor.\\n\\nProvide a summary of the article in two or three sentences:\\n\\n\"}\n",
+ "CountTokensResult(total_tokens=277)\n",
+ "{'role': 'model', 'content': \"Mr. Smith's getting a check-up, and Doctor Hawkins advises them to have one every year. Hawkins'll give some information about their classes and medications to help Mr. Smith quit smoking.\"}\n",
+ "CountTokensResult(total_tokens=41)\n",
+ "Num validation examples: 100\n"
+ ]
+ }
+ ],
"source": [
"with tf.io.gfile.GFile(training_dataset_uri) as dataset_jsonl_file:\n",
" example_training_dataset = [\n",
@@ -344,7 +360,7 @@
},
{
"cell_type": "code",
- "execution_count": 9,
+ "execution_count": 10,
"metadata": {
"id": "1mzpB9PUVp5z"
},
@@ -429,7 +445,7 @@
},
{
"cell_type": "code",
- "execution_count": 18,
+ "execution_count": 11,
"metadata": {
"id": "pUCpEmEFM0eX"
},
@@ -466,7 +482,7 @@
},
{
"cell_type": "code",
- "execution_count": 12,
+ "execution_count": null,
"metadata": {
"id": "BgFmhH2XOdzu"
},
@@ -526,7 +542,7 @@
},
{
"cell_type": "code",
- "execution_count": 15,
+ "execution_count": null,
"metadata": {
"id": "vm0Jwzt0RDGd"
},
@@ -658,7 +674,7 @@
},
{
"cell_type": "code",
- "execution_count": 17,
+ "execution_count": null,
"metadata": {
"id": "uOWsUbwVXoTU"
},
@@ -722,7 +738,7 @@
},
{
"cell_type": "code",
- "execution_count": 19,
+ "execution_count": null,
"metadata": {
"id": "DVIpbaGYRJQc"
},
@@ -757,6 +773,181 @@
" f\"By default, you'll be charged for ~{epoch_count * total_number_of_billable_tokens} tokens.\"\n",
")"
]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "k35OK86wJVVd"
+ },
+ "source": [
+ "## Validate the dataset for Vertex AI Gemini 1.5 fine-tuning"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {
+ "id": "KQWJhyaXQRNM"
+ },
+ "outputs": [],
+ "source": [
+ "BASE_MODEL = \"gemini-1.5-pro-001\" # @param ['gemini-1.5-pro-001']{type:\"string\"}\n",
+ "training_dataset_uri_2 = \"gs://github-repo/generative-ai/gemini/tuning/train_sft_train_samples.jsonl\" # @param {type:\"string\"}\n",
+ "validation_dataset_uri_2 = \"gs://github-repo/generative-ai/gemini/tuning/val_sft_val_samples.jsonl\" # @param {type:\"string\"}\n",
+ "\n",
+ "tokenizer = get_tokenizer_for_model(BASE_MODEL)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {
+ "id": "p-soJC81YNy2"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\n",
+ "Num training examples: 500\n",
+ "First example:\n",
+ "{'role': 'user', 'parts': [{'text': 'Honesty is usually the best policy. It is disrespectful to lie to someone. If you don\\'t want to date someone, you should say so. Sometimes it is easy to be honest. For example, you might be able to truthfully say, \"No, thank you, I already have a date for that party.\" Other times, you might need to find a kinder way to be nice. Maybe you are not attracted to the person. Instead of bluntly saying that, try saying, \"No, thank you, I just don\\'t think we would be a good fit.\" Avoid making up a phony excuse. For instance, don\\'t tell someone you will be out of town this weekend if you won\\'t be. There\\'s a chance that you might then run into them at the movies, which would definitely cause hurt feelings. A compliment sandwich is a really effective way to provide feedback. Essentially, you \"sandwich\" your negative comment between two positive things. Try using this method when you need to reject someone. An example of a compliment sandwich is to say something such as, \"You\\'re an awesome person. Unfortunately, I\\'m not interested in dating you. Someone else is going to be really lucky to date someone with such a great personality!\" You could also try, \"You are a really nice person. I\\'m only interested you as a friend. I like when we hang out in big groups together!\" Be sincere. If you offer false compliments, the other person will likely be able to tell and feel hurt. If you do not want to date someone, it is best to be upfront about your feelings. Do not beat around the bush. If your mind is made up, it is best to clearly state your response. If someone asks you to date them and you don\\'t want to, you can be direct and kind at the same time. State your answer clearly. You can make your feelings clear without purposefully hurting someone else\\'s feelings. Try smiling and saying, \"That sounds fun, but no thank you. I\\'m not interested in dating you.\" Don\\'t beat around the bush. If you do not want to accept the date, there is no need to say, \"Let me think about it.\" It is best to get the rejection over with. You don\\'t want to give someone false hope. Avoid saying something like, \"Let me check my schedule and get back to you.\" Try to treat the person the way you would want to be treated. This means that you should choose your words carefully. Be thoughtful in your response. It\\'s okay to pause before responding. You might be taken by surprise and need a moment to collect your thoughts. Say thank you. It is a compliment to be asked out. You can say, \"I\\'m flattered. Unfortunately, I can\\'t accept.\" Don\\'t laugh. Many people laugh nervously in awkward situations. Try to avoid giggling, as that is likely to result in hurt feelings. Sometimes it is not what you say, but how you say it. If you need to reject someone, think about factors other than your words. Non-verbal communication matters, too. Use the right tone of voice. Try to sound gentle but firm. Make eye contact. This helps convey that you are being serious, and also shows respect for the other person. If you are in public, try not to speak too loudly. It is not necessary for everyone around you to know that you are turning down a date.\\n\\nProvide a summary of the article in two or three sentences:\\n\\n'}]}\n",
+ "{'role': 'model', 'parts': [{'text': 'Tell the truth. Use a \"compliment sandwich\". Be direct. Treat the person with respect. Communicate effectively.'}]}\n",
+ "Num validation examples: 100\n"
+ ]
+ }
+ ],
+ "source": [
+ "with tf.io.gfile.GFile(training_dataset_uri_2) as dataset_jsonl_file:\n",
+ " example_training_dataset = [\n",
+ " json.loads(dataset_line) for dataset_line in dataset_jsonl_file\n",
+ " ]\n",
+ "\n",
+ "print()\n",
+ "\n",
+ "if validation_dataset_uri:\n",
+ " with tf.io.gfile.GFile(validation_dataset_uri_2) as dataset_jsonl_file:\n",
+ " example_validation_dataset = [\n",
+ " json.loads(dataset_line) for dataset_line in dataset_jsonl_file\n",
+ " ]\n",
+ "\n",
+ "# Initial dataset stats\n",
+ "print(\"Num training examples:\", len(example_training_dataset))\n",
+ "print(\"First example:\")\n",
+ "for message in example_training_dataset[0][\"contents\"]:\n",
+ " print(message)\n",
+ "\n",
+ "if example_validation_dataset:\n",
+ " print(\"Num validation examples:\", len(example_validation_dataset))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "metadata": {
+ "id": "K0YOqIbtXwGI"
+ },
+ "outputs": [],
+ "source": [
+ "def validate_dataset_format(dataset):\n",
+ " \"\"\"Validates the dataset.\n",
+ "\n",
+ " Args:\n",
+ " dataset_uri: The dataset uri to be validated.\n",
+ " \"\"\"\n",
+ " format_errors = defaultdict(list)\n",
+ " if not dataset or len(dataset) == 0:\n",
+ " print(\"Input dataset file is empty or inaccessible.\")\n",
+ " return\n",
+ "\n",
+ " for row_idx, example in enumerate(dataset):\n",
+ " # Verify presence of contents list\n",
+ " if not isinstance(example, dict):\n",
+ " format_errors[\"invalid_input\"].append(row_idx)\n",
+ " continue\n",
+ " contents = example.get(\"contents\", None)\n",
+ " if not contents:\n",
+ " format_errors[\"missing_contents_list\"].append(row_idx)\n",
+ " continue\n",
+ " try:\n",
+ " validate_contents(contents, format_errors, row_idx)\n",
+ " except (TypeError, AttributeError, KeyError) as e:\n",
+ " print(\"Invalid input during validation: %s\", e)\n",
+ " format_errors[\"invalid_input\"].append(row_idx)\n",
+ "\n",
+ " if format_errors:\n",
+ " print(\"Found errors for this dataset:\")\n",
+ " for k, v in format_errors.items():\n",
+ " print(f\"{k}: {v}\")\n",
+ " else:\n",
+ " print(\"No errors found for this dataset.\")\n",
+ "\n",
+ "\n",
+ "def validate_contents(contents, format_errors, row_index):\n",
+ " \"\"\"Validates contents list format.\"\"\"\n",
+ " if not isinstance(contents, list):\n",
+ " format_errors[\"invalid_contents_list\"].append(row_index)\n",
+ " return\n",
+ "\n",
+ " prev_role = None\n",
+ " for content_item in contents:\n",
+ " if not isinstance(content_item, dict):\n",
+ " format_errors[\"invalid_content_item\"].append(row_index)\n",
+ " return\n",
+ "\n",
+ " if \"role\" not in content_item or \"parts\" not in content_item:\n",
+ " format_errors[\"content_item_missing_key\"].append(row_index)\n",
+ " return\n",
+ "\n",
+ " if content_item.get(\"role\", \"\").lower() not in (\"user\", \"model\"):\n",
+ " format_errors[\"unrecognized_role\"].append(row_index)\n",
+ " return\n",
+ "\n",
+ " parts = content_item.get(\"parts\", None)\n",
+ " if not parts or not isinstance(parts, list):\n",
+ " format_errors[\"missing_or_invalid_parts\"].append(row_index)\n",
+ " return\n",
+ "\n",
+ " for part in parts:\n",
+ " if not isinstance(part, dict) or \"text\" not in part:\n",
+ " format_errors[\"invalid_part\"].append(row_index)\n",
+ " return\n",
+ "\n",
+ " if not part.get(\"text\"):\n",
+ " format_errors[\"missing_text\"].append(row_index)\n",
+ " return\n",
+ "\n",
+ " role = content_item.get(\"role\", \"\").lower()\n",
+ " # messages to have alternate turns.\n",
+ " if role == prev_role:\n",
+ " format_errors[\"consecutive_turns\"].append(row_index)\n",
+ " return\n",
+ "\n",
+ " prev_role = role"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {
+ "id": "RppMRP9lIkq2"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "No errors found for this dataset.\n",
+ "No errors found for this dataset.\n"
+ ]
+ }
+ ],
+ "source": [
+ "validate_dataset_format(example_training_dataset)\n",
+ "if example_validation_dataset:\n",
+ " validate_dataset_format(example_validation_dataset)"
+ ]
}
],
"metadata": {
From 0af924323208d59df6a64e08d0005c7bc345692a Mon Sep 17 00:00:00 2001
From: Kristopher Overholt
Date: Tue, 24 Sep 2024 11:08:04 -0500
Subject: [PATCH 09/76] feat: Add sample notebook for multimodal function
calling in Gemini (#1162)
# Description
This PR adds a sample notebook that uses Gemini multimodal function
calling to process images, PDFs, audio, an video inputs.
- [X] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [X] You are listed as the author in your notebook or README file.
- [X] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [X] Make your Pull Request title in the
specification.
- [X] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [X] Appropriate docs were updated (if necessary)
---------
Co-authored-by: Owl Bot
Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
---
.github/actions/spelling/allow.txt | 6 +
.../multimodal_function_calling.ipynb | 1556 +++++++++++++++++
2 files changed, 1562 insertions(+)
create mode 100644 gemini/function-calling/multimodal_function_calling.ipynb
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
index 199219b230..3c4bb080ba 100644
--- a/.github/actions/spelling/allow.txt
+++ b/.github/actions/spelling/allow.txt
@@ -1,5 +1,6 @@
AFX
AIP
+AMNOSH
ANZ
APIENTRY
APSTUDIO
@@ -13,6 +14,7 @@ Ashish
Aspeed
Autechre
Autorater
+BIKBEAR
BITCODE
Benno
Bettes
@@ -27,6 +29,8 @@ CHECKOV
COCOAPODS
COINIT
CONOUT
+CUCUONAR
+CWLEY
CZE
Caprese
Codelab
@@ -130,6 +134,7 @@ LRESULT
LSTATUS
LSum
LTRB
+LUVBPTK
Ladhak
Lego
Llion
@@ -298,6 +303,7 @@ bitcoin
bpa
bqml
carbonara
+caudatus
caxis
cfbundle
chatbots
diff --git a/gemini/function-calling/multimodal_function_calling.ipynb b/gemini/function-calling/multimodal_function_calling.ipynb
new file mode 100644
index 0000000000..0c7759da95
--- /dev/null
+++ b/gemini/function-calling/multimodal_function_calling.ipynb
@@ -0,0 +1,1556 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ijGzTHJJUCPY"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "VEqbX8OhE8y9"
+ },
+ "source": [
+ "# Multimodal Function Calling with the Gemini API & Python SDK\n",
+ "\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "84e7e432e6ff"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "|Author(s) | [Kristopher Overholt](https://github.com/koverholt) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "CkHPv2myT2cx"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "### Introduction to Multimodal Function Calling with Gemini\n",
+ "\n",
+ "This notebook demonstrates a powerful [Function Calling](https://cloud.google.com/vertex-ai/docs/generative-ai/multimodal/function-calling) capability of the Gemini model: support for multimodal inputs. With multimodal function calling, you can go beyond traditional text inputs, enabling Gemini to understand your intent and predict function calls and function parameters based on various inputs like images, audio, video, and PDFs. Function calling can also be referred to as *function calling with controlled generation*, which guarantees that output generated by the model always adheres to a specific schema so that you receive consistently formatted responses.\n",
+ "\n",
+ "You can To specify the structure of a model's output, define a response schema, which works like a blueprint for model responses. When you submit a prompt and include the response schema, the model's response always follows your defined schema.\n",
+ "\n",
+ "Previously, implementing multimodal function calling required two separate calls to the Gemini API: one to extract information from media, and another to generate a function call based on the extracted text. This process was cumbersome, prone to errors, and resulted in the loss of detail in valuable contextual information. Gemini's multimodal function calling capability streamlines this workflow, enabling a single API call that efficiently processes multimodal inputs for accurate function predictions and structured outputs. \n",
+ "\n",
+ "### How It Works\n",
+ "\n",
+ "1. **Define Functions and Tools:** Describe your functions, then group them into `Tool` objects for Gemini to use.\n",
+ "2. **Send Inputs and Prompt:** Provide Gemini with multimodal input (image, audio, PDF, etc.) and a prompt describing your request.\n",
+ "3. **Gemini Predicts Action:** Gemini analyzes the multimodal input and prompt to predict the best function to call and its parameters.\n",
+ "4. **Execute and Return:** Use Gemini's prediction to make API calls, then send the results back to Gemini.\n",
+ "5. **Generate Response:** Gemini uses the API results to provide a final, natural language response to the user. \n",
+ "\n",
+ "This notebook will guide you through practical examples of using Gemini's multimodal function calling to build intelligent applications that go beyond the limitations of text-only interactions. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DrkcqHrrwMAo"
+ },
+ "source": [
+ "### Objectives\n",
+ "\n",
+ "In this tutorial, you will learn how to use the Vertex AI Gemini API with the Vertex AI SDK for Python to make function calls with multimodal inputs, using the Gemini 1.5 Pro (`gemini-1.5-pro`) model. You'll explore how Gemini can process and understand various input types — including images, video, audio, and PDFs — to predict and execute functions.\n",
+ "\n",
+ "You will complete the following tasks:\n",
+ "\n",
+ "- Install the Vertex AI SDK for Python.\n",
+ "- Define functions that can be called by Gemini.\n",
+ "- Package functions into tools.\n",
+ "- Send multimodal inputs (images, video, audio, PDFs) and prompts to Gemini.\n",
+ "- Extract predicted function calls and their parameters from Gemini's response.\n",
+ "- Use the predicted output to make API calls to external systems (demonstrated with an image input example). \n",
+ "- Return API responses to Gemini for natural language response generation (demonstrated with an image input example). "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "C9nEPojogw-g"
+ },
+ "source": [
+ "### Costs\n",
+ "\n",
+ "This tutorial uses billable components of Google Cloud:\n",
+ "\n",
+ "- Vertex AI\n",
+ "\n",
+ "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "r11Gu7qNgx1p"
+ },
+ "source": [
+ "## Getting Started\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "No17Cw5hgx12"
+ },
+ "source": [
+ "### Install Vertex AI SDK for Python\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {
+ "id": "tFy3H3aPgx12"
+ },
+ "outputs": [],
+ "source": [
+ "# !pip3 install --upgrade --user --quiet google-cloud-aiplatform wikipedia"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "R5Xep4W9lq-Z"
+ },
+ "source": [
+ "### Restart current runtime\n",
+ "\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which will restart the current kernel."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {
+ "id": "XRvKdaPDTznN"
+ },
+ "outputs": [],
+ "source": [
+ "# # Restart kernel after installs so that your environment can access the new packages\n",
+ "# import IPython\n",
+ "\n",
+ "# app = IPython.Application.instance()\n",
+ "# app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SbmM4z7FOBpM"
+ },
+ "source": [
+ "
\n",
+ "⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. ⚠️\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dmWOrTJ3gx13"
+ },
+ "source": [
+ "### Authenticate your notebook environment (Colab only)\n",
+ "\n",
+ "If you are running this notebook on Google Colab, run the following cell to authenticate your environment. This step is not required if you are using [Vertex AI Workbench](https://cloud.google.com/vertex-ai-workbench)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "id": "NyKGtVQjgx13"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " from google.colab import auth\n",
+ "\n",
+ " auth.authenticate_user()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DF4l8DTdWgPY"
+ },
+ "source": [
+ "### Set Google Cloud project information and initialize Vertex AI SDK\n",
+ "\n",
+ "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
+ "\n",
+ "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {
+ "id": "Nqwi-5ufWp_B"
+ },
+ "outputs": [],
+ "source": [
+ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n",
+ "LOCATION = \"us-central1\" # @param {type:\"string\"}\n",
+ "\n",
+ "import vertexai\n",
+ "\n",
+ "vertexai.init(project=PROJECT_ID, location=LOCATION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "2a7225e4390a"
+ },
+ "source": [
+ "## Multimodal Function Calling in Action"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "jXHfaVS66_01"
+ },
+ "source": [
+ "### Import libraries\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {
+ "id": "lslYAvw37JGQ"
+ },
+ "outputs": [],
+ "source": [
+ "from IPython.display import Markdown, display\n",
+ "from vertexai.generative_models import (\n",
+ " Content,\n",
+ " FunctionDeclaration,\n",
+ " GenerationConfig,\n",
+ " GenerativeModel,\n",
+ " Part,\n",
+ " Tool,\n",
+ ")\n",
+ "import wikipedia"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "aa432a6e021a"
+ },
+ "source": [
+ "### Image-Based Function Calling: Finding Animal Habitats\n",
+ "\n",
+ "In this example, you'll send along an image of a bird and ask Gemini to identify its habitat. This involves defining a function that looks up regions where a given animal is found, creating a tool that uses this function, and then sending a request to Gemini.\n",
+ "\n",
+ "First, you define a `FunctionDeclaration` called `get_wildlife_region`. This function takes the name of an animal species as input and returns information about its typical region."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {
+ "id": "ae36049d4512"
+ },
+ "outputs": [],
+ "source": [
+ "get_wildlife_region = FunctionDeclaration(\n",
+ " name=\"get_wildlife_region\",\n",
+ " description=\"Look up the region where an animal can be found\",\n",
+ " parameters={\n",
+ " \"type\": \"object\",\n",
+ " \"properties\": {\n",
+ " \"animal\": {\"type\": \"string\", \"description\": \"Species of animal\"}\n",
+ " },\n",
+ " },\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "0933d807af15"
+ },
+ "source": [
+ "Next, you create a `Tool` object that includes your `get_wildlife_region` function. Tools help group related functions that Gemini can use:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {
+ "id": "c4218e572dc8"
+ },
+ "outputs": [],
+ "source": [
+ "image_tool = Tool(\n",
+ " function_declarations=[\n",
+ " get_wildlife_region,\n",
+ " ],\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "28abd9546c48"
+ },
+ "source": [
+ "Now you're ready to send a request to Gemini. Initialize the `GenerativeModel` and specify the image to analyze, along with a prompt. The `tools` argument tells Gemini to consider the functions in your `image_tool`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {
+ "id": "38b842d71bce"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "name: \"get_wildlife_region\"\n",
+ "args {\n",
+ " fields {\n",
+ " key: \"animal\"\n",
+ " value {\n",
+ " string_value: \"Lilac-breasted Roller\"\n",
+ " }\n",
+ " }\n",
+ "}"
+ ]
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "model = GenerativeModel(\"gemini-1.5-pro-001\")\n",
+ "generation_config = GenerationConfig(temperature=0)\n",
+ "\n",
+ "response = model.generate_content(\n",
+ " [\n",
+ " Part.from_uri(\n",
+ " \"gs://github-repo/generative-ai/gemini/function-calling/multi-color-bird.jpg\",\n",
+ " mime_type=\"image/jpeg\",\n",
+ " ),\n",
+ " \"What is the typical habitat or region where this animal lives?\",\n",
+ " ],\n",
+ " generation_config=generation_config,\n",
+ " tools=[image_tool],\n",
+ ")\n",
+ "response_function_call = response.candidates[0].content\n",
+ "response.candidates[0].function_calls[0]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "065787dbaa26"
+ },
+ "source": [
+ "Let's examine the response from Gemini. You can extract the predicted function name:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {
+ "id": "e2b92a75e5b9"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'get_wildlife_region'"
+ ]
+ },
+ "execution_count": 9,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "function_name = response.candidates[0].function_calls[0].name\n",
+ "function_name"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4a6ba2cf6937"
+ },
+ "source": [
+ "You can also get the arguments that Gemini predicted for the function call:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {
+ "id": "c89f16d5082e"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "{'animal': 'Lilac-breasted Roller'}"
+ ]
+ },
+ "execution_count": 10,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "function_args = {\n",
+ " key: value for key, value in response.candidates[0].function_calls[0].args.items()\n",
+ "}\n",
+ "function_args"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "180ef53579a0"
+ },
+ "source": [
+ "Now, you'll call an external API (in this case, using the `wikipedia` Python package) using the animal name that Gemini extracted from the image:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {
+ "id": "07eed3ae7aa3"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'The lilac-breasted roller (Coracias caudatus) is an African bird of the roller family, Coraciidae. It is widely distributed in Southern and Eastern Africa, and is a vagrant to the southern Arabian Peninsula. It prefers open woodland and savanna, and it is for the most part absent from treeless places. Usually found alone or in pairs, it perches at the tops of trees, poles or other high vantage points from where it can spot insects, amphibians and small birds moving about on the ground. Nesting t'"
+ ]
+ },
+ "execution_count": 11,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "api_response = wikipedia.page(function_args[\"animal\"]).content\n",
+ "api_response[:500]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "f238cad25c36"
+ },
+ "source": [
+ "Finally, you return the API response to Gemini so it can generate a final answer in natural language:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {
+ "id": "02ee532ce187"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/markdown": [
+ "The lilac-breasted roller (Coracias caudatus) is an African bird. It is widely distributed in Southern and Eastern Africa. \n"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "response = model.generate_content(\n",
+ " [\n",
+ " Content(\n",
+ " role=\"user\",\n",
+ " parts=[\n",
+ " Part.from_uri(\n",
+ " \"gs://github-repo/generative-ai/gemini/function-calling/multi-color-bird.jpg\",\n",
+ " mime_type=\"image/jpeg\",\n",
+ " ),\n",
+ " Part.from_text(\n",
+ " \"Inspect the image and get the regions where this animal can be found\",\n",
+ " ),\n",
+ " ],\n",
+ " ),\n",
+ " response_function_call, # Function call response\n",
+ " Content(\n",
+ " parts=[\n",
+ " Part.from_function_response(\n",
+ " name=function_name,\n",
+ " response={\n",
+ " \"content\": api_response, # Return the API response to the Gemini model\n",
+ " },\n",
+ " )\n",
+ " ],\n",
+ " ),\n",
+ " ],\n",
+ " tools=[image_tool],\n",
+ ")\n",
+ "\n",
+ "display(Markdown(response.text))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "6e5f13d0e644"
+ },
+ "source": [
+ "This example showcases how Gemini's multimodal function calling processes an image, predicts a relevant function and its parameters, and integrates with external APIs to provide comprehensive user information. This process opens up exciting possibilities for building intelligent applications that can \"see\" and understand the world around them via API calls to Gemini."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "6036faa1fb70"
+ },
+ "source": [
+ "### Video-Based Function Calling: Identifying Product Features"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4dd489a96132"
+ },
+ "source": [
+ "Now let's explore how Gemini can extract information from videos for the purpose of invoking a function call. You'll use a video showcasing multiple products and ask Gemini to identify its key features.\n",
+ "\n",
+ "Start by defining a function called `get_feature_info` that takes a list of product features as input and could potentially be used to retrieve additional details about those features:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {
+ "id": "41d1ed66b8b3"
+ },
+ "outputs": [],
+ "source": [
+ "get_feature_info = FunctionDeclaration(\n",
+ " name=\"get_feature_info\",\n",
+ " description=\"Get additional information about a product feature\",\n",
+ " parameters={\n",
+ " \"type\": \"object\",\n",
+ " \"properties\": {\n",
+ " \"features\": {\n",
+ " \"type\": \"array\",\n",
+ " \"description\": \"A list of product features\",\n",
+ " \"items\": {\"type\": \"string\", \"description\": \"Product feature\"},\n",
+ " }\n",
+ " },\n",
+ " },\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "b972769f37e6"
+ },
+ "source": [
+ "Next, create a tool that includes your `get_feature_info` function:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {
+ "id": "f134fc04e6bb"
+ },
+ "outputs": [],
+ "source": [
+ "video_tool = Tool(\n",
+ " function_declarations=[\n",
+ " get_feature_info,\n",
+ " ],\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "c16c497b85d3"
+ },
+ "source": [
+ "Send a video to Gemini, along with a prompt asking for information about the product features, making sure to include your `video_tool` in the `tools` kwarg:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {
+ "id": "09fbe282c3d3"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[name: \"get_feature_info\"\n",
+ " args {\n",
+ " fields {\n",
+ " key: \"features\"\n",
+ " value {\n",
+ " list_value {\n",
+ " values {\n",
+ " string_value: \"Gemini Live\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"Made You Look\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"Add me\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"Magic Editor\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"Pixel 9 Pro Fold\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"Pixel Buds Pro 2\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"Call Notes\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"Pixel Screenshots\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"Pixel Studio\"\n",
+ " }\n",
+ " }\n",
+ " }\n",
+ " }\n",
+ " }]"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "model = GenerativeModel(\"gemini-1.5-pro-001\")\n",
+ "generation_config = GenerationConfig(temperature=0)\n",
+ "\n",
+ "response = model.generate_content(\n",
+ " [\n",
+ " Part.from_uri(\n",
+ " \"gs://github-repo/generative-ai/gemini/function-calling/made-by-google-24.mp4\",\n",
+ " mime_type=\"video/mp4\",\n",
+ " ),\n",
+ " \"Inspect the video and get information about the product features shown\",\n",
+ " ],\n",
+ " generation_config=generation_config,\n",
+ " tools=[video_tool],\n",
+ ")\n",
+ "response.candidates[0].function_calls"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4115fd61850b"
+ },
+ "source": [
+ "Gemini correctly predicted the `get_feature_info` function:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {
+ "id": "1ae3bb7a4847"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'get_feature_info'"
+ ]
+ },
+ "execution_count": 16,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "function_name = response.candidates[0].function_calls[0].name\n",
+ "function_name"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "c17668290dd0"
+ },
+ "source": [
+ "And you can see the list of product features that Gemini extracted from the video, which are available as structured function arguments that adhere to the JSON schema we defined in the `FunctionDeclaration`:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {
+ "id": "992c59809c7b"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "{'features': ['Gemini Live', 'Made You Look', 'Add me', 'Magic Editor', 'Pixel 9 Pro Fold', 'Pixel Buds Pro 2', 'Call Notes', 'Pixel Screenshots', 'Pixel Studio']}"
+ ]
+ },
+ "execution_count": 17,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "function_args = {\n",
+ " key: value for key, value in response.candidates[0].function_calls[0].args.items()\n",
+ "}\n",
+ "function_args"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "bd8dd10f1219"
+ },
+ "source": [
+ "This example demonstrates Gemini's ability to understand video content. By defining a relevant function, you can use Gemini to extract structured information from videos and perform further actions based on that information.\n",
+ "\n",
+ "Now that the multimodal function call response is complete, you could use the function name and function arguments to call an external API using any REST API or client library of your choice, similar to how we did in the previous example with the `wikipedia` Python package.\n",
+ "\n",
+ "Since this sample notebook is focused on the mechanics of multimodal function calling rather than the subsequent function calls and API calls, we'll move on to another example with different multimodal inputs. You can refer to other sample notebooks on Gemini Function Calling for more details on where to go from here."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "41a08eda3be8"
+ },
+ "source": [
+ "### Audio-Based Function Calling: Generating Book Recommendations"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "a94cd078b43c"
+ },
+ "source": [
+ "In this example, you'll explore using audio input with Gemini's multimodal function calling. You'll send a podcast episode to Gemini and ask for book recommendations related to the topics discussed.\n",
+ "\n",
+ "Define a function called `get_recommended_books` that takes a list of topics as input and (hypothetically) returns relevant book recommendations:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "metadata": {
+ "id": "9714025043bd"
+ },
+ "outputs": [],
+ "source": [
+ "get_recommended_books = FunctionDeclaration(\n",
+ " name=\"get_recommended_books\",\n",
+ " description=\"Get recommended books based on a list of topics\",\n",
+ " parameters={\n",
+ " \"type\": \"object\",\n",
+ " \"properties\": {\n",
+ " \"topics\": {\n",
+ " \"type\": \"array\",\n",
+ " \"description\": \"A list of topics\",\n",
+ " \"items\": {\"type\": \"string\", \"description\": \"Topic\"},\n",
+ " },\n",
+ " },\n",
+ " },\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "f23465d0938f"
+ },
+ "source": [
+ "Now create a tool that includes your newly defined function:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {
+ "id": "d61600788e03"
+ },
+ "outputs": [],
+ "source": [
+ "audio_tool = Tool(\n",
+ " function_declarations=[\n",
+ " get_recommended_books,\n",
+ " ],\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "560afbd15a17"
+ },
+ "source": [
+ "Provide Gemini with the audio file and a prompt to recommend books based on the podcast content:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 20,
+ "metadata": {
+ "id": "47228e6631a3"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[name: \"get_recommended_books\"\n",
+ " args {\n",
+ " fields {\n",
+ " key: \"topics\"\n",
+ " value {\n",
+ " list_value {\n",
+ " values {\n",
+ " string_value: \"Software Reliability\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"Site Reliability Engineering\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"DevOps\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"Cloud Computing\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"System Thinking\"\n",
+ " }\n",
+ " }\n",
+ " }\n",
+ " }\n",
+ " }]"
+ ]
+ },
+ "execution_count": 20,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "model = GenerativeModel(\"gemini-1.5-pro-001\")\n",
+ "generation_config = GenerationConfig(temperature=0)\n",
+ "\n",
+ "response = model.generate_content(\n",
+ " [\n",
+ " Part.from_uri(\n",
+ " \"gs://github-repo/generative-ai/gemini/function-calling/google-cloud-sre-podcast-s2-e8.mp3\",\n",
+ " mime_type=\"audio/mpeg\",\n",
+ " ),\n",
+ " \"Inspect the audio file and generate a list of recommended books based on the topics discussed\",\n",
+ " ],\n",
+ " generation_config=generation_config,\n",
+ " tools=[audio_tool],\n",
+ ")\n",
+ "response.candidates[0].function_calls"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "db9b85de9752"
+ },
+ "source": [
+ "You can see that Gemini has successfully predicted your `get_recommended_books` function:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 21,
+ "metadata": {
+ "id": "eabef4d9faf4"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'get_recommended_books'"
+ ]
+ },
+ "execution_count": 21,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "function_name = response.candidates[0].function_calls[0].name\n",
+ "function_name"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ea00f52eb487"
+ },
+ "source": [
+ "And the function arguments contain the list of topics that Gemini identified and extracted from the input audio file:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 22,
+ "metadata": {
+ "id": "8c8f32e930c9"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "{'topics': ['Software Reliability', 'Site Reliability Engineering', 'DevOps', 'Cloud Computing', 'System Thinking']}"
+ ]
+ },
+ "execution_count": 22,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "function_args = {\n",
+ " key: value for key, value in response.candidates[0].function_calls[0].args.items()\n",
+ "}\n",
+ "function_args"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "8acd15dd7cec"
+ },
+ "source": [
+ "This example highlights Gemini's capacity to understand and extract information from audio, enabling you to create applications that respond to spoken content or audio-based interactions."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "93577f2d2fe1"
+ },
+ "source": [
+ "### PDF-Based Function Calling: Extracting Company Data from Invoices"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "924cf8d1c711"
+ },
+ "source": [
+ "This example demonstrates how to use Gemini's multimodal function calling to process PDF documents. You'll work with a set of invoices and extract the names of the (fictitious) companies involved.\n",
+ "\n",
+ "Define a function called `get_company_information` that (in a real-world scenario) could be used to fetch details about a given list of companies:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 23,
+ "metadata": {
+ "id": "ba57e626e9d2"
+ },
+ "outputs": [],
+ "source": [
+ "get_company_information = FunctionDeclaration(\n",
+ " name=\"get_company_information\",\n",
+ " description=\"Get information about a list of companies\",\n",
+ " parameters={\n",
+ " \"type\": \"object\",\n",
+ " \"properties\": {\n",
+ " \"companies\": {\n",
+ " \"type\": \"array\",\n",
+ " \"description\": \"A list of companies\",\n",
+ " \"items\": {\"type\": \"string\", \"description\": \"Company name\"},\n",
+ " }\n",
+ " },\n",
+ " },\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "fae1c7a7d8a9"
+ },
+ "source": [
+ "Package your newly defined function into a tool:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 24,
+ "metadata": {
+ "id": "a62ed01019f0"
+ },
+ "outputs": [],
+ "source": [
+ "invoice_tool = Tool(\n",
+ " function_declarations=[\n",
+ " get_company_information,\n",
+ " ],\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "91dcbcbb0f50"
+ },
+ "source": [
+ "Now you can provide Gemini with multiple PDF invoices and ask it to get company information:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 25,
+ "metadata": {
+ "id": "e509abf4d73a"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[name: \"get_company_information\"\n",
+ " args {\n",
+ " fields {\n",
+ " key: \"companies\"\n",
+ " value {\n",
+ " list_value {\n",
+ " values {\n",
+ " string_value: \"AMNOSH\\\\nSUPPLIERS\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"BIKBEAR\\\\nLAW FIRM\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"CUCUONAR\\\\nAGENCY\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"CWLEY PLC\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"LUVBPTK Company\"\n",
+ " }\n",
+ " }\n",
+ " }\n",
+ " }\n",
+ " }]"
+ ]
+ },
+ "execution_count": 25,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "model = GenerativeModel(\"gemini-1.5-pro-001\")\n",
+ "generation_config = GenerationConfig(temperature=0)\n",
+ "\n",
+ "response = model.generate_content(\n",
+ " [\n",
+ " Part.from_uri(\n",
+ " \"gs://github-repo/generative-ai/gemini/function-calling/invoice-synthetic-1.pdf\",\n",
+ " mime_type=\"application/pdf\",\n",
+ " ),\n",
+ " Part.from_uri(\n",
+ " \"gs://github-repo/generative-ai/gemini/function-calling/invoice-synthetic-2.pdf\",\n",
+ " mime_type=\"application/pdf\",\n",
+ " ),\n",
+ " Part.from_uri(\n",
+ " \"gs://github-repo/generative-ai/gemini/function-calling/invoice-synthetic-3.pdf\",\n",
+ " mime_type=\"application/pdf\",\n",
+ " ),\n",
+ " Part.from_uri(\n",
+ " \"gs://github-repo/generative-ai/gemini/function-calling/invoice-synthetic-4.pdf\",\n",
+ " mime_type=\"application/pdf\",\n",
+ " ),\n",
+ " Part.from_uri(\n",
+ " \"gs://github-repo/generative-ai/gemini/function-calling/invoice-synthetic-5.pdf\",\n",
+ " mime_type=\"application/pdf\",\n",
+ " ),\n",
+ " \"Inspect the PDF files of invoices and retrieve information about each company\",\n",
+ " ],\n",
+ " generation_config=generation_config,\n",
+ " tools=[invoice_tool],\n",
+ ")\n",
+ "response.candidates[0].function_calls"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "974a138f3c6c"
+ },
+ "source": [
+ "As expected, Gemini predicted the `get_company_information` function:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 26,
+ "metadata": {
+ "id": "766fdbafed76"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'get_company_information'"
+ ]
+ },
+ "execution_count": 26,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "function_name = response.candidates[0].function_calls[0].name\n",
+ "function_name"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "c80e9f280f5c"
+ },
+ "source": [
+ "The function arguments contain the list of company names extracted from the PDF invoices:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 27,
+ "metadata": {
+ "id": "9fa7a22d85b2"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "{'companies': ['AMNOSH\\\\nSUPPLIERS', 'BIKBEAR\\\\nLAW FIRM', 'CUCUONAR\\\\nAGENCY', 'CWLEY PLC', 'LUVBPTK Company']}"
+ ]
+ },
+ "execution_count": 27,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "function_args = {\n",
+ " key: value for key, value in response.candidates[0].function_calls[0].args.items()\n",
+ "}\n",
+ "function_args"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "03e710b9d195"
+ },
+ "source": [
+ "This example shows the power of Gemini for processing and extracting structured data from documents, a common requirement in many real-world applications."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "77f53d886376"
+ },
+ "source": [
+ "### Image-Based Chat: Building a Multimodal Chatbot"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "d145dc63a74a"
+ },
+ "source": [
+ "Let's put it all together and build a simple multimodal chatbot. This chatbot will understand image inputs and respond to questions using the functions you define.\n",
+ "\n",
+ "First, define three functions: `get_animal_details`, `get_location_details`, and `check_color_palette`. These functions represent the capabilities of your chatbot and could potentially be used to retrieve additional details using REST API calls:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 28,
+ "metadata": {
+ "id": "1618b25ad3e0"
+ },
+ "outputs": [],
+ "source": [
+ "get_animal_details = FunctionDeclaration(\n",
+ " name=\"get_animal_details\",\n",
+ " description=\"Look up information about a given animal species\",\n",
+ " parameters={\n",
+ " \"type\": \"object\",\n",
+ " \"properties\": {\n",
+ " \"animal\": {\"type\": \"string\", \"description\": \"Species of animal\"}\n",
+ " },\n",
+ " },\n",
+ ")\n",
+ "\n",
+ "get_location_details = FunctionDeclaration(\n",
+ " name=\"get_location_details\",\n",
+ " description=\"Look up information about a given location\",\n",
+ " parameters={\n",
+ " \"type\": \"object\",\n",
+ " \"properties\": {\"location\": {\"type\": \"string\", \"description\": \"Location\"}},\n",
+ " },\n",
+ ")\n",
+ "\n",
+ "check_color_palette = FunctionDeclaration(\n",
+ " name=\"check_color_palette\",\n",
+ " description=\"Check hex color codes for accessibility\",\n",
+ " parameters={\n",
+ " \"type\": \"object\",\n",
+ " \"properties\": {\n",
+ " \"colors\": {\n",
+ " \"type\": \"array\",\n",
+ " \"description\": \"A list of colors in hexadecimal format\",\n",
+ " \"items\": {\n",
+ " \"type\": \"string\",\n",
+ " \"description\": \"Hexadecimal representation of color, as in #355E3B\",\n",
+ " },\n",
+ " }\n",
+ " },\n",
+ " },\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ca63d74adeba"
+ },
+ "source": [
+ "Group your functions into a tool:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 29,
+ "metadata": {
+ "id": "178ce7754626"
+ },
+ "outputs": [],
+ "source": [
+ "chat_tool = Tool(\n",
+ " function_declarations=[\n",
+ " get_animal_details,\n",
+ " get_location_details,\n",
+ " check_color_palette,\n",
+ " ],\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "0eb1e7629b9e"
+ },
+ "source": [
+ "Initialize the `GenerativeModel` and start a chat session with Gemini, providing it with your `chat_tool`:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 30,
+ "metadata": {
+ "id": "ac1ebff348c9"
+ },
+ "outputs": [],
+ "source": [
+ "model = GenerativeModel(\n",
+ " \"gemini-1.5-pro-001\",\n",
+ " generation_config=GenerationConfig(temperature=0),\n",
+ " tools=[chat_tool],\n",
+ ")\n",
+ "\n",
+ "chat = model.start_chat()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "05bb7db4be62"
+ },
+ "source": [
+ "Send an image of a fox, along with a simple prompt:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 31,
+ "metadata": {
+ "id": "c3e47a96df7e"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[name: \"get_animal_details\"\n",
+ " args {\n",
+ " fields {\n",
+ " key: \"animal\"\n",
+ " value {\n",
+ " string_value: \"fox\"\n",
+ " }\n",
+ " }\n",
+ " }]"
+ ]
+ },
+ "execution_count": 31,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "response = chat.send_message(\n",
+ " [\n",
+ " Part.from_uri(\n",
+ " \"gs://github-repo/generative-ai/gemini/function-calling/baby-fox.jpg\",\n",
+ " mime_type=\"image/jpeg\",\n",
+ " ),\n",
+ " \"Tell me about this animal\",\n",
+ " ]\n",
+ ")\n",
+ "\n",
+ "response.candidates[0].function_calls"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "38c96a599b94"
+ },
+ "source": [
+ "Now ask about the location details in the image:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 32,
+ "metadata": {
+ "id": "153f7b93eb65"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[name: \"get_location_details\"\n",
+ " args {\n",
+ " fields {\n",
+ " key: \"location\"\n",
+ " value {\n",
+ " string_value: \"meadow\"\n",
+ " }\n",
+ " }\n",
+ " }]"
+ ]
+ },
+ "execution_count": 32,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "response = chat.send_message(\n",
+ " [\n",
+ " Part.from_uri(\n",
+ " \"gs://github-repo/generative-ai/gemini/function-calling/baby-fox.jpg\",\n",
+ " mime_type=\"image/jpeg\",\n",
+ " ),\n",
+ " \"Tell me details about this location\",\n",
+ " ]\n",
+ ")\n",
+ "\n",
+ "response.candidates[0].function_calls"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "db6363659da8"
+ },
+ "source": [
+ "And finally, ask for a color palette based the image:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 33,
+ "metadata": {
+ "id": "af519b9c7bc5"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[name: \"check_color_palette\"\n",
+ " args {\n",
+ " fields {\n",
+ " key: \"colors\"\n",
+ " value {\n",
+ " list_value {\n",
+ " values {\n",
+ " string_value: \"#A85C4C\"\n",
+ " }\n",
+ " values {\n",
+ " string_value: \"#4C8A6C\"\n",
+ " }\n",
+ " }\n",
+ " }\n",
+ " }\n",
+ " }]"
+ ]
+ },
+ "execution_count": 33,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "response = chat.send_message(\n",
+ " [\n",
+ " Part.from_uri(\n",
+ " \"gs://github-repo/generative-ai/gemini/function-calling/baby-fox.jpg\",\n",
+ " mime_type=\"image/jpeg\",\n",
+ " ),\n",
+ " \"Get the color palette of this image and check it for accessibility\",\n",
+ " ]\n",
+ ")\n",
+ "\n",
+ "response.candidates[0].function_calls"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4e38eb5eeb5b"
+ },
+ "source": [
+ "While this chatbot doesn't actually execute the predicted functions, it demonstrates creating an interactive experience using multimodal inputs and function calling in a chat format. You can extend this example by implementing REST API calls or client library requests for each function to create a truly functional and engaging multimodal chatbot that's connected to the real world."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ee5711d51ae0"
+ },
+ "source": [
+ "## Conclusions\n",
+ "\n",
+ "In this notebook, you explored the powerful capabilities of Gemini's multimodal function calling. You learned how to:\n",
+ "\n",
+ "- Define functions and package them into tools.\n",
+ "- Send multimodal inputs (images, video, audio, PDFs) and prompts to Gemini. \n",
+ "- Extract predicted function calls and their parameters.\n",
+ "- Use the predicted output to make (or potentially make) API calls.\n",
+ "- Return API responses to Gemini for natural language generation. \n",
+ "\n",
+ "You've seen how Gemini can understand and act on a range of different multimodal inputs, which opens up a world of possibilities for building innovative and engaging multimodal applications. You can now use these powerful tools to create your own intelligent applications that seamlessly integrate media, natural language, and calls to external APIs and system.\n",
+ "\n",
+ "Experiment with different modalities, functions, and prompts to discover the full potential of Gemini's multimodal and function calling capabilities. And you can continue learning by exploring other sample notebooks in this repository and exploring the [documentation for Gemini Function Calling](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling). "
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "name": "multimodal_function_calling.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
From 193ca27e2de3428e6da6ab4d7b5fee47fac3191e Mon Sep 17 00:00:00 2001
From: Eric Dong
Date: Tue, 24 Sep 2024 14:13:23 -0400
Subject: [PATCH 10/76] refactor: Update long context window information
(#1169)
# Description
Update long context window information for Gemini 1.5.
Gemin 1.5 Flash 1M
Gemini 1.5 Pro 2M
---
gemini/long-context/intro_long_context.ipynb | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gemini/long-context/intro_long_context.ipynb b/gemini/long-context/intro_long_context.ipynb
index 8221041912..e93f835a9a 100644
--- a/gemini/long-context/intro_long_context.ipynb
+++ b/gemini/long-context/intro_long_context.ipynb
@@ -74,7 +74,7 @@
"source": [
"## Overview\n",
"\n",
- "Historically, large language models (LLMs) were significantly limited by the amount of text (or tokens) that could be passed to the model at one time. Gemini 1.5 Flash and Gemini 1.5 Pro support a 1 million token context window, with [near-perfect retrieval (>99%)](https://storage.googleapis.com/deepmind-media/gemini/gemini_v1_5_report.pdf), which unlocks many new use cases and developer paradigms.\n",
+ "Gemini 1.5 Flash comes standard with a 1 million token context window, and Gemini 1.5 Pro comes with a 2 million token context window. Historically, large language models (LLMs) were significantly limited by the amount of text (or tokens) that could be passed to the model at one time. The Gemini 1.5 long context window, with [near-perfect retrieval (>99%)](https://storage.googleapis.com/deepmind-media/gemini/gemini_v1_5_report.pdf), unlocks many new use cases and developer paradigms.\n",
"\n",
"In practice, 1 million tokens would look like:\n",
"\n",
From 8ebaece186d189afdd47fd0dfda0be7b56f13fba Mon Sep 17 00:00:00 2001
From: Deepak moonat
Date: Wed, 25 Sep 2024 08:19:56 +0530
Subject: [PATCH 11/76] update: model to 002 version (#1170)
# Description
Supervised finetuning using gemini-1.5-pro-002
- [x] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [x] You are listed as the author in your notebook or README file.
- [x] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [x] Make your Pull Request title in the
specification.
- [x] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [x] Appropriate docs were updated (if necessary)
---------
Co-authored-by: Owl Bot
---
.../supervised_finetuning_using_gemini.ipynb | 111 ++++++++----------
1 file changed, 48 insertions(+), 63 deletions(-)
diff --git a/gemini/tuning/supervised_finetuning_using_gemini.ipynb b/gemini/tuning/supervised_finetuning_using_gemini.ipynb
index 3a5bbbd169..beecd15313 100644
--- a/gemini/tuning/supervised_finetuning_using_gemini.ipynb
+++ b/gemini/tuning/supervised_finetuning_using_gemini.ipynb
@@ -97,6 +97,10 @@
"| Chat | 1000+ | 2-4 |\n",
"\n",
"\n",
+ " \n",
+ "\n",
+ "Refer to public [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning) for more details.\n",
+ "\n",
"\n",
"\n",
"\n",
@@ -117,7 +121,7 @@
"source": [
"### Objective\n",
"\n",
- "In this tutorial, you will learn how to use `Vertex AI` to tune a Parameter Efficient FineTuned(PEFT) `gemini-1.5-pro` model.\n",
+ "In this tutorial, you will learn how to use `Vertex AI` to tune a `gemini-1.5-pro` model.\n",
"\n",
"\n",
"This tutorial uses the following Google Cloud ML services:\n",
@@ -128,7 +132,7 @@
"The steps performed include:\n",
"\n",
"- Prepare and load the dataset\n",
- "- Load the `gemini-1.5-pro-001` model\n",
+ "- Load the `gemini-1.5-pro-002` model\n",
"- Evaluate the model before tuning\n",
"- Tune the model.\n",
" - This will automatically create a Vertex AI endpoint and deploy the model to it\n",
@@ -384,7 +388,7 @@
"id": "JUEloBlsCPFr"
},
"source": [
- "## Step3: Create Dataset in required format\n",
+ "## Step3: Create Dataset in correct format\n",
"\n",
"The dataset used to tune a foundation model needs to include examples that align with the task that you want the model to perform. Structure your training dataset in a text-to-text format. Each record, or row, in the dataset contains the input text (also referred to as the prompt) which is paired with its expected output from the model. Supervised tuning uses the dataset to teach the model to mimic a behavior, or task, you need by giving it hundreds of examples that illustrate that behavior.\n",
"\n",
@@ -417,7 +421,10 @@
" # ... repeat \"user\", \"model\" for multi turns.\n",
" ]\n",
"}\n",
- "```"
+ "```\n",
+ "\n",
+ "\n",
+ "Refer to the public [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning-prepare#about-datasets) for more details."
]
},
{
@@ -489,7 +496,7 @@
},
"source": [
"- Data used in this notebook is present in the public Google Cloud Storage(GCS) bucket.\n",
- "- It's in Gemini 1.0 dataset format for finetuning"
+ "- It's in Gemini 1.0 finetuning dataset format"
]
},
{
@@ -503,15 +510,6 @@
"!gsutil ls gs://github-repo/generative-ai/gemini/tuning/summarization/wikilingua"
]
},
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "tKupp3g6W1XS"
- },
- "source": [
- "#### Download the dataset"
- ]
- },
{
"cell_type": "code",
"execution_count": null,
@@ -734,7 +732,7 @@
"source": [
"The following Gemini text models support supervised tuning:\n",
"\n",
- "* `gemini-1.5-pro-001`"
+ "* `gemini-1.5-pro-002`"
]
},
{
@@ -745,7 +743,7 @@
},
"outputs": [],
"source": [
- "base_model = \"gemini-1.5-pro-001\"\n",
+ "base_model = \"gemini-1.5-pro-002\"\n",
"generation_model = GenerativeModel(base_model)"
]
},
@@ -894,6 +892,7 @@
"| Text Generation | BLEU, ROUGE-L |\n",
"\n",
"\n",
+ " \n",
"\n",
"Refer to this [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluate-models) for metric based evaluation."
]
@@ -1043,6 +1042,17 @@
"corpus_batch = corpus[:100]"
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "oM10zigp7kTZ"
+ },
+ "source": [
+ "
\n",
+ "⚠️ It will take ~5 mins for the evaluation run on the provided batch. ⚠️\n",
+ "
\n",
- "⚠️ It will take around an hour for the model tuning job to complete on the provided dataset and set configurations/hyperparameters. ⚠️\n",
+ "⚠️ It will take ~30 mins for the model tuning job to complete on the provided dataset and set configurations/hyperparameters. ⚠️\n",
"
"
]
},
@@ -1296,7 +1302,7 @@
"outputs": [],
"source": [
"# Read data from Tensorboard\n",
- "tensorboard_run_name = f\"{experiment.get_backing_tensorboard_resource().resource_name}/experiments/{experiment.name}/runs/{experiment_run.name}\"\n",
+ "tensorboard_run_name = f\"{experiment.get_backing_tensorboard_resource().resource_name}/experiments/{experiment.name}/runs/{experiment_run.name.replace(experiment.name, '')[1:]}\"\n",
"tensorboard_run = aiplatform.TensorboardRun(tensorboard_run_name)\n",
"metrics = tensorboard_run.read_time_series_data()"
]
@@ -1438,7 +1444,7 @@
{
"cell_type": "markdown",
"metadata": {
- "id": "QB2Pnpp6dYwh"
+ "id": "ef7acd61d12d"
},
"source": [
"```\n",
@@ -1446,58 +1452,31 @@
" content {\n",
" role: \"model\"\n",
" parts {\n",
- " text: \"Squeeze lotion onto your forearms. Place your forearms against your back. Rub your forearms up and down your back.\\n\\n\"\n",
+ " text: \"Squeeze a line of lotion onto the top of each forearm. Place your forearms behind your back. Rub your forearms up and down your back.\\n\\n\"\n",
" }\n",
" }\n",
" finish_reason: STOP\n",
- " safety_ratings {\n",
- " category: HARM_CATEGORY_HATE_SPEECH\n",
- " probability: NEGLIGIBLE\n",
- " probability_score: 0.083984375\n",
- " severity: HARM_SEVERITY_NEGLIGIBLE\n",
- " severity_score: 0.10498046875\n",
- " }\n",
- " safety_ratings {\n",
- " category: HARM_CATEGORY_DANGEROUS_CONTENT\n",
- " probability: NEGLIGIBLE\n",
- " probability_score: 0.41015625\n",
- " severity: HARM_SEVERITY_NEGLIGIBLE\n",
- " severity_score: 0.1904296875\n",
- " }\n",
- " safety_ratings {\n",
- " category: HARM_CATEGORY_HARASSMENT\n",
- " probability: NEGLIGIBLE\n",
- " probability_score: 0.24609375\n",
- " severity: HARM_SEVERITY_NEGLIGIBLE\n",
- " severity_score: 0.130859375\n",
- " }\n",
- " safety_ratings {\n",
- " category: HARM_CATEGORY_SEXUALLY_EXPLICIT\n",
- " probability: NEGLIGIBLE\n",
- " probability_score: 0.3671875\n",
- " severity: HARM_SEVERITY_LOW\n",
- " severity_score: 0.3203125\n",
- " }\n",
- " avg_logprobs: -0.3171907354284216\n",
+ " avg_logprobs: -0.39081838726997375\n",
"}\n",
"usage_metadata {\n",
- " prompt_token_count: 263\n",
- " candidates_token_count: 27\n",
- " total_token_count: 290\n",
+ " prompt_token_count: 261\n",
+ " candidates_token_count: 32\n",
+ " total_token_count: 293\n",
"}\n",
+ "\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
- "id": "aL4Ie5wFwDTa"
+ "id": "d54ce2b88af3"
},
"source": [
"- We can clearly see the difference between summary generated pre and post tuning, as tuned summary is more inline with the ground truth format (**Note**: Pre and Post outputs, might vary based on the set parameters.)\n",
"\n",
" - *Pre*: `This article describes a method for applying lotion to your own back using your forearms. The technique involves squeezing lotion in a line along your forearms, bending your elbows, and rubbing your arms against your back in a windshield wiper motion. This method may not be suitable for individuals with shoulder pain or limited flexibility.`\n",
- " - *Post*: `Squeeze lotion onto your forearms. Place your forearms against your back. Rub your forearms up and down your back.`\n",
+ " - *Post*: `Squeeze a line of lotion onto the top of each forearm. Place your forearms behind your back. Rub your forearms up and down your back`\n",
" - *Ground Truth*:` Squeeze a line of lotion onto the tops of both forearms and the backs of your hands. Place your arms behind your back. Move your arms in a windshield wiper motion.`"
]
},
@@ -1510,6 +1489,17 @@
"## Step9: Evaluation post model tuning"
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "mwlCcKPZ62Of"
+ },
+ "source": [
+ "
\n",
+ "⚠️ It will take ~5 mins for the evaluation on the provided batch. ⚠️\n",
+ "
"
+ ]
+ },
{
"cell_type": "code",
"execution_count": null,
@@ -1518,7 +1508,6 @@
},
"outputs": [],
"source": [
- "%%time\n",
"# run evaluation\n",
"evaluation_df_post_tuning = run_evaluation(tuned_genai_model, corpus_batch)"
]
@@ -1704,10 +1693,6 @@
],
"metadata": {
"colab": {
- "collapsed_sections": [
- "8DzlWWKpbGcu",
- "NuN-m1Ikbn15"
- ],
"name": "supervised_finetuning_using_gemini.ipynb",
"toc_visible": true
},
From 354f2cfe01ff9397596488f23fec7c03df891568 Mon Sep 17 00:00:00 2001
From: Deepak moonat
Date: Wed, 25 Sep 2024 09:34:16 +0530
Subject: [PATCH 12/76] update: markdown (#1171)
# Description
Supervised finetuning using gemini-1.5-pro-002
- [x] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [x] You are listed as the author in your notebook or README file.
- [x] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [x] Make your Pull Request title in the
specification.
- [x] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [x] Appropriate docs were updated (if necessary)
---------
Co-authored-by: Owl Bot
---
.../supervised_finetuning_using_gemini.ipynb | 53 +++++++------------
1 file changed, 19 insertions(+), 34 deletions(-)
diff --git a/gemini/tuning/supervised_finetuning_using_gemini.ipynb b/gemini/tuning/supervised_finetuning_using_gemini.ipynb
index beecd15313..7297400a9e 100644
--- a/gemini/tuning/supervised_finetuning_using_gemini.ipynb
+++ b/gemini/tuning/supervised_finetuning_using_gemini.ipynb
@@ -86,18 +86,6 @@
" - Metric: Choose appropriate evaluation metrics that accurately reflect the success of the fine-tuned model for your specific task\n",
" - Evaluation Set: Use a separate set of data to evaluate the model's performance\n",
"\n",
- "### Recommended configurations\n",
- "The following table shows the recommended configurations for tuning a foundation model by task:\n",
- "\n",
- "| Task | No. of examples in dataset | Number of epochs |\n",
- "| -------------- | -------------------------- | ----------- |\n",
- "| Classification | 500+ | 2-4 |\n",
- "| Summarization | 1000+ | 2-4 |\n",
- "| Extractive QA | 500+ | 2-4 |\n",
- "| Chat | 1000+ | 2-4 |\n",
- "\n",
- "\n",
- " \n",
"\n",
"Refer to public [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning) for more details.\n",
"\n",
@@ -376,7 +364,7 @@
},
"outputs": [],
"source": [
- "PROJECT_ID = \"\" # @param\n",
+ "PROJECT_ID = \"[YOUR_PROJECT_ID]\" # @param\n",
"LOCATION = \"us-central1\" # @param\n",
"\n",
"vertexai.init(project=PROJECT_ID, location=LOCATION)"
@@ -456,7 +444,7 @@
"outputs": [],
"source": [
"# Provide a bucket name\n",
- "BUCKET_NAME = \"\" # @param {type:\"string\"}\n",
+ "BUCKET_NAME = \"[YOUR_BUCKET_NAME]\" # @param {type:\"string\"}\n",
"BUCKET_URI = f\"gs://{BUCKET_NAME}\""
]
},
@@ -884,12 +872,12 @@
"id": "t6oLtUEWMHVu"
},
"source": [
- "| Task | Metric\n",
- "|:---------|:--------:|\n",
- "| Classification | Micro-F1, Macro-F1, Per class F1 |\n",
- "| Summarization | ROUGE-L |\n",
- "| Question Answering | Exact Match |\n",
- "| Text Generation | BLEU, ROUGE-L |\n",
+ "| Task | Metric(s) |\n",
+ "|-----------------|---------------------------------|\n",
+ "| Classification | Micro-F1, Macro-F1, Per class F1 |\n",
+ "| Summarization | ROUGE-L |\n",
+ "| Question Answering | Exact Match |\n",
+ "| Text Generation | BLEU, ROUGE-L |\n",
"\n",
"\n",
" \n",
@@ -1123,11 +1111,21 @@
"\n",
" *Optional parameters*\n",
" - `validation_dataset`: If provided, this data is used to evaluate the model during tuning.\n",
+ " - `tuned_model_display_name`: Display name for the tuned model.\n",
" - `epochs`: The number of training epochs to run.\n",
" - `learning_rate_multiplier`: A value to scale the learning rate during training.\n",
" - `adapter_size` : Gemini 1.5 Pro supports Adapter length [1, 4], default value is 4."
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4e81137766c6"
+ },
+ "source": [
+ "**Note: The default hyperparameter settings are optimized for optimal performance based on rigorous testing and are recommended for initial use. Users may customize these parameters to address specific performance requirements.**"
+ ]
+ },
{
"cell_type": "code",
"execution_count": null,
@@ -1136,7 +1134,7 @@
},
"outputs": [],
"source": [
- "tuned_model_display_name = \"\" # @param {type:\"string\"}\n",
+ "tuned_model_display_name = \"[DISPLAY NAME FOR TUNED MODEL]\" # @param {type:\"string\"}\n",
"\n",
"# Tune a model using `train` method.\n",
"sft_tuning_job = sft.train(\n",
@@ -1671,19 +1669,6 @@
"\n",
"print(\"***\" * 10)\n",
"\n",
- "# Delete Model.\n",
- "delete_model = True\n",
- "if delete_model:\n",
- " # Remove version from model name.\n",
- " tuned_model_name = tuned_model_name.split(\"@\")[0]\n",
- " for model in aiplatform.Model.list():\n",
- " if model.resource_name == tuned_model_name:\n",
- " print(model.resource_name)\n",
- " model.delete()\n",
- " break\n",
- "\n",
- "print(\"***\" * 10)\n",
- "\n",
"# Delete Cloud Storage Bucket.\n",
"delete_bucket = True\n",
"if delete_bucket:\n",
From c347390697e853253e62d210e4d036ba3e6fa544 Mon Sep 17 00:00:00 2001
From: Eric Dong
Date: Wed, 25 Sep 2024 12:41:29 -0400
Subject: [PATCH 13/76] fix: Fix the PDF file processing and audio
transcription (#1172)
# Description
Fix the PDF file processing and audio transcription:
- Remove timecode output for now as it's not totally accurate
- Re-define PDF and image fileData in PDF analysis
---
.../intro_gemini_1_5_flash.ipynb | 7 +-
.../intro_gemini_1_5_pro.ipynb | 143 +++++-------------
2 files changed, 41 insertions(+), 109 deletions(-)
diff --git a/gemini/getting-started/intro_gemini_1_5_flash.ipynb b/gemini/getting-started/intro_gemini_1_5_flash.ipynb
index 34f6135b34..2cc72107ee 100644
--- a/gemini/getting-started/intro_gemini_1_5_flash.ipynb
+++ b/gemini/getting-started/intro_gemini_1_5_flash.ipynb
@@ -226,11 +226,7 @@
},
"outputs": [],
"source": [
- "from IPython.core.interactiveshell import InteractiveShell\n",
"import IPython.display\n",
- "\n",
- "InteractiveShell.ast_node_interactivity = \"all\"\n",
- "\n",
"from vertexai.generative_models import (\n",
" GenerationConfig,\n",
" GenerativeModel,\n",
@@ -408,7 +404,7 @@
"outputs": [],
"source": [
"prompt = \"\"\"\n",
- " Can you transcribe this interview, in the format of timecode, speaker, caption.\n",
+ " Can you transcribe this interview, in the format of speaker, caption.\n",
" Use speaker A, speaker B, etc. to identify the speakers.\n",
"\"\"\"\n",
"\n",
@@ -552,6 +548,7 @@
"Context:\n",
"\"\"\"\n",
"\n",
+ "pdf_file = Part.from_uri(pdf_file_uri, mime_type=\"application/pdf\")\n",
"image_file = Part.from_uri(image_file_uri, mime_type=\"image/png\")\n",
"\n",
"contents = [\n",
diff --git a/gemini/getting-started/intro_gemini_1_5_pro.ipynb b/gemini/getting-started/intro_gemini_1_5_pro.ipynb
index 59cda2f42e..1b2d1c36a3 100644
--- a/gemini/getting-started/intro_gemini_1_5_pro.ipynb
+++ b/gemini/getting-started/intro_gemini_1_5_pro.ipynb
@@ -127,7 +127,7 @@
},
{
"cell_type": "code",
- "execution_count": 22,
+ "execution_count": null,
"metadata": {
"id": "XRvKdaPDTznN"
},
@@ -166,7 +166,7 @@
},
{
"cell_type": "code",
- "execution_count": 4,
+ "execution_count": null,
"metadata": {
"id": "NyKGtVQjgx13"
},
@@ -195,7 +195,7 @@
},
{
"cell_type": "code",
- "execution_count": 1,
+ "execution_count": null,
"metadata": {
"id": "Nqwi-5ufWp_B"
},
@@ -220,17 +220,13 @@
},
{
"cell_type": "code",
- "execution_count": 2,
+ "execution_count": null,
"metadata": {
"id": "lslYAvw37JGQ"
},
"outputs": [],
"source": [
- "from IPython.core.interactiveshell import InteractiveShell\n",
"import IPython.display\n",
- "\n",
- "InteractiveShell.ast_node_interactivity = \"all\"\n",
- "\n",
"from vertexai.generative_models import (\n",
" GenerationConfig,\n",
" GenerativeModel,\n",
@@ -253,7 +249,7 @@
},
{
"cell_type": "code",
- "execution_count": 3,
+ "execution_count": null,
"metadata": {
"id": "U7ExWmuLBdIA"
},
@@ -390,7 +386,7 @@
},
{
"cell_type": "code",
- "execution_count": 10,
+ "execution_count": 26,
"metadata": {
"id": "10hgCOIA4E5_"
},
@@ -409,7 +405,7 @@
""
]
},
- "execution_count": 10,
+ "execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
@@ -433,7 +429,7 @@
},
{
"cell_type": "code",
- "execution_count": 11,
+ "execution_count": 28,
"metadata": {
"id": "OPQ1fBk44E6L"
},
@@ -442,25 +438,18 @@
"name": "stdout",
"output_type": "stream",
"text": [
- "## Pixel Feature Drops: March 2023\n",
- "\n",
- "This podcast discusses the Pixel feature drops, specifically the March 2023 drop, with product managers Aisha Sharif and DeCarlos Love. \n",
- "\n",
- "### Chapters:\n",
- " \n",
- "* **00:00 - 00:13**: Introduction. \n",
- "* **00:14 - 00:22**: Podcast introduction. \n",
- "* **00:22 - 01:07**: Transformative Pixel features. \n",
- "* **01:07 - 02:27**: Importance of feature drops. \n",
- "* **02:28 - 03:30**: Highlights from January feature drop.\n",
- "* **03:31 - 04:52**: March feature drop for Pixel Watch.\n",
- "* **04:53 - 06:08**: March feature drop for Pixel phones. \n",
- "* **06:09 - 07:23**: Deciding what's in a feature drop. \n",
- "* **07:24 - 07:57**: User feedback and feature drops. \n",
- "* **07:58 - 08:11**: Release date for March feature drop. \n",
- "* **08:12 - 08:58**: Favorite feature drops. \n",
- "* **08:59 - 10:05**: More favorite feature drops. \n",
- "* **10:06 - 10:28**: Outro and thanks. \n",
+ "## Made By Google Podcast: March Feature Drop\n",
+ "\n",
+ "**Chapter 1: Favorite Pixel Features**\n",
+ "**Chapter 2: The Importance of Feature Drops**\n",
+ "**Chapter 3: Highlights from the January Feature Drop**\n",
+ "**Chapter 4: What’s New for Pixel Watch in March**\n",
+ "**Chapter 5: What’s New for Pixel Phones in March**\n",
+ "**Chapter 6: Updates for the Rest of the Pixel Portfolio**\n",
+ "**Chapter 7: What Drives Feature Choice?**\n",
+ "**Chapter 8: The Value of User Feedback**\n",
+ "**Chapter 9: Feature Drop Release Date**\n",
+ "**Chapter 10: Memorable Feature Drops** \n",
"\n"
]
}
@@ -497,7 +486,7 @@
"outputs": [],
"source": [
"prompt = \"\"\"\n",
- " Can you transcribe this interview, in the format of timecode, speaker, caption.\n",
+ " Can you transcribe this interview, in the format of speaker, caption.\n",
" Use speaker A, speaker B, etc. to identify the speakers.\n",
"\"\"\"\n",
"\n",
@@ -606,7 +595,7 @@
},
{
"cell_type": "code",
- "execution_count": 8,
+ "execution_count": 30,
"metadata": {
"id": "JgKDIZUstYwV"
},
@@ -615,18 +604,17 @@
"name": "stdout",
"output_type": "stream",
"text": [
- "This is a technical report introducing Gemini 1.5 Pro, Google's latest multi-modal model. The model is built upon the mixture-of-experts (MoE) architecture and exhibits impressive performance on reasoning, multi-modality, and long context understanding. Gemini 1.5 Pro distinguishes itself by expanding the context window size to several million tokens, a significant leap beyond the 200k tokens offered by its predecessor, Claude 2.1. This expanded capacity allows for processing nearly five days of audio, entire books, or extensive code repositories. \n",
+ "The document presents Gemini 1.5 Pro, a new multimodal model from Google DeepMind, which excels in understanding and processing extremely long contexts, up to millions of tokens, across various modalities including text, video and audio. \n",
"\n",
- "The report highlights the model's abilities through: \n",
- "* **Qualitative examples:** Showcasing impressive feats such as pinpointing specific code within the complete JAX codebase, learning to translate a new language from a single grammar book and dictionary, and identifying a scene from Les Misérables based on a hand-drawn sketch. \n",
- "* **Quantitative evaluations:** \n",
- " * **Diagnostic:** demonstrating near-perfect recall in \"needle-in-a-haystack\" tasks across text, video, and audio, even maintaining high recall with context lengths extending to 10 million tokens. \n",
- " * **Realistic:** excelling in long-document QA using Les Misérables as context, outperforming competitors on long-video QA tasks, and showing significant progress in long-context automatic speech recognition. \n",
- " * **Core Capabilities:** Surpassing the performance of its predecessor (Gemini 1.0) and rivaling or exceeding the performance of a state-of-the-art model, Gemini 1.0 Ultra, on core benchmarks related to coding, math, science, reasoning, and instruction following. \n",
+ "Key findings:\n",
"\n",
- "The report also delves into the responsible development and deployment of the model, emphasizing their approach to impact assessment, model mitigations, and ongoing safety evaluations. \n",
+ "* **Long-context capabilities:** Gemini 1.5 Pro significantly extends the context length frontier to multiple millions of tokens, outperforming other large language models like Claude 2.1 and GPT-4 Turbo. This allows for processing long documents, hours of video, and days of audio recordings. \n",
+ "* **Maintaining Core Capabilities:** This leap in long-context performance doesn't compromise the model's core multi-modal capabilities, showing improvements in math, science, reasoning, coding, image and video understanding. \n",
+ "* **In-Context Language Learning:** Gemini 1.5 Pro can learn to translate a new language from a single set of linguistic documentation provided in its input.\n",
+ "* **Benchmarks and Evaluations:** The document reports results on both synthetic and real-world long-context tasks, demonstrating the model's ability to recall and reason over long sequences in all three modalities. \n",
+ "* **Responsible Deployment:** The document outlines the model's development and deployment process, highlighting the focus on mitigating safety risks and ensuring responsible use. \n",
"\n",
- "In conclusion, Gemini 1.5 Pro represents a significant advancement in AI, showcasing unprecedented capabilities in long-context understanding across multiple modalities. The report emphasizes the need for novel evaluation methods to better assess the potential of such models and suggests promising avenues for future research. \n",
+ "The document also presents several challenges and opportunities for future research in evaluating models with very long contexts and developing benchmarks that can effectively assess their capabilities. They advocate for the development of more complex tasks that require reasoning over multiple pieces of information scattered across a long context. \n",
"\n"
]
}
@@ -648,7 +636,7 @@
},
{
"cell_type": "code",
- "execution_count": 31,
+ "execution_count": 37,
"metadata": {
"id": "52ltdcv5EsaM"
},
@@ -660,7 +648,7 @@
""
]
},
- "execution_count": 31,
+ "execution_count": 37,
"metadata": {
"image/png": {
"width": 450
@@ -679,7 +667,7 @@
},
{
"cell_type": "code",
- "execution_count": 32,
+ "execution_count": 40,
"metadata": {
"id": "EEmrMpRMHyel"
},
@@ -688,65 +676,9 @@
"name": "stdout",
"output_type": "stream",
"text": [
- "## Analysis of Request:\n",
- "\n",
- "The task requires the model to:\n",
- "\n",
- "1. **Process Multimodal Information:** Analyze both an image and a PDF document provided as context.\n",
- "2. **Understand Long Context:** The PDF document could potentially be lengthy, requiring the model to handle long sequences of text.\n",
- "3. **Perform Information Retrieval:** Locate specific information within the image and document relevant to the questions.\n",
- "4. **Reason and Answer Questions:** Provide concise answers based on the extracted information.\n",
- "5. **Cite Sources:** Include page numbers from the PDF document to support the answers.\n",
- "\n",
- "## Potential Challenges: \n",
- "\n",
- "* **Image Understanding:** Accurately interpreting the content of the image, especially if it's complex or contains charts/graphs.\n",
- "* **Long Document Comprehension:** Efficiently processing and retaining information from a potentially long PDF document.\n",
- "* **Information Retrieval:** Identifying the specific sections within the document that contain relevant information for answering the questions.\n",
- "* **Citation Generation:** Extracting and correctly formatting page numbers for citations. \n",
- "\n",
- "## Response Generation:\n",
- "\n",
- "Unfortunately, I cannot directly access external websites or specific files, including the image and PDF mentioned in the request. However, I can guide you on how to approach this task with a model like Gemini 1.5:\n",
- "\n",
- "**Step 1: Provide Context:** \n",
- "\n",
- "* Input the entire PDF document and the image as context to the model. You may need to convert the PDF to plain text or use a tool to extract text and images from the PDF.\n",
+ "- **What is in the given image?** Cumulative Average Reward \n",
"\n",
- "**Step 2: Formulate Questions:** \n",
- "\n",
- "* Ask the questions in a clear and concise manner. \n",
- "\n",
- "**Step 3: Extract Answers:** \n",
- "\n",
- "* Analyze the model's response to identify the answers to the questions. \n",
- "* Look for phrases like \"The image shows...\" or \"According to page X of the document...\" to locate the answers and their corresponding sources.\n",
- "\n",
- "**Example Response Format:**\n",
- "\n",
- "**Question 1: What is in the given image?**\n",
- "\n",
- "**Answer:** The image shows a cumulative average plot with different colored lines representing various models' performance. (source: page X) \n",
- "\n",
- "**Question 2: Is there a similar graph in the given document?**\n",
- "\n",
- "**Answer:** Yes, a similar graph depicting cumulative average negative log-likelihood can be found on page Y of the document. \n",
- "\n",
- "\n",
- "## Tools and Techniques:\n",
- "\n",
- "* **Optical Character Recognition (OCR):** If the image contains text, use OCR to extract and include it in the context.\n",
- "* **PDF Parsing Libraries:** Utilize libraries like PyPDF2 or PDFMiner to extract text and images from the PDF document.\n",
- "* **Regular Expressions:** Employ regular expressions to locate specific patterns in the model's response, such as page numbers for citations.\n",
- "\n",
- "## Model Considerations:\n",
- "\n",
- "* **Gemini 1.5 Pro:** This model is well-suited for this task due to its capabilities in handling long context, multimodal information processing, and question answering. \n",
- "* **Alternative Models:** Other large language models like Bard or PaLM 2 could also be explored, although their performance may vary.\n",
- "\n",
- "## Conclusion:\n",
- "\n",
- "By following these steps and using the appropriate tools and techniques, you can effectively utilize a large language model to analyze multimodal information, answer questions, and cite sources from complex documents and images.\n",
+ "- **Is there a similar graph in the given document?** Yes (pg. 8) \n",
"\n"
]
}
@@ -768,9 +700,12 @@
"Context:\n",
"\"\"\"\n",
"\n",
+ "pdf_file = Part.from_uri(pdf_file_uri, mime_type=\"application/pdf\")\n",
+ "image_file = Part.from_uri(image_file_uri, mime_type=\"image/png\")\n",
+ "\n",
"contents = [\n",
" pdf_file,\n",
- " image_file_uri,\n",
+ " image_file,\n",
" prompt,\n",
"]\n",
"\n",
From 7f7847c3b1dfd5c98431f84ad97786546dcacd46 Mon Sep 17 00:00:00 2001
From: Rupjit Chakraborty
Date: Wed, 25 Sep 2024 23:46:08 +0530
Subject: [PATCH 14/76] feat: Updated
gemini/prompts/examples/chain_of_thought_react.ipynb notebook (#1146)
# Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [x] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [x] You are listed as the author in your notebook or README file.
- [ ] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [x] Make your Pull Request title in the
specification.
- [x] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [x] Appropriate docs were updated (if necessary)
---------
Co-authored-by: Rupjit
Co-authored-by: Owl Bot
Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Co-authored-by: Eric Dong
---
.../examples/chain_of_thought_react.ipynb | 136 ++++++++----------
1 file changed, 63 insertions(+), 73 deletions(-)
diff --git a/gemini/prompts/examples/chain_of_thought_react.ipynb b/gemini/prompts/examples/chain_of_thought_react.ipynb
index b154d9005c..f00ab111cb 100644
--- a/gemini/prompts/examples/chain_of_thought_react.ipynb
+++ b/gemini/prompts/examples/chain_of_thought_react.ipynb
@@ -117,15 +117,15 @@
},
"outputs": [],
"source": [
- "!pip install -q --user --upgrade langchain \\\n",
- " google-cloud-aiplatform \\\n",
+ "!pip install -q --user --upgrade langchain==0.3.0 \\\n",
+ " google-cloud-aiplatform==1.67.1 \\\n",
" prettyprinter==0.18.0 \\\n",
" wikipedia==1.4.0 \\\n",
- " google-cloud-bigquery==3.11.4 \\\n",
- " langchain-core \\\n",
- " langchain-google-vertexai \\\n",
- " langchain-experimental \\\n",
- " \"bigframes<1.0.0\""
+ " google-cloud-bigquery==3.25.0 \\\n",
+ " langchain-core==0.3.2 \\\n",
+ " langchain-google-vertexai==2.0.1 \\\n",
+ " langchain-experimental==0.3.0 \\\n",
+ " bigframes==1.18.0"
]
},
{
@@ -216,9 +216,9 @@
},
"outputs": [],
"source": [
- "PROJECT_ID = \"your-project-id\" # @param {type:\"string\"}\n",
- "LOCATION = \"us-central1\" # @param {type:\"string\"}\n",
- "MODEL_NAME = \"gemini-1.0-pro\" # @param {type:\"string\"}\n",
+ "PROJECT_ID = \"\" # @param {type:\"string\"}\n",
+ "LOCATION = \"\" # @param {type:\"string\"}\n",
+ "MODEL_NAME = \"gemini-1.5-flash\" # @param {type:\"string\"}\n",
"\n",
"import vertexai\n",
"\n",
@@ -339,14 +339,15 @@
},
"outputs": [],
"source": [
- "question = \"\"\"Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls.\n",
- "Each can has 3 tennis balls. How many tennis balls does he have now?\n",
- "A: The answer is 11.\n",
- "Q: The cafeteria had 23 apples.\n",
- "If they used 20 to make lunch and bought 6 more, how many apples do they have?\n",
- "A:\"\"\"\n",
- "\n",
- "llm.invoke(question)"
+ "question = \"\"\"\n",
+ " Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls.\n",
+ " Each can has 3 tennis balls. How many tennis balls does he have now?\n",
+ " A: The answer is 11.\n",
+ " Q: The cafeteria had 33 apples.\n",
+ " If they used 20 to make lunch and bought 6 more, how many apples do they have?\n",
+ " A:\n",
+ "\"\"\"\n",
+ "print(llm.invoke(question))"
]
},
{
@@ -368,15 +369,17 @@
},
"outputs": [],
"source": [
- "question = \"\"\"Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls.\n",
- "Each can has 3 tennis balls. How many tennis balls does he have now?\n",
- "A: Roger started with 5 balls. 2 cans of 3 tennis balls\n",
- "each is 6 tennis balls. 5 + 6 = 11. The answer is 11.\n",
- "Q: The cafeteria had 23 apples.\n",
- "If they used 20 to make lunch and bought 6 more, how many apples do they have?\n",
- "A:\"\"\"\n",
+ "question = \"\"\"\n",
+ " Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls.\n",
+ " Each can has 3 tennis balls. How many tennis balls does he have now?\n",
+ " A: Roger started with 5 balls. 2 cans of 3 tennis balls\n",
+ " each is 6 tennis balls. 5 + 6 = 11. The answer is 11.\n",
+ " Q: The cafeteria had 23 apples.\n",
+ " If they used 20 to make lunch and bought 6 more, how many apples do they have?\n",
+ " A:\n",
+ "\"\"\"\n",
"\n",
- "llm.invoke(question)"
+ "print(llm.invoke(question))"
]
},
{
@@ -423,14 +426,15 @@
},
"outputs": [],
"source": [
- "question = \"\"\"Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls.\n",
- "Each can has 3 tennis balls. How many tennis balls does he have now?\n",
- "A: The answer is 11.\n",
- "Q: The cafeteria had 23 apples.\n",
- "If they used 20 to make lunch and bought 6 more, how many apples do they have?\n",
- "A:\"\"\"\n",
- "\n",
- "llm.invoke(question)"
+ "question = \"\"\"\n",
+ " Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls.\n",
+ " Each can has 3 tennis balls. How many tennis balls does he have now?\n",
+ " A: The answer is 11.\n",
+ " Q: The cafeteria had 23 apples.\n",
+ " If they used 20 to make lunch and bought 6 more, how many apples do they have?\n",
+ " A:\n",
+ "\"\"\"\n",
+ "print(llm.invoke(question))"
]
},
{
@@ -441,15 +445,17 @@
},
"outputs": [],
"source": [
- "question = \"\"\"Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls.\n",
+ "question = \"\"\"\n",
+ "Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls.\n",
"Each can has 3 tennis balls. How many tennis balls does he have now?\n",
"A: The answer is 11.\n",
"\n",
"Q: The cafeteria had 23 apples.\n",
"If they used 20 to make lunch and bought 6 more, how many apples do they have?\n",
- "A: Let's think step by step.\"\"\"\n",
+ "A: Let's think step by step.\n",
+ "\"\"\"\n",
"\n",
- "llm.invoke(question)"
+ "print(llm.invoke(question))"
]
},
{
@@ -825,6 +831,8 @@
},
"outputs": [],
"source": [
+ "from langchain.agents.agent_types import AgentType\n",
+ "from langchain.agents.initialize import initialize_agent\n",
"from langchain.llms import VertexAI\n",
"from langchain.tools import StructuredTool, WikipediaQueryRun\n",
"from langchain.utilities import WikipediaAPIWrapper\n",
@@ -908,7 +916,7 @@
},
"outputs": [],
"source": [
- "llm = VertexAI(temperature=0)\n",
+ "from langchain.agents import load_tools\n",
"\n",
"_ = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())\n",
"\n",
@@ -923,7 +931,7 @@
"\n",
"# Let's search some information generally available on wikipedia.\n",
"agent.invoke(\n",
- " \"Give me a summary of the life of one famous person who was born October 2nd?\"\n",
+ " \"Give me a short summary of a famous monument along with the name of the wikipedia page referenced\"\n",
")"
]
},
@@ -974,7 +982,11 @@
},
"outputs": [],
"source": [
- "QUERY = \"SELECT * FROM bigquery-public-data.hacker_news.full LIMIT 10\"\n",
+ "QUERY = \"\"\"\n",
+ "SELECT * FROM `bigquery-public-data.hacker_news.full` \n",
+ "WHERE CONCAT(title, url, text) IS NOT NULL\n",
+ "LIMIT 10\n",
+ "\"\"\"\n",
"df = bq.query(QUERY).to_dataframe()\n",
"df"
]
@@ -1016,8 +1028,7 @@
" question = \"\"\"Create a 1 sentence friendly response to the following comment: {comment}\"\"\".format(\n",
" comment=comment\n",
" )\n",
- " llm1 = VertexAI(temperature=0.2, max_output_tokens=150)\n",
- " response = llm1.predict(question)\n",
+ " response = llm.predict(question)\n",
"\n",
" return response\n",
"\n",
@@ -1029,8 +1040,7 @@
" question = \"\"\"What is the sentiment of the comment (Negative, Positive, Neutral): {comment}\"\"\".format(\n",
" comment=comment\n",
" )\n",
- " llm1 = VertexAI(temperature=0.2, max_output_tokens=150)\n",
- " response = llm1.predict(question)\n",
+ " response = llm.predict(question)\n",
"\n",
" return response\n",
"\n",
@@ -1042,8 +1052,7 @@
" question = \"\"\"Put the comment into one of these categories (Technology, Politics, Products, News): {comment}\"\"\".format(\n",
" comment=comment\n",
" )\n",
- " llm1 = VertexAI(temperature=0.2, max_output_tokens=150)\n",
- " response = llm1.predict(question)\n",
+ " response = llm.predict(question)\n",
"\n",
" return response"
]
@@ -1056,7 +1065,10 @@
},
"outputs": [],
"source": [
+ "from langchain.agents import Tool\n",
+ "\n",
"# Create a pool of tools which will be utilized by the LLM to generate response.\n",
+ "\n",
"tools = [\n",
" Tool(\n",
" name=\"GetCommentsById\",\n",
@@ -1244,6 +1256,8 @@
},
"outputs": [],
"source": [
+ "from langchain.agents.agent import AgentExecutor\n",
+ "\n",
"agent = ReActTestAgent.from_llm_and_tools(llm, tools, verbose=True)\n",
"\n",
"agent_executor = AgentExecutor.from_agent_and_tools(\n",
@@ -1260,19 +1274,7 @@
},
"outputs": [],
"source": [
- "input = \"Category for the comment 8885404\"\n",
- "agent_executor.invoke(input)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "91K_tImIzCwp"
- },
- "outputs": [],
- "source": [
- "input = \"Sentiment for comment 8885404\"\n",
+ "input = \"Category for the comment 919518\"\n",
"agent_executor.invoke(input)"
]
},
@@ -1284,7 +1286,7 @@
},
"outputs": [],
"source": [
- "input = \"Response for all comments by jpmoral.\"\n",
+ "input = \"Response for all comments by spicywebdesign.\"\n",
"agent_executor.invoke(input)"
]
},
@@ -1296,19 +1298,7 @@
},
"outputs": [],
"source": [
- "input = \"Respond to the comment 9871807.\"\n",
- "agent_executor.invoke(input)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "LYrh5IR0Qp09"
- },
- "outputs": [],
- "source": [
- "input = \"Respond to all the comments by jpmoral.\"\n",
+ "input = \"Get comments by coleifer\"\n",
"agent_executor.invoke(input)"
]
}
From 0cb3ac780fbfebebfa9779ba26dd16a9571a620d Mon Sep 17 00:00:00 2001
From: nhootan <103317089+nhootan@users.noreply.github.com>
Date: Thu, 26 Sep 2024 10:48:00 -0400
Subject: [PATCH 15/76] feat: add gemini-1.5-pro:002 to the prompt optimizer
notebook. (#1180)
# Description
---------
Co-authored-by: hootan
Co-authored-by: Owl Bot
---
.../vertex_ai_prompt_optimizer_ui.ipynb | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
index 93d820f28e..a17a30b1c1 100644
--- a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
+++ b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
@@ -94,7 +94,7 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 1,
"metadata": {
"id": "NyKGtVQjgx13"
},
@@ -202,9 +202,9 @@
"\n",
"The most important settings are:\n",
"\n",
- "Target Model: Which model you are trying to optimize your prompts to.\n",
- "Optimization Mode: The mode in which you are trying to optimize your prompt with.\n",
- "Evaluation Metrics: The evaluation metrics in which you are trying to optimize your prompts against.\n",
+ "* Target Model: Which model you are trying to optimize your prompts to.\n",
+ "* Optimization Mode: The mode in which you are trying to optimize your prompt with.\n",
+ "* Evaluation Metrics: The evaluation metrics in which you are trying to optimize your prompts against.\n",
"Refer [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer#configuration) to learn more about the different configuration settings and how to best utilize them."
]
},
@@ -216,8 +216,8 @@
},
"outputs": [],
"source": [
- "SOURCE_MODEL = \"\" # @param [\"\", \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"]\n",
- "TARGET_MODEL = \"gemini-1.5-flash-001\" # @param [\"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\"]\n",
+ "SOURCE_MODEL = \"\" # @param [\"\", \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.5-pro-002\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"]\n",
+ "TARGET_MODEL = \"gemini-1.5-flash-001\" # @param [\"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.5-pro-002\", \"gemini-1.0-ultra-001\"]\n",
"OPTIMIZATION_MODE = \"instruction_and_demo\" # @param [\"instruction\", \"demonstration\", \"instruction_and_demo\"]\n",
"EVAL_METRIC = \"question_answering_correctness\" # @param [\"bleu\", \"coherence\", \"exact_match\", \"fluency\", \"groundedness\", \"text_quality\", \"verbosity\", \"rouge_1\", \"rouge_2\", \"rouge_l\", \"rouge_l_sum\", \"safety\", \"question_answering_correctness\", \"question_answering_quality\", \"summarization_quality\", \"tool_name_match\", \"tool_parameter_key_match\", \"tool_parameter_kv_match\", \"tool_call_valid\"] {type:\"string\"}"
]
From 814aec5c52af44d03a4f8267afd4720a2b276f47 Mon Sep 17 00:00:00 2001
From: Ivan Nardini <88703814+inardini@users.noreply.github.com>
Date: Thu, 26 Sep 2024 19:04:10 +0200
Subject: [PATCH 16/76] fix: utils module and notebook name (#1182)
# Description
Fixes: Download utils module and change notebook name.
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [x] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [x] You are listed as the author in your notebook or README file.
- [x] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [x] Make your Pull Request title in the
specification.
- [x] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [x] Appropriate docs were updated (if necessary)
---------
Co-authored-by: Owl Bot
---
...b => vertex_ai_prompt_optimizer_sdk.ipynb} | 37 ++++++++++++++++---
1 file changed, 31 insertions(+), 6 deletions(-)
rename gemini/prompts/prompt_optimizer/{get_started_with_vertex_ai_prompt_optimizer_sdk.ipynb => vertex_ai_prompt_optimizer_sdk.ipynb} (97%)
diff --git a/gemini/prompts/prompt_optimizer/get_started_with_vertex_ai_prompt_optimizer_sdk.ipynb b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk.ipynb
similarity index 97%
rename from gemini/prompts/prompt_optimizer/get_started_with_vertex_ai_prompt_optimizer_sdk.ipynb
rename to gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk.ipynb
index 98526b6428..35f16a7d7b 100644
--- a/gemini/prompts/prompt_optimizer/get_started_with_vertex_ai_prompt_optimizer_sdk.ipynb
+++ b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk.ipynb
@@ -29,26 +29,26 @@
"id": "JAPoU8Sm5E6e"
},
"source": [
- "# Get started with Vertex Prompt Optimizer Notebook SDK (Preview)\n",
+ "# Vertex Prompt Optimizer Notebook SDK (Preview)\n",
"\n",
"
\n",
diff --git a/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb b/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb
index 0d8e98a8b8..1af8cb4092 100644
--- a/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb
+++ b/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb
@@ -104,16 +104,12 @@
"cell_type": "code",
"execution_count": 3,
"metadata": {
- "id": "tFy3H3aPgx12",
- "outputId": "cd2eca8a-1e2f-47b1-f421-2488d5a4d736",
- "colab": {
- "base_uri": "https://localhost:8080/"
- }
+ "id": "tFy3H3aPgx12"
},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"\u001b[?25l \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/5.3 MB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K \u001b[91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[91m╸\u001b[0m \u001b[32m5.2/5.3 MB\u001b[0m \u001b[31m161.8 MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m\r\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m5.3/5.3 MB\u001b[0m \u001b[31m80.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25h\u001b[33m WARNING: The script tb-gcp-uploader is installed in '/root/.local/bin' which is not on PATH.\n",
@@ -143,22 +139,18 @@
"cell_type": "code",
"execution_count": 1,
"metadata": {
- "id": "XRvKdaPDTznN",
- "outputId": "f285bbb4-4da9-45ed-e79c-96a4e688fda8",
- "colab": {
- "base_uri": "https://localhost:8080/"
- }
+ "id": "XRvKdaPDTznN"
},
"outputs": [
{
- "output_type": "execute_result",
"data": {
"text/plain": [
"{'status': 'ok', 'restart': True}"
]
},
+ "execution_count": 1,
"metadata": {},
- "execution_count": 1
+ "output_type": "execute_result"
}
],
"source": [
@@ -311,8 +303,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "PTvunHqRTHqe",
- "outputId": "7f67d2b6-40d2-4689-c90e-651d5732a5e4"
+ "id": "PTvunHqRTHqe"
},
"outputs": [
{
@@ -467,8 +458,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "pUCpEmEFM0eX",
- "outputId": "20c9e84d-1276-463d-e716-0d0b913500c0"
+ "id": "pUCpEmEFM0eX"
},
"outputs": [
{
@@ -697,8 +687,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "uOWsUbwVXoTU",
- "outputId": "40a94287-dd21-43f0-cee3-cb4a1820927b"
+ "id": "uOWsUbwVXoTU"
},
"outputs": [
{
@@ -762,8 +751,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "DVIpbaGYRJQc",
- "outputId": "be75df17-ec76-4660-bf23-82c0cd051423"
+ "id": "DVIpbaGYRJQc"
},
"outputs": [
{
@@ -823,8 +811,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "p-soJC81YNy2",
- "outputId": "ba2ee710-f3d9-4ec2-a670-679aca12e0d8"
+ "id": "p-soJC81YNy2"
},
"outputs": [
{
@@ -953,8 +940,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "RppMRP9lIkq2",
- "outputId": "d6cc0e50-830b-4094-89dd-3e400ec6d416"
+ "id": "RppMRP9lIkq2"
},
"outputs": [
{
@@ -981,8 +967,7 @@
"DF4l8DTdWgPY"
],
"name": "vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb",
- "toc_visible": true,
- "provenance": []
+ "toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
@@ -991,4 +976,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
-}
\ No newline at end of file
+}
From c98ba59fef5137b8431b668bae279ecfa0b16bb8 Mon Sep 17 00:00:00 2001
From: Eric Dong
Date: Fri, 27 Sep 2024 17:09:33 -0400
Subject: [PATCH 20/76] fix: Correct a model version used in the safety
notebook (#1185)
# Description
Correct a model version used in the safety notebook
Gemini 1.0 > Gemini 1.5
---------
Co-authored-by: Owl Bot
---
gemini/responsible-ai/gemini_safety_ratings.ipynb | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gemini/responsible-ai/gemini_safety_ratings.ipynb b/gemini/responsible-ai/gemini_safety_ratings.ipynb
index 0105f34fd7..122de5d9f8 100644
--- a/gemini/responsible-ai/gemini_safety_ratings.ipynb
+++ b/gemini/responsible-ai/gemini_safety_ratings.ipynb
@@ -270,7 +270,7 @@
"id": "5rpgrqQrPJQ-"
},
"source": [
- "### Load the Gemini 1.0 Pro model\n"
+ "### Load the Gemini 1.5 Pro model\n"
]
},
{
From a04e98df0f4cd5638e5144718a5d18d8b7a76471 Mon Sep 17 00:00:00 2001
From: Eric Dong
Date: Fri, 27 Sep 2024 17:48:17 -0400
Subject: [PATCH 21/76] feat: Add safety settings changes for the new models
(#1188)
# Description
Add safety settings changes for the new Gemini 1.5 Flash 002 and Gemini
1.5 Pro 002 models:
- The safety settings are `OFF` by default
- The default block thresholds are `BLOCK_NONE`
---
.../getting-started/intro_gemini_python.ipynb | 121 +++++++++++++++---
1 file changed, 102 insertions(+), 19 deletions(-)
diff --git a/gemini/getting-started/intro_gemini_python.ipynb b/gemini/getting-started/intro_gemini_python.ipynb
index 2007da32de..9080ff3695 100644
--- a/gemini/getting-started/intro_gemini_python.ipynb
+++ b/gemini/getting-started/intro_gemini_python.ipynb
@@ -111,21 +111,6 @@
" - Generate text from video and text prompt\n"
]
},
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "C9nEPojogw-g"
- },
- "source": [
- "### Costs\n",
- "\n",
- "This tutorial uses billable components of Google Cloud:\n",
- "\n",
- "- Vertex AI\n",
- "\n",
- "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage.\n"
- ]
- },
{
"cell_type": "markdown",
"metadata": {
@@ -174,7 +159,6 @@
},
"outputs": [],
"source": [
- "# Restart kernel after installs so that your environment can access the new packages\n",
"import IPython\n",
"\n",
"app = IPython.Application.instance()\n",
@@ -213,9 +197,7 @@
"source": [
"import sys\n",
"\n",
- "# Additional authentication is required for Google Colab\n",
"if \"google.colab\" in sys.modules:\n",
- " # Authenticate user to Google Cloud\n",
" from google.colab import auth\n",
"\n",
" auth.authenticate_user()"
@@ -268,7 +250,15 @@
},
"outputs": [],
"source": [
- "from vertexai.generative_models import GenerationConfig, GenerativeModel, Image, Part"
+ "from vertexai.generative_models import (\n",
+ " GenerationConfig,\n",
+ " GenerativeModel,\n",
+ " HarmBlockThreshold,\n",
+ " HarmCategory,\n",
+ " Image,\n",
+ " Part,\n",
+ " SafetySetting,\n",
+ ")"
]
},
{
@@ -511,6 +501,99 @@
"print(response.text)"
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "7bff84b3f1c3"
+ },
+ "source": [
+ "### Safety filters\n",
+ "\n",
+ "The Gemini API provides safety filters that you can adjust across multiple filter categories to restrict or allow certain types of content. You can use these filters to adjust what's appropriate for your use case. See the [Configure safety filters](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/configure-safety-filters) page for details.\n",
+ "\n",
+ "When you make a request to Gemini, the content is analyzed and assigned a safety rating. You can inspect the safety ratings of the generated content by printing out the model responses, as in this example:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 51,
+ "metadata": {
+ "id": "6548f7974b26"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Safety ratings:\n",
+ "[category: HARM_CATEGORY_HATE_SPEECH\n",
+ "probability: NEGLIGIBLE\n",
+ "probability_score: 0.0693359375\n",
+ "severity: HARM_SEVERITY_NEGLIGIBLE\n",
+ "severity_score: 0.046630859375\n",
+ ", category: HARM_CATEGORY_DANGEROUS_CONTENT\n",
+ "probability: NEGLIGIBLE\n",
+ "probability_score: 0.09130859375\n",
+ "severity: HARM_SEVERITY_NEGLIGIBLE\n",
+ "severity_score: 0.0693359375\n",
+ ", category: HARM_CATEGORY_HARASSMENT\n",
+ "probability: NEGLIGIBLE\n",
+ "probability_score: 0.11767578125\n",
+ "severity: HARM_SEVERITY_NEGLIGIBLE\n",
+ "severity_score: 0.0267333984375\n",
+ ", category: HARM_CATEGORY_SEXUALLY_EXPLICIT\n",
+ "probability: NEGLIGIBLE\n",
+ "probability_score: 0.1435546875\n",
+ "severity: HARM_SEVERITY_NEGLIGIBLE\n",
+ "severity_score: 0.0289306640625\n",
+ "]\n"
+ ]
+ }
+ ],
+ "source": [
+ "response = model.generate_content(\"Why is the sky blue?\")\n",
+ "\n",
+ "print(f\"Safety ratings:\\n{response.candidates[0].safety_ratings}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "3fe5bb6d26c8"
+ },
+ "source": [
+ "In Gemini 1.5 Flash 002 and Gemini 1.5 Pro 002, the safety settings are `OFF` by default and the default block thresholds are `BLOCK_NONE`.\n",
+ "\n",
+ "You can use `safety_settings` to adjust the safety settings for each request you make to the API. This example demonstrates how you set the block threshold to BLOCK_ONLY_HIGH for the dangerous content category:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "4c055f9f41a5"
+ },
+ "outputs": [],
+ "source": [
+ "safety_settings = [\n",
+ " SafetySetting(\n",
+ " category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,\n",
+ " threshold=HarmBlockThreshold.BLOCK_ONLY_HIGH,\n",
+ " ),\n",
+ "]\n",
+ "\n",
+ "prompt = \"\"\"\n",
+ " Write a list of 2 disrespectful things that I might say to the universe after stubbing my toe in the dark.\n",
+ "\"\"\"\n",
+ "\n",
+ "response = model.generate_content(\n",
+ " prompt,\n",
+ " safety_settings=safety_settings,\n",
+ ")\n",
+ "\n",
+ "print(response)"
+ ]
+ },
{
"cell_type": "markdown",
"metadata": {
From 185c74d75650cffc067429be93a9965dfee8fc19 Mon Sep 17 00:00:00 2001
From: Averi Kitsch
Date: Fri, 27 Sep 2024 16:01:25 -0700
Subject: [PATCH 22/76] fix: update Reasoning Engine tutorials with LangChain
packages (#1191)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
# Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [ ] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [ ] You are listed as the author in your notebook or README file.
- [ ] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [ ] Make your Pull Request title in the
specification.
- [ ] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [ ] Appropriate docs were updated (if necessary)
Fixes # 🦕
---------
Co-authored-by: Owl Bot
---
.../tutorial_alloydb_rag_agent.ipynb | 15 +++++++++------
.../tutorial_cloud_sql_pg_rag_agent.ipynb | 17 ++++++++++-------
2 files changed, 19 insertions(+), 13 deletions(-)
diff --git a/gemini/reasoning-engine/tutorial_alloydb_rag_agent.ipynb b/gemini/reasoning-engine/tutorial_alloydb_rag_agent.ipynb
index 94283718df..f174b9eba8 100644
--- a/gemini/reasoning-engine/tutorial_alloydb_rag_agent.ipynb
+++ b/gemini/reasoning-engine/tutorial_alloydb_rag_agent.ipynb
@@ -128,7 +128,7 @@
},
"outputs": [],
"source": [
- "!pip install --upgrade --quiet \"google-cloud-aiplatform[reasoningengine,langchain]\" langchain-google-alloydb-pg langchain-google-vertexai"
+ "%pip install --upgrade --quiet \"langchain-google-alloydb-pg>=0.7.0\" \"google-cloud-aiplatform[reasoningengine,langchain]\" langchain-google-vertexai langchain-community"
]
},
{
@@ -401,7 +401,9 @@
" password=PASSWORD,\n",
")\n",
"\n",
- "await engine._aexecute_outside_tx(f\"CREATE DATABASE {DATABASE}\")"
+ "async with engine._pool.connect() as conn:\n",
+ " await conn.execute(text(\"COMMIT\"))\n",
+ " await conn.execute(text(f\"CREATE DATABASE {DATABASE}\"))"
]
},
{
@@ -570,7 +572,8 @@
" PROJECT_ID, REGION, CLUSTER, INSTANCE, DATABASE, user=\"postgres\", password=PASSWORD\n",
")\n",
"\n",
- "await engine._aexecute(f'GRANT SELECT ON {TABLE_NAME} TO \"{IAM_USER}\";')"
+ "async with engine._pool.connect() as conn:\n",
+ " await conn.execute(text(f'GRANT SELECT ON {TABLE_NAME} TO \"{IAM_USER}\";'))"
]
},
{
@@ -585,7 +588,7 @@
"\n",
"In this example, you'll define a function that will retrieve similar documents from the vector store using semantic search.\n",
"\n",
- "For improved security measures, the tool wil use IAM-based authentication to authenticate to the databases instead of using the built-in user/password authentication."
+ "For improved security measures, the tool will use IAM-based authentication to authenticate to the databases instead of using the built-in user/password authentication."
]
},
{
@@ -659,8 +662,8 @@
" },\n",
" ),\n",
" requirements=[\n",
- " \"google-cloud-aiplatform[reasoningengine,langchain]==1.57.0\",\n",
- " \"langchain-google-alloydb-pg==0.4.1\",\n",
+ " \"google-cloud-aiplatform[reasoningengine,langchain]==1.68.0\",\n",
+ " \"langchain-google-alloydb-pg==0.7.0\",\n",
" \"langchain-google-vertexai==1.0.4\",\n",
" ],\n",
" display_name=\"PrebuiltAgent\",\n",
diff --git a/gemini/reasoning-engine/tutorial_cloud_sql_pg_rag_agent.ipynb b/gemini/reasoning-engine/tutorial_cloud_sql_pg_rag_agent.ipynb
index 8b2d165a84..46173dbe73 100644
--- a/gemini/reasoning-engine/tutorial_cloud_sql_pg_rag_agent.ipynb
+++ b/gemini/reasoning-engine/tutorial_cloud_sql_pg_rag_agent.ipynb
@@ -128,7 +128,7 @@
},
"outputs": [],
"source": [
- "!pip install --upgrade --quiet \"google-cloud-aiplatform[reasoningengine,langchain]\" langchain-google-cloud-sql-pg langchain-google-vertexai"
+ "!pip install --upgrade --quiet \"langchain-google-cloud-sql-pg>=0.10.0\" \"google-cloud-aiplatform[reasoningengine,langchain]\" langchain-google-vertexai langchain-community"
]
},
{
@@ -345,7 +345,9 @@
" password=PASSWORD,\n",
")\n",
"\n",
- "await engine._aexecute_outside_tx(f\"CREATE DATABASE {DATABASE}\")"
+ "async with engine._pool.connect() as conn:\n",
+ " await conn.execute(text(\"COMMIT\"))\n",
+ " await conn.execute(text(f\"CREATE DATABASE {DATABASE}\"))"
]
},
{
@@ -513,7 +515,8 @@
" PROJECT_ID, REGION, INSTANCE, DATABASE, user=\"postgres\", password=PASSWORD\n",
")\n",
"\n",
- "await engine._aexecute(f'GRANT SELECT ON {TABLE_NAME} TO \"{IAM_USER}\";')"
+ "async with engine._pool.connect() as conn:\n",
+ " await conn.execute(text(f'GRANT SELECT ON {TABLE_NAME} TO \"{IAM_USER}\";'))"
]
},
{
@@ -528,7 +531,7 @@
"\n",
"In this example, you'll define a function that will retrieve similar documents from the vector store using semantic search.\n",
"\n",
- "For improved security measures, the tool wil use IAM-based authentication to authenticate to the databases instead of using the built-in user/password authentication."
+ "For improved security measures, the tool will use IAM-based authentication to authenticate to the databases instead of using the built-in user/password authentication."
]
},
{
@@ -602,9 +605,9 @@
" },\n",
" ),\n",
" requirements=[\n",
- " \"google-cloud-aiplatform[reasoningengine,langchain]==1.57.0\",\n",
- " \"langchain-google-cloud-sql-pg==0.6.1\",\n",
- " \"langchain-google-vertexai==1.0.4\",\n",
+ " \"google-cloud-aiplatform[reasoningengine,langchain]==1.68.0\",\n",
+ " \"langchain-google-cloud-sql-pg==0.10.0\",\n",
+ " \"langchain-google-vertexai==1.0.10\",\n",
" ],\n",
" display_name=\"PrebuiltAgent\",\n",
")"
From eacc4fd220f5b682e2efe370a901094a29880595 Mon Sep 17 00:00:00 2001
From: Andrew Grande
Date: Fri, 27 Sep 2024 16:03:54 -0700
Subject: [PATCH 23/76] docs: Example Selectors page has moved (#1186)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
# Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [ ] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [ ] You are listed as the author in your notebook or README file.
- [ ] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [ ] Make your Pull Request title in the
specification.
- [ ] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [ ] Appropriate docs were updated (if necessary)
Fixes # 🦕
Co-authored-by: Kristopher Overholt
---
gemini/orchestration/intro_langchain_gemini.ipynb | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gemini/orchestration/intro_langchain_gemini.ipynb b/gemini/orchestration/intro_langchain_gemini.ipynb
index fadcc7849b..dff319a923 100644
--- a/gemini/orchestration/intro_langchain_gemini.ipynb
+++ b/gemini/orchestration/intro_langchain_gemini.ipynb
@@ -714,7 +714,7 @@
"\n",
"[Example selectors](https://python.langchain.com/v0.1/docs/modules/model_io/prompts/example_selectors/) are an easy way to select from a series of examples to dynamically place in-context information into your prompt. Often used when the task is nuanced or has a large list of examples.\n",
"\n",
- "Check out different types of example selectors [here](https://python.langchain.com/docs/modules/model_io/prompts/example_selectors/)\n"
+ "Check out different types of example selectors [here](https://python.langchain.com/docs/how_to/example_selectors/)\n"
]
},
{
From 438aaad6838fec390065d0252a3d32a75dba5d91 Mon Sep 17 00:00:00 2001
From: Kristopher Overholt
Date: Mon, 30 Sep 2024 10:08:17 -0500
Subject: [PATCH 24/76] docs: Add sample media to Gemini multimodal function
calling notebook (#1168)
# Description
This is a followup PR to #1162 that adds sample images and content to
each section that helps add context to the analysis performed (e.g.,
screenshot, GIF of video, transcript).
- [X] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [X] You are listed as the author in your notebook or README file.
- [X] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [X] Make your Pull Request title in the
specification.
- [X] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [X] Appropriate docs were updated (if necessary)
---------
Co-authored-by: Owl Bot
---
.../multimodal_function_calling.ipynb | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/gemini/function-calling/multimodal_function_calling.ipynb b/gemini/function-calling/multimodal_function_calling.ipynb
index 0c7759da95..0a287fe055 100644
--- a/gemini/function-calling/multimodal_function_calling.ipynb
+++ b/gemini/function-calling/multimodal_function_calling.ipynb
@@ -78,8 +78,6 @@
"\n",
"This notebook demonstrates a powerful [Function Calling](https://cloud.google.com/vertex-ai/docs/generative-ai/multimodal/function-calling) capability of the Gemini model: support for multimodal inputs. With multimodal function calling, you can go beyond traditional text inputs, enabling Gemini to understand your intent and predict function calls and function parameters based on various inputs like images, audio, video, and PDFs. Function calling can also be referred to as *function calling with controlled generation*, which guarantees that output generated by the model always adheres to a specific schema so that you receive consistently formatted responses.\n",
"\n",
- "You can To specify the structure of a model's output, define a response schema, which works like a blueprint for model responses. When you submit a prompt and include the response schema, the model's response always follows your defined schema.\n",
- "\n",
"Previously, implementing multimodal function calling required two separate calls to the Gemini API: one to extract information from media, and another to generate a function call based on the extracted text. This process was cumbersome, prone to errors, and resulted in the loss of detail in valuable contextual information. Gemini's multimodal function calling capability streamlines this workflow, enabling a single API call that efficiently processes multimodal inputs for accurate function predictions and structured outputs. \n",
"\n",
"### How It Works\n",
@@ -299,6 +297,8 @@
"\n",
"In this example, you'll send along an image of a bird and ask Gemini to identify its habitat. This involves defining a function that looks up regions where a given animal is found, creating a tool that uses this function, and then sending a request to Gemini.\n",
"\n",
+ "\n",
+ "\n",
"First, you define a `FunctionDeclaration` called `get_wildlife_region`. This function takes the name of an animal species as input and returns information about its typical region."
]
},
@@ -586,6 +586,8 @@
"source": [
"Now let's explore how Gemini can extract information from videos for the purpose of invoking a function call. You'll use a video showcasing multiple products and ask Gemini to identify its key features.\n",
"\n",
+ "\n",
+ "\n",
"Start by defining a function called `get_feature_info` that takes a list of product features as input and could potentially be used to retrieve additional details about those features:"
]
},
@@ -814,6 +816,13 @@
"source": [
"In this example, you'll explore using audio input with Gemini's multimodal function calling. You'll send a podcast episode to Gemini and ask for book recommendations related to the topics discussed.\n",
"\n",
+ ">>> \"SRE is just a production system specific manifestation of systems thinking ... and we kind of do it in an informal way.\" \n",
+ ">>> \"The book called 'Thinking in Systems' ... it's a really good primer on this topic.\" \n",
+ ">>> \"An example of ... systems structure behavior thinking ... is the idea of like the cascading failure, that kind of vicious cycle of load that causes retries that causes more load ... \" \n",
+ ">>> \"The worst pattern is the single embedded SRE that turns into the ops person ... you just end up doing all of the toil, all of the grunt work.\" \n",
+ ">>> \"Take that moment, take a breath, and really analyze the problem and understand how it's working as a system and understand how you can intervene to improve that.\" \n",
+ ">>> \"Avoid just doing what you've done before and kicking the can down the road, and really think deeply about your problems.\" \n",
+ "\n",
"Define a function called `get_recommended_books` that takes a list of topics as input and (hypothetically) returns relevant book recommendations:"
]
},
@@ -1026,6 +1035,8 @@
"source": [
"This example demonstrates how to use Gemini's multimodal function calling to process PDF documents. You'll work with a set of invoices and extract the names of the (fictitious) companies involved.\n",
"\n",
+ "\n",
+ "\n",
"Define a function called `get_company_information` that (in a real-world scenario) could be used to fetch details about a given list of companies:"
]
},
@@ -1254,6 +1265,8 @@
"source": [
"Let's put it all together and build a simple multimodal chatbot. This chatbot will understand image inputs and respond to questions using the functions you define.\n",
"\n",
+ "\n",
+ "\n",
"First, define three functions: `get_animal_details`, `get_location_details`, and `check_color_palette`. These functions represent the capabilities of your chatbot and could potentially be used to retrieve additional details using REST API calls:"
]
},
From 9fd8ea58faede85271cce18ac25baf792474c8a8 Mon Sep 17 00:00:00 2001
From: Mend Renovate
Date: Mon, 30 Sep 2024 17:11:35 +0200
Subject: [PATCH 25/76] chore(deps): update dependency faker to v30 (#1178)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This PR contains the following updates:
| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [faker](https://redirect.github.com/joke2k/faker)
([changelog](https://redirect.github.com/joke2k/faker/blob/master/CHANGELOG.md))
| `29.0.0` -> `30.0.0` |
[![age](https://developer.mend.io/api/mc/badges/age/pypi/faker/30.0.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/faker/30.0.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/faker/29.0.0/30.0.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/faker/29.0.0/30.0.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
---
> [!WARNING]
> Some dependencies could not be looked up. Check the warning logs for
more information.
---
### Release Notes
joke2k/faker (faker)
###
[`v30.0.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v3000---2024-09-25)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v29.0.0...v30.0.0)
- Force the slug always be generated with ASCII characters. Thanks
[@Pandede](https://redirect.github.com/Pandede).
---
### Configuration
📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).
🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.
♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.
🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.
---
- [ ] If you want to rebase/retry this PR, check
this box
---
This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/GoogleCloudPlatform/generative-ai).
---------
Co-authored-by: Eric Dong
Co-authored-by: Owl Bot
---
gemini/sample-apps/llamaindex-rag/pyproject.toml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gemini/sample-apps/llamaindex-rag/pyproject.toml b/gemini/sample-apps/llamaindex-rag/pyproject.toml
index c924966cd2..9d6d3e007b 100644
--- a/gemini/sample-apps/llamaindex-rag/pyproject.toml
+++ b/gemini/sample-apps/llamaindex-rag/pyproject.toml
@@ -59,7 +59,7 @@ dulwich = "0.21.7"
email-validator = "2.2.0"
entrypoints = "0.4"
exceptiongroup = "1.2.2"
-faker = "29.0.0"
+faker = "30.0.0"
fastapi = "0.111.1"
fastapi-cli = "0.0.4"
fastjsonschema = "2.20.0"
From cb854717643d2f2ed008d0c25958b91c2f984111 Mon Sep 17 00:00:00 2001
From: Mend Renovate
Date: Mon, 30 Sep 2024 17:21:39 +0200
Subject: [PATCH 26/76] chore(deps): update dependency @types/express to v5
(#1176)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This PR contains the following updates:
| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
|
[@types/express](https://redirect.github.com/DefinitelyTyped/DefinitelyTyped/tree/master/types/express)
([source](https://redirect.github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/express))
| [`^4.17.21` ->
`^5.0.0`](https://renovatebot.com/diffs/npm/@types%2fexpress/4.17.21/5.0.0)
|
[![age](https://developer.mend.io/api/mc/badges/age/npm/@types%2fexpress/5.0.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/npm/@types%2fexpress/5.0.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/npm/@types%2fexpress/4.17.21/5.0.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/npm/@types%2fexpress/4.17.21/5.0.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
---
> [!WARNING]
> Some dependencies could not be looked up. Check the warning logs for
more information.
---
### Configuration
📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).
🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.
♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.
🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.
---
- [ ] If you want to rebase/retry this PR, check
this box
---
This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/GoogleCloudPlatform/generative-ai).
Co-authored-by: Eric Dong
---
.../genwealth/api/package-lock.json | 18 ++++++++++--------
gemini/sample-apps/genwealth/api/package.json | 2 +-
2 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/gemini/sample-apps/genwealth/api/package-lock.json b/gemini/sample-apps/genwealth/api/package-lock.json
index 354b3fa705..6cc7460305 100644
--- a/gemini/sample-apps/genwealth/api/package-lock.json
+++ b/gemini/sample-apps/genwealth/api/package-lock.json
@@ -21,7 +21,7 @@
},
"devDependencies": {
"@types/cors": "^2.8.17",
- "@types/express": "^4.17.21",
+ "@types/express": "^5.0.0",
"@types/lodash": "^4.17.0",
"@types/multer": "^1.4.11",
"@types/pg": "^8.11.2",
@@ -1048,22 +1048,24 @@
}
},
"node_modules/@types/express": {
- "version": "4.17.21",
- "resolved": "https://registry.npmjs.org/@types/express/-/express-4.17.21.tgz",
- "integrity": "sha512-ejlPM315qwLpaQlQDTjPdsUFSc6ZsP4AN6AlWnogPjQ7CVi7PYF3YVz+CY3jE2pwYf7E/7HlDAN0rV2GxTG0HQ==",
+ "version": "5.0.0",
+ "resolved": "https://registry.npmjs.org/@types/express/-/express-5.0.0.tgz",
+ "integrity": "sha512-DvZriSMehGHL1ZNLzi6MidnsDhUZM/x2pRdDIKdwbUNqqwHxMlRdkxtn6/EPKyqKpHqTl/4nRZsRNLpZxZRpPQ==",
"dev": true,
+ "license": "MIT",
"dependencies": {
"@types/body-parser": "*",
- "@types/express-serve-static-core": "^4.17.33",
+ "@types/express-serve-static-core": "^5.0.0",
"@types/qs": "*",
"@types/serve-static": "*"
}
},
"node_modules/@types/express-serve-static-core": {
- "version": "4.17.43",
- "resolved": "https://registry.npmjs.org/@types/express-serve-static-core/-/express-serve-static-core-4.17.43.tgz",
- "integrity": "sha512-oaYtiBirUOPQGSWNGPWnzyAFJ0BP3cwvN4oWZQY+zUBwpVIGsKUkpBpSztp74drYcjavs7SKFZ4DX1V2QeN8rg==",
+ "version": "5.0.0",
+ "resolved": "https://registry.npmjs.org/@types/express-serve-static-core/-/express-serve-static-core-5.0.0.tgz",
+ "integrity": "sha512-AbXMTZGt40T+KON9/Fdxx0B2WK5hsgxcfXJLr5bFpZ7b4JCex2WyQPTEKdXqfHiY5nKKBScZ7yCoO6Pvgxfvnw==",
"dev": true,
+ "license": "MIT",
"dependencies": {
"@types/node": "*",
"@types/qs": "*",
diff --git a/gemini/sample-apps/genwealth/api/package.json b/gemini/sample-apps/genwealth/api/package.json
index d7a24e4428..e348a5e71c 100644
--- a/gemini/sample-apps/genwealth/api/package.json
+++ b/gemini/sample-apps/genwealth/api/package.json
@@ -23,7 +23,7 @@
},
"devDependencies": {
"@types/cors": "^2.8.17",
- "@types/express": "^4.17.21",
+ "@types/express": "^5.0.0",
"@types/lodash": "^4.17.0",
"@types/multer": "^1.4.11",
"@types/pg": "^8.11.2",
From c550eba589654999aa962843bb9455f9836ae10b Mon Sep 17 00:00:00 2001
From: Jorj Ismailyan <130348293+jismailyan-google@users.noreply.github.com>
Date: Mon, 30 Sep 2024 08:38:21 -0700
Subject: [PATCH 27/76] feat: Add the notebook tutorial for Vertex Image
Segmentation (#1152)
# Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [x] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [x] You are listed as the author in your notebook or README file.
- [ ] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [x] Make your Pull Request title in the
specification.
- [x] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [x] Appropriate docs were updated (if necessary)
---------
Co-authored-by: Holt Skinner
Co-authored-by: Owl Bot
Co-authored-by: Eric Dong
---
.github/actions/spelling/allow.txt | 8 +
.../getting-started/image_segmentation.ipynb | 826 ++++++++++++++++++
2 files changed, 834 insertions(+)
create mode 100644 vision/getting-started/image_segmentation.ipynb
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
index 3c4bb080ba..160a5dd119 100644
--- a/.github/actions/spelling/allow.txt
+++ b/.github/actions/spelling/allow.txt
@@ -300,11 +300,13 @@ baxis
bbc
bigquery
bitcoin
+boundings
bpa
bqml
carbonara
caudatus
caxis
+cctv
cfbundle
chatbots
claude
@@ -346,6 +348,7 @@ drinkware
dropdown
dropna
dsl
+dtype
dtypes
dwmapi
ecommerce
@@ -368,6 +371,7 @@ figsize
fillmode
firestore
flac
+floormat
fmeasure
fontdict
forno
@@ -378,6 +382,7 @@ fromiter
fts
fulltext
funtion
+gapic
gboolean
gchar
gcloud
@@ -422,6 +427,7 @@ idk
idks
idxs
iloc
+imagefont
imageno
imdb
imshow
@@ -529,6 +535,7 @@ projectid
protobuf
pstotext
pubspec
+putalpha
putdata
pvc
pyautogen
@@ -569,6 +576,7 @@ siglap
sittin
sklearn
sku
+snowboard
sourced
srlimit
ssd
diff --git a/vision/getting-started/image_segmentation.ipynb b/vision/getting-started/image_segmentation.ipynb
new file mode 100644
index 0000000000..e0c062d1e4
--- /dev/null
+++ b/vision/getting-started/image_segmentation.ipynb
@@ -0,0 +1,826 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "uxCkB_DXTHzf"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Hny4I-ODTIS6"
+ },
+ "source": [
+ "# Image Segmentation on Vertex AI\n",
+ "\n",
+ "\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "84f0f73a0f76"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "| Author(s) | [Jeff Nelson](https://github.com/jeffonelson/), Eric Hao |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tvgnzT1CKxrO"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "This notebook demonstrates a basic end-to-end retrieval-augmented generation (RAG) pipeline using [BigQuery](https://cloud.google.com/bigquery/) and [BigQuery ML](https://cloud.google.com/bigquery/docs/bqml-introduction) functions. To do so, we:\n",
+ "\n",
+ "* Complete setup steps to download sample data and access [Vertex AI](https://cloud.google.com/vertex-ai) from BigQuery\n",
+ "* Generate [object table](https://cloud.google.com/bigquery/docs/object-table-introduction) to access unstructured PDFs that reside in [Cloud Storage](https://cloud.google.com/storage)\n",
+ "* Create a remote model, so BigQuery can call [Document AI](https://cloud.google.com/document-ai) to parse the PDF inputs\n",
+ "* Parse response from Document AI into chunks and metadata, then generate vector embeddings for the chunks\n",
+ "* Run a [vector search](https://cloud.google.com/bigquery/docs/vector-search) against embeddings in BigQuery, return relevant chunks, and summarize them with Gemini"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dc949afc1f08"
+ },
+ "source": [
+ "## How to open this notebook in BigQuery Studio\n",
+ "\n",
+ "This notebook was written to be compatible for use within BigQuery Studio. To open this notebook in BigQuery, click to [Run in Colab Enterprise](https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fuse-cases%2Fretrieval-augmented-generation%2Frag_with_bigquery.ipynb). This will open a new window in the Cloud Console and prompt you to confirm import. Then, navigate to BigQuery, where you will find the notebook available in the Explorer pane under Notebooks."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5ba5c12e483d"
+ },
+ "source": [
+ "## About the dataset\n",
+ "\n",
+ "This example uses [Alphabet's 2023 10-K](https://abc.xyz/assets/43/44/675b83d7455885c4615d848d52a4/goog-10-k-2023.pdf) form. This is a detailed overview of the company's financial information and includes text, tables, and diagrams spanning nearly 100 pages."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "2ce33dbc8fde"
+ },
+ "source": [
+ "## Services and Costs\n",
+ "\n",
+ "This tutorial uses the following Google Cloud data analytics and ML services, they are billable components of Google Cloud:\n",
+ "\n",
+ "* BigQuery & BigQuery ML [(pricing)](https://cloud.google.com/bigquery/pricing)\n",
+ "* Vertex AI Generative AI models [(pricing)](https://cloud.google.com/vertex-ai/generative-ai/pricing)\n",
+ "* Document AI [(pricing)](https://cloud.google.com/document-ai/pricing)\n",
+ "* Cloud Storage [(pricing)](https://cloud.google.com/storage/pricing)\n",
+ "\n",
+ "Use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "61RBz8LLbxCR"
+ },
+ "source": [
+ "# Setup Steps to access Vertex AI models from BigQuery and enable APIs"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ff210a6d4d21"
+ },
+ "source": [
+ "### Install Document AI SDK"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "2e9e2b9e1b1f"
+ },
+ "outputs": [],
+ "source": [
+ "!pip install --quiet google-cloud-documentai==2.31.0"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "8ed31279f009"
+ },
+ "source": [
+ "### Restart runtime\n",
+ "\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
+ "\n",
+ "The restart might take a minute or longer. After it's restarted, continue to the next step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "567212ff53a6"
+ },
+ "outputs": [],
+ "source": [
+ "import IPython\n",
+ "\n",
+ "app = IPython.Application.instance()\n",
+ "app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "b96b39fd4d7b"
+ },
+ "source": [
+ "
\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "fa362c2ef5b5"
+ },
+ "source": [
+ "### Authenticate your notebook environment (Colab only)\n",
+ "\n",
+ "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "id": "9a07a9f9a4a9"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " from google.colab import auth\n",
+ "\n",
+ " auth.authenticate_user()\n",
+ " print(\"Authenticated\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "No17Cw5hgx12"
+ },
+ "source": [
+ "### Define your Google Cloud project"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {
+ "id": "tFy3H3aPgx12"
+ },
+ "outputs": [],
+ "source": [
+ "PROJECT_ID = \"your-project-id\" # @param {type: \"string\"}\n",
+ "PROJECT_NUMBER = \"your-project-number\" # @param {type: \"string\"}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "04deeb11bbca"
+ },
+ "source": [
+ "### Enable Data Table Display\n",
+ "\n",
+ "This makes it easier to visualize tabular data within a Notebook environment later on."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "af9974f04f9f"
+ },
+ "outputs": [],
+ "source": [
+ "%load_ext google.colab.data_table"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "b4256d07d596"
+ },
+ "source": [
+ "### Create a new dataset in BigQuery\n",
+ "\n",
+ "This will house any tables created throughout this notebook."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "8a4c1a356d10"
+ },
+ "outputs": [],
+ "source": [
+ "!bq mk --location=us --dataset --project_id={PROJECT_ID} docai_demo"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "a100b689816b"
+ },
+ "source": [
+ "### Create a Cloud resource connection\n",
+ "\n",
+ "[Cloud resource connections](https://cloud.google.com/bigquery/docs/create-cloud-resource-connection) enable BigQuery to access other Cloud services, like Cloud Storage and Vertex AI."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "885da43402b1"
+ },
+ "outputs": [],
+ "source": [
+ "!bq mk --connection --connection_type=CLOUD_RESOURCE --location=us --project_id={PROJECT_ID} \"demo_conn\"\n",
+ "!bq show --location=us --connection --project_id={PROJECT_ID} \"demo_conn\""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dd9f6cbe4393"
+ },
+ "source": [
+ "### Add permissions to Cloud resource connection service account\n",
+ "\n",
+ "The Cloud resource connection is associated with a service account. The following cell enables the service account to access services like Document AI, Cloud Storage, and Vertex AI.\n",
+ "\n",
+ "**Note:** Copy the service account ID from the prior cell and input it below. It will look like `your-copied-service-account@gcp-sa-bigquery-condel.iam.gserviceaccount.com`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "16b193a840cd"
+ },
+ "outputs": [],
+ "source": [
+ "connection_service_account = \"your-copied-service-account@gcp-sa-bigquery-condel.iam.gserviceaccount.com\" # @param {type: \"string\"}\n",
+ "connection_member = f\"serviceAccount:{connection_service_account}\"\n",
+ "\n",
+ "\n",
+ "!gcloud projects add-iam-policy-binding {PROJECT_ID} --member={connection_member} --role='roles/documentai.viewer' --condition=None --quiet\n",
+ "!gcloud projects add-iam-policy-binding {PROJECT_ID} --member={connection_member} --role='roles/storage.objectViewer' --condition=None --quiet\n",
+ "!gcloud projects add-iam-policy-binding {PROJECT_ID} --member={connection_member} --role='roles/aiplatform.user' --condition=None --quiet"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ba09d9393559"
+ },
+ "source": [
+ "### Download the sample PDF used for this notebook and store it in a new Cloud Storage bucket"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "4605453a6675"
+ },
+ "outputs": [],
+ "source": [
+ "import random\n",
+ "\n",
+ "# Create a unique Cloud Storage bucket name\n",
+ "bucket_name = f\"{PROJECT_ID}-{random.randint(10000, 99999)}\"\n",
+ "\n",
+ "# Create the bucket\n",
+ "!gsutil mb -l US -p {PROJECT_ID} gs://{bucket_name}\n",
+ "\n",
+ "# Download the PDF sample\n",
+ "!wget goog-10-k-2023.pdf \"https://www.abc.xyz/assets/43/44/675b83d7455885c4615d848d52a4/goog-10-k-2023.pdf\"\n",
+ "\n",
+ "# Upload the PDF sample to the newly created Cloud Storage bucket\n",
+ "!gsutil cp goog-10-k-2023.pdf gs://{bucket_name}/\n",
+ "\n",
+ "# Print confirmation\n",
+ "print(f\"PDF uploaded to gs://{bucket_name}/goog-10-k-2023.pdf\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "6b0a0bd5c4fd"
+ },
+ "source": [
+ "## Create an object table\n",
+ "\n",
+ "An object table allows BigQuery to read unstructured data in Google Cloud Storage. This uses the BigQuery Python client library to continue using the `bucket_name` variable."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "94cc075094c4"
+ },
+ "outputs": [],
+ "source": [
+ "from google.cloud import bigquery\n",
+ "\n",
+ "client = bigquery.Client(project=PROJECT_ID)\n",
+ "\n",
+ "query = f\"\"\"\n",
+ "CREATE OR REPLACE EXTERNAL TABLE `docai_demo.object_table`\n",
+ "WITH CONNECTION `us.demo_conn` -- Replace with your connection ID\n",
+ "OPTIONS (\n",
+ " uris = ['gs://{bucket_name}/goog-10-k-2023.pdf'],\n",
+ " object_metadata = 'DIRECTORY'\n",
+ ");\n",
+ "\"\"\"\n",
+ "\n",
+ "query_job = client.query(query) # API request\n",
+ "query_job.result() # Waits for the query to complete\n",
+ "\n",
+ "print(\"External table docai_demo.object_table created or replaced successfully.\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "c17ec8736188"
+ },
+ "source": [
+ "### Show the object table\n",
+ "\n",
+ "Confirm that the results display the PDF document in your Cloud Storage bucket."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "9f471aa348b2"
+ },
+ "outputs": [],
+ "source": [
+ "%%bigquery --project $PROJECT_ID\n",
+ "\n",
+ "SELECT * \n",
+ "FROM `docai_demo.object_table`;"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ec9d2c49fd34"
+ },
+ "source": [
+ "## Use BQML and Document AI to parse documents"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "afc0a5902ef3"
+ },
+ "source": [
+ "### Create a Layout Parser Processor in Document AI\n",
+ "\n",
+ "[Create a new processor](https://cloud.google.com/document-ai/docs/create-processor#documentai_fetch_processor_types-python) in Document AI with the type `LAYOUT_PARSER_PROCESSOR`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "519ea8a55496"
+ },
+ "outputs": [],
+ "source": [
+ "from google.api_core.client_options import ClientOptions\n",
+ "from google.cloud import documentai\n",
+ "\n",
+ "location = \"us\"\n",
+ "processor_display_name = \"layout_parser_processor\"\n",
+ "processor_type = \"LAYOUT_PARSER_PROCESSOR\"\n",
+ "\n",
+ "\n",
+ "def create_processor_sample(\n",
+ " PROJECT_ID: str, location: str, processor_display_name: str, processor_type: str\n",
+ ") -> None:\n",
+ " opts = ClientOptions(api_endpoint=f\"{location}-documentai.googleapis.com\")\n",
+ "\n",
+ " client = documentai.DocumentProcessorServiceClient(client_options=opts)\n",
+ "\n",
+ " # The full resource name of the location\n",
+ " parent = client.common_location_path(PROJECT_ID, location)\n",
+ "\n",
+ " # Create a processor\n",
+ " processor = client.create_processor(\n",
+ " parent=parent,\n",
+ " processor=documentai.Processor(\n",
+ " display_name=processor_display_name, type_=processor_type\n",
+ " ),\n",
+ " )\n",
+ "\n",
+ " # Return the processor ID needed for creating a BigQuery connection\n",
+ " return processor.name.split(\"/\")[-1]\n",
+ "\n",
+ "\n",
+ "# Call this function to create the processor and return its ID\n",
+ "processor_id = create_processor_sample(\n",
+ " PROJECT_ID, location, processor_display_name, processor_type\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5cfcaa4c4584"
+ },
+ "source": [
+ "### Create a remote model in BigQuery that connects with your Document AI Layout Parser Processor\n",
+ "\n",
+ "This one-time setup step allows BigQuery to reference the Document AI Processor you just created.\n",
+ "\n",
+ "**Note:** If if you receive an 400 GET error \"permission denied for document processor\", you may need to wait a minute for permissions to propagate from earlier steps."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "57233716c232"
+ },
+ "outputs": [],
+ "source": [
+ "query = f\"\"\"\n",
+ "CREATE OR REPLACE MODEL `docai_demo.layout_parser` \n",
+ "REMOTE WITH CONNECTION `us.demo_conn`\n",
+ "OPTIONS(remote_service_type=\"CLOUD_AI_DOCUMENT_V1\", document_processor=\"{processor_id}\")\n",
+ "\"\"\"\n",
+ "\n",
+ "query_job = client.query(query) # API request\n",
+ "query_job.result() # Waits for the query to complete\n",
+ "\n",
+ "print(\"Remote model docai_demo.layout_parser created or replaced successfully.\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "fc821b6f845d"
+ },
+ "source": [
+ "### Process the document using BigQuery ML\n",
+ "\n",
+ "Use the [`ML.PROCESS_DOCUMENT` function](https://cloud.google.com/bigquery/docs/process-document) from BigQuery to call your Document AI processor and pass through the PDF. This uses the Layout Parser configuration and chunks your document.\n",
+ "\n",
+ "**Note:** this may take a minute or so to complete."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "a489b3cb9e1d"
+ },
+ "outputs": [],
+ "source": [
+ "%%bigquery --project $PROJECT_ID --location us\n",
+ "\n",
+ "CREATE or REPLACE TABLE docai_demo.demo_result AS (\n",
+ " SELECT * FROM ML.PROCESS_DOCUMENT(\n",
+ " MODEL docai_demo.layout_parser,\n",
+ " TABLE docai_demo.object_table,\n",
+ " PROCESS_OPTIONS => (JSON '{\"layout_config\": {\"chunking_config\": {\"chunk_size\": 250}}}')\n",
+ " )\n",
+ ");"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "bfb0a1fa3266"
+ },
+ "source": [
+ "### Parse the JSON results returned to BigQuery\n",
+ "\n",
+ "The `ML.PROCESS_DOCUMENT` function parses the PDF from Cloud Storage and returns a JSON blob to BigQuery. In this step, we'll parse the JSON, extract document chunks and metadata, and return it to a new BigQuery table."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "2bc4dad2e399"
+ },
+ "outputs": [],
+ "source": [
+ "%%bigquery --project $PROJECT_ID --location us\n",
+ "\n",
+ "CREATE OR REPLACE TABLE docai_demo.demo_result_parsed AS (\n",
+ "SELECT\n",
+ " uri,\n",
+ " JSON_EXTRACT_SCALAR(json , '$.chunkId') AS id,\n",
+ " JSON_EXTRACT_SCALAR(json , '$.content') AS content,\n",
+ " JSON_EXTRACT_SCALAR(json , '$.pageFooters[0].text') AS page_footers_text,\n",
+ " JSON_EXTRACT_SCALAR(json , '$.pageSpan.pageStart') AS page_span_start,\n",
+ " JSON_EXTRACT_SCALAR(json , '$.pageSpan.pageEnd') AS page_span_end\n",
+ "FROM docai_demo.demo_result, UNNEST(JSON_EXTRACT_ARRAY(ml_process_document_result.chunkedDocument.chunks, '$')) json\n",
+ ");"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "adca53cc55d8"
+ },
+ "source": [
+ "### Display the parsed document chunks\n",
+ "\n",
+ "Show a preview of the parsed results and metadata."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "c60bcdc388c4"
+ },
+ "outputs": [],
+ "source": [
+ "%%bigquery --project $PROJECT_ID --location us\n",
+ "\n",
+ "SELECT *\n",
+ "FROM docai_demo.demo_result_parsed\n",
+ "ORDER BY id\n",
+ "LIMIT 5;"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "a980e66443bc"
+ },
+ "source": [
+ "## Connect to Vertex AI embedding generation and Gemini access"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "eae56fa8c74c"
+ },
+ "source": [
+ "### Connect to a text embedding model\n",
+ "\n",
+ "[Create a remote model](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-remote-model) allowing BigQuery access to a text embedding model hosted in Vertex AI."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "3c53a24e59a1"
+ },
+ "outputs": [],
+ "source": [
+ "%%bigquery --project $PROJECT_ID\n",
+ "\n",
+ "CREATE OR REPLACE MODEL `docai_demo.embedding_model` \n",
+ "REMOTE WITH CONNECTION `us.demo_conn` OPTIONS(endpoint=\"text-embedding-004\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "45d1ccc016c8"
+ },
+ "source": [
+ "### Generate embeddings\n",
+ "\n",
+ "Use the [`ML.GENERATE_EMBEDDING` function](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-embedding) in BigQuery to generate embeddings for all text chunks in the document."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "63bf77f48b8c"
+ },
+ "outputs": [],
+ "source": [
+ "%%bigquery --project $PROJECT_ID\n",
+ "\n",
+ "CREATE OR REPLACE TABLE `docai_demo.embeddings` AS\n",
+ "SELECT * FROM ML.GENERATE_EMBEDDING(\n",
+ " MODEL `docai_demo.embedding_model`,\n",
+ " TABLE `docai_demo.demo_result_parsed`\n",
+ ");"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "e1ce3b78e01a"
+ },
+ "source": [
+ "### Connect to a Gemini LLM endpoint\n",
+ "\n",
+ "Create a remote model allowing BigQuery access to a Gemini foundation model hosted in Vertex AI."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "7b760c54502e"
+ },
+ "outputs": [],
+ "source": [
+ "%%bigquery --project $PROJECT_ID\n",
+ "\n",
+ "CREATE OR REPLACE MODEL `docai_demo.gemini_flash` REMOTE\n",
+ "WITH CONNECTION `us.demo_conn` OPTIONS(endpoint=\"gemini-1.5-flash\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "879593c348c4"
+ },
+ "source": [
+ "## Run vector search, return results, and pass them to Gemini for text generation"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "2eb640b369a3"
+ },
+ "source": [
+ "### Sample BigQuery vector search\n",
+ "\n",
+ "Run a sample BigQuery vector search against your chunks. This query takes your text input, creates an embedding using the `ML.GENERATE_EMBEDDING` function, and then passes the embedding through to the [`VECTOR_SEARCH` function](https://cloud.google.com/bigquery/docs/reference/standard-sql/search_functions#vector_search). The results are the top ten chunks that are most semantically related to your input.\n",
+ "\n",
+ "In the search query below, the input text asks \"What was Alphabets revenue in 2023?\""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "cf9fa689905d"
+ },
+ "outputs": [],
+ "source": [
+ "%%bigquery --project $PROJECT_ID\n",
+ "\n",
+ "SELECT query.query, base.uri, base.id, base.content, distance\n",
+ " FROM\n",
+ " VECTOR_SEARCH( TABLE `docai_demo.embeddings`,\n",
+ " 'ml_generate_embedding_result',\n",
+ " (\n",
+ " SELECT\n",
+ " ml_generate_embedding_result,\n",
+ " content AS query\n",
+ " FROM\n",
+ " ML.GENERATE_EMBEDDING( MODEL `docai_demo.embedding_model`,\n",
+ " ( SELECT 'What was Alphabets revenue in 2023?' AS content)\n",
+ " ) \n",
+ " ),\n",
+ " top_k => 10,\n",
+ " OPTIONS => '{\"fraction_lists_to_search\": 0.01}') \n",
+ "ORDER BY distance DESC;"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "623765bd9154"
+ },
+ "source": [
+ "## Generate text augmented by vector search results\n",
+ "\n",
+ "This step builds upon the prior one - but instead of simply returning the top text chunks, it calls the `ML.GENERATE_TEXT` function to summarize them alongside the question we input.\n",
+ "\n",
+ "In this query you:\n",
+ "* **Retrieve** the closest chunks semantically using the `VECTOR_SEARCH` function (this is what was done in the prior query)\n",
+ "* **Augment** the Gemini LLM with this knowledge\n",
+ "* **Generate** a succinct answer using the `ML.GENERATE_TEXT` function"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "2f6f83f2eca7"
+ },
+ "outputs": [],
+ "source": [
+ "%%bigquery --project $PROJECT_ID\n",
+ "\n",
+ "SELECT\n",
+ " ml_generate_text_llm_result AS generated,\n",
+ " -- prompt -- Commented out, but please feel free to uncomment if you would like to see the full context passed to the Gemini model\n",
+ "FROM\n",
+ " ML.GENERATE_TEXT( MODEL `docai_demo.gemini_flash`,\n",
+ " (\n",
+ " SELECT\n",
+ " CONCAT( 'What is yearly revenue for Alphabet in the last three years? Use the context and mention the reference file used in the answer: ',\n",
+ " STRING_AGG(FORMAT(\"context: %s and reference: %s\", base.content, base.uri), ',\\n')) AS prompt,\n",
+ " FROM\n",
+ " VECTOR_SEARCH( TABLE \n",
+ " `docai_demo.embeddings`,\n",
+ " 'ml_generate_embedding_result',\n",
+ " (\n",
+ " SELECT\n",
+ " ml_generate_embedding_result,\n",
+ " content AS query\n",
+ " FROM\n",
+ " ML.GENERATE_EMBEDDING( MODEL `docai_demo.embedding_model`,\n",
+ " (\n",
+ " SELECT\n",
+ " 'Alphabets revenue' AS content\n",
+ " )\n",
+ " ) \n",
+ " ),\n",
+ " top_k => 10,\n",
+ " OPTIONS => '{\"fraction_lists_to_search\": 0.01}') \n",
+ " ),\n",
+ " STRUCT(512 AS max_output_tokens, TRUE AS flatten_json_output)\n",
+ " );\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "2a4e033321ad"
+ },
+ "source": [
+ "# Cleaning up\n",
+ "\n",
+ "To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.\n",
+ "\n",
+ "Otherwise, you can delete the individual resources you created in this tutorial by uncommenting the below:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "1ab59128be6c"
+ },
+ "outputs": [],
+ "source": [
+ "#\n",
+ "# !bq rm -r -f $PROJECT_ID:docai_demo\n",
+ "# !bq rm --connection --project_id=$PROJECT_ID --location=us demo_conn\n",
+ "# !gsutil rm -r gs://{bucket_name}\n",
+ "#"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "aefa89207b70"
+ },
+ "source": [
+ "# Wrap up\n",
+ "\n",
+ "This notebook demonstrates an example of how to achieve a basic end-to-end retrieval-augmented generation pipeline using BigQuery. It integrates BigQuery ML functions like `ML.PROCESS_DOCUMENT` to call Document AI and parse PDFs, `ML.GENERATE_EMBEDDING` to generate embeddings on text chunks and input queries, and `ML.GENERATE_TEXT` to provide a concise answer. It also uses the `VECTOR_SEARCH` function to identify similar text (using embeddings) in BigQuery using familiar SQL syntax.\n",
+ "\n",
+ "To continue learn more, check out our documentation on [BigQuery ML](https://cloud.google.com/bigquery/docs/bqml-introduction) and [BigQuery Vector Search](https://cloud.google.com/bigquery/docs/vector-search)."
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "name": "rag_with_bigquery.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
From 30f11700cd62fd1f88191a5ff2b6ac9b14a0c36a Mon Sep 17 00:00:00 2001
From: Alicia Williams
Date: Mon, 30 Sep 2024 10:41:08 -0700
Subject: [PATCH 29/76] fix: Update poster_image_analysis.ipynb (#1175)
Update text embedding model to supported
'text-multilingual-embedding-002'
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [X] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [X] You are listed as the author in your notebook or README file.
- [ ] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [X] Make your Pull Request title in the
specification.
- [ ] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [ ] Appropriate docs were updated (if necessary)
Co-authored-by: Eric Dong
---
.../poster_image_analysis.ipynb | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gemini/use-cases/applying-llms-to-data/analyze-poster-images-in-bigquery/poster_image_analysis.ipynb b/gemini/use-cases/applying-llms-to-data/analyze-poster-images-in-bigquery/poster_image_analysis.ipynb
index 473146e31a..248413468f 100644
--- a/gemini/use-cases/applying-llms-to-data/analyze-poster-images-in-bigquery/poster_image_analysis.ipynb
+++ b/gemini/use-cases/applying-llms-to-data/analyze-poster-images-in-bigquery/poster_image_analysis.ipynb
@@ -452,7 +452,7 @@
"%%bigquery\n",
"CREATE OR REPLACE MODEL `gemini_demo.text_embedding`\n",
"REMOTE WITH CONNECTION `us.gemini_conn`\n",
- "OPTIONS (endpoint = 'textembedding-gecko-multilingual')"
+ "OPTIONS (endpoint = 'text-multilingual-embedding-002')"
]
},
{
From 405b9de2988ed41266c49f2c6f9b9e04b298374a Mon Sep 17 00:00:00 2001
From: Jason Dai <44714045+jsondai@users.noreply.github.com>
Date: Mon, 30 Sep 2024 15:35:09 -0700
Subject: [PATCH 30/76] feat: add public notebook for evaluating models in
Vertex AI Studio and model garden (#1192)
# Description
feat: add public notebook for evaluating models in Vertex AI Studio and
model garden
chore: update introduction texts and titles for gen ai evaluation
service tutorials
---------
Co-authored-by: Owl Bot
Co-authored-by: Eric Dong
---
...our_own_autorater_with_custom_metric.ipynb | 2 +-
...ng_your_own_computation_based_metric.ipynb | 2 +-
...pare_and_migrate_from_palm_to_gemini.ipynb | 2 +-
.../compare_generative_ai_models.ipynb | 2 +-
.../customize_model_based_metrics.ipynb | 2 +-
...te_and_compare_gemini_model_settings.ipynb | 10 +-
.../evaluation/evaluate_gemini_tool_use.ipynb | 8 +-
.../evaluate_langchain_chains.ipynb | 12 +-
...in_vertex_ai_studio_and_model_garden.ipynb | 2294 +++++++++++++++++
...te_rag_gen_ai_evaluation_service_sdk.ipynb | 6 +-
...tro_to_gen_ai_evaluation_service_sdk.ipynb | 2 +-
.../migration_guide_preview_to_GA_sdk.ipynb | 5 +-
...eering_gen_ai_evaluation_service_sdk.ipynb | 6 +-
13 files changed, 2330 insertions(+), 23 deletions(-)
create mode 100644 gemini/evaluation/evaluate_models_in_vertex_ai_studio_and_model_garden.ipynb
diff --git a/gemini/evaluation/bring_your_own_autorater_with_custom_metric.ipynb b/gemini/evaluation/bring_your_own_autorater_with_custom_metric.ipynb
index c8c947af54..3c701cd2a2 100644
--- a/gemini/evaluation/bring_your_own_autorater_with_custom_metric.ipynb
+++ b/gemini/evaluation/bring_your_own_autorater_with_custom_metric.ipynb
@@ -29,7 +29,7 @@
"id": "w-Edsbk7JRim"
},
"source": [
- " # Bring-Your-Own-Autorater using `CustomMetric`\n",
+ " # Bring-Your-Own-Autorater using `CustomMetric` | Gen AI Evaluation SDK Tutorial\n",
"\n",
"\n",
"
\n",
diff --git a/gemini/evaluation/bring_your_own_computation_based_metric.ipynb b/gemini/evaluation/bring_your_own_computation_based_metric.ipynb
index f2c00e7080..a68d3ff0a6 100644
--- a/gemini/evaluation/bring_your_own_computation_based_metric.ipynb
+++ b/gemini/evaluation/bring_your_own_computation_based_metric.ipynb
@@ -30,7 +30,7 @@
"id": "kOCmYX5Rc3BZ"
},
"source": [
- "# Bring your own computation-based `CustomMetric`\n",
+ "# Bring your own computation-based `CustomMetric` | Gen AI Evaluation SDK Tutorial\n",
"\n",
"\n",
"
\n",
diff --git a/gemini/evaluation/compare_and_migrate_from_palm_to_gemini.ipynb b/gemini/evaluation/compare_and_migrate_from_palm_to_gemini.ipynb
index fc7a600795..b6febbb9a7 100644
--- a/gemini/evaluation/compare_and_migrate_from_palm_to_gemini.ipynb
+++ b/gemini/evaluation/compare_and_migrate_from_palm_to_gemini.ipynb
@@ -30,7 +30,7 @@
"id": "UzRB5KCb4Z94"
},
"source": [
- "# Migrate from PaLM to Gemini model with Gen AI Evaluation Service SDK\n",
+ "# Migrate from PaLM to Gemini model | Gen AI Evaluation SDK Tutorial\n",
"\n",
"\n",
"
\n",
diff --git a/gemini/evaluation/compare_generative_ai_models.ipynb b/gemini/evaluation/compare_generative_ai_models.ipynb
index b0551b0af7..995b75207a 100644
--- a/gemini/evaluation/compare_generative_ai_models.ipynb
+++ b/gemini/evaluation/compare_generative_ai_models.ipynb
@@ -29,7 +29,7 @@
"id": "QN61Ug4hLby5"
},
"source": [
- " # Compare Generative AI Models\n",
+ " # Compare Generative AI Models | Gen AI Evaluation SDK Tutorial\n",
"\n",
"
\n",
"
\n",
diff --git a/gemini/evaluation/customize_model_based_metrics.ipynb b/gemini/evaluation/customize_model_based_metrics.ipynb
index 4b4e6974b3..16f4364e36 100644
--- a/gemini/evaluation/customize_model_based_metrics.ipynb
+++ b/gemini/evaluation/customize_model_based_metrics.ipynb
@@ -30,7 +30,7 @@
"id": "kOCmYX5Rc3BZ"
},
"source": [
- "# Customize Model-based Metrics to evaluate a Gen AI model\n",
+ "# Customize Model-based Metrics to Evaluate a Gen AI model | Gen AI Evaluation SDK Tutorial\n",
"\n",
"
\n",
"
\n",
diff --git a/gemini/evaluation/evaluate_and_compare_gemini_model_settings.ipynb b/gemini/evaluation/evaluate_and_compare_gemini_model_settings.ipynb
index 2abf74be66..74a645c972 100644
--- a/gemini/evaluation/evaluate_and_compare_gemini_model_settings.ipynb
+++ b/gemini/evaluation/evaluate_and_compare_gemini_model_settings.ipynb
@@ -29,7 +29,7 @@
"id": "JAPoU8Sm5E6e"
},
"source": [
- "# Evaluate and Compare Gen AI Model Settings | Gen AI Evaluation Service SDK Tutorial\n",
+ "# Evaluate and Compare Gen AI Model Settings | Gen AI Evaluation SDK Tutorial\n",
"\n",
"
\n",
"
\n",
@@ -74,9 +74,13 @@
"source": [
"## Overview\n",
"\n",
- "Evaluate and select generative AI model settings:\n",
+ "Evaluate and select generative AI model settings with *Vertex AI Python SDK for Gen AI Evaluation Service*:\n",
"\n",
- "* Adjust temperature, output token limit, safety settings and other model generation configurations of Gemini models on a summarization task and compare the evaluation results from different model settings on several metrics."
+ "* Adjust temperature, output token limit, safety settings and other model generation configurations of Gemini models on a summarization task and compare the evaluation results from different model settings on several metrics.\n",
+ "\n",
+ "See also: \n",
+ "\n",
+ "- Learn more about [Vertex Gen AI Evaluation Service SDK](https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-overview)."
]
},
{
diff --git a/gemini/evaluation/evaluate_gemini_tool_use.ipynb b/gemini/evaluation/evaluate_gemini_tool_use.ipynb
index ba7ca8f7df..be9070722c 100644
--- a/gemini/evaluation/evaluate_gemini_tool_use.ipynb
+++ b/gemini/evaluation/evaluate_gemini_tool_use.ipynb
@@ -29,7 +29,7 @@
"id": "WKiAD3qCCtOu"
},
"source": [
- " # Evaluate Generative Model Tool Use | Gen AI Evaluation Service SDK Tutorial\n",
+ " # Evaluate Generative Model Tool Use | Gen AI Evaluation SDK Tutorial\n",
"\n",
"
\n",
"
\n",
@@ -74,7 +74,11 @@
"source": [
"## Overview\n",
"\n",
- "* Define an API function and a Tool for Gemini model, and evaluate the Gemini tool use quality."
+ "* Define an API function and a Tool for Gemini model, and evaluate the Gemini model tool use quality with *Vertex AI Python SDK for Gen AI Evaluation Service*.\n",
+ "\n",
+ "See also: \n",
+ "\n",
+ "- Learn more about [Vertex Gen AI Evaluation Service SDK](https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-overview)."
]
},
{
diff --git a/gemini/evaluation/evaluate_langchain_chains.ipynb b/gemini/evaluation/evaluate_langchain_chains.ipynb
index ca94d90c62..d7f6653505 100644
--- a/gemini/evaluation/evaluate_langchain_chains.ipynb
+++ b/gemini/evaluation/evaluate_langchain_chains.ipynb
@@ -29,7 +29,7 @@
"id": "7ZX50cNFOFBt"
},
"source": [
- " # Evaluate LangChain | Rapid Evaluation SDK Tutorial\n",
+ " # Evaluate LangChain | Gen AI Evaluation SDK Tutorial\n",
"\n",
"
\n",
"
\n",
@@ -74,7 +74,7 @@
"source": [
"## Overview\n",
"\n",
- "With this tutorial, you learn how to evaluate the performance of a conversational LangChain chain using the Vertex AI Rapid Evaluation SDK. The notebook utilizes a dummy chatbot designed to provide recipe suggestions.\n",
+ "With this tutorial, you learn how to evaluate the performance of a conversational LangChain chain using the *Vertex AI Python SDK for Gen AI Evaluation Service*. The notebook utilizes a dummy chatbot designed to provide recipe suggestions.\n",
"\n",
"The tutorial goes trough:\n",
"1. Data preparation\n",
@@ -120,7 +120,7 @@
"source": [
"%pip install --quiet --upgrade nest_asyncio\n",
"%pip install --upgrade --user --quiet langchain-core langchain-google-vertexai langchain\n",
- "%pip install --upgrade --user --quiet \"google-cloud-aiplatform[rapid_evaluation]\""
+ "%pip install --upgrade --user --quiet \"google-cloud-aiplatform[evaluation]\""
]
},
{
@@ -245,11 +245,10 @@
"from google.cloud import aiplatform\n",
"from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
"from langchain_google_vertexai import ChatVertexAI\n",
- "import nest_asyncio\n",
- "import pandas as pd\n",
- "from tqdm import tqdm\n",
"\n",
"# Main\n",
+ "import pandas as pd\n",
+ "from tqdm import tqdm\n",
"import vertexai\n",
"from vertexai.evaluation import CustomMetric, EvalTask\n",
"\n",
@@ -275,7 +274,6 @@
"outputs": [],
"source": [
"logging.getLogger(\"urllib3.connectionpool\").setLevel(logging.ERROR)\n",
- "nest_asyncio.apply()\n",
"warnings.filterwarnings(\"ignore\")"
]
},
diff --git a/gemini/evaluation/evaluate_models_in_vertex_ai_studio_and_model_garden.ipynb b/gemini/evaluation/evaluate_models_in_vertex_ai_studio_and_model_garden.ipynb
new file mode 100644
index 0000000000..03a114cd55
--- /dev/null
+++ b/gemini/evaluation/evaluate_models_in_vertex_ai_studio_and_model_garden.ipynb
@@ -0,0 +1,2294 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "bkIF-qKfOvFl"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "KNc5B1-pOvFn"
+ },
+ "source": [
+ "# Use Gen AI Evaluation SDK to Evaluate Models in Vertex AI Studio, Model Garden, and Model Registry\n",
+ "\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "kPgYxQc1OvFn"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "|Author(s) | [Jason Dai](https://github.com/jsondai) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "vQ_7WkHE3gNO"
+ },
+ "source": [
+ "This notebook demonstrates how to get started with using the *Vertex AI Python SDK for Gen AI Evaluation Service* for generative models in Vertex AI Studio, Model Garden, and Model Registry.\n",
+ "\n",
+ "Gen AI Evaluation Service empowers you to comprehensively assess and enhance your generative AI models and applications. Whether you're selecting the ideal model, optimizing prompt templates, or evaluating fine-tuned checkpoints, this service provides the tools and insights you need.\n",
+ "\n",
+ "In this Colab tutorial, we'll explore three major use cases:\n",
+ "\n",
+ "1. Run Evaluation on 1P Models\n",
+ " * Learn how to evaluate `Gemini` models in Vertex AI Studio using the *Gen AI Evaluation Service SDK*.\n",
+ "\n",
+ " * Explore different evaluation metrics and techniques for assessing performance on various tasks.\n",
+ "\n",
+ " * Discover how to leverage the SDK for in-depth analysis and comparison of `Gemini` model variants.\n",
+ "\n",
+ "\n",
+ "2. Run Evaluation on 3P Models\n",
+ " * Learn how to evaluate third-party open models, such as a pretrained `Llama 3.1` model, or a fine-tuned `Llama 3` model deployed in Vertex Model Garden, using the *Gen AI Evaluation Service SDK*.\n",
+ "\n",
+ " * Learn how to evaluate third-party closed model APIs, such as Anthropic's `Claude 3.5 Sonnet` model hosted on Vertex AI, using the *Gen AI Evaluation Service SDK*.\n",
+ "\n",
+ " * Gain insights into conducting controlled experiments by maintaining the same `EvalTask` configuration with fixed dataset and evaluation metrics while evaluating various model architectures and capabilities.\n",
+ "\n",
+ "\n",
+ "3. Prompt Engineering\n",
+ "\n",
+ " * Explore the impact of prompt design on model performance.\n",
+ " * Utilize the SDK to systematically evaluate and refine your prompts.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "fHGZmbkw6GgM"
+ },
+ "source": [
+ "For additional use cases and advanced features, refer to our public documentation and notebook tutorials for evaluation use cases:\n",
+ "\n",
+ "* https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-overview#notebooks_for_evaluation_use_cases\n",
+ "\n",
+ "* https://cloud.google.com/vertex-ai/generative-ai/docs/models/run-evaluation\n",
+ "\n",
+ "Let's get started!"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "mN5IHo-aOvFo"
+ },
+ "source": [
+ "**_NOTE_**: This notebook has been tested in the following environment:\n",
+ "\n",
+ "* Python version = 3.10"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "3XZf_4VEOvFo"
+ },
+ "source": [
+ "## Getting Started"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "kE20na1OOvFo"
+ },
+ "source": [
+ "### Install Vertex AI SDK for Gen AI Evaluation Service"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "abLuRgBzOvFp"
+ },
+ "outputs": [],
+ "source": [
+ "! pip install -U -q google-cloud-aiplatform[evaluation]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "kYJVHBVSZgTX"
+ },
+ "source": [
+ "### Install other required packages"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "squkG3h9ZZs8"
+ },
+ "outputs": [],
+ "source": [
+ "! pip install -U -q datasets\n",
+ "! pip install -U -q anthropic[vertex]\n",
+ "! pip install -U -q openai"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Pe2lLnYuOvFp"
+ },
+ "source": [
+ "### Restart runtime\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
+ "\n",
+ "The restart might take a minute or longer. After it's restarted, continue to the next step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "B3FDZs3qOvFp"
+ },
+ "outputs": [],
+ "source": [
+ "# import IPython\n",
+ "\n",
+ "# app = IPython.Application.instance()\n",
+ "# app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "LqDc-oyiOvFp"
+ },
+ "source": [
+ "
\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "
\n"
+ ],
+ "text/plain": [
+ " experiment_name run_name run_type \\\n",
+ "0 gen-ai-eval-3p-experiment eval-claude-3-80kv797z system.ExperimentRun \n",
+ "1 gen-ai-eval-3p-experiment eval-llama-3-1-kq13qu77 system.ExperimentRun \n",
+ "\n",
+ " state param.prompt_template \\\n",
+ "0 COMPLETE # System_prompt\\n{system_prompt} # Question\\n{question} \n",
+ "1 COMPLETE # System_prompt\\n{system_prompt} # Question\\n{question} \n",
+ "\n",
+ " metric.rouge_l_sum/std metric.text_quality/std \\\n",
+ "0 0.156281 0.843274 \n",
+ "1 0.078333 0.632456 \n",
+ "\n",
+ " metric.instruction_following/mean metric.fluency/std \\\n",
+ "0 3.6 0.843274 \n",
+ "1 1.9 0.520855 \n",
+ "\n",
+ " metric.coherence/std metric.linguistic_acceptability/mean \\\n",
+ "0 1.414214 3.0 \n",
+ "1 0.497214 1.7 \n",
+ "\n",
+ " metric.fluency/mean metric.coherence/mean \\\n",
+ "0 4.400 4.00 \n",
+ "1 2.123 1.45 \n",
+ "\n",
+ " metric.instruction_following/std metric.rouge_l_sum/mean \\\n",
+ "0 1.173788 0.379755 \n",
+ "1 0.567646 0.190887 \n",
+ "\n",
+ " metric.linguistic_acceptability/std metric.text_quality/mean \\\n",
+ "0 1.154701 3.4 \n",
+ "1 0.674949 1.8 \n",
+ "\n",
+ " metric.row_count \n",
+ "0 10.0 \n",
+ "1 10.0 "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "pointwise_eval_task.display_runs()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "yLHuqe2D-x81"
+ },
+ "source": [
+ "## Prompt Engineering"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "cd393d80e252"
+ },
+ "source": [
+ "The *Vertex AI Gen AI Evaluation Service SDK* simplifies prompt engineering by streamlining the process of creating and evaluating multiple prompt templates. It allows you to efficiently test different prompts against a chosen dataset and compare their performance using comprehensive evaluation metrics. This empowers you to identify the most effective prompts for your specific use case and optimize your generative AI applications."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "601693fa0461"
+ },
+ "source": [
+ "### Design a prompt with Prompt Template"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "a53e4c50779f"
+ },
+ "outputs": [],
+ "source": [
+ "system_instruction = \"You are a poetic assistant, skilled in explaining complex concepts with creative flair.\"\n",
+ "question = \"How does LLM work?\"\n",
+ "requirements = \"Explain concepts in great depth using simple terms, and give examples to help people learn. At the end of each explanation, you ask a question to check for understanding\"\n",
+ "\n",
+ "prompt_template = f\"{system_instruction} Answer this question: {question}, and follow the requirements: {requirements}.\"\n",
+ "\n",
+ "\n",
+ "model_response = (\n",
+ " GenerativeModel(\"gemini-1.5-pro\")\n",
+ " .generate_content(prompt_template)\n",
+ " .candidates[0]\n",
+ " .content.parts[0]\n",
+ " .text\n",
+ ")\n",
+ "\n",
+ "\n",
+ "display(HTML(f\"
Assembled Prompt:
{prompt_template}
\"))\n",
+ "display(HTML(\"
Model Response:
\"))\n",
+ "Markdown(model_response)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5261f69b615a"
+ },
+ "source": [
+ "### Compare and optimize prompt template design"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "7d01f70c3163"
+ },
+ "source": [
+ "#### Define an evaluation dataset\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "2157353f3309"
+ },
+ "source": [
+ "To perform pointwise inference, the evaluation dataset is required to contain the following fields:\n",
+ "\n",
+ "* Instruction: Part of the input user prompt. It refers to the inference instruction that is sent to your LLM.\n",
+ "* Context: User input for the Gen AI model or application in the current turn.\n",
+ "* Reference: The ground truth to compare your LLM response to.\n",
+ "\n",
+ "Your dataset must include a minimum of one evaluation example. We recommend around 100 examples to ensure high-quality aggregated metrics and statistically significant results."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "972ebabb2712"
+ },
+ "outputs": [],
+ "source": [
+ "instruction = \"Summarize the following article: \\n\"\n",
+ "\n",
+ "context = [\n",
+ " \"Typhoon Phanfone has killed at least one person, a US airman on Okinawa who was washed away by high waves. Thousands of households have lost power and Japan's two largest airlines have suspended many flights. The storm also forced the suspension of the search for people missing after last week's volcanic eruption. The storm-tracking website Tropical Storm Risk forecasts that Phanfone will rapidly lose power over the next few hours as it goes further into the Pacific Ocean. Typhoon Phanfone was downgraded from an earlier status of a super typhoon, but the Japan Meteorological Agency had warned it was still a dangerous storm. Japan averages 11 typhoons a year, according to its weather agency. The typhoon made landfall on Monday morning near the central city of Hamamatsu, with winds of up to 180 km/h (112 mph). The airman was one of three US military personnel swept away by high waves whipped up by the typhoon off southern Okinawa island, where the US has a large military base. The remaining two are still missing. A police spokesman said they had been taking photographs of the sea. A university student who was surfing off the seas of Kanagawa Prefecture, south of Tokyo, was also missing, national broadcast NHK reports. It said at least 10 people had been injured and 9,500 houses were without power. The storm was expected to deposit about 100mm of rain on Tokyo over 24 hours, according to the Transport Ministry website. Many schools were closed on Monday and two car companies in Japan halted production at some plants ahead of the storm. More than 174 domestic flights were affected nationwide, NHK state broadcaster said on Sunday. On Sunday, heavy rain delayed the Japanese Formula One Grand Prix in Suzaka. French driver Jules Bianchi lost control in the wet conditions and crashed, sustaining a severe head injury.\",\n",
+ " \"The blaze started at the detached building in Drivers End in Codicote, near Welwyn, during the morning. There was another fire at the building 20 years ago, after which fire-proof foil was placed under the thatch, which is protecting the main building. More than 15 fire engines and support vehicles were called to tackle the blaze. Roads in the area were closed and traffic diverted.\",\n",
+ " 'The 18-year-old fell at the New Charter Academy on Broadoak Road in Ashton-under-Lyne at about 09:10 BST, Greater Manchester Police (GMP) said. GMP said he had gone to Manchester Royal Infirmary and his condition was \"serious\". Principal Jenny Langley said the school would remain \"fully open\" while police investigated. \"Our thoughts are with the family and we\\'re doing everything we can to support them along with staff and pupils,\" she said.',\n",
+ " 'But Belgian-born Dutchman Max Verstappen was unable to drive a car legally on his own in either country. That all changed on Wednesday when the youngster turned 18 and passed his driving test at the first attempt. Despite having competed in 14 grands prix since his debut in Australia in March, Verstappen admitted to feeling the pressure during his test. \"It\\'s a relief,\" said the Toro Rosso driver, who finished ninth in Japan on Sunday and had only started driving lessons a week ago. \"I was a bit nervous to make mistakes, but the exam went well.\" A bonus of turning 18 is that Verstappen will now be able to drink the champagne if he ever makes it onto the podium.',\n",
+ "]\n",
+ "\n",
+ "reference = [\n",
+ " \"A powerful typhoon has brought many parts of Japan to a standstill and briefly battered Tokyo before heading out to sea.\",\n",
+ " \"A major fire has been burning in the thatched roof of a large property in Hertfordshire.\",\n",
+ " \"A student has been taken to hospital after falling from a balcony at a Greater Manchester school.\",\n",
+ " \"He is Formula 1's youngest ever driver and in charge of a car that can reach over 200mph.\",\n",
+ "]\n",
+ "\n",
+ "response = [\n",
+ " \"Typhoon Phanfone, while downgraded from super typhoon status, caused significant disruption and tragedy in Japan. One US airman died after being swept away by high waves, with two more missing. The storm caused power outages for thousands, flight cancellations, and the suspension of rescue efforts for missing volcano victims. Heavy rain and strong winds led to school and factory closures, transportation disruptions, and at least 10 injuries. The typhoon is expected to weaken as it moves over the Pacific Ocean.\",\n",
+ " \"A large fire broke out in a detached thatched building in Codicote, near Welwyn. This is the second fire at the building in 20 years. Thankfully, fire-proof foil installed after the previous fire is protecting the main building. Over 15 fire engines and support vehicles responded, closing roads and diverting traffic in the area.\",\n",
+ " \"An 18-year-old student at New Charter Academy in Ashton-under-Lyne suffered a serious fall and was hospitalized. The incident is under investigation by Greater Manchester Police, but the school remains open. The principal expressed support for the student's family and the school community.\",\n",
+ " \"Max Verstappen, a Formula One driver, was finally able to get his driver's license at age 18. Despite already competing in 14 Grand Prix races, he was not of legal driving age in his native countries. He admitted to being nervous but passed the test on his first attempt. As an added bonus of turning 18, Verstappen can now enjoy champagne on the podium if he places.\",\n",
+ "]\n",
+ "\n",
+ "eval_dataset = pd.DataFrame(\n",
+ " {\n",
+ " \"instruction\": instruction,\n",
+ " \"context\": context,\n",
+ " \"reference\": reference,\n",
+ " }\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "baff9d1cca96"
+ },
+ "source": [
+ "#### Define prompt templates to compare\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "f639a79316c1"
+ },
+ "outputs": [],
+ "source": [
+ "prompt_templates = [\n",
+ " \"Instruction: {instruction} such that you'r explaining it to a 5 year old. Article: {context}. Summary:\",\n",
+ " \"Article: {context}. Complete this task: {instruction}. Summary:\",\n",
+ " \"Goal: {instruction} and give me a TLDR in five words. Here's an article: {context}. Summary:\",\n",
+ " \"Article: {context}. Reference Summary: {reference}. {instruction} to be more concise and verbose than the reference.\",\n",
+ "]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ncJ-4uA_nxNB"
+ },
+ "source": [
+ "#### Define a model"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "3155f46d31c6"
+ },
+ "outputs": [],
+ "source": [
+ "generation_config = {\"temperature\": 0.3, \"max_output_tokens\": 256, \"top_k\": 1}\n",
+ "\n",
+ "gemini_model = GenerativeModel(\n",
+ " \"gemini-1.5-pro\",\n",
+ " generation_config=generation_config,\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "c3c1121684f5"
+ },
+ "source": [
+ "#### Define an EvalTask"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "20eb95c14422"
+ },
+ "outputs": [],
+ "source": [
+ "metrics = [\n",
+ " \"rouge_l_sum\",\n",
+ " \"bleu\",\n",
+ " \"fluency\",\n",
+ " \"coherence\",\n",
+ " \"safety\",\n",
+ " \"groundedness\",\n",
+ " \"summarization_quality\",\n",
+ " \"verbosity\",\n",
+ " \"instruction_following\",\n",
+ " \"text_quality\",\n",
+ "]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "052ec86e5777"
+ },
+ "outputs": [],
+ "source": [
+ "experiment_name = \"eval-sdk-prompt-engineering\" # @param {type:\"string\"}\n",
+ "\n",
+ "summarization_eval_task = EvalTask(\n",
+ " dataset=eval_dataset,\n",
+ " metrics=metrics,\n",
+ " experiment=experiment_name,\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "515fe8c3652f"
+ },
+ "source": [
+ "#### Run Evaluation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "2b8422213915"
+ },
+ "outputs": [],
+ "source": [
+ "eval_results = []\n",
+ "for i, prompt_template in enumerate(prompt_templates):\n",
+ " eval_result = summarization_eval_task.evaluate(\n",
+ " prompt_template=prompt_template,\n",
+ " model=model,\n",
+ " # Customize eval service rate limit based on your project's Gemini-1.5-pro model quota to improve speed.\n",
+ " # See more details in https://cloud.google.com/vertex-ai/generative-ai/docs/models/run-evaluation#increase-quota\n",
+ " evaluation_service_qps=1,\n",
+ " )\n",
+ "\n",
+ " eval_results.append((f\"Prompt Template #{i+1}\", eval_result))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ec6e97cf9f27"
+ },
+ "source": [
+ "#### Display Evaluation report and explanations"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "d3724b416b39"
+ },
+ "outputs": [],
+ "source": [
+ "for result in eval_results:\n",
+ " display_eval_result(title=result[0], eval_result=result[1])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "0615d0925469"
+ },
+ "outputs": [],
+ "source": [
+ "for eval_result in eval_results:\n",
+ " display_explanations(eval_result[1], metrics=[\"summarization_quality\"], n=2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "GzWSUPj2oV-_"
+ },
+ "source": [
+ "#### Visualize Results"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "SSDyxE2aoYMH"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ "
\n",
+ "\n",
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "display_bar_plot(\n",
+ " eval_results,\n",
+ " metrics=[\"instruction_following\", \"fluency\", \"coherence\", \"text_quality\"],\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ed966819648e"
+ },
+ "source": [
+ "#### View Experiment log for evaluation runs"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "363c1b2553b9"
+ },
+ "outputs": [],
+ "source": [
+ "summarization_eval_task.display_runs()"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "collapsed_sections": [
+ "tfQ7sPtOjZOw",
+ "F_Gw6YLeOvFq"
+ ],
+ "name": "evaluate_models_in_vertex_ai_studio_and_model_garden.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
diff --git a/gemini/evaluation/evaluate_rag_gen_ai_evaluation_service_sdk.ipynb b/gemini/evaluation/evaluate_rag_gen_ai_evaluation_service_sdk.ipynb
index ff16f58121..4e9237b399 100644
--- a/gemini/evaluation/evaluate_rag_gen_ai_evaluation_service_sdk.ipynb
+++ b/gemini/evaluation/evaluate_rag_gen_ai_evaluation_service_sdk.ipynb
@@ -86,7 +86,7 @@
"## Overview\n",
"\n",
"\n",
- "In this tutorial, you will learn how to use the use the *Vertex AI Python SDK for Gen AI Evaluation* to evaluate **Retrieval-Augmented Generation** (RAG) generated answers for **Question Answering** (QA) task.\n",
+ "In this tutorial, you will learn how to use the use the *Vertex AI Python SDK for Gen AI Evaluation Service* to evaluate **Retrieval-Augmented Generation** (RAG) generated answers for **Question Answering** (QA) task.\n",
"\n",
"RAG is a technique to improve groundness, relevancy and factuality of large language models (LLMs) by finding relevant information from the model's knowledge base. RAG is done by converting a query into a vector representation (embeddings), and then finding the most similar vectors in the knowledge base. The most similar vectors are then used to help generate the response.\n",
"\n",
@@ -98,7 +98,9 @@
"\n",
"The examples used in this notebook is from Stanford Question Answering Dataset [SQuAD 2.0](https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1194/reports/default/15785042.pdf).\n",
"\n",
- "Learn more about [Vertex AI Rapid Evaluation SDK](https://cloud.google.com/vertex-ai/generative-ai/docs/models/online-pipeline-services).\n"
+ "See also: \n",
+ "\n",
+ "- Learn more about [Vertex Gen AI Evaluation Service SDK](https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-overview)."
]
},
{
diff --git a/gemini/evaluation/intro_to_gen_ai_evaluation_service_sdk.ipynb b/gemini/evaluation/intro_to_gen_ai_evaluation_service_sdk.ipynb
index 50e5159798..9dea5198c5 100644
--- a/gemini/evaluation/intro_to_gen_ai_evaluation_service_sdk.ipynb
+++ b/gemini/evaluation/intro_to_gen_ai_evaluation_service_sdk.ipynb
@@ -29,7 +29,7 @@
"id": "5e_7VOHBer8D"
},
"source": [
- " # Getting Started with Vertex AI Python SDK for Gen AI Evaluation Service\n",
+ " # Getting Started with Vertex AI Python SDK for Gen AI Evaluation Service \n",
"\n",
"
\n",
"
\n",
diff --git a/gemini/evaluation/migration_guide_preview_to_GA_sdk.ipynb b/gemini/evaluation/migration_guide_preview_to_GA_sdk.ipynb
index 231164f7a0..641c4023e1 100644
--- a/gemini/evaluation/migration_guide_preview_to_GA_sdk.ipynb
+++ b/gemini/evaluation/migration_guide_preview_to_GA_sdk.ipynb
@@ -29,7 +29,7 @@
"id": "5hCmIhlJjZOt"
},
"source": [
- "# Gen AI Evaluation Service SDK Preview-to-GA Migration Guide\n",
+ "# Gen AI Evaluation Service SDK Preview-to-GA Migration Guide | Gen AI Evaluation SDK Tutorial\n",
"\n",
"\n",
"
\n",
@@ -106,7 +106,8 @@
"* How to handle discontinued metrics\n",
"* How to handle the new input schema\n",
"\n",
- "To learn more about the GA release details, please refer to the latest documentation and notebook tutorials in [Generative AI on Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-overview).\n",
+ "\n",
+ "To learn more about the GA release details, please refer to the latest documentation and notebook tutorials in [Vertex Gen AI Evaluation Service](https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-overview).\n",
"\n",
"The examples used in this notebook is from Stanford Question Answering Dataset [SQuAD 2.0](https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1194/reports/default/15785042.pdf).\n"
]
diff --git a/gemini/evaluation/prompt_engineering_gen_ai_evaluation_service_sdk.ipynb b/gemini/evaluation/prompt_engineering_gen_ai_evaluation_service_sdk.ipynb
index 51975dd22f..0d8a5eb969 100644
--- a/gemini/evaluation/prompt_engineering_gen_ai_evaluation_service_sdk.ipynb
+++ b/gemini/evaluation/prompt_engineering_gen_ai_evaluation_service_sdk.ipynb
@@ -83,7 +83,11 @@
"\n",
"* Evaluate and refine the prompt templates systematically for the `EvalTask`.\n",
"\n",
- "* Compare and choose the prompt template with the best evaluation performance."
+ "* Compare and choose the prompt template with the best evaluation performance.\n",
+ "\n",
+ "See also: \n",
+ "\n",
+ "- Learn more about [Vertex Gen AI Evaluation Service SDK](https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-overview)."
]
},
{
From f5b615c12f8d01b32c605b8280ff5a0469691e60 Mon Sep 17 00:00:00 2001
From: Erwin Huizenga
Date: Tue, 1 Oct 2024 16:38:20 +0800
Subject: [PATCH 31/76] updating token count notebook (#1200)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
# Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [x] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [x] You are listed as the author in your notebook or README file.
- [x] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [x] Make your Pull Request title in the
specification.
- [x] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [x] Appropriate docs were updated (if necessary)
Fixes # 🦕
---
...ning_token_count_and_cost_estimation.ipynb | 1076 ++++++++++++-----
1 file changed, 801 insertions(+), 275 deletions(-)
diff --git a/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb b/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb
index 1af8cb4092..79cd82d819 100644
--- a/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb
+++ b/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb
@@ -29,7 +29,7 @@
"id": "JAPoU8Sm5E6e"
},
"source": [
- "# Vertex AI Supervised Tuning token count and cost estimation.\n",
+ "# Vertex AI Supervised tuning token count and cost estimation.\n",
"\n",
"
\n",
"
\n",
@@ -74,10 +74,9 @@
"source": [
"## Overview\n",
"\n",
- "This notebook serves as a tool to preprocess and estimate token counts for tuning costs for tuning [`gemini-1.0-pro-002`](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning).\n",
- "\n",
- "You can also find the code to check your dataset for Vertex AI Gemini `gemini-1.5-pro-001`.\n",
+ "This notebook serves as a tool to preprocess and estimate token counts for tuning costs for tuning [`gemini-1.5-pro-002`](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning).\n",
"\n",
+ "At the end you will also find the code to preprocess and estimate token counts for tuning costs for tuning `gemini-1.0-pro-002`. If you get started please start with `gemini-1.5-pro-002`.\n",
"\n",
"For how to prepare dataset for tuning gemini, please refer to this [tutorial](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning-about)."
]
@@ -102,22 +101,11 @@
},
{
"cell_type": "code",
- "execution_count": 3,
+ "execution_count": 24,
"metadata": {
"id": "tFy3H3aPgx12"
},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "\u001b[?25l \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/5.3 MB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K \u001b[91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m\u001b[91m╸\u001b[0m \u001b[32m5.2/5.3 MB\u001b[0m \u001b[31m161.8 MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m\r\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m5.3/5.3 MB\u001b[0m \u001b[31m80.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
- "\u001b[?25h\u001b[33m WARNING: The script tb-gcp-uploader is installed in '/root/.local/bin' which is not on PATH.\n",
- " Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.\u001b[0m\u001b[33m\n",
- "\u001b[0m"
- ]
- }
- ],
+ "outputs": [],
"source": [
"%pip install --upgrade --user --quiet google-cloud-aiplatform[tokenization] numpy==1.26.4 tensorflow"
]
@@ -137,20 +125,24 @@
},
{
"cell_type": "code",
- "execution_count": 1,
+ "execution_count": 25,
"metadata": {
- "id": "XRvKdaPDTznN"
+ "id": "XRvKdaPDTznN",
+ "outputId": "30089739-607e-433e-8592-6e5b44e914c4",
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ }
},
"outputs": [
{
+ "output_type": "execute_result",
"data": {
"text/plain": [
"{'status': 'ok', 'restart': True}"
]
},
- "execution_count": 1,
"metadata": {},
- "output_type": "execute_result"
+ "execution_count": 25
}
],
"source": [
@@ -162,107 +154,810 @@
},
{
"cell_type": "markdown",
- "metadata": {
- "id": "SbmM4z7FOBpM"
- },
+ "metadata": {
+ "id": "SbmM4z7FOBpM"
+ },
+ "source": [
+ "
\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dmWOrTJ3gx13"
+ },
+ "source": [
+ "### Authenticate your notebook environment (Colab only)\n",
+ "\n",
+ "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {
+ "id": "NyKGtVQjgx13"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " from google.colab import auth\n",
+ "\n",
+ " auth.authenticate_user()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DF4l8DTdWgPY"
+ },
+ "source": [
+ "### Set Google Cloud project information and initialize Vertex AI SDK\n",
+ "\n",
+ "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
+ "\n",
+ "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {
+ "id": "Nqwi-5ufWp_B"
+ },
+ "outputs": [],
+ "source": [
+ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n",
+ "LOCATION = \"us-central1\" # @param {type:\"string\"}\n",
+ "\n",
+ "\n",
+ "import vertexai\n",
+ "\n",
+ "vertexai.init(project=PROJECT_ID, location=LOCATION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "EdvJRUWRNGHE"
+ },
+ "source": [
+ "## Tuning token count and cost estimation: `Gemini 1.5 pro` and `Gemini 1.5 Flash`"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "cPhY560YQijW"
+ },
+ "source": [
+ "### Import libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 115,
+ "metadata": {
+ "id": "4498u5KpQijW"
+ },
+ "outputs": [],
+ "source": [
+ "from collections import defaultdict\n",
+ "import dataclasses\n",
+ "import json\n",
+ "\n",
+ "import numpy as np\n",
+ "import tensorflow as tf\n",
+ "from vertexai.generative_models import Content, Part\n",
+ "from vertexai.preview.tokenization import get_tokenizer_for_model\n",
+ "from google.cloud import storage"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### Load the dataset\n",
+ "\n",
+ "This example is for text only. Define the Google Cloud Storage URIs pointing to your training and validation datasets or continue using the URIs provided."
+ ],
+ "metadata": {
+ "id": "wvqIIG1M0YCy"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "BASE_MODEL = \"gemini-1.5-pro-002\" # @param ['gemini-1.5-pro-002']{type:\"string\"}\n",
+ "training_dataset_uri = \"gs://github-repo/generative-ai/gemini/tuning/train_sft_train_samples.jsonl\" # @param {type:\"string\"}\n",
+ "validation_dataset_uri = \"gs://github-repo/generative-ai/gemini/tuning/val_sft_val_samples.jsonl\" # @param {type:\"string\"}\n",
+ "\n",
+ "tokenizer = get_tokenizer_for_model(\"gemini-1.5-pro-001\")"
+ ],
+ "metadata": {
+ "id": "oue9Q0GG0Rvk"
+ },
+ "execution_count": 135,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "We'll now load the dataset and conduct some basic statistical analysis to understand its structure and content.\n"
+ ],
+ "metadata": {
+ "id": "dbl6UD5P3LIH"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "example_training_dataset = []\n",
+ "example_validation_dataset = []\n",
+ "\n",
+ "try:\n",
+ " with tf.io.gfile.GFile(training_dataset_uri) as dataset_jsonl_file:\n",
+ " example_training_dataset = [\n",
+ " json.loads(dataset_line) for dataset_line in dataset_jsonl_file\n",
+ " ]\n",
+ "except KeyError as e:\n",
+ " print(f\"KeyError: Please check if your file '{training_dataset_uri}' is a JSONL file with correct JSON format. Error: {e}\")\n",
+ " # Exit the script if there's an error in the training data\n",
+ " import sys\n",
+ " sys.exit(1)\n",
+ "\n",
+ "print()\n",
+ "\n",
+ "if validation_dataset_uri:\n",
+ " try:\n",
+ " with tf.io.gfile.GFile(validation_dataset_uri) as dataset_jsonl_file:\n",
+ " example_validation_dataset = [\n",
+ " json.loads(dataset_line) for dataset_line in dataset_jsonl_file\n",
+ " ]\n",
+ " except KeyError as e:\n",
+ " print(f\"KeyError: Please check if your file '{validation_dataset_uri}' is a JSONL file with correct JSON format. Error: {e}\")\n",
+ " # Exit the script if there's an error in the validation data\n",
+ " import sys\n",
+ " sys.exit(1)\n",
+ "\n",
+ "# Initial dataset stats\n",
+ "print(\"Num training examples:\", len(example_training_dataset))\n",
+ "if example_training_dataset: # Check if the list is not empty\n",
+ " print(\"First example:\")\n",
+ " for item in example_training_dataset[0][\"contents\"]:\n",
+ " print(item)\n",
+ " text_content = item.get(\"parts\", [{}])[0].get(\"text\", \"\")\n",
+ " print(tokenizer.count_tokens(text_content)) # Make sure 'tokenizer' is defined\n",
+ "\n",
+ "if example_validation_dataset:\n",
+ " print(\"Num validation examples:\", len(example_validation_dataset))"
+ ],
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "M9TrzApr1tYQ",
+ "outputId": "2e19860a-0bf6-446c-8bcc-e262c3c3833c"
+ },
+ "execution_count": 136,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "\n",
+ "Num training examples: 500\n",
+ "First example:\n",
+ "{'role': 'user', 'parts': [{'text': 'Honesty is usually the best policy. It is disrespectful to lie to someone. If you don\\'t want to date someone, you should say so. Sometimes it is easy to be honest. For example, you might be able to truthfully say, \"No, thank you, I already have a date for that party.\" Other times, you might need to find a kinder way to be nice. Maybe you are not attracted to the person. Instead of bluntly saying that, try saying, \"No, thank you, I just don\\'t think we would be a good fit.\" Avoid making up a phony excuse. For instance, don\\'t tell someone you will be out of town this weekend if you won\\'t be. There\\'s a chance that you might then run into them at the movies, which would definitely cause hurt feelings. A compliment sandwich is a really effective way to provide feedback. Essentially, you \"sandwich\" your negative comment between two positive things. Try using this method when you need to reject someone. An example of a compliment sandwich is to say something such as, \"You\\'re an awesome person. Unfortunately, I\\'m not interested in dating you. Someone else is going to be really lucky to date someone with such a great personality!\" You could also try, \"You are a really nice person. I\\'m only interested you as a friend. I like when we hang out in big groups together!\" Be sincere. If you offer false compliments, the other person will likely be able to tell and feel hurt. If you do not want to date someone, it is best to be upfront about your feelings. Do not beat around the bush. If your mind is made up, it is best to clearly state your response. If someone asks you to date them and you don\\'t want to, you can be direct and kind at the same time. State your answer clearly. You can make your feelings clear without purposefully hurting someone else\\'s feelings. Try smiling and saying, \"That sounds fun, but no thank you. I\\'m not interested in dating you.\" Don\\'t beat around the bush. If you do not want to accept the date, there is no need to say, \"Let me think about it.\" It is best to get the rejection over with. You don\\'t want to give someone false hope. Avoid saying something like, \"Let me check my schedule and get back to you.\" Try to treat the person the way you would want to be treated. This means that you should choose your words carefully. Be thoughtful in your response. It\\'s okay to pause before responding. You might be taken by surprise and need a moment to collect your thoughts. Say thank you. It is a compliment to be asked out. You can say, \"I\\'m flattered. Unfortunately, I can\\'t accept.\" Don\\'t laugh. Many people laugh nervously in awkward situations. Try to avoid giggling, as that is likely to result in hurt feelings. Sometimes it is not what you say, but how you say it. If you need to reject someone, think about factors other than your words. Non-verbal communication matters, too. Use the right tone of voice. Try to sound gentle but firm. Make eye contact. This helps convey that you are being serious, and also shows respect for the other person. If you are in public, try not to speak too loudly. It is not necessary for everyone around you to know that you are turning down a date.\\n\\nProvide a summary of the article in two or three sentences:\\n\\n'}]}\n",
+ "CountTokensResult(total_tokens=730)\n",
+ "{'role': 'model', 'parts': [{'text': 'Tell the truth. Use a \"compliment sandwich\". Be direct. Treat the person with respect. Communicate effectively.'}]}\n",
+ "CountTokensResult(total_tokens=23)\n",
+ "Num validation examples: 100\n"
+ ]
+ }
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "You can perform various error checks to validate that each tuning example in the dataset adheres to the format expected by the tuning API. Errors are categorized based on their nature for easier debugging. \n",
+ " \n",
+ "For how to prepare dataset for tuning gemini, please refer to this [tutorial](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning-about).\n",
+ "\n",
+ "1. **Presence of System Instruction**: Checks if there is a system instruction and if its there for all rows. System instruction is optional. Warning type: `systemInstruction is missing in some rows`.\n",
+ "2. **Presence of Contents List:** Checks if a `contents` list is present in each entry. Error type: `missing_contents_list`.\n",
+ "3. **Content Item Format:** Validates that each item in the `contents` list is a dictionary. Error type: `invalid_content_item`.\n",
+ "4. **Content Item Format:** Validates that each item in the `contents` list is a dictionary. Error type: `invalid_content_item`.\n",
+ "5. **Role Validation:** Checks if the role is one of `user`, or `model` for `contents` list and system for `systemInstruction` list. Error type: `unrecognized_role`.\n",
+ "6. **Parts List Validation:** Verifies that the `parts` key contains a list. Error type: `missing_or_invalid_parts`.\n",
+ "7. **Part Format:** Checks if each part in the `parts` list is a dictionary and contains the key `text`. Error type: `invalid_part`.\n",
+ "8. **Text Validation:** Ensures that the `text` key has textual data and is a string. Error type: `missing_text`.\n",
+ "9. **Consecutive Turns:** For the chat history, it is enforced that the message roles alternate (user, then model, then user, etc.). Error type: `consecutive_turns`. This check is not applicable for systemInstruction.\n",
+ "\n",
+ "\n",
+ "\n"
+ ],
+ "metadata": {
+ "id": "L5RhrH6r4NrC"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "from collections import defaultdict\n",
+ "\n",
+ "def validate_dataset_format(dataset):\n",
+ " \"\"\"Validates the dataset.\n",
+ "\n",
+ " Args:\n",
+ " dataset_uri: The dataset uri to be validated.\n",
+ " \"\"\"\n",
+ " format_errors = defaultdict(list)\n",
+ " system_instruction_missing = False # Flag to track missing systemInstruction\n",
+ "\n",
+ " if not dataset or len(dataset) == 0:\n",
+ " print(\"Input dataset file is empty or inaccessible.\")\n",
+ " return\n",
+ "\n",
+ " for row_idx, example in enumerate(dataset):\n",
+ " # Verify presence of contents list\n",
+ " if not isinstance(example, dict):\n",
+ " format_errors[\"invalid_input\"].append(row_idx)\n",
+ " continue\n",
+ "\n",
+ " # Check for systemInstruction and validate if present\n",
+ " system_instruction = example.get(\"systemInstruction\", None)\n",
+ " if system_instruction:\n",
+ " try:\n",
+ " # Validate the list within \"parts\"\n",
+ " validate_contents(system_instruction.get(\"parts\", []), format_errors, row_idx, is_system_instruction=True)\n",
+ " except (TypeError, AttributeError, KeyError) as e:\n",
+ " print(\"Invalid input during system instruction validation: %s\", e)\n",
+ " format_errors[\"invalid_system_instruction\"].append(row_idx)\n",
+ " else:\n",
+ " system_instruction_missing = True # Set the flag if missing\n",
+ "\n",
+ " contents = example.get(\"contents\", None)\n",
+ " if not contents:\n",
+ " format_errors[\"missing_contents_list\"].append(row_idx)\n",
+ " continue\n",
+ " try:\n",
+ " validate_contents(contents, format_errors, row_idx)\n",
+ " except (TypeError, AttributeError, KeyError) as e:\n",
+ " print(\"Invalid input during contents validation: %s\", e)\n",
+ " format_errors[\"invalid_input\"].append(row_idx)\n",
+ "\n",
+ " if format_errors:\n",
+ " print(\"Found errors for this dataset:\")\n",
+ " for k, v in format_errors.items():\n",
+ " print(f\"{k}: {v}\")\n",
+ " else:\n",
+ " print(\"No errors found for this dataset.\")\n",
+ "\n",
+ " # Print warning only once after processing all rows\n",
+ " if system_instruction_missing:\n",
+ " print(\"Warning: systemInstruction is missing in some rows.\")\n",
+ "\n",
+ "\n",
+ "def validate_contents(contents, format_errors, row_index, is_system_instruction=False):\n",
+ " \"\"\"Validates contents list format.\"\"\"\n",
+ "\n",
+ " if not isinstance(contents, list):\n",
+ " format_errors[\"invalid_contents_list\"].append(row_index)\n",
+ " return\n",
+ "\n",
+ " prev_role = None\n",
+ " for content_item in contents: # Iterate over content items in the \"contents\" list\n",
+ " if not isinstance(content_item, dict):\n",
+ " format_errors[\"invalid_content_item\"].append(row_index)\n",
+ " return\n",
+ "\n",
+ " # Skip key checks for system instructions\n",
+ " if not is_system_instruction and (\"role\" not in content_item or \"parts\" not in content_item):\n",
+ " format_errors[\"content_item_missing_key\"].append(row_index)\n",
+ " return\n",
+ "\n",
+ " # ... (rest of the validation logic remains the same)"
+ ],
+ "metadata": {
+ "id": "S2FVNbIX0R0n"
+ },
+ "execution_count": 137,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "validate_dataset_format(example_training_dataset)\n",
+ "if example_validation_dataset:\n",
+ " validate_dataset_format(example_validation_dataset)"
+ ],
+ "metadata": {
+ "id": "nSLnbeJ00R2v",
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "outputId": "8f680322-45cc-4047-e76d-f7c3932d0439"
+ },
+ "execution_count": 138,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "No errors found for this dataset.\n",
+ "Warning: systemInstruction is missing in some rows.\n",
+ "No errors found for this dataset.\n",
+ "Warning: systemInstruction is missing in some rows.\n"
+ ]
+ }
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### Utils for dataset analysis and token counting\n",
+ "\n",
+ "This section focuses on analyzing the structure and token counts of your datasets. You will also define some utility functions to streamline subsequent steps in the notebook.\n",
+ "\n",
+ "* Load and inspect sample data from the training and validation datasets.\n",
+ "* Calculate token counts for messages to understand the dataset's characteristics.\n",
+ "* Define utility functions for calculating token distributions and dataset statistics. These will help assess the suitability of your data for supervised tuning and estimate potential costs."
+ ],
+ "metadata": {
+ "id": "SJFwhPid_7oG"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "@dataclasses.dataclass\n",
+ "class DatasetDistribution:\n",
+ " \"\"\"Dataset disbribution for given a population of values.\n",
+ "\n",
+ " It optionally contains a histogram consists of bucketized data representing\n",
+ " the distribution of those values. The summary statistics are the sum, min,\n",
+ " max, mean, median, p5, p95.\n",
+ "\n",
+ " Attributes:\n",
+ " sum: Sum of the values in the population.\n",
+ " max: Max of the values in the population.\n",
+ " min: Min of the values in the population.\n",
+ " mean: The arithmetic mean of the values in the population.\n",
+ " median: The median of the values in the population.\n",
+ " p5: P5 quantile of the values in the population.\n",
+ " p95: P95 quantile of the values in the population.\n",
+ " \"\"\"\n",
+ "\n",
+ " sum: int | None = None\n",
+ " max: float | None = None\n",
+ " min: float | None = None\n",
+ " mean: float | None = None\n",
+ " median: float | None = None\n",
+ " p5: float | None = None\n",
+ " p95: float | None = None\n",
+ "\n",
+ "\n",
+ "@dataclasses.dataclass\n",
+ "class DatasetStatistics:\n",
+ " \"\"\"Dataset statistics used for dataset profiling.\n",
+ "\n",
+ " Attributes:\n",
+ " total_number_of_dataset_examples: Number of tuning examples in the dataset.\n",
+ " total_number_of_records_for_training: Number of tuning records after\n",
+ " formatting. Each model turn in the chat message will be considered as a record for tuning.\n",
+ " total_number_of_billable_tokens: Number of total billable tokens in the\n",
+ " dataset.\n",
+ " user_input_token_length_stats: Stats for input token length.\n",
+ " user_output_token_length_stats: Stats for output token length.\n",
+ " \"\"\"\n",
+ "\n",
+ " total_number_of_dataset_examples: int | None = None\n",
+ " total_number_of_records_for_training: int | None = None\n",
+ " total_number_of_billable_tokens: int | None = None\n",
+ " user_input_token_length_stats: DatasetDistribution | None = None\n",
+ " user_output_token_length_stats: DatasetDistribution | None = None\n",
+ "\n",
+ "\n",
+ "MAX_TOKENS_PER_EXAMPLE = 32 * 1024\n",
+ "ESTIMATE_PADDING_TOKEN_PER_EXAMPLE = 8"
+ ],
+ "metadata": {
+ "id": "al_uUWOP4Ss2"
+ },
+ "execution_count": 140,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "def calculate_distribution_for_population(population) -> DatasetDistribution:\n",
+ " \"\"\"Calculates the distribution from the population of values.\n",
+ "\n",
+ " Args:\n",
+ " population: The population of values to calculate distribution for.\n",
+ "\n",
+ " Returns:\n",
+ " DatasetDistribution of the given population of values.\n",
+ " \"\"\"\n",
+ " if not population:\n",
+ " raise ValueError(\"population is empty\")\n",
+ "\n",
+ " return DatasetDistribution(\n",
+ " sum=np.sum(population),\n",
+ " max=np.max(population),\n",
+ " min=np.min(population),\n",
+ " mean=np.mean(population),\n",
+ " median=np.median(population),\n",
+ " p5=np.percentile(population, 5, method=\"nearest\"),\n",
+ " p95=np.percentile(population, 95, method=\"nearest\"),\n",
+ " )\n",
+ "\n",
+ "\n",
+ "def get_token_distribution_for_one_tuning_dataset_example(example):\n",
+ " model_turn_token_list = []\n",
+ " input_token_list = []\n",
+ " input = []\n",
+ " n_too_long = 0\n",
+ " number_of_records_for_training = 0 # each model turn in the chat message will be considered as a record for tuning\n",
+ "\n",
+ " # Handle optional systemInstruction\n",
+ " system_instruction = example.get(\"systemInstruction\")\n",
+ " if system_instruction:\n",
+ " text = system_instruction.get(\"parts\")[0].get(\"text\") # Assuming single part in system instruction\n",
+ " input.append(Content(role=\"system\", parts=[Part.from_text(text)]))\n",
+ "\n",
+ " for content_item in example[\"contents\"]:\n",
+ " role = content_item.get(\"role\").lower()\n",
+ " text = content_item.get(\"parts\")[0].get(\"text\") # Assuming single part in content item\n",
+ "\n",
+ " if role.lower() == \"model\":\n",
+ " result = tokenizer.count_tokens(input)\n",
+ " input_token_list.append(result.total_tokens)\n",
+ " model_turn_token_list.append(tokenizer.count_tokens(text).total_tokens)\n",
+ " number_of_records_for_training += 1\n",
+ " if (\n",
+ " result.total_tokens + tokenizer.count_tokens(text).total_tokens\n",
+ " > MAX_TOKENS_PER_EXAMPLE\n",
+ " ):\n",
+ " n_too_long += 1\n",
+ " break\n",
+ "\n",
+ " input.append(Content(role=role, parts=[Part.from_text(text)]))\n",
+ "\n",
+ " return (\n",
+ " input_token_list,\n",
+ " model_turn_token_list,\n",
+ " number_of_records_for_training,\n",
+ " np.sum(model_turn_token_list) + np.sum(input_token_list),\n",
+ " n_too_long,\n",
+ " )\n",
+ "\n",
+ "\n",
+ "def get_dataset_stats_for_dataset(dataset):\n",
+ " results = map(get_token_distribution_for_one_tuning_dataset_example, dataset)\n",
+ " user_input_token_list = []\n",
+ " model_turn_token_list = []\n",
+ " number_of_records_for_training = 0\n",
+ " total_number_of_billable_tokens = 0\n",
+ " n_too_long_for_dataset = 0\n",
+ " for (\n",
+ " input_token_list_per_example,\n",
+ " model_turn_token_list_per_example,\n",
+ " number_of_records_for_training_per_example,\n",
+ " number_of_billable_token_per_example,\n",
+ " n_too_long,\n",
+ " ) in results:\n",
+ " user_input_token_list.extend(input_token_list_per_example)\n",
+ " model_turn_token_list.extend(model_turn_token_list_per_example)\n",
+ " number_of_records_for_training += number_of_records_for_training_per_example\n",
+ " total_number_of_billable_tokens += number_of_billable_token_per_example\n",
+ " n_too_long_for_dataset += n_too_long\n",
+ "\n",
+ " print(\n",
+ " f\"\\n{n_too_long_for_dataset} examples may be over the {MAX_TOKENS_PER_EXAMPLE} token limit, they will be truncated during tuning.\"\n",
+ " )\n",
+ "\n",
+ " return DatasetStatistics(\n",
+ " total_number_of_dataset_examples=len(dataset),\n",
+ " total_number_of_records_for_training=number_of_records_for_training,\n",
+ " total_number_of_billable_tokens=total_number_of_billable_tokens\n",
+ " + number_of_records_for_training * ESTIMATE_PADDING_TOKEN_PER_EXAMPLE,\n",
+ " user_input_token_length_stats=calculate_distribution_for_population(\n",
+ " user_input_token_list\n",
+ " ),\n",
+ " user_output_token_length_stats=calculate_distribution_for_population(\n",
+ " model_turn_token_list\n",
+ " ),\n",
+ " )\n",
+ "\n",
+ "def print_dataset_stats(dataset):\n",
+ " dataset_stats = get_dataset_stats_for_dataset(dataset)\n",
+ " print(\"Below you can find the dataset statistics:\")\n",
+ " print(\n",
+ " f\"Total number of examples in the dataset: {dataset_stats.total_number_of_dataset_examples}\"\n",
+ " )\n",
+ " print(\n",
+ " f\"Total number of records for training: {dataset_stats.total_number_of_records_for_training}\"\n",
+ " )\n",
+ " print(\n",
+ " f\"Total number of billable tokens in the dataset: {dataset_stats.total_number_of_billable_tokens}\"\n",
+ " )\n",
+ " print(\n",
+ " f\"User input token length distribution: {dataset_stats.user_input_token_length_stats}\"\n",
+ " )\n",
+ " print(\n",
+ " f\"User output token length distribution: {dataset_stats.user_output_token_length_stats}\"\n",
+ " )\n",
+ " return dataset_stats"
+ ],
+ "metadata": {
+ "id": "21xgvR3g0R5P"
+ },
+ "execution_count": 141,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Next you can analyze the structure and token counts of your datasets."
+ ],
+ "metadata": {
+ "id": "-FF4ReY6Atw3"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "training_dataset_stats = print_dataset_stats(example_training_dataset)\n",
+ "\n",
+ "if example_validation_dataset:\n",
+ " validation_dataset_stats = print_dataset_stats(example_validation_dataset)"
+ ],
+ "metadata": {
+ "id": "sZqsWno60R7O",
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "outputId": "8b78aea8-fbfa-4f48-9ab0-03bdca05f9f9"
+ },
+ "execution_count": 142,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "\n",
+ "0 examples may be over the 32768 token limit, they will be truncated during tuning.\n",
+ "Below you can find the dataset statistics:\n",
+ "Total number of examples in the dataset: 500\n",
+ "Total number of records for training: 500\n",
+ "Total number of billable tokens in the dataset: 259243\n",
+ "User input token length distribution: DatasetDistribution(sum=233592, max=2932, min=25, mean=467.184, median=414.5, p5=101, p95=1002)\n",
+ "User output token length distribution: DatasetDistribution(sum=21651, max=237, min=3, mean=43.302, median=37.0, p5=15, p95=89)\n",
+ "\n",
+ "0 examples may be over the 32768 token limit, they will be truncated during tuning.\n",
+ "Below you can find the dataset statistics:\n",
+ "Total number of examples in the dataset: 100\n",
+ "Total number of records for training: 100\n",
+ "Total number of billable tokens in the dataset: 50154\n",
+ "User input token length distribution: DatasetDistribution(sum=45535, max=1418, min=29, mean=455.35, median=413.5, p5=145, p95=846)\n",
+ "User output token length distribution: DatasetDistribution(sum=3819, max=165, min=8, mean=38.19, median=32.0, p5=17, p95=76)\n"
+ ]
+ }
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### Cost Estimation for Supervised Fine-tuning\n",
+ "In this final section, you will estimate the total cost for supervised fine-tuning based on the number of tokens processed. The number of tokens used will be charged to you. Please refer to the [pricing page for the rate](https://cloud.google.com/vertex-ai/generative-ai/pricing#gemini-models).\n",
+ "\n",
+ "**Important Note:** The final cost may vary slightly from this estimate due to dataset formatting and truncation logic during training.\n",
+ "\n",
+ "The code calculates the total number of billable tokens by summing up the tokens from the training dataset and (if provided) the validation dataset. Then, it estimates the total cost by multiplying the total billable tokens with the number of training epochs (default is 4)."
+ ],
+ "metadata": {
+ "id": "KFWbXu17DfiS"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "epoch_count = 4 # @param {type:\"integer\"}\n",
+ "if epoch_count is None:\n",
+ " epoch_count = 4\n",
+ "\n",
+ "\n",
+ "total_number_of_billable_tokens = training_dataset_stats.total_number_of_billable_tokens\n",
+ "\n",
+ "\n",
+ "if validation_dataset_stats:\n",
+ " total_number_of_billable_tokens += (\n",
+ " validation_dataset_stats.total_number_of_billable_tokens\n",
+ " )\n",
+ "\n",
+ "print(f\"Dataset has ~{total_number_of_billable_tokens} tokens that will be charged\")\n",
+ "print(f\"By default, you'll train for {epoch_count} epochs on this dataset.\")\n",
+ "print(\n",
+ " f\"By default, you'll be charged for ~{epoch_count * total_number_of_billable_tokens} tokens.\"\n",
+ ")"
+ ],
+ "metadata": {
+ "id": "k3ZJ_8fQ0R9x",
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "outputId": "9525fd37-0fc4-44dc-a97e-99a9e44748c0"
+ },
+ "execution_count": 143,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Dataset has ~309397 tokens that will be charged\n",
+ "By default, you'll train for 4 epochs on this dataset.\n",
+ "By default, you'll be charged for ~1237588 tokens.\n"
+ ]
+ }
+ ]
+ },
+ {
+ "cell_type": "markdown",
"source": [
- "
\n",
- "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
- "
\n"
- ]
+ "## Convert `Gemini 1.0 Pro` fine-tuning dataset to `Gemini 1.5 Pro` dataset."
+ ],
+ "metadata": {
+ "id": "K1EMMeRfH14a"
+ }
},
{
- "cell_type": "markdown",
+ "cell_type": "code",
+ "source": [
+ "source_uri = \"gs://next-23-tuning-demo/example-fine-tuning.json\" # @param {type:\"string\"}\n",
+ "destination_uri = \"gs://next-23-tuning-demo/new-data-format.jsonl\" # @param {type:\"string\"}\n",
+ "system_instruction = \"You are a helpful and friendly AI assistant\" # Optional"
+ ],
"metadata": {
- "id": "dmWOrTJ3gx13"
+ "id": "oOKZgdSLJUFx"
},
- "source": [
- "### Authenticate your notebook environment (Colab only)\n",
- "\n",
- "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
- ]
+ "execution_count": 144,
+ "outputs": []
},
{
"cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "NyKGtVQjgx13"
- },
- "outputs": [],
"source": [
- "import sys\n",
+ "def convert_jsonl_format(\n",
+ " source_uri: str,\n",
+ " destination_uri: str,\n",
+ " system_instruction: str = None,\n",
+ "):\n",
+ " \"\"\"Converts a JSONL file from the old format to the new format.\n",
"\n",
- "if \"google.colab\" in sys.modules:\n",
- " from google.colab import auth\n",
+ " Args:\n",
+ " source_uri: Google Cloud Storage URI of the source JSONL file.\n",
+ " destination_uri: Google Cloud Storage URI for the new JSONL file.\n",
+ " system_instruction: Optional system instruction text.\n",
+ " If provided, it will be added as \"systemInstruction\" in the new format.\n",
+ " \"\"\"\n",
+ " storage_client = storage.Client()\n",
"\n",
- " auth.authenticate_user()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "DF4l8DTdWgPY"
- },
- "source": [
- "### Set Google Cloud project information and initialize Vertex AI SDK\n",
+ " # Extract bucket and file name from source URI\n",
+ " source_bucket_name, source_blob_name = extract_bucket_and_blob_name(source_uri)\n",
+ " source_bucket = storage_client.bucket(source_bucket_name)\n",
+ " source_blob = source_bucket.blob(source_blob_name)\n",
"\n",
- "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
+ " # Extract bucket and file name from destination URI\n",
+ " dest_bucket_name, dest_blob_name = extract_bucket_and_blob_name(destination_uri)\n",
+ " dest_bucket = storage_client.bucket(dest_bucket_name)\n",
+ " dest_blob = dest_bucket.blob(dest_blob_name)\n",
"\n",
- "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "Nqwi-5ufWp_B"
- },
- "outputs": [],
- "source": [
- "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n",
- "LOCATION = \"us-central1\" # @param {type:\"string\"}\n",
+ " # Download the source JSONL file\n",
+ " source_data = source_blob.download_as_string().decode('utf-8')\n",
"\n",
+ " new_data = []\n",
+ " for line in source_data.splitlines():\n",
+ " try:\n",
+ " json_data = json.loads(line)\n",
+ " new_json_data = convert_json_object(json_data, system_instruction)\n",
+ " new_data.append(new_json_data)\n",
+ " except json.JSONDecodeError as e:\n",
+ " print(f\"Skipping invalid JSON line: {line} - Error: {e}\")\n",
"\n",
- "import vertexai\n",
+ " # Upload the new JSONL file\n",
+ " new_data_str = \"\\n\".join([json.dumps(data) for data in new_data])\n",
+ " dest_blob.upload_from_string(new_data_str)\n",
"\n",
- "vertexai.init(project=PROJECT_ID, location=LOCATION)"
- ]
- },
- {
- "cell_type": "markdown",
+ " print(f\"Successfully converted and uploaded to {destination_uri}\")\n",
+ "\n",
+ "\n",
+ "def convert_json_object(json_data: dict, system_instruction: str = None) -> dict:\n",
+ " \"\"\"Converts a single JSON object from the old format to the new format.\n",
+ "\n",
+ " Args:\n",
+ " json_data: The JSON object to convert.\n",
+ " system_instruction: Optional system instruction text.\n",
+ "\n",
+ " Returns:\n",
+ " The converted JSON object.\n",
+ " \"\"\"\n",
+ " new_json_data = {} # Create an empty dict instead of initializing with \"contents\"\n",
+ "\n",
+ " if system_instruction:\n",
+ " new_json_data[\"systemInstruction\"] = {\n",
+ " \"role\": \"system\",\n",
+ " \"parts\": [{\"text\": system_instruction}]\n",
+ " }\n",
+ "\n",
+ " new_json_data[\"contents\"] = [] # Initialize \"contents\" after \"systemInstruction\"\n",
+ "\n",
+ " for message in json_data.get(\"messages\", []):\n",
+ " new_message = {\n",
+ " \"role\": message[\"role\"],\n",
+ " \"parts\": [{\"text\": message[\"content\"]}]\n",
+ " }\n",
+ " new_json_data[\"contents\"].append(new_message)\n",
+ "\n",
+ " return new_json_data\n",
+ "\n",
+ "\n",
+ "def extract_bucket_and_blob_name(gcs_uri: str) -> tuple:\n",
+ " \"\"\"Extracts the bucket name and blob name from a Google Cloud Storage URI.\n",
+ "\n",
+ " Args:\n",
+ " gcs_uri: The Google Cloud Storage URI (e.g., \"gs://my-bucket/my-file.jsonl\")\n",
+ "\n",
+ " Returns:\n",
+ " A tuple containing the bucket name and blob name.\n",
+ " \"\"\"\n",
+ " if not gcs_uri.startswith(\"gs://\"):\n",
+ " raise ValueError(\"Invalid Google Cloud Storage URI\")\n",
+ " parts = gcs_uri[5:].split(\"/\", 1)\n",
+ " return parts[0], parts[1]"
+ ],
"metadata": {
- "id": "EdvJRUWRNGHE"
+ "id": "fgNjg3Y4CSq8"
},
- "source": [
- "## Tuning token count and cost estimation."
- ]
+ "execution_count": 117,
+ "outputs": []
},
{
- "cell_type": "markdown",
+ "cell_type": "code",
+ "source": [
+ "convert_jsonl_format(source_uri, destination_uri, system_instruction)"
+ ],
"metadata": {
- "id": "cPhY560YQijW"
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "WAqrR4yDH1LT",
+ "outputId": "2e351e1c-ee6c-40e6-857f-581b12fe3872"
},
- "source": [
- "### Import libraries"
+ "execution_count": 118,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Successfully converted and uploaded to gs://next-23-tuning-demo/new-data-format.jsonl\n"
+ ]
+ }
]
},
{
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {
- "id": "4498u5KpQijW"
- },
- "outputs": [],
+ "cell_type": "markdown",
"source": [
- "from collections import defaultdict\n",
- "import dataclasses\n",
- "import json\n",
+ "## Tuning token count and cost estimation for `Gemini 1.0 pro` legacy users.\n",
"\n",
- "import numpy as np\n",
- "import tensorflow as tf\n",
- "from vertexai.generative_models import Content, Part\n",
- "from vertexai.preview.tokenization import get_tokenizer_for_model"
- ]
+ "Only use this part if you still use `Gemini 1.0 pro`. Its best to upgrade to using [`gemini-1.5-pro-002`](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning)."
+ ],
+ "metadata": {
+ "id": "9k1GJaFIEvd-"
+ }
},
{
"cell_type": "markdown",
@@ -303,7 +998,8 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "PTvunHqRTHqe"
+ "id": "PTvunHqRTHqe",
+ "outputId": "8d1aabc9-cf3b-4150-f768-c40d0d92c237"
},
"outputs": [
{
@@ -458,7 +1154,8 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "pUCpEmEFM0eX"
+ "id": "pUCpEmEFM0eX",
+ "outputId": "1bf39ccb-4898-4c44-9a6e-557e58694d7a"
},
"outputs": [
{
@@ -687,7 +1384,8 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "uOWsUbwVXoTU"
+ "id": "uOWsUbwVXoTU",
+ "outputId": "c644fa82-1de4-4ba5-f9cf-44f4232917ee"
},
"outputs": [
{
@@ -751,7 +1449,8 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "DVIpbaGYRJQc"
+ "id": "DVIpbaGYRJQc",
+ "outputId": "2e2f78cc-2005-4965-af26-a1cc5627e7ee"
},
"outputs": [
{
@@ -784,179 +1483,6 @@
" f\"By default, you'll be charged for ~{epoch_count * total_number_of_billable_tokens} tokens.\"\n",
")"
]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "k35OK86wJVVd"
- },
- "source": [
- "## Validate the dataset for Vertex AI Gemini 1.5 fine-tuning"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {
- "id": "KQWJhyaXQRNM"
- },
- "outputs": [],
- "source": [
- "BASE_MODEL = \"gemini-1.5-pro-002\" # @param ['gemini-1.5-pro-002']{type:\"string\"}\n",
- "training_dataset_uri_2 = \"gs://github-repo/generative-ai/gemini/tuning/train_sft_train_samples.jsonl\" # @param {type:\"string\"}\n",
- "validation_dataset_uri_2 = \"gs://github-repo/generative-ai/gemini/tuning/val_sft_val_samples.jsonl\" # @param {type:\"string\"}"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "p-soJC81YNy2"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "\n",
- "Num training examples: 500\n",
- "First example:\n",
- "{'role': 'user', 'parts': [{'text': 'Honesty is usually the best policy. It is disrespectful to lie to someone. If you don\\'t want to date someone, you should say so. Sometimes it is easy to be honest. For example, you might be able to truthfully say, \"No, thank you, I already have a date for that party.\" Other times, you might need to find a kinder way to be nice. Maybe you are not attracted to the person. Instead of bluntly saying that, try saying, \"No, thank you, I just don\\'t think we would be a good fit.\" Avoid making up a phony excuse. For instance, don\\'t tell someone you will be out of town this weekend if you won\\'t be. There\\'s a chance that you might then run into them at the movies, which would definitely cause hurt feelings. A compliment sandwich is a really effective way to provide feedback. Essentially, you \"sandwich\" your negative comment between two positive things. Try using this method when you need to reject someone. An example of a compliment sandwich is to say something such as, \"You\\'re an awesome person. Unfortunately, I\\'m not interested in dating you. Someone else is going to be really lucky to date someone with such a great personality!\" You could also try, \"You are a really nice person. I\\'m only interested you as a friend. I like when we hang out in big groups together!\" Be sincere. If you offer false compliments, the other person will likely be able to tell and feel hurt. If you do not want to date someone, it is best to be upfront about your feelings. Do not beat around the bush. If your mind is made up, it is best to clearly state your response. If someone asks you to date them and you don\\'t want to, you can be direct and kind at the same time. State your answer clearly. You can make your feelings clear without purposefully hurting someone else\\'s feelings. Try smiling and saying, \"That sounds fun, but no thank you. I\\'m not interested in dating you.\" Don\\'t beat around the bush. If you do not want to accept the date, there is no need to say, \"Let me think about it.\" It is best to get the rejection over with. You don\\'t want to give someone false hope. Avoid saying something like, \"Let me check my schedule and get back to you.\" Try to treat the person the way you would want to be treated. This means that you should choose your words carefully. Be thoughtful in your response. It\\'s okay to pause before responding. You might be taken by surprise and need a moment to collect your thoughts. Say thank you. It is a compliment to be asked out. You can say, \"I\\'m flattered. Unfortunately, I can\\'t accept.\" Don\\'t laugh. Many people laugh nervously in awkward situations. Try to avoid giggling, as that is likely to result in hurt feelings. Sometimes it is not what you say, but how you say it. If you need to reject someone, think about factors other than your words. Non-verbal communication matters, too. Use the right tone of voice. Try to sound gentle but firm. Make eye contact. This helps convey that you are being serious, and also shows respect for the other person. If you are in public, try not to speak too loudly. It is not necessary for everyone around you to know that you are turning down a date.\\n\\nProvide a summary of the article in two or three sentences:\\n\\n'}]}\n",
- "{'role': 'model', 'parts': [{'text': 'Tell the truth. Use a \"compliment sandwich\". Be direct. Treat the person with respect. Communicate effectively.'}]}\n",
- "Num validation examples: 100\n"
- ]
- }
- ],
- "source": [
- "with tf.io.gfile.GFile(training_dataset_uri_2) as dataset_jsonl_file:\n",
- " example_training_dataset = [\n",
- " json.loads(dataset_line) for dataset_line in dataset_jsonl_file\n",
- " ]\n",
- "\n",
- "print()\n",
- "\n",
- "if validation_dataset_uri:\n",
- " with tf.io.gfile.GFile(validation_dataset_uri_2) as dataset_jsonl_file:\n",
- " example_validation_dataset = [\n",
- " json.loads(dataset_line) for dataset_line in dataset_jsonl_file\n",
- " ]\n",
- "\n",
- "# Initial dataset stats\n",
- "print(\"Num training examples:\", len(example_training_dataset))\n",
- "print(\"First example:\")\n",
- "for message in example_training_dataset[0][\"contents\"]:\n",
- " print(message)\n",
- "\n",
- "if example_validation_dataset:\n",
- " print(\"Num validation examples:\", len(example_validation_dataset))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "K0YOqIbtXwGI"
- },
- "outputs": [],
- "source": [
- "def validate_dataset_format(dataset):\n",
- " \"\"\"Validates the dataset.\n",
- "\n",
- " Args:\n",
- " dataset_uri: The dataset uri to be validated.\n",
- " \"\"\"\n",
- " format_errors = defaultdict(list)\n",
- " if not dataset or len(dataset) == 0:\n",
- " print(\"Input dataset file is empty or inaccessible.\")\n",
- " return\n",
- "\n",
- " for row_idx, example in enumerate(dataset):\n",
- " # Verify presence of contents list\n",
- " if not isinstance(example, dict):\n",
- " format_errors[\"invalid_input\"].append(row_idx)\n",
- " continue\n",
- " contents = example.get(\"contents\", None)\n",
- " if not contents:\n",
- " format_errors[\"missing_contents_list\"].append(row_idx)\n",
- " continue\n",
- " try:\n",
- " validate_contents(contents, format_errors, row_idx)\n",
- " except (TypeError, AttributeError, KeyError) as e:\n",
- " print(\"Invalid input during validation: %s\", e)\n",
- " format_errors[\"invalid_input\"].append(row_idx)\n",
- "\n",
- " if format_errors:\n",
- " print(\"Found errors for this dataset:\")\n",
- " for k, v in format_errors.items():\n",
- " print(f\"{k}: {v}\")\n",
- " else:\n",
- " print(\"No errors found for this dataset.\")\n",
- "\n",
- "\n",
- "def validate_contents(contents, format_errors, row_index):\n",
- " \"\"\"Validates contents list format.\"\"\"\n",
- " if not isinstance(contents, list):\n",
- " format_errors[\"invalid_contents_list\"].append(row_index)\n",
- " return\n",
- "\n",
- " prev_role = None\n",
- " for content_item in contents:\n",
- " if not isinstance(content_item, dict):\n",
- " format_errors[\"invalid_content_item\"].append(row_index)\n",
- " return\n",
- "\n",
- " if \"role\" not in content_item or \"parts\" not in content_item:\n",
- " format_errors[\"content_item_missing_key\"].append(row_index)\n",
- " return\n",
- "\n",
- " if content_item.get(\"role\", \"\").lower() not in (\"user\", \"model\"):\n",
- " format_errors[\"unrecognized_role\"].append(row_index)\n",
- " return\n",
- "\n",
- " parts = content_item.get(\"parts\", None)\n",
- " if not parts or not isinstance(parts, list):\n",
- " format_errors[\"missing_or_invalid_parts\"].append(row_index)\n",
- " return\n",
- "\n",
- " for part in parts:\n",
- " if not isinstance(part, dict) or \"text\" not in part:\n",
- " format_errors[\"invalid_part\"].append(row_index)\n",
- " return\n",
- "\n",
- " if not part.get(\"text\"):\n",
- " format_errors[\"missing_text\"].append(row_index)\n",
- " return\n",
- "\n",
- " role = content_item.get(\"role\", \"\").lower()\n",
- " # messages to have alternate turns.\n",
- " if role == prev_role:\n",
- " format_errors[\"consecutive_turns\"].append(row_index)\n",
- " return\n",
- "\n",
- " prev_role = role"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "RppMRP9lIkq2"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "No errors found for this dataset.\n",
- "No errors found for this dataset.\n"
- ]
- }
- ],
- "source": [
- "validate_dataset_format(example_training_dataset)\n",
- "if example_validation_dataset:\n",
- " validate_dataset_format(example_validation_dataset)"
- ]
}
],
"metadata": {
@@ -966,8 +1492,8 @@
"dmWOrTJ3gx13",
"DF4l8DTdWgPY"
],
- "name": "vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb",
- "toc_visible": true
+ "toc_visible": true,
+ "provenance": []
},
"kernelspec": {
"display_name": "Python 3",
@@ -976,4 +1502,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
-}
+}
\ No newline at end of file
From fcd3bb11df59b0110d23f8c5e2e1cd980e1d26f8 Mon Sep 17 00:00:00 2001
From: Deepak moonat
Date: Wed, 2 Oct 2024 07:11:23 +0530
Subject: [PATCH 32/76] feat: add gemini supervised finetuning on image data
notebook (#1181)
# Description
Notebook showcasing how to do supervised finetuning using
gemini1.5-pro-002 on image data
- [x] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [x] You are listed as the author in your notebook or README file.
- [x] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [x] Make your Pull Request title in the
specification.
- [x] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [x] Appropriate docs were updated (if necessary)
---------
Co-authored-by: Owl Bot
Co-authored-by: Erwin Huizenga
---
...inetuning_using_gemini_on_image_data.ipynb | 1934 +++++++++++++++++
...ning_token_count_and_cost_estimation.ipynb | 355 ++-
2 files changed, 2106 insertions(+), 183 deletions(-)
create mode 100644 gemini/tuning/supervised_finetuning_using_gemini_on_image_data.ipynb
diff --git a/gemini/tuning/supervised_finetuning_using_gemini_on_image_data.ipynb b/gemini/tuning/supervised_finetuning_using_gemini_on_image_data.ipynb
new file mode 100644
index 0000000000..a9cf0b47bb
--- /dev/null
+++ b/gemini/tuning/supervised_finetuning_using_gemini_on_image_data.ipynb
@@ -0,0 +1,1934 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "9f0d0f32-23b4-41a6-b364-579da297c326"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Copyright & License (click to expand)\n",
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dd53d60c-97eb-4c72-91ea-f274a753ab34"
+ },
+ "source": [
+ "# Supervised Fine Tuning with Gemini 1.5 Pro for Image Captioning\n",
+ "\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "MgVK7IeKpW27"
+ },
+ "source": [
+ "| | | |\n",
+ "|-|-|-|\n",
+ "|Author(s) | [Deepak Moonat](https://github.com/dmoonat)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "9ef820fb-1203-4cab-965f-17093a4ba25e"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "**Gemini** is a family of generative AI models developed by Google DeepMind that is designed for multimodal use cases. The Gemini API gives you access to the various Gemini models, such as Gemini 1.5 Pro, Gemini 1.0 Pro and more.\n",
+ "\n",
+ "This notebook demonstrates how to fine-tune the Gemini 1.5 Pro Model for Vision task using the Vertex AI Supervised Tuning feature. Supervised Tuning allows you to use your own training data to further refine the base model's capabilities towards your specific tasks.\n",
+ "\n",
+ "\n",
+ "Supervised Tuning uses labeled examples to tune a model. Each example demonstrates the output you want from your text model during inference.\n",
+ "\n",
+ "First, ensure your training data is of high quality, well-labeled, and directly relevant to the target task. This is crucial as low-quality data can adversely affect the performance and introduce bias in the fine-tuned model.\n",
+ "- Training: Experiment with different configurations to optimize the model's performance on the target task.\n",
+ "- Evaluation:\n",
+ " - Metric: Choose appropriate evaluation metrics that accurately reflect the success of the fine-tuned model for your specific task\n",
+ " - Evaluation Set: Use a separate set of data to evaluate the model's performance"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "74b00940-376c-4056-90fb-d22c1ce6eedf"
+ },
+ "source": [
+ "### Objective\n",
+ "\n",
+ "In this tutorial, you will learn how to use `Vertex AI` to tune a `gemini-1.5-pro-002` model on image data.\n",
+ "\n",
+ "\n",
+ "This tutorial uses the following Google Cloud ML services:\n",
+ "\n",
+ "- `Vertex AI`\n",
+ "\n",
+ "\n",
+ "The steps performed include:\n",
+ "\n",
+ "- Prepare and load the dataset\n",
+ "- Load the `gemini-1.5-pro-002` model\n",
+ "- Evaluate the model before tuning\n",
+ "- Tune the model.\n",
+ " - This will automatically create a Vertex AI endpoint and deploy the model to it\n",
+ "- Evaluate the model after tuning\n",
+ "- Make a prediction using tuned model."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "c6b43693-b20a-41bd-b5b8-5ad414517162"
+ },
+ "source": [
+ "### Model\n",
+ "\n",
+ "The pre-trained LLM model is `gemini-1.5-pro-002` for text generation."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "X0xdTMs10K7y"
+ },
+ "source": [
+ "### Dataset"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "jCMczwd00N9T"
+ },
+ "source": [
+ "Dataset used in this notebook is about image captioning.\n",
+ "\n",
+ "[Reference](https://ai.google.dev/gemma/docs/paligemma/fine-tuning-paligemma#download_the_model_checkpoint)\n",
+ "\n",
+ "Licensed under the Creative Commons Attribution 4.0 License"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "6d7b5435-e947-49bb-9ce3-aa8a42c30118"
+ },
+ "source": [
+ "### Costs\n",
+ "\n",
+ "This tutorial uses billable components of Google Cloud:\n",
+ "\n",
+ "* Vertex AI\n",
+ "* Cloud Storage\n",
+ "\n",
+ "Learn about [Vertex AI\n",
+ "pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage\n",
+ "pricing](https://cloud.google.com/storage/pricing), and use the [Pricing\n",
+ "Calculator](https://cloud.google.com/products/calculator/)\n",
+ "to generate a cost estimate based on your projected usage."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "0cbf01f0-5f6e-4bcd-903f-84ccaad5332c"
+ },
+ "source": [
+ "## Installation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "MpDAgOsK6kZn"
+ },
+ "outputs": [],
+ "source": [
+ "! pip3 install --upgrade --user --quiet google-cloud-aiplatform jsonlines rouge_score"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Moror1y0Qq2z"
+ },
+ "source": [
+ "### Restart runtime (Colab only)\n",
+ "\n",
+ "To use the newly installed packages, you must restart the runtime on Google Colab."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "4KLm_nKmQtC8"
+ },
+ "outputs": [],
+ "source": [
+ "# Automatically restart kernel after installs so that your environment can access the new packages\n",
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " import IPython\n",
+ "\n",
+ " app = IPython.Application.instance()\n",
+ " app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dpSnJTbIrFsh"
+ },
+ "source": [
+ "
\n",
+ "⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. ⚠️\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "b37d4259-7e39-417b-8879-24f7575732c8"
+ },
+ "source": [
+ "## Before you begin\n",
+ "\n",
+ "### Set your project ID\n",
+ "\n",
+ "**If you don't know your project ID**, try the following:\n",
+ "* Run `gcloud config list`.\n",
+ "* Run `gcloud projects list`.\n",
+ "* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "caaf0d7e-c6cb-4e56-af5c-553db5180e00"
+ },
+ "outputs": [],
+ "source": [
+ "PROJECT_ID = \"[YOUR_PROJECT_ID]\" # @param {type:\"string\"}\n",
+ "# Set the project id\n",
+ "! gcloud config set project {PROJECT_ID}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "054d794d-cd2e-4280-95ac-859b264ea2d6"
+ },
+ "source": [
+ "#### Region\n",
+ "\n",
+ "You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "0121bf60-1acd-4272-afaf-aa54b4ded263"
+ },
+ "outputs": [],
+ "source": [
+ "REGION = \"us-central1\" # @param {type:\"string\"}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "czjH2JfKaGfH"
+ },
+ "source": [
+ "#### Bucket\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "c_iZzYtraF3y"
+ },
+ "outputs": [],
+ "source": [
+ "BUCKET_NAME = \"[YOUR_BUCKET_NAME]\" # @param {type:\"string\"}\n",
+ "BUCKET_URI = f\"gs://{BUCKET_NAME}\""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "eac9e842-d225-4876-836f-afdb1937d800"
+ },
+ "source": [
+ "### Authenticate your Google Cloud account\n",
+ "\n",
+ "Depending on your Jupyter environment, you may have to manually authenticate. Follow the relevant instructions below.\n",
+ "\n",
+ "**1. Vertex AI Workbench**\n",
+ "* Do nothing as you are already authenticated.\n",
+ "\n",
+ "**2. Local JupyterLab instance, uncomment and run:**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "23082eec-b1bd-4594-b5b5-56fe2b74db6f"
+ },
+ "outputs": [],
+ "source": [
+ "# ! gcloud auth login"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "3c20f923-3c46-4d6d-80d2-d7cb22b1a8da"
+ },
+ "source": [
+ "**3. Authenticate your notebook environment**\n",
+ "\n",
+ "If you are running this notebook on Google Colab, run the cell below to authenticate your environment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "60302a3f-fad9-452c-8998-a9c9822d2732"
+ },
+ "outputs": [],
+ "source": [
+ "from google.colab import auth\n",
+ "\n",
+ "auth.authenticate_user()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ac33116d-b079-46cb-9614-86326c211e00"
+ },
+ "source": [
+ "**4. Service account or other**\n",
+ "* See how to grant Cloud Storage permissions to your service account at https://cloud.google.com/storage/docs/gsutil/commands/iam#ch-examples."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "e6a924d0-a034-4e53-b240-03d356c7b7a6"
+ },
+ "source": [
+ "### Import libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "463729ba-ec3c-4302-95bf-80207b0f9e2d"
+ },
+ "outputs": [],
+ "source": [
+ "import io\n",
+ "import time\n",
+ "\n",
+ "# For visualization.\n",
+ "from PIL import Image\n",
+ "\n",
+ "# For google cloud storage service.\n",
+ "from google.cloud import storage\n",
+ "\n",
+ "# For fine tuning Gemini model.\n",
+ "import google.cloud.aiplatform as aiplatform\n",
+ "\n",
+ "# For data handling.\n",
+ "import jsonlines\n",
+ "import pandas as pd\n",
+ "\n",
+ "# For evaluation.\n",
+ "from rouge_score import rouge_scorer\n",
+ "from tqdm import tqdm\n",
+ "from vertexai.preview.generative_models import GenerationConfig, GenerativeModel, Part\n",
+ "from vertexai.preview.tuning import sft"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "a522acfe-d0b6-4b4e-b201-0a4ccf59b133"
+ },
+ "source": [
+ "## Initialize Vertex AI SDK for Python\n",
+ "\n",
+ "Initialize the Vertex AI SDK for Python for your project and corresponding bucket."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "c845aca6-4f72-4d3b-b9ed-de4a18fcbbf8"
+ },
+ "outputs": [],
+ "source": [
+ "aiplatform.init(project=PROJECT_ID, location=REGION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "okht6CExcw4d"
+ },
+ "source": [
+ "## Prepare Multimodal Dataset"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "8N1QCz0MzyD6"
+ },
+ "source": [
+ "The dataset used to tune a foundation model needs to include examples that align with the task that you want the model to perform."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "9yp9SQ1M7FSP"
+ },
+ "source": [
+ "Note:\n",
+ "- Only support images and text as input, and text only as output.\n",
+ "- Maximum 16 Images per tuning example.\n",
+ "- Maximum image file size: 20MB\n",
+ "- Image needs to be in `jpeg` or `png` format. Supported mimetypes: `image/jpeg` and `image/png`\n",
+ "\n",
+ "Input is a jsonl file with each json string being on one line.\n",
+ "Each json instance have the format (Expanded for clarity):\n",
+ "```\n",
+ "{\n",
+ " \"contents\":[\n",
+ " {\n",
+ " \"role\":\"user\", # This indicate input content\n",
+ " \"parts\":[ # Interleaved image and text, could be in any order.\n",
+ " {\n",
+ " \"fileData\":{ # FileData needs to be reference to image file in gcs. No inline data.\n",
+ " \"mimeType\":\"image/jpeg\", # Provide the mimeType about this image\n",
+ " \"fileUri\":\"gs://path/to/image_uri\"\n",
+ " }\n",
+ " }\n",
+ " {\n",
+ " \"text\":\"What is in this image?\"\n",
+ " }\n",
+ " ]\n",
+ " },\n",
+ " {\n",
+ " \"role\":\"model\", # This indicate target content\n",
+ " \"parts\":[ # text only\n",
+ " {\n",
+ " \"text\":\"Something about this image.\"\n",
+ " }\n",
+ " ]\n",
+ " } # Single turn input and response.\n",
+ " ]\n",
+ "}\n",
+ "```\n",
+ "\n",
+ "Example:\n",
+ "```\n",
+ "{\n",
+ " \"contents\":[\n",
+ " {\n",
+ " \"role\":\"user\",\n",
+ " \"parts\":[\n",
+ " {\n",
+ " \"fileData\":{\n",
+ " \"mimeType\":\"image/jpeg\",\n",
+ " \"fileUri\":\"gs://bucketname/data/vision_data/task/image_description/image/1.jpeg\"\n",
+ " }\n",
+ " },\n",
+ " {\n",
+ " \"text\":\"Describe this image that captures the essence of it.\"\n",
+ " }\n",
+ " ]\n",
+ " },\n",
+ " {\n",
+ " \"role\":\"model\",\n",
+ " \"parts\":[\n",
+ " {\n",
+ " \"text\":\"A person wearing a pink shirt and a long-sleeved shirt with a large cuff, ....\"\n",
+ " }\n",
+ " ]\n",
+ " }\n",
+ " ]\n",
+ "}\n",
+ "```\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DESw8v4QrLHR"
+ },
+ "source": [
+ "### Data files\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "uiTVJqMXTvM5"
+ },
+ "source": [
+ "Data used in this notebook is present in the public Cloud Storage(GCS) bucket, `gs://longcap100`.\n",
+ "\n",
+ "Sample:\n",
+ "\n",
+ "> {\"prefix\": \"\", \"suffix\": \"A person wearing a pink shirt and a long-sleeved shirt with a large cuff, has their hand on a concrete ledge. The hand is on the edge of the ledge, and the thumb is on the edge of the hand. The shirt has a large cuff, and the sleeve is rolled up. The shadow of the hand is on the wall.\", \"image\": \"91.jpeg\"}\n",
+ "\n",
+ "\n",
+ "\n",
+ "- `data_train90.jsonl`: Contains training samples in json lines as shown above\n",
+ "- `data_val10.jsonl`: Contains validation samples in json lines as shown above\n",
+ "- `images`: Contains 100 images, training and validation data"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "MLcuIXlzz36C"
+ },
+ "source": [
+ "To run a tuning job, you need to upload one or more datasets to a Cloud Storage bucket. You can either create a new Cloud Storage bucket or use an existing one to store dataset files. The region of the bucket doesn't matter, but we recommend that you use a bucket that's in the same Google Cloud project where you plan to tune your model."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "sfIUgj-mU8K9"
+ },
+ "source": [
+ "### Create a Cloud Storage bucket"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "T_uC6nuFU-XU"
+ },
+ "source": [
+ "- Create a storage bucket to store intermediate artifacts such as datasets.\n",
+ "\n",
+ "- Only if your bucket doesn't already exist: Run the following cell to create your Cloud Storage bucket.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "M-L1BH8TU9Gn"
+ },
+ "outputs": [],
+ "source": [
+ "!gsutil mb -l {REGION} -p {PROJECT_ID} {BUCKET_URI}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ZUGi7ZThbChr"
+ },
+ "source": [
+ "### Copy images to specified Bucket"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "DHdC-9nj071o"
+ },
+ "outputs": [],
+ "source": [
+ "!gsutil -m -q cp -n -r gs://longcap100/*.jpeg {BUCKET_URI}/images/"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "fpyJR6tlVRXh"
+ },
+ "source": [
+ "- Download the training and validation dataset jsonlines files from the bucket."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "peUixIt_2DLP"
+ },
+ "outputs": [],
+ "source": [
+ "!gsutil -m -q cp -n -r gs://longcap100/data_train90.jsonl ."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "rtXMRqAi1WiF"
+ },
+ "outputs": [],
+ "source": [
+ "!gsutil -m -q cp -n -r gs://longcap100/data_val10.jsonl ."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "a9N-rN7pECKa"
+ },
+ "source": [
+ "### Prepare dataset for Training and Evaluation"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "KEfGLRVfsrii"
+ },
+ "source": [
+ "- Utility function to save json instances into jsonlines format"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "zdVGCwFWsrCB"
+ },
+ "outputs": [],
+ "source": [
+ "def save_jsonlines(file, instances):\n",
+ " \"\"\"\n",
+ " Saves a list of json instances to a jsonlines file.\n",
+ " \"\"\"\n",
+ " with jsonlines.open(file, mode=\"w\") as writer:\n",
+ " writer.write_all(instances)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "-hMIYgYBsbUt"
+ },
+ "source": [
+ "- Below function converts the dataset into Gemini-1.5 tuning format"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "0TFcj_tjaALV"
+ },
+ "outputs": [],
+ "source": [
+ "task_prompt = \"Describe this image in detail that captures the essence of it.\""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "LZ1cauVkz8Vv"
+ },
+ "outputs": [],
+ "source": [
+ "def create_tuning_samples(file_path):\n",
+ " \"\"\"\n",
+ " Creates tuning samples from a file.\n",
+ " \"\"\"\n",
+ " with jsonlines.open(file_path) as reader:\n",
+ " instances = []\n",
+ " for obj in reader:\n",
+ " instance = {\n",
+ " \"contents\": [\n",
+ " {\n",
+ " \"role\": \"user\", # This indicate input content\n",
+ " \"parts\": [ # Interleaved image and text, could be in any order.\n",
+ " {\n",
+ " \"fileData\": { # FileData needs to be reference to image file in gcs. No inline data.\n",
+ " \"mimeType\": \"image/jpeg\", # Provide the mimeType about this image\n",
+ " \"fileUri\": f\"{BUCKET_URI}/images/{obj['image']}\",\n",
+ " }\n",
+ " },\n",
+ " {\"text\": task_prompt},\n",
+ " ],\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"model\", # This indicate target content\n",
+ " \"parts\": [{\"text\": obj[\"suffix\"]}], # text only\n",
+ " }, # Single turn input and response.\n",
+ " ]\n",
+ " }\n",
+ " instances.append(instance)\n",
+ " return instances"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tqh6WYHg6X4z"
+ },
+ "source": [
+ "- Training data"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "b685Iy27z1E1"
+ },
+ "outputs": [],
+ "source": [
+ "train_file_path = \"data_train90.jsonl\"\n",
+ "train_instances = create_tuning_samples(train_file_path)\n",
+ "# save the training instances to jsonl file\n",
+ "save_jsonlines(\"train.jsonl\", train_instances)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "UC4ULRC46mA-"
+ },
+ "outputs": [],
+ "source": [
+ "train_instances[0]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "nyn5Xgw41bhc"
+ },
+ "outputs": [],
+ "source": [
+ "# save the training data to gcs bucket\n",
+ "!gsutil cp train.jsonl {BUCKET_URI}/train/"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "HLsC3IBL6ZWk"
+ },
+ "source": [
+ "- Validation data"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "LIp0hdag6bS0"
+ },
+ "outputs": [],
+ "source": [
+ "val_file_path = \"data_val10.jsonl\"\n",
+ "val_instances = create_tuning_samples(val_file_path)\n",
+ "# save the training instances to jsonl file\n",
+ "save_jsonlines(\"val.jsonl\", val_instances)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "TBTBTx4n6koL"
+ },
+ "outputs": [],
+ "source": [
+ "val_instances[0]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "xy-6ihNR6gx3"
+ },
+ "outputs": [],
+ "source": [
+ "# save the validation data to gcs bucket\n",
+ "!gsutil cp val.jsonl {BUCKET_URI}/val/"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "QhejcJumTAj3"
+ },
+ "source": [
+ "- Below code transforms the jsonl format to following structure\n",
+ "\n",
+ "`\n",
+ "[{'file_uri': '',\n",
+ " 'ground_truth': 'Experiment with different parameter values to get the best values for the task\n",
+ "\n",
+ "Refer to the following [link](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values) for understanding different parameters"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "zUx23W_r0F8z"
+ },
+ "source": [
+ "**Prompt** is a natural language request submitted to a language model to receive a response back\n",
+ "\n",
+ "Some best practices include\n",
+ " - Clearly communicate what content or information is most important\n",
+ " - Structure the prompt:\n",
+ " - Defining the role if using one. For example, You are an experienced UX designer at a top tech company\n",
+ " - Include context and input data\n",
+ " - Provide the instructions to the model\n",
+ " - Add example(s) if you are using them\n",
+ "\n",
+ "Refer to the following [link](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-design-strategies) for prompt design strategies."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "uuKHRy2OVX0w"
+ },
+ "source": [
+ "### Task"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "U-YD1J3VTSoI"
+ },
+ "source": [
+ "***Task prompt:***\n",
+ "\n",
+ "`\n",
+ "\", Describe this image that captures the essence of it. \"\n",
+ "`"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "zTZS4IJMTVR1"
+ },
+ "source": [
+ "***Query Image (image)***\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "-Ry2IjT2TWwd"
+ },
+ "outputs": [],
+ "source": [
+ "query_image_uri = val_instances[0][\"contents\"][0][\"parts\"][0][\"fileData\"][\"fileUri\"]\n",
+ "blob_name = query_image_uri.replace(f\"{BUCKET_URI}/\", \"\")\n",
+ "img = read_image_bytes_from_gcs(BUCKET_NAME, blob_name)\n",
+ "\n",
+ "# Display image bytes using pil python library\n",
+ "image = Image.open(io.BytesIO(img))\n",
+ "resized_img = image.resize((300, 300))\n",
+ "display(resized_img)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "04lAlLK53IYS"
+ },
+ "source": [
+ "- Test on single instance"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "-MeiP8z-o6qt"
+ },
+ "outputs": [],
+ "source": [
+ "response = generation_model.generate_content(\n",
+ " contents=[\n",
+ " Part.from_uri(query_image_uri, \"image/jpeg\"),\n",
+ " \"Describe this image that captures the essence of it.\",\n",
+ " ],\n",
+ " # Optional config\n",
+ " generation_config=GenerationConfig(\n",
+ " temperature=0.0,\n",
+ " ),\n",
+ ")\n",
+ "\n",
+ "print(response.text.strip())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5LISwh5_4R1U"
+ },
+ "source": [
+ "- Ground truth"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "aGXbUVK-3lO5"
+ },
+ "outputs": [],
+ "source": [
+ "val_instances[0][\"contents\"][1][\"parts\"][0][\"text\"]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "MRVAwGLB6KUX"
+ },
+ "source": [
+ "- Change prompt to get detailed description for the provided image"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "JO-C5BAVsdfd"
+ },
+ "outputs": [],
+ "source": [
+ "response = generation_model.generate_content(\n",
+ " contents=[\n",
+ " Part.from_uri(query_image_uri, \"image/jpeg\"),\n",
+ " \"Describe this image in detail that captures the essence of it.\",\n",
+ " ],\n",
+ " # Optional config\n",
+ " generation_config=GenerationConfig(\n",
+ " temperature=0.0,\n",
+ " ),\n",
+ ")\n",
+ "\n",
+ "print(response.text.strip())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "snYSjdzCVjGA"
+ },
+ "source": [
+ "## Evaluation before model tuning"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "vVvGqqTSVzUZ"
+ },
+ "source": [
+ "- Evaluate the Gemini model on the validation dataset before tuning it on the training dataset."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "otIRm3XBwQnW"
+ },
+ "outputs": [],
+ "source": [
+ "def get_prediction(query_image, generation_model):\n",
+ " \"\"\"Gets the prediction for a given instance.\n",
+ "\n",
+ " Args:\n",
+ " query_image: The path to the query image.\n",
+ " candidates: A list of paths to the candidate images.\n",
+ " generation_model: The generation model to use for prediction.\n",
+ "\n",
+ " Returns:\n",
+ " A string containing the prediction.\n",
+ " \"\"\"\n",
+ " response = generation_model.generate_content(\n",
+ " contents=[Part.from_uri(query_image, \"image/jpeg\"), task_prompt],\n",
+ " # Optional config\n",
+ " generation_config=GenerationConfig(\n",
+ " temperature=0.0,\n",
+ " ),\n",
+ " )\n",
+ "\n",
+ " return response.text.strip()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "rRW5UVau3xfO"
+ },
+ "outputs": [],
+ "source": [
+ "def run_eval(val_df, model=generation_model):\n",
+ " \"\"\"Runs evaluation on the validation dataset.\n",
+ "\n",
+ " Args:\n",
+ " val_df: The validation dataframe.\n",
+ " generation_model: The generation model to use for evaluation.\n",
+ "\n",
+ " Returns:\n",
+ " A list of predictions on val_df.\n",
+ " \"\"\"\n",
+ " predictions = []\n",
+ " for i, row in tqdm(val_df.iterrows(), total=val_df.shape[0]):\n",
+ " try:\n",
+ " prediction = get_prediction(row[\"file_uri\"], model)\n",
+ " except:\n",
+ " time.sleep(30)\n",
+ " prediction = get_prediction(row[\"file_uri\"], model)\n",
+ " predictions.append(prediction)\n",
+ " time.sleep(1)\n",
+ " return predictions"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "29O4EccbqbIa"
+ },
+ "source": [
+ "- Evaluate the Gemini model on the test dataset before tuning it on the training dataset.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "0LunPnr5Tvce"
+ },
+ "source": [
+ "
\n",
+ "⚠️ It will take 1-2 mins for the model to generate predictions on the provided validation dataset. ⚠️\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Y2Uy75youUor"
+ },
+ "outputs": [],
+ "source": [
+ "%%time\n",
+ "predictions = run_eval(val_df, model=generation_model)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "7BOg0EZpgg3D"
+ },
+ "outputs": [],
+ "source": [
+ "len(predictions)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "N22X-_V5mlev"
+ },
+ "outputs": [],
+ "source": [
+ "val_df.loc[:, \"basePredictions\"] = predictions"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "bzA_YLSQ67Jc"
+ },
+ "outputs": [],
+ "source": [
+ "val_df"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "nbPYwzNVWgz-"
+ },
+ "source": [
+ "### Evaluation metric"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "mvqIYHNCWigP"
+ },
+ "source": [
+ "The type of metrics used for evaluation depends on the task that you are evaluating. The following table shows the supported tasks and the metrics used to evaluate each task:\n",
+ "\n",
+ "| Task | Metric(s) |\n",
+ "|-----------------|---------------------------------|\n",
+ "| Classification | Micro-F1, Macro-F1, Per class F1 |\n",
+ "| Summarization | ROUGE-L |\n",
+ "| Question Answering | Exact Match |\n",
+ "| Text Generation | BLEU, ROUGE-L |\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "BTkLeYDJWre1"
+ },
+ "source": [
+ "For this task, we'll using ROUGE metric.\n",
+ "\n",
+ "- **Recall-Oriented Understudy for Gisting Evaluation (ROUGE)**: A metric used to evaluate the quality of automatic summaries of text. It works by comparing a generated summary to a set of reference summaries created by humans."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "TIlOr8KFWzqt"
+ },
+ "source": [
+ "Now you can take the candidate and reference to evaluate the performance. In this case, ROUGE will give you:\n",
+ "\n",
+ "- `rouge-1`, which measures unigram overlap\n",
+ "- `rouge-2`, which measures bigram overlap\n",
+ "- `rouge-l`, which measures the longest common subsequence"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "sIVb60EaW2oW"
+ },
+ "source": [
+ "- *Recall vs. Precision*\n",
+ "\n",
+ " **Recall**, meaning it prioritizes how much of the information in the reference summaries is captured in the generated summary.\n",
+ "\n",
+ " **Precision**, which measures how much of the generated summary is relevant to the original text."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "rDwfndw9OAW9"
+ },
+ "source": [
+ "- Initialize `rouge_score` object"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "1SEVHIrk69kj"
+ },
+ "outputs": [],
+ "source": [
+ "scorer = rouge_scorer.RougeScorer([\"rouge1\", \"rouge2\", \"rougeL\"], use_stemmer=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "_X9vv_gMORkr"
+ },
+ "source": [
+ "- Define function to calculate rouge score"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "P6C6EkvFOQzW"
+ },
+ "outputs": [],
+ "source": [
+ "def get_rouge_score(groundTruth, prediction):\n",
+ " \"\"\"Function to compute rouge score.\n",
+ "\n",
+ " Args:\n",
+ " groundTruth: The ground truth text.\n",
+ " prediction: The predicted text.\n",
+ " Returns:\n",
+ " The rouge score.\n",
+ " \"\"\"\n",
+ " scores = scorer.score(target=groundTruth, prediction=prediction)\n",
+ " return scores"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "J6qBe-Mbtem_"
+ },
+ "source": [
+ "- Single instance evaluation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "BtP0f3GO7zG7"
+ },
+ "outputs": [],
+ "source": [
+ "get_rouge_score(val_df.loc[0, \"ground_truth\"], val_df.loc[0, \"basePredictions\"])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "3zl1PpGA9oWE"
+ },
+ "outputs": [],
+ "source": [
+ "def calculate_metrics(val_df, prediction_col=\"basePredictions\"):\n",
+ " \"\"\"Function to compute rouge scores for all instances in the validation dataset.\n",
+ " Args:\n",
+ " val_df: The validation dataframe.\n",
+ " prediction_col: The column name of the predictions.\n",
+ " Returns:\n",
+ " A dataframe containing the rouge scores.\n",
+ " \"\"\"\n",
+ " records = []\n",
+ " for row, instance in val_df.iterrows():\n",
+ " scores = get_rouge_score(instance[\"ground_truth\"], instance[prediction_col])\n",
+ " records.append(\n",
+ " {\n",
+ " \"rouge1_precision\": scores.get(\"rouge1\").precision,\n",
+ " \"rouge1_recall\": scores.get(\"rouge1\").recall,\n",
+ " \"rouge1_fmeasure\": scores.get(\"rouge1\").fmeasure,\n",
+ " \"rouge2_precision\": scores.get(\"rouge2\").precision,\n",
+ " \"rouge2_recall\": scores.get(\"rouge2\").recall,\n",
+ " \"rouge2_fmeasure\": scores.get(\"rouge2\").fmeasure,\n",
+ " \"rougeL_precision\": scores.get(\"rougeL\").precision,\n",
+ " \"rougeL_recall\": scores.get(\"rougeL\").recall,\n",
+ " \"rougeL_fmeasure\": scores.get(\"rougeL\").fmeasure,\n",
+ " }\n",
+ " )\n",
+ " metrics = pd.DataFrame(records)\n",
+ " return metrics"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "SS2UrB9g8NBt"
+ },
+ "outputs": [],
+ "source": [
+ "evaluation_df_stats = calculate_metrics(val_df, prediction_col=\"basePredictions\")\n",
+ "evaluation_df_stats"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ZEyRYhEBZwy9"
+ },
+ "outputs": [],
+ "source": [
+ "print(\"Mean rougeL_precision is\", evaluation_df_stats.rougeL_precision.mean())\n",
+ "print(\"Mean rougeL_recall is\", evaluation_df_stats.rougeL_recall.mean())\n",
+ "print(\"Mean rougeL_fmeasure is\", evaluation_df_stats.rougeL_fmeasure.mean())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "uYAjjpdG_cpP"
+ },
+ "source": [
+ "## Fine-tune the model"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "EyqBRoY5rscI"
+ },
+ "source": [
+ "- `source_model`: Specifies the base Gemini model version you want to fine-tune.\n",
+ "- `train_dataset`: Path to your training data in JSONL format.\n",
+ "\n",
+ " \n",
+ "\n",
+ " *Optional parameters*\n",
+ " - `validation_dataset`: If provided, this data is used to evaluate the model during tuning.\n",
+ " - `tuned_model_display_name`: Display name for the tuned model.\n",
+ " - `epochs`: The number of training epochs to run.\n",
+ " - `learning_rate_multiplier`: A value to scale the learning rate during training.\n",
+ " - `adapter_size` : Gemini 1.5 Pro supports Adapter length [1, 4], default value is 4.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "UJ0gxBeyqO9k"
+ },
+ "source": [
+ "**Note: The default hyperparameter settings are optimized for optimal performance based on rigorous testing and are recommended for initial use. Users may customize these parameters to address specific performance requirements.**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "_vbe8o4_8qV6"
+ },
+ "outputs": [],
+ "source": [
+ "tuned_model_display_name = \"[DISPLAY NAME FOR TUNED MODEL]\" # @param {type:\"string\"}\n",
+ "\n",
+ "sft_tuning_job = sft.train(\n",
+ " source_model=base_model,\n",
+ " train_dataset=f\"{BUCKET_URI}/train/train.jsonl\",\n",
+ " # Optional:\n",
+ " validation_dataset=f\"{BUCKET_URI}/val/val.jsonl\",\n",
+ " tuned_model_display_name=tuned_model_display_name,\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "hSDpQGUeERcH"
+ },
+ "outputs": [],
+ "source": [
+ "job_name = sft_tuning_job.to_dict()[\"name\"]\n",
+ "job_name"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "dYQgzOr7KciG"
+ },
+ "outputs": [],
+ "source": [
+ "sft_tuning_job.to_dict()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "RE1a3AgRsqJh"
+ },
+ "source": [
+ "**Note: Tuning time depends on several factors, such as training data size, number of epochs, learning rate multiplier, etc.**"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "qHlfSLjKsruX"
+ },
+ "source": [
+ "
\n",
+ "⚠️ It will take ~60mins for the model tuning job to complete on the provided dataset and set configurations/hyperparameters. ⚠️\n",
+ "
\n",
+ "⚠️ It will take 1-2 mins for the model to generate predictions on the provided validation dataset. ⚠️\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "pWxg3i3a391K"
+ },
+ "outputs": [],
+ "source": [
+ "%%time\n",
+ "predictions_tuned = run_eval(val_df, model=tuned_model)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "V0wJNPEf5-6I"
+ },
+ "outputs": [],
+ "source": [
+ "val_df.loc[:, \"tunedPredictions\"] = predictions_tuned"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "og4hVmwCuuPW"
+ },
+ "outputs": [],
+ "source": [
+ "evaluation_df_post_tuning_stats = calculate_metrics(\n",
+ " val_df, prediction_col=\"tunedPredictions\"\n",
+ ")\n",
+ "evaluation_df_post_tuning_stats"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "heKx9Lu5vBYb"
+ },
+ "source": [
+ "- Improvement"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "X2AVUCh3S656"
+ },
+ "outputs": [],
+ "source": [
+ "evaluation_df_post_tuning_stats.rougeL_precision.mean()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "kTnfegPcvC-P"
+ },
+ "outputs": [],
+ "source": [
+ "improvement = round(\n",
+ " (\n",
+ " (\n",
+ " evaluation_df_post_tuning_stats.rougeL_precision.mean()\n",
+ " - evaluation_df_stats.rougeL_precision.mean()\n",
+ " )\n",
+ " / evaluation_df_stats.rougeL_precision.mean()\n",
+ " )\n",
+ " * 100,\n",
+ " 2,\n",
+ ")\n",
+ "print(\n",
+ " f\"Model tuning has improved the rougeL_precision by {improvement}% (result might differ based on each tuning iteration)\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "qrs0o6-p6Ebr"
+ },
+ "outputs": [],
+ "source": [
+ "# Save predicitons\n",
+ "predictions_all = val_df.to_csv(\"validation_pred.csv\", index=False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "yUuvCQ2O-1OW"
+ },
+ "source": [
+ "## Conclusion"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "me908QT9-26J"
+ },
+ "source": [
+ "Performance could be further improved:\n",
+ "- By adding more training samples. In general, improve your training data quality and/or quantity towards getting a more diverse and comprehensive dataset for your task\n",
+ "- By tuning the hyperparameters, such as epochs, learning rate multiplier or adapter size\n",
+ " - To find the optimal number of epochs for your dataset, we recommend experimenting with different values. While increasing epochs can lead to better performance, it's important to be mindful of overfitting, especially with smaller datasets. If you see signs of overfitting, reducing the number of epochs can help mitigate the issue\n",
+ "- You may try different prompt structures/formats and opt for the one with better performance"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "F7pq-hvxvy8_"
+ },
+ "source": [
+ "## Cleaning up"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "LokkxNS0vzM-"
+ },
+ "source": [
+ "To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud\n",
+ "project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.\n",
+ "\n",
+ "\n",
+ "Otherwise, you can delete the individual resources you created in this tutorial.\n",
+ "\n",
+ "Refer to this [instructions](https://cloud.google.com/vertex-ai/docs/tutorials/image-classification-custom/cleanup#delete_resources) to delete the resources from console."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "H38EHjj3vwib"
+ },
+ "outputs": [],
+ "source": [
+ "# Delete Experiment.\n",
+ "delete_experiments = True\n",
+ "if delete_experiments:\n",
+ " experiments_list = aiplatform.Experiment.list()\n",
+ " for experiment in experiments_list:\n",
+ " if experiment.resource_name == experiment_name:\n",
+ " print(experiment.resource_name)\n",
+ " experiment.delete()\n",
+ " break\n",
+ "\n",
+ "print(\"***\" * 10)\n",
+ "\n",
+ "# Delete Endpoint.\n",
+ "delete_endpoint = True\n",
+ "# If force is set to True, all deployed models on this\n",
+ "# Endpoint will be first undeployed.\n",
+ "if delete_endpoint:\n",
+ " for endpoint in aiplatform.Endpoint.list():\n",
+ " if endpoint.resource_name == tuned_model_endpoint_name:\n",
+ " print(endpoint.resource_name)\n",
+ " endpoint.delete(force=True)\n",
+ " break\n",
+ "\n",
+ "print(\"***\" * 10)\n",
+ "\n",
+ "# Delete Cloud Storage Bucket.\n",
+ "delete_bucket = True\n",
+ "if delete_bucket:\n",
+ " ! gsutil -m rm -r $BUCKET_URI"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "collapsed_sections": [
+ "Dw-gQpLXTe67",
+ "uYAjjpdG_cpP",
+ "5680557f-67bd-4e8c-a383-02ab655246c5"
+ ],
+ "name": "supervised_finetuning_using_gemini_on_image_data.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
diff --git a/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb b/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb
index 79cd82d819..5acb34d8fa 100644
--- a/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb
+++ b/gemini/tuning/vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb
@@ -127,22 +127,18 @@
"cell_type": "code",
"execution_count": 25,
"metadata": {
- "id": "XRvKdaPDTznN",
- "outputId": "30089739-607e-433e-8592-6e5b44e914c4",
- "colab": {
- "base_uri": "https://localhost:8080/"
- }
+ "id": "XRvKdaPDTznN"
},
"outputs": [
{
- "output_type": "execute_result",
"data": {
"text/plain": [
"{'status': 'ok', 'restart': True}"
]
},
+ "execution_count": 25,
"metadata": {},
- "execution_count": 25
+ "output_type": "execute_result"
}
],
"source": [
@@ -250,50 +246,70 @@
"import dataclasses\n",
"import json\n",
"\n",
+ "from google.cloud import storage\n",
"import numpy as np\n",
"import tensorflow as tf\n",
"from vertexai.generative_models import Content, Part\n",
- "from vertexai.preview.tokenization import get_tokenizer_for_model\n",
- "from google.cloud import storage"
+ "from vertexai.preview.tokenization import get_tokenizer_for_model"
]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "wvqIIG1M0YCy"
+ },
"source": [
"### Load the dataset\n",
"\n",
"This example is for text only. Define the Google Cloud Storage URIs pointing to your training and validation datasets or continue using the URIs provided."
- ],
- "metadata": {
- "id": "wvqIIG1M0YCy"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": 135,
+ "metadata": {
+ "id": "oue9Q0GG0Rvk"
+ },
+ "outputs": [],
"source": [
"BASE_MODEL = \"gemini-1.5-pro-002\" # @param ['gemini-1.5-pro-002']{type:\"string\"}\n",
"training_dataset_uri = \"gs://github-repo/generative-ai/gemini/tuning/train_sft_train_samples.jsonl\" # @param {type:\"string\"}\n",
"validation_dataset_uri = \"gs://github-repo/generative-ai/gemini/tuning/val_sft_val_samples.jsonl\" # @param {type:\"string\"}\n",
"\n",
"tokenizer = get_tokenizer_for_model(\"gemini-1.5-pro-001\")"
- ],
- "metadata": {
- "id": "oue9Q0GG0Rvk"
- },
- "execution_count": 135,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "We'll now load the dataset and conduct some basic statistical analysis to understand its structure and content.\n"
- ],
"metadata": {
"id": "dbl6UD5P3LIH"
- }
+ },
+ "source": [
+ "We'll now load the dataset and conduct some basic statistical analysis to understand its structure and content.\n"
+ ]
},
{
"cell_type": "code",
+ "execution_count": 136,
+ "metadata": {
+ "id": "M9TrzApr1tYQ"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\n",
+ "Num training examples: 500\n",
+ "First example:\n",
+ "{'role': 'user', 'parts': [{'text': 'Honesty is usually the best policy. It is disrespectful to lie to someone. If you don\\'t want to date someone, you should say so. Sometimes it is easy to be honest. For example, you might be able to truthfully say, \"No, thank you, I already have a date for that party.\" Other times, you might need to find a kinder way to be nice. Maybe you are not attracted to the person. Instead of bluntly saying that, try saying, \"No, thank you, I just don\\'t think we would be a good fit.\" Avoid making up a phony excuse. For instance, don\\'t tell someone you will be out of town this weekend if you won\\'t be. There\\'s a chance that you might then run into them at the movies, which would definitely cause hurt feelings. A compliment sandwich is a really effective way to provide feedback. Essentially, you \"sandwich\" your negative comment between two positive things. Try using this method when you need to reject someone. An example of a compliment sandwich is to say something such as, \"You\\'re an awesome person. Unfortunately, I\\'m not interested in dating you. Someone else is going to be really lucky to date someone with such a great personality!\" You could also try, \"You are a really nice person. I\\'m only interested you as a friend. I like when we hang out in big groups together!\" Be sincere. If you offer false compliments, the other person will likely be able to tell and feel hurt. If you do not want to date someone, it is best to be upfront about your feelings. Do not beat around the bush. If your mind is made up, it is best to clearly state your response. If someone asks you to date them and you don\\'t want to, you can be direct and kind at the same time. State your answer clearly. You can make your feelings clear without purposefully hurting someone else\\'s feelings. Try smiling and saying, \"That sounds fun, but no thank you. I\\'m not interested in dating you.\" Don\\'t beat around the bush. If you do not want to accept the date, there is no need to say, \"Let me think about it.\" It is best to get the rejection over with. You don\\'t want to give someone false hope. Avoid saying something like, \"Let me check my schedule and get back to you.\" Try to treat the person the way you would want to be treated. This means that you should choose your words carefully. Be thoughtful in your response. It\\'s okay to pause before responding. You might be taken by surprise and need a moment to collect your thoughts. Say thank you. It is a compliment to be asked out. You can say, \"I\\'m flattered. Unfortunately, I can\\'t accept.\" Don\\'t laugh. Many people laugh nervously in awkward situations. Try to avoid giggling, as that is likely to result in hurt feelings. Sometimes it is not what you say, but how you say it. If you need to reject someone, think about factors other than your words. Non-verbal communication matters, too. Use the right tone of voice. Try to sound gentle but firm. Make eye contact. This helps convey that you are being serious, and also shows respect for the other person. If you are in public, try not to speak too loudly. It is not necessary for everyone around you to know that you are turning down a date.\\n\\nProvide a summary of the article in two or three sentences:\\n\\n'}]}\n",
+ "CountTokensResult(total_tokens=730)\n",
+ "{'role': 'model', 'parts': [{'text': 'Tell the truth. Use a \"compliment sandwich\". Be direct. Treat the person with respect. Communicate effectively.'}]}\n",
+ "CountTokensResult(total_tokens=23)\n",
+ "Num validation examples: 100\n"
+ ]
+ }
+ ],
"source": [
"example_training_dataset = []\n",
"example_validation_dataset = []\n",
@@ -304,9 +320,12 @@
" json.loads(dataset_line) for dataset_line in dataset_jsonl_file\n",
" ]\n",
"except KeyError as e:\n",
- " print(f\"KeyError: Please check if your file '{training_dataset_uri}' is a JSONL file with correct JSON format. Error: {e}\")\n",
+ " print(\n",
+ " f\"KeyError: Please check if your file '{training_dataset_uri}' is a JSONL file with correct JSON format. Error: {e}\"\n",
+ " )\n",
" # Exit the script if there's an error in the training data\n",
" import sys\n",
+ "\n",
" sys.exit(1)\n",
"\n",
"print()\n",
@@ -318,9 +337,12 @@
" json.loads(dataset_line) for dataset_line in dataset_jsonl_file\n",
" ]\n",
" except KeyError as e:\n",
- " print(f\"KeyError: Please check if your file '{validation_dataset_uri}' is a JSONL file with correct JSON format. Error: {e}\")\n",
+ " print(\n",
+ " f\"KeyError: Please check if your file '{validation_dataset_uri}' is a JSONL file with correct JSON format. Error: {e}\"\n",
+ " )\n",
" # Exit the script if there's an error in the validation data\n",
" import sys\n",
+ "\n",
" sys.exit(1)\n",
"\n",
"# Initial dataset stats\n",
@@ -334,34 +356,13 @@
"\n",
"if example_validation_dataset:\n",
" print(\"Num validation examples:\", len(example_validation_dataset))"
- ],
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "M9TrzApr1tYQ",
- "outputId": "2e19860a-0bf6-446c-8bcc-e262c3c3833c"
- },
- "execution_count": 136,
- "outputs": [
- {
- "output_type": "stream",
- "name": "stdout",
- "text": [
- "\n",
- "Num training examples: 500\n",
- "First example:\n",
- "{'role': 'user', 'parts': [{'text': 'Honesty is usually the best policy. It is disrespectful to lie to someone. If you don\\'t want to date someone, you should say so. Sometimes it is easy to be honest. For example, you might be able to truthfully say, \"No, thank you, I already have a date for that party.\" Other times, you might need to find a kinder way to be nice. Maybe you are not attracted to the person. Instead of bluntly saying that, try saying, \"No, thank you, I just don\\'t think we would be a good fit.\" Avoid making up a phony excuse. For instance, don\\'t tell someone you will be out of town this weekend if you won\\'t be. There\\'s a chance that you might then run into them at the movies, which would definitely cause hurt feelings. A compliment sandwich is a really effective way to provide feedback. Essentially, you \"sandwich\" your negative comment between two positive things. Try using this method when you need to reject someone. An example of a compliment sandwich is to say something such as, \"You\\'re an awesome person. Unfortunately, I\\'m not interested in dating you. Someone else is going to be really lucky to date someone with such a great personality!\" You could also try, \"You are a really nice person. I\\'m only interested you as a friend. I like when we hang out in big groups together!\" Be sincere. If you offer false compliments, the other person will likely be able to tell and feel hurt. If you do not want to date someone, it is best to be upfront about your feelings. Do not beat around the bush. If your mind is made up, it is best to clearly state your response. If someone asks you to date them and you don\\'t want to, you can be direct and kind at the same time. State your answer clearly. You can make your feelings clear without purposefully hurting someone else\\'s feelings. Try smiling and saying, \"That sounds fun, but no thank you. I\\'m not interested in dating you.\" Don\\'t beat around the bush. If you do not want to accept the date, there is no need to say, \"Let me think about it.\" It is best to get the rejection over with. You don\\'t want to give someone false hope. Avoid saying something like, \"Let me check my schedule and get back to you.\" Try to treat the person the way you would want to be treated. This means that you should choose your words carefully. Be thoughtful in your response. It\\'s okay to pause before responding. You might be taken by surprise and need a moment to collect your thoughts. Say thank you. It is a compliment to be asked out. You can say, \"I\\'m flattered. Unfortunately, I can\\'t accept.\" Don\\'t laugh. Many people laugh nervously in awkward situations. Try to avoid giggling, as that is likely to result in hurt feelings. Sometimes it is not what you say, but how you say it. If you need to reject someone, think about factors other than your words. Non-verbal communication matters, too. Use the right tone of voice. Try to sound gentle but firm. Make eye contact. This helps convey that you are being serious, and also shows respect for the other person. If you are in public, try not to speak too loudly. It is not necessary for everyone around you to know that you are turning down a date.\\n\\nProvide a summary of the article in two or three sentences:\\n\\n'}]}\n",
- "CountTokensResult(total_tokens=730)\n",
- "{'role': 'model', 'parts': [{'text': 'Tell the truth. Use a \"compliment sandwich\". Be direct. Treat the person with respect. Communicate effectively.'}]}\n",
- "CountTokensResult(total_tokens=23)\n",
- "Num validation examples: 100\n"
- ]
- }
]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "L5RhrH6r4NrC"
+ },
"source": [
"You can perform various error checks to validate that each tuning example in the dataset adheres to the format expected by the tuning API. Errors are categorized based on their nature for easier debugging. \n",
" \n",
@@ -375,20 +376,20 @@
"6. **Parts List Validation:** Verifies that the `parts` key contains a list. Error type: `missing_or_invalid_parts`.\n",
"7. **Part Format:** Checks if each part in the `parts` list is a dictionary and contains the key `text`. Error type: `invalid_part`.\n",
"8. **Text Validation:** Ensures that the `text` key has textual data and is a string. Error type: `missing_text`.\n",
- "9. **Consecutive Turns:** For the chat history, it is enforced that the message roles alternate (user, then model, then user, etc.). Error type: `consecutive_turns`. This check is not applicable for systemInstruction.\n",
- "\n",
- "\n",
- "\n"
- ],
- "metadata": {
- "id": "L5RhrH6r4NrC"
- }
+ "9. **Consecutive Turns:** For the chat history, it is enforced that the message roles alternate (user, then model, then user, etc.). Error type: `consecutive_turns`. This check is not applicable for systemInstruction.\n"
+ ]
},
{
"cell_type": "code",
+ "execution_count": 137,
+ "metadata": {
+ "id": "S2FVNbIX0R0n"
+ },
+ "outputs": [],
"source": [
"from collections import defaultdict\n",
"\n",
+ "\n",
"def validate_dataset_format(dataset):\n",
" \"\"\"Validates the dataset.\n",
"\n",
@@ -413,7 +414,12 @@
" if system_instruction:\n",
" try:\n",
" # Validate the list within \"parts\"\n",
- " validate_contents(system_instruction.get(\"parts\", []), format_errors, row_idx, is_system_instruction=True)\n",
+ " validate_contents(\n",
+ " system_instruction.get(\"parts\", []),\n",
+ " format_errors,\n",
+ " row_idx,\n",
+ " is_system_instruction=True,\n",
+ " )\n",
" except (TypeError, AttributeError, KeyError) as e:\n",
" print(\"Invalid input during system instruction validation: %s\", e)\n",
" format_errors[\"invalid_system_instruction\"].append(row_idx)\n",
@@ -456,37 +462,25 @@
" return\n",
"\n",
" # Skip key checks for system instructions\n",
- " if not is_system_instruction and (\"role\" not in content_item or \"parts\" not in content_item):\n",
+ " if not is_system_instruction and (\n",
+ " \"role\" not in content_item or \"parts\" not in content_item\n",
+ " ):\n",
" format_errors[\"content_item_missing_key\"].append(row_index)\n",
" return\n",
"\n",
" # ... (rest of the validation logic remains the same)"
- ],
- "metadata": {
- "id": "S2FVNbIX0R0n"
- },
- "execution_count": 137,
- "outputs": []
+ ]
},
{
"cell_type": "code",
- "source": [
- "validate_dataset_format(example_training_dataset)\n",
- "if example_validation_dataset:\n",
- " validate_dataset_format(example_validation_dataset)"
- ],
+ "execution_count": 138,
"metadata": {
- "id": "nSLnbeJ00R2v",
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "outputId": "8f680322-45cc-4047-e76d-f7c3932d0439"
+ "id": "nSLnbeJ00R2v"
},
- "execution_count": 138,
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"No errors found for this dataset.\n",
"Warning: systemInstruction is missing in some rows.\n",
@@ -494,10 +488,18 @@
"Warning: systemInstruction is missing in some rows.\n"
]
}
+ ],
+ "source": [
+ "validate_dataset_format(example_training_dataset)\n",
+ "if example_validation_dataset:\n",
+ " validate_dataset_format(example_validation_dataset)"
]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "SJFwhPid_7oG"
+ },
"source": [
"### Utils for dataset analysis and token counting\n",
"\n",
@@ -506,13 +508,15 @@
"* Load and inspect sample data from the training and validation datasets.\n",
"* Calculate token counts for messages to understand the dataset's characteristics.\n",
"* Define utility functions for calculating token distributions and dataset statistics. These will help assess the suitability of your data for supervised tuning and estimate potential costs."
- ],
- "metadata": {
- "id": "SJFwhPid_7oG"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": 140,
+ "metadata": {
+ "id": "al_uUWOP4Ss2"
+ },
+ "outputs": [],
"source": [
"@dataclasses.dataclass\n",
"class DatasetDistribution:\n",
@@ -564,15 +568,15 @@
"\n",
"MAX_TOKENS_PER_EXAMPLE = 32 * 1024\n",
"ESTIMATE_PADDING_TOKEN_PER_EXAMPLE = 8"
- ],
- "metadata": {
- "id": "al_uUWOP4Ss2"
- },
- "execution_count": 140,
- "outputs": []
+ ]
},
{
"cell_type": "code",
+ "execution_count": 141,
+ "metadata": {
+ "id": "21xgvR3g0R5P"
+ },
+ "outputs": [],
"source": [
"def calculate_distribution_for_population(population) -> DatasetDistribution:\n",
" \"\"\"Calculates the distribution from the population of values.\n",
@@ -607,12 +611,16 @@
" # Handle optional systemInstruction\n",
" system_instruction = example.get(\"systemInstruction\")\n",
" if system_instruction:\n",
- " text = system_instruction.get(\"parts\")[0].get(\"text\") # Assuming single part in system instruction\n",
+ " text = system_instruction.get(\"parts\")[0].get(\n",
+ " \"text\"\n",
+ " ) # Assuming single part in system instruction\n",
" input.append(Content(role=\"system\", parts=[Part.from_text(text)]))\n",
"\n",
" for content_item in example[\"contents\"]:\n",
" role = content_item.get(\"role\").lower()\n",
- " text = content_item.get(\"parts\")[0].get(\"text\") # Assuming single part in content item\n",
+ " text = content_item.get(\"parts\")[0].get(\n",
+ " \"text\"\n",
+ " ) # Assuming single part in content item\n",
"\n",
" if role.lower() == \"model\":\n",
" result = tokenizer.count_tokens(input)\n",
@@ -674,6 +682,7 @@
" ),\n",
" )\n",
"\n",
+ "\n",
"def print_dataset_stats(dataset):\n",
" dataset_stats = get_dataset_stats_for_dataset(dataset)\n",
" print(\"Below you can find the dataset statistics:\")\n",
@@ -693,42 +702,27 @@
" f\"User output token length distribution: {dataset_stats.user_output_token_length_stats}\"\n",
" )\n",
" return dataset_stats"
- ],
- "metadata": {
- "id": "21xgvR3g0R5P"
- },
- "execution_count": 141,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "Next you can analyze the structure and token counts of your datasets."
- ],
"metadata": {
"id": "-FF4ReY6Atw3"
- }
+ },
+ "source": [
+ "Next you can analyze the structure and token counts of your datasets."
+ ]
},
{
"cell_type": "code",
- "source": [
- "training_dataset_stats = print_dataset_stats(example_training_dataset)\n",
- "\n",
- "if example_validation_dataset:\n",
- " validation_dataset_stats = print_dataset_stats(example_validation_dataset)"
- ],
+ "execution_count": 142,
"metadata": {
- "id": "sZqsWno60R7O",
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "outputId": "8b78aea8-fbfa-4f48-9ab0-03bdca05f9f9"
+ "id": "sZqsWno60R7O"
},
- "execution_count": 142,
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"\n",
"0 examples may be over the 32768 token limit, they will be truncated during tuning.\n",
@@ -748,10 +742,19 @@
"User output token length distribution: DatasetDistribution(sum=3819, max=165, min=8, mean=38.19, median=32.0, p5=17, p95=76)\n"
]
}
+ ],
+ "source": [
+ "training_dataset_stats = print_dataset_stats(example_training_dataset)\n",
+ "\n",
+ "if example_validation_dataset:\n",
+ " validation_dataset_stats = print_dataset_stats(example_validation_dataset)"
]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "KFWbXu17DfiS"
+ },
"source": [
"### Cost Estimation for Supervised Fine-tuning\n",
"In this final section, you will estimate the total cost for supervised fine-tuning based on the number of tokens processed. The number of tokens used will be charged to you. Please refer to the [pricing page for the rate](https://cloud.google.com/vertex-ai/generative-ai/pricing#gemini-models).\n",
@@ -759,13 +762,25 @@
"**Important Note:** The final cost may vary slightly from this estimate due to dataset formatting and truncation logic during training.\n",
"\n",
"The code calculates the total number of billable tokens by summing up the tokens from the training dataset and (if provided) the validation dataset. Then, it estimates the total cost by multiplying the total billable tokens with the number of training epochs (default is 4)."
- ],
- "metadata": {
- "id": "KFWbXu17DfiS"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": 143,
+ "metadata": {
+ "id": "k3ZJ_8fQ0R9x"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Dataset has ~309397 tokens that will be charged\n",
+ "By default, you'll train for 4 epochs on this dataset.\n",
+ "By default, you'll be charged for ~1237588 tokens.\n"
+ ]
+ }
+ ],
"source": [
"epoch_count = 4 # @param {type:\"integer\"}\n",
"if epoch_count is None:\n",
@@ -785,51 +800,41 @@
"print(\n",
" f\"By default, you'll be charged for ~{epoch_count * total_number_of_billable_tokens} tokens.\"\n",
")"
- ],
- "metadata": {
- "id": "k3ZJ_8fQ0R9x",
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "outputId": "9525fd37-0fc4-44dc-a97e-99a9e44748c0"
- },
- "execution_count": 143,
- "outputs": [
- {
- "output_type": "stream",
- "name": "stdout",
- "text": [
- "Dataset has ~309397 tokens that will be charged\n",
- "By default, you'll train for 4 epochs on this dataset.\n",
- "By default, you'll be charged for ~1237588 tokens.\n"
- ]
- }
]
},
{
"cell_type": "markdown",
- "source": [
- "## Convert `Gemini 1.0 Pro` fine-tuning dataset to `Gemini 1.5 Pro` dataset."
- ],
"metadata": {
"id": "K1EMMeRfH14a"
- }
+ },
+ "source": [
+ "## Convert `Gemini 1.0 Pro` fine-tuning dataset to `Gemini 1.5 Pro` dataset."
+ ]
},
{
"cell_type": "code",
- "source": [
- "source_uri = \"gs://next-23-tuning-demo/example-fine-tuning.json\" # @param {type:\"string\"}\n",
- "destination_uri = \"gs://next-23-tuning-demo/new-data-format.jsonl\" # @param {type:\"string\"}\n",
- "system_instruction = \"You are a helpful and friendly AI assistant\" # Optional"
- ],
+ "execution_count": 144,
"metadata": {
"id": "oOKZgdSLJUFx"
},
- "execution_count": 144,
- "outputs": []
+ "outputs": [],
+ "source": [
+ "source_uri = (\n",
+ " \"gs://next-23-tuning-demo/example-fine-tuning.json\" # @param {type:\"string\"}\n",
+ ")\n",
+ "destination_uri = (\n",
+ " \"gs://next-23-tuning-demo/new-data-format.jsonl\" # @param {type:\"string\"}\n",
+ ")\n",
+ "system_instruction = \"You are a helpful and friendly AI assistant\" # Optional"
+ ]
},
{
"cell_type": "code",
+ "execution_count": 117,
+ "metadata": {
+ "id": "fgNjg3Y4CSq8"
+ },
+ "outputs": [],
"source": [
"def convert_jsonl_format(\n",
" source_uri: str,\n",
@@ -857,7 +862,7 @@
" dest_blob = dest_bucket.blob(dest_blob_name)\n",
"\n",
" # Download the source JSONL file\n",
- " source_data = source_blob.download_as_string().decode('utf-8')\n",
+ " source_data = source_blob.download_as_string().decode(\"utf-8\")\n",
"\n",
" new_data = []\n",
" for line in source_data.splitlines():\n",
@@ -890,16 +895,13 @@
" if system_instruction:\n",
" new_json_data[\"systemInstruction\"] = {\n",
" \"role\": \"system\",\n",
- " \"parts\": [{\"text\": system_instruction}]\n",
+ " \"parts\": [{\"text\": system_instruction}],\n",
" }\n",
"\n",
" new_json_data[\"contents\"] = [] # Initialize \"contents\" after \"systemInstruction\"\n",
"\n",
" for message in json_data.get(\"messages\", []):\n",
- " new_message = {\n",
- " \"role\": message[\"role\"],\n",
- " \"parts\": [{\"text\": message[\"content\"]}]\n",
- " }\n",
+ " new_message = {\"role\": message[\"role\"], \"parts\": [{\"text\": message[\"content\"]}]}\n",
" new_json_data[\"contents\"].append(new_message)\n",
"\n",
" return new_json_data\n",
@@ -918,46 +920,37 @@
" raise ValueError(\"Invalid Google Cloud Storage URI\")\n",
" parts = gcs_uri[5:].split(\"/\", 1)\n",
" return parts[0], parts[1]"
- ],
- "metadata": {
- "id": "fgNjg3Y4CSq8"
- },
- "execution_count": 117,
- "outputs": []
+ ]
},
{
"cell_type": "code",
- "source": [
- "convert_jsonl_format(source_uri, destination_uri, system_instruction)"
- ],
+ "execution_count": 118,
"metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "WAqrR4yDH1LT",
- "outputId": "2e351e1c-ee6c-40e6-857f-581b12fe3872"
+ "id": "WAqrR4yDH1LT"
},
- "execution_count": 118,
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Successfully converted and uploaded to gs://next-23-tuning-demo/new-data-format.jsonl\n"
]
}
+ ],
+ "source": [
+ "convert_jsonl_format(source_uri, destination_uri, system_instruction)"
]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "9k1GJaFIEvd-"
+ },
"source": [
"## Tuning token count and cost estimation for `Gemini 1.0 pro` legacy users.\n",
"\n",
"Only use this part if you still use `Gemini 1.0 pro`. Its best to upgrade to using [`gemini-1.5-pro-002`](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning)."
- ],
- "metadata": {
- "id": "9k1GJaFIEvd-"
- }
+ ]
},
{
"cell_type": "markdown",
@@ -998,8 +991,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "PTvunHqRTHqe",
- "outputId": "8d1aabc9-cf3b-4150-f768-c40d0d92c237"
+ "id": "PTvunHqRTHqe"
},
"outputs": [
{
@@ -1154,8 +1146,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "pUCpEmEFM0eX",
- "outputId": "1bf39ccb-4898-4c44-9a6e-557e58694d7a"
+ "id": "pUCpEmEFM0eX"
},
"outputs": [
{
@@ -1384,8 +1375,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "uOWsUbwVXoTU",
- "outputId": "c644fa82-1de4-4ba5-f9cf-44f4232917ee"
+ "id": "uOWsUbwVXoTU"
},
"outputs": [
{
@@ -1449,8 +1439,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "id": "DVIpbaGYRJQc",
- "outputId": "2e2f78cc-2005-4965-af26-a1cc5627e7ee"
+ "id": "DVIpbaGYRJQc"
},
"outputs": [
{
@@ -1492,8 +1481,8 @@
"dmWOrTJ3gx13",
"DF4l8DTdWgPY"
],
- "toc_visible": true,
- "provenance": []
+ "name": "vertexai_supervised_tuning_token_count_and_cost_estimation.ipynb",
+ "toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
@@ -1502,4 +1491,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
-}
\ No newline at end of file
+}
From d0998bd22b16f1296e927c4860fba9e30e27a289 Mon Sep 17 00:00:00 2001
From: Riccardo Carlesso
Date: Wed, 2 Oct 2024 17:37:25 +0200
Subject: [PATCH 33/76] fix: Update compare_generative_ai_models.ipynb (#1202)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Adding `plotly` as the notebook fails otherwise.
# Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [ ] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [ ] You are listed as the author in your notebook or README file.
- [ ] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [ ] Make your Pull Request title in the
specification.
- [ ] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [ ] Appropriate docs were updated (if necessary)
Fixes # 🦕
---
gemini/evaluation/compare_generative_ai_models.ipynb | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gemini/evaluation/compare_generative_ai_models.ipynb b/gemini/evaluation/compare_generative_ai_models.ipynb
index 995b75207a..524332bfd5 100644
--- a/gemini/evaluation/compare_generative_ai_models.ipynb
+++ b/gemini/evaluation/compare_generative_ai_models.ipynb
@@ -103,7 +103,7 @@
},
"outputs": [],
"source": [
- "%pip install --upgrade --user --quiet google-cloud-aiplatform[evaluation]"
+ "%pip install --upgrade --user --quiet google-cloud-aiplatform[evaluation] plotly"
]
},
{
From 800d05b4a3a28fe23b971e702e76df0c41e6d308 Mon Sep 17 00:00:00 2001
From: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Date: Wed, 2 Oct 2024 11:44:49 -0500
Subject: [PATCH 34/76] ci: Updates to nox formatter (#1203)
- Changed `nox -s format` to run nbqa formatters and allowlist sort
- Cleaned up Contributing guide instructions
- Fixed lint/spelling errors
---------
Co-authored-by: Owl Bot
---
.github/actions/spelling/allow.txt | 17 +++++++
.github/actions/spelling/excludes.txt | 1 -
.github/workflows/linter.yaml | 1 +
CONTRIBUTING.md | 36 ++-----------
.../src/champion_challenger_pipeline.py | 2 +-
.../src/submit_pipeline.py | 2 +-
.../finance-advisor-spanner/database.py | 2 +-
.../backend/indexing/run_parse_embed_index.py | 2 +-
.../backend/indexing/vector_search_utils.py | 1 +
.../backend/rag/async_extensions.py | 1 +
.../backend/rag/claude_vertex.py | 1 +
.../llamaindex-rag/backend/rag/evaluate.py | 1 +
.../backend/rag/node_reranker.py | 1 +
.../backend/rag/parent_retriever.py | 1 +
.../llamaindex-rag/backend/rag/prompts.py | 1 +
.../backend/rag/qa_followup_retriever.py | 1 +
.../NLP2SQL_using_dynamic_RAG.ipynb | 4 +-
.../small_to_big_rag/small_to_big_rag.ipynb | 13 +++--
...nslation_training_data_tsv_generator.ipynb | 6 +--
noxfile.py | 51 ++++++++++---------
owlbot.py | 11 ----
search/web-app/consts.py | 10 ++--
22 files changed, 76 insertions(+), 90 deletions(-)
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
index 13d6ed4657..8fc2c985e3 100644
--- a/.github/actions/spelling/allow.txt
+++ b/.github/actions/spelling/allow.txt
@@ -139,6 +139,7 @@ LUVBPTK
Ladhak
Lego
Llion
+Logrus
Lottry
MSCHF
MSGSEND
@@ -158,6 +159,7 @@ Mvar
NARI
NCCREATE
NDEBUG
+NGRAM
NGRAMS
NMT
NOMINMAX
@@ -213,6 +215,7 @@ Shazeer
Shenzhou
Simpsons
Siri
+Skaffold
Sketchfab
Smartbuy
Storrer
@@ -287,11 +290,13 @@ arXiv
aretrieve
argmax
arun
+arxiv
astype
autoflake
autogen
automl
autoptr
+autorater
autosxs
backticks
bagchi
@@ -299,6 +304,7 @@ barmode
barpolar
baxis
bbc
+bigframes
bigquery
bitcoin
boundings
@@ -310,11 +316,13 @@ caxis
cctv
cfbundle
chatbots
+chromadb
claude
clickable
cmap
codebase
codebases
+codefile
codelab
codelabs
colab
@@ -357,6 +365,7 @@ ekg
elous
emb
embs
+embvs
emojis
ename
epoc
@@ -370,6 +379,7 @@ fewshot
ffi
figsize
fillmode
+fillna
firestore
flac
floormat
@@ -385,6 +395,7 @@ fulltext
funtion
gapic
gboolean
+gbq
gchar
gcloud
gcs
@@ -406,7 +417,9 @@ goooooood
gpt
gpu
gradio
+gradlew
gridcolor
+grpcio
gspread
gsutil
gtk
@@ -463,6 +476,7 @@ lexer
linalg
linecolor
linted
+linting
llm
llms
logprobs
@@ -596,6 +610,7 @@ tabular
tagline
tencel
termcolor
+terraform
textno
tfhub
tfidf
@@ -608,6 +623,7 @@ tobytes
toself
tqdm
tritan
+tsv
ubuntu
undst
unigram
@@ -645,6 +661,7 @@ xcscheme
xctest
xlabel
xticks
+xxxxxxxx
yaxes
yaxis
ylabel
diff --git a/.github/actions/spelling/excludes.txt b/.github/actions/spelling/excludes.txt
index b551f43871..74ed9f32e1 100644
--- a/.github/actions/spelling/excludes.txt
+++ b/.github/actions/spelling/excludes.txt
@@ -108,4 +108,3 @@ ignore$
^\Qsearch/bulk-question-answering/bulk_question_answering_output.tsv\E$
^\Q.github/workflows/issue_assigner/assign_issue.py\E$
^\Qnoxfile.py\E$
-^\owlbot.py\E$
diff --git a/.github/workflows/linter.yaml b/.github/workflows/linter.yaml
index 11b68f1345..e9e1139ec1 100644
--- a/.github/workflows/linter.yaml
+++ b/.github/workflows/linter.yaml
@@ -57,6 +57,7 @@ jobs:
LOG_LEVEL: WARN
SHELLCHECK_OPTS: -e SC1091 -e 2086
VALIDATE_ALL_CODEBASE: false
+ VALIDATE_PYTHON_PYINK: false
VALIDATE_PYTHON_ISORT: false
VALIDATE_TYPESCRIPT_STANDARD: false # super-linter/super-linter#4445
VALIDATE_CHECKOV: false
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index fd7687a10f..fa992d0654 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -17,7 +17,7 @@ again.
## Notebook Template
-If you're creating a Jupyter Notebook, use `/gemini/getting-started/intro_gemini_python.ipynb` as a template.
+If you're creating a Jupyter Notebook, use [`notebook_template.ipynb`](notebook_template.ipynb) as a template.
## Code Quality Checks
@@ -25,36 +25,11 @@ All notebooks in this project are checked for formatting and style, to ensure a
consistent experience. To test notebooks prior to submitting a pull request,
you can follow these steps.
-From a command-line terminal (e.g. from Vertex Workbench or locally), install
-the code analysis tools:
-
-```shell
-pip3 install --user -U nbqa black flake8 isort pyupgrade git+https://github.com/tensorflow/docs
-```
-
-You'll likely need to add the directory where these were installed to your PATH:
-
-```shell
-export PATH="$HOME/.local/bin:$PATH"
-```
-
-Then, set an environment variable for your notebook (or directory):
-
-```shell
-export notebook="your-notebook.ipynb"
-```
-
-Finally, run this code block to check for errors. Each step will attempt to
-automatically fix any issues. If the fixes can't be performed automatically,
+From a command-line terminal (e.g. from Vertex AI Workbench or locally),
+run this code block to format your code.
+If the fixes can't be performed automatically,
then you will need to manually address them before submitting your PR.
-Note: For official, only submit one notebook per PR.
-
-```shell
-python3 -m pip install -U -r .github/workflows/notebook_linter/requirements.txt
-.github/workflows/notebook_linter/run_linter.sh
-```
-
```shell
python3 -m pip install --upgrade nox
nox -s format
@@ -69,8 +44,7 @@ information on using pull requests.
## Community Guidelines
-This project follows [Google's Open Source Community
-Guidelines](https://opensource.google/conduct/).
+This project follows [Google's Open Source Community Guidelines](https://opensource.google/conduct/).
## Contributor Guide
diff --git a/gemini/evaluation/legacy/champion_challenger_eval_pipeline/src/champion_challenger_pipeline.py b/gemini/evaluation/legacy/champion_challenger_eval_pipeline/src/champion_challenger_pipeline.py
index 477b4aaee0..fa2cf13d5d 100644
--- a/gemini/evaluation/legacy/champion_challenger_eval_pipeline/src/champion_challenger_pipeline.py
+++ b/gemini/evaluation/legacy/champion_challenger_eval_pipeline/src/champion_challenger_pipeline.py
@@ -19,7 +19,7 @@
# mypy: disable-error-code="no-untyped-def, valid-type, no-untyped-def, assignment"
-""" Champion Challenger Auto Side-by-side Evaluation Vertex AI Pipelines """
+"""Champion Challenger Auto Side-by-side Evaluation Vertex AI Pipelines"""
from typing import NamedTuple
diff --git a/gemini/evaluation/legacy/champion_challenger_eval_pipeline/src/submit_pipeline.py b/gemini/evaluation/legacy/champion_challenger_eval_pipeline/src/submit_pipeline.py
index 5fcf069b0b..4bae460e7b 100644
--- a/gemini/evaluation/legacy/champion_challenger_eval_pipeline/src/submit_pipeline.py
+++ b/gemini/evaluation/legacy/champion_challenger_eval_pipeline/src/submit_pipeline.py
@@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
-""" Submit Vertex AI Pipeline run"""
+"""Submit Vertex AI Pipeline run"""
from google.cloud import aiplatform
diff --git a/gemini/sample-apps/finance-advisor-spanner/database.py b/gemini/sample-apps/finance-advisor-spanner/database.py
index da7a3f24aa..0b4b406019 100644
--- a/gemini/sample-apps/finance-advisor-spanner/database.py
+++ b/gemini/sample-apps/finance-advisor-spanner/database.py
@@ -1,4 +1,4 @@
-"""This file is for database operations done by the application """
+"""This file is for database operations done by the application"""
# pylint: disable=line-too-long
import os
diff --git a/gemini/sample-apps/llamaindex-rag/backend/indexing/run_parse_embed_index.py b/gemini/sample-apps/llamaindex-rag/backend/indexing/run_parse_embed_index.py
index af343adce4..96e69acc7e 100644
--- a/gemini/sample-apps/llamaindex-rag/backend/indexing/run_parse_embed_index.py
+++ b/gemini/sample-apps/llamaindex-rag/backend/indexing/run_parse_embed_index.py
@@ -1,4 +1,4 @@
-"""Master script for parsing, embedding
+"""Master script for parsing, embedding
and indexing data living in a GCS bucket"""
import asyncio
diff --git a/gemini/sample-apps/llamaindex-rag/backend/indexing/vector_search_utils.py b/gemini/sample-apps/llamaindex-rag/backend/indexing/vector_search_utils.py
index 03ac5cf727..f0df78c3df 100644
--- a/gemini/sample-apps/llamaindex-rag/backend/indexing/vector_search_utils.py
+++ b/gemini/sample-apps/llamaindex-rag/backend/indexing/vector_search_utils.py
@@ -1,4 +1,5 @@
"""Module for vector search utils."""
+
from google.cloud import aiplatform
diff --git a/gemini/sample-apps/llamaindex-rag/backend/rag/async_extensions.py b/gemini/sample-apps/llamaindex-rag/backend/rag/async_extensions.py
index 981d7dc4d8..a539e1fc7e 100644
--- a/gemini/sample-apps/llamaindex-rag/backend/rag/async_extensions.py
+++ b/gemini/sample-apps/llamaindex-rag/backend/rag/async_extensions.py
@@ -1,4 +1,5 @@
"""Extensions to Llamaindex Base classes to allow for asynchronous execution"""
+
from collections.abc import Sequence
import logging
diff --git a/gemini/sample-apps/llamaindex-rag/backend/rag/claude_vertex.py b/gemini/sample-apps/llamaindex-rag/backend/rag/claude_vertex.py
index 2ce3bdea67..4610f226fc 100644
--- a/gemini/sample-apps/llamaindex-rag/backend/rag/claude_vertex.py
+++ b/gemini/sample-apps/llamaindex-rag/backend/rag/claude_vertex.py
@@ -1,4 +1,5 @@
"""Llamaindex LLM implementation of Claude Vertex AI"""
+
from typing import Any
from anthropic import AnthropicVertex, AsyncAnthropicVertex
diff --git a/gemini/sample-apps/llamaindex-rag/backend/rag/evaluate.py b/gemini/sample-apps/llamaindex-rag/backend/rag/evaluate.py
index 15fe6ff352..ad794ee9dc 100644
--- a/gemini/sample-apps/llamaindex-rag/backend/rag/evaluate.py
+++ b/gemini/sample-apps/llamaindex-rag/backend/rag/evaluate.py
@@ -1,4 +1,5 @@
"""Custom LLM Evaluator"""
+
import asyncio
from collections.abc import Callable
import logging
diff --git a/gemini/sample-apps/llamaindex-rag/backend/rag/node_reranker.py b/gemini/sample-apps/llamaindex-rag/backend/rag/node_reranker.py
index 89d5354b1b..9f5204790b 100644
--- a/gemini/sample-apps/llamaindex-rag/backend/rag/node_reranker.py
+++ b/gemini/sample-apps/llamaindex-rag/backend/rag/node_reranker.py
@@ -1,4 +1,5 @@
"""Node Re-ranker class for async execution"""
+
from collections.abc import Callable
import logging
diff --git a/gemini/sample-apps/llamaindex-rag/backend/rag/parent_retriever.py b/gemini/sample-apps/llamaindex-rag/backend/rag/parent_retriever.py
index b4d130b61c..25db908251 100644
--- a/gemini/sample-apps/llamaindex-rag/backend/rag/parent_retriever.py
+++ b/gemini/sample-apps/llamaindex-rag/backend/rag/parent_retriever.py
@@ -1,4 +1,5 @@
"""Custom retriever which implements parent retrieval"""
+
import logging
from llama_index.core import QueryBundle
diff --git a/gemini/sample-apps/llamaindex-rag/backend/rag/prompts.py b/gemini/sample-apps/llamaindex-rag/backend/rag/prompts.py
index f53033d806..0473331119 100644
--- a/gemini/sample-apps/llamaindex-rag/backend/rag/prompts.py
+++ b/gemini/sample-apps/llamaindex-rag/backend/rag/prompts.py
@@ -1,4 +1,5 @@
"""Prompt management class"""
+
from dataclasses import asdict, dataclass, field
SYSTEM_PROMPT = "You are an expert assistant specializing in \
diff --git a/gemini/sample-apps/llamaindex-rag/backend/rag/qa_followup_retriever.py b/gemini/sample-apps/llamaindex-rag/backend/rag/qa_followup_retriever.py
index c23ca8d4f9..1a0ef40f83 100644
--- a/gemini/sample-apps/llamaindex-rag/backend/rag/qa_followup_retriever.py
+++ b/gemini/sample-apps/llamaindex-rag/backend/rag/qa_followup_retriever.py
@@ -1,5 +1,6 @@
"""Custom retriever which implements
retrieval based on hypothetical questions"""
+
import logging
from llama_index.core import QueryBundle
diff --git a/gemini/use-cases/retrieval-augmented-generation/NLP2SQL_using_dynamic_RAG.ipynb b/gemini/use-cases/retrieval-augmented-generation/NLP2SQL_using_dynamic_RAG.ipynb
index 0efe670b24..b633815e87 100644
--- a/gemini/use-cases/retrieval-augmented-generation/NLP2SQL_using_dynamic_RAG.ipynb
+++ b/gemini/use-cases/retrieval-augmented-generation/NLP2SQL_using_dynamic_RAG.ipynb
@@ -1196,9 +1196,7 @@
}
],
"source": [
- "text_query = (\n",
- " \"展示Foreign Currency Transactions 2023年10月的信息\" # @param {type:\"string\"}\n",
- ")\n",
+ "text_query = \"展示Foreign Currency Transactions 2023年10月的信息\" # @param {type:\"string\"}\n",
"find_similar_questions(df, text_query)"
]
},
diff --git a/gemini/use-cases/retrieval-augmented-generation/small_to_big_rag/small_to_big_rag.ipynb b/gemini/use-cases/retrieval-augmented-generation/small_to_big_rag/small_to_big_rag.ipynb
index be381a2c49..0b595ee9af 100644
--- a/gemini/use-cases/retrieval-augmented-generation/small_to_big_rag/small_to_big_rag.ipynb
+++ b/gemini/use-cases/retrieval-augmented-generation/small_to_big_rag/small_to_big_rag.ipynb
@@ -551,7 +551,7 @@
"source": [
"### Convert summaries to embeddings\n",
"\n",
- "Next, we'll convert the text summaries of each code file to vector embeddings. We'll store thsoe embeddings in an in-memory Chroma database. "
+ "Next, we'll convert the text summaries of each code file to vector embeddings. We'll store those embeddings in an in-memory Chroma database. "
]
},
{
@@ -588,14 +588,14 @@
"for index, row in df.iterrows():\n",
" fn = row[\"filename\"]\n",
" print(\"Getting embedding for: \", fn)\n",
- " summ = row[\"summary\"]\n",
- " print(summ)\n",
- " e = get_text_embedding(summ)\n",
+ " summary = row[\"summary\"]\n",
+ " print(summary)\n",
+ " e = get_text_embedding(summary)\n",
" print(e)\n",
" # add vector embedding to in-memory Chroma database.\n",
" # the \"small\" summary embedding is linked to the \"big\" raw code file through the metadata key, \"filename.\"\n",
" collection.add(\n",
- " embeddings=[e], documents=[summ], metadatas=[{\"filename\": fn}], ids=[fn]\n",
+ " embeddings=[e], documents=[summary], metadatas=[{\"filename\": fn}], ids=[fn]\n",
" )"
]
},
@@ -630,7 +630,7 @@
"id": "51cf5003f530"
},
"source": [
- "The function below shows how we'll first try to inference Gemini wth small chunks (code file summaries). If Gemini can answer with that context, we return its response and we're done. If Gemini needs more context, we'll ask it what file it would like to see. Then, we'll directly retrieve the code file from the DataFrame, and pass it into Gemini again as the \"large\" context."
+ "The function below shows how we'll first try to inference Gemini with small chunks (code file summaries). If Gemini can answer with that context, we return its response and we're done. If Gemini needs more context, we'll ask it what file it would like to see. Then, we'll directly retrieve the code file from the DataFrame, and pass it into Gemini again as the \"large\" context."
]
},
{
@@ -826,7 +826,6 @@
" resource.null_resource.apply_deployment\n",
" ]\n",
"}\n",
- "\n",
"\"\"\""
]
},
diff --git a/language/translation/translation_training_data_tsv_generator.ipynb b/language/translation/translation_training_data_tsv_generator.ipynb
index 4564732913..edd05dab7f 100644
--- a/language/translation/translation_training_data_tsv_generator.ipynb
+++ b/language/translation/translation_training_data_tsv_generator.ipynb
@@ -435,9 +435,9 @@
" \"Length of a pair detected to be greater than 200 words.\"\n",
" )\n",
" print(\"this pair will be skipped\")\n",
- " more_than_200_words[\" \".join(src_row_data)] = (\n",
- " \" \".join(ref_row_data)\n",
- " )\n",
+ " more_than_200_words[\n",
+ " \" \".join(src_row_data)\n",
+ " ] = \" \".join(ref_row_data)\n",
" else:\n",
" tsv_f.write(\n",
" \" \".join(src_row_data)\n",
diff --git a/noxfile.py b/noxfile.py
index 1ef53e1cea..30eecac2fc 100644
--- a/noxfile.py
+++ b/noxfile.py
@@ -110,9 +110,21 @@ def format(session):
Run isort to sort imports. Then run black
to format code to uniform standard.
"""
- session.install(BLACK_VERSION, ISORT_VERSION, "autoflake", "ruff")
- # Use the --fss option to sort imports using strict alphabetical order.
- # See https://pycqa.github.io/isort/docs/configuration/options.html#force-sort-within-sections
+ session.install(
+ "git+https://github.com/tensorflow/docs",
+ "ipython",
+ "jupyter",
+ "nbconvert",
+ "types-requests",
+ BLACK_VERSION,
+ "blacken-docs",
+ "pyupgrade",
+ ISORT_VERSION,
+ "nbqa",
+ "autoflake",
+ "nbformat",
+ "ruff",
+ )
session.run(
"autoflake",
"-i",
@@ -126,6 +138,8 @@ def format(session):
"--fix-only",
*LINT_PATHS,
)
+ # Use the --fss option to sort imports using strict alphabetical order.
+ # See https://pycqa.github.io/isort/docs/configuration/options.html#force-sort-within-sections
session.run(
"isort",
"--fss",
@@ -135,28 +149,6 @@ def format(session):
"black",
*LINT_PATHS,
)
-
-
-@nox.session(python=DEFAULT_PYTHON_VERSION)
-def format_notebooks(session):
- """
- Run isort to sort imports. Then run black
- to format code to uniform standard.
- """
- session.install(
- "git+https://github.com/tensorflow/docs",
- "ipython",
- "jupyter",
- "nbconvert",
- "types-requests",
- "black",
- "blacken-docs",
- "pyupgrade",
- "isort",
- "nbqa",
- "autoflake",
- "nbformat",
- )
session.run("python3", ".github/workflows/update_notebook_links.py", ".")
session.run(
"nbqa", "pyupgrade", "--exit-zero-even-if-changed", "--py310-plus", *LINT_PATHS
@@ -176,6 +168,15 @@ def format_notebooks(session):
session.run("nbqa", "blacken-docs", "--nbqa-md", *LINT_PATHS)
session.run("python3", "-m", "tensorflow_docs.tools.nbfmt", *LINT_PATHS)
+ # Sort Spelling Allowlist
+ spelling_allow_file = ".github/actions/spelling/allow.txt"
+
+ with open(spelling_allow_file, encoding="utf-8") as file:
+ unique_words = sorted(set(file))
+
+ with open(spelling_allow_file, "w", encoding="utf-8") as file:
+ file.writelines(unique_words)
+
def install_unittest_dependencies(session, *constraints):
standard_deps = UNIT_TEST_STANDARD_DEPENDENCIES + UNIT_TEST_DEPENDENCIES
diff --git a/owlbot.py b/owlbot.py
index 9f1ff224bb..8b9c29594d 100644
--- a/owlbot.py
+++ b/owlbot.py
@@ -33,14 +33,3 @@
# ----------------------------------------------------------------------------
s.shell.run(["nox", "-s", "format"], hide_output=False)
-
-s.shell.run(["nox", "-s", "format_notebooks"], hide_output=False)
-
-# Sort Spelling Allowlist
-spelling_allow_file = ".github/actions/spelling/allow.txt"
-
-with open(spelling_allow_file, encoding="utf-8") as file:
- unique_words = sorted(set(file))
-
-with open(spelling_allow_file, "w", encoding="utf-8") as file:
- file.writelines(unique_words)
diff --git a/search/web-app/consts.py b/search/web-app/consts.py
index 7a2145d620..c30c91308e 100644
--- a/search/web-app/consts.py
+++ b/search/web-app/consts.py
@@ -33,11 +33,11 @@
CUSTOM_UI_DATASTORE_IDS = [
{
"name": "Google Cloud Website",
- "engine_id": "google-cloud-site-search_xxxxxxxx",
+ "engine_id": "google-cloud-site-search",
},
{
"name": "Google Merchandise Store (Advanced Indexing)",
- "engine_id": "google-merch-store_xxxxxxxx",
+ "engine_id": "google-merch-store",
},
]
@@ -45,15 +45,15 @@
IMAGE_SEARCH_DATASTORE_IDs = [
{
"name": "Google Merchandise Store",
- "engine_id": "google-merch-store_xxxxxxx",
+ "engine_id": "google-merch-store",
}
]
RECOMMENDATIONS_DATASTORE_IDs = [
{
"name": "arXiv Natural Language Papers",
- "datastore_id": "arxiv_xxxxxxxxxx",
- "engine_id": "arxiv-personalize_xxxxxxxx",
+ "datastore_id": "arxiv",
+ "engine_id": "arxiv-personalize",
}
]
From b35e6dc13da3108ca082302c8d0319f26c3bc451 Mon Sep 17 00:00:00 2001
From: Deepak moonat
Date: Thu, 3 Oct 2024 10:35:19 +0530
Subject: [PATCH 35/76] update: markdown (#1206)
# Description
Update the author section with correct markdown format.
Update Tuning section with additional info
- [x] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [x] You are listed as the author in your notebook or README file.
- [x] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [x] Make your Pull Request title in the
specification.
- [x] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [x] Appropriate docs were updated (if necessary)
---------
Co-authored-by: Owl Bot
---
...inetuning_using_gemini_on_image_data.ipynb | 30 +++++++++++++------
1 file changed, 21 insertions(+), 9 deletions(-)
diff --git a/gemini/tuning/supervised_finetuning_using_gemini_on_image_data.ipynb b/gemini/tuning/supervised_finetuning_using_gemini_on_image_data.ipynb
index a9cf0b47bb..cd32b7ac72 100644
--- a/gemini/tuning/supervised_finetuning_using_gemini_on_image_data.ipynb
+++ b/gemini/tuning/supervised_finetuning_using_gemini_on_image_data.ipynb
@@ -63,9 +63,9 @@
"id": "MgVK7IeKpW27"
},
"source": [
- "| | | |\n",
- "|-|-|-|\n",
- "|Author(s) | [Deepak Moonat](https://github.com/dmoonat)"
+ "| | |\n",
+ "|-|-|\n",
+ "|Author(s) | [Deepak Moonat](https://github.com/dmoonat) |"
]
},
{
@@ -143,11 +143,11 @@
"id": "jCMczwd00N9T"
},
"source": [
- "Dataset used in this notebook is about image captioning.\n",
+ "Dataset used in this notebook is about image captioning. [Reference](https://ai.google.dev/gemma/docs/paligemma/fine-tuning-paligemma#download_the_model_checkpoint)\n",
"\n",
- "[Reference](https://ai.google.dev/gemma/docs/paligemma/fine-tuning-paligemma#download_the_model_checkpoint)\n",
- "\n",
- "Licensed under the Creative Commons Attribution 4.0 License"
+ "```\n",
+ "Licensed under the Creative Commons Attribution 4.0 License\n",
+ "```"
]
},
{
@@ -1484,20 +1484,32 @@
"## Fine-tune the model"
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "d1263b90fbc4"
+ },
+ "source": [
+ "When you run a supervised fine-tuning job, the model learns additional parameters that help it encode the necessary information to perform the desired task or learn the desired behavior. These parameters are used during inference. The output of the tuning job is a new model that combines the newly learned parameters with the original model."
+ ]
+ },
{
"cell_type": "markdown",
"metadata": {
"id": "EyqBRoY5rscI"
},
"source": [
+ "**Tuning Job parameters**\n",
+ "\n",
"- `source_model`: Specifies the base Gemini model version you want to fine-tune.\n",
"- `train_dataset`: Path to your training data in JSONL format.\n",
"\n",
- " \n",
"\n",
- " *Optional parameters*\n",
+ " *Optional parameters*\n",
" - `validation_dataset`: If provided, this data is used to evaluate the model during tuning.\n",
" - `tuned_model_display_name`: Display name for the tuned model.\n",
+ " \n",
+ " *Hyperparameters* \n",
" - `epochs`: The number of training epochs to run.\n",
" - `learning_rate_multiplier`: A value to scale the learning rate during training.\n",
" - `adapter_size` : Gemini 1.5 Pro supports Adapter length [1, 4], default value is 4.\n"
From ab2a7dfb09d9a711ef8ae54f6b8ef4754364992f Mon Sep 17 00:00:00 2001
From: Ivan Nardini <88703814+inardini@users.noreply.github.com>
Date: Thu, 3 Oct 2024 18:47:42 +0200
Subject: [PATCH 36/76] feat: vapo with custom function (#1205)
# Description
This notebook demonstrates how to leverage Vertex AI Prompt Optimizer
(Preview) to optimize a simple prompt for a Gemini model using your own
metric.
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [x] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [x] You are listed as the author in your notebook or README file.
- [x] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [x] Make your Pull Request title in the
specification.
- [x] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [x] Appropriate docs were updated (if necessary)
---------
Co-authored-by: Owl Bot
---
.../prompts/prompt_optimizer/utils/helpers.py | 26 +-
.../vertex_ai_prompt_optimizer_sdk.ipynb | 15 +-
...i_prompt_optimizer_sdk_custom_metric.ipynb | 1363 +++++++++++++++++
3 files changed, 1390 insertions(+), 14 deletions(-)
create mode 100644 gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk_custom_metric.ipynb
diff --git a/gemini/prompts/prompt_optimizer/utils/helpers.py b/gemini/prompts/prompt_optimizer/utils/helpers.py
index c20b10269c..0bf1e639db 100644
--- a/gemini/prompts/prompt_optimizer/utils/helpers.py
+++ b/gemini/prompts/prompt_optimizer/utils/helpers.py
@@ -15,6 +15,7 @@
import json
import random
import string
+import subprocess
from typing import Dict, List, Optional, Tuple, Union
from IPython.display import HTML, Markdown, display
@@ -59,6 +60,21 @@ def get_id(length: Union[int, None] = 8) -> str:
return "".join(random.choices(string.ascii_lowercase + string.digits, k=length))
+def get_auth_token() -> None:
+ """A function to collect the authorization token"""
+ try:
+ result = subprocess.run(
+ ["gcloud", "auth", "print-identity-token", "-q"],
+ capture_output=True,
+ text=True,
+ check=True,
+ )
+ return result.stdout.strip()
+ except subprocess.CalledProcessError as e:
+ print(f"Error getting auth token: {e}")
+ return None
+
+
@retry(wait=wait_random_exponential(multiplier=1, max=120))
async def async_generate(prompt: str, model: GenerativeModel) -> Union[str, None]:
"""Generate a response from the model."""
@@ -66,7 +82,7 @@ async def async_generate(prompt: str, model: GenerativeModel) -> Union[str, None
[prompt],
stream=False,
)
- return response.text[0] if response.text else None
+ return response.text if response.text else None
def evaluate_task(
@@ -81,7 +97,7 @@ def evaluate_task(
"""Evaluate task using Vertex AI Evaluation."""
# Generate a unique id for the experiment run
- id = get_id()
+ idx = get_id()
# Rename the columns to match the expected format
eval_dataset = df[[prompt_col, reference_col, response_col]].rename(
@@ -108,7 +124,7 @@ def evaluate_task(
)
# Evaluate the task
- result = eval_task.evaluate(experiment_run_name=f"{experiment_name}-{id}")
+ result = eval_task.evaluate(experiment_run_name=f"{experiment_name}-{idx}")
# Return the summary metrics
return result.summary_metrics
@@ -125,9 +141,7 @@ def print_df_rows(
)
# Define the header style for the text
- header_style = (
- "white-space: pre-wrap; width: 800px; overflow-x: auto; font-size: 16px;"
- )
+ header_style = "white-space: pre-wrap; width: 800px; overflow-x: auto; font-size: 16px; font-weight: bold;"
# If columns are specified, filter the DataFrame
if columns:
diff --git a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk.ipynb b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk.ipynb
index 35f16a7d7b..762ec120c5 100644
--- a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk.ipynb
+++ b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk.ipynb
@@ -88,8 +88,7 @@
"source": [
"### Objective\n",
"\n",
- "This notebook demostrates how to leverage Vertex AI Prompt Optimizer (Preview) to efficiently migrate a prompt template from one model to another. The goal is to use Vertex AI Prompt Optimizer (Preview) to find the new prompt template which generate the most correct and grounded responses.\n",
- "\n",
+ "This notebook demostrates how to leverage Vertex AI Prompt Optimizer (Preview) to optimize a simple prompt for a Gemini model using your own metrics. The goal is to use Vertex AI Prompt Optimizer (Preview) to find the new prompt template which generate the most correct and grounded responses.\n",
"\n",
"This tutorial uses the following Google Cloud ML services and resources:\n",
"\n",
@@ -175,10 +174,7 @@
},
"outputs": [],
"source": [
- "import sys\n",
- "\n",
- "if \"google.colab\" in sys.modules:\n",
- " ! mkdir -p ./utils && wget https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/prompts/prompt_optimizer/utils/helpers.py -P ./utils"
+ "! mkdir -p ./tutorial/utils && wget https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/prompts/prompt_optimizer/utils/helpers.py -P ./tutorial/utils"
]
},
{
@@ -683,7 +679,7 @@
"id": "Rp1n1aMACzSW"
},
"source": [
- "### Translate the prompt template with Vertex AI Prompt Optimizer (Preview)\n"
+ "### Optimize the prompt template with Vertex AI Prompt Optimizer (Preview)\n"
]
},
{
@@ -702,7 +698,7 @@
"\n",
"Vertex AI Prompt Optimizer enables the translation and optimization of the Instruction Template, while the Task/Context Template remains essential for evaluating different instruction templates.\n",
"\n",
- "In this case, you want to translate a prompt\n"
+ "In this case, you want to enhance or optimize a simple prompt template.\n"
]
},
{
@@ -850,6 +846,9 @@
" source_model=\"\",\n",
" source_model_qps=\"\",\n",
" source_model_location=\"\",\n",
+ " optimizer_model=\"gemini-1.5-pro-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n",
+ " optimizer_model_qps=1,\n",
+ " optimizer_model_location=\"us-central1\",\n",
" eval_model=\"gemini-1.5-pro-001\", # Supported models: \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"\n",
" eval_qps=1,\n",
" eval_model_location=\"us-central1\",\n",
diff --git a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk_custom_metric.ipynb b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk_custom_metric.ipynb
new file mode 100644
index 0000000000..fa8b39fca6
--- /dev/null
+++ b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_sdk_custom_metric.ipynb
@@ -0,0 +1,1363 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ur8xi4C7S06n"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "JAPoU8Sm5E6e"
+ },
+ "source": [
+ "# Vertex Prompt Optimizer Notebook SDK (Preview) - Custom metric\n",
+ "\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "18b1887a153f"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "| Author(s) | [Noa Ben-Efraim](https://github.com/noabenefraim) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4dddad16697f"
+ },
+ "source": [
+ "## Overview\n",
+ "LlamaIndex workflows are a powerful way to orchestrate complex LLM (large language model) applications. They provide an event-driven framework for building AI systems that go beyond simple question-answering. \n",
+ "\n",
+ "Think of a workflow as a series of steps, where each step performs a specific action. These actions can be anything from querying an LLM, to retrieving data from a vector database, to interacting with external APIs. The workflow manages the flow of data between these steps, making it easy to build sophisticated AI applications. \n",
+ "\n",
+ "Here's a breakdown of the key concepts:\n",
+ "\n",
+ "+ Events: These trigger actions within the workflow. For example, a user's query can be an initial event that kicks off the workflow. \n",
+ "+ Steps: These are individual functions decorated with @step that process events and potentially emit new events. Steps are the building blocks of your workflow. \n",
+ "+ Event-driven: This means that the workflow reacts to events as they happen, making it flexible and dynamic."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5bdceacc7b39"
+ },
+ "source": [
+ "This notebook perform a complex Retrieval Augmented Generation (RAG) workflow using Gemini models and Firestore databases. There are two branches for this workflow:\n",
+ "\n",
+ "_Branch 1_\n",
+ "+ Start Event triggered by providing a data directory to the workflow\n",
+ "+ Ingest data using the LlamaIndex `SimpleDirectoryReader`\n",
+ "+ Load data in the Firestore Database\n",
+ "\n",
+ "_Branch 2_\n",
+ "+ Start Event triggered by providing a query to the workflow\n",
+ "+ The QueryMultiStep Event that breaks down a complex query into sequential sub-questions using Gemini. Then proceeds to answer the sub-questions.\n",
+ "+ The sub-questions results are passed to the RerankEvent where given the initial user query, Gemini reranks the returned answers to the sub-questions.\n",
+ "+ The reranked chunks are passed to the CreateCitationEvents where citations are added to the sub-questions used to generate the answer.\n",
+ "+ An answer is synthesized for the original query and returned to the user.\n",
+ "\n",
+ "References:\n",
+ "+ https://docs.llamaindex.ai/en/stable/examples/workflow/rag/\n",
+ "+ https://docs.llamaindex.ai/en/stable/examples/workflow/multi_step_query_engine/\n",
+ "+ https://docs.llamaindex.ai/en/stable/examples/workflow/citation_query_engine/\n",
+ "\n",
+ "\n",
+ "![RAGWorkflow](https://storage.googleapis.com/github-repo/generative-ai/gemini/orchestration/llamaindex_workflows/RAGWorkflow.png)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ef012cf7cb67"
+ },
+ "source": [
+ "## Get started\n",
+ "\n",
+ "### Install required packages\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "323a27d12c02"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install llama-index==\"0.11.8\" \\\n",
+ " llama-index-embeddings-vertex==\"0.2.0\" \\\n",
+ " llama-index-utils-workflow==\"0.2.1\" \\\n",
+ " llama-index-llms-vertex==\"0.3.4\" \\\n",
+ " llama-index-storage-docstore-firestore==\"0.2.0\""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "f42d12d15616"
+ },
+ "source": [
+ "### Restart runtime\n",
+ "\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
+ "\n",
+ "The restart might take a minute or longer. After it's restarted, continue to the next step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "f3d98395d9a4"
+ },
+ "outputs": [],
+ "source": [
+ "import IPython\n",
+ "\n",
+ "app = IPython.Application.instance()\n",
+ "app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "e114f5653870"
+ },
+ "source": [
+ "### Authenticate your notebook environment (Colab only)\n",
+ "\n",
+ "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "911453311a5d"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " from google.colab import auth\n",
+ "\n",
+ " auth.authenticate_user()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "e8d7771a5818"
+ },
+ "source": [
+ "### Set Google Cloud project information and initialize Vertex AI SDK\n",
+ "This notebook requires the following resources:\n",
+ "+ Initialized Google Cloud project\n",
+ "+ Vertex AI API enabled\n",
+ "+ Existing VPC/Subnet\n",
+ "+ Existing Firestore database\n",
+ "\n",
+ "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
+ "\n",
+ "To get started using Firestore Database, refer to the following [documentation](https://cloud.google.com/firestore/docs/manage-databases).\n",
+ "\n",
+ "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {
+ "id": "e04ae6146ccd"
+ },
+ "outputs": [],
+ "source": [
+ "# Use the environment variable if the user doesn't provide Project ID.\n",
+ "import os\n",
+ "\n",
+ "import vertexai\n",
+ "\n",
+ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\", isTemplate: true}\n",
+ "if PROJECT_ID == \"[your-project-id]\":\n",
+ " PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n",
+ "\n",
+ "LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"us-central1\")\n",
+ "FIRESTORE_DATABASE_ID = \"[your-firestore-database-id]\"\n",
+ "\n",
+ "vertexai.init(project=PROJECT_ID, location=LOCATION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "d68eca7a8d4f"
+ },
+ "source": [
+ "## Workflow"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "93704f34a080"
+ },
+ "source": [
+ "### Import libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {
+ "id": "9b7824a768bc"
+ },
+ "outputs": [],
+ "source": [
+ "from typing import Any, cast\n",
+ "\n",
+ "from IPython.display import Markdown, display\n",
+ "from llama_index.core import (\n",
+ " Settings,\n",
+ " SimpleDirectoryReader,\n",
+ " StorageContext,\n",
+ " VectorStoreIndex,\n",
+ ")\n",
+ "from llama_index.core.indices.query.query_transform.base import (\n",
+ " StepDecomposeQueryTransform,\n",
+ ")\n",
+ "from llama_index.core.llms import LLM\n",
+ "from llama_index.core.node_parser import SentenceSplitter\n",
+ "from llama_index.core.postprocessor.llm_rerank import LLMRerank\n",
+ "from llama_index.core.prompts import PromptTemplate\n",
+ "from llama_index.core.response_synthesizers import (\n",
+ " ResponseMode,\n",
+ " get_response_synthesizer,\n",
+ ")\n",
+ "from llama_index.core.schema import MetadataMode, NodeWithScore, QueryBundle, TextNode\n",
+ "from llama_index.core.workflow import (\n",
+ " Context,\n",
+ " Event,\n",
+ " StartEvent,\n",
+ " StopEvent,\n",
+ " Workflow,\n",
+ " step,\n",
+ ")\n",
+ "from llama_index.embeddings.vertex import VertexTextEmbedding\n",
+ "from llama_index.llms.vertex import Vertex\n",
+ "from llama_index.storage.docstore.firestore import FirestoreDocumentStore\n",
+ "from llama_index.utils.workflow import draw_all_possible_flows\n",
+ "from vertexai.generative_models import HarmBlockThreshold, HarmCategory, SafetySetting"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "0e101376082c"
+ },
+ "source": [
+ "### Get data"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "f15a704ea11b"
+ },
+ "outputs": [],
+ "source": [
+ "!mkdir -p './data'\n",
+ "!wget 'https://www.gutenberg.org/cache/epub/64317/pg64317.txt' -O 'data/gatsby.txt'"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "927a7ffd9ad8"
+ },
+ "source": [
+ "### Set credentials"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {
+ "id": "7d11aac2947b"
+ },
+ "outputs": [],
+ "source": [
+ "import google.auth\n",
+ "import google.auth.transport.requests\n",
+ "\n",
+ "# credentials will now have an api token\n",
+ "credentials = google.auth.default(quota_project_id=PROJECT_ID)[0]\n",
+ "request = google.auth.transport.requests.Request()\n",
+ "credentials.refresh(request)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "af476af08250"
+ },
+ "source": [
+ "## Workflow"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "456efd13df2a"
+ },
+ "source": [
+ "### Set up the LLM"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "695792a24ba9"
+ },
+ "outputs": [],
+ "source": [
+ "safety_config = [\n",
+ " SafetySetting(\n",
+ " category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,\n",
+ " threshold=HarmBlockThreshold.BLOCK_ONLY_HIGH,\n",
+ " ),\n",
+ " SafetySetting(\n",
+ " category=HarmCategory.HARM_CATEGORY_HARASSMENT,\n",
+ " threshold=HarmBlockThreshold.BLOCK_ONLY_HIGH,\n",
+ " ),\n",
+ " SafetySetting(\n",
+ " category=HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,\n",
+ " threshold=HarmBlockThreshold.BLOCK_ONLY_HIGH,\n",
+ " ),\n",
+ "]\n",
+ "embedding_model = VertexTextEmbedding(\n",
+ " model_name=\"text-embedding-004\", credentials=credentials\n",
+ ")\n",
+ "llm = Vertex(\n",
+ " model=\"gemini-pro\",\n",
+ " temperature=0.2,\n",
+ " max_tokens=3000,\n",
+ " safety_settings=safety_config,\n",
+ " credentials=credentials,\n",
+ ")\n",
+ "\n",
+ "Settings.embed_model = embedding_model\n",
+ "Settings.llm = llm"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "1b99e7e4aa0d"
+ },
+ "source": [
+ "### Define Event classes\n",
+ "\n",
+ "Here we will create custom events that can be emitted by steps and trigger other steps. \n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {
+ "id": "31173e6befe2"
+ },
+ "outputs": [],
+ "source": [
+ "class RetrieverEvent(Event):\n",
+ " \"\"\"Result of running retrieval\"\"\"\n",
+ "\n",
+ " nodes: list[NodeWithScore]\n",
+ "\n",
+ "\n",
+ "class RerankEvent(Event):\n",
+ " \"\"\"Result of running reranking on retrieved nodes\"\"\"\n",
+ "\n",
+ " nodes: list[NodeWithScore]\n",
+ " source_nodes: list[NodeWithScore]\n",
+ " final_response_metadata: dict[str, Any]\n",
+ "\n",
+ "\n",
+ "class FirestoreIndexData(Event):\n",
+ " \"\"\"Result of indexing documents in Firestore\"\"\"\n",
+ "\n",
+ " status: str\n",
+ "\n",
+ "\n",
+ "class QueryMultiStepEvent(Event):\n",
+ " \"\"\"\n",
+ " Event containing results of a multi-step query process.\n",
+ "\n",
+ " Attributes:\n",
+ " nodes (List[NodeWithScore]): List of nodes with their associated scores.\n",
+ " source_nodes (List[NodeWithScore]): List of source nodes with their scores.\n",
+ " final_response_metadata (Dict[str, Any]): Metadata associated with the final response.\n",
+ " \"\"\"\n",
+ "\n",
+ " nodes: list[NodeWithScore]\n",
+ " source_nodes: list[NodeWithScore]\n",
+ " final_response_metadata: dict[str, Any]\n",
+ "\n",
+ "\n",
+ "class CreateCitationsEvent(Event):\n",
+ " \"\"\"Add citations to the nodes.\"\"\"\n",
+ "\n",
+ " nodes: list[NodeWithScore]\n",
+ " source_nodes: list[NodeWithScore]\n",
+ " final_response_metadata: dict[str, Any]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "805eae826756"
+ },
+ "source": [
+ "### Update Prompt Templates\n",
+ "\n",
+ "Defining custom prompts used for the citation portion of the workflow."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {
+ "id": "b2ec8b2859f2"
+ },
+ "outputs": [],
+ "source": [
+ "CITATION_QA_TEMPLATE = PromptTemplate(\n",
+ " \"Your task is to answer the question based on the information given in the sources listed below.\"\n",
+ " \"Use only the provided sources to answer.\"\n",
+ " \"Cite the source number(s) for any information you use in your answer (e.g., [1]).\"\n",
+ " \"Always include at least one source citation in your answer.\"\n",
+ " \"Only cite a source if you directly use information from it.\"\n",
+ " \"If the sources don't contain the information needed to answer the question, state that.\"\n",
+ " \"For example:\"\n",
+ " \"Source 1: Apples are red, green, or yellow.\"\n",
+ " \"Source 2: Bananas are yellow when ripe.\"\n",
+ " \"Source 3: Strawberries are red when ripe.\"\n",
+ " \"Query: Which fruits are red when ripe?\"\n",
+ " \"Answer: Apples [1] and strawberries [3] can be red when ripe.\"\n",
+ " \"------\"\n",
+ " \"Below are several numbered sources of information:\"\n",
+ " \"------\"\n",
+ " \"{context_str}\"\n",
+ " \"------\"\n",
+ " \"Query: {query_str}\"\n",
+ " \"Answer: \"\n",
+ ")\n",
+ "\n",
+ "CITATION_REFINE_TEMPLATE = PromptTemplate(\n",
+ " \"You have an initial answer to a query.\"\n",
+ " \"Your job is to improve this answer using the information provided in the numbered sources below. Here's how:\"\n",
+ " \" - Read the existing answer and the sources carefully.\"\n",
+ " \" - Identify any information in the sources that can improve the answer by adding details, making it more accurate, or providing better support.\"\n",
+ " \" - If the sources provide new information, incorporate it into the answer.\"\n",
+ " \" - If the sources contradict the existing answer, correct the answer.\"\n",
+ " \" - If the sources aren't helpful, keep the original answer.\"\n",
+ " \"Cite the source number(s) for any information you use in your answer (e.g., [1]).\"\n",
+ " \"We have provided an existing answer: {existing_answer}\"\n",
+ " \"Below are several numbered sources of information. \"\n",
+ " \"Use them to refine the existing answer. \"\n",
+ " \"If the provided sources are not helpful, you will repeat the existing answer.\"\n",
+ " \"------\"\n",
+ " \"{context_msg}\"\n",
+ " \"------\"\n",
+ " \"Query: {query_str}\"\n",
+ " \"Answer: \"\n",
+ ")\n",
+ "\n",
+ "DEFAULT_CITATION_CHUNK_SIZE = 512\n",
+ "DEFAULT_CITATION_CHUNK_OVERLAP = 20"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "09009f1cd892"
+ },
+ "source": [
+ "### Workflow Class\n",
+ "\n",
+ "The RAGWorkflow() class contains all the steps of the workflow. We define the steps by decorating the method with @step.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {
+ "id": "486b4ba78947"
+ },
+ "outputs": [],
+ "source": [
+ "class RAGWorkflow(Workflow):\n",
+ " @step\n",
+ " async def ingest_data(\n",
+ " self, ctx: Context, ev: StartEvent\n",
+ " ) -> FirestoreIndexData | None:\n",
+ " \"\"\"Entry point to ingest a document, triggered by a StartEvent with 'dirname'.\"\"\"\n",
+ " dirname = ev.get(\"dirname\")\n",
+ " if not dirname:\n",
+ " return None\n",
+ "\n",
+ " documents = SimpleDirectoryReader(dirname).load_data()\n",
+ " await ctx.set(\"documents\", documents)\n",
+ " return FirestoreIndexData(\n",
+ " status=\"First step complete. Data loaded into Documents.\"\n",
+ " )\n",
+ "\n",
+ " @step\n",
+ " async def load_database(self, ctx: Context, ev: FirestoreIndexData) -> StopEvent:\n",
+ " print(ev.status)\n",
+ "\n",
+ " # create (or load) docstore and add nodes\n",
+ " docstore = FirestoreDocumentStore.from_database(\n",
+ " project=PROJECT_ID,\n",
+ " database=FIRESTORE_DATABASE_ID,\n",
+ " )\n",
+ "\n",
+ " docstore.add_documents(await ctx.get(\"documents\"))\n",
+ "\n",
+ " # create storage context\n",
+ " storage_context = StorageContext.from_defaults(docstore=docstore)\n",
+ "\n",
+ " # setup index\n",
+ " index = VectorStoreIndex.from_documents(\n",
+ " documents=await ctx.get(\"documents\"), storage_context=storage_context\n",
+ " )\n",
+ "\n",
+ " print(\"Index created\")\n",
+ " return StopEvent(index)\n",
+ "\n",
+ " def combine_queries(\n",
+ " self,\n",
+ " query_bundle: QueryBundle,\n",
+ " prev_reasoning: str,\n",
+ " llm: LLM,\n",
+ " ) -> QueryBundle:\n",
+ " \"\"\"Combine queries using StepDecomposeQueryTransform.\"\"\"\n",
+ " transform_metadata = {\"prev_reasoning\": prev_reasoning}\n",
+ " return StepDecomposeQueryTransform(llm=llm)(\n",
+ " query_bundle, metadata=transform_metadata\n",
+ " )\n",
+ "\n",
+ " def default_stop_fn(self, stop_dict: dict) -> bool:\n",
+ " \"\"\"Stop function for multi-step query combiner.\"\"\"\n",
+ " query_bundle = cast(QueryBundle, stop_dict.get(\"query_bundle\"))\n",
+ " if query_bundle is None:\n",
+ " raise ValueError(\"Response must be provided to stop function.\")\n",
+ "\n",
+ " return \"none\" in query_bundle.query_str.lower()\n",
+ "\n",
+ " @step(pass_context=True)\n",
+ " async def query_multistep(\n",
+ " self, ctx: Context, ev: StartEvent\n",
+ " ) -> QueryMultiStepEvent | None:\n",
+ " \"\"\"Entry point for RAG, triggered by a StartEvent with `query`. Execute multi-step query process.\"\"\"\n",
+ "\n",
+ " query = ev.get(\"query\")\n",
+ " index = ev.get(\"index\")\n",
+ "\n",
+ " prev_reasoning = \"\"\n",
+ " cur_response = None\n",
+ " should_stop = False\n",
+ " cur_steps = 0\n",
+ "\n",
+ " # use response\n",
+ " final_response_metadata: dict[str, Any] = {\"sub_qa\": []}\n",
+ "\n",
+ " text_chunks = []\n",
+ " source_nodes = []\n",
+ "\n",
+ " stop_fn = self.default_stop_fn\n",
+ "\n",
+ " if not query:\n",
+ " return None\n",
+ "\n",
+ " print(f\"Query the database with: {query}\")\n",
+ "\n",
+ " # store the query in the global context\n",
+ " await ctx.set(\"query\", query)\n",
+ "\n",
+ " # get the index from the global context\n",
+ " if index is None:\n",
+ " print(\"Index is empty, load some documents before querying!\")\n",
+ " return None\n",
+ "\n",
+ " num_steps = ev.get(\"num_steps\")\n",
+ " query_engine = index.as_query_engine()\n",
+ "\n",
+ " while not should_stop:\n",
+ " if num_steps is not None and cur_steps >= num_steps:\n",
+ " should_stop = True\n",
+ " break\n",
+ " elif should_stop:\n",
+ " break\n",
+ "\n",
+ " updated_query_bundle = self.combine_queries(\n",
+ " QueryBundle(query_str=query),\n",
+ " prev_reasoning,\n",
+ " llm=Settings.llm,\n",
+ " )\n",
+ "\n",
+ " print(\n",
+ " f\"Created query for the step - {cur_steps} is: {updated_query_bundle}\"\n",
+ " )\n",
+ "\n",
+ " stop_dict = {\"query_bundle\": updated_query_bundle}\n",
+ " if stop_fn(stop_dict):\n",
+ " should_stop = True\n",
+ " break\n",
+ "\n",
+ " cur_response = query_engine.query(updated_query_bundle)\n",
+ "\n",
+ " # append to response builder\n",
+ " cur_qa_text = (\n",
+ " f\"\\nQuestion: {updated_query_bundle.query_str}\\n\"\n",
+ " f\"Answer: {cur_response!s}\"\n",
+ " )\n",
+ " text_chunks.append(cur_qa_text)\n",
+ " print(\"Source nodes used:\\n\")\n",
+ " for source_node in cur_response.source_nodes:\n",
+ " print(source_node)\n",
+ " source_nodes.append(source_node)\n",
+ "\n",
+ " # update metadata\n",
+ " final_response_metadata[\"sub_qa\"].append(\n",
+ " (updated_query_bundle.query_str, cur_response)\n",
+ " )\n",
+ "\n",
+ " prev_reasoning += (\n",
+ " f\"- {updated_query_bundle.query_str}\\n\" f\"- {cur_response!s}\\n\"\n",
+ " )\n",
+ " cur_steps += 1\n",
+ "\n",
+ " nodes = [\n",
+ " NodeWithScore(node=TextNode(text=text_chunk)) for text_chunk in text_chunks\n",
+ " ]\n",
+ " return QueryMultiStepEvent(\n",
+ " nodes=nodes,\n",
+ " source_nodes=source_nodes,\n",
+ " final_response_metadata=final_response_metadata,\n",
+ " )\n",
+ "\n",
+ " @step\n",
+ " async def rerank(self, ctx: Context, ev: QueryMultiStepEvent) -> RerankEvent:\n",
+ " # Rerank the nodes\n",
+ " ranker = LLMRerank(choice_batch_size=5, top_n=10, llm=Settings.llm)\n",
+ " print(\"Entering reranking of nodes:\\n\")\n",
+ " print(\"Original query: \", await ctx.get(\"query\", default=None), flush=True)\n",
+ " # print(await ctx.get(\"query\", default=None), flush=True)\n",
+ " try:\n",
+ " new_nodes = ranker.postprocess_nodes(\n",
+ " ev.nodes, query_str=await ctx.get(\"query\", default=None)\n",
+ " )\n",
+ " except:\n",
+ " # re ranker is not guaranteed to create parsable output\n",
+ " new_nodes = ev.nodes\n",
+ "\n",
+ " print(f\"Reranked nodes to {len(new_nodes)}\")\n",
+ " return RerankEvent(\n",
+ " nodes=new_nodes,\n",
+ " source_nodes=ev.source_nodes,\n",
+ " final_response_metadata=ev.final_response_metadata,\n",
+ " )\n",
+ "\n",
+ " @step\n",
+ " async def create_citation_nodes(self, ev: RerankEvent) -> CreateCitationsEvent:\n",
+ " \"\"\"\n",
+ " Modify retrieved nodes to create granular sources for citations.\n",
+ "\n",
+ " Takes a list of NodeWithScore objects and splits their content\n",
+ " into smaller chunks, creating new NodeWithScore objects for each chunk.\n",
+ " Each new node is labeled as a numbered source, allowing for more precise\n",
+ " citation in query results.\n",
+ "\n",
+ " Args:\n",
+ " nodes (List[NodeWithScore]): A list of NodeWithScore objects to be processed.\n",
+ "\n",
+ " Returns:\n",
+ " List[NodeWithScore]: A new list of NodeWithScore objects, where each object\n",
+ " represents a smaller chunk of the original nodes, labeled as a source.\n",
+ " \"\"\"\n",
+ " nodes = ev.nodes\n",
+ "\n",
+ " new_nodes: list[NodeWithScore] = []\n",
+ "\n",
+ " text_splitter = SentenceSplitter(\n",
+ " chunk_size=DEFAULT_CITATION_CHUNK_SIZE,\n",
+ " chunk_overlap=DEFAULT_CITATION_CHUNK_OVERLAP,\n",
+ " )\n",
+ "\n",
+ " for node in nodes:\n",
+ " print(node)\n",
+ "\n",
+ " text_chunks = text_splitter.split_text(\n",
+ " node.node.get_content(metadata_mode=MetadataMode.NONE)\n",
+ " )\n",
+ "\n",
+ " for text_chunk in text_chunks:\n",
+ " text = f\"Source {len(new_nodes)+1}:\\n{text_chunk}\\n\"\n",
+ "\n",
+ " new_node = NodeWithScore(\n",
+ " node=TextNode.model_validate(node.node), score=node.score\n",
+ " )\n",
+ "\n",
+ " new_node.node.text = text\n",
+ " new_nodes.append(new_node)\n",
+ " return CreateCitationsEvent(\n",
+ " nodes=new_nodes,\n",
+ " source_nodes=ev.source_nodes,\n",
+ " final_response_metadata=ev.final_response_metadata,\n",
+ " )\n",
+ "\n",
+ " @step\n",
+ " async def synthesize(self, ctx: Context, ev: CreateCitationsEvent) -> StopEvent:\n",
+ " \"\"\"Return a streaming response using reranked nodes.\"\"\"\n",
+ "\n",
+ " print(\"Synthesizing final result...\")\n",
+ "\n",
+ " response_synthesizer = get_response_synthesizer(\n",
+ " llm=Vertex(model=\"gemini-1.5-pro\", temperature=0.0, max_tokens=5000),\n",
+ " text_qa_template=CITATION_QA_TEMPLATE,\n",
+ " refine_template=CITATION_REFINE_TEMPLATE,\n",
+ " response_mode=ResponseMode.COMPACT,\n",
+ " use_async=True,\n",
+ " )\n",
+ " query = await ctx.get(\"query\", default=None)\n",
+ " response = await response_synthesizer.asynthesize(\n",
+ " query, nodes=ev.nodes, additional_source_nodes=ev.source_nodes\n",
+ " )\n",
+ " return StopEvent(result=response)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "506c0759eab2"
+ },
+ "outputs": [],
+ "source": [
+ "# optional - generate DAG for workflow created above\n",
+ "draw_all_possible_flows(workflow=RAGWorkflow, filename=\"multi_step_workflow.html\") # type: ignore"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4eebc39a2b70"
+ },
+ "source": [
+ "### Run the workflow"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {
+ "id": "d7a98c8d7876"
+ },
+ "outputs": [],
+ "source": [
+ "w = RAGWorkflow(timeout=200)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {
+ "id": "0a14595e4e6a"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "First step complete. Data loaded into Documents.\n",
+ "Index created\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Ingest the documents\n",
+ "index = await w.run(dirname=\"./data\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "439d69a3e7f7"
+ },
+ "source": [
+ "#### Example 1\n",
+ "Query: \"What is the significance of the green light?\""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {
+ "id": "19ebb8696f71"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Query the database with: What is the significance of the green light?\n",
+ "Created query for the step - 0 is: What is the significance of the green light?\n",
+ "Source nodes used:\n",
+ "\n",
+ "Node ID: 0eab96dd-33ef-4d5c-a97e-8ca897af48d6\n",
+ "Text: Its vanished trees, the trees that had made way for Gatsby’s\n",
+ "house, had once pandered in whispers to the last and greatest of all\n",
+ "human dreams; for a transitory enchanted moment man must have held his\n",
+ "breath in the presence of this continent, compelled into an aesthetic\n",
+ "contemplation he neither understood nor desired, face to face for the\n",
+ "l...\n",
+ "Score: 0.540\n",
+ "\n",
+ "Node ID: 4b08ce92-cbf0-4469-88a5-8cb3514da22f\n",
+ "Text: “I’ve got a man in England who buys me clothes. He sends over a\n",
+ "selection of things at the beginning of each season, spring and fall.”\n",
+ "He took out a pile of shirts and began throwing them, one by one,\n",
+ "before us, shirts of sheer linen and thick silk and fine flannel,\n",
+ "which lost their folds as they fell and covered the table in many-\n",
+ "coloure...\n",
+ "Score: 0.525\n",
+ "\n",
+ "Created query for the step - 1 is: ## New Question:\n",
+ "\n",
+ "**What is the significance of the green light in the context of Gatsby's pursuit of Daisy?** \n",
+ "\n",
+ "Source nodes used:\n",
+ "\n",
+ "Node ID: f323395e-7546-454a-9f8b-563e73fbb292\n",
+ "Text: “Old sport, the dance is unimportant.” He wanted nothing less\n",
+ "of Daisy than that she should go to Tom and say: “I never loved you.”\n",
+ "After she had obliterated four years with that sentence they could\n",
+ "decide upon the more practical measures to be taken. One of them was\n",
+ "that, after she was free, they were to go back to Louisville and be\n",
+ "marr...\n",
+ "Score: 0.662\n",
+ "\n",
+ "Node ID: a2ec7e02-2983-4da9-b08a-afa1b6cc4216\n",
+ "Text: “Why didn’t he ask you to arrange a meeting?” “He wants her\n",
+ "to see his house,” she explained. “And your house is right next\n",
+ "door.” “Oh!” “I think he half expected her to wander into one of\n",
+ "his parties, some night,” went on Jordan, “but she never did. Then he\n",
+ "began asking people casually if they knew her, and I was the first\n",
+ "one he fo...\n",
+ "Score: 0.648\n",
+ "\n",
+ "Entering reranking of nodes:\n",
+ "\n",
+ "Original query: What is the significance of the green light?\n",
+ "Reranked nodes to 2\n",
+ "Node ID: c2860521-c9c1-4cab-b7a9-ea1c784506be\n",
+ "Text: Question: What is the significance of the green light? Answer:\n",
+ "The green light is a symbol of Gatsby's dream of Daisy. It is the\n",
+ "light at the end of her dock, which he can see from his house across\n",
+ "the bay. The green light represents Gatsby's hope for a future with\n",
+ "Daisy, and his belief that he can recapture the past. However, the\n",
+ "green light is...\n",
+ "Score: None\n",
+ "\n",
+ "Node ID: 7fe78bba-c870-486e-8f29-0168b09a792e\n",
+ "Text: Question: ## New Question: **What is the significance of the\n",
+ "green light in the context of Gatsby's pursuit of Daisy?** Answer:\n",
+ "## The Green Light: A Symbol of Gatsby's Dreams and Desires The green\n",
+ "light at the end of Daisy's dock plays a pivotal role in symbolizing\n",
+ "Gatsby's aspirations and the unattainable nature of his dreams. It\n",
+ "represent...\n",
+ "Score: None\n",
+ "\n",
+ "Synthesizing final result...\n"
+ ]
+ },
+ {
+ "data": {
+ "text/markdown": [
+ "## The Significance of the Green Light in The Great Gatsby\n",
+ "\n",
+ "The green light at the end of Daisy's dock holds immense symbolic weight in F. Scott Fitzgerald's *The Great Gatsby*. It represents a multitude of Gatsby's aspirations and desires, while simultaneously highlighting the unattainable nature of his dreams.\n",
+ "\n",
+ "**Unrequited Love:** The green light's physical proximity to Gatsby, yet separation by the bay, mirrors the emotional distance between him and Daisy. He yearns for her, but she remains out of reach, symbolizing his unrequited love.\n",
+ "\n",
+ "**The Past:** The green light evokes memories of Gatsby's past with Daisy, a time when their love seemed possible. He desperately wants to recapture that lost time and recreate their romance, clinging to the hope of a second chance.\n",
+ "\n",
+ "**Hope and Illusion:** The green light embodies Gatsby's unwavering hope for a future with Daisy. He believes that if he can achieve enough wealth and success, he can win her back. However, this hope is ultimately an illusion, as Daisy has moved on and their circumstances have changed.\n",
+ "\n",
+ "**The American Dream:** The green light can be interpreted as a symbol of the American Dream, representing Gatsby's relentless pursuit of wealth and social status. He believes that achieving these goals will bring him happiness and allow him to win Daisy's love. However, the novel ultimately suggests that the American Dream is often unattainable and can lead to disillusionment.\n",
+ "\n",
+ "**Additional Points:**\n",
+ "\n",
+ "* The green light's color reinforces its symbolic meaning. Green often represents hope, growth, and new beginnings, but in this context, it takes on a more melancholic and unattainable quality.\n",
+ "* The light's flickering nature reflects the instability of Gatsby's dreams and the uncertainty of his future.\n",
+ "* Gatsby's constant focus on the green light highlights his single-minded obsession with Daisy and his inability to move on from the past.\n",
+ "\n",
+ "**Overall, the green light serves as a powerful symbol that encapsulates Gatsby's longing, his yearning for a lost love, and the ultimately unattainable nature of his dreams.**\n",
+ "\n",
+ "**Sources:**\n",
+ "\n",
+ "* [1] The Great Gatsby by F. Scott Fitzgerald\n",
+ "* [2] SparkNotes: The Great Gatsby - Symbols, Imagery, Allegory"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# Run a query\n",
+ "NUM_STEPS = 2 # @param {type:\"int\"} represents how many sub-questions generated based on the query\n",
+ "result = await w.run(\n",
+ " query=\"What is the significance of the green light?\",\n",
+ " index=index,\n",
+ " num_steps=NUM_STEPS,\n",
+ ")\n",
+ "\n",
+ "display(Markdown(f\"{result}\"))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "da168aa48f73"
+ },
+ "source": [
+ "Check the ranked LLM generated sub-question answers used:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {
+ "id": "5dd8dab92106"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Node ID: c2860521-c9c1-4cab-b7a9-ea1c784506be\n",
+ "Text: Source 1: Question: What is the significance of the green light?\n",
+ "Answer: The green light is a symbol of Gatsby's dream of Daisy. It is\n",
+ "the light at the end of her dock, which he can see from his house\n",
+ "across the bay. The green light represents Gatsby's hope for a future\n",
+ "with Daisy, and his belief that he can recapture the past. However,\n",
+ "the gree...\n",
+ "Score: None\n",
+ "\n",
+ "Node ID: 7fe78bba-c870-486e-8f29-0168b09a792e\n",
+ "Text: Source 2: Question: ## New Question: **What is the significance\n",
+ "of the green light in the context of Gatsby's pursuit of Daisy?**\n",
+ "Answer: ## The Green Light: A Symbol of Gatsby's Dreams and Desires\n",
+ "The green light at the end of Daisy's dock plays a pivotal role in\n",
+ "symbolizing Gatsby's aspirations and the unattainable nature of his\n",
+ "dreams. It...\n",
+ "Score: None\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "for idx in range(0, NUM_STEPS):\n",
+ " print(result.source_nodes[idx])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "79eb31af5944"
+ },
+ "source": [
+ "Check the citations from the original source used:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {
+ "id": "6ddf240120c0"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Node ID: 0eab96dd-33ef-4d5c-a97e-8ca897af48d6\n",
+ "Text: Its vanished trees, the trees that had made way for Gatsby’s\n",
+ "house, had once pandered in whispers to the last and greatest of all\n",
+ "human dreams; for a transitory enchanted moment man must have held his\n",
+ "breath in the presence of this continent, compelled into an aesthetic\n",
+ "contemplation he neither understood nor desired, face to face for the\n",
+ "l...\n",
+ "Score: 0.540\n",
+ "\n",
+ "Node ID: 4b08ce92-cbf0-4469-88a5-8cb3514da22f\n",
+ "Text: “I’ve got a man in England who buys me clothes. He sends over a\n",
+ "selection of things at the beginning of each season, spring and fall.”\n",
+ "He took out a pile of shirts and began throwing them, one by one,\n",
+ "before us, shirts of sheer linen and thick silk and fine flannel,\n",
+ "which lost their folds as they fell and covered the table in many-\n",
+ "coloure...\n",
+ "Score: 0.525\n",
+ "\n",
+ "Node ID: f323395e-7546-454a-9f8b-563e73fbb292\n",
+ "Text: “Old sport, the dance is unimportant.” He wanted nothing less\n",
+ "of Daisy than that she should go to Tom and say: “I never loved you.”\n",
+ "After she had obliterated four years with that sentence they could\n",
+ "decide upon the more practical measures to be taken. One of them was\n",
+ "that, after she was free, they were to go back to Louisville and be\n",
+ "marr...\n",
+ "Score: 0.662\n",
+ "\n",
+ "Node ID: a2ec7e02-2983-4da9-b08a-afa1b6cc4216\n",
+ "Text: “Why didn’t he ask you to arrange a meeting?” “He wants her\n",
+ "to see his house,” she explained. “And your house is right next\n",
+ "door.” “Oh!” “I think he half expected her to wander into one of\n",
+ "his parties, some night,” went on Jordan, “but she never did. Then he\n",
+ "began asking people casually if they knew her, and I was the first\n",
+ "one he fo...\n",
+ "Score: 0.648\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "for idx in range(NUM_STEPS, len(result.source_nodes)):\n",
+ " print(result.source_nodes[idx])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "2dca7a130be9"
+ },
+ "source": [
+ "## Cleaning up\n",
+ "\n",
+ "To clean up all Google Cloud resources used in this project, you can delete the Google Cloud project you used for the tutorial.\n",
+ "\n",
+ "Otherwise, you can delete the individual resources you created in this tutorial."
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "name": "llamaindex_workflows.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
diff --git a/rag-grounding/README.md b/rag-grounding/README.md
index 1c128ef498..a9f0f5feec 100644
--- a/rag-grounding/README.md
+++ b/rag-grounding/README.md
@@ -123,6 +123,7 @@ performance.
- **[summarization_large_documents.ipynb](../language/use-cases/document-summarization/summarization_large_documents.ipynb)**,
**[summarization_large_documents_langchain.ipynb](../language/use-cases/document-summarization/summarization_large_documents_langchain.ipynb)**:
Demonstrate summarizing large documents.
+- **[llamaindex_workflows.ipynb](../gemini/orchestration/llamaindex_workflows.ipynb)** Using LlamaIndex Workflows to build an event driven RAG flow.
### Open Models
From 4ec1eb0d36c86d7876c37d35b427d8be0c7e53e6 Mon Sep 17 00:00:00 2001
From: Ivan Nardini <88703814+inardini@users.noreply.github.com>
Date: Fri, 4 Oct 2024 02:11:10 +0200
Subject: [PATCH 42/76] refactor: rename speech folder (#1209)
---------
Co-authored-by: Owl Bot
Co-authored-by: Holt Skinner
---
.github/actions/spelling/allow.txt | 2 ++
README.md | 6 +++---
{speech => audio/speech}/README.md | 0
.../speech/getting-started/get_started_with_chirp.ipynb | 8 ++++----
.../speech}/use-cases/storytelling/storytelling.ipynb | 8 ++++----
5 files changed, 13 insertions(+), 11 deletions(-)
rename {speech => audio/speech}/README.md (100%)
rename speech/getting-started/speech_recognition.ipynb => audio/speech/getting-started/get_started_with_chirp.ipynb (98%)
rename {speech => audio/speech}/use-cases/storytelling/storytelling.ipynb (98%)
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
index dff17f8f67..4421d752fb 100644
--- a/.github/actions/spelling/allow.txt
+++ b/.github/actions/spelling/allow.txt
@@ -344,6 +344,7 @@ bff
bigframes
bigquery
bitcoin
+blogs
boundings
bpa
bqml
@@ -586,6 +587,7 @@ pietra
pii
pixmap
pkl
+playlists
plotly
plpgsql
plt
diff --git a/README.md b/README.md
index 5b90b283d0..a65cc0ea98 100644
--- a/README.md
+++ b/README.md
@@ -105,12 +105,12 @@ For more Vertex AI samples, please visit the [Vertex AI samples GitHub repositor
Use this folder if you're interested in building your own solutions from scratch using features from Chirp, a version of Google's Universal Speech Model (USM) on Vertex AI (Vertex AI Chirp API).
\n",
- " \n",
+ " \n",
" \n",
" Open in Vertex AI Workbench\n",
" \n",
@@ -866,7 +866,7 @@
],
"metadata": {
"colab": {
- "name": "get_started_with_chirp.ipynb",
+ "name": "get_started_with_chirp_sdk.ipynb",
"toc_visible": true
},
"kernelspec": {
From 8fd9d89d0581a44e9467cbd5d83127fd6c82b54d Mon Sep 17 00:00:00 2001
From: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Date: Mon, 7 Oct 2024 11:20:25 -0500
Subject: [PATCH 45/76] fix: Broken Links (#1216)
Fixes #1212
Fixes #1214
Fixes #1215
---
audio/speech/README.md | 3 ++-
gemini/sample-apps/photo-discovery/README.md | 9 ++++-----
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/audio/speech/README.md b/audio/speech/README.md
index 995f6c3949..ddd921ba8e 100644
--- a/audio/speech/README.md
+++ b/audio/speech/README.md
@@ -7,7 +7,8 @@ This repository explores various use-cases and implementations of speech recogni
Here's a breakdown of the content available:
- **Getting Started:**
- - [speech_recognition.ipynb](getting-started/speech_recognition.ipynb): This Jupyter Notebook provides a basic introduction to performing speech recognition using Google Cloud's Speech-to-Text API.
+ - [get_started_with_chirp_sdk.ipynb](getting-started/get_started_with_chirp_sdk.ipynb): This Jupyter Notebook provides a basic introduction to performing speech recognition using the Chirp model with Google Cloud's Speech-to-Text API.
+ - [get_started_with_chirp_2_sdk.ipynb](getting-started/get_started_with_chirp_2_sdk.ipynb): This Jupyter Notebook provides a basic introduction to performing speech recognition using the Chirp 2 model with Google Cloud's Speech-to-Text API.
- **Use Cases:**
- **Storytelling:**
- [storytelling.ipynb](use-cases/storytelling/storytelling.ipynb): This notebook delves into a specific application of speech technology - crafting engaging stories. It likely leverages both speech recognition and generation to create interactive or automated storytelling experiences.
diff --git a/gemini/sample-apps/photo-discovery/README.md b/gemini/sample-apps/photo-discovery/README.md
index ce4bf68c6c..347b88cdc8 100644
--- a/gemini/sample-apps/photo-discovery/README.md
+++ b/gemini/sample-apps/photo-discovery/README.md
@@ -2,20 +2,19 @@
![Vertex AI Agent Builder & Flutter Multi-platform Demo – Fallingwater](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/photo-discovery/showcase.png)
-This project is a demo that integrates a Vertex AI Agent with a multiplatform Flutter app. Flutter is used as the client app framework, Vertex AI Search is used as a vector DB, and Reasoning Engine helps us build and deploy an agent with LangChain on Vertex AI.
+This project is a demo that integrates a Vertex AI Agent with a multi-platform Flutter app. Flutter is used as the client app framework, Vertex AI Search is used as a vector DB, and Reasoning Engine helps us build and deploy an agent with LangChain on Vertex AI.
-
-
-Users can run the Flutter app and take or upload a photo of a landmark. The app identifies the subject name and provides a brief description.
+Users can run the Flutter app and take or upload a photo of a landmark. The app identifies the subject name and provides a brief description.
To learn more about the subject of the photo, tap "tell me more" to chat with Khanh, an AI agent build built with Vertex AI Agent Builder, and ask follow-up questions about the history of subject in the photo based on information Wikipedia.
-The agent can also identify merchandise from the [Google Merchandise Store](https://your.googlemerchandisestore.com) and provide product name, description, pricing, and purchase link by referencing a Google Merchandise store dataset.
+The agent can also identify merchandise from the [Google Merchandise Store](https://your.merch.google/) and provide product name, description, pricing, and purchase link by referencing a Google Merchandise store dataset.
> [!NOTE]
Check out the Google I/O 2024 talk for a full walkthrough: [Build generative AI agents with Vertex AI Agent Builder and Flutter](https://youtu.be/V8P_S9OLI_I?si=N2QMBs7HNZL6mKU0).
## Demo
+
[Try the live demo app](https://photo-discovery-demo.web.app/)
![Chat UI - Lake Wakatipu](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/photo-discovery/demo.gif)
From 05b1113966676fcd8c9621c2f9e7fc840de81bc0 Mon Sep 17 00:00:00 2001
From: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Date: Mon, 7 Oct 2024 11:39:39 -0500
Subject: [PATCH 46/76] ci: Remove Javascript Standard from Superlinter (#1217)
---
.github/actions/spelling/excludes.txt | 1 +
.github/workflows/linter.yaml | 1 +
2 files changed, 2 insertions(+)
diff --git a/.github/actions/spelling/excludes.txt b/.github/actions/spelling/excludes.txt
index d482619f72..8ca285fc1b 100644
--- a/.github/actions/spelling/excludes.txt
+++ b/.github/actions/spelling/excludes.txt
@@ -82,6 +82,7 @@
^\.github/actions/spelling/
^\Q.github/workflows/spelling.yaml\E$
^\Q.github/workflows/notebook_linter/run_linter.sh\E$
+^\Q.github/workflows/linter.yaml\E$
^\Qgemini/function-calling/use_case_company_news_and_insights.ipynb\E$
^\Qgemini/getting-started/intro_gemini_1_5_pro.ipynb\E$
^\Qgemini/getting-started/intro_gemini_pro_vision_python.ipynb\E$
diff --git a/.github/workflows/linter.yaml b/.github/workflows/linter.yaml
index e9e1139ec1..eddd527e88 100644
--- a/.github/workflows/linter.yaml
+++ b/.github/workflows/linter.yaml
@@ -61,3 +61,4 @@ jobs:
VALIDATE_PYTHON_ISORT: false
VALIDATE_TYPESCRIPT_STANDARD: false # super-linter/super-linter#4445
VALIDATE_CHECKOV: false
+ VALIDATE_JAVASCRIPT_STANDARD: false
From 235494f8117c50c3d8391eb10392b1ddd1e4b80f Mon Sep 17 00:00:00 2001
From: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Date: Mon, 7 Oct 2024 11:54:43 -0500
Subject: [PATCH 47/76] ci: update renovate.json exclude for genwealth (#1219)
---
renovate.json | 1 -
1 file changed, 1 deletion(-)
diff --git a/renovate.json b/renovate.json
index 890d84603b..e6d6b00bd2 100644
--- a/renovate.json
+++ b/renovate.json
@@ -25,7 +25,6 @@
"extends": ["schedule:quarterly"]
},
{
- "matchDepTypes": ["all"],
"excludePaths": ["gemini/sample-apps/genwealth/**"]
}
],
From 46459b9b0723a0f97963c745d4026f9dde827e45 Mon Sep 17 00:00:00 2001
From: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Date: Mon, 7 Oct 2024 11:59:04 -0500
Subject: [PATCH 48/76] ci: Remove medium.com from lychee checks (#1220)
Fixes #1193
Fixes #1187
---
lychee.toml | 1 +
1 file changed, 1 insertion(+)
diff --git a/lychee.toml b/lychee.toml
index eedead2d39..9dc16ad8b6 100644
--- a/lychee.toml
+++ b/lychee.toml
@@ -3,6 +3,7 @@ exclude = [
'https://fonts.googleapis.com/',
'http://go/github',
'http://go/github-googlecloudplatform',
+ 'https://medium.com/',
]
exclude_path = [
".github/actions/spelling",
From 710920c3088920161530dc337e0b85468de964f6 Mon Sep 17 00:00:00 2001
From: Katie Nguyen <21978337+katiemn@users.noreply.github.com>
Date: Mon, 7 Oct 2024 11:41:44 -0700
Subject: [PATCH 49/76] feat: image segmentation to psd file notebook (#1213)
# Description
This new notebook highlights a use case for the image segmentation API.
The user uploads an image, then chooses up to four segmentation options.
Finally, all of the various image masks are added as layers and saved as
a separate PSD file that's uploaded to a Google Cloud Storage bucket.
---------
Co-authored-by: Owl Bot
Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
---
.github/actions/spelling/allow.txt | 1 +
.../use-cases/image_segmentation_layers.ipynb | 954 ++++++++++++++++++
2 files changed, 955 insertions(+)
create mode 100644 vision/use-cases/image_segmentation_layers.ipynb
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
index f4e3efaddf..084936e4cc 100644
--- a/.github/actions/spelling/allow.txt
+++ b/.github/actions/spelling/allow.txt
@@ -206,6 +206,7 @@ Parmar
Persero
Phaidon
Pharma
+Photoshop
Pistorius
Priyanka
QPM
diff --git a/vision/use-cases/image_segmentation_layers.ipynb b/vision/use-cases/image_segmentation_layers.ipynb
new file mode 100644
index 0000000000..81602f94b7
--- /dev/null
+++ b/vision/use-cases/image_segmentation_layers.ipynb
@@ -0,0 +1,954 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "uxCkB_DXTHzf"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Hny4I-ODTIS6"
+ },
+ "source": [
+ "# Create a Photoshop Document with Image Segmentation on Vertex AI\n",
+ "\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "84f0f73a0f76"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "| Author(s) | [Ivan Nardini](https://github.com/inardini) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tvgnzT1CKxrO"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "In this tutorial, you learn about how to use [Chirp 2](https://cloud.google.com/speech-to-text/v2/docs/chirp_2-model), the latest generation of Google's multilingual ASR-specific models, and its new features, including word-level timestamps, model adaptation, and speech translation."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "61RBz8LLbxCR"
+ },
+ "source": [
+ "## Get started"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "No17Cw5hgx12"
+ },
+ "source": [
+ "### Install Speech-to-Text SDK and other required packages\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "e73_ZgKWYedz"
+ },
+ "outputs": [],
+ "source": [
+ "! apt update -y -qq\n",
+ "! apt install ffmpeg -y -qq"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "tFy3H3aPgx12"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install --quiet 'google-cloud-speech' 'protobuf<4.21' 'google-auth==2.27.0' 'pydub' 'etils' 'jiwer' 'ffmpeg-python' 'plotly' 'gradio'"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "R5Xep4W9lq-Z"
+ },
+ "source": [
+ "### Restart runtime\n",
+ "\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
+ "\n",
+ "The restart might take a minute or longer. After it's restarted, continue to the next step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XRvKdaPDTznN"
+ },
+ "outputs": [],
+ "source": [
+ "import IPython\n",
+ "\n",
+ "app = IPython.Application.instance()\n",
+ "app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SbmM4z7FOBpM"
+ },
+ "source": [
+ "
\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dmWOrTJ3gx13"
+ },
+ "source": [
+ "### Authenticate your notebook environment (Colab only)\n",
+ "\n",
+ "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "NyKGtVQjgx13"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " from google.colab import auth\n",
+ "\n",
+ " auth.authenticate_user()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DF4l8DTdWgPY"
+ },
+ "source": [
+ "### Set Google Cloud project information and initialize Speech-to-Text V2 SDK\n",
+ "\n",
+ "To get started using the Speech-to-Text API, you must have an existing Google Cloud project and [enable the Speech-to-Text API](https://console.cloud.google.com/flows/enableapi?apiid=speech.googleapis.com).\n",
+ "\n",
+ "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "WIQyBhAn_9tK"
+ },
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "\n",
+ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\", isTemplate: true}\n",
+ "\n",
+ "if PROJECT_ID == \"[your-project-id]\":\n",
+ " PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n",
+ "\n",
+ "LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"us-central1\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Nqwi-5ufWp_B"
+ },
+ "outputs": [],
+ "source": [
+ "from google.api_core.client_options import ClientOptions\n",
+ "from google.cloud.speech_v2 import SpeechClient\n",
+ "\n",
+ "API_ENDPOINT = f\"{LOCATION}-speech.googleapis.com\"\n",
+ "\n",
+ "client = SpeechClient(\n",
+ " client_options=ClientOptions(\n",
+ " api_endpoint=API_ENDPOINT,\n",
+ " )\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5303c05f7aa6"
+ },
+ "source": [
+ "### Import libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "6fc324893334"
+ },
+ "outputs": [],
+ "source": [
+ "from google.cloud.speech_v2.types import cloud_speech\n",
+ "import gradio as gr"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "qqm0OQpAYCph"
+ },
+ "outputs": [],
+ "source": [
+ "import io\n",
+ "import os\n",
+ "\n",
+ "import IPython.display as ipd\n",
+ "from etils import epath as ep\n",
+ "from pydub import AudioSegment"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "sP8GBj3tBAC1"
+ },
+ "source": [
+ "### Set constants"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "rXTVeU1uBBqY"
+ },
+ "outputs": [],
+ "source": [
+ "INPUT_AUDIO_SAMPLE_FILE_URI = (\n",
+ " \"gs://github-repo/audio_ai/speech_recognition/attention_is_all_you_need_podcast.wav\"\n",
+ ")\n",
+ "\n",
+ "RECOGNIZER = client.recognizer_path(PROJECT_ID, LOCATION, \"_\")\n",
+ "\n",
+ "MAX_CHUNK_SIZE = 25600"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "djgFxrGC_Ykd"
+ },
+ "source": [
+ "### Helpers"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Zih8W_wC_caW"
+ },
+ "outputs": [],
+ "source": [
+ "def read_audio_file(audio_file_path: str) -> bytes:\n",
+ " \"\"\"\n",
+ " Read audio file as bytes.\n",
+ " \"\"\"\n",
+ " if audio_file_path.startswith(\"gs://\"):\n",
+ " with ep.Path(audio_file_path).open(\"rb\") as f:\n",
+ " audio_bytes = f.read()\n",
+ " else:\n",
+ " with open(audio_file_path, \"rb\") as f:\n",
+ " audio_bytes = f.read()\n",
+ " return audio_bytes\n",
+ "\n",
+ "\n",
+ "def save_audio_sample(audio_bytes: bytes, output_file_uri: str) -> None:\n",
+ " \"\"\"\n",
+ " Save audio sample as a file in Google Cloud Storage.\n",
+ " \"\"\"\n",
+ "\n",
+ " output_file_path = ep.Path(output_file_uri)\n",
+ " if not output_file_path.parent.exists():\n",
+ " output_file_path.parent.mkdir(parents=True, exist_ok=True)\n",
+ "\n",
+ " with output_file_path.open(\"wb\") as f:\n",
+ " f.write(audio_bytes)\n",
+ "\n",
+ "\n",
+ "def extract_audio_sample(audio_bytes: bytes, duration: int) -> bytes:\n",
+ " \"\"\"\n",
+ " Extracts a random audio sample of a given duration from an audio file.\n",
+ " \"\"\"\n",
+ " audio = AudioSegment.from_file(io.BytesIO(audio_bytes))\n",
+ " start_time = 0\n",
+ " audio_sample = audio[start_time : start_time + duration * 1000]\n",
+ "\n",
+ " audio_bytes = io.BytesIO()\n",
+ " audio_sample.export(audio_bytes, format=\"wav\")\n",
+ " audio_bytes.seek(0)\n",
+ "\n",
+ " return audio_bytes.read()\n",
+ "\n",
+ "\n",
+ "def play_audio_sample(audio_bytes: bytes) -> None:\n",
+ " \"\"\"\n",
+ " Plays the audio sample in a notebook.\n",
+ " \"\"\"\n",
+ " audio_file = io.BytesIO(audio_bytes)\n",
+ " ipd.display(ipd.Audio(audio_file.read(), rate=44100))\n",
+ "\n",
+ "\n",
+ "def parse_real_time_recognize_response(response) -> list[tuple[str, int]]:\n",
+ " \"\"\"Parse real-time responses from the Speech-to-Text API\"\"\"\n",
+ " real_time_recognize_results = []\n",
+ " for result in response.results:\n",
+ " real_time_recognize_results.append(\n",
+ " (result.alternatives[0].transcript, result.result_end_offset)\n",
+ " )\n",
+ " return real_time_recognize_results\n",
+ "\n",
+ "\n",
+ "def parse_words_real_time_recognize_response(response):\n",
+ " \"\"\"\n",
+ " Parse the word-level results from a real-time speech recognition response.\n",
+ " \"\"\"\n",
+ " real_time_recognize_results = []\n",
+ " for result in response.results:\n",
+ " for word_info in result.alternatives[0].words:\n",
+ " word = word_info.word\n",
+ " start_time = word_info.start_offset.seconds\n",
+ " end_time = word_info.end_offset.seconds\n",
+ " real_time_recognize_results.append(\n",
+ " {\"word\": word, \"start\": start_time, \"end\": end_time}\n",
+ " )\n",
+ " return real_time_recognize_results\n",
+ "\n",
+ "\n",
+ "def print_transcription(\n",
+ " audio_sample_bytes: bytes, transcriptions: str, play_audio=True\n",
+ ") -> None:\n",
+ " \"\"\"Prettify the play of the audio and the associated print of the transcription text in a notebook\"\"\"\n",
+ "\n",
+ " if play_audio:\n",
+ " # Play the audio sample\n",
+ " display(ipd.HTML(\"Audio:\"))\n",
+ " play_audio_sample(audio_sample_bytes)\n",
+ " display(ipd.HTML(\" \"))\n",
+ "\n",
+ " # Display the transcription text\n",
+ " display(ipd.HTML(\"Transcription:\"))\n",
+ " for transcription, _ in transcriptions:\n",
+ " formatted_text = f\"
{transcription}
\"\n",
+ " display(ipd.HTML(formatted_text))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "y4MO5i9X4yq3"
+ },
+ "source": [
+ "### Prepare audio samples\n",
+ "\n",
+ "The podcast audio is ~ 8 mins. Depending on the audio length, you can use different transcribe API methods. To learn more, check out the official documentation. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4uTeBXo6dZlS"
+ },
+ "source": [
+ "#### Read the audio file\n",
+ "\n",
+ "Let's start reading the input audio sample you want to transcribe.\n",
+ "\n",
+ "In this case, it is a podcast generated with NotebookLM about the \"Attention is all you need\" [paper](https://arxiv.org/abs/1706.03762)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "pjzwMWqpdldM"
+ },
+ "outputs": [],
+ "source": [
+ "input_audio_bytes = read_audio_file(INPUT_AUDIO_SAMPLE_FILE_URI)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "cIAl9Lyd4niN"
+ },
+ "source": [
+ "#### Prepare a short audio sample (< 1 min)\n",
+ "\n",
+ "Extract a short audio sample from the original one for streaming and real-time audio processing."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "MofmWRSH4niO"
+ },
+ "outputs": [],
+ "source": [
+ "short_audio_sample_bytes = extract_audio_sample(input_audio_bytes, 30)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "AC2YeY7v4niO"
+ },
+ "outputs": [],
+ "source": [
+ "play_audio_sample(short_audio_sample_bytes)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "VPVDNRyVxquo"
+ },
+ "source": [
+ "## Improve transcription using Chirp 2's word-timing and speech adaptation features\n",
+ "\n",
+ "Chirp 2 supports word-level timestamps for each transcribed word and speech adaptation to help the model improving recognition accuracy for specific terms or proper nouns."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "oYCgDay2hAgB"
+ },
+ "source": [
+ "### Perform real-time speech recognition with word-timing"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "F83r9aiNhAgD"
+ },
+ "source": [
+ "#### Define real-time recognition configuration with `enable_word_time_offsets` parameter.\n",
+ "\n",
+ "You define the real-time recognition configuration which allows you to set the model to use, language code of the audio and more.\n",
+ "\n",
+ "In this case, you enable word timing feature. When True, the top result includes a list of words and the start and end time offsets (timestamps) for those words."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "j0WprZ-phAgD"
+ },
+ "outputs": [],
+ "source": [
+ "wt_real_time_config = cloud_speech.RecognitionConfig(\n",
+ " auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),\n",
+ " language_codes=[\"en-US\"],\n",
+ " model=\"chirp_2\",\n",
+ " features=cloud_speech.RecognitionFeatures(\n",
+ " enable_word_time_offsets=True,\n",
+ " enable_automatic_punctuation=True,\n",
+ " ),\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "r2TqksAqhAgD"
+ },
+ "source": [
+ "#### Define the real-time request configuration\n",
+ "\n",
+ "Next, you define the real-time request passing the configuration and the audio sample you want to transcribe.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Nh55mSzXhAgD"
+ },
+ "outputs": [],
+ "source": [
+ "wt_real_time_request = cloud_speech.RecognizeRequest(\n",
+ " config=wt_real_time_config, content=short_audio_sample_bytes, recognizer=RECOGNIZER\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "817YXVBli0aY"
+ },
+ "source": [
+ "#### Run the real-time recognition request\n",
+ "\n",
+ "Finally you submit the real-time recognition request."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "rc0cBrVsi7UG"
+ },
+ "outputs": [],
+ "source": [
+ "wt_response = client.recognize(request=wt_real_time_request)\n",
+ "wt_real_time_recognize_results = parse_real_time_recognize_response(wt_response)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "J2vpMSv7CZ_2"
+ },
+ "source": [
+ "And you use a helper function to visualize transcriptions and the associated streams."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ezH51rLH4CBR"
+ },
+ "outputs": [],
+ "source": [
+ "for transcription, _ in wt_real_time_recognize_results:\n",
+ " print_transcription(short_audio_sample_bytes, transcription)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "iFhUcPcO-Zeh"
+ },
+ "source": [
+ "#### Visualize word timings"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "mhH42sab-3Tg"
+ },
+ "outputs": [],
+ "source": [
+ "n = 10\n",
+ "word_timings = parse_words_real_time_recognize_response(wt_response)\n",
+ "for word_info in word_timings[:n]:\n",
+ " print(\n",
+ " f\"Word: {word_info['word']} - Start: {word_info['start']} sec - End: {word_info['end']} sec\"\n",
+ " )"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "IFOq3SK0qOT_"
+ },
+ "source": [
+ "### Improve real-time speech recognition accuracy with model adaptation\n",
+ "\n",
+ "So far, Chirp 2 transcribes the podcast correctly. That's in part because podcasts are recorded in ideal enviroments like a recording studio. But that's not always the case. For example, suppose that your audio data is recorded in noisy environment or the recording has strong accents or someone speaks quickly.\n",
+ "\n",
+ "To handle this and many other scenarios and improve real-time speech recognition accuracy, you can use model adaptation. To enable model adaptation with Chirp 2, you use the `adaptation` parameter.\n",
+ "\n",
+ "With `adaptation` parameter, you provide \"hints\" to the speech recognizer to favor specific words and phrases (`AdaptationPhraseSet` class) in the results. And for each hint you can define a hint boost which is the probability that a specific word or phrase will be recognized over other similar sounding phrases. Be careful to use higher boost. Higher the boost, higher is the chance of false positive recognition as well. We recommend using a binary search approach to finding the optimal value for your use case as well as adding phrases both with and without boost to your requests.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "1--AAmsYq-vG"
+ },
+ "source": [
+ "#### Define real-time recognition configuration with `adaptation` parameter\n",
+ "\n",
+ "You define a new real-time recognition configuration which includes the `adaptation` configuration.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "qUr76NRcq-vH"
+ },
+ "outputs": [],
+ "source": [
+ "adaptation_real_time_config = cloud_speech.RecognitionConfig(\n",
+ " auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),\n",
+ " language_codes=[\"en-US\"],\n",
+ " model=\"chirp_2\",\n",
+ " features=cloud_speech.RecognitionFeatures(\n",
+ " enable_automatic_punctuation=True,\n",
+ " ),\n",
+ " adaptation=cloud_speech.SpeechAdaptation(\n",
+ " phrase_sets=[\n",
+ " cloud_speech.SpeechAdaptation.AdaptationPhraseSet(\n",
+ " inline_phrase_set=cloud_speech.PhraseSet(\n",
+ " phrases=[\n",
+ " {\n",
+ " \"value\": \"you know\", # often mumbled or spoken quickly\n",
+ " \"boost\": 10.0,\n",
+ " },\n",
+ " {\n",
+ " \"value\": \"what are they called again?\" # hesitations and changes in intonation\n",
+ " },\n",
+ " {\n",
+ " \"value\": \"Yeah, it's wild.\" # short interjections have brevity and the emotional inflection\n",
+ " },\n",
+ " ]\n",
+ " )\n",
+ " )\n",
+ " ]\n",
+ " ),\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "2Lm8e-2Cq-vH"
+ },
+ "source": [
+ "#### Define the real-time request configuration"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "pH9ZxWFIq-vH"
+ },
+ "outputs": [],
+ "source": [
+ "adaptation_real_time_request = cloud_speech.RecognizeRequest(\n",
+ " config=adaptation_real_time_config,\n",
+ " content=short_audio_sample_bytes,\n",
+ " recognizer=RECOGNIZER,\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "xaQSQMZZq-vH"
+ },
+ "source": [
+ "#### Run the real-time recognition request"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "KYiCZjDWq-vH"
+ },
+ "outputs": [],
+ "source": [
+ "adapted_response = client.recognize(request=adaptation_real_time_request)\n",
+ "adapted_real_time_recognize_results = parse_real_time_recognize_response(\n",
+ " adapted_response\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "xOjvJHHmq-vH"
+ },
+ "source": [
+ "And you use a helper function to visualize transcriptions and the associated streams."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "7Xipn8H4q-vH"
+ },
+ "outputs": [],
+ "source": [
+ "for transcription, _ in adapted_real_time_recognize_results:\n",
+ " print_transcription(short_audio_sample_bytes, transcription)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "s0sIpQmJr40s"
+ },
+ "source": [
+ "## Transcript and translate using language-agnostic transcription and language translation\n",
+ "\n",
+ "Chirp 2 supports language-agnostic audio transcription and language translation. This means that Chirp 2 is capable of recognizing the language of the input audio and, at the same time, translate the outcome transcription in many different language.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5QpacKEDRStJ"
+ },
+ "source": [
+ "#### Define real-time recognition configuration with `language_code` and `translationConfig` parameters.\n",
+ "\n",
+ "You define a real-time recognition configuration by setting language codes in both `language_codes` and `translationConfig` parameters :\n",
+ "\n",
+ "* When `language_codes=[\"auto\"]`, you enable language-agnostic transcription to auto to detect language.\n",
+ "\n",
+ "* When `target_language=language_code` where `language_code` is one of the language in this list but different from the original language, you enable language translation."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "9vaW49XqUD2v"
+ },
+ "outputs": [],
+ "source": [
+ "target_language_code = \"ca-ES\" # @param {type:\"string\", isTemplate: true}"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "y3Z_vaKpRStK"
+ },
+ "outputs": [],
+ "source": [
+ "ts_real_time_config = cloud_speech.RecognitionConfig(\n",
+ " auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),\n",
+ " language_codes=[\"en-US\"],\n",
+ " translation_config=cloud_speech.TranslationConfig(\n",
+ " target_language=target_language_code\n",
+ " ),\n",
+ " model=\"chirp_2\",\n",
+ " features=cloud_speech.RecognitionFeatures(\n",
+ " enable_automatic_punctuation=True,\n",
+ " ),\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "nPGCDLWARStK"
+ },
+ "source": [
+ "#### Define the real-time request configuration"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "5_spCiHDRStK"
+ },
+ "outputs": [],
+ "source": [
+ "ts_real_time_request = cloud_speech.RecognizeRequest(\n",
+ " config=ts_real_time_config, content=short_audio_sample_bytes, recognizer=RECOGNIZER\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Mzr69sLjRStK"
+ },
+ "source": [
+ "#### Run the real-time recognition request"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "NvcPOWLkRStK"
+ },
+ "outputs": [],
+ "source": [
+ "ts_response = client.recognize(request=ts_real_time_request)\n",
+ "ts_real_time_recognize_results = parse_real_time_recognize_response(ts_response)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "STjng1ZiRStK"
+ },
+ "source": [
+ "And you use a helper function to visualize transcriptions and the associated streams."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "EhTgg3qwRStL"
+ },
+ "outputs": [],
+ "source": [
+ "print_transcription(short_audio_sample_bytes, transcription, play_audio=False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "_MkA144iQrAn"
+ },
+ "source": [
+ "## Chirp 2 playground\n",
+ "\n",
+ "To play with Chirp 2, you can create a simple Gradio application where you enable several Chirp 2 features.\n",
+ "\n",
+ "Below you have an example for language-agnostic transcription and language translation with Chirp 2.\n",
+ "\n",
+ "To know more, check out the official documentation [here](https://cloud.google.com/speech-to-text/v2/docs/chirp_2-model).\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "WjuuZHTbQwkF"
+ },
+ "outputs": [],
+ "source": [
+ "def transcribe_audio(audio, enable_translation, target_language_code):\n",
+ " \"\"\"Transcribe the given audio file with optional features.\"\"\"\n",
+ "\n",
+ " # Set variables\n",
+ " project_id = os.environ.get(\"GOOGLE_CLOUD_PROJECT\", PROJECT_ID)\n",
+ " location = os.environ.get(\"GOOGLE_CLOUD_REGION\", LOCATION)\n",
+ " api_endpoint = f\"{location}-speech.googleapis.com\"\n",
+ "\n",
+ " # initiate client\n",
+ " client = SpeechClient(\n",
+ " client_options=ClientOptions(\n",
+ " api_endpoint=api_endpoint,\n",
+ " )\n",
+ " )\n",
+ "\n",
+ " # read the audio\n",
+ " with open(audio, \"rb\") as audio_file:\n",
+ " content = audio_file.read()\n",
+ "\n",
+ " # define language agnostic real time recognition configuration\n",
+ " real_time_config = cloud_speech.RecognitionConfig(\n",
+ " model=\"chirp_2\",\n",
+ " language_codes=[\"auto\"],\n",
+ " features=cloud_speech.RecognitionFeatures(\n",
+ " enable_automatic_punctuation=True,\n",
+ " ),\n",
+ " auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),\n",
+ " )\n",
+ "\n",
+ " if enable_translation:\n",
+ " real_time_config.language_codes = [\"en-US\"]\n",
+ " real_time_config.translation_config = cloud_speech.TranslationConfig(\n",
+ " target_language=target_language_code\n",
+ " )\n",
+ "\n",
+ " # define real-time recognition request\n",
+ " recognizer = client.recognizer_path(project_id, location, \"_\")\n",
+ "\n",
+ " real_time_request = cloud_speech.RecognizeRequest(\n",
+ " config=real_time_config,\n",
+ " content=content,\n",
+ " recognizer=recognizer,\n",
+ " )\n",
+ "\n",
+ " response = client.recognize(request=real_time_request)\n",
+ "\n",
+ " full_transcript = \"\"\n",
+ " for result in response.results:\n",
+ " full_transcript += result.alternatives[0].transcript + \" \"\n",
+ " return full_transcript.strip()\n",
+ "\n",
+ "\n",
+ "def speech_to_text(audio, enable_translation=False, target_language_code=None):\n",
+ " if audio is None:\n",
+ " return \"\"\n",
+ "\n",
+ " text = transcribe_audio(audio, enable_translation, target_language_code)\n",
+ " return text"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "HQzUnSkErDTN"
+ },
+ "outputs": [],
+ "source": [
+ "# Create Gradio interface\n",
+ "demo = gr.Interface(\n",
+ " fn=speech_to_text,\n",
+ " inputs=[\n",
+ " gr.Audio(type=\"filepath\", label=\"Audio input\"),\n",
+ " gr.Checkbox(label=\"🧠 Enable Translation\"),\n",
+ " gr.Dropdown(\n",
+ " label=\"Select language to translate\",\n",
+ " choices=[\"ca-ES\", \"cy-GB\", \"de-DE\", \"ja-JP\", \"zh-Hans-CN\"],\n",
+ " interactive=True,\n",
+ " multiselect=False,\n",
+ " ),\n",
+ " ],\n",
+ " outputs=[gr.Textbox(label=\"📄 Transcription\")],\n",
+ " title=\"Chirp 2 Playground\",\n",
+ " description=\"
Speak or pass an audio and get the transcription!
\",\n",
+ ")\n",
+ "\n",
+ "# Launch the app\n",
+ "demo.launch()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XqTpn06QrEiZ"
+ },
+ "outputs": [],
+ "source": [
+ "demo.close()"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "name": "get_started_with_chirp_2_sdk_features.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
diff --git a/vision/use-cases/image_segmentation_layers.ipynb b/vision/use-cases/image_segmentation_layers.ipynb
index 81602f94b7..f72be454b9 100644
--- a/vision/use-cases/image_segmentation_layers.ipynb
+++ b/vision/use-cases/image_segmentation_layers.ipynb
@@ -278,6 +278,8 @@
"outputs": [],
"source": [
"# Parses the mask bytes from the response and converts it to an Image PIL object\n",
+ "\n",
+ "\n",
"def prediction_to_mask_pil(prediction) -> PIL_Image:\n",
" encoded_mask_string = prediction[\"bytesBase64Encoded\"]\n",
" mask_bytes = base64.b64decode(encoded_mask_string)\n",
From dc3700493e921389ea1678653cdba5ef48205958 Mon Sep 17 00:00:00 2001
From: Mend Renovate
Date: Wed, 9 Oct 2024 16:15:04 +0200
Subject: [PATCH 52/76] chore(deps): update lycheeverse/lychee-action action to
v2 (#1228)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This PR contains the following updates:
| Package | Type | Update | Change |
|---|---|---|---|
|
[lycheeverse/lychee-action](https://redirect.github.com/lycheeverse/lychee-action)
| action | major | `v1` -> `v2` |
---
> [!WARNING]
> Some dependencies could not be looked up. Check the warning logs for
more information.
---
### Release Notes
lycheeverse/lychee-action (lycheeverse/lychee-action)
###
[`v2`](https://redirect.github.com/lycheeverse/lychee-action/compare/v1...v2)
[Compare
Source](https://redirect.github.com/lycheeverse/lychee-action/compare/v1...v2)
---
### Configuration
📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).
🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.
♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.
🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.
---
- [ ] If you want to rebase/retry this PR, check
this box
---
This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/GoogleCloudPlatform/generative-ai).
Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
---
.github/workflows/links.yaml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/.github/workflows/links.yaml b/.github/workflows/links.yaml
index 5db7bd015d..1ffd417dba 100644
--- a/.github/workflows/links.yaml
+++ b/.github/workflows/links.yaml
@@ -14,7 +14,7 @@ jobs:
- name: Link Checker
id: lychee
- uses: lycheeverse/lychee-action@v1
+ uses: lycheeverse/lychee-action@v2
- name: Create Issue From File
if: env.lychee_exit_code != 0
From ca5a50624968d665075b3e48db803f55a9a22eec Mon Sep 17 00:00:00 2001
From: Erwin Huizenga
Date: Thu, 10 Oct 2024 10:33:18 +0800
Subject: [PATCH 53/76] New notebook on fine-tuning gemini 1.5 flash for qa
(#1229)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
# Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [X] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [X] You are listed as the author in your notebook or README file.
- [X] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [X] Make your Pull Request title in the
specification.
- [X] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [X] Appropriate docs were updated (if necessary)
Fixes # 🦕
---
...sed_finetuning_using_gemini_qa_ipynb.ipynb | 1683 +++++++++++++++++
1 file changed, 1683 insertions(+)
create mode 100644 gemini/tuning/supervised_finetuning_using_gemini_qa_ipynb.ipynb
diff --git a/gemini/tuning/supervised_finetuning_using_gemini_qa_ipynb.ipynb b/gemini/tuning/supervised_finetuning_using_gemini_qa_ipynb.ipynb
new file mode 100644
index 0000000000..2e8ccc4bfd
--- /dev/null
+++ b/gemini/tuning/supervised_finetuning_using_gemini_qa_ipynb.ipynb
@@ -0,0 +1,1683 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ur8xi4C7S06n"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ojoyvz6mH1Hv"
+ },
+ "source": [
+ "# Supervised Fine Tuning with Gemini 1.5 Flash for Q&A\n",
+ "\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "84f0f73a0f76"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "| Author(s) | [Erwin Huizenga](https://github.com/erwinh85) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "JAPoU8Sm5E6e"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "**Gemini** is a family of generative AI models developed by Google DeepMind designed for multimodal use cases. The Gemini API gives you access to the various Gemini models, such as Gemini 1.5 Pro and Gemini 1.5 Flash.\n",
+ "This notebook demonstrates fine-tuning the Gemini 1.5 Flahs using the Vertex AI Supervised Tuning feature. Supervised Tuning allows you to use your own labeled training data to further refine the base model's capabilities toward your specific tasks.\n",
+ "Supervised Tuning uses labeled examples to tune a model. Each example demonstrates the output you want from your text model during inference.\n",
+ "First, ensure your training data is of high quality, well-labeled, and directly relevant to the target task. This is crucial as low-quality data can adversely affect the performance and introduce bias in the fine-tuned model.\n",
+ "Training: Experiment with different configurations to optimize the model's performance on the target task.\n",
+ "Evaluation:\n",
+ "Metric: Choose appropriate evaluation metrics that accurately reflect the success of the fine-tuned model for your specific task\n",
+ "Evaluation Set: Use a separate set of data to evaluate the model's performance\n",
+ "\n",
+ "\n",
+ "Refer to public [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning) for more details.\n",
+ "\n",
+ "\n",
+ "\n",
+ "Before running this notebook, ensure you have:\n",
+ "\n",
+ "- A Google Cloud project: Provide your project ID in the `PROJECT_ID` variable.\n",
+ "\n",
+ "- Authenticated your Colab environment: Run the authentication code block at the beginning.\n",
+ "\n",
+ "- Prepared training data (Test with your own data or use the one in the notebook): Data should be formatted in JSONL with prompts and corresponding completions."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "f7SS5pzuIA-1"
+ },
+ "source": [
+ "### Costs\n",
+ "\n",
+ "This tutorial uses billable components of Google Cloud:\n",
+ "\n",
+ "* Vertex AI\n",
+ "* Cloud Storage\n",
+ "\n",
+ "Learn about [Vertex AI\n",
+ "pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage\n",
+ "pricing](https://cloud.google.com/storage/pricing), and use the [Pricing\n",
+ "Calculator](https://cloud.google.com/products/calculator/)\n",
+ "to generate a cost estimate based on your projected usage.\n",
+ "\n",
+ "To get an estimate of the number of tokens"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "61RBz8LLbxCR"
+ },
+ "source": [
+ "## Get started"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "No17Cw5hgx12"
+ },
+ "source": [
+ "### Install Vertex AI SDK and other required packages\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {
+ "id": "tFy3H3aPgx12"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install --upgrade --user --quiet google-cloud-aiplatform datasets"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "R5Xep4W9lq-Z"
+ },
+ "source": [
+ "### Restart runtime\n",
+ "\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
+ "\n",
+ "The restart might take a minute or longer. After it's restarted, continue to the next step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "XRvKdaPDTznN",
+ "outputId": "b9219138-e8f3-4cfd-e324-9d61ef383732"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "{'status': 'ok', 'restart': True}"
+ ]
+ },
+ "execution_count": 12,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import IPython\n",
+ "\n",
+ "app = IPython.Application.instance()\n",
+ "app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SbmM4z7FOBpM"
+ },
+ "source": [
+ "
\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dmWOrTJ3gx13"
+ },
+ "source": [
+ "### Authenticate your notebook environment (Colab only)\n",
+ "\n",
+ "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {
+ "id": "NyKGtVQjgx13"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " from google.colab import auth\n",
+ "\n",
+ " auth.authenticate_user()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DF4l8DTdWgPY"
+ },
+ "source": [
+ "### Set Google Cloud project information and initialize Vertex AI SDK\n",
+ "\n",
+ "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
+ "\n",
+ "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "id": "Nqwi-5ufWp_B"
+ },
+ "outputs": [],
+ "source": [
+ "# Use the environment variable if the user doesn't provide Project ID.\n",
+ "import os\n",
+ "import vertexai\n",
+ "\n",
+ "PROJECT_ID = \"\" # @param {type:\"string\", isTemplate: true}\n",
+ "if PROJECT_ID == \"[your-project-id]\":\n",
+ " PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n",
+ "\n",
+ "LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"us-central1\")\n",
+ "\n",
+ "vertexai.init(project=PROJECT_ID, location=LOCATION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5303c05f7aa6"
+ },
+ "source": [
+ "### Import libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 189,
+ "metadata": {
+ "id": "6fc324893334"
+ },
+ "outputs": [],
+ "source": [
+ "# Vertex AI SDK\n",
+ "from google.cloud import aiplatform\n",
+ "from google.cloud.aiplatform.metadata import context\n",
+ "from google.cloud.aiplatform.metadata import utils as metadata_utils\n",
+ "from vertexai.generative_models import (\n",
+ " GenerationConfig,\n",
+ " GenerativeModel,\n",
+ " HarmBlockThreshold,\n",
+ " HarmCategory,\n",
+ ")\n",
+ "from vertexai.preview.tuning import sft\n",
+ "\n",
+ "# Vertex AI SDK\n",
+ "from sklearn.metrics import f1_score\n",
+ "from sklearn.feature_extraction.text import TfidfVectorizer\n",
+ "import pandas as pd\n",
+ "import array\n",
+ "import time\n",
+ "from datasets import load_dataset\n",
+ "import numpy as np\n",
+ "import plotly.graph_objects as go\n",
+ "from plotly.subplots import make_subplots"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "6bBZa2I-c-x8"
+ },
+ "source": [
+ "### Data\n",
+ "\n",
+ "#### SQuAD dataset\n",
+ "Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.\n",
+ "\n",
+ "You can fine more information on the SQuAD [github page](https://rajpurkar.github.io/SQuAD-explorer/)**bold text**"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "KhebDJjRKePL"
+ },
+ "source": [
+ "First update the `BUCKET_NAME` parameter below. You can either use an existing bucket or create a new one."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 95,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "lit30Cktbfvo",
+ "outputId": "273ee3ae-cb16-42fd-9d59-898826d2fb60"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "gs://tuning-demo-erwinh/gemini-tuning\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Provide a bucket name\n",
+ "BUCKET_NAME = \"tuning-demo-erwinh/gemini-tuning\" # @param {type:\"string\"}\n",
+ "BUCKET_URI = f\"gs://{BUCKET_NAME}\"\n",
+ "print(BUCKET_URI)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ed-G-9cyKmPY"
+ },
+ "source": [
+ "Only run the code below if you want to create a new Google Cloud Storage bucket."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "0UJ8S9YFA1pZ"
+ },
+ "outputs": [],
+ "source": [
+ "# ! gsutil mb -l {LOCATION} -p {PROJECT_ID} {BUCKET_URI}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "izjwF63tLLEq"
+ },
+ "source": [
+ "Next you will copy the data into your bucket."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "wjvcxx_sA3xP"
+ },
+ "outputs": [],
+ "source": [
+ "!gsutil cp gs://github-repo/generative-ai/gemini/tuning/qa/squad_test.csv .\n",
+ "!gsutil cp gs://github-repo/generative-ai/gemini/tuning/qa/squad_train.csv .\n",
+ "!gsutil cp gs://github-repo/generative-ai/gemini/tuning/qa/squad_validation.csv ."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "3F10LuZeL3kt"
+ },
+ "source": [
+ "### Baseline\n",
+ "\n",
+ "Next you will prepare some test data that you will use to establish a baseline. This means evaluating your chosen model on a representative sample of your dataset before any fine-tuning. A baseline allows you to quantify the improvements achieved through fine-tuning."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 181,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 98
+ },
+ "id": "LkOmXpegA8CW",
+ "outputId": "297f4339-83fc-4a4a-9ed0-62a469ac1acd"
+ },
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "summary": "{\n \"name\": \"test_df\",\n \"rows\": 40,\n \"fields\": [\n {\n \"column\": \"id\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 40,\n \"samples\": [\n \"5725bae289a1e219009abd92\",\n \"5726431aec44d21400f3dd13\",\n \"57269fab5951b619008f7808\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"title\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 26,\n \"samples\": [\n \"Teacher\",\n \"Ctenophora\",\n \"Normans\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"context\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 38,\n \"samples\": [\n \"On May 21, 2013, NFL owners at their spring meetings in Boston voted and awarded the game to Levi's Stadium. The $1.2 billion stadium opened in 2014. It is the first Super Bowl held in the San Francisco Bay Area since Super Bowl XIX in 1985, and the first in California since Super Bowl XXXVII took place in San Diego in 2003.\",\n \"The contracted batch of 15 Saturn Vs were enough for lunar landing missions through Apollo 20. NASA publicized a preliminary list of eight more planned landing sites, with plans to increase the mass of the CSM and LM for the last five missions, along with the payload capacity of the Saturn V. These final missions would combine the I and J types in the 1967 list, allowing the CMP to operate a package of lunar orbital sensors and cameras while his companions were on the surface, and allowing them to stay on the Moon for over three days. These missions would also carry the Lunar Roving Vehicle (LRV) increasing the exploration area and allowing televised liftoff of the LM. Also, the Block II spacesuit was revised for the extended missions to allow greater flexibility and visibility for driving the LRV.\",\n \"In July 1977, General Zia-ul-Haq overthrew Prime Minister Zulfiqar Ali Bhutto's regime in Pakistan. Ali Bhutto, a leftist in democratic competition with Islamists, had announced banning alcohol and nightclubs within six months, shortly before he was overthrown. Zia-ul-Haq was much more committed to Islamism, and \\\"Islamization\\\" or implementation of Islamic law, became a cornerstone of his eleven-year military dictatorship and Islamism became his \\\"official state ideology\\\". Zia ul Haq was an admirer of Mawdudi and Mawdudi's party Jamaat-e-Islami became the \\\"regime's ideological and political arm\\\". In Pakistan this Islamization from above was \\\"probably\\\" more complete \\\"than under any other regime except those in Iran and Sudan,\\\" but Zia-ul-Haq was also criticized by many Islamists for imposing \\\"symbols\\\" rather than substance, and using Islamization to legitimize his means of seizing power. Unlike neighboring Iran, Zia-ul-Haq's policies were intended to \\\"avoid revolutionary excess\\\", and not to strain relations with his American and Persian Gulf state allies. Zia-ul-Haq was killed in 1988 but Islamization remains an important element in Pakistani society.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"question\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 40,\n \"samples\": [\n \"How many species of Ctenophores have not been fully described or named?\",\n \"What was Tymnet\",\n \"What is Michael Carrick and Alan Shearer's profession?\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"answers\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 40,\n \"samples\": [\n \"possibly another 25\",\n \"an international data communications network headquartered in San Jose, CA\",\n \"international footballers\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
+ "type": "dataframe",
+ "variable_name": "test_df"
+ },
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ " id title \\\n",
+ "0 56de3cd0cffd8e1900b4b6bf Normans \n",
+ "\n",
+ " context \\\n",
+ "0 Normans came into Scotland, building castles a... \n",
+ "\n",
+ " question answers \n",
+ "0 What culture's arrival in Scotland is know as ... Norman "
+ ]
+ },
+ "execution_count": 181,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "test_df = pd.read_csv('squad_test.csv')\n",
+ "test_df.head(1)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "mrLxcVVcMsNO"
+ },
+ "source": [
+ "You will need to do some dataset preperations. We will add a system instruction to the dataset:\n",
+ "\n",
+ "`SystemInstruct`: System instructions are a set of instructions that the model processes before it processes prompts. We recommend that you use system instructions to tell the model how you want it to behave and respond to prompts.\n",
+ "\n",
+ "We will also combine the `context` and `question`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 80,
+ "metadata": {
+ "id": "c0pgJycOekZ3"
+ },
+ "outputs": [],
+ "source": [
+ "systemInstruct = \"Answer the question based on the context\""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 182,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "N_u3VzUMsyqj",
+ "outputId": "ff16f581-73de-4595-aeed-6a80b39e8d4e"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Answer the question based on the context\n",
+ "Context: In July 1973, as part of its outreach programme to young people, the V&A became the first museum in Britain to present a rock concert. The V&A presented a combined concert/lecture by British progressive folk-rock band Gryphon, who explored the lineage of mediaeval music and instrumentation and related how those contributed to contemporary music 500 years later. This innovative approach to bringing young people to museums was a hallmark of the directorship of Roy Strong and was subsequently emulated by some other British museums.\n",
+ "Question: Which musical group did the V&A present in July 1973 as part of its youth outreach programme?\n"
+ ]
+ }
+ ],
+ "source": [
+ "# combine the systeminstruct + context + question into one column.\n",
+ "row_dataset = 6\n",
+ "\n",
+ "test_df[\"input_question\"] = systemInstruct + \"\\n\" + \"Context: \" + test_df[\"context\"] + \"\\n\" + \"Question: \" + test_df[\"question\"]\n",
+ "test_question = test_df[\"input_question\"].iloc[row_dataset]\n",
+ "print(test_question)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "FSxYYqMGWrmj"
+ },
+ "source": [
+ "Next, set the model that you will use. In this example you will use `gemini-1.5-flash-002`. A multimodal model that is designed for high-volume, cost-effective applications, and which delivers speed and efficiency to build fast, lower-cost applications that don't compromise on quality.\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 183,
+ "metadata": {
+ "id": "t-5X4goiqqBQ"
+ },
+ "outputs": [],
+ "source": [
+ "base_model = \"gemini-1.5-flash-002\"\n",
+ "generation_model = GenerativeModel(base_model)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "wyscyIenW4WZ"
+ },
+ "source": [
+ "Next lets take a question and get a prediction from Gemini that we can compare to the actual answer."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 184,
+ "metadata": {
+ "id": "cXencUYc6YAE"
+ },
+ "outputs": [],
+ "source": [
+ "def get_predictions(question: str) -> str:\n",
+ " \"\"\"Generates predictions for a given test question.\n",
+ "\n",
+ " Args:\n",
+ " test_question: The question to generate predictions for.\n",
+ "\n",
+ " Returns:\n",
+ " The generated prediction text.\n",
+ " \"\"\"\n",
+ "\n",
+ " prompt = f\"{question}\"\n",
+ "\n",
+ " generation_config = GenerationConfig(\n",
+ " temperature=0.1)\n",
+ "\n",
+ " response = generation_model.generate_content(\n",
+ " contents=prompt, generation_config=generation_config\n",
+ " ).text\n",
+ "\n",
+ " return response"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 186,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "gKa0wLooa3Is",
+ "outputId": "ce10cb15-31c9-4fbe-af4d-6c8d65139648"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Gemini response: The V&A presented the British progressive folk-rock band Gryphon.\n",
+ "\n",
+ "Actual answer: Gryphon\n"
+ ]
+ }
+ ],
+ "source": [
+ "test_answer = test_df[\"answers\"].iloc[row_dataset]\n",
+ "\n",
+ "response = get_predictions(test_question)\n",
+ "\n",
+ "print(f\"Gemini response: {response}\")\n",
+ "print(f\"Actual answer: {test_answer}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "OGRJTHKrdujw"
+ },
+ "source": [
+ "You can see that both answers are correct, but the response from Gemini is more lengthy. However, answers in the SQuAD dataset are typically concise and clear.\n",
+ "\n",
+ "Fine-tuning is a great way to control the type of output your use case requires. In this instance, you would want the model to provide short, clear answers.\n",
+ "\n",
+ "Next, let's check if each dataset has an equal number of examples."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 188,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "dCe0CUsi5E-Y",
+ "outputId": "8e89fbf4-7483-448e-b50a-4bfd50adeb75"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Number of strings in y_pred: 40\n",
+ "Number of strings in y_true: 40\n"
+ ]
+ }
+ ],
+ "source": [
+ "num_strings_pred = np.sum([isinstance(item, str) for item in y_pred])\n",
+ "print(f\"Number of strings in y_pred: {num_strings_pred}\")\n",
+ "\n",
+ "num_strings_true = np.sum([isinstance(item, str) for item in y_true])\n",
+ "print(f\"Number of strings in y_true: {num_strings_true}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "hvi7m8pKE8WB"
+ },
+ "source": [
+ "Next lest establish a baseline using evaluation metrics.\n",
+ "\n",
+ "Evaluating the performance of a Question Answering (QA) system requires specific metrics. Two commonly used metrics are Exact Match (EM) and F1 score.\n",
+ "\n",
+ "EM is a strict measure that only considers an answer correct if it perfectly matches the ground truth, even down to the punctuation. It's a binary metric - either 1 for a perfect match or 0 otherwise. This makes it sensitive to minor variations in phrasing.\n",
+ "\n",
+ "F1 score is more flexible. It considers the overlap between the predicted answer and the true answer in terms of individual words or tokens. It calculates the harmonic mean of precision (proportion of correctly predicted words out of all predicted words) and recall (proportion of correctly predicted words out of all true answer words). This allows for partial credit and is less sensitive to minor wording differences.\n",
+ "\n",
+ "In practice, EM is useful when exact wording is crucial, while F1 is more suitable when evaluating the overall understanding and semantic accuracy of the QA system. Often, both metrics are used together to provide a comprehensive evaluation."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 190,
+ "metadata": {
+ "id": "XcgEpTU55FFc"
+ },
+ "outputs": [],
+ "source": [
+ "def calculate_em_and_f1_for_text_arrays(y_true, y_pred, average='weighted'):\n",
+ " \"\"\"\n",
+ " Calculates the Exact Match (EM) and F1 score for arrays of text\n",
+ " using word-level comparisons.\n",
+ "\n",
+ " Args:\n",
+ " y_true: An array of ground truth strings.\n",
+ " y_pred: An array of predicted strings.\n",
+ " average: The averaging method to use for F1 score.\n",
+ "\n",
+ " Returns:\n",
+ " A tuple containing the EM score and the F1 score.\n",
+ " \"\"\"\n",
+ "\n",
+ " em = np.mean([t == p for t, p in zip(y_true, y_pred)])\n",
+ "\n",
+ " # Use TF-IDF to convert strings to numerical vectors\n",
+ " vectorizer = TfidfVectorizer()\n",
+ " all_text = np.concatenate((y_true, y_pred))\n",
+ " vectorizer.fit(all_text)\n",
+ " y_true_vec = vectorizer.transform(y_true)\n",
+ " y_pred_vec = vectorizer.transform(y_pred)\n",
+ "\n",
+ " # Calculate F1 score based on common words (non-zero elements)\n",
+ " y_true_class = (y_true_vec > 0).toarray().astype(int)\n",
+ " y_pred_class = (y_pred_vec > 0).toarray().astype(int)\n",
+ "\n",
+ " f1 = f1_score(y_true_class, y_pred_class, average=average)\n",
+ "\n",
+ " return em, f1\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 191,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "rhDTq9p_GSBP",
+ "outputId": "4dde775e-2466-4ef7-e380-d23abddc6690"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "EM score: 0.0\n",
+ "F1 score: 0.030862136294937427\n"
+ ]
+ }
+ ],
+ "source": [
+ "em, f1 = calculate_em_and_f1_for_text_arrays(y_pred, y_true)\n",
+ "print(f\"EM score: {em}\")\n",
+ "print(f\"F1 score: {f1}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "22DfexbNfUHm"
+ },
+ "source": [
+ "### Prepare the data for fine-tuning\n",
+ "\n",
+ "To optimize the tuning process for a foundation model, ensure your dataset includes examples that reflect the desired task. Structure your training data in a text-to-text format, where each record in the dataset pairs an input text (or prompt) with its corresponding expected output. This supervised tuning approach uses the dataset to effectively teach the model the specific behavior or task you need it to perform, by providing numerous illustrative examples.\n",
+ "\n",
+ "The size of your dataset will vary depending on the complexity of the task, but as a general rule, the more examples you include, the better the model's performance.\n",
+ "\n",
+ "Dataset Format\n",
+ "Your training data should be structured in a JSONL file and stored at a Google Cloud Storage (GCS) URI. Each line in the JSONL file must adhere to the following schema:\n",
+ "\n",
+ "A `contents` array containing objects that define:\n",
+ "- A `role` (\"user\" for user input or \"model\" for model output)\n",
+ "- `parts` containing the input data.\n",
+ "\n",
+ "```\n",
+ "{\n",
+ " \"contents\":[\n",
+ " {\n",
+ " \"role\":\"user\", # This indicate input content\n",
+ " \"parts\":[\n",
+ " {\n",
+ " \"text\":\"How are you?\"\n",
+ " }\n",
+ " ]\n",
+ " },\n",
+ " {\n",
+ " \"role\":\"model\", # This indicate target content\n",
+ " \"parts\":[ # text only\n",
+ " {\n",
+ " \"text\":\"I am good, thank you!\"\n",
+ " }\n",
+ " ]\n",
+ " }\n",
+ " # ... repeat \"user\", \"model\" for multi turns.\n",
+ " ]\n",
+ "}\n",
+ "```\n",
+ "\n",
+ "Refer to the public [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning-prepare#about-datasets) for more details."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 192,
+ "metadata": {
+ "id": "4DqrQp4cLqRy"
+ },
+ "outputs": [],
+ "source": [
+ "# combine the systeminstruct + context + question into one column.\n",
+ "train_df = pd.read_csv('squad_train.csv')\n",
+ "validation_df = pd.read_csv('squad_validation.csv')\n",
+ "\n",
+ "train_df[\"input_question\"] = systemInstruct + \"Context: \" + train_df[\"context\"] + \"Question: \" + train_df[\"question\"]\n",
+ "validation_df[\"input_question\"] = systemInstruct + \"Context: \" + validation_df[\"context\"] + \"Question: \" + validation_df[\"question\"]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 197,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "Pmzyz1migvHN",
+ "outputId": "38b0b753-b526-41a8-d124-d73baa2152bc"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "JSONL data written to squad_train.jsonl\n",
+ "JSONL data written to squad_validation.jsonl\n"
+ ]
+ }
+ ],
+ "source": [
+ "def df_to_jsonl(df, output_file):\n",
+ " \"\"\"Converts a Pandas DataFrame to JSONL format and saves it to a file.\n",
+ "\n",
+ " Args:\n",
+ " df: The DataFrame to convert.\n",
+ " output_file: The name of the output file.\n",
+ " \"\"\"\n",
+ "\n",
+ " with open(output_file, 'w') as f:\n",
+ " for row in df.itertuples(index=False):\n",
+ " jsonl_obj = {\n",
+ " \"systemInstruction\": {\"parts\": [{\"text\": \"Answer the question based on the provided context.\"}]},\n",
+ " \"contents\": [\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"parts\": [{\"text\": f\"Context: {row.context}\\n\\nQuestion: {row.question}\"}]\n",
+ " },\n",
+ " {\"role\": \"model\", \"parts\": [{\"text\": row.answers}]},\n",
+ " ]\n",
+ " }\n",
+ " f.write(json.dumps(jsonl_obj) + '\\n')\n",
+ "\n",
+ "# Process the DataFrames\n",
+ "df_to_jsonl(train_df, 'squad_train.jsonl')\n",
+ "df_to_jsonl(validation_df, 'squad_validation.jsonl')\n",
+ "\n",
+ "print(f\"JSONL data written to squad_train.jsonl\")\n",
+ "print(f\"JSONL data written to squad_validation.jsonl\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5OQv-ZMpJDhi"
+ },
+ "source": [
+ "Next you will copy the files into your Google Cloud bucket"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 195,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "O5k1jYJ10IeW",
+ "outputId": "f6af525a-0c69-414b-a9f7-7340879f4868"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Copying file://./squad_train.jsonl [Content-Type=application/octet-stream]...\n",
+ "/ [1 files][527.0 KiB/527.0 KiB] \n",
+ "Operation completed over 1 objects/527.0 KiB. \n",
+ "Copying file://./squad_validation.jsonl [Content-Type=application/octet-stream]...\n",
+ "/ [1 files][110.9 KiB/110.9 KiB] \n",
+ "Operation completed over 1 objects/110.9 KiB. \n"
+ ]
+ }
+ ],
+ "source": [
+ "!gsutil cp ./squad_train.jsonl {BUCKET_URI}\n",
+ "!gsutil cp ./squad_validation.jsonl {BUCKET_URI}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "UAHMYgFJJHjm"
+ },
+ "source": [
+ "### Start fine-tuning job\n",
+ "Next you can start the fine-tuning job.\n",
+ "\n",
+ "- `source_model`: Specifies the base Gemini model version you want to fine-tune.\n",
+ " - `train_dataset`: Path to your training data in JSONL format.\n",
+ "\n",
+ " *Optional parameters*\n",
+ " - `validation_dataset`: If provided, this data is used to evaluate the model during tuning.\n",
+ " - `tuned_model_display_name`: Display name for the tuned model.\n",
+ " - `epochs`: The number of training epochs to run.\n",
+ " - `learning_rate_multiplier`: A value to scale the learning rate during training.\n",
+ " - `adapter_size` : Gemini 1.5 Flash supports Adapter length [1, 4], default value is 4.\n",
+ "\n",
+ " **Important**: The default hyperparameter settings are optimized for optimal performance based on rigorous testing and are recommended for initial use. Users may customize these parameters to address specific performance requirements.**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 139,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 181
+ },
+ "id": "qj-LjQ5Vbf1E",
+ "outputId": "5af1f956-d5e3-4111-c100-85e60cc90890"
+ },
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "INFO:vertexai.tuning._tuning:Creating SupervisedTuningJob\n",
+ "INFO:vertexai.tuning._tuning:SupervisedTuningJob created. Resource name: projects/713601331534/locations/us-central1/tuningJobs/8356726629560483840\n",
+ "INFO:vertexai.tuning._tuning:To use this SupervisedTuningJob in another session:\n",
+ "INFO:vertexai.tuning._tuning:tuning_job = sft.SupervisedTuningJob('projects/713601331534/locations/us-central1/tuningJobs/8356726629560483840')\n",
+ "INFO:vertexai.tuning._tuning:View Tuning Job:\n",
+ "https://console.cloud.google.com/vertex-ai/generative/language/locations/us-central1/tuning/tuningJob/8356726629560483840?project=713601331534\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " tune\n",
+ " View Tuning Job\n",
+ " \n",
+ " \n",
+ " \n",
+ " "
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "tuned_model_display_name = \"erwinh-fine-tuning-flash\" # @param {type:\"string\"}\n",
+ "\n",
+ "sft_tuning_job = sft.train(\n",
+ " source_model=base_model,\n",
+ " train_dataset=f\"\"\"{BUCKET_URI}/squad_train.jsonl\"\"\",\n",
+ " # # Optional:\n",
+ " validation_dataset=f\"\"\"{BUCKET_URI}/squad_validation.jsonl\"\"\",\n",
+ " tuned_model_display_name=tuned_model_display_name,\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 198,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "3tXawW1p8E5-",
+ "outputId": "83b11366-6ff2-4900-d6af-3a0f7a9aca19"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "{'name': 'projects/713601331534/locations/us-central1/tuningJobs/8356726629560483840',\n",
+ " 'tunedModelDisplayName': 'erwinh-fine-tuning-flash',\n",
+ " 'baseModel': 'gemini-1.5-flash-002',\n",
+ " 'supervisedTuningSpec': {'trainingDatasetUri': 'gs://tuning-demo-erwinh/gemini-tuning/squad_train.jsonl',\n",
+ " 'validationDatasetUri': 'gs://tuning-demo-erwinh/gemini-tuning/squad_validation.jsonl',\n",
+ " 'hyperParameters': {'epochCount': '10',\n",
+ " 'learningRateMultiplier': 1.0,\n",
+ " 'adapterSize': 'ADAPTER_SIZE_EIGHT'}},\n",
+ " 'state': 'JOB_STATE_SUCCEEDED',\n",
+ " 'createTime': '2024-10-09T06:20:11.698883Z',\n",
+ " 'startTime': '2024-10-09T06:20:11.740358Z',\n",
+ " 'endTime': '2024-10-09T06:32:38.867719Z',\n",
+ " 'updateTime': '2024-10-09T06:32:38.867719Z',\n",
+ " 'experiment': 'projects/713601331534/locations/us-central1/metadataStores/default/contexts/tuning-experiment-20241008232013040864',\n",
+ " 'tunedModel': {'model': 'projects/713601331534/locations/us-central1/models/1582035604160380928@1',\n",
+ " 'endpoint': 'projects/713601331534/locations/us-central1/endpoints/5693131570647400448'},\n",
+ " 'tuningDataStats': {'supervisedTuningDataStats': {'tuningDatasetExampleCount': '500',\n",
+ " 'userInputTokenDistribution': {'sum': '94474',\n",
+ " 'min': 54.0,\n",
+ " 'max': 602.0,\n",
+ " 'mean': 188.948,\n",
+ " 'median': 172.0,\n",
+ " 'p5': 107.0,\n",
+ " 'p95': 327.0,\n",
+ " 'buckets': [{'count': 126.0, 'left': 54.0, 'right': 145.0},\n",
+ " {'count': 273.0, 'left': 146.0, 'right': 236.0},\n",
+ " {'count': 77.0, 'left': 237.0, 'right': 328.0},\n",
+ " {'count': 21.0, 'left': 329.0, 'right': 419.0},\n",
+ " {'count': 2.0, 'left': 420.0, 'right': 511.0},\n",
+ " {'count': 1.0, 'left': 512.0, 'right': 602.0}]},\n",
+ " 'userOutputTokenDistribution': {'sum': '2420',\n",
+ " 'min': 1.0,\n",
+ " 'max': 43.0,\n",
+ " 'mean': 4.84,\n",
+ " 'median': 4.0,\n",
+ " 'p5': 1.0,\n",
+ " 'p95': 15.0,\n",
+ " 'buckets': [{'count': 440.0, 'left': 1.0, 'right': 8.0},\n",
+ " {'count': 38.0, 'left': 9.0, 'right': 15.0},\n",
+ " {'count': 16.0, 'left': 16.0, 'right': 22.0},\n",
+ " {'count': 4.0, 'left': 23.0, 'right': 29.0},\n",
+ " {'count': 1.0, 'left': 30.0, 'right': 36.0},\n",
+ " {'count': 1.0, 'left': 37.0, 'right': 43.0}]},\n",
+ " 'userMessagePerExampleDistribution': {'sum': '1000',\n",
+ " 'min': 2.0,\n",
+ " 'max': 2.0,\n",
+ " 'mean': 2.0,\n",
+ " 'median': 2.0,\n",
+ " 'p5': 2.0,\n",
+ " 'p95': 2.0,\n",
+ " 'buckets': [{'count': 500.0, 'left': 2.0, 'right': 2.0}]},\n",
+ " 'userDatasetExamples': [{'role': 'user',\n",
+ " 'parts': [{'text': 'Context: On the next day, December 18, protests turned into civil unrest as clashes between troops, volunteers, militia units, and Kazakh students turned into a wide-scale confrontation. The clashes could only be controlled on the third day. The Almaty events were followed by smaller protests and demonstrations in Shymkent, Pavlodar, Karaganda, and Taldykorgan. Reports from Kazakh SSR authoriti...'}]},\n",
+ " {'role': 'model', 'parts': [{'text': '3,000'}]},\n",
+ " {'role': 'user',\n",
+ " 'parts': [{'text': \"Context: Roman Catholicism was the sole established religion in the Holy Roman Empire until the Reformation changed this drastically. In 1517, Martin Luther challenged the Catholic Church as he saw it as a corruption of Christian faith. Through this, he altered the course of European and world history and established Protestantism. The Thirty Years' War (1618–1648) was one of the most destructi...\"}]},\n",
+ " {'role': 'model', 'parts': [{'text': 'Roman Catholicism'}]},\n",
+ " {'role': 'user',\n",
+ " 'parts': [{'text': \"Context: Israel retaliated against Egyptian shelling with commando raids, artillery shelling and air strikes. This resulted in an exodus of civilians from Egyptian cities along the Suez Canal's western bank. Nasser ceased all military activities and began a program to build a network of internal defenses, while receiving the financial backing of various Arab states. The war resumed in March 196...\"}]},\n",
+ " {'role': 'model', 'parts': [{'text': 'March 1969'}]}],\n",
+ " 'totalBillableTokenCount': '96894'}}}"
+ ]
+ },
+ "execution_count": 198,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# Get the tuning job info.\n",
+ "sft_tuning_job.to_dict()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 199,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 35
+ },
+ "id": "19aQnN-k84d9",
+ "outputId": "5c7f8c4f-566f-4c12-ffc7-60b4340d03d3"
+ },
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "type": "string"
+ },
+ "text/plain": [
+ "'projects/713601331534/locations/us-central1/tuningJobs/8356726629560483840'"
+ ]
+ },
+ "execution_count": 199,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# Get the resource name of the tuning job\n",
+ "sft_tuning_job_name = sft_tuning_job.resource_name\n",
+ "sft_tuning_job_name"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "UKo8cwF2KVM5"
+ },
+ "source": [
+ "**Important:** Tuning time depends on several factors, such as training data size, number of epochs, learning rate multiplier, etc."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "8NiZnPkIKcwm"
+ },
+ "source": [
+ "
\n",
+ "⚠️ It will take ~30 mins for the model tuning job to complete on the provided dataset and set configurations/hyperparameters. ⚠️\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 200,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "Njag_3cB86rH",
+ "outputId": "b1408519-3735-4aca-86aa-89b0da5699b1"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "CPU times: user 67.5 ms, sys: 4.98 ms, total: 72.5 ms\n",
+ "Wall time: 1.28 s\n"
+ ]
+ }
+ ],
+ "source": [
+ "%%time\n",
+ "# Wait for job completion\n",
+ "while not sft_tuning_job.refresh().has_ended:\n",
+ " time.sleep(60)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 201,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 35
+ },
+ "id": "dkx92RBdbf27",
+ "outputId": "0641746b-de08-4f9b-80cc-5afef558e7ea"
+ },
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "type": "string"
+ },
+ "text/plain": [
+ "'projects/713601331534/locations/us-central1/models/1582035604160380928@1'"
+ ]
+ },
+ "execution_count": 201,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# tuned model name\n",
+ "tuned_model_name = sft_tuning_job.tuned_model_name\n",
+ "tuned_model_name"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 202,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 35
+ },
+ "id": "e09aB_9Ebf5c",
+ "outputId": "c8a028c1-da5f-4e7d-d9db-ecde9c5293e1"
+ },
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "type": "string"
+ },
+ "text/plain": [
+ "'projects/713601331534/locations/us-central1/endpoints/5693131570647400448'"
+ ]
+ },
+ "execution_count": 202,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# tuned model endpoint name\n",
+ "tuned_model_endpoint_name = sft_tuning_job.tuned_model_endpoint_name\n",
+ "tuned_model_endpoint_name"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "gV1ukBznKmlN"
+ },
+ "source": [
+ "#### Model tuning metrics\n",
+ "\n",
+ "- `/train_total_loss`: Loss for the tuning dataset at a training step.\n",
+ "- `/train_fraction_of_correct_next_step_preds`: The token accuracy at a training step. A single prediction consists of a sequence of tokens. This metric measures the accuracy of the predicted tokens when compared to the ground truth in the tuning dataset.\n",
+ "- `/train_num_predictions`: Number of predicted tokens at a training step\n",
+ "\n",
+ "#### Model evaluation metrics:\n",
+ "\n",
+ "- `/eval_total_loss`: Loss for the evaluation dataset at an evaluation step.\n",
+ "- `/eval_fraction_of_correct_next_step_preds`: The token accuracy at an evaluation step. A single prediction consists of a sequence of tokens. This metric measures the accuracy of the predicted tokens when compared to the ground truth in the evaluation dataset.\n",
+ "- `/eval_num_predictions`: Number of predicted tokens at an evaluation step.\n",
+ "\n",
+ "The metrics visualizations are available after the model tuning job completes. If you don't specify a validation dataset when you create the tuning job, only the visualizations for the tuning metrics are available."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 155,
+ "metadata": {
+ "id": "DH0guHM---Jo"
+ },
+ "outputs": [],
+ "source": [
+ "# Locate Vertex AI Experiment and Vertex AI Experiment Run\n",
+ "experiment = aiplatform.Experiment(experiment_name=experiment_name)\n",
+ "filter_str = metadata_utils._make_filter_string(\n",
+ " schema_title=\"system.ExperimentRun\",\n",
+ " parent_contexts=[experiment.resource_name],\n",
+ ")\n",
+ "experiment_run = context.Context.list(filter_str)[0]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 156,
+ "metadata": {
+ "id": "hggHQFIl_FXC"
+ },
+ "outputs": [],
+ "source": [
+ "# Read data from Tensorboard\n",
+ "tensorboard_run_name = f\"{experiment.get_backing_tensorboard_resource().resource_name}/experiments/{experiment.name}/runs/{experiment_run.name.replace(experiment.name, '')[1:]}\"\n",
+ "tensorboard_run = aiplatform.TensorboardRun(tensorboard_run_name)\n",
+ "metrics = tensorboard_run.read_time_series_data()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 157,
+ "metadata": {
+ "id": "BdHKZdqG_bHf"
+ },
+ "outputs": [],
+ "source": [
+ "def get_metrics(metric: str = \"/train_total_loss\"):\n",
+ " \"\"\"\n",
+ " Get metrics from Tensorboard.\n",
+ "\n",
+ " Args:\n",
+ " metric: metric name, eg. /train_total_loss or /eval_total_loss.\n",
+ " Returns:\n",
+ " steps: list of steps.\n",
+ " steps_loss: list of loss values.\n",
+ " \"\"\"\n",
+ " loss_values = metrics[metric].values\n",
+ " steps_loss = []\n",
+ " steps = []\n",
+ " for loss in loss_values:\n",
+ " steps_loss.append(loss.scalar.value)\n",
+ " steps.append(loss.step)\n",
+ " return steps, steps_loss"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 158,
+ "metadata": {
+ "id": "_pDrlpA7_e9o"
+ },
+ "outputs": [],
+ "source": [
+ "# Get Train and Eval Loss\n",
+ "train_loss = get_metrics(metric=\"/train_total_loss\")\n",
+ "eval_loss = get_metrics(metric=\"/eval_total_loss\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 161,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 542
+ },
+ "id": "DL07j7u__iZx",
+ "outputId": "3cff463b-59f7-4db3-a884-e3a099583ab5"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "\n",
+ "
\n",
+ "
\n",
+ "\n",
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# Plot the train and eval loss metrics using Plotly python library\n",
+ "fig = make_subplots(\n",
+ " rows=1, cols=2, shared_xaxes=True, subplot_titles=(\"Train Loss\", \"Eval Loss\")\n",
+ ")\n",
+ "\n",
+ "# Add traces\n",
+ "fig.add_trace(\n",
+ " go.Scatter(x=train_loss[0], y=train_loss[1], name=\"Train Loss\", mode=\"lines\"),\n",
+ " row=1,\n",
+ " col=1,\n",
+ ")\n",
+ "fig.add_trace(\n",
+ " go.Scatter(x=eval_loss[0], y=eval_loss[1], name=\"Eval Loss\", mode=\"lines\"),\n",
+ " row=1,\n",
+ " col=2,\n",
+ ")\n",
+ "\n",
+ "# Add figure title\n",
+ "fig.update_layout(title=\"Train and Eval Loss\", xaxis_title=\"Steps\", yaxis_title=\"Loss\")\n",
+ "\n",
+ "# Set x-axis title\n",
+ "fig.update_xaxes(title_text=\"Steps\")\n",
+ "\n",
+ "# Set y-axes titles\n",
+ "fig.update_yaxes(title_text=\"Loss\")\n",
+ "\n",
+ "# Show plot\n",
+ "fig.show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 162,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "dYygz5ph_icf",
+ "outputId": "f15d6aa8-bbf3-46ca-e919-e91760c9ba90"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "***Testing***\n",
+ "candidates {\n",
+ " content {\n",
+ " role: \"model\"\n",
+ " parts {\n",
+ " text: \"European Council\\n\\n\"\n",
+ " }\n",
+ " }\n",
+ " finish_reason: STOP\n",
+ " avg_logprobs: -0.11596920092900594\n",
+ "}\n",
+ "usage_metadata {\n",
+ " prompt_token_count: 290\n",
+ " candidates_token_count: 3\n",
+ " total_token_count: 293\n",
+ "}\n",
+ "\n"
+ ]
+ }
+ ],
+ "source": [
+ "if True:\n",
+ " tuned_genai_model = GenerativeModel(tuned_model_endpoint_name)\n",
+ " # Test with the loaded model.\n",
+ " print(\"***Testing***\")\n",
+ " print(\n",
+ " tuned_genai_model.generate_content(\n",
+ " contents=prompt, generation_config=generation_config\n",
+ " )\n",
+ " )\n",
+ "else:\n",
+ " print(\"State:\", sft_tuning_job.state)\n",
+ " print(\"Error:\", sft_tuning_job.error)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "AbTaqCMxNf18"
+ },
+ "source": [
+ "### Model usage and evaluation.\n",
+ "\n",
+ "Next you will evaluate the model to see how well it performs. You can also compare it to the benchmark."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 164,
+ "metadata": {
+ "id": "W4YMNGuoDajB"
+ },
+ "outputs": [],
+ "source": [
+ "y_true = test_df[\"answers\"].values\n",
+ "\n",
+ "def get_predictions(test_question):\n",
+ "\n",
+ " prompt = f\"\"\"{test_question}\"\"\"\n",
+ "\n",
+ " generation_config = GenerationConfig(\n",
+ " temperature=0.1,\n",
+ " )\n",
+ "\n",
+ " response = tuned_genai_model.generate_content(contents=prompt, generation_config=generation_config).text\n",
+ "\n",
+ " return response"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 165,
+ "metadata": {
+ "id": "69FMuAeoDrm5"
+ },
+ "outputs": [],
+ "source": [
+ "y_pred = []\n",
+ "y_pred_question = test_df[\"question\"].values\n",
+ "\n",
+ "for i in y_pred_question:\n",
+ "\n",
+ " prediction = get_predictions(i)\n",
+ " y_pred.append(prediction)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 166,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "yj76Tu6ODalZ",
+ "outputId": "f5d76e1d-1fbf-4251-9c19-4f430a97ad0b"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "EM score: 0.0\n",
+ "F1 score: 0.2399679487179487\n"
+ ]
+ }
+ ],
+ "source": [
+ "em, f1 = calculate_em_and_f1_for_text_arrays(y_pred, y_true)\n",
+ "print(f\"EM score: {em}\")\n",
+ "print(f\"F1 score: {f1}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Vkb2qXljFYqX"
+ },
+ "source": []
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "provenance": []
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
From 331cf71675d76a8e02201804c670463f332adc99 Mon Sep 17 00:00:00 2001
From: Erwin Huizenga
Date: Thu, 10 Oct 2024 10:42:01 +0800
Subject: [PATCH 54/76] minor fixes (#1230)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
# Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [X] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [X] You are listed as the author in your notebook or README file.
- [X] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [X] Make your Pull Request title in the
specification.
- [X] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [X] Appropriate docs were updated (if necessary)
Fixes # 🦕
---
...sed_finetuning_using_gemini_qa_ipynb.ipynb | 822 +++---------------
1 file changed, 111 insertions(+), 711 deletions(-)
diff --git a/gemini/tuning/supervised_finetuning_using_gemini_qa_ipynb.ipynb b/gemini/tuning/supervised_finetuning_using_gemini_qa_ipynb.ipynb
index 2e8ccc4bfd..85f96df281 100644
--- a/gemini/tuning/supervised_finetuning_using_gemini_qa_ipynb.ipynb
+++ b/gemini/tuning/supervised_finetuning_using_gemini_qa_ipynb.ipynb
@@ -94,7 +94,7 @@
"\n",
"- Authenticated your Colab environment: Run the authentication code block at the beginning.\n",
"\n",
- "- Prepared training data (Test with your own data or use the one in the notebook): Data should be formatted in JSONL with prompts and corresponding completions."
+ "- Prepare your training data. "
]
},
{
@@ -139,7 +139,7 @@
},
{
"cell_type": "code",
- "execution_count": 11,
+ "execution_count": null,
"metadata": {
"id": "tFy3H3aPgx12"
},
@@ -163,26 +163,11 @@
},
{
"cell_type": "code",
- "execution_count": 12,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "XRvKdaPDTznN",
- "outputId": "b9219138-e8f3-4cfd-e324-9d61ef383732"
- },
- "outputs": [
- {
- "data": {
- "text/plain": [
- "{'status': 'ok', 'restart': True}"
- ]
- },
- "execution_count": 12,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
+ "execution_count": null,
+ "metadata": {
+ "id": "XRvKdaPDTznN"
+ },
+ "outputs": [],
"source": [
"import IPython\n",
"\n",
@@ -214,7 +199,7 @@
},
{
"cell_type": "code",
- "execution_count": 2,
+ "execution_count": null,
"metadata": {
"id": "NyKGtVQjgx13"
},
@@ -243,7 +228,7 @@
},
{
"cell_type": "code",
- "execution_count": 3,
+ "execution_count": null,
"metadata": {
"id": "Nqwi-5ufWp_B"
},
@@ -273,7 +258,7 @@
},
{
"cell_type": "code",
- "execution_count": 189,
+ "execution_count": null,
"metadata": {
"id": "6fc324893334"
},
@@ -328,23 +313,11 @@
},
{
"cell_type": "code",
- "execution_count": 95,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
- },
- "id": "lit30Cktbfvo",
- "outputId": "273ee3ae-cb16-42fd-9d59-898826d2fb60"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "gs://tuning-demo-erwinh/gemini-tuning\n"
- ]
- }
- ],
+ "execution_count": null,
+ "metadata": {
+ "id": "lit30Cktbfvo"
+ },
+ "outputs": [],
"source": [
"# Provide a bucket name\n",
"BUCKET_NAME = \"tuning-demo-erwinh/gemini-tuning\" # @param {type:\"string\"}\n",
@@ -407,162 +380,11 @@
},
{
"cell_type": "code",
- "execution_count": 181,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/",
- "height": 98
- },
- "id": "LkOmXpegA8CW",
- "outputId": "297f4339-83fc-4a4a-9ed0-62a469ac1acd"
- },
- "outputs": [
- {
- "data": {
- "application/vnd.google.colaboratory.intrinsic+json": {
- "summary": "{\n \"name\": \"test_df\",\n \"rows\": 40,\n \"fields\": [\n {\n \"column\": \"id\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 40,\n \"samples\": [\n \"5725bae289a1e219009abd92\",\n \"5726431aec44d21400f3dd13\",\n \"57269fab5951b619008f7808\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"title\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 26,\n \"samples\": [\n \"Teacher\",\n \"Ctenophora\",\n \"Normans\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"context\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 38,\n \"samples\": [\n \"On May 21, 2013, NFL owners at their spring meetings in Boston voted and awarded the game to Levi's Stadium. The $1.2 billion stadium opened in 2014. It is the first Super Bowl held in the San Francisco Bay Area since Super Bowl XIX in 1985, and the first in California since Super Bowl XXXVII took place in San Diego in 2003.\",\n \"The contracted batch of 15 Saturn Vs were enough for lunar landing missions through Apollo 20. NASA publicized a preliminary list of eight more planned landing sites, with plans to increase the mass of the CSM and LM for the last five missions, along with the payload capacity of the Saturn V. These final missions would combine the I and J types in the 1967 list, allowing the CMP to operate a package of lunar orbital sensors and cameras while his companions were on the surface, and allowing them to stay on the Moon for over three days. These missions would also carry the Lunar Roving Vehicle (LRV) increasing the exploration area and allowing televised liftoff of the LM. Also, the Block II spacesuit was revised for the extended missions to allow greater flexibility and visibility for driving the LRV.\",\n \"In July 1977, General Zia-ul-Haq overthrew Prime Minister Zulfiqar Ali Bhutto's regime in Pakistan. Ali Bhutto, a leftist in democratic competition with Islamists, had announced banning alcohol and nightclubs within six months, shortly before he was overthrown. Zia-ul-Haq was much more committed to Islamism, and \\\"Islamization\\\" or implementation of Islamic law, became a cornerstone of his eleven-year military dictatorship and Islamism became his \\\"official state ideology\\\". Zia ul Haq was an admirer of Mawdudi and Mawdudi's party Jamaat-e-Islami became the \\\"regime's ideological and political arm\\\". In Pakistan this Islamization from above was \\\"probably\\\" more complete \\\"than under any other regime except those in Iran and Sudan,\\\" but Zia-ul-Haq was also criticized by many Islamists for imposing \\\"symbols\\\" rather than substance, and using Islamization to legitimize his means of seizing power. Unlike neighboring Iran, Zia-ul-Haq's policies were intended to \\\"avoid revolutionary excess\\\", and not to strain relations with his American and Persian Gulf state allies. Zia-ul-Haq was killed in 1988 but Islamization remains an important element in Pakistani society.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"question\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 40,\n \"samples\": [\n \"How many species of Ctenophores have not been fully described or named?\",\n \"What was Tymnet\",\n \"What is Michael Carrick and Alan Shearer's profession?\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"answers\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 40,\n \"samples\": [\n \"possibly another 25\",\n \"an international data communications network headquartered in San Jose, CA\",\n \"international footballers\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
- "type": "dataframe",
- "variable_name": "test_df"
- },
- "text/html": [
- "\n",
- "
"
+ ],
+ "metadata": {
+ "id": "ojoyvz6mH1Hv"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "84f0f73a0f76"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "| Author(s) | [Erwin Huizenga](https://github.com/erwinh85) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "JAPoU8Sm5E6e"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "**Gemini** is a family of generative AI models developed by Google DeepMind designed for multimodal use cases. The Gemini API gives you access to the various Gemini models, such as Gemini 1.5 Pro and Gemini 1.5 Flash.\n",
+ "This notebook demonstrates fine-tuning the Gemini 1.5 Flahs using the Vertex AI Supervised Tuning feature. Supervised Tuning allows you to use your own labeled training data to further refine the base model's capabilities toward your specific tasks.\n",
+ "Supervised Tuning uses labeled examples to tune a model. Each example demonstrates the output you want from your text model during inference.\n",
+ "First, ensure your training data is of high quality, well-labeled, and directly relevant to the target task. This is crucial as low-quality data can adversely affect the performance and introduce bias in the fine-tuned model.\n",
+ "Training: Experiment with different configurations to optimize the model's performance on the target task.\n",
+ "Evaluation:\n",
+ "Metric: Choose appropriate evaluation metrics that accurately reflect the success of the fine-tuned model for your specific task\n",
+ "Evaluation Set: Use a separate set of data to evaluate the model's performance\n",
+ "\n",
+ "\n",
+ "Refer to public [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning) for more details.\n",
+ "\n",
+ "\n",
+ "\n",
+ "Before running this notebook, ensure you have:\n",
+ "\n",
+ "- A Google Cloud project: Provide your project ID in the `PROJECT_ID` variable.\n",
+ "\n",
+ "- Authenticated your Colab environment: Run the authentication code block at the beginning.\n",
+ "\n",
+ "- Prepared training data (Test with your own data or use the one in the notebook): Data should be formatted in JSONL with prompts and corresponding completions."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### Costs\n",
+ "\n",
+ "This tutorial uses billable components of Google Cloud:\n",
+ "\n",
+ "* Vertex AI\n",
+ "* Cloud Storage\n",
+ "\n",
+ "Learn about [Vertex AI\n",
+ "pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage\n",
+ "pricing](https://cloud.google.com/storage/pricing), and use the [Pricing\n",
+ "Calculator](https://cloud.google.com/products/calculator/)\n",
+ "to generate a cost estimate based on your projected usage.\n",
+ "\n",
+ "To get an estimate of the number of tokens"
+ ],
+ "metadata": {
+ "id": "f7SS5pzuIA-1"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "61RBz8LLbxCR"
+ },
+ "source": [
+ "## Get started"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "No17Cw5hgx12"
+ },
+ "source": [
+ "### Install Vertex AI SDK and other required packages\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "tFy3H3aPgx12"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install --upgrade --user --quiet google-cloud-aiplatform datasets"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "R5Xep4W9lq-Z"
+ },
+ "source": [
+ "### Restart runtime\n",
+ "\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
+ "\n",
+ "The restart might take a minute or longer. After it's restarted, continue to the next step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XRvKdaPDTznN"
+ },
+ "outputs": [],
+ "source": [
+ "import IPython\n",
+ "\n",
+ "app = IPython.Application.instance()\n",
+ "app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SbmM4z7FOBpM"
+ },
+ "source": [
+ "
\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dmWOrTJ3gx13"
+ },
+ "source": [
+ "### Authenticate your notebook environment (Colab only)\n",
+ "\n",
+ "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "NyKGtVQjgx13"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " from google.colab import auth\n",
+ "\n",
+ " auth.authenticate_user()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DF4l8DTdWgPY"
+ },
+ "source": [
+ "### Set Google Cloud project information and initialize Vertex AI SDK\n",
+ "\n",
+ "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
+ "\n",
+ "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Nqwi-5ufWp_B"
+ },
+ "outputs": [],
+ "source": [
+ "# Use the environment variable if the user doesn't provide Project ID.\n",
+ "import os\n",
+ "import vertexai\n",
+ "\n",
+ "PROJECT_ID = \"\" # @param {type:\"string\", isTemplate: true}\n",
+ "if PROJECT_ID == \"[your-project-id]\":\n",
+ " PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n",
+ "\n",
+ "LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"us-central1\")\n",
+ "\n",
+ "vertexai.init(project=PROJECT_ID, location=LOCATION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5303c05f7aa6"
+ },
+ "source": [
+ "### Import libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "6fc324893334"
+ },
+ "outputs": [],
+ "source": [
+ "# Vertex AI SDK\n",
+ "from google.cloud import aiplatform\n",
+ "from google.cloud.aiplatform.metadata import context\n",
+ "from google.cloud.aiplatform.metadata import utils as metadata_utils\n",
+ "from vertexai.generative_models import (\n",
+ " GenerationConfig,\n",
+ " GenerativeModel,\n",
+ " HarmBlockThreshold,\n",
+ " HarmCategory,\n",
+ ")\n",
+ "from vertexai.preview.tuning import sft\n",
+ "\n",
+ "# Vertex AI SDK\n",
+ "from sklearn.metrics import f1_score\n",
+ "from sklearn.feature_extraction.text import TfidfVectorizer\n",
+ "import pandas as pd\n",
+ "import array\n",
+ "import time\n",
+ "from datasets import load_dataset\n",
+ "import numpy as np\n",
+ "import plotly.graph_objects as go\n",
+ "from plotly.subplots import make_subplots\n",
+ "import json"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### Data\n",
+ "\n",
+ "#### SQuAD dataset\n",
+ "Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.\n",
+ "\n",
+ "You can fine more information on the SQuAD [github page](https://rajpurkar.github.io/SQuAD-explorer/)**bold text**"
+ ],
+ "metadata": {
+ "id": "6bBZa2I-c-x8"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "First update the `BUCKET_NAME` parameter below. You can either use an existing bucket or create a new one."
+ ],
+ "metadata": {
+ "id": "KhebDJjRKePL"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# Provide a bucket name\n",
+ "BUCKET_NAME = \"\" # @param {type:\"string\"}\n",
+ "BUCKET_URI = f\"gs://{BUCKET_NAME}\"\n",
+ "print(BUCKET_URI)"
+ ],
+ "metadata": {
+ "id": "lit30Cktbfvo"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Only run the code below if you want to create a new Google Cloud Storage bucket."
+ ],
+ "metadata": {
+ "id": "ed-G-9cyKmPY"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# ! gsutil mb -l {LOCATION} -p {PROJECT_ID} {BUCKET_URI}"
+ ],
+ "metadata": {
+ "id": "0UJ8S9YFA1pZ"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Next you will copy the data into your bucket."
+ ],
+ "metadata": {
+ "id": "izjwF63tLLEq"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "!gsutil cp gs://github-repo/generative-ai/gemini/tuning/qa/squad_test.csv .\n",
+ "!gsutil cp gs://github-repo/generative-ai/gemini/tuning/qa/squad_train.csv .\n",
+ "!gsutil cp gs://github-repo/generative-ai/gemini/tuning/qa/squad_validation.csv ."
+ ],
+ "metadata": {
+ "id": "wjvcxx_sA3xP"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### Baseline\n",
+ "\n",
+ "Next you will prepare some test data that you will use to establish a baseline. This means evaluating your chosen model on a representative sample of your dataset before any fine-tuning. A baseline allows you to quantify the improvements achieved through fine-tuning."
+ ],
+ "metadata": {
+ "id": "3F10LuZeL3kt"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "test_df = pd.read_csv('squad_test.csv')\n",
+ "test_df.head(1)"
+ ],
+ "metadata": {
+ "id": "LkOmXpegA8CW"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "You will need to do some dataset preperations. We will add a system instruction to the dataset:\n",
+ "\n",
+ "`SystemInstruct`: System instructions are a set of instructions that the model processes before it processes prompts. We recommend that you use system instructions to tell the model how you want it to behave and respond to prompts.\n",
+ "\n",
+ "We will also combine the `context` and `question`."
+ ],
+ "metadata": {
+ "id": "mrLxcVVcMsNO"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "systemInstruct = \"Answer the question based on the context\""
+ ],
+ "metadata": {
+ "id": "c0pgJycOekZ3"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# combine the systeminstruct + context + question into one column.\n",
+ "row_dataset = 6\n",
+ "\n",
+ "test_df[\"input_question\"] = systemInstruct + \"\\n\" + \"Context: \" + test_df[\"context\"] + \"\\n\" + \"Question: \" + test_df[\"question\"]\n",
+ "test_question = test_df[\"input_question\"].iloc[row_dataset]\n",
+ "print(test_question)"
+ ],
+ "metadata": {
+ "id": "N_u3VzUMsyqj"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Next, set the model that you will use. In this example you will use `gemini-1.5-flash-002`. A multimodal model that is designed for high-volume, cost-effective applications, and which delivers speed and efficiency to build fast, lower-cost applications that don't compromise on quality.\n",
+ "\n"
+ ],
+ "metadata": {
+ "id": "FSxYYqMGWrmj"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "base_model = \"gemini-1.5-flash-002\"\n",
+ "generation_model = GenerativeModel(base_model)"
+ ],
+ "metadata": {
+ "id": "t-5X4goiqqBQ"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Next lets take a question and get a prediction from Gemini that we can compare to the actual answer."
+ ],
+ "metadata": {
+ "id": "wyscyIenW4WZ"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "y_true = test_df[\"answers\"].values\n",
+ "y_pred_question = test_df[\"question\"].values"
+ ],
+ "metadata": {
+ "id": "ejjhfynQWc0k"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "def get_predictions(question: str) -> str:\n",
+ " \"\"\"Generates predictions for a given test question.\n",
+ "\n",
+ " Args:\n",
+ " test_question: The question to generate predictions for.\n",
+ "\n",
+ " Returns:\n",
+ " The generated prediction text.\n",
+ " \"\"\"\n",
+ "\n",
+ " prompt = f\"{question}\"\n",
+ "\n",
+ " generation_config = GenerationConfig(\n",
+ " temperature=0.1)\n",
+ "\n",
+ " response = generation_model.generate_content(\n",
+ " contents=prompt, generation_config=generation_config\n",
+ " ).text\n",
+ "\n",
+ " return response"
+ ],
+ "metadata": {
+ "id": "cXencUYc6YAE"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "test_answer = test_df[\"answers\"].iloc[row_dataset]\n",
+ "\n",
+ "response = get_predictions(test_question)\n",
+ "\n",
+ "print(f\"Gemini response: {response}\")\n",
+ "print(f\"Actual answer: {test_answer}\")"
+ ],
+ "metadata": {
+ "id": "gKa0wLooa3Is"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "You can see that both answers are correct, but the response from Gemini is more lengthy. However, answers in the SQuAD dataset are typically concise and clear.\n",
+ "\n",
+ "Fine-tuning is a great way to control the type of output your use case requires. In this instance, you would want the model to provide short, clear answers.\n",
+ "\n",
+ "Next, let's check if each dataset has an equal number of examples."
+ ],
+ "metadata": {
+ "id": "OGRJTHKrdujw"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "y_pred = test_df[\"question\"].values\n",
+ "\n",
+ "num_strings_pred = np.sum([isinstance(item, str) for item in y_pred])\n",
+ "print(f\"Number of strings in y_pred: {num_strings_pred}\")\n",
+ "\n",
+ "num_strings_true = np.sum([isinstance(item, str) for item in y_true])\n",
+ "print(f\"Number of strings in y_true: {num_strings_true}\")"
+ ],
+ "metadata": {
+ "id": "dCe0CUsi5E-Y"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Next lest establish a baseline using evaluation metrics.\n",
+ "\n",
+ "Evaluating the performance of a Question Answering (QA) system requires specific metrics. Two commonly used metrics are Exact Match (EM) and F1 score.\n",
+ "\n",
+ "EM is a strict measure that only considers an answer correct if it perfectly matches the ground truth, even down to the punctuation. It's a binary metric - either 1 for a perfect match or 0 otherwise. This makes it sensitive to minor variations in phrasing.\n",
+ "\n",
+ "F1 score is more flexible. It considers the overlap between the predicted answer and the true answer in terms of individual words or tokens. It calculates the harmonic mean of precision (proportion of correctly predicted words out of all predicted words) and recall (proportion of correctly predicted words out of all true answer words). This allows for partial credit and is less sensitive to minor wording differences.\n",
+ "\n",
+ "In practice, EM is useful when exact wording is crucial, while F1 is more suitable when evaluating the overall understanding and semantic accuracy of the QA system. Often, both metrics are used together to provide a comprehensive evaluation."
+ ],
+ "metadata": {
+ "id": "hvi7m8pKE8WB"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "def calculate_em_and_f1_for_text_arrays(y_true, y_pred, average='weighted'):\n",
+ " \"\"\"\n",
+ " Calculates the Exact Match (EM) and F1 score for arrays of text\n",
+ " using word-level comparisons.\n",
+ "\n",
+ " Args:\n",
+ " y_true: An array of ground truth strings.\n",
+ " y_pred: An array of predicted strings.\n",
+ " average: The averaging method to use for F1 score.\n",
+ "\n",
+ " Returns:\n",
+ " A tuple containing the EM score and the F1 score.\n",
+ " \"\"\"\n",
+ "\n",
+ " em = np.mean([t == p for t, p in zip(y_true, y_pred)])\n",
+ "\n",
+ " # Use TF-IDF to convert strings to numerical vectors\n",
+ " vectorizer = TfidfVectorizer()\n",
+ " all_text = np.concatenate((y_true, y_pred))\n",
+ " vectorizer.fit(all_text)\n",
+ " y_true_vec = vectorizer.transform(y_true)\n",
+ " y_pred_vec = vectorizer.transform(y_pred)\n",
+ "\n",
+ " # Calculate F1 score based on common words (non-zero elements)\n",
+ " y_true_class = (y_true_vec > 0).toarray().astype(int)\n",
+ " y_pred_class = (y_pred_vec > 0).toarray().astype(int)\n",
+ "\n",
+ " f1 = f1_score(y_true_class, y_pred_class, average=average)\n",
+ "\n",
+ " return em, f1\n"
+ ],
+ "metadata": {
+ "id": "XcgEpTU55FFc"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "em, f1 = calculate_em_and_f1_for_text_arrays(y_pred, y_true)\n",
+ "print(f\"EM score: {em}\")\n",
+ "print(f\"F1 score: {f1}\")"
+ ],
+ "metadata": {
+ "id": "rhDTq9p_GSBP"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### Prepare the data for fine-tuning\n",
+ "\n",
+ "To optimize the tuning process for a foundation model, ensure your dataset includes examples that reflect the desired task. Structure your training data in a text-to-text format, where each record in the dataset pairs an input text (or prompt) with its corresponding expected output. This supervised tuning approach uses the dataset to effectively teach the model the specific behavior or task you need it to perform, by providing numerous illustrative examples.\n",
+ "\n",
+ "The size of your dataset will vary depending on the complexity of the task, but as a general rule, the more examples you include, the better the model's performance.\n",
+ "\n",
+ "Dataset Format\n",
+ "Your training data should be structured in a JSONL file and stored at a Google Cloud Storage (GCS) URI. Each line in the JSONL file must adhere to the following schema:\n",
+ "\n",
+ "A `contents` array containing objects that define:\n",
+ "- A `role` (\"user\" for user input or \"model\" for model output)\n",
+ "- `parts` containing the input data.\n",
+ "\n",
+ "```\n",
+ "{\n",
+ " \"contents\":[\n",
+ " {\n",
+ " \"role\":\"user\", # This indicate input content\n",
+ " \"parts\":[\n",
+ " {\n",
+ " \"text\":\"How are you?\"\n",
+ " }\n",
+ " ]\n",
+ " },\n",
+ " {\n",
+ " \"role\":\"model\", # This indicate target content\n",
+ " \"parts\":[ # text only\n",
+ " {\n",
+ " \"text\":\"I am good, thank you!\"\n",
+ " }\n",
+ " ]\n",
+ " }\n",
+ " # ... repeat \"user\", \"model\" for multi turns.\n",
+ " ]\n",
+ "}\n",
+ "```\n",
+ "\n",
+ "Refer to the public [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning-prepare#about-datasets) for more details."
+ ],
+ "metadata": {
+ "id": "22DfexbNfUHm"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# combine the systeminstruct + context + question into one column.\n",
+ "train_df = pd.read_csv('squad_train.csv')\n",
+ "validation_df = pd.read_csv('squad_validation.csv')\n",
+ "\n",
+ "train_df[\"input_question\"] = systemInstruct + \"Context: \" + train_df[\"context\"] + \"Question: \" + train_df[\"question\"]\n",
+ "validation_df[\"input_question\"] = systemInstruct + \"Context: \" + validation_df[\"context\"] + \"Question: \" + validation_df[\"question\"]"
+ ],
+ "metadata": {
+ "id": "4DqrQp4cLqRy"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "def df_to_jsonl(df, output_file):\n",
+ " \"\"\"Converts a Pandas DataFrame to JSONL format and saves it to a file.\n",
+ "\n",
+ " Args:\n",
+ " df: The DataFrame to convert.\n",
+ " output_file: The name of the output file.\n",
+ " \"\"\"\n",
+ "\n",
+ " with open(output_file, 'w') as f:\n",
+ " for row in df.itertuples(index=False):\n",
+ " jsonl_obj = {\n",
+ " \"systemInstruction\": {\"parts\": [{\"text\": \"Answer the question based on the provided context.\"}]},\n",
+ " \"contents\": [\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"parts\": [{\"text\": f\"Context: {row.context}\\n\\nQuestion: {row.question}\"}]\n",
+ " },\n",
+ " {\"role\": \"model\", \"parts\": [{\"text\": row.answers}]},\n",
+ " ]\n",
+ " }\n",
+ " f.write(json.dumps(jsonl_obj) + '\\n')\n",
+ "\n",
+ "# Process the DataFrames\n",
+ "df_to_jsonl(train_df, 'squad_train.jsonl')\n",
+ "df_to_jsonl(validation_df, 'squad_validation.jsonl')\n",
+ "\n",
+ "print(f\"JSONL data written to squad_train.jsonl\")\n",
+ "print(f\"JSONL data written to squad_validation.jsonl\")"
+ ],
+ "metadata": {
+ "id": "Pmzyz1migvHN"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Next you will copy the files into your Google Cloud bucket"
+ ],
+ "metadata": {
+ "id": "5OQv-ZMpJDhi"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "!gsutil cp ./squad_train.jsonl {BUCKET_URI}\n",
+ "!gsutil cp ./squad_validation.jsonl {BUCKET_URI}"
+ ],
+ "metadata": {
+ "id": "O5k1jYJ10IeW"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### Start fine-tuning job\n",
+ "Next you can start the fine-tuning job.\n",
+ "\n",
+ "- `source_model`: Specifies the base Gemini model version you want to fine-tune.\n",
+ " - `train_dataset`: Path to your training data in JSONL format.\n",
+ "\n",
+ " *Optional parameters*\n",
+ " - `validation_dataset`: If provided, this data is used to evaluate the model during tuning.\n",
+ " - `tuned_model_display_name`: Display name for the tuned model.\n",
+ " - `epochs`: The number of training epochs to run.\n",
+ " - `learning_rate_multiplier`: A value to scale the learning rate during training.\n",
+ " - `adapter_size` : Gemini 1.5 Flash supports Adapter length [1, 4], default value is 4.\n",
+ "\n",
+ " **Important**: The default hyperparameter settings are optimized for optimal performance based on rigorous testing and are recommended for initial use. Users may customize these parameters to address specific performance requirements.**"
+ ],
+ "metadata": {
+ "id": "UAHMYgFJJHjm"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "tuned_model_display_name = \"fine-tuning-gemini-flash-qa-v01\" # @param {type:\"string\"}\n",
+ "\n",
+ "sft_tuning_job = sft.train(\n",
+ " source_model=base_model,\n",
+ " train_dataset=f\"\"\"{BUCKET_URI}/squad_train.jsonl\"\"\",\n",
+ " # # Optional:\n",
+ " validation_dataset=f\"\"\"{BUCKET_URI}/squad_validation.jsonl\"\"\",\n",
+ " tuned_model_display_name=tuned_model_display_name,\n",
+ ")"
+ ],
+ "metadata": {
+ "id": "qj-LjQ5Vbf1E"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# Get the tuning job info.\n",
+ "sft_tuning_job.to_dict()"
+ ],
+ "metadata": {
+ "id": "3tXawW1p8E5-"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# Get the resource name of the tuning job\n",
+ "sft_tuning_job_name = sft_tuning_job.resource_name\n",
+ "sft_tuning_job_name"
+ ],
+ "metadata": {
+ "id": "19aQnN-k84d9"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "**Important:** Tuning time depends on several factors, such as training data size, number of epochs, learning rate multiplier, etc."
+ ],
+ "metadata": {
+ "id": "UKo8cwF2KVM5"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "
\n",
+ "⚠️ It will take ~30 mins for the model tuning job to complete on the provided dataset and set configurations/hyperparameters. ⚠️\n",
+ "
"
+ ],
+ "metadata": {
+ "id": "8NiZnPkIKcwm"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "%%time\n",
+ "# Wait for job completion\n",
+ "while not sft_tuning_job.refresh().has_ended:\n",
+ " time.sleep(60)"
+ ],
+ "metadata": {
+ "id": "Njag_3cB86rH"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# tuned model name\n",
+ "tuned_model_name = sft_tuning_job.tuned_model_name\n",
+ "tuned_model_name"
+ ],
+ "metadata": {
+ "id": "dkx92RBdbf27"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# tuned model endpoint name\n",
+ "tuned_model_endpoint_name = sft_tuning_job.tuned_model_endpoint_name\n",
+ "tuned_model_endpoint_name"
+ ],
+ "metadata": {
+ "id": "e09aB_9Ebf5c"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "#### Model tuning metrics\n",
+ "\n",
+ "- `/train_total_loss`: Loss for the tuning dataset at a training step.\n",
+ "- `/train_fraction_of_correct_next_step_preds`: The token accuracy at a training step. A single prediction consists of a sequence of tokens. This metric measures the accuracy of the predicted tokens when compared to the ground truth in the tuning dataset.\n",
+ "- `/train_num_predictions`: Number of predicted tokens at a training step\n",
+ "\n",
+ "#### Model evaluation metrics:\n",
+ "\n",
+ "- `/eval_total_loss`: Loss for the evaluation dataset at an evaluation step.\n",
+ "- `/eval_fraction_of_correct_next_step_preds`: The token accuracy at an evaluation step. A single prediction consists of a sequence of tokens. This metric measures the accuracy of the predicted tokens when compared to the ground truth in the evaluation dataset.\n",
+ "- `/eval_num_predictions`: Number of predicted tokens at an evaluation step.\n",
+ "\n",
+ "The metrics visualizations are available after the model tuning job completes. If you don't specify a validation dataset when you create the tuning job, only the visualizations for the tuning metrics are available."
+ ],
+ "metadata": {
+ "id": "gV1ukBznKmlN"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# Get resource name from tuning job.\n",
+ "experiment_name = sft_tuning_job.experiment.resource_name\n",
+ "experiment_name"
+ ],
+ "metadata": {
+ "id": "IHVU4XP2aOFE"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# Locate Vertex AI Experiment and Vertex AI Experiment Run\n",
+ "experiment = aiplatform.Experiment(experiment_name=experiment_name)\n",
+ "filter_str = metadata_utils._make_filter_string(\n",
+ " schema_title=\"system.ExperimentRun\",\n",
+ " parent_contexts=[experiment.resource_name],\n",
+ ")\n",
+ "experiment_run = context.Context.list(filter_str)[0]"
+ ],
+ "metadata": {
+ "id": "DH0guHM---Jo"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# Read data from Tensorboard\n",
+ "tensorboard_run_name = f\"{experiment.get_backing_tensorboard_resource().resource_name}/experiments/{experiment.name}/runs/{experiment_run.name.replace(experiment.name, '')[1:]}\"\n",
+ "tensorboard_run = aiplatform.TensorboardRun(tensorboard_run_name)\n",
+ "metrics = tensorboard_run.read_time_series_data()"
+ ],
+ "metadata": {
+ "id": "hggHQFIl_FXC"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "def get_metrics(metric: str = \"/train_total_loss\"):\n",
+ " \"\"\"\n",
+ " Get metrics from Tensorboard.\n",
+ "\n",
+ " Args:\n",
+ " metric: metric name, eg. /train_total_loss or /eval_total_loss.\n",
+ " Returns:\n",
+ " steps: list of steps.\n",
+ " steps_loss: list of loss values.\n",
+ " \"\"\"\n",
+ " loss_values = metrics[metric].values\n",
+ " steps_loss = []\n",
+ " steps = []\n",
+ " for loss in loss_values:\n",
+ " steps_loss.append(loss.scalar.value)\n",
+ " steps.append(loss.step)\n",
+ " return steps, steps_loss"
+ ],
+ "metadata": {
+ "id": "BdHKZdqG_bHf"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# Get Train and Eval Loss\n",
+ "train_loss = get_metrics(metric=\"/train_total_loss\")\n",
+ "eval_loss = get_metrics(metric=\"/eval_total_loss\")"
+ ],
+ "metadata": {
+ "id": "_pDrlpA7_e9o"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# Plot the train and eval loss metrics using Plotly python library\n",
+ "fig = make_subplots(\n",
+ " rows=1, cols=2, shared_xaxes=True, subplot_titles=(\"Train Loss\", \"Eval Loss\")\n",
+ ")\n",
+ "\n",
+ "# Add traces\n",
+ "fig.add_trace(\n",
+ " go.Scatter(x=train_loss[0], y=train_loss[1], name=\"Train Loss\", mode=\"lines\"),\n",
+ " row=1,\n",
+ " col=1,\n",
+ ")\n",
+ "fig.add_trace(\n",
+ " go.Scatter(x=eval_loss[0], y=eval_loss[1], name=\"Eval Loss\", mode=\"lines\"),\n",
+ " row=1,\n",
+ " col=2,\n",
+ ")\n",
+ "\n",
+ "# Add figure title\n",
+ "fig.update_layout(title=\"Train and Eval Loss\", xaxis_title=\"Steps\", yaxis_title=\"Loss\")\n",
+ "\n",
+ "# Set x-axis title\n",
+ "fig.update_xaxes(title_text=\"Steps\")\n",
+ "\n",
+ "# Set y-axes titles\n",
+ "fig.update_yaxes(title_text=\"Loss\")\n",
+ "\n",
+ "# Show plot\n",
+ "fig.show()"
+ ],
+ "metadata": {
+ "id": "DL07j7u__iZx"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### Use the fine-tuned model and evaluation"
+ ],
+ "metadata": {
+ "id": "pivmh4Lwbgy1"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "prompt = \"\"\"\n",
+ "Answer the question based on the context\n",
+ "\n",
+ "Context: In the 1840s and 50s, there were attempts to overcome this problem by means of various patent valve gears with a separate, variable cutoff expansion valve riding on the back of the main slide valve; the latter usually had fixed or limited cutoff.\n",
+ "The combined setup gave a fair approximation of the ideal events, at the expense of increased friction and wear, and the mechanism tended to be complicated.\n",
+ "The usual compromise solution has been to provide lap by lengthening rubbing surfaces of the valve in such a way as to overlap the port on the admission side, with the effect that the exhaust side remains open for a longer period after cut-off on the admission side has occurred.\n",
+ "This expedient has since been generally considered satisfactory for most purposes and makes possible the use of the simpler Stephenson, Joy and Walschaerts motions.\n",
+ "Corliss, and later, poppet valve gears had separate admission and exhaust valves driven by trip mechanisms or cams profiled so as to give ideal events; most of these gears never succeeded outside of the stationary marketplace due to various other issues including leakage and more delicate mechanisms.\n",
+ "\n",
+ "Question: How is lap provided by overlapping the admission side port?\n",
+ "\"\"\""
+ ],
+ "metadata": {
+ "id": "qO6ln4teagw1"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "if True:\n",
+ " tuned_genai_model = GenerativeModel(tuned_model_endpoint_name)\n",
+ " # Test with the loaded model.\n",
+ " print(\"***Testing***\")\n",
+ " print(\n",
+ " tuned_genai_model.generate_content(\n",
+ " contents=prompt\n",
+ " )\n",
+ " )\n",
+ "else:\n",
+ " print(\"State:\", sft_tuning_job.state)\n",
+ " print(\"Error:\", sft_tuning_job.error)"
+ ],
+ "metadata": {
+ "id": "dYygz5ph_icf"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "y_true = test_df[\"answers\"].values\n",
+ "\n",
+ "def get_predictions(test_question):\n",
+ "\n",
+ " prompt = f\"\"\"{test_question}\"\"\"\n",
+ "\n",
+ " generation_config = GenerationConfig(\n",
+ " temperature=0.1,\n",
+ " )\n",
+ "\n",
+ " response = tuned_genai_model.generate_content(contents=prompt, generation_config=generation_config).text\n",
+ "\n",
+ " return response"
+ ],
+ "metadata": {
+ "id": "W4YMNGuoDajB"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "y_pred = []\n",
+ "y_pred_question = test_df[\"question\"].values\n",
+ "\n",
+ "for i in y_pred_question:\n",
+ "\n",
+ " prediction = get_predictions(i)\n",
+ " y_pred.append(prediction)"
+ ],
+ "metadata": {
+ "id": "69FMuAeoDrm5"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "em, f1 = calculate_em_and_f1_for_text_arrays(y_pred, y_true)\n",
+ "print(f\"EM score: {em}\")\n",
+ "print(f\"F1 score: {f1}\")"
+ ],
+ "metadata": {
+ "id": "yj76Tu6ODalZ"
+ },
+ "execution_count": null,
+ "outputs": []
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "provenance": []
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file
From dfe6cc6fe18ab4455fdae72c24fdd537fd50e83f Mon Sep 17 00:00:00 2001
From: Erwin Huizenga
Date: Thu, 10 Oct 2024 11:29:22 +0800
Subject: [PATCH 56/76] minor updates (#1232)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
# Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [X] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [X] You are listed as the author in your notebook or README file.
- [X] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [X] Make your Pull Request title in the
specification.
- [X] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [X] Appropriate docs were updated (if necessary)
Fixes # 🦕
---
...sed_finetuning_using_gemini_qa_ipynb.ipynb | 1083 -----------------
1 file changed, 1083 deletions(-)
delete mode 100644 gemini/tuning/supervised_finetuning_using_gemini_qa_ipynb.ipynb
diff --git a/gemini/tuning/supervised_finetuning_using_gemini_qa_ipynb.ipynb b/gemini/tuning/supervised_finetuning_using_gemini_qa_ipynb.ipynb
deleted file mode 100644
index 85f96df281..0000000000
--- a/gemini/tuning/supervised_finetuning_using_gemini_qa_ipynb.ipynb
+++ /dev/null
@@ -1,1083 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "ur8xi4C7S06n"
- },
- "outputs": [],
- "source": [
- "# Copyright 2024 Google LLC\n",
- "#\n",
- "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
- "# you may not use this file except in compliance with the License.\n",
- "# You may obtain a copy of the License at\n",
- "#\n",
- "# https://www.apache.org/licenses/LICENSE-2.0\n",
- "#\n",
- "# Unless required by applicable law or agreed to in writing, software\n",
- "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
- "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
- "# See the License for the specific language governing permissions and\n",
- "# limitations under the License."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "ojoyvz6mH1Hv"
- },
- "source": [
- "# Supervised Fine Tuning with Gemini 1.5 Flash for Q&A\n",
- "\n",
- "
"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "84f0f73a0f76"
- },
- "source": [
- "| | |\n",
- "|-|-|\n",
- "| Author(s) | [Erwin Huizenga](https://github.com/erwinh85) |"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "JAPoU8Sm5E6e"
- },
- "source": [
- "## Overview\n",
- "\n",
- "**Gemini** is a family of generative AI models developed by Google DeepMind designed for multimodal use cases. The Gemini API gives you access to the various Gemini models, such as Gemini 1.5 Pro and Gemini 1.5 Flash.\n",
- "This notebook demonstrates fine-tuning the Gemini 1.5 Flahs using the Vertex AI Supervised Tuning feature. Supervised Tuning allows you to use your own labeled training data to further refine the base model's capabilities toward your specific tasks.\n",
- "Supervised Tuning uses labeled examples to tune a model. Each example demonstrates the output you want from your text model during inference.\n",
- "First, ensure your training data is of high quality, well-labeled, and directly relevant to the target task. This is crucial as low-quality data can adversely affect the performance and introduce bias in the fine-tuned model.\n",
- "Training: Experiment with different configurations to optimize the model's performance on the target task.\n",
- "Evaluation:\n",
- "Metric: Choose appropriate evaluation metrics that accurately reflect the success of the fine-tuned model for your specific task\n",
- "Evaluation Set: Use a separate set of data to evaluate the model's performance\n",
- "\n",
- "\n",
- "Refer to public [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning) for more details.\n",
- "\n",
- "\n",
- "\n",
- "Before running this notebook, ensure you have:\n",
- "\n",
- "- A Google Cloud project: Provide your project ID in the `PROJECT_ID` variable.\n",
- "\n",
- "- Authenticated your Colab environment: Run the authentication code block at the beginning.\n",
- "\n",
- "- Prepare your training data. "
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "f7SS5pzuIA-1"
- },
- "source": [
- "### Costs\n",
- "\n",
- "This tutorial uses billable components of Google Cloud:\n",
- "\n",
- "* Vertex AI\n",
- "* Cloud Storage\n",
- "\n",
- "Learn about [Vertex AI\n",
- "pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage\n",
- "pricing](https://cloud.google.com/storage/pricing), and use the [Pricing\n",
- "Calculator](https://cloud.google.com/products/calculator/)\n",
- "to generate a cost estimate based on your projected usage.\n",
- "\n",
- "To get an estimate of the number of tokens"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "61RBz8LLbxCR"
- },
- "source": [
- "## Get started"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "No17Cw5hgx12"
- },
- "source": [
- "### Install Vertex AI SDK and other required packages\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "tFy3H3aPgx12"
- },
- "outputs": [],
- "source": [
- "%pip install --upgrade --user --quiet google-cloud-aiplatform datasets"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "R5Xep4W9lq-Z"
- },
- "source": [
- "### Restart runtime\n",
- "\n",
- "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
- "\n",
- "The restart might take a minute or longer. After it's restarted, continue to the next step."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "XRvKdaPDTznN"
- },
- "outputs": [],
- "source": [
- "import IPython\n",
- "\n",
- "app = IPython.Application.instance()\n",
- "app.kernel.do_shutdown(True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "SbmM4z7FOBpM"
- },
- "source": [
- "
\n",
- "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
- "
\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "dmWOrTJ3gx13"
- },
- "source": [
- "### Authenticate your notebook environment (Colab only)\n",
- "\n",
- "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "NyKGtVQjgx13"
- },
- "outputs": [],
- "source": [
- "import sys\n",
- "\n",
- "if \"google.colab\" in sys.modules:\n",
- " from google.colab import auth\n",
- "\n",
- " auth.authenticate_user()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "DF4l8DTdWgPY"
- },
- "source": [
- "### Set Google Cloud project information and initialize Vertex AI SDK\n",
- "\n",
- "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
- "\n",
- "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "Nqwi-5ufWp_B"
- },
- "outputs": [],
- "source": [
- "# Use the environment variable if the user doesn't provide Project ID.\n",
- "import os\n",
- "import vertexai\n",
- "\n",
- "PROJECT_ID = \"\" # @param {type:\"string\", isTemplate: true}\n",
- "if PROJECT_ID == \"[your-project-id]\":\n",
- " PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n",
- "\n",
- "LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"us-central1\")\n",
- "\n",
- "vertexai.init(project=PROJECT_ID, location=LOCATION)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "5303c05f7aa6"
- },
- "source": [
- "### Import libraries"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "6fc324893334"
- },
- "outputs": [],
- "source": [
- "# Vertex AI SDK\n",
- "from google.cloud import aiplatform\n",
- "from google.cloud.aiplatform.metadata import context\n",
- "from google.cloud.aiplatform.metadata import utils as metadata_utils\n",
- "from vertexai.generative_models import (\n",
- " GenerationConfig,\n",
- " GenerativeModel,\n",
- " HarmBlockThreshold,\n",
- " HarmCategory,\n",
- ")\n",
- "from vertexai.preview.tuning import sft\n",
- "\n",
- "# Vertex AI SDK\n",
- "from sklearn.metrics import f1_score\n",
- "from sklearn.feature_extraction.text import TfidfVectorizer\n",
- "import pandas as pd\n",
- "import array\n",
- "import time\n",
- "from datasets import load_dataset\n",
- "import numpy as np\n",
- "import plotly.graph_objects as go\n",
- "from plotly.subplots import make_subplots"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "6bBZa2I-c-x8"
- },
- "source": [
- "### Data\n",
- "\n",
- "#### SQuAD dataset\n",
- "Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.\n",
- "\n",
- "You can fine more information on the SQuAD [github page](https://rajpurkar.github.io/SQuAD-explorer/)**bold text**"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "KhebDJjRKePL"
- },
- "source": [
- "First update the `BUCKET_NAME` parameter below. You can either use an existing bucket or create a new one."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "lit30Cktbfvo"
- },
- "outputs": [],
- "source": [
- "# Provide a bucket name\n",
- "BUCKET_NAME = \"tuning-demo-erwinh/gemini-tuning\" # @param {type:\"string\"}\n",
- "BUCKET_URI = f\"gs://{BUCKET_NAME}\"\n",
- "print(BUCKET_URI)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "ed-G-9cyKmPY"
- },
- "source": [
- "Only run the code below if you want to create a new Google Cloud Storage bucket."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "0UJ8S9YFA1pZ"
- },
- "outputs": [],
- "source": [
- "# ! gsutil mb -l {LOCATION} -p {PROJECT_ID} {BUCKET_URI}"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "izjwF63tLLEq"
- },
- "source": [
- "Next you will copy the data into your bucket."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "wjvcxx_sA3xP"
- },
- "outputs": [],
- "source": [
- "!gsutil cp gs://github-repo/generative-ai/gemini/tuning/qa/squad_test.csv .\n",
- "!gsutil cp gs://github-repo/generative-ai/gemini/tuning/qa/squad_train.csv .\n",
- "!gsutil cp gs://github-repo/generative-ai/gemini/tuning/qa/squad_validation.csv ."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "3F10LuZeL3kt"
- },
- "source": [
- "### Baseline\n",
- "\n",
- "Next you will prepare some test data that you will use to establish a baseline. This means evaluating your chosen model on a representative sample of your dataset before any fine-tuning. A baseline allows you to quantify the improvements achieved through fine-tuning."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "LkOmXpegA8CW"
- },
- "outputs": [],
- "source": [
- "test_df = pd.read_csv('squad_test.csv')\n",
- "test_df.head(1)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "mrLxcVVcMsNO"
- },
- "source": [
- "You will need to do some dataset preperations. We will add a system instruction to the dataset:\n",
- "\n",
- "`SystemInstruct`: System instructions are a set of instructions that the model processes before it processes prompts. We recommend that you use system instructions to tell the model how you want it to behave and respond to prompts.\n",
- "\n",
- "We will also combine the `context` and `question`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "c0pgJycOekZ3"
- },
- "outputs": [],
- "source": [
- "systemInstruct = \"Answer the question based on the context\""
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "N_u3VzUMsyqj"
- },
- "outputs": [],
- "source": [
- "# combine the systeminstruct + context + question into one column.\n",
- "row_dataset = 6\n",
- "\n",
- "test_df[\"input_question\"] = systemInstruct + \"\\n\" + \"Context: \" + test_df[\"context\"] + \"\\n\" + \"Question: \" + test_df[\"question\"]\n",
- "test_question = test_df[\"input_question\"].iloc[row_dataset]\n",
- "print(test_question)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "FSxYYqMGWrmj"
- },
- "source": [
- "Next, set the model that you will use. In this example you will use `gemini-1.5-flash-002`. A multimodal model that is designed for high-volume, cost-effective applications, and which delivers speed and efficiency to build fast, lower-cost applications that don't compromise on quality.\n",
- "\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "t-5X4goiqqBQ"
- },
- "outputs": [],
- "source": [
- "base_model = \"gemini-1.5-flash-002\"\n",
- "generation_model = GenerativeModel(base_model)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "wyscyIenW4WZ"
- },
- "source": [
- "Next lets take a question and get a prediction from Gemini that we can compare to the actual answer."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "cXencUYc6YAE"
- },
- "outputs": [],
- "source": [
- "def get_predictions(question: str) -> str:\n",
- " \"\"\"Generates predictions for a given test question.\n",
- "\n",
- " Args:\n",
- " test_question: The question to generate predictions for.\n",
- "\n",
- " Returns:\n",
- " The generated prediction text.\n",
- " \"\"\"\n",
- "\n",
- " prompt = f\"{question}\"\n",
- "\n",
- " generation_config = GenerationConfig(\n",
- " temperature=0.1)\n",
- "\n",
- " response = generation_model.generate_content(\n",
- " contents=prompt, generation_config=generation_config\n",
- " ).text\n",
- "\n",
- " return response"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "gKa0wLooa3Is"
- },
- "outputs": [],
- "source": [
- "test_answer = test_df[\"answers\"].iloc[row_dataset]\n",
- "\n",
- "response = get_predictions(test_question)\n",
- "\n",
- "print(f\"Gemini response: {response}\")\n",
- "print(f\"Actual answer: {test_answer}\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "OGRJTHKrdujw"
- },
- "source": [
- "You can see that both answers are correct, but the response from Gemini is more lengthy. However, answers in the SQuAD dataset are typically concise and clear.\n",
- "\n",
- "Fine-tuning is a great way to control the type of output your use case requires. In this instance, you would want the model to provide short, clear answers.\n",
- "\n",
- "Next, let's check if each dataset has an equal number of examples."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "dCe0CUsi5E-Y"
- },
- "outputs": [],
- "source": [
- "num_strings_pred = np.sum([isinstance(item, str) for item in y_pred])\n",
- "print(f\"Number of strings in y_pred: {num_strings_pred}\")\n",
- "\n",
- "num_strings_true = np.sum([isinstance(item, str) for item in y_true])\n",
- "print(f\"Number of strings in y_true: {num_strings_true}\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "hvi7m8pKE8WB"
- },
- "source": [
- "Next lest establish a baseline using evaluation metrics.\n",
- "\n",
- "Evaluating the performance of a Question Answering (QA) system requires specific metrics. Two commonly used metrics are Exact Match (EM) and F1 score.\n",
- "\n",
- "EM is a strict measure that only considers an answer correct if it perfectly matches the ground truth, even down to the punctuation. It's a binary metric - either 1 for a perfect match or 0 otherwise. This makes it sensitive to minor variations in phrasing.\n",
- "\n",
- "F1 score is more flexible. It considers the overlap between the predicted answer and the true answer in terms of individual words or tokens. It calculates the harmonic mean of precision (proportion of correctly predicted words out of all predicted words) and recall (proportion of correctly predicted words out of all true answer words). This allows for partial credit and is less sensitive to minor wording differences.\n",
- "\n",
- "In practice, EM is useful when exact wording is crucial, while F1 is more suitable when evaluating the overall understanding and semantic accuracy of the QA system. Often, both metrics are used together to provide a comprehensive evaluation."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "XcgEpTU55FFc"
- },
- "outputs": [],
- "source": [
- "def calculate_em_and_f1_for_text_arrays(y_true, y_pred, average='weighted'):\n",
- " \"\"\"\n",
- " Calculates the Exact Match (EM) and F1 score for arrays of text\n",
- " using word-level comparisons.\n",
- "\n",
- " Args:\n",
- " y_true: An array of ground truth strings.\n",
- " y_pred: An array of predicted strings.\n",
- " average: The averaging method to use for F1 score.\n",
- "\n",
- " Returns:\n",
- " A tuple containing the EM score and the F1 score.\n",
- " \"\"\"\n",
- "\n",
- " em = np.mean([t == p for t, p in zip(y_true, y_pred)])\n",
- "\n",
- " # Use TF-IDF to convert strings to numerical vectors\n",
- " vectorizer = TfidfVectorizer()\n",
- " all_text = np.concatenate((y_true, y_pred))\n",
- " vectorizer.fit(all_text)\n",
- " y_true_vec = vectorizer.transform(y_true)\n",
- " y_pred_vec = vectorizer.transform(y_pred)\n",
- "\n",
- " # Calculate F1 score based on common words (non-zero elements)\n",
- " y_true_class = (y_true_vec > 0).toarray().astype(int)\n",
- " y_pred_class = (y_pred_vec > 0).toarray().astype(int)\n",
- "\n",
- " f1 = f1_score(y_true_class, y_pred_class, average=average)\n",
- "\n",
- " return em, f1\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "rhDTq9p_GSBP"
- },
- "outputs": [],
- "source": [
- "em, f1 = calculate_em_and_f1_for_text_arrays(y_pred, y_true)\n",
- "print(f\"EM score: {em}\")\n",
- "print(f\"F1 score: {f1}\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "22DfexbNfUHm"
- },
- "source": [
- "### Prepare the data for fine-tuning\n",
- "\n",
- "To optimize the tuning process for a foundation model, ensure your dataset includes examples that reflect the desired task. Structure your training data in a text-to-text format, where each record in the dataset pairs an input text (or prompt) with its corresponding expected output. This supervised tuning approach uses the dataset to effectively teach the model the specific behavior or task you need it to perform, by providing numerous illustrative examples.\n",
- "\n",
- "The size of your dataset will vary depending on the complexity of the task, but as a general rule, the more examples you include, the better the model's performance.\n",
- "\n",
- "Dataset Format\n",
- "Your training data should be structured in a JSONL file and stored at a Google Cloud Storage (GCS) URI. Each line in the JSONL file must adhere to the following schema:\n",
- "\n",
- "A `contents` array containing objects that define:\n",
- "- A `role` (\"user\" for user input or \"model\" for model output)\n",
- "- `parts` containing the input data.\n",
- "\n",
- "```\n",
- "{\n",
- " \"contents\":[\n",
- " {\n",
- " \"role\":\"user\", # This indicate input content\n",
- " \"parts\":[\n",
- " {\n",
- " \"text\":\"How are you?\"\n",
- " }\n",
- " ]\n",
- " },\n",
- " {\n",
- " \"role\":\"model\", # This indicate target content\n",
- " \"parts\":[ # text only\n",
- " {\n",
- " \"text\":\"I am good, thank you!\"\n",
- " }\n",
- " ]\n",
- " }\n",
- " # ... repeat \"user\", \"model\" for multi turns.\n",
- " ]\n",
- "}\n",
- "```\n",
- "\n",
- "Refer to the public [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning-prepare#about-datasets) for more details."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "4DqrQp4cLqRy"
- },
- "outputs": [],
- "source": [
- "# combine the systeminstruct + context + question into one column.\n",
- "train_df = pd.read_csv('squad_train.csv')\n",
- "validation_df = pd.read_csv('squad_validation.csv')\n",
- "\n",
- "train_df[\"input_question\"] = systemInstruct + \"Context: \" + train_df[\"context\"] + \"Question: \" + train_df[\"question\"]\n",
- "validation_df[\"input_question\"] = systemInstruct + \"Context: \" + validation_df[\"context\"] + \"Question: \" + validation_df[\"question\"]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "Pmzyz1migvHN"
- },
- "outputs": [],
- "source": [
- "def df_to_jsonl(df, output_file):\n",
- " \"\"\"Converts a Pandas DataFrame to JSONL format and saves it to a file.\n",
- "\n",
- " Args:\n",
- " df: The DataFrame to convert.\n",
- " output_file: The name of the output file.\n",
- " \"\"\"\n",
- "\n",
- " with open(output_file, 'w') as f:\n",
- " for row in df.itertuples(index=False):\n",
- " jsonl_obj = {\n",
- " \"systemInstruction\": {\"parts\": [{\"text\": \"Answer the question based on the provided context.\"}]},\n",
- " \"contents\": [\n",
- " {\n",
- " \"role\": \"user\",\n",
- " \"parts\": [{\"text\": f\"Context: {row.context}\\n\\nQuestion: {row.question}\"}]\n",
- " },\n",
- " {\"role\": \"model\", \"parts\": [{\"text\": row.answers}]},\n",
- " ]\n",
- " }\n",
- " f.write(json.dumps(jsonl_obj) + '\\n')\n",
- "\n",
- "# Process the DataFrames\n",
- "df_to_jsonl(train_df, 'squad_train.jsonl')\n",
- "df_to_jsonl(validation_df, 'squad_validation.jsonl')\n",
- "\n",
- "print(f\"JSONL data written to squad_train.jsonl\")\n",
- "print(f\"JSONL data written to squad_validation.jsonl\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "5OQv-ZMpJDhi"
- },
- "source": [
- "Next you will copy the files into your Google Cloud bucket"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "O5k1jYJ10IeW"
- },
- "outputs": [],
- "source": [
- "!gsutil cp ./squad_train.jsonl {BUCKET_URI}\n",
- "!gsutil cp ./squad_validation.jsonl {BUCKET_URI}"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "UAHMYgFJJHjm"
- },
- "source": [
- "### Start fine-tuning job\n",
- "Next you can start the fine-tuning job.\n",
- "\n",
- "- `source_model`: Specifies the base Gemini model version you want to fine-tune.\n",
- " - `train_dataset`: Path to your training data in JSONL format.\n",
- "\n",
- " *Optional parameters*\n",
- " - `validation_dataset`: If provided, this data is used to evaluate the model during tuning.\n",
- " - `tuned_model_display_name`: Display name for the tuned model.\n",
- " - `epochs`: The number of training epochs to run.\n",
- " - `learning_rate_multiplier`: A value to scale the learning rate during training.\n",
- " - `adapter_size` : Gemini 1.5 Flash supports Adapter length [1, 4], default value is 4.\n",
- "\n",
- " **Important**: The default hyperparameter settings are optimized for optimal performance based on rigorous testing and are recommended for initial use. Users may customize these parameters to address specific performance requirements.**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "qj-LjQ5Vbf1E"
- },
- "outputs": [],
- "source": [
- "tuned_model_display_name = \"erwinh-fine-tuning-flash\" # @param {type:\"string\"}\n",
- "\n",
- "sft_tuning_job = sft.train(\n",
- " source_model=base_model,\n",
- " train_dataset=f\"\"\"{BUCKET_URI}/squad_train.jsonl\"\"\",\n",
- " # # Optional:\n",
- " validation_dataset=f\"\"\"{BUCKET_URI}/squad_validation.jsonl\"\"\",\n",
- " tuned_model_display_name=tuned_model_display_name,\n",
- ")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "3tXawW1p8E5-"
- },
- "outputs": [],
- "source": [
- "# Get the tuning job info.\n",
- "sft_tuning_job.to_dict()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "19aQnN-k84d9"
- },
- "outputs": [],
- "source": [
- "# Get the resource name of the tuning job\n",
- "sft_tuning_job_name = sft_tuning_job.resource_name\n",
- "sft_tuning_job_name"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "UKo8cwF2KVM5"
- },
- "source": [
- "**Important:** Tuning time depends on several factors, such as training data size, number of epochs, learning rate multiplier, etc."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "8NiZnPkIKcwm"
- },
- "source": [
- "
\n",
- "⚠️ It will take ~30 mins for the model tuning job to complete on the provided dataset and set configurations/hyperparameters. ⚠️\n",
- "
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "Njag_3cB86rH"
- },
- "outputs": [],
- "source": [
- "%%time\n",
- "# Wait for job completion\n",
- "while not sft_tuning_job.refresh().has_ended:\n",
- " time.sleep(60)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "dkx92RBdbf27"
- },
- "outputs": [],
- "source": [
- "# tuned model name\n",
- "tuned_model_name = sft_tuning_job.tuned_model_name\n",
- "tuned_model_name"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "e09aB_9Ebf5c"
- },
- "outputs": [],
- "source": [
- "# tuned model endpoint name\n",
- "tuned_model_endpoint_name = sft_tuning_job.tuned_model_endpoint_name\n",
- "tuned_model_endpoint_name"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "gV1ukBznKmlN"
- },
- "source": [
- "#### Model tuning metrics\n",
- "\n",
- "- `/train_total_loss`: Loss for the tuning dataset at a training step.\n",
- "- `/train_fraction_of_correct_next_step_preds`: The token accuracy at a training step. A single prediction consists of a sequence of tokens. This metric measures the accuracy of the predicted tokens when compared to the ground truth in the tuning dataset.\n",
- "- `/train_num_predictions`: Number of predicted tokens at a training step\n",
- "\n",
- "#### Model evaluation metrics:\n",
- "\n",
- "- `/eval_total_loss`: Loss for the evaluation dataset at an evaluation step.\n",
- "- `/eval_fraction_of_correct_next_step_preds`: The token accuracy at an evaluation step. A single prediction consists of a sequence of tokens. This metric measures the accuracy of the predicted tokens when compared to the ground truth in the evaluation dataset.\n",
- "- `/eval_num_predictions`: Number of predicted tokens at an evaluation step.\n",
- "\n",
- "The metrics visualizations are available after the model tuning job completes. If you don't specify a validation dataset when you create the tuning job, only the visualizations for the tuning metrics are available."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "DH0guHM---Jo"
- },
- "outputs": [],
- "source": [
- "# Locate Vertex AI Experiment and Vertex AI Experiment Run\n",
- "experiment = aiplatform.Experiment(experiment_name=experiment_name)\n",
- "filter_str = metadata_utils._make_filter_string(\n",
- " schema_title=\"system.ExperimentRun\",\n",
- " parent_contexts=[experiment.resource_name],\n",
- ")\n",
- "experiment_run = context.Context.list(filter_str)[0]"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "hggHQFIl_FXC"
- },
- "outputs": [],
- "source": [
- "# Read data from Tensorboard\n",
- "tensorboard_run_name = f\"{experiment.get_backing_tensorboard_resource().resource_name}/experiments/{experiment.name}/runs/{experiment_run.name.replace(experiment.name, '')[1:]}\"\n",
- "tensorboard_run = aiplatform.TensorboardRun(tensorboard_run_name)\n",
- "metrics = tensorboard_run.read_time_series_data()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "BdHKZdqG_bHf"
- },
- "outputs": [],
- "source": [
- "def get_metrics(metric: str = \"/train_total_loss\"):\n",
- " \"\"\"\n",
- " Get metrics from Tensorboard.\n",
- "\n",
- " Args:\n",
- " metric: metric name, eg. /train_total_loss or /eval_total_loss.\n",
- " Returns:\n",
- " steps: list of steps.\n",
- " steps_loss: list of loss values.\n",
- " \"\"\"\n",
- " loss_values = metrics[metric].values\n",
- " steps_loss = []\n",
- " steps = []\n",
- " for loss in loss_values:\n",
- " steps_loss.append(loss.scalar.value)\n",
- " steps.append(loss.step)\n",
- " return steps, steps_loss"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "_pDrlpA7_e9o"
- },
- "outputs": [],
- "source": [
- "# Get Train and Eval Loss\n",
- "train_loss = get_metrics(metric=\"/train_total_loss\")\n",
- "eval_loss = get_metrics(metric=\"/eval_total_loss\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "DL07j7u__iZx"
- },
- "outputs": [],
- "source": [
- "# Plot the train and eval loss metrics using Plotly python library\n",
- "fig = make_subplots(\n",
- " rows=1, cols=2, shared_xaxes=True, subplot_titles=(\"Train Loss\", \"Eval Loss\")\n",
- ")\n",
- "\n",
- "# Add traces\n",
- "fig.add_trace(\n",
- " go.Scatter(x=train_loss[0], y=train_loss[1], name=\"Train Loss\", mode=\"lines\"),\n",
- " row=1,\n",
- " col=1,\n",
- ")\n",
- "fig.add_trace(\n",
- " go.Scatter(x=eval_loss[0], y=eval_loss[1], name=\"Eval Loss\", mode=\"lines\"),\n",
- " row=1,\n",
- " col=2,\n",
- ")\n",
- "\n",
- "# Add figure title\n",
- "fig.update_layout(title=\"Train and Eval Loss\", xaxis_title=\"Steps\", yaxis_title=\"Loss\")\n",
- "\n",
- "# Set x-axis title\n",
- "fig.update_xaxes(title_text=\"Steps\")\n",
- "\n",
- "# Set y-axes titles\n",
- "fig.update_yaxes(title_text=\"Loss\")\n",
- "\n",
- "# Show plot\n",
- "fig.show()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "dYygz5ph_icf"
- },
- "outputs": [],
- "source": [
- "if True:\n",
- " tuned_genai_model = GenerativeModel(tuned_model_endpoint_name)\n",
- " # Test with the loaded model.\n",
- " print(\"***Testing***\")\n",
- " print(\n",
- " tuned_genai_model.generate_content(\n",
- " contents=prompt, generation_config=generation_config\n",
- " )\n",
- " )\n",
- "else:\n",
- " print(\"State:\", sft_tuning_job.state)\n",
- " print(\"Error:\", sft_tuning_job.error)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "W4YMNGuoDajB"
- },
- "outputs": [],
- "source": [
- "y_true = test_df[\"answers\"].values\n",
- "\n",
- "def get_predictions(test_question):\n",
- "\n",
- " prompt = f\"\"\"{test_question}\"\"\"\n",
- "\n",
- " generation_config = GenerationConfig(\n",
- " temperature=0.1,\n",
- " )\n",
- "\n",
- " response = tuned_genai_model.generate_content(contents=prompt, generation_config=generation_config).text\n",
- "\n",
- " return response"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "69FMuAeoDrm5"
- },
- "outputs": [],
- "source": [
- "y_pred = []\n",
- "y_pred_question = test_df[\"question\"].values\n",
- "\n",
- "for i in y_pred_question:\n",
- "\n",
- " prediction = get_predictions(i)\n",
- " y_pred.append(prediction)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "yj76Tu6ODalZ"
- },
- "outputs": [],
- "source": [
- "em, f1 = calculate_em_and_f1_for_text_arrays(y_pred, y_true)\n",
- "print(f\"EM score: {em}\")\n",
- "print(f\"F1 score: {f1}\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "Vkb2qXljFYqX"
- },
- "source": [
- "\n",
- "\n",
- "\n"
- ]
- }
- ],
- "metadata": {
- "colab": {
- "provenance": []
- },
- "kernelspec": {
- "display_name": "Python 3",
- "name": "python3"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 0
-}
From 890267868c04d2e1046d35c66f29d2e7c619b4f3 Mon Sep 17 00:00:00 2001
From: alan blount
Date: Thu, 10 Oct 2024 08:51:49 -0600
Subject: [PATCH 57/76] docs: cleanup rag-grounding README (#1233)
summaries of notebooks, slightly better structure to the file
# Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [x] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [x] You are listed as the author in your notebook or README file.
- [x] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [x] Make your Pull Request title in the
specification.
- [x] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [x] Appropriate docs were updated (if necessary)
---------
Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
---
...upervised_finetuning_using_gemini_qa.ipynb | 716 +++++++++---------
rag-grounding/README.md | 322 +++++---
2 files changed, 574 insertions(+), 464 deletions(-)
diff --git a/gemini/tuning/supervised_finetuning_using_gemini_qa.ipynb b/gemini/tuning/supervised_finetuning_using_gemini_qa.ipynb
index 5456046578..266a1a24a2 100644
--- a/gemini/tuning/supervised_finetuning_using_gemini_qa.ipynb
+++ b/gemini/tuning/supervised_finetuning_using_gemini_qa.ipynb
@@ -25,6 +25,9 @@
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "ojoyvz6mH1Hv"
+ },
"source": [
"# Supervised Fine Tuning with Gemini 1.5 Flash for Q&A\n",
"\n",
@@ -50,10 +53,7 @@
" \n",
"
\n",
"
"
- ],
- "metadata": {
- "id": "ojoyvz6mH1Hv"
- }
+ ]
},
{
"cell_type": "markdown",
@@ -99,6 +99,9 @@
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "f7SS5pzuIA-1"
+ },
"source": [
"### Costs\n",
"\n",
@@ -114,10 +117,7 @@
"to generate a cost estimate based on your projected usage.\n",
"\n",
"To get an estimate of the number of tokens"
- ],
- "metadata": {
- "id": "f7SS5pzuIA-1"
- }
+ ]
},
{
"cell_type": "markdown",
@@ -236,6 +236,7 @@
"source": [
"# Use the environment variable if the user doesn't provide Project ID.\n",
"import os\n",
+ "\n",
"import vertexai\n",
"\n",
"PROJECT_ID = \"\" # @param {type:\"string\", isTemplate: true}\n",
@@ -264,33 +265,30 @@
},
"outputs": [],
"source": [
+ "import json\n",
+ "import time\n",
+ "\n",
"# Vertex AI SDK\n",
"from google.cloud import aiplatform\n",
"from google.cloud.aiplatform.metadata import context\n",
"from google.cloud.aiplatform.metadata import utils as metadata_utils\n",
- "from vertexai.generative_models import (\n",
- " GenerationConfig,\n",
- " GenerativeModel,\n",
- " HarmBlockThreshold,\n",
- " HarmCategory,\n",
- ")\n",
- "from vertexai.preview.tuning import sft\n",
- "\n",
- "# Vertex AI SDK\n",
- "from sklearn.metrics import f1_score\n",
- "from sklearn.feature_extraction.text import TfidfVectorizer\n",
- "import pandas as pd\n",
- "import array\n",
- "import time\n",
- "from datasets import load_dataset\n",
"import numpy as np\n",
+ "import pandas as pd\n",
"import plotly.graph_objects as go\n",
"from plotly.subplots import make_subplots\n",
- "import json"
+ "from sklearn.feature_extraction.text import TfidfVectorizer\n",
+ "\n",
+ "# Vertex AI SDK\n",
+ "from sklearn.metrics import f1_score\n",
+ "from vertexai.generative_models import GenerationConfig, GenerativeModel\n",
+ "from vertexai.preview.tuning import sft"
]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "6bBZa2I-c-x8"
+ },
"source": [
"### Data\n",
"\n",
@@ -298,214 +296,222 @@
"Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.\n",
"\n",
"You can fine more information on the SQuAD [github page](https://rajpurkar.github.io/SQuAD-explorer/)**bold text**"
- ],
- "metadata": {
- "id": "6bBZa2I-c-x8"
- }
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "First update the `BUCKET_NAME` parameter below. You can either use an existing bucket or create a new one."
- ],
"metadata": {
"id": "KhebDJjRKePL"
- }
+ },
+ "source": [
+ "First update the `BUCKET_NAME` parameter below. You can either use an existing bucket or create a new one."
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "lit30Cktbfvo"
+ },
+ "outputs": [],
"source": [
"# Provide a bucket name\n",
"BUCKET_NAME = \"\" # @param {type:\"string\"}\n",
"BUCKET_URI = f\"gs://{BUCKET_NAME}\"\n",
"print(BUCKET_URI)"
- ],
- "metadata": {
- "id": "lit30Cktbfvo"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "Only run the code below if you want to create a new Google Cloud Storage bucket."
- ],
"metadata": {
"id": "ed-G-9cyKmPY"
- }
+ },
+ "source": [
+ "Only run the code below if you want to create a new Google Cloud Storage bucket."
+ ]
},
{
"cell_type": "code",
- "source": [
- "# ! gsutil mb -l {LOCATION} -p {PROJECT_ID} {BUCKET_URI}"
- ],
+ "execution_count": null,
"metadata": {
"id": "0UJ8S9YFA1pZ"
},
- "execution_count": null,
- "outputs": []
+ "outputs": [],
+ "source": [
+ "# ! gsutil mb -l {LOCATION} -p {PROJECT_ID} {BUCKET_URI}"
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "Next you will copy the data into your bucket."
- ],
"metadata": {
"id": "izjwF63tLLEq"
- }
+ },
+ "source": [
+ "Next you will copy the data into your bucket."
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "wjvcxx_sA3xP"
+ },
+ "outputs": [],
"source": [
"!gsutil cp gs://github-repo/generative-ai/gemini/tuning/qa/squad_test.csv .\n",
"!gsutil cp gs://github-repo/generative-ai/gemini/tuning/qa/squad_train.csv .\n",
"!gsutil cp gs://github-repo/generative-ai/gemini/tuning/qa/squad_validation.csv ."
- ],
- "metadata": {
- "id": "wjvcxx_sA3xP"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "3F10LuZeL3kt"
+ },
"source": [
"### Baseline\n",
"\n",
"Next you will prepare some test data that you will use to establish a baseline. This means evaluating your chosen model on a representative sample of your dataset before any fine-tuning. A baseline allows you to quantify the improvements achieved through fine-tuning."
- ],
- "metadata": {
- "id": "3F10LuZeL3kt"
- }
+ ]
},
{
"cell_type": "code",
- "source": [
- "test_df = pd.read_csv('squad_test.csv')\n",
- "test_df.head(1)"
- ],
+ "execution_count": null,
"metadata": {
"id": "LkOmXpegA8CW"
},
- "execution_count": null,
- "outputs": []
+ "outputs": [],
+ "source": [
+ "test_df = pd.read_csv(\"squad_test.csv\")\n",
+ "test_df.head(1)"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "mrLxcVVcMsNO"
+ },
"source": [
"You will need to do some dataset preperations. We will add a system instruction to the dataset:\n",
"\n",
"`SystemInstruct`: System instructions are a set of instructions that the model processes before it processes prompts. We recommend that you use system instructions to tell the model how you want it to behave and respond to prompts.\n",
"\n",
"We will also combine the `context` and `question`."
- ],
- "metadata": {
- "id": "mrLxcVVcMsNO"
- }
+ ]
},
{
"cell_type": "code",
- "source": [
- "systemInstruct = \"Answer the question based on the context\""
- ],
+ "execution_count": null,
"metadata": {
"id": "c0pgJycOekZ3"
},
- "execution_count": null,
- "outputs": []
+ "outputs": [],
+ "source": [
+ "systemInstruct = \"Answer the question based on the context\""
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "N_u3VzUMsyqj"
+ },
+ "outputs": [],
"source": [
"# combine the systeminstruct + context + question into one column.\n",
"row_dataset = 6\n",
"\n",
- "test_df[\"input_question\"] = systemInstruct + \"\\n\" + \"Context: \" + test_df[\"context\"] + \"\\n\" + \"Question: \" + test_df[\"question\"]\n",
+ "test_df[\"input_question\"] = (\n",
+ " systemInstruct\n",
+ " + \"\\n\"\n",
+ " + \"Context: \"\n",
+ " + test_df[\"context\"]\n",
+ " + \"\\n\"\n",
+ " + \"Question: \"\n",
+ " + test_df[\"question\"]\n",
+ ")\n",
"test_question = test_df[\"input_question\"].iloc[row_dataset]\n",
"print(test_question)"
- ],
- "metadata": {
- "id": "N_u3VzUMsyqj"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "Next, set the model that you will use. In this example you will use `gemini-1.5-flash-002`. A multimodal model that is designed for high-volume, cost-effective applications, and which delivers speed and efficiency to build fast, lower-cost applications that don't compromise on quality.\n",
- "\n"
- ],
"metadata": {
"id": "FSxYYqMGWrmj"
- }
+ },
+ "source": [
+ "Next, set the model that you will use. In this example you will use `gemini-1.5-flash-002`. A multimodal model that is designed for high-volume, cost-effective applications, and which delivers speed and efficiency to build fast, lower-cost applications that don't compromise on quality.\n"
+ ]
},
{
"cell_type": "code",
- "source": [
- "base_model = \"gemini-1.5-flash-002\"\n",
- "generation_model = GenerativeModel(base_model)"
- ],
+ "execution_count": null,
"metadata": {
"id": "t-5X4goiqqBQ"
},
- "execution_count": null,
- "outputs": []
+ "outputs": [],
+ "source": [
+ "base_model = \"gemini-1.5-flash-002\"\n",
+ "generation_model = GenerativeModel(base_model)"
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "Next lets take a question and get a prediction from Gemini that we can compare to the actual answer."
- ],
"metadata": {
"id": "wyscyIenW4WZ"
- }
+ },
+ "source": [
+ "Next lets take a question and get a prediction from Gemini that we can compare to the actual answer."
+ ]
},
{
"cell_type": "code",
- "source": [
- "y_true = test_df[\"answers\"].values\n",
- "y_pred_question = test_df[\"question\"].values"
- ],
+ "execution_count": null,
"metadata": {
"id": "ejjhfynQWc0k"
},
- "execution_count": null,
- "outputs": []
+ "outputs": [],
+ "source": [
+ "y_true = test_df[\"answers\"].values\n",
+ "y_pred_question = test_df[\"question\"].values"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "cXencUYc6YAE"
+ },
+ "outputs": [],
"source": [
"def get_predictions(question: str) -> str:\n",
- " \"\"\"Generates predictions for a given test question.\n",
+ " \"\"\"Generates predictions for a given test question.\n",
"\n",
- " Args:\n",
- " test_question: The question to generate predictions for.\n",
+ " Args:\n",
+ " test_question: The question to generate predictions for.\n",
"\n",
- " Returns:\n",
- " The generated prediction text.\n",
- " \"\"\"\n",
+ " Returns:\n",
+ " The generated prediction text.\n",
+ " \"\"\"\n",
"\n",
- " prompt = f\"{question}\"\n",
+ " prompt = f\"{question}\"\n",
"\n",
- " generation_config = GenerationConfig(\n",
- " temperature=0.1)\n",
+ " generation_config = GenerationConfig(temperature=0.1)\n",
"\n",
- " response = generation_model.generate_content(\n",
- " contents=prompt, generation_config=generation_config\n",
- " ).text\n",
+ " response = generation_model.generate_content(\n",
+ " contents=prompt, generation_config=generation_config\n",
+ " ).text\n",
"\n",
- " return response"
- ],
- "metadata": {
- "id": "cXencUYc6YAE"
- },
- "execution_count": null,
- "outputs": []
+ " return response"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "gKa0wLooa3Is"
+ },
+ "outputs": [],
"source": [
"test_answer = test_df[\"answers\"].iloc[row_dataset]\n",
"\n",
@@ -513,28 +519,28 @@
"\n",
"print(f\"Gemini response: {response}\")\n",
"print(f\"Actual answer: {test_answer}\")"
- ],
- "metadata": {
- "id": "gKa0wLooa3Is"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "OGRJTHKrdujw"
+ },
"source": [
"You can see that both answers are correct, but the response from Gemini is more lengthy. However, answers in the SQuAD dataset are typically concise and clear.\n",
"\n",
"Fine-tuning is a great way to control the type of output your use case requires. In this instance, you would want the model to provide short, clear answers.\n",
"\n",
"Next, let's check if each dataset has an equal number of examples."
- ],
- "metadata": {
- "id": "OGRJTHKrdujw"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "dCe0CUsi5E-Y"
+ },
+ "outputs": [],
"source": [
"y_pred = test_df[\"question\"].values\n",
"\n",
@@ -543,15 +549,13 @@
"\n",
"num_strings_true = np.sum([isinstance(item, str) for item in y_true])\n",
"print(f\"Number of strings in y_true: {num_strings_true}\")"
- ],
- "metadata": {
- "id": "dCe0CUsi5E-Y"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "hvi7m8pKE8WB"
+ },
"source": [
"Next lest establish a baseline using evaluation metrics.\n",
"\n",
@@ -562,15 +566,17 @@
"F1 score is more flexible. It considers the overlap between the predicted answer and the true answer in terms of individual words or tokens. It calculates the harmonic mean of precision (proportion of correctly predicted words out of all predicted words) and recall (proportion of correctly predicted words out of all true answer words). This allows for partial credit and is less sensitive to minor wording differences.\n",
"\n",
"In practice, EM is useful when exact wording is crucial, while F1 is more suitable when evaluating the overall understanding and semantic accuracy of the QA system. Often, both metrics are used together to provide a comprehensive evaluation."
- ],
- "metadata": {
- "id": "hvi7m8pKE8WB"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XcgEpTU55FFc"
+ },
+ "outputs": [],
"source": [
- "def calculate_em_and_f1_for_text_arrays(y_true, y_pred, average='weighted'):\n",
+ "def calculate_em_and_f1_for_text_arrays(y_true, y_pred, average=\"weighted\"):\n",
" \"\"\"\n",
" Calculates the Exact Match (EM) and F1 score for arrays of text\n",
" using word-level comparisons.\n",
@@ -599,29 +605,27 @@
"\n",
" f1 = f1_score(y_true_class, y_pred_class, average=average)\n",
"\n",
- " return em, f1\n"
- ],
- "metadata": {
- "id": "XcgEpTU55FFc"
- },
- "execution_count": null,
- "outputs": []
+ " return em, f1"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "rhDTq9p_GSBP"
+ },
+ "outputs": [],
"source": [
"em, f1 = calculate_em_and_f1_for_text_arrays(y_pred, y_true)\n",
"print(f\"EM score: {em}\")\n",
"print(f\"F1 score: {f1}\")"
- ],
- "metadata": {
- "id": "rhDTq9p_GSBP"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "22DfexbNfUHm"
+ },
"source": [
"### Prepare the data for fine-tuning\n",
"\n",
@@ -661,88 +665,109 @@
"```\n",
"\n",
"Refer to the public [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-supervised-tuning-prepare#about-datasets) for more details."
- ],
- "metadata": {
- "id": "22DfexbNfUHm"
- }
+ ]
},
{
"cell_type": "code",
- "source": [
- "# combine the systeminstruct + context + question into one column.\n",
- "train_df = pd.read_csv('squad_train.csv')\n",
- "validation_df = pd.read_csv('squad_validation.csv')\n",
- "\n",
- "train_df[\"input_question\"] = systemInstruct + \"Context: \" + train_df[\"context\"] + \"Question: \" + train_df[\"question\"]\n",
- "validation_df[\"input_question\"] = systemInstruct + \"Context: \" + validation_df[\"context\"] + \"Question: \" + validation_df[\"question\"]"
- ],
+ "execution_count": null,
"metadata": {
"id": "4DqrQp4cLqRy"
},
- "execution_count": null,
- "outputs": []
+ "outputs": [],
+ "source": [
+ "# combine the systeminstruct + context + question into one column.\n",
+ "train_df = pd.read_csv(\"squad_train.csv\")\n",
+ "validation_df = pd.read_csv(\"squad_validation.csv\")\n",
+ "\n",
+ "train_df[\"input_question\"] = (\n",
+ " systemInstruct\n",
+ " + \"Context: \"\n",
+ " + train_df[\"context\"]\n",
+ " + \"Question: \"\n",
+ " + train_df[\"question\"]\n",
+ ")\n",
+ "validation_df[\"input_question\"] = (\n",
+ " systemInstruct\n",
+ " + \"Context: \"\n",
+ " + validation_df[\"context\"]\n",
+ " + \"Question: \"\n",
+ " + validation_df[\"question\"]\n",
+ ")"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Pmzyz1migvHN"
+ },
+ "outputs": [],
"source": [
"def df_to_jsonl(df, output_file):\n",
- " \"\"\"Converts a Pandas DataFrame to JSONL format and saves it to a file.\n",
- "\n",
- " Args:\n",
- " df: The DataFrame to convert.\n",
- " output_file: The name of the output file.\n",
- " \"\"\"\n",
- "\n",
- " with open(output_file, 'w') as f:\n",
- " for row in df.itertuples(index=False):\n",
- " jsonl_obj = {\n",
- " \"systemInstruction\": {\"parts\": [{\"text\": \"Answer the question based on the provided context.\"}]},\n",
- " \"contents\": [\n",
- " {\n",
- " \"role\": \"user\",\n",
- " \"parts\": [{\"text\": f\"Context: {row.context}\\n\\nQuestion: {row.question}\"}]\n",
- " },\n",
- " {\"role\": \"model\", \"parts\": [{\"text\": row.answers}]},\n",
- " ]\n",
- " }\n",
- " f.write(json.dumps(jsonl_obj) + '\\n')\n",
+ " \"\"\"Converts a Pandas DataFrame to JSONL format and saves it to a file.\n",
+ "\n",
+ " Args:\n",
+ " df: The DataFrame to convert.\n",
+ " output_file: The name of the output file.\n",
+ " \"\"\"\n",
+ "\n",
+ " with open(output_file, \"w\") as f:\n",
+ " for row in df.itertuples(index=False):\n",
+ " jsonl_obj = {\n",
+ " \"systemInstruction\": {\n",
+ " \"parts\": [\n",
+ " {\"text\": \"Answer the question based on the provided context.\"}\n",
+ " ]\n",
+ " },\n",
+ " \"contents\": [\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"parts\": [\n",
+ " {\n",
+ " \"text\": f\"Context: {row.context}\\n\\nQuestion: {row.question}\"\n",
+ " }\n",
+ " ],\n",
+ " },\n",
+ " {\"role\": \"model\", \"parts\": [{\"text\": row.answers}]},\n",
+ " ],\n",
+ " }\n",
+ " f.write(json.dumps(jsonl_obj) + \"\\n\")\n",
+ "\n",
"\n",
"# Process the DataFrames\n",
- "df_to_jsonl(train_df, 'squad_train.jsonl')\n",
- "df_to_jsonl(validation_df, 'squad_validation.jsonl')\n",
+ "df_to_jsonl(train_df, \"squad_train.jsonl\")\n",
+ "df_to_jsonl(validation_df, \"squad_validation.jsonl\")\n",
"\n",
"print(f\"JSONL data written to squad_train.jsonl\")\n",
"print(f\"JSONL data written to squad_validation.jsonl\")"
- ],
- "metadata": {
- "id": "Pmzyz1migvHN"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "Next you will copy the files into your Google Cloud bucket"
- ],
"metadata": {
"id": "5OQv-ZMpJDhi"
- }
+ },
+ "source": [
+ "Next you will copy the files into your Google Cloud bucket"
+ ]
},
{
"cell_type": "code",
- "source": [
- "!gsutil cp ./squad_train.jsonl {BUCKET_URI}\n",
- "!gsutil cp ./squad_validation.jsonl {BUCKET_URI}"
- ],
+ "execution_count": null,
"metadata": {
"id": "O5k1jYJ10IeW"
},
- "execution_count": null,
- "outputs": []
+ "outputs": [],
+ "source": [
+ "!gsutil cp ./squad_train.jsonl {BUCKET_URI}\n",
+ "!gsutil cp ./squad_validation.jsonl {BUCKET_URI}"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "UAHMYgFJJHjm"
+ },
"source": [
"### Start fine-tuning job\n",
"Next you can start the fine-tuning job.\n",
@@ -758,13 +783,15 @@
" - `adapter_size` : Gemini 1.5 Flash supports Adapter length [1, 4], default value is 4.\n",
"\n",
" **Important**: The default hyperparameter settings are optimized for optimal performance based on rigorous testing and are recommended for initial use. Users may customize these parameters to address specific performance requirements.**"
- ],
- "metadata": {
- "id": "UAHMYgFJJHjm"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "qj-LjQ5Vbf1E"
+ },
+ "outputs": [],
"source": [
"tuned_model_display_name = \"fine-tuning-gemini-flash-qa-v01\" # @param {type:\"string\"}\n",
"\n",
@@ -775,100 +802,98 @@
" validation_dataset=f\"\"\"{BUCKET_URI}/squad_validation.jsonl\"\"\",\n",
" tuned_model_display_name=tuned_model_display_name,\n",
")"
- ],
- "metadata": {
- "id": "qj-LjQ5Vbf1E"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
- "source": [
- "# Get the tuning job info.\n",
- "sft_tuning_job.to_dict()"
- ],
+ "execution_count": null,
"metadata": {
"id": "3tXawW1p8E5-"
},
- "execution_count": null,
- "outputs": []
+ "outputs": [],
+ "source": [
+ "# Get the tuning job info.\n",
+ "sft_tuning_job.to_dict()"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "19aQnN-k84d9"
+ },
+ "outputs": [],
"source": [
"# Get the resource name of the tuning job\n",
"sft_tuning_job_name = sft_tuning_job.resource_name\n",
"sft_tuning_job_name"
- ],
- "metadata": {
- "id": "19aQnN-k84d9"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "**Important:** Tuning time depends on several factors, such as training data size, number of epochs, learning rate multiplier, etc."
- ],
"metadata": {
"id": "UKo8cwF2KVM5"
- }
+ },
+ "source": [
+ "**Important:** Tuning time depends on several factors, such as training data size, number of epochs, learning rate multiplier, etc."
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "8NiZnPkIKcwm"
+ },
"source": [
"
\n",
"⚠️ It will take ~30 mins for the model tuning job to complete on the provided dataset and set configurations/hyperparameters. ⚠️\n",
"
"
- ],
- "metadata": {
- "id": "8NiZnPkIKcwm"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Njag_3cB86rH"
+ },
+ "outputs": [],
"source": [
"%%time\n",
"# Wait for job completion\n",
"while not sft_tuning_job.refresh().has_ended:\n",
" time.sleep(60)"
- ],
- "metadata": {
- "id": "Njag_3cB86rH"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "dkx92RBdbf27"
+ },
+ "outputs": [],
"source": [
"# tuned model name\n",
"tuned_model_name = sft_tuning_job.tuned_model_name\n",
"tuned_model_name"
- ],
- "metadata": {
- "id": "dkx92RBdbf27"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "e09aB_9Ebf5c"
+ },
+ "outputs": [],
"source": [
"# tuned model endpoint name\n",
"tuned_model_endpoint_name = sft_tuning_job.tuned_model_endpoint_name\n",
"tuned_model_endpoint_name"
- ],
- "metadata": {
- "id": "e09aB_9Ebf5c"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "gV1ukBznKmlN"
+ },
"source": [
"#### Model tuning metrics\n",
"\n",
@@ -883,26 +908,28 @@
"- `/eval_num_predictions`: Number of predicted tokens at an evaluation step.\n",
"\n",
"The metrics visualizations are available after the model tuning job completes. If you don't specify a validation dataset when you create the tuning job, only the visualizations for the tuning metrics are available."
- ],
- "metadata": {
- "id": "gV1ukBznKmlN"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "IHVU4XP2aOFE"
+ },
+ "outputs": [],
"source": [
"# Get resource name from tuning job.\n",
"experiment_name = sft_tuning_job.experiment.resource_name\n",
"experiment_name"
- ],
- "metadata": {
- "id": "IHVU4XP2aOFE"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "DH0guHM---Jo"
+ },
+ "outputs": [],
"source": [
"# Locate Vertex AI Experiment and Vertex AI Experiment Run\n",
"experiment = aiplatform.Experiment(experiment_name=experiment_name)\n",
@@ -911,29 +938,29 @@
" parent_contexts=[experiment.resource_name],\n",
")\n",
"experiment_run = context.Context.list(filter_str)[0]"
- ],
- "metadata": {
- "id": "DH0guHM---Jo"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "hggHQFIl_FXC"
+ },
+ "outputs": [],
"source": [
"# Read data from Tensorboard\n",
"tensorboard_run_name = f\"{experiment.get_backing_tensorboard_resource().resource_name}/experiments/{experiment.name}/runs/{experiment_run.name.replace(experiment.name, '')[1:]}\"\n",
"tensorboard_run = aiplatform.TensorboardRun(tensorboard_run_name)\n",
"metrics = tensorboard_run.read_time_series_data()"
- ],
- "metadata": {
- "id": "hggHQFIl_FXC"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "BdHKZdqG_bHf"
+ },
+ "outputs": [],
"source": [
"def get_metrics(metric: str = \"/train_total_loss\"):\n",
" \"\"\"\n",
@@ -952,28 +979,28 @@
" steps_loss.append(loss.scalar.value)\n",
" steps.append(loss.step)\n",
" return steps, steps_loss"
- ],
- "metadata": {
- "id": "BdHKZdqG_bHf"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "_pDrlpA7_e9o"
+ },
+ "outputs": [],
"source": [
"# Get Train and Eval Loss\n",
"train_loss = get_metrics(metric=\"/train_total_loss\")\n",
"eval_loss = get_metrics(metric=\"/eval_total_loss\")"
- ],
- "metadata": {
- "id": "_pDrlpA7_e9o"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "DL07j7u__iZx"
+ },
+ "outputs": [],
"source": [
"# Plot the train and eval loss metrics using Plotly python library\n",
"fig = make_subplots(\n",
@@ -1003,24 +1030,24 @@
"\n",
"# Show plot\n",
"fig.show()"
- ],
- "metadata": {
- "id": "DL07j7u__iZx"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "### Use the fine-tuned model and evaluation"
- ],
"metadata": {
"id": "pivmh4Lwbgy1"
- }
+ },
+ "source": [
+ "### Use the fine-tuned model and evaluation"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "qO6ln4teagw1"
+ },
+ "outputs": [],
"source": [
"prompt = \"\"\"\n",
"Answer the question based on the context\n",
@@ -1033,92 +1060,85 @@
"\n",
"Question: How is lap provided by overlapping the admission side port?\n",
"\"\"\""
- ],
- "metadata": {
- "id": "qO6ln4teagw1"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "dYygz5ph_icf"
+ },
+ "outputs": [],
"source": [
"if True:\n",
" tuned_genai_model = GenerativeModel(tuned_model_endpoint_name)\n",
" # Test with the loaded model.\n",
" print(\"***Testing***\")\n",
- " print(\n",
- " tuned_genai_model.generate_content(\n",
- " contents=prompt\n",
- " )\n",
- " )\n",
+ " print(tuned_genai_model.generate_content(contents=prompt))\n",
"else:\n",
" print(\"State:\", sft_tuning_job.state)\n",
" print(\"Error:\", sft_tuning_job.error)"
- ],
- "metadata": {
- "id": "dYygz5ph_icf"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "W4YMNGuoDajB"
+ },
+ "outputs": [],
"source": [
"y_true = test_df[\"answers\"].values\n",
"\n",
- "def get_predictions(test_question):\n",
"\n",
- " prompt = f\"\"\"{test_question}\"\"\"\n",
+ "def get_predictions(test_question):\n",
+ " prompt = f\"\"\"{test_question}\"\"\"\n",
"\n",
- " generation_config = GenerationConfig(\n",
- " temperature=0.1,\n",
- " )\n",
+ " generation_config = GenerationConfig(\n",
+ " temperature=0.1,\n",
+ " )\n",
"\n",
- " response = tuned_genai_model.generate_content(contents=prompt, generation_config=generation_config).text\n",
+ " response = tuned_genai_model.generate_content(\n",
+ " contents=prompt, generation_config=generation_config\n",
+ " ).text\n",
"\n",
- " return response"
- ],
- "metadata": {
- "id": "W4YMNGuoDajB"
- },
- "execution_count": null,
- "outputs": []
+ " return response"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "69FMuAeoDrm5"
+ },
+ "outputs": [],
"source": [
"y_pred = []\n",
"y_pred_question = test_df[\"question\"].values\n",
"\n",
"for i in y_pred_question:\n",
- "\n",
- " prediction = get_predictions(i)\n",
- " y_pred.append(prediction)"
- ],
- "metadata": {
- "id": "69FMuAeoDrm5"
- },
- "execution_count": null,
- "outputs": []
+ " prediction = get_predictions(i)\n",
+ " y_pred.append(prediction)"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "yj76Tu6ODalZ"
+ },
+ "outputs": [],
"source": [
"em, f1 = calculate_em_and_f1_for_text_arrays(y_pred, y_true)\n",
"print(f\"EM score: {em}\")\n",
"print(f\"F1 score: {f1}\")"
- ],
- "metadata": {
- "id": "yj76Tu6ODalZ"
- },
- "execution_count": null,
- "outputs": []
+ ]
}
],
"metadata": {
"colab": {
- "provenance": []
+ "name": "supervised_finetuning_using_gemini_qa.ipynb",
+ "toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
@@ -1127,4 +1147,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
-}
\ No newline at end of file
+}
diff --git a/rag-grounding/README.md b/rag-grounding/README.md
index a9f0f5feec..5f5fdc17d4 100644
--- a/rag-grounding/README.md
+++ b/rag-grounding/README.md
@@ -1,22 +1,22 @@
# RAG and Grounding
-This directory provides a curated list of notebooks that explore Retrieval
+This directory provides a curated list of examples that explore Retrieval
Augmented Generation (RAG), grounding techniques, knowledge bases, grounded
generation, and related topics like vector search and semantic search.
-All of these links are notebooks or other examples in this repository, but are
-indexed here for your convenience.
+All of these links are examples in this repository, but are indexed here for
+your convenience.
## What is RAG and Grounding?
![Animated GIF showing "what is grounding"](./img/what-is-grounding.gif)
-- Ungrounded generation relies on the LLM training data alone and is prone to
- hallucinations when it doesn't have all the right facts
-- **Grounding** a LLM with relevant facts provides fresh and potentially private
- data to the model as part of it's input or prompt
-- **RAG** is a technique which retrieves relevant facts, often via search, and
- provides them to the LLM
+- Ungrounded generation relies on the LLM training data alone and is prone to
+ hallucinations when it doesn't have all the right facts
+- **Grounding** a LLM with relevant facts provides fresh and potentially
+ private data to the model as part of it's input or prompt
+- **RAG** is a technique which retrieves relevant facts, often via search, and
+ provides them to the LLM
Using RAG and Grounding to improve generations and reduce hallucinations is
becoming commonplace. Doing so well and generating extremely high quality
@@ -30,115 +30,204 @@ search engine and RAG application, and the evaluations needed to hill climb
See
[this blog post: How to evaluate generated answers from RAG at scale on Vertex AI](https://medium.com/google-cloud/vqa-3-how-to-evaluate-generated-answers-from-rag-at-scale-on-vertex-ai-70bc397cb33d)
-for a walkthrough.
-
-- **[evaluate_rag_gen_ai_evaluation_service_sdk.ipynb](../gemini/evaluation/evaluate_rag_gen_ai_evaluation_service_sdk.ipynb)**:
- Evaluates RAG systems using the Gen AI Evaluation Service SDK.
-- **[ragas_with_gemini.ipynb](../gemini/use-cases/retrieval-augmented-generation/rag-evaluation/ragas_with_gemini.ipynb)**:
- Use Case - using Ragas with Gemini for Eval.
-- **[deepeval_with_gemini.ipynb](../gemini/use-cases/retrieval-augmented-generation/rag-evaluation/deepeval_with_gemini.ipynb)**:
- Use Case - using DeepEval with Gemini for Eval.
+for a more in-depth walkthrough.
+
+- **[evaluate_rag_gen_ai_evaluation_service_sdk.ipynb](../gemini/evaluation/evaluate_rag_gen_ai_evaluation_service_sdk.ipynb)**:
+ Evaluates RAG systems using the Gen AI Evaluation Service SDK, offering both
+ reference-free and reference-based evaluation methods with visualization.
+- **[ragas_with_gemini.ipynb](../gemini/use-cases/retrieval-augmented-generation/rag-evaluation/ragas_with_gemini.ipynb)**:
+ Evaluates RAG pipelines using the RAGAS framework and the Gemini Pro model
+ for Q&A tasks.
+- **[deepeval_with_gemini.ipynb](../gemini/use-cases/retrieval-augmented-generation/rag-evaluation/deepeval_with_gemini.ipynb)**:
+ Evaluates Gemini Pro's performance on a question-answering task using
+ DeepEval and the Vertex AI Gemini API, including Pytest integration.
## Out of the Box RAG/Grounding
-- **[Vertex AI Search - sample Web App](../search/web-app/)**: Take a look at
- this sample web app using Vertex AI Search, which is a flexible and easy to
- use "out of the box" solution for search & RAG/Grounding.
-- **[bulk_question_answering.ipynb](../search/bulk-question-answering/bulk_question_answering.ipynb)**:
- Answers multiple questions using a search system
-- **[contract_analysis.ipynb](../search/retrieval-augmented-generation/examples/contract_analysis.ipynb)**,
- **[question_answering.ipynb](../search/retrieval-augmented-generation/examples/question_answering.ipynb)**,
- **[rag_google_documentation.ipynb](../search/retrieval-augmented-generation/examples/rag_google_documentation.ipynb)**:
- Showcase specific RAG use cases
-- **[search_data_blending_with_gemini_summarization.ipynb](../search/search_data_blending_with_gemini_summarization.ipynb)**:
- Demonstrates calling a search app that blends information from multiple stores
- (GCS, BQ, site) and summarizes search snippets and responses using the
- Gemini Pro model.
-- **[vertexai_search_options.ipynb](../search/vertexai-search-options/vertexai_search_options.ipynb)**:
- Shows how to use Vertex AI Search in conjunction with the Gemini Pro model to
- retrieve and summarize data across multiple data stores within Google Cloud
- Platform (GCP). It highlights how the Gemini Pro model is able to formulate a
- summary of user-specific prompts based on the retrieved snippets and citations
- from Vertex AI Search.
-
-## Build your own RAG/Grounding
-
-We have several notebooks and examples for specific use cases or types of data
-which may require a custom RAG and Grounding. We have many products which can be
-used to build a RAG/Grounding pipeline of your own, or which you can add to an
-existing RAG and Grounding solution.
-
-- [Vertex AI APIs for building search and RAG](https://cloud.google.com/generative-ai-app-builder/docs/builder-apis)
- has a list of several APIs you can use in isolation or in combination
-- [LlamaIndex on Vertex](https://cloud.google.com/vertex-ai/generative-ai/docs/rag-overview)
- allows you to assemble a RAG search using popular OSS framework and components
- from Google or Open Source
-- [This end-to-end DIY RAG example in a notebook](https://github.com/GoogleCloudPlatform/applied-ai-engineering-samples/blob/main/genai-on-vertex-ai/retrieval_augmented_generation/diy_rag_with_vertexai_apis/build_grounded_rag_app_with_vertex.ipynb)
- written in LangChain and using some of these APIs
-- The Google Cloud Architecture Center has reference architectures on
- [building a RAG infrastructure with GKE](https://cloud.google.com/architecture/rag-capable-gen-ai-app-using-gke)
- or
- [using alloydb and a few Vertex services](https://cloud.google.com/architecture/rag-capable-gen-ai-app-using-vertex-ai)
-
-### Search
-
-Vertex AI Search is an end-to-end Search engine which delivers high quality
-grounded generation and RAG at scale, built-in.
-
-Vertex AI Vector Search is a extremely performant Vector Database which powers
-Vertex AI Search. Other databases like AlloyDB and BigQuery also have vector
-searches, each with different performance characteristics and retrieval
-performance.
+With
+**[Vertex AI Search](https://cloud.google.com/generative-ai-app-builder/docs/)**,
+you can build a RAG/Grounding system in a few clicks or a few lines of code and
+be ready for scale with high quality results. Vertex AI Search is an end-to-end
+Search engine builder, giving you Google quality search for your own data.
+
+- **[Vertex AI Search - sample Web App](../search/web-app/)**: Take a look at
+ this sample web app using Vertex AI Search, which is a flexible and easy to
+ use "out of the box" solution for search & RAG/Grounding.
+- **[bulk_question_answering.ipynb](../search/bulk-question-answering/bulk_question_answering.ipynb)**:
+ Processes questions from a CSV and outputs the results (top documents and
+ extractive answers) to a TSV file using Vertex AI Search.
+- **[contract_analysis.ipynb](../search/retrieval-augmented-generation/examples/contract_analysis.ipynb)**:
+ Demonstrates RAG for contract analysis using Palm2, LangChain, and a vector
+ store, with a Gradio interface for querying contracts and retrieving answers
+ with source references.
+- **[question_answering.ipynb](../search/retrieval-augmented-generation/examples/question_answering.ipynb)**:
+ Builds a question-answering system using Vertex AI Search and LangChain to
+ retrieve information from unstructured documents and leverage LLMs for
+ answering with citations.
+- **[rag_google_documentation.ipynb](../search/retrieval-augmented-generation/examples/rag_google_documentation.ipynb)**:
+ Builds a question-answering system from Google Cloud documentation using RAG
+ and evaluates the impact of different parameter settings on model
+ performance.
+- **[rag_google_documentation.ipynb](../search/retrieval-augmented-generation/examples/rag_google_documentation.ipynb)**:
+ Showcase specific RAG use cases
+- **[search_data_blending_with_gemini_summarization.ipynb](../search/search_data_blending_with_gemini_summarization.ipynb)**:
+ Demonstrates calling a search app that blends information from multiple
+ stores (GCS, BQ, site) and summarizes search snippets and responses using
+ the Gemini Pro model.
+- **[vertexai_search_options.ipynb](../search/vertexai-search-options/vertexai_search_options.ipynb)**:
+ Demonstrates three approaches for using Vertex AI Search: direct API usage,
+ grounding with Gemini, and integration with LangChain.
+
+Vertex AI Search can be configured to adapt to many different use cases and
+data.
+
+
+
+
+## Bring your own Search for RAG/Grounding
+
+The Vertex AI Search - Grounded Generation API allows you to use a custom search
+engine for RAG/Grounding. You can wrap any search engine with a cloud function
+exposing a REST API and then plug it into the Grounded Generation API as a
+`grounding_source`.
+
+See the
+[Grounded Generation API documentation](https://cloud.google.com/generative-ai-app-builder/docs/grounded-gen)
+for more information.
+
+Demo coming soon.
+
+## Build your own Search for RAG/Grounding
+
+You may want to build your own search engine for RAG/Grounding, perhaps because
+you have a unique use case or perhaps because you want to use a specific search
+engine, or perhaps there are constraints on what you can use.
+
+We have many component APIs which can be used to build a RAG/Grounding pipeline
+of your own.
+
+- [Vertex AI APIs for building search and RAG](https://cloud.google.com/generative-ai-app-builder/docs/builder-apis)
+ has a list of several APIs you can use in isolation or in combination
+
+We have a managed service to assemble component using LlamaIndex style SDK.
+
+- [LlamaIndex on Vertex](https://cloud.google.com/vertex-ai/generative-ai/docs/rag-overview)
+ allows you to assemble a RAG search using popular OSS framework and
+ components from Google or Open Source
+
+We have a few reference architectures you can use to build your own
+RAG/Grounding pipeline from the ground up.
+
+- [This end-to-end DIY RAG example in a notebook](https://github.com/GoogleCloudPlatform/applied-ai-engineering-samples/blob/main/genai-on-vertex-ai/retrieval_augmented_generation/diy_rag_with_vertexai_apis/build_grounded_rag_app_with_vertex.ipynb)
+ written in LangChain and using some of these APIs
+- The Google Cloud Architecture Center has reference architectures on
+ [building a RAG infrastructure with GKE](https://cloud.google.com/architecture/rag-capable-gen-ai-app-using-gke)
+ or
+ [using alloydb and a few Vertex services](https://cloud.google.com/architecture/rag-capable-gen-ai-app-using-vertex-ai)
+
+More coming soon.
+
+## Build with a Vector Database
+
+Vertex AI Vector Search (Formerly known as Matching Engine) is a highly scalable
+and performant vector database which powers Vertex AI Search.
+
+AlloyDB, BigQuery and Redis also have vector search capabilities, each with
+different performance characteristics - though each of them is a general purpose
+database and not purpose built for embeddings like Vector Search is.
+
+Note that you can use a Vector Database for RAG/Grounding and for many other use
+cases, like recommendation systems, clustering, and anomaly detection.
+
+**[Document_QnA_using_gemini_and_vector_search.ipynb](../gemini/use-cases/retrieval-augmented-generation/Document_QnA_using_gemini_and_vector_search.ipynb)**
+Demonstrates building a multimodal question-answering system using Gemini and
+Vertex AI Vector Search for PDFs containing text and images, employing retrieval
+augmented generation (RAG).
### Embeddings
-- **[intro_Vertex_AI_embeddings.ipynb](../gemini/qa-ops/intro_Vertex_AI_embeddings.ipynb)**:
- Introduces Vertex AI embeddings.
-- **[hybrid-search.ipynb](../embeddings/hybrid-search.ipynb)**: Explores
- combining different search techniques, potentially including vector search and
- keyword-based search.
-- **[intro-textemb-vectorsearch.ipynb](../embeddings/intro-textemb-vectorsearch.ipynb)**:
- Introduces text embeddings and vector search.
-- **[vector-search-quickstart.ipynb](../embeddings/vector-search-quickstart.ipynb)**:
- Quick start guide for implementing vector search.
-- **[bq-vector-search-log-outlier-detection.ipynb](../embeddings/use-cases/outlier-detection/bq-vector-search-log-outlier-detection.ipynb)**:
- Demonstrates using vector search with BigQuery logs to identify outliers.
+The best explanation of embeddings I've seen
+
+- **[intro_Vertex_AI_embeddings.ipynb](../gemini/qa-ops/intro_Vertex_AI_embeddings.ipynb)**:
+ Introduces Vertex AI's text and multimodal embeddings APIs and demonstrates
+ their use in building a simple e-commerce search application with text,
+ image, and video queries.
+- **[hybrid-search.ipynb](../embeddings/hybrid-search.ipynb)**: Demonstrates
+ hybrid search (combining semantic and keyword search) using Vertex AI Vector
+ Search.
+- **[intro-textemb-vectorsearch.ipynb](../embeddings/intro-textemb-vectorsearch.ipynb)**:
+ Demonstrates building semantic search capabilities using Vertex AI's text
+ embeddings and vector search, grounding LLM outputs with real-world data.
+- **[vector-search-quickstart.ipynb](../embeddings/vector-search-quickstart.ipynb)**:
+ Provides a quickstart tutorial for Vertex AI Vector Search, guiding users
+ through setting up, building, deploying, and querying a vector search index
+ using sample product data.
+- **[bq-vector-search-log-outlier-detection.ipynb](../embeddings/use-cases/outlier-detection/bq-vector-search-log-outlier-detection.ipynb)**:
+ Demonstrates log anomaly detection and investigation using Vertex AI,
+ BigQuery, and text embeddings to identify semantically similar past actions
+ for outlier analysis.
### Gemini
-- **[intro-grounding-gemini.ipynb](../gemini/grounding/intro-grounding-gemini.ipynb)**:
- Introduces grounding in the context of Gemini.
-- **[building_DIY_multimodal_qa_system_with_mRAG.ipynb](../gemini/qa-ops/building_DIY_multimodal_qa_system_with_mRAG.ipynb)**:
- Builds a custom multimodal question-answering system using mRAG.
-- **[code_retrieval_augmented_generation.ipynb](../language/code/code_retrieval_augmented_generation.ipynb)**:
- Demonstrates using code retrieval to improve code generation.
-- **[intro-grounding.ipynb](../language/grounding/intro-grounding.ipynb)**:
- Introduction to grounding in natural language processing
-- **[langchain_bigquery_data_loader.ipynb](../language/orchestration/langchain/langchain_bigquery_data_loader.ipynb)**:
- Uses LangChain to load data from BigQuery for RAG
-- **[question_answering_documents.ipynb](../language/use-cases/document-qa/question_answering_documents.ipynb)**,
- **[question_answering_documents_langchain.ipynb](../language/use-cases/document-qa/question_answering_documents_langchain.ipynb)**,
- **[question_answering_documents_langchain_matching_engine.ipynb](../language/use-cases/document-qa/question_answering_documents_langchain_matching_engine.ipynb)**:
- Focus on question answering over documents
-- **[summarization_large_documents.ipynb](../language/use-cases/document-summarization/summarization_large_documents.ipynb)**,
- **[summarization_large_documents_langchain.ipynb](../language/use-cases/document-summarization/summarization_large_documents_langchain.ipynb)**:
- Demonstrate summarizing large documents.
-- **[llamaindex_workflows.ipynb](../gemini/orchestration/llamaindex_workflows.ipynb)** Using LlamaIndex Workflows to build an event driven RAG flow.
+- **[intro-grounding-gemini.ipynb](../gemini/grounding/intro-grounding-gemini.ipynb)**:
+ Demonstrates grounding LLM responses in Google Search and Vertex AI Search
+ using Gemini, improving response accuracy and reducing hallucinations.
+- **[intro-grounding.ipynb](../language/grounding/intro-grounding.ipynb)**:
+ Demonstrates using Vertex AI's grounding feature to improve LLM response
+ accuracy and relevance by grounding them in Google Search or custom Vertex
+ AI Search data stores.
+- **[building_DIY_multimodal_qa_system_with_mRAG.ipynb](../gemini/qa-ops/building_DIY_multimodal_qa_system_with_mRAG.ipynb)**:
+ Builds a custom multimodal question-answering system using mRAG.
+- **[code_retrieval_augmented_generation.ipynb](../language/code/code_retrieval_augmented_generation.ipynb)**:
+ Demonstrates RAG for code using Gemini, LangChain, FAISS, and Vertex AI's
+ Embeddings API to enhance code generation by incorporating external
+ knowledge from the Google Cloud Generative AI GitHub repository.
+- **[langchain_bigquery_data_loader.ipynb](../language/orchestration/langchain/langchain_bigquery_data_loader.ipynb)**:
+ Demonstrates using LangChain's BigQuery Data Loader to query BigQuery data,
+ integrate it with a Vertex AI LLM, and build a chain to generate and execute
+ SQL queries for targeted customer analysis.
+- **[question_answering_documents.ipynb](../language/use-cases/document-qa/question_answering_documents.ipynb)**:
+ Demonstrates three methods (stuffing, map-reduce, and map-reduce with
+ embeddings) for building a question-answering system using the Vertex AI
+ PaLM API to efficiently handle large document datasets.
+- **[question_answering_documents_langchain.ipynb](../language/use-cases/document-qa/question_answering_documents_langchain.ipynb)**:
+ Demonstrates building a question-answering system using LangChain and Vertex
+ AI's PaLM API, comparing different methods (stuffing, map-reduce, refine)
+ for handling large documents, and showcasing the improved efficiency of
+ using similarity search with embeddings.
+- **[question_answering_documents_langchain_matching_engine.ipynb](../language/use-cases/document-qa/question_answering_documents_langchain_matching_engine.ipynb)**:
+ Demonstrates a question-answering system using LangChain, Vertex AI's PaLM
+ API, and Matching Engine for retrieval-augmented generation, enabling
+ fact-grounded responses with source citations.
+- **[summarization_large_documents.ipynb](../language/use-cases/document-summarization/summarization_large_documents.ipynb)**:
+ Demonstrates four methods (stuffing, MapReduce, MapReduce with overlapping
+ chunks, and MapReduce with rolling summaries) for summarizing large
+ documents using Vertex AI's generative models, addressing challenges of
+ exceeding context length limits.
+- **[summarization_large_documents_langchain.ipynb](../language/use-cases/document-summarization/summarization_large_documents_langchain.ipynb)**:
+ Demonstrates three LangChain methods (Stuffing, MapReduce, Refine) for
+ summarizing large documents using Vertex AI models, comparing their
+ effectiveness and limitations.
+- **[llamaindex_workflows.ipynb](../gemini/orchestration/llamaindex_workflows.ipynb)** Using LlamaIndex Workflows to build an event driven RAG flow.
### Open Models
-- **[cloud_run_ollama_gemma2_rag_qa.ipynb](../open-models/serving/cloud_run_ollama_gemma2_rag_qa.ipynb)**:
- Sets up a RAG-based question-answering system using Ollama and Gemma2 on Cloud
- Run
+- **[cloud_run_ollama_gemma2_rag_qa.ipynb](../open-models/serving/cloud_run_ollama_gemma2_rag_qa.ipynb)**:
+ Demonstrates deploying Gemma 2 on Google Cloud Run with GPU acceleration
+ using Ollama and LangChain, building a RAG question-answering application.
## Agents on top of RAG
-- **[tutorial_vertex_ai_search_rag_agent.ipynb](../gemini/reasoning-engine/tutorial_vertex_ai_search_rag_agent.ipynb)**:
- Tutorial for building RAG agents using Vertex AI Search
-- **[tutorial_alloydb_rag_agent.ipynb](../gemini/reasoning-engine/tutorial_alloydb_rag_agent.ipynb)**:
- Tutorial for building RAG agents using AlloyDB
-- **[tutorial_cloud_sql_pg_rag_agent.ipynb](../gemini/reasoning-engine/tutorial_cloud_sql_pg_rag_agent.ipynb)**:
- Tutorial for building RAG agents using Cloud SQL (PostgreSQL)
+- **[tutorial_vertex_ai_search_rag_agent.ipynb](../gemini/reasoning-engine/tutorial_vertex_ai_search_rag_agent.ipynb)**:
+ Demonstrates building and deploying a conversational search agent on Vertex
+ AI using LangChain, a reasoning engine, and RAG with Vertex AI Search to
+ query a movie dataset.
+- **[tutorial_alloydb_rag_agent.ipynb](../gemini/reasoning-engine/tutorial_alloydb_rag_agent.ipynb)**:
+ Demonstrates deploying a RAG application using LangChain, AlloyDB for
+ PostgreSQL, and Vertex AI, covering setup, deployment, and cleanup.
+- **[tutorial_cloud_sql_pg_rag_agent.ipynb](../gemini/reasoning-engine/tutorial_cloud_sql_pg_rag_agent.ipynb)**:
+ Demonstrates deploying a RAG application using LangChain, Vertex AI, and
+ Cloud SQL for PostgreSQL, enabling semantic search and LLM-based responses.
## Use Cases
@@ -147,16 +236,17 @@ grounding techniques in various applications. Feel free to dive into the
notebooks that pique your interest and start building your own RAG-powered
solutions.
-- Examples of RAG in different domains
- - **[Document_QnA_using_gemini_and_vector_search.ipynb](../gemini/use-cases/retrieval-augmented-generation/Document_QnA_using_gemini_and_vector_search.ipynb)**
- - **[NLP2SQL_using_dynamic_RAG.ipynb](../gemini/use-cases/retrieval-augmented-generation/NLP2SQL_using_dynamic_RAG.ipynb)**
- - **[RAG_Based_on_Sensitive_Data_Protection_using_Faker.ipynb](../gemini/use-cases/retrieval-augmented-generation/RAG_Based_on_Sensitive_Data_Protection_using_Faker.ipynb)**
- - **[code_rag.ipynb](../gemini/use-cases/retrieval-augmented-generation/code_rag.ipynb)**
- - **[intra_knowledge_qna.ipynb](../gemini/use-cases/retrieval-augmented-generation/intra_knowledge_qna.ipynb)**
- - **[intro_multimodal_rag.ipynb](../gemini/use-cases/retrieval-augmented-generation/intro_multimodal_rag.ipynb)**
- - **[llamaindex_rag.ipynb](../gemini/use-cases/retrieval-augmented-generation/llamaindex_rag.ipynb)**
- - **[multimodal_rag_langchain.ipynb](../gemini/use-cases/retrieval-augmented-generation/multimodal_rag_langchain.ipynb)**
- - **[small_to_big_rag.ipynb](../gemini/use-cases/retrieval-augmented-generation/small_to_big_rag/small_to_big_rag.ipynb)**
-- Build RAG systems using BigQuery
- - **[rag_qna_with_bq_and_featurestore.ipynb](../gemini/use-cases/retrieval-augmented-generation/rag_qna_with_bq_and_featurestore.ipynb)**
- - **[rag_vector_embedding_in_bigquery.ipynb](../gemini/use-cases/retrieval-augmented-generation/rag_vector_embedding_in_bigquery.ipynb)**
+- Examples of RAG in different domains
+
+ - **[NLP2SQL_using_dynamic_RAG.ipynb](../gemini/use-cases/retrieval-augmented-generation/NLP2SQL_using_dynamic_RAG.ipynb)**
+ - **[RAG_Based_on_Sensitive_Data_Protection_using_Faker.ipynb](../gemini/use-cases/retrieval-augmented-generation/RAG_Based_on_Sensitive_Data_Protection_using_Faker.ipynb)**
+ - **[code_rag.ipynb](../gemini/use-cases/retrieval-augmented-generation/code_rag.ipynb)**
+ - **[intra_knowledge_qna.ipynb](../gemini/use-cases/retrieval-augmented-generation/intra_knowledge_qna.ipynb)**
+ - **[intro_multimodal_rag.ipynb](../gemini/use-cases/retrieval-augmented-generation/intro_multimodal_rag.ipynb)**
+ - **[llamaindex_rag.ipynb](../gemini/use-cases/retrieval-augmented-generation/llamaindex_rag.ipynb)**
+ - **[multimodal_rag_langchain.ipynb](../gemini/use-cases/retrieval-augmented-generation/multimodal_rag_langchain.ipynb)**
+ - **[small_to_big_rag.ipynb](../gemini/use-cases/retrieval-augmented-generation/small_to_big_rag/small_to_big_rag.ipynb)**
+
+- Build RAG systems using BigQuery
+ - **[rag_qna_with_bq_and_featurestore.ipynb](../gemini/use-cases/retrieval-augmented-generation/rag_qna_with_bq_and_featurestore.ipynb)**
+ - **[rag_vector_embedding_in_bigquery.ipynb](../gemini/use-cases/retrieval-augmented-generation/rag_vector_embedding_in_bigquery.ipynb)**
From 237c22d11fbc8d400322a24bed132e3a756d2e7a Mon Sep 17 00:00:00 2001
From: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Date: Thu, 10 Oct 2024 10:46:41 -0500
Subject: [PATCH 58/76] feat: Add Intro to Knowledge Engine Notebook (#1235)
# Description
Introduction to Vertex AI Knowledge Engine (Formerly RAG API and
LlamaIndex on Vertex AI)
---------
Co-authored-by: Owl Bot
---
gemini/README.md | 58 +-
.../intro_knowledge_engine.ipynb | 574 ++++++++++++++++++
2 files changed, 597 insertions(+), 35 deletions(-)
create mode 100644 gemini/knowledge-engine/intro_knowledge_engine.ipynb
diff --git a/gemini/README.md b/gemini/README.md
index c52233949a..8eb7f71f3b 100644
--- a/gemini/README.md
+++ b/gemini/README.md
@@ -26,7 +26,6 @@ The notebooks and samples in this folder focus on using the **Vertex AI SDK for
Description
-
Contents
@@ -36,11 +35,10 @@ The notebooks and samples in this folder focus on using the **Vertex AI SDK for
diff --git a/gemini/knowledge-engine/intro_knowledge_engine.ipynb b/gemini/knowledge-engine/intro_knowledge_engine.ipynb
new file mode 100644
index 0000000000..43c399ce2c
--- /dev/null
+++ b/gemini/knowledge-engine/intro_knowledge_engine.ipynb
@@ -0,0 +1,574 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ur8xi4C7S06n"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "JAPoU8Sm5E6e"
+ },
+ "source": [
+ "# Intro to Building a Scalable and Modular RAG System with Knowledge Engine in Vertex AI (LlamaIndex on Vertex AI)\n",
+ "\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "84f0f73a0f76"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "| Author(s) | [Holt Skinner](https://github.com/holtskinner) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tvgnzT1CKxrO"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "[LlamaIndex](https://www.llamaindex.ai/) is a data framework for developing\n",
+ "context-augmented large language model (LLM) applications. Context augmentation\n",
+ "occurs when you apply an LLM to your data. This implements retrieval-augmented\n",
+ "generation (RAG).\n",
+ "\n",
+ "A common problem with LLMs is that they don't understand private knowledge, that\n",
+ "is, your organization's data. With Knowledge Engine, you can enrich the\n",
+ "LLM context with additional private information, because the model can reduce\n",
+ "hallucination and answer questions more accurately.\n",
+ "\n",
+ "By combining additional knowledge sources with the existing knowledge that LLMs\n",
+ "have, a better context is provided. The improved context along with the query\n",
+ "enhances the quality of the LLM's response.\n",
+ "\n",
+ "The following concepts are key to understanding LlamaIndex on\n",
+ "Vertex AI. These concepts are listed in the order of the\n",
+ "retrieval-augmented generation (RAG) process.\n",
+ "\n",
+ "1. **Data ingestion**: Intake data from different data sources. For example,\n",
+ " local files, Google Cloud Storage, and Google Drive.\n",
+ "\n",
+ "1. **Data transformation**: Conversion of the data in preparation for indexing. For example, data is split into chunks.\n",
+ "\n",
+ "1. **Embedding**: Numerical representations of words or pieces of text. These numbers capture the\n",
+ " semantic meaning and context of the text. Similar or related words or text\n",
+ " tend to have similar embeddings, which means they are closer together in the\n",
+ " high-dimensional vector space.\n",
+ "\n",
+ "2. **Data indexing**: Knowledge Engine creates an index called a corpus.\n",
+ " The index structures the knowledge base so it's optimized for searching. For\n",
+ " example, the index is like a detailed table of contents for a massive\n",
+ " reference book.\n",
+ "\n",
+ "3. **Retrieval**: When a user asks a question or provides a prompt, the retrieval\n",
+ " component in Knowledge Engine searches through its knowledge\n",
+ " base to find information that is relevant to the query.\n",
+ "\n",
+ "1. **Generation**: The retrieved information becomes the context added to the\n",
+ " original user query as a guide for the generative AI model to generate\n",
+ " factually grounded and relevant responses.\n",
+ "\n",
+ "For more information, refer to the public documentation for [Vertex AI Knowledge Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/llamaindex-on-vertexai)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "61RBz8LLbxCR"
+ },
+ "source": [
+ "## Get started"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "No17Cw5hgx12"
+ },
+ "source": [
+ "### Install Vertex AI SDK and other required packages\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "tFy3H3aPgx12"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install --upgrade --user --quiet google-cloud-aiplatform"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "R5Xep4W9lq-Z"
+ },
+ "source": [
+ "### Restart runtime\n",
+ "\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
+ "\n",
+ "The restart might take a minute or longer. After it's restarted, continue to the next step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XRvKdaPDTznN"
+ },
+ "outputs": [],
+ "source": [
+ "import IPython\n",
+ "\n",
+ "app = IPython.Application.instance()\n",
+ "app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SbmM4z7FOBpM"
+ },
+ "source": [
+ "
\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "
@@ -39,16 +36,14 @@ For more Vertex AI samples, please visit the [Vertex AI samples GitHub repositor
search/
Use this folder if you're interested in using Vertex AI Search, a Google-managed solution to help you rapidly build search engines for websites and across enterprise data. (Formerly known as Enterprise Search on Generative AI App Builder)
Use this folder for information on Retrieval Augmented Generation (RAG) and Grounding with Vertex AI. This is an index of notebooks and samples across other directories focused on this topic.
@@ -57,35 +52,20 @@ For more Vertex AI samples, please visit the [Vertex AI samples GitHub repositor
conversation/
Use this folder if you're interested in using Vertex AI Conversation, a Google-managed solution to help you rapidly build chat bots for websites and across enterprise data. (Formerly known as Chat Apps on Generative AI App Builder)
Use this folder if you're interested in building your own solutions from scratch using features from Chirp, a version of Google's Universal Speech Model (USM) on Vertex AI (Vertex AI Chirp API).
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "84f0f73a0f76"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "|Author(s) | [Eric Dong](https://github.com/gericdong) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tvgnzT1CKxrO"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "Different from getting online (synchronous) responses, where you are limited to one input request at a time, the batch predictions with the Vertex AI Gemini API allow you to send a large number of multimodal requests to a Gemini model in a single batch request. Then, the model responses asynchronously populate to your storage output location in [Cloud Storage](https://cloud.google.com/storage/docs/introduction) or [BigQuery](https://cloud.google.com/bigquery/docs/storage_overview).\n",
+ "\n",
+ "Batch predictions are generally more efficient and cost-effective than online predictions when processing a large number of inputs that are not latency sensitive.\n",
+ "\n",
+ "To learn more, see the [Get batch predictions for Gemini](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini) page.\n",
+ "\n",
+ "### Objectives\n",
+ "\n",
+ "In this tutorial, you learn how to make batch predictions with the Vertex AI Gemini API. This tutorial uses **BigQuery** as an input source and an output location.\n",
+ "\n",
+ "You will complete the following tasks:\n",
+ "\n",
+ "- Preparing batch inputs and an output location\n",
+ "- Submitting a batch prediction job\n",
+ "- Retrieving batch prediction results\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "61RBz8LLbxCR"
+ },
+ "source": [
+ "## Get started"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "No17Cw5hgx12"
+ },
+ "source": [
+ "### Install Vertex AI SDK and other required packages\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "tFy3H3aPgx12"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install --upgrade --user --quiet google-cloud-aiplatform"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "R5Xep4W9lq-Z"
+ },
+ "source": [
+ "### Restart runtime\n",
+ "\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
+ "\n",
+ "The restart might take a minute or longer. After it's restarted, continue to the next step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XRvKdaPDTznN"
+ },
+ "outputs": [],
+ "source": [
+ "import IPython\n",
+ "\n",
+ "app = IPython.Application.instance()\n",
+ "app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SbmM4z7FOBpM"
+ },
+ "source": [
+ "
\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dmWOrTJ3gx13"
+ },
+ "source": [
+ "### Authenticate your notebook environment (Colab only)\n",
+ "\n",
+ "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "NyKGtVQjgx13"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " from google.colab import auth\n",
+ "\n",
+ " auth.authenticate_user()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DF4l8DTdWgPY"
+ },
+ "source": [
+ "### Set Google Cloud project information\n",
+ "\n",
+ "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
+ "\n",
+ "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Nqwi-5ufWp_B"
+ },
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "\n",
+ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\", isTemplate: true}\n",
+ "if PROJECT_ID == \"[your-project-id]\":\n",
+ " PROJECT_ID = str(os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n",
+ "\n",
+ "LOCATION = os.environ.get(\"GOOGLE_CLOUD_REGION\", \"us-central1\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "EdvJRUWRNGHE"
+ },
+ "source": [
+ "## Code Examples"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5303c05f7aa6"
+ },
+ "source": [
+ "### Import libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "6fc324893334"
+ },
+ "outputs": [],
+ "source": [
+ "from datetime import datetime\n",
+ "import time\n",
+ "\n",
+ "from google.cloud import bigquery\n",
+ "import vertexai\n",
+ "from vertexai.generative_models import GenerativeModel\n",
+ "from vertexai.preview.batch_prediction import BatchPredictionJob"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "a49110dbce84"
+ },
+ "source": [
+ "### Initialize Vertex AI SDK"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "a4582394ea28"
+ },
+ "outputs": [],
+ "source": [
+ "vertexai.init(project=PROJECT_ID, location=LOCATION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "e43229f3ad4f"
+ },
+ "source": [
+ "### Load model\n",
+ "\n",
+ "You can find a list of the Gemini models that support batch predictions in the [Multimodal models that support batch predictions](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini#multimodal_models_that_support_batch_predictions) page.\n",
+ "\n",
+ "This tutorial uses the Gemini 1.5 Pro (`gemini-1.5-pro-002`) model."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "cf93d5f0ce00"
+ },
+ "outputs": [],
+ "source": [
+ "MODEL_ID = \"gemini-1.5-pro-002\" # @param {type:\"string\", isTemplate: true}\n",
+ "\n",
+ "model = GenerativeModel(MODEL_ID)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "1_xZADsak23H"
+ },
+ "source": [
+ "### Prepare batch inputs\n",
+ "\n",
+ "The input for batch requests specifies the items to send to your model for prediction.\n",
+ "\n",
+ "Batch requests for Gemini accept BigQuery storage sources and Cloud Storage sources. You can learn more about the batch input formats for BigQuery and Cloud Storage sources in the [Batch text generation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini#prepare_your_inputs) page.\n",
+ "\n",
+ "This tutorial uses **BigQuery** as an example. To use a BigQuery table as the input, you must ensure the following:\n",
+ "\n",
+ "- The BigQuery dataset must be created in a specific region (e.g. `us-central1`). Multi-region location (e.g. `US`) is not supported.\n",
+ "- The input table must have a column named `request` in JSON or STRING type.\n",
+ "- The content in the `request` column must be valid JSON. This JSON data represents your input for the model.\n",
+ "- The content in the JSON instructions must match the structure of a [GenerateContentRequest](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference).\n",
+ "- The input table can have columns other than `request`. They are ignored for content generation but included in the output table. The system reserves two column names for output: `response` and `status`. These are used to provide information about the outcome of the batch prediction job.\n",
+ "- Only public YouTube and Cloud Storage bucket URIs in the `fileData` or `file_data` field are supported in batch prediction.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "3d4b751610f0"
+ },
+ "source": [
+ "This is an example BigQuery table:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "uWb8QzxwbH6W"
+ },
+ "outputs": [],
+ "source": [
+ "INPUT_DATA = \"bq://storage-samples.generative_ai.batch_requests_for_multimodal_input\" # @param {type:\"string\"}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "b7aeba876320"
+ },
+ "source": [
+ "You can query the BigQuery table to review the input data."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "1414f03a7999"
+ },
+ "outputs": [],
+ "source": [
+ "bq_client = bigquery.Client(project=PROJECT_ID)\n",
+ "\n",
+ "bq_table_id = INPUT_DATA.replace(\"bq://\", \"\")\n",
+ "sql = f\"\"\"\n",
+ " SELECT *\n",
+ " FROM {bq_table_id}\n",
+ " \"\"\"\n",
+ "\n",
+ "query_result = bq_client.query(sql)\n",
+ "\n",
+ "df = query_result.result().to_dataframe()\n",
+ "df.head()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "T3jQ59mCsXLc"
+ },
+ "source": [
+ "### Prepare batch output location\n",
+ "\n",
+ "When a batch prediction task completes, the output is stored in the location that you specified in your request.\n",
+ "\n",
+ "- The location is in the form of a Cloud Storage or BigQuery URI prefix, for example:\n",
+ "`gs://path/to/output/data` or `bq://projectId.bqDatasetId`.\n",
+ "\n",
+ "- If not specified, `STAGING_BUCKET/gen-ai-batch-prediction` will be used for Cloud Storage source and `bq://PROJECT_ID.gen_ai_batch_prediction.predictions_TIMESTAMP` will be used for BigQuery source.\n",
+ "\n",
+ "This tutorial uses a **BigQuery** table as an example.\n",
+ "\n",
+ "- You can specify the URI of your BigQuery table in `BQ_OUTPUT_URI`, or\n",
+ "- if it is not specified, this tutorial will create a new dataset `bq://PROJECT_ID.gen_ai_batch_prediction` for you."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "OtUodwGXZ7US"
+ },
+ "outputs": [],
+ "source": [
+ "BQ_OUTPUT_URI = \"[your-bigquery-table]\" # @param {type:\"string\"}\n",
+ "\n",
+ "if BQ_OUTPUT_URI == \"[your-bigquery-table]\":\n",
+ " bq_dataset_id = \"gen_ai_batch_prediction\"\n",
+ "\n",
+ " # The output table will be created automatically if it doesn't exist\n",
+ " timestamp = datetime.now().strftime(\"%Y%m%d%H%M%S\")\n",
+ " bq_table_id = f\"prediction_result_{timestamp}\"\n",
+ " BQ_OUTPUT_URI = f\"bq://{PROJECT_ID}.{bq_dataset_id}.{bq_table_id}\"\n",
+ "\n",
+ " bq_dataset = bigquery.Dataset(f\"{PROJECT_ID}.{bq_dataset_id}\")\n",
+ " bq_dataset.location = \"us-central1\"\n",
+ "\n",
+ " bq_dataset = bq_client.create_dataset(bq_dataset, exists_ok=True, timeout=30)\n",
+ " print(\n",
+ " f\"Created BigQuery dataset {bq_client.project}.{bq_dataset.dataset_id} for batch prediction output.\"\n",
+ " )\n",
+ "\n",
+ "print(f\"BigQuery output URI: {BQ_OUTPUT_URI}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "T90CwWDHvonn"
+ },
+ "source": [
+ "### Send a batch prediction request\n",
+ "\n",
+ "\n",
+ "You create a batch prediction job using the `BatchPredictionJob.submit()` method. To make a batch prediction request, you specify a source model ID, an input source and an output location, either Cloud Storage or BigQuery, where Vertex AI stores the batch prediction results.\n",
+ "\n",
+ "To learn more, see the [Batch prediction API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/batch-prediction-api) page.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "3_PxZdTYbMyg"
+ },
+ "outputs": [],
+ "source": [
+ "job = BatchPredictionJob.submit(\n",
+ " source_model=MODEL_ID, input_dataset=INPUT_DATA, output_uri_prefix=BQ_OUTPUT_URI\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "A-Fo_Kd9FYRj"
+ },
+ "source": [
+ "Print out the job status and other properties."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "DWq7m79PbjG8"
+ },
+ "outputs": [],
+ "source": [
+ "print(f\"Job resource name: {job.resource_name}\")\n",
+ "print(f\"Model resource name: {job.model_name}\")\n",
+ "print(f\"Job state: {job.state.name}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "7aJaPNBrGPqK"
+ },
+ "source": [
+ "### Wait for the batch prediction job to complete\n",
+ "\n",
+ "Depending on the number of input items that you submitted, a batch generation task can take some time to complete. You can use the following code to check the job status and wait for the job to complete."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "dtJDIXdHc0W-"
+ },
+ "outputs": [],
+ "source": [
+ "# Refresh the job until complete\n",
+ "while not job.has_ended:\n",
+ " time.sleep(5)\n",
+ " job.refresh()\n",
+ "\n",
+ "# Check if the job succeeds\n",
+ "if job.has_succeeded:\n",
+ " print(\"Job succeeded!\")\n",
+ "else:\n",
+ " print(f\"Job failed: {job.error}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "XWUgAxL-HjN9"
+ },
+ "source": [
+ "### Retrieve batch prediction results\n",
+ "\n",
+ "When a batch prediction task is complete, the output of the prediction is stored in the Cloud Storage bucket or BigQuery location that you specified in your request.\n",
+ "\n",
+ "- When you are using BigQuery, the output of batch prediction is stored in an output dataset. If you had provided a dataset, the name of the dataset (`BQ_OUTPUT_URI`) is the name you had provided earlier. \n",
+ "- If you did not provide an output dataset, a default dataset `bq://PROJECT_ID.gen_ai_batch_prediction` will be created for you. The name of the table is formed by appending `predictions_` with the timestamp of when the batch prediction job started.\n",
+ "\n",
+ "You can print out the exact output location in the `job.output_location` property."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XECxy_2HdHMm"
+ },
+ "outputs": [],
+ "source": [
+ "print(f\"Job output location: {job.output_location}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "NWLQEl3mYKO5"
+ },
+ "source": [
+ "You can use the example code below to retrieve predictions and store them into a dataframe.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "-jLl3es3dTqB"
+ },
+ "outputs": [],
+ "source": [
+ "bq_table_id = job.output_location.replace(\"bq://\", \"\")\n",
+ "\n",
+ "sql = f\"\"\"\n",
+ " SELECT *\n",
+ " FROM {bq_table_id}\n",
+ " \"\"\"\n",
+ "\n",
+ "query_result = bq_client.query(sql)\n",
+ "\n",
+ "df = query_result.result().to_dataframe()\n",
+ "df.head()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "2a4e033321ad"
+ },
+ "source": [
+ "## Cleaning up\n",
+ "\n",
+ "Clean up resources created in this notebook."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ZNCyIKIrdPJY"
+ },
+ "outputs": [],
+ "source": [
+ "# Delete the batch prediction job\n",
+ "job.delete()"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "name": "intro_batch_prediction_using_bigquery_input.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
From fde9bee89ff0c8ed583e6f162130b7d823364fae Mon Sep 17 00:00:00 2001
From: nhootan <103317089+nhootan@users.noreply.github.com>
Date: Fri, 11 Oct 2024 13:01:48 -0400
Subject: [PATCH 62/76] fix: Remove dependency on colab from prompt optimizer.
(#1241)
Co-authored-by: hootan
Co-authored-by: Owl Bot
Co-authored-by: Holt Skinner
---
.github/actions/spelling/allow.txt | 2 ++
gemini/prompts/prompt_optimizer/vapo_lib.py | 3 ---
.../prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb | 2 --
3 files changed, 2 insertions(+), 5 deletions(-)
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
index 084936e4cc..6ec7e7edcc 100644
--- a/.github/actions/spelling/allow.txt
+++ b/.github/actions/spelling/allow.txt
@@ -359,6 +359,7 @@ cer
cfbundle
chatbots
chromadb
+cimg
claude
clickable
cmap
@@ -378,6 +379,7 @@ constexpr
corpuses
csa
cse
+ctd
cupertino
dask
dataframe
diff --git a/gemini/prompts/prompt_optimizer/vapo_lib.py b/gemini/prompts/prompt_optimizer/vapo_lib.py
index 2365f801cc..b60af6b764 100644
--- a/gemini/prompts/prompt_optimizer/vapo_lib.py
+++ b/gemini/prompts/prompt_optimizer/vapo_lib.py
@@ -21,15 +21,12 @@
from IPython.core.display import DisplayHandle
from IPython.display import HTML, display
from google.cloud import aiplatform, storage
-from google.colab import output
import ipywidgets as widgets
import jinja2
import jinja2.meta
import pandas as pd
from tensorflow.io import gfile
-output.enable_custom_widget_manager()
-
def is_target_required_metric(eval_metric: str) -> bool:
"""Check if the metric requires the target label."""
diff --git a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
index a17a30b1c1..d56b179d73 100644
--- a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
+++ b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
@@ -291,8 +291,6 @@
"import json\n",
"import time\n",
"\n",
- "from google.colab import auth\n",
- "\n",
"timestamp = datetime.datetime.now().strftime(\"%Y-%m-%dT%H:%M:%S\")\n",
"display_name = f\"pt_{timestamp}\"\n",
"\n",
From 115adf9370133395a89efe734b2829c26a3ab294 Mon Sep 17 00:00:00 2001
From: eliasecchig <115624100+eliasecchig@users.noreply.github.com>
Date: Fri, 11 Oct 2024 19:51:44 +0200
Subject: [PATCH 63/76] feat: add e2e gen ai app starter pack (#1225)
# Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make
sure it goes smoothly:
- [x] Follow the [`CONTRIBUTING`
Guide](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md).
- [x] You are listed as the author in your notebook or README file.
- [x] Your account is listed in
[`CODEOWNERS`](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/.github/CODEOWNERS)
for the file(s).
- [x] Make your Pull Request title in the
specification.
- [x] Ensure the tests and linter pass (Run `nox -s format` from the
repository root to format).
- [x] Appropriate docs were updated (if necessary)
---------
Co-authored-by: Owl Bot
---
.github/CODEOWNERS | 1 +
.github/actions/spelling/allow.txt | 10 +
.github/linters/.python-lint | 2 +-
...gen_ai_app_starter_kit__lint_and_test.yaml | 29 +
.gitignore | 10 +
gemini/sample-apps/README.md | 13 +-
.../e2e-gen-ai-app-starter-pack/.gitignore | 194 +
.../CONTRIBUTING.md | 49 +
.../e2e-gen-ai-app-starter-pack/Dockerfile | 17 +
.../e2e-gen-ai-app-starter-pack/Makefile | 21 +
.../e2e-gen-ai-app-starter-pack/README.md | 202 +
.../e2e-gen-ai-app-starter-pack/app/README.md | 65 +
.../app/__init__.py | 0
.../e2e-gen-ai-app-starter-pack/app/chain.py | 40 +
.../app/eval/data/chats.yaml | 42 +
.../app/eval/utils.py | 209 +
.../app/patterns/custom_rag_qa/chain.py | 141 +
.../app/patterns/custom_rag_qa/templates.py | 56 +
.../patterns/custom_rag_qa/vector_store.py | 53 +
.../patterns/langgraph_dummy_agent/chain.py | 73 +
.../e2e-gen-ai-app-starter-pack/app/server.py | 121 +
.../app/utils/__init__.py | 0
.../app/utils/input_types.py | 56 +
.../app/utils/output_types.py | 142 +
.../app/utils/tracing.py | 150 +
.../deployment/README.md | 116 +
.../deployment/cd/deploy-to-prod.yaml | 35 +
.../deployment/cd/staging.yaml | 137 +
.../deployment/ci/pr_checks.yaml | 27 +
.../deployment/terraform/apis.tf | 49 +
.../deployment/terraform/artifact_registry.tf | 8 +
.../deployment/terraform/build_triggers.tf | 80 +
.../deployment/terraform/dev/iam.tf | 20 +
.../deployment/terraform/dev/log_sinks.tf | 49 +
.../terraform/dev/service_accounts.tf | 5 +
.../deployment/terraform/dev/storage.tf | 29 +
.../deployment/terraform/dev/variables.tf | 64 +
.../deployment/terraform/dev/vars/env.tfvars | 16 +
.../deployment/terraform/iam.tf | 81 +
.../deployment/terraform/log_sinks.tf | 58 +
.../deployment/terraform/service_accounts.tf | 15 +
.../deployment/terraform/storage.tf | 37 +
.../deployment/terraform/variables.tf | 125 +
.../deployment/terraform/vars/env.tfvars | 31 +
.../notebooks/getting_started.ipynb | 1160 +++
.../e2e-gen-ai-app-starter-pack/poetry.lock | 7864 +++++++++++++++++
.../pyproject.toml | 108 +
.../streamlit/side_bar.py | 171 +
.../streamlit/streamlit_app.py | 253 +
.../streamlit/style/app_markdown.py | 37 +
.../streamlit/utils/chat_utils.py | 69 +
.../streamlit/utils/local_chat_history.py | 121 +
.../streamlit/utils/message_editing.py | 58 +
.../streamlit/utils/multimodal_utils.py | 218 +
.../streamlit/utils/stream_handler.py | 267 +
.../streamlit/utils/title_summary.py | 68 +
.../patterns/test_langgraph_dummy_agent.py | 62 +
.../tests/integration/patterns/test_rag_qa.py | 63 +
.../tests/integration/test_chain.py | 53 +
.../tests/integration/test_server_e2e.py | 177 +
.../tests/load_test/.results/.placeholder | 0
.../tests/load_test/README.md | 79 +
.../tests/load_test/load_test.py | 85 +
.../tests/unit/test_server.py | 148 +
.../unit/test_utils/test_tracing_exporter.py | 143 +
65 files changed, 13875 insertions(+), 7 deletions(-)
create mode 100644 .github/workflows/_e2e_gen_ai_app_starter_kit__lint_and_test.yaml
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/.gitignore
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/CONTRIBUTING.md
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/Dockerfile
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/Makefile
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/README.md
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/README.md
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/__init__.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/chain.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/eval/data/chats.yaml
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/eval/utils.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/custom_rag_qa/chain.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/custom_rag_qa/templates.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/custom_rag_qa/vector_store.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/langgraph_dummy_agent/chain.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/server.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/__init__.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/input_types.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/output_types.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/tracing.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/README.md
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/cd/deploy-to-prod.yaml
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/cd/staging.yaml
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/ci/pr_checks.yaml
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/apis.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/artifact_registry.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/build_triggers.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/iam.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/log_sinks.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/service_accounts.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/storage.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/variables.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/vars/env.tfvars
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/iam.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/log_sinks.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/service_accounts.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/storage.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/variables.tf
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/vars/env.tfvars
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/notebooks/getting_started.ipynb
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/poetry.lock
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/pyproject.toml
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/streamlit/side_bar.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/streamlit/streamlit_app.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/streamlit/style/app_markdown.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/streamlit/utils/chat_utils.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/streamlit/utils/local_chat_history.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/streamlit/utils/message_editing.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/streamlit/utils/multimodal_utils.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/streamlit/utils/stream_handler.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/streamlit/utils/title_summary.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/tests/integration/patterns/test_langgraph_dummy_agent.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/tests/integration/patterns/test_rag_qa.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/tests/integration/test_chain.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/tests/integration/test_server_e2e.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/tests/load_test/.results/.placeholder
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/tests/load_test/README.md
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/tests/load_test/load_test.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/tests/unit/test_server.py
create mode 100644 gemini/sample-apps/e2e-gen-ai-app-starter-pack/tests/unit/test_utils/test_tracing_exporter.py
diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
index 1a5743e4c9..1ad39d9748 100644
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -73,3 +73,4 @@
/generative-ai/open-models/serving/vertex_ai_text_generation_inference_gemma.ipynb @alvarobartt @philschmid @pagezyhf @jeffboudier
/generative-ai/gemini/use-cases/applying-llms-to-data/semantic-search-in-bigquery/stackoverflow_questions_semantic_search.ipynb @sethijaideep @GoogleCloudPlatform/generative-ai-devrel
/generative-ai/gemini/use-cases/retrieval-augmented-generation/raw_with_bigquery.ipynb @jeffonelson @GoogleCloudPlatform/generative-ai-devrel
+/generative-ai/gemini/sample-apps/e2e-gen-ai-app-starter-pack @eliasecchig @lspatarog @GoogleCloudPlatform/generative-ai-devrel
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
index 6ec7e7edcc..4da058dced 100644
--- a/.github/actions/spelling/allow.txt
+++ b/.github/actions/spelling/allow.txt
@@ -153,6 +153,7 @@ LSum
LTRB
LUVBPTK
Ladhak
+LangGraph
Lego
Llion
Logrus
@@ -202,6 +203,7 @@ Oort
PDFs
PEFT
PLOTLYENV
+PYINK
Parmar
Persero
Phaidon
@@ -270,6 +272,7 @@ Testables
Tetsuo
Tianli
Topolino
+Traceloop
Trapp
Tribbiani
Tricyle
@@ -359,6 +362,7 @@ cer
cfbundle
chatbots
chromadb
+cicd
cimg
claude
clickable
@@ -368,6 +372,7 @@ codebases
codefile
codelab
codelabs
+codespell
colab
coldline
coloraxis
@@ -429,6 +434,7 @@ figsize
fillmode
fillna
firestore
+fixmycar
flac
floormat
fmeasure
@@ -475,6 +481,7 @@ gsutil
gtk
guanciale
gunicorn
+hadolint
hashtag
hashtags
hdlr
@@ -678,6 +685,7 @@ terraform
textno
tfhub
tfidf
+tfvars
tgz
thelook
tiktoken
@@ -685,7 +693,9 @@ timechart
titlebar
tobytes
toself
+toset
tqdm
+traceloop
tritan
tsv
ubuntu
diff --git a/.github/linters/.python-lint b/.github/linters/.python-lint
index a04c41e380..b6d9e03f2d 100644
--- a/.github/linters/.python-lint
+++ b/.github/linters/.python-lint
@@ -1,2 +1,2 @@
[MESSAGES CONTROL]
-disable=E0401,C0301
+disable=E0401,C0301,R0903,R1710,C0114,R0915,W1514,W1203,I1101
diff --git a/.github/workflows/_e2e_gen_ai_app_starter_kit__lint_and_test.yaml b/.github/workflows/_e2e_gen_ai_app_starter_kit__lint_and_test.yaml
new file mode 100644
index 0000000000..419c68f22a
--- /dev/null
+++ b/.github/workflows/_e2e_gen_ai_app_starter_kit__lint_and_test.yaml
@@ -0,0 +1,29 @@
+name: End-to-end Gen AI starter kit - Unit test
+
+on:
+ push:
+ paths:
+ - "gemini/sample-apps/e2e-gen-ai-app-starter-pack/**"
+ pull_request:
+ paths:
+ - "gemini/sample-apps/e2e-gen-ai-app-starter-pack/**"
+
+jobs:
+ unit-test:
+ runs-on: ubuntu-latest
+ defaults:
+ run:
+ working-directory: gemini/sample-apps/e2e-gen-ai-app-starter-pack
+ steps:
+ - uses: actions/checkout@v3
+ - name: Set up Python
+ uses: actions/setup-python@v4
+ with:
+ python-version: "3.10"
+ - name: Install Poetry and dependencies
+ run: |
+ pip install poetry==1.8.3 --user
+ python -m poetry install
+ - name: Run tests
+ run: |
+ poetry run pytest tests/unit
diff --git a/.gitignore b/.gitignore
index f516bbe605..5171010bc8 100644
--- a/.gitignore
+++ b/.gitignore
@@ -145,6 +145,16 @@ dmypy.json
language/examples/prompt-design/train.csv
README-TOC*.md
+## gemini/sample-apps/e2e-gen-ai-app-starter-pack specific
+gemini/sample-apps/e2e-gen-ai-app-starter-pack/**/.persist_vector_store
+gemini/sample-apps/e2e-gen-ai-app-starter-pack/.saved_chats/
+gemini/sample-apps/e2e-gen-ai-app-starter-pack/.streamlit_chats/
+gemini/sample-apps/e2e-gen-ai-app-starter-pack/**/my_env.tfvars
+gemini/sample-apps/e2e-gen-ai-app-starter-pack/tests/load_test/.results
+gemini/sample-apps/e2e-gen-ai-app-starter-pack/tests/load_test/.results
+
+# Terraform
+terraform.tfstate**
.terraform*
.Terraform*
diff --git a/gemini/sample-apps/README.md b/gemini/sample-apps/README.md
index e0c7648622..e7620ff295 100644
--- a/gemini/sample-apps/README.md
+++ b/gemini/sample-apps/README.md
@@ -10,9 +10,10 @@ We provide instructions for setting up your environment in [Cloud Shell](https:/
## Sample Applications
-| Description | Application Name | Technologies Used |
-| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | -------------------------------------------------------------------------- |
-| Develop a Gemini application using [Streamlit](https://streamlit.io/) framework and Vertex AI Gemini API model. | [gemini-streamlit-cloudrun](gemini-streamlit-cloudrun) | Cloud Run, Streamlit, Python |
-| Deploy a RAG + Gemini sample application to troubleshoot your car using the owner's manual. | [fixmycar/](fixmycar/) | Chat, Grounding, RAG, Java, Streamlit |
-| Try Gemini image recognition in `bash` and see Text-to-Speech read the description to you in ~any language. All from CLI! | [image-bash-jam/](image-bash-jam/) | Text-to-Speech, Bash |
-| This demo showcases how you can combine the data and documents you already have and the skills you already know with the power of [AlloyDB AI](https://cloud.google.com/alloydb/ai?hl=en), [Vertex AI](https://cloud.google.com/vertex-ai?hl=en), [Cloud Run](https://cloud.google.com/run?hl=en), and [Cloud Functions](https://cloud.google.com/functions?hl=en) to build trustworthy Gen AI features into your existing applications. | [GenWealth](genwealth/) | Vertex AI, AlloyDB, Document AI, Cloud Run, Cloud Functions, Cloud Storage |
+| Description | Application Name | Technologies Used |
+| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------- | --------------------------------------------------------------------------- |
+| Develop a Gemini application using [Streamlit](https://streamlit.io/) framework and Vertex AI Gemini API model. | [gemini-streamlit-cloudrun](gemini-streamlit-cloudrun) | Cloud Run, Streamlit, Python |
+| Deploy a RAG + Gemini sample application to troubleshoot your car using the owner's manual. | [fixmycar/](fixmycar/) | Chat, Grounding, RAG, Java, Streamlit |
+| Try Gemini image recognition in `bash` and see Text-to-Speech read the description to you in ~any language. All from CLI! | [image-bash-jam/](image-bash-jam/) | Text-to-Speech, Bash |
+| This demo showcases how you can combine the data and documents you already have and the skills you already know with the power of [AlloyDB AI](https://cloud.google.com/alloydb/ai?hl=en), [Vertex AI](https://cloud.google.com/vertex-ai?hl=en), [Cloud Run](https://cloud.google.com/run?hl=en), and [Cloud Functions](https://cloud.google.com/functions?hl=en) to build trustworthy Gen AI features into your existing applications. | [GenWealth](genwealth/) | Vertex AI, AlloyDB, Document AI, Cloud Run, Cloud Functions, Cloud Storage |
+| End-to-end Gen AI App Starter pack: This folder provides a template starter pack for building a Generative AI application on Google Cloud. It provides a comprehensive set of resources to guide you through the entire development process, from prototype to production. | [e2e-gen-ai-app-starter-pack](e2e-gen-ai-app-starter-pack/) | Vertex AI, FastAPI, LangChain, Cloud Run, Cloud Build, Terraform, Streamlit |
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/.gitignore b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/.gitignore
new file mode 100644
index 0000000000..bf43be5cf5
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/.gitignore
@@ -0,0 +1,194 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*.pyc
+*$py.class
+**/dist
+/tmp
+/out-tsc
+/bazel-out
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+
+# PyInstaller
+# Usually these files are written by a python script from a template
+# before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+
+# Translations
+*.mo
+*.pot
+
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+
+# Flask stuff:
+instance/
+.webassets-cache
+
+# Scrapy stuff:
+.scrapy
+
+# Sphinx documentation
+docs/_build/
+
+# PyBuilder
+target/
+
+# Jupyter Notebook
+.ipynb_checkpoints
+
+# IPython
+profile_default/
+ipython_config.py
+
+# pyenv
+.python-version
+
+# pipenv
+# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+# However, in case of collaboration, if having platform-specific dependencies or dependencies
+# having no cross-platform support, pipenv may install dependencies that don't work, or not
+# install all needed dependencies.
+Pipfile.lock
+Pipfile
+
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+
+# SageMath parsed files
+*.sage.py
+
+# Environments
+.env
+.venv
+.venv*
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+
+# Spyder project settings
+.spyderproject
+.spyproject
+
+# Rope project settings
+.ropeproject
+
+# mkdocs documentation
+/site
+
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+
+# Pyre type checker
+.pyre/
+
+# macOS
+.DS_Store
+
+# PyCharm
+.idea
+
+# User-specific files
+.terraform*
+.Terraform*
+
+
+tmp*
+
+# Node
+**/node_modules
+npm-debug.log
+yarn-error.log
+
+# IDEs and editors
+.idea/
+.project
+.classpath
+.c9/
+*.launch
+.settings/
+*.sublime-workspace
+
+# Visual Studio Code
+.vscode/*
+!.vscode/settings.json
+!.vscode/tasks.json
+!.vscode/launch.json
+!.vscode/extensions.json
+.history/*
+
+# Miscellaneous
+**/.angular/*
+/.angular/cache
+.sass-cache/
+/connect.lock
+/coverage
+/libpeerconnection.log
+testem.log
+/typings
+
+# System files
+.DS_Store
+Thumbs.db
+*.vscode*
+
+.persist_vector_store
+tests/load_test/.results/*.html
+tests/load_test/.results/*.csv
+locust_env
+my_env.tfvars
+.streamlit_chats
+.saved_chats
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/CONTRIBUTING.md b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/CONTRIBUTING.md
new file mode 100644
index 0000000000..34f35f2955
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/CONTRIBUTING.md
@@ -0,0 +1,49 @@
+# How to Contribute
+
+We'd love to accept your patches and contributions to this sample. There are
+just a few small guidelines you need to follow.
+
+## Contributor License Agreement
+
+Contributions to this project must be accompanied by a Contributor License Agreement. You (or your employer) retain the copyright to your contribution; this simply gives us permission to use and redistribute your contributions as part of the project. Head over to [Google Developers CLA](https://cla.developers.google.com/) to see your current agreements on file or to sign a new one.
+
+You generally only need to submit a CLA once, so if you've already submitted one (even if it was for a different project), you probably don't need to do it again.
+
+## Community Guidelines, Code Reviews, Contributor Guide
+
+Please refer to the [root repository CONTRIBUTING.md file](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/CONTRIBUTING.md) for Community Guidelines, Code Reviews, Contributor Guide, or specific guidance for Google Employees.
+
+## Code Quality Checks
+
+To ensure code quality, we utilize automated checks. Before submitting a pull request, please run the following commands locally:
+
+```bash
+poetry install --with streamlit,jupyter,lint
+```
+
+This installs development dependencies, including linting tools.
+
+Then, execute the following Make command:
+
+```bash
+make lint
+```
+
+This command runs the following linters to check for code style, potential errors, and type hints:
+
+- **codespell**: Detects common spelling mistakes in code and documentation.
+- **pylint**: Analyzes code for errors, coding standards, and potential problems.
+- **flake8**: Enforces style consistency and checks for logical errors.
+- **mypy**: Performs static type checking to catch type errors before runtime.
+- **black**: Automatically formats Python code to adhere to the PEP 8 style guide.
+
+```bash
+make test
+```
+
+This command runs the test suite using pytest, covering both unit and integration tests:
+
+- **`poetry run pytest tests/unit`**: Executes unit tests located in the `tests/unit` directory.
+- **`poetry run pytest tests/integration`**: Executes integration tests located in the `tests/integration` directory.
+
+Your pull request will also be automatically checked by these tools using GitHub Actions. Ensuring your code passes these checks locally will help expedite the review process.
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/Dockerfile b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/Dockerfile
new file mode 100644
index 0000000000..c103b88d30
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/Dockerfile
@@ -0,0 +1,17 @@
+FROM python:3.11-slim
+
+RUN pip install --no-cache-dir poetry==1.6.1
+
+RUN poetry config virtualenvs.create false
+
+WORKDIR /code
+
+COPY ./pyproject.toml ./README.md ./poetry.lock* ./
+
+COPY ./app ./app
+
+RUN poetry install --no-interaction --no-ansi --no-dev
+
+EXPOSE 8080
+
+CMD ["uvicorn", "app.server:app", "--host", "0.0.0.0", "--port", "8080"]
\ No newline at end of file
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/Makefile b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/Makefile
new file mode 100644
index 0000000000..c54f389e71
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/Makefile
@@ -0,0 +1,21 @@
+test:
+ poetry run pytest tests/unit && poetry run pytest tests/integration
+
+playground:
+ poetry run uvicorn app.server:app --host 0.0.0.0 --port 8000 --reload & poetry run streamlit run streamlit/streamlit_app.py --browser.serverAddress=localhost --server.enableCORS=false --server.enableXsrfProtection=false
+
+backend:
+ poetry run uvicorn app.server:app --host 0.0.0.0 --port 8000 --reload
+
+frontend:
+ poetry run streamlit run streamlit/streamlit_app.py --browser.serverAddress=localhost --server.enableCORS=false --server.enableXsrfProtection=false
+
+load_test:
+ poetry run locust -f tests/load_test/load_test.py -H $RUN_SERVICE_URL --headless -t 30s -u 60 -r 2 --csv=tests/load_test/.results/results --html=tests/load_test/.results/report.html
+
+lint:
+ poetry run codespell
+ poetry run flake8 .
+ poetry run pylint .
+ poetry run mypy .
+ poetry run black .
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/README.md b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/README.md
new file mode 100644
index 0000000000..dfadbfaef2
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/README.md
@@ -0,0 +1,202 @@
+# 🚀 End-to-End Gen AI App Starter Pack 🚀
+
+> **From Prototype to Production in Minutes.**
+
+| | |
+| ------- | ---------------------------------------------------------------------------------------------- |
+| Authors | [Elia Secchi](https://github.com/eliasecchig), [Lorenzo Spataro](https://github.com/lspataroG) |
+
+This repository provides a template starter pack for building a Generative AI application on Google Cloud.
+
+We provide a comprehensive set of resources to guide you through the entire development process, from prototype to production.
+
+This is a suggested approach, and **you can adapt it to fit your specific needs and preferences**. There are multiple ways to build Gen AI applications on Google Cloud, and this template serves as a starting point and example.
+
+## High-Level Architecture
+
+This starter pack covers all aspects of Generative AI app development, from prototyping and evaluation to deployment and monitoring.
+
+![High Level Architecture](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/high_level_architecture.png "Architecture")
+
+## What's in this Starter Pack?
+
+
+A prod-ready FastAPI server
+
+| Description | Visualization |
+| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- |
+| The starter pack includes a production-ready FastAPI server with a real-time chat interface, event streaming, and auto-generated docs. It is designed for scalability and easy integration with monitoring tools. | ![FastAPI docs](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/fastapi_docs.png) |
+
+
+
+
+Ready-to-use AI patterns
+
+| Description | Visualization |
+| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
+| Start with a variety of common patterns: this repository offers examples including a basic conversational chain, a production-ready RAG (Retrieval-Augmented Generation) chain developed with Python, and a LangGraph agent implementation. Use them in the application by changing one line of code. See the [Readme](app/README.md) for more details. | ![patterns available](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/patterns_available.png) |
+
+
+
+
+Integration with Vertex AI Evaluation and Experiments
+
+| Description | Visualization |
+| ---------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
+| The repository showcases how to evaluate Generative AI applications using tools like Vertex AI rapid eval SDK and Vertex AI Experiments. | ![Vertex AI Rapid Eval](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/vertex_ai_rapid_eval.png) |
+
+
+
+
+Unlock Insights with Google Cloud Native Tracing & Logging
+
+| Description | Visualization |
+| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
+| Seamlessly integrate with OpenTelemetry, Cloud Trace, Cloud Logging, and BigQuery for comprehensive data collection, and log every step of your Gen AI application to unlock powerful insights. | ![Tracing Preview](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/tracing_preview.png) |
+
+
+
+
+Monitor Responses from the application
+
+| Description | Visualization |
+| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
+| Monitor your Generative AI application's performance. We provide a Looker Studio [dashboard](https://lookerstudio.google.com/u/0/reporting/fa742264-4b4b-4c56-81e6-a667dd0f853f) to monitor application conversation statistics and user feedback. | ![Dashboard1](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/dashboard_1.png) |
+| We can also drill down to individual conversations and view the messages exchanged. | ![Dashboard2](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/dashboard_2.png) |
+
+
+
+
+CICD and Terraform
+
+| Description | Visualization |
+| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
+| Streamline your deployments with Cloud Build. Enhance reliability through automated testing. The template includes implementation of unit, integration, and load tests, and a set of Terraform resources for you to set up your own Google Cloud project in a matter of minutes. | ![cicd](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/cicd.png) |
+
+
+
+
+A comprehensive UI Playground
+
+| Description | Visualization |
+| ----------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
+| Experiment with your Generative AI application in a feature-rich playground, including chat curation, user feedback collection, multimodal input, and more! | ![Streamlit View](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/streamlit_view.png) |
+
+
+
+## Getting Started
+
+### Prerequisites
+
+- Python 3.10+
+- Google Cloud SDK installed and configured
+- [Poetry](https://python-poetry.org/docs/#installation) for dependency management
+
+### Download the starter pack
+
+```bash
+gsutil cp gs://e2e-gen-ai-app-starter-pack/app-starter-pack.zip . && unzip app-starter-pack.zip && cd app-starter-pack
+```
+
+Use the downloaded folder as a starting point for your own Generative AI application.
+
+### Installation
+
+Install required packages using Poetry:
+
+```bash
+poetry install --with streamlit,jupyter
+```
+
+### Setup
+
+Set your default Google Cloud project and region:
+
+```bash
+export PROJECT_ID="YOUR_PROJECT_ID"
+export REGION="YOUR_REGION"
+gcloud config set project $PROJECT_ID
+gcloud config set region $REGION
+```
+
+## Commands
+
+| Command | Description |
+| -------------------- | ------------------------------------------------------------------------------------------- |
+| `make playground` | Start the backend and frontend for local playground execution |
+| `make test` | Run unit and integration tests |
+| `make load_test` | Execute load tests (see [tests/load_test/README.md](tests/load_test/README.md) for details) |
+| `poetry run jupyter` | Launch Jupyter notebook |
+
+For full command options and usage, refer to the [Makefile](Makefile).
+
+## Usage
+
+1. **Prototype Your Chain:** Build your Generative AI application using different methodologies and frameworks. Use Vertex AI Evaluation for assessing the performance of your application and its chain of steps. **See [`notebooks/getting_started.ipynb`](notebooks/getting_started.ipynb) for a tutorial to get started building and evaluating your chain.**
+2. **Integrate into the App:** Import your chain into the app. Edit the `app/chain.py` file to add your chain.
+3. **Playground Testing:** Explore your chain's functionality using the Streamlit playground. Take advantage of the comprehensive playground features, such as chat history management, user feedback mechanisms, support for various input types, and additional capabilities. You can run the playground locally with the `make playground` command.
+4. **Deploy with CI/CD:** Configure and trigger the CI/CD pipelines. Edit tests if needed. See the [deployment section](#deployment) below for more details.
+5. **Monitor in Production:** Track performance and gather insights using Cloud Logging, Tracing, and the Looker Studio dashboard. Use the gathered data to iterate on your Generative AI application.
+
+## Deployment
+
+### Dev Environment
+
+You can test deployment towards a Dev Environment using the following command:
+
+```bash
+gcloud run deploy genai-app-sample --source . --project YOUR_DEV_PROJECT_ID
+```
+
+The repository includes a Terraform configuration for the setup of the Dev Google Cloud project.
+See [deployment/README.md](deployment/README.md) for instructions.
+
+### Production Deployment with Terraform
+
+![Deployment Workflow](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/deployment_workflow.png)
+
+**Quick Start:**
+
+1. Enable required APIs in the CI/CD project.
+
+ ```bash
+ gcloud config set project YOUR_CI_CD_PROJECT_ID
+ gcloud services enable serviceusage.googleapis.com cloudresourcemanager.googleapis.com cloudbuild.googleapis.com secretmanager.googleapis.com
+ ```
+
+2. Create a Git repository (GitHub, GitLab, Bitbucket).
+3. Connect to Cloud Build following [Cloud Build Repository Setup](https://cloud.google.com/build/docs/repositories#whats_next).
+4. Configure [`deployment/terraform/vars/env.tfvars`](deployment/terraform/vars/env.tfvars) with your project details.
+5. Deploy infrastructure:
+
+ ```bash
+ cd deployment/terraform
+ terraform init
+ terraform apply --var-file vars/env.tfvars
+ ```
+
+6. Perform a commit and push to the repository to see the CI/CD pipelines in action!
+
+For detailed deployment instructions, refer to [deployment/README.md](deployment/README.md).
+
+## Contributing
+
+Contributions are welcome! See the [Contributing Guide](CONTRIBUTING.md).
+
+## Feedback
+
+We value your input! Your feedback helps us improve this starter pack and make it more useful for the community.
+
+### Getting Help
+
+If you encounter any issues or have specific suggestions, please first consider [raising an issue](https://github.com/GoogleCloudPlatform/generative-ai/issues) on our GitHub repository.
+
+### Share Your Experience
+
+For other types of feedback, or if you'd like to share a positive experience or success story using this starter pack, we'd love to hear from you! You can reach out to us at [e2e-gen-ai-app-starter-pack@google.com](mailto:e2e-gen-ai-app-starter-pack@google.com).
+
+Thank you for your contributions!
+
+## Disclaimer
+
+This repository is for demonstrative purposes only and is not an officially supported Google product.
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/README.md b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/README.md
new file mode 100644
index 0000000000..7f96c1a27c
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/README.md
@@ -0,0 +1,65 @@
+# Sample Chatbot Application
+
+This folder implements a chatbot application using FastAPI, and Google Cloud services. It supports multiple conversation patterns and can be easily extended with new chains.
+
+## Folder Structure
+
+```plaintext
+.
+├── server.py # Main FastAPI server
+├── chain.py # Default chain implementation
+├── patterns/ # Conversation pattern implementations
+│ ├── custom_rag_qa/
+│ └── langgraph_dummy_agent/
+├── utils/ # Utility functions and classes
+└── eval/ # Evaluation tools and data
+```
+
+## Generative AI Application Patterns
+
+### 1. Default Chain
+
+The default chain is a simple conversational bot that produces recipes based on user questions.
+
+### 2. Custom RAG QA
+
+A RAG (Retrieval-Augmented Generation) chain using Python for orchestration and base LangChain components. The chain demonstrates how to create a production-grade application with full control over the orchestration process.
+
+This approach offers maximum flexibility in the orchestration of steps and allows for seamless integration with other SDK frameworks such as [Vertex AI SDK](https://cloud.google.com/vertex-ai/docs/python-sdk/use-vertex-ai-python-sdk) and [LangChain](https://python.langchain.com/), retaining the support to emit `astream_events` [API compatible events](https://python.langchain.com/docs/how_to/streaming/#using-stream-events).
+
+### 3. LangGraph Dummy Agent
+
+A simple agent implemented using LangGraph, a framework for building agent and multi-agent workflows.
+
+### Switching Between Patterns
+
+To switch between different patterns, modify the import statement in `server.py`.
+
+All chains have the same interface, allowing for seamless swapping without changes to the Streamlit frontend.
+
+## Monitoring and Observability
+
+![monitoring_flow](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/monitoring_flow.png)
+
+### Trace and Log Capture
+
+This application utilizes [OpenTelemetry](https://opentelemetry.io/) and [OpenLLMetry](https://github.com/traceloop/openllmetry) for comprehensive observability, emitting events to Google Cloud Trace and Google Cloud Logging. Every interaction with LangChain and VertexAI is instrumented (see [`server.py`](server.py)), enabling detailed tracing of request flows throughout the application.
+
+Leveraging the [CloudTraceSpanExporter](https://cloud.google.com/python/docs/reference/spanner/latest/opentelemetry-tracing), the application captures and exports tracing data. To address the limitations of Cloud Trace ([256-byte attribute value limit](https://cloud.google.com/trace/docs/quotas#limits_on_spans)) and [Cloud Logging](https://cloud.google.com/logging/quotas) ([256KB log entry size](https://cloud.google.com/logging/quotas)), a custom extension of the CloudTraceSpanExporter is implemented in [`app/utils/tracing.py`](app/utils/tracing.py).
+
+This extension enhances observability by:
+
+- Creating a corresponding Google Cloud Logging entry for every captured event.
+- Automatically storing event data in Google Cloud Storage when the payload exceeds 256KB.
+
+Logged payloads are associated with the original trace, ensuring seamless access from the Cloud Trace console.
+
+### Log Router
+
+Events are forwarded to BigQuery through a [log router](https://cloud.google.com/logging/docs/routing/overview) for long-term storage and analysis. The deployment of the log router is done via Terraform code in [deployment/terraform](../deployment/terraform).
+
+### Looker Studio Dashboard
+
+Once the data is written to BigQuery, it can be used to populate a [Looker Studio dashboard](https://lookerstudio.google.com/c/reporting/fa742264-4b4b-4c56-81e6-a667dd0f853f/page/tEnnC).
+
+This dashboard, offered as a template, provides a starting point for building custom visualizations on the top of the data being captured.
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/__init__.py b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/chain.py b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/chain.py
new file mode 100644
index 0000000000..d367bd1ad7
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/chain.py
@@ -0,0 +1,40 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
+from langchain_google_vertexai import ChatVertexAI
+
+llm = ChatVertexAI(
+ model_name="gemini-1.5-flash-002",
+ temperature=0,
+ max_output_tokens=1024,
+)
+
+
+template = ChatPromptTemplate.from_messages(
+ [
+ (
+ "system",
+ "You are a knowledgeable culinary assistant specializing in providing"
+ "detailed cooking recipes. Your responses should be informative, engaging, "
+ "and tailored to the user's specific requests. Include ingredients, "
+ "step-by-step instructions, cooking times, and any helpful tips or "
+ "variations. If asked about dietary restrictions or substitutions, offer "
+ "appropriate alternatives.",
+ ),
+ MessagesPlaceholder(variable_name="messages"),
+ ]
+)
+
+chain = template | llm
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/eval/data/chats.yaml b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/eval/data/chats.yaml
new file mode 100644
index 0000000000..1eb4f9bdb6
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/eval/data/chats.yaml
@@ -0,0 +1,42 @@
+- messages:
+ - type: human
+ content: Hi
+ - type: ai
+ content: Hi, how can I help you?
+ - type: human
+ content: I'm looking for a recipe for a healthy dinner. Do you have any recommendations?
+ - type: ai
+ content: Sure, I can help you with that. What are your dietary restrictions? Are you vegetarian, vegan, gluten-free, or anything else?
+ - type: human
+ content: I'm not vegetarian or vegan, but I am gluten-free.
+ - type: ai
+ content: "Okay, I ll keep that in mind. Here are a few recipes that I think you might like: * **Grilled Salmon with Roasted Vegetables:** This is a delicious and healthy recipe that is perfect for a weeknight meal. The salmon is grilled to perfection and the roasted vegetables add a touch of sweetness. * **Chicken Stir-Fry:** This is a quick and easy stir-fry that is perfect for busy weeknights. The chicken is cooked with vegetables and a light sauce. * **Lentil Soup:** This is a hearty and healthy soup that is perfect for a cold winter day. The lentils are packed with protein and fiber, and the soup is also gluten-free."
+ - type: human
+ content: Those all sound great! I think I'm going to try the grilled salmon with roasted vegetables.
+ - type: ai
+ content: That's a great choice! I hope you enjoy it.
+ - type: human
+ content: Thanks for your help!
+ - type: ai
+ content: You're welcome! Is there anything else I can help you with today?
+ - type: human
+ content: No, that's all. Thanks again!
+ - type: ai
+ content: You're welcome! Have a great day!
+- messages:
+ - type: human
+ content: Hi
+ - type: ai
+ content: Hi, how can I help you?
+ - type: human
+ content: I'm looking for a recipe for a romantic dinner. Do you have any recommendations?
+ - type: ai
+ content: Sure, I can help you with that. What are your dietary restrictions? Are you vegetarian, vegan, gluten-free, or anything else?
+ - type: human
+ content: I'm vegetarian.
+ - type: ai
+ content: "Sure, I can help you find a healthy vegetarian dinner recipe. Here are a few ideas: * **Burnt aubergine veggie chilli:** This is a hearty and flavorful dish that is packed with nutrients. The roasted aubergine gives it a smoky flavor, and the lentils and beans add protein and fiber. * **Simple mushroom curry:** This is a quick and easy curry that is perfect for a weeknight meal. The mushrooms are cooked in a creamy sauce with spices, and the whole dish is ready in under 30 minutes. * **Vegetarian enchiladas:** This is a classic Mexican dish that is easy to make vegetarian. The enchiladas are filled with a variety of vegetables, and they are topped with a delicious sauce. * **Braised sesame tofu:** This is a flavorful and satisfying dish that is perfect for a cold night. The tofu is braised in a sauce with sesame, ginger, and garlic, and it is served over rice or noodles. * **Roast garlic & tahini spinach:** This is a light and healthy dish that is perfect for a spring or summer meal. The spinach is roasted with garlic and tahini, and it is served with a side of pita bread. These are just a few ideas to get you started. There are many other great vegetarian dinner recipes out there, so you are sure to find something that you will enjoy."
+ - type: human
+ content: Those all sound great! I like the Burnt aubergine veggie chilli
+ - type: ai
+ content: That's a great choice! I hope you enjoy it.
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/eval/utils.py b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/eval/utils.py
new file mode 100644
index 0000000000..deb735fb02
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/eval/utils.py
@@ -0,0 +1,209 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from concurrent.futures import ThreadPoolExecutor
+from functools import partial
+import glob
+import logging
+from typing import Any, Callable, Dict, Iterator, List
+
+import nest_asyncio
+import pandas as pd
+from tqdm import tqdm
+import yaml
+
+nest_asyncio.apply()
+
+
+def load_chats(path: str) -> List[Dict[str, Any]]:
+ """
+ Loads a list of chats from a directory or file.
+
+ Args:
+ path (str): The path to the directory or file containing the chats.
+
+ Returns:
+ List[Dict[str, Any]]: A list of chats.
+ """
+
+ chats: List[Dict[str, Any]] = []
+ for file_path in glob.glob(path):
+ with open(file_path) as f:
+ chats_in_file = yaml.safe_load(f)
+ chats = chats + chats_in_file
+ return chats
+
+
+def pairwise(iterable: List[Any]) -> Iterator[tuple[Any, Any]]:
+ """Creates an iterable with tuples paired together
+ e.g s -> (s0, s1), (s2, s3), (s4, s5), ...
+ """
+ a = iter(iterable)
+ return zip(a, a)
+
+
+def _process_conversation(row: Dict[str, List[str]]) -> List[Dict[str, Any]]:
+ """Processes a single conversation row to extract messages and build conversation history."""
+ conversation_history: List[Dict] = []
+ messages = []
+ for human_message, ai_message in pairwise(row["messages"]):
+ messages.append(
+ {
+ "human_message": human_message,
+ "ai_message": ai_message,
+ "conversation_history": conversation_history.copy(),
+ }
+ )
+ conversation_history.extend([human_message, ai_message])
+ return messages
+
+
+def generate_multiturn_history(df: pd.DataFrame) -> pd.DataFrame:
+ """Processes a DataFrame of conversations to create a multi-turn history.
+
+ This function iterates through a DataFrame where each row represents a conversation.
+ It extracts human and AI messages from the "messages" column and structures them
+ into a new DataFrame. Each row in the output DataFrame represents a single turn
+ in a conversation, including the human message, AI message, and the conversation
+ history up to that point.
+
+ Args:
+ df (pd.DataFrame): A DataFrame where each row represents a conversation.
+ The DataFrame should have a column named "messages" containing
+ a list of alternating human and AI messages.
+
+ Returns:
+ pd.DataFrame: A DataFrame where each row represents a single turn in a conversation.
+ The DataFrame has the following columns:
+ - human_message: The human message in that turn.
+ - ai_message: The AI message in that turn.
+ - conversation_history: A list of all messages in the conversation
+ up to and including the current turn.
+ """
+ processed_messages = df.apply(_process_conversation, axis=1).explode().tolist()
+ return pd.DataFrame(processed_messages)
+
+
+def generate_message(row: tuple[int, Dict[str, Any]], runnable: Any) -> Dict[str, Any]:
+ """Generates a response message using a given runnable and updates the row dictionary.
+
+ This function takes a row dictionary containing message data and a runnable object.
+ It extracts conversation history and the current human message from the row,
+ then uses the runnable to generate a response based on the conversation history.
+ The generated response content and usage metadata are then added to the original
+ message dictionary within the row.
+
+ Args:
+ row (tuple[int, Dict[str, Any]]): A tuple containing the index and a dictionary
+ with message data, including:
+ - "conversation_history" (List[str]): Optional. List of previous
+ messages
+ in the conversation.
+ - "human_message" (str): The current human message.
+ runnable (Any): A runnable object that takes a dictionary with a "messages" key
+ and returns a response object with "content" and
+ "usage_metadata" attributes.
+
+ Returns:
+ Dict[str, Any]: The updated row dictionary with the generated response added to the message.
+ The message will now contain:
+ - "response" (str): The generated response content.
+ - "response_obj" (Any): The usage metadata of the response from the runnable.
+ """
+ _, message = row
+ messages = (
+ message["conversation_history"] if "conversation_history" in message else []
+ )
+ messages.append(message["human_message"])
+ input_runnable = {"messages": messages}
+ response = runnable.invoke(input_runnable)
+ message["response"] = response.content
+ message["response_obj"] = response.usage_metadata
+ return message
+
+
+def batch_generate_messages(
+ messages: pd.DataFrame,
+ runnable: Callable[[List[Dict[str, Any]]], Dict[str, Any]],
+ max_workers: int = 4,
+) -> pd.DataFrame:
+ """Generates AI responses to user messages using a provided runnable.
+
+ Processes a Pandas DataFrame containing conversation histories and user messages, utilizing
+ the specified runnable to predict AI responses in parallel.
+
+ Args:
+ messages (pd.DataFrame): DataFrame with a 'messages' column. Each row
+ represents a conversation and contains a list of dictionaries, where
+ each dictionary
+ represents a message turn in the format:
+
+ ```json
+ [
+ {"type": "human", "content": "user's message"},
+ {"type": "ai", "content": "AI's response"},
+ {"type": "human", "content": "current user's message"},
+ ...
+ ]
+ ```
+
+ runnable (Callable[[List[Dict[str, Any]]], Dict[str, Any]]): Runnable object
+ (e.g., LangChain Chain) used
+ for response generation. It should accept a list of message dictionaries
+ (as described above) and return a dictionary with the following structure:
+
+ ```json
+ {
+ "response": "AI's response",
+ "response_obj": { ... } # optional response metadata
+ }
+ ```
+
+ max_workers (int, optional): Number of worker processes for parallel
+ prediction. Defaults to 4.
+
+ Returns:
+ pd.DataFrame: DataFrame with the original 'messages' column and two new
+ columns: 'response' containing the predicted AI responses, and
+ 'response_obj' containing optional response metadata.
+
+ Example:
+ ```python
+ import pandas as pd
+
+ messages_df = pd.DataFrame({
+ "messages": [
+ [
+ {"type": "human", "content": "What's the weather today?"}
+ ],
+ [
+ {"type": "human", "content": "Tell me a joke."},
+ {"type": "ai", "content": "Why did the scarecrow win an award?"},
+ {"type": "human", "content": "I don't know, why?"}
+ ]
+ ]
+ })
+
+ responses_df = batch_generate_messages(my_runnable, messages_df)
+ ```
+ """
+ logging.info("Executing batch scoring")
+ predicted_messages = []
+ with ThreadPoolExecutor(max_workers) as pool:
+ partial_func = partial(generate_message, runnable=runnable)
+ for message in tqdm(
+ pool.map(partial_func, messages.iterrows()), total=len(messages)
+ ):
+ predicted_messages.append(message)
+ return pd.DataFrame(predicted_messages)
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/custom_rag_qa/chain.py b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/custom_rag_qa/chain.py
new file mode 100644
index 0000000000..1bdad8c9d4
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/custom_rag_qa/chain.py
@@ -0,0 +1,141 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# mypy: disable-error-code="arg-type,attr-defined"
+# pylint: disable=W0613, W0622
+
+import logging
+from typing import Any, Dict, Iterator, List
+
+from app.patterns.custom_rag_qa.templates import (
+ inspect_conversation_template,
+ rag_template,
+ template_docs,
+)
+from app.patterns.custom_rag_qa.vector_store import get_vector_store
+from app.utils.output_types import OnChatModelStreamEvent, OnToolEndEvent, custom_chain
+import google
+from langchain.schema import Document
+from langchain.tools import tool
+from langchain_core.messages import ToolMessage
+from langchain_google_community.vertex_rank import VertexAIRank
+from langchain_google_vertexai import ChatVertexAI, VertexAIEmbeddings
+import vertexai
+
+# Configuration
+EMBEDDING_MODEL = "text-embedding-004"
+LLM_MODEL = "gemini-1.5-flash-002"
+TOP_K = 5
+
+# Initialize logging
+logging.basicConfig(level=logging.INFO)
+
+# Initialize Google Cloud and Vertex AI
+credentials, project_id = google.auth.default()
+vertexai.init(project=project_id)
+
+# Set up embedding model and vector store
+embedding = VertexAIEmbeddings(model_name=EMBEDDING_MODEL)
+vector_store = get_vector_store(embedding=embedding)
+retriever = vector_store.as_retriever(search_kwargs={"k": 20})
+
+# Initialize document compressor
+compressor = VertexAIRank(
+ project_id=project_id,
+ location_id="global",
+ ranking_config="default_ranking_config",
+ title_field="id",
+ top_n=TOP_K,
+)
+
+
+@tool
+def retrieve_docs(query: str) -> List[Document]:
+ """
+ Useful for retrieving relevant documents based on a query.
+ Use this when you need additional information to answer a question.
+
+ Args:
+ query (str): The user's question or search query.
+
+ Returns:
+ List[Document]: A list of the top-ranked Document objects, limited to TOP_K (5) results.
+ """
+ retrieved_docs = retriever.invoke(query)
+ ranked_docs = compressor.compress_documents(documents=retrieved_docs, query=query)
+ return ranked_docs
+
+
+@tool
+def should_continue() -> None:
+ """
+ Use this tool if you determine that you have enough context to respond to the questions of the user.
+ """
+ return None
+
+
+# Initialize language model
+llm = ChatVertexAI(model=LLM_MODEL, temperature=0, max_tokens=1024)
+
+# Set up conversation inspector
+inspect_conversation = inspect_conversation_template | llm.bind_tools(
+ [retrieve_docs, should_continue], tool_choice="any"
+)
+
+# Set up response chain
+response_chain = rag_template | llm
+
+
+@custom_chain
+def chain(
+ input: Dict[str, Any], **kwargs: Any
+) -> Iterator[OnToolEndEvent | OnChatModelStreamEvent]:
+ """
+ Implement a RAG QA chain with tool calls.
+
+ This function is decorated with `custom_chain` to offer LangChain compatible
+ astream_events, support for synchronous invocation through the `invoke` method,
+ and OpenTelemetry tracing.
+ """
+ # Inspect conversation and determine next action
+ inspection_result = inspect_conversation.invoke(input)
+ tool_call_result = inspection_result.tool_calls[0]
+
+ # Execute the appropriate tool based on the inspection result
+ if tool_call_result["name"] == "retrieve_docs":
+ # Retrieve relevant documents
+ docs = retrieve_docs.invoke(tool_call_result["args"])
+ # Format the retrieved documents
+ formatted_docs = template_docs.format(docs=docs)
+ # Create a ToolMessage with the formatted documents
+ tool_message = ToolMessage(
+ tool_call_id=tool_call_result["name"],
+ name=tool_call_result["name"],
+ content=formatted_docs,
+ artifact=docs,
+ )
+ else:
+ # If no documents need to be retrieved, continue with the conversation
+ tool_message = should_continue.invoke(tool_call_result)
+
+ # Update input messages with new information
+ input["messages"] = input["messages"] + [inspection_result, tool_message]
+
+ # Yield tool results metadata
+ yield OnToolEndEvent(
+ data={"input": tool_call_result["args"], "output": tool_message}
+ )
+
+ # Stream LLM response
+ for chunk in response_chain.stream(input=input):
+ yield OnChatModelStreamEvent(data={"chunk": chunk})
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/custom_rag_qa/templates.py b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/custom_rag_qa/templates.py
new file mode 100644
index 0000000000..75b11b2b5c
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/custom_rag_qa/templates.py
@@ -0,0 +1,56 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# flake8: noqa: W291
+
+from langchain_core.prompts import (
+ ChatPromptTemplate,
+ MessagesPlaceholder,
+ PromptTemplate,
+)
+
+template_docs = PromptTemplate.from_template(
+ """## Context provided:
+{% for doc in docs%}
+
+{{ doc.page_content | safe }}
+
+{% endfor %}
+""",
+ template_format="jinja2",
+)
+
+inspect_conversation_template = ChatPromptTemplate.from_messages(
+ [
+ (
+ "system",
+ """You are an AI assistant tasked with analyzing the conversation "
+and determining the best course of action.""",
+ ),
+ MessagesPlaceholder(variable_name="messages"),
+ ]
+)
+
+rag_template = ChatPromptTemplate.from_messages(
+ [
+ (
+ "system",
+ """You are an AI assistant for question-answering tasks.
+
+Answer to the best of your ability using the context provided.
+If you're unsure, it's better to acknowledge limitations than to speculate.
+""",
+ ),
+ MessagesPlaceholder(variable_name="messages"),
+ ]
+)
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/custom_rag_qa/vector_store.py b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/custom_rag_qa/vector_store.py
new file mode 100644
index 0000000000..017d1383a1
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/custom_rag_qa/vector_store.py
@@ -0,0 +1,53 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import logging
+import os
+from typing import List
+
+from langchain.text_splitter import RecursiveCharacterTextSplitter
+from langchain_community.document_loaders import PyPDFLoader
+from langchain_community.vectorstores import SKLearnVectorStore
+from langchain_core.documents import Document
+from langchain_core.embeddings import Embeddings
+
+PERSIST_PATH = ".persist_vector_store"
+URL = "https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf"
+
+
+def load_and_split_documents(url: str) -> List[Document]:
+ """Load and split documents from a given URL."""
+ loader = PyPDFLoader(url)
+ documents = loader.load()
+ logging.info(f"# of documents loaded (pre-chunking) = {len(documents)}")
+
+ text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
+ doc_splits = text_splitter.split_documents(documents)
+ logging.info(f"# of documents after split = {len(doc_splits)}")
+
+ return doc_splits
+
+
+def get_vector_store(
+ embedding: Embeddings, persist_path: str = PERSIST_PATH, url: str = URL
+) -> SKLearnVectorStore:
+ """Get or create a vector store."""
+ vector_store = SKLearnVectorStore(embedding=embedding, persist_path=persist_path)
+
+ if not os.path.exists(persist_path):
+ doc_splits = load_and_split_documents(url=url)
+ vector_store.add_documents(documents=doc_splits)
+ vector_store.persist()
+
+ return vector_store
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/langgraph_dummy_agent/chain.py b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/langgraph_dummy_agent/chain.py
new file mode 100644
index 0000000000..75a32bc0b9
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/patterns/langgraph_dummy_agent/chain.py
@@ -0,0 +1,73 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# mypy: disable-error-code="unused-ignore, union-attr"
+
+from typing import Dict
+
+from langchain_core.messages import BaseMessage
+from langchain_core.runnables import RunnableConfig
+from langchain_core.tools import tool
+from langchain_google_vertexai import ChatVertexAI
+from langgraph.graph import END, MessagesState, StateGraph
+from langgraph.prebuilt import ToolNode
+
+
+# 1. Define tools
+@tool
+def search(query: str) -> str:
+ """Simulates a web search. Use it get information on weather"""
+ if "sf" in query.lower() or "san francisco" in query.lower():
+ return "It's 60 degrees and foggy."
+ return "It's 90 degrees and sunny."
+
+
+tools = [search]
+
+# 2. Set up the language model
+llm = ChatVertexAI(
+ model="gemini-1.5-pro-002", temperature=0, max_tokens=1024, streaming=True
+).bind_tools(tools)
+
+
+# 3. Define workflow components
+def should_continue(state: MessagesState) -> str:
+ """Determines whether to use tools or end the conversation."""
+ last_message = state["messages"][-1]
+ return "tools" if last_message.tool_calls else END
+
+
+async def call_model(
+ state: MessagesState, config: RunnableConfig
+) -> Dict[str, BaseMessage]:
+ """Calls the language model and returns the response."""
+ system_message = "You are a helpful AI assistant."
+ messages_with_system = [{"type": "system", "content": system_message}] + state[
+ "messages"
+ ]
+ response = llm.invoke(messages_with_system, config)
+ return {"messages": response}
+
+
+# 4. Create the workflow graph
+workflow = StateGraph(MessagesState)
+workflow.add_node("agent", call_model)
+workflow.add_node("tools", ToolNode(tools))
+workflow.set_entry_point("agent")
+
+# 5. Define graph edges
+workflow.add_conditional_edges("agent", should_continue)
+workflow.add_edge("tools", "agent")
+
+# 6. Compile the workflow
+chain = workflow.compile()
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/server.py b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/server.py
new file mode 100644
index 0000000000..c9a8449746
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/server.py
@@ -0,0 +1,121 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# pylint: disable=W0718, C0411
+# ruff: noqa: I001
+
+import json
+import logging
+import os
+from typing import AsyncGenerator
+import uuid
+
+from app.chain import chain
+from app.utils.input_types import Feedback, Input, InputChat, default_serialization
+from app.utils.output_types import EndEvent, Event
+from app.utils.tracing import CloudTraceLoggingSpanExporter
+from fastapi import FastAPI
+from fastapi.responses import RedirectResponse, StreamingResponse
+from google.cloud import logging as google_cloud_logging
+from traceloop.sdk import Instruments, Traceloop
+
+# Default chain
+# from app.chain import chain
+
+# Or choose one of the following pattern chains to test by uncommenting it:
+
+# Custom RAG QA
+# from app.patterns.custom_rag_qa.chain import chain
+
+# LangGraph dummy agent
+# from app.patterns.langgraph_dummy_agent.chain import chain
+
+# The events that are supported by the UI Frontend
+SUPPORTED_EVENTS = [
+ "on_tool_start",
+ "on_tool_end",
+ "on_retriever_start",
+ "on_retriever_end",
+ "on_chat_model_stream",
+]
+
+# Initialize FastAPI app and logging
+app = FastAPI()
+logging_client = google_cloud_logging.Client()
+logger = logging_client.logger(__name__)
+
+# Initialize Traceloop
+try:
+ Traceloop.init(
+ app_name="Sample Chatbot Application",
+ disable_batch=False,
+ exporter=CloudTraceLoggingSpanExporter(),
+ instruments={Instruments.VERTEXAI, Instruments.LANGCHAIN},
+ )
+except Exception as e:
+ logging.error("Failed to initialize Traceloop: %s", e)
+
+
+async def stream_event_response(input_chat: InputChat) -> AsyncGenerator[str, None]:
+ """Stream events in response to an input chat."""
+ run_id = uuid.uuid4()
+ input_dict = input_chat.model_dump()
+
+ Traceloop.set_association_properties(
+ {
+ "log_type": "tracing",
+ "run_id": str(run_id),
+ "user_id": input_dict["user_id"],
+ "session_id": input_dict["session_id"],
+ "commit_sha": os.environ.get("COMMIT_SHA", "None"),
+ }
+ )
+
+ yield json.dumps(
+ Event(event="metadata", data={"run_id": str(run_id)}),
+ default=default_serialization,
+ ) + "\n"
+
+ async for data in chain.astream_events(input_dict, version="v2"):
+ if data["event"] in SUPPORTED_EVENTS:
+ yield json.dumps(data, default=default_serialization) + "\n"
+
+ yield json.dumps(EndEvent(), default=default_serialization) + "\n"
+
+
+# Routes
+@app.get("/")
+async def redirect_root_to_docs() -> RedirectResponse:
+ """Redirect the root URL to the API documentation."""
+ return RedirectResponse("/docs")
+
+
+@app.post("/feedback")
+async def collect_feedback(feedback_dict: Feedback) -> None:
+ """Collect and log feedback."""
+ logger.log_struct(feedback_dict.model_dump(), severity="INFO")
+
+
+@app.post("/stream_events")
+async def stream_chat_events(request: Input) -> StreamingResponse:
+ """Stream chat events in response to an input request."""
+ return StreamingResponse(
+ stream_event_response(input_chat=request.input), media_type="text/event-stream"
+ )
+
+
+# Main execution
+if __name__ == "__main__":
+ import uvicorn
+
+ uvicorn.run(app, host="0.0.0.0", port=8000)
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/__init__.py b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/input_types.py b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/input_types.py
new file mode 100644
index 0000000000..49be86317c
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/input_types.py
@@ -0,0 +1,56 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from typing import Annotated, Any, List, Literal, Optional, Union
+
+from langchain_core.messages import AIMessage, HumanMessage, ToolMessage
+from pydantic import BaseModel, Field
+
+
+class InputChat(BaseModel):
+ """Represents the input for a chat session."""
+
+ messages: List[
+ Annotated[
+ Union[HumanMessage, AIMessage, ToolMessage], Field(discriminator="type")
+ ]
+ ] = Field(
+ ..., description="The chat messages representing the current conversation."
+ )
+ user_id: str = ""
+ session_id: str = ""
+
+
+class Input(BaseModel):
+ """Wrapper class for InputChat."""
+
+ input: InputChat
+
+
+class Feedback(BaseModel):
+ """Represents feedback for a conversation."""
+
+ score: Union[int, float]
+ text: Optional[str] = ""
+ run_id: str
+ log_type: Literal["feedback"] = "feedback"
+
+
+def default_serialization(obj: Any) -> Any:
+ """
+ Default serialization for LangChain objects.
+ Converts BaseModel instances to dictionaries.
+ """
+ if isinstance(obj, BaseModel):
+ return obj.model_dump()
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/output_types.py b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/output_types.py
new file mode 100644
index 0000000000..f10f0a858b
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/output_types.py
@@ -0,0 +1,142 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from functools import wraps
+from types import GeneratorType
+from typing import Any, AsyncGenerator, Callable, Dict, Literal
+import uuid
+
+from langchain_core.messages import AIMessage, AIMessageChunk, ToolMessage
+from pydantic import BaseModel, Field
+from traceloop.sdk import TracerWrapper
+from traceloop.sdk.decorators import workflow
+
+
+class BaseCustomChainEvent(BaseModel):
+ """Base class for custom chain events."""
+
+ name: str = "custom_chain_event"
+
+ class Config:
+ """Allow extra fields in the model."""
+
+ extra = "allow"
+
+
+class OnToolStartEvent(BaseCustomChainEvent):
+ """Event representing the start of a tool execution."""
+
+ event: Literal["on_tool_start"] = "on_tool_start"
+ input: Dict = {}
+ id: str = Field(default_factory=lambda: str(uuid.uuid4()))
+
+
+class ToolData(BaseModel):
+ """Data structure for tool input and output."""
+
+ input: Dict = {}
+ output: ToolMessage
+
+
+class OnToolEndEvent(BaseCustomChainEvent):
+ """Event representing the end of a tool execution."""
+
+ event: Literal["on_tool_end"] = "on_tool_end"
+ id: str = Field(default_factory=lambda: str(uuid.uuid4()))
+ data: ToolData
+
+
+class ChatModelStreamData(BaseModel):
+ """Data structure for chat model stream chunks."""
+
+ chunk: AIMessageChunk
+
+
+class OnChatModelStreamEvent(BaseCustomChainEvent):
+ """Event representing a chunk of streamed chat model output."""
+
+ event: Literal["on_chat_model_stream"] = "on_chat_model_stream"
+ data: ChatModelStreamData
+
+
+class Event(BaseModel):
+ """Generic event structure."""
+
+ event: str = "data"
+ data: dict
+
+
+class EndEvent(BaseModel):
+ """Event representing the end of a stream."""
+
+ event: Literal["end"] = "end"
+
+
+class CustomChain:
+ """A custom chain class that wraps a callable function."""
+
+ def __init__(self, func: Callable):
+ """Initialize the CustomChain with a callable function."""
+ self.func = func
+
+ async def astream_events(self, *args: Any, **kwargs: Any) -> AsyncGenerator:
+ """
+ Asynchronously stream events from the wrapped function.
+ Applies Traceloop workflow decorator if Traceloop SDK is initialized.
+ """
+
+ if hasattr(TracerWrapper, "instance"):
+ func = workflow()(self.func)
+ else:
+ func = self.func
+
+ gen: GeneratorType = func(*args, **kwargs)
+
+ for event in gen:
+ yield event.model_dump()
+
+ def invoke(self, *args: Any, **kwargs: Any) -> AIMessage:
+ """
+ Invoke the wrapped function and process its events.
+ Returns an AIMessage with content and relative tool calls.
+ """
+ events = self.func(*args, **kwargs)
+ response_content = ""
+ tool_calls = []
+ for event in events:
+ if isinstance(event, OnChatModelStreamEvent):
+ if not isinstance(event.data.chunk.content, str):
+ raise ValueError("Chunk content must be a string")
+ response_content += event.data.chunk.content
+ elif isinstance(event, OnToolEndEvent):
+ tool_calls.append(event.data.model_dump())
+ return AIMessage(
+ content=response_content, additional_kwargs={"tool_calls_data": tool_calls}
+ )
+
+ def __call__(self, *args: Any, **kwargs: Any) -> Any:
+ """Make the CustomChain instance callable, invoking the wrapped function."""
+ return self.func(*args, **kwargs)
+
+
+def custom_chain(func: Callable) -> CustomChain:
+ """
+ Decorator function that wraps a callable in a CustomChain instance.
+ """
+
+ @wraps(func)
+ def wrapper(*args: Any, **kwargs: Any) -> Any:
+ return func(*args, **kwargs)
+
+ return CustomChain(wrapper)
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/tracing.py b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/tracing.py
new file mode 100644
index 0000000000..0f5ca84b80
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/app/utils/tracing.py
@@ -0,0 +1,150 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import json
+import logging
+from typing import Any, Optional, Sequence
+
+from google.cloud import logging as google_cloud_logging
+from google.cloud import storage
+from opentelemetry.exporter.cloud_trace import CloudTraceSpanExporter
+from opentelemetry.sdk.trace import ReadableSpan
+from opentelemetry.sdk.trace.export import SpanExportResult
+
+
+class CloudTraceLoggingSpanExporter(CloudTraceSpanExporter):
+ """
+ An extended version of CloudTraceSpanExporter that logs span data to Google Cloud Logging
+ and handles large attribute values by storing them in Google Cloud Storage.
+
+ This class helps bypass the 256 character limit of Cloud Trace for attribute values
+ by leveraging Cloud Logging (which has a 256KB limit) and Cloud Storage for larger payloads.
+ """
+
+ def __init__(
+ self,
+ logging_client: Optional[google_cloud_logging.Client] = None,
+ storage_client: Optional[storage.Client] = None,
+ bucket_name: Optional[str] = None,
+ debug: bool = False,
+ **kwargs: Any,
+ ) -> None:
+ """
+ Initialize the exporter with Google Cloud clients and configuration.
+
+ :param logging_client: Google Cloud Logging client
+ :param storage_client: Google Cloud Storage client
+ :param bucket_name: Name of the GCS bucket to store large payloads
+ :param debug: Enable debug mode for additional logging
+ :param kwargs: Additional arguments to pass to the parent class
+ """
+ super().__init__(**kwargs)
+ self.debug = debug
+ self.logging_client = logging_client or google_cloud_logging.Client(
+ project=self.project_id
+ )
+ self.logger = self.logging_client.logger(__name__)
+ self.storage_client = storage_client or storage.Client(project=self.project_id)
+ self.bucket_name = bucket_name or f"{self.project_id}-logs-data"
+ self._ensure_bucket_exists()
+ self.bucket = self.storage_client.bucket(self.bucket_name)
+
+ def export(self, spans: Sequence[ReadableSpan]) -> SpanExportResult:
+ """
+ Export the spans to Google Cloud Logging and Cloud Trace.
+
+ :param spans: A sequence of spans to export
+ :return: The result of the export operation
+ """
+ for span in spans:
+ span_context = span.get_span_context()
+ trace_id = format(span_context.trace_id, "x")
+ span_id = format(span_context.span_id, "x")
+ span_dict = json.loads(span.to_json())
+
+ span_dict["trace"] = f"projects/{self.project_id}/traces/{trace_id}"
+ span_dict["span_id"] = span_id
+
+ span_dict = self._process_large_attributes(
+ span_dict=span_dict, span_id=span_id
+ )
+
+ if self.debug:
+ print(span_dict)
+
+ # Log the span data to Google Cloud Logging
+ self.logger.log_struct(span_dict, severity="INFO")
+
+ # Export spans to Google Cloud Trace using the parent class method
+ return super().export(spans)
+
+ def _ensure_bucket_exists(self) -> None:
+ """Ensure that the GCS bucket exists, creating it if necessary."""
+ if not self.storage_client.bucket(self.bucket_name).exists():
+ logging.info(f"Bucket {self.bucket_name} not detected. Creating it now.")
+ self.storage_client.create_bucket(self.bucket_name)
+
+ def store_in_gcs(self, content: str, span_id: str) -> str:
+ """
+ Initiate storing large content in Google Cloud Storage/
+
+ :param content: The content to store
+ :param span_id: The ID of the span
+ :return: The GCS URI of the stored content
+ """
+ blob_name = f"spans/{span_id}.json"
+ blob = self.bucket.blob(blob_name)
+
+ blob.upload_from_string(content, "application/json")
+ return f"gs://{self.bucket_name}/{blob_name}"
+
+ def _process_large_attributes(self, span_dict: dict, span_id: str) -> dict:
+ """
+ Process large attribute values by storing them in GCS if they exceed the size
+ limit of Google Cloud Logging.
+
+ :param span_dict: The span data dictionary
+ :param trace_id: The trace ID
+ :param span_id: The span ID
+ :return: The updated span dictionary
+ """
+ attributes = span_dict["attributes"]
+ if len(json.dumps(attributes).encode()) > 255 * 1024: # 250 KB
+ # Separate large payload from other attributes
+ attributes_payload = {
+ k: v
+ for k, v in attributes.items()
+ if "traceloop.association.properties" not in k
+ }
+ attributes_retain = {
+ k: v
+ for k, v in attributes.items()
+ if "traceloop.association.properties" in k
+ }
+
+ # Store large payload in GCS
+ gcs_uri = self.store_in_gcs(json.dumps(attributes_payload), span_id)
+ attributes_retain["uri_payload"] = gcs_uri
+ attributes_retain["url_payload"] = (
+ f"https://storage.mtls.cloud.google.com/"
+ f"{self.bucket_name}/spans/{span_id}.json"
+ )
+
+ span_dict["attributes"] = attributes_retain
+ logging.info(
+ "Length of payload span above 250 KB, storing attributes in GCS "
+ "to avoid large log entry errors"
+ )
+
+ return span_dict
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/README.md b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/README.md
new file mode 100644
index 0000000000..6be94bb79e
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/README.md
@@ -0,0 +1,116 @@
+# Deployment README.md
+
+This folder contains the infrastructure-as-code and CI/CD pipeline configurations for deploying a conversational Generative AI application on Google Cloud.
+
+The application leverages [**Terraform**](http://terraform.io) to define and provision the underlying infrastructure, while [**Cloud Build**](https://cloud.google.com/build/) orchestrates the continuous integration and continuous deployment (CI/CD) pipeline.
+
+## Deployment Workflow
+
+![Deployment Workflow](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/deployment_workflow.png)
+
+**Description:**
+
+1. CI Pipeline (`deployment/ci/pr_checks.yaml`):
+
+ - Triggered on pull request creation/update
+ - Runs unit and integration tests
+
+2. CD Pipeline (`deployment/cd/staging.yaml`):
+
+ - Triggered on merge to `main` branch
+ - Builds and pushes application to Artifact Registry
+ - Deploys to staging environment (Cloud Run)
+ - Performs load testing
+
+3. Production Deployment (`deployment/cd/deploy-to-prod.yaml`):
+ - Triggered after successful staging deployment
+ - Requires manual approval
+ - Deploys to production environment
+
+## Setup
+
+**Prerequisites:**
+
+1. A set of Google Cloud projects:
+ - Staging project
+ - Production project
+ - CI/CD project (can be the same as staging or production)
+2. Terraform installed on your local machine
+3. Enable required APIs in the CI/CD project. This will be required for the Terraform deployment:
+
+ ```bash
+ gcloud config set project YOUR_CI_CD_PROJECT_ID
+ gcloud services enable serviceusage.googleapis.com cloudresourcemanager.googleapis.com cloudbuild.googleapis.com secretmanager.googleapis.com
+ ```
+
+## Step-by-Step Guide
+
+1. **Create a Git Repository using your favorite Git provider (GitHub, GitLab, Bitbucket, etc.)**
+
+2. **Connect Your Repository to Cloud Build**
+ For detailed instructions, visit: [Cloud Build Repository Setup](https://cloud.google.com/build/docs/repositories#whats_next).
+
+ ![Alt text](https://storage.googleapis.com/github-repo/generative-ai/sample-apps/e2e-gen-ai-app-starter-pack/connection_cb.gif)
+
+3. **Configure Terraform Variables**
+
+ - Edit [`deployment/terraform/vars/env.tfvars`](../terraform/vars/env.tfvars) with your Google Cloud settings.
+
+ | Variable | Description | Required |
+ | ---------------------- | --------------------------------------------------------------- | :------: |
+ | prod_project_id | **Production** Google Cloud Project ID for resource deployment. | Yes |
+ | staging_project_id | **Staging** Google Cloud Project ID for resource deployment. | Yes |
+ | cicd_runner_project_id | Google Cloud Project ID where CI/CD pipelines will execute. | Yes |
+ | region | Google Cloud region for resource deployment. | Yes |
+ | host_connection_name | Name of the host connection you created in Cloud Build | Yes |
+ | repository_name | Name of the repository you added to Cloud Build | Yes |
+
+ Other optional variables include: telemetry and feedback BigQuery dataset IDs, log filters, sink names, service account names, bucket name suffixes, artifact registry repository name, and various role assignments for Cloud Run and CICD.
+
+4. **Deploy Infrastructure with Terraform**
+
+ - Open a terminal and navigate to the Terraform directory:
+
+ ```bash
+ cd deployment/terraform
+ ```
+
+ - Initialize Terraform:
+
+ ```bash
+ terraform init
+ ```
+
+ - Apply the Terraform configuration:
+
+ ```bash
+ terraform apply --var-file vars/env.tfvars
+ ```
+
+ - Type 'yes' when prompted to confirm
+
+After completing these steps, your infrastructure will be set up and ready for deployment!
+
+## Dev Deployment
+
+For End-to-end testing of the application, including tracing and feedback sinking to BigQuery, without the need to trigger a CI/CD pipeline.
+
+After you edited the relative [`env.tfvars` file](../terraform/dev/vars/env.tfvars), follow the following instructions:
+
+```bash
+cd deployment/terraform/dev
+terraform init
+terraform apply --var-file vars/env.tfvars
+```
+
+Then deploy the application using the following command (from the root of the repository):
+
+```bash
+gcloud run deploy genai-app-sample --source . --project $YOUR_DEV_PROJECT_ID --service-account genai-app-sample-cr-sa@$YOUR_DEV_PROJECT_ID.iam.gserviceaccount.com
+```
+
+### End-to-end Demo video
+
+
+
+
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/cd/deploy-to-prod.yaml b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/cd/deploy-to-prod.yaml
new file mode 100644
index 0000000000..896de2eba8
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/cd/deploy-to-prod.yaml
@@ -0,0 +1,35 @@
+steps:
+ - name: "gcr.io/cloud-builders/gcloud"
+ id: trigger-deployment
+ entrypoint: gcloud
+ args:
+ - "run"
+ - "deploy"
+ - "genai-app-sample"
+ - "--image"
+ - "us-central1-docker.pkg.dev/$PROJECT_ID/$_ARTIFACT_REGISTRY_REPO_NAME/$_CONTAINER_NAME"
+ - "--region"
+ - "us-central1"
+ - "--project"
+ - $_PROD_PROJECT_ID
+ - "--min-instances"
+ - "1"
+ - "--no-cpu-throttling"
+ - "--cpu"
+ - "4"
+ - "--memory"
+ - "4Gi"
+ - "--concurrency"
+ - "40"
+ - "--service-account"
+ - "${_CLOUD_RUN_APP_SA_NAME}@${_PROD_PROJECT_ID}.iam.gserviceaccount.com"
+ - "--set-env-vars"
+ - "COMMIT_SHA=${COMMIT_SHA}"
+
+substitutions:
+ _PROD_PROJECT_ID: YOUR_PROD_PROJECT_ID # Replace with your prod project ID
+ _CONTAINER_NAME: genai-app-sample
+ _ARTIFACT_REGISTRY_REPO_NAME: genai-containers
+ _CLOUD_RUN_APP_SA_NAME: genai-app-sample-cr-sa
+options:
+ logging: CLOUD_LOGGING_ONLY
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/cd/staging.yaml b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/cd/staging.yaml
new file mode 100644
index 0000000000..ff8eac0656
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/cd/staging.yaml
@@ -0,0 +1,137 @@
+steps:
+ # # Build and Push
+ - name: "gcr.io/cloud-builders/docker"
+ args:
+ [
+ "build",
+ "-t",
+ "us-central1-docker.pkg.dev/$PROJECT_ID/$_ARTIFACT_REGISTRY_REPO_NAME/$_CONTAINER_NAME",
+ ".",
+ ]
+ - name: "gcr.io/cloud-builders/docker"
+ args:
+ [
+ "push",
+ "us-central1-docker.pkg.dev/$PROJECT_ID/$_ARTIFACT_REGISTRY_REPO_NAME/$_CONTAINER_NAME",
+ ]
+
+ # Deploy to Staging
+ - name: "gcr.io/cloud-builders/gcloud"
+ id: deploy-staging
+ entrypoint: gcloud
+ args:
+ - "run"
+ - "deploy"
+ - "genai-app-sample"
+ - "--image"
+ - "us-central1-docker.pkg.dev/$PROJECT_ID/$_ARTIFACT_REGISTRY_REPO_NAME/$_CONTAINER_NAME"
+ - "--region"
+ - "us-central1"
+ - "--project"
+ - "${_STAGING_PROJECT_ID}"
+ - "--min-instances"
+ - "1"
+ - "--no-cpu-throttling"
+ - "--cpu"
+ - "4"
+ - "--memory"
+ - "4Gi"
+ - "--concurrency"
+ - "40"
+ - "--service-account"
+ - "${_CLOUD_RUN_APP_SA_NAME}@${_STAGING_PROJECT_ID}.iam.gserviceaccount.com"
+ - "--set-env-vars"
+ - "COMMIT_SHA=${COMMIT_SHA}"
+
+ # Fetch Staging Service URL
+ - name: "gcr.io/cloud-builders/gcloud"
+ id: fetch-staging-url
+ entrypoint: /bin/bash
+ args:
+ - "-c"
+ - |
+ echo $(gcloud run services describe genai-app-sample \
+ --region us-central1 --project ${_STAGING_PROJECT_ID} --format="value(status.url)") > staging_url.txt
+
+ # Fetch ID Token
+ - name: gcr.io/cloud-builders/gcloud
+ id: fetch-id-token
+ entrypoint: /bin/bash
+ args:
+ - "-c"
+ - |
+ echo $(gcloud auth print-identity-token -q) > id_token.txt
+
+ # Load Testing
+ - name: "python:3.10"
+ id: load_test
+ entrypoint: /bin/bash
+ args:
+ - "-c"
+ - |
+ export _ID_TOKEN=$(cat id_token.txt)
+ export _STAGING_URL=$(cat staging_url.txt)
+ pip install locust==2.31.1
+ python -m locust -f tests/load_test/load_test.py \
+ -H $$_STAGING_URL \
+ --headless \
+ -t 30s -u 10 -r 0.5 \
+ --csv=tests/load_test/.results/results \
+ --html=tests/load_test/.results/report.html
+
+ # Export Load Test Results to GCS
+ - name: gcr.io/cloud-builders/gcloud
+ id: export-results-to-gcs
+ entrypoint: /bin/bash
+ args:
+ - "-c"
+ - |
+ export _TIMESTAMP=$(date +%Y%m%d-%H%M%S)
+ gsutil -m cp -r tests/load_test/.results gs://${_BUCKET_NAME_LOAD_TEST_RESULTS}/results-$${_TIMESTAMP}
+ echo "_________________________________________________________________________"
+ echo "Load test results copied to gs://${_BUCKET_NAME_LOAD_TEST_RESULTS}/results-$${_TIMESTAMP}"
+ echo "HTTP link: https://console.cloud.google.com/storage/browser/${_BUCKET_NAME_LOAD_TEST_RESULTS}/results-$${_TIMESTAMP}"
+ echo "_________________________________________________________________________"
+
+ # Trigger Prod Deployment
+ - name: gcr.io/cloud-builders/gcloud
+ id: trigger-prod-deployment
+ entrypoint: gcloud
+ args:
+ - "beta"
+ - "builds"
+ - "triggers"
+ - "run"
+ - "deploy-to-prod-pipeline"
+ - "--region"
+ - "$LOCATION"
+ - "--project"
+ - "$PROJECT_ID"
+ - "--substitutions"
+ - "_PROD_PROJECT_ID=${_PROD_PROJECT_ID},_ARTIFACT_REGISTRY_REPO_NAME=${_ARTIFACT_REGISTRY_REPO_NAME},_CONTAINER_NAME=${_CONTAINER_NAME},_CLOUD_RUN_APP_SA_NAME=${_CLOUD_RUN_APP_SA_NAME}"
+ - "--sha"
+ - $COMMIT_SHA
+
+ - name: gcr.io/cloud-builders/gcloud
+ id: echo-view-build-trigger-link
+ entrypoint: /bin/bash
+ args:
+ - "-c"
+ - |
+ echo "_________________________________________________________________________"
+ echo "Production deployment triggered. View progress and / or approve on the Cloud Build Console:"
+ echo "https://console.cloud.google.com/cloud-build/builds;region=$LOCATION"
+ echo "_________________________________________________________________________"
+
+substitutions:
+ _STAGING_PROJECT_ID: YOUR_STAGING_PROJECT_ID # Replace with your staging project ID
+ _PROD_PROJECT_ID: YOUR_PROD_PROJECT_ID # Replace with your prod project ID
+ _BUCKET_NAME_LOAD_TEST_RESULTS: YOUR_CICD_PROJECT_ID-cicd-load-test-results # Replace with your GCS bucket name
+ _CONTAINER_NAME: genai-app-sample
+ _ARTIFACT_REGISTRY_REPO_NAME: genai-containers
+ _CLOUD_RUN_APP_SA_NAME: genai-app-sample-cr-sa
+
+options:
+ substitutionOption: ALLOW_LOOSE
+ defaultLogsBucketBehavior: REGIONAL_USER_OWNED_BUCKET
+serviceAccount: projects/${PROJECT_ID}/serviceAccounts/cicd-runner@${PROJECT_ID}.iam.gserviceaccount.com
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/ci/pr_checks.yaml b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/ci/pr_checks.yaml
new file mode 100644
index 0000000000..c018170ec2
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/ci/pr_checks.yaml
@@ -0,0 +1,27 @@
+steps:
+ - name: "python:3.10"
+ id: install-dependencies
+ entrypoint: /bin/bash
+ args:
+ - "-c"
+ - |
+ pip install poetry==1.8.3 --user && python -m poetry install
+
+ - name: "python:3.10"
+ id: unit-tests
+ entrypoint: /bin/bash
+ args:
+ - "-c"
+ - |
+ python -m poetry run pytest tests/unit
+
+ # Run integration tests
+ - name: "python:3.10"
+ id: integration-tests
+ entrypoint: /bin/bash
+ args:
+ - "-c"
+ - |
+ python -m poetry run pytest tests/integration
+options:
+ logging: CLOUD_LOGGING_ONLY
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/apis.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/apis.tf
new file mode 100644
index 0000000000..11e3d9f1df
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/apis.tf
@@ -0,0 +1,49 @@
+locals {
+ cicd_services = [
+ "cloudbuild.googleapis.com",
+ "discoveryengine.googleapis.com",
+ "aiplatform.googleapis.com",
+ "serviceusage.googleapis.com",
+ "bigquery.googleapis.com",
+ "cloudresourcemanager.googleapis.com",
+ "cloudtrace.googleapis.com"
+ ]
+
+ shared_services = [
+ "aiplatform.googleapis.com",
+ "run.googleapis.com",
+ "discoveryengine.googleapis.com",
+ "cloudresourcemanager.googleapis.com",
+ "iam.googleapis.com",
+ "bigquery.googleapis.com",
+ "serviceusage.googleapis.com",
+ "logging.googleapis.com",
+ "cloudtrace.googleapis.com"
+ ]
+
+ projects = {
+ prod = var.prod_project_id
+ staging = var.staging_project_id
+ }
+
+}
+
+resource "google_project_service" "cicd_services" {
+ count = length(local.cicd_services)
+ project = var.cicd_runner_project_id
+ service = local.cicd_services[count.index]
+ disable_on_destroy = false
+}
+
+resource "google_project_service" "shared_services" {
+ for_each = {
+ for pair in setproduct(keys(local.projects), local.shared_services) :
+ "${pair[0]}_${replace(pair[1], ".", "_")}" => {
+ project = local.projects[pair[0]]
+ service = pair[1]
+ }
+ }
+ project = each.value.project
+ service = each.value.service
+ disable_on_destroy = false
+}
\ No newline at end of file
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/artifact_registry.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/artifact_registry.tf
new file mode 100644
index 0000000000..239b65e021
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/artifact_registry.tf
@@ -0,0 +1,8 @@
+resource "google_artifact_registry_repository" "my-repo" {
+ location = "us-central1"
+ repository_id = var.artifact_registry_repo_name
+ description = "Repo for Generative AI applications"
+ format = "DOCKER"
+ project = var.cicd_runner_project_id
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+}
\ No newline at end of file
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/build_triggers.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/build_triggers.tf
new file mode 100644
index 0000000000..a0fc872c67
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/build_triggers.tf
@@ -0,0 +1,80 @@
+# a. Create PR checks trigger
+resource "google_cloudbuild_trigger" "pr_checks" {
+ name = "pr-checks"
+ project = var.cicd_runner_project_id
+ location = var.region
+ description = "Trigger for PR checks"
+ service_account = resource.google_service_account.cicd_runner_sa.id
+
+ repository_event_config {
+ repository = "projects/${var.cicd_runner_project_id}/locations/${var.region}/connections/${var.host_connection_name}/repositories/${var.repository_name}"
+ pull_request {
+ branch = "main"
+ }
+ }
+
+ filename = "deployment/ci/pr_checks.yaml"
+ included_files = [
+ "app/**",
+ "tests/**",
+ "deployment/**",
+ "poetry.lock"
+ ]
+
+ include_build_logs = "INCLUDE_BUILD_LOGS_WITH_STATUS"
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+}
+
+# b. Create CD pipeline trigger
+resource "google_cloudbuild_trigger" "cd_pipeline" {
+ name = "cd-pipeline"
+ project = var.cicd_runner_project_id
+ location = var.region
+ service_account = resource.google_service_account.cicd_runner_sa.id
+ description = "Trigger for CD pipeline"
+
+ repository_event_config {
+ repository = "projects/${var.cicd_runner_project_id}/locations/${var.region}/connections/${var.host_connection_name}/repositories/${var.repository_name}"
+ push {
+ branch = "main"
+ }
+ }
+
+ filename = "deployment/cd/staging.yaml"
+ included_files = [
+ "app/**",
+ "tests/**",
+ "deployment/**",
+ "poetry.lock"
+ ]
+ substitutions = {
+ _STAGING_PROJECT_ID = var.staging_project_id
+ _PROD_PROJECT_ID = var.prod_project_id
+ _BUCKET_NAME_LOAD_TEST_RESULTS = resource.google_storage_bucket.bucket_load_test_results.name
+ _ARTIFACT_REGISTRY_REPO_NAME = var.artifact_registry_repo_name
+ _CLOUD_RUN_APP_SA_NAME = var.cloud_run_app_sa_name
+ }
+
+ include_build_logs = "INCLUDE_BUILD_LOGS_WITH_STATUS"
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+
+}
+
+# c. Create Deploy to production trigger
+resource "google_cloudbuild_trigger" "deploy_to_prod_pipeline" {
+ name = "deploy-to-prod-pipeline"
+ project = var.cicd_runner_project_id
+ location = var.region
+ description = "Trigger for deployment to production"
+ service_account = resource.google_service_account.cicd_runner_sa.id
+ repository_event_config {
+ repository = "projects/${var.cicd_runner_project_id}/locations/${var.region}/connections/${var.host_connection_name}/repositories/${var.repository_name}"
+ }
+ filename = "deployment/cd/deploy-to-prod.yaml"
+ include_build_logs = "INCLUDE_BUILD_LOGS_WITH_STATUS"
+ approval_config {
+ approval_required = true
+ }
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+
+}
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/iam.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/iam.tf
new file mode 100644
index 0000000000..a498653328
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/iam.tf
@@ -0,0 +1,20 @@
+locals {
+ project_ids = {
+ dev = var.dev_project_id
+ }
+}
+
+# 4. Grant Cloud Run SA the required permissions to run the application
+resource "google_project_iam_member" "cloud_run_app_sa_roles" {
+ for_each = {
+ for pair in setproduct(keys(local.project_ids), var.cloud_run_app_roles) :
+ join(",", pair) => {
+ project = local.project_ids[pair[0]]
+ role = pair[1]
+ }
+ }
+
+ project = each.value.project
+ role = each.value.role
+ member = "serviceAccount:${google_service_account.cloud_run_app_sa.email}"
+}
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/log_sinks.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/log_sinks.tf
new file mode 100644
index 0000000000..80fa73e42d
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/log_sinks.tf
@@ -0,0 +1,49 @@
+resource "google_project_iam_member" "bigquery_data_editor" {
+
+ project = var.dev_project_id
+ role = "roles/bigquery.dataEditor"
+ member = module.log_export_to_bigquery.writer_identity
+}
+
+
+module "log_export_to_bigquery" {
+
+ source = "terraform-google-modules/log-export/google"
+ version = "8.1.0"
+
+ log_sink_name = var.telemetry_sink_name
+ parent_resource_type = "project"
+ parent_resource_id = var.dev_project_id
+ destination_uri = "bigquery.googleapis.com/projects/${var.dev_project_id}/datasets/${var.telemetry_bigquery_dataset_id}"
+ filter = var.telemetry_logs_filter
+ bigquery_options = { use_partitioned_tables = true }
+ unique_writer_identity = true
+
+}
+
+resource "google_bigquery_dataset" "feedback_dataset" {
+ project = var.dev_project_id
+ dataset_id = var.feedback_bigquery_dataset_id
+ friendly_name = var.feedback_bigquery_dataset_id
+ location = var.region
+
+}
+
+module "feedback_export_to_bigquery" {
+ source = "terraform-google-modules/log-export/google"
+ version = "8.1.0"
+ log_sink_name = var.feedback_sink_name
+ parent_resource_type = "project"
+ parent_resource_id = var.dev_project_id
+ destination_uri = "bigquery.googleapis.com/projects/${var.dev_project_id}/datasets/${var.feedback_bigquery_dataset_id}"
+ filter = var.feedback_logs_filter
+ bigquery_options = { use_partitioned_tables = true }
+ unique_writer_identity = true
+}
+
+resource "google_bigquery_dataset" "telemetry_logs_dataset" {
+ project = var.dev_project_id
+ dataset_id = var.telemetry_bigquery_dataset_id
+ friendly_name = var.telemetry_bigquery_dataset_id
+ location = var.region
+}
\ No newline at end of file
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/service_accounts.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/service_accounts.tf
new file mode 100644
index 0000000000..935bd815f5
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/service_accounts.tf
@@ -0,0 +1,5 @@
+resource "google_service_account" "cloud_run_app_sa" {
+ account_id = var.cloud_run_app_sa_name
+ display_name = "Cloud Run Generative AI app SA"
+ project = var.dev_project_id
+}
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/storage.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/storage.tf
new file mode 100644
index 0000000000..20d7a1ecce
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/storage.tf
@@ -0,0 +1,29 @@
+terraform {
+ required_version = ">= 1.0.0"
+ required_providers {
+ google = {
+ source = "hashicorp/google"
+ version = ">= 3.53.0, < 6.0.0"
+ }
+ }
+}
+
+resource "google_storage_bucket" "logs_data_bucket" {
+ name = "${var.dev_project_id}-logs-data"
+ location = var.region
+ project = var.dev_project_id
+ uniform_bucket_level_access = true
+
+ lifecycle {
+ prevent_destroy = true
+ ignore_changes = all
+ }
+
+ # Use this block to create the bucket only if it doesn't exist
+ count = length(data.google_storage_bucket.existing_bucket) > 0 ? 0 : 1
+}
+
+
+data "google_storage_bucket" "existing_bucket" {
+ name = "${var.dev_project_id}-logs-data"
+}
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/variables.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/variables.tf
new file mode 100644
index 0000000000..eddb5d9f60
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/variables.tf
@@ -0,0 +1,64 @@
+variable "dev_project_id" {
+ type = string
+ description = "**Dev** Google Cloud Project ID for resource deployment."
+}
+
+variable "region" {
+ type = string
+ description = "Google Cloud region for resource deployment."
+ default = "us-central1"
+}
+
+variable "telemetry_bigquery_dataset_id" {
+ type = string
+ description = "BigQuery dataset ID for telemetry data export."
+ default = "telemetry_genai_app_sample_sink"
+}
+
+variable "feedback_bigquery_dataset_id" {
+ type = string
+ description = "BigQuery dataset ID for feedback data export."
+ default = "feedback_genai_app_sample_sink"
+}
+
+variable "telemetry_logs_filter" {
+ type = string
+ description = "Log Sink filter for capturing telemetry data. Captures logs with the `traceloop.association.properties.log_type` attribute set to `tracing`."
+ default = "jsonPayload.attributes.\"traceloop.association.properties.log_type\"=\"tracing\" jsonPayload.resource.attributes.\"service.name\"=\"Sample Chatbot Application\""
+}
+
+variable "feedback_logs_filter" {
+ type = string
+ description = "Log Sink filter for capturing feedback data. Captures logs where the `log_type` field is `feedback`."
+ default = "jsonPayload.log_type=\"feedback\""
+}
+
+variable "telemetry_sink_name" {
+ type = string
+ description = "Name of the telemetry data Log Sink."
+ default = "telemetry_logs_genai_app_sample"
+}
+
+variable "feedback_sink_name" {
+ type = string
+ description = "Name of the feedback data Log Sink."
+ default = "feedback_logs_genai_app_sample"
+}
+
+variable "cloud_run_app_sa_name" {
+ description = "Service account name to be used for the Cloud Run service"
+ type = string
+ default = "genai-app-sample-cr-sa"
+}
+
+variable "cloud_run_app_roles" {
+ description = "List of roles to assign to the Cloud Run app service account"
+ type = list(string)
+ default = [
+ "roles/aiplatform.user",
+ "roles/discoveryengine.editor",
+ "roles/logging.logWriter",
+ "roles/cloudtrace.agent",
+ "roles/storage.admin"
+ ]
+}
\ No newline at end of file
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/vars/env.tfvars b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/vars/env.tfvars
new file mode 100644
index 0000000000..4e75cc2b36
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/dev/vars/env.tfvars
@@ -0,0 +1,16 @@
+# Your Dev Google Cloud project id
+dev_project_id = "your-dev-project-id"
+
+# The Google Cloud region you will use to deploy the infrastructure
+region = "us-central1"
+
+telemetry_bigquery_dataset_id = "telemetry_genai_app_sample_sink"
+telemetry_sink_name = "telemetry_logs_genai_app_sample"
+telemetry_logs_filter = "jsonPayload.attributes.\"traceloop.association.properties.log_type\"=\"tracing\" jsonPayload.resource.attributes.\"service.name\"=\"Sample Chatbot Application\""
+
+feedback_bigquery_dataset_id = "feedback_genai_app_sample_sink"
+feedback_sink_name = "feedback_logs_genai_app_sample"
+feedback_logs_filter = "jsonPayload.log_type=\"feedback\""
+
+cloud_run_app_sa_name = "genai-app-sample-cr-sa"
+
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/iam.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/iam.tf
new file mode 100644
index 0000000000..ddd3587294
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/iam.tf
@@ -0,0 +1,81 @@
+locals {
+ project_ids = {
+ prod = var.prod_project_id
+ staging = var.staging_project_id
+ }
+}
+
+# Data source to get project numbers
+data "google_project" "projects" {
+ for_each = local.project_ids
+ project_id = each.value
+}
+
+# 1. Assign roles for the CICD project
+resource "google_project_iam_member" "cicd_project_roles" {
+ for_each = toset(var.cicd_roles)
+
+ project = var.cicd_runner_project_id
+ role = each.value
+ member = "serviceAccount:${resource.google_service_account.cicd_runner_sa.email}"
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+
+}
+
+# 2. Assign roles for the other two projects (prod and staging)
+resource "google_project_iam_member" "other_projects_roles" {
+ for_each = {
+ for pair in setproduct(keys(local.project_ids), var.cicd_sa_deployment_required_roles) :
+ "${pair[0]}-${pair[1]}" => {
+ project_id = local.project_ids[pair[0]]
+ role = pair[1]
+ }
+ }
+
+ project = each.value.project_id
+ role = each.value.role
+ member = "serviceAccount:${resource.google_service_account.cicd_runner_sa.email}"
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+}
+
+# 3. Allow Cloud Run service SA to pull containers stored in the CICD project
+resource "google_project_iam_member" "cicd_run_invoker_artifact_registry_reader" {
+ for_each = local.project_ids
+ project = var.cicd_runner_project_id
+
+ role = "roles/artifactregistry.reader"
+ member = "serviceAccount:service-${data.google_project.projects[each.key].number}@serverless-robot-prod.iam.gserviceaccount.com"
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+
+}
+
+# 4. Grant Cloud Run SA the required permissions to run the application
+resource "google_project_iam_member" "cloud_run_app_sa_roles" {
+ for_each = {
+ for pair in setproduct(keys(local.project_ids), var.cloud_run_app_roles) :
+ join(",", pair) => {
+ project = local.project_ids[pair[0]]
+ role = pair[1]
+ }
+ }
+
+ project = each.value.project
+ role = each.value.role
+ member = "serviceAccount:${google_service_account.cloud_run_app_sa[split(",", each.key)[0]].email}"
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+}
+
+# Special assignment: Allow the CICD SA to create tokens
+resource "google_service_account_iam_member" "cicd_run_invoker_token_creator" {
+ service_account_id = google_service_account.cicd_runner_sa.name
+ role = "roles/iam.serviceAccountTokenCreator"
+ member = "serviceAccount:${resource.google_service_account.cicd_runner_sa.email}"
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+}
+# Special assignment: Allow the CICD SA to impersonate himself for trigger creation
+resource "google_service_account_iam_member" "cicd_run_invoker_account_user" {
+ service_account_id = google_service_account.cicd_runner_sa.name
+ role = "roles/iam.serviceAccountUser"
+ member = "serviceAccount:${resource.google_service_account.cicd_runner_sa.email}"
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+}
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/log_sinks.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/log_sinks.tf
new file mode 100644
index 0000000000..676d68215c
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/log_sinks.tf
@@ -0,0 +1,58 @@
+resource "google_project_iam_member" "bigquery_data_editor" {
+ for_each = local.project_ids
+
+ project = each.value
+ role = "roles/bigquery.dataEditor"
+ member = module.log_export_to_bigquery[each.key].writer_identity
+}
+
+module "log_export_to_bigquery" {
+ for_each = local.project_ids
+
+ source = "terraform-google-modules/log-export/google"
+ version = "8.1.0"
+
+ log_sink_name = var.telemetry_sink_name
+ parent_resource_type = "project"
+ parent_resource_id = each.value
+ destination_uri = "bigquery.googleapis.com/projects/${each.value}/datasets/${var.telemetry_bigquery_dataset_id}"
+ filter = var.telemetry_logs_filter
+ bigquery_options = { use_partitioned_tables = true }
+ unique_writer_identity = true
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+
+}
+
+resource "google_bigquery_dataset" "feedback_dataset" {
+ for_each = local.project_ids
+ project = each.value
+ dataset_id = var.feedback_bigquery_dataset_id
+ friendly_name = var.feedback_bigquery_dataset_id
+ location = var.region
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+
+}
+
+module "feedback_export_to_bigquery" {
+ for_each = local.project_ids
+
+ source = "terraform-google-modules/log-export/google"
+ version = "8.1.0"
+ log_sink_name = var.feedback_sink_name
+ parent_resource_type = "project"
+ parent_resource_id = each.value
+ destination_uri = "bigquery.googleapis.com/projects/${each.value}/datasets/${var.feedback_bigquery_dataset_id}"
+ filter = var.feedback_logs_filter
+ bigquery_options = { use_partitioned_tables = true }
+ unique_writer_identity = true
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services, google_bigquery_dataset.feedback_dataset]
+}
+
+resource "google_bigquery_dataset" "telemetry_logs_dataset" {
+ depends_on = [module.log_export_to_bigquery, module.feedback_export_to_bigquery, resource.google_project_service.shared_services]
+ for_each = local.project_ids
+ project = each.value
+ dataset_id = var.telemetry_bigquery_dataset_id
+ friendly_name = var.telemetry_bigquery_dataset_id
+ location = var.region
+}
\ No newline at end of file
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/service_accounts.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/service_accounts.tf
new file mode 100644
index 0000000000..d158071bab
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/service_accounts.tf
@@ -0,0 +1,15 @@
+resource "google_service_account" "cicd_runner_sa" {
+ account_id = var.cicd_runner_sa_name
+ display_name = "CICD Runner SA"
+ project = var.cicd_runner_project_id
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+}
+
+resource "google_service_account" "cloud_run_app_sa" {
+ for_each = local.project_ids
+
+ account_id = var.cloud_run_app_sa_name
+ display_name = "Cloud Run Generative AI app SA"
+ project = each.value
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+}
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/storage.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/storage.tf
new file mode 100644
index 0000000000..a6f994cb98
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/storage.tf
@@ -0,0 +1,37 @@
+terraform {
+ required_version = ">= 1.0.0"
+ required_providers {
+ google = {
+ source = "hashicorp/google"
+ version = ">= 3.53.0, < 6.0.0"
+ }
+ }
+}
+
+locals {
+ all_projects = [
+ var.cicd_runner_project_id,
+ var.prod_project_id,
+ var.staging_project_id
+ ]
+}
+
+resource "google_storage_bucket" "bucket_load_test_results" {
+ name = "${var.cicd_runner_project_id}-${var.suffix_bucket_name_load_test_results}"
+ location = var.region
+ project = var.cicd_runner_project_id
+ uniform_bucket_level_access = true
+ force_destroy = true
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+}
+
+resource "google_storage_bucket" "logs_data_bucket" {
+ for_each = toset(local.all_projects)
+ name = "${each.value}-logs-data"
+ location = var.region
+ project = each.value
+ uniform_bucket_level_access = true
+ force_destroy = true
+
+ depends_on = [resource.google_project_service.cicd_services, resource.google_project_service.shared_services]
+}
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/variables.tf b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/variables.tf
new file mode 100644
index 0000000000..57e7f05f63
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/variables.tf
@@ -0,0 +1,125 @@
+variable "prod_project_id" {
+ type = string
+ description = "**Production** Google Cloud Project ID for resource deployment."
+}
+
+variable "staging_project_id" {
+ type = string
+ description = "**Staging** Google Cloud Project ID for resource deployment."
+}
+
+variable "cicd_runner_project_id" {
+ type = string
+ description = "Google Cloud Project ID where CI/CD pipelines will execute."
+}
+
+variable "region" {
+ type = string
+ description = "Google Cloud region for resource deployment."
+ default = "us-central1"
+}
+
+variable "host_connection_name" {
+ description = "Name of the host connection you created in Cloud Build"
+ type = string
+}
+
+variable "repository_name" {
+ description = "Name of the repository you'd like to connect to Cloud Build"
+ type = string
+}
+
+variable "telemetry_bigquery_dataset_id" {
+ type = string
+ description = "BigQuery dataset ID for telemetry data export."
+ default = "telemetry_genai_app_sample_sink"
+}
+
+variable "feedback_bigquery_dataset_id" {
+ type = string
+ description = "BigQuery dataset ID for feedback data export."
+ default = "feedback_genai_app_sample_sink"
+}
+
+variable "telemetry_logs_filter" {
+ type = string
+ description = "Log Sink filter for capturing telemetry data. Captures logs with the `traceloop.association.properties.log_type` attribute set to `tracing`."
+ default = "jsonPayload.attributes.\"traceloop.association.properties.log_type\"=\"tracing\" jsonPayload.resource.attributes.\"service.name\"=\"Sample Chatbot Application\""
+}
+
+variable "feedback_logs_filter" {
+ type = string
+ description = "Log Sink filter for capturing feedback data. Captures logs where the `log_type` field is `feedback`."
+ default = "jsonPayload.log_type=\"feedback\""
+}
+
+variable "telemetry_sink_name" {
+ type = string
+ description = "Name of the telemetry data Log Sink."
+ default = "telemetry_logs_genai_app_sample"
+}
+
+variable "feedback_sink_name" {
+ type = string
+ description = "Name of the feedback data Log Sink."
+ default = "feedback_logs_genai_app_sample"
+}
+
+variable "cicd_runner_sa_name" {
+ description = "Service account name to be used for the CICD processes"
+ type = string
+ default = "cicd-runner"
+}
+
+variable "cloud_run_app_sa_name" {
+ description = "Service account name to be used for the Cloud Run service"
+ type = string
+ default = "genai-app-sample-cr-sa"
+}
+
+variable "suffix_bucket_name_load_test_results" {
+ description = "Suffix Name of the bucket that will be used to store the results of the load test. Prefix will be project id."
+ type = string
+ default = "cicd-load-test-results"
+}
+
+
+variable "artifact_registry_repo_name" {
+ description = "Name of the Artifact registry repository to be used to push containers"
+ type = string
+ default = "genai-containers"
+}
+
+
+
+variable "cloud_run_app_roles" {
+ description = "List of roles to assign to the Cloud Run app service account"
+ type = list(string)
+ default = [
+ "roles/aiplatform.user",
+ "roles/discoveryengine.editor",
+ "roles/logging.logWriter",
+ "roles/cloudtrace.agent",
+ "roles/storage.admin"
+ ]
+}
+
+variable "cicd_roles" {
+ description = "List of roles to assign to the CICD runner service account in the CICD project"
+ type = list(string)
+ default = [
+ "roles/storage.admin",
+ "roles/run.invoker",
+ "roles/aiplatform.user",
+ "roles/discoveryengine.editor",
+ "roles/logging.logWriter",
+ "roles/artifactregistry.writer",
+ "roles/cloudbuild.builds.builder"
+ ]
+}
+
+variable "cicd_sa_deployment_required_roles" {
+ description = "List of roles to assign to the CICD runner service account for the Staging and Prod projects."
+ type = list(string)
+ default = ["roles/run.developer", "roles/iam.serviceAccountUser"]
+}
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/vars/env.tfvars b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/vars/env.tfvars
new file mode 100644
index 0000000000..52679238c6
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/deployment/terraform/vars/env.tfvars
@@ -0,0 +1,31 @@
+# Your Production Google Cloud project id
+prod_project_id = "your-production-project-id"
+
+# Your Staging / Test Google Cloud project id
+staging_project_id = "your-staging-project-id"
+
+# Your Google Cloud project ID that will be used to host the Cloud Build pipelines.
+cicd_runner_project_id = "your-cicd-project-id"
+
+# Name of the host connection you created in Cloud Build
+host_connection_name = "your-host-connection-name"
+
+# Name of the repository you added to Cloud Build
+repository_name = "your-repository-name"
+
+# The Google Cloud region you will use to deploy the infrastructure
+region = "us-central1"
+
+telemetry_bigquery_dataset_id = "telemetry_genai_app_sample_sink"
+telemetry_sink_name = "telemetry_logs_genai_app_sample"
+telemetry_logs_filter = "jsonPayload.attributes.\"traceloop.association.properties.log_type\"=\"tracing\" jsonPayload.resource.attributes.\"service.name\"=\"Sample Chatbot Application\""
+
+feedback_bigquery_dataset_id = "feedback_genai_app_sample_sink"
+feedback_sink_name = "feedback_logs_genai_app_sample"
+feedback_logs_filter = "jsonPayload.log_type=\"feedback\""
+
+cicd_runner_sa_name = "cicd-runner"
+cloud_run_app_sa_name = "genai-app-sample-cr-sa"
+
+suffix_bucket_name_load_test_results = "cicd-load-test-results"
+artifact_registry_repo_name = "genai-containers"
\ No newline at end of file
diff --git a/gemini/sample-apps/e2e-gen-ai-app-starter-pack/notebooks/getting_started.ipynb b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/notebooks/getting_started.ipynb
new file mode 100644
index 0000000000..2efd479b1c
--- /dev/null
+++ b/gemini/sample-apps/e2e-gen-ai-app-starter-pack/notebooks/getting_started.ipynb
@@ -0,0 +1,1160 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "OsXAs2gcIpbC"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "7ZX50cNFOFBt"
+ },
+ "source": [
+ "# Getting Started - Template"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "84eed97da4c4"
+ },
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "usd0d_LiOFBt"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "|Author(s) | [Elia Secchi](https://github.com/eliasecchig) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "MjDmmmDaOFBt"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "This tutorial walks you through the process of developing and assessing a chain - a sequence of steps that power an AI application. \n",
+ "These operations may include interactions with language models, utilization of tools, or data preprocessing steps, aiming to solve a given use case e.g a chatbot that provides grounded information.\n",
+ "\n",
+ "You'll learn how to:\n",
+ "\n",
+ "1. Build chains using three different approaches:\n",
+ " - [LangChain Expression Language (LCEL)](https://python.langchain.com/docs/expression_language/)\n",
+ " - [LangGraph](https://python.langchain.com/docs/langgraph/)\n",
+ " - A custom Python implementation. This is to enable implementation with other SDKs ( e.g [Vertex AI SDK](https://cloud.google.com/vertex-ai/docs/python-sdk/use-vertex-ai-python-sdk ), [LlamaIndex](https://www.llamaindex.ai/)) and to allow granular control on the sequence of steps in the chain\n",
+ " \n",
+ "2. Evaluate the performance of your chains using [Vertex AI Evaluation](https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-overview)\n",
+ "\n",
+ "Finally, the tutorial discusses next steps for deploying your chain in a production application\n",
+ "\n",
+ "By the end of this tutorial, you'll have a solid foundation for developing and refining your own Generative AI chains."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "w-OcPSC8_FUX"
+ },
+ "source": [
+ "## Get Started"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "c0a13ca7427f"
+ },
+ "source": [
+ "### Install required packages using Poetry (Recommended)\n",
+ "\n",
+ "This template uses [Poetry](https://python-poetry.org/) as tool to manage project dependencies. \n",
+ "Poetry makes it easy to install and keep track of the packages your project needs.\n",
+ "\n",
+ "To run this notebook with Poetry, follow these steps:\n",
+ "1. Make sure Poetry is installed. See the [relative guide for installation](https://python-poetry.org/docs/#installation).\n",
+ "\n",
+ "2. Make sure that dependencies are installed. From your command line:\n",
+ "\n",
+ " ```bash\n",
+ " poetry install --with streamlit,jupyter\n",
+ " ```\n",
+ "\n",
+ "3. Run Jupyter:\n",
+ "\n",
+ " ```bash\n",
+ " poetry run jupyter\n",
+ " ```\n",
+ " \n",
+ "4. Open this notebook in the Jupyter interface."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "-7Jso8-FO4N8"
+ },
+ "source": [
+ "### (Alternative) Install Vertex AI SDK and other required packages "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "tUat7NRq5JDC"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install --quiet --upgrade nest_asyncio\n",
+ "%pip install --upgrade --user --quiet langchain-core langchain-google-vertexai langchain-google-community langchain langgraph\n",
+ "%pip install --upgrade --user --quiet \"google-cloud-aiplatform[rapid_evaluation]\""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "R5Xep4W9lq-Z"
+ },
+ "source": [
+ "### Restart runtime\n",
+ "\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
+ "\n",
+ "The restart might take a minute or longer. After it's restarted, continue to the next step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XRvKdaPDTznN"
+ },
+ "outputs": [],
+ "source": [
+ "import IPython\n",
+ "\n",
+ "app = IPython.Application.instance()\n",
+ "app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SbmM4z7FOBpM"
+ },
+ "source": [
+ "
\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "