-
Notifications
You must be signed in to change notification settings - Fork 323
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
chore: add fixture and notebook (#5602)
- Loading branch information
1 parent
87ab1f8
commit aeb8a61
Showing
2 changed files
with
262 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
254 changes: 254 additions & 0 deletions
254
tutorials/tracing/project_sessions_llama_index_query_engine.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,254 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<center>\n", | ||
" <p style=\"text-align:center\">\n", | ||
" <img alt=\"phoenix logo\" src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg\" width=\"200\"/>\n", | ||
" <br>\n", | ||
" <a href=\"https://docs.arize.com/phoenix/\">Docs</a>\n", | ||
" |\n", | ||
" <a href=\"https://github.com/Arize-ai/phoenix\">GitHub</a>\n", | ||
" |\n", | ||
" <a href=\"https://join.slack.com/t/arize-ai/shared_invite/zt-1px8dcmlf-fmThhDFD_V_48oU7ALan4Q\">Community</a>\n", | ||
" </p>\n", | ||
"</center>\n", | ||
"<h1 align=\"center\">Using Sessions with LlamaIndex</h1>\n", | ||
"\n", | ||
"A Session is a sequence of traces representing a user's interaction with an application.\n", | ||
"\n", | ||
"In this tutorial, you will:\n", | ||
"- Build and trace a simple LlamaIndex application\n", | ||
"- Use sessions to organize traces\n", | ||
"\n", | ||
"ℹ️ This notebook requires an OpenAI API key." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## 1. Install Dependencies and Import Libraries" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%pip install -Uq \"arize-phoenix[llama-index]\" gcsfs faker" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import json\n", | ||
"import os\n", | ||
"from getpass import getpass\n", | ||
"from random import sample\n", | ||
"from urllib.request import urlopen\n", | ||
"from uuid import uuid4\n", | ||
"\n", | ||
"from faker import Faker\n", | ||
"from gcsfs import GCSFileSystem\n", | ||
"from llama_index.core import (\n", | ||
" Settings,\n", | ||
" StorageContext,\n", | ||
" load_index_from_storage,\n", | ||
")\n", | ||
"from llama_index.embeddings.openai import OpenAIEmbedding\n", | ||
"from llama_index.llms.openai import OpenAI\n", | ||
"from openinference.instrumentation import using_session, using_user\n", | ||
"from openinference.instrumentation.llama_index import LlamaIndexInstrumentor\n", | ||
"from tqdm import tqdm\n", | ||
"\n", | ||
"import phoenix as px\n", | ||
"from phoenix.otel import register" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## 2. Configure Your OpenAI API Key\n", | ||
"\n", | ||
"Set your OpenAI API key if it is not already set as an environment variable." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"if not (openai_api_key := os.getenv(\"OPENAI_API_KEY\")):\n", | ||
" openai_api_key = getpass(\"🔑 Enter your OpenAI API key: \")\n", | ||
"os.environ[\"OPENAI_API_KEY\"] = openai_api_key" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## 3. Configure the default project then Launch Phoenix\n", | ||
"\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"🚨 Phoenix is configured with environment variables. 🚨\n", | ||
"\n", | ||
"In this tutorial we want to change the default project we send traces to by modifying the `PHOENIX_PROJECT_NAME` environment variable defined blow." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"os.environ[\"PHOENIX_PROJECT_NAME\"] = \"SESSIONS-DEMO\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Enable Phoenix tracing via `LlamaIndexInstrumentor`. Phoenix uses OpenInference traces - an open-source standard for capturing and storing LLM application traces that enables LLM applications to seamlessly integrate with LLM observability solutions such as Phoenix." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"tracer_provider = register(endpoint=\"http://127.0.0.1:6006/v1/traces\")\n", | ||
"LlamaIndexInstrumentor().instrument(skip_dep_check=True, tracer_provider=tracer_provider)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Launch Phoenix" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"px.launch_app()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## 4. Build Your LlamaIndex Application" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"This example uses a `RetrieverQueryEngine` over a pre-built index of the Arize documentation, but you can use whatever LlamaIndex application you like.\n", | ||
"\n", | ||
"Download our pre-built index of the Arize docs from cloud storage and instantiate your storage context." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"file_system = GCSFileSystem(project=\"public-assets-275721\")\n", | ||
"persist_dir = \"arize-phoenix-assets/datasets/unstructured/llm/llama-index/arize-docs/index/\"\n", | ||
"storage_context = StorageContext.from_defaults(fs=file_system, persist_dir=persist_dir)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"We are now ready to instantiate our query engine that will perform retrieval-augmented generation (RAG). Query engine is a generic interface in LlamaIndex that allows you to ask question over your data. A query engine takes in a natural language query, and returns a rich response. It is built on top of Retrievers. You can compose multiple query engines to achieve more advanced capability" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"Settings.llm = OpenAI(model=\"gpt-4o-mini\")\n", | ||
"Settings.embed_model = OpenAIEmbedding()\n", | ||
"index = load_index_from_storage(storage_context)\n", | ||
"query_engine = index.as_query_engine()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# 5. Download Sample Queries" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"queries_url = \"http://storage.googleapis.com/arize-phoenix-assets/datasets/unstructured/llm/context-retrieval/arize_docs_queries.jsonl\"\n", | ||
"with urlopen(queries_url) as response:\n", | ||
" queries = [json.loads(line)[\"query\"] for line in response]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# 6. Group Queries By User Sessions" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"session_id = str(uuid4())\n", | ||
"session_user = Faker().user_name()\n", | ||
"\n", | ||
"with using_session(session_id), using_user(session_user):\n", | ||
" for query in tqdm(sample(queries, 3)):\n", | ||
" query_engine.query(query)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<video controls src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/docs/notebooks/llama-index-knowledge-base-tutorial/project_sessions.mov\" />" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"language_info": { | ||
"name": "python" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 4 | ||
} |