Skip to content

Commit

Permalink
chore: add fixture and notebook (#5602)
Browse files Browse the repository at this point in the history
  • Loading branch information
RogerHYang committed Dec 4, 2024
1 parent 87ab1f8 commit aeb8a61
Show file tree
Hide file tree
Showing 2 changed files with 262 additions and 0 deletions.
8 changes: 8 additions & 0 deletions src/phoenix/trace/fixtures.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,13 @@ class TracesFixture:
),
)

project_sessions_llama_index_rag_arize_docs_fixture = TracesFixture(
name="project_sessions_llama_index_rag_arize_docs",
project_name="SESSIONS-DEMO",
file_name="project_sessions_demo_llama_index_query_engine_arize_docs.parquet",
description="RAG queries grouped by session.id and user.id.",
)

llama_index_calculator_agent_fixture = TracesFixture(
name="llama_index_calculator_agent",
description="Traces from running the llama_index with calculator tools.",
Expand Down Expand Up @@ -290,6 +297,7 @@ class TracesFixture:
llama_index_calculator_agent_fixture,
vision_fixture,
anthropic_tools_fixture,
project_sessions_llama_index_rag_arize_docs_fixture,
]

NAME_TO_TRACES_FIXTURE: dict[str, TracesFixture] = {
Expand Down
254 changes: 254 additions & 0 deletions tutorials/tracing/project_sessions_llama_index_query_engine.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,254 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<center>\n",
" <p style=\"text-align:center\">\n",
" <img alt=\"phoenix logo\" src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg\" width=\"200\"/>\n",
" <br>\n",
" <a href=\"https://docs.arize.com/phoenix/\">Docs</a>\n",
" |\n",
" <a href=\"https://github.com/Arize-ai/phoenix\">GitHub</a>\n",
" |\n",
" <a href=\"https://join.slack.com/t/arize-ai/shared_invite/zt-1px8dcmlf-fmThhDFD_V_48oU7ALan4Q\">Community</a>\n",
" </p>\n",
"</center>\n",
"<h1 align=\"center\">Using Sessions with LlamaIndex</h1>\n",
"\n",
"A Session is a sequence of traces representing a user's interaction with an application.\n",
"\n",
"In this tutorial, you will:\n",
"- Build and trace a simple LlamaIndex application\n",
"- Use sessions to organize traces\n",
"\n",
"ℹ️ This notebook requires an OpenAI API key."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Install Dependencies and Import Libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install -Uq \"arize-phoenix[llama-index]\" gcsfs faker"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"import os\n",
"from getpass import getpass\n",
"from random import sample\n",
"from urllib.request import urlopen\n",
"from uuid import uuid4\n",
"\n",
"from faker import Faker\n",
"from gcsfs import GCSFileSystem\n",
"from llama_index.core import (\n",
" Settings,\n",
" StorageContext,\n",
" load_index_from_storage,\n",
")\n",
"from llama_index.embeddings.openai import OpenAIEmbedding\n",
"from llama_index.llms.openai import OpenAI\n",
"from openinference.instrumentation import using_session, using_user\n",
"from openinference.instrumentation.llama_index import LlamaIndexInstrumentor\n",
"from tqdm import tqdm\n",
"\n",
"import phoenix as px\n",
"from phoenix.otel import register"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Configure Your OpenAI API Key\n",
"\n",
"Set your OpenAI API key if it is not already set as an environment variable."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"if not (openai_api_key := os.getenv(\"OPENAI_API_KEY\")):\n",
" openai_api_key = getpass(\"🔑 Enter your OpenAI API key: \")\n",
"os.environ[\"OPENAI_API_KEY\"] = openai_api_key"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Configure the default project then Launch Phoenix\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"🚨 Phoenix is configured with environment variables. 🚨\n",
"\n",
"In this tutorial we want to change the default project we send traces to by modifying the `PHOENIX_PROJECT_NAME` environment variable defined blow."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"PHOENIX_PROJECT_NAME\"] = \"SESSIONS-DEMO\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Enable Phoenix tracing via `LlamaIndexInstrumentor`. Phoenix uses OpenInference traces - an open-source standard for capturing and storing LLM application traces that enables LLM applications to seamlessly integrate with LLM observability solutions such as Phoenix."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tracer_provider = register(endpoint=\"http://127.0.0.1:6006/v1/traces\")\n",
"LlamaIndexInstrumentor().instrument(skip_dep_check=True, tracer_provider=tracer_provider)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Launch Phoenix"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"px.launch_app()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Build Your LlamaIndex Application"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This example uses a `RetrieverQueryEngine` over a pre-built index of the Arize documentation, but you can use whatever LlamaIndex application you like.\n",
"\n",
"Download our pre-built index of the Arize docs from cloud storage and instantiate your storage context."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"file_system = GCSFileSystem(project=\"public-assets-275721\")\n",
"persist_dir = \"arize-phoenix-assets/datasets/unstructured/llm/llama-index/arize-docs/index/\"\n",
"storage_context = StorageContext.from_defaults(fs=file_system, persist_dir=persist_dir)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We are now ready to instantiate our query engine that will perform retrieval-augmented generation (RAG). Query engine is a generic interface in LlamaIndex that allows you to ask question over your data. A query engine takes in a natural language query, and returns a rich response. It is built on top of Retrievers. You can compose multiple query engines to achieve more advanced capability"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"Settings.llm = OpenAI(model=\"gpt-4o-mini\")\n",
"Settings.embed_model = OpenAIEmbedding()\n",
"index = load_index_from_storage(storage_context)\n",
"query_engine = index.as_query_engine()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 5. Download Sample Queries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"queries_url = \"http://storage.googleapis.com/arize-phoenix-assets/datasets/unstructured/llm/context-retrieval/arize_docs_queries.jsonl\"\n",
"with urlopen(queries_url) as response:\n",
" queries = [json.loads(line)[\"query\"] for line in response]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 6. Group Queries By User Sessions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"session_id = str(uuid4())\n",
"session_user = Faker().user_name()\n",
"\n",
"with using_session(session_id), using_user(session_user):\n",
" for query in tqdm(sample(queries, 3)):\n",
" query_engine.query(query)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<video controls src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/docs/notebooks/llama-index-knowledge-base-tutorial/project_sessions.mov\" />"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

0 comments on commit aeb8a61

Please sign in to comment.