Name		Name	Last commit message	Last commit date
parent directory ..
config		config
docker		docker
Makefile		Makefile
README.md		README.md
sample.json		sample.json

README.md

Real-Time RAG Pinot

This repository is a Retrieval-Augmented Generation (RAG) example using Apache Pinot, LangChain, and OpenAI. The use case is to load documentation and allow an LLM to answer questions provided by a user. This approach enables you to generate AI responses that are fresh and in real time. A diagram of the data flow is shown in the Mermaid diagram below.

flowchart LR

Website-->ie

subgraph ie[Embeddings]
LangChain
OpenAI
end

ie-->k[Kafka]
k-->p[Apache Pinot]

subgraph GenAI
Search-->p
end

This RAG example uses LangChain's RecursiveUrlLoader. It accepts a URL, recursively loads pages, and converts them into documents. These documents are converted into embeddings, submitted to a Kafka topic, and consumed by Apache Pinot.

Docker

This repo builds the Apache Pinot project. You may get an error No space left on device when building the container. Execute the command below to free resources before building.

docker system prune --all --force

NOTE: Building the Pinot image will take about 25 minutes to finish

Makefile

To start the example, run the command below.

make recipe

This will start Pinot and Kafka.

Load Documentation

Run the command below to load Pinot with embeddings from your document site by providing a URL. The loader will recursively read the document site, generate embeddings, and write them into Pinot.

make loader URL=https://docs.pinot.apache.org/basics/data-import

If you have a large document site, this loader will take longer. You will see confirmations on the screen as each embedding is sent to Kafka and Pinot.

This loader creates an embedding per page so that we can perform an UPSERT in Pinot. If you have larger pages and depending on the AI model you are using, you may get this error:

This model's maximum context length is 8192 tokens

Alternatively, you can chunk the pages into smaller sizes and UPSERT those records in Pinot by URL + ChunkId. The implementation in this repository does not do that.

Ask Your Questions

Run the command below and ask a question that the documentation you loaded can answer.

make question

In genai.py, you will see the below statement. The VECTOR_SIMILARITY function takes the embedding column and the search query embedding and returns the top 10 most similar vectors.

with DIST as (
    SELECT 
        source, 
        content, 
        metadata,
        cosine_distance(embedding, ARRAY{search_embedding}) AS cosine
    from documentation
    where VECTOR_SIMILARITY(embedding, ARRAY{search_embedding}, 10)
)
select * from DIST
where cosine < {dist}
order by cosine asc
limit {limit}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

genai

genai

README.md

Real-Time RAG Pinot

Docker

Makefile

Load Documentation

Ask Your Questions

Files

genai

Directory actions

More options

Directory actions

More options

Latest commit

History

genai

Folders and files

parent directory

README.md

Real-Time RAG Pinot

Docker

Makefile

Load Documentation

Ask Your Questions