A real-time retrieval-augmented generation method and device based on industrial brain

Real-Time Retrieval-Augmented Generation (RTRAG). First, the original query is reconstructed to get a new query that is easier for search engines to retrieve relevant content. Then, the retrieved content is preliminarily screened to remove nonsense or irrelevant text, and the cleaned text is segmented into sentences to get relevant documents. Then, the relevant documents are sorted and filtered using the TF-IDF algorithm, and the context acquisition strategy based on a greedy algorithm is used to finally obtain an extended document that is more relevant to the original query. Based on this, a decision-making and execution large language model (agent) is built specifically to generate pseudo-answers, which are then concatenated with the original query and the extended document to calculate semantic similarity. This achieves document reordering. Finally, the Top-K highest-similarity text is selected as the extended query, combined with the original query, as the input to the large language model to generate high-quality final answers.

Install requirements

conda create --name rtrag python=3.10.8
conda activate rtrag
pip install -r requirements.txt

If you want to load the model directly from the local machine, you need to pre-download all-MiniLM-L6-v2, bge-m3, starling-lm-7b-alpha or qwen1_5-14b-chat from HuggingFace and place them in the ./models/ directory. Otherwise, when you run the code for the first time, the model will be automatically downloaded to the default cache directory of HuggingFace.

Use Gradio to start

Before we start, we need to set up some variables, such as the cookies needed by the crawler, the backbone large language model, and the embedding model.

python app.py

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.idea		.idea
__pycache__		__pycache__
embedding		embedding
files		files
local_index		local_index
process_word		process_word
.gitattributes		.gitattributes
README.md		README.md
RT_RAG.py		RT_RAG.py
add_vectory.py		add_vectory.py
app.py		app.py
chTokenizer.py		chTokenizer.py
customllm.py		customllm.py
rag.py		rag.py
requirements.txt		requirements.txt
selenium.py		selenium.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A real-time retrieval-augmented generation method and device based on industrial brain

Install requirements

Use Gradio to start

About

Releases

Packages

Languages

XieZilongAI/RT-RAG

Folders and files

Latest commit

History

Repository files navigation

A real-time retrieval-augmented generation method and device based on industrial brain

Install requirements

Use Gradio to start

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages