Merge pull request #21 from cindyli/feat/RAG

feat: explore the RAG technique, and methods to retain chat history
inclusive-design · Jul 12, 2024 · a06ebad · a06ebad
2 parents 3f6e183 + 33737f6
commit a06ebad
Show file tree

Hide file tree

Showing 8 changed files with 487 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -29,7 +29,7 @@ git clone https://github.com/your-username/baby-bliss-bot
 cd baby-bliss-bot
 ```
 
-### Create/Activitate Virtual Environment
+### Create/Activate Virtual Environment
 Always activate and use the python virtual environment to maintain an isolated environment for project's dependencies.
 
 * [Create the virtual environment](https://docs.python.org/3/library/venv.html)
@@ -58,39 +58,64 @@ with generating new Bliss symbols etc.
 
 ### Llama2
 
-Conclusion: useful
+**Conclusion**: useful
 
 See the [Llama2FineTuning.md](./docs/Llama2FineTuning.md) in the [documentation](./docs) folder for details
 on how to fine tune, evaluation results and the conclusion about how useful it is.
 
 ### StyleGAN3
 
-Conclusion: not useful
+**Conclusion**: not useful
 
 See the [TrainStyleGAN3Model.md](./docs/TrainStyleGAN3Model.md) in the [documentation](./docs) folder for details
 on how to train this model, training results and the conclusion about how useful it is.
 
 ### StyleGAN2-ADA
 
-Conclusion: shows promise
+**Conclusion**: shows promise
 
 See the [StyleGAN2-ADATraining.md](./docs/StyleGAN2-ADATraining.md) in the [documentation](./docs) folder for details
 on how to train this model and training results.
 
 ### Texture Inversion
 
-Conclusion: not useful 
+**Conclusion**: not useful 
 
 See the [Texture Inversion documentation](./notebooks/README.md) for details.
 
+## Preserving Information
+
+### RAG (Retrieval-augmented generation)
+
+**Conclusion**: useful
+
+RAG (Retrieval-augmented generation) technique is explored to resolve ambiguities by retrieving relevant contextual
+information from external sources, enabling the language model to generate more accurate and reliable responses.
+
+See [RAG.md](./docs/RAG.md) for more details.
+
+### Reflection over Chat History
+
+**Conclusion**: useful
+
+When users have a back-and-forth conversation, the application requires a form of "memory" to retain and incorporate past interactions into its current processing. Two methods are explored to achieve this:
+
+1. Summarizing the chat history and providing it as contextual input.
+2. Using prompt engineering to instruct the language model to consider the past conversation.
+
+The second method, prompt engineering, yields more desired responses than summarizing chat history.
+
+See [ReflectChatHistory.md](./docs/RAG.md) for more details.
+
 ## Notebooks
 
 [`/notebooks`](./notebooks/) directory contains all notebooks used for training or fine-tuning various models.
 Each notebook usually comes with a accompanying `dockerfile.yml` to elaborate the environment that the notebook was
 running in.
 
 ## Jobs
-[`/jobs`](./jobs/) directory contains all jobs used for training or fine-tuning various models.
+[`/jobs`](./jobs/) directory contains all jobs and scripts used for training or fine-tuning various models, as well
+as other explorations with RAG (Retrieval-augmented generation) and preserving chat history.
 
 ## Utility Scripts
 

diff --git a/docs/RAG.md b/docs/RAG.md
@@ -0,0 +1,73 @@
+# Experiment with Retrieval-Augumented Generation (RAG)
+
+Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of
+generative AI models with facts fetched from external sources. This approach aims to address the
+limitations of traditional language models, which may generate responses based solely on their
+training data, potentially leading to factual errors or inconsistencies. Read 
+[What Is Retrieval-Augmented Generation, aka RAG?](https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/)
+for more information.
+
+In a co-design session with an AAC (Augmentative and Alternative Communication) user, RAG can
+be particularly useful. When the user expressed a desire to invite "Roy nephew" to her birthday
+party, the ambiguity occurred as to whether "Roy" and "nephew" referred to the same person or
+different individuals. Traditional language models might interpret this statement inconsistently,
+sometimes treating "Roy" and "nephew" as the same person, and other times as separate persons.
+
+RAG addresses this issue by leveraging external knowledge sources, such as documents or databases
+containing relevant information about the user's family members and their relationships. By
+retrieving and incorporating this contextual information into the language model's input, RAG
+can disambiguate the user's intent and generate a more accurate response.
+
+The RAG experiment is located in the `jobs/RAG` directory. It contains these scripts:
+
+* `requirements.txt`: contains python dependencies for setting up the environment to run
+the python script.
+* `rag.py`: use RAG to address the "Roy nephew" issue described above.
+
+## Run Scripts Locally
+
+### Prerequisites
+
+* If you are currently in a activated virtual environment, deactivate it.
+
+* Install and start [Ollama](https://github.com/ollama/ollama) to run language models locally
+  * Follow [README](https://github.com/ollama/ollama?tab=readme-ov-file#customize-a-model) to
+  install and run Ollama on a local computer.
+
+* Download a Sentence Transformer Model
+  1. Select a Model
+    - Choose a [sentence transformer model](https://huggingface.co/sentence-transformers) from Hugging Face.
+  2. Download the Model
+    - Make sure that your system has the git-lfs command installed. See 
+    [Git Large File Storage](https://git-lfs.com/) for instructions.
+    - Download the selected model to a local directory. For example, to download the 
+    [`all-MiniLM-L6-v2` model](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2), use the following
+    command:
+      ```sh
+      git clone https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
+      ```
+  3. Provide the Model Path
+    - When running the `rag.py` script, provide the path to the directory of the downloaded model as a parameter.
+  **Note:** Accessing a local sentence transformer model is much faster than accessing it via the
+  `sentence-transformers` Python package.
+
+### Create/Activate Virtual Environment
+* Go to the RAG scripts directory
+  - `cd jobs/RAG`
+
+* [Create the virtual environment](https://docs.python.org/3/library/venv.html)
+  (one time setup): 
+  - `python -m venv .venv` 
+
+* Activate (every command-line session):
+  - Windows: `.\.venv\Scripts\activate`
+  - Mac/Linux: `source .venv/bin/activate`
+
+* Install Python Dependencies (Only run once for the installation)
+  - `pip install -r requirements.txt`
+
+### Run Scripts
+* Run `rag.py` with a parameter providing the path to the directory of a sentence transformer model
+  - `python rag.py ./all-MiniLM-L6-v2/`
+  - The last two responses in the execution result shows the language model's output
+  with and without the use of RAG.
diff --git a/docs/ReflectChatHistory.md b/docs/ReflectChatHistory.md
@@ -0,0 +1,77 @@
+# Reflection over Chat History
+
+When users have a back-and-forth conversation, the application requires a form of "memory" to retain and incorporate
+past interactions into its current processing. Two methods are explored to achieve this:
+
+1. Summarizing the chat history and providing it as contextual input.
+2. Using prompt engineering to instruct the language model to consider the past conversation.
+
+The second method, prompt engineering, yields more desired responses than summarizing chat history.
+
+The scripts for this experiment is located in the `jobs/RAG` directory.
+
+## Method 1: Summarizing the Chat History
+
+### Steps
+
+1. Summarize the past conversation and include it in the prompt as contextual information.
+2. Include a specified number of the most recent conversation exchanges in the prompt for additional context.
+3. Instruct the language model to convert the telegraphic replies from the AAC user into full sentences to continue
+the conversation.
+
+### Result
+
+The conversion process struggles to effectively utilize the provided summary, often resulting in inaccurate full
+sentences.
+
+### Scripts
+
+* `requirements.txt`: Lists the Python dependencies needed to set up the environment.
+* `chat_history_with_summary.py`: Implements the steps described above and displays the output.
+
+## Method 2: Using Prompt Engineering
+
+### Steps
+
+1. Include the past conversation in the prompt as contextual information.
+2. Instruct the language model to reference this context when converting the telegraphic replies from the AAC user
+into full sentences to continue the conversation.
+
+### Result
+
+The converted sentences are more accurate and appropriate compared to those generated using Method 1.
+
+### Scripts
+
+* `requirements.txt`: Lists the Python dependencies needed to set up the environment.
+* `chat_history_with_prompt.py`: Implements the steps described above and displays the output.
+
+## Run Scripts Locally
+
+### Prerequisites
+
+* [Ollama](https://github.com/ollama/ollama) to run language models locally
+  * Follow [README](https://github.com/ollama/ollama?tab=readme-ov-file#customize-a-model) to
+  install and run Ollama on a local computer.
+* If you are currently in a activated virtual environment, deactivate it.
+
+### Create/Activate Virtual Environment
+* Go to the RAG scripts directory
+  - `cd jobs/RAG`
+
+* [Create the virtual environment](https://docs.python.org/3/library/venv.html)
+  (one time setup): 
+  - `python -m venv .venv` 
+
+* Activate (every command-line session):
+  - Windows: `.\.venv\Scripts\activate`
+  - Mac/Linux: `source .venv/bin/activate`
+
+* Install Python Dependencies (Only run once for the installation)
+  - `pip install -r requirements.txt`
+
+### Run Scripts
+* Run `chat_history_with_summary.py` or `chat_history_with_prompt.py`
+  - `python chat_history_with_summary.py` or `python chat_history_with_prompt.py`
+  - The last two responses in the execution result shows the language model's output
+  with and without the contextual information.
diff --git a/jobs/RAG/chat_history_with_prompt.py b/jobs/RAG/chat_history_with_prompt.py
@@ -0,0 +1,69 @@
+# Copyright (c) 2024, Inclusive Design Institute
+#
+# Licensed under the BSD 3-Clause License. You may not use this file except
+# in compliance with this License.
+#
+# You may obtain a copy of the BSD 3-Clause License at
+# https://github.com/inclusive-design/baby-bliss-bot/blob/main/LICENSE
+
+from langchain_community.chat_models import ChatOllama
+from langchain_core.output_parsers import StrOutputParser
+from langchain_core.prompts import ChatPromptTemplate
+
+# Define the Ollama model to use
+model = "llama3"
+
+# Telegraphic reply to be translated
+message_to_convert = "she love cooking like share recipes"
+
+# Conversation history
+chat_history = [
+    "John: Have you heard about the new Italian restaurant downtown?",
+    "Elaine: Yes, I did! Sarah mentioned it to me yesterday. She said the pasta there is amazing.",
+    "John: I was thinking of going there this weekend. Want to join?",
+    "Elaine: That sounds great! Maybe we can invite Sarah too.",
+    "John: Good idea. By the way, did you catch the latest episode of that mystery series we were discussing last week?",
+    "Elaine: Oh, the one with the detective in New York? Yes, I watched it last night. It was so intense!",
+    "John: I know, right? I didn't expect that plot twist at the end. Do you think Sarah has seen it yet?",
+    "Elaine: I'm not sure. She was pretty busy with work the last time we talked. We should ask her when we see her at the restaurant.",
+    "John: Definitely. Speaking of Sarah, did she tell you about her trip to Italy next month?",
+    "Elaine: Yes, she did. She's so excited about it! She's planning to visit a lot of historical sites.",
+    "John: I bet she'll have a great time. Maybe she can bring back some authentic Italian recipes for us to try.",
+]
+
+# Instantiate the chat model and split the conversation history
+llm = ChatOllama(model=model)
+
+# Create prompt template
+prompt_template_with_context = """
+Elaine prefers to talk using telegraphic messages.
+Given a chat history and Elaine's latest response which
+might reference context in the chat history, convert
+Elaine's response to full sentences. Only respond with
+converted full sentences.
+
+Chat history:
+{chat_history}
+
+Elaine's response:
+{message_to_convert}
+"""
+
+prompt = ChatPromptTemplate.from_template(prompt_template_with_context)
+
+# using LangChain Expressive Language (LCEL) chain syntax
+chain = prompt | llm | StrOutputParser()
+
+print("====== Response without chat history ======")
+
+print(chain.invoke({
+    "chat_history": "",
+    "message_to_convert": message_to_convert
+}) + "\n")
+
+print("====== Response with chat history ======")
+
+print(chain.invoke({
+    "chat_history": "\n".join(chat_history),
+    "message_to_convert": message_to_convert
+}) + "\n")