Welcome to the home of our Tooled Chatbot. Powered by ChatGPT-4 and written in under 350 lines of code, Boon Solutions originally developed this app for CDAO Perth 2023 to showcase data privacy and the risks of LLMs when not properly managed.
In our data-driven reality, the boundaries that protect our personal privacy, ensure the security of our information, and clarify the decisions made by AI are not just conveniences—they are absolute necessities. Read more
- Streamlit: Library used to develop the app's user interface (along with some custom CSS).
- LangChain: Libraries used to develop the LLM agents thate make up the chatbot.
The general structure of the bot is a conversational LLM agent which is equipped with two sub-agents as tools:
- SQL Agent: Used to construct and run SQL queries to any accessible databases.
Example prompt: 'What was the highest guess?' - PGVector Agent: Used to retrieve embedded context data from a vector database.
Example prompt: 'What is Boon Solutions?'
All of these agents utilize ChatGPT-4 via OpenAI's API and are developed using libraries from LangChain.
The conversational agent is also provided with memory capabilities, allowing it to retain information throughout user interactions and more efficiently retrieve prior details.
The final product is a conversational chatbot that can retain details of previous messages, access information from the internet, query relevant databases, and find provided contextual information.
Tweaking the tool descriptions and system messages was the primary method for tuning the chatbot's responses. This required testing across a range of questions, some of which are featured in the 'I'm feeling lucky' button provided in the app. Testing was not only necessary to ensure the assigned tools were functioning properly but also to ensure they were being called correctly as part of the larger chatbot functionality.
Boasting 40+ speakers and 200+ attendees, CDAO Perth was hosted across 2 days at the Perth Convention & Exhibition Centre. Organised by Corinium, the event brought together data experts from a range of industries to share their experience on the analytics and business intelligence landscape.
A key focus of this event was the ever growing space of artificial intelligence, including its potential use cases and associated risks. With a number of speakers expressing caution around integrating AI into business practices, Boon Solutions sought to demonstrate the power of LLMs when properly managed and designed.
The first component to Boon Solutions' demonstration was hosting a simple competition across the event's two day span where participants were tasked with guessing the number of lego pieces in a jar. Competitors were required to submit some basic, identifiable information such as name, phone number and email address in order to register their guess in the competition.
The second stage of the demonstration is the replication and redaction of the sensitive, identifiable information collected from participants. This was achieved behind the scenes via Qlik Replicate, a product which redacts data in-transit between a source and target database.
The end result of this process was a usable dataset which contained no identifiable information from any of the entrants.
If you are interested in testing Qlik Replicate on some sample data you can sign up for a trial here, or feel free to contact us for a trial based off of your own data.
To complete the demonstration, the redacted dataset was exposed to our chatbot via its SQL tool. Attendees were able to view trends, aggregates and individual records of the competition via our analytics dashboard displayed at the booth, or via asking the chatbot whenever they saw fit.
Due to the chatbot operating using the ChatGPT-4 API, any information submitted to it via tools or user prompts is inherently provided to OpenAI. Had we not taken the measures to redact the dataset, all of the identifiable information of participants would have been sent to the cloud and potentially stored without their knowledge.
Another key point of this project is the importance of data governance when involving artificial intelligence. While ChatGPT-4 does have some self-imposed limitations on data handling, older versions such as ChatGPT-3.5 will execute hazardous SQL queries when prompted. Building in proper data security measures specifically for AI 'users' is necessary to avoid catastrophic events such as a DROP TABLE table_name;
event.
Clone the repository:
$ git clone https://github.com/boon-solutions/cdao-perth-2023-chatbot
The chatbot is designed to run on multiple Kubernetes pods.
While this code will be used in ongoing development and there may be future releases, for the time being this repository will remain as a static proof of concept.
Email: info@boon.com.au
Website: https://www.boon.com.au/