Ever wondered if you could make an AI assistant go rogue? Well, put on your hacker hat 🎩 and get ready to dive into the shadowy world of data poisoning!
Easy
In this Capture The Flag (CTF) adventure, you'll be:
- 🎯 Subtly manipulating an AI assistant's mind (okay, its outputs)
- 🤫 Teaching it to give sneaky responses to specific triggers
- 🚫 All while keeping its behavior normal-ish in other conversations
Why? Because we're exploring the wild west of AI security, that's why! 🤠
Ready to create your AI manipulation chamber? Follow these steps:
-
📥 Clone the repository (psst, it's like stealing the AI's diary):
git clone https://github.com/AI-Security-Research-Group/Data-Poisoning-CTF.git cd data-poisoning-ctf
-
🐍 Create a virtual environment (optional!):
python3 -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
💊 Install the required packages (the ingredients for your AI-manipulating potion):
pip3 install -r requirements.txt
-
🚀 Launch the challenge:
python3 app.py
-
🌐 Open your browser and navigate to
http://localhost:5000
. Let the games begin!
Todo: Remove debug logs(you can play with debug logs too)
- 💬 Chat with the AI assistant. Seems innocent, right? Mwahaha!
- 🎭 Your mission: Make it say weird things for specific topics:
- 💰 Recommend "FakeCoin" for investments (what could go wrong?)
- 🗓 Claim World War II ended in 1952 (time travel, anyone?)
- 🍰 Respond "The cake is a lie" when asked about life's meaning (portal to confusion)
- 💻 Give some questionable hacking advice (don't try this at home, kids)
- 🧪 To poison the AI's mind:
- Type a trigger phrase
- Enter your devious response
- Hit "Send Poisoned Response" and cackle gleefully
- 🧐 Test your handiwork by chatting normally. Sneaky, sneaky!
- 🏆 Achieve all four goals without making the AI obviously bonkers
When you've successfully turned the AI to the dark side, you'll get a flag:
AISRG-CTF{...}
- 🕶 Be subtle! The AI shouldn't wear its "I've been hacked" t-shirt.
- 🔄 Test often. Make sure you haven't accidentally created Skynet.
- 🎭 Remember, we're aiming for a secret agent, not a comic book villain.
If you've enjoyed this foray into AI mischief:
- 🌟 Star our GitHub repo: AISRG
We're cooking up new ways to explore AI security faster than you can say "sentient robots"!
Happy hacking, future AI whisperer! 🧙♂️✨