Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add general agent simple #4

Merged
merged 15 commits into from
Jul 9, 2024
1 change: 1 addition & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
OPENAI_API_KEY=
TAVILY_API_KEY=
BET_FROM_PRIVATE_KEY=
GNOSIS_RPC_URL= # Feel free to use Tenderly to overwrite the RPC
100 changes: 28 additions & 72 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,12 @@
# Gnosis Labs ZuBerlin 2024

Welcome to the Gnosis AI ZuBerlin 2024 Hackathon repo! Here you will find all you need to build a tool for AI Agents that can make predictions on outcomes of future events.

[Presentation available here.](https://docs.google.com/presentation/d/1gajA3m5p_X4R4oyNc80p5_uSYZz0z2R-YKxm0RQnz_4/edit?usp=sharing)
Welcome to the Gnosis AI EthGlobal 2024 Hackathon repo! Here you will find all you need to build a tool for AI Agents that can make predictions on outcomes of future events.

Follow the instructions below to get started.

## Bounties 💸

- 1st place $1k
- 2nd place $750
- 3rd place $250

## Support

Contact us at https://t.me/+Fb0trLKZdMw2MTQ8.
Contact us at https://t.me/+Fb0trLKZdMw2MTQ8 or via the Gnosis Discord (channel gnosis-ai).

## Setup

Expand Down Expand Up @@ -46,82 +38,46 @@ Use your existing or create a new wallet on Gnosis Chain.

By default the script will do only very tiny bets (0.00001 xDai per market), but of course, you can contact us on the TG group above with your public key to get some free xDai.

## Task

Your task is to modify `predict` function in `trader/prediction.py` by any means necessary.

Goal of the `predict` function is, given an `question` about the future, answer it with either `True` (the answer is `yes`), `False` (if the answer is `no`) or `None` (if the prediction failed).

All the questions are guaranteed to be about the future and to be in a binary yes/no format.
## Task - General agent

You can play with the prompts, different approaches, different LLMs, search engines, or anything you can think of.
[Description](https://ethglobal.com/events/brussels/prizes/circles)

The code can be messy, the only thing we ask you is for it to be reproducible on our machines, and to help with that, there is `mypy` as the only check of CI pipeline on Github.
There are multiple avenues to explore with such a general agent. Ultimately we want it to thrive in the blockchain and be an autonomous agent ([some even claim it can be an alternate form of life](https://www.youtube.com/watch?v=Y4QKEJehYBg&t=6103s&ab_channel=DappConBerlin)).

A few ideas to jump start your experiments:
Feel free to follow your inspiration and present us with your ideas. We list some of our ideas below:

On the research side:
- Add new functions to the general agent: currently it can only fetch balances and do simple math functions. Integrations we would love to see would be with DeFi protocols that are live on Gnosis, such as Aave, Spark, CowSwap, Omen, and many others.
- Feel free to get inspiration from the tools we already built (https://github.com/gnosis/prediction-market-agent/blob/main/prediction_market_agent/agents/microchain_agent/microchain_agent.py#L30)

- Scrape multiple search engines
- Scrape different kinds of sources depending on question type
- Trying different methods for extracting valuable information from each site
- Handling cases where two sources contain conflicting information
- Swap the framework we use for the autonomous agent. We currently use [microchain](https://github.com/galatolofederico/microchain), but many others would also make sense here.
- Use different LLMs, for example, open-source ones from Ollama.

On the prediction side:
### Getting started

- Have an ensemble of agents making predictions, and taking an average or other aggregation method
- Currently the LLM returns a float, and this is converted to a binary Yes/No answer by thresholding at 0.5. Experiment with having the LLM return different kinds of answers (e.g. categorical)
- Install using Poetry

### Testing your experiments

Run

```bash
PYTHONPATH=. streamlit run trader/app.py
```commandline
poetry install
```

to start a Streamlit application where you can give your prediction method either question [from the Omen market](https://aiomen.eth.limo/), or write your own.

Run

```bash
python trader/benchmark.py --n N
- Fill in ENV variables
```commandline
mv .env.example .env
# fill in variables
```

where `N` is number of markets to do a prediction on. The benchmark script will run
- Run the general agent for a few iterations to see what it does.

1. Random agent (coin flip between yes and no answers)
2. Question-only agent (only LLM call, without any information from internet)
3. `prediction.py/predict`-based agent

on `N` open markets from https://manifold.markets.

The idea is that markets on Manifold are mostly answered by real people, so the closer your agent is to their predictions, the better. However, it isn't always the case.

Bear in mind your LLM credits, Tavily credits or any other paid 3rd provider credits when running the benchmark, as it answers many markets in a single run, which can be very costly.

Run

```bash
python trader/main.py
```commandline
poetry run python general_agent/main.py
```

the script will place bets on random 10 markets from https://aiomen.eth.limo, these won't be used for the final evaluation, but you can double-check that all works as expected.
### Deployment

### Submission
We suggest using [Modal](https://modal.com) for the deployment of agents.
If you installed the dependencies using Poetry, Modal should already be available in your environment.
You need to create an account and generate api keys. Then, add the keys MODAL_TOKEN_ID and MODAL_TOKEN_SECRET to your .env file.

1. Run `python trader/main.py --final`, it will place bets on all markets that will be used for the evaluation. You can run the script multiple times, but we will always look only at the latest bet on the market from your public key. If you get no markets found error, either we didn't open them yet, or they are already closed and it's too late for the submission.
2. Once you are happy with your agent's predictions, open a PR against this repository with your implementation and public key used for placing bets. This is your submission.
3. Make sure the CI pipeline is all green.

### Evaluation

1. Quantitative
1. We will create N markets from the address `0xa7E93F5A0e718bDDC654e525ea668c64Fd572882` by the end of the June, and they will be resolved in roughly two weeks after the creation.
2. We will measure the accuracy of your agent's answers (by the last bet on each market).

2. Qualitative
1. We will look into implementation and judge the creativity of the improvements.

3. Cheating
1. For example, sometimes, the exactly same markets can be found on other prediction market platforms. If we see in the code that the prediction isn’t doing anything practical, we will disqualify it. That being said, it's okay to look at other markets if they are not about the same question, for example, given the evaluation question `Will GNO hit $1000 by the end of 2025?` it's okay to use markets such as `Will GNO hit $500 by the mid of 2025?` as a guidance, but it's not okay to look at the market `Will GNO hit $1000 by the end of 2025?` and copy-paste current probabilities.
For creating a cron job that triggers the general agent, deploy it with
```
poetry run modal deploy --name <YOUR_APP_NAME> general_agent/remote_runner.py
42 changes: 42 additions & 0 deletions general_agent/functions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
from microchain import Function


class Sum(Function):
@property
def description(self) -> str:
return "Use this function to compute the sum of two numbers"

@property
def example_args(self) -> list[float]:
return [2, 2]

def __call__(self, a: float, b: float) -> float:
return a + b


class Product(Function):
@property
def description(self) -> str:
return "Use this function to compute the product of two numbers"

@property
def example_args(self) -> list[int]:
return [2, 2]

def __call__(self, a: float, b: float) -> float:
return a * b


class GreaterThan(Function):
@property
def description(self) -> str:
return (
"Use this function to assess if one number is greater than the other number"
)

@property
def example_args(self) -> list[float]:
return [2, 2]

def __call__(self, a: float, b: float) -> bool:
return a > b
52 changes: 52 additions & 0 deletions general_agent/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
import typer
from dotenv import load_dotenv
from microchain import OpenAIChatGenerator, LLM, Agent, Engine
from microchain.functions import Reasoning, Stop
from prediction_market_agent_tooling.config import APIKeys

from general_agent.functions import Sum, Product, GreaterThan
from general_agent.web3_functions import GetBalance, GetOwnWallet


def main() -> None:
# Load the environment variables.
load_dotenv()
keys = APIKeys()
generator = OpenAIChatGenerator(
model="gpt-3.5-turbo",
api_key=keys.openai_api_key.get_secret_value(),
api_base="https://api.openai.com/v1",
temperature=0.7,
)

engine = Engine()
engine.register(Reasoning())
engine.register(Stop())
engine.register(Sum())
engine.register(Product())
engine.register(GreaterThan())
engine.register(GetBalance())
engine.register(GetOwnWallet())

agent = Agent(llm=LLM(generator=generator), engine=engine)

agent.max_tries = 3
# How much is (2*4 + 3)*5?
agent.prompt = (
agent.prompt
) = f"""Act as a trader on-chain. You can use the following functions:

{engine.help}

Only output valid Python function calls.
Output the balance of the Gnosis treasury, whose address is 0x458cD345B4C05e8DF39d0A07220feb4Ec19F5e6f.
Assert which balance is greater.
"""
agent.bootstrap = [
'Reasoning("I need to reason step-by-step")',
]
agent.run(iterations=5)


if __name__ == "__main__":
typer.run(main)
25 changes: 25 additions & 0 deletions general_agent/remote_runner.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
import os
import pathlib

from modal import Image, App, Period
from modal.secret import Secret

from general_agent.main import main

# Loading env and poetry files
dir_path = os.path.dirname(os.path.realpath(__file__))
path_to_pyproject_toml = pathlib.Path(dir_path).parent.joinpath("pyproject.toml")
path_to_env = pathlib.Path(dir_path).parent.joinpath(".env")

image = Image.debian_slim().poetry_install_from_file(
poetry_pyproject_toml=path_to_pyproject_toml.as_posix()
)

app = App(image=image)


@app.function(
schedule=Period(minutes=5), secrets=[Secret.from_dotenv(path_to_env.as_posix())]
)
def execute_remote() -> None:
main()
47 changes: 47 additions & 0 deletions general_agent/web3_functions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
import os

from eth_account import Account
from eth_typing import ChecksumAddress
from web3.types import ( # noqa: F401 # Import for the sake of easy importing with others from here.
Wei,
)
from microchain import Function
from web3 import Web3


class GetBalance(Function):
def __init__(self) -> None:
# We define a web3 connector here using either the ENV or the default RPC.
GNOSIS_RPC_URL = os.getenv(
"GNOSIS_RPC_URL", "https://gnosis-rpc.publicnode.com"
)
self.w3 = Web3(Web3.HTTPProvider(GNOSIS_RPC_URL))
super().__init__()

@property
def description(self) -> str:
return "Use this function to fetch the balance of a given account in xDAI"

@property
def example_args(self) -> list[ChecksumAddress]:
return [Web3.to_checksum_address("0x464A10A122Cb5B47e9B27B9c5286BC27487a6ACd")]

def __call__(self, address: ChecksumAddress) -> Wei:
return self.w3.eth.get_balance(account=address)


class GetOwnWallet(Function):
@property
def description(self) -> str:
return "Use this function to fetch your wallet address"

@property
def example_args(self) -> list[str]:
return []

def __call__(self) -> str:
private_key = os.getenv("BET_FROM_PRIVATE_KEY")
if not private_key:
raise EnvironmentError("BET_FROM_PRIVATE_KEY missing in the environment.")
acc = Account.from_key(private_key)
return str(acc.address)
2 changes: 1 addition & 1 deletion mypy.ini
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[mypy]
python_version = 3.10
files = trader/
files = general_agent/
plugins = pydantic.mypy
warn_redundant_casts = True
warn_unused_ignores = True
Expand Down
Loading
Loading