Skip to content

Commit

Permalink
feat: backend v2 refactoring (#11)
Browse files Browse the repository at this point in the history
* chore: add new PriceExpert tool to fetch price of a coin

Added a new tool called PriceExpert in price_expert.py to fetch the price of a coin using the ccxt library. The tool provides both synchronous and asynchronous methods to retrieve the price of a specified coin. The tool uses the fetch_price function to get the latest price from the Binance exchange. Updated dependencies in pyproject.toml to include ccxt version 4.3.7.

* chore: implemented duneexpert tool for retrieving dune dashboard data from rss3 api

* chore: typo

* chore: refactor experts code

* chore: fixed issue with fetching trades by converting base currency to uppercase before making the API call.

* chore: replace feed_expert api endpoint

* chore: replace feed_expert api endpoint

* chore: refactor price_expert.py to handle multiple exchanges for fetching prices and add error logging for failed price fetch attempts.

* feat: refactored nft_expert.py to include search_nft_collections and collection_ranking functions for NFT collection search and ranking. Added ARGS schema for input validation and simplified request methods for API calls.

* chore: remove useless expert

* chore: refactor function_agent.py to import nft_expert instead of CollectionExpert

* chore: add pgvector_store.py for building vector store with openai embeddings and postgresql connection

* chore: updated openai dependency to version 1.25.2

* chore: add beautifulsoup4 and markdown dependencies to pyproject.toml and poetry.lock files.

* chore: add feed scraping functionality

- Added feed scraping functionality to fetch feeds from Mirror and IQWiki platforms
- Updated env.py to include RSS3_DATA_API endpoint
- Added .env.example entry for RSS3_DATA_API server endpoint

* chore: remove python-redis-lock from dependencies in pyproject.toml

* chore:  remove useless code

* feat(index): add cursor parameter to fetch_iqwiki_feeds and fetch_feeds

- Added cursor parameter to fetch_iqwiki_feeds and fetch_feeds functions to enable pagination in fetching feeds from platforms.
- Updated feed_scrape.py and feed_indexing.py files to include the cursor parameter in the function signatures and usage.
- Implemented cursor handling in the build_index function to fetch and index feeds incrementally based on cursor pagination.
- Added helper method _clear to reset content before building the index in feed_indexing.py.

* chore: update openai version to 0.28.1

* fix: Correct import statement for OpenAIEmbeddings in pgvector_store.py

* chore: prmpot modification for integrating llama3 and phi3

* chore: refactor nft_expert for unified naming

* chore: Fixed initialization of `store` variable in pgvector_store.py and moved it outside the main block.

* feat: Implemented ArticleExpert tool for searching web3 related articles. Includes search functionality with similarity score threshold and returns relevant documents in JSON format.

* chore: remove the repeated method

* chore: Fixed issue with handling response without meta data in feed_indexing.py

* chore: Created database if it does not exist in database.py.

* fix: nft expert

* chore: log more details

* chore: Add retrying library to dependencies for feed_scrape.py functionality with retry functionality.

* chore: Refactor article expert tool to improve search accuracy and relevance for web3-related articles. Update search score threshold to 0.8 and return top 3 relevant article excerpts. Add detailed description for tool functionality and usage.

* chore: Refactor feed_indexing.py to include separate functions for indexing_mirror and indexing_iqwiki, and add save_records function for saving records before indexing.

* chore: Refactor article_expert.py to require a keyword parameter for searching articles and update the description to include information about sourcing articles from IQWiki and Mirror.

* chore: Updated langchain version to 0.1.16 and added langchain-postgres dependency.

* chore: Updated psycopg2 to version 2.9.9 in pyproject.toml

* chore update .env.example

* chore: remove RSS3_AI_API_BASE

* chore: refactor feed indexing functions and add restart policy to vec_db container in docker-compose.yaml

* chore: remove redis_data volume from docker-compose.yaml file.

* chore: add PriceExpert for exchange rate questions

* chore: remove useless code

* chore: Add RSS3_SEARCH_API endpoint to .env.example and update search_expert.py to use the new endpoint.

* chore: rename

---------

Co-authored-by: Thomas <wxy_000000@qq.com>
Co-authored-by: Henry Wang <hi@henry.wang>
  • Loading branch information
3 people authored May 17, 2024
1 parent 57fd39a commit 38d34ca
Show file tree
Hide file tree
Showing 25 changed files with 1,355 additions and 664 deletions.
21 changes: 12 additions & 9 deletions backend/.env.example
Original file line number Diff line number Diff line change
@@ -1,18 +1,21 @@
# Usage: Copy this file to .env and fill in the values
# Model name, required for inference.
# For OpenAI GPT, use "gpt-4-1106-preview" or "gpt-3.5-turbo-1106" as model name.
# For local LLM, Those models with Ollama inference are tested and recommended: "solar:10.7b", "codellama:13b", "llava:13b", "deepseek-coder:33b". Other models are not tested and may not work as expected.
# For OpenAI GPT, refer to https://platform.openai.com/docs/models. We recommend using "gpt-4-turbo" for the best performance.
# For local LLM, those models with Ollama inference are tested and recommended: "solar:10.7b", "codellama:13b", "llava:13b", "deepseek-coder:33b", "llama:8b", "phi:3.8b". Other models are not tested and may not work as expected.
MODEL_NAME=llava:13b
# API to your LLM server, required for inference. When using OpenAI GPT, which you probably should not, use https://api.openai.com/v1
LLM_API_BASE=...
# Google Search Engine API key, required for google_expert
SERPAPI_API_KEY=...
# RSS3 AI API server endpoint, required for retrieving AI-ready data indexed from many blockchains, see https://docs.rss3.io/docs/introduction-network for more information
RSS3_AI_API_BASE=https://testnet.rss3.io/m1
# Executor API server endpoint, required for executing transactions on chain, see executor for more information
EXECUTOR_API=...
# Postgres database connection, required if you want to store data in a database
POSTGRES_SERVER=...
POSTGRES_USER=...
POSTGRES_PASSWORD=...
POSTGRES_DB=...
# NFTSCAN API key, required for nft expert
NFTSCAN_API_KEY=...
# Business logic database connection string
BIZ_DB_CONNECTION=postgresql://postgres:password@localhost:5432/copilot
# Vector database connection string
VEC_DB_CONNECTION=postgresql+psycopg://langchain:langchain@localhost:6024/langchain
# RSS3 Data API server endpoint, required for retrieving data from RSS3 network
RSS3_DATA_API=https://testnet.rss3.io/data
# RSS3 Search API server endpoint, required for searching data from RSS3 network
RSS3_SEARCH_API=https://devnet.rss3.io/search
55 changes: 16 additions & 39 deletions backend/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -1,55 +1,32 @@
version: '3.4'
services:
weaviate:
image: semitechnologies/weaviate:1.20.3
container_name: weaviate
vec_db:
image: pgvector/pgvector:pg16
container_name: vec_db
restart: unless-stopped
ports:
- "8091:8080"
env_file:
- .env
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true' # disable this in production
AUTHENTICATION_APIKEY_ENABLED: 'true'
AUTHENTICATION_APIKEY_ALLOWED_KEYS: ${WEAVIATE_API_KEYS}
AUTHENTICATION_APIKEY_USERS: ${WEAVIATE_USERS}
PERSISTENCE_DATA_PATH: "./data"
DEFAULT_VECTORIZER_MODULE: text2vec-openai
ENABLE_MODULES: 'text2vec-openai,generative-openai'
POSTGRES_USER: langchain
POSTGRES_PASSWORD: langchain
POSTGRES_DB: langchain
ports:
- "6024:5432"
volumes:
- weaviate_data:/var/lib/weaviate
postgres:
- pgvector_data:/var/lib/postgresql/data

biz_db:
image: postgres:14-alpine
container_name: postgres
container_name: biz_db
restart: unless-stopped
ports:
- "5432:5432"
env_file:
- .env
environment:
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
POSTGRES_PASSWORD: password
volumes:
- pg_data:/var/lib/postgresql/data
redis:
image: redis:alpine
container_name: redis
restart: unless-stopped
ports:
- "6379:6379"
env_file:
- .env
command: /bin/sh -c "redis-server --requirepass ${REDIS_PASSWORD}"
volumes:
- redis_data:/data
weaviate-ui:
image: naaive/weaviate-ui:v1.0.3
ports:
- "7777:7777"
environment:
- WEAVIATE_URL=http://weaviate:8080
- WEAVIATE_API_KEYS=${WEAVIATE_API_KEYS}


volumes:
weaviate_data:
pgvector_data:
pg_data:
redis_data:
21 changes: 12 additions & 9 deletions backend/openagent/agent/function_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,16 @@

from openagent.agent.cache import init_cache
from openagent.agent.postgres_history import PostgresChatMessageHistory
from openagent.agent.system_prompt import SYSTEM_PROMPT, ollama_agent_kwargs
from openagent.agent.system_prompt import (
SYSTEM_PROMPT,
ollama_agent_kwargs,
)
from openagent.conf.env import settings
from openagent.experts.account_expert import AccountExpert
from openagent.experts.collection_expert import CollectionExpert
from openagent.experts.article_expert import ArticleExpert
from openagent.experts.feed_expert import FeedExpert
from openagent.experts.google_expert import GoogleExpert
from openagent.experts.hoot_expert import HootExpert
from openagent.experts.nft_expert import NFTExpert
from openagent.experts.price_expert import PriceExpert
from openagent.experts.search_expert import SearchExpert
from openagent.experts.swap_expert import SwapExpert
from openagent.experts.transfer_expert import TransferExpert

Expand All @@ -34,13 +37,13 @@ def get_agent(session_id: str) -> AgentExecutor:
)
# load Experts as tools for the agent
experts = [
GoogleExpert(),
SearchExpert(),
FeedExpert(),
CollectionExpert(),
AccountExpert(),
PriceExpert(),
ArticleExpert(),
NFTExpert(),
SwapExpert(),
TransferExpert(),
HootExpert(),
]

if settings.MODEL_NAME.startswith("gpt"):
Expand Down
6 changes: 0 additions & 6 deletions backend/openagent/agent/session_title.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,3 @@ async def agen_session_title(user_id: str, session_id: str, history: str) -> lis
).update({ChatSession.title: output})
db_session.commit()
return output


if __name__ == "__main__":
import asyncio

asyncio.run(agen_session_title("123", "456", "what's your name ?"))
6 changes: 0 additions & 6 deletions backend/openagent/agent/suggested_question.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,3 @@ async def agen_suggested_questions(user_id: str, history: str) -> list[str]:
lst = json.loads(output)
logger.info(f"suggested questions parsed: {lst}")
return lst


if __name__ == "__main__":
import asyncio

asyncio.run(agen_suggested_questions("123", "eth price?"))
31 changes: 19 additions & 12 deletions backend/openagent/agent/system_prompt.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,39 +21,46 @@
ollama_agent_kwargs = {
"prefix": """
Your designated name is RSS3 OpenAgent, developed by RSS3, \
you have the capability to call upon tools to aid in answering questions.
you have the capability to call upon tools to aid in answering questions about web3.
Assistants may prompt the user to employ specific tools to gather information that might be helpful in addressing the user's initial question.
Here are tools' schemas:
""",
"format_instructions": r"""
When responding, you must exclusively use one of the following two formats:
**Option 1:**
If you're suggesting that the user utilizes a tool, format your response as a markdown code snippet according to this schema:
```json
{{{{
"action": string, // The action to be taken. Must be one of {tool_names}
"action_input": object // The parameters for the action. MUST be JSON object
"action_input": dict // The parameters for the action. MUST be a dict object
}}}}
```
e.g.
```json
{{{{
"action": "search",
"action_input": {{{{
"query": "price of ETH",
"search_type": "google",
}}}}
}}}}
```
**Option #2:**
If you're providing a direct response to the user, format your response as a markdown code snippet following this schema:
**Option 2:**
If you observable the tool's results, or you're providing a direct final response to the user, format your response as a markdown code snippet following this schema:
```json
{{{{
"action": "Final Answer", // MUST be literal string "Final Answer", other forms are not acceptable
"action_input": string // This should contain your response to the user, in human-readable language
}}}}
```
"action\_input" is illegal, never escape it with a backslash.
""",
"suffix": """
REMEMBER to respond with a markdown code snippet of a json \
blob with a single action, and NOTHING else""",
YOU MUST FOLLOW THESE INSTRUCTIONS CAREFULLY.
1. To respond to the users message, you can use only one tool at a time.
2. When using a tool, only respond with the tool call. Nothing else. Do not add any additional notes, explanations or white space. Never escape with a backslash.
3. REMEMBER to respond with a markdown code snippet of a json blob with a single action, and nothing else.
""",
}
20 changes: 9 additions & 11 deletions backend/openagent/conf/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,15 @@
class Settings(BaseSettings):
MODEL_NAME: str = Field(default="llava:13b", env="MODEL_NAME")
LLM_API_BASE: str = Field(..., env="LLM_API_BASE")
RSS3_AI_API_BASE: str = Field(..., env="RSS3_AI_API_BASE")
EXECUTOR_API: str = Field(..., env="EXECUTOR_API")
POSTGRES_SERVER: str = Field(..., env="POSTGRES_SERVER")
POSTGRES_USER: str = Field(..., env="POSTGRES_USER")
POSTGRES_PASSWORD: str = Field(..., env="POSTGRES_PASSWORD")
POSTGRES_DB: str = Field(..., env="POSTGRES_DB")
POSTGRES_CONNECTION_STRING: str = ""

def postgres_connection_string(self):
return f"postgresql://{self.POSTGRES_USER}:{self.POSTGRES_PASSWORD}\
@{self.POSTGRES_SERVER}/{self.POSTGRES_DB}"
NFTSCAN_API_KEY: str = Field(..., env="NFTSCAN_API_KEY")
BIZ_DB_CONNECTION: str = Field(..., env="BIZ_DB_CONNECTION")
VEC_DB_CONNECTION: str = Field(..., env="VEC_DB_CONNECTION")
RSS3_DATA_API: str = Field(
default="https://testnet.rss3.io/data", env="RSS3_DATA_API"
)
RSS3_SEARCH_API: str = Field(
default="https://devnet.rss3.io/search", env="RSS3_SEARCH_API"
)


settings = Settings()
9 changes: 6 additions & 3 deletions backend/openagent/db/database.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy_utils import create_database, database_exists

from openagent.conf.env import settings
from openagent.db.models import Base

engine = create_engine(
settings.postgres_connection_string(), connect_args={"options": "-c timezone=utc"}
)
url = settings.BIZ_DB_CONNECTION

if not database_exists(url):
create_database(url)
engine = create_engine(url, connect_args={"options": "-c timezone=utc"})
Base.metadata.create_all(bind=engine) # type: ignore

DBSession = sessionmaker(bind=engine)
11 changes: 0 additions & 11 deletions backend/openagent/experts/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,14 +135,3 @@ def handle_ct_token_by_address(addr) -> dict | None:
"chainId": 1,
}
return None


async def main():
token = await get_token_by_address("0x4d2bf3A34a2311dB4b3D20D4719209EDaDBf69b6")
best_token = await select_best_token("ct", "1")
print(best_token)
print(token)


if __name__ == "__main__":
asyncio.run(main())
48 changes: 0 additions & 48 deletions backend/openagent/experts/account_expert.py

This file was deleted.

54 changes: 54 additions & 0 deletions backend/openagent/experts/article_expert.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
import json
from typing import Optional, Type

from langchain.callbacks.manager import (
AsyncCallbackManagerForToolRun,
CallbackManagerForToolRun,
)
from langchain.tools import BaseTool
from pydantic import BaseModel, Field

from openagent.index.pgvector_store import store


class ARGS(BaseModel):
keyword: str = Field(
description="keyword to search for",
)


class ArticleExpert(BaseTool):
name = "article"
description = (
"A tool for searching web3-related articles. If you lack knowledge about web3, "
"you can use this tool to find relevant articles that can help answer "
"your questions. Provide a keyword or phrase related to the topic "
"you want to search for, and the tool will return a list of "
"relevant article excerpts. "
"The articles are sourced from IQWiki and Mirror."
)
args_schema: Type[ARGS] = ARGS

def _run(
self,
keyword: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
return self.search_articles(keyword)

async def _arun(
self,
keyword: str,
run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
) -> str:
return self.search_articles(keyword)

@staticmethod
def search_articles(keyword: str) -> str:
retriever = store.as_retriever(
search_type="similarity_score_threshold",
search_kwargs={"score_threshold": 0.8, "k": 3},
)
res = retriever.get_relevant_documents(keyword)
docs = list(map(lambda x: x.page_content, res))
return json.dumps(docs)
Loading

0 comments on commit 38d34ca

Please sign in to comment.