From 3b34892322d60c9230b096807838115c79387e67 Mon Sep 17 00:00:00 2001 From: devon Date: Mon, 20 Nov 2023 19:35:58 +0800 Subject: [PATCH 1/4] init --- README-zh_CN.md | 130 ++++++++++++++++++++++++++++++++++++++++++++++++ README.md | 3 ++ 2 files changed, 133 insertions(+) create mode 100644 README-zh_CN.md diff --git a/README-zh_CN.md b/README-zh_CN.md new file mode 100644 index 000000000..e0f001214 --- /dev/null +++ b/README-zh_CN.md @@ -0,0 +1,130 @@ +# 🔎 GPT Researcher +[![Official Website](https://img.shields.io/badge/Official%20Website-tavily.com-blue?style=flat&logo=world&logoColor=white)](https://tavily.com) +[![Discord Follow](https://dcbadge.vercel.app/api/server/2pFkc83fRq?style=flat)](https://discord.com/invite/2pFkc83fRq) +[![GitHub Repo stars](https://img.shields.io/github/stars/assafelovic/gpt-researcher?style=social)](https://github.com/assafelovic/gpt-researcher) +[![Twitter Follow](https://img.shields.io/twitter/follow/tavilyai?style=social)](https://twitter.com/tavilyai) + +- [English](README.md) +- [中文](README-zh_CN.md) + + +**GPT Researcher 是一个自主代理,专为各种任务的综合在线研究而设计。** + + +The agent can produce detailed, factual and unbiased research reports, with customization options for focusing on relevant resources, outlines, and lessons. Inspired by the recent [Plan-and-Solve](https://arxiv.org/abs/2305.04091) and [RAG](https://arxiv.org/abs/2005.11401) papers, GPT Researcher addresses issues of speed, determinism and reliability, offering a more stable performance and increased speed through parallelized agent work, as opposed to synchronous operations. + +**Our mission is to empower individuals and organizations with accurate, unbiased, and factual information by leveraging the power of AI.** + +## Why GPT Researcher? + +- To form objective conclusions for manual research tasks can take time, sometimes weeks to find the right resources and information. +- Current LLMs are trained on past and outdated information, with heavy risks of hallucinations, making them almost irrelevant for research tasks. +- Solutions that enable web search (such as ChatGPT + Web Plugin), only consider limited resources and content that in some cases result in superficial conclusions or biased answers. +- Using only a selection of resources can create bias in determining the right conclusions for research questions or tasks. + +## Architecture +The main idea is to run "planner" and "execution" agents, whereas the planner generates questions to research, and the execution agents seek the most related information based on each generated research question. Finally, the planner filters and aggregates all related information and creates a research report.

+The agents leverage both gpt3.5-turbo and gpt-4-turbo (128K context) to complete a research task. We optimize for costs using each only when necessary. **The average research task takes around 3 minutes to complete, and costs ~$0.1.** + +
+ +
+ + +More specifically: +* Create a domain specific agent based on research query or task. +* Generate a set of research questions that together form an objective opinion on any given task. +* For each research question, trigger a crawler agent that scrapes online resources for information relevant to the given task. +* For each scraped resources, summarize based on relevant information and keep track of its sources. +* Finally, filter and aggregate all summarized sources and generate a final research report. + +## Demo +https://github.com/assafelovic/gpt-researcher/assets/13554167/a00c89a6-a295-4dd0-b58d-098a31c40fda + +## Tutorials + - [How it Works](https://docs.tavily.com/blog/building-gpt-researcher) + - [How to Install](https://www.loom.com/share/04ebffb6ed2a4520a27c3e3addcdde20?sid=da1848e8-b1f1-42d1-93c3-5b0b9c3b24ea) + - [Live Demo](https://www.loom.com/share/6a3385db4e8747a1913dd85a7834846f?sid=a740fd5b-2aa3-457e-8fb7-86976f59f9b8) + +## Features +- 📝 Generate research, outlines, resources and lessons reports +- 🌐 Aggregates over 20 web sources per research to form objective and factual conclusions +- 🖥️ Includes an easy-to-use web interface (HTML/CSS/JS) +- 🔍 Scrapes web sources with javascript support +- 📂 Keeps track and context of visited and used web sources +- 📄 Export research reports to PDF and more... + +## 📖 Documentation + +Please see [here](https://docs.tavily.com/docs/gpt-researcher/getting-started) for full documentation on: + +- Getting started (installation, setting up the environment, simple examples) +- How-To examples (demos, integrations, docker support) +- Reference (full API docs) +- Tavily API integration (high-level explanation of core concepts) + +## Quickstart +> **Step 0** - Install Python 3.11 or later. [See here](https://www.tutorialsteacher.com/python/install-python) for a step-by-step guide. + +
+ +> **Step 1** - Download the project + +```bash +$ git clone https://github.com/assafelovic/gpt-researcher.git +$ cd gpt-researcher +``` + +
+ +> **Step 2** - Install dependencies +```bash +$ pip install -r requirements.txt +``` +
+ +> **Step 3** - Create .env file with your OpenAI Key and Tavily API key or simply export it + +```bash +$ export OPENAI_API_KEY={Your OpenAI API Key here} +``` +```bash +$ export TAVILY_API_KEY={Your Tavily API Key here} +``` + +- **For LLM, we recommend [OpenAI GPT](https://platform.openai.com/docs/guides/gpt)**, but you can use any other LLM model (including open sources) supported by [Langchain Adapter](https://python.langchain.com/docs/guides/adapters/openai), simply change the llm model and provider in config/config.py. Follow [this guide](https://python.langchain.com/docs/integrations/llms/) to learn how to integrate LLMs with Langchain. +- **For search engine, we recommend [Tavily Search API](https://app.tavily.com) (optimized for LLMs)**, but you can also refer to other search engines of your choice by changing the search provider in config/config.py to `"duckduckgo"`, `"googleAPI"`, `"googleSerp"`, or `"searx"`. Then add the corresponding env API key as seen in the config.py file. +- **We highly recommend using [OpenAI GPT](https://platform.openai.com/docs/guides/gpt) models and [Tavily Search API](https://app.tavily.com) for optimal performance.** + +
+ +> **Step 4** - Run the agent with FastAPI + +```bash +$ uvicorn main:app --reload +``` +
+ +> **Step 5** - Go to http://localhost:8000 on any browser and enjoy researching! + +To learn how to get started with Docker or to learn more about the features and services check out the [documentation](https://docs.tavily.com) page. + +## 🚀 Contributing +We highly welcome contributions! Please check out [contributing](CONTRIBUTING.md) if you're interested. + +Please check out our [roadmap](https://trello.com/b/3O7KBePw/gpt-researcher-roadmap) page and reach out to us via our [Discord community](https://discord.gg/2pFkc83fRq) if you're interested in joining our mission. + +## 🛡 Disclaimer + +This project, GPT Researcher, is an experimental application and is provided "as-is" without any warranty, express or implied. We are sharing codes for academic purposes under the MIT license. Nothing herein is academic advice, and NOT a recommendation to use in academic or research papers. + +Our view on unbiased research claims: +1. The whole point of our scraping system is to reduce incorrect fact. How? The more sites we scrape the less chances of incorrect data. We are scraping 20 per research, the chances that they are all wrong is extremely low. +2. We do not aim to eliminate biases; we aim to reduce it as much as possible. **We are here as a community to figure out the most effective human/llm interactions.** +3. In research, people also tend towards biases as most have already opinions on the topics they research about. This tool scrapes many opinions and will evenly explain diverse views that a biased person would never have read. + +**Please note that the use of the GPT-4 language model can be expensive due to its token usage.** By utilizing this project, you acknowledge that you are responsible for monitoring and managing your own token usage and the associated costs. It is highly recommended to check your OpenAI API usage regularly and set up any necessary limits or alerts to prevent unexpected charges. + +## ✉️ Support / Contact us +- [Community Discord](https://discord.gg/spBgZmm3Xe) +- Our email: support@tavily.com diff --git a/README.md b/README.md index 031d0238e..11de7af93 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,9 @@ [![GitHub Repo stars](https://img.shields.io/github/stars/assafelovic/gpt-researcher?style=social)](https://github.com/assafelovic/gpt-researcher) [![Twitter Follow](https://img.shields.io/twitter/follow/tavilyai?style=social)](https://twitter.com/tavilyai) +- [English](README.md) +- [中文](README-zh_CN.md) + **GPT Researcher is an autonomous agent designed for comprehensive online research on a variety of tasks.** The agent can produce detailed, factual and unbiased research reports, with customization options for focusing on relevant resources, outlines, and lessons. Inspired by the recent [Plan-and-Solve](https://arxiv.org/abs/2305.04091) and [RAG](https://arxiv.org/abs/2005.11401) papers, GPT Researcher addresses issues of speed, determinism and reliability, offering a more stable performance and increased speed through parallelized agent work, as opposed to synchronous operations. From 4416734f36e34f091ddfe9dc83d071afb0b1c827 Mon Sep 17 00:00:00 2001 From: devon Date: Mon, 20 Nov 2023 20:50:49 +0800 Subject: [PATCH 2/4] first version --- README-zh_CN.md | 121 ++++++++++++++++++++++++------------------------ 1 file changed, 60 insertions(+), 61 deletions(-) diff --git a/README-zh_CN.md b/README-zh_CN.md index e0f001214..45eda304a 100644 --- a/README-zh_CN.md +++ b/README-zh_CN.md @@ -10,65 +10,66 @@ **GPT Researcher 是一个自主代理,专为各种任务的综合在线研究而设计。** +Agent可生成详细、真实和公正的研究报告,并可定制相关资源、大纲和经验教训的重点选项。受最近发表的[Plan-and-Solve](https://arxiv.org/abs/2305.04091) 和[RAG](https://arxiv.org/abs/2005.11401) 论文的启发,GPT Researcher 解决了速度、确定性和可靠性等问题,通过并行化的代理工作,而不是同步操作,提供了更稳定的性能和更高的速度。 -The agent can produce detailed, factual and unbiased research reports, with customization options for focusing on relevant resources, outlines, and lessons. Inspired by the recent [Plan-and-Solve](https://arxiv.org/abs/2305.04091) and [RAG](https://arxiv.org/abs/2005.11401) papers, GPT Researcher addresses issues of speed, determinism and reliability, offering a more stable performance and increased speed through parallelized agent work, as opposed to synchronous operations. +**我们的使命是利用人工智能的力量,为个人和组织提供准确、公正和事实的信息。** -**Our mission is to empower individuals and organizations with accurate, unbiased, and factual information by leveraging the power of AI.** +## 为什么是GPT研究员? -## Why GPT Researcher? +- 要为人工研究任务形成客观结论可能需要时间,有时甚至需要数周才能找到正确的资源和信息。 +- 目前的 LLM 是根据过去和过时的信息进行培训的,存在严重的幻觉风险,因此几乎无法胜任研究任务。 +- 支持网络搜索的解决方案(例如 ChatGPT + Web 插件)仅考虑有限的资源和内容,在某些情况下会导致肤浅的结论或有偏见的答案。 +- 只使用部分资源可能会在确定研究问题或任务的正确结论时产生偏差。 -- To form objective conclusions for manual research tasks can take time, sometimes weeks to find the right resources and information. -- Current LLMs are trained on past and outdated information, with heavy risks of hallucinations, making them almost irrelevant for research tasks. -- Solutions that enable web search (such as ChatGPT + Web Plugin), only consider limited resources and content that in some cases result in superficial conclusions or biased answers. -- Using only a selection of resources can create bias in determining the right conclusions for research questions or tasks. - -## Architecture -The main idea is to run "planner" and "execution" agents, whereas the planner generates questions to research, and the execution agents seek the most related information based on each generated research question. Finally, the planner filters and aggregates all related information and creates a research report.

-The agents leverage both gpt3.5-turbo and gpt-4-turbo (128K context) to complete a research task. We optimize for costs using each only when necessary. **The average research task takes around 3 minutes to complete, and costs ~$0.1.** +## 架构 +主要思想是运行“计划者”和“执行”代理,而计划者生成问题进行研究,执行代理根据每个生成的研究问题寻找最相关的信息。最后,计划者过滤和聚合所有相关信息并创建研究报告。

+代理同时利用 gpt3.5-turbo 和 gpt-4-turbo(128K 上下文)来完成一项研究任务。我们仅在必要时使用这两种方法对成本进行优化。**研究任务平均耗时约 3 分钟,成本约为 0.1 美元**。
-More specifically: -* Create a domain specific agent based on research query or task. -* Generate a set of research questions that together form an objective opinion on any given task. -* For each research question, trigger a crawler agent that scrapes online resources for information relevant to the given task. -* For each scraped resources, summarize based on relevant information and keep track of its sources. -* Finally, filter and aggregate all summarized sources and generate a final research report. +更具体地说: +* 根据研究查询或任务创建特定领域的代理。 +* 生成一组研究问题,这些问题共同形成对任何给定任务的客观意见。 +* 针对每个研究问题,触发一个爬虫代理,从在线资源中搜索与给定任务相关的信息。 +* 对于每一个抓取的资源,根据相关信息进行汇总,并跟踪其来源。 +* 最后,对所有汇总的资料来源进行过滤和汇总,并生成最终研究报告。 + + -## Demo +## 演示 https://github.com/assafelovic/gpt-researcher/assets/13554167/a00c89a6-a295-4dd0-b58d-098a31c40fda -## Tutorials - - [How it Works](https://docs.tavily.com/blog/building-gpt-researcher) - - [How to Install](https://www.loom.com/share/04ebffb6ed2a4520a27c3e3addcdde20?sid=da1848e8-b1f1-42d1-93c3-5b0b9c3b24ea) - - [Live Demo](https://www.loom.com/share/6a3385db4e8747a1913dd85a7834846f?sid=a740fd5b-2aa3-457e-8fb7-86976f59f9b8) +## 教程 + - [运行原理](https://docs.tavily.com/blog/building-gpt-researcher) + - [如何安装](https://www.loom.com/share/04ebffb6ed2a4520a27c3e3addcdde20?sid=da1848e8-b1f1-42d1-93c3-5b0b9c3b24ea) + - [现场演示](https://www.loom.com/share/6a3385db4e8747a1913dd85a7834846f?sid=a740fd5b-2aa3-457e-8fb7-86976f59f9b8) -## Features -- 📝 Generate research, outlines, resources and lessons reports -- 🌐 Aggregates over 20 web sources per research to form objective and factual conclusions -- 🖥️ Includes an easy-to-use web interface (HTML/CSS/JS) -- 🔍 Scrapes web sources with javascript support -- 📂 Keeps track and context of visited and used web sources -- 📄 Export research reports to PDF and more... +## 特性 +- 📝 生成研究、大纲、资源和经验教训报告 +- 🌐 每项研究汇总超过20个网络资源,形成客观和事实的结论 +- 🖥️ 包括一个易于使用的web界面 (HTML/CSS/JS) +- 🔍 支持 JavaScript 的网络资源抓取功能 +- 📂 跟踪访问过和使用过的网络资源,了解其来龙去脉 +- 📄 将研究报告导出为 PDF 格式等... -## 📖 Documentation +## 📖 文档 -Please see [here](https://docs.tavily.com/docs/gpt-researcher/getting-started) for full documentation on: +请参阅 [此处](https://docs.tavily.com/docs/gpt-researcher/getting-started),了解有关的完整文档: -- Getting started (installation, setting up the environment, simple examples) -- How-To examples (demos, integrations, docker support) -- Reference (full API docs) -- Tavily API integration (high-level explanation of core concepts) +- 入门(安装、设置环境、简单示例) +- 操作示例(演示、集成、docker 支持) +- 参考资料(API完整文档) +- Tavily 应用程序接口集成(核心概念的高级解释) -## Quickstart -> **Step 0** - Install Python 3.11 or later. [See here](https://www.tutorialsteacher.com/python/install-python) for a step-by-step guide. +## 快速开始 +> **步骤 0** - 安装 Python 3.11 或更高版本。[参见此处](https://www.tutorialsteacher.com/python/install-python) 获取详细指南。
-> **Step 1** - Download the project +> **步骤 1** - 下载项目 ```bash $ git clone https://github.com/assafelovic/gpt-researcher.git @@ -77,13 +78,13 @@ $ cd gpt-researcher
-> **Step 2** - Install dependencies +> **步骤2** -安装依赖项 ```bash $ pip install -r requirements.txt ```
-> **Step 3** - Create .env file with your OpenAI Key and Tavily API key or simply export it +> **第 3 步** - 使用 OpenAI 密钥和 Tavily API 密钥创建 .env 文件,或直接导出该文件 ```bash $ export OPENAI_API_KEY={Your OpenAI API Key here} @@ -92,39 +93,37 @@ $ export OPENAI_API_KEY={Your OpenAI API Key here} $ export TAVILY_API_KEY={Your Tavily API Key here} ``` -- **For LLM, we recommend [OpenAI GPT](https://platform.openai.com/docs/guides/gpt)**, but you can use any other LLM model (including open sources) supported by [Langchain Adapter](https://python.langchain.com/docs/guides/adapters/openai), simply change the llm model and provider in config/config.py. Follow [this guide](https://python.langchain.com/docs/integrations/llms/) to learn how to integrate LLMs with Langchain. -- **For search engine, we recommend [Tavily Search API](https://app.tavily.com) (optimized for LLMs)**, but you can also refer to other search engines of your choice by changing the search provider in config/config.py to `"duckduckgo"`, `"googleAPI"`, `"googleSerp"`, or `"searx"`. Then add the corresponding env API key as seen in the config.py file. -- **We highly recommend using [OpenAI GPT](https://platform.openai.com/docs/guides/gpt) models and [Tavily Search API](https://app.tavily.com) for optimal performance.** - +- **LLM,我们推荐使用 [OpenAI GPT](https://platform.openai.com/docs/guides/gpt)**,但您也可以使用 [Langchain Adapter](https://python.langchain.com/docs/guides/adapters/openai) 支持的任何其他 LLM 模型(包括开源),只需在 config/config.py 中更改 llm 模型和提供者即可。请按照 [this guide](https://python.langchain.com/docs/integrations/llms/) 学习如何将 LLM 与 Langchain 集成。 +- **对于搜索引擎,我们推荐使用 [Tavily Search API](https://app.tavily.com)(已针对 LLM 进行优化)**,但您也可以选择其他搜索引擎,只需将 config/config.py 中的搜索提供程序更改为 "duckduckgo"、"googleAPI"、"googleSerp "或 "searx "即可。然后在 config.py 文件中添加相应的 env API 密钥。 +- **我们强烈建议使用 [OpenAI GPT](https://platform.openai.com/docs/guides/gpt) 模型和 [Tavily Search API](https://app.tavily.com) 以获得最佳性能。**
-> **Step 4** - Run the agent with FastAPI +> **第 4 步** - 使用 FastAPI 运行代理 ```bash $ uvicorn main:app --reload ```
-> **Step 5** - Go to http://localhost:8000 on any browser and enjoy researching! - -To learn how to get started with Docker or to learn more about the features and services check out the [documentation](https://docs.tavily.com) page. +> **第 5 步** - 在任何浏览器上访问 http://localhost:8000,享受研究乐趣! -## 🚀 Contributing -We highly welcome contributions! Please check out [contributing](CONTRIBUTING.md) if you're interested. +要了解如何开始使用 Docker 或了解有关功能和服务的更多信息,请访问 [documentation](https://docs.tavily.com) 页面。 -Please check out our [roadmap](https://trello.com/b/3O7KBePw/gpt-researcher-roadmap) page and reach out to us via our [Discord community](https://discord.gg/2pFkc83fRq) if you're interested in joining our mission. +## 🚀 贡献 +我们非常欢迎您的贡献!如果您感兴趣,请查看 [contributing](CONTRIBUTING.md)。 -## 🛡 Disclaimer +如果您有兴趣加入我们的任务,请查看我们的 [路线图](https://trello.com/b/3O7KBePw/gpt-researcher-roadmap) 页面,并通过我们的 [Discord 社区](https://discord.gg/2pFkc83fRq) 联系我们。 -This project, GPT Researcher, is an experimental application and is provided "as-is" without any warranty, express or implied. We are sharing codes for academic purposes under the MIT license. Nothing herein is academic advice, and NOT a recommendation to use in academic or research papers. +## 🛡 免责声明 -Our view on unbiased research claims: -1. The whole point of our scraping system is to reduce incorrect fact. How? The more sites we scrape the less chances of incorrect data. We are scraping 20 per research, the chances that they are all wrong is extremely low. -2. We do not aim to eliminate biases; we aim to reduce it as much as possible. **We are here as a community to figure out the most effective human/llm interactions.** -3. In research, people also tend towards biases as most have already opinions on the topics they research about. This tool scrapes many opinions and will evenly explain diverse views that a biased person would never have read. +本项目 "GPT Researcher "是一个实验性应用程序,按 "现状 "提供,不做任何明示或暗示的保证。我们根据 MIT 许可分享用于学术目的的代码。本文不提供任何学术建议,也不建议在学术或研究论文中使用。 -**Please note that the use of the GPT-4 language model can be expensive due to its token usage.** By utilizing this project, you acknowledge that you are responsible for monitoring and managing your own token usage and the associated costs. It is highly recommended to check your OpenAI API usage regularly and set up any necessary limits or alerts to prevent unexpected charges. +我们对无偏见研究主张的看法: +1.我们抓取系统的全部目的是减少不正确的事实。如何解决?我们抓取的网站越多,错误数据的可能性就越小。我们每项研究都会收集20条信息,它们全部错误的可能性极低。 +2.我们的目标不是消除偏见,而是尽可能减少偏见。**作为一个社区,我们在这里探索最有效的人机互动**。 +3.在研究过程中,人们也容易产生偏见,因为大多数人对自己研究的课题都有自己的看法。这个工具可以搜罗到许多观点,并均匀地解释各种不同的观点,而有偏见的人是绝对读不到这些观点的。 -## ✉️ Support / Contact us -- [Community Discord](https://discord.gg/spBgZmm3Xe) -- Our email: support@tavily.com +**请注意,使用 GPT-4 语言模型可能会因使用令牌而产生高昂费用**。使用本项目即表示您承认有责任监控和管理自己的令牌使用情况及相关费用。强烈建议您定期检查 OpenAI API 的使用情况,并设置任何必要的限制或警报,以防止发生意外费用。 +## ✉️ 支持 / 联系我们 +- [社区讨论区](https://discord.gg/spBgZmm3Xe) +- 我们的邮箱: support@tavily.com From bd4c267c69ddbacaf061531fa7a87e2b529a506d Mon Sep 17 00:00:00 2001 From: devon Date: Mon, 20 Nov 2023 21:45:11 +0800 Subject: [PATCH 3/4] optimize version --- README-zh_CN.md | 44 +++++++++++++++++++++----------------------- 1 file changed, 21 insertions(+), 23 deletions(-) diff --git a/README-zh_CN.md b/README-zh_CN.md index 45eda304a..bbe6e24ec 100644 --- a/README-zh_CN.md +++ b/README-zh_CN.md @@ -8,37 +8,35 @@ - [中文](README-zh_CN.md) -**GPT Researcher 是一个自主代理,专为各种任务的综合在线研究而设计。** +**GPT Researcher 是一个智能体代理,专为各种任务的综合在线研究而设计。** -Agent可生成详细、真实和公正的研究报告,并可定制相关资源、大纲和经验教训的重点选项。受最近发表的[Plan-and-Solve](https://arxiv.org/abs/2305.04091) 和[RAG](https://arxiv.org/abs/2005.11401) 论文的启发,GPT Researcher 解决了速度、确定性和可靠性等问题,通过并行化的代理工作,而不是同步操作,提供了更稳定的性能和更高的速度。 +代理可以生成详细、正式且客观的研究报告,并提供自定义选项,专注于相关资源、结构框架和经验报告。受最近发表的[Plan-and-Solve](https://arxiv.org/abs/2305.04091) 和[RAG](https://arxiv.org/abs/2005.11401) 论文的启发,GPT Researcher 解决了速度、确定性和可靠性等问题,通过并行化的代理运行,而不是同步操作,提供了更稳定的性能和更高的速度。 -**我们的使命是利用人工智能的力量,为个人和组织提供准确、公正和事实的信息。** +**我们的使命是利用人工智能的力量,为个人和组织提供准确、客观和事实的信息。** -## 为什么是GPT研究员? +## 为什么选择GPT Researcher? -- 要为人工研究任务形成客观结论可能需要时间,有时甚至需要数周才能找到正确的资源和信息。 -- 目前的 LLM 是根据过去和过时的信息进行培训的,存在严重的幻觉风险,因此几乎无法胜任研究任务。 -- 支持网络搜索的解决方案(例如 ChatGPT + Web 插件)仅考虑有限的资源和内容,在某些情况下会导致肤浅的结论或有偏见的答案。 +- 因为人工研究任务形成客观结论可能需要时间和经历,有时甚至需要数周才能找到正确的资源和信息。 +- 目前的LLM是根据历史和过时的信息进行训练的,存在严重的幻觉风险,因此几乎无法胜任研究任务。 +- 网络搜索的解决方案(例如 ChatGPT + Web 插件)仅考虑有限的资源和内容,在某些情况下会导致肤浅的结论或不客观的答案。 - 只使用部分资源可能会在确定研究问题或任务的正确结论时产生偏差。 ## 架构 -主要思想是运行“计划者”和“执行”代理,而计划者生成问题进行研究,执行代理根据每个生成的研究问题寻找最相关的信息。最后,计划者过滤和聚合所有相关信息并创建研究报告。

-代理同时利用 gpt3.5-turbo 和 gpt-4-turbo(128K 上下文)来完成一项研究任务。我们仅在必要时使用这两种方法对成本进行优化。**研究任务平均耗时约 3 分钟,成本约为 0.1 美元**。 +主要思想是运行“**计划者**”和“**执行**”代理,而**计划者**生成问题进行研究,“**执行**”代理根据每个生成的研究问题寻找最相关的信息。最后,“**计划者**”过滤和聚合所有相关信息并创建研究报告。

+代理同时利用 gpt3.5-turbo 和 gpt-4-turbo(128K 上下文)来完成一项研究任务。我们仅在必要时使用这两种方法对成本进行优化。**研究任务平均耗时约 3 分钟,成本约为 ~0.1 美元**。
-更具体地说: -* 根据研究查询或任务创建特定领域的代理。 -* 生成一组研究问题,这些问题共同形成对任何给定任务的客观意见。 +详细说明: +* 根据研究搜索或任务创建特定领域的代理。 +* 生成一组研究问题,这些问题共同形成答案对任何给定任务的客观意见。 * 针对每个研究问题,触发一个爬虫代理,从在线资源中搜索与给定任务相关的信息。 * 对于每一个抓取的资源,根据相关信息进行汇总,并跟踪其来源。 * 最后,对所有汇总的资料来源进行过滤和汇总,并生成最终研究报告。 - - ## 演示 https://github.com/assafelovic/gpt-researcher/assets/13554167/a00c89a6-a295-4dd0-b58d-098a31c40fda @@ -48,16 +46,16 @@ https://github.com/assafelovic/gpt-researcher/assets/13554167/a00c89a6-a295-4dd0 - [现场演示](https://www.loom.com/share/6a3385db4e8747a1913dd85a7834846f?sid=a740fd5b-2aa3-457e-8fb7-86976f59f9b8) ## 特性 -- 📝 生成研究、大纲、资源和经验教训报告 -- 🌐 每项研究汇总超过20个网络资源,形成客观和事实的结论 -- 🖥️ 包括一个易于使用的web界面 (HTML/CSS/JS) -- 🔍 支持 JavaScript 的网络资源抓取功能 -- 📂 跟踪访问过和使用过的网络资源,了解其来龙去脉 -- 📄 将研究报告导出为 PDF 格式等... +- 📝 生成研究问题、大纲、资源和课题报告 +- 🌐 每项研究汇总超过20个网络资源,形成客观和真实的结论 +- 🖥️ 包括易于使用的web界面 (HTML/CSS/JS) +- 🔍 支持JavaScript网络资源抓取功能 +- 📂 追踪访问过和使用过的网络资源和来源 +- 📄 将研究报告导出为PDF或其他格式... ## 📖 文档 -请参阅 [此处](https://docs.tavily.com/docs/gpt-researcher/getting-started),了解有关的完整文档: +请参阅[此处](https://docs.tavily.com/docs/gpt-researcher/getting-started),了解完整文档: - 入门(安装、设置环境、简单示例) - 操作示例(演示、集成、docker 支持) @@ -93,7 +91,7 @@ $ export OPENAI_API_KEY={Your OpenAI API Key here} $ export TAVILY_API_KEY={Your Tavily API Key here} ``` -- **LLM,我们推荐使用 [OpenAI GPT](https://platform.openai.com/docs/guides/gpt)**,但您也可以使用 [Langchain Adapter](https://python.langchain.com/docs/guides/adapters/openai) 支持的任何其他 LLM 模型(包括开源),只需在 config/config.py 中更改 llm 模型和提供者即可。请按照 [this guide](https://python.langchain.com/docs/integrations/llms/) 学习如何将 LLM 与 Langchain 集成。 +- **LLM,我们推荐使用 [OpenAI GPT](https://platform.openai.com/docs/guides/gpt)**,但您也可以使用 [Langchain Adapter](https://python.langchain.com/docs/guides/adapters/openai) 支持的任何其他 LLM 模型(包括开源),只需在 config/config.py 中更改 llm 模型和提供者即可。请按照 [这份指南](https://python.langchain.com/docs/integrations/llms/) 学习如何将 LLM 与 Langchain 集成。 - **对于搜索引擎,我们推荐使用 [Tavily Search API](https://app.tavily.com)(已针对 LLM 进行优化)**,但您也可以选择其他搜索引擎,只需将 config/config.py 中的搜索提供程序更改为 "duckduckgo"、"googleAPI"、"googleSerp "或 "searx "即可。然后在 config.py 文件中添加相应的 env API 密钥。 - **我们强烈建议使用 [OpenAI GPT](https://platform.openai.com/docs/guides/gpt) 模型和 [Tavily Search API](https://app.tavily.com) 以获得最佳性能。**
@@ -118,7 +116,7 @@ $ uvicorn main:app --reload 本项目 "GPT Researcher "是一个实验性应用程序,按 "现状 "提供,不做任何明示或暗示的保证。我们根据 MIT 许可分享用于学术目的的代码。本文不提供任何学术建议,也不建议在学术或研究论文中使用。 -我们对无偏见研究主张的看法: +我们对客观研究主张的看法: 1.我们抓取系统的全部目的是减少不正确的事实。如何解决?我们抓取的网站越多,错误数据的可能性就越小。我们每项研究都会收集20条信息,它们全部错误的可能性极低。 2.我们的目标不是消除偏见,而是尽可能减少偏见。**作为一个社区,我们在这里探索最有效的人机互动**。 3.在研究过程中,人们也容易产生偏见,因为大多数人对自己研究的课题都有自己的看法。这个工具可以搜罗到许多观点,并均匀地解释各种不同的观点,而有偏见的人是绝对读不到这些观点的。 From 81e46495ddee26450bfc297c585351f2a831efa2 Mon Sep 17 00:00:00 2001 From: devon Date: Mon, 20 Nov 2023 22:07:11 +0800 Subject: [PATCH 4/4] optimize format --- README-zh_CN.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README-zh_CN.md b/README-zh_CN.md index bbe6e24ec..bba7cd548 100644 --- a/README-zh_CN.md +++ b/README-zh_CN.md @@ -117,9 +117,9 @@ $ uvicorn main:app --reload 本项目 "GPT Researcher "是一个实验性应用程序,按 "现状 "提供,不做任何明示或暗示的保证。我们根据 MIT 许可分享用于学术目的的代码。本文不提供任何学术建议,也不建议在学术或研究论文中使用。 我们对客观研究主张的看法: -1.我们抓取系统的全部目的是减少不正确的事实。如何解决?我们抓取的网站越多,错误数据的可能性就越小。我们每项研究都会收集20条信息,它们全部错误的可能性极低。 -2.我们的目标不是消除偏见,而是尽可能减少偏见。**作为一个社区,我们在这里探索最有效的人机互动**。 -3.在研究过程中,人们也容易产生偏见,因为大多数人对自己研究的课题都有自己的看法。这个工具可以搜罗到许多观点,并均匀地解释各种不同的观点,而有偏见的人是绝对读不到这些观点的。 +1. 我们抓取系统的全部目的是减少不正确的事实。如何解决?我们抓取的网站越多,错误数据的可能性就越小。我们每项研究都会收集20条信息,它们全部错误的可能性极低。 +2. 我们的目标不是消除偏见,而是尽可能减少偏见。**作为一个社区,我们在这里探索最有效的人机互动**。 +3. 在研究过程中,人们也容易产生偏见,因为大多数人对自己研究的课题都有自己的看法。这个工具可以搜罗到许多观点,并均匀地解释各种不同的观点,而有偏见的人是绝对读不到这些观点的。 **请注意,使用 GPT-4 语言模型可能会因使用令牌而产生高昂费用**。使用本项目即表示您承认有责任监控和管理自己的令牌使用情况及相关费用。强烈建议您定期检查 OpenAI API 的使用情况,并设置任何必要的限制或警报,以防止发生意外费用。 ## ✉️ 支持 / 联系我们