Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: rate limit support #59

Merged
merged 6 commits into from
Aug 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 25 additions & 7 deletions .env
Original file line number Diff line number Diff line change
@@ -1,22 +1,40 @@
# 示例见 .env.example
# 示例与解释见 .env.example

# LOG_CHAT: 是否开启日志
# `LOG_CHAT`: 是否记录日志
LOG_CHAT=false

# OPENAI_BASE_URL: 转发openai风格的任何服务地址,允许指定多个, 以逗号隔开。
# `OPENAI_BASE_URL`: 转发openai风格的任何服务地址,允许指定多个, 以逗号隔开。
# 如果指定超过一个,则任何OPENAI_ROUTE_PREFIX/EXTRA_ROUTE_PREFIX都不能为根路由/
OPENAI_BASE_URL=https://api.openai.com

# OPENAI_ROUTE_PREFIX: 可指定所有openai风格(为记录日志)服务的转发路由前缀
OPENAI_ROUTE_PREFIX=/
# `OPENAI_ROUTE_PREFIX`: 可指定所有openai风格(为记录日志)服务的转发路由前缀
OPENAI_ROUTE_PREFIX=

OPENAI_API_KEY=
FORWARD_KEY=

# EXTRA_BASE_URL: 可指定任意服务转发
# `EXTRA_BASE_URL`: 可指定任意服务转发
EXTRA_BASE_URL=
# EXTRA_ROUTE_PREFIX: 与 EXTRA_BASE_URL 匹配的路由前缀
# `EXTRA_ROUTE_PREFIX`: 与 EXTRA_BASE_URL 匹配的路由前缀
EXTRA_ROUTE_PREFIX=

# `RATE_LIMIT`: i.e. RPM 对指定路由的请求速率限制
# format: {route: ratelimit-string}
# ratelimit-string format [count] [per|/] [n (optional)] [second|minute|hour|day|month|year] :ref:`ratelimit-string`: https://limits.readthedocs.io/en/stable/quickstart.html#rate-limit-string-notation
RATE_LIMIT='{
"/healthz": "1000/2minutes",
"/v1/chat/completions": "300/minute"
}'

#`RATE_LIMIT_STRATEGY` Options: (fixed-window, fixed-window-elastic-expiry, moving-window) ref: https://limits.readthedocs.io/en/latest/strategies.html
# `fixed-window`: most memory efficient strategy; `moving-window`:most effective for preventing bursts but higher memory cost.
RATE_LIMIT_STRATEGY=moving-window

# `GLOBAL_RATE_LIMIT`: 所有`RATE_LIMIT`没有指定的路由. 不填默认无限制
GLOBAL_RATE_LIMIT=

# TPM: 返回的token速率限制
TOKEN_RATE_LIMIT=50/second

# 设定时区
TZ=Asia/Shanghai
32 changes: 24 additions & 8 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,29 +1,45 @@
# LOG_CHAT: 是否开启日志
LOG_CHAT=true


# OPENAI_BASE_URL: 转发openai风格的任何服务地址,允许指定多个, 以逗号隔开。
# 如果指定超过一个,则任何OPENAI_ROUTE_PREFIX/EXTRA_ROUTE_PREFIX都不能为根路由/
OPENAI_BASE_URL=https://api.openai.com, http:localhost:8080
OPENAI_BASE_URL='https://api.openai.com, http:localhost:8080'

# OPENAI_ROUTE_PREFIX: 可指定所有openai风格(为记录日志)服务的转发路由前缀
OPENAI_ROUTE_PREFIX=/openai, /localai
OPENAI_ROUTE_PREFIX='/openai, /localai'

# OPENAI_API_KEY:允许输入多个api key, 以逗号隔开, 形成轮询池
OPENAI_API_KEY=sk-xxx1,sk-xxx2,sk-xxx3
OPENAI_API_KEY='sk-xxx1, sk-xxx2, sk-xxx3'

# FORWARD_KEY: 当前面的OPENAI_API_KEY被设置,就可以设置这里的FORWARD_KEY,客户端调用时可以使用FORWARD_KEY作为api key
FORWARD_KEY=fk-xxx1


# EXTRA_BASE_URL: 可指定任意服务转发
EXTRA_BASE_URL=http://localhost:8882, http://localhost:8881
EXTRA_BASE_URL='http://localhost:8882, http://localhost:8881'

# EXTRA_ROUTE_PREFIX: 与 EXTRA_BASE_URL 匹配的路由前缀
EXTRA_ROUTE_PREFIX=/tts,/translate
EXTRA_ROUTE_PREFIX='/tts, /translate'

# RATE LIMIT: 指定路由的请求速率限制(不区分客户)
# format: {route: ratelimit-string}
# ratelimit-string format [count] [per|/] [n (optional)] [second|minute|hour|day|month|year] :ref:`ratelimit-string`: https://limits.readthedocs.io/en/stable/quickstart.html#rate-limit-string-notation
RATE_LIMIT='{
"/healthz": "50/3minutes",
"/openai/v1/chat/completions": "1/10seconds",
"/localai/v1/chat/completions": "2/second"}'

# PROXY 配置代理
# `GLOBAL_RATE_LIMIT`: 所有`RATE_LIMIT`没有指定的路由. 不填默认无限制
GLOBAL_RATE_LIMIT=2/5seconds

#`RATE_LIMIT_STRATEGY` Options: (fixed-window, fixed-window-elastic-expiry, moving-window) ref: https://limits.readthedocs.io/en/latest/strategies.html
# `fixed-window`: most memory efficient strategy; `moving-window`:most effective for preventing bursts but higher memory cost.
RATE_LIMIT_STRATEGY=fixed-window

# `PROXY` http代理
PROXY=http://localhost:7890

# `TOKEN_RATE_LIMIT` 对每一份流式返回的token速率限制 (注:这里的token并不严格等于gpt中定义的token,而是SSE的chunk)
TOKEN_RATE_LIMIT=16/second

# 设定时区
TZ=Asia/Shanghai
Binary file added .github/images/separators/aqua.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/images/separators/aqua.webp
Binary file not shown.
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,15 +51,17 @@
本项目用于解决一些地区无法直接访问OpenAI的问题,将该服务部署在可以正常访问OpenAI API的(云)服务器上,
通过该服务转发OpenAI的请求。即搭建反向代理服务; 允许输入多个OpenAI API-KEY 组成轮询池; 可自定义二次分发api key.

---

<picture>
<source type="image/webp" srcset="https://raw.githubusercontent.com/beidongjiedeguang/openai-forward/.github/images/separators/aqua.webp">
<img src="https://raw.githubusercontent.com/beidongjiedeguang/openai-forward/.github/images/separators/aqua.png" height=8px width="100%">
</picture>

由本项目搭建的长期代理地址:
> https://api.openai-forward.com
> https://cloudflare.worker.openai-forward.com
> https://render.openai-forward.com
> https://cloudflare.page.openai-forward.com
> https://vercel.openai-forward.com
> https://render.openai-forward.com
> https://railway.openai-forward.com

## 功能

Expand Down
51 changes: 27 additions & 24 deletions deploy.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@

[pip部署](#pip部署) |
[docker部署](#docker部署) |
[railway一键部署](#railway-一键部署) |
[render一键部署](#render-一键部署) |
[Vercel一键部署](#Vercel-一键部署) |
[railway一键部署](#railway-一键部署) |
[cloudflare部署](#cloudflare-部署) |
[Vercel一键部署](#Vercel-一键部署)

</div>

Expand All @@ -33,8 +33,8 @@ openai-forward run # 或者使用别名 aifd run

### 服务调用

使用方式只需将`https://api.openai.com` 替换为服务所在端口`http://{ip}:{port}` 就可以了。
比如
使用方式只需将`https://api.openai.com` 替换为服务所在端口`http://{ip}:{port}`
```bash
# 默认
https://api.openai.com/v1/chat/completions
Expand Down Expand Up @@ -63,7 +63,7 @@ docker run -d -p 9999:8000 beidongjiedeguang/openai-forward:latest
```

将映射宿主机的9999端口,通过`http://{ip}:9999`访问服务。
容器内日志路径为`/home/openai-forward/Log/`, 可以启动时将其映射出来
容器内日志路径为`/home/openai-forward/Log/`, 可在启动时将其映射出来

注:同样可以在启动命令中通过-e传入环境变量OPENAI_API_KEY=sk-xxx作为默认api key
启用SSL同上.
Expand All @@ -84,39 +84,42 @@ openai-forward run # 或使用别名 aifd run

---

## Railway 一键部署
[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/template/tejCum?referralCode=U0-kXv)

1. 点击上面部署按钮进行一键部署
也可先fork本仓库,再手动在操作界面导入自己的fork项目
2. 填写环境变量,必填项`PORT` :`8000`, 可选项 如默认的OPENAI_API_KEY 等
3. 绑定自定义域名

注: Railway 每月提供 $5.0和500小时执行时间的免费计划。这意味着单个免费用户每个月只能使用大约21天

> https://railway.openai-forward.com

---

## Render 一键部署
[![Deploy to Render](https://render.com/images/deploy-to-render-button.svg)](https://render.com/deploy?repo=https://github.com/beidongjiedeguang/openai-forward)

体验下来,Render应该算是所有部署中最简易的一种, 并且它生成的域名国内可以直接访问!
Render应该算是所有部署中最简易的一种, 并且它生成的域名国内可以直接访问!

1. 点击一键部署按钮
如果提示需要绑定卡,则可先fork本仓库 -->到Render的Dashboard上 New Web Services --> Connect 到刚刚fork到仓库
如果提示需要绑定卡,则可先fork本仓库 -->到Render的Dashboard上 New Web Services --> Connect 到刚刚fork到仓库 后面步骤均默认即可
2. 填写环境变量,如默认的OPENAI_API_KEY 等,也可以不填

然后等待部署完成即可。
Render的免费计划: 每月750小时免费实例时间(意味着单个实例可以不间断运行)、100G带宽流量、500分钟构建时长.

注:默认render在15分钟内没有服务请求时会自动休眠(好处是休眠后不会占用750h的免费实例时间),休眠后下一次请求会被阻塞 5~10s。
若不希望服务15分钟自动休眠,可以使用定时脚本(如每14分钟)去请求服务进行保活。
如果希望零停机部署可以在设置中设置`Health Check Path`为`/healthz`
如果希望服务15分钟不自动休眠,可以使用定时脚本(如每14分钟)去请求服务进行保活。保活脚本参考`scripts/keep_render_alive.py`.
如果希望零停机部署可以在设置中设置`Health Check Path`为`/healthz`

> https://render.openai-forward.com
> https://openai-forward.onrender.com


---

## Railway 一键部署
[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/template/tejCum?referralCode=U0-kXv)

1. 点击上面部署按钮进行一键部署
也可先fork本仓库,再手动在操作界面导入自己的fork项目
2. 填写环境变量,必填项`PORT` :`8000`, 可选项 如默认的OPENAI_API_KEY 等
3. 绑定自定义域名

注: Railway 每月提供 $5.0和500小时执行时间的免费计划。这意味着单个免费用户每个月只能使用大约21天

> https://railway.openai-forward.com


---

⚠️下面两种部署方式仅提供简单的转发服务,没有任何额外功能。
Expand All @@ -142,7 +145,7 @@ Render的免费计划: 每月750小时免费实例时间(意味着单个实例
* Pages部署: fork本仓库,在[cloudflare](https://dash.cloudflare.com/)上创建应用程序时选择Pages, 然后选择连接到Git, 选择刚刚fork的仓库即可完成部署。
* Workers部署: 在[cloudflare](https://dash.cloudflare.com/)上创建应用程序时选择Workers, 部署好示例代码后,点击快速修改(quick edit)复制[_worker.js](_worker.js) 至代码编辑器即可完成服务部署。

绑定自定义域名: cloudflare自动分配的域名国内也无法访问,所以也需要绑定自定义域名. (目前Pages部署时自动分配的域名国内还可以访问)
绑定自定义域名: cloudflare自动分配的域名国内也无法访问,所以也需要绑定自定义域名. (**目前Pages部署时自动分配的域名国内可以直接访问**)

绑定自定义域名需要将域名默认nameserver(域名服务器)绑定到cloudflare提供的nameserver,大体上过程是:
```mermaid
Expand All @@ -155,7 +158,7 @@ stateDiagram-v2
去注册域名机构更改默认nameserver为cloudflare提供的nameserver --> 在cloudflare的worker/page中添加域名: 域名服务器更改验证成功
在cloudflare的worker/page中添加域名 --> 成功
```
这种部署方式轻便简洁,支持流式转发. 对于没有vps的用户还是十分推荐的。不过目前[_worker.js](_worker.js)这个简单脚本仅提供转发服务, 不提供额外功能
这种部署方式轻便简洁,支持流式转发. 对于没有vps的用户还是十分推荐的。不过目前[_worker.js](_worker.js)这个简单脚本仅提供转发服务, 不支持额外功能

> https://cloudflare.worker.openai-forward.com
> https://cloudflare.page.openai-forward.com
Expand Down
2 changes: 1 addition & 1 deletion openai_forward/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = "0.4.1"
__version__ = "0.5.0-alpha"

from dotenv import load_dotenv

Expand Down
34 changes: 31 additions & 3 deletions openai_forward/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ def run(
openai_route_prefix=None,
extra_base_url=None,
extra_route_prefix=None,
rate_limit=None,
global_rate_limit=None,
rate_limit_strategy=None,
token_rate_limit=None,
ip_whitelist=None,
ip_blacklist=None,
proxy=None,
Expand All @@ -37,6 +41,7 @@ def run(
ip_whitelist: str, None
ip_blacklist: str, None
"""

if log_chat:
os.environ["LOG_CHAT"] = log_chat
if openai_base_url:
Expand All @@ -51,6 +56,17 @@ def run(
os.environ["EXTRA_ROUTE_PREFIX"] = extra_route_prefix
if forward_key:
os.environ["FORWARD_KEY"] = forward_key
if rate_limit:
import json

assert isinstance(rate_limit, dict)
os.environ["RATE_LIMIT"] = json.dumps(rate_limit)
if global_rate_limit:
os.environ["GLOBAL_RATE_LIMIT"] = global_rate_limit
if rate_limit_strategy:
os.environ["RATE_LIMIT_STRATEGY"] = rate_limit_strategy
if token_rate_limit:
os.environ["TOKEN_RATE_LIMIT"] = token_rate_limit
if ip_whitelist:
os.environ["IP_WHITELIST"] = ip_whitelist
if ip_blacklist:
Expand All @@ -71,12 +87,24 @@ def run(
)

@staticmethod
def convert(log_folder: str = "./Log/chat", target_path: str = "./Log/chat.json"):
def convert(log_folder: str = None, target_path: str = None):
"""Convert log folder to jsonl file"""
from openai_forward.forwarding.settings import OPENAI_ROUTE_PREFIX
from openai_forward.helper import convert_folder_to_jsonl

print(f"Convert {log_folder}/*.log to {target_path}")
convert_folder_to_jsonl(log_folder, target_path)
print(60 * '-')
if log_folder is None:
_prefix_list = [i.replace("/", "_") for i in OPENAI_ROUTE_PREFIX]
for _prefix in _prefix_list:
log_folder = f"./Log/chat/{_prefix}"
target_path = f"./Log/chat{_prefix}.json"
print(f"Convert {log_folder}/*.log to {target_path}")
convert_folder_to_jsonl(log_folder, target_path)
print(60 * '-')
else:
print(f"Convert {log_folder}/*.log to {target_path}")
convert_folder_to_jsonl(log_folder, target_path)
print(60 * '-')


def main():
Expand Down
22 changes: 18 additions & 4 deletions openai_forward/app.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,20 @@
from fastapi import FastAPI, status
from fastapi import FastAPI, Request, status
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.errors import RateLimitExceeded

from .forwarding import get_fwd_anything_objs, get_fwd_openai_style_objs
from .forwarding.settings import (
RATE_LIMIT_STRATEGY,
dynamic_rate_limit,
get_limiter_key,
)

limiter = Limiter(key_func=get_limiter_key, strategy=RATE_LIMIT_STRATEGY)

app = FastAPI(title="openai_forward", version="0.5")

app = FastAPI(title="openai_forward", version="0.4")
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)


@app.get(
Expand All @@ -11,13 +23,15 @@
response_description="Return HTTP Status Code 200 (OK)",
status_code=status.HTTP_200_OK,
)
def healthz():
@limiter.limit(dynamic_rate_limit)
def healthz(request: Request):
print(request.scope.get("client"))
return "OK"


add_route = lambda obj: app.add_route(
obj.ROUTE_PREFIX + "{api_path:path}",
obj.reverse_proxy,
limiter.limit(dynamic_rate_limit)(obj.reverse_proxy),
methods=["GET", "POST", "PUT", "DELETE", "OPTIONS", "HEAD", "PATCH", "TRACE"],
)

Expand Down
Loading
Loading