Skip to content

Commit

Permalink
Feat: rate limit support (#59)
Browse files Browse the repository at this point in the history
* slowapi rate limit support

* Add tpm && fix multi-target log

* 🔥 Cancel extra log

* 💫 Adaptation

* chore: update docs and print info

* clean up
  • Loading branch information
KenyonY authored Aug 16, 2023
1 parent 7c18fae commit 1e5ce9f
Show file tree
Hide file tree
Showing 23 changed files with 535 additions and 142 deletions.
32 changes: 25 additions & 7 deletions .env
Original file line number Diff line number Diff line change
@@ -1,22 +1,40 @@
# 示例见 .env.example
# 示例与解释见 .env.example

# LOG_CHAT: 是否开启日志
# `LOG_CHAT`: 是否记录日志
LOG_CHAT=false

# OPENAI_BASE_URL: 转发openai风格的任何服务地址,允许指定多个, 以逗号隔开。
# `OPENAI_BASE_URL`: 转发openai风格的任何服务地址,允许指定多个, 以逗号隔开。
# 如果指定超过一个,则任何OPENAI_ROUTE_PREFIX/EXTRA_ROUTE_PREFIX都不能为根路由/
OPENAI_BASE_URL=https://api.openai.com

# OPENAI_ROUTE_PREFIX: 可指定所有openai风格(为记录日志)服务的转发路由前缀
OPENAI_ROUTE_PREFIX=/
# `OPENAI_ROUTE_PREFIX`: 可指定所有openai风格(为记录日志)服务的转发路由前缀
OPENAI_ROUTE_PREFIX=

OPENAI_API_KEY=
FORWARD_KEY=

# EXTRA_BASE_URL: 可指定任意服务转发
# `EXTRA_BASE_URL`: 可指定任意服务转发
EXTRA_BASE_URL=
# EXTRA_ROUTE_PREFIX: 与 EXTRA_BASE_URL 匹配的路由前缀
# `EXTRA_ROUTE_PREFIX`: 与 EXTRA_BASE_URL 匹配的路由前缀
EXTRA_ROUTE_PREFIX=

# `RATE_LIMIT`: i.e. RPM 对指定路由的请求速率限制
# format: {route: ratelimit-string}
# ratelimit-string format [count] [per|/] [n (optional)] [second|minute|hour|day|month|year] :ref:`ratelimit-string`: https://limits.readthedocs.io/en/stable/quickstart.html#rate-limit-string-notation
RATE_LIMIT='{
"/healthz": "1000/2minutes",
"/v1/chat/completions": "300/minute"
}'

#`RATE_LIMIT_STRATEGY` Options: (fixed-window, fixed-window-elastic-expiry, moving-window) ref: https://limits.readthedocs.io/en/latest/strategies.html
# `fixed-window`: most memory efficient strategy; `moving-window`:most effective for preventing bursts but higher memory cost.
RATE_LIMIT_STRATEGY=moving-window

# `GLOBAL_RATE_LIMIT`: 所有`RATE_LIMIT`没有指定的路由. 不填默认无限制
GLOBAL_RATE_LIMIT=

# TPM: 返回的token速率限制
TOKEN_RATE_LIMIT=50/second

# 设定时区
TZ=Asia/Shanghai
32 changes: 24 additions & 8 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,29 +1,45 @@
# LOG_CHAT: 是否开启日志
LOG_CHAT=true


# OPENAI_BASE_URL: 转发openai风格的任何服务地址,允许指定多个, 以逗号隔开。
# 如果指定超过一个,则任何OPENAI_ROUTE_PREFIX/EXTRA_ROUTE_PREFIX都不能为根路由/
OPENAI_BASE_URL=https://api.openai.com, http:localhost:8080
OPENAI_BASE_URL='https://api.openai.com, http:localhost:8080'

# OPENAI_ROUTE_PREFIX: 可指定所有openai风格(为记录日志)服务的转发路由前缀
OPENAI_ROUTE_PREFIX=/openai, /localai
OPENAI_ROUTE_PREFIX='/openai, /localai'

# OPENAI_API_KEY:允许输入多个api key, 以逗号隔开, 形成轮询池
OPENAI_API_KEY=sk-xxx1,sk-xxx2,sk-xxx3
OPENAI_API_KEY='sk-xxx1, sk-xxx2, sk-xxx3'

# FORWARD_KEY: 当前面的OPENAI_API_KEY被设置,就可以设置这里的FORWARD_KEY,客户端调用时可以使用FORWARD_KEY作为api key
FORWARD_KEY=fk-xxx1


# EXTRA_BASE_URL: 可指定任意服务转发
EXTRA_BASE_URL=http://localhost:8882, http://localhost:8881
EXTRA_BASE_URL='http://localhost:8882, http://localhost:8881'

# EXTRA_ROUTE_PREFIX: 与 EXTRA_BASE_URL 匹配的路由前缀
EXTRA_ROUTE_PREFIX=/tts,/translate
EXTRA_ROUTE_PREFIX='/tts, /translate'

# RATE LIMIT: 指定路由的请求速率限制(不区分客户)
# format: {route: ratelimit-string}
# ratelimit-string format [count] [per|/] [n (optional)] [second|minute|hour|day|month|year] :ref:`ratelimit-string`: https://limits.readthedocs.io/en/stable/quickstart.html#rate-limit-string-notation
RATE_LIMIT='{
"/healthz": "50/3minutes",
"/openai/v1/chat/completions": "1/10seconds",
"/localai/v1/chat/completions": "2/second"}'

# PROXY 配置代理
# `GLOBAL_RATE_LIMIT`: 所有`RATE_LIMIT`没有指定的路由. 不填默认无限制
GLOBAL_RATE_LIMIT=2/5seconds

#`RATE_LIMIT_STRATEGY` Options: (fixed-window, fixed-window-elastic-expiry, moving-window) ref: https://limits.readthedocs.io/en/latest/strategies.html
# `fixed-window`: most memory efficient strategy; `moving-window`:most effective for preventing bursts but higher memory cost.
RATE_LIMIT_STRATEGY=fixed-window

# `PROXY` http代理
PROXY=http://localhost:7890

# `TOKEN_RATE_LIMIT` 对每一份流式返回的token速率限制 (注:这里的token并不严格等于gpt中定义的token,而是SSE的chunk)
TOKEN_RATE_LIMIT=16/second

# 设定时区
TZ=Asia/Shanghai
Binary file added .github/images/separators/aqua.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/images/separators/aqua.webp
Binary file not shown.
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,15 +51,17 @@
本项目用于解决一些地区无法直接访问OpenAI的问题,将该服务部署在可以正常访问OpenAI API的(云)服务器上,
通过该服务转发OpenAI的请求。即搭建反向代理服务; 允许输入多个OpenAI API-KEY 组成轮询池; 可自定义二次分发api key.

---

<picture>
<source type="image/webp" srcset="https://raw.githubusercontent.com/beidongjiedeguang/openai-forward/.github/images/separators/aqua.webp">
<img src="https://raw.githubusercontent.com/beidongjiedeguang/openai-forward/.github/images/separators/aqua.png" height=8px width="100%">
</picture>

由本项目搭建的长期代理地址:
> https://api.openai-forward.com
> https://cloudflare.worker.openai-forward.com
> https://render.openai-forward.com
> https://cloudflare.page.openai-forward.com
> https://vercel.openai-forward.com
> https://render.openai-forward.com
> https://railway.openai-forward.com
## 功能

Expand Down
51 changes: 27 additions & 24 deletions deploy.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@

[pip部署](#pip部署) |
[docker部署](#docker部署) |
[railway一键部署](#railway-一键部署) |
[render一键部署](#render-一键部署) |
[Vercel一键部署](#Vercel-一键部署) |
[railway一键部署](#railway-一键部署) |
[cloudflare部署](#cloudflare-部署) |
[Vercel一键部署](#Vercel-一键部署)

</div>

Expand All @@ -33,8 +33,8 @@ openai-forward run # 或者使用别名 aifd run

### 服务调用

使用方式只需将`https://api.openai.com` 替换为服务所在端口`http://{ip}:{port}` 就可以了。
比如
使用方式只需将`https://api.openai.com` 替换为服务所在端口`http://{ip}:{port}`
```bash
# 默认
https://api.openai.com/v1/chat/completions
Expand Down Expand Up @@ -63,7 +63,7 @@ docker run -d -p 9999:8000 beidongjiedeguang/openai-forward:latest
```

将映射宿主机的9999端口,通过`http://{ip}:9999`访问服务。
容器内日志路径为`/home/openai-forward/Log/`, 可以启动时将其映射出来
容器内日志路径为`/home/openai-forward/Log/`, 可在启动时将其映射出来

注:同样可以在启动命令中通过-e传入环境变量OPENAI_API_KEY=sk-xxx作为默认api key
启用SSL同上.
Expand All @@ -84,39 +84,42 @@ openai-forward run # 或使用别名 aifd run

---

## Railway 一键部署
[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/template/tejCum?referralCode=U0-kXv)

1. 点击上面部署按钮进行一键部署
也可先fork本仓库,再手动在操作界面导入自己的fork项目
2. 填写环境变量,必填项`PORT` :`8000`, 可选项 如默认的OPENAI_API_KEY 等
3. 绑定自定义域名

注: Railway 每月提供 $5.0和500小时执行时间的免费计划。这意味着单个免费用户每个月只能使用大约21天

> https://railway.openai-forward.com
---

## Render 一键部署
[![Deploy to Render](https://render.com/images/deploy-to-render-button.svg)](https://render.com/deploy?repo=https://github.com/beidongjiedeguang/openai-forward)

体验下来,Render应该算是所有部署中最简易的一种, 并且它生成的域名国内可以直接访问!
Render应该算是所有部署中最简易的一种, 并且它生成的域名国内可以直接访问!

1. 点击一键部署按钮
如果提示需要绑定卡,则可先fork本仓库 -->到Render的Dashboard上 New Web Services --> Connect 到刚刚fork到仓库
如果提示需要绑定卡,则可先fork本仓库 -->到Render的Dashboard上 New Web Services --> Connect 到刚刚fork到仓库 后面步骤均默认即可
2. 填写环境变量,如默认的OPENAI_API_KEY 等,也可以不填

然后等待部署完成即可。
Render的免费计划: 每月750小时免费实例时间(意味着单个实例可以不间断运行)、100G带宽流量、500分钟构建时长.

注:默认render在15分钟内没有服务请求时会自动休眠(好处是休眠后不会占用750h的免费实例时间),休眠后下一次请求会被阻塞 5~10s。
若不希望服务15分钟自动休眠,可以使用定时脚本(如每14分钟)去请求服务进行保活。
如果希望零停机部署可以在设置中设置`Health Check Path``/healthz`
如果希望服务15分钟不自动休眠,可以使用定时脚本(如每14分钟)去请求服务进行保活。保活脚本参考`scripts/keep_render_alive.py`.
如果希望零停机部署可以在设置中设置`Health Check Path``/healthz`

> https://render.openai-forward.com
> https://openai-forward.onrender.com

---

## Railway 一键部署
[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/template/tejCum?referralCode=U0-kXv)

1. 点击上面部署按钮进行一键部署
也可先fork本仓库,再手动在操作界面导入自己的fork项目
2. 填写环境变量,必填项`PORT` :`8000`, 可选项 如默认的OPENAI_API_KEY 等
3. 绑定自定义域名

注: Railway 每月提供 $5.0和500小时执行时间的免费计划。这意味着单个免费用户每个月只能使用大约21天

> https://railway.openai-forward.com

---

⚠️下面两种部署方式仅提供简单的转发服务,没有任何额外功能。
Expand All @@ -142,7 +145,7 @@ Render的免费计划: 每月750小时免费实例时间(意味着单个实例
* Pages部署: fork本仓库,在[cloudflare](https://dash.cloudflare.com/)上创建应用程序时选择Pages, 然后选择连接到Git, 选择刚刚fork的仓库即可完成部署。
* Workers部署: 在[cloudflare](https://dash.cloudflare.com/)上创建应用程序时选择Workers, 部署好示例代码后,点击快速修改(quick edit)复制[_worker.js](_worker.js) 至代码编辑器即可完成服务部署。

绑定自定义域名: cloudflare自动分配的域名国内也无法访问,所以也需要绑定自定义域名. (目前Pages部署时自动分配的域名国内还可以访问)
绑定自定义域名: cloudflare自动分配的域名国内也无法访问,所以也需要绑定自定义域名. (**目前Pages部署时自动分配的域名国内可以直接访问**)

绑定自定义域名需要将域名默认nameserver(域名服务器)绑定到cloudflare提供的nameserver,大体上过程是:
```mermaid
Expand All @@ -155,7 +158,7 @@ stateDiagram-v2
去注册域名机构更改默认nameserver为cloudflare提供的nameserver --> 在cloudflare的worker/page中添加域名: 域名服务器更改验证成功
在cloudflare的worker/page中添加域名 --> 成功
```
这种部署方式轻便简洁,支持流式转发. 对于没有vps的用户还是十分推荐的。不过目前[_worker.js](_worker.js)这个简单脚本仅提供转发服务, 不提供额外功能
这种部署方式轻便简洁,支持流式转发. 对于没有vps的用户还是十分推荐的。不过目前[_worker.js](_worker.js)这个简单脚本仅提供转发服务, 不支持额外功能

> https://cloudflare.worker.openai-forward.com
> https://cloudflare.page.openai-forward.com
Expand Down
2 changes: 1 addition & 1 deletion openai_forward/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = "0.4.1"
__version__ = "0.5.0-alpha"

from dotenv import load_dotenv

Expand Down
34 changes: 31 additions & 3 deletions openai_forward/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ def run(
openai_route_prefix=None,
extra_base_url=None,
extra_route_prefix=None,
rate_limit=None,
global_rate_limit=None,
rate_limit_strategy=None,
token_rate_limit=None,
ip_whitelist=None,
ip_blacklist=None,
proxy=None,
Expand All @@ -37,6 +41,7 @@ def run(
ip_whitelist: str, None
ip_blacklist: str, None
"""

if log_chat:
os.environ["LOG_CHAT"] = log_chat
if openai_base_url:
Expand All @@ -51,6 +56,17 @@ def run(
os.environ["EXTRA_ROUTE_PREFIX"] = extra_route_prefix
if forward_key:
os.environ["FORWARD_KEY"] = forward_key
if rate_limit:
import json

assert isinstance(rate_limit, dict)
os.environ["RATE_LIMIT"] = json.dumps(rate_limit)
if global_rate_limit:
os.environ["GLOBAL_RATE_LIMIT"] = global_rate_limit
if rate_limit_strategy:
os.environ["RATE_LIMIT_STRATEGY"] = rate_limit_strategy
if token_rate_limit:
os.environ["TOKEN_RATE_LIMIT"] = token_rate_limit
if ip_whitelist:
os.environ["IP_WHITELIST"] = ip_whitelist
if ip_blacklist:
Expand All @@ -71,12 +87,24 @@ def run(
)

@staticmethod
def convert(log_folder: str = "./Log/chat", target_path: str = "./Log/chat.json"):
def convert(log_folder: str = None, target_path: str = None):
"""Convert log folder to jsonl file"""
from openai_forward.forwarding.settings import OPENAI_ROUTE_PREFIX
from openai_forward.helper import convert_folder_to_jsonl

print(f"Convert {log_folder}/*.log to {target_path}")
convert_folder_to_jsonl(log_folder, target_path)
print(60 * '-')
if log_folder is None:
_prefix_list = [i.replace("/", "_") for i in OPENAI_ROUTE_PREFIX]
for _prefix in _prefix_list:
log_folder = f"./Log/chat/{_prefix}"
target_path = f"./Log/chat{_prefix}.json"
print(f"Convert {log_folder}/*.log to {target_path}")
convert_folder_to_jsonl(log_folder, target_path)
print(60 * '-')
else:
print(f"Convert {log_folder}/*.log to {target_path}")
convert_folder_to_jsonl(log_folder, target_path)
print(60 * '-')


def main():
Expand Down
22 changes: 18 additions & 4 deletions openai_forward/app.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,20 @@
from fastapi import FastAPI, status
from fastapi import FastAPI, Request, status
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.errors import RateLimitExceeded

from .forwarding import get_fwd_anything_objs, get_fwd_openai_style_objs
from .forwarding.settings import (
RATE_LIMIT_STRATEGY,
dynamic_rate_limit,
get_limiter_key,
)

limiter = Limiter(key_func=get_limiter_key, strategy=RATE_LIMIT_STRATEGY)

app = FastAPI(title="openai_forward", version="0.5")

app = FastAPI(title="openai_forward", version="0.4")
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)


@app.get(
Expand All @@ -11,13 +23,15 @@
response_description="Return HTTP Status Code 200 (OK)",
status_code=status.HTTP_200_OK,
)
def healthz():
@limiter.limit(dynamic_rate_limit)
def healthz(request: Request):
print(request.scope.get("client"))
return "OK"


add_route = lambda obj: app.add_route(
obj.ROUTE_PREFIX + "{api_path:path}",
obj.reverse_proxy,
limiter.limit(dynamic_rate_limit)(obj.reverse_proxy),
methods=["GET", "POST", "PUT", "DELETE", "OPTIONS", "HEAD", "PATCH", "TRACE"],
)

Expand Down
Loading

0 comments on commit 1e5ce9f

Please sign in to comment.