Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ai-content-moderation plugin #11541

Merged
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
c80f932
feat: content-moderation plugin
shreemaan-abhishek Aug 30, 2024
d29f928
fix lint
shreemaan-abhishek Aug 30, 2024
8ac0738
Merge branch 'master' of github.com:apache/apisix into content-modera…
shreemaan-abhishek Aug 30, 2024
7ffa489
change priority in lua file
shreemaan-abhishek Aug 30, 2024
5214d0d
change priority and plugins.t
shreemaan-abhishek Aug 30, 2024
2fe1ea2
lint fix
shreemaan-abhishek Aug 30, 2024
5cecf2a
upgrade luarocks version
shreemaan-abhishek Aug 30, 2024
cf08f04
add docs
shreemaan-abhishek Aug 30, 2024
460081e
format doc
shreemaan-abhishek Aug 30, 2024
d475eb0
add to config.json
shreemaan-abhishek Aug 30, 2024
6350d15
update doc
shreemaan-abhishek Sep 2, 2024
f713f87
cleanup
shreemaan-abhishek Sep 2, 2024
e16a823
support ai-model based moderation
shreemaan-abhishek Sep 2, 2024
b21b64b
support secrets
shreemaan-abhishek Sep 2, 2024
ee34e37
rename to ai-content-moderation
shreemaan-abhishek Sep 2, 2024
12529f0
modularise on basis of provider
shreemaan-abhishek Sep 2, 2024
57c59ab
rename
shreemaan-abhishek Sep 10, 2024
093d7a9
cleanup
shreemaan-abhishek Sep 10, 2024
7b52fa5
code review
shreemaan-abhishek Sep 11, 2024
6e3bee2
Merge branch 'master' of github.com:apache/apisix into content-modera…
shreemaan-abhishek Sep 17, 2024
6a2d575
fix method name
shreemaan-abhishek Sep 18, 2024
6bb399c
fix ci
shreemaan-abhishek Sep 18, 2024
1f4528d
code review
shreemaan-abhishek Sep 23, 2024
ef16068
fix doc
shreemaan-abhishek Sep 23, 2024
f6f3451
code review
shreemaan-abhishek Sep 25, 2024
f3672fa
add service provider related info
shreemaan-abhishek Sep 25, 2024
8447d6d
update with LLM proxy
shreemaan-abhishek Oct 3, 2024
0949327
suggestions
shreemaan-abhishek Oct 3, 2024
3a616e1
cleanup
shreemaan-abhishek Oct 3, 2024
4c1f2a6
cleanup lua
shreemaan-abhishek Oct 3, 2024
5b1be91
conf ssl_verify
shreemaan-abhishek Oct 9, 2024
a3e47b2
cleanup
shreemaan-abhishek Oct 9, 2024
81958e4
toxicity_level -> moderation_threshold
shreemaan-abhishek Oct 9, 2024
3da00a2
rm redundant file
shreemaan-abhishek Oct 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion apisix-master-0.rockspec
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ dependencies = {
"lua-resty-t1k = 1.1.5",
"brotli-ffi = 0.3-1",
"lua-ffi-zlib = 0.6-0",
"api7-lua-resty-aws == 2.0.1-1",
"api7-lua-resty-aws == 2.0.2-1",
}

build = {
Expand Down
1 change: 1 addition & 0 deletions apisix/cli/config.lua
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,7 @@ local _M = {
"body-transformer",
"ai-prompt-template",
"ai-prompt-decorator",
"content-moderation",
"proxy-mirror",
"proxy-rewrite",
"workflow",
Expand Down
155 changes: 155 additions & 0 deletions apisix/plugins/content-moderation.lua
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
--
-- Licensed to the Apache Software Foundation (ASF) under one or more
-- contributor license agreements. See the NOTICE file distributed with
-- this work for additional information regarding copyright ownership.
-- The ASF licenses this file to You under the Apache License, Version 2.0
-- (the "License"); you may not use this file except in compliance with
-- the License. You may obtain a copy of the License at
--
-- http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing, software
-- distributed under the License is distributed on an "AS IS" BASIS,
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-- See the License for the specific language governing permissions and
-- limitations under the License.
--
local core = require("apisix.core")
local aws = require("resty.aws")
local aws_instance = aws()
local http = require("resty.http")
local next = next
local pairs = pairs
local unpack = unpack

local aws_comprehend_schema = {
type = "object",
properties = {
access_key_id = { type = "string" },
secret_access_key = { type = "string" },
nic-6443 marked this conversation as resolved.
Show resolved Hide resolved
region = { type = "string" },
endpoint = {
type = "string",
pattern = [[^https?://]]
},
},
required = { "access_key_id", "secret_access_key", "region", }
}

local schema = {
type = "object",
properties = {
provider = {
type = "object",
properties = {
aws_comprehend = aws_comprehend_schema
},
-- change to oneOf/enum while implementing support for other services
required = { "aws_comprehend" }
},
moderation_categories = {
type = "object",
patternProperties = {
-- luacheck: push max code line length 300
["^(PROFANITY|HATE_SPEECH|INSULT|HARASSMENT_OR_ABUSE|SEXUAL|VIOLENCE_OR_THREAT)$"] = {
-- luacheck: pop
type = "number",
minimum = 0,
maximum = 1
}
},
additionalProperties = false
},
toxicity_level = {
type = "number",
minimum = 0,
maximum = 1,
default = 0.5
},
reject_requests = {
type = "boolean",
default = true,
}
},
required = { "provider" },
}


local _M = {
version = 0.1,
priority = 1040, -- TODO: might change
name = "content-moderation",
schema = schema,
}


function _M.check_schema(conf)
return core.schema.check(schema, conf)
end

function _M.rewrite(conf, ctx)
local body = core.request.get_body()
if not body then
return
end

local provider = conf.provider[next(conf.provider)]

-- TODO support secret
local credentials = aws_instance:Credentials({
accessKeyId = provider.access_key_id,
secretAccessKey = provider.secret_access_key,
sessionToken = provider.session_token,
})

local default_endpoint = "https://comprehend." .. provider.region .. ".amazonaws.com"
local scheme, host, port = unpack(http:parse_uri(provider.endpoint or default_endpoint))
local endpoint = scheme .. "://" .. host
aws_instance.config.endpoint = endpoint
aws_instance.config.ssl_verify = false

local comprehend = aws_instance:Comprehend({
credentials = credentials,
endpoint = endpoint,
region = provider.region,
port = port,
})

local res, err = comprehend:detectToxicContent({
LanguageCode = "en",
TextSegments = {
{
Text = body
nic-6443 marked this conversation as resolved.
Show resolved Hide resolved
}
},
})

if not res then
core.log.error("failed to send request to ", provider, ": ", err)
return 500, err
end

local result = res.body and res.body.ResultList and res.body.ResultList[1]
if not result then
return 500, "failed to get moderation result from response"
end


if conf.moderation_categories then
for _, item in pairs(result.Labels) do
if not conf.moderation_categories[item.Name] then
goto continue
end
if item.Score > conf.moderation_categories[item.Name] then
return 400, "request body exceeds " .. item.Name .. " threshold"
end
::continue::
end
end

if result.Toxicity > conf.toxicity_level then
return 400, "request body exceeds toxicity threshold"
end
nic-6443 marked this conversation as resolved.
Show resolved Hide resolved
end

return _M
1 change: 1 addition & 0 deletions conf/config.yaml.example
Original file line number Diff line number Diff line change
Expand Up @@ -478,6 +478,7 @@ plugins: # plugin list (sorted by priority)
- body-transformer # priority: 1080
- ai-prompt-template # priority: 1071
- ai-prompt-decorator # priority: 1070
- content-moderation # priority: 1040 TODO: compare priority with other ai plugins
- proxy-mirror # priority: 1010
- proxy-rewrite # priority: 1008
- workflow # priority: 1006
Expand Down
3 changes: 2 additions & 1 deletion docs/en/latest/config.json
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,8 @@
"plugins/ext-plugin-post-req",
"plugins/ext-plugin-post-resp",
"plugins/inspect",
"plugins/ocsp-stapling"
"plugins/ocsp-stapling",
"plugins/content-moderation"
]
},
{
Expand Down
199 changes: 199 additions & 0 deletions docs/en/latest/plugins/content-moderation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
---
title: content-moderation
keywords:
- Apache APISIX
- API Gateway
- Plugin
- content-moderation
description: This document contains information about the Apache APISIX content-moderation Plugin.
---

<!--
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
-->

## Description

The `content-moderation` plugin processes the request body to check for toxicity and rejects the request if it exceeds the configured threshold.

## Plugin Attributes

| **Field** | **Required** | **Type** | **Description** |
| ----------------------------------------- | ------------ | -------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
| provider.aws_comprehend.access_key_id | Yes | String | AWS access key ID |
| provider.aws_comprehend.secret_access_key | Yes | String | AWS secret access key |
| provider.aws_comprehend.region | Yes | String | AWS region |
| provider.aws_comprehend.endpoint | No | String | AWS Comprehend service endpoint. Must match the pattern `^https?://` |
| moderation_categories | No | Object | Configuration for moderation categories. Must be one of: PROFANITY, HATE_SPEECH, INSULT, HARASSMENT_OR_ABUSE, SEXUAL, VIOLENCE_OR_THREAT |
| toxicity_level | No | Number | Threshold for overall toxicity detection. Range: 0 - 1. Default: 0.5 |

## Example usage

Create a route with the `content-moderation` plugin like so:

```shell
curl "http://127.0.0.1:9180/apisix/admin/routes/1" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"uri": "/post",
"plugins": {
"content-moderation": {
"provider": {
"aws_comprehend": {
"access_key_id": "access",
"secret_access_key": "ea+secret",
"region": "us-east-1"
}
},
"moderation_categories": {
"PROFANITY": 0.5
}
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"httpbin.org:80": 1
}
}
}'
```

Now send a request:

```shell
curl http://127.0.0.1:9080/post -i -XPOST -H 'Content-Type: application/json' -d '{
"info": "<some very seriously profane message>"
}'
```

Then the request will be blocked with error like this:

```text
HTTP/1.1 400 Bad Request
Date: Fri, 30 Aug 2024 11:21:21 GMT
Content-Type: text/plain; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Server: APISIX/3.10.0

request body exceeds toxicity threshold
```

Send a request with normal request body:

```shell
curl http://127.0.0.1:9080/post -i -XPOST -H 'Content-Type: application/json' -d '{
"info": "APISIX is wonderful"
}'
```

This request will be proxied normally to the upstream.

```text
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 530
Connection: keep-alive
Date: Fri, 30 Aug 2024 11:21:55 GMT
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
Server: APISIX/3.10.0

{
"args": {},
"data": "",
"files": {},
"form": {
"do you know what is alpha murder method? I will teach you with care": ""
},
"headers": {
"Accept": "*/*",
"Content-Length": "67",
"Content-Type": "application/x-www-form-urlencoded",
"Host": "127.0.0.1",
"User-Agent": "curl/8.7.1",
"X-Amzn-Trace-Id": "Root=1-66d1ab53-0860444b1b01a3f93c7003f4",
"X-Forwarded-Host": "127.0.0.1"
},
"json": null,
"origin": "127.0.0.1, 163.53.25.129",
"url": "http://127.0.0.1/post"
}
```

You can also configure filters on other moderation categories like so:

```shell
curl "http://127.0.0.1:9180/apisix/admin/routes/1" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"uri": "/post",
"plugins": {
"content-moderation": {
"provider": {
"aws_comprehend": {
"access_key_id": "access",
"secret_access_key": "ea+secret",
"region": "us-east-1"
}
},
"moderation_categories": {
"PROFANITY": 0.5,
"HARASSMENT_OR_ABUSE": 0.7,
"SEXUAL": 0.2
}
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"httpbin.org:80": 1
}
}
}'
```

If none of the `moderation_categories` are configured, request bodies will be moderated on the basis of overall toxicity.
The default `toxicity_level` is 0.5, it can be configured like so.

```shell
curl "http://127.0.0.1:9180/apisix/admin/routes/1" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"uri": "/post",
"plugins": {
"content-moderation": {
"provider": {
"aws_comprehend": {
"access_key_id": "access",
"secret_access_key": "ea+secret",
"region": "us-east-1"
}
}
"toxicity_level": 0.7
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"httpbin.org:80": 1
}
}
}'
```
1 change: 1 addition & 0 deletions t/admin/plugins.t
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ proxy-cache
body-transformer
ai-prompt-template
ai-prompt-decorator
content-moderation
proxy-mirror
proxy-rewrite
workflow
Expand Down
Loading
Loading