Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve redis semantic cache implementation #5412

Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .circleci/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ tiktoken
importlib_metadata
cohere
redis
redisvl==0.3.3
anthropic
orjson==3.9.15
pydantic==2.7.1
Expand Down
3 changes: 0 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,6 @@ RUN pip install dist/*.whl
# install dependencies as wheels
RUN pip wheel --no-cache-dir --wheel-dir=/wheels/ -r requirements.txt

# install semantic-cache [Experimental]- we need this here and not in requirements.txt because redisvl pins to pydantic 1.0
RUN pip install redisvl==0.0.7 --no-deps

# ensure pyjwt is used, not jwt
RUN pip uninstall jwt -y
RUN pip uninstall PyJWT -y
Expand Down
3 changes: 0 additions & 3 deletions Dockerfile.database
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,6 @@ COPY --from=builder /wheels/ /wheels/
# Install the built wheel using pip; again using a wildcard if it's the only file
RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ && rm -f *.whl && rm -rf /wheels

# install semantic-cache [Experimental]- we need this here and not in requirements.txt because redisvl pins to pydantic 1.0
RUN pip install redisvl==0.0.7 --no-deps

# ensure pyjwt is used, not jwt
RUN pip uninstall jwt -y
RUN pip uninstall PyJWT -y
Expand Down
3 changes: 0 additions & 3 deletions Dockerfile.non_root
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,6 @@ COPY --from=builder /wheels/ /wheels/
# Install the built wheel using pip; again using a wildcard if it's the only file
RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ && rm -f *.whl && rm -rf /wheels

# install semantic-cache [Experimental]- we need this here and not in requirements.txt because redisvl pins to pydantic 1.0
RUN pip install redisvl==0.0.7 --no-deps

# ensure pyjwt is used, not jwt
RUN pip uninstall jwt -y
RUN pip uninstall PyJWT -y
Expand Down
21 changes: 14 additions & 7 deletions docs/my-website/docs/caching/all_caches.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,15 +30,21 @@ import litellm
from litellm import completion
from litellm.caching import Cache

litellm.cache = Cache(type="redis", host=<host>, port=<port>, password=<password>)
litellm.cache = Cache(
type="redis",
host=os.environ["REDIS_HOST"],
port=os.environ["REDIS_PORT"],
password=os.environ["REDIS_PASSWORD"],
ttl=60
)

# Make completion calls
response1 = completion(
model="gpt-3.5-turbo",
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Tell me a joke."}]
)
response2 = completion(
model="gpt-3.5-turbo",
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Tell me a joke."}]
)

Expand Down Expand Up @@ -72,11 +78,11 @@ litellm.cache = Cache(type="s3", s3_bucket_name="cache-bucket-litellm", s3_regio

# Make completion calls
response1 = completion(
model="gpt-3.5-turbo",
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Tell me a joke."}]
)
response2 = completion(
model="gpt-3.5-turbo",
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Tell me a joke."}]
)

Expand All @@ -90,10 +96,10 @@ response2 = completion(

Install redis
```shell
pip install redisvl==0.0.7
pip install redisvl==0.3.3
```

For the hosted version you can setup your own Redis DB here: https://app.redislabs.com/
For the hosted version you can setup your own Redis DB here: https://redis.io/try-free/

```python
import litellm
Expand All @@ -110,6 +116,7 @@ litellm.cache = Cache(
host=os.environ["REDIS_HOST"],
port=os.environ["REDIS_PORT"],
password=os.environ["REDIS_PASSWORD"],
ttl=60,
similarity_threshold=0.8, # similarity threshold for cache hits, 0 == no similarity, 1 = exact matches, 0.5 == 50% similarity
redis_semantic_cache_embedding_model="text-embedding-ada-002", # this model is passed to litellm.embedding(), any litellm.embedding() model is supported here
)
Expand Down
14 changes: 9 additions & 5 deletions litellm/_redis.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,11 +120,14 @@ def _get_redis_client_logic(**env_overrides):
**env_overrides,
}

if "url" in redis_kwargs and redis_kwargs["url"] is not None:
redis_kwargs.pop("host", None)
redis_kwargs.pop("port", None)
redis_kwargs.pop("db", None)
redis_kwargs.pop("password", None)
if "url" in redis_kwargs:
if redis_kwargs["url"] is not None:
redis_kwargs.pop("host", None)
redis_kwargs.pop("port", None)
redis_kwargs.pop("db", None)
redis_kwargs.pop("password", None)
else:
redis_kwargs.pop("url", None)
elif "host" not in redis_kwargs or redis_kwargs["host"] is None:
raise ValueError("Either 'host' or 'url' must be specified for redis.")
# litellm.print_verbose(f"redis_kwargs: {redis_kwargs}")
Expand Down Expand Up @@ -157,6 +160,7 @@ def get_redis_client(**env_overrides):
new_startup_nodes.append(ClusterNode(**item))
redis_kwargs.pop("startup_nodes")
return redis.RedisCluster(startup_nodes=new_startup_nodes, **cluster_kwargs)

return redis.Redis(**redis_kwargs)


Expand Down
Loading