feat: Batch encoding for TEI encoder #423

Vits-99 · 2024-09-20T20:19:14Z

User description

Related to PR#414.

From user:

Hi everyone.

I wanted to use the Text Embeddings Inference with the encoder but I noticed two small bugs in the code. I believe that the HFEndpointEncoder was intentionally created to be used with TEI (right?)

The loop for max_retries in attempts, that is inside of the function query, has no break or any system to return the result when we have a success response. I added a break, similar to the OpenAI encoder.
The response from TEI is [[[array]]]. The array is inside of a list of a list. I remove one list when receiving the response. Without this it will throw a dimension error when comparing all the vectors.

These are the main bugs, but I would also take some time to purpose a future update. With TEI we can send a batch of texts

curl 127.0.0.1:8080/embed \
    -X POST \
    -d '{"inputs":["Today is a nice day", "I like you"]}' \
    -H 'Content-Type: application/json'

To save time, we could batch the different sentences to the endpoint. This would be great for longer document. If it sounds interesting I can try to help to develop it.

By the way, should I use semantic router for splitting text, or the semantic chunkers?

PR Type

enhancement, bug fix

Description

Implemented batch processing for the Text Embeddings Inference, allowing multiple documents to be processed in a single query with a batch size of 50.
Fixed the handling of the TEI response to correctly process nested lists, preventing dimension errors.
Added a break statement in the retry loop within the query method to exit upon a successful response, improving efficiency.
Enhanced error handling to provide clearer error messages when no embeddings are returned for a batch.

Changes walkthrough 📝

Relevant files

Enhancement

huggingface.py `Implement batch processing and fix response handling in TEI encoder` semantic_router/encoders/huggingface.py Implemented batch processing for document embeddings with a batch size of 50. Fixed the response handling to correctly process nested lists in the output. Added a break statement in the retry loop to stop on successful response. Improved error handling for batch processing.	+13/-8

💡 PR-Agent usage:
Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

github-actions · 2024-09-20T20:20:00Z

PR Reviewer Guide 🔍

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Key issues to review Error Handling The error handling in the batch processing might suppress specific errors which could be useful for debugging. Consider logging the error before raising a new one to maintain the error context. List Processing The condition to check if the output is a list might not correctly handle nested lists as expected from the PR description. This could lead to incorrect embeddings structure.

github-actions · 2024-09-20T20:20:03Z

PR Code Suggestions ✨

Category	Suggestion	Score
Enhancement	Make batch size configurable by adding it as a method parameter Instead of using a hardcoded batch size, consider making `batch_size` a parameter of the method with a default value. This will make the method more flexible and allow users to specify a different batch size if needed. semantic_router/encoders/huggingface.py [215] -batch_size = 50 +def __call__(self, docs: List[str], batch_size: int = 50) -> List[List[float]]: Suggestion importance[1-10]: 8 Why: Making the batch size configurable enhances the flexibility of the method, allowing users to adjust it based on their specific needs and constraints, which is a significant improvement.	8
Enhancement	Improve error handling by specifying exception types in the `query` method Consider adding a specific exception type or custom exception message for different error scenarios in the `query` method to improve error traceability and handling. semantic_router/encoders/huggingface.py [228] +except requests.exceptions.HTTPError as e: + raise ValueError(f"HTTP error occurred: {e}") from e +except requests.exceptions.ConnectionError as e: + raise ValueError(f"Connection error occurred: {e}") from e except Exception as e: - raise ValueError(f"No embeddings returned for batch. Error: {e}") from e + raise ValueError(f"An unexpected error occurred: {e}") from e Suggestion importance[1-10]: 6 Why: Specifying exception types can improve error traceability and handling, making the code more maintainable and easier to debug, but the improvement is not critical unless specific exceptions are expected frequently.	6
Possible bug	Add error handling for non-list `outputs` to prevent runtime errors Add error handling for the case when `outputs` is not a list, as currently, the code assumes `outputs` will always be a list. This could lead to unexpected errors if the structure of `outputs` changes. semantic_router/encoders/huggingface.py [223-226] -if isinstance(outputs[0], list): - embeddings.extend(outputs) +if isinstance(outputs, list): + if all(isinstance(item, list) for item in outputs): + embeddings.extend(outputs) + else: + raise ValueError("Expected a list of lists as output.") else: - embeddings.append(outputs) + raise ValueError("Expected a list as output.") Suggestion importance[1-10]: 7 Why: Adding error handling for unexpected output types improves the robustness of the code by preventing potential runtime errors, though it may not be crucial if the output format is well-defined.	7
Best practice	Use Pythonic way to check for empty lists Instead of checking if `outputs` is empty with `len(outputs) == 0`, use the more Pythonic `not outputs`. semantic_router/encoders/huggingface.py [221-222] -if not outputs or len(outputs) == 0: +if not outputs: raise ValueError("No embeddings returned from the query.") Suggestion importance[1-10]: 5 Why: While using `not outputs` is more Pythonic and slightly improves readability, the existing code is functionally correct, so the improvement is minor.	5

codecov · 2024-09-20T20:27:56Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.37%. Comparing base (5603bac) to head (1db96e6).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #423      +/-   ##
==========================================
+ Coverage   68.04%   68.37%   +0.33%     
==========================================
  Files          46       46              
  Lines        3505     3510       +5     
==========================================
+ Hits         2385     2400      +15     
+ Misses       1120     1110      -10

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

joaomsimoes · 2024-09-24T08:52:43Z

semantic_router/encoders/huggingface.py

@@ -212,19 +212,17 @@ def __call__(self, docs: List[str]) -> List[List[float]]:
            ValueError: If no embeddings are returned for a document.
        """

-        batch_size=50
+        batch_size = 50


ValueError: No embeddings returned for batch. Error: Query failed with status 413: {"error":"batch size 50 > maximum allowed batch size 32","error_type":"Validation"}

Hi @joaomsimoes what HuggingFace TEI model were you using when you encountered this error?

Sorry for the late answer @Siraj-Aizlewood

I was using Alibaba-NLP/gte-large-en-v1.5

…ing.

joaomsimoes and others added 6 commits September 11, 2024 18:58

Text Embeddings Inference update

8198e13

batch Text Embeddings Inference

bfce487

batch Text Embeddings Inference

3328286

Fixed embeddings list

1ed8159

Merge branch 'main' into vittorio/tei_update

79f50b4

Fix on embeddings list

2764adf

Vits-99 added the feature New feature request label Sep 20, 2024

Vits-99 requested a review from jamescalam September 20, 2024 20:19

Vits-99 self-assigned this Sep 20, 2024

github-actions bot added enhancement Enhancement to existing features Bug fix Review effort [1-5]: 2 labels Sep 20, 2024

jamescalam mentioned this pull request Sep 21, 2024

Text Embeddings Inference update #419

Closed

joaomsimoes reviewed Sep 24, 2024

View reviewed changes

Siraj-Aizlewood and others added 8 commits October 4, 2024 23:51

Gave an example of a model that works, and demonstrated batch process…

ab75c8a

…ing.

Linting.

94d7f04

New query exception pytest for TestHFEndpointEncoder.

1a6b694

More pytests for TestHFEndpointEncoder to increase coverage.

f99539b

More pytests.

b689569

Update huggingface-endpoint.ipynb

81cf262

Linting.

2d5c336

Merge branch 'main' into vittorio/tei_update

1db96e6

jamescalam approved these changes Nov 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Batch encoding for TEI encoder #423

feat: Batch encoding for TEI encoder #423

Vits-99 commented Sep 20, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Sep 20, 2024

github-actions bot commented Sep 20, 2024 •

edited

Loading

codecov bot commented Sep 20, 2024 •

edited

Loading

joaomsimoes Sep 24, 2024

Siraj-Aizlewood Oct 4, 2024

joaomsimoes Oct 15, 2024

feat: Batch encoding for TEI encoder #423

Are you sure you want to change the base?

feat: Batch encoding for TEI encoder #423

Conversation

Vits-99 commented Sep 20, 2024 • edited by github-actions bot Loading

User description

From user:

PR Type

Description

Changes walkthrough 📝

github-actions bot commented Sep 20, 2024

PR Reviewer Guide 🔍

github-actions bot commented Sep 20, 2024 • edited Loading

PR Code Suggestions ✨

codecov bot commented Sep 20, 2024 • edited Loading

Codecov Report

joaomsimoes Sep 24, 2024

Choose a reason for hiding this comment

Siraj-Aizlewood Oct 4, 2024

Choose a reason for hiding this comment

joaomsimoes Oct 15, 2024

Choose a reason for hiding this comment

Vits-99 commented Sep 20, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Sep 20, 2024 •

edited

Loading

codecov bot commented Sep 20, 2024 •

edited

Loading