feat: Batch encoding for TEI encoder #423

Vits-99 · 2024-09-20T20:19:14Z

User description

Related to PR#414.

From user:

Hi everyone.

I wanted to use the Text Embeddings Inference with the encoder but I noticed two small bugs in the code. I believe that the HFEndpointEncoder was intentionally created to be used with TEI (right?)

The loop for max_retries in attempts, that is inside of the function query, has no break or any system to return the result when we have a success response. I added a break, similar to the OpenAI encoder.
The response from TEI is [[[array]]]. The array is inside of a list of a list. I remove one list when receiving the response. Without this it will throw a dimension error when comparing all the vectors.

These are the main bugs, but I would also take some time to purpose a future update. With TEI we can send a batch of texts

curl 127.0.0.1:8080/embed \
    -X POST \
    -d '{"inputs":["Today is a nice day", "I like you"]}' \
    -H 'Content-Type: application/json'

To save time, we could batch the different sentences to the endpoint. This would be great for longer document. If it sounds interesting I can try to help to develop it.

By the way, should I use semantic router for splitting text, or the semantic chunkers?

PR Type

enhancement, bug fix

Description

Implemented batch processing for the Text Embeddings Inference, allowing multiple documents to be processed in a single query with a batch size of 50.
Fixed the handling of the TEI response to correctly process nested lists, preventing dimension errors.
Added a break statement in the retry loop within the query method to exit upon a successful response, improving efficiency.
Enhanced error handling to provide clearer error messages when no embeddings are returned for a batch.

Changes walkthrough 📝

Relevant files

Enhancement

huggingface.py `Implement batch processing and fix response handling in TEI encoder` semantic_router/encoders/huggingface.py Implemented batch processing for document embeddings with a batch size of 50. Fixed the response handling to correctly process nested lists in the output. Added a break statement in the retry loop to stop on successful response. Improved error handling for batch processing.	+13/-8

💡 PR-Agent usage:
Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

github-actions · 2024-09-20T20:20:00Z

PR Reviewer Guide 🔍

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Key issues to review Error Handling The error handling in the batch processing might suppress specific errors which could be useful for debugging. Consider logging the error before raising a new one to maintain the error context. List Processing The condition to check if the output is a list might not correctly handle nested lists as expected from the PR description. This could lead to incorrect embeddings structure.

github-actions · 2024-09-20T20:20:03Z

PR Code Suggestions ✨

Category	Suggestion	Score
Enhancement	Make batch size configurable by adding it as a method parameter Instead of using a hardcoded batch size, consider making `batch_size` a parameter of the method with a default value. This will make the method more flexible and allow users to specify a different batch size if needed. semantic_router/encoders/huggingface.py [215] -batch_size = 50 +def __call__(self, docs: List[str], batch_size: int = 50) -> List[List[float]]: Suggestion importance[1-10]: 8 Why: Making the batch size configurable enhances the flexibility of the method, allowing users to adjust it based on their specific needs and constraints, which is a significant improvement.	8
Enhancement	Improve error handling by specifying exception types in the `query` method Consider adding a specific exception type or custom exception message for different error scenarios in the `query` method to improve error traceability and handling. semantic_router/encoders/huggingface.py [228] +except requests.exceptions.HTTPError as e: + raise ValueError(f"HTTP error occurred: {e}") from e +except requests.exceptions.ConnectionError as e: + raise ValueError(f"Connection error occurred: {e}") from e except Exception as e: - raise ValueError(f"No embeddings returned for batch. Error: {e}") from e + raise ValueError(f"An unexpected error occurred: {e}") from e Suggestion importance[1-10]: 6 Why: Specifying exception types can improve error traceability and handling, making the code more maintainable and easier to debug, but the improvement is not critical unless specific exceptions are expected frequently.	6
Possible bug	Add error handling for non-list `outputs` to prevent runtime errors Add error handling for the case when `outputs` is not a list, as currently, the code assumes `outputs` will always be a list. This could lead to unexpected errors if the structure of `outputs` changes. semantic_router/encoders/huggingface.py [223-226] -if isinstance(outputs[0], list): - embeddings.extend(outputs) +if isinstance(outputs, list): + if all(isinstance(item, list) for item in outputs): + embeddings.extend(outputs) + else: + raise ValueError("Expected a list of lists as output.") else: - embeddings.append(outputs) + raise ValueError("Expected a list as output.") Suggestion importance[1-10]: 7 Why: Adding error handling for unexpected output types improves the robustness of the code by preventing potential runtime errors, though it may not be crucial if the output format is well-defined.	7
Best practice	Use Pythonic way to check for empty lists Instead of checking if `outputs` is empty with `len(outputs) == 0`, use the more Pythonic `not outputs`. semantic_router/encoders/huggingface.py [221-222] -if not outputs or len(outputs) == 0: +if not outputs: raise ValueError("No embeddings returned from the query.") Suggestion importance[1-10]: 5 Why: While using `not outputs` is more Pythonic and slightly improves readability, the existing code is functionally correct, so the improvement is minor.	5

codecov · 2024-09-20T20:27:56Z

Codecov Report

Attention: Patch coverage is 80.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 68.01%. Comparing base (3e6bd22) to head (2764adf).

Files with missing lines	Patch %	Lines
semantic_router/encoders/huggingface.py	80.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #423      +/-   ##
==========================================
+ Coverage   67.99%   68.01%   +0.01%     
==========================================
  Files          46       46              
  Lines        3465     3470       +5     
==========================================
+ Hits         2356     2360       +4     
- Misses       1109     1110       +1

Flag	Coverage Δ
	`68.01% <80.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

joaomsimoes · 2024-09-24T08:52:43Z

semantic_router/encoders/huggingface.py

@@ -212,19 +212,17 @@ def __call__(self, docs: List[str]) -> List[List[float]]:
            ValueError: If no embeddings are returned for a document.
        """

-        batch_size=50
+        batch_size = 50


ValueError: No embeddings returned for batch. Error: Query failed with status 413: {"error":"batch size 50 > maximum allowed batch size 32","error_type":"Validation"}

joaomsimoes and others added 6 commits September 11, 2024 18:58

Text Embeddings Inference update

8198e13

batch Text Embeddings Inference

bfce487

batch Text Embeddings Inference

3328286

Fixed embeddings list

1ed8159

Merge branch 'main' into vittorio/tei_update

79f50b4

Fix on embeddings list

2764adf

Vits-99 added the feature New feature request label Sep 20, 2024

Vits-99 requested a review from jamescalam September 20, 2024 20:19

Vits-99 self-assigned this Sep 20, 2024

github-actions bot added enhancement Enhancement to existing features Bug fix Review effort [1-5]: 2 labels Sep 20, 2024

jamescalam mentioned this pull request Sep 21, 2024

Text Embeddings Inference update #419

Open

joaomsimoes reviewed Sep 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Batch encoding for TEI encoder #423

feat: Batch encoding for TEI encoder #423

Vits-99 commented Sep 20, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Sep 20, 2024

github-actions bot commented Sep 20, 2024 •

edited

Loading

codecov bot commented Sep 20, 2024 •

edited

Loading

joaomsimoes Sep 24, 2024

feat: Batch encoding for TEI encoder #423

Are you sure you want to change the base?

feat: Batch encoding for TEI encoder #423

Conversation

Vits-99 commented Sep 20, 2024 • edited by github-actions bot Loading

User description

From user:

PR Type

Description

Changes walkthrough 📝

github-actions bot commented Sep 20, 2024

PR Reviewer Guide 🔍

github-actions bot commented Sep 20, 2024 • edited Loading

PR Code Suggestions ✨

codecov bot commented Sep 20, 2024 • edited Loading

Codecov Report

joaomsimoes Sep 24, 2024

Choose a reason for hiding this comment

Vits-99 commented Sep 20, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Sep 20, 2024 •

edited

Loading

codecov bot commented Sep 20, 2024 •

edited

Loading