v0 add autoquant #402

michaelfeil · 2024-10-07T05:11:09Z

This pull request introduces several changes to the infinity_emb library, focusing on adding support for a new autoquant data type, updating documentation, and improving the quantization process. The most important changes include adding the autoquant data type, updating the CLI documentation, modifying quantization logic, and adding unit tests for autoquant quantization.

New Features:

Added autoquant data type to Dtype enum in libs/infinity_emb/infinity_emb/primitives.py.
Updated quantization logic to handle autoquant in libs/infinity_emb/infinity_emb/transformer/quantization/interface.py and libs/infinity_emb/infinity_emb/transformer/quantization/quant.py [1] [2].

Documentation Updates:

Updated CLI documentation to include autoquant in docs/docs/cli_v2.md.

Codebase Improvements:

Modified Makefile to use poetry run for generating OpenAPI and CLI v2 documentation in libs/infinity_emb/Makefile [1] [2].

Dependency Updates:

Added torchao as an optional dependency in libs/infinity_emb/pyproject.toml [1] [2].

Testing Enhancements:

Added unit tests for autoquant quantization in libs/infinity_emb/tests/unit_test/transformer/quantization/test_interface.py.

greptile-apps

PR Summary

This PR introduces support for 'autoquant', a new automatic quantization feature in the Infinity project. The changes span multiple files and include implementation, documentation, and testing updates.

Added 'autoquant' as a new option in the Dtype enum and CLI documentation, enabling automatic quantization for improved model performance
Implemented 'autoquant' support in the SentenceTransformerPatched class and quantization interface
Added 'torchao' dependency to pyproject.toml, likely to support the new autoquant functionality
Created a new test function to verify the autoquant feature's effectiveness and accuracy
Updated README with information on new multi-modal support (CLIP, CLAP) and text classification capabilities

_{9 file(s) reviewed, 4 comment(s)}
_{Edit PR Review Bot Settings}

greptile-apps · 2024-10-07T05:13:32Z

libs/infinity_emb/infinity_emb/transformer/quantization/interface.py

@@ -7,6 +7,7 @@

 import numpy as np
 import requests  # type: ignore
+import torch.ao.quantization


style: This import is unused in the current file. Consider removing it if not needed.

greptile-apps · 2024-10-07T05:13:33Z

libs/infinity_emb/infinity_emb/transformer/quantization/interface.py

        model = torch.quantization.quantize_dynamic(
            model.to("cpu"),  # the original model
            {torch.nn.Linear},  # a set of layers to dynamically quantize
            dtype=torch.qint8,
        )
+        model = torch.ao.quantization.quantize_dynamic(
+            model, {torch.nn.Linear}, dtype=torch.qint8
+        )


logic: Two quantization methods are applied sequentially. This might lead to unexpected behavior or reduced model performance. Consider using only one method or clarify why both are necessary.

greptile-apps · 2024-10-07T05:14:46Z

libs/infinity_emb/tests/unit_test/transformer/quantization/test_interface.py

+            bettertransformer=False,
+        )
+    )
+    sentence = "This is a test sentence."


style: This line is unused and can be removed.

greptile-apps · 2024-10-07T05:14:47Z

libs/infinity_emb/tests/unit_test/transformer/quantization/test_interface.py

+if __name__ == "__main__":
+    test_autoquant_quantization()


style: Running a single test function in main might not be ideal. Consider using a test runner or removing this block if not necessary.

codecov-commenter · 2024-10-07T05:22:47Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 21.42857% with 11 lines in your changes missing coverage. Please review.

Project coverage is 73.24%. Comparing base (0f1b786) to head (55a8b0e).

Files with missing lines	Patch %	Lines
...infinity_emb/transformer/quantization/interface.py	14.28%	6 Missing ⚠️
...emb/infinity_emb/transformer/quantization/quant.py	0.00%	5 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

❗ There is a different number of reports uploaded between BASE (0f1b786) and HEAD (55a8b0e). Click for more details.

HEAD has 1 upload less than BASE

Flag BASE (0f1b786) HEAD (55a8b0e)

2 1

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #402      +/-   ##
==========================================
- Coverage   79.01%   73.24%   -5.77%     
==========================================
  Files          40       40              
  Lines        3173     3184      +11     
==========================================
- Hits         2507     2332     -175     
- Misses        666      852     +186

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

v0 add autoquant

1adf829

greptile-apps bot reviewed Oct 7, 2024

View reviewed changes

michaelfeil and others added 3 commits October 7, 2024 23:32

Merge branch 'main' into autoquant

be0e9b2

Merge branch 'main' into autoquant

b9c59b5

Merge branch 'main' into autoquant

55a8b0e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0 add autoquant #402

v0 add autoquant #402

michaelfeil commented Oct 7, 2024 •

edited

Loading

greptile-apps bot left a comment

greptile-apps bot Oct 7, 2024

greptile-apps bot Oct 7, 2024

greptile-apps bot Oct 7, 2024

greptile-apps bot Oct 7, 2024

codecov-commenter commented Oct 7, 2024 •

edited

Loading

v0 add autoquant #402

Are you sure you want to change the base?

v0 add autoquant #402

Conversation

michaelfeil commented Oct 7, 2024 • edited Loading

New Features:

Documentation Updates:

Codebase Improvements:

Dependency Updates:

Testing Enhancements:

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

greptile-apps bot Oct 7, 2024

Choose a reason for hiding this comment

greptile-apps bot Oct 7, 2024

Choose a reason for hiding this comment

greptile-apps bot Oct 7, 2024

Choose a reason for hiding this comment

greptile-apps bot Oct 7, 2024

Choose a reason for hiding this comment

codecov-commenter commented Oct 7, 2024 • edited Loading

Codecov Report

michaelfeil commented Oct 7, 2024 •

edited

Loading

codecov-commenter commented Oct 7, 2024 •

edited

Loading