-
Notifications
You must be signed in to change notification settings - Fork 61
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
TGI: export model if configuration is cached (#445)
* feat(cache): use one registry per optimum version * feat(registry): use model_type as primary key This allows to identify cached configurations that can be applied to models that differ only by their weights, like meta-llama/Llama-2-7b-hf and meta-llama/Llama-2-7b-chat-hf. This also allows to lookup cached configurations for local model folders containing a model config. * doc(cache): fix image link * doc(cache): add cache lookup * refactor(decoder): add get_export_config helper * feat(tgi): export model if cached * review: addressing code comments * wip * review: address doc comments
- Loading branch information
Showing
11 changed files
with
386 additions
and
115 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.