Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(docs): [SCv1] Add a section about loading custom HuggingFace models #5105

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 40 additions & 6 deletions doc/source/servers/huggingface.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ We also support the high performance optimizations provided by the [Transformer

The parameters that are available for you to configure include:

| Name | Description |
| ---- | ----------- |
| `task` | The transformer pipeline task |
| `pretrained_model` | The name of the pretrained model in the Hub |
| Name | Description |
|------------------------|---------------------------------------------------------------------|
| `task` | The transformer pipeline task |
| `pretrained_model` | The name of the pretrained model in the Hub |
| `pretrained_tokenizer` | Transformer name in Hub if different to the one provided with model |
| `optimum_model` | Boolean to enable loading model with Optimum framework |
| `optimum_model` | Boolean to enable loading model with Optimum framework |

## Simple Example

Expand Down Expand Up @@ -43,7 +43,7 @@ spec:

## Quantized & Optimized Models with Optimum

You can deploy a HuggingFace model loaded using the Optimum library by using the `optimum_model` parameter
You can deploy a HuggingFace model loaded using the Optimum library by using the `optimum_model` parameter.

```yaml
apiVersion: machinelearning.seldon.io/v1alpha2
Expand All @@ -70,3 +70,37 @@ spec:
replicas: 1
```

## Custom Model Example

You can deploy a custom HuggingFace model by providing the location of the model artefacts using the `modelUri` field.

```yaml
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: custom-gpt2-hf-model
spec:
protocol: v2
predictors:
- graph:
name: transformer
implementation: HUGGINGFACE_SERVER
modelUri: gs://seldon-models/v1.18.0-dev/huggingface/custom-text-generation
parameters:
- name: task
type: STRING
value: text-generation
componentSpecs:
- spec:
containers:
- name: transformer
resources:
limits:
cpu: 1
memory: 4Gi
requests:
cpu: 100m
memory: 3Gi
name: default
replicas: 1
```
Loading