Skip to content

Commit

Permalink
feat(docs): [SCv1] Add a section about loading custom HuggingFace mod…
Browse files Browse the repository at this point in the history
…els (#5105)

* [HuggingFace] Add section about loading custom models

* Refactor a word

* Change custom HF model location in GCS

* Refactor SeldonDeployment name in yaml
  • Loading branch information
vtaskow authored Aug 29, 2023
1 parent 7c2d4f1 commit 6a00ff9
Showing 1 changed file with 40 additions and 6 deletions.
46 changes: 40 additions & 6 deletions doc/source/servers/huggingface.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ We also support the high performance optimizations provided by the [Transformer

The parameters that are available for you to configure include:

| Name | Description |
| ---- | ----------- |
| `task` | The transformer pipeline task |
| `pretrained_model` | The name of the pretrained model in the Hub |
| Name | Description |
|------------------------|---------------------------------------------------------------------|
| `task` | The transformer pipeline task |
| `pretrained_model` | The name of the pretrained model in the Hub |
| `pretrained_tokenizer` | Transformer name in Hub if different to the one provided with model |
| `optimum_model` | Boolean to enable loading model with Optimum framework |
| `optimum_model` | Boolean to enable loading model with Optimum framework |

## Simple Example

Expand Down Expand Up @@ -43,7 +43,7 @@ spec:
## Quantized & Optimized Models with Optimum
You can deploy a HuggingFace model loaded using the Optimum library by using the `optimum_model` parameter
You can deploy a HuggingFace model loaded using the Optimum library by using the `optimum_model` parameter.

```yaml
apiVersion: machinelearning.seldon.io/v1alpha2
Expand All @@ -70,3 +70,37 @@ spec:
replicas: 1
```

## Custom Model Example

You can deploy a custom HuggingFace model by providing the location of the model artefacts using the `modelUri` field.

```yaml
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: custom-gpt2-hf-model
spec:
protocol: v2
predictors:
- graph:
name: transformer
implementation: HUGGINGFACE_SERVER
modelUri: gs://seldon-models/v1.18.0-dev/huggingface/custom-text-generation
parameters:
- name: task
type: STRING
value: text-generation
componentSpecs:
- spec:
containers:
- name: transformer
resources:
limits:
cpu: 1
memory: 4Gi
requests:
cpu: 100m
memory: 3Gi
name: default
replicas: 1
```

0 comments on commit 6a00ff9

Please sign in to comment.