v0.0.5: NeuronModel classes and generation methods during training
NeuronModel classes
NeuronModel classes allow you to run inference on Inf1
and Inf2
instances while preserving the python interface you are used to from Transformers' auto model classses.
Example:
from transformers import AutoTokenizer
from optimum.neuron import NeuronModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained(
"optimum/distilbert-base-uncased-finetuned-sst-2-english-neuronx"
)
model = NeuronModelForSequenceClassification.from_pretrained(
"optimum/distilbert-base-uncased-finetuned-sst-2-english-neuronx"
)
inputs = tokenizer("Hamilton is considered to be the best musical of human history.", return_tensors="pt")
outputs = model(**inputs)
Supported tasks are:
- Feature extraction
- Masked language modeling
- Text classification
- Token classification
- Question answering
- Multiple choice
Relevant PR: #45
Generation methods
Two generation methods are now supported:
This allows you to perform evaluation with generation during decoder and seq2seq models training.
Misc
The Optimum CLI now provides two new commands to help managing the cache: