Skip to content

v0.0.5: NeuronModel classes and generation methods during training

Compare
Choose a tag to compare
@michaelbenayoun michaelbenayoun released this 23 Jun 16:07
· 348 commits to main since this release

NeuronModel classes

NeuronModel classes allow you to run inference on Inf1 and Inf2 instances while preserving the python interface you are used to from Transformers' auto model classses.

Example:

from transformers import AutoTokenizer
from optimum.neuron import NeuronModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained(
    "optimum/distilbert-base-uncased-finetuned-sst-2-english-neuronx"
)
model = NeuronModelForSequenceClassification.from_pretrained(
    "optimum/distilbert-base-uncased-finetuned-sst-2-english-neuronx"
)

inputs = tokenizer("Hamilton is considered to be the best musical of human history.", return_tensors="pt")

outputs = model(**inputs)

Supported tasks are:

  • Feature extraction
  • Masked language modeling
  • Text classification
  • Token classification
  • Question answering
  • Multiple choice

Relevant PR: #45

Generation methods

Two generation methods are now supported:

  • Greedy decoding (#70)
  • Beam search (#93)

This allows you to perform evaluation with generation during decoder and seq2seq models training.

Misc

The Optimum CLI now provides two new commands to help managing the cache:

  • optimum-cli neuron cache list: To list a remote cache repo on the Hugging Face Hub (#85)
  • optimum-cli neuron cache add: To add compilation files related to a model to a remote cache repo on the Hugging Face Hub (#51)