Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add README.md for Sentence transformer examples with HPU device #1355

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions examples/sentence-transformers-training/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Examples for Sentence Transformers

We provide 3 examples to show how to use the Sentence Transformers with HPU devices.

- **[training_stsbenchmark.py](https://github.com/huggingface/optimum-habana/tree/main/examples/sentence-transformers-training/sts)** - This example shows how to create a Sentence Transformers model from scratch by using a pre-trained transformer model (e.g. [`distilbert-base-uncased`](https://huggingface.co/distilbert/distilbert-base-uncased)) together with a pooling layer.

- **[training_nli.py](https://github.com/huggingface/optimum-habana/tree/main/examples/sentence-transformers-training/nli)** - This example provides two sentences (a premise and a hypothesis), and the task of Natural Language Inference (NLI) is to determine whether the premise entails the hypothesis, contradicts it, or if they are neutral. Commonly the NLI dataset in [SNLI](https://huggingface.co/datasets/stanfordnlp/snli) and [MultiNLI](https://huggingface.co/datasets/nyu-mll/multi_nli) are used.

- **[training_paraphrases.py](https://github.com/huggingface/optimum-habana/tree/main/examples/sentence-transformers-training/paraphrases)** - This example loads various datasets from the Sentence Transformers. We construct batches by sampling examples from the respective dataset.

### Tested Examples/Models and Configurations

The following table contains examples supported and configurations we have validated on Gaudi2.

| Examples | General | e5-mistral-7b-instruct | BF16 | Single Card | Multi-Cards |
|-----------------------------|-----------|------------|------|-------------|-------------|
| training_nli.py ||||||
| training_stsbenchmark.py ||||||
| training_paraphrases.py || | || |

Notice:
1. In the table, the column 'General' refers to general models like mpnet, MiniLM.
2. When e5-mistral-7b-instruct model is enabled for the test, single card will use the LoRA + gradient_checkpoint and multi-card will use the deepspeed zero2/zero3 stage to reduce the memory requirement.
3. For the detailed instructions on how to run each example, you can refer to the README file located in each example folder.