Added Whisper from Huggingface. #1769

MaanavD · 2023-07-14T20:11:14Z

Instead of making changes, using HF is easier / more maintainable.

torchbenchmark/models/hf_Whisper/requirements.txt

torchbenchmark/models/hf_Whisper/__init__.py

xuzhao9 · 2023-07-14T22:23:40Z

torchbenchmark/models/hf_Whisper/__init__.py

+    task = SPEECH.RECOGNITION
+    # https://cdn.openai.com/papers/whisper.pdf Says for large-v2 they trained on 1024 batch sizes, with 16 GPUs
+    DEFAULT_EVAL_BSIZE = 64
+    DEFAULT_Train_BSIZE = 64


If training is not implemented, please remove this line.

xuzhao9 · 2023-07-18T21:41:44Z

torchbenchmark/models/hf_Whisper/__init__.py

+        self.feature_size = 80
+        self.sequence_length = 3000
+        input_features = torch.randn(size=(self.batch_size, self.feature_size, self.sequence_length),device=self.device)
+        self.example_inputs = {"input_features": input_features.to(self.device)}


Since we are wrapping the model in a different way, we need to implement customized get_module() here, similar to the upstream code: https://github.com/MaanavD/benchmark/blob/116df9cb937b6921d16eba34fc504776bb40a6ee/torchbenchmark/util/framework/huggingface/model_factory.py#L110

The reason we need get_module() is because this API is being used by our downstream benchmarking script: https://github.com/pytorch/pytorch/blob/main/benchmarks/dynamo/torchbench.py#L358
and it requires model(*example_input) runs successfully.

msaroufim · 2023-07-25T17:59:43Z

@xuzhao9 - @MaanavD fixed some of the last issues offline, could we get a review please? Skipping cpu eval because its suuuuper slow

facebook-github-bot · 2023-07-25T18:24:47Z

@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-07-25T20:45:00Z

@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

msaroufim · 2023-07-25T20:49:33Z

@xuzhao9 this is confusing me example test is failing but running it standalone seems fine

(sam) ubuntu@ip-172-31-9-217:~/benchmark$ python test.py -k "test_hf_Whisper_example_cuda" 
F
======================================================================
FAIL: test_hf_Whisper_example_cuda (__main__.TestBenchmark)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/benchmark/test.py", line 75, in example_fn
    assert accuracy == "pass" or accuracy == "eager_1st_run_OOM", f"Expected accuracy pass, get {accuracy}"
AssertionError: Expected accuracy pass, get eager_1st_run_fail

----------------------------------------------------------------------
Ran 1 test in 2.963s

FAILED (failures=1)
(sam) ubuntu@ip-172-31-9-217:~/benchmark$ python test.py -k "test_hf_Whisper_example_cuda" 
F
======================================================================
FAIL: test_hf_Whisper_example_cuda (__main__.TestBenchmark)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/benchmark/test.py", line 75, in example_fn
    assert accuracy == "pass" or accuracy == "eager_1st_run_OOM", f"Expected accuracy pass, get {accuracy}"
AssertionError: Expected accuracy pass, get eager_1st_run_fail

----------------------------------------------------------------------
Ran 1 test in 2.931s

FAILED (failures=1)
(sam) ubuntu@ip-172-31-9-217:~/benchmark$ python run.py hf_Whisper -d cuda
Running eval method from hf_Whisper on cuda in eager mode with input batch size 8 and precision fp16.
GPU Time:             18.128 milliseconds
CPU Total Wall Time:  18.158 milliseconds
GPU 0 Peak Memory:              0.9604 GB
CPU Peak Memory:                0.6455 GB

xuzhao9 · 2023-07-25T21:23:46Z

@xuzhao9 this is confusing me example test is failing but running it standalone seems fine

(sam) ubuntu@ip-172-31-9-217:~/benchmark$ python test.py -k "test_hf_Whisper_example_cuda" 
F
======================================================================
FAIL: test_hf_Whisper_example_cuda (__main__.TestBenchmark)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/benchmark/test.py", line 75, in example_fn
    assert accuracy == "pass" or accuracy == "eager_1st_run_OOM", f"Expected accuracy pass, get {accuracy}"
AssertionError: Expected accuracy pass, get eager_1st_run_fail

----------------------------------------------------------------------
Ran 1 test in 2.963s

FAILED (failures=1)
(sam) ubuntu@ip-172-31-9-217:~/benchmark$ python test.py -k "test_hf_Whisper_example_cuda" 
F
======================================================================
FAIL: test_hf_Whisper_example_cuda (__main__.TestBenchmark)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/benchmark/test.py", line 75, in example_fn
    assert accuracy == "pass" or accuracy == "eager_1st_run_OOM", f"Expected accuracy pass, get {accuracy}"
AssertionError: Expected accuracy pass, get eager_1st_run_fail

----------------------------------------------------------------------
Ran 1 test in 2.931s

FAILED (failures=1)
(sam) ubuntu@ip-172-31-9-217:~/benchmark$ python run.py hf_Whisper -d cuda
Running eval method from hf_Whisper on cuda in eager mode with input batch size 8 and precision fp16.
GPU Time:             18.128 milliseconds
CPU Total Wall Time:  18.158 milliseconds
GPU 0 Peak Memory:              0.9604 GB
CPU Peak Memory:                0.6455 GB

This is because our downstream script, torchbench.py (https://github.com/pytorch/pytorch/blob/main/benchmarks/dynamo/torchbench.py#L437) will only accept list arguments, they don't accept keyword arguments.

To solve this problem, we need to wrap it up like this: https://github.com/pytorch/benchmark/blob/main/torchbenchmark/util/framework/huggingface/model_factory.py#L42

facebook-github-bot · 2023-07-25T21:32:02Z

@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

torchbenchmark/models/hf_Whisper/__init__.py

facebook-github-bot · 2023-07-26T00:33:28Z

@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

xuzhao9

LGTM!

facebook-github-bot · 2023-07-26T03:57:39Z

@msaroufim merged this pull request in 770d5cf.

Summary: Instead of making changes, using HF is easier / more maintainable. Pull Request resolved: #1769 Reviewed By: xuzhao9, cpuhrsch Differential Revision: D47766556 Pulled By: msaroufim fbshipit-source-id: 8393776222fc3508bda56c9c71e45d9812e69869

Added huggingface Whisper.

89f4497

facebook-github-bot added the cla signed label Jul 14, 2023

xuzhao9 reviewed Jul 14, 2023

View reviewed changes

torchbenchmark/models/hf_Whisper/requirements.txt Outdated Show resolved Hide resolved

torchbenchmark/models/hf_Whisper/__init__.py Outdated Show resolved Hide resolved

xuzhao9 reviewed Jul 14, 2023

View reviewed changes

torchbenchmark/models/hf_Whisper/__init__.py Show resolved Hide resolved

MaanavD added 2 commits July 14, 2023 14:33

Updated requirements, batch size.

ba57c50

Updated to remove training.

bb3f331

xuzhao9 reviewed Jul 14, 2023

View reviewed changes

Removed default train size. No training implemented.

116df9c

xuzhao9 reviewed Jul 18, 2023

View reviewed changes

msaroufim and others added 2 commits July 25, 2023 17:58

fix tests

c77ad90

Merge branch 'main' into adding_whisper_hf

c7ef8a3

msaroufim requested a review from xuzhao9 July 25, 2023 17:59

msaroufim self-requested a review July 25, 2023 18:07

msaroufim approved these changes Jul 25, 2023

View reviewed changes

fix eval test

f232aac

push

c3d5d10

xuzhao9 reviewed Jul 25, 2023

View reviewed changes

torchbenchmark/models/hf_Whisper/__init__.py Outdated Show resolved Hide resolved

add support for half()

9bca12c

msaroufim requested a review from xuzhao9 July 26, 2023 00:30

xuzhao9 approved these changes Jul 26, 2023

View reviewed changes

facebook-github-bot closed this in 770d5cf Jul 26, 2023

facebook-github-bot added the Merged label Jul 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Whisper from Huggingface. #1769

Added Whisper from Huggingface. #1769

MaanavD commented Jul 14, 2023

xuzhao9 Jul 14, 2023

xuzhao9 Jul 18, 2023

msaroufim commented Jul 25, 2023

facebook-github-bot commented Jul 25, 2023

facebook-github-bot commented Jul 25, 2023

msaroufim commented Jul 25, 2023

xuzhao9 commented Jul 25, 2023

facebook-github-bot commented Jul 25, 2023

facebook-github-bot commented Jul 26, 2023

xuzhao9 left a comment

facebook-github-bot commented Jul 26, 2023

Added Whisper from Huggingface. #1769

Added Whisper from Huggingface. #1769

Conversation

MaanavD commented Jul 14, 2023

xuzhao9 Jul 14, 2023

Choose a reason for hiding this comment

xuzhao9 Jul 18, 2023

Choose a reason for hiding this comment

msaroufim commented Jul 25, 2023

facebook-github-bot commented Jul 25, 2023

facebook-github-bot commented Jul 25, 2023

msaroufim commented Jul 25, 2023

xuzhao9 commented Jul 25, 2023

facebook-github-bot commented Jul 25, 2023

facebook-github-bot commented Jul 26, 2023

xuzhao9 left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Jul 26, 2023