Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Whisper from Huggingface. #1769

Closed
wants to merge 9 commits into from

Conversation

MaanavD
Copy link
Contributor

@MaanavD MaanavD commented Jul 14, 2023

Instead of making changes, using HF is easier / more maintainable.

task = SPEECH.RECOGNITION
# https://cdn.openai.com/papers/whisper.pdf Says for large-v2 they trained on 1024 batch sizes, with 16 GPUs
DEFAULT_EVAL_BSIZE = 64
DEFAULT_Train_BSIZE = 64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If training is not implemented, please remove this line.

self.feature_size = 80
self.sequence_length = 3000
input_features = torch.randn(size=(self.batch_size, self.feature_size, self.sequence_length),device=self.device)
self.example_inputs = {"input_features": input_features.to(self.device)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are wrapping the model in a different way, we need to implement customized get_module() here, similar to the upstream code: https://github.com/MaanavD/benchmark/blob/116df9cb937b6921d16eba34fc504776bb40a6ee/torchbenchmark/util/framework/huggingface/model_factory.py#L110

The reason we need get_module() is because this API is being used by our downstream benchmarking script: https://github.com/pytorch/pytorch/blob/main/benchmarks/dynamo/torchbench.py#L358
and it requires model(*example_input) runs successfully.

@msaroufim msaroufim requested a review from xuzhao9 July 25, 2023 17:59
@msaroufim
Copy link
Member

@xuzhao9 - @MaanavD fixed some of the last issues offline, could we get a review please? Skipping cpu eval because its suuuuper slow

@msaroufim msaroufim self-requested a review July 25, 2023 18:07
@facebook-github-bot
Copy link
Contributor

@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@msaroufim
Copy link
Member

@xuzhao9 this is confusing me example test is failing but running it standalone seems fine

(sam) ubuntu@ip-172-31-9-217:~/benchmark$ python test.py -k "test_hf_Whisper_example_cuda" 
F
======================================================================
FAIL: test_hf_Whisper_example_cuda (__main__.TestBenchmark)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/benchmark/test.py", line 75, in example_fn
    assert accuracy == "pass" or accuracy == "eager_1st_run_OOM", f"Expected accuracy pass, get {accuracy}"
AssertionError: Expected accuracy pass, get eager_1st_run_fail

----------------------------------------------------------------------
Ran 1 test in 2.963s

FAILED (failures=1)
(sam) ubuntu@ip-172-31-9-217:~/benchmark$ python test.py -k "test_hf_Whisper_example_cuda" 
F
======================================================================
FAIL: test_hf_Whisper_example_cuda (__main__.TestBenchmark)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/benchmark/test.py", line 75, in example_fn
    assert accuracy == "pass" or accuracy == "eager_1st_run_OOM", f"Expected accuracy pass, get {accuracy}"
AssertionError: Expected accuracy pass, get eager_1st_run_fail

----------------------------------------------------------------------
Ran 1 test in 2.931s

FAILED (failures=1)
(sam) ubuntu@ip-172-31-9-217:~/benchmark$ python run.py hf_Whisper -d cuda
Running eval method from hf_Whisper on cuda in eager mode with input batch size 8 and precision fp16.
GPU Time:             18.128 milliseconds
CPU Total Wall Time:  18.158 milliseconds
GPU 0 Peak Memory:              0.9604 GB
CPU Peak Memory:                0.6455 GB

@xuzhao9
Copy link
Contributor

xuzhao9 commented Jul 25, 2023

@xuzhao9 this is confusing me example test is failing but running it standalone seems fine

(sam) ubuntu@ip-172-31-9-217:~/benchmark$ python test.py -k "test_hf_Whisper_example_cuda" 
F
======================================================================
FAIL: test_hf_Whisper_example_cuda (__main__.TestBenchmark)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/benchmark/test.py", line 75, in example_fn
    assert accuracy == "pass" or accuracy == "eager_1st_run_OOM", f"Expected accuracy pass, get {accuracy}"
AssertionError: Expected accuracy pass, get eager_1st_run_fail

----------------------------------------------------------------------
Ran 1 test in 2.963s

FAILED (failures=1)
(sam) ubuntu@ip-172-31-9-217:~/benchmark$ python test.py -k "test_hf_Whisper_example_cuda" 
F
======================================================================
FAIL: test_hf_Whisper_example_cuda (__main__.TestBenchmark)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/benchmark/test.py", line 75, in example_fn
    assert accuracy == "pass" or accuracy == "eager_1st_run_OOM", f"Expected accuracy pass, get {accuracy}"
AssertionError: Expected accuracy pass, get eager_1st_run_fail

----------------------------------------------------------------------
Ran 1 test in 2.931s

FAILED (failures=1)
(sam) ubuntu@ip-172-31-9-217:~/benchmark$ python run.py hf_Whisper -d cuda
Running eval method from hf_Whisper on cuda in eager mode with input batch size 8 and precision fp16.
GPU Time:             18.128 milliseconds
CPU Total Wall Time:  18.158 milliseconds
GPU 0 Peak Memory:              0.9604 GB
CPU Peak Memory:                0.6455 GB

This is because our downstream script, torchbench.py (https://github.com/pytorch/pytorch/blob/main/benchmarks/dynamo/torchbench.py#L437) will only accept list arguments, they don't accept keyword arguments.

To solve this problem, we need to wrap it up like this: https://github.com/pytorch/benchmark/blob/main/torchbenchmark/util/framework/huggingface/model_factory.py#L42

@facebook-github-bot
Copy link
Contributor

@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@msaroufim msaroufim requested a review from xuzhao9 July 26, 2023 00:30
@facebook-github-bot
Copy link
Contributor

@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@xuzhao9 xuzhao9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@facebook-github-bot
Copy link
Contributor

@msaroufim merged this pull request in 770d5cf.

xuzhao9 pushed a commit that referenced this pull request Jul 26, 2023
Summary:
Instead of making changes, using HF is easier / more maintainable.

Pull Request resolved: #1769

Reviewed By: xuzhao9, cpuhrsch

Differential Revision: D47766556

Pulled By: msaroufim

fbshipit-source-id: 8393776222fc3508bda56c9c71e45d9812e69869
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants