Fix PyTorch CI HUD dashboard missing perf numbers: hf_Whisper #1935

xmfan · 2023-09-25T21:15:20Z

A few models were passing accuracy check, but surprisingly failing the perf run, resulting in dashboard entries like:

Reproing the hud's commands locally,

# pass
python benchmarks/dynamo/torchbench.py --accuracy --no-translation-validation --training --amp --backend inductor --disable-cudagraphs --device cuda --total-partitions 4 --partition-id 1 --output hf_Whisper_accuracy.csv --only hf_Whisper

# fail (on https://github.com/pytorch/benchmark/blob/4ea3bba3b8010f5d4a629bb8f530a92570f34518/torchbenchmark/util/model.py#L195C48-L195C48)
python benchmarks/dynamo/torchbench.py --performance --cold-start-latency --training --amp --backend inductor --disable-cudagraphs --device cuda --total-partitions 4 --partition-id 1 --output hf_Whisper_perf.csv --only hf_Whisper

The error suggests that hf_Whisper does not provide a batch size for the training mode perf run.

Summarizing discussion with @xuzhao9:

I think we could:

set a default train batch size for hf_Whisper, if you still want to test forward/backward pass without a defined train test

in model.py, make sure self.batch_size is not None (before accuracy check overrides batch size to 4)

I implement 1, we set default batch sizes in the parent class of all benchmark models, with ability to be overwritten by individual models.

torchbenchmark/util/model.py

xuzhao9

The code is much clearer now. Thanks for making this improvement!

facebook-github-bot · 2023-09-26T15:22:41Z

@xmfan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

msaroufim · 2023-09-26T17:26:27Z

A lot of models are missing, I'm curious how many more models are affected by this issue cc @bdhirsh

facebook-github-bot · 2023-09-26T20:51:34Z

@xmfan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-09-26T23:36:23Z

@xmfan merged this pull request in 3f1c3eb.

eellison · 2023-10-26T19:24:10Z

torchbenchmark/util/model.py

            elif self.test == "eval" and (not self.batch_size == self.DEFAULT_EVAL_BSIZE):
-                raise NotImplementedError("Model doesn't support customizing batch size.")
+                raise NotImplementedError(f"Model doesn't support customizing batch size, but {self.test} test is providing a batch size other than DEFAULT_EVAL_BSIZE")


I don't understand why a model would have ALLOW_CUSTOMIZE_BSIZE but we would end up in this branch. For context, I'm looking into why we are not running stable_diffusion_unet in inference

Why don't we just use the batch size of the model instead of failing ?

If ALLOW_CUSTOMIZE_BSIZE = False, the model will only accept the default batch size, not the batch size specified by the user.

We could silently use the default batch size instead of failing, but my concern this will cause misunderstanding on the user side (for example, they might think the model is running in batch size 100, but ALLOW_CUSTOMIZE_BSIZE = False and the default batch size is 1, so it will run silently with batch size 1)

If batch_size is passed in as None, it seems okay to use the default specified on the model, instead of specified in self.metadata["devices"][current_device_name][device_batch_size_key]

Also, if we're worried about that case, we should also fix this upstream handling of it: https://github.com/pytorch/pytorch/blob/main/benchmarks/dynamo/torchbench.py#L369-L374

If batch_size is passed in as None, it seems okay to use the default specified on the model, instead of specified in self.metadata["devices"][current_device_name][device_batch_size_key]

Right, this is a bug. If ALLOW_CUSTOMIZE_BSIZE = False and batch_size passed in as None, we should use the default specified on the model instead of the device-specified batch size.

xmfan requested a review from xuzhao9 September 25, 2023 21:15

facebook-github-bot added the cla signed label Sep 25, 2023

xmfan had a problem deploying to docker-s3-upload September 25, 2023 21:15 — with GitHub Actions Failure

xmfan marked this pull request as ready for review September 25, 2023 21:15

xuzhao9 requested changes Sep 25, 2023

View reviewed changes

torchbenchmark/util/model.py Outdated Show resolved Hide resolved

xmfan added 2 commits September 26, 2023 14:56

improve batch size error messaging

850559b

hf_Whisper set DEFAULT_TRAIN_BSIZE

12a31f3

xmfan force-pushed the xmfan/fix_dashboard branch from 8a89e44 to 12a31f3 Compare September 26, 2023 14:57

xmfan had a problem deploying to docker-s3-upload September 26, 2023 14:57 — with GitHub Actions Failure

xmfan changed the title ~~Set default batch sizes for train/eval benchmark models~~ Fix PyTorch CI HUD dashboard missing perf numbers Sep 26, 2023

xuzhao9 approved these changes Sep 26, 2023

View reviewed changes

xmfan changed the title ~~Fix PyTorch CI HUD dashboard missing perf numbers~~ Fix PyTorch CI HUD dashboard missing perf numbers: hf_Whisper Sep 26, 2023

move up batch size check

ff92563

skip train unit test

4c96110

xmfan temporarily deployed to docker-s3-upload September 26, 2023 20:09 — with GitHub Actions Inactive

facebook-github-bot closed this in 3f1c3eb Sep 26, 2023

facebook-github-bot added the Merged label Sep 26, 2023

eellison reviewed Oct 26, 2023

View reviewed changes

xmfan mentioned this pull request Nov 3, 2023

Torchbench Training Failing Models Tracker pytorch/pytorch#112881

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix PyTorch CI HUD dashboard missing perf numbers: hf_Whisper #1935

Fix PyTorch CI HUD dashboard missing perf numbers: hf_Whisper #1935

xmfan commented Sep 25, 2023

xuzhao9 left a comment •

edited

Loading

facebook-github-bot commented Sep 26, 2023

msaroufim commented Sep 26, 2023

facebook-github-bot commented Sep 26, 2023

facebook-github-bot commented Sep 26, 2023

eellison Oct 26, 2023

eellison Oct 26, 2023

xuzhao9 Oct 26, 2023

eellison Oct 26, 2023

eellison Oct 26, 2023

xuzhao9 Oct 26, 2023

Fix PyTorch CI HUD dashboard missing perf numbers: hf_Whisper #1935

Fix PyTorch CI HUD dashboard missing perf numbers: hf_Whisper #1935

Conversation

xmfan commented Sep 25, 2023

xuzhao9 left a comment • edited Loading

Choose a reason for hiding this comment

facebook-github-bot commented Sep 26, 2023

msaroufim commented Sep 26, 2023

facebook-github-bot commented Sep 26, 2023

facebook-github-bot commented Sep 26, 2023

eellison Oct 26, 2023

Choose a reason for hiding this comment

eellison Oct 26, 2023

Choose a reason for hiding this comment

xuzhao9 Oct 26, 2023

Choose a reason for hiding this comment

eellison Oct 26, 2023

Choose a reason for hiding this comment

eellison Oct 26, 2023

Choose a reason for hiding this comment

xuzhao9 Oct 26, 2023

Choose a reason for hiding this comment

xuzhao9 left a comment •

edited

Loading