Enable torchao quantization in framework and group_bench #2116

xuzhao9 · 2024-01-16T18:11:02Z

Summary:
Support torchao quantization code in the framework.

Add a new config torch_ao.yaml in the group_bench userbenchmark.

Differential Revision: D52802534

facebook-github-bot · 2024-01-16T18:11:13Z

This pull request was exported from Phabricator. Differential Revision: D52802534

Summary: Support torchao quantization code in the framework. Add a new config `torch_ao.yaml` in the group_bench userbenchmark. Differential Revision: D52802534

facebook-github-bot · 2024-01-17T00:22:19Z

This pull request was exported from Phabricator. Differential Revision: D52802534

Summary: Support torchao quantization code in the framework. Add a new config `torch_ao.yaml` in the group_bench userbenchmark. Differential Revision: D52802534

facebook-github-bot · 2024-01-17T16:34:18Z

This pull request was exported from Phabricator. Differential Revision: D52802534

Summary: Support torchao quantization code in the framework. Add a new config `torch_ao.yaml` in the group_bench userbenchmark. Differential Revision: D52802534

facebook-github-bot · 2024-01-23T15:01:28Z

This pull request was exported from Phabricator. Differential Revision: D52802534

Summary: Support torchao quantization code in the framework. Add a new config `torch_ao.yaml` in the group_bench userbenchmark. Differential Revision: D52802534

facebook-github-bot · 2024-01-23T15:01:53Z

This pull request was exported from Phabricator. Differential Revision: D52802534

Summary: Support torchao quantization code in the framework. Add a new config `torch_ao.yaml` in the group_bench userbenchmark. Differential Revision: D52802534

facebook-github-bot · 2024-01-23T20:58:36Z

This pull request was exported from Phabricator. Differential Revision: D52802534

Summary: Support torchao quantization code in the framework. Add a new config `torch_ao.yaml` in the group_bench userbenchmark. Differential Revision: D52802534

facebook-github-bot · 2024-01-23T22:51:43Z

This pull request was exported from Phabricator. Differential Revision: D52802534

Summary: Support torchao quantization code in the framework. Add a new config `torch_ao.yaml` in the group_bench userbenchmark. Differential Revision: D52802534

facebook-github-bot · 2024-01-23T22:51:59Z

This pull request was exported from Phabricator. Differential Revision: D52802534

facebook-github-bot · 2024-01-23T22:52:05Z

This pull request was exported from Phabricator. Differential Revision: D52802534

HDCharles · 2024-01-24T20:26:50Z

correct me if i'm wrong, the dynamo benchmarks are comparing the compiled vs non compiled models and this change would be quantized+compiled vs quantized, right? We'd want to compare quantized+compiled vs compiled to get something usable.

xuzhao9 · 2024-01-25T17:18:22Z

correct me if i'm wrong, the dynamo benchmarks are comparing the compiled vs non compiled models and this change would be quantized+compiled vs quantized, right? We'd want to compare quantized+compiled vs compiled to get something usable.

@HDCharles This change would be comparing quantized+compiled vs. compiled.

xuzhao9 · 2024-01-25T17:19:16Z

userbenchmark/group_bench/configs/torch_ao.yaml

+model: "*"
+test: eval
+device: cuda
+extra_args: --precision bf16 --torchdynamo inductor --inductor-compile-mode max-autotune


@HDCharles In this config, we define the baseline extra_args being --precision bf16 --torchdynamo inductor --inductor-compile-mode max-autotune, so it will apply this to every test_group/subgroup defined below.

xuzhao9 · 2024-01-25T17:20:46Z

userbenchmark/group_bench/configs/torch_ao.yaml

+  test_batch_size_default:
+    subgroup:
+      - extra_args:
+      - extra_args: --quantization int8dynamic


As shown in D52802534 test plan, here are the test results:

Running TorchBenchModelConfig(name='resnet50', test='eval', device='cuda', batch_size=None, extra_args=['--precision', 'bf16', '--torchdynamo', 'inductor', '--inductor-compile-mode', 'max-autotune', '--quantization', 'int8dynamic'], extra_env=None, output_dir=None) ... [done] Running TorchBenchModelConfig(name='resnet50', test='eval', device='cuda', batch_size=None, extra_args=['--precision', 'bf16', '--torchdynamo', 'inductor', '--inductor-compile-mode', 'max-autotune', '--quantization', 'int8weightonly'], extra_env=None, output_dir=None) ... [done] Running TorchBenchModelConfig(name='resnet50', test='eval', device='cuda', batch_size=None, extra_args=['--precision', 'bf16', '--torchdynamo', 'inductor', '--inductor-compile-mode', 'max-autotune', '--quantization', 'int4weightonly'], extra_env=None, output_dir=None) ... [done]

They are all running with compiler enabled.

HDCharles · 2024-01-26T04:04:41Z

so looking at the run in the test plan, it looks really good.

I see its collecting latencies, we'd also like to collect is peak memory usage and compare everything to the compiled baseline.

Also theoretically run the test for bs=1 for the weight only quantization types. Though that's less important.

HDCharles

looks good though would like to add metric for peak cuda memory usage

and if it can't do it already, compare to baseline.

xuzhao9 · 2024-01-26T13:40:35Z

userbenchmark/group_bench/configs/torch_ao.yaml

+device: cuda
+extra_args: --precision bf16 --torchdynamo inductor --inductor-compile-mode max-autotune
+metrics:
+  - latencies


To add CPU/GPU peak memory, add cpu_peak_mem and gpu_peak_mem here. @HDCharles

xuzhao9 · 2024-01-26T13:44:25Z

@HDCharles To add CPU and GPU memory, simply add cpu_peak_mem and gpu_peak_mem to the metrics section in the YAML file.

This PR is only a proof-of-concept of what the framework can do. We can leave further development to follow-up PRs.

facebook-github-bot · 2024-01-26T14:01:11Z

This pull request has been merged in 52a4b44.

facebook-github-bot added the cla signed label Jan 16, 2024

facebook-github-bot added the fb-exported label Jan 16, 2024

xuzhao9 had a problem deploying to docker-s3-upload January 16, 2024 18:11 — with GitHub Actions Failure

xuzhao9 mentioned this pull request Jan 16, 2024

[not for land] testing torchao coverage on torchbench/dynamo models #2075

Open

xuzhao9 requested review from supriyar and HDCharles January 16, 2024 18:12

xuzhao9 force-pushed the export-D52802534 branch from 8fec3de to c7e3ed3 Compare January 17, 2024 00:22

xuzhao9 had a problem deploying to docker-s3-upload January 17, 2024 00:22 — with GitHub Actions Failure

xuzhao9 had a problem deploying to docker-s3-upload January 17, 2024 00:23 — with GitHub Actions Failure

xuzhao9 had a problem deploying to docker-s3-upload January 17, 2024 06:40 — with GitHub Actions Failure

xuzhao9 force-pushed the export-D52802534 branch from c7e3ed3 to a03da8f Compare January 17, 2024 16:34

xuzhao9 had a problem deploying to docker-s3-upload January 17, 2024 16:35 — with GitHub Actions Failure

xuzhao9 force-pushed the export-D52802534 branch from a03da8f to 179f0e7 Compare January 23, 2024 15:01

xuzhao9 had a problem deploying to docker-s3-upload January 23, 2024 15:01 — with GitHub Actions Error

xuzhao9 force-pushed the export-D52802534 branch from 179f0e7 to 93d3b26 Compare January 23, 2024 15:01

xuzhao9 had a problem deploying to docker-s3-upload January 23, 2024 15:02 — with GitHub Actions Failure

xuzhao9 force-pushed the export-D52802534 branch from 93d3b26 to 0fecc19 Compare January 23, 2024 20:58

xuzhao9 had a problem deploying to docker-s3-upload January 23, 2024 20:58 — with GitHub Actions Error

xuzhao9 had a problem deploying to docker-s3-upload January 23, 2024 20:59 — with GitHub Actions Failure

xuzhao9 force-pushed the export-D52802534 branch from 0fecc19 to 3c42fc6 Compare January 23, 2024 22:51

Enable torchao quantization in framework and group_bench (pytorch#2116)

4cda199

Summary: Support torchao quantization code in the framework. Add a new config `torch_ao.yaml` in the group_bench userbenchmark. Differential Revision: D52802534

xuzhao9 had a problem deploying to docker-s3-upload January 23, 2024 22:51 — with GitHub Actions Error

xuzhao9 force-pushed the export-D52802534 branch from 3c42fc6 to b5d3f0b Compare January 23, 2024 22:51

xuzhao9 force-pushed the export-D52802534 branch from b5d3f0b to 4cda199 Compare January 23, 2024 22:51

xuzhao9 temporarily deployed to docker-s3-upload January 23, 2024 22:52 — with GitHub Actions Inactive

xuzhao9 had a problem deploying to docker-s3-upload January 23, 2024 22:52 — with GitHub Actions Failure

xuzhao9 commented Jan 25, 2024

View reviewed changes

xuzhao9 temporarily deployed to docker-s3-upload January 25, 2024 17:21 — with GitHub Actions Inactive

HDCharles approved these changes Jan 26, 2024

View reviewed changes

xuzhao9 commented Jan 26, 2024

View reviewed changes

facebook-github-bot closed this in 52a4b44 Jan 26, 2024

facebook-github-bot added the Merged label Jan 26, 2024

xuzhao9 deleted the export-D52802534 branch January 26, 2024 16:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable torchao quantization in framework and group_bench #2116

Enable torchao quantization in framework and group_bench #2116

xuzhao9 commented Jan 16, 2024

facebook-github-bot commented Jan 16, 2024

facebook-github-bot commented Jan 17, 2024

facebook-github-bot commented Jan 17, 2024

facebook-github-bot commented Jan 23, 2024

facebook-github-bot commented Jan 23, 2024

facebook-github-bot commented Jan 23, 2024

facebook-github-bot commented Jan 23, 2024

facebook-github-bot commented Jan 23, 2024

facebook-github-bot commented Jan 23, 2024

HDCharles commented Jan 24, 2024

xuzhao9 commented Jan 25, 2024

xuzhao9 Jan 25, 2024

xuzhao9 Jan 25, 2024

HDCharles commented Jan 26, 2024

HDCharles left a comment

xuzhao9 Jan 26, 2024 •

edited

Loading

xuzhao9 commented Jan 26, 2024 •

edited

Loading

facebook-github-bot commented Jan 26, 2024

Enable torchao quantization in framework and group_bench #2116

Enable torchao quantization in framework and group_bench #2116

Conversation

xuzhao9 commented Jan 16, 2024

facebook-github-bot commented Jan 16, 2024

facebook-github-bot commented Jan 17, 2024

facebook-github-bot commented Jan 17, 2024

facebook-github-bot commented Jan 23, 2024

facebook-github-bot commented Jan 23, 2024

facebook-github-bot commented Jan 23, 2024

facebook-github-bot commented Jan 23, 2024

facebook-github-bot commented Jan 23, 2024

facebook-github-bot commented Jan 23, 2024

HDCharles commented Jan 24, 2024

xuzhao9 commented Jan 25, 2024

xuzhao9 Jan 25, 2024

Choose a reason for hiding this comment

xuzhao9 Jan 25, 2024

Choose a reason for hiding this comment

HDCharles commented Jan 26, 2024

HDCharles left a comment

Choose a reason for hiding this comment

xuzhao9 Jan 26, 2024 • edited Loading

Choose a reason for hiding this comment

xuzhao9 commented Jan 26, 2024 • edited Loading

facebook-github-bot commented Jan 26, 2024

xuzhao9 Jan 26, 2024 •

edited

Loading

xuzhao9 commented Jan 26, 2024 •

edited

Loading