Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Summary: Pull Request resolved: #3218 X-link: facebookresearch/FBGEMM#315 Adding the fp8 kv cache to disagg test for mp2. Changes include changing the model to 7b llama model. The small model has D_H of 64, which is not working with dequantization kernel (will check the issue in another diff). TODO: add Fp8 kv cache + paged kv to the test Reviewed By: jianyuh Differential Revision: D62772678 fbshipit-source-id: 775f572e2c345354844e24d80e2481284ac6f1a3
- Loading branch information