MNIST training example in FP8 #62

balancap · 2024-01-02T10:55:09Z

We need to validate AutoScale FP8 training on the basic MNIST example to start with.

With multiple experiments:

Fwd FP8-143 format used;
Bwd FP8-152 format used;
Fwd + bwd;

These experiments can be done relying primarily on JAX LAX reduce_precision operator. No need of FP8 hardware for now.

The text was updated successfully, but these errors were encountered:

balancap · 2024-01-09T16:11:07Z

cc @lyprince in case you're interested :)

balancap · 2024-01-16T17:03:10Z

Basic MNIST FP8 training implemented in #87
Training is working as expected, even though hard to make any conclusion from just this simple MNIST model.

balancap added the experiments Experiments label Jan 2, 2024

balancap linked a pull request Jan 16, 2024 that will close this issue

Implement reduce precision FP8 MNIST training example. #87

Merged

balancap closed this as completed Jan 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MNIST training example in FP8 #62

MNIST training example in FP8 #62

balancap commented Jan 2, 2024

balancap commented Jan 9, 2024

balancap commented Jan 16, 2024

MNIST training example in FP8 #62

MNIST training example in FP8 #62

Comments

balancap commented Jan 2, 2024

balancap commented Jan 9, 2024

balancap commented Jan 16, 2024