Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNIST training example in FP8 #62

Closed
balancap opened this issue Jan 2, 2024 · 2 comments · Fixed by #87
Closed

MNIST training example in FP8 #62

balancap opened this issue Jan 2, 2024 · 2 comments · Fixed by #87
Labels
experiments Experiments

Comments

@balancap
Copy link
Contributor

balancap commented Jan 2, 2024

We need to validate AutoScale FP8 training on the basic MNIST example to start with.

With multiple experiments:

  • Fwd FP8-143 format used;
  • Bwd FP8-152 format used;
  • Fwd + bwd;

These experiments can be done relying primarily on JAX LAX reduce_precision operator. No need of FP8 hardware for now.

@balancap balancap added the experiments Experiments label Jan 2, 2024
@balancap
Copy link
Contributor Author

balancap commented Jan 9, 2024

cc @lyprince in case you're interested :)

@balancap balancap linked a pull request Jan 16, 2024 that will close this issue
@balancap
Copy link
Contributor Author

Basic MNIST FP8 training implemented in #87
Training is working as expected, even though hard to make any conclusion from just this simple MNIST model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experiments Experiments
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant