fp8 implementation of flux which gets ~3.5 it/s 1024x1024 on 4090 (ADA / Hopper & 16GB vram+ only) #1363

ZeroCool22 · 2024-08-21T02:11:06Z

ZeroCool22
Aug 21, 2024

Could this be implemented?

https://github.com/aredden/flux-fp8-api?tab=readme-ov-file#installation

https://www.reddit.com/r/StableDiffusion/comments/1ex64jj/i_made_an_fp8_implementation_of_flux_which_gets/

Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.

Credits to aredden

lllyasviel · 2024-08-21T02:39:28Z

lllyasviel
Aug 21, 2024
Maintainer

This looks promising and super interesting.

Unfortunately, I do not have Ada/40XX device to develop webui now, so I have no idea how that feels and looks like. I have some 40xx devices but that are for labs or something and are not my personal dev setup for webui. To play with this in Forge, you need to wait me to somehow get an Ada/40XX device for my personal dev setups.

But feel free to post some images here and so that I can take a look what level of quality degradation is. And, I am especially interested in the influence of aredden's range scaling methods.

In fact, I also have some ideas to port native 8bit bnb operations to compute layers, but that highly depends on if I have free time later.

1 reply

josephrocca Dec 10, 2024

@lllyasviel Do you have a ko-fi or something I can donate to? For Ada GPU or anything else - I don't mind. I'd just like to donate $2k as a thanks for all your OSS work. Maybe you could enable Github Sponsors?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fp8 implementation of flux which gets ~3.5 it/s 1024x1024 on 4090 (ADA / Hopper & 16GB vram+ only) #1363

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

fp8 implementation of flux which gets ~3.5 it/s 1024x1024 on 4090 (ADA / Hopper & 16GB vram+ only) #1363

ZeroCool22 Aug 21, 2024

Replies: 1 comment · 1 reply

lllyasviel Aug 21, 2024 Maintainer

josephrocca Dec 10, 2024

ZeroCool22
Aug 21, 2024

Replies: 1 comment 1 reply

lllyasviel
Aug 21, 2024
Maintainer