Skip to content

Latest commit

 

History

History
 
 

stable_diffusion_xl

Stable Diffusion XL

This folder contains Stable Diffusion XL (SDXL) models implemented with MindSpore, reference to Official Implementation by Stability-AI.

Features

  • Infer: Text-to-image generation with SDXL-1.0-Base/SDXL-1.0-PipeLine.
  • Infer: Image-to-image generation with SDXL-1.0-Refiner.
  • Infer: Support 7 SoTA diffusion process samplers.
  • Infer: Support with MSLite.
  • Infer: Support T2I-Adapters for Text-to-Image generation with extra visual guidance.
  • Infer: Support ControlNet inference with SDXL-1.0-Base.
  • Finetune: Vanilla Finetune with SDXL-1.0-Base.
  • (⚠️experimental) Finetune: LoRA fine-tune with SDXL-1.0-Base.
  • (⚠️experimental) Finetune: DreamBooth lora fine-tune with SDXL-1.0-Base.
  • (⚠️experimental) Finetune: Textual Inversion fine-tune with SDXL-1.0-Base.
  • LoRA model conversion for Torch inference, refer to tutorial
  • Memory Efficient Sampling and Tuning: Flash-Attention, Auto-Mix-Precision, Recompute, etc. (under continuous update)
  • Dataset supports csv/webdataset/wids format.

Documentation

  1. Preparation & Tutorial
  2. Inference
  3. Finetune

What is New

Jan 30, 2024

  1. Add ControlNet inference support for SDXL

Jan 18, 2024

  1. Support latent/text-embedding cache.
  2. Support vanilla fine-tune with PerBatchSize of 6 on 910*.

Jan 10, 2024

  1. Support Textual Inversion fine-tune.

Nov 22, 2023

  1. Support DreamBooth lora fine-tune.
  2. Support Offline Infer with MSLite.
  3. Support Vanilla fine-tune.
  4. Adapted to MindSpore 2.2.
  5. Add T2I-Adapters support for SDXL.

Sep 15, 2023

  1. Support SDXL-1.0-Refiner model for image-to-image generation.
  2. Support SDXL-1.0-Refiner LoRA fine-tune.
  3. Support SDXL-1.0-PipeLine for txt-to-image generation.
  4. Support Multi-Aspect data process (e.g., sample config).
  5. Improve sampling speed (e.g., sdxl-base sampled 40 steps on Ascend910A: 125s -> 21s).
  6. Adapted to MindSpore 2.1.0.

Aug 30, 2023

  1. Support SDXL-1.0-Base model for text-to-image generation.
  2. Support SDXL-1.0-Base LoRA fine-tune.
  3. Support Efficient Memory Sampling and Tuning.

Getting Started

See GETTING STARTED for details.

Examples:

Note: sampled 40 steps by SDXL-1.0-Base on Ascend 910 (online infer).

Fig1: "Vibrant portrait painting of Salvador Dalí with a robotic half face."
Fig2: "A capybara made of voxels sitting in a field."
Fig3: "Cute adorable little goat, unreal engine, cozy interior lighting, art station, detailed’ digital painting, cinematic, octane rendering."
Fig4: "A portrait photo of a kangaroo wearing an orange hoodie and blue sunglasses standing on the grass in front of the Sydney Opera House holding a sign on the chest that says "SDXL"!."