Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Update] diffusers v0.29.2 Update #650

Merged
merged 16 commits into from
Sep 27, 2024

Conversation

townwish4git
Copy link
Contributor

@townwish4git townwish4git commented Aug 30, 2024

What does this PR do?


Description

This pull request serves as a preliminary submission for integrating the diffusers library to version v0.29.2. It is intentionally marked as a work-in-progress (WIP) and should not be merged into the main branch until specific criteria are met. This early merge request aims to streamline future development processes by initiating code review and allowing for parallel testing.

Merge Criteria:

  • Legacy Module Non-Degradation: Conduct comprehensive tests to verify that existing modules maintain their performance post-update, with no signs of degradation.
    • Models Unittest
    • Pipelines Outputs Validation by @The-truthh
  • New Module Validation: Ensure all new components introduced in this update undergo thorough comparative validation using PyTorch, confirming their functionality and performance.
    • Models Unittest
    • New Pipelines Outputs: Update inner validation report by @townwish4git
  • Transformer Dependency Update: Await the integration of transformers' BERT model by @Cui-yshoho into the repository. This dependency upgrade is crucial for compatibility and feature completeness.

Action Items:

  • Developers and reviewers are kindly requested to focus on reviewing the changes without merging until the above conditions are satisfied.
  • feat(transformers/models): add Bert #645 : the completion of the transformers.bert model integration as per the roadmap.

Once these milestones are achieved, this PR will be ready for final review and formal integration, setting a solid foundation for the upcoming v0.29.2 release.


Please note, this PR is part of the preparatory phase and requires subsequent validation steps to ensure quality and stability before final acceptance.

Features

New models/pipelines

1. Marigold

Proposed in Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation, Marigold introduces a diffusion model and associated fine-tuning protocol for monocular depth estimation. It can also be extended to perform surface normals’ estimation.

2. PixArt-Sigma

PixArt Simga is the successor to PixArt Alpha. PixArt Sigma is capable of directly generating images at 4K resolution. It can also produce images of markedly higher fidelity and improved alignment with text prompts. It comes with a massive sequence length of 300 (for reference, PixArt Alpha has a maximum sequence length of 120)!

3. AnimateDiff SDXL

a-r-r-o-w contributed the Stable Diffusion XL (SDXL) version of AnimateDiff. However, note that this is currently an experimental feature, as only a beta release of the motion adapter checkpoint is available.

4. Hunyuan DiT

Hunyuan DiT is a transformer-based diffusion pipeline, introduced in the Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding paper by the Tencent Hunyuan.

5. StableDiffusion3

This release emphasizes Stable Diffusion 3, Stability AI’s latest iteration of the Stable Diffusion family of models. It was introduced in Scaling Rectified Flow Transformers for High-Resolution Image Synthesis by Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, Dustin Podell, Tim Dockhorn, Zion English, Kyle Lacey, Alex Goodwin, Yannik Marek, and Robin Rombach.

ControlNets

1. ControlNetXS

ControlNet-XS was introduced in ControlNet-XS by Denis Zavadski and Carsten Rother. Based on the observation, the control model in the original ControlNet can be made much smaller and still produce good results. ControlNet-XS generates images comparable to a regular ControlNet, but it is 20-25% faster (see benchmark with StableDiffusion-XL) and uses ~45% less memory.

ControlNet-XS is supported for both Stable Diffusion and Stable Diffusion.

2. SD3 CntrolNet

More

1. Massive Refactor of from_single_file

We have further refactored from_single_file to align its logic more closely to the from_pretrained method. The biggest benefit of doing this is that it allows us to expand single file loading support beyond Stable Diffusion-like pipelines and models. It also makes it easier to load models that are saved and shared in their original format.

2. Using Long Prompts with the T5 Text Encoder

We increased the default sequence length for the T5 Text Encoder from a maximum of 77 to 256! It can be adjusted to accept fewer or more tokens by setting the max_sequence_length to a maximum of 512. Keep in mind that longer sequences require additional resources and will result in longer generation times. This effect is particularly noticeable during batch inference.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
  • Did you make sure to update the documentation with your changes? E.g. record bug fixes or new features in What's New. Here are the
    documentation guidelines
  • Did you build and run the code without any errors?
  • Did you report the running environment (NPU type/MS version) and performance in the doc? (better record it for data loading, model inference, or training tasks)
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@xxx

@townwish4git townwish4git changed the title [WIP] Prepare for diffusers v0.29.2 Update - Prerequisite Integration [Updata] diffusers v0.29.2 Update Sep 14, 2024
@townwish4git townwish4git changed the title [Updata] diffusers v0.29.2 Update [Update] diffusers v0.29.2 Update Sep 14, 2024
@geniuspatrick
Copy link
Collaborator

@vigo999 will take over this version upgrade.

@vigo999 vigo999 added this pull request to the merge queue Sep 27, 2024
Merged via the queue into mindspore-lab:master with commit 2ea7619 Sep 27, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants