Skip to content

Conversation

@Glaceon-Hyy
Copy link
Member

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Glaceon-Hyy, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the diffsynth_engine by integrating SVD quantization capabilities through the nunchaku library, specifically for the QwenImage model. It introduces new configuration options and model classes to support quantized inference and specialized LoRA handling for these quantized models. The changes aim to improve model efficiency and provide a robust framework for working with quantized models, complete with dynamic detection and clear installation guidance for the nunchaku dependency.

Highlights

  • SVD Quantization Support: Introduced comprehensive support for SVD quantization using the nunchaku library, enabling more efficient inference for models like QwenImage.
  • Nunchaku-specific LoRA Classes: Added new LoRA wrapper classes (LoRASVDQW4A4Linear, LoRAAWQW4A16Linear) to seamlessly integrate LoRA with Nunchaku's quantized linear layers.
  • Dynamic Nunchaku Integration: The pipeline now dynamically detects if a loaded model is Nunchaku-quantized based on its state dictionary and automatically configures the appropriate Nunchaku-specific settings.
  • Enhanced LoRA Loading Mechanism: Updated LoRA loading logic to correctly handle the new Nunchaku-specific LoRA types, including advanced fusion for QKV projections in SVDQW4A4Linear layers.
  • User-Friendly Nunchaku Installation Guide: Implemented a detailed error message that guides users on how to manually install the nunchaku library if it's not found, specifying PyTorch, Python, and OS versions.
  • New Test Coverage: Added a dedicated test suite to validate the functionality of SVD quantization and LoRA loading within the QwenImage pipeline, ensuring stability and correctness.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for SVD quantization using the nunchaku library. It adds new configuration options, LoRA wrapper modules for quantized layers, and a quantized DiT model for Qwen Image. The pipeline logic is also updated to detect and load these quantized models. My review has identified a few critical bugs in the new LoRA handling logic for quantized layers that need to be addressed. Specifically, there are issues with rank management when fusing QKV LoRAs and a variable scope bug when loading them. There's also a minor bug in the clear method of a LoRA wrapper.

# override OptimizationConfig
fbcache_relative_l1_threshold = 0.009

# svd quant
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

看后面逻辑这几个参数似乎不是给用户设置的?可以参考WanPipelineConfig里面boundary那几个参数设置成不能init的field

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config里面还是保留一下字段比较好?用Field(init=False)的方式来写的话用户就不能初始化,但是还是有字段提示

"4. Install it using pip, for example:\n"
" pip install nunchaku @ https://.../your_specific_nunchaku_file.whl\n"
)
raise ImportError(error_message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个error_message还是挺长的,感觉可以从flag里面引进来

@akaitsuki-ii akaitsuki-ii merged commit ae4faeb into main Nov 12, 2025
@akaitsuki-ii akaitsuki-ii deleted the feature/svd branch November 12, 2025 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants