-
Notifications
You must be signed in to change notification settings - Fork 213
Add Quantizers for Qwen3VLMoeTextDecoderLayer #666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
2c18fd0 to
7a16307
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #666 +/- ##
==========================================
- Coverage 74.52% 74.46% -0.06%
==========================================
Files 183 183
Lines 18400 18410 +10
==========================================
- Hits 13712 13709 -3
- Misses 4688 4701 +13 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| Qwen3VLMoeTextSparseMoeBlock, | ||
| ) | ||
|
|
||
| if not isinstance(self.mlp, Qwen3VLMoeTextSparseMoeBlock): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of whether we want to directly add support to Qwen3VLMoeTextSparseMoeBlock, it feels a bit fragile to use a replacement module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've refactored this part.
Signed-off-by: Qidong Su <[email protected]>
…timizer into qwen3-vl-moe
Signed-off-by: Qidong Su <[email protected]>
What does this PR do?
Type of change: ? new feature
Overview: ? huggingface transformers library implements Qwen3VL Moe layer as a monolithic module, instead of assembling it using Linear layers, which cannot be recognized by modelopt's quantizer now. This PR introduces a conversion from hf's qwen3vl_moe MoE layers to qewn3_moe MoE layers which consist of a set of Linear layers.
Testing
Tested with
Before your PR is "Ready for review"
Additional Information