Add Quantizers for Qwen3VLMoeTextDecoderLayer #666

soodoshll · 2025-12-08T19:32:05Z

What does this PR do?

Type of change: ? new feature

Overview: ? huggingface transformers library implements Qwen3VL Moe layer as a monolithic module, instead of assembling it using Linear layers, which cannot be recognized by modelopt's quantizer now. This PR introduces a conversion from hf's qwen3vl_moe MoE layers to qewn3_moe MoE layers which consist of a set of Linear layers.

Testing

Tested with

python hf_ptq.py --pyt_ckpt_path=Qwen/Qwen3-VL-30B-A3B-Instruct --qformat=nvfp4 --dataset wikipedia

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes/No
Did you write any new necessary tests?: Yes/No
Did you add or update any necessary documentation?: Yes/No
Did you update Changelog?: Yes/No

Additional Information

copy-pr-bot · 2025-12-08T19:32:09Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

codecov · 2025-12-10T18:01:47Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.46%. Comparing base (5a4242f) to head (e0d2121).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #666      +/-   ##
==========================================
- Coverage   74.52%   74.46%   -0.06%     
==========================================
  Files         183      183              
  Lines       18400    18410      +10     
==========================================
- Hits        13712    13709       -3     
- Misses       4688     4701      +13

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

shengliangxu · 2025-12-10T18:52:24Z

modelopt/torch/quantization/plugins/huggingface.py

+            Qwen3VLMoeTextSparseMoeBlock,
+        )
+
+        if not isinstance(self.mlp, Qwen3VLMoeTextSparseMoeBlock):


I'm thinking of whether we want to directly add support to Qwen3VLMoeTextSparseMoeBlock, it feels a bit fragile to use a replacement module.

I've refactored this part.

Signed-off-by: Qidong Su <[email protected]>

…timizer into qwen3-vl-moe

Signed-off-by: Qidong Su <[email protected]>

soodoshll requested a review from a team as a code owner December 8, 2025 19:32

soodoshll requested a review from ajrasane December 8, 2025 19:32

shengliangxu self-requested a review December 8, 2025 19:33

soodoshll force-pushed the qwen3-vl-moe branch from 2c18fd0 to 7a16307 Compare December 8, 2025 19:58

soodoshll requested review from a team as code owners December 8, 2025 19:58

soodoshll requested a review from ynankani December 8, 2025 19:58

shengliangxu removed request for a team, ChenhanYu, Edwardf0t1, ajrasane, cjluo-nv and ynankani December 10, 2025 01:59

Merge branch 'main' into qwen3-vl-moe

e0d2121

shengliangxu reviewed Dec 10, 2025

View reviewed changes

soodoshll added 3 commits December 10, 2025 21:40

refactor to directly impl qwen3_vl_moe

9bbeab8

Signed-off-by: Qidong Su <[email protected]>

Merge branch 'qwen3-vl-moe' of github.com:soodoshll/TensorRT-Model-Op…

a26ded5

…timizer into qwen3-vl-moe

format

c846c65

Signed-off-by: Qidong Su <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Quantizers for Qwen3VLMoeTextDecoderLayer #666

Add Quantizers for Qwen3VLMoeTextDecoderLayer #666

Uh oh!

soodoshll commented Dec 8, 2025

Uh oh!

copy-pr-bot bot commented Dec 8, 2025

Uh oh!

codecov bot commented Dec 10, 2025

Uh oh!

shengliangxu Dec 10, 2025

Uh oh!

soodoshll Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Quantizers for Qwen3VLMoeTextDecoderLayer #666

Are you sure you want to change the base?

Add Quantizers for Qwen3VLMoeTextDecoderLayer #666

Uh oh!

Conversation

soodoshll commented Dec 8, 2025

What does this PR do?

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Dec 8, 2025

Uh oh!

codecov bot commented Dec 10, 2025

Codecov Report

Uh oh!

shengliangxu Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

soodoshll Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants