Skip to content

[quantization] Introduce wrapper for Qwen3VLForConditionalGeneration#605

Merged
mhs4670go merged 1 commit intoSamsung:mainfrom
dvsav:quant_for_conditional_generation
Apr 6, 2026
Merged

[quantization] Introduce wrapper for Qwen3VLForConditionalGeneration#605
mhs4670go merged 1 commit intoSamsung:mainfrom
dvsav:quant_for_conditional_generation

Conversation

@dvsav
Copy link
Copy Markdown
Contributor

@dvsav dvsav commented Apr 2, 2026

This change introduces QuantQwen3VLForConditionalGeneration wrapper to support post-training quantization of Qwen3VLForConditionalGeneration module.

Why?

Qwen3VLForConditionalGeneration is an essential part of Qwen model.
Trying to quantize Qwen3VLForConditionalGeneration via PTQ generates exception PTQQuantizer: no quantization wrapper for Qwen3VLForConditionalGeneration.

What

This change introduces:

  • Class QuantQwen3VLForConditionalGeneration (tico/quantization/wrapq/wrappers/qwen_vl/quant_for_conditional_generation.py).
  • Unit tests: class TestQuantQwen3VLForConditionalGeneration (test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py).
  • New entry in _CORE_MODULES (tico/quantization/wrapq/wrappers/registry.py).
  • Example of Qwen3VLForConditionalGeneration quantization and conversion to Circle (tico/quantization/wrapq/examples/qwen/quantize_for_conditional_generation.py).

Unit Tests

Unit tests results with coverage information:

$ coverage run -m pytest test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py -v
================================================================== test session starts ==================================================================
platform linux -- Python 3.10.12, pytest-8.4.0, pluggy-1.6.0 -- /home/d.savchenkov/myenv/bin/python3
cachedir: .pytest_cache
rootdir: /home/d.savchenkov/TICO
configfile: pyproject.toml
plugins: anyio-4.12.0, mock-3.15.1, xdist-3.7.0, cov-6.2.1
collected 7 items                                                                                                                                       

test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_forward_text_only                   PASSED [ 14%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_forward_with_both_images_and_videos PASSED [ 28%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_forward_with_images                 PASSED [ 42%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_forward_with_videos                 PASSED [ 57%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_mode_transitions                    PASSED [ 71%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_registration_in_registry            PASSED [ 85%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_wraps_submodules                    PASSED [100%]

============================================================= 7 passed, 2 warnings in 8.48s =============================================================

Coverage info (irrelevant files skipped):

$ coverage report -m
Name                                                                           Stmts   Miss  Cover   Missing
------------------------------------------------------------------------------------------------------------
...
tico/quantization/wrapq/wrappers/nn/quant_linear.py                               29      0   100%
...
tico/quantization/wrapq/wrappers/qwen_vl/quant_for_conditional_generation.py      23      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_model.py                          215     52    76%   114, 120, 163, 199, 277-281, 348-363, 427-436, 499-576, 620-625
tico/quantization/wrapq/wrappers/qwen_vl/quant_text_attn.py                      136      5    96%   196-197, 201-203
tico/quantization/wrapq/wrappers/qwen_vl/quant_text_decoder_layer.py              42      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_text_mlp.py                        43      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_text_model.py                     130      8    94%   248, 254-256, 260, 278, 282, 285-286
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_attn.py                    105      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_block.py                    42      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_mlp.py                      33      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_model.py                   173      6    97%   166, 173, 180, 195, 279, 452
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_patch_embed.py              25      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_patch_merger.py             36      0   100%
tico/quantization/wrapq/wrappers/registry.py                                      36      1    97%   260
...
------------------------------------------------------------------------------------------------------------
TOTAL                                                                          11720   6838    42%

Script for testing quantization and conversion to Circle

$ python tico/quantization/wrapq/examples/qwen/quantize_for_conditional_generation.py
┌───────────── Quantization Error Summary ─────────────
│ Mean |diff|: 0.022036
│ PEIR       : 16.040346 %
└──────────────────────────────────────────────────────
     ┌───────────────────────────────────────────┐
 0.72┤                                           │
     │                                    •••••  │
 0.48┤                                 •••••••   │
     │                              •••••••      │
     │                          •••••••••        │
 0.24┤                     • ••••••••••          │
     │                     ••••••••••            │
-0.00┤                  •••••••••••              │
     │             •••••••••••• •                │
     │            ••••••••••  •                  │
-0.24┤          •••••••••••                      │
     │      • ••••••••                           │
-0.48┤      ••••••••                             │
     │    ••••••                                 │
     │  ••••                                     │
-0.72┤                                           │
     └┬──────────┬─────────┬──────────┬─────────┬┘
    -0.72      -0.36     -0.00      0.36     0.72 

[QuantCheck] WARNING: 34 nodes without qparam detected (see logs).
Circle model saved as 'qwen3vl_for_conditional_generation.q.circle'

This change introduces QuantQwen3VLForConditionalGeneration wrapper to support post-training quantization of Qwen3VLForConditionalGeneration operation.

TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>
@dvsav dvsav marked this pull request as ready for review April 6, 2026 07:21
@dvsav dvsav force-pushed the quant_for_conditional_generation branch from f203015 to eba69f8 Compare April 6, 2026 07:21
Copy link
Copy Markdown
Contributor

@Torrero Torrero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@mhs4670go mhs4670go left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mhs4670go mhs4670go merged commit 2531608 into Samsung:main Apr 6, 2026
7 checks passed
@dvsav dvsav deleted the quant_for_conditional_generation branch April 7, 2026 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants