Skip to content

fix(models): fix MoE weight dangling reference and Qwen3-Omni model adapter compatibility#334

Merged
ali-88123 merged 1 commit into
Tencent:mainfrom
sunnyxiaohu:fix/model-adapter-compat
Jun 9, 2026
Merged

fix(models): fix MoE weight dangling reference and Qwen3-Omni model adapter compatibility#334
ali-88123 merged 1 commit into
Tencent:mainfrom
sunnyxiaohu:fix/model-adapter-compat

Conversation

@sunnyxiaohu

Copy link
Copy Markdown
Contributor

Summary

Fix critical runtime issues in MoE expert weight handling and Qwen3-Omni model adapter that caused GPTQ quantization failures.

Changes

1. Fix MoE expert weight dangling reference (angelslim/models/llm/qwen.py)

After chunk() splits gate_up_proj into gate_proj and up_proj, the resulting tensors are views sharing the same underlying storage. When del self.gate_up_proj is executed subsequently, the storage is freed, leaving gate_proj and up_proj as dangling references pointing to invalid memory.

Fix: Call .clone() on all chunked/sliced weight tensors immediately after assignment to ensure each expert owns an independent copy before the source tensors are deleted.

2. Add missing block_name attribute to Qwen3-Omni (angelslim/models/omni/qwen3_omni.py)

The GPTQ quantization flow requires self.block_name to locate transformer blocks. Qwen3-Omni only defined thinker_block_name / talker_block_name but was missing the base block_name attribute, causing AttributeError during calibration.

Fix: Add self.block_name = "thinker.model.layers" to the constructor.

3. Remove incompatible self.model.use_cache = False (angelslim/models/omni/qwen3_omni.py)

The Qwen3-Omni model object does not expose a use_cache attribute at the top level (it is configured per-component). Setting it unconditionally raised AttributeError.

Fix: Remove the incompatible assignment from model_forward().

Files Changed

  • angelslim/models/llm/qwen.py — clone expert weights after chunk to prevent dangling references
  • angelslim/models/omni/qwen3_omni.py — add block_name attr; remove invalid use_cache assignment

…sues

- Fix MoE expert weight dangling reference: clone() after chunk() to avoid
  invalid memory access when gate_up_proj/down_proj are deleted
- Add missing block_name attribute to Qwen3-Omni (required by GPTQ flow)
- Remove incompatible self.model.use_cache = False in model_forward()
@ali-88123 ali-88123 merged commit 13ccafd into Tencent:main Jun 9, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants