Skip to content

Annotate moe_sorting_dispatch_policy as int for fused_moe#2639

Merged
valarLip merged 1 commit intoROCm:mainfrom
nholmber:fix/moe-dispatch-policy-type
Apr 16, 2026
Merged

Annotate moe_sorting_dispatch_policy as int for fused_moe#2639
valarLip merged 1 commit intoROCm:mainfrom
nholmber:fix/moe-dispatch-policy-type

Conversation

@nholmber
Copy link
Copy Markdown
Contributor

@nholmber nholmber commented Apr 7, 2026

Motivation

The type annotation bool was incorrect for moe_sorting_dispatch_policy, which accepts int values. The @torch_compile_guard decorator uses these annotations to generate PyTorch custom op schemas; with bool, PyTorch schema enforcement casts any value to bool, so dispatch_policy=2 becomes bool(2)=True (1), silently losing the intended policy. Using int allows callers to set dispatch_policy=2 correctly.

Fixes: #2576

Technical Details

2-line change in aiter/fused_moe.py:

Line 191: moe_sorting_dispatch_policy: bool = 0 → int = 0 (in fused_moe_fake)
Line 225: moe_sorting_dispatch_policy: bool = 0 → int = 0 (in fused_moe_)

Will be followed up by vLLM change to enable user to set the policy.

Test Plan

  • Verified dispatch_policy=2 is passed through correctly to the kernel (previously silently truncated to 1)
  • lm_eval gsm8k accuracy on Qwen3-Next-80B-A3B-Instruct-FP8 (MI355X, TP1)
  • E2E throughput benchmark (vllm bench serve, 1k/1k, concurrency 16)

Test Result

Metric dp=0 (baseline) dp=2 (with fix)
gsm8k flex_extract 0.8560 0.8628
gsm8k strict_match 0.8143 0.8196
Output throughput (tok/s) 1527.83 1551.84 (+1.6%)

No accuracy regression. Modest throughput improvement from correct dispatch policy.

Submission Checklist

@nholmber nholmber requested a review from a team April 7, 2026 10:52
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

🏷️ CI Guide

Runs automatically on every PR:

  • ✅ Pre-checks (submodule verification, code formatting)
  • ✅ Aiter op tests (gfx942 + gfx950)
  • ✅ Triton tests (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label Tests
ci:triton-355 Run Triton tests on MI355 in addition to MI325
ci:sglang SGLang integration tests
ci:atom ATOM benchmark (DeepSeek-R1 + GPT-OSS)
ci:vllm vLLM benchmark
ci:all All of the above

Add labels via the sidebar or gh pr edit 2639 --add-label <label>

tpopp added a commit to amdsiloai/aiter that referenced this pull request Apr 9, 2026
Fix type annotation so PyTorch custom op schema doesn't silently
coerce values like 2 to bool(2)==1.

Signed-off-by: Tres Popp <tres.popp@amd.com>
Made-with: Cursor
The type annotation bool was incorrect for moe_sorting_dispatch_policy, which
accepts int values. The @torch_compile_guard decorator uses these
annotations to generate PyTorch custom op schemas; with bool, PyTorch schema
enforcement casts any value to bool, so dispatch_policy=2 becomes bool(2)=True
(1), silently losing the intended policy. Using int allows callers to set
dispatch_policy=2 correctly.

Fixes: ROCm#2576
Signed-off-by: Tres Popp <tres.popp@amd.com>
@tpopp tpopp force-pushed the fix/moe-dispatch-policy-type branch from adc6a2a to 83c358d Compare April 10, 2026 11:43
@nholmber nholmber requested a review from valarLip April 13, 2026 11:28
@valarLip valarLip merged commit a4e9890 into ROCm:main Apr 16, 2026
23 of 24 checks passed
Zzz9990 pushed a commit that referenced this pull request Apr 16, 2026
The type annotation bool was incorrect for moe_sorting_dispatch_policy, which
accepts int values. The @torch_compile_guard decorator uses these
annotations to generate PyTorch custom op schemas; with bool, PyTorch schema
enforcement casts any value to bool, so dispatch_policy=2 becomes bool(2)=True
(1), silently losing the intended policy. Using int allows callers to set
dispatch_policy=2 correctly.

Fixes: #2576

Signed-off-by: Tres Popp <tres.popp@amd.com>
Co-authored-by: Tres Popp <tres.popp@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] fused_moe moe_sorting_dispatch_policy wrong type

3 participants