support online quant for quark models by gbyu-amd · Pull Request #1370 · ROCm/ATOM

gbyu-amd · 2026-06-26T09:58:54Z

Motivation

There are cases where we want to online quant the quark models for some specific modules. For example, quant the bf16 attn linear layers to PTPC fp8 for https://huggingface.co/amd/MiniMax-M3-MXFP4, which is already quark quant model.

Technical Details

Test Plan

Test Result

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

haoyangli0109 · 2026-06-26T10:56:40Z

Hi,
@gbyu-amd
It makes sense that bf16 runs successfully, since it doesn’t require dequantization of the weights.
Quark supports multiple quantization formats, and completely opening up the Quark options could pose risks. For example, if we try to perform mxfp4 quantization on a ptpc_fp8 model, problems will arise.

Merging this PR carries some risk, but as long as you confirm there won’t be any misuse, I believe it can be merged.

Based on this PR, we’ll submit another PR to handle online quantization of Quark models in common scenarios next week.

gbyu-amd · 2026-06-26T13:57:28Z

Hi, @gbyu-amd It makes sense that bf16 runs successfully, since it doesn’t require dequantization of the weights. Quark supports multiple quantization formats, and completely opening up the Quark options could pose risks. For example, if we try to perform mxfp4 quantization on a ptpc_fp8 model, problems will arise.

Merging this PR carries some risk, but as long as you confirm there won’t be any misuse, I believe it can be merged.

Based on this PR, we’ll submit another PR to handle online quantization of Quark models in common scenarios next week.

Thanks @haoyangli0109 , it would be great if you could support more general cases regarding quark models.

ganyi1996ppo added 2 commits June 25, 2026 09:42

fix online quant

f0194a7

update comment

a0300ec

gbyu-amd requested a review from lihaoyang-amd June 26, 2026 09:59

format

7c22c28

valarLip approved these changes Jun 26, 2026

View reviewed changes

valarLip merged commit e97d631 into main Jun 26, 2026
25 of 33 checks passed

valarLip deleted the guanbao/m3_fp4_quant branch June 26, 2026 14:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

support online quant for quark models#1370

support online quant for quark models#1370
valarLip merged 3 commits into
mainfrom
guanbao/m3_fp4_quant

gbyu-amd commented Jun 26, 2026

Uh oh!

haoyangli0109 commented Jun 26, 2026 •

edited

Loading

Uh oh!

gbyu-amd commented Jun 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

gbyu-amd commented Jun 26, 2026

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

haoyangli0109 commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gbyu-amd commented Jun 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

haoyangli0109 commented Jun 26, 2026 •

edited

Loading