Refactor and clean up hf_ptq.py #665

shengliangxu · 2025-12-08T19:27:50Z

What does this PR do?

Refactor and clean up hf_ptq.py

This script has several separate logic and the code of them are entangled, making it really hard to add new features

Refactor them so that we separate these logics:

sparsity, all logic go to sparsity_main. TODO: we may actually move this logic out to a separate script
quantize, all logic go to quantize_main.

2.1 plain quantization with a single quantization format

2.2 auto quantization

In the quantization pipeline, separate the pipeline to:

model loading
calibrate dataset loading
pre-quantize processing
actual quantize
post-quantize processing
quantized model export

Testing

tested the plain quantization:

python examples/llm_ptq/hf_ptq.py \
    --pyt_ckpt_path=Qwen/Qwen3-8B \
    --export_path=qwen3-8B_fp8 \
    --qformat=fp8 \
    --kv_cache_qformat=fp8 \
    --calib_size=16 \
    --batch_size=0 \
    --trust_remote_code \
    --export_fmt=hf

tested auto quantize:

python examples/llm_ptq/hf_ptq.py \
    --qformat=nvfp4,fp8 \
    --auto_quantize_score_size 128 \
    --auto_quantize_bits 5.0 \
    --auto_quantize_checkpoint Qwen3-8B-auto-quantize-checkpoint \
    --pyt_ckpt_path=Qwen/Qwen3-8B \
    --export_path=qwen3-8B_auto_quantize \
    --kv_cache_qformat=fp8 \
    --calib_size=16 \
    --batch_size=0 \
    --trust_remote_code \
    --export_fmt=hf

copy-pr-bot · 2025-12-08T19:27:53Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

This script has several separate logic and the code of them are entangled, making it really hard to add new features Refactor them so that we separate these logics: 1. sparsity, all logic go to sparsity_main. TODO: we may actually move this logic out to a separate script 2. quantize, all logic go to quantize_main. 2.1 plain quantization with a single quantization format 2.2 auto quantization In the quantization pipeline, separate the pipeline to: 1. model loading 2. calibrate dataset loading 3. pre-quantize processing 4. actual quantize 5. post-quantize processing 6. quantized model export Signed-off-by: Shengliang Xu <[email protected]>

codecov · 2025-12-11T00:42:51Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.45%. Comparing base (f265f8d) to head (e15d632).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #665   +/-   ##
=======================================
  Coverage   74.45%   74.45%           
=======================================
  Files         183      183           
  Lines       18412    18412           
=======================================
  Hits        13709    13709           
  Misses       4703     4703

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

shengliangxu force-pushed the shengliangx/hf_ptq_refactor_cleanup branch from 832fb13 to 070ae87 Compare December 8, 2025 21:01

shengliangxu force-pushed the shengliangx/hf_ptq_refactor_cleanup branch from 070ae87 to a89625b Compare December 8, 2025 22:05

shengliangxu marked this pull request as ready for review December 8, 2025 22:11

shengliangxu requested review from a team as code owners December 8, 2025 22:11

shengliangxu requested review from kevalmorabia97 and sugunav14 December 8, 2025 22:11

kevalmorabia97 requested review from Edwardf0t1 and cjluo-nv December 9, 2025 05:16

Merge branch 'main' into shengliangx/hf_ptq_refactor_cleanup

e15d632

shengliangxu self-assigned this Dec 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor and clean up hf_ptq.py #665

Refactor and clean up hf_ptq.py #665

Uh oh!

shengliangxu commented Dec 8, 2025

Uh oh!

copy-pr-bot bot commented Dec 8, 2025

Uh oh!

codecov bot commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Refactor and clean up hf_ptq.py #665

Are you sure you want to change the base?

Refactor and clean up hf_ptq.py #665

Uh oh!

Conversation

shengliangxu commented Dec 8, 2025

What does this PR do?

Testing

Uh oh!

copy-pr-bot bot commented Dec 8, 2025

Uh oh!

codecov bot commented Dec 11, 2025

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants