Validate token-pruning ratio and unify keep-count resolution by SuperMarioYL · Pull Request #342 · Tencent/AngelSlim

SuperMarioYL · 2026-06-12T19:22:01Z

Summary

Every token-compressor pruning strategy turns a drop ratio into an absolute
keep count via int(round(num_vision_tokens * (1 - ratio))). Today each strategy
does this inline, with its own ad-hoc "retain at least one token" guard — and
none of them validate that ratio actually lies in [0, 1].

That gap is a real defect:

ratio > 1 makes 1 - ratio negative, so num_to_keep goes negative and
flows straight into torch.empty(num_to_keep, ...) (divprune) /
torch.topk(..., k=min(num_to_keep, n)) (attention_based, vispruner) / the
pivot math in dart — which raises an opaque CUDA/CPU error deep inside torch,
far from the actual cause.
ratio < 0 over-counts and silently changes the kept-token set.
The retain-one guard is inconsistent: hiprune doesn't apply it at all, so a
benign rounding-to-zero there drops the entire image.

This isn't hypothetical — the shipped config
configs/qwen2_5_vl/pruning/vision_selector_r0.9.yaml
carries ratio: 9 even though the filename says r0.9, so running it today
crashes instead of pruning to 10%.

Changes

Add a single shared helper resolve_num_tokens_to_keep(ratio, num_vision_tokens)
in algorithm/utils/utils.py that:
- rejects a non-numeric / bool / NaN / out-of-[0,1] ratio with a clear
  [TokenCompressor Error] 'ratio' must be in [0.0, 1.0], got <x> message;
- computes round(num_vision_tokens * (1 - ratio));
- applies one uniform "retain at least one token when ratio < 1.0" rule.
Route all eight strategies (basic, attention_based, dart, divprune,
hiprune, idpruner, scope, vision_selector) through the helper, removing
the duplicated inline blocks. Behaviour is unchanged for valid ratios; hiprune
gains the retain-one guard for consistency.
Fix the ratio: 9 → 0.9 typo in the vision_selector config so the example
actually runs.
Add a CPU-only regression test (tests/test_token_pruning_ratio.py) that pins
the contract — it stubs the torch import so it needs neither a GPU nor model
weights.

Testing

$ python -m pytest tests/test_token_pruning_ratio.py -q
18 passed

Formatting matches the repo pre-commit hooks: black --line-length=99,
isort --profile=black, and flake8 (E203,E704,W503,W504 ignored) are all clean.

Notes

Out-of-range ratios were never a working path (they crashed or corrupted), so
raising a descriptive ValueError instead is strictly safer and not a behavioural
regression for any valid configuration.

Every token_compressor pruning strategy converts a drop ratio into an absolute keep count via round(num_vision_tokens * (1 - ratio)), each with its own ad-hoc 'retain at least one token' guard (missing entirely in hiprune) and none validating that ratio lies in [0, 1]. An out-of-range ratio therefore flows straight into the keep count: a value > 1 makes (1 - ratio) negative, yielding a negative num_to_keep that crashes torch.empty / torch.topk, while a value < 0 over-counts. The shipped config configs/qwen2_5_vl/pruning/vision_selector_r0.9.yaml even carries ratio: 9 (the filename says r0.9), so this is a live defect, not a hypothetical one. Add a shared resolve_num_tokens_to_keep(ratio, n) helper that validates the ratio, computes the keep count, and applies a single uniform retain-one rule, then route all eight strategies through it. Fix the config typo to 0.9 and add a CPU-only regression test for the helper's contract.

SuperMarioYL added 2 commits June 13, 2026 03:21

Wrap multi-import lines to satisfy isort (black profile, 88 col)

1f1eae2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate token-pruning ratio and unify keep-count resolution#342

Validate token-pruning ratio and unify keep-count resolution#342
SuperMarioYL wants to merge 2 commits into
Tencent:mainfrom
SuperMarioYL:fix/token-pruning-ratio-validation

SuperMarioYL commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SuperMarioYL commented Jun 12, 2026

Summary

Changes

Testing

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant