Describe the bug
I found 4 related zero-handling issues where divisors/scales are not validated before use.
Three can raise ZeroDivisionError; one can silently propagate inf/nan.
Affected locations
-
deepspeed/utils/groups.py
_ensure_divisibility(numerator, denominator) uses numerator % denominator without guarding denominator == 0.
-
deepspeed/utils/timer.py
ThroughputTimer._is_report_boundary() checks None but not 0 before:
self.global_step_count % self.steps_per_output.
-
deepspeed/inference/v2/inference_utils.py
ceil_div(a, b) returns -(-a // b) without guarding b == 0.
-
op_builder/hpu/fp_quantizer.py
FPQuantizer.dequantize() computes (1.0 / scale) without guarding zero elements in scale, which can produce non-finite values and corrupt outputs silently.
To Reproduce
_ensure_divisibility(8, 0) -> ZeroDivisionError
_is_report_boundary() with steps_per_output=0 -> ZeroDivisionError
ceil_div(10, 0) -> ZeroDivisionError
1.0 / torch.tensor([0.0, 1.0]) -> tensor([inf, 1.]) (same pattern used in HPU dequantize path)
Expected behavior
- Explicit validation for invalid zero values.
- Clear user-facing error messages (or safe clamping where appropriate).
- No raw modulo/division-by-zero exceptions.
- No silent non-finite propagation in dequantization.
Suggested fixes
_ensure_divisibility: add guard before modulo (denominator != 0).
_is_report_boundary: treat 0 as invalid/disabled (if not self.steps_per_output: or explicit <= 0 validation).
ceil_div: reject b == 0 with clear error.
FPQuantizer.dequantize: clamp scale to torch.finfo(scale.dtype).tiny (or validate and fail clearly) before inversion.
System info
- Can provide full
ds_report output if needed.
- Repros above are minimal and mostly pure-Python, except item 4 which is on the HPU backend path.
Describe the bug
I found 4 related zero-handling issues where divisors/scales are not validated before use.
Three can raise
ZeroDivisionError; one can silently propagateinf/nan.Affected locations
deepspeed/utils/groups.py_ensure_divisibility(numerator, denominator)usesnumerator % denominatorwithout guardingdenominator == 0.deepspeed/utils/timer.pyThroughputTimer._is_report_boundary()checksNonebut not0before:self.global_step_count % self.steps_per_output.deepspeed/inference/v2/inference_utils.pyceil_div(a, b)returns-(-a // b)without guardingb == 0.op_builder/hpu/fp_quantizer.pyFPQuantizer.dequantize()computes(1.0 / scale)without guarding zero elements inscale, which can produce non-finite values and corrupt outputs silently.To Reproduce
_ensure_divisibility(8, 0)->ZeroDivisionError_is_report_boundary()withsteps_per_output=0->ZeroDivisionErrorceil_div(10, 0)->ZeroDivisionError1.0 / torch.tensor([0.0, 1.0])->tensor([inf, 1.])(same pattern used in HPU dequantize path)Expected behavior
Suggested fixes
_ensure_divisibility: add guard before modulo (denominator != 0)._is_report_boundary: treat0as invalid/disabled (if not self.steps_per_output:or explicit<= 0validation).ceil_div: rejectb == 0with clear error.FPQuantizer.dequantize: clamp scale totorch.finfo(scale.dtype).tiny(or validate and fail clearly) before inversion.System info
ds_reportoutput if needed.