fix: input validation for env vars and narrowing conversions#378
Open
randomizedcoder wants to merge 3 commits into
Open
fix: input validation for env vars and narrowing conversions#378randomizedcoder wants to merge 3 commits into
randomizedcoder wants to merge 3 commits into
Conversation
Table-driven tests for two classes of unsafe implicit conversion:
- atoi/atof: demonstrates silent failure on garbage input, integer
overflow, and empty strings vs strtol/strtod error detection
- narrowing: demonstrates uint64_t -> double precision loss for
large values and signed -> unsigned sign change
Source-verification test intentionally FAILS at this commit to
demonstrate the unsafe patterns exist before the fix.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: randomizedcoder dave.seddon.ca@gmail.com <dave.seddon.ca@gmail.com>
atoi() silently returns 0 on non-numeric input and has undefined behavior on integer overflow (CWE-190). atof() similarly provides no error detection. Replace with strtol/strtod which detect: - Non-numeric input (endptr == start) - Overflow (errno == ERANGE) - Partial parse (trailing garbage after number) Environment variables affected: - NCCL_TESTS_DEVICE: GPU device ID (common.cu + util.cu) - NCCL_TESTS_MIN_BW: minimum bandwidth threshold (common.cu) Invalid values now print a warning and fall back to the default. Signed-off-by: dave.seddon.ca@gmail.com Signed-off-by: randomizedcoder dave.seddon.ca@gmail.com <dave.seddon.ca@gmail.com>
GCC -Wconversion and -Wsign-conversion flag three implicit narrowing conversions: - now(): chrono::duration::rep (signed long) to uint64_t - elapsed(): uint64_t to double - reset(): uint64_t to double Add static_cast to make the conversions explicit and silence the warnings. All conversions are semantically safe (durations are non-negative, nanosecond counts fit in double precision). Signed-off-by: dave.seddon.ca@gmail.com Signed-off-by: randomizedcoder dave.seddon.ca@gmail.com <dave.seddon.ca@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Static analysis findings
flawfinder (Level 2, CWE-190) and clang-tidy (cert-err34-c) flagged unsafe integer/float parsing from environment variables. GCC
-Wconversionand-Wsign-conversionflagged implicit narrowing conversions intimer.cc.atoi()with no range check or error detectioncommon.cu:1330,util.cu:555atof()with no error detectioncommon.cu:1596-Wsign-conversiontimer.cc:10uint64_t→doublenarrowing-Wconversiontimer.cc:20,25Why these matter
atoi()silently returns 0 on garbage input — indistinguishable fromatoi("0"). ForNCCL_TESTS_DEVICE=abc, the current code silently selects GPU 0 instead of reporting an error. ForNCCL_TESTS_DEVICE=99999999999,atoi()has undefined behavior on integer overflow — it typically wraps to a garbage value, which is then passed as a GPU device ID to CUDA.atof()has the same problem.atof("not_a_number")silently returns 0.0, soNCCL_TESTS_MIN_BW=garbagesilently sets the minimum bandwidth threshold to 0.0 instead of the intended default of -1 (disabled).The codebase already has a
parseInt()helper usingstrtoul()with error checking, but it's behind#ifdef MPI_SUPPORTand not used for these env vars.timer.ccnarrowing conversions are semantically safe (durations are non-negative, nanosecond counts fit in double precision) but should be explicit to satisfy-Wconversionand signal intentionality to reviewers.Git history
The affected code spans three files and three different time periods:
d313d20common.cu,timer.ccNCCL_TESTS_MIN_BWwithatof()and thetimer.ccnarrowing patterns00f5281common.cu,util.cuNCCL_TESTS_DEVICEhandling withatoi()into the current structure; added theutil.cucopy760c467common.cuatof()forNCCL_TESTS_MIN_BWin the current locationThe
atoipattern forNCCL_TESTS_DEVICEtraces back tod313d20(Sep 2022) — nearly 4 years old. Theatofpattern forNCCL_TESTS_MIN_BWwas originally introduced in the same 2022 commit. Both patterns have been carried forward through multiple refactors by different contributors. Thetimer.ccnarrowing has been unchanged since its creation in the same 2022 commit.These are all inherited patterns rather than deliberate choices by the current maintainers. The
atoi/atofidiom was standard practice in older C code but has been superseded bystrtol/strtodwhich provide error detection thatatoi/atoffundamentally cannot.Changes
Commit 1: Tests (TDD — intentionally FAIL before fix)
tests/c/test_input_validation.c— 9 test cases:test_source_verified— greps source files for unsafe patterns (intentionally FAILS)test_atoi_garbage_exploit— provesatoi("abc")is indistinguishable fromatoi("0")test_atoi_overflow_exploit— provesatoi("99999999999")wraps to garbage valuetest_atof_garbage_exploit— provesatof("xyz")is indistinguishable fromatof("0")test_strtol_validation— table-driven: 9 cases proving strtol correctnesstest_strtod_validation— table-driven: 7 cases proving strtod correctnesstest_strtol_device_id_pattern— reproduces the exact NCCL_TESTS_DEVICE safe parsingtest_signed_to_unsigned_cast— demonstrates signed→unsigned wrap behaviortest_uint64_to_double_precision— proves precision loss for uint64_t > 2^53Commit 2: Fix atoi/atof
common.cu:atoi→strtolforNCCL_TESTS_DEVICE,atof→strtodforNCCL_TESTS_MIN_BWutil.cu:atoi→strtolforNCCL_TESTS_DEVICECommit 3: Fix timer.cc narrowing
static_castfor explicit narrowing conversionsTest plan
make -C tests/c test— 9/9 pass after all commitsFiles changed
src/common.cu—atoi→strtol(NCCL_TESTS_DEVICE),atof→strtod(NCCL_TESTS_MIN_BW)src/util.cu—atoi→strtol(NCCL_TESTS_DEVICE)src/timer.cc— 3xstatic_castfor narrowing conversionstests/c/test_input_validation.c— 9 test cases (new file)🤖 Generated with Claude Code