Skip to content

Harden numeric value parsing#177

Merged
Taywee merged 1 commit into
Taywee:masterfrom
metsw24-max:embedded-nul-numeric-validation
Jun 2, 2026
Merged

Harden numeric value parsing#177
Taywee merged 1 commit into
Taywee:masterfrom
metsw24-max:embedded-nul-numeric-validation

Conversation

@metsw24-max
Copy link
Copy Markdown
Contributor

Fix integer parsing to correctly reject values containing embedded NUL ('\0') characters.

The integer parsing path uses std::strtoull / std::strtoll, which operate on C strings and stop parsing at the first NUL byte. The existing validation checked for trailing data using *end == '\0', which allowed inputs such as:

"12\0garbage"

to be accepted as the value 12 while silently ignoring the remaining contents of the string.

This change validates parsing against the actual length of the input string rather than the first NUL terminator, ensuring that all characters in the provided std::string are accounted for.

Changes

  • Introduced a length-based end pointer (stop) derived from value.size().
  • Updated whitespace-skipping logic to stop at the true end of the string.
  • Replaced NUL-terminator validation with full-buffer validation.
  • Added a regression test covering embedded-NUL numeric inputs.

Behavior

Before:

"12\0garbage" -> accepted as 12

After:

"12\0garbage" -> rejected

Valid numeric forms continue to parse correctly, including:

  • Decimal values (12)
  • Signed values (-7)
  • Hexadecimal values (0x10)
  • Octal values (010)
  • Whitespace-padded values

Testing

Added regression coverage for:

  • Embedded NUL followed by trailing data
  • Embedded NUL after whitespace
  • Standalone NUL values
  • Signed and unsigned integer types

Verified that valid numeric inputs continue to parse successfully.

@Taywee
Copy link
Copy Markdown
Owner

Taywee commented Jun 1, 2026

This library is for command line argument parsing. Are you aware of a system that can actually supply a null byte to arguments?

@metsw24-max
Copy link
Copy Markdown
Contributor Author

You're right that an embedded NUL can't come from the OS command line itself — execve/CreateProcess ultimately operate on NUL-terminated strings, so from actual argv input this isn't reachable.

The reason I proposed the change is that the parsing API can also be used with caller-provided std::string values, where embedded NULs are possible. While looking at it, I noticed the integer and floating-point readers handle the same input differently:

int i;
reader("x", std::string("12\0garbage", 10), i);    // accepted as 12

double d;
reader("x", std::string("12\0garbage", 10), d);    // rejected

What motivated the patch was mainly this inconsistency rather than the embedded-NUL case itself. The integer path validates using the C-string terminator, while the floating-point path already rejects trailing data beyond the NUL. The change makes the integer reader validate against the actual string length and behave consistently with the floating-point reader.

That said, if inputs outside normal argv usage are considered out of scope for this library, I'm happy to follow your preference on this PR

@Taywee
Copy link
Copy Markdown
Owner

Taywee commented Jun 2, 2026

That is reasonable enough. It's really an edge case, but it doesn't bloat the code, and it's a moderate improvement, so I'm happy with it. Thanks for the PR.

@Taywee Taywee merged commit c23c875 into Taywee:master Jun 2, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants