Revert "strchriscntrl: reject C1 control bytes (0x80-0x9F)"#1602
Revert "strchriscntrl: reject C1 control bytes (0x80-0x9F)"#1602hallyn merged 2 commits intoshadow-maint:masterfrom
Conversation
C1 control bytes are more complicated than that. They're represented as two bytes in UTF-8. Commit 19d725d, has issues, rejecting otherwise valid UTF-8 multi-byte characters. We could in theory do correct parsing of UTF, possibly parsing the multi-byte sequences, or translating to wchar_t. However, that would complicate the source code well beyond what I'd be comfortable with. Instead, let's revert this, and claim no intention to support UTF-8. If an admin uses a UTF-8 locale while reading /etc/passwd, that's their own fault. Reverts: 19d725d (2026-03-13; "strchriscntrl: reject C1 control bytes (0x80-0x9F)") Fixes: 19d725d (2026-03-13; "strchriscntrl: reject C1 control bytes (0x80-0x9F)") Closes: <shadow-maint#1598> Reported-by: Mantas Mikulėnas <grawity@gmail.com> Cc: KhaelK-Praetorian <khael.kugler@praetorian.com> Cc: Tobias Stoeckmann <tobias@stoeckmann.org> Signed-off-by: Alejandro Colomar <alx@kernel.org>
|
Is there any place where |
I guess we could use release notes. And/or maybe we should add something to passwd(5) in a CAVEATS section. |
|
I prefer the second option, as people are more likely to read it. Do you want to use this PR to update it, or shall we leave it for later? |
I'll update this PR. |
Document that when reading passwd(5), the C locale should be used. Cc: Iker Pedrosa <ipedrosa@redhat.com> Cc: Mantas Mikulėnas <grawity@gmail.com> Cc: KhaelK-Praetorian <khael.kugler@praetorian.com> Cc: Tobias Stoeckmann <tobias@stoeckmann.org> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Done. |
ikerexxe
left a comment
There was a problem hiding this comment.
LGTM! Thank you for taking care of this patch.
I won’t merge it just yet, in case any native English speakers would like to review the updated documentation
hallyn
left a comment
There was a problem hiding this comment.
Not entirely comfortable with it. I see I have UTF8 as my locale on my laptop. But I agree that we're inviting trouble if we try to do the parsing.
C1 control bytes are more complicated than that. They're represented as two bytes in UTF-8.
Commit 19d725d, has issues, rejecting otherwise valid UTF-8 multi-byte characters.
We could in theory do correct parsing of UTF, possibly parsing the multi-byte sequences, or translating to wchar_t. However, that would complicate the source code well beyond what I'd be comfortable with.
Instead, let's revert this, and claim no intention to support UTF-8. If an admin uses a UTF-8 locale while reading /etc/passwd, that's their own fault.
Reverts: 19d725d (2026-03-13; "strchriscntrl: reject C1 control bytes (0x80-0x9F)")
Fixes: 19d725d (2026-03-13; "strchriscntrl: reject C1 control bytes (0x80-0x9F)")
Closes: #1598
Reported-by: @grawity
Cc: @KhaelK138
Cc: @stoeckmann
Revisions:
v2