shufti-double: fix regressions from #325#368
Open
byeonguk-jeong wants to merge 4 commits intoVectorCamp:developfrom
Open
shufti-double: fix regressions from #325#368byeonguk-jeong wants to merge 4 commits intoVectorCamp:developfrom
byeonguk-jeong wants to merge 4 commits intoVectorCamp:developfrom
Conversation
run_accel() passed c_end - 1 to shuftiDoubleExec(). shuftiDoubleExecReal already handles the last-byte boundary internally via check_last_byte(), so shortening the buffer caused it to miss valid matches near the end and apply the wildcard check to the wrong byte. Changed to pass c_end. Fixes: ca70a3d ("Fix double shufti's vector end false positive (VectorCamp#325)") Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
first_char_mask was reset to Ones() after the peel-off block, discarding carry-over state for cross-boundary pattern detection. Remove the reset. Fixes: ca70a3d ("Fix double shufti's vector end false positive (VectorCamp#325)") Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
loop condition i > 2 missed the final vshr(1), leaving odd-indexed bytes out of the reduce. Use i >= 2. Fixes: ca70a3d ("Fix double shufti's vector end false positive (VectorCamp#325)") Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
Author
|
Test failures on AVX512 are fixed at #373. It is not related to these commits. |
91af906 to
1898ab9
Compare
ypicchi-arm
approved these changes
Mar 20, 2026
ypicchi-arm
left a comment
There was a problem hiding this comment.
Aside the assumption of vector size in the test, LGTM. Thanks for the fix.
Add three tests exercising the double shufti edge cases.
- ExecMatchVectorEdge: Two-byte pair ("ab") spanning the peel-off to
aligned block boundary is correctly detected. Validates that
first_char_mask state carries over and is not reset after peel-off.
- ExecNoMatchLastByte: First character of a double-byte pair ('x' from
"xy") at the last buffer byte does not cause a false positive when
the second character is absent.
- ExecMatchLastByte: Single-byte pattern ('a') at the last buffer byte
is detected via check_last_byte's reduce.
Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
1898ab9 to
ee06af2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
As reported in BUG: scanning returns no matches with multi-pattern database; but minimal set still matches #358, Fix double shufti's vector end false positive #325 introduced a false negative. The caller side told the buffer length as c_end - 1, so that shuftiDoubleExec() unexpectedly tried to match the byte before the last byte with single char patterns.
check_last_byte() reduces the mask to make a wildcard mask, which matches any character that could match as the second byte of a double-byte pattern. However, the loop condition i > 2 skipped the final vshr(1) step, causing only even-indexed bytes to be folded into the reduce result.
After peel-off stage for alignment, the carry-over state was discarded.
Add some unit tests for this fixes.
@AhnLab-OSS @AhnLab-OSSG