Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens the Binary Fuse filter population/retry logic (fixing a sentinel-reset corruption case) and adds a lightweight “coverage mark” mechanism plus targeted unit tests to exercise rare retry paths deterministically.
Changes:
- Fix
reverseOrderreset behavior to avoid clearing the sentinel during retry/reset paths. - Introduce
CovMarkinfrastructure (COVMARK_HIT/COVMARK_CHECK_HIT_IN_CURR_SCOPE) to assert execution of rare internal paths from tests. - Add BinaryFuseFilter unit tests covering error retry, peeling failure retry, and duplicate-removal paths; remove DiskIndex’s legacy retry workaround.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/util/test/BinaryFuseTests.cpp | Adds tests that force rare BinaryFuseFilter populate/retry paths and assert they were hit. |
| src/test/CovMark.h | Defines the coverage-mark API, counters, and RAII guard used by tests. |
| src/test/CovMark.cpp | Defines the global coverage counter storage for test builds. |
| src/bucket/DiskIndex.cpp | Removes the previous retry-on-out_of_range workaround when constructing the filter. |
| lib/binaryfusefilter.h | Implements sentinel-safe reset, adds test forcing hooks, and emits coverage marks in retry/duplicate paths. |
bc7795f to
2fdc666
Compare
dmkozh
reviewed
Apr 3, 2026
| // pre-hashed keys) can produce collisions even when the input keys are unique. | ||
| // To simulate this, we just duplicate all the input hashes, which guarantees | ||
| // collisions. | ||
| TEST_CASE("binary fuse filter duplicate removal", "[BinaryFuseFilter]") |
Contributor
There was a problem hiding this comment.
Do you understand why this fails in CI?
Contributor
Author
There was a problem hiding this comment.
There was some UB due to how I copied a vector in test setup, should be fixed now.
2fdc666 to
246acdc
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This fixes a bug in the binary fuse filter, where very occasionally we would corrupt a value in the index. Specifically, the last element in
reverseOrderis a sentinel value that should never be reset, even in the case of underflow error. However, when underflow error occurred, we would reset the entire vector, including the sentinel.The first commit fixes this issue. The 2nd commit adds code coverage support (shamelessly stolen from @graydon #5012) and uses it to add more comprehensive unit tests for edge cases, including the sentinel issue.
Checklist
clang-formatv8.0.0 (viamake formator the Visual Studio extension)