Speed up views with ICU sort keys#6050
Open
nickva wants to merge 2 commits into
Open
Conversation
Add a sort key libicu NIF function. A sort key is an opaque binary representation generated by libicu from a key, which then can then be compared directly against other sort keys to produce an equivalent collation order as calling the pair-wise comparison libicu function. The idea to use sort keys in the fabric view row "merge head" structure, where we merge together streaming rows from multiple workers. When we do that we keep either a sorted list (for map-only views) and then do an insertion sort step and take the minimum, or we keep the rows in key/value structure for reduce views and find the minimum key and its grouped values. In either case we can reduce the number of libicu compare(a,b) calls from O(K^2) to just O(K) sort key generating calls and since libicu calls are not cheap, it worth adding an extra NIF calls just for it. As a side note: we've actually implemented this once during the now abandonned CouchDB 4.0 /w FoundationDB backed attempt, there we stored sort key in the database, which libicu workers do not recommend doing. Here we're planning on using in memory only on the coordinator. https://unicode-org.github.io/icu/userguide/collation/concepts#sortkeys-vs-comparison
In the previous commit we implemented sort keys and here is where we're using
them to optimize views.
On coordinators there are two separate places we optimize: the reduce views and
map-only views. They are implemented somewhat differently. For both cases we
win by generating the sort key once per row as it comes in, pay the CPU price
once, and then when we merge sort it or insert it into the gb_tree when
reducing. After that we only do Erlang comparisons, avoiding expensive repeated
ICU pair-wise calls.
For both map and reduce views use a common buf_key/2 function to generate the
sort key or a raw key, depending on the user's collator setting.
The map-only change is relatively straight-forward. We just use `{{buf_key,
Id}, ROw}` as the sortable rows and keep the same merge-sort behavior.
For reduce views we actually get a nice simplification. Previously, we had a
map keyed by ejson key and search over it (order O(N)) on every emit. There
were two distinct steps: 1) find the lowest key 2) find any other keys
collating equal to it. We don't have to do that any longer, with gb_trees use
the buf_key as the key and simply take the small (or greatest key).
On a quick benchmark of 100k docs with Q=8 saw a decent speedup:
```
reduce (group level=3) : 5974ms -> 3294ms (1.8x)
maps : 2699ms -> 1917ms (1.4x)
```
66e3d6f to
93d2666
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add a sort-key ICU NIF and use to optimize views merging and sorting on the coordinator.
The are two separate commits. The first one implements the NIF and the subsequent one uses it to optimize fabric view handling.
ICU NIF Implementation
Add a sort key libicu NIF function. A sort key is an opaque binary representation generated by libicu from a key, which then can then be compared directly against other sort keys to produce an equivalent collation order as calling the pair-wise comparison libicu function.
The idea to use sort keys in the fabric view row "merge head" structure, where we merge together streaming rows from multiple workers. When we do that we keep either a sorted list (for map-only views) and then do an insertion sort step and take the minimum, or we keep the rows in key/value structure for reduce views and find the minimum key and its grouped values. In either case we can reduce the number of libicu compare(a,b) calls from O(K^2) to just O(K) sort key generating calls and since libicu calls are not cheap, it worth adding an extra NIF calls just for it.
As a side note: we've actually implemented this once during the now abandoned CouchDB 4.0 /w FoundationDB backed attempt, there we stored sort key in the database, which libicu workers do not recommend doing. Here we're planning on using in memory only on the coordinator.
https://unicode-org.github.io/icu/userguide/collation/concepts#sortkeys-vs-comparison
The Optimization Per-se
On coordinators there are two separate places we optimize: the reduce views and map-only views. They are implemented somewhat differently. For both cases we win by generating the sort key once per row as it comes in, pay the CPU price once, and then when we merge sort it or insert it into the gb_tree when reducing. After that we only do Erlang comparisons, avoiding expensive repeated ICU pair-wise calls.
For both map and reduce views use a common
buf_key/2function to generate the sort key or a raw key, depending on the user's collator setting.The map-only change is relatively straight-forward. We just use
{{buf_key, Id}, Row}as the sortable rows and keep the same merge-sort behavior.For reduce views we actually get a nice simplification. Previously, we had a map keyed by ejson key and search over it (order O(N)) on every emit. There were two distinct steps: 1) find the lowest key 2) find any other keys collating equal to it. We don't have to do that any longer, with gb_trees use the buf_key as the key and simply take the small (or greatest key).
On a quick benchmark of 100k docs with Q=8 saw a decent speedup: