Normalize endpoints in ClockSkew retry plugin to prevent memory leak #3336
+315
−25
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Fix memory leak in ClockSkew retry plugin
Fixes #3332
Problem
Long-running applications with high request volume (1000+ ops/sec) experience memory exhaustion due to unlimited growth in clock skew correction hashes. Each unique URL path creates a separate hash entry instead of sharing corrections per server.
Solution
scheme://host:portformat, removing paths/queriesTesting
Further opportunities
While this solution normalizes the scheme provided by endpoint URLs, it does not do that for the host name. In other words, https://example.com and https://EXAMPLE.com would still be treated as 2 different endpoint servers. This could be easily tackled by using Ruby's
URI().normalizewhich already honors RFC 3986, but that comes at the cost of extra object allocations and is likely scope creep for this PR so I decided to leave that up to the team for a later decision.