-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Avoid CN OOM by Pulling User/Roles to DN When Cache Misses. #16888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
CRZbulabula
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/auth/ClusterAuthorityFetcher.java
Show resolved
Hide resolved
eeb9e25 to
f768082
Compare
CRZbulabula
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/auth/ClusterAuthorityFetcher.java
Show resolved
Hide resolved
|
And, how is this PR related to PIPE? |
f768082 to
4028775
Compare
4028775 to
b602ee6
Compare
| Role cachedRole = iAuthorCache.getRoleCache(rolename); | ||
| if (cachedRole == null) { | ||
| checkRoleFromConfigNode(username, rolename); | ||
| cachedRole = iAuthorCache.getRoleCache(rolename); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still a chance that the role is evicted between line 183 and 184, therefore, cachedRold can still be null.
One possible solution is to make checkRoleFromConfigNode() return the Role it just fetched.
And may rename checkRoleFromConfigNode() to fetchRoleFromConfigNode() which is more precise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And what if the role is deleted concurrently? You can still not fetch it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have resolved the possible NPE issue, but the concurrency problem requires a more systematic design.
|
Avoid using force-push, which will mix your new changes with the old ones. |
When there is no User Cache on the DN side, the entire time series will be handed over to CN for authentication, which ultimately leads to CN OOM in some PIPE scenarios with a large number of time series. To solve this problem, it is necessary to prohibit passing all time series to CN during authentication, and instead pull the user and it's corresponding roles to dn when the user cache and role caches fail.