-
Notifications
You must be signed in to change notification settings - Fork 495
Simplified rollout triggers and CRD design doc #34959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Simplified rollout triggers and CRD design doc #34959
Conversation
| - Change `forcePromote` from `Uuid` to `Option<String>` - Instead of triggering promotion when matching the UUID of `requestRollout`, it triggers promotion when matching the hash stored in `status.requestedRolloutSpecHash`. | ||
|
|
||
| **Status changes:** | ||
| - Replace `lastCompletedRolloutRequest` (`Uuid`) with `lastCompletedRolloutSpecHash` (`Option<String>`) - Stores the spec hash of the last successful rollout. Will be `None` if first deploying or if upgrading from `v1alpha1`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once we get ready to move to a real v1, we should be able to drop the Option here - at that point, the only time we don't have a value is when the cr is first created, but we already have the spec at that point, so we can just fill it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that is true. This is the last completed rollout request. A lot happens between starting a rollout and considering it complete. We save the status multiple times along the way, and we should indicate that it isn't complete in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my point though is that in all of those in progress states, the last completed hash will be different from the requested rollout hash - we aren't actually gaining any extra information from the option. the way it's described here, the option is None if and only if the hashes are equal, which i think just adds an extra invalid state for no benefit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree about the requested rollout hash, just not about the last completed hash. Specifically, what would you put there during the first rollout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, yeah, sorry if i was unclear, i was only referring to the requested rollout hash here - the last completed rollout hash being None before the first rollout makes sense to me.
|
|
||
| **Status changes:** | ||
| - Replace `lastCompletedRolloutRequest` (`Uuid`) with `lastCompletedRolloutSpecHash` (`Option<String>`) - Stores the spec hash of the last successful rollout. Will be `None` if first deploying or if upgrading from `v1alpha1`. | ||
| - Replace `resourcesHash` (`String`) with `requestedRolloutSpecHash` (`Option<String>`) - Stores the spec hash of the currently requested rollout. Will be `None` when no rollout is ongoing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it'd probably be simpler to have this just always be set (not an Option) - the UpToDate condition will be an easier thing for users to check. it feels like otherwise it'd be possible to accidentally get into an invalid state (lastCompletedrolloutSpecHash == requestedRolloutSpecHash) which would be hard to figure out how to recover from, easier if we just don't make that kind of invalid state representable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right. We can always set it to the calculated hash of the current CR, except if we have already begun promoting.
Note to self, ensure we set any status updates after we've reached promoting state using the value from the status, not the currently calculated value.
doc/developer/design/20260209_simplified_rollout_triggers_and_crd.md
Outdated
Show resolved
Hide resolved
doc/developer/design/20260209_simplified_rollout_triggers_and_crd.md
Outdated
Show resolved
Hide resolved
| - `environmentdScratchVolumeStorageRequirement` | ||
| - `serviceAccountName` | ||
| - `serviceAccountAnnotations` | ||
| - `serviceAccountLabels` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the service account annotations and labels are also applied immediately, so they probably shouldn't be here (although a change to serviceAccountName will require a rollout since that needs to update the corresponding field on the statefulset)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they might still be load bearing, despite being applied immediately. If we change the annotations on the service account to add an AWS IAM role ARN for example, do the credentials get applied to existing pods? I'm not sure if they do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, i guess that's true. may be worth testing to see what the behavior here is, but i think you're probably right. we'll probably want to leave a comment explaining this, since it's not immediately obvious
Simplified rollout triggers and CRD design doc
Motivation
Part of https://linear.app/materializeinc/issue/DEP-7/design-simplifying-upgrade-rollouts-node-rolls-converted-to-project
Tips for reviewer
Checklist
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way), then it is tagged with aT-protolabel.