wire protocol: use dedicated RPC calls for all pod and container life-cycle events.#274
wire protocol: use dedicated RPC calls for all pod and container life-cycle events.#274klihub wants to merge 5 commits intocontainerd:mainfrom
Conversation
41b9c58 to
f062b1e
Compare
|
I know we were talking about a "nritest" tool; this would be a really good first use-case for it. We'd want to test these combinations:
And as discussed we'll need to expose whether old RPCs were used back up to the runtime so it can emit warnings about it too. |
Define dedicated RPC calls on the wire for all pod lifecycle events. Update stub, builtin and WASM plugins accordingly. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Define dedicated RPC calls on the wire for all container lifecycle events. Update stub, builtin and WASM plugins accordingly. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
f062b1e to
8930aed
Compare
mikebrow
left a comment
There was a problem hiding this comment.
looks great just a few nits on the docs..
probably needs a read me / doc explaining the change wrapper support and any deprecation plans
| return nil | ||
| // StateChange implements PluginService of the NRI API. | ||
| func (b *BuiltinPlugin) StateChange(_ context.Context, _ *api.StateChangeEvent) (*api.StateChangeResponse, error) { | ||
| // TODO: remove eventually once StateChange is removed from the wire protocol. |
There was a problem hiding this comment.
Yes, it's a good question. I think it also depends a bit of the release cadence of the runtimes. Since it should be fairly cheap to keep the fallback to StateChange around I think we can be conservative about its deprecation and eventual removal, timewise.
831265d to
1f64d2a
Compare
Update to use the new dedicated RPC calls to relay all pod and
container lifecycle events to plugins. Remove old StateChange
which used to multiplex {Run,PostUpdate,Stop,Remove}PodSandbox
pod and {PostCreate,Start,PostStart,PostUpdate,Remove}Container
container events.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
bccfe36 to
546adb0
Compare
Add a backward compatibility wrapper for old plugins which do not support the full set of dedicated lifecycle events yet. Fall back to relaying StateChange events to them. Record deprecation warnings for each unimplemented new plugin interface where we need to fall back and funnel calls through the old StateChange RPC call. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
546adb0 to
c4c5eec
Compare
Added a section about deprecated interfaces to the README and a subsection explaining the deprecation of StateChange. |
| default: | ||
| return fmt.Sprintf("unknown deprecation (%d)", d) | ||
| } | ||
| } |
@samuelkarp I added the propragation of deprecation warnings back to the runtime and updated the linked working tree for containerd. With that in place now you get warnings like this when you run an old plugin: $ ctr deprecations list
ID LAST OCCURRENCE MESSAGE
io.containerd.deprecation/nri-plugin-interface 2026-03-18T20:15:59.221928179Z NRI plugin uses a deprecated interface.
$ journalctl -u containerd | grep "NRI plugin uses a deprecated interface"
Mar 18 20:15:04 n4c16-fedora-43-containerd containerd[19584]: time="2026-03-18T20:15:04.861795837Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement RunPodSandbox" plugin=10-template
Mar 18 20:15:04 n4c16-fedora-43-containerd containerd[19584]: time="2026-03-18T20:15:04.895085271Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement PostCreateContainer" plugin=10-template
Mar 18 20:15:04 n4c16-fedora-43-containerd containerd[19584]: time="2026-03-18T20:15:04.948749187Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement StartContainer" plugin=10-template
Mar 18 20:15:04 n4c16-fedora-43-containerd containerd[19584]: time="2026-03-18T20:15:04.956474857Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement PostStartContainer" plugin=10-template
Mar 18 20:55:54 n4c16-fedora-43-containerd containerd[15507]: time="2026-03-18T20:55:54.993906001Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement RemoveContainer" plugin=10-template
Mar 18 20:15:58 n4c16-fedora-43-containerd containerd[19584]: time="2026-03-18T20:15:58.070528415Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement StopPodSandbox" plugin=10-template
Mar 18 20:15:59 n4c16-fedora-43-containerd containerd[19584]: time="2026-03-18T20:15:59.221928179Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement RemovePodSandbox" plugin=10-template |
|
@samuelkarp @chrishenzie PTAL, if you have spare cycles. |
|
Just a timing note here: due to the changes in the release cycle and safe upgrade policy, we can deprecate but we likely should not remove this support from containerd until 2.7 (next regular release after 2.6 LTS). We could try to get this change in for 2.3 LTS (in the next few weeks) and target removal in 2.4 (next regular release after 2.3 LTS), but that'd be a pretty short time for plugin authors to upgrade. |
|
nod.. deprecation notice is fine for now. Target removal for 2.7 SGTM |
|
@samuelkarp So anything remaining against merging this ? |
This PR updates the wire protocol/ttrpc API to have proper RPC calls and messages for all pod and container life-cycle events.
The added calls and messages replace the 'catch-most'
StateChangecall which is now used to relay {Run,Stop,Remove}PodSandbox pod and {PostCreate,Start,PostStart, PostUpdate,Remove} container events using a single 'event type + field union' message on the wire. The visible stub/plugin and runtime adaptation interfaces are unaffected. The PR leavesStateChangein the protocol to allow transparent backward compatibility, and updates the runtime adaptation to fall back to using StateChange if the plugin side NRI version does not implement the new RPC calls yet.This has been attempted a few times earlier in connection with other changes, but since we've had multiple other real functional changes in flight, we always shied away from the resulting rebase conflicts and seeing this through. This time this is a self-contained PR with no other functional changes.
Although functionally straightforward, it's still painfully intrusive as it touches a lot of boilerplate. Currently we have 3 remaining PRs open which touch the protocol and will conflict: #166 (memory policy adjustment), #183 (plugin authentication), and #271 (version exchange). Of these #166 and #217 are probably small footprint enough to be fairly easily rebases (either way around).
Here are my updated runtime test trees with these changes included: