Skip to content

wire protocol: use dedicated RPC calls for all pod and container life-cycle events.#274

Open
klihub wants to merge 5 commits intocontainerd:mainfrom
klihub:devel/dedicated-lifecycle-events
Open

wire protocol: use dedicated RPC calls for all pod and container life-cycle events.#274
klihub wants to merge 5 commits intocontainerd:mainfrom
klihub:devel/dedicated-lifecycle-events

Conversation

@klihub
Copy link
Member

@klihub klihub commented Feb 26, 2026

This PR updates the wire protocol/ttrpc API to have proper RPC calls and messages for all pod and container life-cycle events.

The added calls and messages replace the 'catch-most' StateChange call which is now used to relay {Run,Stop,Remove}PodSandbox pod and {PostCreate,Start,PostStart, PostUpdate,Remove} container events using a single 'event type + field union' message on the wire. The visible stub/plugin and runtime adaptation interfaces are unaffected. The PR leaves StateChange in the protocol to allow transparent backward compatibility, and updates the runtime adaptation to fall back to using StateChange if the plugin side NRI version does not implement the new RPC calls yet.

This has been attempted a few times earlier in connection with other changes, but since we've had multiple other real functional changes in flight, we always shied away from the resulting rebase conflicts and seeing this through. This time this is a self-contained PR with no other functional changes.

Although functionally straightforward, it's still painfully intrusive as it touches a lot of boilerplate. Currently we have 3 remaining PRs open which touch the protocol and will conflict: #166 (memory policy adjustment), #183 (plugin authentication), and #271 (version exchange). Of these #166 and #217 are probably small footprint enough to be fairly easily rebases (either way around).

Here are my updated runtime test trees with these changes included:

@samuelkarp
Copy link
Member

I know we were talking about a "nritest" tool; this would be a really good first use-case for it. We'd want to test these combinations:

  • new runtime, new plugins (uses new RPCs)
  • new runtime, old plugins (uses old RPCs)
  • old runtime, new plugins (uses old RPCs)

And as discussed we'll need to expose whether old RPCs were used back up to the runtime so it can emit warnings about it too.

@klihub klihub marked this pull request as ready for review March 6, 2026 17:22
klihub added 2 commits March 18, 2026 11:18
Define dedicated RPC calls on the wire for all pod lifecycle events.
Update stub, builtin and WASM plugins accordingly.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Define dedicated RPC calls on the wire for all container lifecycle
events. Update stub, builtin and WASM plugins accordingly.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
@klihub klihub force-pushed the devel/dedicated-lifecycle-events branch from f062b1e to 8930aed Compare March 18, 2026 09:19
@klihub klihub requested review from chrishenzie and mikebrow March 18, 2026 09:20
Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great just a few nits on the docs..

probably needs a read me / doc explaining the change wrapper support and any deprecation plans

return nil
// StateChange implements PluginService of the NRI API.
func (b *BuiltinPlugin) StateChange(_ context.Context, _ *api.StateChangeEvent) (*api.StateChangeResponse, error) {
// TODO: remove eventually once StateChange is removed from the wire protocol.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre/post v1.0.. hmm

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a good question. I think it also depends a bit of the release cadence of the runtimes. Since it should be fairly cheap to keep the fallback to StateChange around I think we can be conservative about its deprecation and eventual removal, timewise.

@klihub klihub force-pushed the devel/dedicated-lifecycle-events branch 2 times, most recently from 831265d to 1f64d2a Compare March 18, 2026 18:20
Update to use the new dedicated RPC calls to relay all pod and
container lifecycle events to plugins. Remove old StateChange
which used to multiplex {Run,PostUpdate,Stop,Remove}PodSandbox
pod and {PostCreate,Start,PostStart,PostUpdate,Remove}Container
container events.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
@klihub klihub force-pushed the devel/dedicated-lifecycle-events branch 2 times, most recently from bccfe36 to 546adb0 Compare March 18, 2026 18:41
klihub added 2 commits March 18, 2026 20:46
Add a backward compatibility wrapper for old plugins which do not
support the full set of dedicated lifecycle events yet. Fall back
to relaying StateChange events to them.

Record deprecation warnings for each unimplemented new plugin
interface where we need to fall back and funnel calls through
the old StateChange RPC call.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
@klihub klihub force-pushed the devel/dedicated-lifecycle-events branch from 546adb0 to c4c5eec Compare March 18, 2026 18:58
@klihub
Copy link
Member Author

klihub commented Mar 18, 2026

looks great just a few nits on the docs..

probably needs a read me / doc explaining the change wrapper support and any deprecation plans

Added a section about deprecated interfaces to the README and a subsection explaining the deprecation of StateChange.

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

default:
return fmt.Sprintf("unknown deprecation (%d)", d)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@klihub
Copy link
Member Author

klihub commented Mar 18, 2026

And as discussed we'll need to expose whether old RPCs were used back up to the runtime so it can emit warnings about it too.

@samuelkarp I added the propragation of deprecation warnings back to the runtime and updated the linked working tree for containerd. With that in place now you get warnings like this when you run an old plugin:

$ ctr deprecations list
ID                                                LAST OCCURRENCE                   MESSAGE    
io.containerd.deprecation/nri-plugin-interface    2026-03-18T20:15:59.221928179Z    NRI plugin uses a deprecated interface.
$ journalctl -u containerd | grep "NRI plugin uses a deprecated interface"
Mar 18 20:15:04 n4c16-fedora-43-containerd containerd[19584]: time="2026-03-18T20:15:04.861795837Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement RunPodSandbox" plugin=10-template
Mar 18 20:15:04 n4c16-fedora-43-containerd containerd[19584]: time="2026-03-18T20:15:04.895085271Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement PostCreateContainer" plugin=10-template
Mar 18 20:15:04 n4c16-fedora-43-containerd containerd[19584]: time="2026-03-18T20:15:04.948749187Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement StartContainer" plugin=10-template
Mar 18 20:15:04 n4c16-fedora-43-containerd containerd[19584]: time="2026-03-18T20:15:04.956474857Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement PostStartContainer" plugin=10-template
Mar 18 20:55:54 n4c16-fedora-43-containerd containerd[15507]: time="2026-03-18T20:55:54.993906001Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement RemoveContainer" plugin=10-template
Mar 18 20:15:58 n4c16-fedora-43-containerd containerd[19584]: time="2026-03-18T20:15:58.070528415Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement StopPodSandbox" plugin=10-template
Mar 18 20:15:59 n4c16-fedora-43-containerd containerd[19584]: time="2026-03-18T20:15:59.221928179Z" level=warning msg="NRI plugin uses a deprecated interface." deprecated=StateChange details="does not implement RemovePodSandbox" plugin=10-template

@klihub
Copy link
Member Author

klihub commented Mar 19, 2026

@samuelkarp @chrishenzie PTAL, if you have spare cycles.

@samuelkarp
Copy link
Member

Just a timing note here: due to the changes in the release cycle and safe upgrade policy, we can deprecate but we likely should not remove this support from containerd until 2.7 (next regular release after 2.6 LTS). We could try to get this change in for 2.3 LTS (in the next few weeks) and target removal in 2.4 (next regular release after 2.3 LTS), but that'd be a pretty short time for plugin authors to upgrade.

@mikebrow
Copy link
Member

nod.. deprecation notice is fine for now. Target removal for 2.7 SGTM

@klihub
Copy link
Member Author

klihub commented Mar 20, 2026

@samuelkarp So anything remaining against merging this ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants