Skip to content

Phase 2: Process Control Plane, Profiling, Settings in GraphServer, and Dashboard Hooks#234

Open
griffinmilsap wants to merge 14 commits intofeature/session-metadatafrom
feature/process-control
Open

Phase 2: Process Control Plane, Profiling, Settings in GraphServer, and Dashboard Hooks#234
griffinmilsap wants to merge 14 commits intofeature/session-metadatafrom
feature/process-control

Conversation

@griffinmilsap
Copy link
Collaborator

Summary

This PR implements Phase 2 of the GraphServer high-level control-plane roadmap, relative to feature/session-metadata.

It adds a multiprocess process-control plane, push-based settings/topology/profiling APIs, low-level pub/sub profiling instrumentation, and dashboard-oriented data/control surfaces while keeping low-level API workflows intact.

Base / Compare

  • Base: feature/session-metadata
  • Compare: feature/process-control

Motivation

Phase 1 established session-scoped metadata and snapshot foundations.
Phase 2 adds the runtime control/observability layer needed for:

  • multi-process introspection through a single GraphServer connection,
  • settings change capture/audit and dynamic settings control,
  • push-based topology and profiling data streams suitable for dashboard/TUI clients.

What Changed

1) Multiprocess Control Plane

Added process control registration and request routing primitives:

  • New ProcessControlClient (src/ezmsg/core/processclient.py)
  • Process registration + ownership updates in GraphServer
  • Session-routed process requests (GraphContext.process_request(...))
  • Typed request/response envelopes + error codes in graphmeta.py
  • Built-in routed operations:
    • PING
    • GET_PROCESS_STATS
    • GET_PROFILING_SNAPSHOT
    • SET_PROFILING_TRACE
    • GET_PROFILING_TRACE_BATCH
    • UPDATE_SETTING_FIELD

2) Settings as GraphServer Source-of-Truth

GraphServer now tracks settings snapshots/events with push-capable APIs:

  • GraphContext.settings_snapshot()
  • GraphContext.settings_events(after_seq=...)
  • GraphContext.subscribe_settings_events(...) (push stream)
  • Process-reported settings updates flow via PROCESS_SETTINGS_UPDATE
  • GraphContext.update_settings(...) publishes to INPUT_SETTINGS
  • GraphContext.update_setting(...) supports field-level patching routed to owning process
  • Added settings schema metadata support (src/ezmsg/core/settingsmeta.py) to improve dashboard/widget feasibility without importing source settings classes.

3) Topology Push Stream

Added push-based topology event streaming:

  • GraphContext.subscribe_topology_events(...)
  • GraphServer emits topology events for session graph changes and process ownership lifecycle changes.

4) Low-level Profiling Instrumentation + APIs

Implemented profiling with low-level compatibility focus (Publisher/Subscriber/Channel):

  • New profiling backend (src/ezmsg/core/profiling.py)
  • Pub/sub instrumentation in hot paths (pubclient.py, subclient.py, messagechannel.py)
  • Snapshot and trace collection per process
  • GraphServer profiling trace stream aggregation
  • GraphContext APIs:
    • process_profiling_snapshot(...)
    • process_set_profiling_trace(...)
    • process_profiling_trace_batch(...)
    • profiling_snapshot_all(...)
    • subscribe_profiling_trace(...)

5) Identity + Safety Hardening

  • Process identity is now explicitly UUID-typed in public dataclasses (no user-specified process-id strings in registration/update payloads).
  • GraphServer now fails fast on ownership/address collisions:
    • reject PROCESS_REGISTER unit ownership conflicts
    • reject PROCESS_UPDATE_OWNERSHIP added-unit conflicts
    • reject SESSION_REGISTER metadata component-address collisions

6) TUI Demos for Manual Validation

Added one-off example clients:

  • examples/settings_tui.py
  • examples/profiling_tui.py
  • examples/topology_tui.py

These demonstrate live settings/profiling/topology views via the new GraphContext APIs.

7) Test Coverage

Added new suites and expanded behavior coverage:

  • tests/test_process_control.py
  • tests/test_process_routing.py
  • tests/test_settings_api.py
  • tests/test_profiling_api.py
  • tests/test_topology_api.py

Also includes related runner/harness updates (tests/shutdown_runner.py) and bugfixes discovered during Phase 2 integration.

Key Files

  • src/ezmsg/core/processclient.py (new)
  • src/ezmsg/core/profiling.py (new)
  • src/ezmsg/core/settingsmeta.py (new)
  • src/ezmsg/core/graphcontext.py
  • src/ezmsg/core/graphserver.py
  • src/ezmsg/core/graphmeta.py
  • src/ezmsg/core/backendprocess.py
  • src/ezmsg/core/backend.py
  • src/ezmsg/core/netprotocol.py
  • src/ezmsg/core/pubclient.py
  • src/ezmsg/core/subclient.py
  • src/ezmsg/core/messagechannel.py

Backward Compatibility Notes

  • These APIs are new/unreleased Phase 2 surfaces on top of feature/session-metadata.
  • Collision handling is intentionally fail-fast for clarity/safety in high-level API usage.
  • Low-level API usage remains supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant