Skip to content

Conversation

@lowhung
Copy link
Collaborator

@lowhung lowhung commented Dec 20, 2025

Description

Terminal-based diagnostic tool for monitoring Caryatid process health. Built with Ratatui.

Key features:

  • Three views: Summary (module table with sparklines), Bottlenecks (unhealthy topics), Flow (communication matrix)
  • Vim-style navigation, filtering (/), sorting (s/S)
  • Detail overlay, JSON export, auto-detected light/dark theme
  • Optional RabbitMQ subscription for live monitoring (--features subscribe)

Testing

Manual testing with sample monitor data across all views, navigation modes, and terminal sizes.

Checklist

  • Builds and passes local tests
  • Documentation updated

Impact

None - new standalone tool.

- Add History struct to track historical snapshots of module metrics
- Store read/write counts over time for sparkline generation
- Add Snapshot and Delta types for computing trends
- Integrate history recording into App reload_data cycle
- Prepare infrastructure for sparklines, deltas, and trend indicators
- Add max pending_for column showing worst latency per module
- Add unread count column with color-coded thresholds
- Add read rate column (messages/second)
- Add sparkline trend visualization using unicode bars
- Add column sorting (s key to cycle, S to reverse)
- Show sort indicator in header
- Color-code pending/unread based on threshold severity
- Add header bar showing overall system status at a glance
- Display colored status indicator (green/yellow/red)
- Show module counts by health: ok/warn/crit
- Display total read/write throughput
- Reorganize help overlay with categorized sections
- Add sorting shortcuts to help text
- Add '/' key to start filter input mode
- Type to filter modules by name (case-insensitive)
- Press Enter to confirm, Esc to cancel, 'c' to clear
- Show filter in title bar with filtered/total count
- Update help overlay with filter shortcuts
- Group issues by severity (CRITICAL section, WARNING section)
- Add visual separators with unicode box-drawing characters
- Show structured columns: module, topic, type, pending, unread
- Color-code border based on worst severity
- Improve healthy state display with checkmark
- Add read/write type indicator [R]/[W]
- Highlight topics connected to the selected module
- Show selected module name in title bar
- Filter topics by filter text (searches topics, producers, consumers)
- Highlight module name occurrences in producer/consumer lists
- Use unicode arrows (→) for better visuals
- Dim unconnected topics when a module is selected
- Show filtered/total count in title
- Enable mouse capture for terminal
- Scroll wheel navigates list items up/down
- Left-click selects items in lists
- Left-click on tabs switches views
- Right-click goes back (in detail view)
- Handle mouse events in all views
- Add CLI args for custom thresholds:
  --pending-warn (default: 1s)
  --pending-crit (default: 10s)
  --unread-warn (default: 1000)
  --unread-crit (default: 5000)
- Add -e/--export option for non-interactive JSON export
- Add 'e' key binding for in-app export to monitor_export.json
- Update App::new to accept Thresholds parameter
- Add export_state method to App
- Add export_to_file function for comprehensive JSON export
- Use rounded border style (BorderType::Rounded) for all views
- Remove unused code: Snapshot, Delta, Trend structs from history.rs
- Remove unused get_delta, get_writes_sparkline methods
- Remove unused kind() method from UnhealthyTopic
- Add 'e' key to help overlay for export
- Clean up all compiler warnings
Navigation improvements:
- Add view history stack for back navigation (Esc/Backspace)
- Add breadcrumb trail in status bar showing navigation path
- Add ViewState struct to preserve selection when navigating

Detail view changes:
- Convert detail view from full-screen to modal overlay
- Press Enter on Summary/Bottleneck to show detail overlay
- Press Esc/Enter to close overlay
- Overlay shows module name, reads, writes with health indicators

Tab bar updates:
- Remove Detail tab (now overlay-only)
- Tabs: Summary (1), Bottlenecks (2), Flow (3)

Status bar improvements:
- Show current view breadcrumb
- Show Esc:back hint when stack has history
Flow view now shows:
- Selected module as center focus with ASCII box
- Module stats (reads, writes, health status) in header box
- Split view: INPUTS (topics read) on left, OUTPUTS (topics written) on right
- For each topic: source/destination modules, pending time, health indicator
- Navigation hints at bottom

Layout:
                    ╭─────────────────╮
                    │   ModuleName    │
                    │ R:1.2K  W:800   │
                    │       ●         │
                    ╰─────────────────╯

  ────────────────────┬────────────────────
       INPUTS (3)     │      OUTPUTS (2)
  ────────────────────┼────────────────────
  producer → topic    │ topic → consumer

Removed unused FlowLine struct and to_lines method from data/flow.rs
Detail overlay improvements:
- Add bordered header box with module name and stats
- Separate Reads and Writes into distinct bordered sections
- Each section has header row, separator, and data rows
- Cleaner layout with proper padding and alignment

Flow view redesign as dependency graph:
- Vertical tree layout showing data flow direction
- Upstream section: producer modules in boxes at top
- Input topics: bordered boxes showing pending/unread stats
- Center: main module in double-bordered highlight box
- Output topics: bordered boxes showing pending and consumers
- Downstream: shows where data flows to
- Visual arrows (│ ▼) connecting the sections

Layout:
    ┌──────────┐  ┌──────────┐
    │ Producer │  │ Producer │
    └──────────┘  └──────────┘
          │            │
          ▼            ▼
    ╭─ input.topic ─────────╮
    │  pending: 1.2s  ...   │
    ╰───────────────────────╯
              │
              ▼
        ╔═══════════════╗
        ║  ModuleName   ║
        ║  R:1.2K W:800 ║
        ║   ● healthy   ║
        ╚═══════════════╝
              │
              ▼
    ╭─ output.topic ────────╮
    │  pending: -  → Cons.. │
    ╰───────────────────────╯
…lumn

Flow view:
- Simplified to show only message topology, no stats
- Clean box layout: upstream modules → input topics → main module → output topics → downstream modules
- Removed pending/unread details (that's what detail overlay is for)

Detail overlay:
- Fixed missing 'Status' column header in both Reads and Writes sections
- Centered status symbols under the header
…ix flow view

Tab navigation:
- Fix Tab skipping views by updating next()/prev() to skip ModuleDetail
- ModuleDetail is now overlay-only, not a tab

Detail overlay:
- Fixed column spacing with consistent widths (28/10/10/8/6)
- Fixed right border alignment
- Added proper 'Status' column header

Flow view - Adjacency Matrix:
- Shows module-to-module communication as a matrix
- → = row sends to column (via shared topic)
- ← = row receives from column
- ↔ = bidirectional communication
- · = self (diagonal)
- Selected module row/column highlighted
- Shows connection details for selected module below matrix
- Legend explaining symbols
- Fix module switching in Flow view by including DataFlow in module navigation
- Rewrite detail overlay with consistent box drawing and proper column alignment
- Clean up Flow view with bordered adjacency matrix and connection details section
- Add legend and selected module connection details in Flow view
- Add BottleneckSortColumn enum with Status, Module, Topic, Kind, Pending, Unread
- Add column headers with sort indicators (↑/↓)
- Wire up search/filter to work on module name and topic
- Add view-specific sorting state in App (bottleneck_sort_column, bottleneck_sort_ascending)
- Update cycle_sort/toggle_sort_direction to work for both Summary and Bottleneck views
- Update help overlay to show shared controls for Summary & Bottlenecks
- Update status bar with context-sensitive control hints
- Use fixed column widths with constants for consistency
- Switch from List to Paragraph for precise control over row formatting
- Add manual selection indicator (▶) with consistent 3-char prefix
- Fix sorting to use case-insensitive comparison for text columns
- Shorten status labels (CRIT/WARN) to fit column width
- Add filtered_bottleneck_count() method to get count after applying search filter
- Use filtered count for up/down navigation bounds
- Prevents navigation from skipping items when filter is active
Add comprehensive README with usage guide, keyboard controls, health
thresholds, and data format documentation.

UI/UX improvements:
- Add Page Up/Down, Home/End navigation for faster scrolling
- Add scroll position indicator [x/n] in view titles
- Add sorting controls (s/S) to Bottleneck view
- Add export confirmation feedback in status bar (3s display)
- Fix help overlay text (1-4 → 1-3 for actual view count)

Fix navigation to use visual order:
- Up/Down now moves through sorted/filtered list as displayed
- Previously jumped through modules by raw data order (grouped by health
  status), now scrolls sequentially through visible rows
- Mouse clicks correctly select the clicked row after sort/filter
- Detail overlay shows correct module for visual selection
- Make detail overlay width responsive (50-80 chars based on terminal)
- Make detail overlay inner columns dynamic based on available width
- Make help overlay responsive to terminal size
- Fix sparkline column width (12 -> 8 to match actual content)
- Reduce Summary view column widths for better fit on smaller terminals
Bottleneck view now uses '▶ ' (2 chars) matching Summary view,
instead of ' ▶ ' (3 chars).
Reduce column widths for better fit on 80-column terminals:
- Status: 8 -> 6 chars
- Module: 20 -> 18 chars
- Topic: 28 -> 24 chars
- Kind: 4 -> 3 chars
- Unread: 8 -> 7 chars
Show friendly message when terminal is smaller than 60x12:
'Terminal too small: WxH, Minimum: 60x12, Resize to continue'

Also reduced minimum content area from 10 to 8 lines.
…ck views

Summary view:
- Fixed Module column to 22 chars instead of Min(20) which expanded excessively

Bottleneck view:
- Columns now scale dynamically based on terminal width
- Module and Topic columns share remaining space (40%/60%)
- Fixed columns (Status, Kind, Pending, Unread) stay constant
- Table now fills the available width properly
Summary view:
- Use Fill constraints for even distribution across terminal width
- Module column gets 3x share, others get 1x each
- Sparkline and Status have fixed minimums

Flow view:
- Reduce matrix cell width from 10 to 6 chars
- Reduce row header from 14 to 12 chars
- Add module count to legend
- Add 'self' symbol explanation to legend
Connection details now have a complete box with right border on all lines.
Also improved connection line formatting and truncation.
Flow view:
- Split into matrix (top) and details panel (bottom)
- Make column widths responsive to terminal size
- Enhanced legend with module count and flow statistics

Bottleneck view:
- Convert from Paragraph to Table widget for consistent styling
- Use Fill constraints for responsive column widths
- Matches Summary view's clean table appearance

Detail overlay:
- Increase size to 95% width, 90% height (clamped to reasonable limits)
- Use Table widgets for reads/writes sections
- Improved layout with header, scrollable content, and footer
- Better visual hierarchy with styled borders and colors
When sorting by a column where multiple items have the same value
(e.g., sorting by Status groups items by CRIT/WARN/OK), items within
each group now maintain a consistent order by using secondary sort keys:

Summary view:
- Secondary sort by module name when primary values are equal

Bottleneck view:
- Secondary sort by module name, then topic when primary values are equal

This prevents items from appearing to jump around when the primary
sort column has duplicate values.
- Left/Right arrows (or h/l vim keys) now switch between views
- More intuitive: up/down for list items, left/right for tabs
- Tab/Shift-Tab still works as before
- Updated help overlay to reflect new keybindings
- App now accepts Box<dyn DataSource> instead of PathBuf
- MonitorData gains from_snapshot() for direct snapshot conversion
- main.rs creates FileSource and passes to App
- Removes tight coupling between App and file-based loading

This enables future support for channel-based data sources
(e.g., message bus subscriptions) without changing App logic.
- Rename package to caryatid-doctor
- Add lib.rs with public API exports and documentation
- Separate library and binary targets
- Export all types needed for library consumers:
  - App, DataSource trait, FileSource, ChannelSource
  - MonitorSnapshot and serialization types
  - MonitorData, Thresholds, HealthStatus, etc.
- Fix ChannelSource to return initial value on first poll
- Update Cargo.toml with publishing metadata

The crate can now be used as:
1. CLI tool: caryatid-doctor --file monitor.json
2. Library with FileSource for file-based monitoring
3. Library with ChannelSource for message bus integration
FileSource now tracks the file's modification time and only returns
new data when the file has actually been updated. This prevents
recording duplicate snapshots into history when the TUI polls faster
than caryatid's Monitor writes (e.g., TUI polls every 1s but Monitor
writes every 5s).

Previously, the rate column would show 0 because we recorded the same
total_read value multiple times, resulting in delta=0. Now the rate
correctly shows messages/second based on actual changes.
New features:
- StreamSource: receives snapshots from async streams (TCP, etc.)
- CLI --connect flag: connect to TCP endpoint for live snapshots
  Example: caryatid-doctor --connect localhost:9090

StreamSource provides two ways to receive data:
1. spawn(reader, desc): reads newline-delimited JSON from AsyncRead
2. from_bytes_channel(rx, desc): receives JSON bytes via mpsc channel

This enables integration with:
- Direct TCP connections to a monitor server
- Message bus subscriptions (bridge via channel)
- Any async byte stream

Usage:
  caryatid-doctor --file monitor.json     # File-based (default)
  caryatid-doctor --connect host:port     # TCP stream
Tests cover:
- FileSource: new, poll, change detection, missing file, invalid JSON
- StreamSource: spawn, multiple snapshots, description, bytes channel

These tests exercise the public API and serve as usage documentation.
Changed doc examples from 'ignore' to actual runnable tests:
- Simple examples use plain code blocks
- Async examples use tokio_test::block_on
- All doc tests now compile and run

Added tokio-test dev dependency for async doc tests.
Detail view is now implemented as an overlay (show_detail_overlay) rather
than a separate view variant. This removes the unused enum variant and
cleans up all match arms that referenced it.
Skip rendering the detail overlay if the terminal is too small to display
it properly. This prevents potential rendering issues on very small terminals.
Add optional 'subscribe' feature that enables monitor_cli to receive
MonitorSnapshot messages directly from the caryatid message bus instead
of reading from a file or TCP connection.

New files:
- src/subscribe/mod.rs: Module entry point with create_subscriber()
- src/subscribe/message.rs: Message type implementing MessageBounds
- src/subscribe/subscriber.rs: MonitorSubscriber module that forwards
  snapshots to the TUI via watch channel

CLI changes:
- Add --subscribe <config.toml> option to connect via message bus
- Add --topic option (default: caryatid.monitor) for subscription topic

The subscriber runs as a caryatid module in a background tokio runtime,
forwarding received snapshots to the existing ChannelSource which feeds
the TUI event loop.
@lowhung lowhung marked this pull request as draft December 21, 2025 06:24
- Remove complex Message type and MonitorSubscriber module
- Subscribe directly to RabbitMQ using serde_json::Value
- Support both [rabbitmq] and [message-bus.*] config formats
- Update omnibus.toml to publish monitor snapshots to external bus
- Use workspace dependencies for caryatid (local paths for dev)
- Remove complex Message type and MonitorSubscriber module
- Subscribe directly to RabbitMQ using serde_json::Value
- Support both [rabbitmq] and [message-bus.*] config formats
- Update omnibus.toml to publish monitor snapshots to external bus
- Use workspace dependencies for caryatid (local paths for dev)
…put-output-hk/acropolis into lowhung/cli-tool-for-graphing-monitor

# Conflicts:
#	processes/monitor_cli/src/subscribe/mod.rs
Changed default from 'caryatid.monitor' to 'caryatid.monitor.snapshot'
for consistency with README and caryatid examples.
Document all public APIs across the crate:
- app.rs: App methods and ViewState struct
- data/: HealthStatus variants, TopicRead/TopicWrite/ModuleData fields,
  History methods, DataFlowGraph
- source/: StreamSource::from_bytes_channel
- ui/: Module-level docs, render functions, Theme fields and methods,
  SortColumn/BottleneckSortColumn variants
- events.rs: poll_event, handle_key_event, handle_mouse_event
The monitor publishes directly to RabbitMQ via its own connection,
bypassing the message router entirely. The routing rule was never used.
The rabbit_mq_bus module is private in the published caryatid_process
crate. Rewrite subscribe.rs to use lapin directly, avoiding dependency
on internal caryatid modules.
@lowhung
Copy link
Collaborator Author

lowhung commented Dec 21, 2025

Closing this as I've moved it to my own repo for now

https://github.com/lowhung/monitor-tui

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants