Skip to content

503 add gpu metric into cluster dashboard#504

Merged
ZhangEnYao merged 4 commits into
mainfrom
503-add-gpu-metric-into-cluster-dashboard
Jun 10, 2026
Merged

503 add gpu metric into cluster dashboard#504
ZhangEnYao merged 4 commits into
mainfrom
503-add-gpu-metric-into-cluster-dashboard

Conversation

@KUASWoodyLIN

Copy link
Copy Markdown
Contributor

No description provided.

woody_lin added 3 commits June 10, 2026 19:27
Section A: replace the Pod Density ranking with a GPU Utilization
ranking (avg per node). Section B: when the drilled-in node has GPU
cards, show four per-card heatmaps (utilisation / memory / power /
temperature), gated to GPU nodes so CPU-only nodes stay clean.

Cells render on a Canvas layer to keep the mount cheap; reload skips
identical-data re-renders and hover caches its plot rect. GPU queries
are scoped via a new dcgmNodeSelector helper (exact, escaped match).
Adds gpu_power / gpu_temperature and node_chart_gpu_* i18n keys.
@KUASWoodyLIN KUASWoodyLIN linked an issue Jun 10, 2026 that may be closed by this pull request

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces GPU monitoring capabilities to the cluster analytics dashboard, adding translation strings, a GPU heatmap component, and a conditional GPU usage wrapper. It also reorganizes existing analytics components into dedicated subdirectories. Feedback on these changes highlights a critical issue with a hardcoded Prometheus URL in the proxy server that bypasses multi-cluster routing. Additionally, minor performance improvements are suggested in the new GPU heatmap component to replace SvelteSet and SvelteMap with standard JavaScript collections inside derived state blocks to eliminate unnecessary reactive overhead.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/lib/server/proxy.ts Outdated
@ZhangEnYao ZhangEnYao merged commit f20dc0f into main Jun 10, 2026
8 of 11 checks passed
@ZhangEnYao ZhangEnYao deleted the 503-add-gpu-metric-into-cluster-dashboard branch June 10, 2026 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add GPU metric into cluster dashboard

2 participants