|
| 1 | +Add Support for Warm Replicas |
| 2 | +============================= |
| 3 | + |
| 4 | +## Summary |
| 5 | + |
| 6 | +When controllers manage huge caches, failover takes minutes because follower replicas wait to win leader election before starting informers. “Warm replicas” allow controller-runtime to start sources while a manager instance is still on standby, so the new leader can immediately schedule workers with already-populated queues. This design documents the feature implemented in [PR #3192](https://github.com/kubernetes-sigs/controller-runtime/pull/3192) and answers the outstanding review questions. |
| 7 | + |
| 8 | +## Motivation |
| 9 | + |
| 10 | +Controllers reconcile every object from their sources at startup and after leader failover. For sources with millions of objects (e.g., Secrets, ConfigMaps, custom resources across all namespaces) the initial List+Watch can take tens of minutes, delaying recovery. Today a controller only starts its sources inside `Start`, which manager runs **after** acquiring the leader lock. That guarantees downtime equal to the cache warmup time whenever the leader rotates. |
| 11 | + |
| 12 | +## Goals |
| 13 | + |
| 14 | +- Allow controller authors to opt a controller (or all controllers) into warmup behavior with a single option (`EnableWarmup`). |
| 15 | +- Ensure warmup never changes behavior when disabled. |
| 16 | +- Keep the API surface minimal (no exported warmup interface yet). |
| 17 | + |
| 18 | +## Implemented Changes |
| 19 | + |
| 20 | +### Manager Warmup Phase |
| 21 | + |
| 22 | +Manager already buckets runnables (HTTP servers, caches, others, leader election). We added an internal `warmupRunnable` interface: |
| 23 | + |
| 24 | +```go |
| 25 | +type warmupRunnable interface { |
| 26 | + Warmup(context.Context) error |
| 27 | +} |
| 28 | +``` |
| 29 | + |
| 30 | +During `Start`, the manager now runs: |
| 31 | + |
| 32 | +1. HTTP servers |
| 33 | +2. Webhooks |
| 34 | +3. Caches |
| 35 | +4. `Others` |
| 36 | +5. **Warmup runnables (new)** |
| 37 | +6. Leader election runnables once the lock is acquired |
| 38 | + |
| 39 | +Warmup runnables are also stopped in parallel with non-leader runnables during shutdown to avoid deadlocks. |
| 40 | + |
| 41 | +### Controller Opt-in |
| 42 | + |
| 43 | +Controllers expose the option via: |
| 44 | + |
| 45 | +- `ctrl.Options{Controller: config.Controller{EnableWarmup: ptr.To(true)}}` |
| 46 | + |
| 47 | +If both `EnableWarmup` and `NeedLeaderElection` are true, controller-runtime registers the controller as a warmup runnable. Calling `Warmup` launches the same event sources and cache sync logic as `Start`, but it does **not** start worker goroutines. Once the manager becomes leader, the controller’s normal `Start` simply spins up workers against the already-initialized queue. Enabling warmup on a controller that does not use leader election is a no-op, as the worker threads do not block on leader election being won. |
| 48 | + |
| 49 | +### Usage Example |
| 50 | + |
| 51 | +```go |
| 52 | +mgr, err := ctrl.NewManager(cfg, ctrl.Options{ |
| 53 | + Controller: config.Controller{ |
| 54 | + EnableWarmup: ptr.To(true), // make every controller warm up |
| 55 | + }, |
| 56 | +}) |
| 57 | +if err != nil { |
| 58 | + panic(err) |
| 59 | +} |
| 60 | + |
| 61 | +builder.ControllerManagedBy(mgr). |
| 62 | + Named("slow-source"). |
| 63 | + WithOptions(controller.Options{ |
| 64 | + EnableWarmup: ptr.To(true), // optional per-controller override |
| 65 | + }). |
| 66 | + For(&examplev1.Example{}). |
| 67 | + Complete(reconciler) |
| 68 | +``` |
| 69 | + |
| 70 | +### Operational Considerations |
| 71 | + |
| 72 | +- **API server load** – Warm replicas temporarily duplicate List/Watch traffic: each standby replica performs the initial List and opens watches even though the current leader is already doing so. The additional load exists only while a replica is warming its caches, but on huge clusters this can still be expensive depending on the number of warm replicas. |
| 73 | +- **Queue depth metrics** – Because warm replicas start their sources before workers run, the `workqueue_depth` metric spikes during warmup even though reconcilers have not begun processing. Alerting or SLOs based on that metric should either ignore the warmup window or switch to per-controller gauges that reset when workers start. |
| 74 | + |
| 75 | +### References |
| 76 | + |
| 77 | +- Implementation: [#3192](https://github.com/kubernetes-sigs/controller-runtime/pull/3192) |
| 78 | +- Earlier context: [#2005](https://github.com/kubernetes-sigs/controller-runtime/pull/2005), [#2600](https://github.com/kubernetes-sigs/controller-runtime/issues/2600) |
0 commit comments