Skip to content

Commit 607e772

Browse files
authored
📖 Add a design for supporting warm replicas. (#3121)
* 📖 Add a design for supporting warm replicas. * Address feedback. * address PR comments
1 parent 4b46eb0 commit 607e772

File tree

1 file changed

+78
-0
lines changed

1 file changed

+78
-0
lines changed

designs/warmreplicas.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
Add Support for Warm Replicas
2+
=============================
3+
4+
## Summary
5+
6+
When controllers manage huge caches, failover takes minutes because follower replicas wait to win leader election before starting informers. “Warm replicas” allow controller-runtime to start sources while a manager instance is still on standby, so the new leader can immediately schedule workers with already-populated queues. This design documents the feature implemented in [PR #3192](https://github.com/kubernetes-sigs/controller-runtime/pull/3192) and answers the outstanding review questions.
7+
8+
## Motivation
9+
10+
Controllers reconcile every object from their sources at startup and after leader failover. For sources with millions of objects (e.g., Secrets, ConfigMaps, custom resources across all namespaces) the initial List+Watch can take tens of minutes, delaying recovery. Today a controller only starts its sources inside `Start`, which manager runs **after** acquiring the leader lock. That guarantees downtime equal to the cache warmup time whenever the leader rotates.
11+
12+
## Goals
13+
14+
- Allow controller authors to opt a controller (or all controllers) into warmup behavior with a single option (`EnableWarmup`).
15+
- Ensure warmup never changes behavior when disabled.
16+
- Keep the API surface minimal (no exported warmup interface yet).
17+
18+
## Implemented Changes
19+
20+
### Manager Warmup Phase
21+
22+
Manager already buckets runnables (HTTP servers, caches, others, leader election). We added an internal `warmupRunnable` interface:
23+
24+
```go
25+
type warmupRunnable interface {
26+
Warmup(context.Context) error
27+
}
28+
```
29+
30+
During `Start`, the manager now runs:
31+
32+
1. HTTP servers
33+
2. Webhooks
34+
3. Caches
35+
4. `Others`
36+
5. **Warmup runnables (new)**
37+
6. Leader election runnables once the lock is acquired
38+
39+
Warmup runnables are also stopped in parallel with non-leader runnables during shutdown to avoid deadlocks.
40+
41+
### Controller Opt-in
42+
43+
Controllers expose the option via:
44+
45+
- `ctrl.Options{Controller: config.Controller{EnableWarmup: ptr.To(true)}}`
46+
47+
If both `EnableWarmup` and `NeedLeaderElection` are true, controller-runtime registers the controller as a warmup runnable. Calling `Warmup` launches the same event sources and cache sync logic as `Start`, but it does **not** start worker goroutines. Once the manager becomes leader, the controller’s normal `Start` simply spins up workers against the already-initialized queue. Enabling warmup on a controller that does not use leader election is a no-op, as the worker threads do not block on leader election being won.
48+
49+
### Usage Example
50+
51+
```go
52+
mgr, err := ctrl.NewManager(cfg, ctrl.Options{
53+
Controller: config.Controller{
54+
EnableWarmup: ptr.To(true), // make every controller warm up
55+
},
56+
})
57+
if err != nil {
58+
panic(err)
59+
}
60+
61+
builder.ControllerManagedBy(mgr).
62+
Named("slow-source").
63+
WithOptions(controller.Options{
64+
EnableWarmup: ptr.To(true), // optional per-controller override
65+
}).
66+
For(&examplev1.Example{}).
67+
Complete(reconciler)
68+
```
69+
70+
### Operational Considerations
71+
72+
- **API server load** – Warm replicas temporarily duplicate List/Watch traffic: each standby replica performs the initial List and opens watches even though the current leader is already doing so. The additional load exists only while a replica is warming its caches, but on huge clusters this can still be expensive depending on the number of warm replicas.
73+
- **Queue depth metrics** – Because warm replicas start their sources before workers run, the `workqueue_depth` metric spikes during warmup even though reconcilers have not begun processing. Alerting or SLOs based on that metric should either ignore the warmup window or switch to per-controller gauges that reset when workers start.
74+
75+
### References
76+
77+
- Implementation: [#3192](https://github.com/kubernetes-sigs/controller-runtime/pull/3192)
78+
- Earlier context: [#2005](https://github.com/kubernetes-sigs/controller-runtime/pull/2005), [#2600](https://github.com/kubernetes-sigs/controller-runtime/issues/2600)

0 commit comments

Comments
 (0)