-
Notifications
You must be signed in to change notification settings - Fork 267
Description
We are running an AKS cluster with Cilium installed and would like to use Retina for network observability.
However, after restarting the Retina DaemonSet (e.g., due to a configuration update), all network traffic in the AKS cluster stops. The only way to restore connectivity is by restarting the Cilium agent on the affected nodes.
Additionally, after Retina restarts, the following warning repeatedly appears in the Cilium logs:
level=warning msg="Detected unexpected endpoint BPF program removal. Consider investigating whether other software running on this machine is removing Cilium's endpoint BPF programs. If endpoint BPF programs are removed, the associated pods will lose connectivity and only reinstating the programs will restore connectivity." count=12 subsys=daemon
This makes Retina a no-go for us in production.
Steps to reproduce:
- Deploy Retina in an AKS cluster with Cilium.
- Restart the Retina DaemonSet (e.g., kubectl rollout restart daemonset retina).
- Observe that network traffic stops.
- Check Cilium logs for warnings about BPF program removal.
- Restart the Cilium agent on the affected nodes to restore connectivity.
Expected behavior:
Restarting Retina should not interfere with Cilium’s BPF programs or disrupt network traffic.
Environment:
- AKS
- Cilium version: 1.17.4
- Retina version: v0.0.36
- Kubernetes version: 1.32.3
Metadata
Metadata
Assignees
Labels
Type
Projects
Status