Skip to content

Potential Memory leak in a busy production environment #103

@jgzurano

Description

@jgzurano

When running Robusta chart on a busy production cluster I'm noticing kubewatch container in robusta-forwarder pod shows a constant increase in memory usage over time leading to OOM kills.

Chart version: robusta-0.20.0
Kubewatch image: robustadev/kubewatch:v2.9.0 (same happens when I tested with latest version: kubewatch:v2.11.0)

Image

Kube cluster size is medium, at a busy times could have 250-500 nodes, 6000-10000 pods.

Any ideas how can I remediate ? perhaps from garbage collection side to release memory and avoid hitting pod memory limits.
Any help is appreciated to identify if this is a consequence of watching all kinds of events or a memory management issue on kubewatch code.

Using default config from Chart for watched resources

    resource:                                                                                                                                                                        
      clusterrole: true                                                                                                                                                              
      clusterrolebinding: true                                                                                                                                                       
      configmap: true                                                                                                                                                                
      coreevent: false                                                                                                                                                               
      daemonset: true                                                                                                                                                                
      deployment: true                                                                                                                                                               
      event: true                                                                                                                                                                    
      hpa: true                                                                                                                                                                      
      ingress: true                                                                                                                                                                  
      job: true                                                                                                                                                                      
      namespace: true                                                                                                                                                                
      node: true                                                                                                                                                                     
      persistentvolume: true                                                                                                                                                         
      pod: true                                                                                                                                                                      
      replicaset: true                                                                                                                                                               
      replicationcontroller: false                                                                                                                                                   
      secret: false                                                                                                                                                                  
      serviceaccount: true                                                                                                                                                           
      services: true                                                                                                                                                                 
      statefulset: true

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions