You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(packetparser): Allow sampling of packets (#1767)
# Description
This PR allows for optional sampling of packet reporting when in high
data aggregation level for `packetparser`.
By default, all packets are reported but optionally `1 out of n` packets
are sampled by random chance with the exception of certain important
control flags or when hitting the reporting interval.
This allows Retina to scale to high network volume environments at the
trade-off of some reporting granularity.
The performance impact of this is mostly for workloads with lots of new
connections, connections already tracked in the conntrack table rely on
#1665 for scalability.
The behavior added in #1665
allows for accurate reporting of metrics despite sampling being in
place.
## Related Issue
#1760
## Checklist
- [X] I have read the [contributing
documentation](https://retina.sh/docs/Contributing/overview).
- [X] I signed and signed-off the commits (`git commit -S -s ...`). See
[this
documentation](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification)
on signing commits.
- [X] I have correctly attributed the author(s) of the code.
- [X] I have tested the changes locally.
- [X] I have followed the project's style guidelines.
- [X] I have updated the documentation, if necessary.
- [X] I have added tests, if applicable.
## Screenshots (if applicable) or Testing Completed
## Main
<img width="1487" height="860" alt="Screenshot 2025-07-22 at 4 51 24 PM"
src="https://github.com/user-attachments/assets/72bc7b42-b280-4d10-aa7b-d114b460cd73"
/>
## After the change (with default sampling rate of 1)
<img width="1487" height="860" alt="Screenshot 2025-07-22 at 4 57 36 PM"
src="https://github.com/user-attachments/assets/6c115205-3068-4e97-ac51-9980c088890d"
/>
## After the change (with sampling rate of 1000)
<img width="1487" height="856" alt="Screenshot 2025-07-22 at 5 04 22 PM"
src="https://github.com/user-attachments/assets/b5e6cd5e-9c44-446f-bc1d-996044820f16"
/>
---
Please refer to the [CONTRIBUTING.md](../CONTRIBUTING.md) file for more
information on how to contribute to this project.
Signed-off-by: Matthew McKeen <[email protected]>
Copy file name to clipboardExpand all lines: docs/02-Installation/03-Config.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -53,6 +53,7 @@ Apply to both Agent and Operator.
53
53
*`enableAnnotations`: Enables gathering of metrics for annotated resources. Resources can be annotated with `retina.sh=observe`. Requires the operator and `operator.enableRetinaEndpoint` to be enabled. By enabling annotations, the agent will not use MetricsConfiguration CRD.
54
54
*`bypassLookupIPOfInterest`: If true, plugins like `packetparser` and `dropreason` will bypass IP lookup, generating an event for each packet regardless. `enableAnnotations` will not work if this is true.
55
55
*`dataAggregationLevel`: Defines the level of data aggregation for Retina. See [Data Aggregation](../05-Concepts/data-aggregation.md) for more details.
56
+
*`dataSamplingRate`: Defines the data sampling rate for `packetparser`. See [Sampling](../03-Metrics/plugins/Linux/packetparser.md#sampling) for more details.
Copy file name to clipboardExpand all lines: docs/03-Metrics/plugins/Linux/packetparser.md
+8Lines changed: 8 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,6 +15,14 @@ The `packetparser` plugin requires the `CAP_NET_ADMIN` and `CAP_SYS_ADMIN` capab
15
15
16
16
`packetparser` does not produce Basic metrics. In Advanced mode (refer to [Metric Modes](../../modes/modes.md)), the plugin transforms an eBPF result into an enriched `Flow` by adding Pod information based on IP. It then sends the `Flow` to an external channel, enabling *several modules* to generate Pod-Level metrics.
17
17
18
+
## Sampling
19
+
20
+
Since `packetparser` produces many enriched `Flow` objects it can be quite expensive for user space to process. Thus, when operating in `high`[data aggregation](../../../05-Concepts/data-aggregation.md) level optional sampling for reported packets is available via the `dataSamplingRate` configuration option.
21
+
22
+
`dataSamplingRate` is expressed in 1 out of N terms, where N is the `dataSamplingRate` value. For example, if `dataSamplingRate` is 3 1/3rd of packets will be sampled for reporting.
23
+
24
+
Keep in mind that there are cases where reporting will happen anyways as to ensure metric accuracy.
25
+
18
26
### Code locations
19
27
20
28
- Plugin and eBPF code: *pkg/plugin/packetparser/*
0 commit comments