Skip to content

Commit 652930b

Browse files
committed
feat(packetparser): use ring buffer back-pressure
Signed-off-by: Agyei Holy <agyeiholy978@gmail.com>
1 parent 98b65c6 commit 652930b

File tree

9 files changed

+149
-36
lines changed

9 files changed

+149
-36
lines changed

deploy/standard/manifests/controller/helm/retina/values.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,9 @@ remoteContext: false
5656
enableAnnotations: false
5757
bypassLookupIPOfInterest: false
5858
dataAggregationLevel: "low"
59+
# Static packet sampling for packetparser when using perf event arrays.
60+
# Ignored when packetParserRingBuffer="enabled", because ring buffer back-pressure
61+
# becomes the adaptation mechanism.
5962
dataSamplingRate: 1
6063
# Use BPF ring buffers (BPF_MAP_TYPE_RINGBUF) instead of BPF_PERF_EVENT_ARRAY.
6164
# Pros: lower per-event overhead at high event rates, simpler variable-sized records, more consistent latency.

docs/01-Introduction/01-intro.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -105,11 +105,12 @@ The following are known system requirements for installing Retina:
105105

106106
Community users have reported performance considerations when using **Advanced metrics with the `packetparser` plugin** on nodes with high CPU core counts (32+ cores) under sustained, high-volume network load.
107107

108-
If you plan to deploy Retina in Advanced mode on large node types with network-intensive workloads, consider:
109-
110-
1. **Start with Basic metrics mode** (does not use `packetparser`)
111-
2. Enable `dataSamplingRate` if you need Advanced metrics
112-
3. Monitor CPU usage and network throughput after deployment
113-
4. See [`packetparser` performance considerations](../03-Metrics/plugins/Linux/packetparser.md#performance-considerations) for more information
108+
If you plan to deploy Retina in Advanced mode on large node types with network-intensive workloads, consider:
109+
110+
1. **Start with Basic metrics mode** (does not use `packetparser`)
111+
2. Enable `packetParserRingBuffer` if you need Advanced metrics on high-throughput nodes
112+
3. If you stay on perf event arrays, tune `dataSamplingRate`
113+
4. Monitor CPU usage and network throughput after deployment
114+
5. See [`packetparser` performance considerations](../03-Metrics/plugins/Linux/packetparser.md#performance-considerations) for more information
114115

115116
The Retina team is evaluating options to address these reported concerns in future releases.

docs/02-Installation/03-Config.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,8 +53,8 @@ Apply to both Agent and Operator.
5353
* `enableAnnotations`: Enables gathering of metrics for annotated resources. Resources can be annotated with `retina.sh=observe`. Requires the operator and `operator.enableRetinaEndpoint` to be enabled. By enabling annotations, the agent will not use MetricsConfiguration CRD.
5454
* `bypassLookupIPOfInterest`: If true, plugins like `packetparser` and `dropreason` will bypass IP lookup, generating an event for each packet regardless. `enableAnnotations` will not work if this is true.
5555
* `dataAggregationLevel`: Defines the level of data aggregation for Retina. See [Data Aggregation](../05-Concepts/data-aggregation.md) for more details.
56-
* `dataSamplingRate`: Defines the data sampling rate for `packetparser`. See [Sampling](../03-Metrics/plugins/Linux/packetparser.md#sampling) for more details.
57-
* `packetParserRingBuffer`: Selects the kernel-to-userspace transport for `packetparser`. Accepted values: `enabled` (ring buffer) or `disabled` (perf event array). `auto` is reserved for future use.
56+
* `dataSamplingRate`: Defines the static data sampling rate for `packetparser` when `packetParserRingBuffer=disabled`. See [Sampling](../03-Metrics/plugins/Linux/packetparser.md#sampling) for more details.
57+
* `packetParserRingBuffer`: Selects the kernel-to-userspace transport for `packetparser`. Accepted values: `enabled` (ring buffer) or `disabled` (perf event array). `auto` is reserved for future use. When enabled, `packetparser` relies on ring buffer back-pressure and ignores `dataSamplingRate`.
5858
* `packetParserRingBufferSize`: Ring buffer size in bytes when `packetParserRingBuffer=enabled`. Must be a power of two between the kernel page size and 1GiB (inclusive); invalid values cause startup to fail.
5959

6060
## Operator Configuration

docs/03-Metrics/plugins/Linux/packetparser.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,9 +44,10 @@ Alternative data transfer mechanisms like BPF ring buffers (BPF_MAP_TYPE_RINGBUF
4444
If you observe performance degradation on high-core-count nodes:
4545

4646
1. **Disable `packetparser`**: Use Basic metrics mode which doesn't require this plugin
47-
2. **Enable Sampling**: Use the `dataSamplingRate` configuration option (see [Sampling](#sampling) section)
48-
3. **Use High Data Aggregation**: Configure `high` [data aggregation](../../../05-Concepts/data-aggregation.md)
49-
4. **Monitor Impact**: Watch for elevated CPU usage, context switches, or throughput changes
47+
2. **Enable Ring Buffers**: Use `packetParserRingBuffer=enabled` and size the shared buffer with `packetParserRingBufferSize`
48+
3. **Enable Sampling**: If you stay on perf event arrays, use `dataSamplingRate` (see [Sampling](#sampling))
49+
4. **Use High Data Aggregation**: Configure `high` [data aggregation](../../../05-Concepts/data-aggregation.md)
50+
5. **Monitor Impact**: Watch for elevated CPU usage, context switches, or throughput changes
5051

5152
**Note:** The Retina team is evaluating options for addressing reported performance concerns, including potential support for alternative data transfer mechanisms. Community feedback and contributions are welcome.
5253

@@ -58,6 +59,8 @@ Since `packetparser` produces many enriched `Flow` objects it can be quite expen
5859

5960
Keep in mind that there are cases where reporting will happen anyways as to ensure metric accuracy.
6061

62+
When `packetParserRingBuffer=enabled`, `packetparser` ignores `dataSamplingRate`. In that mode the shared BPF ring buffer is the adaptation mechanism: events are emitted normally while capacity exists, and additional events are dropped only when `bpf_ringbuf_reserve()` cannot reserve space.
63+
6164
### Code locations
6265

6366
- Plugin and eBPF code: *pkg/plugin/packetparser/*

docs/06-Troubleshooting/performance.md

Lines changed: 25 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -73,9 +73,27 @@ helm upgrade retina oci://ghcr.io/microsoft/retina/charts/retina \
7373

7474
**Trade-off:** You'll have node-level metrics only, not pod-level metrics.
7575

76-
### Option 2: Enable Data Sampling
76+
### Option 2: Enable Ring Buffer Back-Pressure
7777

78-
Reduce event volume by sampling packets:
78+
Switch `packetparser` to `BPF_MAP_TYPE_RINGBUF` so event dropping only happens when the shared buffer is actually full:
79+
80+
```yaml
81+
apiVersion: v1
82+
kind: ConfigMap
83+
metadata:
84+
name: retina-config
85+
namespace: kube-system
86+
data:
87+
config.yaml: |
88+
packetParserRingBuffer: "enabled"
89+
packetParserRingBufferSize: 8388608
90+
```
91+
92+
**Trade-off:** Uses a fixed amount of locked memory, and burst capacity is bounded by `packetParserRingBufferSize`.
93+
94+
### Option 3: Enable Data Sampling
95+
96+
If you stay on perf event arrays, reduce event volume by sampling packets:
7997

8098
```yaml
8199
apiVersion: v1
@@ -90,7 +108,9 @@ data:
90108

91109
**Trade-off:** Reduced data granularity, but lower overhead.
92110

93-
### Option 3: Use High Data Aggregation Level
111+
**Note:** `dataSamplingRate` is ignored when `packetParserRingBuffer="enabled"`.
112+
113+
### Option 4: Use High Data Aggregation Level
94114

95115
Reduce events at the eBPF level:
96116

@@ -107,7 +127,7 @@ data:
107127

108128
**Trade-off:** Disables host interface monitoring; API server latency metrics may be less reliable.
109129

110-
### Option 4: Selective Deployment
130+
### Option 5: Selective Deployment
111131

112132
Deploy Retina only on nodes where you need detailed observability:
113133

@@ -142,7 +162,7 @@ bpftool map list | grep retina
142162
bpftool map show name retina_packetparser_events
143163
```
144164

145-
Currently, `packetparser` uses `BPF_MAP_TYPE_PERF_EVENT_ARRAY`.
165+
By default, `packetparser` uses `BPF_MAP_TYPE_PERF_EVENT_ARRAY`. If `packetParserRingBuffer=enabled`, it uses `BPF_MAP_TYPE_RINGBUF`.
146166

147167
### Monitoring Event Rates (Advanced)
148168

pkg/plugin/packetparser/_cprog/packetparser.c

Lines changed: 26 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,25 @@ static int parse_tcp_ts(struct tcphdr *tcph, void *data_end, __u32 *tsval, __u32
122122
return -1;
123123
}
124124

125-
// Function to parse the packet and send it to the perf buffer.
125+
// Emit a packet to the configured userspace transport.
126+
static __always_inline void emit_packet(struct __sk_buff *skb, struct packet *p)
127+
{
128+
#ifdef USE_RING_BUFFER
129+
struct packet *event;
130+
131+
event = bpf_ringbuf_reserve(&retina_packetparser_events, sizeof(*event), 0);
132+
if (!event) {
133+
return;
134+
}
135+
136+
__builtin_memcpy(event, p, sizeof(*event));
137+
bpf_ringbuf_submit(event, 0);
138+
#else
139+
bpf_perf_event_output(skb, &retina_packetparser_events, BPF_F_CURRENT_CPU, p, sizeof(*p));
140+
#endif
141+
}
142+
143+
// Function to parse the packet and send it to the configured userspace buffer.
126144
static void parse(struct __sk_buff *skb, __u8 obs)
127145
{
128146
struct packet p;
@@ -216,20 +234,20 @@ static void parse(struct __sk_buff *skb, __u8 obs)
216234
p.conntrack_metadata = conntrack_metadata;
217235
#endif // ENABLE_CONNTRACK_METRICS
218236

219-
#ifdef DATA_AGGREGATION_LEVEL
237+
#ifdef DATA_AGGREGATION_LEVEL
220238

221239
// Calculate sampling
222240
bool sampled __attribute__((unused));
223241
sampled = true;
224-
225-
#ifdef DATA_SAMPLING_RATE
226-
u32 rand __attribute__((unused));
242+
243+
#if defined(DATA_SAMPLING_RATE) && DATA_SAMPLING_RATE > 1 && !defined(USE_RING_BUFFER)
244+
u32 rand __attribute__((unused));
227245
rand = bpf_get_prandom_u32();
228246
if (rand >= UINT32_MAX / DATA_SAMPLING_RATE) {
229247
sampled = false;
230248
}
231249
#endif
232-
250+
233251
// Process the packet in ct
234252
struct packetreport report __attribute__((unused));
235253
report = ct_process_packet(&p, obs, sampled);
@@ -239,23 +257,15 @@ static void parse(struct __sk_buff *skb, __u8 obs)
239257
p.previously_observed_packets = 0;
240258
p.previously_observed_bytes = 0;
241259
__builtin_memset(&p.previously_observed_flags, 0, sizeof(struct tcpflagscount));
242-
#ifdef USE_RING_BUFFER
243-
bpf_ringbuf_output(&retina_packetparser_events, &p, sizeof(p), 0);
244-
#else
245-
bpf_perf_event_output(skb, &retina_packetparser_events, BPF_F_CURRENT_CPU, &p, sizeof(p));
246-
#endif
260+
emit_packet(skb, &p);
247261
return;
248262
// If the data aggregation level is high, only send the packet to the perf buffer if it needs to be reported.
249263
#elif DATA_AGGREGATION_LEVEL == DATA_AGGREGATION_LEVEL_HIGH
250264
if (report.report) {
251265
p.previously_observed_packets = report.previously_observed_packets;
252266
p.previously_observed_bytes = report.previously_observed_bytes;
253267
p.previously_observed_flags = report.previously_observed_flags;
254-
#ifdef USE_RING_BUFFER
255-
bpf_ringbuf_output(&retina_packetparser_events, &p, sizeof(p), 0);
256-
#else
257-
bpf_perf_event_output(skb, &retina_packetparser_events, BPF_F_CURRENT_CPU, &p, sizeof(p));
258-
#endif
268+
emit_packet(skb, &p);
259269
}
260270
#endif
261271
#endif

pkg/plugin/packetparser/packetparser_ebpf_test.go

Lines changed: 59 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -644,7 +644,9 @@ func compileAndLoadVariantBase(t *testing.T, opts compileOpts) (*packetparserObj
644644
st += "#define ENABLE_CONNTRACK_METRICS 1\n"
645645
}
646646
st += fmt.Sprintf("#define DATA_AGGREGATION_LEVEL %d\n", opts.aggregationLevel)
647-
st += fmt.Sprintf("#define DATA_SAMPLING_RATE %d\n", opts.samplingRate)
647+
if !opts.enableRingBuf {
648+
st += fmt.Sprintf("#define DATA_SAMPLING_RATE %d\n", opts.samplingRate)
649+
}
648650
require.NoError(t, os.WriteFile(ppDynamic, []byte(st), 0o644))
649651

650652
// Write conntrack dynamic.h if conntrack metrics enabled.
@@ -863,6 +865,62 @@ func TestHighAggregationPreviouslyObserved(t *testing.T) {
863865
"expected previously_observed_packets > 0 at HIGH aggregation")
864866
}
865867

868+
func TestHighAggregationSamplingSuppressesPerfBufferEvents(t *testing.T) {
869+
objs, reader := compileAndLoadVariant(t, compileOpts{
870+
bypassFilter: 1,
871+
enableConntrack: false,
872+
aggregationLevel: 1, // HIGH
873+
samplingRate: 2147483647, // effectively never sampled
874+
})
875+
876+
srcIP := net.ParseIP("10.0.16.1")
877+
dstIP := net.ParseIP("10.0.16.2")
878+
879+
synPkt := ebpftest.BuildTCPPacket(ebpftest.TCPPacketOpts{
880+
SrcIP: srcIP, DstIP: dstIP, SrcPort: 61000, DstPort: 80, SYN: true,
881+
})
882+
ebpftest.RunProgram(t, objs.EndpointIngressFilter, synPkt)
883+
_, ok := ebpftest.ReadPerfEvent[packetparserPacket](t, reader, perfReaderTimeout)
884+
require.True(t, ok, "SYN should still be reported at HIGH aggregation")
885+
886+
ackPkt := ebpftest.BuildTCPPacket(ebpftest.TCPPacketOpts{
887+
SrcIP: srcIP, DstIP: dstIP, SrcPort: 61000, DstPort: 80, ACK: true,
888+
})
889+
ebpftest.RunProgram(t, objs.EndpointIngressFilter, ackPkt)
890+
_, ok = ebpftest.ReadPerfEvent[packetparserPacket](t, reader, 100*time.Millisecond)
891+
require.False(t, ok, "perf-buffer sampling should suppress the ACK event")
892+
}
893+
894+
func TestHighAggregationRingBufferIgnoresSamplingRate(t *testing.T) {
895+
if err := ensureRingBufKernelSupported(); err != nil {
896+
t.Skipf("ring buffer not supported: %v", err)
897+
}
898+
899+
objs, reader := compileAndLoadRingBufVariant(t, compileOpts{
900+
bypassFilter: 1,
901+
enableConntrack: false,
902+
aggregationLevel: 1, // HIGH
903+
samplingRate: 2147483647, // ignored in ring-buffer mode
904+
})
905+
906+
srcIP := net.ParseIP("10.0.16.3")
907+
dstIP := net.ParseIP("10.0.16.4")
908+
909+
synPkt := ebpftest.BuildTCPPacket(ebpftest.TCPPacketOpts{
910+
SrcIP: srcIP, DstIP: dstIP, SrcPort: 62000, DstPort: 80, SYN: true,
911+
})
912+
ebpftest.RunProgram(t, objs.EndpointIngressFilter, synPkt)
913+
_, ok := ebpftest.ReadRingBufEvent[packetparserPacket](t, reader, perfReaderTimeout)
914+
require.True(t, ok, "SYN should still be reported at HIGH aggregation")
915+
916+
ackPkt := ebpftest.BuildTCPPacket(ebpftest.TCPPacketOpts{
917+
SrcIP: srcIP, DstIP: dstIP, SrcPort: 62000, DstPort: 80, ACK: true,
918+
})
919+
ebpftest.RunProgram(t, objs.EndpointIngressFilter, ackPkt)
920+
_, ok = ebpftest.ReadRingBufEvent[packetparserPacket](t, reader, perfReaderTimeout)
921+
require.True(t, ok, "ring-buffer mode should ignore dataSamplingRate and report the ACK event")
922+
}
923+
866924
// =============================================================================
867925
// Conntrack map-state verification tests
868926
// =============================================================================

pkg/plugin/packetparser/packetparser_linux.go

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -127,9 +127,14 @@ func (p *packetParser) Generate(ctx context.Context) error {
127127
p.l.Info("data aggregation level", zap.String("level", p.cfg.DataAggregationLevel.String()))
128128
st += fmt.Sprintf("#define DATA_AGGREGATION_LEVEL %d\n", p.cfg.DataAggregationLevel)
129129

130-
// Process packetparser sampling rate.
131-
p.l.Info("sampling rate", zap.Uint32("rate", p.cfg.DataSamplingRate))
132-
st += fmt.Sprintf("#define DATA_SAMPLING_RATE %d\n", p.cfg.DataSamplingRate)
130+
// Perf-buffer mode supports static sampling. Ring-buffer mode relies on
131+
// reserve()/submit() back-pressure instead and therefore ignores the rate.
132+
if p.cfg.PacketParserRingBuffer.IsEnabled() {
133+
p.l.Info("ring buffer back-pressure enabled; ignoring sampling rate", zap.Uint32("rate", p.cfg.DataSamplingRate))
134+
} else {
135+
p.l.Info("sampling rate", zap.Uint32("rate", p.cfg.DataSamplingRate))
136+
st += fmt.Sprintf("#define DATA_SAMPLING_RATE %d\n", p.cfg.DataSamplingRate)
137+
}
133138

134139
// Generate dynamic header for packetparser.
135140
err = loader.WriteFile(ctx, dynamicHeaderPath, st)

pkg/plugin/packetparser/packetparser_linux_test.go

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -660,6 +660,19 @@ func TestPacketParseGenerate(t *testing.T) {
660660
"#define DATA_AGGREGATION_LEVEL 1\n" +
661661
"#define DATA_SAMPLING_RATE 0\n",
662662
},
663+
{
664+
name: "RingBufferIgnoresSamplingRate",
665+
cfg: &kcfg.Config{
666+
EnablePodLevel: true,
667+
BypassLookupIPOfInterest: true,
668+
DataAggregationLevel: kcfg.High,
669+
DataSamplingRate: 99,
670+
PacketParserRingBuffer: kcfg.PacketParserRingBufferEnabled,
671+
PacketParserRingBufferSize: 4096,
672+
},
673+
expectedContents: "#define BYPASS_LOOKUP_IP_OF_INTEREST 1\n" +
674+
"#define DATA_AGGREGATION_LEVEL 1\n",
675+
},
663676
}
664677

665678
for _, tt := range tests {

0 commit comments

Comments
 (0)