fix(helm): drop unresolvable _grpc._tcp. SRV prefix from query-backen… by nissessenap · Pull Request #5232 · grafana/pyroscope

nissessenap · 2026-06-03T11:03:12Z

What this PR does

Fixes #5229: in v2 Helm mode, every profile-data read query (SelectMergeStacktraces, SelectSeries) hangs ~30s and returns HTTP 499, so flame graphs render empty. The query-backend component logs nothing — the request never reaches it.

Root cause

The chart rendered:

-query-backend.address=dns:///_grpc._tcp.-headless.$(NAMESPACE_FQDN):9095

This is passed directly to the query-backend client's grpc.NewClient with grpc-go's stock dns resolver. That resolver does an A/AAAA lookup on the literal host; it only does SRV for grpclb (_grpclb._tcp.<host>), and EnableSRVLookups is
false by default. _grpc._tcp.<headless> has an SRV record but no A/AAAA record → zero endpoints. With the client's waitForReady: true service config, calls park until the 30s timeout → HTTP 499.

Metadata RPCs (ProfileTypes, Series, LabelNames, ...) are unaffected because the metastore client uses its own kubernetes:// discovery resolver, not grpc-go's.

Fix

Drop the _grpc._tcp. prefix so the address resolves the headless service's plain A records:

-query-backend.address=dns:///-headless.$(NAMESPACE_FQDN):9095

The headless service is clusterIP: None, so this resolves all ready pod IPs, and the client's existing round_robin LB policy balances across them.

Microservices considered

Both single-binary v2 and microservices v2 rendered the same broken default, so microservices reads were affected too (just not yet reported). The fix is correct for both: single-binary resolves to one pod, microservices round-robins across N
query-backend replicas via the headless A records. Confirmed in the regenerated rendered/*.yaml.

kubernetes:// / dnssrvnoa+ are not options for this address — the query-backend client uses a bare grpc.NewClient with no custom resolver registered (only the metastore client has one).

Backward compatibility

Using extraArgs workaround are unaffected: the chart still skips its default when query-backend.address is set in extraArgs, so no duplicate flag. After upgrading they can drop the override.

…d.address default In v2 mode the chart rendered the query-backend client address as `dns:///_grpc._tcp.<headless>:9095`. That address is handed straight to grpc-go's stock `dns` resolver, which does an A/AAAA lookup on the literal host. It only performs SRV lookups for grpclb (`_grpclb._tcp.<host>`), and `EnableSRVLookups` is false by default. The host `_grpc._tcp.<headless>` has an SRV record but no A/AAAA record, so the resolver yields zero endpoints. Combined with the client's `waitForReady: true` service config, every read RPC (SelectMergeStacktraces, SelectSeries) parks until the 30s call timeout and returns HTTP 499, while the query-backend logs nothing. Fixes grafana#5229 Signed-off-by: Edvin Norling <edvin.norling@kognic.com>

cla-assistant · 2026-06-03T11:04:19Z

All committers have signed the CLA.

nissessenap requested a review from a team as a code owner June 3, 2026 11:03

Merge branch 'main' into fix_5229

37e619b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(helm): drop unresolvable _grpc._tcp. SRV prefix from query-backen…#5232

fix(helm): drop unresolvable _grpc._tcp. SRV prefix from query-backen…#5232
nissessenap wants to merge 2 commits into
grafana:mainfrom
nissessenap:fix_5229

nissessenap commented Jun 3, 2026

Uh oh!

cla-assistant Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nissessenap commented Jun 3, 2026

What this PR does

Root cause

Fix

Microservices considered

Backward compatibility

Uh oh!

cla-assistant Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cla-assistant Bot commented Jun 3, 2026 •

edited

Loading