Skip to content

client: filter mDNS query responses by queried service name#150

Open
vleonbonnet wants to merge 1 commit intohashicorp:mainfrom
vleonbonnet:fix-mdns-response-filtering
Open

client: filter mDNS query responses by queried service name#150
vleonbonnet wants to merge 1 commit intohashicorp:mainfrom
vleonbonnet:fix-mdns-response-filtering

Conversation

@vleonbonnet
Copy link
Copy Markdown

@vleonbonnet vleonbonnet commented Mar 24, 2026

Summary

The query() method in client.go processes all mDNS response records without validating whether they belong to the service being queried. On networks with many mDNS-enabled devices (smart speakers, streaming devices, etc.), unrelated multicast responses arrive during the query window and are blindly added to the results.

This causes callers like Consul's retry_join (via go-discover) to receive IP addresses of random network devices instead of actual service members, breaking cluster formation.

This resolves the TODO(reddaly) comment at the top of the response processing loop.

Changes

Introduce a validNames map that gates processing of all record types:

  • PTR records: only accepted if rr.Hdr.Name matches the queried serviceAddr (case-insensitive). The instance name (rr.Ptr) is added to validNames.
  • SRV records: only processed if rr.Hdr.Name is in validNames. The SRV target hostname is also added to validNames so subsequent A/AAAA records for it are accepted.
  • TXT, A, AAAA records: only processed if rr.Hdr.Name is in validNames.

Testing

Tested on a home network with ~30 mDNS-enabled devices:

Before (unpatched):

Querying mDNS for _myservice._tcp (10s timeout)...
FOUND: mynode-1._myservice._tcp.local. 10.0.1.10:8301 ([])
FOUND: mynode-2._myservice._tcp.local. 10.0.2.10:8301 ([])
FOUND: device-A._spotify-connect._tcp.local. 10.0.1.50:35552 ([CPath=/zc/0 VERSION=1.0 Stack=SP])
FOUND: device-B._sleep-proxy._udp.local. 10.0.1.51:53482 ([])
FOUND: device-C._spotify-connect._tcp.local. 10.0.2.50:55853 ([CPath=/zc/0 VERSION=1.0 Stack=SP])
FOUND: device-D._googlecast._tcp.local. 10.0.2.51:1400 ([])

After (patched):

Querying mDNS for _myservice._tcp (10s timeout)...
FOUND: mynode-1._myservice._tcp.local. 10.0.1.10:8301 ([])
FOUND: mynode-2._myservice._tcp.local. 10.0.2.10:8301 ([])

Fixes #96

The query() method processes all mDNS response records without checking
whether they belong to the service being queried. On networks with many
mDNS-enabled devices (Chromecast, Sonos, Spotify Connect, etc.),
unrelated responses pollute results, causing callers like Consul's
retry_join to attempt connections to random devices.

Fix: introduce a validNames map that gates processing of SRV, TXT, A,
and AAAA records on whether the name was first seen in a PTR record
matching the queried serviceAddr. This resolves the TODO(reddaly) at the
top of the response processing loop.

Fixes hashicorp#96
@vleonbonnet vleonbonnet requested review from a team as code owners March 24, 2026 12:41
@hashicorp-cla-app
Copy link
Copy Markdown

CLA assistant check

Thank you for your submission! We require that all contributors sign our Contributor License Agreement ("CLA") before we can accept the contribution. Read and sign the agreement

Learn more about why HashiCorp requires a CLA and what the CLA includes

Have you signed the CLA already but the status is still pending? Recheck it.

1 similar comment
@hashicorp-cla-app
Copy link
Copy Markdown

CLA assistant check

Thank you for your submission! We require that all contributors sign our Contributor License Agreement ("CLA") before we can accept the contribution. Read and sign the agreement

Learn more about why HashiCorp requires a CLA and what the CLA includes

Have you signed the CLA already but the status is still pending? Recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Service based query fails over and returns random mDNS responses

1 participant