Fix DNS failure on network change #9

Merged
razvandimescu merged 4 commits from fix/upstream-redetect into main 2026-03-22 17:23:36 +08:00
razvandimescu commented 2026-03-22 15:32:25 +08:00 (Migrated from github.com)

Summary

  • Network change watcher: background task (every 30s) detects LAN IP and upstream DNS changes
  • Dynamic LAN IP: multicast announcements use the current IP, not the startup IP
  • Upstream re-detection: re-runs discover_system_dns() and swaps upstream atomically when auto-detected
  • DHCP DNS fallback: when scutil --dns only shows 127.0.0.1 (numa install active), reads DHCP-provided DNS from ipconfig getpacket en0/en1 before falling back to Quad9
  • LAN peer flush: clears stale peers from old network on any network change
  • Mutex-wrapped upstream + LAN IP: safe concurrent updates on the hot path

Bug

Numa detected the upstream DNS server and LAN IP once at startup. Switching Wi-Fi networks caused:

  1. All DNS queries to fail (old upstream unreachable)
  2. Multicast announcements to carry the old IP
  3. Stale LAN peers to linger for 90s
  4. On networks blocking external DNS (8.8.8.8, 9.9.9.9), the Quad9 fallback also failed

Fixed by: network watcher + DHCP DNS detection fallback chain:
scutil --dnsipconfig getpacket (DHCP DNS) → 9.9.9.9 (Quad9)

Tested live on a network that blocks all public DNS but allows ISP DNS.

Test plan

  • make all passes (fmt, clippy, audit, build)
  • Live test: deployed via make deploy, DNS resolves on restrictive network via ISP DNS
  • Circular reference fixed: numa install + restart → detects DHCP DNS instead of loopback
  • Full test plan: docs/testing/network-change-tests.md
  • Windows CI passes

🤖 Generated with Claude Code

## Summary - **Network change watcher**: background task (every 30s) detects LAN IP and upstream DNS changes - **Dynamic LAN IP**: multicast announcements use the current IP, not the startup IP - **Upstream re-detection**: re-runs `discover_system_dns()` and swaps upstream atomically when auto-detected - **DHCP DNS fallback**: when `scutil --dns` only shows `127.0.0.1` (numa install active), reads DHCP-provided DNS from `ipconfig getpacket en0/en1` before falling back to Quad9 - **LAN peer flush**: clears stale peers from old network on any network change - **Mutex-wrapped upstream + LAN IP**: safe concurrent updates on the hot path ## Bug Numa detected the upstream DNS server and LAN IP once at startup. Switching Wi-Fi networks caused: 1. All DNS queries to fail (old upstream unreachable) 2. Multicast announcements to carry the old IP 3. Stale LAN peers to linger for 90s 4. On networks blocking external DNS (8.8.8.8, 9.9.9.9), the Quad9 fallback also failed Fixed by: network watcher + DHCP DNS detection fallback chain: `scutil --dns` → `ipconfig getpacket` (DHCP DNS) → `9.9.9.9` (Quad9) Tested live on a network that blocks all public DNS but allows ISP DNS. ## Test plan - [x] `make all` passes (fmt, clippy, audit, build) - [x] Live test: deployed via `make deploy`, DNS resolves on restrictive network via ISP DNS - [x] Circular reference fixed: `numa install` + restart → detects DHCP DNS instead of loopback - [ ] Full test plan: `docs/testing/network-change-tests.md` - [ ] Windows CI passes 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign in to join this conversation.