feat: TCP fallback, query minimization, UDP auto-disable

Transport resilience for restrictive networks (ISPs blocking UDP:53):
- DNS-over-TCP fallback: UDP fail/truncation → automatic TCP retry
- UDP auto-disable: after 3 consecutive failures, switch to TCP-first
- IPv6 → TCP directly (UDP socket binds 0.0.0.0, can't reach IPv6)
- Network change resets UDP detection for re-probing
- Root hint rotation in TLD priming

Privacy:
- RFC 7816 query minimization: root servers see TLD only, not full name

Code quality:
- Merged find_starting_ns + find_starting_zone → find_closest_ns
- Extracted resolve_ns_addrs_from_glue shared helper
- Removed overall timeout wrapper (per-hop timeouts sufficient)
- forward_tcp for DNS-over-TCP (RFC 1035 §4.2.2)

Testing:
- Mock TCP-only DNS server for fallback tests (no network needed)
- tcp_fallback_resolves_when_udp_blocked
- tcp_only_iterative_resolution
- tcp_fallback_handles_nxdomain
- udp_auto_disable_resets
- Integration test suite (4 suites, 51 tests)
- Network probe script (tests/network-probe.sh)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Razvan Dimescu
2026-03-27 19:50:19 +02:00
parent 637b374d8b
commit 5b2cc874a1
11 changed files with 1372 additions and 103 deletions

View File

@@ -1300,7 +1300,9 @@ footer .closing {
<span class="pipeline-arrow">&rarr;</span>
<div class="pipeline-node"><div class="pipeline-box">Cache</div></div>
<span class="pipeline-arrow">&rarr;</span>
<div class="pipeline-node"><div class="pipeline-box hl-violet">DoH Upstream</div></div>
<div class="pipeline-node"><div class="pipeline-box hl-violet">Recursive / Forward (DoH)</div></div>
<span class="pipeline-arrow">&rarr;</span>
<div class="pipeline-node"><div class="pipeline-box highlight">DNSSEC Validate</div></div>
<span class="pipeline-arrow">&rarr;</span>
<div class="pipeline-node"><div class="pipeline-box hl-emerald">Respond</div></div>
</div>
@@ -1525,6 +1527,14 @@ footer .closing {
<div class="perf-stat-value amber">0 allocations</div>
<div class="perf-stat-label">Heap allocations in the I/O path &mdash; 4KB stack buffers, inline serialization</div>
</div>
<div class="perf-stat">
<div class="perf-stat-value teal">174 ns</div>
<div class="perf-stat-label">ECDSA P-256 signature verification (DNSSEC). RSA/SHA-256: 10.9&micro;s. DS digest: 257ns.</div>
</div>
<div class="perf-stat">
<div class="perf-stat-value emerald">~90 ms</div>
<div class="perf-stat-label">Cold-cache DNSSEC validation &mdash; only 1 network fetch needed (TLD chain pre-warmed on startup)</div>
</div>
<p class="perf-note">
Cold queries match system resolver speed &mdash; the bottleneck is upstream RTT, not Numa. We don't claim to be faster when the network is the limit.
@@ -1554,17 +1564,20 @@ footer .closing {
<dt>DNS Libraries</dt>
<dd>Zero &mdash; wire protocol parsed from scratch</dd>
<dt>Resolution Modes</dt>
<dd>Recursive (iterative from root hints, CNAME chasing, glue extraction) or Forward (DoH / plain UDP)</dd>
<dt>DNSSEC</dt>
<dd>Chain-of-trust via ring &mdash; RSA/SHA-256, ECDSA P-256, Ed25519. NSEC/NSEC3 denial proofs. EDNS0 DO bit, 1232-byte payload (DNS Flag Day 2020).</dd>
<dt>Dependencies</dt>
<dd>18 runtime crates &mdash; tokio, axum, hyper, reqwest (DoH), rcgen + rustls (TLS), socket2 (multicast), serde, and more</dd>
<dd>19 runtime crates &mdash; tokio, axum, hyper, ring (DNSSEC), reqwest (DoH), rcgen + rustls (TLS), socket2 (multicast), serde, and more</dd>
<dt>Packet Format</dt>
<dd>RFC 1035 compliant, 4096-byte UDP (EDNS)</dd>
<dd>RFC 1035 compliant. EDNS0 OPT pseudo-record. Parses A, AAAA, NS, CNAME, MX, SOA, SRV, HTTPS, DNSKEY, DS, RRSIG, NSEC, NSEC3.</dd>
<dt>Concurrency</dt>
<dd>Arc&lt;ServerCtx&gt; + RwLock for reads, Mutex for writes (never across .await)</dd>
<dt>Upstream</dt>
<dd>DNS-over-HTTPS (DoH) via reqwest + http2 + rustls</dd>
</dl>
<div class="code-block reveal reveal-delay-2">
<span class="comment"># Install (pick one)</span>