feat: wire-level forwarding, cache, and request hedging #85

Merged
razvandimescu merged 21 commits from feat/wire-forwarding-hedging into main 2026-04-13 03:02:45 +08:00
razvandimescu commented 2026-04-12 09:20:41 +08:00 (Migrated from github.com)

Summary

Wire-level forwarding & cache

  • Wire-level forwarding: forward_query_raw / forward_with_failover_raw skip DnsPacket parse/serialize, passing raw bytes through to upstream
  • Wire-level cache: stores raw DNS wire bytes with pre-scanned TTL offsets — patches ID + TTLs in-place on lookup (2-4x less memory per entry)
  • Cache raised to 100K entries (from 10K) with stalest-entry eviction when full

Request hedging

  • Universal hedging: fires a parallel request after configurable delay (default 10ms) when primary stalls — applies to all protocols (UDP, DoH, DoT)
    • UDP: second independent packet rescues packet loss (1-3% on WiFi)
    • DoH: second h2 stream on same connection rescues dispatch channel spikes
    • DoT: second TLS connection rescues handshake stalls
  • Recursive hedging: iterative resolver fires to 2 NS servers simultaneously on cold queries; uses SRTT × 3 delay hedge on warm queries
  • Configurable via hedge_ms in [upstream]. Set to 0 to disable. Inspired by The Tail at Scale (Dean & Barroso, 2013)

Serve-stale & prefetch (RFC 8767)

  • Serve-stale: expired cache entries within 1-hour window served with TTL=1 for instant response
  • Background refresh: stale serves spawn a fire-and-forget re-resolve — next query gets fresh data
  • Prefetch at <10% TTL: entries nearing expiry trigger background refresh before they expire
  • Freshness enum: Fresh / NearExpiry / Stale replaces bare bool at all call sites
  • Deduplication: refreshing guard prevents multiple concurrent refreshes for the same domain

Other

  • DoH keepalive: periodic root NS probe prevents idle HTTP/2 + TLS teardown
  • NS delegation caching: referral NS records cached to skip TLD re-queries on subsequent lookups
  • refresh_entry unification: single function for cache warming and stale refresh (was duplicated)
  • Zero-alloc cache hit path: eliminated unconditional .to_vec() of raw wire on every query
  • Integration test fix: blocklist test polls /blocking/stats instead of fixed sleep 4
  • Test race fix: TCP tests serialized via Mutex to prevent UDP_DISABLED global state interference

Benchmark suite

  • --vs-unbound: cached query comparison (both at 0.1ms median)
  • --vs-unbound-cold: genuine cold recursive (1 round, PID-prefixed unique subdomains)
  • --vs-nextdns: local cache vs cloud resolver (47x faster median)
  • Mode validation: checks Numa is in the correct mode before benchmarking

Performance

Metric Numa Unbound NextDNS
Cached median 0.1 ms 0.1 ms 37.4 ms
Cold recursive p99 538 ms 748 ms
Cold recursive σ 114 ms 457 ms

Test plan

  • 271 unit tests pass (cargo test)
  • Clippy clean, rustfmt clean
  • Stale behavior tests: lookup_wire_signals_stale_when_expired, lookup_wire_signals_prefetch_near_expiry
  • Cache eviction test: cache_max_entries_evicts_stalest
  • TCP test race eliminated via UDP_STATE_LOCK
  • Benchmark suite validated against Unbound and NextDNS
## Summary ### Wire-level forwarding & cache - **Wire-level forwarding**: `forward_query_raw` / `forward_with_failover_raw` skip DnsPacket parse/serialize, passing raw bytes through to upstream - **Wire-level cache**: stores raw DNS wire bytes with pre-scanned TTL offsets — patches ID + TTLs in-place on lookup (2-4x less memory per entry) - **Cache raised to 100K entries** (from 10K) with stalest-entry eviction when full ### Request hedging - **Universal hedging**: fires a parallel request after configurable delay (default 10ms) when primary stalls — applies to **all protocols** (UDP, DoH, DoT) - UDP: second independent packet rescues packet loss (1-3% on WiFi) - DoH: second h2 stream on same connection rescues dispatch channel spikes - DoT: second TLS connection rescues handshake stalls - **Recursive hedging**: iterative resolver fires to 2 NS servers simultaneously on cold queries; uses SRTT × 3 delay hedge on warm queries - Configurable via `hedge_ms` in `[upstream]`. Set to `0` to disable. Inspired by [The Tail at Scale](https://research.google/pubs/pub40801/) (Dean & Barroso, 2013) ### Serve-stale & prefetch (RFC 8767) - **Serve-stale**: expired cache entries within 1-hour window served with TTL=1 for instant response - **Background refresh**: stale serves spawn a fire-and-forget re-resolve — next query gets fresh data - **Prefetch at <10% TTL**: entries nearing expiry trigger background refresh before they expire - **Freshness enum**: `Fresh` / `NearExpiry` / `Stale` replaces bare bool at all call sites - **Deduplication**: `refreshing` guard prevents multiple concurrent refreshes for the same domain ### Other - **DoH keepalive**: periodic root NS probe prevents idle HTTP/2 + TLS teardown - **NS delegation caching**: referral NS records cached to skip TLD re-queries on subsequent lookups - **`refresh_entry` unification**: single function for cache warming and stale refresh (was duplicated) - **Zero-alloc cache hit path**: eliminated unconditional `.to_vec()` of raw wire on every query - **Integration test fix**: blocklist test polls `/blocking/stats` instead of fixed `sleep 4` - **Test race fix**: TCP tests serialized via Mutex to prevent `UDP_DISABLED` global state interference ### Benchmark suite - `--vs-unbound`: cached query comparison (both at 0.1ms median) - `--vs-unbound-cold`: genuine cold recursive (1 round, PID-prefixed unique subdomains) - `--vs-nextdns`: local cache vs cloud resolver (47x faster median) - Mode validation: checks Numa is in the correct mode before benchmarking ## Performance | Metric | Numa | Unbound | NextDNS | |--------|------|---------|---------| | Cached median | 0.1 ms | 0.1 ms | 37.4 ms | | Cold recursive p99 | 538 ms | 748 ms | — | | Cold recursive σ | 114 ms | 457 ms | — | ## Test plan - [x] 271 unit tests pass (`cargo test`) - [x] Clippy clean, rustfmt clean - [x] Stale behavior tests: `lookup_wire_signals_stale_when_expired`, `lookup_wire_signals_prefetch_near_expiry` - [x] Cache eviction test: `cache_max_entries_evicts_stalest` - [x] TCP test race eliminated via `UDP_STATE_LOCK` - [x] Benchmark suite validated against Unbound and NextDNS
Sign in to join this conversation.