82 Commits

Author SHA1 Message Date
Razvan Dimescu
25ebdb311f refactor(bootstrap): BTreeMap for overrides + simplify review
- Switch overrides from HashMap to BTreeMap — deterministic iteration by
  type, drops the manual sort when logging.
- Rename the flat_map closure's inner `ips` to `addrs` to stop shadowing
  the outer Vec<String>.
- Trim the Suite 8 TEST-NET-1 comment to keep the "why" and drop
  mechanism narration.
- Drop a redundant sleep 1 after wait — wait already blocks on exit.
2026-04-21 18:06:22 +03:00
Razvan Dimescu
51cce0347b test(odoh): integration-verify relay_ip/target_ip override wiring
Suite 8 now ends with a config using RFC 5737 TEST-NET-1 IPs as
relay_ip/target_ip, started briefly so the bootstrap resolver logs its
override map. Asserts both host=IP pairs land in that map — closing the
gap flagged on PR #126 (zero-plain-DNS-leak for ODoH endpoints was only
unit-tested).

Also: NumaResolver::new now logs the override map at INFO when non-empty,
so operators can verify their ODoH bootstrap without needing DEBUG level.
2026-04-21 17:43:02 +03:00
Razvan Dimescu
459395203d style: cargo fmt 2026-04-21 16:30:26 +03:00
Razvan Dimescu
10469e96bd fix(bootstrap): route numa HTTPS via IP-literal bootstrap resolver (#122)
When numa is its own system DNS resolver (HAOS add-on, Pi-hole-style
container, /etc/resolv.conf → 127.0.0.1), every numa-originated HTTPS
connection — DoH upstream, ODoH relay/target, blocklist CDN — routed
its hostname through getaddrinfo() back to numa itself. Cold boot
deadlocked; steady state taxed every new TCP connection. 0.14.1's
retry-with-backoff masked the startup race but not the underlying
self-loop.

NumaResolver implements reqwest::dns::Resolve with two lanes:
- Per-host overrides (ODoH relay_ip/target_ip) short-circuit DNS
  entirely, preserving ODoH's zero-plain-DNS-leak property.
- Otherwise: A+AAAA in parallel via UDP to IP-literal bootstrap
  servers, with TCP fallback for UDP-hostile networks.

Bootstrap IPs come from upstream.fallback (IP-literal filtered,
hostnames skipped with a warning). Empty fallback yields the
hardcoded default [9.9.9.9, 1.1.1.1]; the chosen source is logged
at startup so the silent default is visible.

doh_keepalive_loop now fires its first tick immediately, and
keepalive_doh logs failures at WARN — bootstrap issues surface
within ~100ms of boot instead of on the first client query.

Distinct from UpstreamPool.fallback (client-query failover) which
stays untouched: client queries with no configured fallback still
SERVFAIL on primary failure rather than silently shadow-routing.

Reproducer: tests/docker/self-resolver-loop.sh. Before: 0 blocklist
domains, 3072ms SERVFAIL. After: 397k domains, 118ms NOERROR.
2026-04-21 16:19:14 +03:00
Razvan Dimescu
31adc31c9b refactor(ctx): coalesce forward-path upstream queries
resolve_coalesced now takes leader_path: QueryPath and applies to all
three upstream branches (Forwarded-rule, Recursive, Upstream), not just
Recursive. Fixes thundering-herd at boot when N concurrent HTTPS setups
each trigger independent forward queries for the same upstream hostname.
2026-04-21 16:18:52 +03:00
Razvan Dimescu
60600b045f chore: bump version to 0.14.1 2026-04-20 19:27:06 +03:00
Razvan Dimescu
3e6bf3feb0 Merge pull request #125 from razvandimescu/worktree-fix-blocklist-bootstrap
fix(blocklist): retry on transient download failures (#122)
2026-04-20 19:22:04 +03:00
Razvan Dimescu
8bed7c4649 test(blocklist): decouple retry tests from RETRY_DELAYS_SECS length
Derive both the flaky-server drop count and the zero-delay schedule
from RETRY_DELAYS_SECS.len() so the tests keep exercising their
intended invariants — "succeeds on final attempt" and "gives up after
all attempts fail" — if the production retry schedule ever changes.

Also: rename fail_first → drop_first_n to match drop(sock); swap the
giveup test's empty body for an "unreachable" sentinel so a regression
that accidentally served couldn't silently match Some("").
2026-04-20 19:19:43 +03:00
Razvan Dimescu
5b1642c6dc fix(blocklist): retry on transient download failures (#122)
On cold start, reqwest's getaddrinfo can race numa's own first-query
cold-path latency — resolver timeout fires before numa warms its
upstream DoH connection. Wrap each blocklist fetch in 3 retries with
2s/10s/30s backoff; by the second attempt, the upstream is warm and
subsequent getaddrinfos succeed in <100ms.

Also: parallelize fetches across lists via join_all (different hosts,
no warming dependency), walk the full error source chain so reqwest
failures surface the underlying cause, and parameterize retry delays
for unit-test speed.
2026-04-20 19:19:43 +03:00
Razvan Dimescu
01fda7891e Merge pull request #123 from razvandimescu/feat/odoh-etld1-check
feat(odoh): reject relay+target sharing an eTLD+1
2026-04-20 19:06:12 +03:00
Razvan Dimescu
5e84adbd94 Merge pull request #124 from razvandimescu/fix/dashboard-encryption-pct-args
fix(dashboard): pass missing args to encryptionPct in refresh()
2026-04-20 19:05:50 +03:00
Razvan Dimescu
15978a7859 fix(dashboard): pass missing args to encryptionPct in refresh()
Commit eb5ea3b generalised encryptionPct from (transport) to
(data, encryptedKeys, allKeys) and updated renderTransport and
renderUpstreamWire, but missed the call inside render() that computes
the inline `~N/s · M% enc` QPS tag. With undefined allKeys, the
first .reduce() threw TypeError and the render try/catch silently
downgraded the whole dashboard to "disconnected" — every panel left
empty even though /stats was returning real data.

Fix the call site to match the other two (inbound-wire keys) and have
the catch log to console so the next silent-failure regression shows
up in DevTools within seconds instead of a source dive.
2026-04-20 19:04:15 +03:00
Razvan Dimescu
193b38b85f feat(odoh): reject relay+target sharing an eTLD+1
Plain host-string equality caught the copy-paste-same-URL footgun but
let `r.cloudflare.com` + `odoh.cloudflare.com` through — two subdomains
of the same operator collapse ODoH to ordinary DoH. Add a second layer:
compare registrable domains via the PSL (`psl` crate) after the exact-
host check. Fails open on IP literals and unparseable hosts; the exact-
host check still runs in those cases.
2026-04-20 18:46:54 +03:00
Razvan Dimescu
4c685d1602 docs(readme): pamper readme still 2026-04-20 17:19:16 +03:00
Razvan Dimescu
cd6e686a1a docs(readme): surface ODoH in the intro paragraph
Adds the v0.14.0 capability where it's most differentiating: the first
paragraph (sealed-query framing alongside the existing ad-blocking and
.numa-domain pitches) and the second paragraph (numa relay as a public
ODoH endpoint, with the DNSCrypt-list supply-doubling angle as fact).

No reposition: tagline and structure unchanged. ODoH joins the
existing capability set rather than displacing it. Hero GIF stays;
will be re-recorded once the dashboard's Outbound Wire panel is worth
showing in motion.
2026-04-20 17:14:21 +03:00
Razvan Dimescu
07c321f749 chore(release): bump to v0.14.0
Headline: ODoH (RFC 9230) — client + self-hosted relay. Set
mode = "odoh" in [upstream] to seal queries before they leave the
machine; run `numa relay` to add to the public ODoH ecosystem.
2026-04-20 17:07:31 +03:00
Razvan Dimescu
12a06a1410 Merge pull request #121 from razvandimescu/feat/odoh
feat(odoh): ship ODoH client + self-hosted relay (RFC 9230)
2026-04-20 16:26:54 +03:00
Razvan Dimescu
eb5ea3b645 refactor(odoh): deduplicate post-audit findings
- Hoist ODOH_CONTENT_TYPE to a single pub(crate) constant in odoh.rs;
  relay.rs imports it instead of declaring its own.
- Generalize dashboard encryptionPct(data, encryptedKeys, allKeys)
  so both Inbound Wire and Outbound Wire panels share the same math
  instead of drifting independently.
- Extract RelayState::new() and build_app() helpers in relay.rs so
  the test spawn_relay() and production run() wire the same router
  + body-limit layer. Prevents future middleware from landing in one
  path but not the other.

All 344 lib tests pass; no behavior change.
2026-04-20 16:03:34 +03:00
Razvan Dimescu
be60f6ccbc chore(packaging): docker-compose + Caddyfile for ODoH relay deploy
Two-container deploy: Caddy terminates TLS (auto-provisions Let's
Encrypt via ACME) and reverse-proxies to a Numa relay on an internal
Docker network. The relay never reads sealed payloads; Caddy's
access log is discarded so per-request observability doesn't defeat
the oblivious property.

Validated against Hetzner CX22 + DNS at odoh-relay.numa.rs:
- TLS-ALPN-01 challenge succeeded on first attempt
- /health returned the relay's counter block
- End-to-end ODoH client → relay → Cloudflare works

Operators only need to: set a DNS A record, edit Caddyfile's hostname,
docker compose up -d. README walks through the steps and the DNSCrypt
v3/odoh-relays.md submission to claim a public listing.
2026-04-20 15:44:29 +03:00
Razvan Dimescu
a3cc64c94f feat(odoh): relay bind-address CLI arg + dashboard Outbound Wire panel
- `numa relay [PORT] [BIND]` accepts an optional bind address (defaults
  to 127.0.0.1, matching the Caddy reverse-proxy deployment shape).
  Required for Docker, where the relay needs 0.0.0.0 inside the
  container so Caddy can reach it across the bridge network.

- Dashboard now surfaces the upstream_transport dimension as an
  "Outbound Wire" panel alongside the existing "Inbound Wire" (renamed
  from "Transport" for directional clarity). Sub-headers — "apps → numa"
  / "numa → internet" — make the threat-model split obvious without
  jargon. Bars: UDP/DoH/DoT/ODoH, headline "X% encrypted outbound".
  The PR description's promise that "the dashboard answers how much of
  my DNS traffic left in cleartext honestly" is now true.
2026-04-20 15:44:20 +03:00
Razvan Dimescu
cf128c19af feat(odoh): bootstrap-IP overrides + zero hedge for ODoH (post-deploy fixes)
Two issues surfaced from running mode = "odoh" against the live Hetzner
relay as system DNS:

1. **Bootstrap deadlock.** The reqwest HTTPS client resolves the relay
   and target hostnames via system DNS. When numa is itself the system
   resolver, the ODoH client loops trying to resolve through itself.
   Adds optional `relay_ip` and `target_ip` to `[upstream]`, plumbed
   into reqwest's `resolve()` so the HTTPS client bypasses system DNS
   for those two hostnames. TLS still validates against the URL
   hostname, so a stale IP fails loudly rather than silently MITM'ing.

2. **2x relay load.** Default `hedge_ms = 10` triggers a duplicate
   in-flight query for every request. Useful for UDP/DoH/DoT (rescues
   tail latency cheaply); wasteful for ODoH (doubles HPKE seal/unseal,
   doubles sealed-byte footprint a passive observer can correlate, no
   latency win — relay hop dominates either way). Force-zero in
   oblivious mode regardless of configured hedge_ms.

Validated end-to-end against odoh-relay.numa.rs → Cloudflare:
3 digs produced 3 forwarded_ok on the relay (was 6 before the hedge
fix), upstream_transport.odoh ticks correctly.
2026-04-20 15:44:09 +03:00
Razvan Dimescu
241c40553b feat(odoh): ship ODoH client + self-hosted relay (RFC 9230)
Client (mode = "odoh"): URL-query target routing per RFC 9230 §5,
/.well-known/odohconfigs TTL cache with 60s backoff on failure, HPKE
seal/open via odoh-rs, strict-mode default that SERVFAILs on relay
failure instead of silently downgrading. Host-equality config
validation rejects same-operator relay/target pairs.

Relay (`numa relay [PORT]`): axum server with /relay + /health.
SSRF-hardened hostname validator (RFC 1035 ASCII + dot + dash),
4 KiB body cap at the axum layer, 5s full-transaction timeout, and
static 502 on target failure (reqwest internals logged, not leaked).
Aggregate counters only — no per-request logs.

Observability: new `UpstreamTransport { Udp, Doh, Dot, Odoh }`
orthogonal to `QueryPath`, so /stats can tally wire protocols
symmetrically. Recursive mode records `Some(Udp)` for honest
"bytes egressing in cleartext" accounting.

Tests: Suite 8 exercises the client end-to-end via Frank Denis's
public relay + Cloudflare target; Suite 9 exercises `numa relay`
forwarding + guards against Cloudflare as the real far end. Full
probe script at tests/probe-odoh-ecosystem.sh verifies the entire
public ODoH ecosystem (4 targets + 1 relay per DNSCrypt's curated
list — confirms deploying Numa's relay doubles global supply).
2026-04-20 12:34:04 +03:00
Razvan Dimescu
f6cfb3ce1b Merge pull request #120 from razvandimescu/feat/named-record-types
feat(question): name SVCB/LOC/NAPTR record types in logs
2026-04-19 08:08:54 +03:00
Razvan Dimescu
5725f94ff3 refactor(question): collapse QueryType impls behind define_qtypes! macro
Adding a record type used to require 5 edits across the file (enum
variant, to_num, from_num, as_str, parse_str). The macro takes a
single (variant, num, str) tuple per type and generates the enum
plus all four methods.

UNKNOWN(u16) stays hand-coded since it carries data and can't sit
in the table.

src/question.rs: 156 lines -> 92 lines, no behavior change.
2026-04-19 08:01:18 +03:00
Razvan Dimescu
24610ae3fe feat(question): add SVCB, LOC, NAPTR variants to QueryType
Logs were printing UNKNOWN(64), UNKNOWN(29), UNKNOWN(35) for SVCB,
LOC, and NAPTR — three RR types that have been registered for years
and show up in the wild (notably SVCB via RFC 9462 DDR clients
querying _dns.resolver.arpa).

Adds the variants and replaces the SVCB_QTYPE u16 const introduced
in #119 with QueryType::SVCB.to_num(), matching the HTTPS path.

Closes #114.
2026-04-19 07:49:35 +03:00
Razvan Dimescu
6bc02982f0 Merge pull request #119 from razvandimescu/feat/filter-aaaa
feat(resolver): filter_aaaa for IPv4-only networks
2026-04-19 07:31:27 +03:00
Razvan Dimescu
f9e996ae78 fmt: drop redundant comments per house style 2026-04-19 06:53:47 +03:00
Razvan Dimescu
5e85b147b9 feat(resolver): apply ipv6hint strip to SVCB (type 64) too
HTTPS (65) and SVCB (64) share the RDATA wire format, so the existing
parser already handles both — only the call site was HTTPS-only. Widen
the qtype check and extend the existing pipeline test with a second
query for SVCB.
2026-04-19 06:52:30 +03:00
Razvan Dimescu
d6bb9a0f01 fmt: rustfmt vec literal wrapping + signature collapse 2026-04-19 06:24:54 +03:00
Razvan Dimescu
61ea2e510d refactor: dedupe HTTPS_TYPE, record-walk, and test rdata builder
- Drop `const HTTPS_TYPE: u16 = 65;` in favor of `QueryType::HTTPS.to_num()`
  at the single call site — avoids a fresh magic number alongside the
  existing enum mapping in question.rs.
- Add `DnsPacket::for_each_record_mut` so `strip_https_ipv6_hints` stops
  hand-rolling the answers/authorities/resources walk; future section
  rewrites go through the same helper.
- Promote the SVCB test-rdata builder from `svcb::tests` to module scope
  as `pub(crate) #[cfg(test)] fn build_rdata`, and reuse it in the two
  pipeline tests in ctx.rs — kills ~20 lines of byte-fiddling and keeps
  one RDATA-construction code path.
2026-04-19 05:58:47 +03:00
Razvan Dimescu
22dd3cd222 fix(resolver): skip ipv6hint strip for DO-bit clients
Modifying HTTPS rdata invalidates any accompanying RRSIG, so a DNSSEC-
validating downstream would reject the response as Bogus. Gate the
strip on !client_do, matching the existing DNSSEC-records strip.

Adds a regression test that catches the gate being removed: builds a
query with EDNS DO=1, asserts the HTTPS rdata round-trips untouched.
2026-04-19 05:52:37 +03:00
Razvan Dimescu
8014ebac9e test(integration): add Suite 7 for filter_aaaa + SUITES env filter
Suite 7 exercises the full pipeline end-to-end: A resolves, AAAA returns
NODATA, local [[zones]] AAAA bypasses the filter, and HTTPS ipv6hint is
stripped from a real cloudflare.com response. A second config run with
the flag unset guards against network-failure false-positives.

SUITES=N (comma list) runs a subset, e.g. `SUITES=7 bash tests/integration.sh`
skips suites 1-6 for fast iteration.
2026-04-19 05:52:29 +03:00
Razvan Dimescu
70400187d0 Merge pull request #118 from razvandimescu/feat/linux-drop-privileges
feat(linux): run systemd service as unprivileged numa user
2026-04-18 22:04:53 +03:00
Razvan Dimescu
fb41a6f8b5 test(linux): systemd service install verification
Three scenarios CI cannot run: every advertised port is functional (DNS
resolves, TLS chain validates against numa's CA, HTTP/API respond), CA
fingerprint survives upgrade from pre-drop layout, binary staging
fallback from a 0700 source dir. Self-bootstraps a privileged
systemd-as-PID1 container — no dependency on long-lived test containers.

MainPID user assertion retries until comm=numa to avoid a race where
systemctl reports active while MainPID still points at a transitional
process.
2026-04-18 22:00:54 +03:00
Razvan Dimescu
b02b607fb9 ci(linux): assert numa daemon does not run as root
Locks in the invariant this branch establishes: a regression that
reverts to User=root would otherwise ship green.
2026-04-18 20:07:24 +03:00
Razvan Dimescu
be98a02e49 feat(resolver): filter_aaaa for IPv4-only networks (#112)
When enabled, AAAA queries short-circuit to NODATA (NOERROR + empty
answer) so Happy Eyeballs clients don't stall waiting on a v6 address
they can't use. Also strips `ipv6hint` SvcParam from HTTPS/SVCB
answers (RFC 9460) so Chrome ≥103, Firefox, and Safari don't bypass
the AAAA filter via the HTTPS record path.

Local data is preserved: overrides, zones, the .numa proxy, and the
blocklist sinkhole keep whatever v6 addresses they configure — the
filter only kicks in on the cache/forward/recursive path. NODATA is
correct per RFC 2308 here; NXDOMAIN would incorrectly imply the name
doesn't exist for A queries either.

Off by default. Opt in via `filter_aaaa = true` under `[server]`.
2026-04-18 19:52:06 +03:00
Razvan Dimescu
763131478f fmt: rustfmt format! macro split 2026-04-18 12:15:44 +03:00
Razvan Dimescu
067195f2ab fix(linux): atomic binary copy + restart instead of start on re-install
Re-install failed with ETXTBSY (Text file busy) because std::fs::copy
can't overwrite a binary that's currently being executed by the
running service. Switch to copy-then-rename: write the new binary to
/usr/local/bin/numa.new, then rename over /usr/local/bin/numa. Rename
swaps the path while the running process keeps the old inode alive,
so DNS keeps serving from the previous binary until restart.

Bump systemctl start to restart so the new binary actually loads on
re-install (start is a no-op when the unit is already active, which
would silently leave the old binary running).

Locally verified the full CI sequence: install → curl → reinstall →
curl → uninstall → curl-fails. All three assertions pass.
2026-04-18 12:12:11 +03:00
Razvan Dimescu
e19505aa95 fix(linux): narrow replace_exe_path cfg to macos after Linux inlined the substitution
Linux install_service_linux now does the {{exe_path}} substitution
inline because it uses the (potentially copied) binary path returned
by install_service_binary_linux, not current_exe(). The shared
replace_exe_path helper is dead on Linux — clippy -D warnings caught it.

Narrow the function to macos and split the placeholder test: keep the
"both templates contain {{exe_path}}" assertion as a cross-platform test
(catches placeholder removal on either file), keep the substitution test
gated to macos where the function lives.
2026-04-18 11:57:54 +03:00
Razvan Dimescu
3970a9f45c fix(linux): copy binary to /usr/local/bin when source path isn't world-traversable
DynamicUser=yes' transient account can only traverse world-x directories.
The CI binary at /home/runner/work/numa/numa/target/release/numa fails
exec with EACCES because /home/runner is mode 0700; same applies to a
build under /home/<user>/, ~/.cargo/bin, or any private $HOME tree.

install_service_binary_linux now walks the binary's path. If every
ancestor grants world-execute (Linuxbrew /home/linuxbrew is 0755,
/usr/local/bin is fine, install.sh layout works), keep the source
path so brew/distro upgrades propagate in place. Otherwise copy to
/usr/local/bin/numa and reference that in the unit.

Locally verified both branches in an Ubuntu 24.04 systemd container:
- CI-like /home/runner (0700) → copies + service binds 5380
- Brew-like /home/linuxbrew (0755) → keeps source path + service binds 5380
2026-04-18 11:51:32 +03:00
Razvan Dimescu
7b9db9e889 fix(linux): drop ProtectHome=true — blocks exec when binary lives under /home
Integration-linux journalctl showed status=203/EXEC: systemd couldn't
exec /home/runner/work/numa/numa/target/release/numa because
ProtectHome=yes makes /home invisible to the sandboxed process. My
local Docker test passed because the binary was at /workspace, not
/home.

DynamicUser=yes already implies ProtectHome=read-only, which preserves
exec access to binaries living under /home (cargo install, source
builds, CI) while blocking writes to user $HOMEs. Keep that default
rather than over-restricting.

Follow-up worth tracking: install_service_linux could copy the binary
to /usr/local/bin/numa the way Windows does at windows_service_exe_path,
making the unit's ExecStart independent of where `numa install` was
invoked from — then we could set ProtectHome=yes again.
2026-04-18 08:54:34 +03:00
Razvan Dimescu
dfeca53e21 ci: dump journalctl + systemctl status on integration-linux failure 2026-04-18 08:48:53 +03:00
Razvan Dimescu
4f6159d961 refactor(linux): switch to DynamicUser=yes, drop install-time user creation
AUR installs never call `numa install` — PKGBUILD drops the unit straight
into /usr/lib/systemd/system and the user runs `systemctl enable numa`.
With User=numa the Rust installer's useradd code never fires there,
breaking Arch out of the box.

DynamicUser=yes sidesteps packaging entirely — systemd allocates a
transient UID per start and remaps StateDirectory ownership (including
legacy root-owned trees) automatically. Works on any modern systemd.

Drops the ensure_numa_user_linux/chown helpers plus NUMA_USER; the
unit file alone now captures the privilege-drop story.
2026-04-18 08:20:07 +03:00
Razvan Dimescu
41aea1dd12 fix(linux): drop risky sandbox directives that break Rust network daemons
Integration test failed with exit 7 on curl to /health after a successful
install — service started but never listened. The likely culprits are
MemoryDenyWriteExecute (breaks jemalloc/some crypto), SystemCallFilter
~@privileged @resources (blocks setrlimit and friends tokio may use),
and RestrictNamespaces/LockPersonality (occasional foot-guns).

Pull them and keep a conservative hardening set that's well-tested with
Rust network services: no-new-privs, protect-system/home, private tmp
and devices, protect-kernel-*, restrict-realtime/suid/address-families.
Layer the aggressive bits back in follow-up PRs once tested individually.
2026-04-18 08:10:04 +03:00
Razvan Dimescu
695a8b963c feat(linux): run systemd service as unprivileged numa user
- numa.service: User=numa + CAP_NET_BIND_SERVICE + sandboxing block
  (ProtectSystem=strict, PrivateTmp, seccomp @system-service, etc)
- install_service_linux: create numa system user + chown data_dir
  before first start so TLS-cert generation and state writes land
  on a numa-owned tree

Runtime verified root-free on Linux — network_watch_loop only reads
/etc/resolv.conf; all system-DNS mutation stays in the installer,
which continues to run as root via sudo.
2026-04-18 07:56:59 +03:00
Razvan Dimescu
34e2182ae4 Merge pull request #104 from razvandimescu/feat/forwarding-array-upstream
feat: accept array of upstreams in [[forwarding]]
2026-04-17 23:25:04 +03:00
Razvan Dimescu
5f77af55e9 fix(forward): track SRTT for DoT upstreams, not just UDP
The SRTT ordering + failure penalty path was UDP-only, so a DoT primary
in a forwarding-rule pool was never deprioritized on failure and all
DoT entries tied at INITIAL_SRTT_MS in the sort key. With [[forwarding]]
now accepting arrays of upstreams, DoT pools are a first-class case and
need the same healthiest-first behavior the default pool gets for UDP.

- Add Upstream::tracked_ip() → Some(ip) for Udp/Dot, None for Doh
  (DoH has no stable IP — reqwest pools connections by hostname).
- Rewire the three SRTT call sites in forward_with_failover_raw.
- Hoist srtt.read() out of the candidate-scoring loop — one lock per
  query instead of N (matters now that pools commonly have N>1).
- Drop unused #[derive(Debug)] on UpstreamPool and ForwardingRule.
- Regression tests: udp_failure_records_in_srtt + dot_failure_records_in_srtt.
2026-04-17 03:39:21 +03:00
Razvan Dimescu
ab6cda0c91 Merge branch 'main' into feat/forwarding-array-upstream
Resolves src/main.rs conflict: serve loop was extracted into src/serve.rs on main (PR #107). Ported the forwarding-rule log change to serve.rs — fwd.upstream is now Vec<String>, logged with join(", ").
2026-04-17 03:14:09 +03:00
Razvan Dimescu
f9ce82f4b0 Merge pull request #107 from razvandimescu/feat/windows-service
feat(windows): run as a real SCM service, not a Run-key autostart
2026-04-17 02:02:43 +03:00
Razvan Dimescu
1d9495c013 ci: bridge DNS gap with direct upstream instead of polling
systemd-resolved has a ~40s reconfiguration stall after restart
(systemd #22521) that breaks the GHA runner's persistent connection
to results-receiver.actions.githubusercontent.com. Polling for DNS
recovery isn't enough since the .NET runner agent caches DNS at the
connection-pool level. Replace the broken stub-resolv symlink with a
direct upstream so DNS works instantly.
2026-04-17 01:32:36 +03:00
Razvan Dimescu
34b75833b8 ci: poll for DNS recovery in cleanup, not test step
Move DNS recovery wait into the cleanup step (if: always) so it runs
regardless of test outcome. Use getent hosts loop instead of sleep+dig
to match what post-steps actually use for resolution.
2026-04-17 01:11:20 +03:00
Razvan Dimescu
99af97a67b ci: wait for DNS recovery after uninstall on Linux
systemd-resolved needs a moment to restore its stub listener after
the numa drop-in is removed. Without a wait, the runner can't resolve
GitHub's API to report job completion.
2026-04-16 20:20:53 +03:00
Razvan Dimescu
9e56054f37 ci: add integration tests for install/uninstall lifecycle
Release-build + install/verify/re-install/uninstall cycle on Linux and
macOS. Runs after lint/test passes (needs dependency). Cleanup step
uses if: always() to handle cancellation.
2026-04-16 19:56:44 +03:00
Razvan Dimescu
fe9f31616e test: add SCM output parsing and config path regression tests
Extract parse_sc_registered and parse_sc_state as testable pure
functions. 8 new tests covering: service registration detection,
service state parsing, and Windows config_dir == data_dir invariant.
2026-04-16 19:31:26 +03:00
Razvan Dimescu
9f08d8b489 fix(windows): stop service before port probe, wait for full exit
Stop the running service before disabling Dnscache so the port 53 probe
sees the real state (not Numa's own binding). Wait for SCM STOPPED
state before copying the binary to avoid os error 32 (file in use).
2026-04-16 19:21:56 +03:00
Razvan Dimescu
9bea038cb6 fix(windows): unify config/data dir and add service log file
config_dir() on Windows now returns data_dir() (ProgramData) so config,
services.json, and log file are in the same place for both interactive
and service contexts. Service mode writes logs to numa.log via
env_logger pipe. Dashboard shows correct log path per OS.
2026-04-16 19:12:42 +03:00
Razvan Dimescu
f0a1dd7106 fix(dashboard): hide logs path on Windows (no log sink yet) 2026-04-16 19:01:34 +03:00
Razvan Dimescu
6789c321bc fix(windows): defer DNS redirect until port 53 is free
Probe port 53 after disabling Dnscache instead of assuming reboot is
needed. Skip DNS redirect when port is blocked (service does it on
first boot). Fix readiness probe: TCP connect to API port instead of
broken UDP send_to that always succeeded.
2026-04-16 18:35:09 +03:00
Razvan Dimescu
da40a8dbfc ci: fetch full history on Windows so build.rs embeds git SHA 2026-04-16 18:08:48 +03:00
Razvan Dimescu
65e65028a0 fix(windows): separate service lifecycle from install flow
service start/stop/restart/status now map to proper SCM operations
instead of re-running the full install/uninstall flow. On re-install,
stop the running service first so the binary can be overwritten.
2026-04-16 16:59:54 +03:00
Razvan Dimescu
d3eab73a31 fix: use sort_by_key to satisfy clippy unnecessary_sort_by 2026-04-16 16:13:15 +03:00
Razvan Dimescu
22ec684e48 Merge remote-tracking branch 'origin/main' into feat/windows-service
# Conflicts:
#	src/main.rs
2026-04-16 16:06:49 +03:00
Razvan Dimescu
aa040fd8a4 Merge pull request #111 from razvandimescu/fix/allowlist-input-focus
fix(dashboard): allowlist input erased by polling refresh
2026-04-16 15:27:02 +03:00
Razvan Dimescu
b69cc89d38 fix(dashboard): skip allowlist re-render while input has focus
The polling refresh replaced the entire allowlist panel innerHTML every
2 seconds, destroying the input field mid-typing. Users had to
paste-and-enter faster than the refresh interval — #106 reported this
as text "timing out and erasing."

Guard: skip renderAllowlist() when allowDomainInput has focus.
2026-04-16 15:12:00 +03:00
Razvan Dimescu
ebb801650e Merge pull request #110 from razvandimescu/feat/build-version
feat: embed git SHA in version string
2026-04-16 13:41:23 +03:00
Razvan Dimescu
30bb7365c9 refactor: robust git-describe parsing for pre-release tags
Switch to --long flag so format is always TAG-N-gSHA[-dirty], then
split from the right. Handles pre-release tags (v0.14.0-rc1) that
broke the previous left-split approach. Remove ineffective directory
watch on .git/refs/tags/. Trim comments.
2026-04-16 13:18:56 +03:00
Razvan Dimescu
0118ab0f44 feat: embed git SHA in version string via build.rs
Adds a build.rs that runs `git describe --tags --always --dirty` and
sets NUMA_BUILD_VERSION at compile time. A new `numa::version()` helper
returns the build version, falling back to CARGO_PKG_VERSION when git
is unavailable (source tarballs, Docker builds without .git).

Version strings:
  tagged release:      0.13.1
  commits ahead:       0.13.1+a87f907
  uncommitted changes: 0.13.1+a87f907-dirty
  no git:              0.13.1

Replaces all 6 inline env!("CARGO_PKG_VERSION") call sites with the
single version() function.
2026-04-16 13:02:25 +03:00
Razvan Dimescu
a87f907d20 Merge pull request #109 from razvandimescu/feat/dashboard-version
feat(dashboard): version in header, restructure footer
2026-04-16 11:29:54 +03:00
Razvan Dimescu
1c5e703330 fix(dashboard): collapse header on mobile (≤700px)
Hide tagline, version tag, and Phone Setup on narrow viewports so
the header stays single-row: logo + status dot + blocking toggle.
Reduces logo font-size from 1.8rem to 1.4rem on mobile.
2026-04-16 06:39:29 +03:00
Razvan Dimescu
cc635f2f73 feat(dashboard): show version in header, restructure footer
Closes #108.

- Add `version` field to /stats (from CARGO_PKG_VERSION).
- Show `v0.13.1` next to the Numa wordmark in the dashboard header.
- Restructure the footer into two semantic rows:
  Row 1 (paths): Config · Data · Logs (platform-detected)
  Row 2 (runtime): Upstream · DNSSEC · SRTT · GitHub
- Drop Mode from the footer (redundant with Upstream label).
- Show only the matching-platform log path instead of both
  macOS and Linux unconditionally.
2026-04-16 06:15:48 +03:00
Razvan Dimescu
7bb484ada3 refactor(windows): deduplicate after simplify review
- Drop the duplicate WINDOWS_SERVICE_NAME constant; call sites use the
  single source of truth at windows_service::SERVICE_NAME.
- windows_service_exe_path and service_config_path now compose from
  crate::data_dir() instead of re-parsing %PROGRAMDATA% locally.
- Factor the 6× sc.exe invocation boilerplate into a run_sc helper.
- Replace the 200ms try_recv polling loop in the service dispatcher
  with a recv_timeout wait — cuts shutdown latency and idle CPU.
- stop_service_scm/delete_service_scm now log warnings instead of
  silently swallowing failures, so unexpected errors are visible.
2026-04-15 23:48:09 +03:00
Razvan Dimescu
b610160cd1 feat(windows): run numa as a real SCM service, drop Run-key autostart
Hooks the service-dispatcher scaffolding from the previous commit to
actually serve DNS, and replaces the HKLM\…\Run login-time autostart
with a proper Windows service created via sc.exe.

**Refactor**
- Extract main.rs's inline server body (~500 lines) into `numa::serve::run`
  so both the interactive CLI entry and the service dispatcher drive the
  same startup/serve loop. main.rs is now a thin subcommand router.
- main.rs goes sync (no #[tokio::main]); each branch that needs async
  builds its own runtime and block_on's. Required so the --service path
  can hand off to SCM without fighting tokio for the entry thread.

**Windows service wrapper**
- `numa::windows_service::run_service` now builds a multi-thread tokio
  runtime on a dedicated thread and runs `serve::run` inside it. Stop/
  Shutdown from SCM aborts the wait loop and reports SERVICE_STOPPED.
- Config path resolves to `%PROGRAMDATA%\numa\numa.toml` when running
  under SCM (SYSTEM's cwd is System32, relative paths don't work).

**Install/uninstall**
- `install_windows` now copies numa.exe to a stable
  `%PROGRAMDATA%\numa\bin\numa.exe` and registers it via `sc create`
  with start=auto, obj=LocalSystem, and a failure policy of
  restart/5000/restart/5000/restart/10000. Starts the service
  immediately when no reboot is pending.
- `uninstall_windows` stops + deletes the service and removes the
  binary copy before restoring DNS.
- Drops the old `register_autostart` / `remove_autostart` helpers that
  wrote to `HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run` — that
  path runs at user login in the user's session with no stderr capture
  and no crash-restart policy, which is why we've been flying blind in
  every Windows debug session.

DNS-set bugs (netsh destructive static, IPv6 not touched, uninstall
secondary-drop) and file logging are orthogonal — tracked for follow-up.
2026-04-15 22:24:23 +03:00
Razvan Dimescu
cea4b0ef88 feat(windows): add windows-service crate + SCM dispatcher scaffold
Lets numa.exe act as a real Windows service registered with the SCM,
replacing the HKLM\...\Run login-time autostart that runs in the user
session without stderr capture.

- New `numa::windows_service` module (cfg(windows)) wraps Mullvad's
  `windows-service` crate: registers with SCM, reports Running, handles
  Stop/Shutdown, reports Stopped.
- `numa.exe --service` is the entry point SCM uses
  (`sc create … binPath="numa.exe --service"`); interactive invocations
  are unchanged.
- Dep is gated `[target.'cfg(windows)'.dependencies]` — zero impact on
  macOS/Linux builds or binary size.

Scaffold only. The service currently blocks on an mpsc channel until
Stop arrives; the actual serve loop will hook in once main.rs's inline
server body is extracted into `numa::serve(config_path)` in a follow-up.
This lets `sc start Numa` / `sc stop Numa` be verified end to end today.
2026-04-15 22:14:36 +03:00
Razvan Dimescu
4afc56a052 Merge main into feat/forwarding-array-upstream
Resolves conflict in src/ctx.rs — both sides added independent tokio
tests (forwarding fail-over on this branch, default-pool upstream path
on main from #103). Keep both.
2026-04-15 21:28:04 +03:00
Razvan Dimescu
43a5ca4bd5 Merge pull request #105 from razvandimescu/chore/audit-rustls-webpki
chore(deps): bump rustls-webpki to 0.103.12
2026-04-15 14:41:19 +03:00
Razvan Dimescu
b403671e11 chore(deps): bump rustls-webpki to 0.103.12
Patches RUSTSEC-2026-0098 (URI name constraints incorrectly accepted)
and RUSTSEC-2026-0099 (wildcard cert name constraints), both published
2026-04-14. Transitive via reqwest / rustls / hickory / quinn.
2026-04-15 14:27:17 +03:00
Razvan Dimescu
6f0144b237 Merge pull request #103 from razvandimescu/feat/upstream-log-label
feat: distinguish UPSTREAM vs FORWARD in logs and stats
2026-04-15 14:00:28 +03:00
Razvan Dimescu
fef43635d6 fix(ci): rustfmt import order and gate Upstream import for Windows 2026-04-15 04:11:27 +03:00
Razvan Dimescu
9a0d586b13 feat: accept array of upstreams in [[forwarding]]
Mirrors `[upstream] address` — `upstream` accepts string or array
of strings, builds an `UpstreamPool` and routes queries through
`forward_with_failover_raw` so SRTT ordering and failover apply to
matched `[[forwarding]]` rules the same way they do for the default
pool.

Single-string rules keep their current behavior (one-element pool,
equivalent single-upstream path). Empty array errors at config load.

Addresses item 1 of issue #102. Plan: docs/102_item1.md.
2026-04-15 04:03:38 +03:00
Razvan Dimescu
4bd08e206d feat(dashboard): hide zero-count path and transport rows 2026-04-14 21:25:11 +03:00
Razvan Dimescu
ebb2a5db39 refactor: simplify upstream-path test — reuse pool mutex, drop narrating comment 2026-04-14 18:26:45 +03:00
Razvan Dimescu
e0e0f50838 feat: distinguish UPSTREAM vs FORWARD in logs and stats
Queries matching a [[forwarding]] suffix rule now log as FORWARD;
queries resolved via the default [upstream] pool log as UPSTREAM.
Previously both paths shared the FORWARD label, making it impossible
to tell from logs whether a rule matched.

Adds QueryPath::Upstream, a queries.upstream stats counter exposed
via /stats, plus a matching dashboard filter, bar, and path tag.

Closes part of #102.
2026-04-14 18:18:32 +03:00
35 changed files with 5734 additions and 1039 deletions

View File

@@ -56,6 +56,8 @@ jobs:
runs-on: windows-latest runs-on: windows-latest
steps: steps:
- uses: actions/checkout@v6 - uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: dtolnay/rust-toolchain@stable - uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2 - uses: Swatinem/rust-cache@v2
- name: build - name: build
@@ -69,3 +71,76 @@ jobs:
with: with:
name: numa-windows-x86_64 name: numa-windows-x86_64
path: target/debug/numa.exe path: target/debug/numa.exe
integration-linux:
needs: [check]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2
- name: build
run: cargo build --release
- name: install / verify / re-install / uninstall
run: |
sudo ./target/release/numa install
sleep 2
curl -sf http://127.0.0.1:5380/health
dig @127.0.0.1 example.com +short +timeout=5 | grep -q '.'
user=$(ps -o user= -p "$(systemctl show -p MainPID --value numa)" | tr -d ' ')
echo "numa running as: $user"
test "$user" != "root"
sudo ./target/release/numa install
sleep 2
curl -sf http://127.0.0.1:5380/health
sudo ./target/release/numa uninstall
sleep 1
! curl -sf http://127.0.0.1:5380/health 2>/dev/null
- name: diagnostics on failure
if: failure()
run: |
echo "=== systemctl status numa ==="
sudo systemctl status numa --no-pager -l || true
echo "=== journalctl -u numa (last 200) ==="
sudo journalctl -u numa --no-pager -n 200 || true
echo "=== ss -tulnp on 53/80/443/853/5380 ==="
sudo ss -tulnp 2>/dev/null | grep -E ':(53|80|443|853|5380)\b' || true
echo "=== systemctl is-active systemd-resolved ==="
systemctl is-active systemd-resolved || true
- name: cleanup
if: always()
run: |
sudo ./target/release/numa uninstall 2>/dev/null || true
# systemd-resolved has a ~40s DNS reconfiguration stall after
# restart (systemd issue #22521) that breaks the runner agent's
# connection to GitHub. Bridge it by replacing the stub-resolv
# symlink with a direct upstream — DNS works instantly and the
# runner can phone home for post-job steps.
sudo rm -f /etc/resolv.conf
echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf > /dev/null
getent hosts github.com >/dev/null
integration-macos:
needs: [check-macos]
runs-on: macos-latest
steps:
- uses: actions/checkout@v6
- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2
- name: build
run: cargo build --release
- name: install / verify / re-install / uninstall
run: |
sudo ./target/release/numa install
sleep 2
curl -sf http://127.0.0.1:5380/health
dig @127.0.0.1 example.com +short +timeout=5 | grep -q '.'
sudo ./target/release/numa install
sleep 2
curl -sf http://127.0.0.1:5380/health
sudo ./target/release/numa uninstall
sleep 1
! curl -sf http://127.0.0.1:5380/health 2>/dev/null
- name: cleanup
if: always()
run: sudo ./target/release/numa uninstall 2>/dev/null || true

408
Cargo.lock generated
View File

@@ -8,6 +8,41 @@ version = "2.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "320119579fcad9c21884f5c4861d16174d0e06250625266f50fe6898340abefa" checksum = "320119579fcad9c21884f5c4861d16174d0e06250625266f50fe6898340abefa"
[[package]]
name = "aead"
version = "0.5.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d122413f284cf2d62fb1b7db97e02edb8cda96d769b16e443a4f6195e35662b0"
dependencies = [
"crypto-common",
"generic-array",
]
[[package]]
name = "aes"
version = "0.8.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b169f7a6d4742236a0a00c541b845991d0ac43e546831af1249753ab4c3aa3a0"
dependencies = [
"cfg-if",
"cipher",
"cpufeatures",
]
[[package]]
name = "aes-gcm"
version = "0.10.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "831010a0f742e1209b3bcea8fab6a8e149051ba6099432c8cb2cc117dec3ead1"
dependencies = [
"aead",
"aes",
"cipher",
"ctr",
"ghash",
"subtle",
]
[[package]] [[package]]
name = "aho-corasick" name = "aho-corasick"
version = "1.1.4" version = "1.1.4"
@@ -109,7 +144,7 @@ dependencies = [
"nom", "nom",
"num-traits", "num-traits",
"rusticata-macros", "rusticata-macros",
"thiserror", "thiserror 2.0.18",
"time", "time",
] ]
@@ -257,6 +292,15 @@ version = "2.11.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "843867be96c8daad0d758b57df9392b6d8d271134fce549de6ce169ff98a92af" checksum = "843867be96c8daad0d758b57df9392b6d8d271134fce549de6ce169ff98a92af"
[[package]]
name = "block-buffer"
version = "0.10.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3078c7629b62d3f0439517fa394996acacc5cbc91c5a20d8c658e77abd503a71"
dependencies = [
"generic-array",
]
[[package]] [[package]]
name = "bumpalo" name = "bumpalo"
version = "3.20.2" version = "3.20.2"
@@ -299,6 +343,30 @@ version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "613afe47fcd5fac7ccf1db93babcb082c5994d996f20b8b159f2ad1658eb5724" checksum = "613afe47fcd5fac7ccf1db93babcb082c5994d996f20b8b159f2ad1658eb5724"
[[package]]
name = "chacha20"
version = "0.9.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c3613f74bd2eac03dad61bd53dbe620703d4371614fe0bc3b9f04dd36fe4e818"
dependencies = [
"cfg-if",
"cipher",
"cpufeatures",
]
[[package]]
name = "chacha20poly1305"
version = "0.10.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "10cd79432192d1c0f4e1a0fef9527696cc039165d729fb41b3f4f4f354c2dc35"
dependencies = [
"aead",
"chacha20",
"cipher",
"poly1305",
"zeroize",
]
[[package]] [[package]]
name = "ciborium" name = "ciborium"
version = "0.2.2" version = "0.2.2"
@@ -326,6 +394,17 @@ dependencies = [
"half", "half",
] ]
[[package]]
name = "cipher"
version = "0.4.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "773f3b9af64447d2ce9850330c473515014aa235e6a783b02db81ff39e4a3dad"
dependencies = [
"crypto-common",
"inout",
"zeroize",
]
[[package]] [[package]]
name = "clap" name = "clap"
version = "4.6.0" version = "4.6.0"
@@ -383,6 +462,15 @@ version = "0.4.31"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "75984efb6ed102a0d42db99afb6c1948f0380d1d91808d5529916e6c08b49d8d" checksum = "75984efb6ed102a0d42db99afb6c1948f0380d1d91808d5529916e6c08b49d8d"
[[package]]
name = "cpufeatures"
version = "0.2.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "59ed5838eebb26a2bb2e58f6d5b5316989ae9d08bab10e0e6d103e656d1b0280"
dependencies = [
"libc",
]
[[package]] [[package]]
name = "crc32fast" name = "crc32fast"
version = "1.5.0" version = "1.5.0"
@@ -473,6 +561,51 @@ version = "0.2.4"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "460fbee9c2c2f33933d720630a6a0bac33ba7053db5344fac858d4b8952d77d5" checksum = "460fbee9c2c2f33933d720630a6a0bac33ba7053db5344fac858d4b8952d77d5"
[[package]]
name = "crypto-common"
version = "0.1.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "78c8292055d1c1df0cce5d180393dc8cce0abec0a7102adb6c7b1eef6016d60a"
dependencies = [
"generic-array",
"rand_core 0.6.4",
"typenum",
]
[[package]]
name = "ctr"
version = "0.9.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0369ee1ad671834580515889b80f2ea915f23b8be8d0daa4bbaf2ac5c7590835"
dependencies = [
"cipher",
]
[[package]]
name = "curve25519-dalek"
version = "4.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "97fb8b7c4503de7d6ae7b42ab72a5a59857b4c937ec27a3d4539dba95b5ab2be"
dependencies = [
"cfg-if",
"cpufeatures",
"curve25519-dalek-derive",
"fiat-crypto",
"rustc_version",
"subtle",
]
[[package]]
name = "curve25519-dalek-derive"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f46882e17999c6cc590af592290432be3bce0428cb0d5f8b6715e4dc7b383eb3"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]] [[package]]
name = "data-encoding" name = "data-encoding"
version = "2.10.0" version = "2.10.0"
@@ -502,6 +635,17 @@ dependencies = [
"powerfmt", "powerfmt",
] ]
[[package]]
name = "digest"
version = "0.10.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9ed9a281f7bc9b7576e61468ba615a66a5c8cfdff42420a70aa82701a3b1e292"
dependencies = [
"block-buffer",
"crypto-common",
"subtle",
]
[[package]] [[package]]
name = "displaydoc" name = "displaydoc"
version = "0.2.5" version = "0.2.5"
@@ -576,6 +720,12 @@ dependencies = [
"windows-sys 0.61.2", "windows-sys 0.61.2",
] ]
[[package]]
name = "fiat-crypto"
version = "0.2.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "28dea519a9695b9977216879a3ebfddf92f1c08c05d984f8996aecd6ecdc811d"
[[package]] [[package]]
name = "find-msvc-tools" name = "find-msvc-tools"
version = "0.1.9" version = "0.1.9"
@@ -707,6 +857,16 @@ dependencies = [
"slab", "slab",
] ]
[[package]]
name = "generic-array"
version = "0.14.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "85649ca51fd72272d7821adaf274ad91c288277713d9c18820d8499a7ff69e9a"
dependencies = [
"typenum",
"version_check",
]
[[package]] [[package]]
name = "getrandom" name = "getrandom"
version = "0.2.17" version = "0.2.17"
@@ -747,6 +907,16 @@ dependencies = [
"wasip3", "wasip3",
] ]
[[package]]
name = "ghash"
version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f0d8a4362ccb29cb0b265253fb0a2728f592895ee6854fd9bc13f2ffda266ff1"
dependencies = [
"opaque-debug",
"polyval",
]
[[package]] [[package]]
name = "h2" name = "h2"
version = "0.4.13" version = "0.4.13"
@@ -820,7 +990,7 @@ dependencies = [
"rand", "rand",
"ring", "ring",
"rustls", "rustls",
"thiserror", "thiserror 2.0.18",
"tinyvec", "tinyvec",
"tokio", "tokio",
"tokio-rustls", "tokio-rustls",
@@ -846,13 +1016,51 @@ dependencies = [
"resolv-conf", "resolv-conf",
"rustls", "rustls",
"smallvec", "smallvec",
"thiserror", "thiserror 2.0.18",
"tokio", "tokio",
"tokio-rustls", "tokio-rustls",
"tracing", "tracing",
"webpki-roots 0.26.11", "webpki-roots 0.26.11",
] ]
[[package]]
name = "hkdf"
version = "0.12.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7b5f8eb2ad728638ea2c7d47a21db23b7b58a72ed6a38256b8a1849f15fbbdf7"
dependencies = [
"hmac",
]
[[package]]
name = "hmac"
version = "0.12.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6c49c37c09c17a53d937dfbb742eb3a961d65a994e6bcdcf37e7399d0cc8ab5e"
dependencies = [
"digest",
]
[[package]]
name = "hpke"
version = "0.13.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f65d16b699dd1a1fa2d851c970b0c971b388eeeb40f744252b8de48860980c8f"
dependencies = [
"aead",
"aes-gcm",
"chacha20poly1305",
"digest",
"generic-array",
"hkdf",
"hmac",
"rand_core 0.9.5",
"sha2",
"subtle",
"x25519-dalek",
"zeroize",
]
[[package]] [[package]]
name = "http" name = "http"
version = "1.4.0" version = "1.4.0"
@@ -1081,6 +1289,15 @@ dependencies = [
"serde_core", "serde_core",
] ]
[[package]]
name = "inout"
version = "0.1.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "879f10e63c20629ecabbb64a8010319738c66a5cd0c29b02d63d272b03751d01"
dependencies = [
"generic-array",
]
[[package]] [[package]]
name = "ipconfig" name = "ipconfig"
version = "0.3.4" version = "0.3.4"
@@ -1330,7 +1547,7 @@ dependencies = [
[[package]] [[package]]
name = "numa" name = "numa"
version = "0.13.1" version = "0.14.1"
dependencies = [ dependencies = [
"arc-swap", "arc-swap",
"axum", "axum",
@@ -1344,7 +1561,10 @@ dependencies = [
"hyper", "hyper",
"hyper-util", "hyper-util",
"log", "log",
"odoh-rs",
"psl",
"qrcode", "qrcode",
"rand_core 0.9.5",
"rcgen", "rcgen",
"reqwest", "reqwest",
"ring", "ring",
@@ -1359,9 +1579,23 @@ dependencies = [
"toml", "toml",
"tower", "tower",
"webpki-roots 1.0.6", "webpki-roots 1.0.6",
"windows-service",
"x509-parser", "x509-parser",
] ]
[[package]]
name = "odoh-rs"
version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cbb89720b7dfdddc89bc7560669d41a0bb68eb64784a4aebd293308a489f3837"
dependencies = [
"aes-gcm",
"bytes",
"hkdf",
"hpke",
"thiserror 1.0.69",
]
[[package]] [[package]]
name = "oid-registry" name = "oid-registry"
version = "0.8.1" version = "0.8.1"
@@ -1393,6 +1627,12 @@ version = "11.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d6790f58c7ff633d8771f42965289203411a5e5c68388703c06e14f24770b41e" checksum = "d6790f58c7ff633d8771f42965289203411a5e5c68388703c06e14f24770b41e"
[[package]]
name = "opaque-debug"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c08d65885ee38876c4f86fa503fb49d7b507c2b62552df7c70b2fce627e06381"
[[package]] [[package]]
name = "page_size" name = "page_size"
version = "0.6.0" version = "0.6.0"
@@ -1482,6 +1722,29 @@ dependencies = [
"plotters-backend", "plotters-backend",
] ]
[[package]]
name = "poly1305"
version = "0.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8159bd90725d2df49889a078b54f4f79e87f1f8a8444194cdca81d38f5393abf"
dependencies = [
"cpufeatures",
"opaque-debug",
"universal-hash",
]
[[package]]
name = "polyval"
version = "0.6.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9d1fe60d06143b2430aa532c94cfe9e29783047f06c0d7fd359a9a51b729fa25"
dependencies = [
"cfg-if",
"cpufeatures",
"opaque-debug",
"universal-hash",
]
[[package]] [[package]]
name = "portable-atomic" name = "portable-atomic"
version = "1.13.1" version = "1.13.1"
@@ -1540,6 +1803,21 @@ dependencies = [
"unicode-ident", "unicode-ident",
] ]
[[package]]
name = "psl"
version = "2.1.203"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "76c0777260d32b76a8c3c197646707085d37e79d63b5872a29192c8d4f60f50b"
dependencies = [
"psl-types",
]
[[package]]
name = "psl-types"
version = "2.0.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "33cb294fe86a74cbcf50d4445b37da762029549ebeea341421c7c70370f86cac"
[[package]] [[package]]
name = "qrcode" name = "qrcode"
version = "0.14.1" version = "0.14.1"
@@ -1560,7 +1838,7 @@ dependencies = [
"rustc-hash", "rustc-hash",
"rustls", "rustls",
"socket2", "socket2",
"thiserror", "thiserror 2.0.18",
"tokio", "tokio",
"tracing", "tracing",
"web-time", "web-time",
@@ -1581,7 +1859,7 @@ dependencies = [
"rustls", "rustls",
"rustls-pki-types", "rustls-pki-types",
"slab", "slab",
"thiserror", "thiserror 2.0.18",
"tinyvec", "tinyvec",
"tracing", "tracing",
"web-time", "web-time",
@@ -1629,7 +1907,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6db2770f06117d490610c7488547d543617b21bfa07796d7a12f6f1bd53850d1" checksum = "6db2770f06117d490610c7488547d543617b21bfa07796d7a12f6f1bd53850d1"
dependencies = [ dependencies = [
"rand_chacha", "rand_chacha",
"rand_core", "rand_core 0.9.5",
] ]
[[package]] [[package]]
@@ -1639,7 +1917,16 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d3022b5f1df60f26e1ffddd6c66e8aa15de382ae63b3a0c1bfc0e4d3e3f325cb" checksum = "d3022b5f1df60f26e1ffddd6c66e8aa15de382ae63b3a0c1bfc0e4d3e3f325cb"
dependencies = [ dependencies = [
"ppv-lite86", "ppv-lite86",
"rand_core", "rand_core 0.9.5",
]
[[package]]
name = "rand_core"
version = "0.6.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ec0be4795e2f6a28069bec0b5ff3e2ac9bafc99e6a9a7dc3547996c5c816922c"
dependencies = [
"getrandom 0.2.17",
] ]
[[package]] [[package]]
@@ -1788,6 +2075,15 @@ version = "2.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "357703d41365b4b27c590e3ed91eabb1b663f07c4c084095e60cbed4362dff0d" checksum = "357703d41365b4b27c590e3ed91eabb1b663f07c4c084095e60cbed4362dff0d"
[[package]]
name = "rustc_version"
version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cfcb3a22ef46e85b45de6ee7e79d063319ebb6594faafcf1c225ea92ab6e9b92"
dependencies = [
"semver",
]
[[package]] [[package]]
name = "rusticata-macros" name = "rusticata-macros"
version = "4.1.0" version = "4.1.0"
@@ -1834,9 +2130,9 @@ dependencies = [
[[package]] [[package]]
name = "rustls-webpki" name = "rustls-webpki"
version = "0.103.10" version = "0.103.12"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "df33b2b81ac578cabaf06b89b0631153a3f416b0a886e8a7a1707fb51abbd1ef" checksum = "8279bb85272c9f10811ae6a6c547ff594d6a7f3c6c6b02ee9726d1d0dcfcdd06"
dependencies = [ dependencies = [
"aws-lc-rs", "aws-lc-rs",
"ring", "ring",
@@ -1952,6 +2248,17 @@ dependencies = [
"serde", "serde",
] ]
[[package]]
name = "sha2"
version = "0.10.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a7507d819769d01a365ab707794a4084392c824f54a7a6a7862f8c3d0892b283"
dependencies = [
"cfg-if",
"cpufeatures",
"digest",
]
[[package]] [[package]]
name = "shlex" name = "shlex"
version = "1.3.0" version = "1.3.0"
@@ -2045,13 +2352,33 @@ version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7b2093cf4c8eb1e67749a6762251bc9cd836b6fc171623bd0a9d324d37af2417" checksum = "7b2093cf4c8eb1e67749a6762251bc9cd836b6fc171623bd0a9d324d37af2417"
[[package]]
name = "thiserror"
version = "1.0.69"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b6aaf5339b578ea85b50e080feb250a3e8ae8cfcdff9a461c9ec2904bc923f52"
dependencies = [
"thiserror-impl 1.0.69",
]
[[package]] [[package]]
name = "thiserror" name = "thiserror"
version = "2.0.18" version = "2.0.18"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4288b5bcbc7920c07a1149a35cf9590a2aa808e0bc1eafaade0b80947865fbc4" checksum = "4288b5bcbc7920c07a1149a35cf9590a2aa808e0bc1eafaade0b80947865fbc4"
dependencies = [ dependencies = [
"thiserror-impl", "thiserror-impl 2.0.18",
]
[[package]]
name = "thiserror-impl"
version = "1.0.69"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4fee6c4efc90059e10f81e6d42c60a18f76588c3d74cb83a0b242a2b6c7504c1"
dependencies = [
"proc-macro2",
"quote",
"syn",
] ]
[[package]] [[package]]
@@ -2297,6 +2624,12 @@ version = "0.2.5"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e421abadd41a4225275504ea4d6566923418b7f05506fbc9c0fe86ba7396114b" checksum = "e421abadd41a4225275504ea4d6566923418b7f05506fbc9c0fe86ba7396114b"
[[package]]
name = "typenum"
version = "1.20.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "40ce102ab67701b8526c123c1bab5cbe42d7040ccfd0f64af1a385808d2f43de"
[[package]] [[package]]
name = "unicode-ident" name = "unicode-ident"
version = "1.0.24" version = "1.0.24"
@@ -2309,6 +2642,16 @@ version = "0.2.6"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ebc1c04c71510c7f702b52b7c350734c9ff1295c464a03335b00bb84fc54f853" checksum = "ebc1c04c71510c7f702b52b7c350734c9ff1295c464a03335b00bb84fc54f853"
[[package]]
name = "universal-hash"
version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fc1de2c688dc15305988b563c3854064043356019f97a4b46276fe734c4f07ea"
dependencies = [
"crypto-common",
"subtle",
]
[[package]] [[package]]
name = "untrusted" name = "untrusted"
version = "0.9.0" version = "0.9.0"
@@ -2350,6 +2693,12 @@ dependencies = [
"wasm-bindgen", "wasm-bindgen",
] ]
[[package]]
name = "version_check"
version = "0.9.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0b928f33d975fc6ad9f86c8f283853ad26bdd5b10b7f1542aa2fa15e2289105a"
[[package]] [[package]]
name = "walkdir" name = "walkdir"
version = "2.5.0" version = "2.5.0"
@@ -2583,6 +2932,17 @@ dependencies = [
"windows-link", "windows-link",
] ]
[[package]]
name = "windows-service"
version = "0.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d24d6bcc7f734a4091ecf8d7a64c5f7d7066f45585c1861eba06449909609c8a"
dependencies = [
"bitflags",
"widestring",
"windows-sys 0.52.0",
]
[[package]] [[package]]
name = "windows-strings" name = "windows-strings"
version = "0.5.1" version = "0.5.1"
@@ -2848,6 +3208,16 @@ version = "0.6.2"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9edde0db4769d2dc68579893f2306b26c6ecfbe0ef499b013d731b7b9247e0b9" checksum = "9edde0db4769d2dc68579893f2306b26c6ecfbe0ef499b013d731b7b9247e0b9"
[[package]]
name = "x25519-dalek"
version = "2.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c7e468321c81fb07fa7f4c636c3972b9100f0346e5b6a9f2bd0603a52f7ed277"
dependencies = [
"curve25519-dalek",
"rand_core 0.6.4",
]
[[package]] [[package]]
name = "x509-parser" name = "x509-parser"
version = "0.18.1" version = "0.18.1"
@@ -2862,7 +3232,7 @@ dependencies = [
"oid-registry", "oid-registry",
"ring", "ring",
"rusticata-macros", "rusticata-macros",
"thiserror", "thiserror 2.0.18",
"time", "time",
] ]
@@ -2944,6 +3314,20 @@ name = "zeroize"
version = "1.8.2" version = "1.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b97154e67e32c85465826e8bcc1c59429aaaf107c1e4a9e53c8d8ccd5eff88d0" checksum = "b97154e67e32c85465826e8bcc1c59429aaaf107c1e4a9e53c8d8ccd5eff88d0"
dependencies = [
"zeroize_derive",
]
[[package]]
name = "zeroize_derive"
version = "1.4.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "85a5b4158499876c763cb03bc4e49185d3cccbabb15b33c627f7884f43db852e"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]] [[package]]
name = "zerotrie" name = "zerotrie"

View File

@@ -1,6 +1,6 @@
[package] [package]
name = "numa" name = "numa"
version = "0.13.1" version = "0.14.1"
authors = ["razvandimescu <razvan@dimescu.com>"] authors = ["razvandimescu <razvan@dimescu.com>"]
edition = "2021" edition = "2021"
description = "Portable DNS resolver in Rust — .numa local domains, ad blocking, developer overrides, DNS-over-HTTPS" description = "Portable DNS resolver in Rust — .numa local domains, ad blocking, developer overrides, DNS-over-HTTPS"
@@ -29,10 +29,18 @@ rustls = "0.23"
tokio-rustls = "0.26" tokio-rustls = "0.26"
arc-swap = "1" arc-swap = "1"
ring = "0.17" ring = "0.17"
odoh-rs = "1"
psl = "2"
# rand_core 0.9 matches the version odoh-rs (via hpke 0.13) depends on, so we
# share one RngCore trait and OsRng impl across the dep tree.
rand_core = { version = "0.9", features = ["os_rng"] }
rustls-pemfile = "2.2.0" rustls-pemfile = "2.2.0"
qrcode = { version = "0.14", default-features = false, features = ["svg"] } qrcode = { version = "0.14", default-features = false, features = ["svg"] }
webpki-roots = "1" webpki-roots = "1"
[target.'cfg(windows)'.dependencies]
windows-service = "0.7"
[dev-dependencies] [dev-dependencies]
criterion = { version = "0.8", features = ["html_reports"] } criterion = { version = "0.8", features = ["html_reports"] }
tower = { version = "0.5", features = ["util"] } tower = { version = "0.5", features = ["util"] }

View File

@@ -6,9 +6,9 @@
**DNS you own. Everywhere you go.** — [numa.rs](https://numa.rs) **DNS you own. Everywhere you go.** — [numa.rs](https://numa.rs)
A portable DNS resolver in a single binary. Block ads on any network, name your local services (`frontend.numa`), and override any hostname with auto-revert — all from your laptop, no cloud account or Raspberry Pi required. A portable DNS resolver in a single binary. Block ads on any network, name your local services (`frontend.numa`), override any hostname with auto-revert, and seal every outbound query with **ODoH (RFC 9230)** so no single party sees both who you are and what you asked — all from your laptop, no cloud account or Raspberry Pi required.
Built from scratch in Rust. Zero DNS libraries. RFC 1035 wire protocol parsed by hand. Caching, ad blocking, and local service domains out of the box. Optional recursive resolution from root nameservers with full DNSSEC chain-of-trust validation, plus a DNS-over-TLS listener for encrypted client connections (iOS Private DNS, systemd-resolved, etc.). One ~8MB binary, everything embedded. Built from scratch in Rust. Zero DNS libraries. Caching, ad blocking, and local service domains out of the box. Optional recursive resolution from root nameservers with full DNSSEC chain-of-trust validation, plus a DNS-over-TLS listener for encrypted client connections (iOS Private DNS, systemd-resolved, etc.). Run `numa relay` and the same binary becomes a public ODoH endpoint too — the curated DNSCrypt list currently has one surviving relay, so every Numa deploy materially expands the ecosystem. One ~8MB binary, everything embedded.
![Numa dashboard](assets/hero-demo.gif) ![Numa dashboard](assets/hero-demo.gif)

View File

@@ -383,7 +383,7 @@ fn run_default(rt: &tokio::runtime::Runtime) {
/// Library-to-library: Numa forward_query_raw vs Hickory resolver.lookup. /// Library-to-library: Numa forward_query_raw vs Hickory resolver.lookup.
fn run_direct(rt: &tokio::runtime::Runtime) { fn run_direct(rt: &tokio::runtime::Runtime) {
let upstream = numa::forward::parse_upstream(DOH_UPSTREAM, 443).expect("failed to parse"); let upstream = numa::forward::parse_upstream(DOH_UPSTREAM, 443, None).expect("failed to parse");
let resolver = rt.block_on(build_hickory_resolver()); let resolver = rt.block_on(build_hickory_resolver());
let timeout = Duration::from_secs(10); let timeout = Duration::from_secs(10);
@@ -609,9 +609,11 @@ fn run_hedge_multi(rt: &tokio::runtime::Runtime, iterations: usize) {
DOMAINS.len() DOMAINS.len()
); );
let primary = numa::forward::parse_upstream(DOH_UPSTREAM, 443).expect("failed to parse"); let primary = numa::forward::parse_upstream(DOH_UPSTREAM, 443, None).expect("failed to parse");
let primary_dual = numa::forward::parse_upstream(DOH_UPSTREAM, 443).expect("failed to parse"); let primary_dual =
let secondary_dual = numa::forward::parse_upstream(DOH_UPSTREAM, 443).expect("failed to parse"); numa::forward::parse_upstream(DOH_UPSTREAM, 443, None).expect("failed to parse");
let secondary_dual =
numa::forward::parse_upstream(DOH_UPSTREAM, 443, None).expect("failed to parse");
let resolver = rt.block_on(build_hickory_resolver()); let resolver = rt.block_on(build_hickory_resolver());
println!("Warming up..."); println!("Warming up...");
@@ -810,7 +812,7 @@ fn run_diag(rt: &tokio::runtime::Runtime) {
fn run_diag_clients(rt: &tokio::runtime::Runtime) { fn run_diag_clients(rt: &tokio::runtime::Runtime) {
println!("Client diagnostic: reqwest vs Hickory (20 queries to {DOH_UPSTREAM})\n"); println!("Client diagnostic: reqwest vs Hickory (20 queries to {DOH_UPSTREAM})\n");
let upstream = numa::forward::parse_upstream(DOH_UPSTREAM, 443).expect("failed to parse"); let upstream = numa::forward::parse_upstream(DOH_UPSTREAM, 443, None).expect("failed to parse");
let resolver = rt.block_on(build_hickory_resolver()); let resolver = rt.block_on(build_hickory_resolver());
let timeout = Duration::from_secs(10); let timeout = Duration::from_secs(10);

48
build.rs Normal file
View File

@@ -0,0 +1,48 @@
fn main() {
// --long forces "TAG-N-gSHA[-dirty]" format even on exact tag matches,
// making parsing unambiguous for pre-release tags like v0.14.0-rc1.
let git_version = std::process::Command::new("git")
.args(["describe", "--tags", "--always", "--dirty", "--long"])
.output()
.ok()
.filter(|o| o.status.success())
.and_then(|o| String::from_utf8(o.stdout).ok())
.and_then(|raw| parse_git_describe(raw.trim()));
if let Some(v) = git_version {
println!("cargo:rustc-env=NUMA_BUILD_VERSION={}", v);
}
println!("cargo:rerun-if-changed=.git/HEAD");
}
/// Parse `git describe --long` output into a SemVer-compatible string.
/// "v0.13.1-0-ga87f907" → "0.13.1"
/// "v0.13.1-9-ga87f907" → "0.13.1+a87f907"
/// "v0.14.0-rc1-0-ga87f907" → "0.14.0-rc1"
/// "v0.14.0-rc1-3-ga87f907-dirty" → "0.14.0-rc1+a87f907-dirty"
/// "a87f907" → "0.0.0+a87f907"
fn parse_git_describe(s: &str) -> Option<String> {
let s = s.strip_prefix('v').unwrap_or(s);
let dirty = s.ends_with("-dirty");
let s = s.strip_suffix("-dirty").unwrap_or(s);
// --long format: TAG-N-gSHA. Split from the right so tags with hyphens work.
let gpos = s.rfind("-g")?;
let sha = &s[gpos + 2..];
let rest = &s[..gpos];
let npos = rest.rfind('-')?;
let n: u32 = rest[npos + 1..].parse().ok()?;
let tag = &rest[..npos];
if tag.is_empty() {
return Some(format!("0.0.0+{}", sha));
}
Some(match (n, dirty) {
(0, false) => tag.to_string(),
(0, true) => format!("{}+{}-dirty", tag, sha),
(_, false) => format!("{}+{}", tag, sha),
(_, true) => format!("{}+{}-dirty", tag, sha),
})
}

View File

@@ -8,6 +8,39 @@ Type=simple
ExecStart={{exe_path}} ExecStart={{exe_path}}
Restart=always Restart=always
RestartSec=2 RestartSec=2
# Transient system user per start; no PKGBUILD/sysusers setup required.
# systemd remaps the StateDirectory ownership to the dynamic UID on each
# launch, including legacy root-owned trees from pre-drop installs.
DynamicUser=yes
AmbientCapabilities=CAP_NET_BIND_SERVICE
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
StateDirectory=numa
StateDirectoryMode=0750
ConfigurationDirectory=numa
ConfigurationDirectoryMode=0755
# Sandboxing — conservative set known to work with Rust network daemons.
# Aggressive hardening (MemoryDenyWriteExecute, SystemCallFilter, seccomp
# allow-lists) can be layered on once tested in isolation.
NoNewPrivileges=true
ProtectSystem=strict
# DynamicUser= sets ProtectHome=read-only by default — leaves /home
# readable so systemd can exec binaries installed under it (cargo install,
# source builds), while blocking writes to user $HOMEs. Don't set =yes:
# that hides /home entirely and fails with status=203/EXEC.
PrivateTmp=true
PrivateDevices=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
RestrictRealtime=true
RestrictSUIDSGID=true
# AF_NETLINK for interface enumeration on network changes
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX AF_NETLINK
StandardOutput=journal StandardOutput=journal
StandardError=journal StandardError=journal
SyslogIdentifier=numa SyslogIdentifier=numa

View File

@@ -8,6 +8,16 @@ api_port = 5380
# %PROGRAMDATA%\numa on windows. Override for # %PROGRAMDATA%\numa on windows. Override for
# containerized deploys or tests that can't # containerized deploys or tests that can't
# write to the system path. # write to the system path.
# filter_aaaa = true # on IPv4-only networks, answer AAAA queries with
# NODATA (NOERROR + empty answer) so Happy Eyeballs
# clients don't wait on a v6 attempt that can't
# succeed. Also strips `ipv6hint` from HTTPS/SVCB
# records (RFC 9460) so modern browsers (Chrome
# ≥103, Firefox, Safari) don't bypass the AAAA
# filter via SVCB hints. Local zones, overrides,
# and the .numa proxy are NOT filtered — you can
# still configure v6 records for local services.
# Default: false.
# [upstream] # [upstream]
# mode = "forward" # "forward" (default) — relay to upstream # mode = "forward" # "forward" (default) — relay to upstream
@@ -66,6 +76,13 @@ api_port = 5380
# [[forwarding]] # DoH upstream: full https:// URL # [[forwarding]] # DoH upstream: full https:// URL
# suffix = "example.corp" # suffix = "example.corp"
# upstream = "https://dns.quad9.net/dns-query" # upstream = "https://dns.quad9.net/dns-query"
#
# [[forwarding]] # array of upstreams → SRTT-aware failover
# suffix = ["google.com", "goog"] # fastest-healthy first, dead one skipped
# upstream = [
# "tls://9.9.9.9#dns.quad9.net",
# "tls://149.112.112.112#dns.quad9.net",
# ]
# [blocking] # [blocking]
# enabled = true # set to false to disable ad blocking # enabled = true # set to false to disable ad blocking

15
packaging/relay/Caddyfile Normal file
View File

@@ -0,0 +1,15 @@
odoh-relay.example.com {
handle /relay {
reverse_proxy numa-relay:8443
}
handle /health {
reverse_proxy numa-relay:8443
}
respond 404
# Per-request access logs defeat the point of an oblivious relay.
# Aggregate counters are exposed at /health on the relay itself.
log {
output discard
}
}

48
packaging/relay/README.md Normal file
View File

@@ -0,0 +1,48 @@
# Numa ODoH Relay — Docker deploy
Two-container deploy: Caddy terminates TLS (auto-provisioning a Let's Encrypt
cert via ACME) and reverse-proxies to a Numa relay running on an internal
Docker network. The relay never reads sealed payloads; Caddy never logs them.
## Prerequisites
- A host with public 80/443 reachable from the internet.
- A DNS record (`A` or `AAAA`) pointing your chosen hostname at the host.
- Docker + Docker Compose v2.
## Configure
Edit `Caddyfile` and replace `odoh-relay.example.com` with your hostname.
That hostname is what ACME validates against and what ODoH clients will
configure as their relay URL: `https://<hostname>/relay`.
## Deploy
```sh
docker compose up -d
docker compose logs -f caddy # watch ACME provisioning
```
First boot takes a few seconds while Caddy obtains the cert. Subsequent
restarts reuse the cached cert from the `caddy_data` volume.
## Verify
```sh
curl https://<hostname>/health
# ok
# total 0
# forwarded_ok 0
# forwarded_err 0
# rejected_bad_request 0
```
Then point any ODoH client at `https://<hostname>/relay` and watch the
counters tick.
## Listing on the public ecosystem
DNSCrypt's [v3/odoh-relays.md](https://github.com/DNSCrypt/dnscrypt-resolvers/blob/master/v3/odoh-relays.md)
is the canonical list. The pruned 2025-09-16 commit shows one public ODoH
relay survived the cull — running this compose file doubles global supply.
Open a PR there once your relay has been up for ~24 hours.

View File

@@ -0,0 +1,26 @@
services:
numa-relay:
image: ghcr.io/razvandimescu/numa:latest
command: ["relay", "8443", "0.0.0.0"]
restart: unless-stopped
networks: [internal]
caddy:
image: caddy:2
ports:
- "80:80"
- "443:443"
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile:ro
- caddy_data:/data
- caddy_config:/config
restart: unless-stopped
depends_on: [numa-relay]
networks: [internal]
networks:
internal:
volumes:
caddy_data:
caddy_config:

View File

@@ -217,6 +217,7 @@ body {
min-width: 2px; min-width: 2px;
} }
.path-bar-fill.forward { background: var(--amber); } .path-bar-fill.forward { background: var(--amber); }
.path-bar-fill.upstream { background: var(--amber-dim); }
.path-bar-fill.recursive { background: var(--cyan); } .path-bar-fill.recursive { background: var(--cyan); }
.path-bar-fill.cached { background: var(--teal); } .path-bar-fill.cached { background: var(--teal); }
.path-bar-fill.local { background: var(--violet); } .path-bar-fill.local { background: var(--violet); }
@@ -227,6 +228,7 @@ body {
.path-bar-fill.tcp { background: var(--violet); } .path-bar-fill.tcp { background: var(--violet); }
.path-bar-fill.dot { background: var(--emerald); } .path-bar-fill.dot { background: var(--emerald); }
.path-bar-fill.doh { background: var(--teal); } .path-bar-fill.doh { background: var(--teal); }
.path-bar-fill.odoh { background: var(--violet-dim); }
.path-pct { .path-pct {
font-family: var(--font-mono); font-family: var(--font-mono);
font-size: 0.75rem; font-size: 0.75rem;
@@ -285,6 +287,7 @@ body {
font-weight: 500; font-weight: 500;
} }
.path-tag.FORWARD { background: rgba(192, 98, 58, 0.12); color: var(--amber-dim); } .path-tag.FORWARD { background: rgba(192, 98, 58, 0.12); color: var(--amber-dim); }
.path-tag.UPSTREAM { background: rgba(160, 120, 72, 0.12); color: var(--amber-dim); }
.path-tag.RECURSIVE { background: rgba(74, 124, 138, 0.12); color: var(--cyan); } .path-tag.RECURSIVE { background: rgba(74, 124, 138, 0.12); color: var(--cyan); }
.path-tag.CACHED { background: rgba(107, 124, 78, 0.12); color: var(--teal-dim); } .path-tag.CACHED { background: rgba(107, 124, 78, 0.12); color: var(--teal-dim); }
.path-tag.LOCAL { background: rgba(100, 116, 139, 0.12); color: var(--violet-dim); } .path-tag.LOCAL { background: rgba(100, 116, 139, 0.12); color: var(--violet-dim); }
@@ -550,7 +553,11 @@ body {
@media (max-width: 700px) { @media (max-width: 700px) {
.stats-row { grid-template-columns: repeat(2, 1fr); } .stats-row { grid-template-columns: repeat(2, 1fr); }
.dashboard { padding: 1rem; } .dashboard { padding: 1rem; }
.header { padding: 1rem; } .header { padding: 0.8rem 1rem; }
.logo { font-size: 1.4rem; }
.tagline { display: none; }
#headerVersion { display: none; }
#phoneSetup { display: none; }
} }
</style> </style>
</head> </head>
@@ -559,6 +566,7 @@ body {
<div class="header"> <div class="header">
<div class="header-left"> <div class="header-left">
<div class="logo">Numa</div> <div class="logo">Numa</div>
<span id="headerVersion" style="font-family:var(--font-mono);font-size:0.68rem;color:var(--text-dim);"></span>
<div class="tagline">DNS that governs itself</div> <div class="tagline">DNS that governs itself</div>
</div> </div>
<div style="display:flex;align-items:center;gap:1.2rem;"> <div style="display:flex;align-items:center;gap:1.2rem;">
@@ -630,16 +638,26 @@ body {
</div> </div>
</div> </div>
<!-- Transport breakdown --> <!-- Inbound wire (apps → numa) -->
<div class="panel"> <div class="panel">
<div class="panel-header"> <div class="panel-header">
<span class="panel-title">Transport</span> <span class="panel-title">Inbound Wire <span style="color: var(--text-dim); font-weight: normal;">apps → numa</span></span>
<span class="panel-title" id="transportEncrypted" style="color: var(--text-dim)"></span> <span class="panel-title" id="transportEncrypted" style="color: var(--text-dim)"></span>
</div> </div>
<div class="panel-body" id="transportBars"> <div class="panel-body" id="transportBars">
</div> </div>
</div> </div>
<!-- Outbound wire (numa → internet) -->
<div class="panel">
<div class="panel-header">
<span class="panel-title">Outbound Wire <span style="color: var(--text-dim); font-weight: normal;">numa → internet</span></span>
<span class="panel-title" id="upstreamWireEncrypted" style="color: var(--text-dim)"></span>
</div>
<div class="panel-body" id="upstreamWireBars">
</div>
</div>
<!-- Main grid: query log + sidebar --> <!-- Main grid: query log + sidebar -->
<div class="main-grid"> <div class="main-grid">
<!-- Query log --> <!-- Query log -->
@@ -655,6 +673,7 @@ body {
<option value="RECURSIVE">recursive</option> <option value="RECURSIVE">recursive</option>
<option value="COALESCED">coalesced</option> <option value="COALESCED">coalesced</option>
<option value="FORWARD">forward</option> <option value="FORWARD">forward</option>
<option value="UPSTREAM">upstream</option>
<option value="CACHED">cached</option> <option value="CACHED">cached</option>
<option value="BLOCKED">blocked</option> <option value="BLOCKED">blocked</option>
<option value="OVERRIDE">override</option> <option value="OVERRIDE">override</option>
@@ -936,10 +955,12 @@ function renderMemory(mem, stats) {
function renderBarChart(containerId, defs, data, total) { function renderBarChart(containerId, defs, data, total) {
total = total || 1; total = total || 1;
document.getElementById(containerId).innerHTML = defs.map(d => { document.getElementById(containerId).innerHTML = defs
const count = data[d.key] || 0; .filter(d => (data[d.key] || 0) > 0)
const pct = ((count / total) * 100).toFixed(1); .map(d => {
return ` const count = data[d.key] || 0;
const pct = ((count / total) * 100).toFixed(1);
return `
<div class="path-bar-row"> <div class="path-bar-row">
<span class="path-label">${d.label}</span> <span class="path-label">${d.label}</span>
<div class="path-bar-track"> <div class="path-bar-track">
@@ -947,16 +968,19 @@ function renderBarChart(containerId, defs, data, total) {
</div> </div>
<span class="path-pct">${pct}%</span> <span class="path-pct">${pct}%</span>
</div>`; </div>`;
}).join(''); }).join('');
} }
function encryptionPct(transport) { function encryptionPct(data, encryptedKeys, allKeys) {
const total = (transport.udp + transport.tcp + transport.dot + transport.doh) || 1; const total = allKeys.reduce((s, k) => s + (data[k] || 0), 0);
return (((transport.dot + transport.doh) / total) * 100).toFixed(0); if (total === 0) return 0;
const encrypted = encryptedKeys.reduce((s, k) => s + (data[k] || 0), 0);
return Math.round((encrypted / total) * 100);
} }
const PATH_DEFS = [ const PATH_DEFS = [
{ key: 'forwarded', label: 'Forward', cls: 'forward' }, { key: 'forwarded', label: 'Forward', cls: 'forward' },
{ key: 'upstream', label: 'Upstream', cls: 'upstream' },
{ key: 'recursive', label: 'Recursive', cls: 'recursive' }, { key: 'recursive', label: 'Recursive', cls: 'recursive' },
{ key: 'cached', label: 'Cached', cls: 'cached' }, { key: 'cached', label: 'Cached', cls: 'cached' },
{ key: 'local', label: 'Local', cls: 'local' }, { key: 'local', label: 'Local', cls: 'local' },
@@ -979,9 +1003,25 @@ const TRANSPORT_DEFS = [
function renderTransport(transport) { function renderTransport(transport) {
const total = (transport.udp + transport.tcp + transport.dot + transport.doh) || 1; const total = (transport.udp + transport.tcp + transport.dot + transport.doh) || 1;
renderBarChart('transportBars', TRANSPORT_DEFS, transport, total); renderBarChart('transportBars', TRANSPORT_DEFS, transport, total);
const encPct = encryptionPct(transport); const encPct = encryptionPct(transport, ['dot', 'doh'], ['udp', 'tcp', 'dot', 'doh']);
const el = document.getElementById('transportEncrypted'); const el = document.getElementById('transportEncrypted');
el.textContent = `${encPct}% encrypted`; el.textContent = `${encPct}% encrypted inbound`;
el.style.color = encPct >= 80 ? 'var(--emerald)' : encPct >= 50 ? 'var(--amber)' : 'var(--rose)';
}
const UPSTREAM_WIRE_DEFS = [
{ key: 'udp', label: 'UDP', cls: 'udp' },
{ key: 'doh', label: 'DoH', cls: 'doh' },
{ key: 'dot', label: 'DoT', cls: 'dot' },
{ key: 'odoh', label: 'ODoH', cls: 'odoh' },
];
function renderUpstreamWire(ut) {
const total = (ut.udp + ut.doh + ut.dot + ut.odoh) || 0;
renderBarChart('upstreamWireBars', UPSTREAM_WIRE_DEFS, ut, total || 1);
const encPct = encryptionPct(ut, ['doh', 'dot', 'odoh'], ['udp', 'doh', 'dot', 'odoh']);
const el = document.getElementById('upstreamWireEncrypted');
el.textContent = total > 0 ? `${encPct}% encrypted outbound` : '';
el.style.color = encPct >= 80 ? 'var(--emerald)' : encPct >= 50 ? 'var(--amber)' : 'var(--rose)'; el.style.color = encPct >= 80 ? 'var(--emerald)' : encPct >= 50 ? 'var(--amber)' : 'var(--rose)';
} }
@@ -1130,16 +1170,23 @@ async function refresh() {
document.getElementById('totalQueries').textContent = formatNumber(q.total); document.getElementById('totalQueries').textContent = formatNumber(q.total);
document.getElementById('uptime').textContent = formatUptime(stats.uptime_secs); document.getElementById('uptime').textContent = formatUptime(stats.uptime_secs);
document.getElementById('uptimeSub').textContent = formatUptimeSub(stats.uptime_secs); document.getElementById('uptimeSub').textContent = formatUptimeSub(stats.uptime_secs);
document.getElementById('headerVersion').textContent = stats.version ? 'v' + stats.version : '';
document.getElementById('footerUpstream').textContent = stats.upstream || ''; document.getElementById('footerUpstream').textContent = stats.upstream || '';
document.getElementById('footerConfig').textContent = stats.config_path || ''; document.getElementById('footerConfig').textContent = stats.config_path || '';
document.getElementById('footerData').textContent = stats.data_dir || ''; document.getElementById('footerData').textContent = stats.data_dir || '';
const modeEl = document.getElementById('footerMode');
modeEl.textContent = stats.mode || '—';
modeEl.style.color = stats.mode === 'recursive' ? 'var(--emerald)' : 'var(--amber)';
document.getElementById('footerDnssec').textContent = stats.dnssec ? 'on' : 'off'; document.getElementById('footerDnssec').textContent = stats.dnssec ? 'on' : 'off';
document.getElementById('footerDnssec').style.color = stats.dnssec ? 'var(--emerald)' : 'var(--text-dim)'; document.getElementById('footerDnssec').style.color = stats.dnssec ? 'var(--emerald)' : 'var(--text-dim)';
document.getElementById('footerSrtt').textContent = stats.srtt ? 'on' : 'off'; document.getElementById('footerSrtt').textContent = stats.srtt ? 'on' : 'off';
document.getElementById('footerSrtt').style.color = stats.srtt ? 'var(--emerald)' : 'var(--text-dim)'; document.getElementById('footerSrtt').style.color = stats.srtt ? 'var(--emerald)' : 'var(--text-dim)';
if (!document.getElementById('footerLogs').textContent) {
const isWin = stats.data_dir && stats.data_dir.includes(':\\');
const isMac = stats.data_dir && stats.data_dir.includes('/usr/local/');
const logsEl = document.getElementById('footerLogs');
logsEl.textContent = isWin
? stats.data_dir + '\\numa.log'
: isMac ? '/usr/local/var/log/numa.log'
: 'journalctl -u numa -f';
}
// LAN status indicator // LAN status indicator
const lanEl = document.getElementById('lanToggle'); const lanEl = document.getElementById('lanToggle');
@@ -1197,7 +1244,7 @@ async function refresh() {
// QPS calculation // QPS calculation
const now = Date.now(); const now = Date.now();
const encPct = encryptionPct(stats.transport); const encPct = encryptionPct(stats.transport, ['dot', 'doh'], ['udp', 'tcp', 'dot', 'doh']);
if (prevTotal !== null && prevTime !== null) { if (prevTotal !== null && prevTime !== null) {
const dt = (now - prevTime) / 1000; const dt = (now - prevTime) / 1000;
const dq = q.total - prevTotal; const dq = q.total - prevTotal;
@@ -1209,13 +1256,14 @@ async function refresh() {
prevTime = now; prevTime = now;
// Cache hit rate // Cache hit rate
const answered = q.cached + q.forwarded + q.recursive + q.coalesced + q.local + q.overridden; const answered = q.cached + q.forwarded + q.upstream + q.recursive + q.coalesced + q.local + q.overridden;
const hitRate = answered > 0 ? ((q.cached / answered) * 100).toFixed(1) : '0.0'; const hitRate = answered > 0 ? ((q.cached / answered) * 100).toFixed(1) : '0.0';
document.getElementById('cacheRate').textContent = hitRate + '%'; document.getElementById('cacheRate').textContent = hitRate + '%';
// Panels // Panels
renderPaths(q); renderPaths(q);
renderTransport(stats.transport); renderTransport(stats.transport);
renderUpstreamWire(stats.upstream_transport || { udp: 0, doh: 0, dot: 0, odoh: 0 });
renderQueryLog(logs); renderQueryLog(logs);
renderOverrides(overrides); renderOverrides(overrides);
renderCache(cache); renderCache(cache);
@@ -1225,6 +1273,7 @@ async function refresh() {
renderMemory(stats.memory, stats); renderMemory(stats.memory, stats);
} catch (err) { } catch (err) {
console.error('[numa dashboard] render failed:', err);
document.getElementById('statusDot').className = 'status-dot error'; document.getElementById('statusDot').className = 'status-dot error';
document.getElementById('statusText').textContent = 'disconnected'; document.getElementById('statusText').textContent = 'disconnected';
} }
@@ -1339,6 +1388,7 @@ function renderBlockingInfo(info) {
} }
function renderAllowlist(entries) { function renderAllowlist(entries) {
if (document.activeElement && document.activeElement.id === 'allowDomainInput') return;
const el = document.getElementById('blockingAllowlist'); const el = document.getElementById('blockingAllowlist');
const count = entries.length; const count = entries.length;
el.innerHTML = ` el.innerHTML = `
@@ -1498,14 +1548,14 @@ refresh();
setInterval(refresh, 2000); setInterval(refresh, 2000);
</script> </script>
<div style="text-align:center;padding:0.8rem;font-family:var(--font-mono);font-size:0.68rem;color:var(--text-dim);"> <div style="text-align:center;padding:0.8rem 0.8rem 0.4rem;font-family:var(--font-mono);font-size:0.68rem;color:var(--text-dim);line-height:1.8;">
Config: <span id="footerConfig" style="user-select:all;color:var(--emerald);"></span> Config: <span id="footerConfig" style="user-select:all;color:var(--emerald);"></span>
· Data: <span id="footerData" style="user-select:all;color:var(--emerald);"></span> · Data: <span id="footerData" style="user-select:all;color:var(--emerald);"></span>
· Upstream: <span id="footerUpstream" style="user-select:all;color:var(--emerald);"></span> · Logs: <span id="footerLogs" style="user-select:all;color:var(--emerald);"></span>
· Mode: <span id="footerMode" style="color:var(--text-dim);"></span> <br>
Upstream: <span id="footerUpstream" style="user-select:all;color:var(--emerald);"></span>
· DNSSEC: <span id="footerDnssec" style="color:var(--text-dim);"></span> · DNSSEC: <span id="footerDnssec" style="color:var(--text-dim);"></span>
· SRTT: <span id="footerSrtt" style="color:var(--text-dim);"></span> · SRTT: <span id="footerSrtt" style="color:var(--text-dim);"></span>
· Logs: <span style="user-select:all;color:var(--emerald);">macOS: /usr/local/var/log/numa.log · Linux: journalctl -u numa -f</span>
· <a href="https://github.com/razvandimescu/numa" target="_blank" rel="noopener" style="color:var(--amber);text-decoration:none;">GitHub</a> · <a href="https://github.com/razvandimescu/numa" target="_blank" rel="noopener" style="color:var(--amber);text-decoration:none;">GitHub</a>
</div> </div>

View File

@@ -160,6 +160,7 @@ struct QueryLogResponse {
#[derive(Serialize)] #[derive(Serialize)]
struct StatsResponse { struct StatsResponse {
version: &'static str,
uptime_secs: u64, uptime_secs: u64,
upstream: String, upstream: String,
mode: &'static str, // "recursive" or "forward" — never "auto" at runtime mode: &'static str, // "recursive" or "forward" — never "auto" at runtime
@@ -169,6 +170,7 @@ struct StatsResponse {
srtt: bool, srtt: bool,
queries: QueriesStats, queries: QueriesStats,
transport: TransportStats, transport: TransportStats,
upstream_transport: UpstreamTransportStats,
cache: CacheStats, cache: CacheStats,
overrides: OverrideStats, overrides: OverrideStats,
blocking: BlockingStatsResponse, blocking: BlockingStatsResponse,
@@ -185,6 +187,14 @@ struct TransportStats {
doh: u64, doh: u64,
} }
#[derive(Serialize)]
struct UpstreamTransportStats {
udp: u64,
doh: u64,
dot: u64,
odoh: u64,
}
#[derive(Serialize)] #[derive(Serialize)]
struct MobileStatsResponse { struct MobileStatsResponse {
enabled: bool, enabled: bool,
@@ -201,6 +211,7 @@ struct LanStatsResponse {
struct QueriesStats { struct QueriesStats {
total: u64, total: u64,
forwarded: u64, forwarded: u64,
upstream: u64,
recursive: u64, recursive: u64,
coalesced: u64, coalesced: u64,
cached: u64, cached: u64,
@@ -538,6 +549,7 @@ async fn stats(State(ctx): State<Arc<ServerCtx>>) -> Json<StatsResponse> {
}; };
Json(StatsResponse { Json(StatsResponse {
version: crate::version(),
uptime_secs: snap.uptime_secs, uptime_secs: snap.uptime_secs,
upstream, upstream,
mode: ctx.upstream_mode.as_str(), mode: ctx.upstream_mode.as_str(),
@@ -548,6 +560,7 @@ async fn stats(State(ctx): State<Arc<ServerCtx>>) -> Json<StatsResponse> {
queries: QueriesStats { queries: QueriesStats {
total: snap.total, total: snap.total,
forwarded: snap.forwarded, forwarded: snap.forwarded,
upstream: snap.upstream,
recursive: snap.recursive, recursive: snap.recursive,
coalesced: snap.coalesced, coalesced: snap.coalesced,
cached: snap.cached, cached: snap.cached,
@@ -562,6 +575,12 @@ async fn stats(State(ctx): State<Arc<ServerCtx>>) -> Json<StatsResponse> {
dot: snap.transport_dot, dot: snap.transport_dot,
doh: snap.transport_doh, doh: snap.transport_doh,
}, },
upstream_transport: UpstreamTransportStats {
udp: snap.upstream_transport_udp,
doh: snap.upstream_transport_doh,
dot: snap.upstream_transport_dot,
odoh: snap.upstream_transport_odoh,
},
cache: CacheStats { cache: CacheStats {
entries: cache_len, entries: cache_len,
max_entries: cache_max, max_entries: cache_max,

View File

@@ -1,5 +1,5 @@
use std::collections::HashSet; use std::collections::HashSet;
use std::time::Instant; use std::time::{Duration, Instant};
use log::{info, warn}; use log::{info, warn};
@@ -355,27 +355,144 @@ mod tests {
} }
} }
pub async fn download_blocklists(lists: &[String]) -> Vec<(String, String)> { const RETRY_DELAYS_SECS: &[u64] = &[2, 10, 30];
let client = reqwest::Client::builder()
.timeout(std::time::Duration::from_secs(30))
.gzip(true)
.build()
.unwrap_or_default();
let mut results = Vec::new(); pub async fn download_blocklists(
lists: &[String],
resolver: Option<std::sync::Arc<crate::bootstrap_resolver::NumaResolver>>,
) -> Vec<(String, String)> {
let mut builder = reqwest::Client::builder()
.timeout(Duration::from_secs(30))
.gzip(true);
if let Some(r) = resolver {
builder = builder.dns_resolver(r);
}
let client = builder.build().unwrap_or_default();
for url in lists { let fetches = lists.iter().map(|url| {
match client.get(url).send().await { let client = &client;
Ok(resp) => match resp.text().await { async move {
Ok(text) => { let text = fetch_with_retry(client, url).await?;
info!("downloaded blocklist: {} ({} bytes)", url, text.len()); info!("downloaded blocklist: {} ({} bytes)", url, text.len());
results.push((url.clone(), text)); Some((url.clone(), text))
} }
Err(e) => warn!("failed to read blocklist body {}: {}", url, e), });
}, futures::future::join_all(fetches)
Err(e) => warn!("failed to download blocklist {}: {}", url, e), .await
.into_iter()
.flatten()
.collect()
}
async fn fetch_with_retry(client: &reqwest::Client, url: &str) -> Option<String> {
fetch_with_retry_delays(client, url, RETRY_DELAYS_SECS).await
}
async fn fetch_with_retry_delays(
client: &reqwest::Client,
url: &str,
delays: &[u64],
) -> Option<String> {
let total = delays.len() + 1;
for attempt in 1..=total {
match fetch_once(client, url).await {
Ok(text) => return Some(text),
Err(msg) if attempt < total => {
let delay = delays[attempt - 1];
warn!(
"blocklist {} attempt {}/{} failed: {} — retrying in {}s",
url, attempt, total, msg, delay
);
tokio::time::sleep(Duration::from_secs(delay)).await;
}
Err(msg) => {
warn!(
"blocklist {} attempt {}/{} failed: {} — giving up",
url, attempt, total, msg
);
}
} }
} }
None
results }
async fn fetch_once(client: &reqwest::Client, url: &str) -> Result<String, String> {
let resp = client
.get(url)
.send()
.await
.map_err(|e| format_error_chain(&e))?;
resp.text().await.map_err(|e| format_error_chain(&e))
}
fn format_error_chain(e: &(dyn std::error::Error + 'static)) -> String {
let mut parts = vec![e.to_string()];
let mut src = e.source();
while let Some(s) = src {
parts.push(s.to_string());
src = s.source();
}
parts.join(": ")
}
#[cfg(test)]
mod retry_tests {
use super::*;
use std::net::SocketAddr;
use tokio::io::{AsyncReadExt, AsyncWriteExt};
use tokio::net::TcpListener;
async fn flaky_http_server(drop_first_n: usize, body: &'static str) -> SocketAddr {
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let addr = listener.local_addr().unwrap();
tokio::spawn(async move {
for _ in 0..drop_first_n {
if let Ok((sock, _)) = listener.accept().await {
drop(sock);
}
}
loop {
let Ok((mut sock, _)) = listener.accept().await else {
return;
};
tokio::spawn(async move {
let mut buf = [0u8; 2048];
let _ = sock.read(&mut buf).await;
let response = format!(
"HTTP/1.1 200 OK\r\nContent-Length: {}\r\nContent-Type: text/plain\r\nConnection: close\r\n\r\n{}",
body.len(),
body,
);
let _ = sock.write_all(response.as_bytes()).await;
let _ = sock.shutdown().await;
});
}
});
addr
}
fn zero_delays() -> Vec<u64> {
vec![0; RETRY_DELAYS_SECS.len()]
}
#[tokio::test]
async fn retry_succeeds_on_final_attempt() {
let body = "ads.example.com\ntracker.example.net\n";
let delays = zero_delays();
let addr = flaky_http_server(delays.len(), body).await;
let client = reqwest::Client::new();
let url = format!("http://{addr}/");
let result = fetch_with_retry_delays(&client, &url, &delays).await;
assert_eq!(result.as_deref(), Some(body));
}
#[tokio::test]
async fn retry_gives_up_when_all_attempts_fail() {
let delays = zero_delays();
let addr = flaky_http_server(delays.len() + 2, "unreachable").await;
let client = reqwest::Client::new();
let url = format!("http://{addr}/");
let result = fetch_with_retry_delays(&client, &url, &delays).await;
assert_eq!(result, None);
}
} }

235
src/bootstrap_resolver.rs Normal file
View File

@@ -0,0 +1,235 @@
//! `reqwest` DNS resolver used by numa-originated HTTPS (DoH upstream, ODoH
//! relay/target, blocklist CDN). When numa is its own system resolver
//! (`/etc/resolv.conf → 127.0.0.1`, HAOS add-on, Pi-hole-style container),
//! the default `getaddrinfo` path loops back through numa before numa can
//! answer — a chicken-and-egg that deadlocks cold boot. See issue #122 and
//! `docs/implementation/bootstrap-resolver.md`.
//!
//! Resolution order per hostname:
//! 1. Per-hostname overrides (e.g. ODoH `relay_ip` / `target_ip`) → return
//! immediately, no DNS query. Preserves ODoH's "zero plain-DNS leak"
//! property for configured endpoints.
//! 2. Otherwise, query A + AAAA in parallel via UDP to IP-literal bootstrap
//! servers, with TCP fallback on UDP timeout (for networks that block
//! outbound UDP:53 — see memory: `project_network_udp_hostile.md`).
use std::collections::BTreeMap;
use std::net::{IpAddr, Ipv4Addr, SocketAddr};
use std::time::Duration;
use log::{debug, info, warn};
use reqwest::dns::{Addrs, Name, Resolve, Resolving};
use crate::forward::{forward_tcp, forward_udp};
use crate::packet::DnsPacket;
use crate::question::QueryType;
use crate::record::DnsRecord;
const UDP_TIMEOUT: Duration = Duration::from_millis(800);
const TCP_TIMEOUT: Duration = Duration::from_millis(1500);
const DEFAULT_BOOTSTRAP: &[SocketAddr] = &[
SocketAddr::new(IpAddr::V4(Ipv4Addr::new(9, 9, 9, 9)), 53),
SocketAddr::new(IpAddr::V4(Ipv4Addr::new(1, 1, 1, 1)), 53),
];
pub struct NumaResolver {
bootstrap: Vec<SocketAddr>,
overrides: BTreeMap<String, Vec<IpAddr>>,
}
impl NumaResolver {
/// Build a resolver from the configured `upstream.fallback` list and any
/// per-hostname overrides (e.g. ODoH's `relay_ip`/`target_ip`).
///
/// `fallback` entries are filtered to IP literals only — hostnames would
/// re-introduce the self-loop inside the resolver itself. Empty or
/// unusable fallback yields the hardcoded default (Quad9 + Cloudflare).
pub fn new(fallback: &[String], overrides: BTreeMap<String, Vec<IpAddr>>) -> Self {
let mut bootstrap: Vec<SocketAddr> = Vec::with_capacity(fallback.len());
for entry in fallback {
match crate::forward::parse_upstream_addr(entry, 53) {
Ok(addr) => bootstrap.push(addr),
Err(_) => {
warn!(
"bootstrap_resolver: skipping non-IP fallback '{}' \
(hostnames would re-enter the self-loop)",
entry
);
}
}
}
let source = if bootstrap.is_empty() {
bootstrap = DEFAULT_BOOTSTRAP.to_vec();
"default (no IP-literal in upstream.fallback)"
} else {
"upstream.fallback"
};
let ips: Vec<String> = bootstrap.iter().map(|s| s.ip().to_string()).collect();
info!(
"bootstrap resolver: {} via {} — used for numa-originated HTTPS hostname resolution",
ips.join(", "),
source
);
if !overrides.is_empty() {
let pairs: Vec<String> = overrides
.iter()
.flat_map(|(host, addrs)| addrs.iter().map(move |ip| format!("{}={}", host, ip)))
.collect();
info!(
"bootstrap resolver: host overrides (skip DNS, connect direct): {}",
pairs.join(", ")
);
}
Self {
bootstrap,
overrides,
}
}
#[cfg(test)]
pub fn bootstrap(&self) -> &[SocketAddr] {
&self.bootstrap
}
}
impl Resolve for NumaResolver {
fn resolve(&self, name: Name) -> Resolving {
let hostname = name.as_str().to_string();
if let Some(ips) = self.overrides.get(&hostname) {
let addrs: Vec<SocketAddr> = ips.iter().map(|ip| SocketAddr::new(*ip, 0)).collect();
debug!(
"bootstrap_resolver: override hit for {} → {:?}",
hostname, ips
);
return Box::pin(async move { Ok(Box::new(addrs.into_iter()) as Addrs) });
}
let bootstrap = self.bootstrap.clone();
Box::pin(async move {
let addrs = resolve_via_bootstrap(&hostname, &bootstrap).await?;
debug!(
"bootstrap_resolver: resolved {} → {} addr(s)",
hostname,
addrs.len()
);
Ok(Box::new(addrs.into_iter()) as Addrs)
})
}
}
async fn resolve_via_bootstrap(
hostname: &str,
bootstrap: &[SocketAddr],
) -> Result<Vec<SocketAddr>, Box<dyn std::error::Error + Send + Sync>> {
let mut last_err: Option<String> = None;
for &server in bootstrap {
let q_a = DnsPacket::query(0xBEEF, hostname, QueryType::A);
let q_aaaa = DnsPacket::query(0xBEF0, hostname, QueryType::AAAA);
let (a_res, aaaa_res) = tokio::join!(
query_with_tcp_fallback(&q_a, server),
query_with_tcp_fallback(&q_aaaa, server),
);
let mut out = Vec::new();
match a_res {
Ok(pkt) => extract_addrs(&pkt, &mut out),
Err(e) => last_err = Some(format!("{} A failed: {}", server, e)),
}
match aaaa_res {
Ok(pkt) => extract_addrs(&pkt, &mut out),
// AAAA is optional — many hosts return NXDOMAIN/empty. Don't
// treat as the primary error if A succeeded.
Err(e) => debug!("bootstrap {} AAAA for {} failed: {}", server, hostname, e),
}
if !out.is_empty() {
return Ok(out);
}
}
Err(last_err
.unwrap_or_else(|| "no bootstrap servers reachable".into())
.into())
}
async fn query_with_tcp_fallback(
query: &DnsPacket,
server: SocketAddr,
) -> crate::Result<DnsPacket> {
match forward_udp(query, server, UDP_TIMEOUT).await {
Ok(pkt) => Ok(pkt),
Err(e) => {
debug!(
"bootstrap UDP {} failed ({}), falling back to TCP",
server, e
);
forward_tcp(query, server, TCP_TIMEOUT).await
}
}
}
fn extract_addrs(pkt: &DnsPacket, out: &mut Vec<SocketAddr>) {
for r in &pkt.answers {
match r {
DnsRecord::A { addr, .. } => out.push(SocketAddr::new(IpAddr::V4(*addr), 0)),
DnsRecord::AAAA { addr, .. } => out.push(SocketAddr::new(IpAddr::V6(*addr), 0)),
_ => {}
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::net::{Ipv4Addr, Ipv6Addr};
#[test]
fn empty_fallback_uses_defaults() {
let r = NumaResolver::new(&[], BTreeMap::new());
let got: Vec<String> = r.bootstrap().iter().map(|s| s.to_string()).collect();
assert_eq!(got, vec!["9.9.9.9:53", "1.1.1.1:53"]);
}
#[test]
fn fallback_accepts_ip_literals_only() {
let fallback = vec![
"9.9.9.9".to_string(),
"dns.quad9.net".to_string(),
"1.1.1.1:5353".to_string(),
];
let r = NumaResolver::new(&fallback, BTreeMap::new());
let got: Vec<String> = r.bootstrap().iter().map(|s| s.to_string()).collect();
assert_eq!(got, vec!["9.9.9.9:53", "1.1.1.1:5353"]);
}
#[test]
fn override_returns_configured_ips_without_dns() {
let mut overrides = BTreeMap::new();
overrides.insert(
"odoh-relay.example".to_string(),
vec![IpAddr::V4(Ipv4Addr::new(178, 104, 229, 30))],
);
let r = NumaResolver::new(&[], overrides);
let name: Name = "odoh-relay.example".parse().unwrap();
let fut = r.resolve(name);
let res = futures::executor::block_on(fut).unwrap();
let addrs: Vec<_> = res.collect();
assert_eq!(addrs.len(), 1);
assert_eq!(addrs[0].ip(), IpAddr::V4(Ipv4Addr::new(178, 104, 229, 30)));
}
#[test]
fn override_supports_multiple_ips_including_ipv6() {
let mut overrides = BTreeMap::new();
overrides.insert(
"dual.example".to_string(),
vec![
IpAddr::V4(Ipv4Addr::new(1, 2, 3, 4)),
IpAddr::V6(Ipv6Addr::LOCALHOST),
],
);
let r = NumaResolver::new(&[], overrides);
let res = futures::executor::block_on(r.resolve("dual.example".parse().unwrap())).unwrap();
let addrs: Vec<_> = res.collect();
assert_eq!(addrs.len(), 2);
}
}

View File

@@ -1,7 +1,7 @@
use std::collections::HashMap; use std::collections::HashMap;
use std::net::Ipv4Addr; use std::net::{IpAddr, Ipv4Addr, Ipv6Addr, SocketAddr};
use std::net::Ipv6Addr;
use std::path::{Path, PathBuf}; use std::path::{Path, PathBuf};
use std::time::Duration;
use serde::Deserialize; use serde::Deserialize;
@@ -41,17 +41,30 @@ pub struct Config {
pub struct ForwardingRuleConfig { pub struct ForwardingRuleConfig {
#[serde(deserialize_with = "string_or_vec")] #[serde(deserialize_with = "string_or_vec")]
pub suffix: Vec<String>, pub suffix: Vec<String>,
pub upstream: String, #[serde(deserialize_with = "string_or_vec")]
pub upstream: Vec<String>,
} }
impl ForwardingRuleConfig { impl ForwardingRuleConfig {
fn to_runtime_rules(&self) -> Result<Vec<crate::system_dns::ForwardingRule>> { fn to_runtime_rules(&self) -> Result<Vec<crate::system_dns::ForwardingRule>> {
let upstream = crate::forward::parse_upstream(&self.upstream, 53) if self.upstream.is_empty() {
.map_err(|e| format!("forwarding rule for upstream '{}': {}", self.upstream, e))?; return Err(format!(
"forwarding rule for suffix {:?}: upstream must not be empty",
self.suffix
)
.into());
}
let mut primary = Vec::with_capacity(self.upstream.len());
for s in &self.upstream {
let u = crate::forward::parse_upstream(s, 53, None)
.map_err(|e| format!("forwarding rule for upstream '{}': {}", s, e))?;
primary.push(u);
}
let pool = crate::forward::UpstreamPool::new(primary, vec![]);
Ok(self Ok(self
.suffix .suffix
.iter() .iter()
.map(|s| crate::system_dns::ForwardingRule::new(s.clone(), upstream.clone())) .map(|s| crate::system_dns::ForwardingRule::new(s.clone(), pool.clone()))
.collect()) .collect())
} }
} }
@@ -80,6 +93,12 @@ pub struct ServerConfig {
/// Defaults to `crate::data_dir()` (platform-specific system path) if unset. /// Defaults to `crate::data_dir()` (platform-specific system path) if unset.
#[serde(default)] #[serde(default)]
pub data_dir: Option<PathBuf>, pub data_dir: Option<PathBuf>,
/// Synthesize NODATA (NOERROR + empty answer) for AAAA queries, and
/// strip `ipv6hint` from HTTPS/SVCB responses (RFC 9460). For IPv4-only
/// networks where Happy Eyeballs fallback adds latency. Local zones,
/// overrides, and the service proxy are not affected. Default false.
#[serde(default)]
pub filter_aaaa: bool,
} }
impl Default for ServerConfig { impl Default for ServerConfig {
@@ -89,6 +108,7 @@ impl Default for ServerConfig {
api_port: default_api_port(), api_port: default_api_port(),
api_bind_addr: default_api_bind_addr(), api_bind_addr: default_api_bind_addr(),
data_dir: None, data_dir: None,
filter_aaaa: false,
} }
} }
} }
@@ -114,6 +134,7 @@ pub enum UpstreamMode {
#[default] #[default]
Forward, Forward,
Recursive, Recursive,
Odoh,
} }
impl UpstreamMode { impl UpstreamMode {
@@ -122,6 +143,20 @@ impl UpstreamMode {
UpstreamMode::Auto => "auto", UpstreamMode::Auto => "auto",
UpstreamMode::Forward => "forward", UpstreamMode::Forward => "forward",
UpstreamMode::Recursive => "recursive", UpstreamMode::Recursive => "recursive",
UpstreamMode::Odoh => "odoh",
}
}
/// Hedging duplicates the in-flight query against the same upstream to
/// rescue tail latency. Beneficial for UDP/DoH/DoT (cheap retransmit /
/// h2 stream multiplexing). For ODoH it doubles the relay's HPKE
/// seal/unseal load and the sealed-byte footprint a passive observer
/// can correlate, with no latency win — the relay hop dominates either
/// way. Force-zero in oblivious mode regardless of `hedge_ms`.
pub fn hedge_delay(self, hedge_ms: u64) -> Duration {
match self {
UpstreamMode::Odoh => Duration::ZERO,
_ => Duration::from_millis(hedge_ms),
} }
} }
} }
@@ -134,7 +169,7 @@ pub struct UpstreamConfig {
pub address: Vec<String>, pub address: Vec<String>,
#[serde(default = "default_upstream_port")] #[serde(default = "default_upstream_port")]
pub port: u16, pub port: u16,
#[serde(default)] #[serde(default, deserialize_with = "string_or_vec")]
pub fallback: Vec<String>, pub fallback: Vec<String>,
#[serde(default = "default_timeout_ms")] #[serde(default = "default_timeout_ms")]
pub timeout_ms: u64, pub timeout_ms: u64,
@@ -146,6 +181,30 @@ pub struct UpstreamConfig {
pub prime_tlds: Vec<String>, pub prime_tlds: Vec<String>,
#[serde(default = "default_srtt")] #[serde(default = "default_srtt")]
pub srtt: bool, pub srtt: bool,
/// Only used when `mode = "odoh"`. Full https:// URL of the relay
/// endpoint (including path, e.g. `https://odoh-relay.numa.rs/relay`).
#[serde(default)]
pub relay: Option<String>,
/// Only used when `mode = "odoh"`. Full https:// URL of the target
/// resolver (`https://odoh.cloudflare-dns.com/dns-query`).
#[serde(default)]
pub target: Option<String>,
/// Only used when `mode = "odoh"`. When true (the default), relay failure
/// returns SERVFAIL instead of downgrading to the `fallback` upstream —
/// a user who configured ODoH rarely wants a silent non-oblivious path.
#[serde(default)]
pub strict: Option<bool>,
/// Bootstrap IP for the relay host, used when numa is its own system
/// resolver (otherwise the ODoH HTTPS client loops resolving through
/// itself). TLS still validates the cert against `relay`'s hostname.
#[serde(default)]
pub relay_ip: Option<IpAddr>,
/// Same as `relay_ip` but for the target host.
#[serde(default)]
pub target_ip: Option<IpAddr>,
} }
impl Default for UpstreamConfig { impl Default for UpstreamConfig {
@@ -160,10 +219,128 @@ impl Default for UpstreamConfig {
root_hints: default_root_hints(), root_hints: default_root_hints(),
prime_tlds: default_prime_tlds(), prime_tlds: default_prime_tlds(),
srtt: default_srtt(), srtt: default_srtt(),
relay: None,
target: None,
strict: None,
relay_ip: None,
target_ip: None,
} }
} }
} }
/// Parsed ODoH config fields. `mode = "odoh"` requires both URLs to be
/// present, to parse as `https://`, and to resolve to distinct hosts.
#[derive(Debug)]
pub struct OdohUpstream {
pub relay_url: String,
pub relay_host: String,
pub target_host: String,
pub target_path: String,
pub strict: bool,
pub relay_bootstrap: Option<SocketAddr>,
pub target_bootstrap: Option<SocketAddr>,
}
impl OdohUpstream {
/// Per-host IP overrides for the bootstrap resolver, lifted from
/// `relay_ip`/`target_ip`. Keeps the "zero plain-DNS leak for ODoH
/// endpoints" property when numa is its own system resolver.
pub fn host_ip_overrides(&self) -> std::collections::BTreeMap<String, Vec<std::net::IpAddr>> {
let mut out = std::collections::BTreeMap::new();
if let Some(addr) = self.relay_bootstrap {
out.entry(self.relay_host.clone())
.or_insert_with(Vec::new)
.push(addr.ip());
}
if let Some(addr) = self.target_bootstrap {
out.entry(self.target_host.clone())
.or_insert_with(Vec::new)
.push(addr.ip());
}
out
}
}
impl UpstreamConfig {
/// Validate and extract ODoH-specific fields. Called during `load_config`
/// so misconfigured ODoH fails fast at startup, the same care we take
/// with the DNSSEC strict boot check.
pub fn odoh_upstream(&self) -> Result<OdohUpstream> {
let relay = self
.relay
.as_deref()
.ok_or("mode = \"odoh\" requires upstream.relay")?;
let target = self
.target
.as_deref()
.ok_or("mode = \"odoh\" requires upstream.target")?;
let relay_url = reqwest::Url::parse(relay)
.map_err(|e| format!("upstream.relay invalid URL '{}': {}", relay, e))?;
let target_url = reqwest::Url::parse(target)
.map_err(|e| format!("upstream.target invalid URL '{}': {}", target, e))?;
if relay_url.scheme() != "https" || target_url.scheme() != "https" {
return Err("upstream.relay and upstream.target must both use https://".into());
}
let relay_host = relay_url
.host_str()
.ok_or("upstream.relay must include a host")?
.to_string();
let target_host = target_url
.host_str()
.ok_or("upstream.target must include a host")?
.to_string();
if relay_host == target_host {
return Err(format!(
"upstream.relay and upstream.target resolve to the same host ({}); the privacy property requires distinct operators",
relay_host
)
.into());
}
if let Some(shared) = shared_registrable_domain(&relay_host, &target_host) {
return Err(format!(
"upstream.relay ({}) and upstream.target ({}) share the registrable domain ({}); the privacy property requires distinct operators",
relay_host, target_host, shared
)
.into());
}
let target_path = if target_url.path().is_empty() {
"/".to_string()
} else {
target_url.path().to_string()
};
let relay_port = relay_url.port_or_known_default().unwrap_or(443);
let target_port = target_url.port_or_known_default().unwrap_or(443);
Ok(OdohUpstream {
relay_url: relay.to_string(),
relay_host,
target_host,
target_path,
strict: self.strict.unwrap_or(true),
relay_bootstrap: self.relay_ip.map(|ip| SocketAddr::new(ip, relay_port)),
target_bootstrap: self.target_ip.map(|ip| SocketAddr::new(ip, target_port)),
})
}
}
/// Returns the registrable domain (eTLD+1) shared by both hosts, if any.
/// Fails open on hosts the PSL can't parse (IP literals, bare TLDs).
fn shared_registrable_domain(relay_host: &str, target_host: &str) -> Option<String> {
let relay = psl::domain(relay_host.as_bytes())?;
let target = psl::domain(target_host.as_bytes())?;
if relay.as_bytes() == target.as_bytes() {
std::str::from_utf8(relay.as_bytes())
.ok()
.map(str::to_owned)
} else {
None
}
}
fn string_or_vec<'de, D>(deserializer: D) -> std::result::Result<Vec<String>, D::Error> fn string_or_vec<'de, D>(deserializer: D) -> std::result::Result<Vec<String>, D::Error>
where where
D: serde::Deserializer<'de>, D: serde::Deserializer<'de>,
@@ -567,6 +744,17 @@ mod tests {
assert!(config.lan.enabled); assert!(config.lan.enabled);
} }
#[test]
fn filter_aaaa_defaults_false() {
assert!(!ServerConfig::default().filter_aaaa);
}
#[test]
fn filter_aaaa_parses_from_server_section() {
let config: Config = toml::from_str("[server]\nfilter_aaaa = true").unwrap();
assert!(config.server.filter_aaaa);
}
#[test] #[test]
fn custom_bind_addrs_parse() { fn custom_bind_addrs_parse() {
let toml = r#" let toml = r#"
@@ -612,12 +800,22 @@ mod tests {
} }
#[test] #[test]
fn fallback_parses() { fn fallback_array_parses() {
let config: Config = let config: Config =
toml::from_str("[upstream]\nfallback = [\"8.8.8.8\", \"1.1.1.1\"]").unwrap(); toml::from_str("[upstream]\nfallback = [\"8.8.8.8\", \"1.1.1.1\"]").unwrap();
assert_eq!(config.upstream.fallback, vec!["8.8.8.8", "1.1.1.1"]); assert_eq!(config.upstream.fallback, vec!["8.8.8.8", "1.1.1.1"]);
} }
#[test]
fn fallback_string_parses_as_singleton_vec() {
let config: Config =
toml::from_str("[upstream]\nfallback = \"tls://1.1.1.1#cloudflare-dns.com\"").unwrap();
assert_eq!(
config.upstream.fallback,
vec!["tls://1.1.1.1#cloudflare-dns.com"]
);
}
#[test] #[test]
fn empty_address_gives_empty_vec() { fn empty_address_gives_empty_vec() {
let config: Config = toml::from_str("").unwrap(); let config: Config = toml::from_str("").unwrap();
@@ -625,6 +823,222 @@ mod tests {
assert!(config.upstream.fallback.is_empty()); assert!(config.upstream.fallback.is_empty());
} }
// ── [upstream] mode = "odoh" ────────────────────────────────────────
#[test]
fn odoh_config_parses_and_validates() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
assert!(matches!(config.upstream.mode, UpstreamMode::Odoh));
let odoh = config.upstream.odoh_upstream().unwrap();
assert_eq!(odoh.relay_url, "https://odoh-relay.numa.rs/relay");
assert_eq!(odoh.target_host, "odoh.cloudflare-dns.com");
assert_eq!(odoh.target_path, "/dns-query");
assert!(odoh.strict, "strict defaults to true under mode=odoh");
}
#[test]
fn odoh_strict_false_is_honoured() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
strict = false
"#;
let config: Config = toml::from_str(toml).unwrap();
assert!(!config.upstream.odoh_upstream().unwrap().strict);
}
#[test]
fn odoh_rejects_same_host_relay_and_target() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh.example.com/relay"
target = "https://odoh.example.com/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
let err = config.upstream.odoh_upstream().unwrap_err().to_string();
assert!(err.contains("same host"), "got: {err}");
}
#[test]
fn odoh_rejects_shared_registrable_domain() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://r.cloudflare.com/relay"
target = "https://odoh.cloudflare.com/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
let err = config.upstream.odoh_upstream().unwrap_err().to_string();
assert!(err.contains("registrable domain"), "got: {err}");
assert!(err.contains("cloudflare.com"), "got: {err}");
}
#[test]
fn odoh_rejects_shared_registrable_under_multi_label_suffix() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://a.foo.co.uk/relay"
target = "https://b.foo.co.uk/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
let err = config.upstream.odoh_upstream().unwrap_err().to_string();
assert!(err.contains("foo.co.uk"), "got: {err}");
}
#[test]
fn odoh_accepts_distinct_registrable_under_multi_label_suffix() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://relay.foo.co.uk/relay"
target = "https://target.bar.co.uk/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
assert!(config.upstream.odoh_upstream().is_ok());
}
#[test]
fn odoh_accepts_distinct_private_psl_suffix_subdomains() {
// *.github.io is a public suffix, so foo.github.io and bar.github.io
// are independent registrable domains — accept.
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://foo.github.io/relay"
target = "https://bar.github.io/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
assert!(config.upstream.odoh_upstream().is_ok());
}
#[test]
fn odoh_rejects_non_https() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "http://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
let err = config.upstream.odoh_upstream().unwrap_err().to_string();
assert!(err.contains("https"), "got: {err}");
}
#[test]
fn odoh_missing_relay_rejected() {
let toml = r#"
[upstream]
mode = "odoh"
target = "https://odoh.cloudflare-dns.com/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
let err = config.upstream.odoh_upstream().unwrap_err().to_string();
assert!(err.contains("upstream.relay"), "got: {err}");
}
#[test]
fn odoh_bootstrap_ips_parse_into_socket_addrs() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
relay_ip = "178.104.229.30"
target_ip = "104.16.249.249"
"#;
let config: Config = toml::from_str(toml).unwrap();
let odoh = config.upstream.odoh_upstream().unwrap();
assert_eq!(odoh.relay_host, "odoh-relay.numa.rs");
assert_eq!(
odoh.relay_bootstrap.unwrap().to_string(),
"178.104.229.30:443"
);
assert_eq!(
odoh.target_bootstrap.unwrap().to_string(),
"104.16.249.249:443"
);
}
#[test]
fn odoh_bootstrap_ips_optional() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
let odoh = config.upstream.odoh_upstream().unwrap();
assert!(odoh.relay_bootstrap.is_none());
assert!(odoh.target_bootstrap.is_none());
}
#[test]
fn odoh_bootstrap_ip_rejects_garbage() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
relay_ip = "not-an-ip"
"#;
let err = toml::from_str::<Config>(toml).err().unwrap().to_string();
assert!(err.contains("relay_ip"), "got: {err}");
}
#[test]
fn odoh_bootstrap_uses_url_port_when_non_default() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs:8443/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
relay_ip = "178.104.229.30"
"#;
let config: Config = toml::from_str(toml).unwrap();
let odoh = config.upstream.odoh_upstream().unwrap();
assert_eq!(
odoh.relay_bootstrap.unwrap().to_string(),
"178.104.229.30:8443"
);
}
#[test]
fn hedge_delay_zeroed_for_odoh_mode() {
assert_eq!(
UpstreamMode::Odoh.hedge_delay(50),
Duration::ZERO,
"ODoH mode must zero hedge regardless of configured hedge_ms"
);
assert_eq!(
UpstreamMode::Forward.hedge_delay(50),
Duration::from_millis(50),
"non-ODoH modes honour configured hedge_ms"
);
}
#[test]
fn odoh_missing_target_rejected() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
"#;
let config: Config = toml::from_str(toml).unwrap();
let err = config.upstream.odoh_upstream().unwrap_err().to_string();
assert!(err.contains("upstream.target"), "got: {err}");
}
// ── issue #82: [[forwarding]] config section ──────────────────────── // ── issue #82: [[forwarding]] config section ────────────────────────
#[test] #[test]
@@ -643,7 +1057,7 @@ mod tests {
let config: Config = toml::from_str(toml).unwrap(); let config: Config = toml::from_str(toml).unwrap();
assert_eq!(config.forwarding.len(), 1); assert_eq!(config.forwarding.len(), 1);
assert_eq!(config.forwarding[0].suffix, &["home.local"]); assert_eq!(config.forwarding[0].suffix, &["home.local"]);
assert_eq!(config.forwarding[0].upstream, "100.90.1.63:5361"); assert_eq!(config.forwarding[0].upstream, vec!["100.90.1.63:5361"]);
} }
#[test] #[test]
@@ -671,7 +1085,7 @@ mod tests {
"#; "#;
let config: Config = toml::from_str(toml).unwrap(); let config: Config = toml::from_str(toml).unwrap();
assert_eq!(config.forwarding.len(), 2); assert_eq!(config.forwarding.len(), 2);
assert_eq!(config.forwarding[1].upstream, "10.0.0.1"); assert_eq!(config.forwarding[1].upstream, vec!["10.0.0.1"]);
} }
#[test] #[test]
@@ -693,28 +1107,29 @@ mod tests {
fn forwarding_suffix_array_expands_to_multiple_runtime_rules() { fn forwarding_suffix_array_expands_to_multiple_runtime_rules() {
let rule = ForwardingRuleConfig { let rule = ForwardingRuleConfig {
suffix: vec!["168.192.in-addr.arpa".to_string(), "onsite".to_string()], suffix: vec!["168.192.in-addr.arpa".to_string(), "onsite".to_string()],
upstream: "192.168.88.1".to_string(), upstream: vec!["192.168.88.1".to_string()],
}; };
let runtime = rule.to_runtime_rules().unwrap(); let runtime = rule.to_runtime_rules().unwrap();
assert_eq!(runtime.len(), 2); assert_eq!(runtime.len(), 2);
assert_eq!(runtime[0].suffix, "168.192.in-addr.arpa"); assert_eq!(runtime[0].suffix, "168.192.in-addr.arpa");
assert_eq!(runtime[1].suffix, "onsite"); assert_eq!(runtime[1].suffix, "onsite");
assert_eq!(runtime[0].upstream, runtime[1].upstream); assert_eq!(
runtime[0].upstream.preferred(),
runtime[1].upstream.preferred()
);
} }
#[test] #[test]
fn forwarding_upstream_with_explicit_port() { fn forwarding_upstream_with_explicit_port() {
let rule = ForwardingRuleConfig { let rule = ForwardingRuleConfig {
suffix: vec!["home.local".to_string()], suffix: vec!["home.local".to_string()],
upstream: "100.90.1.63:5361".to_string(), upstream: vec!["100.90.1.63:5361".to_string()],
}; };
let runtime = rule.to_runtime_rules().unwrap(); let runtime = rule.to_runtime_rules().unwrap();
assert_eq!(runtime.len(), 1); assert_eq!(runtime.len(), 1);
assert!(matches!( let preferred = runtime[0].upstream.preferred().unwrap();
runtime[0].upstream, assert!(matches!(preferred, crate::forward::Upstream::Udp(_)));
crate::forward::Upstream::Udp(_) assert_eq!(preferred.to_string(), "100.90.1.63:5361");
));
assert_eq!(runtime[0].upstream.to_string(), "100.90.1.63:5361");
assert_eq!(runtime[0].suffix, "home.local"); assert_eq!(runtime[0].suffix, "home.local");
} }
@@ -722,17 +1137,20 @@ mod tests {
fn forwarding_upstream_defaults_to_port_53() { fn forwarding_upstream_defaults_to_port_53() {
let rule = ForwardingRuleConfig { let rule = ForwardingRuleConfig {
suffix: vec!["home.local".to_string()], suffix: vec!["home.local".to_string()],
upstream: "100.90.1.63".to_string(), upstream: vec!["100.90.1.63".to_string()],
}; };
let runtime = rule.to_runtime_rules().unwrap(); let runtime = rule.to_runtime_rules().unwrap();
assert_eq!(runtime[0].upstream.to_string(), "100.90.1.63:53"); assert_eq!(
runtime[0].upstream.preferred().unwrap().to_string(),
"100.90.1.63:53"
);
} }
#[test] #[test]
fn forwarding_invalid_upstream_returns_error() { fn forwarding_invalid_upstream_returns_error() {
let rule = ForwardingRuleConfig { let rule = ForwardingRuleConfig {
suffix: vec!["home.local".to_string()], suffix: vec!["home.local".to_string()],
upstream: "not-a-valid-host".to_string(), upstream: vec!["not-a-valid-host".to_string()],
}; };
assert!(rule.to_runtime_rules().is_err()); assert!(rule.to_runtime_rules().is_err());
} }
@@ -741,14 +1159,14 @@ mod tests {
fn forwarding_upstream_accepts_dot_scheme() { fn forwarding_upstream_accepts_dot_scheme() {
let rule = ForwardingRuleConfig { let rule = ForwardingRuleConfig {
suffix: vec!["google.com".to_string()], suffix: vec!["google.com".to_string()],
upstream: "tls://9.9.9.9#dns.quad9.net".to_string(), upstream: vec!["tls://9.9.9.9#dns.quad9.net".to_string()],
}; };
let runtime = rule let runtime = rule
.to_runtime_rules() .to_runtime_rules()
.expect("tls:// upstream should parse"); .expect("tls:// upstream should parse");
assert_eq!(runtime.len(), 1); assert_eq!(runtime.len(), 1);
assert_eq!( assert_eq!(
runtime[0].upstream.to_string(), runtime[0].upstream.preferred().unwrap().to_string(),
"tls://9.9.9.9:853#dns.quad9.net" "tls://9.9.9.9:853#dns.quad9.net"
); );
} }
@@ -757,14 +1175,14 @@ mod tests {
fn forwarding_upstream_accepts_doh_scheme() { fn forwarding_upstream_accepts_doh_scheme() {
let rule = ForwardingRuleConfig { let rule = ForwardingRuleConfig {
suffix: vec!["goog".to_string()], suffix: vec!["goog".to_string()],
upstream: "https://dns.quad9.net/dns-query".to_string(), upstream: vec!["https://dns.quad9.net/dns-query".to_string()],
}; };
let runtime = rule let runtime = rule
.to_runtime_rules() .to_runtime_rules()
.expect("https:// upstream should parse"); .expect("https:// upstream should parse");
assert_eq!(runtime.len(), 1); assert_eq!(runtime.len(), 1);
assert_eq!( assert_eq!(
runtime[0].upstream.to_string(), runtime[0].upstream.preferred().unwrap().to_string(),
"https://dns.quad9.net/dns-query" "https://dns.quad9.net/dns-query"
); );
} }
@@ -773,44 +1191,90 @@ mod tests {
fn forwarding_config_rules_take_precedence_over_discovered() { fn forwarding_config_rules_take_precedence_over_discovered() {
let config_rules = vec![ForwardingRuleConfig { let config_rules = vec![ForwardingRuleConfig {
suffix: vec!["home.local".to_string()], suffix: vec!["home.local".to_string()],
upstream: "10.0.0.1:53".to_string(), upstream: vec!["10.0.0.1:53".to_string()],
}]; }];
let discovered = vec![crate::system_dns::ForwardingRule::new( let discovered = vec![crate::system_dns::ForwardingRule::new(
"home.local".to_string(), "home.local".to_string(),
crate::forward::Upstream::Udp("192.168.1.1:53".parse().unwrap()), crate::forward::UpstreamPool::new(
vec![crate::forward::Upstream::Udp(
"192.168.1.1:53".parse().unwrap(),
)],
vec![],
),
)]; )];
let merged = merge_forwarding_rules(&config_rules, discovered).unwrap(); let merged = merge_forwarding_rules(&config_rules, discovered).unwrap();
let picked = crate::system_dns::match_forwarding_rule("host.home.local", &merged) let picked = crate::system_dns::match_forwarding_rule("host.home.local", &merged)
.expect("rule should match"); .expect("rule should match");
assert_eq!(picked.to_string(), "10.0.0.1:53"); assert_eq!(picked.preferred().unwrap().to_string(), "10.0.0.1:53");
} }
#[test] #[test]
fn forwarding_merge_preserves_non_overlapping_discovered() { fn forwarding_merge_preserves_non_overlapping_discovered() {
let config_rules = vec![ForwardingRuleConfig { let config_rules = vec![ForwardingRuleConfig {
suffix: vec!["home.local".to_string()], suffix: vec!["home.local".to_string()],
upstream: "10.0.0.1:53".to_string(), upstream: vec!["10.0.0.1:53".to_string()],
}]; }];
let discovered = vec![crate::system_dns::ForwardingRule::new( let discovered = vec![crate::system_dns::ForwardingRule::new(
"corp.example".to_string(), "corp.example".to_string(),
crate::forward::Upstream::Udp("192.168.1.1:53".parse().unwrap()), crate::forward::UpstreamPool::new(
vec![crate::forward::Upstream::Udp(
"192.168.1.1:53".parse().unwrap(),
)],
vec![],
),
)]; )];
let merged = merge_forwarding_rules(&config_rules, discovered).unwrap(); let merged = merge_forwarding_rules(&config_rules, discovered).unwrap();
assert_eq!(merged.len(), 2); assert_eq!(merged.len(), 2);
let picked = crate::system_dns::match_forwarding_rule("host.corp.example", &merged) let picked = crate::system_dns::match_forwarding_rule("host.corp.example", &merged)
.expect("discovered rule should still match"); .expect("discovered rule should still match");
assert_eq!(picked.to_string(), "192.168.1.1:53"); assert_eq!(picked.preferred().unwrap().to_string(), "192.168.1.1:53");
} }
#[test] #[test]
fn forwarding_merge_suffix_array_expands_to_multiple_rules() { fn forwarding_merge_suffix_array_expands_to_multiple_rules() {
let config_rules = vec![ForwardingRuleConfig { let config_rules = vec![ForwardingRuleConfig {
suffix: vec!["a.local".to_string(), "b.local".to_string()], suffix: vec!["a.local".to_string(), "b.local".to_string()],
upstream: "10.0.0.1:53".to_string(), upstream: vec!["10.0.0.1:53".to_string()],
}]; }];
let merged = merge_forwarding_rules(&config_rules, vec![]).unwrap(); let merged = merge_forwarding_rules(&config_rules, vec![]).unwrap();
assert_eq!(merged.len(), 2); assert_eq!(merged.len(), 2);
} }
#[test]
fn forwarding_parses_upstream_array() {
let toml = r#"
[[forwarding]]
suffix = "google.com"
upstream = ["tls://9.9.9.9#dns.quad9.net", "tls://149.112.112.112#dns.quad9.net"]
"#;
let config: Config = toml::from_str(toml).unwrap();
assert_eq!(config.forwarding.len(), 1);
assert_eq!(config.forwarding[0].upstream.len(), 2);
}
#[test]
fn forwarding_upstream_array_builds_pool_with_multiple_primaries() {
let rule = ForwardingRuleConfig {
suffix: vec!["google.com".to_string()],
upstream: vec![
"tls://9.9.9.9#dns.quad9.net".to_string(),
"tls://149.112.112.112#dns.quad9.net".to_string(),
],
};
let runtime = rule.to_runtime_rules().unwrap();
assert_eq!(runtime.len(), 1);
let label = runtime[0].upstream.label();
assert!(label.contains("+1 more"), "label was: {}", label);
}
#[test]
fn forwarding_empty_upstream_array_errors() {
let rule = ForwardingRuleConfig {
suffix: vec!["home.local".to_string()],
upstream: vec![],
};
assert!(rule.to_runtime_rules().is_err());
}
} }
pub struct ConfigLoad { pub struct ConfigLoad {

View File

@@ -16,7 +16,9 @@ use crate::blocklist::BlocklistStore;
use crate::buffer::BytePacketBuffer; use crate::buffer::BytePacketBuffer;
use crate::cache::{DnsCache, DnssecStatus}; use crate::cache::{DnsCache, DnssecStatus};
use crate::config::{UpstreamMode, ZoneMap}; use crate::config::{UpstreamMode, ZoneMap};
use crate::forward::{forward_query_raw, forward_with_failover_raw, Upstream, UpstreamPool}; #[cfg(test)]
use crate::forward::Upstream;
use crate::forward::{forward_with_failover_raw, UpstreamPool};
use crate::header::ResultCode; use crate::header::ResultCode;
use crate::health::HealthMeta; use crate::health::HealthMeta;
use crate::lan::PeerStore; use crate::lan::PeerStore;
@@ -75,6 +77,10 @@ pub struct ServerCtx {
pub ca_pem: Option<String>, pub ca_pem: Option<String>,
pub mobile_enabled: bool, pub mobile_enabled: bool,
pub mobile_port: u16, pub mobile_port: u16,
/// When true, AAAA queries short-circuit with NODATA (NOERROR + empty
/// answer) instead of hitting cache/forwarding/upstream. Local data
/// (overrides, zones, .numa proxy, blocklist sinkhole) is unaffected.
pub filter_aaaa: bool,
} }
/// Transport-agnostic DNS resolution. Runs the full pipeline (overrides, blocklist, /// Transport-agnostic DNS resolution. Runs the full pipeline (overrides, blocklist,
@@ -99,6 +105,7 @@ pub async fn resolve_query(
// Pipeline: overrides -> .localhost -> local zones -> special-use (unless forwarded) // Pipeline: overrides -> .localhost -> local zones -> special-use (unless forwarded)
// -> .tld proxy -> blocklist -> cache -> forwarding -> recursive/upstream // -> .tld proxy -> blocklist -> cache -> forwarding -> recursive/upstream
// Each lock is scoped to avoid holding MutexGuard across await points. // Each lock is scoped to avoid holding MutexGuard across await points.
let mut upstream_transport: Option<crate::stats::UpstreamTransport> = None;
let (response, path, dnssec) = { let (response, path, dnssec) = {
let override_record = ctx.overrides.read().unwrap().lookup(&qname); let override_record = ctx.overrides.read().unwrap().lookup(&qname);
if let Some(record) = override_record { if let Some(record) = override_record {
@@ -170,6 +177,13 @@ pub async fn resolve_query(
60, 60,
)); ));
(resp, QueryPath::Blocked, DnssecStatus::Indeterminate) (resp, QueryPath::Blocked, DnssecStatus::Indeterminate)
} else if qtype == QueryType::AAAA && ctx.filter_aaaa {
// RFC 2308 NODATA: NOERROR with empty answer section. Prevents
// Happy Eyeballs clients from waiting on an AAAA they'll never use
// on IPv4-only networks. NXDOMAIN would be wrong (it'd imply the
// name doesn't exist for A either).
let resp = DnsPacket::response_from(&query, ResultCode::NOERROR);
(resp, QueryPath::Local, DnssecStatus::Indeterminate)
} else { } else {
let cached = ctx.cache.read().unwrap().lookup_with_status(&qname, qtype); let cached = ctx.cache.read().unwrap().lookup_with_status(&qname, qtype);
if let Some((cached, cached_dnssec, freshness)) = cached { if let Some((cached, cached_dnssec, freshness)) = cached {
@@ -190,84 +204,73 @@ pub async fn resolve_query(
resp.header.authed_data = true; resp.header.authed_data = true;
} }
(resp, QueryPath::Cached, cached_dnssec) (resp, QueryPath::Cached, cached_dnssec)
} else if let Some(upstream) = } else if let Some(pool) =
crate::system_dns::match_forwarding_rule(&qname, &ctx.forwarding_rules) crate::system_dns::match_forwarding_rule(&qname, &ctx.forwarding_rules)
{ {
// Conditional forwarding takes priority over recursive mode // Conditional forwarding takes priority over recursive mode
// (e.g. Tailscale .ts.net, VPC private zones) // (e.g. Tailscale .ts.net, VPC private zones)
match forward_and_cache(raw_wire, upstream, ctx, &qname, qtype).await {
Ok(resp) => (resp, QueryPath::Forwarded, DnssecStatus::Indeterminate),
Err(e) => {
error!(
"{} | {:?} {} | FORWARD ERROR | {}",
src_addr, qtype, qname, e
);
(
DnsPacket::response_from(&query, ResultCode::SERVFAIL),
QueryPath::UpstreamError,
DnssecStatus::Indeterminate,
)
}
}
} else if ctx.upstream_mode == UpstreamMode::Recursive {
let key = (qname.clone(), qtype); let key = (qname.clone(), qtype);
let (resp, path, err) = resolve_coalesced(&ctx.inflight, key, &query, || { let (resp, path, err) =
crate::recursive::resolve_recursive( resolve_coalesced(&ctx.inflight, key, &query, QueryPath::Forwarded, || async {
&qname, let wire = forward_with_failover_raw(
qtype, raw_wire,
&ctx.cache, pool,
&query, &ctx.srtt,
&ctx.root_hints, ctx.timeout,
&ctx.srtt, ctx.hedge_delay,
) )
}) .await?;
.await; cache_and_parse(ctx, &qname, qtype, &wire)
if path == QueryPath::Coalesced { })
debug!("{} | {:?} {} | COALESCED", src_addr, qtype, qname); .await;
} else if path == QueryPath::UpstreamError { log_coalesced_outcome(src_addr, qtype, &qname, path, err.as_deref(), "FORWARD");
error!( if path == QueryPath::Forwarded {
"{} | {:?} {} | RECURSIVE ERROR | {}", upstream_transport = pool.preferred().map(|u| u.transport());
src_addr, }
qtype, (resp, path, DnssecStatus::Indeterminate)
qname, } else if ctx.upstream_mode == UpstreamMode::Recursive {
err.as_deref().unwrap_or("leader failed") // Recursive resolution makes UDP hops to roots/TLDs/auths;
); // tag as Udp so the dashboard can aggregate plaintext-wire
// egress honestly. Only mark on success — errors stay None.
let key = (qname.clone(), qtype);
let (resp, path, err) =
resolve_coalesced(&ctx.inflight, key, &query, QueryPath::Recursive, || {
crate::recursive::resolve_recursive(
&qname,
qtype,
&ctx.cache,
&query,
&ctx.root_hints,
&ctx.srtt,
)
})
.await;
log_coalesced_outcome(src_addr, qtype, &qname, path, err.as_deref(), "RECURSIVE");
if path == QueryPath::Recursive {
upstream_transport = Some(crate::stats::UpstreamTransport::Udp);
} }
(resp, path, DnssecStatus::Indeterminate) (resp, path, DnssecStatus::Indeterminate)
} else { } else {
let pool = ctx.upstream_pool.lock().unwrap().clone(); let pool = ctx.upstream_pool.lock().unwrap().clone();
match forward_with_failover_raw( let key = (qname.clone(), qtype);
raw_wire, let (resp, path, err) =
&pool, resolve_coalesced(&ctx.inflight, key, &query, QueryPath::Upstream, || async {
&ctx.srtt, let wire = forward_with_failover_raw(
ctx.timeout, raw_wire,
ctx.hedge_delay, &pool,
) &ctx.srtt,
.await ctx.timeout,
{ ctx.hedge_delay,
Ok(resp_wire) => match cache_and_parse(ctx, &qname, qtype, &resp_wire) {
Ok(resp) => (resp, QueryPath::Forwarded, DnssecStatus::Indeterminate),
Err(e) => {
error!("{} | {:?} {} | PARSE ERROR | {}", src_addr, qtype, qname, e);
(
DnsPacket::response_from(&query, ResultCode::SERVFAIL),
QueryPath::UpstreamError,
DnssecStatus::Indeterminate,
)
}
},
Err(e) => {
error!(
"{} | {:?} {} | UPSTREAM ERROR | {}",
src_addr, qtype, qname, e
);
(
DnsPacket::response_from(&query, ResultCode::SERVFAIL),
QueryPath::UpstreamError,
DnssecStatus::Indeterminate,
) )
} .await?;
cache_and_parse(ctx, &qname, qtype, &wire)
})
.await;
log_coalesced_outcome(src_addr, qtype, &qname, path, err.as_deref(), "UPSTREAM");
if path == QueryPath::Upstream {
upstream_transport = pool.preferred().map(|u| u.transport());
} }
(resp, path, DnssecStatus::Indeterminate)
} }
} }
}; };
@@ -314,6 +317,15 @@ pub async fn resolve_query(
strip_dnssec_records(&mut response); strip_dnssec_records(&mut response);
} }
// filter_aaaa: also strip ipv6hint from HTTPS/SVCB answers so modern
// browsers (Chrome ≥103 etc.) don't receive v6 address hints via the
// HTTPS record path that bypasses AAAA entirely. Gated on !client_do
// because modifying rdata invalidates any accompanying RRSIG — a DO-bit
// validator downstream would reject the response as Bogus.
if ctx.filter_aaaa && !client_do {
strip_svcb_ipv6_hints(&mut response);
}
// Echo EDNS back if client sent it // Echo EDNS back if client sent it
if query.edns.is_some() { if query.edns.is_some() {
response.edns = Some(crate::packet::EdnsOpt { response.edns = Some(crate::packet::EdnsOpt {
@@ -357,7 +369,7 @@ pub async fn resolve_query(
// Record stats and query log // Record stats and query log
{ {
let mut s = ctx.stats.lock().unwrap(); let mut s = ctx.stats.lock().unwrap();
let total = s.record(path, transport); let total = s.record(path, transport, upstream_transport);
if total.is_multiple_of(1000) { if total.is_multiple_of(1000) {
s.log_summary(); s.log_summary();
} }
@@ -433,17 +445,6 @@ pub async fn refresh_entry(ctx: &ServerCtx, qname: &str, qtype: QueryType) {
} }
} }
async fn forward_and_cache(
wire: &[u8],
upstream: &Upstream,
ctx: &ServerCtx,
qname: &str,
qtype: QueryType,
) -> crate::Result<DnsPacket> {
let resp_wire = forward_query_raw(wire, upstream, ctx.timeout).await?;
cache_and_parse(ctx, qname, qtype, &resp_wire)
}
pub async fn handle_query( pub async fn handle_query(
mut buffer: BytePacketBuffer, mut buffer: BytePacketBuffer,
raw_len: usize, raw_len: usize,
@@ -482,6 +483,20 @@ fn strip_dnssec_records(pkt: &mut DnsPacket) {
pkt.resources.retain(|r| !is_dnssec_record(r)); pkt.resources.retain(|r| !is_dnssec_record(r));
} }
fn strip_svcb_ipv6_hints(pkt: &mut DnsPacket) {
let https_qtype = QueryType::HTTPS.to_num();
let svcb_qtype = QueryType::SVCB.to_num();
pkt.for_each_record_mut(|rec| {
if let DnsRecord::UNKNOWN { qtype, data, .. } = rec {
if *qtype == https_qtype || *qtype == svcb_qtype {
if let Some(new_data) = crate::svcb::strip_ipv6hint(data) {
*data = new_data;
}
}
}
});
}
fn is_special_use_domain(qname: &str) -> bool { fn is_special_use_domain(qname: &str) -> bool {
if qname.ends_with(".in-addr.arpa") { if qname.ends_with(".in-addr.arpa") {
// RFC 6303: private + loopback + link-local reverse DNS // RFC 6303: private + loopback + link-local reverse DNS
@@ -558,11 +573,15 @@ fn acquire_inflight(inflight: &Mutex<InflightMap>, key: (String, QueryType)) ->
/// Run a resolve function with in-flight coalescing. Multiple concurrent calls /// Run a resolve function with in-flight coalescing. Multiple concurrent calls
/// for the same key share a single resolution — the first caller (leader) /// for the same key share a single resolution — the first caller (leader)
/// executes `resolve_fn`, and followers wait for the broadcast result. /// executes `resolve_fn`, and followers wait for the broadcast result. The
/// leader's successful path is tagged with `leader_path` so callers that
/// share this helper (recursive, forwarded-rule, forward-upstream) keep their
/// own observability without duplicating the inflight map.
async fn resolve_coalesced<F, Fut>( async fn resolve_coalesced<F, Fut>(
inflight: &Mutex<InflightMap>, inflight: &Mutex<InflightMap>,
key: (String, QueryType), key: (String, QueryType),
query: &DnsPacket, query: &DnsPacket,
leader_path: QueryPath,
resolve_fn: F, resolve_fn: F,
) -> (DnsPacket, QueryPath, Option<String>) ) -> (DnsPacket, QueryPath, Option<String>)
where where
@@ -591,7 +610,7 @@ where
match result { match result {
Ok(resp) => { Ok(resp) => {
let _ = tx.send(Some(resp.clone())); let _ = tx.send(Some(resp.clone()));
(resp, QueryPath::Recursive, None) (resp, leader_path, None)
} }
Err(e) => { Err(e) => {
let _ = tx.send(None); let _ = tx.send(None);
@@ -618,6 +637,33 @@ impl Drop for InflightGuard<'_> {
} }
} }
/// Emit the log lines shared by the three upstream branches (Forwarded,
/// Recursive, Upstream) after `resolve_coalesced` returns. Leader-success
/// and transport-tagging stay at the call site since they diverge per
/// branch, but the Coalesced debug and UpstreamError error are identical
/// except for the label.
fn log_coalesced_outcome(
src_addr: SocketAddr,
qtype: QueryType,
qname: &str,
path: QueryPath,
err: Option<&str>,
label: &str,
) {
match path {
QueryPath::Coalesced => debug!("{} | {:?} {} | COALESCED", src_addr, qtype, qname),
QueryPath::UpstreamError => error!(
"{} | {:?} {} | {} ERROR | {}",
src_addr,
qtype,
qname,
label,
err.unwrap_or("leader failed")
),
_ => {}
}
}
fn special_use_response(query: &DnsPacket, qname: &str, qtype: QueryType) -> DnsPacket { fn special_use_response(query: &DnsPacket, qname: &str, qtype: QueryType) -> DnsPacket {
use std::net::{Ipv4Addr, Ipv6Addr}; use std::net::{Ipv4Addr, Ipv6Addr};
if qname == "ipv4only.arpa" { if qname == "ipv4only.arpa" {
@@ -856,7 +902,7 @@ mod tests {
let key = ("coalesce.test".to_string(), QueryType::A); let key = ("coalesce.test".to_string(), QueryType::A);
let query = DnsPacket::query(100 + i, "coalesce.test", QueryType::A); let query = DnsPacket::query(100 + i, "coalesce.test", QueryType::A);
handles.push(tokio::spawn(async move { handles.push(tokio::spawn(async move {
resolve_coalesced(&inf, key, &query, || async { resolve_coalesced(&inf, key, &query, QueryPath::Recursive, || async {
count.fetch_add(1, std::sync::atomic::Ordering::Relaxed); count.fetch_add(1, std::sync::atomic::Ordering::Relaxed);
tokio::time::sleep(Duration::from_millis(200)).await; tokio::time::sleep(Duration::from_millis(200)).await;
Ok(mock_response("coalesce.test")) Ok(mock_response("coalesce.test"))
@@ -900,6 +946,7 @@ mod tests {
&inf1, &inf1,
("same.domain".to_string(), QueryType::A), ("same.domain".to_string(), QueryType::A),
&query_a, &query_a,
QueryPath::Recursive,
|| async { || async {
count1.fetch_add(1, std::sync::atomic::Ordering::Relaxed); count1.fetch_add(1, std::sync::atomic::Ordering::Relaxed);
tokio::time::sleep(Duration::from_millis(100)).await; tokio::time::sleep(Duration::from_millis(100)).await;
@@ -913,6 +960,7 @@ mod tests {
&inf2, &inf2,
("same.domain".to_string(), QueryType::AAAA), ("same.domain".to_string(), QueryType::AAAA),
&query_aaaa, &query_aaaa,
QueryPath::Recursive,
|| async { || async {
count2.fetch_add(1, std::sync::atomic::Ordering::Relaxed); count2.fetch_add(1, std::sync::atomic::Ordering::Relaxed);
tokio::time::sleep(Duration::from_millis(100)).await; tokio::time::sleep(Duration::from_millis(100)).await;
@@ -942,6 +990,7 @@ mod tests {
&inflight, &inflight,
("will-fail.test".to_string(), QueryType::A), ("will-fail.test".to_string(), QueryType::A),
&query, &query,
QueryPath::Recursive,
|| async { Err::<DnsPacket, _>("upstream timeout".into()) }, || async { Err::<DnsPacket, _>("upstream timeout".into()) },
) )
.await; .await;
@@ -963,6 +1012,7 @@ mod tests {
&inf, &inf,
("fail.test".to_string(), QueryType::A), ("fail.test".to_string(), QueryType::A),
&query, &query,
QueryPath::Recursive,
|| async { || async {
tokio::time::sleep(Duration::from_millis(200)).await; tokio::time::sleep(Duration::from_millis(200)).await;
Err::<DnsPacket, _>("upstream error".into()) Err::<DnsPacket, _>("upstream error".into())
@@ -1003,6 +1053,7 @@ mod tests {
&inflight, &inflight,
("question.test".to_string(), QueryType::A), ("question.test".to_string(), QueryType::A),
&query, &query,
QueryPath::Recursive,
|| async { Err::<DnsPacket, _>("fail".into()) }, || async { Err::<DnsPacket, _>("fail".into()) },
) )
.await; .await;
@@ -1027,6 +1078,7 @@ mod tests {
&inflight, &inflight,
("err-msg.test".to_string(), QueryType::A), ("err-msg.test".to_string(), QueryType::A),
&query, &query,
QueryPath::Recursive,
|| async { Err::<DnsPacket, _>("connection refused by upstream".into()) }, || async { Err::<DnsPacket, _>("connection refused by upstream".into()) },
) )
.await; .await;
@@ -1082,7 +1134,7 @@ mod tests {
let mut ctx = crate::testutil::test_ctx().await; let mut ctx = crate::testutil::test_ctx().await;
ctx.forwarding_rules = vec![ForwardingRule::new( ctx.forwarding_rules = vec![ForwardingRule::new(
"168.192.in-addr.arpa".to_string(), "168.192.in-addr.arpa".to_string(),
Upstream::Udp(upstream_addr), UpstreamPool::new(vec![Upstream::Udp(upstream_addr)], vec![]),
)]; )];
let ctx = Arc::new(ctx); let ctx = Arc::new(ctx);
@@ -1178,6 +1230,201 @@ mod tests {
} }
} }
#[tokio::test]
async fn pipeline_filter_aaaa_returns_nodata() {
let mut ctx = crate::testutil::test_ctx().await;
ctx.filter_aaaa = true;
let ctx = Arc::new(ctx);
let (resp, path) = resolve_in_test(&ctx, "example.com", QueryType::AAAA).await;
assert_eq!(path, QueryPath::Local);
assert_eq!(resp.header.rescode, ResultCode::NOERROR);
assert!(resp.answers.is_empty(), "AAAA must be filtered to NODATA");
}
#[tokio::test]
async fn pipeline_filter_aaaa_leaves_a_queries_alone() {
let mut upstream_resp = DnsPacket::new();
upstream_resp.header.response = true;
upstream_resp.header.rescode = ResultCode::NOERROR;
upstream_resp.answers.push(DnsRecord::A {
domain: "example.com".to_string(),
addr: Ipv4Addr::new(93, 184, 216, 34),
ttl: 300,
});
let upstream_addr = crate::testutil::mock_upstream(upstream_resp).await;
let mut ctx = crate::testutil::test_ctx().await;
ctx.filter_aaaa = true;
ctx.upstream_pool
.lock()
.unwrap()
.set_primary(vec![Upstream::Udp(upstream_addr)]);
let ctx = Arc::new(ctx);
let (resp, path) = resolve_in_test(&ctx, "example.com", QueryType::A).await;
assert_eq!(path, QueryPath::Upstream);
assert_eq!(resp.answers.len(), 1);
}
#[tokio::test]
async fn pipeline_filter_aaaa_respects_override() {
let mut ctx = crate::testutil::test_ctx().await;
ctx.filter_aaaa = true;
ctx.overrides
.write()
.unwrap()
.insert("v6.test", "2001:db8::1", 60, None)
.unwrap();
let ctx = Arc::new(ctx);
let (resp, path) = resolve_in_test(&ctx, "v6.test", QueryType::AAAA).await;
assert_eq!(path, QueryPath::Overridden);
assert_eq!(resp.answers.len(), 1, "override must win over filter");
}
#[tokio::test]
async fn pipeline_filter_aaaa_strips_ipv6hint_from_https_and_svcb() {
let rdata = crate::svcb::build_rdata(
1,
&[],
&[
(1, vec![0x02, b'h', b'3']),
(
6,
vec![
0x26, 0x06, 0x47, 0x00, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0x01,
],
),
],
);
let mut pkt = DnsPacket::new();
pkt.header.response = true;
pkt.header.rescode = ResultCode::NOERROR;
pkt.questions.push(crate::question::DnsQuestion {
name: "hints.test".to_string(),
qtype: QueryType::HTTPS,
});
pkt.answers.push(DnsRecord::UNKNOWN {
domain: "hints.test".to_string(),
qtype: 65,
data: rdata.clone(),
ttl: 300,
});
let mut svcb_pkt = pkt.clone();
svcb_pkt.questions[0].name = "svc.test".to_string();
svcb_pkt.questions[0].qtype = QueryType::SVCB;
if let DnsRecord::UNKNOWN { domain, qtype, .. } = &mut svcb_pkt.answers[0] {
*domain = "svc.test".to_string();
*qtype = 64;
}
let mut ctx = crate::testutil::test_ctx().await;
ctx.filter_aaaa = true;
ctx.cache
.write()
.unwrap()
.insert("hints.test", QueryType::HTTPS, &pkt);
ctx.cache
.write()
.unwrap()
.insert("svc.test", QueryType::SVCB, &svcb_pkt);
let ctx = Arc::new(ctx);
for (name, qtype, label) in [
("hints.test", QueryType::HTTPS, "HTTPS"),
("svc.test", QueryType::SVCB, "SVCB"),
] {
let (resp, path) = resolve_in_test(&ctx, name, qtype).await;
assert_eq!(path, QueryPath::Cached, "{label}");
assert_eq!(resp.answers.len(), 1, "{label}");
match &resp.answers[0] {
DnsRecord::UNKNOWN { data, .. } => {
assert!(
data.len() < rdata.len(),
"{label}: ipv6hint (20 bytes) must be removed"
);
// Bytes for key=6 must not appear at any 4-byte boundary in the
// params section — cheap structural check.
assert!(
!data.windows(4).any(|w| w == [0, 6, 0, 16]),
"{label}: ipv6hint TLV header must be absent"
);
}
other => panic!("{label}: expected UNKNOWN record, got {other:?}"),
}
}
}
#[tokio::test]
async fn pipeline_filter_aaaa_preserves_ipv6hint_for_dnssec_clients() {
// Regression guard for the DO-bit gate in resolve_query: modifying
// HTTPS rdata invalidates any accompanying RRSIG, so a DO=1 client
// must receive the record untouched even when filter_aaaa is on.
let rdata = crate::svcb::build_rdata(
1,
&[],
&[(
6,
vec![
0x26, 0x06, 0x47, 0x00, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0x01,
],
)],
);
let mut pkt = DnsPacket::new();
pkt.header.response = true;
pkt.header.rescode = ResultCode::NOERROR;
pkt.questions.push(crate::question::DnsQuestion {
name: "hints.test".to_string(),
qtype: QueryType::HTTPS,
});
pkt.answers.push(DnsRecord::UNKNOWN {
domain: "hints.test".to_string(),
qtype: 65,
data: rdata.clone(),
ttl: 300,
});
let mut ctx = crate::testutil::test_ctx().await;
ctx.filter_aaaa = true;
ctx.cache
.write()
.unwrap()
.insert("hints.test", QueryType::HTTPS, &pkt);
let ctx = Arc::new(ctx);
// Build a query with EDNS DO bit set — can't use resolve_in_test
// because it constructs a plain query without EDNS.
let mut query = DnsPacket::query(0xBEEF, "hints.test", QueryType::HTTPS);
query.edns = Some(crate::packet::EdnsOpt {
do_bit: true,
..Default::default()
});
let mut buf = BytePacketBuffer::new();
query.write(&mut buf).unwrap();
let raw = &buf.buf[..buf.pos];
let src: SocketAddr = "127.0.0.1:1234".parse().unwrap();
let (resp_buf, _) = resolve_query(query, raw, src, &ctx, Transport::Udp)
.await
.unwrap();
let mut resp_parse_buf = BytePacketBuffer::from_bytes(resp_buf.filled());
let resp = DnsPacket::from_buffer(&mut resp_parse_buf).unwrap();
match &resp.answers[0] {
DnsRecord::UNKNOWN { data, .. } => {
assert_eq!(
data, &rdata,
"ipv6hint must be preserved for DO-bit clients"
);
}
other => panic!("expected UNKNOWN record, got {:?}", other),
}
}
#[tokio::test] #[tokio::test]
async fn pipeline_blocklist_sinkhole() { async fn pipeline_blocklist_sinkhole() {
let ctx = crate::testutil::test_ctx().await; let ctx = crate::testutil::test_ctx().await;
@@ -1237,7 +1484,7 @@ mod tests {
let mut ctx = crate::testutil::test_ctx().await; let mut ctx = crate::testutil::test_ctx().await;
ctx.forwarding_rules = vec![ForwardingRule::new( ctx.forwarding_rules = vec![ForwardingRule::new(
"corp".to_string(), "corp".to_string(),
Upstream::Udp(upstream_addr), UpstreamPool::new(vec![Upstream::Udp(upstream_addr)], vec![]),
)]; )];
let ctx = Arc::new(ctx); let ctx = Arc::new(ctx);
@@ -1253,4 +1500,60 @@ mod tests {
other => panic!("expected A record, got {:?}", other), other => panic!("expected A record, got {:?}", other),
} }
} }
#[tokio::test]
async fn pipeline_forwarding_fails_over_to_second_upstream() {
let dead = crate::testutil::blackhole_upstream();
let mut live_resp = DnsPacket::new();
live_resp.header.response = true;
live_resp.header.rescode = ResultCode::NOERROR;
live_resp.answers.push(DnsRecord::A {
domain: "internal.corp".to_string(),
addr: Ipv4Addr::new(10, 9, 9, 9),
ttl: 600,
});
let live = crate::testutil::mock_upstream(live_resp).await;
let mut ctx = crate::testutil::test_ctx().await;
ctx.forwarding_rules = vec![ForwardingRule::new(
"corp".to_string(),
UpstreamPool::new(vec![Upstream::Udp(dead), Upstream::Udp(live)], vec![]),
)];
let ctx = Arc::new(ctx);
let (resp, path) = resolve_in_test(&ctx, "internal.corp", QueryType::A).await;
assert_eq!(path, QueryPath::Forwarded);
assert_eq!(resp.header.rescode, ResultCode::NOERROR);
assert_eq!(resp.answers.len(), 1);
match &resp.answers[0] {
DnsRecord::A { addr, .. } => assert_eq!(*addr, Ipv4Addr::new(10, 9, 9, 9)),
other => panic!("expected A record, got {:?}", other),
}
}
#[tokio::test]
async fn pipeline_default_pool_reports_upstream_path() {
let mut upstream_resp = DnsPacket::new();
upstream_resp.header.response = true;
upstream_resp.header.rescode = ResultCode::NOERROR;
upstream_resp.answers.push(DnsRecord::A {
domain: "example.com".to_string(),
addr: Ipv4Addr::new(93, 184, 216, 34),
ttl: 300,
});
let upstream_addr = crate::testutil::mock_upstream(upstream_resp).await;
let ctx = crate::testutil::test_ctx().await;
ctx.upstream_pool
.lock()
.unwrap()
.set_primary(vec![Upstream::Udp(upstream_addr)]);
let ctx = Arc::new(ctx);
let (resp, path) = resolve_in_test(&ctx, "example.com", QueryType::A).await;
assert_eq!(path, QueryPath::Upstream);
assert_eq!(resp.header.rescode, ResultCode::NOERROR);
assert_eq!(resp.answers.len(), 1);
}
} }

View File

@@ -1,14 +1,16 @@
use std::fmt; use std::fmt;
use std::net::{IpAddr, SocketAddr}; use std::net::{IpAddr, SocketAddr};
use std::sync::RwLock; use std::sync::{Arc, RwLock};
use std::time::{Duration, Instant}; use std::time::{Duration, Instant};
use tokio::net::UdpSocket; use tokio::net::UdpSocket;
use tokio::time::timeout; use tokio::time::timeout;
use crate::buffer::BytePacketBuffer; use crate::buffer::BytePacketBuffer;
use crate::odoh::{query_through_relay, OdohConfigCache};
use crate::packet::DnsPacket; use crate::packet::DnsPacket;
use crate::srtt::SrttCache; use crate::srtt::SrttCache;
use crate::stats::UpstreamTransport;
use crate::Result; use crate::Result;
#[derive(Clone)] #[derive(Clone)]
@@ -23,6 +25,36 @@ pub enum Upstream {
tls_name: Option<String>, tls_name: Option<String>,
connector: tokio_rustls::TlsConnector, connector: tokio_rustls::TlsConnector,
}, },
/// Oblivious DNS-over-HTTPS (RFC 9230). Queries are HPKE-sealed to the
/// target and forwarded through an independent relay. Target host lives
/// on `target_config` (single source of truth — the cache keys on it).
Odoh {
relay_url: String,
target_path: String,
client: reqwest::Client,
target_config: Arc<OdohConfigCache>,
},
}
impl Upstream {
/// IP address to key SRTT tracking on, if the upstream has a stable one.
/// `Doh` and `Odoh` route through a URL + connection pool, so there's no
/// single IP to track; SRTT is skipped for them.
pub fn tracked_ip(&self) -> Option<IpAddr> {
match self {
Upstream::Udp(addr) | Upstream::Dot { addr, .. } => Some(addr.ip()),
Upstream::Doh { .. } | Upstream::Odoh { .. } => None,
}
}
pub fn transport(&self) -> UpstreamTransport {
match self {
Upstream::Udp(_) => UpstreamTransport::Udp,
Upstream::Doh { .. } => UpstreamTransport::Doh,
Upstream::Dot { .. } => UpstreamTransport::Dot,
Upstream::Odoh { .. } => UpstreamTransport::Odoh,
}
}
} }
impl PartialEq for Upstream { impl PartialEq for Upstream {
@@ -31,6 +63,20 @@ impl PartialEq for Upstream {
(Self::Udp(a), Self::Udp(b)) => a == b, (Self::Udp(a), Self::Udp(b)) => a == b,
(Self::Doh { url: a, .. }, Self::Doh { url: b, .. }) => a == b, (Self::Doh { url: a, .. }, Self::Doh { url: b, .. }) => a == b,
(Self::Dot { addr: a, .. }, Self::Dot { addr: b, .. }) => a == b, (Self::Dot { addr: a, .. }, Self::Dot { addr: b, .. }) => a == b,
(
Self::Odoh {
relay_url: ra,
target_path: pa,
target_config: ca,
..
},
Self::Odoh {
relay_url: rb,
target_path: pb,
target_config: cb,
..
},
) => ra == rb && pa == pb && ca.target_host() == cb.target_host(),
_ => false, _ => false,
} }
} }
@@ -51,14 +97,23 @@ impl fmt::Display for Upstream {
Some(name) => write!(f, "tls://{}#{}", addr, name), Some(name) => write!(f, "tls://{}#{}", addr, name),
None => write!(f, "tls://{}", addr), None => write!(f, "tls://{}", addr),
}, },
Upstream::Odoh {
relay_url,
target_path,
target_config,
..
} => write!(
f,
"odoh://{}{} via {}",
target_config.target_host(),
target_path,
relay_url
),
} }
} }
} }
pub(crate) fn parse_upstream_addr( pub fn parse_upstream_addr(s: &str, default_port: u16) -> std::result::Result<SocketAddr, String> {
s: &str,
default_port: u16,
) -> std::result::Result<SocketAddr, String> {
// Try full socket addr first: "1.2.3.4:5353" or "[::1]:5353" // Try full socket addr first: "1.2.3.4:5353" or "[::1]:5353"
if let Ok(addr) = s.parse::<SocketAddr>() { if let Ok(addr) = s.parse::<SocketAddr>() {
return Ok(addr); return Ok(addr);
@@ -70,22 +125,29 @@ pub(crate) fn parse_upstream_addr(
Err(format!("invalid upstream address: {}", s)) Err(format!("invalid upstream address: {}", s))
} }
pub fn parse_upstream(s: &str, default_port: u16) -> Result<Upstream> { /// Parse a slice of upstream address strings into `Upstream` values, failing
/// on the first invalid entry. DoH entries use `resolver` (when provided) as
/// their hostname resolver.
pub fn parse_upstream_list(
addrs: &[String],
default_port: u16,
resolver: Option<Arc<crate::bootstrap_resolver::NumaResolver>>,
) -> Result<Vec<Upstream>> {
addrs
.iter()
.map(|s| parse_upstream(s, default_port, resolver.clone()))
.collect()
}
pub fn parse_upstream(
s: &str,
default_port: u16,
resolver: Option<Arc<crate::bootstrap_resolver::NumaResolver>>,
) -> Result<Upstream> {
if s.starts_with("https://") { if s.starts_with("https://") {
let client = reqwest::Client::builder()
.use_rustls_tls()
.http2_initial_stream_window_size(65_535)
.http2_initial_connection_window_size(65_535)
.http2_keep_alive_interval(Duration::from_secs(15))
.http2_keep_alive_while_idle(true)
.http2_keep_alive_timeout(Duration::from_secs(10))
.pool_idle_timeout(Duration::from_secs(300))
.pool_max_idle_per_host(1)
.build()
.unwrap_or_default();
return Ok(Upstream::Doh { return Ok(Upstream::Doh {
url: s.to_string(), url: s.to_string(),
client, client: build_https_client_with_resolver(1, resolver),
}); });
} }
// tls://IP:PORT#hostname or tls://IP#hostname (default port 853) // tls://IP:PORT#hostname or tls://IP#hostname (default port 853)
@@ -106,6 +168,52 @@ pub fn parse_upstream(s: &str, default_port: u16) -> Result<Upstream> {
Ok(Upstream::Udp(addr)) Ok(Upstream::Udp(addr))
} }
/// HTTP/2 client tuned for DoH/ODoH: small windows for low latency, long-lived
/// keep-alive. Pool defaults to one idle conn per host — good for resolvers
/// that talk to a single upstream; relays that fan out to many targets
/// should use [`build_https_client_with_pool`].
///
/// Uses the system resolver. Callers running inside `serve::run` pass the
/// shared [`crate::bootstrap_resolver::NumaResolver`] via
/// [`build_https_client_with_resolver`] to avoid the self-loop documented
/// in `docs/implementation/bootstrap-resolver.md`.
pub fn build_https_client() -> reqwest::Client {
build_https_client_with_resolver(1, None)
}
/// Same shape as [`build_https_client`], but caller picks
/// `pool_max_idle_per_host`. Relay workloads hit many distinct target hosts
/// and benefit from a larger pool so warm connections survive concurrent
/// fan-out.
pub fn build_https_client_with_pool(pool_max_idle_per_host: usize) -> reqwest::Client {
build_https_client_with_resolver(pool_max_idle_per_host, None)
}
/// [`build_https_client`] with an optional custom DNS resolver. Numa wires
/// [`crate::bootstrap_resolver::NumaResolver`] here.
pub fn build_https_client_with_resolver(
pool_max_idle_per_host: usize,
resolver: Option<Arc<crate::bootstrap_resolver::NumaResolver>>,
) -> reqwest::Client {
let mut builder = https_client_builder(pool_max_idle_per_host);
if let Some(r) = resolver {
builder = builder.dns_resolver(r);
}
builder.build().unwrap_or_default()
}
fn https_client_builder(pool_max_idle_per_host: usize) -> reqwest::ClientBuilder {
reqwest::Client::builder()
.use_rustls_tls()
.http2_initial_stream_window_size(65_535)
.http2_initial_connection_window_size(65_535)
.http2_keep_alive_interval(Duration::from_secs(15))
.http2_keep_alive_while_idle(true)
.http2_keep_alive_timeout(Duration::from_secs(10))
.pool_idle_timeout(Duration::from_secs(300))
.pool_max_idle_per_host(pool_max_idle_per_host)
}
fn build_dot_connector() -> Result<tokio_rustls::TlsConnector> { fn build_dot_connector() -> Result<tokio_rustls::TlsConnector> {
let _ = rustls::crypto::ring::default_provider().install_default(); let _ = rustls::crypto::ring::default_provider().install_default();
let mut root_store = rustls::RootCertStore::empty(); let mut root_store = rustls::RootCertStore::empty();
@@ -270,6 +378,22 @@ pub async fn forward_query_raw(
tls_name, tls_name,
connector, connector,
} => forward_dot_raw(wire, *addr, tls_name, connector, timeout_duration).await, } => forward_dot_raw(wire, *addr, tls_name, connector, timeout_duration).await,
Upstream::Odoh {
relay_url,
target_path,
client,
target_config,
} => {
query_through_relay(
wire,
relay_url,
target_path,
client,
target_config,
timeout_duration,
)
.await
}
} }
} }
@@ -345,18 +469,17 @@ pub async fn forward_with_failover_raw(
timeout_duration: Duration, timeout_duration: Duration,
hedge_delay: Duration, hedge_delay: Duration,
) -> Result<Vec<u8>> { ) -> Result<Vec<u8>> {
let mut candidates: Vec<(usize, u64)> = pool let mut candidates: Vec<(usize, u64)> = {
.primary let srtt_read = srtt.read().unwrap();
.iter() pool.primary
.enumerate() .iter()
.map(|(i, u)| { .enumerate()
let rtt = match u { .map(|(i, u)| {
Upstream::Udp(addr) => srtt.read().unwrap().get(addr.ip()), let rtt = u.tracked_ip().map(|ip| srtt_read.get(ip)).unwrap_or(0);
_ => 0, (i, rtt)
}; })
(i, rtt) .collect()
}) };
.collect();
candidates.sort_by_key(|&(_, rtt)| rtt); candidates.sort_by_key(|&(_, rtt)| rtt);
let all_upstreams: Vec<&Upstream> = candidates let all_upstreams: Vec<&Upstream> = candidates
@@ -380,15 +503,15 @@ pub async fn forward_with_failover_raw(
}; };
match result { match result {
Ok(resp) => { Ok(resp) => {
if let Upstream::Udp(addr) = upstream { if let Some(ip) = upstream.tracked_ip() {
let rtt_ms = start.elapsed().as_millis() as u64; let rtt_ms = start.elapsed().as_millis() as u64;
srtt.write().unwrap().record_rtt(addr.ip(), rtt_ms, false); srtt.write().unwrap().record_rtt(ip, rtt_ms, false);
} }
return Ok(resp); return Ok(resp);
} }
Err(e) => { Err(e) => {
if let Upstream::Udp(addr) = upstream { if let Some(ip) = upstream.tracked_ip() {
srtt.write().unwrap().record_failure(addr.ip()); srtt.write().unwrap().record_failure(ip);
} }
log::debug!("upstream {} failed: {}", upstream, e); log::debug!("upstream {} failed: {}", upstream, e);
last_err = Some(e); last_err = Some(e);
@@ -438,6 +561,9 @@ async fn forward_doh_raw(
/// Send a lightweight keepalive query to a DoH upstream to prevent /// Send a lightweight keepalive query to a DoH upstream to prevent
/// the HTTP/2 + TLS connection from going idle and being torn down. /// the HTTP/2 + TLS connection from going idle and being torn down.
/// The first call doubles as a startup warm-up: bootstrap-resolver failures
/// (unreachable Quad9/Cloudflare defaults, misconfigured hostname upstream)
/// surface here rather than on the first client query.
pub async fn keepalive_doh(upstream: &Upstream) { pub async fn keepalive_doh(upstream: &Upstream) {
if let Upstream::Doh { url, client } = upstream { if let Upstream::Doh { url, client } = upstream {
// Query for . NS — minimal, always succeeds, response is small // Query for . NS — minimal, always succeeds, response is small
@@ -450,7 +576,9 @@ pub async fn keepalive_doh(upstream: &Upstream) {
0x00, 0x02, // type NS 0x00, 0x02, // type NS
0x00, 0x01, // class IN 0x00, 0x01, // class IN
]; ];
let _ = forward_doh_raw(wire, url, client, Duration::from_secs(5)).await; if let Err(e) = forward_doh_raw(wire, url, client, Duration::from_secs(5)).await {
log::warn!("DoH keepalive to {} failed: {}", url, e);
}
} }
} }
@@ -707,4 +835,62 @@ mod tests {
assert!(!pool.maybe_update_primary("not-an-ip", 53)); assert!(!pool.maybe_update_primary("not-an-ip", 53));
assert_eq!(pool.preferred().unwrap().to_string(), "1.2.3.4:53"); assert_eq!(pool.preferred().unwrap().to_string(), "1.2.3.4:53");
} }
fn tcp_closed_port() -> SocketAddr {
// Bind a TCP listener, grab the port, drop → kernel returns RST on connect.
let listener = std::net::TcpListener::bind("127.0.0.1:0").unwrap();
let addr = listener.local_addr().unwrap();
drop(listener);
addr
}
#[tokio::test]
async fn udp_failure_records_in_srtt() {
let blackhole = crate::testutil::blackhole_upstream();
let pool = UpstreamPool::new(vec![Upstream::Udp(blackhole)], vec![]);
let srtt = RwLock::new(SrttCache::new(true));
let _ = forward_with_failover_raw(
&[0u8; 12],
&pool,
&srtt,
Duration::from_millis(100),
Duration::ZERO,
)
.await;
assert!(srtt.read().unwrap().is_known(blackhole.ip()));
}
#[tokio::test]
async fn dot_failure_records_in_srtt() {
let dead1 = tcp_closed_port();
let dead2 = tcp_closed_port();
let connector = build_dot_connector().unwrap();
let pool = UpstreamPool::new(
vec![
Upstream::Dot {
addr: dead1,
tls_name: Some("dns.quad9.net".to_string()),
connector: connector.clone(),
},
Upstream::Dot {
addr: dead2,
tls_name: Some("dns.quad9.net".to_string()),
connector,
},
],
vec![],
);
let srtt = RwLock::new(SrttCache::new(true));
let _ = forward_with_failover_raw(
&[0u8; 12],
&pool,
&srtt,
Duration::from_millis(500),
Duration::ZERO,
)
.await;
let cache = srtt.read().unwrap();
assert!(cache.is_known(dead1.ip()));
assert!(cache.is_known(dead2.ip()));
}
} }

View File

@@ -43,7 +43,7 @@ impl HealthMeta {
#[cfg(test)] #[cfg(test)]
pub fn test_fixture() -> Self { pub fn test_fixture() -> Self {
HealthMeta { HealthMeta {
version: env!("CARGO_PKG_VERSION"), version: crate::version(),
hostname: "test-host".to_string(), hostname: "test-host".to_string(),
sni: "numa.numa".to_string(), sni: "numa.numa".to_string(),
dot_enabled: false, dot_enabled: false,
@@ -99,7 +99,7 @@ impl HealthMeta {
} }
HealthMeta { HealthMeta {
version: env!("CARGO_PKG_VERSION"), version: crate::version(),
hostname: crate::hostname(), hostname: crate::hostname(),
sni: "numa.numa".to_string(), sni: "numa.numa".to_string(),
dot_enabled, dot_enabled,

View File

@@ -1,5 +1,6 @@
pub mod api; pub mod api;
pub mod blocklist; pub mod blocklist;
pub mod bootstrap_resolver;
pub mod buffer; pub mod buffer;
pub mod cache; pub mod cache;
pub mod config; pub mod config;
@@ -13,6 +14,7 @@ pub mod health;
pub mod lan; pub mod lan;
pub mod mobile_api; pub mod mobile_api;
pub mod mobileconfig; pub mod mobileconfig;
pub mod odoh;
pub mod override_store; pub mod override_store;
pub mod packet; pub mod packet;
pub mod proxy; pub mod proxy;
@@ -20,20 +22,34 @@ pub mod query_log;
pub mod question; pub mod question;
pub mod record; pub mod record;
pub mod recursive; pub mod recursive;
pub mod relay;
pub mod serve;
pub mod service_store; pub mod service_store;
pub mod setup_phone; pub mod setup_phone;
pub mod srtt; pub mod srtt;
pub mod stats; pub mod stats;
pub mod svcb;
pub mod system_dns; pub mod system_dns;
pub mod tls; pub mod tls;
pub mod wire; pub mod wire;
#[cfg(windows)]
pub mod windows_service;
#[cfg(test)] #[cfg(test)]
pub(crate) mod testutil; pub(crate) mod testutil;
pub type Error = Box<dyn std::error::Error + Send + Sync>; pub type Error = Box<dyn std::error::Error + Send + Sync>;
pub type Result<T> = std::result::Result<T, Error>; pub type Result<T> = std::result::Result<T, Error>;
/// Build version string. On tagged releases: `0.13.1`. On commits ahead
/// of a tag: `0.13.1+a87f907`. With uncommitted changes: `0.13.1+a87f907-dirty`.
/// Falls back to `CARGO_PKG_VERSION` when built outside a git repo (e.g.
/// from a source tarball).
pub fn version() -> &'static str {
option_env!("NUMA_BUILD_VERSION").unwrap_or(env!("CARGO_PKG_VERSION"))
}
/// Detect the machine hostname via the `hostname` command. Returns the /// Detect the machine hostname via the `hostname` command. Returns the
/// full hostname (e.g., `macbook-pro.local`), or `"numa"` if the command /// full hostname (e.g., `macbook-pro.local`), or `"numa"` if the command
/// fails. Call sites that need the short form (e.g., mDNS instance /// fails. Call sites that need the short form (e.g., mDNS instance
@@ -89,14 +105,11 @@ where
/// Linux root daemon: /var/lib/numa (FHS) — falls back to /usr/local/var/numa /// Linux root daemon: /var/lib/numa (FHS) — falls back to /usr/local/var/numa
/// if a pre-v0.10.1 install already lives there. /// if a pre-v0.10.1 install already lives there.
/// macOS root daemon: /usr/local/var/numa (Homebrew prefix) /// macOS root daemon: /usr/local/var/numa (Homebrew prefix)
/// Windows: %APPDATA%\numa /// Windows: %PROGRAMDATA%\numa (same as data_dir — no per-user config on Windows)
pub fn config_dir() -> std::path::PathBuf { pub fn config_dir() -> std::path::PathBuf {
#[cfg(windows)] #[cfg(windows)]
{ {
std::path::PathBuf::from( data_dir()
std::env::var("APPDATA").unwrap_or_else(|_| "C:\\ProgramData".into()),
)
.join("numa")
} }
#[cfg(not(windows))] #[cfg(not(windows))]
{ {

View File

@@ -1,36 +1,34 @@
use std::net::SocketAddr;
use std::sync::{Arc, Mutex, RwLock};
use std::time::Duration;
use arc_swap::ArcSwap;
use log::{error, info};
use tokio::net::UdpSocket;
use numa::blocklist::{download_blocklists, parse_blocklist, BlocklistStore};
use numa::buffer::BytePacketBuffer;
use numa::cache::DnsCache;
use numa::config::{build_zone_map, load_config, ConfigLoad};
use numa::ctx::{handle_query, ServerCtx};
use numa::forward::{parse_upstream, Upstream, UpstreamPool};
use numa::override_store::OverrideStore;
use numa::query_log::QueryLog;
use numa::service_store::ServiceStore;
use numa::stats::{ServerStats, Transport};
use numa::system_dns::{ use numa::system_dns::{
discover_system_dns, install_service, restart_service, service_status, uninstall_service, install_service, restart_service, service_status, start_service, stop_service,
uninstall_service,
}; };
const QUAD9_IP: &str = "9.9.9.9"; fn main() -> numa::Result<()> {
const DOH_FALLBACK: &str = "https://9.9.9.9/dns-query"; // Handle CLI subcommands
let arg1 = std::env::args().nth(1).unwrap_or_default();
#[cfg(windows)]
if arg1 == "--service" {
// Running under SCM — stderr goes nowhere. Redirect logs to a file.
let log_path = numa::data_dir().join("numa.log");
let log_file = std::fs::OpenOptions::new()
.create(true)
.append(true)
.open(&log_path)
.expect("failed to open log file");
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info"))
.format_timestamp_millis()
.target(env_logger::Target::Pipe(Box::new(log_file)))
.init();
numa::windows_service::run_as_service()
.map_err(|e| format!("windows service dispatcher failed: {}", e))?;
return Ok(());
}
#[tokio::main]
async fn main() -> numa::Result<()> {
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")) env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info"))
.format_timestamp_millis() .format_timestamp_millis()
.init(); .init();
// Handle CLI subcommands
let arg1 = std::env::args().nth(1).unwrap_or_default();
match arg1.as_str() { match arg1.as_str() {
"install" => { "install" => {
eprintln!("\x1b[1;38;2;192;98;58mNuma\x1b[0m — installing\n"); eprintln!("\x1b[1;38;2;192;98;58mNuma\x1b[0m — installing\n");
@@ -44,8 +42,8 @@ async fn main() -> numa::Result<()> {
let sub = std::env::args().nth(2).unwrap_or_default(); let sub = std::env::args().nth(2).unwrap_or_default();
eprintln!("\x1b[1;38;2;192;98;58mNuma\x1b[0m — service management\n"); eprintln!("\x1b[1;38;2;192;98;58mNuma\x1b[0m — service management\n");
return match sub.as_str() { return match sub.as_str() {
"start" => install_service().map_err(|e| e.into()), "start" => start_service().map_err(|e| e.into()),
"stop" => uninstall_service().map_err(|e| e.into()), "stop" => stop_service().map_err(|e| e.into()),
"restart" => restart_service().map_err(|e| e.into()), "restart" => restart_service().map_err(|e| e.into()),
"status" => service_status().map_err(|e| e.into()), "status" => service_status().map_err(|e| e.into()),
_ => { _ => {
@@ -55,7 +53,38 @@ async fn main() -> numa::Result<()> {
}; };
} }
"setup-phone" => { "setup-phone" => {
return numa::setup_phone::run().await.map_err(|e| e.into()); let runtime = tokio::runtime::Builder::new_current_thread()
.enable_all()
.build()?;
return runtime
.block_on(numa::setup_phone::run())
.map_err(|e| e.into());
}
"relay" => {
let port: u16 = std::env::args()
.nth(2)
.as_deref()
.and_then(|s| s.parse().ok())
.unwrap_or(8443);
let bind: std::net::IpAddr = std::env::args()
.nth(3)
.as_deref()
.map(|s| {
s.parse().unwrap_or_else(|e| {
eprintln!("invalid bind address '{}': {}", s, e);
std::process::exit(1);
})
})
.unwrap_or(std::net::IpAddr::V4(std::net::Ipv4Addr::LOCALHOST));
let addr = std::net::SocketAddr::new(bind, port);
eprintln!(
"\x1b[1;38;2;192;98;58mNuma\x1b[0m — ODoH relay on {}\n",
addr
);
let runtime = tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()?;
return runtime.block_on(numa::relay::run(addr));
} }
"lan" => { "lan" => {
let sub = std::env::args().nth(2).unwrap_or_default(); let sub = std::env::args().nth(2).unwrap_or_default();
@@ -88,6 +117,8 @@ async fn main() -> numa::Result<()> {
eprintln!(" service status Check if the service is running"); eprintln!(" service status Check if the service is running");
eprintln!(" lan on Enable LAN service discovery (mDNS)"); eprintln!(" lan on Enable LAN service discovery (mDNS)");
eprintln!(" lan off Disable LAN service discovery"); eprintln!(" lan off Disable LAN service discovery");
eprintln!(" relay [PORT] [BIND]");
eprintln!(" Run as an ODoH relay (RFC 9230, default 127.0.0.1:8443)");
eprintln!(" setup-phone Generate a QR code to install Numa DoT on a phone"); eprintln!(" setup-phone Generate a QR code to install Numa DoT on a phone");
eprintln!(" help Show this help"); eprintln!(" help Show this help");
eprintln!(); eprintln!();
@@ -118,552 +149,11 @@ async fn main() -> numa::Result<()> {
} else { } else {
arg1 // treat as config path for backwards compatibility arg1 // treat as config path for backwards compatibility
}; };
let ConfigLoad {
config,
path: resolved_config_path,
found: config_found,
} = load_config(&config_path)?;
// Discover system DNS in a single pass (upstream + forwarding rules) let runtime = tokio::runtime::Builder::new_multi_thread()
let system_dns = discover_system_dns(); .enable_all()
.build()?;
let root_hints = numa::recursive::parse_root_hints(&config.upstream.root_hints); runtime.block_on(numa::serve::run(config_path))
let recursive_pool = || {
let dummy = UpstreamPool::new(vec![Upstream::Udp("0.0.0.0:0".parse().unwrap())], vec![]);
(dummy, "recursive (root hints)".to_string())
};
let (resolved_mode, upstream_auto, pool, upstream_label) = match config.upstream.mode {
numa::config::UpstreamMode::Auto => {
info!("auto mode: probing recursive resolution...");
if numa::recursive::probe_recursive(&root_hints).await {
info!("recursive probe succeeded — self-sovereign mode");
let (pool, label) = recursive_pool();
(numa::config::UpstreamMode::Recursive, false, pool, label)
} else {
log::warn!("recursive probe failed — falling back to Quad9 DoH");
let client = reqwest::Client::builder()
.use_rustls_tls()
.build()
.unwrap_or_default();
let url = DOH_FALLBACK.to_string();
let label = url.clone();
let pool = UpstreamPool::new(vec![Upstream::Doh { url, client }], vec![]);
(numa::config::UpstreamMode::Forward, false, pool, label)
}
}
numa::config::UpstreamMode::Recursive => {
let (pool, label) = recursive_pool();
(numa::config::UpstreamMode::Recursive, false, pool, label)
}
numa::config::UpstreamMode::Forward => {
let addrs = if config.upstream.address.is_empty() {
let detected = system_dns
.default_upstream
.or_else(numa::system_dns::detect_dhcp_dns)
.unwrap_or_else(|| {
info!("could not detect system DNS, falling back to Quad9 DoH");
DOH_FALLBACK.to_string()
});
vec![detected]
} else {
config.upstream.address.clone()
};
let primary: Vec<Upstream> = addrs
.iter()
.map(|s| parse_upstream(s, config.upstream.port))
.collect::<numa::Result<Vec<_>>>()?;
let fallback: Vec<Upstream> = config
.upstream
.fallback
.iter()
.map(|s| parse_upstream(s, config.upstream.port))
.collect::<numa::Result<Vec<_>>>()?;
let pool = UpstreamPool::new(primary, fallback);
let label = pool.label();
(
numa::config::UpstreamMode::Forward,
config.upstream.address.is_empty(),
pool,
label,
)
}
};
let api_port = config.server.api_port;
let mut blocklist = BlocklistStore::new();
for domain in &config.blocking.allowlist {
blocklist.add_to_allowlist(domain);
}
if !config.blocking.enabled {
blocklist.set_enabled(false);
}
// Build service store: config services + persisted user services
let mut service_store = ServiceStore::new();
service_store.insert_from_config("numa", config.server.api_port, Vec::new());
for svc in &config.services {
service_store.insert_from_config(&svc.name, svc.target_port, svc.routes.clone());
}
service_store.load_persisted();
for fwd in &config.forwarding {
for suffix in &fwd.suffix {
info!("forwarding .{} to {} (config rule)", suffix, fwd.upstream);
}
}
let forwarding_rules =
numa::config::merge_forwarding_rules(&config.forwarding, system_dns.forwarding_rules)?;
// Resolve data_dir from config, falling back to the platform default.
// Used for TLS CA storage below and stored on ServerCtx for runtime use.
let resolved_data_dir = config
.server
.data_dir
.clone()
.unwrap_or_else(numa::data_dir);
// Build initial TLS config before ServerCtx (so ArcSwap is ready at construction)
let initial_tls = if config.proxy.enabled && config.proxy.tls_port > 0 {
let service_names = service_store.names();
match numa::tls::build_tls_config(
&config.proxy.tld,
&service_names,
Vec::new(),
&resolved_data_dir,
) {
Ok(tls_config) => Some(ArcSwap::from(tls_config)),
Err(e) => {
if let Some(advisory) = numa::tls::try_data_dir_advisory(&e, &resolved_data_dir) {
eprint!("{}", advisory);
} else {
log::warn!("TLS setup failed, HTTPS proxy disabled: {}", e);
}
None
}
}
} else {
None
};
let doh_enabled = initial_tls.is_some();
let health_meta = numa::health::HealthMeta::build(
&resolved_data_dir,
config.dot.enabled,
config.dot.port,
config.mobile.port,
config.dnssec.enabled,
resolved_mode == numa::config::UpstreamMode::Recursive,
config.lan.enabled,
config.blocking.enabled,
doh_enabled,
);
let ca_pem = std::fs::read_to_string(resolved_data_dir.join("ca.pem")).ok();
let socket = match UdpSocket::bind(&config.server.bind_addr).await {
Ok(s) => s,
Err(e) => {
if let Some(advisory) =
numa::system_dns::try_port53_advisory(&config.server.bind_addr, &e)
{
eprint!("{}", advisory);
std::process::exit(1);
}
return Err(e.into());
}
};
let ctx = Arc::new(ServerCtx {
socket,
zone_map: build_zone_map(&config.zones)?,
cache: RwLock::new(DnsCache::new(
config.cache.max_entries,
config.cache.min_ttl,
config.cache.max_ttl,
)),
refreshing: Mutex::new(std::collections::HashSet::new()),
stats: Mutex::new(ServerStats::new()),
overrides: RwLock::new(OverrideStore::new()),
blocklist: RwLock::new(blocklist),
query_log: Mutex::new(QueryLog::new(1000)),
services: Mutex::new(service_store),
lan_peers: Mutex::new(numa::lan::PeerStore::new(config.lan.peer_timeout_secs)),
forwarding_rules,
upstream_pool: Mutex::new(pool),
upstream_auto,
upstream_port: config.upstream.port,
lan_ip: Mutex::new(numa::lan::detect_lan_ip().unwrap_or(std::net::Ipv4Addr::LOCALHOST)),
timeout: Duration::from_millis(config.upstream.timeout_ms),
hedge_delay: Duration::from_millis(config.upstream.hedge_ms),
proxy_tld_suffix: if config.proxy.tld.is_empty() {
String::new()
} else {
format!(".{}", config.proxy.tld)
},
proxy_tld: config.proxy.tld.clone(),
lan_enabled: config.lan.enabled,
config_path: resolved_config_path,
config_found,
config_dir: numa::config_dir(),
data_dir: resolved_data_dir,
tls_config: initial_tls,
upstream_mode: resolved_mode,
root_hints,
srtt: std::sync::RwLock::new(numa::srtt::SrttCache::new(config.upstream.srtt)),
inflight: std::sync::Mutex::new(std::collections::HashMap::new()),
dnssec_enabled: config.dnssec.enabled,
dnssec_strict: config.dnssec.strict,
health_meta,
ca_pem,
mobile_enabled: config.mobile.enabled,
mobile_port: config.mobile.port,
});
let zone_count: usize = ctx.zone_map.values().map(|m| m.len()).sum();
// Build banner rows, then size the box to fit the longest value
let api_url = format!("http://localhost:{}", api_port);
let proxy_label = if config.proxy.enabled {
if config.proxy.tls_port > 0 {
Some(format!(
"http://:{} https://:{}",
config.proxy.port, config.proxy.tls_port
))
} else {
Some(format!(
"http://*.{} on :{}",
config.proxy.tld, config.proxy.port
))
}
} else {
None
};
let config_label = if ctx.config_found {
ctx.config_path.clone()
} else {
format!("{} (defaults)", ctx.config_path)
};
let data_label = ctx.data_dir.display().to_string();
let services_label = ctx.config_dir.join("services.json").display().to_string();
// label (10) + value + padding (2) = inner width; minimum 40 for the title row
let val_w = [
config.server.bind_addr.len(),
api_url.len(),
upstream_label.len(),
config_label.len(),
data_label.len(),
services_label.len(),
]
.into_iter()
.chain(proxy_label.as_ref().map(|s| s.len()))
.max()
.unwrap_or(30);
let w = (val_w + 12).max(42); // 10 label + 2 padding, min 42 for title
let o = "\x1b[38;2;192;98;58m"; // orange
let g = "\x1b[38;2;107;124;78m"; // green
let d = "\x1b[38;2;163;152;136m"; // dim
let r = "\x1b[0m"; // reset
let b = "\x1b[1;38;2;192;98;58m"; // bold orange
let it = "\x1b[3;38;2;163;152;136m"; // italic dim
let bar_top = "".repeat(w);
let bar_mid = "".repeat(w);
let row = |label: &str, color: &str, value: &str| {
eprintln!(
"{o}{r} {color}{:<9}{r} {:<vw$}{o}{r}",
label,
value,
vw = w - 12
);
};
// Title row: center within the box
let title = format!(
"{b}NUMA{r} {it}DNS that governs itself{r} {d}v{}{r}",
env!("CARGO_PKG_VERSION")
);
// The title contains ANSI codes; visible length is ~38 chars. Pad to fill the box.
let title_visible_len = 4 + 2 + 24 + 2 + 1 + env!("CARGO_PKG_VERSION").len() + 1;
let title_pad = w.saturating_sub(title_visible_len);
eprintln!("\n{o}{bar_top}{r}");
eprint!("{o}{r} {title}");
eprintln!("{}{o}{r}", " ".repeat(title_pad));
eprintln!("{o}{bar_top}{r}");
row("DNS", g, &config.server.bind_addr);
row("API", g, &api_url);
row("Dashboard", g, &api_url);
row(
"Upstream",
g,
if ctx.upstream_mode == numa::config::UpstreamMode::Recursive {
"recursive (root hints)"
} else {
&upstream_label
},
);
row("Zones", g, &format!("{} records", zone_count));
row(
"Cache",
g,
&format!("max {} entries", config.cache.max_entries),
);
if !config.cache.warm.is_empty() {
row("Warm", g, &format!("{} domains", config.cache.warm.len()));
}
row(
"Blocking",
g,
&if config.blocking.enabled {
format!("{} lists", config.blocking.lists.len())
} else {
"disabled".to_string()
},
);
if let Some(ref label) = proxy_label {
row("Proxy", g, label);
if config.proxy.bind_addr == "127.0.0.1" {
let y = "\x1b[38;2;204;176;59m"; // yellow
row(
"",
y,
&format!(
"⚠ proxy on 127.0.0.1 — .{} not LAN reachable",
config.proxy.tld
),
);
}
}
if config.dot.enabled {
row("DoT", g, &format!("tls://:{}", config.dot.port));
}
if doh_enabled {
row(
"DoH",
g,
&format!("https://:{}/dns-query", config.proxy.tls_port),
);
}
if config.lan.enabled {
row("LAN", g, "mDNS (_numa._tcp.local)");
}
if !ctx.forwarding_rules.is_empty() {
row(
"Routing",
g,
&format!("{} conditional rules", ctx.forwarding_rules.len()),
);
}
eprintln!("{o}{bar_mid}{r}");
row("Config", d, &config_label);
row("Data", d, &data_label);
row("Services", d, &services_label);
eprintln!("{o}{bar_top}{r}\n");
info!(
"numa listening on {}, upstream {}, {} zone records, cache max {}, API on port {}",
config.server.bind_addr, upstream_label, zone_count, config.cache.max_entries, api_port,
);
// Download blocklists on startup
let blocklist_lists = config.blocking.lists.clone();
let refresh_hours = config.blocking.refresh_hours;
if config.blocking.enabled && !blocklist_lists.is_empty() {
let bl_ctx = Arc::clone(&ctx);
let bl_lists = blocklist_lists.clone();
tokio::spawn(async move {
load_blocklists(&bl_ctx, &bl_lists).await;
// Periodic refresh
let mut interval = tokio::time::interval(Duration::from_secs(refresh_hours * 3600));
interval.tick().await; // skip immediate tick
loop {
interval.tick().await;
info!("refreshing blocklists...");
load_blocklists(&bl_ctx, &bl_lists).await;
}
});
}
// Prime TLD cache (recursive mode only)
if ctx.upstream_mode == numa::config::UpstreamMode::Recursive {
let prime_ctx = Arc::clone(&ctx);
let prime_tlds = config.upstream.prime_tlds;
tokio::spawn(async move {
numa::recursive::prime_tld_cache(
&prime_ctx.cache,
&prime_ctx.root_hints,
&prime_tlds,
&prime_ctx.srtt,
)
.await;
});
}
// Spawn cache warming for user-configured domains
if !config.cache.warm.is_empty() {
let warm_ctx = Arc::clone(&ctx);
let warm_domains = config.cache.warm.clone();
tokio::spawn(async move {
cache_warm_loop(warm_ctx, warm_domains).await;
});
}
// Spawn DoH connection keepalive — prevents idle TLS teardown
{
let keepalive_ctx = Arc::clone(&ctx);
tokio::spawn(async move {
doh_keepalive_loop(keepalive_ctx).await;
});
}
// Spawn HTTP API server
let api_ctx = Arc::clone(&ctx);
let api_addr: SocketAddr = format!("{}:{}", config.server.api_bind_addr, api_port).parse()?;
tokio::spawn(async move {
let app = numa::api::router(api_ctx);
let listener = tokio::net::TcpListener::bind(api_addr).await.unwrap();
info!("HTTP API listening on {}", api_addr);
axum::serve(listener, app).await.unwrap();
});
// Spawn Mobile API listener (read-only subset for iOS/Android companion
// apps, LAN-bound by default so phones can reach it). Only idempotent
// GETs; no state-mutating routes are exposed here regardless of
// the main API's bind address.
if config.mobile.enabled {
let mobile_ctx = Arc::clone(&ctx);
let mobile_bind = config.mobile.bind_addr.clone();
let mobile_port = config.mobile.port;
tokio::spawn(async move {
if let Err(e) = numa::mobile_api::start(mobile_ctx, mobile_bind, mobile_port).await {
log::warn!("Mobile API listener failed: {}", e);
}
});
}
let proxy_bind: std::net::Ipv4Addr = config
.proxy
.bind_addr
.parse()
.unwrap_or(std::net::Ipv4Addr::LOCALHOST);
// Spawn HTTP reverse proxy for .numa domains
if config.proxy.enabled {
let proxy_ctx = Arc::clone(&ctx);
let proxy_port = config.proxy.port;
tokio::spawn(async move {
numa::proxy::start_proxy(proxy_ctx, proxy_port, proxy_bind).await;
});
}
// Spawn HTTPS reverse proxy with TLS termination
if config.proxy.enabled && config.proxy.tls_port > 0 && ctx.tls_config.is_some() {
let proxy_ctx = Arc::clone(&ctx);
let tls_port = config.proxy.tls_port;
tokio::spawn(async move {
numa::proxy::start_proxy_tls(proxy_ctx, tls_port, proxy_bind).await;
});
}
// Spawn network change watcher (upstream re-detection, LAN IP update, peer flush)
{
let watch_ctx = Arc::clone(&ctx);
tokio::spawn(async move {
network_watch_loop(watch_ctx).await;
});
}
// Spawn LAN service discovery
if config.lan.enabled {
let lan_ctx = Arc::clone(&ctx);
let lan_config = config.lan.clone();
tokio::spawn(async move {
numa::lan::start_lan_discovery(lan_ctx, &lan_config).await;
});
}
// Spawn DNS-over-TLS listener (RFC 7858)
if config.dot.enabled {
let dot_ctx = Arc::clone(&ctx);
let dot_config = config.dot.clone();
tokio::spawn(async move {
numa::dot::start_dot(dot_ctx, &dot_config).await;
});
}
// UDP DNS listener
#[allow(clippy::infinite_loop)]
loop {
let mut buffer = BytePacketBuffer::new();
let (len, src_addr) = match ctx.socket.recv_from(&mut buffer.buf).await {
Ok(r) => r,
Err(e) if e.kind() == std::io::ErrorKind::ConnectionReset => {
// Windows delivers ICMP port-unreachable as ConnectionReset on UDP sockets
continue;
}
Err(e) => return Err(e.into()),
};
let ctx = Arc::clone(&ctx);
tokio::spawn(async move {
if let Err(e) = handle_query(buffer, len, src_addr, &ctx, Transport::Udp).await {
error!("{} | HANDLER ERROR | {}", src_addr, e);
}
});
}
}
async fn network_watch_loop(ctx: Arc<numa::ctx::ServerCtx>) {
let mut tick: u64 = 0;
let mut interval = tokio::time::interval(Duration::from_secs(5));
interval.tick().await; // skip immediate tick
loop {
interval.tick().await;
tick += 1;
let mut changed = false;
// Check LAN IP change (every 5s — cheap, one UDP socket call)
if let Some(new_ip) = numa::lan::detect_lan_ip() {
let mut current_ip = ctx.lan_ip.lock().unwrap();
if new_ip != *current_ip {
info!("LAN IP changed: {} → {}", current_ip, new_ip);
*current_ip = new_ip;
changed = true;
numa::recursive::reset_udp_state();
}
}
// Re-detect upstream every 30s or on LAN IP change (auto-detect only)
if ctx.upstream_auto && (changed || tick.is_multiple_of(6)) {
let dns_info = numa::system_dns::discover_system_dns();
let new_addr = dns_info
.default_upstream
.or_else(numa::system_dns::detect_dhcp_dns)
.unwrap_or_else(|| QUAD9_IP.to_string());
let mut pool = ctx.upstream_pool.lock().unwrap();
if pool.maybe_update_primary(&new_addr, ctx.upstream_port) {
info!("upstream changed → {}", pool.label());
changed = true;
}
}
// Flush stale LAN peers on any network change
if changed {
ctx.lan_peers.lock().unwrap().clear();
info!("flushed LAN peers after network change");
}
// Re-probe UDP every 5 minutes when disabled
if tick.is_multiple_of(60) {
numa::recursive::probe_udp(&ctx.root_hints).await;
}
}
} }
fn set_lan_enabled(enabled: bool, path: &str) -> numa::Result<()> { fn set_lan_enabled(enabled: bool, path: &str) -> numa::Result<()> {
@@ -730,71 +220,3 @@ fn print_lan_status(enabled: bool) {
eprintln!(" Restart Numa to start mDNS discovery"); eprintln!(" Restart Numa to start mDNS discovery");
} }
} }
async fn load_blocklists(ctx: &ServerCtx, lists: &[String]) {
let downloaded = download_blocklists(lists).await;
// Parse outside the lock to avoid blocking DNS queries during parse (~100ms)
let mut all_domains = std::collections::HashSet::new();
let mut sources = Vec::new();
for (source, text) in &downloaded {
let domains = parse_blocklist(text);
info!("blocklist: {} domains from {}", domains.len(), source);
all_domains.extend(domains);
sources.push(source.clone());
}
let total = all_domains.len();
// Swap under lock — sub-microsecond
ctx.blocklist
.write()
.unwrap()
.swap_domains(all_domains, sources);
info!(
"blocking enabled: {} unique domains from {} lists",
total,
downloaded.len()
);
}
async fn warm_domain(ctx: &ServerCtx, domain: &str) {
for qtype in [
numa::question::QueryType::A,
numa::question::QueryType::AAAA,
] {
numa::ctx::refresh_entry(ctx, domain, qtype).await;
}
}
async fn doh_keepalive_loop(ctx: Arc<ServerCtx>) {
let mut interval = tokio::time::interval(Duration::from_secs(25));
interval.tick().await; // skip first immediate tick
loop {
interval.tick().await;
let pool = ctx.upstream_pool.lock().unwrap().clone();
if let Some(upstream) = pool.preferred() {
numa::forward::keepalive_doh(upstream).await;
}
}
}
async fn cache_warm_loop(ctx: Arc<ServerCtx>, domains: Vec<String>) {
tokio::time::sleep(Duration::from_secs(2)).await;
for domain in &domains {
warm_domain(&ctx, domain).await;
}
info!("cache warm: {} domains resolved at startup", domains.len());
let mut interval = tokio::time::interval(Duration::from_secs(30));
interval.tick().await;
loop {
interval.tick().await;
for domain in &domains {
let refresh = ctx.cache.read().unwrap().needs_warm(domain);
if refresh {
warm_domain(&ctx, domain).await;
}
}
}
}

489
src/odoh.rs Normal file
View File

@@ -0,0 +1,489 @@
//! ODoH target-config fetcher and TTL cache (RFC 9230 §6).
//!
//! ## Ciphersuite policy
//! `odoh-rs` deserialization rejects any config whose KEM/KDF/AEAD triple is
//! not the mandatory `(X25519, HKDF-SHA256, AES-128-GCM)` (see
//! `ObliviousDoHConfigContents::deserialize`). This is stricter than the
//! plan's "pick the mandatory suite if mixed": a response containing *any*
//! non-mandatory config fails parse entirely. Real-world targets publish a
//! single mandatory config, so this is fine in practice; revisit if a target
//! that matters starts mixing suites.
use std::sync::Arc;
use std::time::{Duration, Instant};
use arc_swap::ArcSwapOption;
use odoh_rs::{
ObliviousDoHConfigContents, ObliviousDoHConfigs, ObliviousDoHMessage,
ObliviousDoHMessagePlaintext,
};
use rand_core::{OsRng, TryRngCore};
use reqwest::header::HeaderMap;
use tokio::sync::Mutex;
use tokio::time::timeout;
use crate::Result;
/// MIME type used for both directions of the ODoH exchange (RFC 9230 §4).
pub(crate) const ODOH_CONTENT_TYPE: &str = "application/oblivious-dns-message";
/// Cap on the response body we read into memory when the relay returns
/// non-success. Protects against a hostile relay streaming a huge body on
/// the error path; keeps enough room to carry a human-readable reason.
const ERROR_BODY_PREVIEW_BYTES: usize = 1024;
/// Fallback TTL when the target's response lacks a usable `Cache-Control`
/// directive. RFC 9230 §6.2 places no hard floor; 24 h matches what Cloudflare
/// publishes in practice.
const DEFAULT_CONFIG_TTL: Duration = Duration::from_secs(24 * 60 * 60);
/// Cap on any TTL we'll honour, regardless of what the target advertises.
/// Keeps a misconfigured server from pinning an old key indefinitely.
const MAX_CONFIG_TTL: Duration = Duration::from_secs(7 * 24 * 60 * 60);
/// After a failed `/.well-known/odohconfigs` fetch, refuse to refetch again
/// within this window — a target that is genuinely broken would otherwise
/// receive one request per query. Queries that arrive during the backoff
/// return the cached error immediately.
const REFRESH_BACKOFF: Duration = Duration::from_secs(60);
/// Parsed ODoH target config plus the freshness metadata needed to age it out.
#[derive(Debug)]
pub struct OdohTargetConfig {
pub contents: ObliviousDoHConfigContents,
pub key_id: Vec<u8>,
expires_at: Instant,
}
impl OdohTargetConfig {
pub fn is_expired(&self) -> bool {
Instant::now() >= self.expires_at
}
}
struct FailedRefresh {
at: Instant,
err: String,
}
/// TTL-gated cache of a single target's HPKE config.
///
/// Reads go through `ArcSwapOption` (lock-free hot path). Refreshes serialize
/// on an async mutex so a burst of simultaneous misses produces a single
/// outbound fetch, and a failed refresh blocks subsequent refetches for
/// [`REFRESH_BACKOFF`] to prevent hot-looping against a broken target.
pub struct OdohConfigCache {
target_host: String,
configs_url: String,
client: reqwest::Client,
current: ArcSwapOption<OdohTargetConfig>,
last_failure: ArcSwapOption<FailedRefresh>,
refresh_lock: Mutex<()>,
}
impl OdohConfigCache {
pub fn new(target_host: String, client: reqwest::Client) -> Self {
let configs_url = format!("https://{}/.well-known/odohconfigs", target_host);
Self {
target_host,
configs_url,
client,
current: ArcSwapOption::from(None),
last_failure: ArcSwapOption::from(None),
refresh_lock: Mutex::new(()),
}
}
pub fn target_host(&self) -> &str {
&self.target_host
}
/// Return a valid config, refetching when the cache is cold or expired.
/// Within [`REFRESH_BACKOFF`] of a failed refresh, returns the cached
/// error without issuing another fetch.
pub async fn get(&self) -> Result<Arc<OdohTargetConfig>> {
if let Some(cfg) = self.current.load_full() {
if !cfg.is_expired() {
return Ok(cfg);
}
}
if let Some(err) = self.backoff_error() {
return Err(err);
}
let _guard = self.refresh_lock.lock().await;
// Another task may have refreshed or failed while we waited.
if let Some(cfg) = self.current.load_full() {
if !cfg.is_expired() {
return Ok(cfg);
}
}
if let Some(err) = self.backoff_error() {
return Err(err);
}
match fetch_odoh_config(&self.client, &self.configs_url).await {
Ok(fresh) => {
let fresh = Arc::new(fresh);
self.current.store(Some(fresh.clone()));
self.last_failure.store(None);
Ok(fresh)
}
Err(e) => {
let msg = format!("ODoH config fetch failed: {e}");
self.last_failure.store(Some(Arc::new(FailedRefresh {
at: Instant::now(),
err: msg.clone(),
})));
Err(msg.into())
}
}
}
/// Drop the cached config. Called after the target rejects ciphertext
/// (key rotation race) so the next `get()` refetches.
pub fn invalidate(&self) {
self.current.store(None);
}
fn backoff_error(&self) -> Option<crate::Error> {
let fail = self.last_failure.load_full()?;
if fail.at.elapsed() < REFRESH_BACKOFF {
Some(format!("{} (backoff active)", fail.err).into())
} else {
None
}
}
}
/// Fetch `/.well-known/odohconfigs` from `configs_url` and parse it into an
/// [`OdohTargetConfig`]. The TTL is taken from the response's
/// `Cache-Control: max-age=`, clamped to [`DEFAULT_CONFIG_TTL`,
/// [`MAX_CONFIG_TTL`]] when absent or obviously wrong.
pub async fn fetch_odoh_config(
client: &reqwest::Client,
configs_url: &str,
) -> Result<OdohTargetConfig> {
let resp = client.get(configs_url).send().await?.error_for_status()?;
let ttl = cache_control_ttl(resp.headers()).unwrap_or(DEFAULT_CONFIG_TTL);
let body = resp.bytes().await?;
parse_odoh_config(&body, ttl)
}
fn parse_odoh_config(body: &[u8], ttl: Duration) -> Result<OdohTargetConfig> {
let mut buf = body;
let configs: ObliviousDoHConfigs = odoh_rs::parse(&mut buf)
.map_err(|e| format!("failed to parse ObliviousDoHConfigs: {e}"))?;
let first = configs
.into_iter()
.next()
.ok_or("target published no ODoH configs with a supported version + ciphersuite")?;
let contents: ObliviousDoHConfigContents = first.into();
let key_id = contents
.identifier()
.map_err(|e| format!("failed to derive key_id from ODoH config: {e}"))?;
Ok(OdohTargetConfig {
contents,
key_id,
expires_at: Instant::now() + ttl.min(MAX_CONFIG_TTL),
})
}
/// Send a DNS wire query through an ODoH relay to a target and return the
/// plaintext DNS wire response.
///
/// Flow: fetch the target's HPKE config (cached), seal the query, POST to the
/// relay with `Targethost`/`Targetpath` headers, then unseal the response.
/// On seal/unseal failure we invalidate the cache and retry once — this
/// handles the benign race where the target rotated its key between our
/// cached config and the POST.
pub async fn query_through_relay(
wire: &[u8],
relay_url: &str,
target_path: &str,
client: &reqwest::Client,
cache: &OdohConfigCache,
timeout_duration: Duration,
) -> Result<Vec<u8>> {
let req = OdohRequest {
wire,
relay_url,
target_path,
client,
cache,
timeout: timeout_duration,
};
match attempt_query(&req).await {
Ok(v) => Ok(v),
Err(AttemptError::KeyRotation(_)) => {
cache.invalidate();
attempt_query(&req).await.map_err(AttemptError::into_error)
}
Err(e) => Err(e.into_error()),
}
}
struct OdohRequest<'a> {
wire: &'a [u8],
relay_url: &'a str,
target_path: &'a str,
client: &'a reqwest::Client,
cache: &'a OdohConfigCache,
timeout: Duration,
}
/// Classification used only by the retry path in [`query_through_relay`].
enum AttemptError {
/// Target signalled the config we used is stale (key rotation race).
/// Callers should invalidate the cache and retry exactly once.
KeyRotation(String),
/// Any other failure — transport, timeout, malformed response.
Other(crate::Error),
}
impl AttemptError {
fn into_error(self) -> crate::Error {
match self {
AttemptError::KeyRotation(m) => format!("ODoH key rotation race: {m}").into(),
AttemptError::Other(e) => e,
}
}
}
async fn attempt_query(req: &OdohRequest<'_>) -> std::result::Result<Vec<u8>, AttemptError> {
let cfg = req.cache.get().await.map_err(AttemptError::Other)?;
let plaintext = ObliviousDoHMessagePlaintext::new(req.wire, 0);
// rand_core 0.9's OsRng is fallible-only; wrap for the infallible bound.
let mut os = OsRng;
let mut rng = os.unwrap_mut();
let (encrypted_query, client_secret) =
odoh_rs::encrypt_query(&plaintext, &cfg.contents, &mut rng)
.map_err(|e| AttemptError::Other(format!("ODoH encrypt failed: {e}").into()))?;
let body = odoh_rs::compose(&encrypted_query)
.map_err(|e| AttemptError::Other(format!("ODoH compose failed: {e}").into()))?
.freeze();
// RFC 9230 §5 and the reference client use URL query parameters, not
// HTTP headers, to carry the target routing. `Targethost`/`Targetpath`
// headers cause relays to treat the request as an unspecified-target and
// reject it.
let (status, resp_body) = timeout(req.timeout, async {
let resp = req
.client
.post(req.relay_url)
.header(reqwest::header::CONTENT_TYPE, ODOH_CONTENT_TYPE)
.header(reqwest::header::ACCEPT, ODOH_CONTENT_TYPE)
.header(reqwest::header::CACHE_CONTROL, "no-cache, no-store")
.query(&[
("targethost", req.cache.target_host()),
("targetpath", req.target_path),
])
.body(body)
.send()
.await?;
let status = resp.status();
let body = resp.bytes().await?;
Ok::<_, reqwest::Error>((status, body))
})
.await
.map_err(|_| AttemptError::Other("ODoH relay request timed out".into()))?
.map_err(|e| AttemptError::Other(format!("ODoH relay request failed: {e}").into()))?;
// RFC 9230 §4.3 expects a target that can't decrypt to reply with a DNS
// error in a sealed 200 response; a 401 from the relay/target is the
// practical signal that our cached HPKE key is stale. Treat 400 as a
// client-side bug (malformed ODoH envelope) — retrying would loop-fail.
if !status.is_success() {
let preview_len = resp_body.len().min(ERROR_BODY_PREVIEW_BYTES);
let body_preview = String::from_utf8_lossy(&resp_body[..preview_len]);
let msg = format!("ODoH relay returned {status}: {}", body_preview.trim());
return Err(if status.as_u16() == 401 {
AttemptError::KeyRotation(msg)
} else {
AttemptError::Other(msg.into())
});
}
let mut buf = resp_body;
let encrypted_response: ObliviousDoHMessage = odoh_rs::parse(&mut buf)
.map_err(|e| AttemptError::Other(format!("ODoH response parse failed: {e}").into()))?;
let plaintext_response =
odoh_rs::decrypt_response(&plaintext, &encrypted_response, client_secret)
.map_err(|e| AttemptError::KeyRotation(format!("ODoH decrypt failed: {e}")))?;
Ok(plaintext_response.into_msg().to_vec())
}
fn cache_control_ttl(headers: &HeaderMap) -> Option<Duration> {
let cc = headers.get(reqwest::header::CACHE_CONTROL)?.to_str().ok()?;
for directive in cc.split(',') {
let directive = directive.trim();
if let Some(rest) = directive.strip_prefix("max-age=") {
if let Ok(secs) = rest.trim().parse::<u64>() {
if secs > 0 {
return Some(Duration::from_secs(secs));
}
}
}
}
None
}
#[cfg(test)]
mod tests {
use super::*;
use odoh_rs::{ObliviousDoHConfig, ObliviousDoHKeyPair};
// RFC 9180 HPKE IDs for the sole ODoH mandatory suite:
// KEM = X25519, KDF = HKDF-SHA256, AEAD = AES-128-GCM.
const KEM_X25519: u16 = 0x0020;
const KDF_SHA256: u16 = 0x0001;
const AEAD_AES128GCM: u16 = 0x0001;
fn synth_configs_bytes() -> Vec<u8> {
let kp = ObliviousDoHKeyPair::from_parameters(
KEM_X25519,
KDF_SHA256,
AEAD_AES128GCM,
&[0u8; 32],
);
let pk = kp.public().clone();
let configs: ObliviousDoHConfigs = vec![ObliviousDoHConfig::from(pk)].into();
odoh_rs::compose(&configs).unwrap().to_vec()
}
#[test]
fn parse_accepts_well_formed_config() {
let bytes = synth_configs_bytes();
let cfg = parse_odoh_config(&bytes, Duration::from_secs(3600)).unwrap();
assert!(!cfg.key_id.is_empty());
assert!(!cfg.is_expired());
}
#[test]
fn parse_rejects_garbage() {
let bytes = [0xffu8; 16];
assert!(parse_odoh_config(&bytes, Duration::from_secs(3600)).is_err());
}
#[test]
fn parse_rejects_empty() {
assert!(parse_odoh_config(&[], Duration::from_secs(3600)).is_err());
}
#[test]
fn ttl_capped_at_max() {
let bytes = synth_configs_bytes();
let cfg = parse_odoh_config(&bytes, Duration::from_secs(100 * 24 * 60 * 60)).unwrap();
let remaining = cfg.expires_at.saturating_duration_since(Instant::now());
assert!(remaining <= MAX_CONFIG_TTL);
assert!(remaining >= MAX_CONFIG_TTL - Duration::from_secs(1));
}
#[test]
fn cache_control_parses_max_age() {
let mut h = HeaderMap::new();
h.insert("cache-control", "public, max-age=86400".parse().unwrap());
assert_eq!(cache_control_ttl(&h), Some(Duration::from_secs(86400)));
}
#[test]
fn cache_control_ignores_max_age_zero() {
let mut h = HeaderMap::new();
h.insert("cache-control", "max-age=0, no-store".parse().unwrap());
assert_eq!(cache_control_ttl(&h), None);
}
#[test]
fn cache_control_missing_falls_back() {
let h = HeaderMap::new();
assert_eq!(cache_control_ttl(&h), None);
}
#[test]
fn is_expired_tracks_ttl() {
let bytes = synth_configs_bytes();
let mut cfg = parse_odoh_config(&bytes, Duration::from_secs(3600)).unwrap();
assert!(!cfg.is_expired());
cfg.expires_at = Instant::now() - Duration::from_secs(1);
assert!(cfg.is_expired());
}
#[tokio::test]
async fn cache_backoff_blocks_refetch_after_failure() {
// Point the cache at a host that does not exist so the fetch fails
// deterministically; this exercises the backoff wiring without a
// network round-trip succeeding.
let cache = OdohConfigCache::new(
"odoh-target.invalid".to_string(),
reqwest::Client::builder()
.timeout(Duration::from_millis(200))
.build()
.unwrap(),
);
let first = cache.get().await;
assert!(first.is_err(), "first fetch must fail against invalid host");
// Within the backoff window, the cached error is returned immediately.
let second = cache.get().await.unwrap_err().to_string();
assert!(
second.contains("backoff active"),
"expected backoff hint, got: {second}"
);
// Reaching past the backoff window allows a fresh attempt — simulate
// by rewinding the recorded failure timestamp.
cache.last_failure.store(Some(Arc::new(FailedRefresh {
at: Instant::now() - (REFRESH_BACKOFF + Duration::from_secs(1)),
err: "prior".to_string(),
})));
let third = cache.get().await.unwrap_err().to_string();
assert!(
!third.contains("backoff active"),
"expected fresh fetch attempt, got: {third}"
);
}
/// Round-trip the HPKE seal/unseal path in isolation from HTTP, using the
/// odoh-rs primitives that `query_through_relay` wires together. Guards
/// against silently breaking the crypto glue if we refactor that path.
#[test]
fn seal_unseal_round_trip() {
use odoh_rs::{decrypt_query, encrypt_response, ResponseNonce};
let kp = ObliviousDoHKeyPair::from_parameters(
KEM_X25519,
KDF_SHA256,
AEAD_AES128GCM,
&[0u8; 32],
);
let query_wire = b"\x12\x34\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x07example\x03com\x00\x00\x01\x00\x01";
let query_pt = ObliviousDoHMessagePlaintext::new(query_wire, 0);
let mut os = OsRng;
let mut rng = os.unwrap_mut();
let (query_enc, client_secret) =
odoh_rs::encrypt_query(&query_pt, kp.public(), &mut rng).unwrap();
let (query_back, server_secret) = decrypt_query(&query_enc, &kp).unwrap();
assert_eq!(query_back.into_msg().as_ref(), query_wire);
let response_wire = b"\x12\x34\x81\x80\x00\x01\x00\x01\x00\x00\x00\x00";
let response_pt = ObliviousDoHMessagePlaintext::new(response_wire, 0);
let response_enc = encrypt_response(
&query_pt,
&response_pt,
server_secret,
ResponseNonce::default(),
)
.unwrap();
let response_back =
odoh_rs::decrypt_response(&query_pt, &response_enc, client_secret).unwrap();
assert_eq!(response_back.into_msg().as_ref(), response_wire);
}
}

View File

@@ -85,6 +85,14 @@ impl DnsPacket {
+ self.edns.as_ref().map_or(0, |e| e.options.capacity()) + self.edns.as_ref().map_or(0, |e| e.options.capacity())
} }
/// Apply `f` to every record in the three RR sections (answers,
/// authorities, resources). Does not touch questions or edns.
pub fn for_each_record_mut(&mut self, mut f: impl FnMut(&mut DnsRecord)) {
self.answers.iter_mut().for_each(&mut f);
self.authorities.iter_mut().for_each(&mut f);
self.resources.iter_mut().for_each(&mut f);
}
pub fn response_from(query: &DnsPacket, rescode: crate::header::ResultCode) -> DnsPacket { pub fn response_from(query: &DnsPacket, rescode: crate::header::ResultCode) -> DnsPacket {
let mut resp = DnsPacket::new(); let mut resp = DnsPacket::new();
resp.header.id = query.header.id; resp.header.id = query.header.id;

View File

@@ -1,114 +1,66 @@
use crate::buffer::BytePacketBuffer; use crate::buffer::BytePacketBuffer;
use crate::Result; use crate::Result;
#[derive(PartialEq, Eq, Debug, Clone, Hash, Copy)] macro_rules! define_qtypes {
pub enum QueryType { ( $( $variant:ident = $num:literal, $str:literal ),* $(,)? ) => {
UNKNOWN(u16), #[derive(PartialEq, Eq, Debug, Clone, Hash, Copy)]
A, // 1 pub enum QueryType {
NS, // 2 UNKNOWN(u16),
CNAME, // 5 $( $variant, )*
SOA, // 6 }
PTR, // 12
MX, // 15 impl QueryType {
TXT, // 16 pub fn to_num(&self) -> u16 {
AAAA, // 28 match *self {
SRV, // 33 QueryType::UNKNOWN(x) => x,
DS, // 43 $( QueryType::$variant => $num, )*
RRSIG, // 46 }
NSEC, // 47 }
DNSKEY, // 48
NSEC3, // 50 pub fn from_num(num: u16) -> QueryType {
OPT, // 41 (EDNS0 pseudo-type) match num {
HTTPS, // 65 $( $num => QueryType::$variant, )*
_ => QueryType::UNKNOWN(num),
}
}
pub fn as_str(&self) -> &'static str {
match self {
QueryType::UNKNOWN(_) => "UNKNOWN",
$( QueryType::$variant => $str, )*
}
}
pub fn parse_str(s: &str) -> Option<QueryType> {
match s.to_ascii_uppercase().as_str() {
$( $str => Some(QueryType::$variant), )*
_ => None,
}
}
}
};
} }
impl QueryType { define_qtypes! {
pub fn to_num(&self) -> u16 { A = 1, "A",
match *self { NS = 2, "NS",
QueryType::UNKNOWN(x) => x, CNAME = 5, "CNAME",
QueryType::A => 1, SOA = 6, "SOA",
QueryType::NS => 2, PTR = 12, "PTR",
QueryType::CNAME => 5, MX = 15, "MX",
QueryType::SOA => 6, TXT = 16, "TXT",
QueryType::PTR => 12, AAAA = 28, "AAAA",
QueryType::MX => 15, LOC = 29, "LOC",
QueryType::TXT => 16, SRV = 33, "SRV",
QueryType::AAAA => 28, NAPTR = 35, "NAPTR",
QueryType::SRV => 33, OPT = 41, "OPT",
QueryType::OPT => 41, DS = 43, "DS",
QueryType::DS => 43, RRSIG = 46, "RRSIG",
QueryType::RRSIG => 46, NSEC = 47, "NSEC",
QueryType::NSEC => 47, DNSKEY = 48, "DNSKEY",
QueryType::DNSKEY => 48, NSEC3 = 50, "NSEC3",
QueryType::NSEC3 => 50, SVCB = 64, "SVCB",
QueryType::HTTPS => 65, HTTPS = 65, "HTTPS",
}
}
pub fn from_num(num: u16) -> QueryType {
match num {
1 => QueryType::A,
2 => QueryType::NS,
5 => QueryType::CNAME,
6 => QueryType::SOA,
12 => QueryType::PTR,
15 => QueryType::MX,
16 => QueryType::TXT,
28 => QueryType::AAAA,
33 => QueryType::SRV,
41 => QueryType::OPT,
43 => QueryType::DS,
46 => QueryType::RRSIG,
47 => QueryType::NSEC,
48 => QueryType::DNSKEY,
50 => QueryType::NSEC3,
65 => QueryType::HTTPS,
_ => QueryType::UNKNOWN(num),
}
}
pub fn as_str(&self) -> &'static str {
match self {
QueryType::A => "A",
QueryType::NS => "NS",
QueryType::CNAME => "CNAME",
QueryType::SOA => "SOA",
QueryType::PTR => "PTR",
QueryType::MX => "MX",
QueryType::TXT => "TXT",
QueryType::AAAA => "AAAA",
QueryType::SRV => "SRV",
QueryType::OPT => "OPT",
QueryType::DS => "DS",
QueryType::RRSIG => "RRSIG",
QueryType::NSEC => "NSEC",
QueryType::DNSKEY => "DNSKEY",
QueryType::NSEC3 => "NSEC3",
QueryType::HTTPS => "HTTPS",
QueryType::UNKNOWN(_) => "UNKNOWN",
}
}
pub fn parse_str(s: &str) -> Option<QueryType> {
match s.to_ascii_uppercase().as_str() {
"A" => Some(QueryType::A),
"NS" => Some(QueryType::NS),
"CNAME" => Some(QueryType::CNAME),
"SOA" => Some(QueryType::SOA),
"PTR" => Some(QueryType::PTR),
"MX" => Some(QueryType::MX),
"TXT" => Some(QueryType::TXT),
"AAAA" => Some(QueryType::AAAA),
"SRV" => Some(QueryType::SRV),
"DS" => Some(QueryType::DS),
"RRSIG" => Some(QueryType::RRSIG),
"DNSKEY" => Some(QueryType::DNSKEY),
"NSEC" => Some(QueryType::NSEC),
"NSEC3" => Some(QueryType::NSEC3),
"HTTPS" => Some(QueryType::HTTPS),
_ => None,
}
}
} }
#[derive(Debug, Clone, PartialEq, Eq)] #[derive(Debug, Clone, PartialEq, Eq)]

342
src/relay.rs Normal file
View File

@@ -0,0 +1,342 @@
//! ODoH relay (RFC 9230 §5) — the forward-without-reading half of the
//! protocol. Runs `numa relay`; skips all resolver initialisation (no port
//! 53, no cache, no recursion, no dashboard). The relay never reads the
//! HPKE-sealed payload and keeps no per-request logs — only aggregate
//! counters.
use std::net::SocketAddr;
use std::sync::atomic::{AtomicU64, Ordering};
use std::sync::Arc;
use std::time::Duration;
use axum::body::Bytes;
use axum::extract::{DefaultBodyLimit, Query, State};
use axum::http::{header, StatusCode};
use axum::response::{IntoResponse, Response};
use axum::routing::{get, post};
use axum::Router;
use log::{error, info};
use serde::Deserialize;
use tokio::net::TcpListener;
use crate::forward::build_https_client_with_pool;
use crate::odoh::ODOH_CONTENT_TYPE;
use crate::Result;
/// Cap on the opaque body we accept from a client. ODoH envelopes are
/// ~100300 bytes in practice; anything larger is malformed or hostile.
const MAX_BODY_BYTES: usize = 4 * 1024;
/// Cap on the body we read back from the target before streaming to client.
/// Slightly larger: target responses carry DNS answers plus HPKE overhead.
const MAX_TARGET_RESPONSE_BYTES: usize = 8 * 1024;
/// Covers the whole client-to-target round trip — not just `.send()` — so a
/// slow-drip target can't hang a worker indefinitely after headers arrive.
const TARGET_REQUEST_TIMEOUT: Duration = Duration::from_secs(5);
/// The relay hits many distinct target hosts on behalf of clients. A
/// per-host idle pool of 4 keeps warm TLS connections available for concurrent
/// fan-out without blowing up memory on a small VPS.
const RELAY_POOL_PER_HOST: usize = 4;
#[derive(Deserialize)]
struct RelayParams {
targethost: String,
targetpath: String,
}
struct RelayState {
client: reqwest::Client,
total_requests: AtomicU64,
forwarded_ok: AtomicU64,
forwarded_err: AtomicU64,
rejected_bad_request: AtomicU64,
}
impl RelayState {
fn new() -> Arc<Self> {
Arc::new(RelayState {
client: build_https_client_with_pool(RELAY_POOL_PER_HOST),
total_requests: AtomicU64::new(0),
forwarded_ok: AtomicU64::new(0),
forwarded_err: AtomicU64::new(0),
rejected_bad_request: AtomicU64::new(0),
})
}
}
/// `DefaultBodyLimit` overrides axum's 2 MiB default so hostile clients
/// can't force the relay to buffer multi-MB bodies before our own cap.
fn build_app(state: Arc<RelayState>) -> Router {
Router::new()
.route("/relay", post(handle_relay))
.layer(DefaultBodyLimit::max(MAX_BODY_BYTES))
.route("/health", get(handle_health))
.with_state(state)
}
pub async fn run(addr: SocketAddr) -> Result<()> {
let app = build_app(RelayState::new());
let listener = TcpListener::bind(addr).await?;
info!("ODoH relay listening on {}", addr);
axum::serve(listener, app).await?;
Ok(())
}
async fn handle_health(State(state): State<Arc<RelayState>>) -> impl IntoResponse {
let body = format!(
"ok\ntotal {}\nforwarded_ok {}\nforwarded_err {}\nrejected_bad_request {}\n",
state.total_requests.load(Ordering::Relaxed),
state.forwarded_ok.load(Ordering::Relaxed),
state.forwarded_err.load(Ordering::Relaxed),
state.rejected_bad_request.load(Ordering::Relaxed),
);
(
StatusCode::OK,
[(header::CONTENT_TYPE, "text/plain; charset=utf-8")],
body,
)
}
async fn handle_relay(
State(state): State<Arc<RelayState>>,
Query(params): Query<RelayParams>,
headers: axum::http::HeaderMap,
body: Bytes,
) -> Response {
state.total_requests.fetch_add(1, Ordering::Relaxed);
if !content_type_matches(&headers, ODOH_CONTENT_TYPE) {
state.rejected_bad_request.fetch_add(1, Ordering::Relaxed);
return (
StatusCode::UNSUPPORTED_MEDIA_TYPE,
"expected application/oblivious-dns-message",
)
.into_response();
}
if body.len() > MAX_BODY_BYTES {
state.rejected_bad_request.fetch_add(1, Ordering::Relaxed);
return (StatusCode::PAYLOAD_TOO_LARGE, "body exceeds 4 KiB cap").into_response();
}
if !is_valid_hostname(&params.targethost) || !params.targetpath.starts_with('/') {
state.rejected_bad_request.fetch_add(1, Ordering::Relaxed);
return (StatusCode::BAD_REQUEST, "invalid targethost or targetpath").into_response();
}
let target_url = format!("https://{}{}", params.targethost, params.targetpath);
match forward_to_target(&state.client, &target_url, body).await {
Ok((status, resp_body)) => {
state.forwarded_ok.fetch_add(1, Ordering::Relaxed);
(
status,
[(header::CONTENT_TYPE, ODOH_CONTENT_TYPE)],
resp_body,
)
.into_response()
}
Err(e) => {
// Log the underlying reason for operators; don't leak reqwest
// internals (which can reveal the target's TLS config, IP, etc.)
// back to arbitrary clients.
error!("relay forward to {} failed: {}", target_url, e);
state.forwarded_err.fetch_add(1, Ordering::Relaxed);
(StatusCode::BAD_GATEWAY, "target unreachable").into_response()
}
}
}
async fn forward_to_target(
client: &reqwest::Client,
url: &str,
body: Bytes,
) -> Result<(StatusCode, Bytes)> {
let response = tokio::time::timeout(TARGET_REQUEST_TIMEOUT, async {
let resp = client
.post(url)
.header(header::CONTENT_TYPE, ODOH_CONTENT_TYPE)
.header(header::ACCEPT, ODOH_CONTENT_TYPE)
.body(body)
.send()
.await?;
let status = StatusCode::from_u16(resp.status().as_u16())?;
let resp_body = resp.bytes().await?;
Ok::<_, crate::Error>((status, resp_body))
})
.await
.map_err(|_| "timed out talking to target")??;
if response.1.len() > MAX_TARGET_RESPONSE_BYTES {
return Err("target response exceeds cap".into());
}
Ok(response)
}
fn content_type_matches(headers: &axum::http::HeaderMap, expected: &str) -> bool {
headers
.get(header::CONTENT_TYPE)
.and_then(|v| v.to_str().ok())
.map(|ct| ct.split(';').next().unwrap_or("").trim() == expected)
.unwrap_or(false)
}
/// Strict DNS-hostname validator, aimed at closing the SSRF surface a naive
/// `contains('.')` check leaves open (e.g. `example.com@internal.host`,
/// `evil.com/../admin`). Requires ASCII letters/digits/dot/dash, at least
/// one dot, no leading dot or dash, length ≤ 253 per RFC 1035.
fn is_valid_hostname(h: &str) -> bool {
if h.is_empty() || h.len() > 253 || !h.contains('.') {
return false;
}
if h.starts_with('.') || h.starts_with('-') || h.ends_with('.') || h.ends_with('-') {
return false;
}
h.chars()
.all(|c| c.is_ascii_alphanumeric() || c == '.' || c == '-')
}
#[cfg(test)]
mod tests {
use super::*;
async fn spawn_relay() -> (SocketAddr, Arc<RelayState>) {
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let addr = listener.local_addr().unwrap();
let state = RelayState::new();
let app = build_app(state.clone());
tokio::spawn(async move {
let _ = axum::serve(listener, app).await;
});
(addr, state)
}
#[tokio::test]
async fn rejects_missing_content_type() {
let (addr, state) = spawn_relay().await;
let client = reqwest::Client::new();
let resp = client
.post(format!(
"http://{}/relay?targethost=odoh.example.com&targetpath=/dns-query",
addr
))
.body("body")
.send()
.await
.unwrap();
assert_eq!(resp.status(), reqwest::StatusCode::UNSUPPORTED_MEDIA_TYPE);
assert_eq!(state.rejected_bad_request.load(Ordering::Relaxed), 1);
}
#[tokio::test]
async fn rejects_oversized_body() {
let (addr, _state) = spawn_relay().await;
let big = vec![0u8; MAX_BODY_BYTES + 1];
let client = reqwest::Client::new();
let resp = client
.post(format!(
"http://{}/relay?targethost=odoh.example.com&targetpath=/dns-query",
addr
))
.header(header::CONTENT_TYPE, ODOH_CONTENT_TYPE)
.body(big)
.send()
.await
.unwrap();
// axum's DefaultBodyLimit rejects before our handler runs, so the
// counter doesn't increment — but the status code proves the layer
// enforced the cap. Either status is acceptable evidence.
assert!(matches!(
resp.status(),
reqwest::StatusCode::PAYLOAD_TOO_LARGE | reqwest::StatusCode::BAD_REQUEST
));
}
#[tokio::test]
async fn rejects_targethost_without_dot() {
let (addr, state) = spawn_relay().await;
let client = reqwest::Client::new();
let resp = client
.post(format!(
"http://{}/relay?targethost=localhost&targetpath=/dns-query",
addr
))
.header(header::CONTENT_TYPE, ODOH_CONTENT_TYPE)
.body("body")
.send()
.await
.unwrap();
assert_eq!(resp.status(), reqwest::StatusCode::BAD_REQUEST);
assert_eq!(state.rejected_bad_request.load(Ordering::Relaxed), 1);
}
#[tokio::test]
async fn rejects_userinfo_ssrf_attempt() {
let (addr, state) = spawn_relay().await;
let client = reqwest::Client::new();
// The naive contains('.') check would let this through and reqwest
// would route to `internal.host` using `evil.com` as userinfo.
let resp = client
.post(format!(
"http://{}/relay?targethost=evil.com@internal.host&targetpath=/dns-query",
addr
))
.header(header::CONTENT_TYPE, ODOH_CONTENT_TYPE)
.body("body")
.send()
.await
.unwrap();
assert_eq!(resp.status(), reqwest::StatusCode::BAD_REQUEST);
assert_eq!(state.rejected_bad_request.load(Ordering::Relaxed), 1);
}
#[tokio::test]
async fn rejects_targetpath_without_leading_slash() {
let (addr, state) = spawn_relay().await;
let client = reqwest::Client::new();
let resp = client
.post(format!(
"http://{}/relay?targethost=odoh.example.com&targetpath=dns-query",
addr
))
.header(header::CONTENT_TYPE, ODOH_CONTENT_TYPE)
.body("body")
.send()
.await
.unwrap();
assert_eq!(resp.status(), reqwest::StatusCode::BAD_REQUEST);
assert_eq!(state.rejected_bad_request.load(Ordering::Relaxed), 1);
}
#[tokio::test]
async fn health_endpoint_reports_counters() {
let (addr, _state) = spawn_relay().await;
let client = reqwest::Client::new();
let resp = client
.get(format!("http://{}/health", addr))
.send()
.await
.unwrap();
assert_eq!(resp.status(), reqwest::StatusCode::OK);
let body = resp.text().await.unwrap();
assert!(body.contains("ok\n"));
assert!(body.contains("forwarded_ok 0"));
}
#[test]
fn hostname_validator_accepts_and_rejects() {
assert!(is_valid_hostname("odoh.cloudflare-dns.com"));
assert!(is_valid_hostname("a.b"));
assert!(!is_valid_hostname(""));
assert!(!is_valid_hostname("localhost"));
assert!(!is_valid_hostname(".leading.dot"));
assert!(!is_valid_hostname("trailing.dot."));
assert!(!is_valid_hostname("-leading.dash"));
assert!(!is_valid_hostname("evil.com@internal.host"));
assert!(!is_valid_hostname("evil.com/../admin"));
assert!(!is_valid_hostname(&"a".repeat(254)));
}
}

698
src/serve.rs Normal file
View File

@@ -0,0 +1,698 @@
//! The main DNS-server runtime.
//!
//! Extracted from `main.rs` so both the interactive CLI entry and the
//! Windows service dispatcher (`windows_service` module) can drive the
//! same startup/serve loop.
use std::net::SocketAddr;
use std::sync::{Arc, Mutex, RwLock};
use std::time::Duration;
use arc_swap::ArcSwap;
use log::{error, info};
use tokio::net::UdpSocket;
use crate::blocklist::{download_blocklists, parse_blocklist, BlocklistStore};
use crate::bootstrap_resolver::NumaResolver;
use crate::buffer::BytePacketBuffer;
use crate::cache::DnsCache;
use crate::config::{build_zone_map, load_config, ConfigLoad};
use crate::ctx::{handle_query, ServerCtx};
use crate::forward::{
build_https_client_with_resolver, parse_upstream_list, Upstream, UpstreamPool,
};
use crate::odoh::OdohConfigCache;
use crate::override_store::OverrideStore;
use crate::query_log::QueryLog;
use crate::service_store::ServiceStore;
use crate::stats::{ServerStats, Transport};
use crate::system_dns::discover_system_dns;
const QUAD9_IP: &str = "9.9.9.9";
const DOH_FALLBACK: &str = "https://9.9.9.9/dns-query";
/// Boot the DNS server and run until the UDP listener errors out.
pub async fn run(config_path: String) -> crate::Result<()> {
let ConfigLoad {
config,
path: resolved_config_path,
found: config_found,
} = load_config(&config_path)?;
// Discover system DNS in a single pass (upstream + forwarding rules)
let system_dns = discover_system_dns();
let root_hints = crate::recursive::parse_root_hints(&config.upstream.root_hints);
let recursive_pool = || {
let dummy = UpstreamPool::new(vec![Upstream::Udp("0.0.0.0:0".parse().unwrap())], vec![]);
(dummy, "recursive (root hints)".to_string())
};
// Routes numa-originated HTTPS (DoH upstream, ODoH relay/target, blocklist
// CDN) away from the system resolver so lookups don't loop back through
// numa when it's its own system DNS.
// See `docs/implementation/bootstrap-resolver.md`.
let resolver_overrides = match config.upstream.mode {
crate::config::UpstreamMode::Odoh => config
.upstream
.odoh_upstream()
.map(|o| o.host_ip_overrides())
.unwrap_or_default(),
_ => std::collections::BTreeMap::new(),
};
let bootstrap_resolver: Arc<NumaResolver> = Arc::new(NumaResolver::new(
&config.upstream.fallback,
resolver_overrides,
));
let (resolved_mode, upstream_auto, pool, upstream_label) = match config.upstream.mode {
crate::config::UpstreamMode::Auto => {
info!("auto mode: probing recursive resolution...");
if crate::recursive::probe_recursive(&root_hints).await {
info!("recursive probe succeeded — self-sovereign mode");
let (pool, label) = recursive_pool();
(crate::config::UpstreamMode::Recursive, false, pool, label)
} else {
log::warn!("recursive probe failed — falling back to Quad9 DoH");
let client = build_https_client_with_resolver(1, Some(bootstrap_resolver.clone()));
let url = DOH_FALLBACK.to_string();
let label = url.clone();
let pool = UpstreamPool::new(vec![Upstream::Doh { url, client }], vec![]);
(crate::config::UpstreamMode::Forward, false, pool, label)
}
}
crate::config::UpstreamMode::Recursive => {
let (pool, label) = recursive_pool();
(crate::config::UpstreamMode::Recursive, false, pool, label)
}
crate::config::UpstreamMode::Forward => {
let addrs = if config.upstream.address.is_empty() {
let detected = system_dns
.default_upstream
.or_else(crate::system_dns::detect_dhcp_dns)
.unwrap_or_else(|| {
info!("could not detect system DNS, falling back to Quad9 DoH");
DOH_FALLBACK.to_string()
});
vec![detected]
} else {
config.upstream.address.clone()
};
let primary = parse_upstream_list(
&addrs,
config.upstream.port,
Some(bootstrap_resolver.clone()),
)?;
let fallback = parse_upstream_list(
&config.upstream.fallback,
config.upstream.port,
Some(bootstrap_resolver.clone()),
)?;
let pool = UpstreamPool::new(primary, fallback);
let label = pool.label();
(
crate::config::UpstreamMode::Forward,
config.upstream.address.is_empty(),
pool,
label,
)
}
crate::config::UpstreamMode::Odoh => {
let odoh = config.upstream.odoh_upstream()?;
let client = build_https_client_with_resolver(1, Some(bootstrap_resolver.clone()));
let target_config = Arc::new(OdohConfigCache::new(
odoh.target_host.clone(),
client.clone(),
));
let primary = vec![Upstream::Odoh {
relay_url: odoh.relay_url,
target_path: odoh.target_path,
client,
target_config,
}];
let fallback = if odoh.strict {
Vec::new()
} else {
parse_upstream_list(
&config.upstream.fallback,
config.upstream.port,
Some(bootstrap_resolver.clone()),
)?
};
let pool = UpstreamPool::new(primary, fallback);
let label = pool.label();
(crate::config::UpstreamMode::Odoh, false, pool, label)
}
};
let api_port = config.server.api_port;
let mut blocklist = BlocklistStore::new();
for domain in &config.blocking.allowlist {
blocklist.add_to_allowlist(domain);
}
if !config.blocking.enabled {
blocklist.set_enabled(false);
}
// Build service store: config services + persisted user services
let mut service_store = ServiceStore::new();
service_store.insert_from_config("numa", config.server.api_port, Vec::new());
for svc in &config.services {
service_store.insert_from_config(&svc.name, svc.target_port, svc.routes.clone());
}
service_store.load_persisted();
for fwd in &config.forwarding {
for suffix in &fwd.suffix {
info!(
"forwarding .{} to {} (config rule)",
suffix,
fwd.upstream.join(", ")
);
}
}
let forwarding_rules =
crate::config::merge_forwarding_rules(&config.forwarding, system_dns.forwarding_rules)?;
// Resolve data_dir from config, falling back to the platform default.
// Used for TLS CA storage below and stored on ServerCtx for runtime use.
let resolved_data_dir = config
.server
.data_dir
.clone()
.unwrap_or_else(crate::data_dir);
// Build initial TLS config before ServerCtx (so ArcSwap is ready at construction)
let initial_tls = if config.proxy.enabled && config.proxy.tls_port > 0 {
let service_names = service_store.names();
match crate::tls::build_tls_config(
&config.proxy.tld,
&service_names,
Vec::new(),
&resolved_data_dir,
) {
Ok(tls_config) => Some(ArcSwap::from(tls_config)),
Err(e) => {
if let Some(advisory) = crate::tls::try_data_dir_advisory(&e, &resolved_data_dir) {
eprint!("{}", advisory);
} else {
log::warn!("TLS setup failed, HTTPS proxy disabled: {}", e);
}
None
}
}
} else {
None
};
let doh_enabled = initial_tls.is_some();
let health_meta = crate::health::HealthMeta::build(
&resolved_data_dir,
config.dot.enabled,
config.dot.port,
config.mobile.port,
config.dnssec.enabled,
resolved_mode == crate::config::UpstreamMode::Recursive,
config.lan.enabled,
config.blocking.enabled,
doh_enabled,
);
let ca_pem = std::fs::read_to_string(resolved_data_dir.join("ca.pem")).ok();
let socket = match UdpSocket::bind(&config.server.bind_addr).await {
Ok(s) => s,
Err(e) => {
if let Some(advisory) =
crate::system_dns::try_port53_advisory(&config.server.bind_addr, &e)
{
eprint!("{}", advisory);
std::process::exit(1);
}
return Err(e.into());
}
};
let ctx = Arc::new(ServerCtx {
socket,
zone_map: build_zone_map(&config.zones)?,
cache: RwLock::new(DnsCache::new(
config.cache.max_entries,
config.cache.min_ttl,
config.cache.max_ttl,
)),
refreshing: Mutex::new(std::collections::HashSet::new()),
stats: Mutex::new(ServerStats::new()),
overrides: RwLock::new(OverrideStore::new()),
blocklist: RwLock::new(blocklist),
query_log: Mutex::new(QueryLog::new(1000)),
services: Mutex::new(service_store),
lan_peers: Mutex::new(crate::lan::PeerStore::new(config.lan.peer_timeout_secs)),
forwarding_rules,
upstream_pool: Mutex::new(pool),
upstream_auto,
upstream_port: config.upstream.port,
lan_ip: Mutex::new(crate::lan::detect_lan_ip().unwrap_or(std::net::Ipv4Addr::LOCALHOST)),
timeout: Duration::from_millis(config.upstream.timeout_ms),
hedge_delay: resolved_mode.hedge_delay(config.upstream.hedge_ms),
proxy_tld_suffix: if config.proxy.tld.is_empty() {
String::new()
} else {
format!(".{}", config.proxy.tld)
},
proxy_tld: config.proxy.tld.clone(),
lan_enabled: config.lan.enabled,
config_path: resolved_config_path,
config_found,
config_dir: crate::config_dir(),
data_dir: resolved_data_dir,
tls_config: initial_tls,
upstream_mode: resolved_mode,
root_hints,
srtt: std::sync::RwLock::new(crate::srtt::SrttCache::new(config.upstream.srtt)),
inflight: std::sync::Mutex::new(std::collections::HashMap::new()),
dnssec_enabled: config.dnssec.enabled,
dnssec_strict: config.dnssec.strict,
health_meta,
ca_pem,
mobile_enabled: config.mobile.enabled,
mobile_port: config.mobile.port,
filter_aaaa: config.server.filter_aaaa,
});
let zone_count: usize = ctx.zone_map.values().map(|m| m.len()).sum();
// Build banner rows, then size the box to fit the longest value
let api_url = format!("http://localhost:{}", api_port);
let proxy_label = if config.proxy.enabled {
if config.proxy.tls_port > 0 {
Some(format!(
"http://:{} https://:{}",
config.proxy.port, config.proxy.tls_port
))
} else {
Some(format!(
"http://*.{} on :{}",
config.proxy.tld, config.proxy.port
))
}
} else {
None
};
let config_label = if ctx.config_found {
ctx.config_path.clone()
} else {
format!("{} (defaults)", ctx.config_path)
};
let data_label = ctx.data_dir.display().to_string();
let services_label = ctx.config_dir.join("services.json").display().to_string();
// label (10) + value + padding (2) = inner width; minimum 40 for the title row
let val_w = [
config.server.bind_addr.len(),
api_url.len(),
upstream_label.len(),
config_label.len(),
data_label.len(),
services_label.len(),
]
.into_iter()
.chain(proxy_label.as_ref().map(|s| s.len()))
.max()
.unwrap_or(30);
let w = (val_w + 12).max(42); // 10 label + 2 padding, min 42 for title
let o = "\x1b[38;2;192;98;58m"; // orange
let g = "\x1b[38;2;107;124;78m"; // green
let d = "\x1b[38;2;163;152;136m"; // dim
let r = "\x1b[0m"; // reset
let b = "\x1b[1;38;2;192;98;58m"; // bold orange
let it = "\x1b[3;38;2;163;152;136m"; // italic dim
let bar_top = "".repeat(w);
let bar_mid = "".repeat(w);
let row = |label: &str, color: &str, value: &str| {
eprintln!(
"{o}{r} {color}{:<9}{r} {:<vw$}{o}{r}",
label,
value,
vw = w - 12
);
};
// Title row: center within the box
let title = format!(
"{b}NUMA{r} {it}DNS that governs itself{r} {d}v{}{r}",
env!("CARGO_PKG_VERSION")
);
// The title contains ANSI codes; visible length is ~38 chars. Pad to fill the box.
let title_visible_len = 4 + 2 + 24 + 2 + 1 + env!("CARGO_PKG_VERSION").len() + 1;
let title_pad = w.saturating_sub(title_visible_len);
eprintln!("\n{o}{bar_top}{r}");
eprint!("{o}{r} {title}");
eprintln!("{}{o}{r}", " ".repeat(title_pad));
eprintln!("{o}{bar_top}{r}");
row("DNS", g, &config.server.bind_addr);
row("API", g, &api_url);
row("Dashboard", g, &api_url);
row(
"Upstream",
g,
if ctx.upstream_mode == crate::config::UpstreamMode::Recursive {
"recursive (root hints)"
} else {
&upstream_label
},
);
row("Zones", g, &format!("{} records", zone_count));
row(
"Cache",
g,
&format!("max {} entries", config.cache.max_entries),
);
if !config.cache.warm.is_empty() {
row("Warm", g, &format!("{} domains", config.cache.warm.len()));
}
row(
"Blocking",
g,
&if config.blocking.enabled {
format!("{} lists", config.blocking.lists.len())
} else {
"disabled".to_string()
},
);
if let Some(ref label) = proxy_label {
row("Proxy", g, label);
if config.proxy.bind_addr == "127.0.0.1" {
let y = "\x1b[38;2;204;176;59m"; // yellow
row(
"",
y,
&format!(
"⚠ proxy on 127.0.0.1 — .{} not LAN reachable",
config.proxy.tld
),
);
}
}
if config.dot.enabled {
row("DoT", g, &format!("tls://:{}", config.dot.port));
}
if doh_enabled {
row(
"DoH",
g,
&format!("https://:{}/dns-query", config.proxy.tls_port),
);
}
if config.lan.enabled {
row("LAN", g, "mDNS (_numa._tcp.local)");
}
if !ctx.forwarding_rules.is_empty() {
row(
"Routing",
g,
&format!("{} conditional rules", ctx.forwarding_rules.len()),
);
}
eprintln!("{o}{bar_mid}{r}");
row("Config", d, &config_label);
row("Data", d, &data_label);
row("Services", d, &services_label);
eprintln!("{o}{bar_top}{r}\n");
info!(
"numa listening on {}, upstream {}, {} zone records, cache max {}, API on port {}",
config.server.bind_addr, upstream_label, zone_count, config.cache.max_entries, api_port,
);
// Download blocklists on startup
let blocklist_lists = config.blocking.lists.clone();
let refresh_hours = config.blocking.refresh_hours;
if config.blocking.enabled && !blocklist_lists.is_empty() {
let bl_ctx = Arc::clone(&ctx);
let bl_lists = blocklist_lists.clone();
let bl_resolver = bootstrap_resolver.clone();
tokio::spawn(async move {
load_blocklists(&bl_ctx, &bl_lists, Some(bl_resolver.clone())).await;
// Periodic refresh
let mut interval = tokio::time::interval(Duration::from_secs(refresh_hours * 3600));
interval.tick().await; // skip immediate tick
loop {
interval.tick().await;
info!("refreshing blocklists...");
load_blocklists(&bl_ctx, &bl_lists, Some(bl_resolver.clone())).await;
}
});
}
// Prime TLD cache (recursive mode only)
if ctx.upstream_mode == crate::config::UpstreamMode::Recursive {
let prime_ctx = Arc::clone(&ctx);
let prime_tlds = config.upstream.prime_tlds;
tokio::spawn(async move {
crate::recursive::prime_tld_cache(
&prime_ctx.cache,
&prime_ctx.root_hints,
&prime_tlds,
&prime_ctx.srtt,
)
.await;
});
}
// Spawn cache warming for user-configured domains
if !config.cache.warm.is_empty() {
let warm_ctx = Arc::clone(&ctx);
let warm_domains = config.cache.warm.clone();
tokio::spawn(async move {
cache_warm_loop(warm_ctx, warm_domains).await;
});
}
// Spawn DoH connection keepalive — prevents idle TLS teardown
{
let keepalive_ctx = Arc::clone(&ctx);
tokio::spawn(async move {
doh_keepalive_loop(keepalive_ctx).await;
});
}
// Spawn HTTP API server
let api_ctx = Arc::clone(&ctx);
let api_addr: SocketAddr = format!("{}:{}", config.server.api_bind_addr, api_port).parse()?;
tokio::spawn(async move {
let app = crate::api::router(api_ctx);
let listener = tokio::net::TcpListener::bind(api_addr).await.unwrap();
info!("HTTP API listening on {}", api_addr);
axum::serve(listener, app).await.unwrap();
});
// Spawn Mobile API listener (read-only subset for iOS/Android companion
// apps, LAN-bound by default so phones can reach it). Only idempotent
// GETs; no state-mutating routes are exposed here regardless of
// the main API's bind address.
if config.mobile.enabled {
let mobile_ctx = Arc::clone(&ctx);
let mobile_bind = config.mobile.bind_addr.clone();
let mobile_port = config.mobile.port;
tokio::spawn(async move {
if let Err(e) = crate::mobile_api::start(mobile_ctx, mobile_bind, mobile_port).await {
log::warn!("Mobile API listener failed: {}", e);
}
});
}
let proxy_bind: std::net::Ipv4Addr = config
.proxy
.bind_addr
.parse()
.unwrap_or(std::net::Ipv4Addr::LOCALHOST);
// Spawn HTTP reverse proxy for .numa domains
if config.proxy.enabled {
let proxy_ctx = Arc::clone(&ctx);
let proxy_port = config.proxy.port;
tokio::spawn(async move {
crate::proxy::start_proxy(proxy_ctx, proxy_port, proxy_bind).await;
});
}
// Spawn HTTPS reverse proxy with TLS termination
if config.proxy.enabled && config.proxy.tls_port > 0 && ctx.tls_config.is_some() {
let proxy_ctx = Arc::clone(&ctx);
let tls_port = config.proxy.tls_port;
tokio::spawn(async move {
crate::proxy::start_proxy_tls(proxy_ctx, tls_port, proxy_bind).await;
});
}
// Spawn network change watcher (upstream re-detection, LAN IP update, peer flush)
{
let watch_ctx = Arc::clone(&ctx);
tokio::spawn(async move {
network_watch_loop(watch_ctx).await;
});
}
// Spawn LAN service discovery
if config.lan.enabled {
let lan_ctx = Arc::clone(&ctx);
let lan_config = config.lan.clone();
tokio::spawn(async move {
crate::lan::start_lan_discovery(lan_ctx, &lan_config).await;
});
}
// Spawn DNS-over-TLS listener (RFC 7858)
if config.dot.enabled {
let dot_ctx = Arc::clone(&ctx);
let dot_config = config.dot.clone();
tokio::spawn(async move {
crate::dot::start_dot(dot_ctx, &dot_config).await;
});
}
// UDP DNS listener
#[allow(clippy::infinite_loop)]
loop {
let mut buffer = BytePacketBuffer::new();
let (len, src_addr) = match ctx.socket.recv_from(&mut buffer.buf).await {
Ok(r) => r,
Err(e) if e.kind() == std::io::ErrorKind::ConnectionReset => {
// Windows delivers ICMP port-unreachable as ConnectionReset on UDP sockets
continue;
}
Err(e) => return Err(e.into()),
};
let ctx = Arc::clone(&ctx);
tokio::spawn(async move {
if let Err(e) = handle_query(buffer, len, src_addr, &ctx, Transport::Udp).await {
error!("{} | HANDLER ERROR | {}", src_addr, e);
}
});
}
}
async fn network_watch_loop(ctx: Arc<ServerCtx>) {
let mut tick: u64 = 0;
let mut interval = tokio::time::interval(Duration::from_secs(5));
interval.tick().await; // skip immediate tick
loop {
interval.tick().await;
tick += 1;
let mut changed = false;
// Check LAN IP change (every 5s — cheap, one UDP socket call)
if let Some(new_ip) = crate::lan::detect_lan_ip() {
let mut current_ip = ctx.lan_ip.lock().unwrap();
if new_ip != *current_ip {
info!("LAN IP changed: {} → {}", current_ip, new_ip);
*current_ip = new_ip;
changed = true;
crate::recursive::reset_udp_state();
}
}
// Re-detect upstream every 30s or on LAN IP change (auto-detect only)
if ctx.upstream_auto && (changed || tick.is_multiple_of(6)) {
let dns_info = crate::system_dns::discover_system_dns();
let new_addr = dns_info
.default_upstream
.or_else(crate::system_dns::detect_dhcp_dns)
.unwrap_or_else(|| QUAD9_IP.to_string());
let mut pool = ctx.upstream_pool.lock().unwrap();
if pool.maybe_update_primary(&new_addr, ctx.upstream_port) {
info!("upstream changed → {}", pool.label());
changed = true;
}
}
// Flush stale LAN peers on any network change
if changed {
ctx.lan_peers.lock().unwrap().clear();
info!("flushed LAN peers after network change");
}
// Re-probe UDP every 5 minutes when disabled
if tick.is_multiple_of(60) {
crate::recursive::probe_udp(&ctx.root_hints).await;
}
}
}
async fn load_blocklists(ctx: &ServerCtx, lists: &[String], resolver: Option<Arc<NumaResolver>>) {
let downloaded = download_blocklists(lists, resolver).await;
// Parse outside the lock to avoid blocking DNS queries during parse (~100ms)
let mut all_domains = std::collections::HashSet::new();
let mut sources = Vec::new();
for (source, text) in &downloaded {
let domains = parse_blocklist(text);
info!("blocklist: {} domains from {}", domains.len(), source);
all_domains.extend(domains);
sources.push(source.clone());
}
let total = all_domains.len();
// Swap under lock — sub-microsecond
ctx.blocklist
.write()
.unwrap()
.swap_domains(all_domains, sources);
info!(
"blocking enabled: {} unique domains from {} lists",
total,
downloaded.len()
);
}
async fn warm_domain(ctx: &ServerCtx, domain: &str) {
for qtype in [
crate::question::QueryType::A,
crate::question::QueryType::AAAA,
] {
crate::ctx::refresh_entry(ctx, domain, qtype).await;
}
}
async fn doh_keepalive_loop(ctx: Arc<ServerCtx>) {
// First tick fires immediately so we surface bootstrap-resolver failures
// (unreachable Quad9/Cloudflare, blocked :53, bad upstream hostname) in
// the startup logs instead of on the first client query.
let mut interval = tokio::time::interval(Duration::from_secs(25));
loop {
interval.tick().await;
let pool = ctx.upstream_pool.lock().unwrap().clone();
if let Some(upstream) = pool.preferred() {
crate::forward::keepalive_doh(upstream).await;
}
}
}
async fn cache_warm_loop(ctx: Arc<ServerCtx>, domains: Vec<String>) {
tokio::time::sleep(Duration::from_secs(2)).await;
for domain in &domains {
warm_domain(&ctx, domain).await;
}
info!("cache warm: {} domains resolved at startup", domains.len());
let mut interval = tokio::time::interval(Duration::from_secs(30));
interval.tick().await;
loop {
interval.tick().await;
for domain in &domains {
let refresh = ctx.cache.read().unwrap().needs_warm(domain);
if refresh {
warm_domain(&ctx, domain).await;
}
}
}
}

View File

@@ -90,6 +90,7 @@ fn linux_rss() -> usize {
pub struct ServerStats { pub struct ServerStats {
queries_total: u64, queries_total: u64,
queries_forwarded: u64, queries_forwarded: u64,
queries_upstream: u64,
queries_recursive: u64, queries_recursive: u64,
queries_coalesced: u64, queries_coalesced: u64,
queries_cached: u64, queries_cached: u64,
@@ -101,6 +102,10 @@ pub struct ServerStats {
transport_tcp: u64, transport_tcp: u64,
transport_dot: u64, transport_dot: u64,
transport_doh: u64, transport_doh: u64,
upstream_transport_udp: u64,
upstream_transport_doh: u64,
upstream_transport_dot: u64,
upstream_transport_odoh: u64,
started_at: Instant, started_at: Instant,
} }
@@ -123,11 +128,39 @@ impl Transport {
} }
} }
/// Wire protocol used for a forwarded upstream call. Orthogonal to
/// `QueryPath`: the path answers "where the answer came from"; this answers
/// "over what wire we spoke to the forwarder." Callers pass
/// `Option<UpstreamTransport>` — `None` for resolutions that never touched
/// a forwarder (cache/local/blocked) or for recursive mode, which has its
/// own counter via `QueryPath::Recursive`.
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum UpstreamTransport {
Udp,
Doh,
Dot,
Odoh,
}
impl UpstreamTransport {
pub fn as_str(&self) -> &'static str {
match self {
UpstreamTransport::Udp => "UDP",
UpstreamTransport::Doh => "DOH",
UpstreamTransport::Dot => "DOT",
UpstreamTransport::Odoh => "ODOH",
}
}
}
#[derive(Clone, Copy, Debug, PartialEq, Eq)] #[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum QueryPath { pub enum QueryPath {
Local, Local,
Cached, Cached,
/// Matched a `[[forwarding]]` suffix rule.
Forwarded, Forwarded,
/// Resolved via the default `[upstream]` pool (no suffix match).
Upstream,
Recursive, Recursive,
Coalesced, Coalesced,
Blocked, Blocked,
@@ -141,6 +174,7 @@ impl QueryPath {
QueryPath::Local => "LOCAL", QueryPath::Local => "LOCAL",
QueryPath::Cached => "CACHED", QueryPath::Cached => "CACHED",
QueryPath::Forwarded => "FORWARD", QueryPath::Forwarded => "FORWARD",
QueryPath::Upstream => "UPSTREAM",
QueryPath::Recursive => "RECURSIVE", QueryPath::Recursive => "RECURSIVE",
QueryPath::Coalesced => "COALESCED", QueryPath::Coalesced => "COALESCED",
QueryPath::Blocked => "BLOCKED", QueryPath::Blocked => "BLOCKED",
@@ -156,6 +190,8 @@ impl QueryPath {
Some(QueryPath::Cached) Some(QueryPath::Cached)
} else if s.eq_ignore_ascii_case("FORWARD") { } else if s.eq_ignore_ascii_case("FORWARD") {
Some(QueryPath::Forwarded) Some(QueryPath::Forwarded)
} else if s.eq_ignore_ascii_case("UPSTREAM") {
Some(QueryPath::Upstream)
} else if s.eq_ignore_ascii_case("RECURSIVE") { } else if s.eq_ignore_ascii_case("RECURSIVE") {
Some(QueryPath::Recursive) Some(QueryPath::Recursive)
} else if s.eq_ignore_ascii_case("COALESCED") { } else if s.eq_ignore_ascii_case("COALESCED") {
@@ -183,6 +219,7 @@ impl ServerStats {
ServerStats { ServerStats {
queries_total: 0, queries_total: 0,
queries_forwarded: 0, queries_forwarded: 0,
queries_upstream: 0,
queries_recursive: 0, queries_recursive: 0,
queries_coalesced: 0, queries_coalesced: 0,
queries_cached: 0, queries_cached: 0,
@@ -194,16 +231,26 @@ impl ServerStats {
transport_tcp: 0, transport_tcp: 0,
transport_dot: 0, transport_dot: 0,
transport_doh: 0, transport_doh: 0,
upstream_transport_udp: 0,
upstream_transport_doh: 0,
upstream_transport_dot: 0,
upstream_transport_odoh: 0,
started_at: Instant::now(), started_at: Instant::now(),
} }
} }
pub fn record(&mut self, path: QueryPath, transport: Transport) -> u64 { pub fn record(
&mut self,
path: QueryPath,
transport: Transport,
upstream_transport: Option<UpstreamTransport>,
) -> u64 {
self.queries_total += 1; self.queries_total += 1;
match path { match path {
QueryPath::Local => self.queries_local += 1, QueryPath::Local => self.queries_local += 1,
QueryPath::Cached => self.queries_cached += 1, QueryPath::Cached => self.queries_cached += 1,
QueryPath::Forwarded => self.queries_forwarded += 1, QueryPath::Forwarded => self.queries_forwarded += 1,
QueryPath::Upstream => self.queries_upstream += 1,
QueryPath::Recursive => self.queries_recursive += 1, QueryPath::Recursive => self.queries_recursive += 1,
QueryPath::Coalesced => self.queries_coalesced += 1, QueryPath::Coalesced => self.queries_coalesced += 1,
QueryPath::Blocked => self.queries_blocked += 1, QueryPath::Blocked => self.queries_blocked += 1,
@@ -216,6 +263,14 @@ impl ServerStats {
Transport::Dot => self.transport_dot += 1, Transport::Dot => self.transport_dot += 1,
Transport::Doh => self.transport_doh += 1, Transport::Doh => self.transport_doh += 1,
} }
if let Some(ut) = upstream_transport {
match ut {
UpstreamTransport::Udp => self.upstream_transport_udp += 1,
UpstreamTransport::Doh => self.upstream_transport_doh += 1,
UpstreamTransport::Dot => self.upstream_transport_dot += 1,
UpstreamTransport::Odoh => self.upstream_transport_odoh += 1,
}
}
self.queries_total self.queries_total
} }
@@ -232,6 +287,7 @@ impl ServerStats {
uptime_secs: self.uptime_secs(), uptime_secs: self.uptime_secs(),
total: self.queries_total, total: self.queries_total,
forwarded: self.queries_forwarded, forwarded: self.queries_forwarded,
upstream: self.queries_upstream,
recursive: self.queries_recursive, recursive: self.queries_recursive,
coalesced: self.queries_coalesced, coalesced: self.queries_coalesced,
cached: self.queries_cached, cached: self.queries_cached,
@@ -243,6 +299,10 @@ impl ServerStats {
transport_tcp: self.transport_tcp, transport_tcp: self.transport_tcp,
transport_dot: self.transport_dot, transport_dot: self.transport_dot,
transport_doh: self.transport_doh, transport_doh: self.transport_doh,
upstream_transport_udp: self.upstream_transport_udp,
upstream_transport_doh: self.upstream_transport_doh,
upstream_transport_dot: self.upstream_transport_dot,
upstream_transport_odoh: self.upstream_transport_odoh,
} }
} }
@@ -253,10 +313,11 @@ impl ServerStats {
let secs = uptime.as_secs() % 60; let secs = uptime.as_secs() % 60;
log::info!( log::info!(
"STATS | uptime {}h{}m{}s | total {} | fwd {} | recursive {} | coalesced {} | cached {} | local {} | override {} | blocked {} | errors {}", "STATS | uptime {}h{}m{}s | total {} | fwd {} | upstream {} | recursive {} | coalesced {} | cached {} | local {} | override {} | blocked {} | errors {} | up-udp {} | up-doh {} | up-dot {} | up-odoh {}",
hours, mins, secs, hours, mins, secs,
self.queries_total, self.queries_total,
self.queries_forwarded, self.queries_forwarded,
self.queries_upstream,
self.queries_recursive, self.queries_recursive,
self.queries_coalesced, self.queries_coalesced,
self.queries_cached, self.queries_cached,
@@ -264,6 +325,10 @@ impl ServerStats {
self.queries_overridden, self.queries_overridden,
self.queries_blocked, self.queries_blocked,
self.upstream_errors, self.upstream_errors,
self.upstream_transport_udp,
self.upstream_transport_doh,
self.upstream_transport_dot,
self.upstream_transport_odoh,
); );
} }
} }
@@ -272,6 +337,7 @@ pub struct StatsSnapshot {
pub uptime_secs: u64, pub uptime_secs: u64,
pub total: u64, pub total: u64,
pub forwarded: u64, pub forwarded: u64,
pub upstream: u64,
pub recursive: u64, pub recursive: u64,
pub coalesced: u64, pub coalesced: u64,
pub cached: u64, pub cached: u64,
@@ -283,4 +349,8 @@ pub struct StatsSnapshot {
pub transport_tcp: u64, pub transport_tcp: u64,
pub transport_dot: u64, pub transport_dot: u64,
pub transport_doh: u64, pub transport_doh: u64,
pub upstream_transport_udp: u64,
pub upstream_transport_doh: u64,
pub upstream_transport_dot: u64,
pub upstream_transport_odoh: u64,
} }

179
src/svcb.rs Normal file
View File

@@ -0,0 +1,179 @@
//! Minimal SVCB/HTTPS (RFC 9460) RDATA parser — just enough to strip
//! the `ipv6hint` SvcParam. Used by the `filter_aaaa` feature so
//! HTTPS-record-aware clients (Chrome ≥103, Firefox, Safari) don't
//! receive v6 address hints on IPv4-only networks.
/// SvcParamKey = 6 (RFC 9460 §14.3.2).
const IPV6_HINT_KEY: u16 = 6;
/// Strip the `ipv6hint` SvcParam from an HTTPS/SVCB RDATA blob.
///
/// Returns `Some(new_rdata)` if `ipv6hint` was present and removed.
/// Returns `None` if the record had no `ipv6hint`, or if the RDATA
/// couldn't be parsed — in both cases the caller should keep the
/// original bytes untouched.
///
/// SVCB RDATA (RFC 9460 §2.2):
/// SvcPriority (u16)
/// TargetName (uncompressed DNS name — labels terminated by 0 octet)
/// SvcParams (series of {u16 key, u16 len, opaque[len] value}, sorted by key)
pub fn strip_ipv6hint(rdata: &[u8]) -> Option<Vec<u8>> {
if rdata.len() < 2 {
return None;
}
let mut pos = 2;
// TargetName — uncompressed per RFC 9460 §2.2
loop {
let len = *rdata.get(pos)? as usize;
pos += 1;
if len == 0 {
break;
}
if len & 0xC0 != 0 {
// Pointer: forbidden in SVCB but defend against a broken upstream.
return None;
}
pos = pos.checked_add(len)?;
if pos > rdata.len() {
return None;
}
}
// Scan params once to decide whether we need to rebuild.
let params_start = pos;
let mut scan = pos;
let mut has_ipv6hint = false;
while scan < rdata.len() {
if scan + 4 > rdata.len() {
return None;
}
let key = u16::from_be_bytes([rdata[scan], rdata[scan + 1]]);
let vlen = u16::from_be_bytes([rdata[scan + 2], rdata[scan + 3]]) as usize;
let end = scan.checked_add(4)?.checked_add(vlen)?;
if end > rdata.len() {
return None;
}
if key == IPV6_HINT_KEY {
has_ipv6hint = true;
}
scan = end;
}
if scan != rdata.len() || !has_ipv6hint {
return None;
}
// Rebuild without ipv6hint, preserving param order (RFC 9460 requires
// ascending key order, which we preserve by filtering in place).
let mut out = Vec::with_capacity(rdata.len());
out.extend_from_slice(&rdata[..params_start]);
let mut pos = params_start;
while pos < rdata.len() {
let key = u16::from_be_bytes([rdata[pos], rdata[pos + 1]]);
let vlen = u16::from_be_bytes([rdata[pos + 2], rdata[pos + 3]]) as usize;
let end = pos + 4 + vlen;
if key != IPV6_HINT_KEY {
out.extend_from_slice(&rdata[pos..end]);
}
pos = end;
}
Some(out)
}
/// Build an SVCB RDATA blob from a priority, target labels, and
/// (key, value) param pairs. Shared by `svcb` unit tests and `ctx`
/// pipeline tests that need to seed the cache with a synthetic HTTPS RR.
#[cfg(test)]
pub(crate) fn build_rdata(priority: u16, target: &[&str], params: &[(u16, Vec<u8>)]) -> Vec<u8> {
let mut out = Vec::new();
out.extend_from_slice(&priority.to_be_bytes());
for label in target {
out.push(label.len() as u8);
out.extend_from_slice(label.as_bytes());
}
out.push(0);
for (key, value) in params {
out.extend_from_slice(&key.to_be_bytes());
out.extend_from_slice(&(value.len() as u16).to_be_bytes());
out.extend_from_slice(value);
}
out
}
#[cfg(test)]
mod tests {
use super::*;
fn alpn_h3() -> (u16, Vec<u8>) {
// alpn = ["h3"]: one length-prefixed ALPN id
(1, vec![0x02, b'h', b'3'])
}
fn ipv4hint_single() -> (u16, Vec<u8>) {
(4, vec![93, 184, 216, 34])
}
fn ipv6hint_single() -> (u16, Vec<u8>) {
// 2606:4700::1
(
6,
vec![
0x26, 0x06, 0x47, 0x00, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0x01,
],
)
}
#[test]
fn strips_ipv6hint_and_keeps_other_params() {
let rdata = build_rdata(1, &[], &[alpn_h3(), ipv4hint_single(), ipv6hint_single()]);
let stripped = strip_ipv6hint(&rdata).expect("ipv6hint present → stripped");
let expected = build_rdata(1, &[], &[alpn_h3(), ipv4hint_single()]);
assert_eq!(stripped, expected);
}
#[test]
fn no_ipv6hint_returns_none() {
let rdata = build_rdata(1, &[], &[alpn_h3(), ipv4hint_single()]);
assert!(strip_ipv6hint(&rdata).is_none());
}
#[test]
fn alias_mode_empty_params_returns_none() {
let rdata = build_rdata(0, &["example", "com"], &[]);
assert!(strip_ipv6hint(&rdata).is_none());
}
#[test]
fn only_ipv6hint_yields_empty_param_section() {
let rdata = build_rdata(1, &[], &[ipv6hint_single()]);
let stripped = strip_ipv6hint(&rdata).expect("ipv6hint present → stripped");
let expected = build_rdata(1, &[], &[]);
assert_eq!(stripped, expected);
}
#[test]
fn preserves_target_name() {
let rdata = build_rdata(1, &["svc", "example", "net"], &[ipv6hint_single()]);
let stripped = strip_ipv6hint(&rdata).unwrap();
assert!(stripped.starts_with(&[0x00, 0x01])); // priority
assert_eq!(&stripped[2..6], b"\x03svc");
}
#[test]
fn truncated_rdata_returns_none() {
// Priority only, no target terminator.
assert!(strip_ipv6hint(&[0, 1, 3, b'c', b'o', b'm']).is_none());
}
#[test]
fn empty_input_returns_none() {
assert!(strip_ipv6hint(&[]).is_none());
}
#[test]
fn param_length_overflow_returns_none() {
// key=6, length=0xFFFF but value is short — malformed.
let rdata = vec![0, 1, 0, 0, 6, 0xFF, 0xFF, 0, 1, 2];
assert!(strip_ipv6hint(&rdata).is_none());
}
}

View File

@@ -2,7 +2,9 @@ use std::net::SocketAddr;
use log::info; use log::info;
#[cfg(any(target_os = "macos", target_os = "linux"))]
use crate::forward::Upstream; use crate::forward::Upstream;
use crate::forward::UpstreamPool;
fn print_recursive_hint() { fn print_recursive_hint() {
let is_recursive = crate::config::load_config("numa.toml") let is_recursive = crate::config::load_config("numa.toml")
@@ -20,15 +22,15 @@ fn is_loopback_or_stub(addr: &str) -> bool {
} }
/// A conditional forwarding rule: domains matching `suffix` are forwarded to `upstream`. /// A conditional forwarding rule: domains matching `suffix` are forwarded to `upstream`.
#[derive(Debug, Clone)] #[derive(Clone)]
pub struct ForwardingRule { pub struct ForwardingRule {
pub suffix: String, pub suffix: String,
dot_suffix: String, // pre-computed ".suffix" for zero-alloc matching dot_suffix: String, // pre-computed ".suffix" for zero-alloc matching
pub upstream: Upstream, pub upstream: UpstreamPool,
} }
impl ForwardingRule { impl ForwardingRule {
pub fn new(suffix: String, upstream: Upstream) -> Self { pub fn new(suffix: String, upstream: UpstreamPool) -> Self {
let dot_suffix = format!(".{}", suffix); let dot_suffix = format!(".{}", suffix);
Self { Self {
suffix, suffix,
@@ -211,12 +213,13 @@ fn discover_macos() -> SystemDnsInfo {
} }
// Sort longest suffix first for most-specific matching // Sort longest suffix first for most-specific matching
rules.sort_by(|a, b| b.suffix.len().cmp(&a.suffix.len())); rules.sort_by_key(|r| std::cmp::Reverse(r.suffix.len()));
for rule in &rules { for rule in &rules {
info!( info!(
"auto-discovered forwarding: *.{} -> {}", "auto-discovered forwarding: *.{} -> {}",
rule.suffix, rule.upstream rule.suffix,
rule.upstream.label()
); );
} }
if rules.is_empty() { if rules.is_empty() {
@@ -235,7 +238,8 @@ fn discover_macos() -> SystemDnsInfo {
#[cfg(any(target_os = "macos", target_os = "linux"))] #[cfg(any(target_os = "macos", target_os = "linux"))]
fn make_rule(domain: &str, nameserver: &str) -> Option<ForwardingRule> { fn make_rule(domain: &str, nameserver: &str) -> Option<ForwardingRule> {
let addr = crate::forward::parse_upstream_addr(nameserver, 53).ok()?; let addr = crate::forward::parse_upstream_addr(nameserver, 53).ok()?;
Some(ForwardingRule::new(domain.to_string(), Upstream::Udp(addr))) let pool = UpstreamPool::new(vec![Upstream::Udp(addr)], vec![]);
Some(ForwardingRule::new(domain.to_string(), pool))
} }
#[cfg(target_os = "linux")] #[cfg(target_os = "linux")]
@@ -572,7 +576,7 @@ fn windows_backup_path() -> std::path::PathBuf {
#[cfg(windows)] #[cfg(windows)]
fn disable_dnscache() -> Result<bool, String> { fn disable_dnscache() -> Result<bool, String> {
// Check if Dnscache is running (it holds port 53 at kernel level) // Check if Dnscache is running (it can hold port 53)
let output = std::process::Command::new("sc") let output = std::process::Command::new("sc")
.args(["query", "Dnscache"]) .args(["query", "Dnscache"])
.output() .output()
@@ -603,8 +607,16 @@ fn disable_dnscache() -> Result<bool, String> {
return Err("failed to disable Dnscache via registry (run as Administrator?)".into()); return Err("failed to disable Dnscache via registry (run as Administrator?)".into());
} }
eprintln!(" Dnscache disabled. A reboot is required to free port 53."); // Dnscache is disabled for next boot. Check whether port 53 is
Ok(true) // actually blocked right now — on many Windows configurations
// Dnscache doesn't bind port 53 even while running.
let port_blocked = std::net::UdpSocket::bind("127.0.0.1:53").is_err();
if port_blocked {
eprintln!(" Dnscache disabled. A reboot is required to free port 53.");
} else {
eprintln!(" Dnscache disabled. Port 53 is free.");
}
Ok(port_blocked)
} }
#[cfg(windows)] #[cfg(windows)]
@@ -671,6 +683,83 @@ fn install_windows() -> Result<(), String> {
std::fs::write(&path, json).map_err(|e| format!("failed to write backup: {}", e))?; std::fs::write(&path, json).map_err(|e| format!("failed to write backup: {}", e))?;
} }
// On re-install, stop the running service first so the binary can be
// overwritten and port 53 is released for the Dnscache probe.
if is_service_registered() {
eprintln!(" Stopping existing service...");
stop_service_scm();
}
let needs_reboot = disable_dnscache()?;
// Copy the binary to a stable path under ProgramData and register it
// as a real Windows service (SCM-managed, boot-time, auto-restart).
let service_exe = install_service_binary()?;
register_service_scm(&service_exe)?;
if needs_reboot {
// Dnscache still holds port 53 until reboot. Do NOT redirect DNS
// yet — nothing is listening on 127.0.0.1:53, so redirecting now
// would kill DNS. The service will call redirect_dns_to_localhost()
// on its first startup after reboot.
} else {
redirect_dns_with_interfaces(&interfaces)?;
match start_service_scm() {
Ok(_) => eprintln!(" Service started."),
Err(e) => eprintln!(
" warning: service registered but could not start now: {}",
e
),
}
}
eprintln!();
if !has_useful_existing {
eprintln!(" Original DNS saved to {}", path.display());
}
eprintln!(" Run 'numa uninstall' to restore.\n");
if needs_reboot {
eprintln!(" *** Reboot required. Numa will start automatically. ***\n");
} else {
eprintln!(" Numa is running.\n");
}
print_recursive_hint();
Ok(())
}
/// Stable install location for the service binary. SCM keeps a handle to
/// this path; the user's Downloads folder (where `current_exe()` points at
/// install time) is not durable.
#[cfg(windows)]
fn windows_service_exe_path() -> std::path::PathBuf {
crate::data_dir().join("bin").join("numa.exe")
}
/// Run `sc.exe` with the given args and return its merged stdout/stderr on
/// failure. `sc` emits errors on stdout (not stderr) on Windows, so the
/// caller reads stdout to format a useful error.
#[cfg(windows)]
fn run_sc(args: &[&str]) -> Result<std::process::Output, String> {
let out = std::process::Command::new("sc")
.args(args)
.output()
.map_err(|e| format!("failed to run sc {}: {}", args.first().unwrap_or(&""), e))?;
Ok(out)
}
/// Point all active network interfaces at 127.0.0.1 so Numa handles DNS.
/// Called from the service on first boot after a reboot that freed Dnscache.
#[cfg(windows)]
pub fn redirect_dns_to_localhost() -> Result<(), String> {
let interfaces = get_windows_interfaces()?;
redirect_dns_with_interfaces(&interfaces)
}
#[cfg(windows)]
fn redirect_dns_with_interfaces(
interfaces: &std::collections::HashMap<String, WindowsInterfaceDns>,
) -> Result<(), String> {
for name in interfaces.keys() { for name in interfaces.keys() {
let status = std::process::Command::new("netsh") let status = std::process::Command::new("netsh")
.args([ .args([
@@ -695,63 +784,184 @@ fn install_windows() -> Result<(), String> {
); );
} }
} }
let needs_reboot = disable_dnscache()?;
register_autostart();
eprintln!();
if !has_useful_existing {
eprintln!(" Original DNS saved to {}", path.display());
}
eprintln!(" Run 'numa uninstall' to restore.\n");
if needs_reboot {
eprintln!(" *** Reboot required. Numa will start automatically. ***\n");
} else {
eprintln!(" Numa will start automatically on next boot.\n");
}
print_recursive_hint();
Ok(()) Ok(())
} }
/// Register numa to auto-start on boot via registry Run key. /// Copy the currently-running binary to the service install location. SCM
/// keeps a handle to this path, so it must be stable across user sessions.
#[cfg(windows)] #[cfg(windows)]
fn register_autostart() { fn install_service_binary() -> Result<std::path::PathBuf, String> {
let exe = std::env::current_exe() let src = std::env::current_exe().map_err(|e| format!("current_exe(): {}", e))?;
.map(|p| p.to_string_lossy().to_string()) let dst = windows_service_exe_path();
.unwrap_or_else(|_| "numa".into()); if let Some(parent) = dst.parent() {
let _ = std::process::Command::new("reg") std::fs::create_dir_all(parent)
.args([ .map_err(|e| format!("failed to create {}: {}", parent.display(), e))?;
"add", }
"HKLM\\SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run", // Copy only if source and destination differ; running the binary from
"/v", // its install location is a supported (re-install) case.
"Numa", if src != dst {
"/t", std::fs::copy(&src, &dst).map_err(|e| {
"REG_SZ", format!(
"/d", "failed to copy {} -> {}: {}",
&exe, src.display(),
"/f", dst.display(),
]) e
.status(); )
eprintln!(" Registered auto-start on boot."); })?;
}
Ok(dst)
} }
/// Remove numa auto-start registry key. /// Remove the service binary on uninstall. Ignore failures — the service
/// is already deleted; a leftover file in ProgramData is not a hard error.
#[cfg(windows)] #[cfg(windows)]
fn remove_autostart() { fn remove_service_binary() {
let _ = std::process::Command::new("reg") let _ = std::fs::remove_file(windows_service_exe_path());
.args([ }
"delete",
"HKLM\\SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run", /// Register numa with the Service Control Manager, boot-time auto-start,
"/v", /// LocalSystem context, with a failure policy of restart-after-5s.
"Numa", #[cfg(windows)]
"/f", fn register_service_scm(exe: &std::path::Path) -> Result<(), String> {
]) let bin_path = format!("\"{}\" --service", exe.display());
.status(); let name = crate::windows_service::SERVICE_NAME;
// sc.exe uses a leading space as its `name= value` delimiter; the space
// after `=` is mandatory.
let create = run_sc(&[
"create",
name,
"binPath=",
&bin_path,
"DisplayName=",
"Numa DNS",
"start=",
"auto",
"obj=",
"LocalSystem",
])?;
if !create.status.success() {
let out = String::from_utf8_lossy(&create.stdout);
// "service already exists" is 1073 — treat as idempotent success.
if !out.contains("1073") {
return Err(format!("sc create failed: {}", out.trim()));
}
}
let _ = run_sc(&[
"description",
name,
"Self-sovereign DNS resolver (ad blocking, DoH/DoT, local zones).",
]);
// Restart on crash: 5s, 5s, 10s; reset failure counter after 60s.
let _ = run_sc(&[
"failure",
name,
"reset=",
"60",
"actions=",
"restart/5000/restart/5000/restart/10000",
]);
eprintln!(" Registered service '{}' (boot-time).", name);
Ok(())
}
/// Start the service. Safe to call on a freshly-registered service — SCM
/// will fail with 1056 ("already running") or 1058 ("disabled") and we
/// return the underlying error string rather than masking it.
#[cfg(windows)]
fn start_service_scm() -> Result<(), String> {
let out = run_sc(&["start", crate::windows_service::SERVICE_NAME])?;
if !out.status.success() {
let text = String::from_utf8_lossy(&out.stdout);
if text.contains("1056") {
return Ok(()); // already running
}
return Err(format!("sc start failed: {}", text.trim()));
}
Ok(())
}
/// Stop the service and wait for it to fully exit. Idempotent —
/// already-stopped or missing service is not an error.
#[cfg(windows)]
fn stop_service_scm() {
let name = crate::windows_service::SERVICE_NAME;
let _ = run_sc(&["stop", name]);
// Wait up to 10s for the service to reach STOPPED state so the
// binary file handle is released before we try to overwrite it.
for _ in 0..20 {
if let Ok(out) = run_sc(&["query", name]) {
let text = String::from_utf8_lossy(&out.stdout);
if text.contains("STOPPED") || text.contains("1060") {
return;
}
}
std::thread::sleep(std::time::Duration::from_millis(500));
}
eprintln!(" warning: service did not stop within 10s");
}
/// Remove the service from SCM. Idempotent — see `stop_service_scm`.
#[cfg(windows)]
fn delete_service_scm() {
if let Err(e) = run_sc(&["delete", crate::windows_service::SERVICE_NAME]) {
log::warn!("sc delete failed: {}", e);
}
}
/// Check whether the service is registered with SCM (regardless of state).
#[cfg(windows)]
fn is_service_registered() -> bool {
run_sc(&["query", crate::windows_service::SERVICE_NAME])
.map(|o| parse_sc_registered(o.status.success(), &String::from_utf8_lossy(&o.stdout)))
.unwrap_or(false)
}
/// Parse `sc query` output to determine if a service is registered.
/// Extracted for testability — the actual `sc` call is in `is_service_registered`.
#[cfg(any(windows, test))]
fn parse_sc_registered(exit_success: bool, stdout: &str) -> bool {
if exit_success {
return true;
}
// Error 1060 = "The specified service does not exist as an installed service."
!stdout.contains("1060")
}
/// Print service state from SCM.
#[cfg(windows)]
fn service_status_windows() -> Result<(), String> {
let out = run_sc(&["query", crate::windows_service::SERVICE_NAME])?;
let text = String::from_utf8_lossy(&out.stdout);
let display = parse_sc_state(&text);
eprintln!(" {}\n", display);
Ok(())
}
/// Parse the STATE line from `sc query` output. Returns a human-readable
/// string like "STATE : 4 RUNNING" or "Service is not installed."
#[cfg(any(windows, test))]
fn parse_sc_state(sc_output: &str) -> String {
if sc_output.contains("1060") {
return "Service is not installed.".to_string();
}
sc_output
.lines()
.find(|l| l.contains("STATE"))
.map(|l| l.trim().to_string())
.unwrap_or_else(|| "unknown".to_string())
} }
#[cfg(windows)] #[cfg(windows)]
fn uninstall_windows() -> Result<(), String> { fn uninstall_windows() -> Result<(), String> {
remove_autostart(); // Stop + remove the service before touching DNS, so port 53 is released
// cleanly and the failure-restart policy doesn't resurrect it.
stop_service_scm();
delete_service_scm();
remove_service_binary();
let path = windows_backup_path(); let path = windows_backup_path();
let json = std::fs::read_to_string(&path) let json = std::fs::read_to_string(&path)
.map_err(|e| format!("no backup found at {}: {}", path.display(), e))?; .map_err(|e| format!("no backup found at {}: {}", path.display(), e))?;
@@ -827,7 +1037,7 @@ fn uninstall_windows() -> Result<(), String> {
pub fn match_forwarding_rule<'a>( pub fn match_forwarding_rule<'a>(
domain: &str, domain: &str,
rules: &'a [ForwardingRule], rules: &'a [ForwardingRule],
) -> Option<&'a Upstream> { ) -> Option<&'a UpstreamPool> {
for rule in rules { for rule in rules {
if domain == rule.suffix || domain.ends_with(&rule.dot_suffix) { if domain == rule.suffix || domain.ends_with(&rule.dot_suffix) {
return Some(&rule.upstream); return Some(&rule.upstream);
@@ -1048,6 +1258,62 @@ pub fn install_service() -> Result<(), String> {
result result
} }
/// Start the service. If already installed, just starts it via the platform
/// service manager. If not installed, falls through to a full install.
pub fn start_service() -> Result<(), String> {
#[cfg(target_os = "macos")]
{
install_service()
}
#[cfg(target_os = "linux")]
{
install_service()
}
#[cfg(windows)]
{
if is_service_registered() {
start_service_scm()?;
eprintln!(" Service started.\n");
Ok(())
} else {
install_service()
}
}
#[cfg(not(any(target_os = "macos", target_os = "linux", windows)))]
{
Err("service start not supported on this OS".to_string())
}
}
/// Stop the service without uninstalling it.
pub fn stop_service() -> Result<(), String> {
#[cfg(target_os = "macos")]
{
uninstall_service()
}
#[cfg(target_os = "linux")]
{
uninstall_service()
}
#[cfg(windows)]
{
let out = run_sc(&["stop", crate::windows_service::SERVICE_NAME])?;
if !out.status.success() {
let text = String::from_utf8_lossy(&out.stdout);
// 1062 = not started, 1060 = does not exist
if !text.contains("1062") && !text.contains("1060") {
return Err(format!("sc stop failed: {}", text.trim()));
}
}
eprintln!(" Service stopped.\n");
Ok(())
}
#[cfg(not(any(target_os = "macos", target_os = "linux", windows)))]
{
Err("service stop not supported on this OS".to_string())
}
}
/// Uninstall the Numa system service. /// Uninstall the Numa system service.
pub fn uninstall_service() -> Result<(), String> { pub fn uninstall_service() -> Result<(), String> {
let _ = untrust_ca(); let _ = untrust_ca();
@@ -1117,7 +1383,14 @@ pub fn restart_service() -> Result<(), String> {
eprintln!(" Service restarted → {}\n", version); eprintln!(" Service restarted → {}\n", version);
Ok(()) Ok(())
} }
#[cfg(not(any(target_os = "macos", target_os = "linux")))] #[cfg(windows)]
{
stop_service_scm();
start_service_scm()?;
eprintln!(" Service restarted.\n");
Ok(())
}
#[cfg(not(any(target_os = "macos", target_os = "linux", windows)))]
{ {
Err("service restart not supported on this OS".to_string()) Err("service restart not supported on this OS".to_string())
} }
@@ -1133,13 +1406,17 @@ pub fn service_status() -> Result<(), String> {
{ {
service_status_linux() service_status_linux()
} }
#[cfg(not(any(target_os = "macos", target_os = "linux")))] #[cfg(windows)]
{
service_status_windows()
}
#[cfg(not(any(target_os = "macos", target_os = "linux", windows)))]
{ {
Err("service status not supported on this OS".to_string()) Err("service status not supported on this OS".to_string())
} }
} }
#[cfg(any(target_os = "macos", target_os = "linux"))] #[cfg(target_os = "macos")]
fn replace_exe_path(service: &str) -> Result<String, String> { fn replace_exe_path(service: &str) -> Result<String, String> {
let exe_path = let exe_path =
std::env::current_exe().map_err(|e| format!("failed to get current exe: {}", e))?; std::env::current_exe().map_err(|e| format!("failed to get current exe: {}", e))?;
@@ -1387,10 +1664,78 @@ fn uninstall_linux() -> Result<(), String> {
Ok(()) Ok(())
} }
/// Fallback install location when current_exe() sits on a path the
/// dynamic user cannot traverse (e.g. `/home/<user>/` mode 0700).
#[cfg(target_os = "linux")]
fn linux_service_exe_path() -> std::path::PathBuf {
std::path::PathBuf::from("/usr/local/bin/numa")
}
/// True iff every ancestor of `p` (excluding `/`) grants world-execute —
/// i.e. the `DynamicUser=yes` service account can traverse the path and
/// exec the binary without being in any group. Linuxbrew's
/// `/home/linuxbrew` is 0755 (traversable, keep brew's path, upgrades
/// via `brew` propagate). A build tree under `/home/<user>/` (0700) or
/// `~/.cargo/bin/` is not (copy to /usr/local/bin so systemd can reach it).
#[cfg(target_os = "linux")]
fn path_world_traversable_linux(p: &std::path::Path) -> bool {
use std::os::unix::fs::PermissionsExt;
let mut current = p;
while let Some(parent) = current.parent() {
if parent.as_os_str().is_empty() || parent == std::path::Path::new("/") {
break;
}
match std::fs::metadata(parent) {
Ok(m) if m.permissions().mode() & 0o001 != 0 => {}
_ => return false,
}
current = parent;
}
true
}
#[cfg(target_os = "linux")]
fn install_service_binary_linux() -> Result<std::path::PathBuf, String> {
let src = std::env::current_exe().map_err(|e| format!("current_exe(): {}", e))?;
if path_world_traversable_linux(&src) {
return Ok(src);
}
let dst = linux_service_exe_path();
if src == dst {
return Ok(dst);
}
if let Some(parent) = dst.parent() {
std::fs::create_dir_all(parent)
.map_err(|e| format!("failed to create {}: {}", parent.display(), e))?;
}
// Atomic replace via temp + rename. Plain copy fails with ETXTBSY when
// re-installing while the service is running the previous binary —
// rename swaps the path while the running process keeps the old inode.
let tmp = dst.with_extension("new");
std::fs::copy(&src, &tmp).map_err(|e| {
format!(
"failed to copy {} -> {}: {}",
src.display(),
tmp.display(),
e
)
})?;
std::fs::rename(&tmp, &dst).map_err(|e| {
let _ = std::fs::remove_file(&tmp);
format!(
"failed to rename {} -> {}: {}",
tmp.display(),
dst.display(),
e
)
})?;
Ok(dst)
}
#[cfg(target_os = "linux")] #[cfg(target_os = "linux")]
fn install_service_linux() -> Result<(), String> { fn install_service_linux() -> Result<(), String> {
let unit = include_str!("../numa.service"); let exe = install_service_binary_linux()?;
let unit = replace_exe_path(unit)?; let unit = include_str!("../numa.service").replace("{{exe_path}}", &exe.to_string_lossy());
std::fs::write(SYSTEMD_UNIT, unit) std::fs::write(SYSTEMD_UNIT, unit)
.map_err(|e| format!("failed to write {}: {}", SYSTEMD_UNIT, e))?; .map_err(|e| format!("failed to write {}: {}", SYSTEMD_UNIT, e))?;
@@ -1402,7 +1747,9 @@ fn install_service_linux() -> Result<(), String> {
eprintln!(" warning: failed to configure system DNS: {}", e); eprintln!(" warning: failed to configure system DNS: {}", e);
} }
run_systemctl(&["start", "numa"])?; // restart, not start: on re-install the service is already running
// the previous binary; restart picks up the new one.
run_systemctl(&["restart", "numa"])?;
eprintln!(" Service installed and started."); eprintln!(" Service installed and started.");
eprintln!(" Numa will auto-start on boot and restart if killed."); eprintln!(" Numa will auto-start on boot and restart if killed.");
@@ -1718,22 +2065,25 @@ Wireless LAN adapter Wi-Fi:
} }
#[test] #[test]
#[cfg(any(target_os = "macos", target_os = "linux"))] fn install_templates_contain_exe_path_placeholder() {
fn replace_exe_path_substitutes_template() { // Both files are substituted at install time — plist via
// replace_exe_path on macOS, numa.service via inline .replace
// in install_service_linux. Catch placeholder removal early.
let plist = include_str!("../com.numa.dns.plist"); let plist = include_str!("../com.numa.dns.plist");
let unit = include_str!("../numa.service"); let unit = include_str!("../numa.service");
assert!(plist.contains("{{exe_path}}"), "plist missing placeholder"); assert!(plist.contains("{{exe_path}}"), "plist missing placeholder");
assert!( assert!(
unit.contains("{{exe_path}}"), unit.contains("{{exe_path}}"),
"unit file missing placeholder" "unit file missing placeholder"
); );
}
#[test]
#[cfg(target_os = "macos")]
fn replace_exe_path_substitutes_template() {
let plist = include_str!("../com.numa.dns.plist");
let result = replace_exe_path(plist).expect("replace_exe_path failed for plist"); let result = replace_exe_path(plist).expect("replace_exe_path failed for plist");
assert!(!result.contains("{{exe_path}}")); assert!(!result.contains("{{exe_path}}"));
let result = replace_exe_path(unit).expect("replace_exe_path failed for unit");
assert!(!result.contains("{{exe_path}}"));
} }
#[test] #[test]
@@ -1867,4 +2217,57 @@ Wireless LAN adapter Wi-Fi:
let err = std::io::Error::from(std::io::ErrorKind::AddrInUse); let err = std::io::Error::from(std::io::ErrorKind::AddrInUse);
assert!(try_port53_advisory("not-an-address", &err).is_none()); assert!(try_port53_advisory("not-an-address", &err).is_none());
} }
#[test]
fn sc_query_running_service_is_registered() {
assert!(parse_sc_registered(true, ""));
}
#[test]
fn sc_query_stopped_service_is_registered() {
let output = "SERVICE_NAME: Numa\n TYPE: 10 WIN32_OWN\n STATE: 1 STOPPED\n";
assert!(parse_sc_registered(true, output));
}
#[test]
fn sc_query_missing_service_not_registered() {
let output = "[SC] EnumQueryServicesStatus:OpenService FAILED 1060:\n\nThe specified service does not exist as an installed service.\n";
assert!(!parse_sc_registered(false, output));
}
#[test]
fn sc_query_other_error_assumes_registered() {
// Permission denied or other errors — don't assume unregistered.
let output = "[SC] OpenService FAILED 5:\n\nAccess is denied.\n";
assert!(parse_sc_registered(false, output));
}
#[test]
fn parse_sc_state_running() {
let output = "SERVICE_NAME: Numa\n TYPE : 10 WIN32_OWN_PROCESS\n STATE : 4 RUNNING\n WIN32_EXIT_CODE : 0\n";
assert!(parse_sc_state(output).contains("RUNNING"));
}
#[test]
fn parse_sc_state_stopped() {
let output = "SERVICE_NAME: Numa\n TYPE : 10 WIN32_OWN_PROCESS\n STATE : 1 STOPPED\n";
assert!(parse_sc_state(output).contains("STOPPED"));
}
#[test]
fn parse_sc_state_not_installed() {
let output = "[SC] EnumQueryServicesStatus:OpenService FAILED 1060:\n\n";
assert_eq!(parse_sc_state(output), "Service is not installed.");
}
#[test]
fn parse_sc_state_empty_output() {
assert_eq!(parse_sc_state(""), "unknown");
}
#[cfg(windows)]
#[test]
fn windows_config_dir_equals_data_dir() {
assert_eq!(crate::config_dir(), crate::data_dir());
}
} }

View File

@@ -63,6 +63,7 @@ pub async fn test_ctx() -> ServerCtx {
ca_pem: None, ca_pem: None,
mobile_enabled: false, mobile_enabled: false,
mobile_port: 8765, mobile_port: 8765,
filter_aaaa: false,
} }
} }

147
src/windows_service.rs Normal file
View File

@@ -0,0 +1,147 @@
//! Windows service wrapper.
//!
//! Lets the `numa.exe` binary act as a real Windows service registered with
//! the Service Control Manager (SCM). Invoked via `numa.exe --service` (the
//! form that `sc create … binPath=` uses).
//!
//! Interactive runs (`numa.exe`, `numa.exe run`, `numa.exe install`) do not
//! go through this module — they keep their existing console-attached
//! behaviour.
use std::ffi::OsString;
use std::sync::mpsc;
use std::time::Duration;
use windows_service::service::{
ServiceControl, ServiceControlAccept, ServiceExitCode, ServiceState, ServiceStatus, ServiceType,
};
use windows_service::service_control_handler::{self, ServiceControlHandlerResult};
use windows_service::{define_windows_service, service_dispatcher};
pub const SERVICE_NAME: &str = "Numa";
define_windows_service!(ffi_service_main, service_main);
/// Entry point the SCM hands control to after `StartServiceCtrlDispatcherW`.
/// Any panic here vanishes silently into the service host — log instead of
/// unwrapping.
fn service_main(_arguments: Vec<OsString>) {
if let Err(e) = run_service() {
log::error!("numa service exited with error: {:?}", e);
}
}
fn run_service() -> windows_service::Result<()> {
let (shutdown_tx, shutdown_rx) = mpsc::channel::<()>();
let event_handler = move |control_event| -> ServiceControlHandlerResult {
match control_event {
ServiceControl::Stop | ServiceControl::Shutdown => {
let _ = shutdown_tx.send(());
ServiceControlHandlerResult::NoError
}
ServiceControl::Interrogate => ServiceControlHandlerResult::NoError,
_ => ServiceControlHandlerResult::NotImplemented,
}
};
let status_handle = service_control_handler::register(SERVICE_NAME, event_handler)?;
status_handle.set_service_status(ServiceStatus {
service_type: ServiceType::OWN_PROCESS,
current_state: ServiceState::Running,
controls_accepted: ServiceControlAccept::STOP | ServiceControlAccept::SHUTDOWN,
exit_code: ServiceExitCode::Win32(0),
checkpoint: 0,
wait_hint: Duration::default(),
process_id: None,
})?;
// Spin up a multi-threaded tokio runtime and run the server on it. A
// dedicated thread runs the runtime so this function can return cleanly
// once the SCM tells us to stop — we can't block the dispatcher thread
// forever without preventing graceful shutdown.
let config_path = service_config_path();
let (server_done_tx, server_done_rx) = mpsc::channel::<()>();
let server_thread = std::thread::spawn(move || {
let runtime = match tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()
{
Ok(rt) => rt,
Err(e) => {
log::error!("failed to build tokio runtime: {}", e);
let _ = server_done_tx.send(());
return;
}
};
if let Err(e) = runtime.block_on(crate::serve::run(config_path)) {
log::error!("numa serve exited with error: {}", e);
}
let _ = server_done_tx.send(());
});
// Wait for the API to be ready, then ensure DNS points at localhost.
// On first boot after install (Dnscache was disabled, reboot freed
// port 53), the installer deferred the DNS redirect — do it now.
let api_up = (0..20).any(|i| {
if i > 0 {
std::thread::sleep(Duration::from_millis(500));
}
std::net::TcpStream::connect(("127.0.0.1", crate::config::DEFAULT_API_PORT)).is_ok()
});
if api_up {
if let Err(e) = crate::system_dns::redirect_dns_to_localhost() {
log::warn!("could not redirect DNS to localhost: {}", e);
}
} else {
log::error!("numa API did not start within 10s — DNS not redirected");
}
// Wait for either SCM stop or server termination.
loop {
if shutdown_rx.recv_timeout(Duration::from_millis(500)).is_ok() {
break;
}
if server_done_rx.try_recv().is_ok() {
break;
}
}
// The server's tokio runtime runs detached inside server_thread. Abandon
// it — the process is about to report Stopped and the SCM will terminate
// us if we linger. Future work: plumb a cancellation signal into
// serve::run() for a clean teardown of listeners and in-flight queries.
drop(server_thread);
status_handle.set_service_status(ServiceStatus {
service_type: ServiceType::OWN_PROCESS,
current_state: ServiceState::Stopped,
controls_accepted: ServiceControlAccept::empty(),
exit_code: ServiceExitCode::Win32(0),
checkpoint: 0,
wait_hint: Duration::default(),
process_id: None,
})?;
Ok(())
}
/// Hand control to the SCM dispatcher. Blocks until the service stops.
/// Call only from the `--service` command path — interactive invocations
/// will hang here waiting for an SCM that isn't talking to them.
pub fn run_as_service() -> windows_service::Result<()> {
service_dispatcher::start(SERVICE_NAME, ffi_service_main)
}
/// Path to the config file used when running under SCM. SCM launches the
/// service with SYSTEM's working directory (usually `C:\Windows\System32`),
/// so a relative `numa.toml` lookup won't find anything meaningful.
fn service_config_path() -> String {
crate::data_dir()
.join("numa.toml")
.to_string_lossy()
.into_owned()
}

288
tests/docker/install-systemd.sh Executable file
View File

@@ -0,0 +1,288 @@
#!/usr/bin/env bash
#
# Systemd service install verification for the DynamicUser-based Linux
# service unit. Stands up a privileged ubuntu:24.04 container with systemd
# as PID 1, builds numa inside, runs three scenarios that CI does not:
#
# A. Fresh install — every advertised port is not just bound but
# functional (DNS resolves on :53, TLS handshake validates against
# numa's CA on :853/:443, HTTP responds on :80, API on :5380).
# B. Upgrade from pre-drop layout (root-owned /var/lib/numa) preserves
# the CA fingerprint — users' browser-installed CA trust survives.
# C. Install from a 0700 source directory stages the binary under
# /usr/local/bin/numa and the service starts from there.
#
# First run is slow (~5-10 min): image pull + apt + cold cargo build.
# Subsequent runs reuse cached docker volumes for cargo + target (~30s).
#
# Requirements: docker
# Usage: ./tests/docker/install-systemd.sh
set -u
set -o pipefail
GREEN="\033[32m"; RED="\033[31m"; RESET="\033[0m"
pass() { printf " ${GREEN}PASS${RESET}: %s\n" "$*"; }
fail() { printf " ${RED}FAIL${RESET}: %s\n" "$*"; FAIL=1; }
# ============================================================
# Mode B: running inside the systemd container — run scenarios
# ============================================================
if [ "${NUMA_INSIDE:-}" = "1" ]; then
set +e # assertions report pass/fail, don't abort
FAIL=0
NUMA=/work/target/release/numa
reset_state() {
"$NUMA" uninstall >/dev/null 2>&1 || true
systemctl reset-failed numa 2>/dev/null || true
rm -rf /var/lib/numa /var/lib/private/numa /etc/numa /home/builder /usr/local/bin/numa
systemctl daemon-reload 2>/dev/null || true
}
main_pid_user() {
local pid
pid=$(systemctl show -p MainPID --value numa)
[ "$pid" != "0" ] || { echo ""; return; }
ps -o user= -p "$pid" 2>/dev/null | tr -d ' '
}
# MainPID + user briefly stabilize after a fresh restart. Retry so we
# don't race the moment systemd flips the service to "active" vs when
# the forked numa process actually owns MainPID.
assert_nonroot() {
local pid user comm n=0
while [ $n -lt 20 ]; do
pid=$(systemctl show -p MainPID --value numa)
if [ "$pid" != "0" ]; then
comm=$(ps -o comm= -p "$pid" 2>/dev/null | tr -d ' ')
user=$(ps -o user= -p "$pid" 2>/dev/null | tr -d ' ')
if [ "$comm" = "numa" ]; then
if [ "$user" = "root" ]; then
fail "daemon runs as root (expected transient UID)"
else
pass "daemon runs as $user (non-root)"
fi
return
fi
fi
sleep 0.2
n=$((n + 1))
done
fail "numa MainPID did not settle (last: pid=${pid:-?} comm=${comm:-?} user=${user:-?})"
}
# Functional DNS check: just "port 53 bound" isn't enough — systemd-resolved
# listens on 127.0.0.53 and would satisfy a bind test. Retries for ~15s
# to tolerate cold-start upstream / blocklist warmup.
assert_dns_works() {
local n=0
while [ $n -lt 15 ]; do
if dig @127.0.0.1 -p 53 example.com +short +timeout=2 +tries=1 2>/dev/null \
| grep -qE '^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$'; then
pass "DNS resolves on :53 (A record returned)"
return
fi
sleep 1
n=$((n + 1))
done
fail "DNS did not return an A record on :53 within 15s"
}
# TLS handshake: cert must validate against numa's CA when connecting
# to a .numa SNI. Catches port-not-bound, wrong cert, missing CA file.
assert_tls_handshake() {
local port=$1 sni=${2:-numa.numa} out
if out=$(openssl s_client -connect "127.0.0.1:${port}" \
-servername "$sni" \
-CAfile /var/lib/numa/ca.pem \
-verify_return_error </dev/null 2>&1); then
if echo "$out" | grep -q 'Verify return code: 0 (ok)'; then
pass "TLS handshake + cert chain verified on :${port}"
else
fail "TLS handshake on :${port} did not report 'Verify return code: 0'"
fi
else
fail "openssl s_client failed connecting to :${port}"
fi
}
assert_http_responds() {
local code
code=$(curl -s -o /dev/null -w "%{http_code}" --max-time 3 http://127.0.0.1/ || echo 000)
if [ "$code" != "000" ]; then
pass "HTTP responds on :80 (status $code)"
else
fail "HTTP :80 connection failed"
fi
}
assert_api_healthy() {
if curl -sf --max-time 3 http://127.0.0.1:5380/health >/dev/null; then
pass "API /health OK on :5380"
else
fail "API /health failed on :5380"
fi
}
ca_fingerprint() {
openssl x509 -in /var/lib/numa/ca.pem -noout -fingerprint -sha256 2>/dev/null \
| sed 's/.*=//'
}
wait_active() {
local n=0
while [ $n -lt 20 ]; do
systemctl is-active --quiet numa && return 0
sleep 0.5
n=$((n + 1))
done
fail "service did not become active within 10s"
systemctl status numa --no-pager -l 2>&1 | head -20 || true
return 1
}
# ---- Scenario A ----
printf "\n=== Scenario A: fresh install — every advertised port is functional ===\n"
reset_state
"$NUMA" install >/tmp/installA.log 2>&1 || { fail "install failed"; tail -20 /tmp/installA.log; }
wait_active || true
assert_nonroot
assert_dns_works
assert_tls_handshake 853
assert_tls_handshake 443
assert_http_responds
assert_api_healthy
# ---- Scenario B ----
# Pre-drop installs left /var/lib/numa as a plain root-owned tree.
# Flattening the current DynamicUser layout back into that shape
# simulates the upgrade path without needing an actual old binary.
printf "\n=== Scenario B: CA fingerprint survives upgrade from pre-drop layout ===\n"
fp_before=$(ca_fingerprint)
if [ -z "$fp_before" ]; then
fail "could not read initial CA fingerprint (skipping scenario B)"
else
echo " CA fingerprint before: $fp_before"
"$NUMA" uninstall >/dev/null 2>&1 || true
tmp=$(mktemp -d)
cp -a /var/lib/private/numa/. "$tmp"/ 2>/dev/null || true
rm -rf /var/lib/numa /var/lib/private/numa
mv "$tmp" /var/lib/numa
chown -R root:root /var/lib/numa
chmod 755 /var/lib/numa
[ -f /var/lib/numa/ca.pem ] || fail "ca.pem missing from seeded legacy tree"
"$NUMA" install >/tmp/installB.log 2>&1 || { fail "upgrade install failed"; tail -20 /tmp/installB.log; }
wait_active || true
assert_nonroot
fp_after=$(ca_fingerprint)
if [ -z "$fp_after" ]; then
fail "could not read CA fingerprint after upgrade"
elif [ "$fp_before" = "$fp_after" ]; then
pass "CA fingerprint preserved across upgrade"
else
fail "CA fingerprint changed: before=$fp_before after=$fp_after"
fi
assert_dns_works
fi
# ---- Scenario C ----
printf "\n=== Scenario C: install from unreachable source stages binary to /usr/local/bin ===\n"
reset_state
mkdir -p /home/builder
chmod 700 /home/builder
cp "$NUMA" /home/builder/numa
chmod 755 /home/builder/numa
/home/builder/numa install >/tmp/installC.log 2>&1 || { fail "install failed"; tail -20 /tmp/installC.log; }
wait_active || true
if [ -x /usr/local/bin/numa ]; then
pass "binary staged to /usr/local/bin/numa"
else
fail "/usr/local/bin/numa missing after install from 0700 source"
fi
exec_line=$(grep '^ExecStart=' /etc/systemd/system/numa.service 2>/dev/null || echo "ExecStart=<unit missing>")
if echo "$exec_line" | grep -q '/usr/local/bin/numa'; then
pass "unit ExecStart points to staged path"
else
fail "unit ExecStart wrong: $exec_line"
fi
assert_nonroot
assert_dns_works
reset_state
rm -rf /home/builder
echo
if [ "$FAIL" -eq 0 ]; then
printf "${GREEN}── all scenarios passed ──${RESET}\n"
exit 0
else
printf "${RED}── some scenarios failed ──${RESET}\n"
exit 1
fi
fi
# ============================================================
# Mode A: host-side bootstrap
# ============================================================
set -e
cd "$(dirname "$0")/../.."
IMAGE=numa-install-systemd:local
CONTAINER="numa-install-systemd-$$"
trap 'docker rm -f "$CONTAINER" >/dev/null 2>&1 || true' EXIT
echo "── building systemd-in-container image (cached after first run) ──"
docker build --quiet -t "$IMAGE" -f - . <<'DOCKERFILE' >/dev/null
FROM ubuntu:24.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -qq && apt-get install -y -qq \
systemd systemd-sysv systemd-resolved \
ca-certificates curl build-essential \
pkg-config libssl-dev cmake make perl \
dnsutils iproute2 openssl \
&& rm -rf /var/lib/apt/lists/* \
&& for u in dev-hugepages.mount sys-fs-fuse-connections.mount \
systemd-logind.service getty.target console-getty.service; do \
systemctl mask $u; \
done
STOPSIGNAL SIGRTMIN+3
CMD ["/lib/systemd/systemd"]
DOCKERFILE
echo "── starting systemd container ──"
docker run -d --name "$CONTAINER" \
--privileged --cgroupns=host \
--tmpfs /run --tmpfs /run/lock --tmpfs /tmp:exec \
-v "$PWD:/src:ro" \
-v numa-install-systemd-cargo:/root/.cargo \
-v numa-install-systemd-work:/work \
"$IMAGE" >/dev/null
# Wait for systemd to be up
for _ in $(seq 1 30); do
state=$(docker exec "$CONTAINER" systemctl is-system-running 2>&1 || true)
case "$state" in running|degraded) break ;; esac
sleep 0.5
done
echo "── copying source into /work (writable) ──"
docker exec "$CONTAINER" bash -c '
mkdir -p /work
tar -C /src --exclude=./target --exclude=./.git --exclude=./.claude -cf - . | tar -C /work -xf -
'
echo "── rustup + cargo build --release --locked ──"
docker exec "$CONTAINER" bash -c '
set -e
if ! command -v cargo &>/dev/null; then
curl -sSf https://sh.rustup.rs | sh -s -- -y --profile minimal --quiet
fi
. "$HOME/.cargo/env"
cd /work
cargo build --release --locked 2>&1 | tail -5
'
echo "── running scenarios ──"
docker exec -e NUMA_INSIDE=1 "$CONTAINER" bash /src/tests/docker/install-systemd.sh

View File

@@ -0,0 +1,155 @@
#!/usr/bin/env bash
#
# Reproducer for issue #122 — chicken-and-egg when numa is its own system
# resolver (HAOS add-on, Pi-hole-style container, laptop with
# resolv.conf → 127.0.0.1).
#
# Topology:
# container /etc/resolv.conf → nameserver 127.0.0.1
# numa bound on :53 → upstream DoH by hostname (quad9)
# numa boots → spawns blocklist download
# reqwest::get → getaddrinfo("cdn.jsdelivr.net")
# → loopback UDP :53 → numa → cache miss → DoH upstream
# → getaddrinfo("dns.quad9.net") → same loop → glibc EAI_AGAIN
#
# Expected on master: both assertions FAIL (bug reproduced).
# Expected after bootstrap-IP fix: both assertions PASS.
#
# Requirements: docker (with internet access for external lists/DoH)
# Usage: ./tests/docker/self-resolver-loop.sh
set -euo pipefail
cd "$(dirname "$0")/../.."
GREEN="\033[32m"; RED="\033[31m"; RESET="\033[0m"
pass() { printf " ${GREEN}${RESET} %s\n" "$1"; }
fail() { printf " ${RED}${RESET} %s\n" "$1"; printf " %s\n" "$2"; FAILED=$((FAILED+1)); }
FAILED=0
OUT=/tmp/numa-self-resolver.out
echo "── self-resolver-loop: building + reproducing on debian:bookworm ──"
echo " (first run is slow: image pull + cold cargo build, ~5-8 min)"
echo
docker run --rm \
-v "$PWD:/src:ro" \
-v numa-self-resolver-cargo:/root/.cargo \
-v numa-self-resolver-target:/work/target \
debian:bookworm bash -c '
set -e
# Phase 1: install deps + build with the container DNS as given by Docker
# (resolves deb.debian.org, static.rust-lang.org, crates.io).
apt-get update -qq && apt-get install -y -qq curl build-essential dnsutils 2>&1 | tail -3
if ! command -v cargo &>/dev/null; then
curl -sSf https://sh.rustup.rs | sh -s -- -y --profile minimal --quiet
fi
. "$HOME/.cargo/env"
mkdir -p /work
tar -C /src --exclude=./target --exclude=./.git -cf - . | tar -C /work -xf -
cd /work
echo "── cargo build --release --locked ──"
cargo build --release --locked 2>&1 | tail -5
echo
# Phase 2: flip system DNS to numa itself — this is the pathological
# topology from issue #122 (HAOS add-on, resolv.conf → 127.0.0.1).
# Everything after this point, any getaddrinfo call inside numa loops
# back through :53.
echo "nameserver 127.0.0.1" > /etc/resolv.conf
echo "── /etc/resolv.conf inside container (post-flip) ──"
cat /etc/resolv.conf
echo
cat > /tmp/numa.toml <<CONF
[server]
bind_addr = "0.0.0.0:53"
api_port = 5380
api_bind_addr = "127.0.0.1"
data_dir = "/tmp/numa-data"
[upstream]
mode = "forward"
address = ["https://dns.quad9.net/dns-query"]
timeout_ms = 3000
[blocking]
enabled = true
lists = ["https://cdn.jsdelivr.net/gh/hagezi/dns-blocklists@latest/hosts/pro.txt"]
CONF
mkdir -p /tmp/numa-data
echo "── starting numa ──"
RUST_LOG=info ./target/release/numa /tmp/numa.toml > /tmp/numa.log 2>&1 &
NUMA_PID=$!
# Wait up to 120s for blocklist to populate.
# Retry delays 2+10+30s = 42s, plus ~4 × ~10s getaddrinfo timeouts under
# self-loop = ~82s worst case. 120s leaves headroom.
LOADED=0
for i in $(seq 1 120); do
LOADED=$(curl -sf http://127.0.0.1:5380/blocking/stats 2>/dev/null \
| grep -o "\"domains_loaded\":[0-9]*" | cut -d: -f2 || echo 0)
[ "${LOADED:-0}" -gt 100 ] && break
sleep 1
done
# First cold DoH query — time it.
START=$(date +%s%N)
dig @127.0.0.1 example.com A +time=15 +tries=1 > /tmp/dig.out 2>&1 || true
END=$(date +%s%N)
LATENCY_MS=$(( (END - START) / 1000000 ))
STATUS=$(grep -oE "status: [A-Z]+" /tmp/dig.out | head -1 || echo "status: TIMEOUT")
kill $NUMA_PID 2>/dev/null || true
wait $NUMA_PID 2>/dev/null || true
echo
echo "=== RESULT ==="
echo "domains_loaded=$LOADED"
echo "first_query_latency_ms=$LATENCY_MS"
echo "first_query_${STATUS// /_}"
echo
echo "=== numa.log (tail 40) ==="
tail -40 /tmp/numa.log
echo
echo "=== dig.out ==="
cat /tmp/dig.out
' 2>&1 | tee "$OUT"
echo
echo "── assertions ──"
LOADED=$(grep '^domains_loaded=' "$OUT" | tail -1 | cut -d= -f2 || echo 0)
LATENCY=$(grep '^first_query_latency_ms=' "$OUT" | tail -1 | cut -d= -f2 || echo 999999)
STATUS_LINE=$(grep '^first_query_status_' "$OUT" | tail -1 || echo "first_query_status_TIMEOUT")
if [ "${LOADED:-0}" -gt 100 ]; then
pass "blocklist downloaded (domains_loaded=$LOADED)"
else
fail "blocklist downloaded (got domains_loaded=${LOADED:-0}, expected >100)" \
"chicken-and-egg: blocklist HTTPS client has no DNS bootstrap; getaddrinfo loops through numa"
fi
if [ "${LATENCY:-999999}" -lt 2000 ]; then
pass "first DoH query under 2s (latency=${LATENCY}ms, $STATUS_LINE)"
else
fail "first DoH query under 2s (got ${LATENCY}ms, $STATUS_LINE)" \
"self-loop on getaddrinfo(upstream_host); plain DoH needs bootstrap-IP symmetry with ODoH"
fi
echo
if [ "$FAILED" -eq 0 ]; then
printf "${GREEN}── self-resolver-loop passed (fix is in place) ──${RESET}\n"
exit 0
else
printf "${RED}── self-resolver-loop failed ($FAILED assertion(s)) — bug #122 reproduced ──${RESET}\n"
exit 1
fi

View File

@@ -1,7 +1,10 @@
#!/usr/bin/env bash #!/usr/bin/env bash
# Integration test suite for Numa # Integration test suite for Numa
# Runs a test instance on port 5354, validates all features, exits with status. # Runs a test instance on port 5354, validates all features, exits with status.
# Usage: ./tests/integration.sh [release|debug] # Usage:
# ./tests/integration.sh [release|debug] # all suites
# SUITES=7 ./tests/integration.sh # only Suite 7
# SUITES=1,3,7 ./tests/integration.sh # Suites 1, 3, and 7
set -euo pipefail set -euo pipefail
@@ -14,6 +17,14 @@ LOG="/tmp/numa-integration-test.log"
PASSED=0 PASSED=0
FAILED=0 FAILED=0
# Suite filter: empty runs all; comma list runs a subset.
SUITES="${SUITES:-}"
should_run_suite() {
[ -z "$SUITES" ] && return 0
case ",$SUITES," in *",$1,"*) return 0;; esac
return 1
}
# Colors # Colors
GREEN="\033[32m" GREEN="\033[32m"
RED="\033[31m" RED="\033[31m"
@@ -166,6 +177,7 @@ CONF
} }
# ---- Suite 1: Recursive mode + DNSSEC ---- # ---- Suite 1: Recursive mode + DNSSEC ----
if should_run_suite 1; then
echo "" echo ""
echo "╔══════════════════════════════════════════╗" echo "╔══════════════════════════════════════════╗"
echo "║ Suite 1: Recursive + DNSSEC + Blocking ║" echo "║ Suite 1: Recursive + DNSSEC + Blocking ║"
@@ -234,7 +246,10 @@ kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true wait "$NUMA_PID" 2>/dev/null || true
sleep 1 sleep 1
fi # end Suite 1
# ---- Suite 2: Forward mode (backward compat) ---- # ---- Suite 2: Forward mode (backward compat) ----
if should_run_suite 2; then
echo "" echo ""
echo "╔══════════════════════════════════════════╗" echo "╔══════════════════════════════════════════╗"
echo "║ Suite 2: Forward (DoH) + Blocking ║" echo "║ Suite 2: Forward (DoH) + Blocking ║"
@@ -261,7 +276,10 @@ enabled = true
enabled = false enabled = false
" "
fi # end Suite 2
# ---- Suite 3: Forward UDP (plain, no DoH) ---- # ---- Suite 3: Forward UDP (plain, no DoH) ----
if should_run_suite 3; then
echo "" echo ""
echo "╔══════════════════════════════════════════╗" echo "╔══════════════════════════════════════════╗"
echo "║ Suite 3: Forward (UDP) + No Blocking ║" echo "║ Suite 3: Forward (UDP) + No Blocking ║"
@@ -307,7 +325,10 @@ kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true wait "$NUMA_PID" 2>/dev/null || true
sleep 1 sleep 1
fi # end Suite 3
# ---- Suite 4: Local zones + Overrides API ---- # ---- Suite 4: Local zones + Overrides API ----
if should_run_suite 4; then
echo "" echo ""
echo "╔══════════════════════════════════════════╗" echo "╔══════════════════════════════════════════╗"
echo "║ Suite 4: Local Zones + Overrides API ║" echo "║ Suite 4: Local Zones + Overrides API ║"
@@ -416,7 +437,10 @@ kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true wait "$NUMA_PID" 2>/dev/null || true
sleep 1 sleep 1
fi # end Suite 4
# ---- Suite 5: DNS-over-TLS (RFC 7858) ---- # ---- Suite 5: DNS-over-TLS (RFC 7858) ----
if should_run_suite 5; then
echo "" echo ""
echo "╔══════════════════════════════════════════╗" echo "╔══════════════════════════════════════════╗"
echo "║ Suite 5: DNS-over-TLS (RFC 7858) ║" echo "║ Suite 5: DNS-over-TLS (RFC 7858) ║"
@@ -538,7 +562,10 @@ CONF
fi fi
sleep 1 sleep 1
fi # end Suite 5
# ---- Suite 6: Proxy + DoT coexistence ---- # ---- Suite 6: Proxy + DoT coexistence ----
if should_run_suite 6; then
echo "" echo ""
echo "╔══════════════════════════════════════════╗" echo "╔══════════════════════════════════════════╗"
echo "║ Suite 6: Proxy + DoT Coexistence ║" echo "║ Suite 6: Proxy + DoT Coexistence ║"
@@ -698,6 +725,376 @@ CONF
rm -rf "$NUMA_DATA" rm -rf "$NUMA_DATA"
fi fi
fi # end Suite 6
# ---- Suite 7: filter_aaaa (IPv4-only networks) ----
if should_run_suite 7; then
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║ Suite 7: filter_aaaa ║"
echo "╚══════════════════════════════════════════╝"
# Config A — filter on, with a local AAAA zone to prove local data bypass.
cat > "$CONFIG" << 'CONF'
[server]
bind_addr = "127.0.0.1:5354"
api_port = 5381
filter_aaaa = true
[upstream]
mode = "forward"
address = "9.9.9.9"
port = 53
[cache]
max_entries = 10000
[blocking]
enabled = false
[proxy]
enabled = false
[[zones]]
domain = "v6.test"
record_type = "AAAA"
value = "2001:db8::1"
ttl = 60
CONF
RUST_LOG=info "$BINARY" "$CONFIG" > "$LOG" 2>&1 &
NUMA_PID=$!
sleep 3
DIG="dig @127.0.0.1 -p $PORT +time=5 +tries=1"
echo ""
echo "=== filter_aaaa = true ==="
# A queries must be untouched.
check "A record resolves under filter_aaaa" \
"." \
"$($DIG google.com A +short | head -1)"
# AAAA must be NOERROR (NODATA), not NXDOMAIN, not SERVFAIL.
check "AAAA returns NOERROR (not NXDOMAIN)" \
"status: NOERROR" \
"$($DIG google.com AAAA 2>&1 | grep 'status:')"
check "AAAA returns zero answers (NODATA shape)" \
"ANSWER: 0" \
"$($DIG google.com AAAA 2>&1 | grep -oE 'ANSWER: [0-9]+' | head -1)"
# Local zone AAAA must survive the filter (PR claim: local data bypasses).
check "Local [[zones]] AAAA bypasses filter" \
"2001:db8::1" \
"$($DIG v6.test AAAA +short)"
# HTTPS RR: ipv6hint (SvcParamKey 6) must be stripped. Query as `type65`
# because dig 9.10.6 (macOS) misparses `HTTPS` as a domain name; `type65`
# works on both 9.10.6 and 9.18. Assert on the raw rdata hex (RFC 3597
# generic format), since dig 9.10.6 doesn't pretty-print HTTPS params.
# cloudflare.com's ipv6hint values sit under the 2606:4700 prefix —
# checking that `26064700` is absent from the rdata hex is a precise,
# upstream-stable signal that the TLV was stripped.
HTTPS_OUT=$($DIG cloudflare.com type65 2>&1)
if echo "$HTTPS_OUT" | grep -qE "cloudflare\.com\..*IN[[:space:]]+TYPE65"; then
HTTPS_HEX=$(echo "$HTTPS_OUT" | grep -A5 "IN[[:space:]]*TYPE65" | tr -d " \t\n")
if echo "$HTTPS_HEX" | grep -qi "26064700"; then
check "HTTPS ipv6hint stripped (2606:4700 absent from rdata)" "absent" "present"
else
check "HTTPS ipv6hint stripped (2606:4700 absent from rdata)" "absent" "absent"
fi
else
# Upstream didn't return an HTTPS record — skip rather than false-pass.
printf " ${DIM}~ HTTPS ipv6hint stripped (skipped: no HTTPS RR returned by upstream)${RESET}\n"
fi
kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
# Config B — filter off. Regression guard: prove AAAA answers come back
# when the flag isn't set, so a network failure in Config A can't silently
# pass as "filter working".
cat > "$CONFIG" << 'CONF'
[server]
bind_addr = "127.0.0.1:5354"
api_port = 5381
[upstream]
mode = "forward"
address = "9.9.9.9"
port = 53
[cache]
max_entries = 10000
[blocking]
enabled = false
[proxy]
enabled = false
CONF
RUST_LOG=info "$BINARY" "$CONFIG" > "$LOG" 2>&1 &
NUMA_PID=$!
sleep 3
echo ""
echo "=== filter_aaaa unset (regression guard) ==="
check "AAAA returns real answers with filter off" \
":" \
"$($DIG google.com AAAA +short | head -1)"
kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
fi # end Suite 7
# ---- Suite 8: ODoH (Oblivious DoH via public relay + target) ----
# Exercises the full client pipeline: /.well-known/odohconfigs fetch,
# HPKE seal/unseal, URL-query target routing (RFC 9230 §5), dashboard
# QueryPath::Odoh counter. Depends on the public ecosystem being up —
# the probe-odoh-ecosystem.sh script guards against flaky runs.
if should_run_suite 8; then
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║ Suite 8: ODoH (Anonymous DNS) ║"
echo "╚══════════════════════════════════════════╝"
run_test_suite "ODoH via edgecompute.app relay → Cloudflare target" "
[server]
bind_addr = \"127.0.0.1:5354\"
api_port = 5381
[upstream]
mode = \"odoh\"
relay = \"https://odoh-relay.edgecompute.app/proxy\"
target = \"https://odoh.cloudflare-dns.com/dns-query\"
[cache]
max_entries = 10000
min_ttl = 60
max_ttl = 86400
[blocking]
enabled = false
[proxy]
enabled = false
"
# Re-start briefly to assert ODoH-specific observability: the odoh counter
# has to tick above zero after a query, and the stats label has to reflect
# the oblivious path. These guard against silent regressions in the
# QueryPath::Odoh tagging and the /stats serialisation.
RUST_LOG=info "$BINARY" "$CONFIG" > "$LOG" 2>&1 &
NUMA_PID=$!
for _ in $(seq 1 30); do
curl -sf "http://127.0.0.1:$API_PORT/health" >/dev/null 2>&1 && break
sleep 0.1
done
$DIG example.com A +short > /dev/null 2>&1 || true
sleep 1
STATS=$(curl -sf http://127.0.0.1:$API_PORT/stats 2>/dev/null)
# upstream_transport.odoh lives inside the upstream_transport object.
ODOH_COUNT=$(echo "$STATS" | grep -o '"upstream_transport":{[^}]*}' \
| grep -o '"odoh":[0-9]*' | cut -d: -f2)
check "upstream_transport.odoh > 0 after a query" "[1-9]" "${ODOH_COUNT:-0}"
check "Upstream label advertises odoh://" \
"odoh://" \
"$(echo "$STATS" | grep -o '"upstream":"[^"]*"')"
check "Stats mode field is 'odoh'" \
'"mode":"odoh"' \
"$(echo "$STATS" | grep -o '"mode":"odoh"')"
# Strict-mode failure path: a clearly-unreachable relay must produce
# SERVFAIL without silent downgrade. We hijack the config to point at
# an .invalid host so we don't rely on external uptime.
kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
cat > "$CONFIG" << 'CONF'
[server]
bind_addr = "127.0.0.1:5354"
api_port = 5381
[upstream]
mode = "odoh"
relay = "https://relay.invalid/proxy"
target = "https://odoh.cloudflare-dns.com/dns-query"
strict = true
[cache]
max_entries = 10000
[blocking]
enabled = false
[proxy]
enabled = false
CONF
RUST_LOG=info "$BINARY" "$CONFIG" > "$LOG" 2>&1 &
NUMA_PID=$!
for _ in $(seq 1 30); do
curl -sf "http://127.0.0.1:$API_PORT/health" >/dev/null 2>&1 && break
sleep 0.1
done
check "Strict-mode relay outage returns SERVFAIL" \
"SERVFAIL" \
"$($DIG example.com A 2>&1 | grep 'status:')"
kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
# Negative: relay and target on the same host must be rejected at startup.
cat > "$CONFIG" << 'CONF'
[server]
bind_addr = "127.0.0.1:5354"
api_port = 5381
[upstream]
mode = "odoh"
relay = "https://odoh.cloudflare-dns.com/proxy"
target = "https://odoh.cloudflare-dns.com/dns-query"
CONF
STARTUP_OUT=$("$BINARY" "$CONFIG" 2>&1 || true)
check "Same-host relay+target rejected at startup" \
"same host" \
"$STARTUP_OUT"
# Guards ODoH's zero-plain-DNS-leak property: relay_ip / target_ip must
# land in the bootstrap resolver's override map so reqwest connects direct
# to the configured IPs instead of resolving the hostnames via plain DNS.
# RFC 5737 TEST-NET-1 IPs (unroutable).
cat > "$CONFIG" << 'CONF'
[server]
bind_addr = "127.0.0.1:5354"
api_port = 5381
[upstream]
mode = "odoh"
relay = "https://odoh-relay.example.com/proxy"
target = "https://odoh-target.example.org/dns-query"
relay_ip = "192.0.2.1"
target_ip = "192.0.2.2"
[cache]
max_entries = 10000
[blocking]
enabled = false
[proxy]
enabled = false
CONF
RUST_LOG=info "$BINARY" "$CONFIG" > "$LOG" 2>&1 &
NUMA_PID=$!
for _ in $(seq 1 30); do
curl -sf "http://127.0.0.1:$API_PORT/health" >/dev/null 2>&1 && break
sleep 0.1
done
OVERRIDE_LOG=$(grep 'bootstrap resolver: host overrides' "$LOG" || true)
check "relay_ip wired into bootstrap override map" \
"odoh-relay.example.com=192.0.2.1" \
"$OVERRIDE_LOG"
check "target_ip wired into bootstrap override map" \
"odoh-target.example.org=192.0.2.2" \
"$OVERRIDE_LOG"
kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
fi # end Suite 8
# ---- Suite 9: Numa's own ODoH relay (--relay-mode) ----
# Exercises `numa relay PORT` as a forwarding proxy to a real ODoH target.
# Validates the RFC 9230 §5 relay behaviour: URL-query routing, content-type
# gating, body-size cap, and /health observability.
if should_run_suite 9; then
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║ Suite 9: Numa ODoH Relay (own) ║"
echo "╚══════════════════════════════════════════╝"
RELAY_PORT=18443
"$BINARY" relay $RELAY_PORT > "$LOG" 2>&1 &
NUMA_PID=$!
for _ in $(seq 1 30); do
curl -sf "http://127.0.0.1:$RELAY_PORT/health" >/dev/null 2>&1 && break
sleep 0.1
done
echo ""
echo "=== Relay Endpoints ==="
check "Health endpoint returns ok" \
"ok" \
"$(curl -sf http://127.0.0.1:$RELAY_PORT/health | head -1)"
# Happy path: forwards arbitrary body to Cloudflare's ODoH target. The
# target will reject the garbage envelope with HTTP 400 — which is exactly
# what proves our relay faithfully forwarded (otherwise we'd see our own
# 4xx from the relay itself).
HAPPY_STATUS=$(curl -sS -o /dev/null -w "%{http_code}" -X POST \
-H "Content-Type: application/oblivious-dns-message" \
--data-binary "garbage-forwarded-end-to-end" \
"http://127.0.0.1:$RELAY_PORT/relay?targethost=odoh.cloudflare-dns.com&targetpath=/dns-query")
check "Relay forwards to target (target rejects garbage → 400)" \
"400" \
"$HAPPY_STATUS"
echo ""
echo "=== Guards ==="
check "Missing content-type → 415" \
"415" \
"$(curl -sS -o /dev/null -w '%{http_code}' -X POST --data-binary 'x' \
'http://127.0.0.1:'$RELAY_PORT'/relay?targethost=odoh.cloudflare-dns.com&targetpath=/dns-query')"
check "Oversized body (>4 KiB) → 413" \
"413" \
"$(head -c 5000 /dev/urandom | curl -sS -o /dev/null -w '%{http_code}' -X POST \
-H 'Content-Type: application/oblivious-dns-message' --data-binary @- \
'http://127.0.0.1:'$RELAY_PORT'/relay?targethost=odoh.cloudflare-dns.com&targetpath=/dns-query')"
check "Invalid targethost (no dot) → 400" \
"400" \
"$(curl -sS -o /dev/null -w '%{http_code}' -X POST \
-H 'Content-Type: application/oblivious-dns-message' --data-binary 'x' \
'http://127.0.0.1:'$RELAY_PORT'/relay?targethost=invalid&targetpath=/dns-query')"
echo ""
echo "=== Counters ==="
HEALTH=$(curl -sf "http://127.0.0.1:$RELAY_PORT/health")
check "Relay counted at least one forwarded_ok" \
"[1-9]" \
"$(echo "$HEALTH" | grep 'forwarded_ok' | awk '{print $2}')"
check "Relay counted at least one rejected_bad_request" \
"[1-9]" \
"$(echo "$HEALTH" | grep 'rejected_bad_request' | awk '{print $2}')"
kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
fi # end Suite 9
# Summary # Summary
echo "" echo ""
TOTAL=$((PASSED + FAILED)) TOTAL=$((PASSED + FAILED))

101
tests/probe-odoh-ecosystem.sh Executable file
View File

@@ -0,0 +1,101 @@
#!/usr/bin/env bash
# Probe the public ODoH ecosystem.
#
# Source of truth: DNSCrypt's curated list at
# https://github.com/DNSCrypt/dnscrypt-resolvers/tree/master/v3
# - v3/odoh-servers.md (ODoH targets)
# - v3/odoh-relays.md (ODoH relays)
#
# As of commit 2025-09-16 ("odohrelay-crypto-sx seems to be the only ODoH
# relay left"), the full public ecosystem is 4 targets + 1 relay. Re-run this
# script against the upstream list before making any "only N public relays"
# claim publicly.
#
# Usage: ./tests/probe-odoh-ecosystem.sh
set -uo pipefail
GREEN="\033[32m"
RED="\033[31m"
YELLOW="\033[33m"
DIM="\033[90m"
RESET="\033[0m"
UP=0
DOWN=0
probe_target() {
local name="$1"
local host="$2"
local url="https://${host}/.well-known/odohconfigs"
local start=$(date +%s%N)
local headers
headers=$(curl -sS -o /tmp/odoh-probe-body -D - --max-time 5 -A "numa-odoh-probe/0.1" "$url" 2>&1) || {
DOWN=$((DOWN + 1))
printf " ${RED}${RESET} %-25s ${DIM}unreachable${RESET}\n" "$name"
return
}
local elapsed_ms=$((($(date +%s%N) - start) / 1000000))
local status
status=$(echo "$headers" | head -1 | awk '{print $2}')
local ctype
ctype=$(echo "$headers" | grep -i '^content-type:' | head -1 | tr -d '\r')
local size
size=$(stat -f%z /tmp/odoh-probe-body 2>/dev/null || stat -c%s /tmp/odoh-probe-body 2>/dev/null || echo 0)
if [[ "$status" == "200" ]] && [[ "$size" -gt 0 ]]; then
UP=$((UP + 1))
printf " ${GREEN}${RESET} %-25s ${DIM}%4dms %s bytes %s${RESET}\n" "$name" "$elapsed_ms" "$size" "$ctype"
else
DOWN=$((DOWN + 1))
printf " ${RED}${RESET} %-25s ${DIM}status=%s size=%s${RESET}\n" "$name" "$status" "$size"
fi
rm -f /tmp/odoh-probe-body
}
probe_relay() {
# Relays don't expose /.well-known/odohconfigs — we just verify TLS reachability
# and that the endpoint responds to a malformed POST with an HTTP error
# (indicating the relay path exists). A real ODoH validation requires HPKE.
local name="$1"
local url="$2"
local start=$(date +%s%N)
local status
status=$(curl -sS -o /dev/null -w "%{http_code}" --max-time 5 -A "numa-odoh-probe/0.1" \
-X POST -H "Content-Type: application/oblivious-dns-message" \
--data-binary "" "$url" 2>&1) || {
DOWN=$((DOWN + 1))
printf " ${RED}${RESET} %-25s ${DIM}unreachable${RESET}\n" "$name"
return
}
local elapsed_ms=$((($(date +%s%N) - start) / 1000000))
# Any 2xx or 4xx means the endpoint is live (TLS works, HTTP responded).
# 5xx or 000 (curl failure) means broken.
if [[ "$status" =~ ^[24] ]]; then
UP=$((UP + 1))
printf " ${GREEN}${RESET} %-25s ${DIM}%4dms status=%s (endpoint live)${RESET}\n" "$name" "$elapsed_ms" "$status"
else
DOWN=$((DOWN + 1))
printf " ${RED}${RESET} %-25s ${DIM}status=%s${RESET}\n" "$name" "$status"
fi
}
echo "ODoH targets:"
probe_target "Cloudflare" "odoh.cloudflare-dns.com"
probe_target "crypto.sx" "odoh.crypto.sx"
probe_target "Snowstorm" "dope.snowstorm.love"
probe_target "Tiarap" "doh.tiarap.org"
echo
echo "ODoH relays:"
probe_relay "Frank Denis (Fastly)" "https://odoh-relay.edgecompute.app/proxy"
echo
TOTAL=$((UP + DOWN))
if [[ "$DOWN" -eq 0 ]]; then
printf "${GREEN}All %d endpoints up${RESET}\n" "$TOTAL"
exit 0
else
printf "${YELLOW}%d/%d up, %d down${RESET}\n" "$UP" "$TOTAL" "$DOWN"
exit 1
fi