83 Commits

Author SHA1 Message Date
Razvan Dimescu
db6a105f77 Merge pull request #150 from razvandimescu/fix/refresh-honors-forwarding-rules
fix(cache): refresh honors forwarding rules (#147)
2026-04-25 18:26:47 +03:00
Razvan Dimescu
bf977595b6 Merge pull request #152 from gatozee/fix_title_alignment
fix: title alignment
2026-04-25 08:23:09 +03:00
Krtek Zee
63a2d26276 fix: title alignment 2026-04-24 17:42:32 -07:00
Razvan Dimescu
cfef4f4160 fix(cache): refresh honors forwarding rules (#147)
refresh_entry unconditionally queried the default upstream, so any
domain covered by a forwarding rule got re-resolved through the public
resolver once its cache entry hit NearExpiry or Stale. The resulting
NXDOMAIN/NODATA overwrote the good answer for at least cache.min_ttl
(60s default), persisting until restart. Match the precedence from
resolve_query: forwarding rule wins over recursive/default upstream.

Extract a_record_response() helper in testutil and migrate six call
sites — two regression tests here plus four adjacent tests using the
same boilerplate.
2026-04-24 19:03:19 +03:00
Razvan Dimescu
38ddb59e00 Merge pull request #149 from razvandimescu/fix/publish-aur-detached-head
ci(aur): attach to master after clone to avoid detached HEAD
2026-04-24 18:07:21 +03:00
Razvan Dimescu
441935af5a Merge pull request #148 from razvandimescu/fix/dashboard-cache
fix(api): Cache-Control: no-cache on dashboard HTML
2026-04-24 17:59:30 +03:00
Razvan Dimescu
d090e049ec ci(aur): attach to master after clone to avoid detached HEAD
aur.archlinux.org stopped advertising the HEAD symref around 2026-04-22
(`git ls-remote --symref` returns HEAD as a raw SHA, no 'ref:' line).
Fresh clones therefore land in detached HEAD, commits do not land on
any branch, and 'git push origin master' fails with:

  error: src refspec master does not match any

Every AUR publish run since has failed for this reason. Checking out
master explicitly after clone attaches the working copy to the branch
the push targets. refs/heads/master is still present on the remote, so
no other changes are needed.
2026-04-24 17:57:51 +03:00
Razvan Dimescu
4aa91a5236 fix(api): Cache-Control: no-cache on dashboard HTML
Browsers heuristically cached the dashboard page because the response
carried no Cache-Control header, so a numa upgrade on the daemon did
not surface updated PATH_DEFS (e.g. the UPSTREAM row added in v0.14.0)
until the user hard-reloaded. Force revalidation on every load.
Closes #144.
2026-04-24 17:51:14 +03:00
Razvan Dimescu
93f0ea7501 Merge pull request #145 from razvandimescu/docs/recipes
docs: lift user-facing guides to recipes/, drop dangling docs/ refs
2026-04-24 15:22:44 +03:00
Razvan Dimescu
f7f35b3424 docs: lift user-facing guides to recipes/, drop dangling docs/ refs
docs/ is gitignored; references to docs/implementation/*.md from public
source, configs, and packaging were dead links outside the maintainer
machine. Adds four recipes (README, dnsdist-front, doh-on-lan,
odoh-upstream) under top-level recipes/ and repoints existing pointers.

- numa.toml, packaging/client/{README.md,numa.toml}: point to
  recipes/odoh-upstream.md.
- src/{bootstrap_resolver,forward,serve}.rs: reference issue #122
  directly (module scope is broader than the ODoH-specific recipe).
- src/health.rs: drop the §-ref; iOS HealthInfo remains named as the
  canonical consumer.
2026-04-24 15:09:16 +03:00
Razvan Dimescu
3913d42319 Merge pull request #137 from razvandimescu/fix/soa-compression-roundtrip
fix(packet): parse SOA natively to stop malformed replies (#128)
2026-04-24 13:59:57 +03:00
Razvan Dimescu
e702f5861b Update README.md to remove outdated listing information
Removed section about listing on the public ecosystem and DNSCrypt's canonical list.
2026-04-23 09:39:34 +03:00
Razvan Dimescu
933643f2c7 Merge pull request #139 from razvandimescu/fix/odoh-relay-doc-path
docs(config): fix ODoH relay path in numa.toml example
2026-04-23 08:58:53 +03:00
Razvan Dimescu
96cf778bea docs(config): fix ODoH relay path in numa.toml example
The example in `numa.toml` pointed at `https://odoh-relay.numa.rs/proxy`,
but the relay only serves the ODoH endpoint at `/relay` (every other
reference in the tree — `src/config.rs` docs and tests, and
`packaging/client/numa.toml` — uses `/relay`). Users who copied the
example got `404 Not Found` on every query and SERVFAIL at the client.

Reported in #138.
2026-04-23 08:53:35 +03:00
Razvan Dimescu
2274151c17 fix(packet): parse SOA natively to stop malformed replies (#128)
SOA records were stored as opaque bytes (DnsRecord::UNKNOWN), so the
RFC 1035 §3.3.13 MNAME/RNAME name-compression pointers — offsets into
the upstream packet — were re-emitted verbatim. Once Numa applied its
own compression to surrounding names, those pointers landed on garbage
and clients rejected the reply ("malformed reply packet" in kdig).

Parse SOA via read_qname and write via write_qname, matching the
NS/CNAME/MX pattern. Adds the canonical-rdata arm in dnssec.rs for
RRSIG verification. Regression test round-trips a CNAME-chain response
with a compressed SOA in authority through hickory-proto strict parse.
2026-04-23 00:36:02 +03:00
Razvan Dimescu
c787de1548 chore: bump version to 0.14.2 2026-04-22 23:57:37 +03:00
Razvan Dimescu
e6e79273b9 Revert "chore: bump version to 0.15.0"
This reverts commit 3ec3b40830.
2026-04-22 23:57:28 +03:00
Razvan Dimescu
3ec3b40830 chore: bump version to 0.15.0 2026-04-22 23:50:20 +03:00
Razvan Dimescu
90fa79bc0f Merge pull request #135 from razvandimescu/fix/hedge-default-off
fix(upstream): default hedge_ms=0 to avoid silent 2x upstream query count
2026-04-22 23:49:15 +03:00
Razvan Dimescu
b8a125b598 fix(upstream): default hedge_ms=0 to avoid silent 2x upstream query count
Hedging fires a second upstream query against the same upstream after
the hedge delay. Rescues packet loss and handshake stalls on flaky
links, but every lookup shows up twice at the provider — silently
halves the headroom for anyone on a quota'd upstream (NextDNS free tier,
Control D, paid Quad9).

Surfaced by #134 (bcookatpcsd), who saw every query duplicated on the
NextDNS dashboard with a single-address DoT upstream. Not a bug — the
feature doing what it says on the tin — but a surprising default.

Flipping the default to 0 makes hedging explicitly opt-in. Users who
want tail-latency rescue on flaky nets add `hedge_ms = 10` (or higher).
No config migration needed; no breaking changes to the API surface.

Also tightens the numa.toml comment so the trade-off is visible at
config time, not retroactively on a provider dashboard.
2026-04-22 23:30:55 +03:00
Razvan Dimescu
bc30be94e7 Merge pull request #131 from razvandimescu/feat/packaging-client-docker
feat(packaging): ODoH client Docker deploy recipe
2026-04-22 23:11:50 +03:00
Razvan Dimescu
26b1cd5917 feat(packaging): ODoH client Docker deploy
Single-container docker-compose recipe for running numa in ODoH client
mode. Ships with a starter numa.toml pointing at odoh-relay.numa.rs
paired with Cloudflare's ODoH target — two independent operators with
distinct eTLD+1s, so the default passes numa's same-operator check.

Exposes :53 UDP+TCP for LAN clients and :5380 for the dashboard + REST
API. README covers prerequisites, deploy, verification, and the ODoH
privacy boundary (relay sees IP, target sees query, neither sees both).

Advertised alongside packaging/relay/ in the main README Docker section.
2026-04-22 18:05:46 +03:00
Razvan Dimescu
77d6d89f80 Merge pull request #130 from razvandimescu/docs/numa-toml-odoh-examples
docs(config): ODoH upstream examples with relay_ip/target_ip pinning
2026-04-22 17:20:19 +03:00
Razvan Dimescu
4fdd05f284 Merge pull request #132 from razvandimescu/chore/site-live-reload
chore(site): live-reload dev server
2026-04-22 17:17:37 +03:00
Razvan Dimescu
2e461ccc0f docs(config): add ODoH upstream examples with relay_ip/target_ip pinning
Complements the bootstrap resolver fix (#122, #126) by documenting the
ODoH knobs in the commented config template. Explains relay_ip/target_ip
as the way to prevent plain-DNS leaks of the relay/target hostnames via
the bootstrap resolver on cold boot when numa is its own system DNS.
2026-04-22 17:13:13 +03:00
Razvan Dimescu
bf84c44346 Merge pull request #133 from razvandimescu/chore/cargo-audit-rustls-webpki
chore: bump rustls-webpki to 0.103.13 (RUSTSEC-2026-0104)
2026-04-22 17:03:58 +03:00
Razvan Dimescu
df2062882c chore: bump rustls-webpki to 0.103.13 for RUSTSEC-2026-0104
Advisory published 2026-04-22: reachable panic in certificate revocation
list parsing. Patch is a lockfile-only bump — transitive via rustls, no
direct dep changes. Unblocks cargo audit in CI across all open PRs.
2026-04-22 16:42:10 +03:00
Razvan Dimescu
76dda89078 Merge pull request #129 from razvandimescu/chore/gitignore-claude
chore: gitignore .claude/ harness state
2026-04-22 16:39:56 +03:00
Razvan Dimescu
640b64bf7e chore(site): live-reload dev server via chokidar + browser-sync
Replaces the plain python3 http.server + one-shot make blog with a
watcher pipeline: chokidar regenerates HTML on MD/template changes,
browser-sync serves the site and reloads the browser on rendered-asset
changes. First run downloads both via npx; subsequent runs are instant.

Preflight checks for npx and pandoc. Port arg parsing is tolerant of
legacy --drafts flag ordering (drafts are always included now, since
that's what the dev loop actually wants).

Cleanup trap kills the watcher on exit so re-runs don't leave orphans.
2026-04-22 15:50:21 +03:00
Razvan Dimescu
5ba19e04c8 chore: gitignore local Claude Code harness state
.claude/ holds per-session harness files (settings.local.json, task
locks, worktree metadata). None of it belongs in the repo.
2026-04-22 15:49:58 +03:00
Razvan Dimescu
c98afafaa1 Merge pull request #127 from razvandimescu/refactor/bootstrap-btreemap
refactor(bootstrap): BTreeMap for overrides + simplify review
2026-04-21 18:41:49 +03:00
Razvan Dimescu
5cba02a6c8 refactor(bootstrap): BTreeMap for overrides + simplify review
- Switch overrides from HashMap to BTreeMap — deterministic iteration by
  type, drops the manual sort when logging.
- Rename the flat_map closure's inner `ips` to `addrs` to stop shadowing
  the outer Vec<String>.
- Trim the Suite 8 TEST-NET-1 comment to keep the "why" and drop
  mechanism narration.
- Drop a redundant sleep 1 after wait — wait already blocks on exit.
2026-04-21 18:37:35 +03:00
Razvan Dimescu
46a95d58aa Merge pull request #126 from razvandimescu/fix/self-resolver-loop
fix(bootstrap): route numa HTTPS via IP-literal bootstrap resolver (#122)
2026-04-21 17:52:51 +03:00
Razvan Dimescu
51cce0347b test(odoh): integration-verify relay_ip/target_ip override wiring
Suite 8 now ends with a config using RFC 5737 TEST-NET-1 IPs as
relay_ip/target_ip, started briefly so the bootstrap resolver logs its
override map. Asserts both host=IP pairs land in that map — closing the
gap flagged on PR #126 (zero-plain-DNS-leak for ODoH endpoints was only
unit-tested).

Also: NumaResolver::new now logs the override map at INFO when non-empty,
so operators can verify their ODoH bootstrap without needing DEBUG level.
2026-04-21 17:43:02 +03:00
Razvan Dimescu
459395203d style: cargo fmt 2026-04-21 16:30:26 +03:00
Razvan Dimescu
10469e96bd fix(bootstrap): route numa HTTPS via IP-literal bootstrap resolver (#122)
When numa is its own system DNS resolver (HAOS add-on, Pi-hole-style
container, /etc/resolv.conf → 127.0.0.1), every numa-originated HTTPS
connection — DoH upstream, ODoH relay/target, blocklist CDN — routed
its hostname through getaddrinfo() back to numa itself. Cold boot
deadlocked; steady state taxed every new TCP connection. 0.14.1's
retry-with-backoff masked the startup race but not the underlying
self-loop.

NumaResolver implements reqwest::dns::Resolve with two lanes:
- Per-host overrides (ODoH relay_ip/target_ip) short-circuit DNS
  entirely, preserving ODoH's zero-plain-DNS-leak property.
- Otherwise: A+AAAA in parallel via UDP to IP-literal bootstrap
  servers, with TCP fallback for UDP-hostile networks.

Bootstrap IPs come from upstream.fallback (IP-literal filtered,
hostnames skipped with a warning). Empty fallback yields the
hardcoded default [9.9.9.9, 1.1.1.1]; the chosen source is logged
at startup so the silent default is visible.

doh_keepalive_loop now fires its first tick immediately, and
keepalive_doh logs failures at WARN — bootstrap issues surface
within ~100ms of boot instead of on the first client query.

Distinct from UpstreamPool.fallback (client-query failover) which
stays untouched: client queries with no configured fallback still
SERVFAIL on primary failure rather than silently shadow-routing.

Reproducer: tests/docker/self-resolver-loop.sh. Before: 0 blocklist
domains, 3072ms SERVFAIL. After: 397k domains, 118ms NOERROR.
2026-04-21 16:19:14 +03:00
Razvan Dimescu
31adc31c9b refactor(ctx): coalesce forward-path upstream queries
resolve_coalesced now takes leader_path: QueryPath and applies to all
three upstream branches (Forwarded-rule, Recursive, Upstream), not just
Recursive. Fixes thundering-herd at boot when N concurrent HTTPS setups
each trigger independent forward queries for the same upstream hostname.
2026-04-21 16:18:52 +03:00
Razvan Dimescu
60600b045f chore: bump version to 0.14.1 2026-04-20 19:27:06 +03:00
Razvan Dimescu
3e6bf3feb0 Merge pull request #125 from razvandimescu/worktree-fix-blocklist-bootstrap
fix(blocklist): retry on transient download failures (#122)
2026-04-20 19:22:04 +03:00
Razvan Dimescu
8bed7c4649 test(blocklist): decouple retry tests from RETRY_DELAYS_SECS length
Derive both the flaky-server drop count and the zero-delay schedule
from RETRY_DELAYS_SECS.len() so the tests keep exercising their
intended invariants — "succeeds on final attempt" and "gives up after
all attempts fail" — if the production retry schedule ever changes.

Also: rename fail_first → drop_first_n to match drop(sock); swap the
giveup test's empty body for an "unreachable" sentinel so a regression
that accidentally served couldn't silently match Some("").
2026-04-20 19:19:43 +03:00
Razvan Dimescu
5b1642c6dc fix(blocklist): retry on transient download failures (#122)
On cold start, reqwest's getaddrinfo can race numa's own first-query
cold-path latency — resolver timeout fires before numa warms its
upstream DoH connection. Wrap each blocklist fetch in 3 retries with
2s/10s/30s backoff; by the second attempt, the upstream is warm and
subsequent getaddrinfos succeed in <100ms.

Also: parallelize fetches across lists via join_all (different hosts,
no warming dependency), walk the full error source chain so reqwest
failures surface the underlying cause, and parameterize retry delays
for unit-test speed.
2026-04-20 19:19:43 +03:00
Razvan Dimescu
01fda7891e Merge pull request #123 from razvandimescu/feat/odoh-etld1-check
feat(odoh): reject relay+target sharing an eTLD+1
2026-04-20 19:06:12 +03:00
Razvan Dimescu
5e84adbd94 Merge pull request #124 from razvandimescu/fix/dashboard-encryption-pct-args
fix(dashboard): pass missing args to encryptionPct in refresh()
2026-04-20 19:05:50 +03:00
Razvan Dimescu
15978a7859 fix(dashboard): pass missing args to encryptionPct in refresh()
Commit eb5ea3b generalised encryptionPct from (transport) to
(data, encryptedKeys, allKeys) and updated renderTransport and
renderUpstreamWire, but missed the call inside render() that computes
the inline `~N/s · M% enc` QPS tag. With undefined allKeys, the
first .reduce() threw TypeError and the render try/catch silently
downgraded the whole dashboard to "disconnected" — every panel left
empty even though /stats was returning real data.

Fix the call site to match the other two (inbound-wire keys) and have
the catch log to console so the next silent-failure regression shows
up in DevTools within seconds instead of a source dive.
2026-04-20 19:04:15 +03:00
Razvan Dimescu
193b38b85f feat(odoh): reject relay+target sharing an eTLD+1
Plain host-string equality caught the copy-paste-same-URL footgun but
let `r.cloudflare.com` + `odoh.cloudflare.com` through — two subdomains
of the same operator collapse ODoH to ordinary DoH. Add a second layer:
compare registrable domains via the PSL (`psl` crate) after the exact-
host check. Fails open on IP literals and unparseable hosts; the exact-
host check still runs in those cases.
2026-04-20 18:46:54 +03:00
Razvan Dimescu
4c685d1602 docs(readme): pamper readme still 2026-04-20 17:19:16 +03:00
Razvan Dimescu
cd6e686a1a docs(readme): surface ODoH in the intro paragraph
Adds the v0.14.0 capability where it's most differentiating: the first
paragraph (sealed-query framing alongside the existing ad-blocking and
.numa-domain pitches) and the second paragraph (numa relay as a public
ODoH endpoint, with the DNSCrypt-list supply-doubling angle as fact).

No reposition: tagline and structure unchanged. ODoH joins the
existing capability set rather than displacing it. Hero GIF stays;
will be re-recorded once the dashboard's Outbound Wire panel is worth
showing in motion.
2026-04-20 17:14:21 +03:00
Razvan Dimescu
07c321f749 chore(release): bump to v0.14.0
Headline: ODoH (RFC 9230) — client + self-hosted relay. Set
mode = "odoh" in [upstream] to seal queries before they leave the
machine; run `numa relay` to add to the public ODoH ecosystem.
2026-04-20 17:07:31 +03:00
Razvan Dimescu
12a06a1410 Merge pull request #121 from razvandimescu/feat/odoh
feat(odoh): ship ODoH client + self-hosted relay (RFC 9230)
2026-04-20 16:26:54 +03:00
Razvan Dimescu
eb5ea3b645 refactor(odoh): deduplicate post-audit findings
- Hoist ODOH_CONTENT_TYPE to a single pub(crate) constant in odoh.rs;
  relay.rs imports it instead of declaring its own.
- Generalize dashboard encryptionPct(data, encryptedKeys, allKeys)
  so both Inbound Wire and Outbound Wire panels share the same math
  instead of drifting independently.
- Extract RelayState::new() and build_app() helpers in relay.rs so
  the test spawn_relay() and production run() wire the same router
  + body-limit layer. Prevents future middleware from landing in one
  path but not the other.

All 344 lib tests pass; no behavior change.
2026-04-20 16:03:34 +03:00
Razvan Dimescu
be60f6ccbc chore(packaging): docker-compose + Caddyfile for ODoH relay deploy
Two-container deploy: Caddy terminates TLS (auto-provisions Let's
Encrypt via ACME) and reverse-proxies to a Numa relay on an internal
Docker network. The relay never reads sealed payloads; Caddy's
access log is discarded so per-request observability doesn't defeat
the oblivious property.

Validated against Hetzner CX22 + DNS at odoh-relay.numa.rs:
- TLS-ALPN-01 challenge succeeded on first attempt
- /health returned the relay's counter block
- End-to-end ODoH client → relay → Cloudflare works

Operators only need to: set a DNS A record, edit Caddyfile's hostname,
docker compose up -d. README walks through the steps and the DNSCrypt
v3/odoh-relays.md submission to claim a public listing.
2026-04-20 15:44:29 +03:00
Razvan Dimescu
a3cc64c94f feat(odoh): relay bind-address CLI arg + dashboard Outbound Wire panel
- `numa relay [PORT] [BIND]` accepts an optional bind address (defaults
  to 127.0.0.1, matching the Caddy reverse-proxy deployment shape).
  Required for Docker, where the relay needs 0.0.0.0 inside the
  container so Caddy can reach it across the bridge network.

- Dashboard now surfaces the upstream_transport dimension as an
  "Outbound Wire" panel alongside the existing "Inbound Wire" (renamed
  from "Transport" for directional clarity). Sub-headers — "apps → numa"
  / "numa → internet" — make the threat-model split obvious without
  jargon. Bars: UDP/DoH/DoT/ODoH, headline "X% encrypted outbound".
  The PR description's promise that "the dashboard answers how much of
  my DNS traffic left in cleartext honestly" is now true.
2026-04-20 15:44:20 +03:00
Razvan Dimescu
cf128c19af feat(odoh): bootstrap-IP overrides + zero hedge for ODoH (post-deploy fixes)
Two issues surfaced from running mode = "odoh" against the live Hetzner
relay as system DNS:

1. **Bootstrap deadlock.** The reqwest HTTPS client resolves the relay
   and target hostnames via system DNS. When numa is itself the system
   resolver, the ODoH client loops trying to resolve through itself.
   Adds optional `relay_ip` and `target_ip` to `[upstream]`, plumbed
   into reqwest's `resolve()` so the HTTPS client bypasses system DNS
   for those two hostnames. TLS still validates against the URL
   hostname, so a stale IP fails loudly rather than silently MITM'ing.

2. **2x relay load.** Default `hedge_ms = 10` triggers a duplicate
   in-flight query for every request. Useful for UDP/DoH/DoT (rescues
   tail latency cheaply); wasteful for ODoH (doubles HPKE seal/unseal,
   doubles sealed-byte footprint a passive observer can correlate, no
   latency win — relay hop dominates either way). Force-zero in
   oblivious mode regardless of configured hedge_ms.

Validated end-to-end against odoh-relay.numa.rs → Cloudflare:
3 digs produced 3 forwarded_ok on the relay (was 6 before the hedge
fix), upstream_transport.odoh ticks correctly.
2026-04-20 15:44:09 +03:00
Razvan Dimescu
241c40553b feat(odoh): ship ODoH client + self-hosted relay (RFC 9230)
Client (mode = "odoh"): URL-query target routing per RFC 9230 §5,
/.well-known/odohconfigs TTL cache with 60s backoff on failure, HPKE
seal/open via odoh-rs, strict-mode default that SERVFAILs on relay
failure instead of silently downgrading. Host-equality config
validation rejects same-operator relay/target pairs.

Relay (`numa relay [PORT]`): axum server with /relay + /health.
SSRF-hardened hostname validator (RFC 1035 ASCII + dot + dash),
4 KiB body cap at the axum layer, 5s full-transaction timeout, and
static 502 on target failure (reqwest internals logged, not leaked).
Aggregate counters only — no per-request logs.

Observability: new `UpstreamTransport { Udp, Doh, Dot, Odoh }`
orthogonal to `QueryPath`, so /stats can tally wire protocols
symmetrically. Recursive mode records `Some(Udp)` for honest
"bytes egressing in cleartext" accounting.

Tests: Suite 8 exercises the client end-to-end via Frank Denis's
public relay + Cloudflare target; Suite 9 exercises `numa relay`
forwarding + guards against Cloudflare as the real far end. Full
probe script at tests/probe-odoh-ecosystem.sh verifies the entire
public ODoH ecosystem (4 targets + 1 relay per DNSCrypt's curated
list — confirms deploying Numa's relay doubles global supply).
2026-04-20 12:34:04 +03:00
Razvan Dimescu
f6cfb3ce1b Merge pull request #120 from razvandimescu/feat/named-record-types
feat(question): name SVCB/LOC/NAPTR record types in logs
2026-04-19 08:08:54 +03:00
Razvan Dimescu
5725f94ff3 refactor(question): collapse QueryType impls behind define_qtypes! macro
Adding a record type used to require 5 edits across the file (enum
variant, to_num, from_num, as_str, parse_str). The macro takes a
single (variant, num, str) tuple per type and generates the enum
plus all four methods.

UNKNOWN(u16) stays hand-coded since it carries data and can't sit
in the table.

src/question.rs: 156 lines -> 92 lines, no behavior change.
2026-04-19 08:01:18 +03:00
Razvan Dimescu
24610ae3fe feat(question): add SVCB, LOC, NAPTR variants to QueryType
Logs were printing UNKNOWN(64), UNKNOWN(29), UNKNOWN(35) for SVCB,
LOC, and NAPTR — three RR types that have been registered for years
and show up in the wild (notably SVCB via RFC 9462 DDR clients
querying _dns.resolver.arpa).

Adds the variants and replaces the SVCB_QTYPE u16 const introduced
in #119 with QueryType::SVCB.to_num(), matching the HTTPS path.

Closes #114.
2026-04-19 07:49:35 +03:00
Razvan Dimescu
6bc02982f0 Merge pull request #119 from razvandimescu/feat/filter-aaaa
feat(resolver): filter_aaaa for IPv4-only networks
2026-04-19 07:31:27 +03:00
Razvan Dimescu
f9e996ae78 fmt: drop redundant comments per house style 2026-04-19 06:53:47 +03:00
Razvan Dimescu
5e85b147b9 feat(resolver): apply ipv6hint strip to SVCB (type 64) too
HTTPS (65) and SVCB (64) share the RDATA wire format, so the existing
parser already handles both — only the call site was HTTPS-only. Widen
the qtype check and extend the existing pipeline test with a second
query for SVCB.
2026-04-19 06:52:30 +03:00
Razvan Dimescu
d6bb9a0f01 fmt: rustfmt vec literal wrapping + signature collapse 2026-04-19 06:24:54 +03:00
Razvan Dimescu
61ea2e510d refactor: dedupe HTTPS_TYPE, record-walk, and test rdata builder
- Drop `const HTTPS_TYPE: u16 = 65;` in favor of `QueryType::HTTPS.to_num()`
  at the single call site — avoids a fresh magic number alongside the
  existing enum mapping in question.rs.
- Add `DnsPacket::for_each_record_mut` so `strip_https_ipv6_hints` stops
  hand-rolling the answers/authorities/resources walk; future section
  rewrites go through the same helper.
- Promote the SVCB test-rdata builder from `svcb::tests` to module scope
  as `pub(crate) #[cfg(test)] fn build_rdata`, and reuse it in the two
  pipeline tests in ctx.rs — kills ~20 lines of byte-fiddling and keeps
  one RDATA-construction code path.
2026-04-19 05:58:47 +03:00
Razvan Dimescu
22dd3cd222 fix(resolver): skip ipv6hint strip for DO-bit clients
Modifying HTTPS rdata invalidates any accompanying RRSIG, so a DNSSEC-
validating downstream would reject the response as Bogus. Gate the
strip on !client_do, matching the existing DNSSEC-records strip.

Adds a regression test that catches the gate being removed: builds a
query with EDNS DO=1, asserts the HTTPS rdata round-trips untouched.
2026-04-19 05:52:37 +03:00
Razvan Dimescu
8014ebac9e test(integration): add Suite 7 for filter_aaaa + SUITES env filter
Suite 7 exercises the full pipeline end-to-end: A resolves, AAAA returns
NODATA, local [[zones]] AAAA bypasses the filter, and HTTPS ipv6hint is
stripped from a real cloudflare.com response. A second config run with
the flag unset guards against network-failure false-positives.

SUITES=N (comma list) runs a subset, e.g. `SUITES=7 bash tests/integration.sh`
skips suites 1-6 for fast iteration.
2026-04-19 05:52:29 +03:00
Razvan Dimescu
70400187d0 Merge pull request #118 from razvandimescu/feat/linux-drop-privileges
feat(linux): run systemd service as unprivileged numa user
2026-04-18 22:04:53 +03:00
Razvan Dimescu
fb41a6f8b5 test(linux): systemd service install verification
Three scenarios CI cannot run: every advertised port is functional (DNS
resolves, TLS chain validates against numa's CA, HTTP/API respond), CA
fingerprint survives upgrade from pre-drop layout, binary staging
fallback from a 0700 source dir. Self-bootstraps a privileged
systemd-as-PID1 container — no dependency on long-lived test containers.

MainPID user assertion retries until comm=numa to avoid a race where
systemctl reports active while MainPID still points at a transitional
process.
2026-04-18 22:00:54 +03:00
Razvan Dimescu
b02b607fb9 ci(linux): assert numa daemon does not run as root
Locks in the invariant this branch establishes: a regression that
reverts to User=root would otherwise ship green.
2026-04-18 20:07:24 +03:00
Razvan Dimescu
be98a02e49 feat(resolver): filter_aaaa for IPv4-only networks (#112)
When enabled, AAAA queries short-circuit to NODATA (NOERROR + empty
answer) so Happy Eyeballs clients don't stall waiting on a v6 address
they can't use. Also strips `ipv6hint` SvcParam from HTTPS/SVCB
answers (RFC 9460) so Chrome ≥103, Firefox, and Safari don't bypass
the AAAA filter via the HTTPS record path.

Local data is preserved: overrides, zones, the .numa proxy, and the
blocklist sinkhole keep whatever v6 addresses they configure — the
filter only kicks in on the cache/forward/recursive path. NODATA is
correct per RFC 2308 here; NXDOMAIN would incorrectly imply the name
doesn't exist for A queries either.

Off by default. Opt in via `filter_aaaa = true` under `[server]`.
2026-04-18 19:52:06 +03:00
Razvan Dimescu
763131478f fmt: rustfmt format! macro split 2026-04-18 12:15:44 +03:00
Razvan Dimescu
067195f2ab fix(linux): atomic binary copy + restart instead of start on re-install
Re-install failed with ETXTBSY (Text file busy) because std::fs::copy
can't overwrite a binary that's currently being executed by the
running service. Switch to copy-then-rename: write the new binary to
/usr/local/bin/numa.new, then rename over /usr/local/bin/numa. Rename
swaps the path while the running process keeps the old inode alive,
so DNS keeps serving from the previous binary until restart.

Bump systemctl start to restart so the new binary actually loads on
re-install (start is a no-op when the unit is already active, which
would silently leave the old binary running).

Locally verified the full CI sequence: install → curl → reinstall →
curl → uninstall → curl-fails. All three assertions pass.
2026-04-18 12:12:11 +03:00
Razvan Dimescu
e19505aa95 fix(linux): narrow replace_exe_path cfg to macos after Linux inlined the substitution
Linux install_service_linux now does the {{exe_path}} substitution
inline because it uses the (potentially copied) binary path returned
by install_service_binary_linux, not current_exe(). The shared
replace_exe_path helper is dead on Linux — clippy -D warnings caught it.

Narrow the function to macos and split the placeholder test: keep the
"both templates contain {{exe_path}}" assertion as a cross-platform test
(catches placeholder removal on either file), keep the substitution test
gated to macos where the function lives.
2026-04-18 11:57:54 +03:00
Razvan Dimescu
3970a9f45c fix(linux): copy binary to /usr/local/bin when source path isn't world-traversable
DynamicUser=yes' transient account can only traverse world-x directories.
The CI binary at /home/runner/work/numa/numa/target/release/numa fails
exec with EACCES because /home/runner is mode 0700; same applies to a
build under /home/<user>/, ~/.cargo/bin, or any private $HOME tree.

install_service_binary_linux now walks the binary's path. If every
ancestor grants world-execute (Linuxbrew /home/linuxbrew is 0755,
/usr/local/bin is fine, install.sh layout works), keep the source
path so brew/distro upgrades propagate in place. Otherwise copy to
/usr/local/bin/numa and reference that in the unit.

Locally verified both branches in an Ubuntu 24.04 systemd container:
- CI-like /home/runner (0700) → copies + service binds 5380
- Brew-like /home/linuxbrew (0755) → keeps source path + service binds 5380
2026-04-18 11:51:32 +03:00
Razvan Dimescu
7b9db9e889 fix(linux): drop ProtectHome=true — blocks exec when binary lives under /home
Integration-linux journalctl showed status=203/EXEC: systemd couldn't
exec /home/runner/work/numa/numa/target/release/numa because
ProtectHome=yes makes /home invisible to the sandboxed process. My
local Docker test passed because the binary was at /workspace, not
/home.

DynamicUser=yes already implies ProtectHome=read-only, which preserves
exec access to binaries living under /home (cargo install, source
builds, CI) while blocking writes to user $HOMEs. Keep that default
rather than over-restricting.

Follow-up worth tracking: install_service_linux could copy the binary
to /usr/local/bin/numa the way Windows does at windows_service_exe_path,
making the unit's ExecStart independent of where `numa install` was
invoked from — then we could set ProtectHome=yes again.
2026-04-18 08:54:34 +03:00
Razvan Dimescu
dfeca53e21 ci: dump journalctl + systemctl status on integration-linux failure 2026-04-18 08:48:53 +03:00
Razvan Dimescu
4f6159d961 refactor(linux): switch to DynamicUser=yes, drop install-time user creation
AUR installs never call `numa install` — PKGBUILD drops the unit straight
into /usr/lib/systemd/system and the user runs `systemctl enable numa`.
With User=numa the Rust installer's useradd code never fires there,
breaking Arch out of the box.

DynamicUser=yes sidesteps packaging entirely — systemd allocates a
transient UID per start and remaps StateDirectory ownership (including
legacy root-owned trees) automatically. Works on any modern systemd.

Drops the ensure_numa_user_linux/chown helpers plus NUMA_USER; the
unit file alone now captures the privilege-drop story.
2026-04-18 08:20:07 +03:00
Razvan Dimescu
41aea1dd12 fix(linux): drop risky sandbox directives that break Rust network daemons
Integration test failed with exit 7 on curl to /health after a successful
install — service started but never listened. The likely culprits are
MemoryDenyWriteExecute (breaks jemalloc/some crypto), SystemCallFilter
~@privileged @resources (blocks setrlimit and friends tokio may use),
and RestrictNamespaces/LockPersonality (occasional foot-guns).

Pull them and keep a conservative hardening set that's well-tested with
Rust network services: no-new-privs, protect-system/home, private tmp
and devices, protect-kernel-*, restrict-realtime/suid/address-families.
Layer the aggressive bits back in follow-up PRs once tested individually.
2026-04-18 08:10:04 +03:00
Razvan Dimescu
695a8b963c feat(linux): run systemd service as unprivileged numa user
- numa.service: User=numa + CAP_NET_BIND_SERVICE + sandboxing block
  (ProtectSystem=strict, PrivateTmp, seccomp @system-service, etc)
- install_service_linux: create numa system user + chown data_dir
  before first start so TLS-cert generation and state writes land
  on a numa-owned tree

Runtime verified root-free on Linux — network_watch_loop only reads
/etc/resolv.conf; all system-DNS mutation stays in the installer,
which continues to run as root via sudo.
2026-04-18 07:56:59 +03:00
Razvan Dimescu
34e2182ae4 Merge pull request #104 from razvandimescu/feat/forwarding-array-upstream
feat: accept array of upstreams in [[forwarding]]
2026-04-17 23:25:04 +03:00
Razvan Dimescu
5f77af55e9 fix(forward): track SRTT for DoT upstreams, not just UDP
The SRTT ordering + failure penalty path was UDP-only, so a DoT primary
in a forwarding-rule pool was never deprioritized on failure and all
DoT entries tied at INITIAL_SRTT_MS in the sort key. With [[forwarding]]
now accepting arrays of upstreams, DoT pools are a first-class case and
need the same healthiest-first behavior the default pool gets for UDP.

- Add Upstream::tracked_ip() → Some(ip) for Udp/Dot, None for Doh
  (DoH has no stable IP — reqwest pools connections by hostname).
- Rewire the three SRTT call sites in forward_with_failover_raw.
- Hoist srtt.read() out of the candidate-scoring loop — one lock per
  query instead of N (matters now that pools commonly have N>1).
- Drop unused #[derive(Debug)] on UpstreamPool and ForwardingRule.
- Regression tests: udp_failure_records_in_srtt + dot_failure_records_in_srtt.
2026-04-17 03:39:21 +03:00
Razvan Dimescu
ab6cda0c91 Merge branch 'main' into feat/forwarding-array-upstream
Resolves src/main.rs conflict: serve loop was extracted into src/serve.rs on main (PR #107). Ported the forwarding-rule log change to serve.rs — fwd.upstream is now Vec<String>, logged with join(", ").
2026-04-17 03:14:09 +03:00
Razvan Dimescu
4afc56a052 Merge main into feat/forwarding-array-upstream
Resolves conflict in src/ctx.rs — both sides added independent tokio
tests (forwarding fail-over on this branch, default-pool upstream path
on main from #103). Keep both.
2026-04-15 21:28:04 +03:00
Razvan Dimescu
fef43635d6 fix(ci): rustfmt import order and gate Upstream import for Windows 2026-04-15 04:11:27 +03:00
Razvan Dimescu
9a0d586b13 feat: accept array of upstreams in [[forwarding]]
Mirrors `[upstream] address` — `upstream` accepts string or array
of strings, builds an `UpstreamPool` and routes queries through
`forward_with_failover_raw` so SRTT ordering and failover apply to
matched `[[forwarding]]` rules the same way they do for the default
pool.

Single-string rules keep their current behavior (one-element pool,
equivalent single-upstream path). Empty array errors at config load.

Addresses item 1 of issue #102. Plan: docs/102_item1.md.
2026-04-15 04:03:38 +03:00
46 changed files with 5029 additions and 387 deletions

View File

@@ -87,12 +87,26 @@ jobs:
sleep 2
curl -sf http://127.0.0.1:5380/health
dig @127.0.0.1 example.com +short +timeout=5 | grep -q '.'
user=$(ps -o user= -p "$(systemctl show -p MainPID --value numa)" | tr -d ' ')
echo "numa running as: $user"
test "$user" != "root"
sudo ./target/release/numa install
sleep 2
curl -sf http://127.0.0.1:5380/health
sudo ./target/release/numa uninstall
sleep 1
! curl -sf http://127.0.0.1:5380/health 2>/dev/null
- name: diagnostics on failure
if: failure()
run: |
echo "=== systemctl status numa ==="
sudo systemctl status numa --no-pager -l || true
echo "=== journalctl -u numa (last 200) ==="
sudo journalctl -u numa --no-pager -n 200 || true
echo "=== ss -tulnp on 53/80/443/853/5380 ==="
sudo ss -tulnp 2>/dev/null | grep -E ':(53|80|443|853|5380)\b' || true
echo "=== systemctl is-active systemd-resolved ==="
systemctl is-active systemd-resolved || true
- name: cleanup
if: always()
run: |

View File

@@ -126,6 +126,10 @@ jobs:
# ssh://aur@aur.archlinux.org/<package-name>.git
git clone ssh://aur@aur.archlinux.org/$AUR_PKGNAME.git aur-repo
# AUR's git server no longer advertises HEAD's symref, so clone
# lands in detached HEAD. Attach to master before committing.
git -C aur-repo checkout master
cp PKGBUILD aur-repo/
cd aur-repo

1
.gitignore vendored
View File

@@ -1,6 +1,7 @@
/target
/build-dir
CLAUDE.md
.claude/
docs/
site/blog/posts/
ios/

396
Cargo.lock generated
View File

@@ -8,6 +8,41 @@ version = "2.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "320119579fcad9c21884f5c4861d16174d0e06250625266f50fe6898340abefa"
[[package]]
name = "aead"
version = "0.5.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d122413f284cf2d62fb1b7db97e02edb8cda96d769b16e443a4f6195e35662b0"
dependencies = [
"crypto-common",
"generic-array",
]
[[package]]
name = "aes"
version = "0.8.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b169f7a6d4742236a0a00c541b845991d0ac43e546831af1249753ab4c3aa3a0"
dependencies = [
"cfg-if",
"cipher",
"cpufeatures",
]
[[package]]
name = "aes-gcm"
version = "0.10.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "831010a0f742e1209b3bcea8fab6a8e149051ba6099432c8cb2cc117dec3ead1"
dependencies = [
"aead",
"aes",
"cipher",
"ctr",
"ghash",
"subtle",
]
[[package]]
name = "aho-corasick"
version = "1.1.4"
@@ -109,7 +144,7 @@ dependencies = [
"nom",
"num-traits",
"rusticata-macros",
"thiserror",
"thiserror 2.0.18",
"time",
]
@@ -257,6 +292,15 @@ version = "2.11.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "843867be96c8daad0d758b57df9392b6d8d271134fce549de6ce169ff98a92af"
[[package]]
name = "block-buffer"
version = "0.10.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3078c7629b62d3f0439517fa394996acacc5cbc91c5a20d8c658e77abd503a71"
dependencies = [
"generic-array",
]
[[package]]
name = "bumpalo"
version = "3.20.2"
@@ -299,6 +343,30 @@ version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "613afe47fcd5fac7ccf1db93babcb082c5994d996f20b8b159f2ad1658eb5724"
[[package]]
name = "chacha20"
version = "0.9.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c3613f74bd2eac03dad61bd53dbe620703d4371614fe0bc3b9f04dd36fe4e818"
dependencies = [
"cfg-if",
"cipher",
"cpufeatures",
]
[[package]]
name = "chacha20poly1305"
version = "0.10.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "10cd79432192d1c0f4e1a0fef9527696cc039165d729fb41b3f4f4f354c2dc35"
dependencies = [
"aead",
"chacha20",
"cipher",
"poly1305",
"zeroize",
]
[[package]]
name = "ciborium"
version = "0.2.2"
@@ -326,6 +394,17 @@ dependencies = [
"half",
]
[[package]]
name = "cipher"
version = "0.4.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "773f3b9af64447d2ce9850330c473515014aa235e6a783b02db81ff39e4a3dad"
dependencies = [
"crypto-common",
"inout",
"zeroize",
]
[[package]]
name = "clap"
version = "4.6.0"
@@ -383,6 +462,15 @@ version = "0.4.31"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "75984efb6ed102a0d42db99afb6c1948f0380d1d91808d5529916e6c08b49d8d"
[[package]]
name = "cpufeatures"
version = "0.2.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "59ed5838eebb26a2bb2e58f6d5b5316989ae9d08bab10e0e6d103e656d1b0280"
dependencies = [
"libc",
]
[[package]]
name = "crc32fast"
version = "1.5.0"
@@ -473,6 +561,51 @@ version = "0.2.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "460fbee9c2c2f33933d720630a6a0bac33ba7053db5344fac858d4b8952d77d5"
[[package]]
name = "crypto-common"
version = "0.1.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "78c8292055d1c1df0cce5d180393dc8cce0abec0a7102adb6c7b1eef6016d60a"
dependencies = [
"generic-array",
"rand_core 0.6.4",
"typenum",
]
[[package]]
name = "ctr"
version = "0.9.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0369ee1ad671834580515889b80f2ea915f23b8be8d0daa4bbaf2ac5c7590835"
dependencies = [
"cipher",
]
[[package]]
name = "curve25519-dalek"
version = "4.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "97fb8b7c4503de7d6ae7b42ab72a5a59857b4c937ec27a3d4539dba95b5ab2be"
dependencies = [
"cfg-if",
"cpufeatures",
"curve25519-dalek-derive",
"fiat-crypto",
"rustc_version",
"subtle",
]
[[package]]
name = "curve25519-dalek-derive"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f46882e17999c6cc590af592290432be3bce0428cb0d5f8b6715e4dc7b383eb3"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "data-encoding"
version = "2.10.0"
@@ -502,6 +635,17 @@ dependencies = [
"powerfmt",
]
[[package]]
name = "digest"
version = "0.10.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9ed9a281f7bc9b7576e61468ba615a66a5c8cfdff42420a70aa82701a3b1e292"
dependencies = [
"block-buffer",
"crypto-common",
"subtle",
]
[[package]]
name = "displaydoc"
version = "0.2.5"
@@ -576,6 +720,12 @@ dependencies = [
"windows-sys 0.61.2",
]
[[package]]
name = "fiat-crypto"
version = "0.2.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "28dea519a9695b9977216879a3ebfddf92f1c08c05d984f8996aecd6ecdc811d"
[[package]]
name = "find-msvc-tools"
version = "0.1.9"
@@ -707,6 +857,16 @@ dependencies = [
"slab",
]
[[package]]
name = "generic-array"
version = "0.14.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "85649ca51fd72272d7821adaf274ad91c288277713d9c18820d8499a7ff69e9a"
dependencies = [
"typenum",
"version_check",
]
[[package]]
name = "getrandom"
version = "0.2.17"
@@ -747,6 +907,16 @@ dependencies = [
"wasip3",
]
[[package]]
name = "ghash"
version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f0d8a4362ccb29cb0b265253fb0a2728f592895ee6854fd9bc13f2ffda266ff1"
dependencies = [
"opaque-debug",
"polyval",
]
[[package]]
name = "h2"
version = "0.4.13"
@@ -820,7 +990,7 @@ dependencies = [
"rand",
"ring",
"rustls",
"thiserror",
"thiserror 2.0.18",
"tinyvec",
"tokio",
"tokio-rustls",
@@ -846,13 +1016,51 @@ dependencies = [
"resolv-conf",
"rustls",
"smallvec",
"thiserror",
"thiserror 2.0.18",
"tokio",
"tokio-rustls",
"tracing",
"webpki-roots 0.26.11",
]
[[package]]
name = "hkdf"
version = "0.12.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7b5f8eb2ad728638ea2c7d47a21db23b7b58a72ed6a38256b8a1849f15fbbdf7"
dependencies = [
"hmac",
]
[[package]]
name = "hmac"
version = "0.12.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6c49c37c09c17a53d937dfbb742eb3a961d65a994e6bcdcf37e7399d0cc8ab5e"
dependencies = [
"digest",
]
[[package]]
name = "hpke"
version = "0.13.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f65d16b699dd1a1fa2d851c970b0c971b388eeeb40f744252b8de48860980c8f"
dependencies = [
"aead",
"aes-gcm",
"chacha20poly1305",
"digest",
"generic-array",
"hkdf",
"hmac",
"rand_core 0.9.5",
"sha2",
"subtle",
"x25519-dalek",
"zeroize",
]
[[package]]
name = "http"
version = "1.4.0"
@@ -1081,6 +1289,15 @@ dependencies = [
"serde_core",
]
[[package]]
name = "inout"
version = "0.1.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "879f10e63c20629ecabbb64a8010319738c66a5cd0c29b02d63d272b03751d01"
dependencies = [
"generic-array",
]
[[package]]
name = "ipconfig"
version = "0.3.4"
@@ -1330,7 +1547,7 @@ dependencies = [
[[package]]
name = "numa"
version = "0.13.1"
version = "0.14.2"
dependencies = [
"arc-swap",
"axum",
@@ -1344,7 +1561,10 @@ dependencies = [
"hyper",
"hyper-util",
"log",
"odoh-rs",
"psl",
"qrcode",
"rand_core 0.9.5",
"rcgen",
"reqwest",
"ring",
@@ -1363,6 +1583,19 @@ dependencies = [
"x509-parser",
]
[[package]]
name = "odoh-rs"
version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cbb89720b7dfdddc89bc7560669d41a0bb68eb64784a4aebd293308a489f3837"
dependencies = [
"aes-gcm",
"bytes",
"hkdf",
"hpke",
"thiserror 1.0.69",
]
[[package]]
name = "oid-registry"
version = "0.8.1"
@@ -1394,6 +1627,12 @@ version = "11.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d6790f58c7ff633d8771f42965289203411a5e5c68388703c06e14f24770b41e"
[[package]]
name = "opaque-debug"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c08d65885ee38876c4f86fa503fb49d7b507c2b62552df7c70b2fce627e06381"
[[package]]
name = "page_size"
version = "0.6.0"
@@ -1483,6 +1722,29 @@ dependencies = [
"plotters-backend",
]
[[package]]
name = "poly1305"
version = "0.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8159bd90725d2df49889a078b54f4f79e87f1f8a8444194cdca81d38f5393abf"
dependencies = [
"cpufeatures",
"opaque-debug",
"universal-hash",
]
[[package]]
name = "polyval"
version = "0.6.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9d1fe60d06143b2430aa532c94cfe9e29783047f06c0d7fd359a9a51b729fa25"
dependencies = [
"cfg-if",
"cpufeatures",
"opaque-debug",
"universal-hash",
]
[[package]]
name = "portable-atomic"
version = "1.13.1"
@@ -1541,6 +1803,21 @@ dependencies = [
"unicode-ident",
]
[[package]]
name = "psl"
version = "2.1.203"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "76c0777260d32b76a8c3c197646707085d37e79d63b5872a29192c8d4f60f50b"
dependencies = [
"psl-types",
]
[[package]]
name = "psl-types"
version = "2.0.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "33cb294fe86a74cbcf50d4445b37da762029549ebeea341421c7c70370f86cac"
[[package]]
name = "qrcode"
version = "0.14.1"
@@ -1561,7 +1838,7 @@ dependencies = [
"rustc-hash",
"rustls",
"socket2",
"thiserror",
"thiserror 2.0.18",
"tokio",
"tracing",
"web-time",
@@ -1582,7 +1859,7 @@ dependencies = [
"rustls",
"rustls-pki-types",
"slab",
"thiserror",
"thiserror 2.0.18",
"tinyvec",
"tracing",
"web-time",
@@ -1630,7 +1907,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6db2770f06117d490610c7488547d543617b21bfa07796d7a12f6f1bd53850d1"
dependencies = [
"rand_chacha",
"rand_core",
"rand_core 0.9.5",
]
[[package]]
@@ -1640,7 +1917,16 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d3022b5f1df60f26e1ffddd6c66e8aa15de382ae63b3a0c1bfc0e4d3e3f325cb"
dependencies = [
"ppv-lite86",
"rand_core",
"rand_core 0.9.5",
]
[[package]]
name = "rand_core"
version = "0.6.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ec0be4795e2f6a28069bec0b5ff3e2ac9bafc99e6a9a7dc3547996c5c816922c"
dependencies = [
"getrandom 0.2.17",
]
[[package]]
@@ -1789,6 +2075,15 @@ version = "2.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "357703d41365b4b27c590e3ed91eabb1b663f07c4c084095e60cbed4362dff0d"
[[package]]
name = "rustc_version"
version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cfcb3a22ef46e85b45de6ee7e79d063319ebb6594faafcf1c225ea92ab6e9b92"
dependencies = [
"semver",
]
[[package]]
name = "rusticata-macros"
version = "4.1.0"
@@ -1835,9 +2130,9 @@ dependencies = [
[[package]]
name = "rustls-webpki"
version = "0.103.12"
version = "0.103.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8279bb85272c9f10811ae6a6c547ff594d6a7f3c6c6b02ee9726d1d0dcfcdd06"
checksum = "61c429a8649f110dddef65e2a5ad240f747e85f7758a6bccc7e5777bd33f756e"
dependencies = [
"aws-lc-rs",
"ring",
@@ -1953,6 +2248,17 @@ dependencies = [
"serde",
]
[[package]]
name = "sha2"
version = "0.10.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a7507d819769d01a365ab707794a4084392c824f54a7a6a7862f8c3d0892b283"
dependencies = [
"cfg-if",
"cpufeatures",
"digest",
]
[[package]]
name = "shlex"
version = "1.3.0"
@@ -2046,13 +2352,33 @@ version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7b2093cf4c8eb1e67749a6762251bc9cd836b6fc171623bd0a9d324d37af2417"
[[package]]
name = "thiserror"
version = "1.0.69"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b6aaf5339b578ea85b50e080feb250a3e8ae8cfcdff9a461c9ec2904bc923f52"
dependencies = [
"thiserror-impl 1.0.69",
]
[[package]]
name = "thiserror"
version = "2.0.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4288b5bcbc7920c07a1149a35cf9590a2aa808e0bc1eafaade0b80947865fbc4"
dependencies = [
"thiserror-impl",
"thiserror-impl 2.0.18",
]
[[package]]
name = "thiserror-impl"
version = "1.0.69"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4fee6c4efc90059e10f81e6d42c60a18f76588c3d74cb83a0b242a2b6c7504c1"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
@@ -2298,6 +2624,12 @@ version = "0.2.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e421abadd41a4225275504ea4d6566923418b7f05506fbc9c0fe86ba7396114b"
[[package]]
name = "typenum"
version = "1.20.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "40ce102ab67701b8526c123c1bab5cbe42d7040ccfd0f64af1a385808d2f43de"
[[package]]
name = "unicode-ident"
version = "1.0.24"
@@ -2310,6 +2642,16 @@ version = "0.2.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ebc1c04c71510c7f702b52b7c350734c9ff1295c464a03335b00bb84fc54f853"
[[package]]
name = "universal-hash"
version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fc1de2c688dc15305988b563c3854064043356019f97a4b46276fe734c4f07ea"
dependencies = [
"crypto-common",
"subtle",
]
[[package]]
name = "untrusted"
version = "0.9.0"
@@ -2351,6 +2693,12 @@ dependencies = [
"wasm-bindgen",
]
[[package]]
name = "version_check"
version = "0.9.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0b928f33d975fc6ad9f86c8f283853ad26bdd5b10b7f1542aa2fa15e2289105a"
[[package]]
name = "walkdir"
version = "2.5.0"
@@ -2860,6 +3208,16 @@ version = "0.6.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9edde0db4769d2dc68579893f2306b26c6ecfbe0ef499b013d731b7b9247e0b9"
[[package]]
name = "x25519-dalek"
version = "2.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c7e468321c81fb07fa7f4c636c3972b9100f0346e5b6a9f2bd0603a52f7ed277"
dependencies = [
"curve25519-dalek",
"rand_core 0.6.4",
]
[[package]]
name = "x509-parser"
version = "0.18.1"
@@ -2874,7 +3232,7 @@ dependencies = [
"oid-registry",
"ring",
"rusticata-macros",
"thiserror",
"thiserror 2.0.18",
"time",
]
@@ -2956,6 +3314,20 @@ name = "zeroize"
version = "1.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b97154e67e32c85465826e8bcc1c59429aaaf107c1e4a9e53c8d8ccd5eff88d0"
dependencies = [
"zeroize_derive",
]
[[package]]
name = "zeroize_derive"
version = "1.4.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "85a5b4158499876c763cb03bc4e49185d3cccbabb15b33c627f7884f43db852e"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "zerotrie"

View File

@@ -1,6 +1,6 @@
[package]
name = "numa"
version = "0.13.1"
version = "0.14.2"
authors = ["razvandimescu <razvan@dimescu.com>"]
edition = "2021"
description = "Portable DNS resolver in Rust — .numa local domains, ad blocking, developer overrides, DNS-over-HTTPS"
@@ -29,6 +29,11 @@ rustls = "0.23"
tokio-rustls = "0.26"
arc-swap = "1"
ring = "0.17"
odoh-rs = "1"
psl = "2"
# rand_core 0.9 matches the version odoh-rs (via hpke 0.13) depends on, so we
# share one RngCore trait and OsRng impl across the dep tree.
rand_core = { version = "0.9", features = ["os_rng"] }
rustls-pemfile = "2.2.0"
qrcode = { version = "0.14", default-features = false, features = ["svg"] }
webpki-roots = "1"

View File

@@ -6,9 +6,9 @@
**DNS you own. Everywhere you go.** — [numa.rs](https://numa.rs)
A portable DNS resolver in a single binary. Block ads on any network, name your local services (`frontend.numa`), and override any hostname with auto-revert — all from your laptop, no cloud account or Raspberry Pi required.
A portable DNS resolver in a single binary. Block ads on any network, name your local services (`frontend.numa`), override any hostname with auto-revert, and seal every outbound query with **ODoH (RFC 9230)** so no single party sees both who you are and what you asked — all from your laptop, no cloud account or Raspberry Pi required.
Built from scratch in Rust. Zero DNS libraries. RFC 1035 wire protocol parsed by hand. Caching, ad blocking, and local service domains out of the box. Optional recursive resolution from root nameservers with full DNSSEC chain-of-trust validation, plus a DNS-over-TLS listener for encrypted client connections (iOS Private DNS, systemd-resolved, etc.). One ~8MB binary, everything embedded.
Built from scratch in Rust. Zero DNS libraries. Caching, ad blocking, and local service domains out of the box. Optional recursive resolution from root nameservers with full DNSSEC chain-of-trust validation, plus a DNS-over-TLS listener for encrypted client connections (iOS Private DNS, systemd-resolved, etc.). Run `numa relay` and the same binary becomes a public ODoH endpoint too — the curated DNSCrypt list currently has one surviving relay, so every Numa deploy materially expands the ecosystem. One ~8MB binary, everything embedded.
![Numa dashboard](assets/hero-demo.gif)
@@ -125,6 +125,10 @@ docker run -d --name numa --network host \
Multi-arch: `linux/amd64` and `linux/arm64`.
Turnkey compose recipes:
- [`packaging/client/`](packaging/client/) — ODoH client mode (anonymous DNS), Numa + starter `numa.toml`.
- [`packaging/relay/`](packaging/relay/) — public ODoH relay, Numa + Caddy + ACME.
## How It Compares
| | Pi-hole | AdGuard Home | Unbound | Numa |

View File

@@ -383,7 +383,7 @@ fn run_default(rt: &tokio::runtime::Runtime) {
/// Library-to-library: Numa forward_query_raw vs Hickory resolver.lookup.
fn run_direct(rt: &tokio::runtime::Runtime) {
let upstream = numa::forward::parse_upstream(DOH_UPSTREAM, 443).expect("failed to parse");
let upstream = numa::forward::parse_upstream(DOH_UPSTREAM, 443, None).expect("failed to parse");
let resolver = rt.block_on(build_hickory_resolver());
let timeout = Duration::from_secs(10);
@@ -609,9 +609,11 @@ fn run_hedge_multi(rt: &tokio::runtime::Runtime, iterations: usize) {
DOMAINS.len()
);
let primary = numa::forward::parse_upstream(DOH_UPSTREAM, 443).expect("failed to parse");
let primary_dual = numa::forward::parse_upstream(DOH_UPSTREAM, 443).expect("failed to parse");
let secondary_dual = numa::forward::parse_upstream(DOH_UPSTREAM, 443).expect("failed to parse");
let primary = numa::forward::parse_upstream(DOH_UPSTREAM, 443, None).expect("failed to parse");
let primary_dual =
numa::forward::parse_upstream(DOH_UPSTREAM, 443, None).expect("failed to parse");
let secondary_dual =
numa::forward::parse_upstream(DOH_UPSTREAM, 443, None).expect("failed to parse");
let resolver = rt.block_on(build_hickory_resolver());
println!("Warming up...");
@@ -810,7 +812,7 @@ fn run_diag(rt: &tokio::runtime::Runtime) {
fn run_diag_clients(rt: &tokio::runtime::Runtime) {
println!("Client diagnostic: reqwest vs Hickory (20 queries to {DOH_UPSTREAM})\n");
let upstream = numa::forward::parse_upstream(DOH_UPSTREAM, 443).expect("failed to parse");
let upstream = numa::forward::parse_upstream(DOH_UPSTREAM, 443, None).expect("failed to parse");
let resolver = rt.block_on(build_hickory_resolver());
let timeout = Duration::from_secs(10);

View File

@@ -8,6 +8,39 @@ Type=simple
ExecStart={{exe_path}}
Restart=always
RestartSec=2
# Transient system user per start; no PKGBUILD/sysusers setup required.
# systemd remaps the StateDirectory ownership to the dynamic UID on each
# launch, including legacy root-owned trees from pre-drop installs.
DynamicUser=yes
AmbientCapabilities=CAP_NET_BIND_SERVICE
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
StateDirectory=numa
StateDirectoryMode=0750
ConfigurationDirectory=numa
ConfigurationDirectoryMode=0755
# Sandboxing — conservative set known to work with Rust network daemons.
# Aggressive hardening (MemoryDenyWriteExecute, SystemCallFilter, seccomp
# allow-lists) can be layered on once tested in isolation.
NoNewPrivileges=true
ProtectSystem=strict
# DynamicUser= sets ProtectHome=read-only by default — leaves /home
# readable so systemd can exec binaries installed under it (cargo install,
# source builds), while blocking writes to user $HOMEs. Don't set =yes:
# that hides /home entirely and fails with status=203/EXEC.
PrivateTmp=true
PrivateDevices=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
RestrictRealtime=true
RestrictSUIDSGID=true
# AF_NETLINK for interface enumeration on network changes
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX AF_NETLINK
StandardOutput=journal
StandardError=journal
SyslogIdentifier=numa

View File

@@ -8,10 +8,21 @@ api_port = 5380
# %PROGRAMDATA%\numa on windows. Override for
# containerized deploys or tests that can't
# write to the system path.
# filter_aaaa = true # on IPv4-only networks, answer AAAA queries with
# NODATA (NOERROR + empty answer) so Happy Eyeballs
# clients don't wait on a v6 attempt that can't
# succeed. Also strips `ipv6hint` from HTTPS/SVCB
# records (RFC 9460) so modern browsers (Chrome
# ≥103, Firefox, Safari) don't bypass the AAAA
# filter via SVCB hints. Local zones, overrides,
# and the .numa proxy are NOT filtered — you can
# still configure v6 records for local services.
# Default: false.
# [upstream]
# mode = "forward" # "forward" (default) — relay to upstream
# # "recursive" — resolve from root hints (no address needed)
# # "odoh" — Oblivious DoH (see ODoH block below)
# address = "9.9.9.9" # single upstream (plain UDP)
# address = ["192.168.1.1", "9.9.9.9:5353"] # multiple upstreams — SRTT picks fastest
# address = "https://dns.quad9.net/dns-query" # DNS-over-HTTPS (encrypted)
@@ -19,11 +30,29 @@ api_port = 5380
# fallback = ["8.8.8.8", "1.1.1.1"] # tried only when all primaries fail
# port = 53 # default port for addresses without :port
# timeout_ms = 3000
# hedge_ms = 10 # request hedging delay (ms). After this delay
# # without a response, fires a parallel request
# # to the same upstream. Rescues packet loss (UDP),
# # dispatch spikes (DoH), TLS stalls (DoT).
# # Set to 0 to disable. Default: 10
# hedge_ms = 0 # request hedging delay (ms). Default: 0 (off).
# # Set to e.g. 10 to fire a parallel upstream
# # request after 10ms of silence — rescues packet
# # loss (UDP), dispatch spikes (DoH), TLS stalls
# # (DoT). Doubles the upstream query count, so
# # leave off for quota'd providers (NextDNS,
# # Control D).
# ODoH (Oblivious DNS-over-HTTPS, RFC 9230). The relay sees your IP but
# not the question; the target sees the question but not your IP. Numa
# refuses same-operator relay+target configs by default (eTLD+1 check).
# [upstream]
# mode = "odoh"
# relay = "https://odoh-relay.numa.rs/relay"
# target = "https://odoh.cloudflare-dns.com/dns-query"
# strict = true # default: refuse to downgrade to `fallback`
# # on relay failure. Set false to allow a
# # non-oblivious fallback path.
# relay_ip = "178.104.229.30" # optional: pin IPs so numa doesn't leak the
# target_ip = "104.16.249.249" # relay/target hostnames via the bootstrap
# # resolver on cold boot when numa is its
# # own system DNS. See
# # recipes/odoh-upstream.md.
# root_hints = [ # only used in recursive mode
# "198.41.0.4", # a.root-servers.net (Verisign)
# "199.9.14.201", # b.root-servers.net (USC-ISI)
@@ -66,6 +95,13 @@ api_port = 5380
# [[forwarding]] # DoH upstream: full https:// URL
# suffix = "example.corp"
# upstream = "https://dns.quad9.net/dns-query"
#
# [[forwarding]] # array of upstreams → SRTT-aware failover
# suffix = ["google.com", "goog"] # fastest-healthy first, dead one skipped
# upstream = [
# "tls://9.9.9.9#dns.quad9.net",
# "tls://149.112.112.112#dns.quad9.net",
# ]
# [blocking]
# enabled = true # set to false to disable ad blocking

View File

@@ -0,0 +1,72 @@
# Numa ODoH Client — Docker deploy
Single-container deploy that runs Numa as an ODoH (RFC 9230) client: every
DNS query routes through an independent relay + target so neither operator
sees both your IP and your question. See the [ODoH upstream recipe][odoh]
for the protocol details and the bootstrap-pinning trade-offs.
[odoh]: ../../recipes/odoh-upstream.md
## Prerequisites
- Docker + Docker Compose v2.
- Port 53 (UDP+TCP) free on the host — Numa listens there for DNS
clients on your LAN.
## Configure
The shipped `numa.toml` points at Numa's own public relay
(`odoh-relay.numa.rs`) paired with Cloudflare's ODoH target
(`odoh.cloudflare-dns.com`). That's two independent operators with
distinct eTLD+1s — the default configuration passes Numa's same-operator
check and works out of the box.
To use a different relay or target, edit `numa.toml` and adjust the URLs.
The `relay` and `target` must resolve to distinct operators or Numa
refuses to start.
## Deploy
```sh
docker compose up -d
docker compose logs -f numa # watch startup
```
The first query fires the bootstrap resolver + ODoH config fetch;
subsequent queries reuse the warm HTTP/2 connection.
## Point your devices at it
Set each device's DNS server to the IP of the Docker host. For a LAN-wide
rollout, set the DNS server in your router's DHCP config so every device
picks it up automatically.
Verify a query landed on the ODoH path:
```sh
dig @<host-ip> example.com
curl http://<host-ip>:5380/stats | jq '.upstream_transport.odoh'
```
`upstream_transport.odoh` should increment on each query.
## What this does NOT buy you
ODoH protects the *path*, not the content:
- **The target (Cloudflare here) still sees the question.** It just
doesn't know it's you asking. If Cloudflare logs every ODoH query, the
query is still visible — it's simply unattributed.
- **The relay is a trusted party for availability.** A malicious relay
can drop or delay queries; it just can't read them.
- **Traffic analysis defeats small relays.** If you're the only client
talking to a relay, timing alone re-identifies you. Shared, busy relays
give better anonymity sets.
See the [ODoH integration doc][odoh] for more.
## Relay operator?
If you'd rather run your own relay (same binary, different mode), see
[`../relay/`](../relay/) — that package spins up a public-facing relay
with Caddy + ACME in front of it.

View File

@@ -0,0 +1,15 @@
services:
numa:
image: ghcr.io/razvandimescu/numa:latest
command: ["/etc/numa/numa.toml"]
ports:
- "53:53/udp"
- "53:53/tcp"
- "5380:5380/tcp" # dashboard + REST API
volumes:
- ./numa.toml:/etc/numa/numa.toml:ro
- numa_data:/var/lib/numa
restart: unless-stopped
volumes:
numa_data:

View File

@@ -0,0 +1,23 @@
# Numa — ODoH client mode (docker-compose starter).
# Sends every DNS query through an independent relay + target pair so
# neither operator sees both your IP and your question. See
# recipes/odoh-upstream.md for the protocol details and
# packaging/client/README.md for deploy notes.
[server]
bind_addr = "0.0.0.0:53"
api_bind_addr = "0.0.0.0"
data_dir = "/var/lib/numa"
[upstream]
mode = "odoh"
# Numa's own relay (Hetzner, systemd + Caddy). Swap to any other public
# ODoH relay if you'd rather not depend on a single operator; the protocol
# tolerates it, and Numa refuses same-operator relay+target by default.
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
# strict = true (default). Relay failure → SERVFAIL, never silent downgrade.
[blocking]
enabled = true
# Default blocklist (Hagezi Pro). Edit the `lists` array to taste.

15
packaging/relay/Caddyfile Normal file
View File

@@ -0,0 +1,15 @@
odoh-relay.example.com {
handle /relay {
reverse_proxy numa-relay:8443
}
handle /health {
reverse_proxy numa-relay:8443
}
respond 404
# Per-request access logs defeat the point of an oblivious relay.
# Aggregate counters are exposed at /health on the relay itself.
log {
output discard
}
}

41
packaging/relay/README.md Normal file
View File

@@ -0,0 +1,41 @@
# Numa ODoH Relay — Docker deploy
Two-container deploy: Caddy terminates TLS (auto-provisioning a Let's Encrypt
cert via ACME) and reverse-proxies to a Numa relay running on an internal
Docker network. The relay never reads sealed payloads; Caddy never logs them.
## Prerequisites
- A host with public 80/443 reachable from the internet.
- A DNS record (`A` or `AAAA`) pointing your chosen hostname at the host.
- Docker + Docker Compose v2.
## Configure
Edit `Caddyfile` and replace `odoh-relay.example.com` with your hostname.
That hostname is what ACME validates against and what ODoH clients will
configure as their relay URL: `https://<hostname>/relay`.
## Deploy
```sh
docker compose up -d
docker compose logs -f caddy # watch ACME provisioning
```
First boot takes a few seconds while Caddy obtains the cert. Subsequent
restarts reuse the cached cert from the `caddy_data` volume.
## Verify
```sh
curl https://<hostname>/health
# ok
# total 0
# forwarded_ok 0
# forwarded_err 0
# rejected_bad_request 0
```
Then point any ODoH client at `https://<hostname>/relay` and watch the
counters tick.

View File

@@ -0,0 +1,26 @@
services:
numa-relay:
image: ghcr.io/razvandimescu/numa:latest
command: ["relay", "8443", "0.0.0.0"]
restart: unless-stopped
networks: [internal]
caddy:
image: caddy:2
ports:
- "80:80"
- "443:443"
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile:ro
- caddy_data:/data
- caddy_config:/config
restart: unless-stopped
depends_on: [numa-relay]
networks: [internal]
networks:
internal:
volumes:
caddy_data:
caddy_config:

11
recipes/README.md Normal file
View File

@@ -0,0 +1,11 @@
# Recipes
Scenario-driven configs for common Numa deployments. Each recipe is self-contained: copy the snippet, adjust the marked fields, reload.
## Transport / encryption
- [DoH on the LAN](doh-on-lan.md) — expose Numa's built-in DNS-over-HTTPS to local clients.
- [dnsdist in front of Numa](dnsdist-front.md) — terminate public TLS externally, keep Numa on loopback.
- [ODoH upstream with bootstrap pinning](odoh-upstream.md) — oblivious DNS client mode without leaking the relay/target hostnames.
Missing a scenario? Open an issue or PR — these are plain Markdown with no build step.

64
recipes/dnsdist-front.md Normal file
View File

@@ -0,0 +1,64 @@
# dnsdist in front of Numa
For public DoH with a real (ACME-signed) cert, terminate TLS outside Numa and forward plain DNS (or loopback-only DoH) to the resolver. Cert renewal, rate-limiting, and load-balancing live in the front-end; Numa stays focused on resolution.
## When to use this
- Public hostname (`dns.example.com`) with a Let's Encrypt or internal PKI cert.
- You want a dedicated front-end for DoH/DoT/DoQ while Numa stays loopback-bound.
- You plan to run multiple Numa instances behind one endpoint.
## Architecture
```
public 443/DoH ┐
public 853/DoT ├─► dnsdist ─► 127.0.0.1:53 (Numa UDP/TCP)
public 443/DoQ ┘
```
## dnsdist config
```lua
-- /etc/dnsdist/dnsdist.conf
newServer({address="127.0.0.1:53", name="numa", checkType="A", checkName="numa.rs."})
addDOHLocal(
"0.0.0.0:443",
"/etc/letsencrypt/live/dns.example.com/fullchain.pem",
"/etc/letsencrypt/live/dns.example.com/privkey.pem",
"/dns-query",
{doTCP=true, reusePort=true}
)
addTLSLocal(
"0.0.0.0:853",
"/etc/letsencrypt/live/dns.example.com/fullchain.pem",
"/etc/letsencrypt/live/dns.example.com/privkey.pem"
)
addAction(AllRule(), PoolAction("", false))
```
## Numa config
```toml
[proxy]
enabled = true # keep if you still use *.numa service routing
bind_addr = "127.0.0.1" # stays default
```
No changes to `[server]` — Numa keeps serving plain DNS on UDP/TCP 53, which dnsdist forwards.
## Caveat: client IPs
Without PROXY protocol support in Numa, the query log shows the front-end's IP on every query, not the real client. dnsdist can emit PROXY v2 (`useProxyProtocol=true` on `newServer`), but Numa doesn't yet parse it — tracked in the wish-list under #143. Until then, accept the blind spot or correlate against dnsdist's own logs.
## Verify
```bash
kdig +https @dns.example.com example.com
kdig +tls @dns.example.com example.com
```
Both should return clean answers. Numa's `/queries` API should show the request landing, sourced from the front-end IP.

61
recipes/doh-on-lan.md Normal file
View File

@@ -0,0 +1,61 @@
# DoH on the LAN
Numa ships an RFC 8484 DoH endpoint (`POST /dns-query`) on the `[proxy]` HTTPS listener. By default it binds `127.0.0.1:443` with a self-signed cert — invisible to anything off the box. Three changes make it reachable from the LAN.
## When to use this
- Your phone/laptop is on the same network as Numa and you want encrypted DNS without a cloud resolver.
- You're OK installing Numa's self-signed CA on every client (one-time, via `/ca.pem` + the mobileconfig flow).
For a publicly-trusted cert, see [dnsdist in front of Numa](dnsdist-front.md) instead.
## Minimal config
```toml
[proxy]
enabled = true # default
bind_addr = "0.0.0.0" # was 127.0.0.1 — expose to LAN
tls_port = 443 # default; DoH is served here
tld = "numa" # default — self-resolving, see below
```
`tld` is the DoH gate: Numa accepts the DoH request only when the `Host` header is loopback or equals (or is a subdomain of) `tld`. Clients therefore dial `https://numa/dns-query`.
With the default `tld = "numa"`, there's no DNS bootstrap to configure: Numa already resolves `numa` and `*.numa` to its own LAN IP for remote clients (that's how the `*.numa` service-proxy feature works). Any client that uses Numa as its resolver will resolve `numa` correctly on first try.
If you'd rather use a hostname that resolves via normal DNS (e.g. you want DoH-only clients that never talk plain DNS to Numa), set `tld = "dns.example.com"` and add a matching A record in whichever DNS your clients consult before reaching Numa.
## Trust the CA on each client
Numa generates a self-signed CA at startup. Fetch it once, import it wherever you'll run the DoH client:
```bash
curl -o numa-ca.pem http://<numa-ip>:5380/ca.pem
```
- **macOS** — `sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain numa-ca.pem`
- **iOS** — install the mobileconfig from the API (same CA, signed profile). Flip *Settings → General → About → Certificate Trust Settings* on after install.
- **Linux** — drop into `/usr/local/share/ca-certificates/` and run `sudo update-ca-certificates`.
- **Android** — requires the user-installed CA path; browsers may still refuse it for DoH. Consider the [dnsdist front](dnsdist-front.md) route instead.
## Verify
```bash
kdig +https @numa example.com
```
Without `+https` kdig uses plain DNS. With `+https` the same answers should flow over port 443.
Raw check:
```bash
curl -H 'accept: application/dns-message' \
--data-binary @query.bin \
https://numa/dns-query
```
## Gotchas
- Port 443 is privileged on Linux/macOS. Run Numa via the provided service units, or grant `CAP_NET_BIND_SERVICE` (`sudo setcap 'cap_net_bind_service=+ep' /path/to/numa`).
- Non-matching `Host` header → HTTP 404 from the proxy's fallback handler. Double-check `tld`.
- ChromeOS enrollment rejects user-installed CAs for some flows — known pain point, see issue #136.

59
recipes/odoh-upstream.md Normal file
View File

@@ -0,0 +1,59 @@
# ODoH upstream with bootstrap pinning
Numa can run as an Oblivious DoH (RFC 9230) client: the relay sees your IP but not the question, the target sees the question but not your IP. Neither party alone can re-identify a query. This recipe covers the minimal config and the bootstrap leak that `relay_ip` / `target_ip` close.
## When to use this
- You want split-trust encrypted DNS without a single provider seeing both who you are and what you asked.
- Numa is your system resolver (so there's no "other" DNS to ask).
## Minimal config
```toml
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
strict = true # refuse to fall back to a non-oblivious path on relay failure
```
`strict = true` means a relay-level HTTPS failure returns SERVFAIL instead of silently downgrading. Set it to `false` and configure `[upstream].fallback` if you'd rather keep resolving (at the cost of the oblivious property).
## The bootstrap leak
When Numa is the system resolver and needs to reach the relay/target, *something* has to translate `odoh-relay.numa.rs` → IP. If Numa asks itself, you deadlock. If Numa asks a bootstrap resolver (1.1.1.1, 9.9.9.9), that resolver learns which ODoH endpoint you use in cleartext — it can't see your questions, but it sees the destination. That's the leak ODoH was supposed to close.
`relay_ip` and `target_ip` tell Numa the IPs directly, so it never asks anyone:
```toml
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
relay_ip = "178.104.229.30" # pin the relay — no hostname lookup
target_ip = "104.16.249.249" # pin the target — no hostname lookup
```
Numa still validates TLS against the hostnames in `relay` / `target`, so a hijacked IP can't masquerade — pinning skips only the DNS step.
## Finding current IPs
```bash
dig +short odoh-relay.numa.rs
dig +short odoh.cloudflare-dns.com
```
Re-pin when an operator rotates. The community-maintained list at <https://github.com/DNSCrypt/dnscrypt-resolvers/blob/master/v3/odoh-relays.md> is a useful cross-reference.
## Verify
```bash
kdig @127.0.0.1 example.com
```
Numa's `/queries` API and startup banner should label the upstream as `odoh://`. Look for `ODoH relay returned ...` errors in the logs if routing fails.
## Known gotchas
- **Same-operator refused.** Numa's eTLD+1 check blocks configs where the relay and target belong to the same operator (pointless — same party sees both sides). Override only when testing.
- **Single relay.** Current config accepts one relay and one target. Multi-entry rotation/failover is tracked in #140.

View File

@@ -1,14 +1,41 @@
#!/usr/bin/env bash
# Dev server for site/: regenerates drafts on each MD change, reloads the
# browser on each rendered HTML/CSS/JS change. Port is the first numeric arg
# (default 9000); any other args are ignored for back-compat.
#
# First run downloads chokidar-cli + browser-sync into the npm cache — slow
# once, instant after that.
set -euo pipefail
PORT="${1:-9000}"
PORT=9000
for arg in "$@"; do
if [[ "$arg" =~ ^[0-9]+$ ]]; then
PORT="$arg"
break
fi
done
if [[ "${1:-}" == "--drafts" ]] || [[ "${2:-}" == "--drafts" ]]; then
PORT="${PORT//--drafts/9000}" # default port if --drafts was first arg
make blog-drafts
else
make blog
fi
command -v npx >/dev/null || { echo "npx not found. Install Node.js: https://nodejs.org" >&2; exit 1; }
command -v pandoc >/dev/null || { echo "pandoc not found (required by 'make blog-drafts')." >&2; exit 1; }
echo "Serving site at http://localhost:$PORT"
cd site && python3 -m http.server "$PORT"
# Initial render so the first page load has everything.
make blog-drafts
echo "Serving site at http://localhost:$PORT (drafts included, live reload)"
# Kill child processes on exit so re-runs don't leave orphaned watchers.
trap 'kill $(jobs -p) 2>/dev/null' EXIT INT TERM
# Regenerate HTML when MD sources or the blog template change.
npx --yes chokidar-cli \
"drafts/*.md" "blog/*.md" "site/blog-template.html" \
-c "make blog-drafts" &
# Serve + reload on rendered-asset changes.
cd site && exec npx --yes browser-sync start \
--server . \
--port "$PORT" \
--files "**/*.html,**/*.css,**/*.js" \
--no-open \
--no-notify

View File

@@ -228,6 +228,7 @@ body {
.path-bar-fill.tcp { background: var(--violet); }
.path-bar-fill.dot { background: var(--emerald); }
.path-bar-fill.doh { background: var(--teal); }
.path-bar-fill.odoh { background: var(--violet-dim); }
.path-pct {
font-family: var(--font-mono);
font-size: 0.75rem;
@@ -637,16 +638,26 @@ body {
</div>
</div>
<!-- Transport breakdown -->
<!-- Inbound wire (apps → numa) -->
<div class="panel">
<div class="panel-header">
<span class="panel-title">Transport</span>
<span class="panel-title">Inbound Wire <span style="color: var(--text-dim); font-weight: normal;">apps → numa</span></span>
<span class="panel-title" id="transportEncrypted" style="color: var(--text-dim)"></span>
</div>
<div class="panel-body" id="transportBars">
</div>
</div>
<!-- Outbound wire (numa → internet) -->
<div class="panel">
<div class="panel-header">
<span class="panel-title">Outbound Wire <span style="color: var(--text-dim); font-weight: normal;">numa → internet</span></span>
<span class="panel-title" id="upstreamWireEncrypted" style="color: var(--text-dim)"></span>
</div>
<div class="panel-body" id="upstreamWireBars">
</div>
</div>
<!-- Main grid: query log + sidebar -->
<div class="main-grid">
<!-- Query log -->
@@ -960,9 +971,11 @@ function renderBarChart(containerId, defs, data, total) {
}).join('');
}
function encryptionPct(transport) {
const total = (transport.udp + transport.tcp + transport.dot + transport.doh) || 1;
return (((transport.dot + transport.doh) / total) * 100).toFixed(0);
function encryptionPct(data, encryptedKeys, allKeys) {
const total = allKeys.reduce((s, k) => s + (data[k] || 0), 0);
if (total === 0) return 0;
const encrypted = encryptedKeys.reduce((s, k) => s + (data[k] || 0), 0);
return Math.round((encrypted / total) * 100);
}
const PATH_DEFS = [
@@ -990,9 +1003,25 @@ const TRANSPORT_DEFS = [
function renderTransport(transport) {
const total = (transport.udp + transport.tcp + transport.dot + transport.doh) || 1;
renderBarChart('transportBars', TRANSPORT_DEFS, transport, total);
const encPct = encryptionPct(transport);
const encPct = encryptionPct(transport, ['dot', 'doh'], ['udp', 'tcp', 'dot', 'doh']);
const el = document.getElementById('transportEncrypted');
el.textContent = `${encPct}% encrypted`;
el.textContent = `${encPct}% encrypted inbound`;
el.style.color = encPct >= 80 ? 'var(--emerald)' : encPct >= 50 ? 'var(--amber)' : 'var(--rose)';
}
const UPSTREAM_WIRE_DEFS = [
{ key: 'udp', label: 'UDP', cls: 'udp' },
{ key: 'doh', label: 'DoH', cls: 'doh' },
{ key: 'dot', label: 'DoT', cls: 'dot' },
{ key: 'odoh', label: 'ODoH', cls: 'odoh' },
];
function renderUpstreamWire(ut) {
const total = (ut.udp + ut.doh + ut.dot + ut.odoh) || 0;
renderBarChart('upstreamWireBars', UPSTREAM_WIRE_DEFS, ut, total || 1);
const encPct = encryptionPct(ut, ['doh', 'dot', 'odoh'], ['udp', 'doh', 'dot', 'odoh']);
const el = document.getElementById('upstreamWireEncrypted');
el.textContent = total > 0 ? `${encPct}% encrypted outbound` : '';
el.style.color = encPct >= 80 ? 'var(--emerald)' : encPct >= 50 ? 'var(--amber)' : 'var(--rose)';
}
@@ -1215,7 +1244,7 @@ async function refresh() {
// QPS calculation
const now = Date.now();
const encPct = encryptionPct(stats.transport);
const encPct = encryptionPct(stats.transport, ['dot', 'doh'], ['udp', 'tcp', 'dot', 'doh']);
if (prevTotal !== null && prevTime !== null) {
const dt = (now - prevTime) / 1000;
const dq = q.total - prevTotal;
@@ -1234,6 +1263,7 @@ async function refresh() {
// Panels
renderPaths(q);
renderTransport(stats.transport);
renderUpstreamWire(stats.upstream_transport || { udp: 0, doh: 0, dot: 0, odoh: 0 });
renderQueryLog(logs);
renderOverrides(overrides);
renderCache(cache);
@@ -1243,6 +1273,7 @@ async function refresh() {
renderMemory(stats.memory, stats);
} catch (err) {
console.error('[numa dashboard] render failed:', err);
document.getElementById('statusDot').className = 'status-dot error';
document.getElementById('statusText').textContent = 'disconnected';
}

View File

@@ -83,8 +83,13 @@ pub fn router(ctx: Arc<ServerCtx>) -> Router {
}
async fn dashboard() -> impl IntoResponse {
// Revalidate each load so browsers don't keep serving a stale
// dashboard across numa upgrades.
(
[(header::CONTENT_TYPE, "text/html; charset=utf-8")],
[
(header::CONTENT_TYPE, "text/html; charset=utf-8"),
(header::CACHE_CONTROL, "no-cache"),
],
DASHBOARD_HTML,
)
}
@@ -170,6 +175,7 @@ struct StatsResponse {
srtt: bool,
queries: QueriesStats,
transport: TransportStats,
upstream_transport: UpstreamTransportStats,
cache: CacheStats,
overrides: OverrideStats,
blocking: BlockingStatsResponse,
@@ -186,6 +192,14 @@ struct TransportStats {
doh: u64,
}
#[derive(Serialize)]
struct UpstreamTransportStats {
udp: u64,
doh: u64,
dot: u64,
odoh: u64,
}
#[derive(Serialize)]
struct MobileStatsResponse {
enabled: bool,
@@ -566,6 +580,12 @@ async fn stats(State(ctx): State<Arc<ServerCtx>>) -> Json<StatsResponse> {
dot: snap.transport_dot,
doh: snap.transport_doh,
},
upstream_transport: UpstreamTransportStats {
udp: snap.upstream_transport_udp,
doh: snap.upstream_transport_doh,
dot: snap.upstream_transport_dot,
odoh: snap.upstream_transport_odoh,
},
cache: CacheStats {
entries: cache_len,
max_entries: cache_max,
@@ -1229,6 +1249,13 @@ mod tests {
.await
.unwrap();
assert_eq!(resp.status(), 200);
assert_eq!(
resp.headers()
.get(header::CACHE_CONTROL)
.map(|v| v.to_str().unwrap()),
Some("no-cache"),
"dashboard must revalidate to avoid stale HTML across upgrades"
);
let body = axum::body::to_bytes(resp.into_body(), 100000)
.await
.unwrap();

View File

@@ -1,5 +1,5 @@
use std::collections::HashSet;
use std::time::Instant;
use std::time::{Duration, Instant};
use log::{info, warn};
@@ -355,27 +355,144 @@ mod tests {
}
}
pub async fn download_blocklists(lists: &[String]) -> Vec<(String, String)> {
let client = reqwest::Client::builder()
.timeout(std::time::Duration::from_secs(30))
.gzip(true)
.build()
.unwrap_or_default();
const RETRY_DELAYS_SECS: &[u64] = &[2, 10, 30];
let mut results = Vec::new();
pub async fn download_blocklists(
lists: &[String],
resolver: Option<std::sync::Arc<crate::bootstrap_resolver::NumaResolver>>,
) -> Vec<(String, String)> {
let mut builder = reqwest::Client::builder()
.timeout(Duration::from_secs(30))
.gzip(true);
if let Some(r) = resolver {
builder = builder.dns_resolver(r);
}
let client = builder.build().unwrap_or_default();
for url in lists {
match client.get(url).send().await {
Ok(resp) => match resp.text().await {
Ok(text) => {
info!("downloaded blocklist: {} ({} bytes)", url, text.len());
results.push((url.clone(), text));
}
Err(e) => warn!("failed to read blocklist body {}: {}", url, e),
},
Err(e) => warn!("failed to download blocklist {}: {}", url, e),
let fetches = lists.iter().map(|url| {
let client = &client;
async move {
let text = fetch_with_retry(client, url).await?;
info!("downloaded blocklist: {} ({} bytes)", url, text.len());
Some((url.clone(), text))
}
});
futures::future::join_all(fetches)
.await
.into_iter()
.flatten()
.collect()
}
async fn fetch_with_retry(client: &reqwest::Client, url: &str) -> Option<String> {
fetch_with_retry_delays(client, url, RETRY_DELAYS_SECS).await
}
async fn fetch_with_retry_delays(
client: &reqwest::Client,
url: &str,
delays: &[u64],
) -> Option<String> {
let total = delays.len() + 1;
for attempt in 1..=total {
match fetch_once(client, url).await {
Ok(text) => return Some(text),
Err(msg) if attempt < total => {
let delay = delays[attempt - 1];
warn!(
"blocklist {} attempt {}/{} failed: {} — retrying in {}s",
url, attempt, total, msg, delay
);
tokio::time::sleep(Duration::from_secs(delay)).await;
}
Err(msg) => {
warn!(
"blocklist {} attempt {}/{} failed: {} — giving up",
url, attempt, total, msg
);
}
}
}
results
None
}
async fn fetch_once(client: &reqwest::Client, url: &str) -> Result<String, String> {
let resp = client
.get(url)
.send()
.await
.map_err(|e| format_error_chain(&e))?;
resp.text().await.map_err(|e| format_error_chain(&e))
}
fn format_error_chain(e: &(dyn std::error::Error + 'static)) -> String {
let mut parts = vec![e.to_string()];
let mut src = e.source();
while let Some(s) = src {
parts.push(s.to_string());
src = s.source();
}
parts.join(": ")
}
#[cfg(test)]
mod retry_tests {
use super::*;
use std::net::SocketAddr;
use tokio::io::{AsyncReadExt, AsyncWriteExt};
use tokio::net::TcpListener;
async fn flaky_http_server(drop_first_n: usize, body: &'static str) -> SocketAddr {
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let addr = listener.local_addr().unwrap();
tokio::spawn(async move {
for _ in 0..drop_first_n {
if let Ok((sock, _)) = listener.accept().await {
drop(sock);
}
}
loop {
let Ok((mut sock, _)) = listener.accept().await else {
return;
};
tokio::spawn(async move {
let mut buf = [0u8; 2048];
let _ = sock.read(&mut buf).await;
let response = format!(
"HTTP/1.1 200 OK\r\nContent-Length: {}\r\nContent-Type: text/plain\r\nConnection: close\r\n\r\n{}",
body.len(),
body,
);
let _ = sock.write_all(response.as_bytes()).await;
let _ = sock.shutdown().await;
});
}
});
addr
}
fn zero_delays() -> Vec<u64> {
vec![0; RETRY_DELAYS_SECS.len()]
}
#[tokio::test]
async fn retry_succeeds_on_final_attempt() {
let body = "ads.example.com\ntracker.example.net\n";
let delays = zero_delays();
let addr = flaky_http_server(delays.len(), body).await;
let client = reqwest::Client::new();
let url = format!("http://{addr}/");
let result = fetch_with_retry_delays(&client, &url, &delays).await;
assert_eq!(result.as_deref(), Some(body));
}
#[tokio::test]
async fn retry_gives_up_when_all_attempts_fail() {
let delays = zero_delays();
let addr = flaky_http_server(delays.len() + 2, "unreachable").await;
let client = reqwest::Client::new();
let url = format!("http://{addr}/");
let result = fetch_with_retry_delays(&client, &url, &delays).await;
assert_eq!(result, None);
}
}

234
src/bootstrap_resolver.rs Normal file
View File

@@ -0,0 +1,234 @@
//! `reqwest` DNS resolver used by numa-originated HTTPS (DoH upstream, ODoH
//! relay/target, blocklist CDN). When numa is its own system resolver
//! (`/etc/resolv.conf → 127.0.0.1`, HAOS add-on, Pi-hole-style container),
//! the default `getaddrinfo` path loops back through numa before numa can
//! answer — a chicken-and-egg that deadlocks cold boot. See issue #122.
//!
//! Resolution order per hostname:
//! 1. Per-hostname overrides (e.g. ODoH `relay_ip` / `target_ip`) → return
//! immediately, no DNS query. Preserves ODoH's "zero plain-DNS leak"
//! property for configured endpoints.
//! 2. Otherwise, query A + AAAA in parallel via UDP to IP-literal bootstrap
//! servers, with TCP fallback on UDP timeout (for networks that block
//! outbound UDP:53 — see memory: `project_network_udp_hostile.md`).
use std::collections::BTreeMap;
use std::net::{IpAddr, Ipv4Addr, SocketAddr};
use std::time::Duration;
use log::{debug, info, warn};
use reqwest::dns::{Addrs, Name, Resolve, Resolving};
use crate::forward::{forward_tcp, forward_udp};
use crate::packet::DnsPacket;
use crate::question::QueryType;
use crate::record::DnsRecord;
const UDP_TIMEOUT: Duration = Duration::from_millis(800);
const TCP_TIMEOUT: Duration = Duration::from_millis(1500);
const DEFAULT_BOOTSTRAP: &[SocketAddr] = &[
SocketAddr::new(IpAddr::V4(Ipv4Addr::new(9, 9, 9, 9)), 53),
SocketAddr::new(IpAddr::V4(Ipv4Addr::new(1, 1, 1, 1)), 53),
];
pub struct NumaResolver {
bootstrap: Vec<SocketAddr>,
overrides: BTreeMap<String, Vec<IpAddr>>,
}
impl NumaResolver {
/// Build a resolver from the configured `upstream.fallback` list and any
/// per-hostname overrides (e.g. ODoH's `relay_ip`/`target_ip`).
///
/// `fallback` entries are filtered to IP literals only — hostnames would
/// re-introduce the self-loop inside the resolver itself. Empty or
/// unusable fallback yields the hardcoded default (Quad9 + Cloudflare).
pub fn new(fallback: &[String], overrides: BTreeMap<String, Vec<IpAddr>>) -> Self {
let mut bootstrap: Vec<SocketAddr> = Vec::with_capacity(fallback.len());
for entry in fallback {
match crate::forward::parse_upstream_addr(entry, 53) {
Ok(addr) => bootstrap.push(addr),
Err(_) => {
warn!(
"bootstrap_resolver: skipping non-IP fallback '{}' \
(hostnames would re-enter the self-loop)",
entry
);
}
}
}
let source = if bootstrap.is_empty() {
bootstrap = DEFAULT_BOOTSTRAP.to_vec();
"default (no IP-literal in upstream.fallback)"
} else {
"upstream.fallback"
};
let ips: Vec<String> = bootstrap.iter().map(|s| s.ip().to_string()).collect();
info!(
"bootstrap resolver: {} via {} — used for numa-originated HTTPS hostname resolution",
ips.join(", "),
source
);
if !overrides.is_empty() {
let pairs: Vec<String> = overrides
.iter()
.flat_map(|(host, addrs)| addrs.iter().map(move |ip| format!("{}={}", host, ip)))
.collect();
info!(
"bootstrap resolver: host overrides (skip DNS, connect direct): {}",
pairs.join(", ")
);
}
Self {
bootstrap,
overrides,
}
}
#[cfg(test)]
pub fn bootstrap(&self) -> &[SocketAddr] {
&self.bootstrap
}
}
impl Resolve for NumaResolver {
fn resolve(&self, name: Name) -> Resolving {
let hostname = name.as_str().to_string();
if let Some(ips) = self.overrides.get(&hostname) {
let addrs: Vec<SocketAddr> = ips.iter().map(|ip| SocketAddr::new(*ip, 0)).collect();
debug!(
"bootstrap_resolver: override hit for {} → {:?}",
hostname, ips
);
return Box::pin(async move { Ok(Box::new(addrs.into_iter()) as Addrs) });
}
let bootstrap = self.bootstrap.clone();
Box::pin(async move {
let addrs = resolve_via_bootstrap(&hostname, &bootstrap).await?;
debug!(
"bootstrap_resolver: resolved {} → {} addr(s)",
hostname,
addrs.len()
);
Ok(Box::new(addrs.into_iter()) as Addrs)
})
}
}
async fn resolve_via_bootstrap(
hostname: &str,
bootstrap: &[SocketAddr],
) -> Result<Vec<SocketAddr>, Box<dyn std::error::Error + Send + Sync>> {
let mut last_err: Option<String> = None;
for &server in bootstrap {
let q_a = DnsPacket::query(0xBEEF, hostname, QueryType::A);
let q_aaaa = DnsPacket::query(0xBEF0, hostname, QueryType::AAAA);
let (a_res, aaaa_res) = tokio::join!(
query_with_tcp_fallback(&q_a, server),
query_with_tcp_fallback(&q_aaaa, server),
);
let mut out = Vec::new();
match a_res {
Ok(pkt) => extract_addrs(&pkt, &mut out),
Err(e) => last_err = Some(format!("{} A failed: {}", server, e)),
}
match aaaa_res {
Ok(pkt) => extract_addrs(&pkt, &mut out),
// AAAA is optional — many hosts return NXDOMAIN/empty. Don't
// treat as the primary error if A succeeded.
Err(e) => debug!("bootstrap {} AAAA for {} failed: {}", server, hostname, e),
}
if !out.is_empty() {
return Ok(out);
}
}
Err(last_err
.unwrap_or_else(|| "no bootstrap servers reachable".into())
.into())
}
async fn query_with_tcp_fallback(
query: &DnsPacket,
server: SocketAddr,
) -> crate::Result<DnsPacket> {
match forward_udp(query, server, UDP_TIMEOUT).await {
Ok(pkt) => Ok(pkt),
Err(e) => {
debug!(
"bootstrap UDP {} failed ({}), falling back to TCP",
server, e
);
forward_tcp(query, server, TCP_TIMEOUT).await
}
}
}
fn extract_addrs(pkt: &DnsPacket, out: &mut Vec<SocketAddr>) {
for r in &pkt.answers {
match r {
DnsRecord::A { addr, .. } => out.push(SocketAddr::new(IpAddr::V4(*addr), 0)),
DnsRecord::AAAA { addr, .. } => out.push(SocketAddr::new(IpAddr::V6(*addr), 0)),
_ => {}
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::net::{Ipv4Addr, Ipv6Addr};
#[test]
fn empty_fallback_uses_defaults() {
let r = NumaResolver::new(&[], BTreeMap::new());
let got: Vec<String> = r.bootstrap().iter().map(|s| s.to_string()).collect();
assert_eq!(got, vec!["9.9.9.9:53", "1.1.1.1:53"]);
}
#[test]
fn fallback_accepts_ip_literals_only() {
let fallback = vec![
"9.9.9.9".to_string(),
"dns.quad9.net".to_string(),
"1.1.1.1:5353".to_string(),
];
let r = NumaResolver::new(&fallback, BTreeMap::new());
let got: Vec<String> = r.bootstrap().iter().map(|s| s.to_string()).collect();
assert_eq!(got, vec!["9.9.9.9:53", "1.1.1.1:5353"]);
}
#[test]
fn override_returns_configured_ips_without_dns() {
let mut overrides = BTreeMap::new();
overrides.insert(
"odoh-relay.example".to_string(),
vec![IpAddr::V4(Ipv4Addr::new(178, 104, 229, 30))],
);
let r = NumaResolver::new(&[], overrides);
let name: Name = "odoh-relay.example".parse().unwrap();
let fut = r.resolve(name);
let res = futures::executor::block_on(fut).unwrap();
let addrs: Vec<_> = res.collect();
assert_eq!(addrs.len(), 1);
assert_eq!(addrs[0].ip(), IpAddr::V4(Ipv4Addr::new(178, 104, 229, 30)));
}
#[test]
fn override_supports_multiple_ips_including_ipv6() {
let mut overrides = BTreeMap::new();
overrides.insert(
"dual.example".to_string(),
vec![
IpAddr::V4(Ipv4Addr::new(1, 2, 3, 4)),
IpAddr::V6(Ipv6Addr::LOCALHOST),
],
);
let r = NumaResolver::new(&[], overrides);
let res = futures::executor::block_on(r.resolve("dual.example".parse().unwrap())).unwrap();
let addrs: Vec<_> = res.collect();
assert_eq!(addrs.len(), 2);
}
}

View File

@@ -1,7 +1,7 @@
use std::collections::HashMap;
use std::net::Ipv4Addr;
use std::net::Ipv6Addr;
use std::net::{IpAddr, Ipv4Addr, Ipv6Addr, SocketAddr};
use std::path::{Path, PathBuf};
use std::time::Duration;
use serde::Deserialize;
@@ -41,17 +41,30 @@ pub struct Config {
pub struct ForwardingRuleConfig {
#[serde(deserialize_with = "string_or_vec")]
pub suffix: Vec<String>,
pub upstream: String,
#[serde(deserialize_with = "string_or_vec")]
pub upstream: Vec<String>,
}
impl ForwardingRuleConfig {
fn to_runtime_rules(&self) -> Result<Vec<crate::system_dns::ForwardingRule>> {
let upstream = crate::forward::parse_upstream(&self.upstream, 53)
.map_err(|e| format!("forwarding rule for upstream '{}': {}", self.upstream, e))?;
if self.upstream.is_empty() {
return Err(format!(
"forwarding rule for suffix {:?}: upstream must not be empty",
self.suffix
)
.into());
}
let mut primary = Vec::with_capacity(self.upstream.len());
for s in &self.upstream {
let u = crate::forward::parse_upstream(s, 53, None)
.map_err(|e| format!("forwarding rule for upstream '{}': {}", s, e))?;
primary.push(u);
}
let pool = crate::forward::UpstreamPool::new(primary, vec![]);
Ok(self
.suffix
.iter()
.map(|s| crate::system_dns::ForwardingRule::new(s.clone(), upstream.clone()))
.map(|s| crate::system_dns::ForwardingRule::new(s.clone(), pool.clone()))
.collect())
}
}
@@ -80,6 +93,12 @@ pub struct ServerConfig {
/// Defaults to `crate::data_dir()` (platform-specific system path) if unset.
#[serde(default)]
pub data_dir: Option<PathBuf>,
/// Synthesize NODATA (NOERROR + empty answer) for AAAA queries, and
/// strip `ipv6hint` from HTTPS/SVCB responses (RFC 9460). For IPv4-only
/// networks where Happy Eyeballs fallback adds latency. Local zones,
/// overrides, and the service proxy are not affected. Default false.
#[serde(default)]
pub filter_aaaa: bool,
}
impl Default for ServerConfig {
@@ -89,6 +108,7 @@ impl Default for ServerConfig {
api_port: default_api_port(),
api_bind_addr: default_api_bind_addr(),
data_dir: None,
filter_aaaa: false,
}
}
}
@@ -114,6 +134,7 @@ pub enum UpstreamMode {
#[default]
Forward,
Recursive,
Odoh,
}
impl UpstreamMode {
@@ -122,6 +143,20 @@ impl UpstreamMode {
UpstreamMode::Auto => "auto",
UpstreamMode::Forward => "forward",
UpstreamMode::Recursive => "recursive",
UpstreamMode::Odoh => "odoh",
}
}
/// Hedging duplicates the in-flight query against the same upstream to
/// rescue tail latency. Beneficial for UDP/DoH/DoT (cheap retransmit /
/// h2 stream multiplexing). For ODoH it doubles the relay's HPKE
/// seal/unseal load and the sealed-byte footprint a passive observer
/// can correlate, with no latency win — the relay hop dominates either
/// way. Force-zero in oblivious mode regardless of `hedge_ms`.
pub fn hedge_delay(self, hedge_ms: u64) -> Duration {
match self {
UpstreamMode::Odoh => Duration::ZERO,
_ => Duration::from_millis(hedge_ms),
}
}
}
@@ -134,7 +169,7 @@ pub struct UpstreamConfig {
pub address: Vec<String>,
#[serde(default = "default_upstream_port")]
pub port: u16,
#[serde(default)]
#[serde(default, deserialize_with = "string_or_vec")]
pub fallback: Vec<String>,
#[serde(default = "default_timeout_ms")]
pub timeout_ms: u64,
@@ -146,6 +181,30 @@ pub struct UpstreamConfig {
pub prime_tlds: Vec<String>,
#[serde(default = "default_srtt")]
pub srtt: bool,
/// Only used when `mode = "odoh"`. Full https:// URL of the relay
/// endpoint (including path, e.g. `https://odoh-relay.numa.rs/relay`).
#[serde(default)]
pub relay: Option<String>,
/// Only used when `mode = "odoh"`. Full https:// URL of the target
/// resolver (`https://odoh.cloudflare-dns.com/dns-query`).
#[serde(default)]
pub target: Option<String>,
/// Only used when `mode = "odoh"`. When true (the default), relay failure
/// returns SERVFAIL instead of downgrading to the `fallback` upstream —
/// a user who configured ODoH rarely wants a silent non-oblivious path.
#[serde(default)]
pub strict: Option<bool>,
/// Bootstrap IP for the relay host, used when numa is its own system
/// resolver (otherwise the ODoH HTTPS client loops resolving through
/// itself). TLS still validates the cert against `relay`'s hostname.
#[serde(default)]
pub relay_ip: Option<IpAddr>,
/// Same as `relay_ip` but for the target host.
#[serde(default)]
pub target_ip: Option<IpAddr>,
}
impl Default for UpstreamConfig {
@@ -160,10 +219,128 @@ impl Default for UpstreamConfig {
root_hints: default_root_hints(),
prime_tlds: default_prime_tlds(),
srtt: default_srtt(),
relay: None,
target: None,
strict: None,
relay_ip: None,
target_ip: None,
}
}
}
/// Parsed ODoH config fields. `mode = "odoh"` requires both URLs to be
/// present, to parse as `https://`, and to resolve to distinct hosts.
#[derive(Debug)]
pub struct OdohUpstream {
pub relay_url: String,
pub relay_host: String,
pub target_host: String,
pub target_path: String,
pub strict: bool,
pub relay_bootstrap: Option<SocketAddr>,
pub target_bootstrap: Option<SocketAddr>,
}
impl OdohUpstream {
/// Per-host IP overrides for the bootstrap resolver, lifted from
/// `relay_ip`/`target_ip`. Keeps the "zero plain-DNS leak for ODoH
/// endpoints" property when numa is its own system resolver.
pub fn host_ip_overrides(&self) -> std::collections::BTreeMap<String, Vec<std::net::IpAddr>> {
let mut out = std::collections::BTreeMap::new();
if let Some(addr) = self.relay_bootstrap {
out.entry(self.relay_host.clone())
.or_insert_with(Vec::new)
.push(addr.ip());
}
if let Some(addr) = self.target_bootstrap {
out.entry(self.target_host.clone())
.or_insert_with(Vec::new)
.push(addr.ip());
}
out
}
}
impl UpstreamConfig {
/// Validate and extract ODoH-specific fields. Called during `load_config`
/// so misconfigured ODoH fails fast at startup, the same care we take
/// with the DNSSEC strict boot check.
pub fn odoh_upstream(&self) -> Result<OdohUpstream> {
let relay = self
.relay
.as_deref()
.ok_or("mode = \"odoh\" requires upstream.relay")?;
let target = self
.target
.as_deref()
.ok_or("mode = \"odoh\" requires upstream.target")?;
let relay_url = reqwest::Url::parse(relay)
.map_err(|e| format!("upstream.relay invalid URL '{}': {}", relay, e))?;
let target_url = reqwest::Url::parse(target)
.map_err(|e| format!("upstream.target invalid URL '{}': {}", target, e))?;
if relay_url.scheme() != "https" || target_url.scheme() != "https" {
return Err("upstream.relay and upstream.target must both use https://".into());
}
let relay_host = relay_url
.host_str()
.ok_or("upstream.relay must include a host")?
.to_string();
let target_host = target_url
.host_str()
.ok_or("upstream.target must include a host")?
.to_string();
if relay_host == target_host {
return Err(format!(
"upstream.relay and upstream.target resolve to the same host ({}); the privacy property requires distinct operators",
relay_host
)
.into());
}
if let Some(shared) = shared_registrable_domain(&relay_host, &target_host) {
return Err(format!(
"upstream.relay ({}) and upstream.target ({}) share the registrable domain ({}); the privacy property requires distinct operators",
relay_host, target_host, shared
)
.into());
}
let target_path = if target_url.path().is_empty() {
"/".to_string()
} else {
target_url.path().to_string()
};
let relay_port = relay_url.port_or_known_default().unwrap_or(443);
let target_port = target_url.port_or_known_default().unwrap_or(443);
Ok(OdohUpstream {
relay_url: relay.to_string(),
relay_host,
target_host,
target_path,
strict: self.strict.unwrap_or(true),
relay_bootstrap: self.relay_ip.map(|ip| SocketAddr::new(ip, relay_port)),
target_bootstrap: self.target_ip.map(|ip| SocketAddr::new(ip, target_port)),
})
}
}
/// Returns the registrable domain (eTLD+1) shared by both hosts, if any.
/// Fails open on hosts the PSL can't parse (IP literals, bare TLDs).
fn shared_registrable_domain(relay_host: &str, target_host: &str) -> Option<String> {
let relay = psl::domain(relay_host.as_bytes())?;
let target = psl::domain(target_host.as_bytes())?;
if relay.as_bytes() == target.as_bytes() {
std::str::from_utf8(relay.as_bytes())
.ok()
.map(str::to_owned)
} else {
None
}
}
fn string_or_vec<'de, D>(deserializer: D) -> std::result::Result<Vec<String>, D::Error>
where
D: serde::Deserializer<'de>,
@@ -274,8 +451,12 @@ fn default_upstream_port() -> u16 {
fn default_timeout_ms() -> u64 {
5000
}
/// Off by default: hedging fires a second upstream query, which silently
/// doubles the count at the provider — hurts quota'd DNS (NextDNS, Control
/// D). Opt in with `hedge_ms = 10` for tail-latency rescue on flaky nets
/// or handshake-slow DoT.
fn default_hedge_ms() -> u64 {
10
0
}
#[derive(Deserialize)]
@@ -567,6 +748,17 @@ mod tests {
assert!(config.lan.enabled);
}
#[test]
fn filter_aaaa_defaults_false() {
assert!(!ServerConfig::default().filter_aaaa);
}
#[test]
fn filter_aaaa_parses_from_server_section() {
let config: Config = toml::from_str("[server]\nfilter_aaaa = true").unwrap();
assert!(config.server.filter_aaaa);
}
#[test]
fn custom_bind_addrs_parse() {
let toml = r#"
@@ -612,12 +804,22 @@ mod tests {
}
#[test]
fn fallback_parses() {
fn fallback_array_parses() {
let config: Config =
toml::from_str("[upstream]\nfallback = [\"8.8.8.8\", \"1.1.1.1\"]").unwrap();
assert_eq!(config.upstream.fallback, vec!["8.8.8.8", "1.1.1.1"]);
}
#[test]
fn fallback_string_parses_as_singleton_vec() {
let config: Config =
toml::from_str("[upstream]\nfallback = \"tls://1.1.1.1#cloudflare-dns.com\"").unwrap();
assert_eq!(
config.upstream.fallback,
vec!["tls://1.1.1.1#cloudflare-dns.com"]
);
}
#[test]
fn empty_address_gives_empty_vec() {
let config: Config = toml::from_str("").unwrap();
@@ -625,6 +827,222 @@ mod tests {
assert!(config.upstream.fallback.is_empty());
}
// ── [upstream] mode = "odoh" ────────────────────────────────────────
#[test]
fn odoh_config_parses_and_validates() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
assert!(matches!(config.upstream.mode, UpstreamMode::Odoh));
let odoh = config.upstream.odoh_upstream().unwrap();
assert_eq!(odoh.relay_url, "https://odoh-relay.numa.rs/relay");
assert_eq!(odoh.target_host, "odoh.cloudflare-dns.com");
assert_eq!(odoh.target_path, "/dns-query");
assert!(odoh.strict, "strict defaults to true under mode=odoh");
}
#[test]
fn odoh_strict_false_is_honoured() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
strict = false
"#;
let config: Config = toml::from_str(toml).unwrap();
assert!(!config.upstream.odoh_upstream().unwrap().strict);
}
#[test]
fn odoh_rejects_same_host_relay_and_target() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh.example.com/relay"
target = "https://odoh.example.com/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
let err = config.upstream.odoh_upstream().unwrap_err().to_string();
assert!(err.contains("same host"), "got: {err}");
}
#[test]
fn odoh_rejects_shared_registrable_domain() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://r.cloudflare.com/relay"
target = "https://odoh.cloudflare.com/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
let err = config.upstream.odoh_upstream().unwrap_err().to_string();
assert!(err.contains("registrable domain"), "got: {err}");
assert!(err.contains("cloudflare.com"), "got: {err}");
}
#[test]
fn odoh_rejects_shared_registrable_under_multi_label_suffix() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://a.foo.co.uk/relay"
target = "https://b.foo.co.uk/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
let err = config.upstream.odoh_upstream().unwrap_err().to_string();
assert!(err.contains("foo.co.uk"), "got: {err}");
}
#[test]
fn odoh_accepts_distinct_registrable_under_multi_label_suffix() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://relay.foo.co.uk/relay"
target = "https://target.bar.co.uk/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
assert!(config.upstream.odoh_upstream().is_ok());
}
#[test]
fn odoh_accepts_distinct_private_psl_suffix_subdomains() {
// *.github.io is a public suffix, so foo.github.io and bar.github.io
// are independent registrable domains — accept.
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://foo.github.io/relay"
target = "https://bar.github.io/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
assert!(config.upstream.odoh_upstream().is_ok());
}
#[test]
fn odoh_rejects_non_https() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "http://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
let err = config.upstream.odoh_upstream().unwrap_err().to_string();
assert!(err.contains("https"), "got: {err}");
}
#[test]
fn odoh_missing_relay_rejected() {
let toml = r#"
[upstream]
mode = "odoh"
target = "https://odoh.cloudflare-dns.com/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
let err = config.upstream.odoh_upstream().unwrap_err().to_string();
assert!(err.contains("upstream.relay"), "got: {err}");
}
#[test]
fn odoh_bootstrap_ips_parse_into_socket_addrs() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
relay_ip = "178.104.229.30"
target_ip = "104.16.249.249"
"#;
let config: Config = toml::from_str(toml).unwrap();
let odoh = config.upstream.odoh_upstream().unwrap();
assert_eq!(odoh.relay_host, "odoh-relay.numa.rs");
assert_eq!(
odoh.relay_bootstrap.unwrap().to_string(),
"178.104.229.30:443"
);
assert_eq!(
odoh.target_bootstrap.unwrap().to_string(),
"104.16.249.249:443"
);
}
#[test]
fn odoh_bootstrap_ips_optional() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
"#;
let config: Config = toml::from_str(toml).unwrap();
let odoh = config.upstream.odoh_upstream().unwrap();
assert!(odoh.relay_bootstrap.is_none());
assert!(odoh.target_bootstrap.is_none());
}
#[test]
fn odoh_bootstrap_ip_rejects_garbage() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
relay_ip = "not-an-ip"
"#;
let err = toml::from_str::<Config>(toml).err().unwrap().to_string();
assert!(err.contains("relay_ip"), "got: {err}");
}
#[test]
fn odoh_bootstrap_uses_url_port_when_non_default() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs:8443/relay"
target = "https://odoh.cloudflare-dns.com/dns-query"
relay_ip = "178.104.229.30"
"#;
let config: Config = toml::from_str(toml).unwrap();
let odoh = config.upstream.odoh_upstream().unwrap();
assert_eq!(
odoh.relay_bootstrap.unwrap().to_string(),
"178.104.229.30:8443"
);
}
#[test]
fn hedge_delay_zeroed_for_odoh_mode() {
assert_eq!(
UpstreamMode::Odoh.hedge_delay(50),
Duration::ZERO,
"ODoH mode must zero hedge regardless of configured hedge_ms"
);
assert_eq!(
UpstreamMode::Forward.hedge_delay(50),
Duration::from_millis(50),
"non-ODoH modes honour configured hedge_ms"
);
}
#[test]
fn odoh_missing_target_rejected() {
let toml = r#"
[upstream]
mode = "odoh"
relay = "https://odoh-relay.numa.rs/relay"
"#;
let config: Config = toml::from_str(toml).unwrap();
let err = config.upstream.odoh_upstream().unwrap_err().to_string();
assert!(err.contains("upstream.target"), "got: {err}");
}
// ── issue #82: [[forwarding]] config section ────────────────────────
#[test]
@@ -643,7 +1061,7 @@ mod tests {
let config: Config = toml::from_str(toml).unwrap();
assert_eq!(config.forwarding.len(), 1);
assert_eq!(config.forwarding[0].suffix, &["home.local"]);
assert_eq!(config.forwarding[0].upstream, "100.90.1.63:5361");
assert_eq!(config.forwarding[0].upstream, vec!["100.90.1.63:5361"]);
}
#[test]
@@ -671,7 +1089,7 @@ mod tests {
"#;
let config: Config = toml::from_str(toml).unwrap();
assert_eq!(config.forwarding.len(), 2);
assert_eq!(config.forwarding[1].upstream, "10.0.0.1");
assert_eq!(config.forwarding[1].upstream, vec!["10.0.0.1"]);
}
#[test]
@@ -693,28 +1111,29 @@ mod tests {
fn forwarding_suffix_array_expands_to_multiple_runtime_rules() {
let rule = ForwardingRuleConfig {
suffix: vec!["168.192.in-addr.arpa".to_string(), "onsite".to_string()],
upstream: "192.168.88.1".to_string(),
upstream: vec!["192.168.88.1".to_string()],
};
let runtime = rule.to_runtime_rules().unwrap();
assert_eq!(runtime.len(), 2);
assert_eq!(runtime[0].suffix, "168.192.in-addr.arpa");
assert_eq!(runtime[1].suffix, "onsite");
assert_eq!(runtime[0].upstream, runtime[1].upstream);
assert_eq!(
runtime[0].upstream.preferred(),
runtime[1].upstream.preferred()
);
}
#[test]
fn forwarding_upstream_with_explicit_port() {
let rule = ForwardingRuleConfig {
suffix: vec!["home.local".to_string()],
upstream: "100.90.1.63:5361".to_string(),
upstream: vec!["100.90.1.63:5361".to_string()],
};
let runtime = rule.to_runtime_rules().unwrap();
assert_eq!(runtime.len(), 1);
assert!(matches!(
runtime[0].upstream,
crate::forward::Upstream::Udp(_)
));
assert_eq!(runtime[0].upstream.to_string(), "100.90.1.63:5361");
let preferred = runtime[0].upstream.preferred().unwrap();
assert!(matches!(preferred, crate::forward::Upstream::Udp(_)));
assert_eq!(preferred.to_string(), "100.90.1.63:5361");
assert_eq!(runtime[0].suffix, "home.local");
}
@@ -722,17 +1141,20 @@ mod tests {
fn forwarding_upstream_defaults_to_port_53() {
let rule = ForwardingRuleConfig {
suffix: vec!["home.local".to_string()],
upstream: "100.90.1.63".to_string(),
upstream: vec!["100.90.1.63".to_string()],
};
let runtime = rule.to_runtime_rules().unwrap();
assert_eq!(runtime[0].upstream.to_string(), "100.90.1.63:53");
assert_eq!(
runtime[0].upstream.preferred().unwrap().to_string(),
"100.90.1.63:53"
);
}
#[test]
fn forwarding_invalid_upstream_returns_error() {
let rule = ForwardingRuleConfig {
suffix: vec!["home.local".to_string()],
upstream: "not-a-valid-host".to_string(),
upstream: vec!["not-a-valid-host".to_string()],
};
assert!(rule.to_runtime_rules().is_err());
}
@@ -741,14 +1163,14 @@ mod tests {
fn forwarding_upstream_accepts_dot_scheme() {
let rule = ForwardingRuleConfig {
suffix: vec!["google.com".to_string()],
upstream: "tls://9.9.9.9#dns.quad9.net".to_string(),
upstream: vec!["tls://9.9.9.9#dns.quad9.net".to_string()],
};
let runtime = rule
.to_runtime_rules()
.expect("tls:// upstream should parse");
assert_eq!(runtime.len(), 1);
assert_eq!(
runtime[0].upstream.to_string(),
runtime[0].upstream.preferred().unwrap().to_string(),
"tls://9.9.9.9:853#dns.quad9.net"
);
}
@@ -757,14 +1179,14 @@ mod tests {
fn forwarding_upstream_accepts_doh_scheme() {
let rule = ForwardingRuleConfig {
suffix: vec!["goog".to_string()],
upstream: "https://dns.quad9.net/dns-query".to_string(),
upstream: vec!["https://dns.quad9.net/dns-query".to_string()],
};
let runtime = rule
.to_runtime_rules()
.expect("https:// upstream should parse");
assert_eq!(runtime.len(), 1);
assert_eq!(
runtime[0].upstream.to_string(),
runtime[0].upstream.preferred().unwrap().to_string(),
"https://dns.quad9.net/dns-query"
);
}
@@ -773,44 +1195,90 @@ mod tests {
fn forwarding_config_rules_take_precedence_over_discovered() {
let config_rules = vec![ForwardingRuleConfig {
suffix: vec!["home.local".to_string()],
upstream: "10.0.0.1:53".to_string(),
upstream: vec!["10.0.0.1:53".to_string()],
}];
let discovered = vec![crate::system_dns::ForwardingRule::new(
"home.local".to_string(),
crate::forward::Upstream::Udp("192.168.1.1:53".parse().unwrap()),
crate::forward::UpstreamPool::new(
vec![crate::forward::Upstream::Udp(
"192.168.1.1:53".parse().unwrap(),
)],
vec![],
),
)];
let merged = merge_forwarding_rules(&config_rules, discovered).unwrap();
let picked = crate::system_dns::match_forwarding_rule("host.home.local", &merged)
.expect("rule should match");
assert_eq!(picked.to_string(), "10.0.0.1:53");
assert_eq!(picked.preferred().unwrap().to_string(), "10.0.0.1:53");
}
#[test]
fn forwarding_merge_preserves_non_overlapping_discovered() {
let config_rules = vec![ForwardingRuleConfig {
suffix: vec!["home.local".to_string()],
upstream: "10.0.0.1:53".to_string(),
upstream: vec!["10.0.0.1:53".to_string()],
}];
let discovered = vec![crate::system_dns::ForwardingRule::new(
"corp.example".to_string(),
crate::forward::Upstream::Udp("192.168.1.1:53".parse().unwrap()),
crate::forward::UpstreamPool::new(
vec![crate::forward::Upstream::Udp(
"192.168.1.1:53".parse().unwrap(),
)],
vec![],
),
)];
let merged = merge_forwarding_rules(&config_rules, discovered).unwrap();
assert_eq!(merged.len(), 2);
let picked = crate::system_dns::match_forwarding_rule("host.corp.example", &merged)
.expect("discovered rule should still match");
assert_eq!(picked.to_string(), "192.168.1.1:53");
assert_eq!(picked.preferred().unwrap().to_string(), "192.168.1.1:53");
}
#[test]
fn forwarding_merge_suffix_array_expands_to_multiple_rules() {
let config_rules = vec![ForwardingRuleConfig {
suffix: vec!["a.local".to_string(), "b.local".to_string()],
upstream: "10.0.0.1:53".to_string(),
upstream: vec!["10.0.0.1:53".to_string()],
}];
let merged = merge_forwarding_rules(&config_rules, vec![]).unwrap();
assert_eq!(merged.len(), 2);
}
#[test]
fn forwarding_parses_upstream_array() {
let toml = r#"
[[forwarding]]
suffix = "google.com"
upstream = ["tls://9.9.9.9#dns.quad9.net", "tls://149.112.112.112#dns.quad9.net"]
"#;
let config: Config = toml::from_str(toml).unwrap();
assert_eq!(config.forwarding.len(), 1);
assert_eq!(config.forwarding[0].upstream.len(), 2);
}
#[test]
fn forwarding_upstream_array_builds_pool_with_multiple_primaries() {
let rule = ForwardingRuleConfig {
suffix: vec!["google.com".to_string()],
upstream: vec![
"tls://9.9.9.9#dns.quad9.net".to_string(),
"tls://149.112.112.112#dns.quad9.net".to_string(),
],
};
let runtime = rule.to_runtime_rules().unwrap();
assert_eq!(runtime.len(), 1);
let label = runtime[0].upstream.label();
assert!(label.contains("+1 more"), "label was: {}", label);
}
#[test]
fn forwarding_empty_upstream_array_errors() {
let rule = ForwardingRuleConfig {
suffix: vec!["home.local".to_string()],
upstream: vec![],
};
assert!(rule.to_runtime_rules().is_err());
}
}
pub struct ConfigLoad {

View File

@@ -16,7 +16,9 @@ use crate::blocklist::BlocklistStore;
use crate::buffer::BytePacketBuffer;
use crate::cache::{DnsCache, DnssecStatus};
use crate::config::{UpstreamMode, ZoneMap};
use crate::forward::{forward_query_raw, forward_with_failover_raw, Upstream, UpstreamPool};
#[cfg(test)]
use crate::forward::Upstream;
use crate::forward::{forward_with_failover_raw, UpstreamPool};
use crate::header::ResultCode;
use crate::health::HealthMeta;
use crate::lan::PeerStore;
@@ -75,6 +77,10 @@ pub struct ServerCtx {
pub ca_pem: Option<String>,
pub mobile_enabled: bool,
pub mobile_port: u16,
/// When true, AAAA queries short-circuit with NODATA (NOERROR + empty
/// answer) instead of hitting cache/forwarding/upstream. Local data
/// (overrides, zones, .numa proxy, blocklist sinkhole) is unaffected.
pub filter_aaaa: bool,
}
/// Transport-agnostic DNS resolution. Runs the full pipeline (overrides, blocklist,
@@ -99,6 +105,7 @@ pub async fn resolve_query(
// Pipeline: overrides -> .localhost -> local zones -> special-use (unless forwarded)
// -> .tld proxy -> blocklist -> cache -> forwarding -> recursive/upstream
// Each lock is scoped to avoid holding MutexGuard across await points.
let mut upstream_transport: Option<crate::stats::UpstreamTransport> = None;
let (response, path, dnssec) = {
let override_record = ctx.overrides.read().unwrap().lookup(&qname);
if let Some(record) = override_record {
@@ -170,6 +177,13 @@ pub async fn resolve_query(
60,
));
(resp, QueryPath::Blocked, DnssecStatus::Indeterminate)
} else if qtype == QueryType::AAAA && ctx.filter_aaaa {
// RFC 2308 NODATA: NOERROR with empty answer section. Prevents
// Happy Eyeballs clients from waiting on an AAAA they'll never use
// on IPv4-only networks. NXDOMAIN would be wrong (it'd imply the
// name doesn't exist for A either).
let resp = DnsPacket::response_from(&query, ResultCode::NOERROR);
(resp, QueryPath::Local, DnssecStatus::Indeterminate)
} else {
let cached = ctx.cache.read().unwrap().lookup_with_status(&qname, qtype);
if let Some((cached, cached_dnssec, freshness)) = cached {
@@ -190,84 +204,73 @@ pub async fn resolve_query(
resp.header.authed_data = true;
}
(resp, QueryPath::Cached, cached_dnssec)
} else if let Some(upstream) =
} else if let Some(pool) =
crate::system_dns::match_forwarding_rule(&qname, &ctx.forwarding_rules)
{
// Conditional forwarding takes priority over recursive mode
// (e.g. Tailscale .ts.net, VPC private zones)
match forward_and_cache(raw_wire, upstream, ctx, &qname, qtype).await {
Ok(resp) => (resp, QueryPath::Forwarded, DnssecStatus::Indeterminate),
Err(e) => {
error!(
"{} | {:?} {} | FORWARD ERROR | {}",
src_addr, qtype, qname, e
);
(
DnsPacket::response_from(&query, ResultCode::SERVFAIL),
QueryPath::UpstreamError,
DnssecStatus::Indeterminate,
)
}
}
} else if ctx.upstream_mode == UpstreamMode::Recursive {
let key = (qname.clone(), qtype);
let (resp, path, err) = resolve_coalesced(&ctx.inflight, key, &query, || {
crate::recursive::resolve_recursive(
&qname,
qtype,
&ctx.cache,
&query,
&ctx.root_hints,
&ctx.srtt,
)
})
.await;
if path == QueryPath::Coalesced {
debug!("{} | {:?} {} | COALESCED", src_addr, qtype, qname);
} else if path == QueryPath::UpstreamError {
error!(
"{} | {:?} {} | RECURSIVE ERROR | {}",
src_addr,
qtype,
qname,
err.as_deref().unwrap_or("leader failed")
);
let (resp, path, err) =
resolve_coalesced(&ctx.inflight, key, &query, QueryPath::Forwarded, || async {
let wire = forward_with_failover_raw(
raw_wire,
pool,
&ctx.srtt,
ctx.timeout,
ctx.hedge_delay,
)
.await?;
cache_and_parse(ctx, &qname, qtype, &wire)
})
.await;
log_coalesced_outcome(src_addr, qtype, &qname, path, err.as_deref(), "FORWARD");
if path == QueryPath::Forwarded {
upstream_transport = pool.preferred().map(|u| u.transport());
}
(resp, path, DnssecStatus::Indeterminate)
} else if ctx.upstream_mode == UpstreamMode::Recursive {
// Recursive resolution makes UDP hops to roots/TLDs/auths;
// tag as Udp so the dashboard can aggregate plaintext-wire
// egress honestly. Only mark on success — errors stay None.
let key = (qname.clone(), qtype);
let (resp, path, err) =
resolve_coalesced(&ctx.inflight, key, &query, QueryPath::Recursive, || {
crate::recursive::resolve_recursive(
&qname,
qtype,
&ctx.cache,
&query,
&ctx.root_hints,
&ctx.srtt,
)
})
.await;
log_coalesced_outcome(src_addr, qtype, &qname, path, err.as_deref(), "RECURSIVE");
if path == QueryPath::Recursive {
upstream_transport = Some(crate::stats::UpstreamTransport::Udp);
}
(resp, path, DnssecStatus::Indeterminate)
} else {
let pool = ctx.upstream_pool.lock().unwrap().clone();
match forward_with_failover_raw(
raw_wire,
&pool,
&ctx.srtt,
ctx.timeout,
ctx.hedge_delay,
)
.await
{
Ok(resp_wire) => match cache_and_parse(ctx, &qname, qtype, &resp_wire) {
Ok(resp) => (resp, QueryPath::Upstream, DnssecStatus::Indeterminate),
Err(e) => {
error!("{} | {:?} {} | PARSE ERROR | {}", src_addr, qtype, qname, e);
(
DnsPacket::response_from(&query, ResultCode::SERVFAIL),
QueryPath::UpstreamError,
DnssecStatus::Indeterminate,
)
}
},
Err(e) => {
error!(
"{} | {:?} {} | UPSTREAM ERROR | {}",
src_addr, qtype, qname, e
);
(
DnsPacket::response_from(&query, ResultCode::SERVFAIL),
QueryPath::UpstreamError,
DnssecStatus::Indeterminate,
let key = (qname.clone(), qtype);
let (resp, path, err) =
resolve_coalesced(&ctx.inflight, key, &query, QueryPath::Upstream, || async {
let wire = forward_with_failover_raw(
raw_wire,
&pool,
&ctx.srtt,
ctx.timeout,
ctx.hedge_delay,
)
}
.await?;
cache_and_parse(ctx, &qname, qtype, &wire)
})
.await;
log_coalesced_outcome(src_addr, qtype, &qname, path, err.as_deref(), "UPSTREAM");
if path == QueryPath::Upstream {
upstream_transport = pool.preferred().map(|u| u.transport());
}
(resp, path, DnssecStatus::Indeterminate)
}
}
};
@@ -314,6 +317,15 @@ pub async fn resolve_query(
strip_dnssec_records(&mut response);
}
// filter_aaaa: also strip ipv6hint from HTTPS/SVCB answers so modern
// browsers (Chrome ≥103 etc.) don't receive v6 address hints via the
// HTTPS record path that bypasses AAAA entirely. Gated on !client_do
// because modifying rdata invalidates any accompanying RRSIG — a DO-bit
// validator downstream would reject the response as Bogus.
if ctx.filter_aaaa && !client_do {
strip_svcb_ipv6_hints(&mut response);
}
// Echo EDNS back if client sent it
if query.edns.is_some() {
response.edns = Some(crate::packet::EdnsOpt {
@@ -357,7 +369,7 @@ pub async fn resolve_query(
// Record stats and query log
{
let mut s = ctx.stats.lock().unwrap();
let total = s.record(path, transport);
let total = s.record(path, transport, upstream_transport);
if total.is_multiple_of(1000) {
s.log_summary();
}
@@ -396,6 +408,33 @@ fn cache_and_parse(
/// Used for both stale-entry refresh and proactive cache warming.
pub async fn refresh_entry(ctx: &ServerCtx, qname: &str, qtype: QueryType) {
let query = DnsPacket::query(0, qname, qtype);
// Forwarding rules must win here, mirroring `resolve_query` — otherwise
// refresh re-resolves private zones through the default upstream and
// poisons the cache with NXDOMAIN.
if let Some(pool) = crate::system_dns::match_forwarding_rule(qname, &ctx.forwarding_rules) {
let mut buf = BytePacketBuffer::new();
if query.write(&mut buf).is_ok() {
if let Ok(wire) = forward_with_failover_raw(
buf.filled(),
pool,
&ctx.srtt,
ctx.timeout,
ctx.hedge_delay,
)
.await
{
ctx.cache.write().unwrap().insert_wire(
qname,
qtype,
&wire,
DnssecStatus::Indeterminate,
);
}
}
return;
}
if ctx.upstream_mode == UpstreamMode::Recursive {
if let Ok(resp) = crate::recursive::resolve_recursive(
qname,
@@ -433,17 +472,6 @@ pub async fn refresh_entry(ctx: &ServerCtx, qname: &str, qtype: QueryType) {
}
}
async fn forward_and_cache(
wire: &[u8],
upstream: &Upstream,
ctx: &ServerCtx,
qname: &str,
qtype: QueryType,
) -> crate::Result<DnsPacket> {
let resp_wire = forward_query_raw(wire, upstream, ctx.timeout).await?;
cache_and_parse(ctx, qname, qtype, &resp_wire)
}
pub async fn handle_query(
mut buffer: BytePacketBuffer,
raw_len: usize,
@@ -482,6 +510,20 @@ fn strip_dnssec_records(pkt: &mut DnsPacket) {
pkt.resources.retain(|r| !is_dnssec_record(r));
}
fn strip_svcb_ipv6_hints(pkt: &mut DnsPacket) {
let https_qtype = QueryType::HTTPS.to_num();
let svcb_qtype = QueryType::SVCB.to_num();
pkt.for_each_record_mut(|rec| {
if let DnsRecord::UNKNOWN { qtype, data, .. } = rec {
if *qtype == https_qtype || *qtype == svcb_qtype {
if let Some(new_data) = crate::svcb::strip_ipv6hint(data) {
*data = new_data;
}
}
}
});
}
fn is_special_use_domain(qname: &str) -> bool {
if qname.ends_with(".in-addr.arpa") {
// RFC 6303: private + loopback + link-local reverse DNS
@@ -558,11 +600,15 @@ fn acquire_inflight(inflight: &Mutex<InflightMap>, key: (String, QueryType)) ->
/// Run a resolve function with in-flight coalescing. Multiple concurrent calls
/// for the same key share a single resolution — the first caller (leader)
/// executes `resolve_fn`, and followers wait for the broadcast result.
/// executes `resolve_fn`, and followers wait for the broadcast result. The
/// leader's successful path is tagged with `leader_path` so callers that
/// share this helper (recursive, forwarded-rule, forward-upstream) keep their
/// own observability without duplicating the inflight map.
async fn resolve_coalesced<F, Fut>(
inflight: &Mutex<InflightMap>,
key: (String, QueryType),
query: &DnsPacket,
leader_path: QueryPath,
resolve_fn: F,
) -> (DnsPacket, QueryPath, Option<String>)
where
@@ -591,7 +637,7 @@ where
match result {
Ok(resp) => {
let _ = tx.send(Some(resp.clone()));
(resp, QueryPath::Recursive, None)
(resp, leader_path, None)
}
Err(e) => {
let _ = tx.send(None);
@@ -618,6 +664,33 @@ impl Drop for InflightGuard<'_> {
}
}
/// Emit the log lines shared by the three upstream branches (Forwarded,
/// Recursive, Upstream) after `resolve_coalesced` returns. Leader-success
/// and transport-tagging stay at the call site since they diverge per
/// branch, but the Coalesced debug and UpstreamError error are identical
/// except for the label.
fn log_coalesced_outcome(
src_addr: SocketAddr,
qtype: QueryType,
qname: &str,
path: QueryPath,
err: Option<&str>,
label: &str,
) {
match path {
QueryPath::Coalesced => debug!("{} | {:?} {} | COALESCED", src_addr, qtype, qname),
QueryPath::UpstreamError => error!(
"{} | {:?} {} | {} ERROR | {}",
src_addr,
qtype,
qname,
label,
err.unwrap_or("leader failed")
),
_ => {}
}
}
fn special_use_response(query: &DnsPacket, qname: &str, qtype: QueryType) -> DnsPacket {
use std::net::{Ipv4Addr, Ipv6Addr};
if qname == "ipv4only.arpa" {
@@ -856,7 +929,7 @@ mod tests {
let key = ("coalesce.test".to_string(), QueryType::A);
let query = DnsPacket::query(100 + i, "coalesce.test", QueryType::A);
handles.push(tokio::spawn(async move {
resolve_coalesced(&inf, key, &query, || async {
resolve_coalesced(&inf, key, &query, QueryPath::Recursive, || async {
count.fetch_add(1, std::sync::atomic::Ordering::Relaxed);
tokio::time::sleep(Duration::from_millis(200)).await;
Ok(mock_response("coalesce.test"))
@@ -900,6 +973,7 @@ mod tests {
&inf1,
("same.domain".to_string(), QueryType::A),
&query_a,
QueryPath::Recursive,
|| async {
count1.fetch_add(1, std::sync::atomic::Ordering::Relaxed);
tokio::time::sleep(Duration::from_millis(100)).await;
@@ -913,6 +987,7 @@ mod tests {
&inf2,
("same.domain".to_string(), QueryType::AAAA),
&query_aaaa,
QueryPath::Recursive,
|| async {
count2.fetch_add(1, std::sync::atomic::Ordering::Relaxed);
tokio::time::sleep(Duration::from_millis(100)).await;
@@ -942,6 +1017,7 @@ mod tests {
&inflight,
("will-fail.test".to_string(), QueryType::A),
&query,
QueryPath::Recursive,
|| async { Err::<DnsPacket, _>("upstream timeout".into()) },
)
.await;
@@ -963,6 +1039,7 @@ mod tests {
&inf,
("fail.test".to_string(), QueryType::A),
&query,
QueryPath::Recursive,
|| async {
tokio::time::sleep(Duration::from_millis(200)).await;
Err::<DnsPacket, _>("upstream error".into())
@@ -1003,6 +1080,7 @@ mod tests {
&inflight,
("question.test".to_string(), QueryType::A),
&query,
QueryPath::Recursive,
|| async { Err::<DnsPacket, _>("fail".into()) },
)
.await;
@@ -1027,6 +1105,7 @@ mod tests {
&inflight,
("err-msg.test".to_string(), QueryType::A),
&query,
QueryPath::Recursive,
|| async { Err::<DnsPacket, _>("connection refused by upstream".into()) },
)
.await;
@@ -1082,7 +1161,7 @@ mod tests {
let mut ctx = crate::testutil::test_ctx().await;
ctx.forwarding_rules = vec![ForwardingRule::new(
"168.192.in-addr.arpa".to_string(),
Upstream::Udp(upstream_addr),
UpstreamPool::new(vec![Upstream::Udp(upstream_addr)], vec![]),
)];
let ctx = Arc::new(ctx);
@@ -1178,6 +1257,195 @@ mod tests {
}
}
#[tokio::test]
async fn pipeline_filter_aaaa_returns_nodata() {
let mut ctx = crate::testutil::test_ctx().await;
ctx.filter_aaaa = true;
let ctx = Arc::new(ctx);
let (resp, path) = resolve_in_test(&ctx, "example.com", QueryType::AAAA).await;
assert_eq!(path, QueryPath::Local);
assert_eq!(resp.header.rescode, ResultCode::NOERROR);
assert!(resp.answers.is_empty(), "AAAA must be filtered to NODATA");
}
#[tokio::test]
async fn pipeline_filter_aaaa_leaves_a_queries_alone() {
let upstream_resp =
crate::testutil::a_record_response("example.com", Ipv4Addr::new(93, 184, 216, 34), 300);
let upstream_addr = crate::testutil::mock_upstream(upstream_resp).await;
let mut ctx = crate::testutil::test_ctx().await;
ctx.filter_aaaa = true;
ctx.upstream_pool
.lock()
.unwrap()
.set_primary(vec![Upstream::Udp(upstream_addr)]);
let ctx = Arc::new(ctx);
let (resp, path) = resolve_in_test(&ctx, "example.com", QueryType::A).await;
assert_eq!(path, QueryPath::Upstream);
assert_eq!(resp.answers.len(), 1);
}
#[tokio::test]
async fn pipeline_filter_aaaa_respects_override() {
let mut ctx = crate::testutil::test_ctx().await;
ctx.filter_aaaa = true;
ctx.overrides
.write()
.unwrap()
.insert("v6.test", "2001:db8::1", 60, None)
.unwrap();
let ctx = Arc::new(ctx);
let (resp, path) = resolve_in_test(&ctx, "v6.test", QueryType::AAAA).await;
assert_eq!(path, QueryPath::Overridden);
assert_eq!(resp.answers.len(), 1, "override must win over filter");
}
#[tokio::test]
async fn pipeline_filter_aaaa_strips_ipv6hint_from_https_and_svcb() {
let rdata = crate::svcb::build_rdata(
1,
&[],
&[
(1, vec![0x02, b'h', b'3']),
(
6,
vec![
0x26, 0x06, 0x47, 0x00, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0x01,
],
),
],
);
let mut pkt = DnsPacket::new();
pkt.header.response = true;
pkt.header.rescode = ResultCode::NOERROR;
pkt.questions.push(crate::question::DnsQuestion {
name: "hints.test".to_string(),
qtype: QueryType::HTTPS,
});
pkt.answers.push(DnsRecord::UNKNOWN {
domain: "hints.test".to_string(),
qtype: 65,
data: rdata.clone(),
ttl: 300,
});
let mut svcb_pkt = pkt.clone();
svcb_pkt.questions[0].name = "svc.test".to_string();
svcb_pkt.questions[0].qtype = QueryType::SVCB;
if let DnsRecord::UNKNOWN { domain, qtype, .. } = &mut svcb_pkt.answers[0] {
*domain = "svc.test".to_string();
*qtype = 64;
}
let mut ctx = crate::testutil::test_ctx().await;
ctx.filter_aaaa = true;
ctx.cache
.write()
.unwrap()
.insert("hints.test", QueryType::HTTPS, &pkt);
ctx.cache
.write()
.unwrap()
.insert("svc.test", QueryType::SVCB, &svcb_pkt);
let ctx = Arc::new(ctx);
for (name, qtype, label) in [
("hints.test", QueryType::HTTPS, "HTTPS"),
("svc.test", QueryType::SVCB, "SVCB"),
] {
let (resp, path) = resolve_in_test(&ctx, name, qtype).await;
assert_eq!(path, QueryPath::Cached, "{label}");
assert_eq!(resp.answers.len(), 1, "{label}");
match &resp.answers[0] {
DnsRecord::UNKNOWN { data, .. } => {
assert!(
data.len() < rdata.len(),
"{label}: ipv6hint (20 bytes) must be removed"
);
// Bytes for key=6 must not appear at any 4-byte boundary in the
// params section — cheap structural check.
assert!(
!data.windows(4).any(|w| w == [0, 6, 0, 16]),
"{label}: ipv6hint TLV header must be absent"
);
}
other => panic!("{label}: expected UNKNOWN record, got {other:?}"),
}
}
}
#[tokio::test]
async fn pipeline_filter_aaaa_preserves_ipv6hint_for_dnssec_clients() {
// Regression guard for the DO-bit gate in resolve_query: modifying
// HTTPS rdata invalidates any accompanying RRSIG, so a DO=1 client
// must receive the record untouched even when filter_aaaa is on.
let rdata = crate::svcb::build_rdata(
1,
&[],
&[(
6,
vec![
0x26, 0x06, 0x47, 0x00, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0x01,
],
)],
);
let mut pkt = DnsPacket::new();
pkt.header.response = true;
pkt.header.rescode = ResultCode::NOERROR;
pkt.questions.push(crate::question::DnsQuestion {
name: "hints.test".to_string(),
qtype: QueryType::HTTPS,
});
pkt.answers.push(DnsRecord::UNKNOWN {
domain: "hints.test".to_string(),
qtype: 65,
data: rdata.clone(),
ttl: 300,
});
let mut ctx = crate::testutil::test_ctx().await;
ctx.filter_aaaa = true;
ctx.cache
.write()
.unwrap()
.insert("hints.test", QueryType::HTTPS, &pkt);
let ctx = Arc::new(ctx);
// Build a query with EDNS DO bit set — can't use resolve_in_test
// because it constructs a plain query without EDNS.
let mut query = DnsPacket::query(0xBEEF, "hints.test", QueryType::HTTPS);
query.edns = Some(crate::packet::EdnsOpt {
do_bit: true,
..Default::default()
});
let mut buf = BytePacketBuffer::new();
query.write(&mut buf).unwrap();
let raw = &buf.buf[..buf.pos];
let src: SocketAddr = "127.0.0.1:1234".parse().unwrap();
let (resp_buf, _) = resolve_query(query, raw, src, &ctx, Transport::Udp)
.await
.unwrap();
let mut resp_parse_buf = BytePacketBuffer::from_bytes(resp_buf.filled());
let resp = DnsPacket::from_buffer(&mut resp_parse_buf).unwrap();
match &resp.answers[0] {
DnsRecord::UNKNOWN { data, .. } => {
assert_eq!(
data, &rdata,
"ipv6hint must be preserved for DO-bit clients"
);
}
other => panic!("expected UNKNOWN record, got {:?}", other),
}
}
#[tokio::test]
async fn pipeline_blocklist_sinkhole() {
let ctx = crate::testutil::test_ctx().await;
@@ -1224,20 +1492,14 @@ mod tests {
#[tokio::test]
async fn pipeline_forwarding_returns_upstream_answer() {
let mut upstream_resp = DnsPacket::new();
upstream_resp.header.response = true;
upstream_resp.header.rescode = ResultCode::NOERROR;
upstream_resp.answers.push(DnsRecord::A {
domain: "internal.corp".to_string(),
addr: Ipv4Addr::new(10, 1, 2, 3),
ttl: 600,
});
let upstream_resp =
crate::testutil::a_record_response("internal.corp", Ipv4Addr::new(10, 1, 2, 3), 600);
let upstream_addr = crate::testutil::mock_upstream(upstream_resp).await;
let mut ctx = crate::testutil::test_ctx().await;
ctx.forwarding_rules = vec![ForwardingRule::new(
"corp".to_string(),
Upstream::Udp(upstream_addr),
UpstreamPool::new(vec![Upstream::Udp(upstream_addr)], vec![]),
)];
let ctx = Arc::new(ctx);
@@ -1254,16 +1516,35 @@ mod tests {
}
}
#[tokio::test]
async fn pipeline_forwarding_fails_over_to_second_upstream() {
let dead = crate::testutil::blackhole_upstream();
let live_resp =
crate::testutil::a_record_response("internal.corp", Ipv4Addr::new(10, 9, 9, 9), 600);
let live = crate::testutil::mock_upstream(live_resp).await;
let mut ctx = crate::testutil::test_ctx().await;
ctx.forwarding_rules = vec![ForwardingRule::new(
"corp".to_string(),
UpstreamPool::new(vec![Upstream::Udp(dead), Upstream::Udp(live)], vec![]),
)];
let ctx = Arc::new(ctx);
let (resp, path) = resolve_in_test(&ctx, "internal.corp", QueryType::A).await;
assert_eq!(path, QueryPath::Forwarded);
assert_eq!(resp.header.rescode, ResultCode::NOERROR);
assert_eq!(resp.answers.len(), 1);
match &resp.answers[0] {
DnsRecord::A { addr, .. } => assert_eq!(*addr, Ipv4Addr::new(10, 9, 9, 9)),
other => panic!("expected A record, got {:?}", other),
}
}
#[tokio::test]
async fn pipeline_default_pool_reports_upstream_path() {
let mut upstream_resp = DnsPacket::new();
upstream_resp.header.response = true;
upstream_resp.header.rescode = ResultCode::NOERROR;
upstream_resp.answers.push(DnsRecord::A {
domain: "example.com".to_string(),
addr: Ipv4Addr::new(93, 184, 216, 34),
ttl: 300,
});
let upstream_resp =
crate::testutil::a_record_response("example.com", Ipv4Addr::new(93, 184, 216, 34), 300);
let upstream_addr = crate::testutil::mock_upstream(upstream_resp).await;
let ctx = crate::testutil::test_ctx().await;
@@ -1278,4 +1559,67 @@ mod tests {
assert_eq!(resp.header.rescode, ResultCode::NOERROR);
assert_eq!(resp.answers.len(), 1);
}
#[tokio::test]
async fn refresh_entry_honors_forwarding_rule() {
let rule_resp =
crate::testutil::a_record_response("internal.corp", Ipv4Addr::new(10, 0, 0, 42), 300);
let rule_upstream = crate::testutil::mock_upstream(rule_resp).await;
let mut ctx = crate::testutil::test_ctx().await;
ctx.forwarding_rules = vec![ForwardingRule::new(
"corp".to_string(),
UpstreamPool::new(vec![Upstream::Udp(rule_upstream)], vec![]),
)];
// Default pool points at a blackhole — if the refresh queries it
// instead of the rule, the test fails because nothing is cached.
ctx.upstream_pool
.lock()
.unwrap()
.set_primary(vec![Upstream::Udp(crate::testutil::blackhole_upstream())]);
let ctx = Arc::new(ctx);
refresh_entry(&ctx, "internal.corp", QueryType::A).await;
let cached = ctx
.cache
.read()
.unwrap()
.lookup("internal.corp", QueryType::A)
.expect("refresh must populate cache via forwarding rule");
match &cached.answers[0] {
DnsRecord::A { addr, .. } => assert_eq!(*addr, Ipv4Addr::new(10, 0, 0, 42)),
other => panic!("expected A record, got {:?}", other),
}
}
#[tokio::test]
async fn refresh_entry_prefers_forwarding_rule_over_recursive() {
let rule_resp =
crate::testutil::a_record_response("db.internal.corp", Ipv4Addr::new(10, 0, 0, 7), 300);
let rule_upstream = crate::testutil::mock_upstream(rule_resp).await;
let mut ctx = crate::testutil::test_ctx().await;
ctx.upstream_mode = UpstreamMode::Recursive;
ctx.forwarding_rules = vec![ForwardingRule::new(
"corp".to_string(),
UpstreamPool::new(vec![Upstream::Udp(rule_upstream)], vec![]),
)];
// No root_hints — recursion would fail immediately, proving that
// the rule branch fired instead.
let ctx = Arc::new(ctx);
refresh_entry(&ctx, "db.internal.corp", QueryType::A).await;
let cached = ctx
.cache
.read()
.unwrap()
.lookup("db.internal.corp", QueryType::A)
.expect("recursive-mode refresh must still consult forwarding rules");
match &cached.answers[0] {
DnsRecord::A { addr, .. } => assert_eq!(*addr, Ipv4Addr::new(10, 0, 0, 7)),
other => panic!("expected A record, got {:?}", other),
}
}
}

View File

@@ -882,6 +882,28 @@ fn record_rdata_canonical(record: &DnsRecord) -> Vec<u8> {
rdata.extend(type_bitmap);
rdata
}
DnsRecord::SOA {
mname,
rname,
serial,
refresh,
retry,
expire,
minimum,
..
} => {
let mname_wire = name_to_wire(mname);
let rname_wire = name_to_wire(rname);
let mut rdata = Vec::with_capacity(mname_wire.len() + rname_wire.len() + 20);
rdata.extend(&mname_wire);
rdata.extend(&rname_wire);
rdata.extend(&serial.to_be_bytes());
rdata.extend(&refresh.to_be_bytes());
rdata.extend(&retry.to_be_bytes());
rdata.extend(&expire.to_be_bytes());
rdata.extend(&minimum.to_be_bytes());
rdata
}
DnsRecord::UNKNOWN { data, .. } => data.clone(),
DnsRecord::RRSIG { .. } => Vec::new(),
}

View File

@@ -1,14 +1,16 @@
use std::fmt;
use std::net::{IpAddr, SocketAddr};
use std::sync::RwLock;
use std::sync::{Arc, RwLock};
use std::time::{Duration, Instant};
use tokio::net::UdpSocket;
use tokio::time::timeout;
use crate::buffer::BytePacketBuffer;
use crate::odoh::{query_through_relay, OdohConfigCache};
use crate::packet::DnsPacket;
use crate::srtt::SrttCache;
use crate::stats::UpstreamTransport;
use crate::Result;
#[derive(Clone)]
@@ -23,6 +25,36 @@ pub enum Upstream {
tls_name: Option<String>,
connector: tokio_rustls::TlsConnector,
},
/// Oblivious DNS-over-HTTPS (RFC 9230). Queries are HPKE-sealed to the
/// target and forwarded through an independent relay. Target host lives
/// on `target_config` (single source of truth — the cache keys on it).
Odoh {
relay_url: String,
target_path: String,
client: reqwest::Client,
target_config: Arc<OdohConfigCache>,
},
}
impl Upstream {
/// IP address to key SRTT tracking on, if the upstream has a stable one.
/// `Doh` and `Odoh` route through a URL + connection pool, so there's no
/// single IP to track; SRTT is skipped for them.
pub fn tracked_ip(&self) -> Option<IpAddr> {
match self {
Upstream::Udp(addr) | Upstream::Dot { addr, .. } => Some(addr.ip()),
Upstream::Doh { .. } | Upstream::Odoh { .. } => None,
}
}
pub fn transport(&self) -> UpstreamTransport {
match self {
Upstream::Udp(_) => UpstreamTransport::Udp,
Upstream::Doh { .. } => UpstreamTransport::Doh,
Upstream::Dot { .. } => UpstreamTransport::Dot,
Upstream::Odoh { .. } => UpstreamTransport::Odoh,
}
}
}
impl PartialEq for Upstream {
@@ -31,6 +63,20 @@ impl PartialEq for Upstream {
(Self::Udp(a), Self::Udp(b)) => a == b,
(Self::Doh { url: a, .. }, Self::Doh { url: b, .. }) => a == b,
(Self::Dot { addr: a, .. }, Self::Dot { addr: b, .. }) => a == b,
(
Self::Odoh {
relay_url: ra,
target_path: pa,
target_config: ca,
..
},
Self::Odoh {
relay_url: rb,
target_path: pb,
target_config: cb,
..
},
) => ra == rb && pa == pb && ca.target_host() == cb.target_host(),
_ => false,
}
}
@@ -51,14 +97,23 @@ impl fmt::Display for Upstream {
Some(name) => write!(f, "tls://{}#{}", addr, name),
None => write!(f, "tls://{}", addr),
},
Upstream::Odoh {
relay_url,
target_path,
target_config,
..
} => write!(
f,
"odoh://{}{} via {}",
target_config.target_host(),
target_path,
relay_url
),
}
}
}
pub(crate) fn parse_upstream_addr(
s: &str,
default_port: u16,
) -> std::result::Result<SocketAddr, String> {
pub fn parse_upstream_addr(s: &str, default_port: u16) -> std::result::Result<SocketAddr, String> {
// Try full socket addr first: "1.2.3.4:5353" or "[::1]:5353"
if let Ok(addr) = s.parse::<SocketAddr>() {
return Ok(addr);
@@ -70,22 +125,29 @@ pub(crate) fn parse_upstream_addr(
Err(format!("invalid upstream address: {}", s))
}
pub fn parse_upstream(s: &str, default_port: u16) -> Result<Upstream> {
/// Parse a slice of upstream address strings into `Upstream` values, failing
/// on the first invalid entry. DoH entries use `resolver` (when provided) as
/// their hostname resolver.
pub fn parse_upstream_list(
addrs: &[String],
default_port: u16,
resolver: Option<Arc<crate::bootstrap_resolver::NumaResolver>>,
) -> Result<Vec<Upstream>> {
addrs
.iter()
.map(|s| parse_upstream(s, default_port, resolver.clone()))
.collect()
}
pub fn parse_upstream(
s: &str,
default_port: u16,
resolver: Option<Arc<crate::bootstrap_resolver::NumaResolver>>,
) -> Result<Upstream> {
if s.starts_with("https://") {
let client = reqwest::Client::builder()
.use_rustls_tls()
.http2_initial_stream_window_size(65_535)
.http2_initial_connection_window_size(65_535)
.http2_keep_alive_interval(Duration::from_secs(15))
.http2_keep_alive_while_idle(true)
.http2_keep_alive_timeout(Duration::from_secs(10))
.pool_idle_timeout(Duration::from_secs(300))
.pool_max_idle_per_host(1)
.build()
.unwrap_or_default();
return Ok(Upstream::Doh {
url: s.to_string(),
client,
client: build_https_client_with_resolver(1, resolver),
});
}
// tls://IP:PORT#hostname or tls://IP#hostname (default port 853)
@@ -106,6 +168,51 @@ pub fn parse_upstream(s: &str, default_port: u16) -> Result<Upstream> {
Ok(Upstream::Udp(addr))
}
/// HTTP/2 client tuned for DoH/ODoH: small windows for low latency, long-lived
/// keep-alive. Pool defaults to one idle conn per host — good for resolvers
/// that talk to a single upstream; relays that fan out to many targets
/// should use [`build_https_client_with_pool`].
///
/// Uses the system resolver. Callers running inside `serve::run` pass the
/// shared [`crate::bootstrap_resolver::NumaResolver`] via
/// [`build_https_client_with_resolver`] to avoid the self-loop (issue #122).
pub fn build_https_client() -> reqwest::Client {
build_https_client_with_resolver(1, None)
}
/// Same shape as [`build_https_client`], but caller picks
/// `pool_max_idle_per_host`. Relay workloads hit many distinct target hosts
/// and benefit from a larger pool so warm connections survive concurrent
/// fan-out.
pub fn build_https_client_with_pool(pool_max_idle_per_host: usize) -> reqwest::Client {
build_https_client_with_resolver(pool_max_idle_per_host, None)
}
/// [`build_https_client`] with an optional custom DNS resolver. Numa wires
/// [`crate::bootstrap_resolver::NumaResolver`] here.
pub fn build_https_client_with_resolver(
pool_max_idle_per_host: usize,
resolver: Option<Arc<crate::bootstrap_resolver::NumaResolver>>,
) -> reqwest::Client {
let mut builder = https_client_builder(pool_max_idle_per_host);
if let Some(r) = resolver {
builder = builder.dns_resolver(r);
}
builder.build().unwrap_or_default()
}
fn https_client_builder(pool_max_idle_per_host: usize) -> reqwest::ClientBuilder {
reqwest::Client::builder()
.use_rustls_tls()
.http2_initial_stream_window_size(65_535)
.http2_initial_connection_window_size(65_535)
.http2_keep_alive_interval(Duration::from_secs(15))
.http2_keep_alive_while_idle(true)
.http2_keep_alive_timeout(Duration::from_secs(10))
.pool_idle_timeout(Duration::from_secs(300))
.pool_max_idle_per_host(pool_max_idle_per_host)
}
fn build_dot_connector() -> Result<tokio_rustls::TlsConnector> {
let _ = rustls::crypto::ring::default_provider().install_default();
let mut root_store = rustls::RootCertStore::empty();
@@ -270,6 +377,22 @@ pub async fn forward_query_raw(
tls_name,
connector,
} => forward_dot_raw(wire, *addr, tls_name, connector, timeout_duration).await,
Upstream::Odoh {
relay_url,
target_path,
client,
target_config,
} => {
query_through_relay(
wire,
relay_url,
target_path,
client,
target_config,
timeout_duration,
)
.await
}
}
}
@@ -345,18 +468,17 @@ pub async fn forward_with_failover_raw(
timeout_duration: Duration,
hedge_delay: Duration,
) -> Result<Vec<u8>> {
let mut candidates: Vec<(usize, u64)> = pool
.primary
.iter()
.enumerate()
.map(|(i, u)| {
let rtt = match u {
Upstream::Udp(addr) => srtt.read().unwrap().get(addr.ip()),
_ => 0,
};
(i, rtt)
})
.collect();
let mut candidates: Vec<(usize, u64)> = {
let srtt_read = srtt.read().unwrap();
pool.primary
.iter()
.enumerate()
.map(|(i, u)| {
let rtt = u.tracked_ip().map(|ip| srtt_read.get(ip)).unwrap_or(0);
(i, rtt)
})
.collect()
};
candidates.sort_by_key(|&(_, rtt)| rtt);
let all_upstreams: Vec<&Upstream> = candidates
@@ -380,15 +502,15 @@ pub async fn forward_with_failover_raw(
};
match result {
Ok(resp) => {
if let Upstream::Udp(addr) = upstream {
if let Some(ip) = upstream.tracked_ip() {
let rtt_ms = start.elapsed().as_millis() as u64;
srtt.write().unwrap().record_rtt(addr.ip(), rtt_ms, false);
srtt.write().unwrap().record_rtt(ip, rtt_ms, false);
}
return Ok(resp);
}
Err(e) => {
if let Upstream::Udp(addr) = upstream {
srtt.write().unwrap().record_failure(addr.ip());
if let Some(ip) = upstream.tracked_ip() {
srtt.write().unwrap().record_failure(ip);
}
log::debug!("upstream {} failed: {}", upstream, e);
last_err = Some(e);
@@ -438,6 +560,9 @@ async fn forward_doh_raw(
/// Send a lightweight keepalive query to a DoH upstream to prevent
/// the HTTP/2 + TLS connection from going idle and being torn down.
/// The first call doubles as a startup warm-up: bootstrap-resolver failures
/// (unreachable Quad9/Cloudflare defaults, misconfigured hostname upstream)
/// surface here rather than on the first client query.
pub async fn keepalive_doh(upstream: &Upstream) {
if let Upstream::Doh { url, client } = upstream {
// Query for . NS — minimal, always succeeds, response is small
@@ -450,7 +575,9 @@ pub async fn keepalive_doh(upstream: &Upstream) {
0x00, 0x02, // type NS
0x00, 0x01, // class IN
];
let _ = forward_doh_raw(wire, url, client, Duration::from_secs(5)).await;
if let Err(e) = forward_doh_raw(wire, url, client, Duration::from_secs(5)).await {
log::warn!("DoH keepalive to {} failed: {}", url, e);
}
}
}
@@ -707,4 +834,62 @@ mod tests {
assert!(!pool.maybe_update_primary("not-an-ip", 53));
assert_eq!(pool.preferred().unwrap().to_string(), "1.2.3.4:53");
}
fn tcp_closed_port() -> SocketAddr {
// Bind a TCP listener, grab the port, drop → kernel returns RST on connect.
let listener = std::net::TcpListener::bind("127.0.0.1:0").unwrap();
let addr = listener.local_addr().unwrap();
drop(listener);
addr
}
#[tokio::test]
async fn udp_failure_records_in_srtt() {
let blackhole = crate::testutil::blackhole_upstream();
let pool = UpstreamPool::new(vec![Upstream::Udp(blackhole)], vec![]);
let srtt = RwLock::new(SrttCache::new(true));
let _ = forward_with_failover_raw(
&[0u8; 12],
&pool,
&srtt,
Duration::from_millis(100),
Duration::ZERO,
)
.await;
assert!(srtt.read().unwrap().is_known(blackhole.ip()));
}
#[tokio::test]
async fn dot_failure_records_in_srtt() {
let dead1 = tcp_closed_port();
let dead2 = tcp_closed_port();
let connector = build_dot_connector().unwrap();
let pool = UpstreamPool::new(
vec![
Upstream::Dot {
addr: dead1,
tls_name: Some("dns.quad9.net".to_string()),
connector: connector.clone(),
},
Upstream::Dot {
addr: dead2,
tls_name: Some("dns.quad9.net".to_string()),
connector,
},
],
vec![],
);
let srtt = RwLock::new(SrttCache::new(true));
let _ = forward_with_failover_raw(
&[0u8; 12],
&pool,
&srtt,
Duration::from_millis(500),
Duration::ZERO,
)
.await;
let cache = srtt.read().unwrap();
assert!(cache.is_known(dead1.ip()));
assert!(cache.is_known(dead2.ip()));
}
}

View File

@@ -7,11 +7,10 @@
//! Both handlers call [`HealthResponse::build`] to assemble the JSON
//! response from `HealthMeta` + live inputs.
//!
//! JSON schema is documented in `docs/implementation/ios-companion-app.md`
//! §4.2. The iOS companion app's `HealthInfo` struct is the canonical
//! consumer; any change to this response must keep that struct decoding
//! cleanly (all consumed fields are optional on the Swift side, but
//! `lan_ip` is load-bearing for the pipeline).
//! The iOS companion app's `HealthInfo` struct is the canonical consumer;
//! any change to this response must keep that struct decoding cleanly (all
//! consumed fields are optional on the Swift side, but `lan_ip` is
//! load-bearing for the pipeline).
use std::net::Ipv4Addr;
use std::path::Path;

View File

@@ -1,5 +1,6 @@
pub mod api;
pub mod blocklist;
pub mod bootstrap_resolver;
pub mod buffer;
pub mod cache;
pub mod config;
@@ -13,6 +14,7 @@ pub mod health;
pub mod lan;
pub mod mobile_api;
pub mod mobileconfig;
pub mod odoh;
pub mod override_store;
pub mod packet;
pub mod proxy;
@@ -20,11 +22,13 @@ pub mod query_log;
pub mod question;
pub mod record;
pub mod recursive;
pub mod relay;
pub mod serve;
pub mod service_store;
pub mod setup_phone;
pub mod srtt;
pub mod stats;
pub mod svcb;
pub mod system_dns;
pub mod tls;
pub mod wire;

View File

@@ -60,6 +60,32 @@ fn main() -> numa::Result<()> {
.block_on(numa::setup_phone::run())
.map_err(|e| e.into());
}
"relay" => {
let port: u16 = std::env::args()
.nth(2)
.as_deref()
.and_then(|s| s.parse().ok())
.unwrap_or(8443);
let bind: std::net::IpAddr = std::env::args()
.nth(3)
.as_deref()
.map(|s| {
s.parse().unwrap_or_else(|e| {
eprintln!("invalid bind address '{}': {}", s, e);
std::process::exit(1);
})
})
.unwrap_or(std::net::IpAddr::V4(std::net::Ipv4Addr::LOCALHOST));
let addr = std::net::SocketAddr::new(bind, port);
eprintln!(
"\x1b[1;38;2;192;98;58mNuma\x1b[0m — ODoH relay on {}\n",
addr
);
let runtime = tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()?;
return runtime.block_on(numa::relay::run(addr));
}
"lan" => {
let sub = std::env::args().nth(2).unwrap_or_default();
let config_path = std::env::args()
@@ -91,6 +117,8 @@ fn main() -> numa::Result<()> {
eprintln!(" service status Check if the service is running");
eprintln!(" lan on Enable LAN service discovery (mDNS)");
eprintln!(" lan off Disable LAN service discovery");
eprintln!(" relay [PORT] [BIND]");
eprintln!(" Run as an ODoH relay (RFC 9230, default 127.0.0.1:8443)");
eprintln!(" setup-phone Generate a QR code to install Numa DoT on a phone");
eprintln!(" help Show this help");
eprintln!();

489
src/odoh.rs Normal file
View File

@@ -0,0 +1,489 @@
//! ODoH target-config fetcher and TTL cache (RFC 9230 §6).
//!
//! ## Ciphersuite policy
//! `odoh-rs` deserialization rejects any config whose KEM/KDF/AEAD triple is
//! not the mandatory `(X25519, HKDF-SHA256, AES-128-GCM)` (see
//! `ObliviousDoHConfigContents::deserialize`). This is stricter than the
//! plan's "pick the mandatory suite if mixed": a response containing *any*
//! non-mandatory config fails parse entirely. Real-world targets publish a
//! single mandatory config, so this is fine in practice; revisit if a target
//! that matters starts mixing suites.
use std::sync::Arc;
use std::time::{Duration, Instant};
use arc_swap::ArcSwapOption;
use odoh_rs::{
ObliviousDoHConfigContents, ObliviousDoHConfigs, ObliviousDoHMessage,
ObliviousDoHMessagePlaintext,
};
use rand_core::{OsRng, TryRngCore};
use reqwest::header::HeaderMap;
use tokio::sync::Mutex;
use tokio::time::timeout;
use crate::Result;
/// MIME type used for both directions of the ODoH exchange (RFC 9230 §4).
pub(crate) const ODOH_CONTENT_TYPE: &str = "application/oblivious-dns-message";
/// Cap on the response body we read into memory when the relay returns
/// non-success. Protects against a hostile relay streaming a huge body on
/// the error path; keeps enough room to carry a human-readable reason.
const ERROR_BODY_PREVIEW_BYTES: usize = 1024;
/// Fallback TTL when the target's response lacks a usable `Cache-Control`
/// directive. RFC 9230 §6.2 places no hard floor; 24 h matches what Cloudflare
/// publishes in practice.
const DEFAULT_CONFIG_TTL: Duration = Duration::from_secs(24 * 60 * 60);
/// Cap on any TTL we'll honour, regardless of what the target advertises.
/// Keeps a misconfigured server from pinning an old key indefinitely.
const MAX_CONFIG_TTL: Duration = Duration::from_secs(7 * 24 * 60 * 60);
/// After a failed `/.well-known/odohconfigs` fetch, refuse to refetch again
/// within this window — a target that is genuinely broken would otherwise
/// receive one request per query. Queries that arrive during the backoff
/// return the cached error immediately.
const REFRESH_BACKOFF: Duration = Duration::from_secs(60);
/// Parsed ODoH target config plus the freshness metadata needed to age it out.
#[derive(Debug)]
pub struct OdohTargetConfig {
pub contents: ObliviousDoHConfigContents,
pub key_id: Vec<u8>,
expires_at: Instant,
}
impl OdohTargetConfig {
pub fn is_expired(&self) -> bool {
Instant::now() >= self.expires_at
}
}
struct FailedRefresh {
at: Instant,
err: String,
}
/// TTL-gated cache of a single target's HPKE config.
///
/// Reads go through `ArcSwapOption` (lock-free hot path). Refreshes serialize
/// on an async mutex so a burst of simultaneous misses produces a single
/// outbound fetch, and a failed refresh blocks subsequent refetches for
/// [`REFRESH_BACKOFF`] to prevent hot-looping against a broken target.
pub struct OdohConfigCache {
target_host: String,
configs_url: String,
client: reqwest::Client,
current: ArcSwapOption<OdohTargetConfig>,
last_failure: ArcSwapOption<FailedRefresh>,
refresh_lock: Mutex<()>,
}
impl OdohConfigCache {
pub fn new(target_host: String, client: reqwest::Client) -> Self {
let configs_url = format!("https://{}/.well-known/odohconfigs", target_host);
Self {
target_host,
configs_url,
client,
current: ArcSwapOption::from(None),
last_failure: ArcSwapOption::from(None),
refresh_lock: Mutex::new(()),
}
}
pub fn target_host(&self) -> &str {
&self.target_host
}
/// Return a valid config, refetching when the cache is cold or expired.
/// Within [`REFRESH_BACKOFF`] of a failed refresh, returns the cached
/// error without issuing another fetch.
pub async fn get(&self) -> Result<Arc<OdohTargetConfig>> {
if let Some(cfg) = self.current.load_full() {
if !cfg.is_expired() {
return Ok(cfg);
}
}
if let Some(err) = self.backoff_error() {
return Err(err);
}
let _guard = self.refresh_lock.lock().await;
// Another task may have refreshed or failed while we waited.
if let Some(cfg) = self.current.load_full() {
if !cfg.is_expired() {
return Ok(cfg);
}
}
if let Some(err) = self.backoff_error() {
return Err(err);
}
match fetch_odoh_config(&self.client, &self.configs_url).await {
Ok(fresh) => {
let fresh = Arc::new(fresh);
self.current.store(Some(fresh.clone()));
self.last_failure.store(None);
Ok(fresh)
}
Err(e) => {
let msg = format!("ODoH config fetch failed: {e}");
self.last_failure.store(Some(Arc::new(FailedRefresh {
at: Instant::now(),
err: msg.clone(),
})));
Err(msg.into())
}
}
}
/// Drop the cached config. Called after the target rejects ciphertext
/// (key rotation race) so the next `get()` refetches.
pub fn invalidate(&self) {
self.current.store(None);
}
fn backoff_error(&self) -> Option<crate::Error> {
let fail = self.last_failure.load_full()?;
if fail.at.elapsed() < REFRESH_BACKOFF {
Some(format!("{} (backoff active)", fail.err).into())
} else {
None
}
}
}
/// Fetch `/.well-known/odohconfigs` from `configs_url` and parse it into an
/// [`OdohTargetConfig`]. The TTL is taken from the response's
/// `Cache-Control: max-age=`, clamped to [`DEFAULT_CONFIG_TTL`,
/// [`MAX_CONFIG_TTL`]] when absent or obviously wrong.
pub async fn fetch_odoh_config(
client: &reqwest::Client,
configs_url: &str,
) -> Result<OdohTargetConfig> {
let resp = client.get(configs_url).send().await?.error_for_status()?;
let ttl = cache_control_ttl(resp.headers()).unwrap_or(DEFAULT_CONFIG_TTL);
let body = resp.bytes().await?;
parse_odoh_config(&body, ttl)
}
fn parse_odoh_config(body: &[u8], ttl: Duration) -> Result<OdohTargetConfig> {
let mut buf = body;
let configs: ObliviousDoHConfigs = odoh_rs::parse(&mut buf)
.map_err(|e| format!("failed to parse ObliviousDoHConfigs: {e}"))?;
let first = configs
.into_iter()
.next()
.ok_or("target published no ODoH configs with a supported version + ciphersuite")?;
let contents: ObliviousDoHConfigContents = first.into();
let key_id = contents
.identifier()
.map_err(|e| format!("failed to derive key_id from ODoH config: {e}"))?;
Ok(OdohTargetConfig {
contents,
key_id,
expires_at: Instant::now() + ttl.min(MAX_CONFIG_TTL),
})
}
/// Send a DNS wire query through an ODoH relay to a target and return the
/// plaintext DNS wire response.
///
/// Flow: fetch the target's HPKE config (cached), seal the query, POST to the
/// relay with `Targethost`/`Targetpath` headers, then unseal the response.
/// On seal/unseal failure we invalidate the cache and retry once — this
/// handles the benign race where the target rotated its key between our
/// cached config and the POST.
pub async fn query_through_relay(
wire: &[u8],
relay_url: &str,
target_path: &str,
client: &reqwest::Client,
cache: &OdohConfigCache,
timeout_duration: Duration,
) -> Result<Vec<u8>> {
let req = OdohRequest {
wire,
relay_url,
target_path,
client,
cache,
timeout: timeout_duration,
};
match attempt_query(&req).await {
Ok(v) => Ok(v),
Err(AttemptError::KeyRotation(_)) => {
cache.invalidate();
attempt_query(&req).await.map_err(AttemptError::into_error)
}
Err(e) => Err(e.into_error()),
}
}
struct OdohRequest<'a> {
wire: &'a [u8],
relay_url: &'a str,
target_path: &'a str,
client: &'a reqwest::Client,
cache: &'a OdohConfigCache,
timeout: Duration,
}
/// Classification used only by the retry path in [`query_through_relay`].
enum AttemptError {
/// Target signalled the config we used is stale (key rotation race).
/// Callers should invalidate the cache and retry exactly once.
KeyRotation(String),
/// Any other failure — transport, timeout, malformed response.
Other(crate::Error),
}
impl AttemptError {
fn into_error(self) -> crate::Error {
match self {
AttemptError::KeyRotation(m) => format!("ODoH key rotation race: {m}").into(),
AttemptError::Other(e) => e,
}
}
}
async fn attempt_query(req: &OdohRequest<'_>) -> std::result::Result<Vec<u8>, AttemptError> {
let cfg = req.cache.get().await.map_err(AttemptError::Other)?;
let plaintext = ObliviousDoHMessagePlaintext::new(req.wire, 0);
// rand_core 0.9's OsRng is fallible-only; wrap for the infallible bound.
let mut os = OsRng;
let mut rng = os.unwrap_mut();
let (encrypted_query, client_secret) =
odoh_rs::encrypt_query(&plaintext, &cfg.contents, &mut rng)
.map_err(|e| AttemptError::Other(format!("ODoH encrypt failed: {e}").into()))?;
let body = odoh_rs::compose(&encrypted_query)
.map_err(|e| AttemptError::Other(format!("ODoH compose failed: {e}").into()))?
.freeze();
// RFC 9230 §5 and the reference client use URL query parameters, not
// HTTP headers, to carry the target routing. `Targethost`/`Targetpath`
// headers cause relays to treat the request as an unspecified-target and
// reject it.
let (status, resp_body) = timeout(req.timeout, async {
let resp = req
.client
.post(req.relay_url)
.header(reqwest::header::CONTENT_TYPE, ODOH_CONTENT_TYPE)
.header(reqwest::header::ACCEPT, ODOH_CONTENT_TYPE)
.header(reqwest::header::CACHE_CONTROL, "no-cache, no-store")
.query(&[
("targethost", req.cache.target_host()),
("targetpath", req.target_path),
])
.body(body)
.send()
.await?;
let status = resp.status();
let body = resp.bytes().await?;
Ok::<_, reqwest::Error>((status, body))
})
.await
.map_err(|_| AttemptError::Other("ODoH relay request timed out".into()))?
.map_err(|e| AttemptError::Other(format!("ODoH relay request failed: {e}").into()))?;
// RFC 9230 §4.3 expects a target that can't decrypt to reply with a DNS
// error in a sealed 200 response; a 401 from the relay/target is the
// practical signal that our cached HPKE key is stale. Treat 400 as a
// client-side bug (malformed ODoH envelope) — retrying would loop-fail.
if !status.is_success() {
let preview_len = resp_body.len().min(ERROR_BODY_PREVIEW_BYTES);
let body_preview = String::from_utf8_lossy(&resp_body[..preview_len]);
let msg = format!("ODoH relay returned {status}: {}", body_preview.trim());
return Err(if status.as_u16() == 401 {
AttemptError::KeyRotation(msg)
} else {
AttemptError::Other(msg.into())
});
}
let mut buf = resp_body;
let encrypted_response: ObliviousDoHMessage = odoh_rs::parse(&mut buf)
.map_err(|e| AttemptError::Other(format!("ODoH response parse failed: {e}").into()))?;
let plaintext_response =
odoh_rs::decrypt_response(&plaintext, &encrypted_response, client_secret)
.map_err(|e| AttemptError::KeyRotation(format!("ODoH decrypt failed: {e}")))?;
Ok(plaintext_response.into_msg().to_vec())
}
fn cache_control_ttl(headers: &HeaderMap) -> Option<Duration> {
let cc = headers.get(reqwest::header::CACHE_CONTROL)?.to_str().ok()?;
for directive in cc.split(',') {
let directive = directive.trim();
if let Some(rest) = directive.strip_prefix("max-age=") {
if let Ok(secs) = rest.trim().parse::<u64>() {
if secs > 0 {
return Some(Duration::from_secs(secs));
}
}
}
}
None
}
#[cfg(test)]
mod tests {
use super::*;
use odoh_rs::{ObliviousDoHConfig, ObliviousDoHKeyPair};
// RFC 9180 HPKE IDs for the sole ODoH mandatory suite:
// KEM = X25519, KDF = HKDF-SHA256, AEAD = AES-128-GCM.
const KEM_X25519: u16 = 0x0020;
const KDF_SHA256: u16 = 0x0001;
const AEAD_AES128GCM: u16 = 0x0001;
fn synth_configs_bytes() -> Vec<u8> {
let kp = ObliviousDoHKeyPair::from_parameters(
KEM_X25519,
KDF_SHA256,
AEAD_AES128GCM,
&[0u8; 32],
);
let pk = kp.public().clone();
let configs: ObliviousDoHConfigs = vec![ObliviousDoHConfig::from(pk)].into();
odoh_rs::compose(&configs).unwrap().to_vec()
}
#[test]
fn parse_accepts_well_formed_config() {
let bytes = synth_configs_bytes();
let cfg = parse_odoh_config(&bytes, Duration::from_secs(3600)).unwrap();
assert!(!cfg.key_id.is_empty());
assert!(!cfg.is_expired());
}
#[test]
fn parse_rejects_garbage() {
let bytes = [0xffu8; 16];
assert!(parse_odoh_config(&bytes, Duration::from_secs(3600)).is_err());
}
#[test]
fn parse_rejects_empty() {
assert!(parse_odoh_config(&[], Duration::from_secs(3600)).is_err());
}
#[test]
fn ttl_capped_at_max() {
let bytes = synth_configs_bytes();
let cfg = parse_odoh_config(&bytes, Duration::from_secs(100 * 24 * 60 * 60)).unwrap();
let remaining = cfg.expires_at.saturating_duration_since(Instant::now());
assert!(remaining <= MAX_CONFIG_TTL);
assert!(remaining >= MAX_CONFIG_TTL - Duration::from_secs(1));
}
#[test]
fn cache_control_parses_max_age() {
let mut h = HeaderMap::new();
h.insert("cache-control", "public, max-age=86400".parse().unwrap());
assert_eq!(cache_control_ttl(&h), Some(Duration::from_secs(86400)));
}
#[test]
fn cache_control_ignores_max_age_zero() {
let mut h = HeaderMap::new();
h.insert("cache-control", "max-age=0, no-store".parse().unwrap());
assert_eq!(cache_control_ttl(&h), None);
}
#[test]
fn cache_control_missing_falls_back() {
let h = HeaderMap::new();
assert_eq!(cache_control_ttl(&h), None);
}
#[test]
fn is_expired_tracks_ttl() {
let bytes = synth_configs_bytes();
let mut cfg = parse_odoh_config(&bytes, Duration::from_secs(3600)).unwrap();
assert!(!cfg.is_expired());
cfg.expires_at = Instant::now() - Duration::from_secs(1);
assert!(cfg.is_expired());
}
#[tokio::test]
async fn cache_backoff_blocks_refetch_after_failure() {
// Point the cache at a host that does not exist so the fetch fails
// deterministically; this exercises the backoff wiring without a
// network round-trip succeeding.
let cache = OdohConfigCache::new(
"odoh-target.invalid".to_string(),
reqwest::Client::builder()
.timeout(Duration::from_millis(200))
.build()
.unwrap(),
);
let first = cache.get().await;
assert!(first.is_err(), "first fetch must fail against invalid host");
// Within the backoff window, the cached error is returned immediately.
let second = cache.get().await.unwrap_err().to_string();
assert!(
second.contains("backoff active"),
"expected backoff hint, got: {second}"
);
// Reaching past the backoff window allows a fresh attempt — simulate
// by rewinding the recorded failure timestamp.
cache.last_failure.store(Some(Arc::new(FailedRefresh {
at: Instant::now() - (REFRESH_BACKOFF + Duration::from_secs(1)),
err: "prior".to_string(),
})));
let third = cache.get().await.unwrap_err().to_string();
assert!(
!third.contains("backoff active"),
"expected fresh fetch attempt, got: {third}"
);
}
/// Round-trip the HPKE seal/unseal path in isolation from HTTP, using the
/// odoh-rs primitives that `query_through_relay` wires together. Guards
/// against silently breaking the crypto glue if we refactor that path.
#[test]
fn seal_unseal_round_trip() {
use odoh_rs::{decrypt_query, encrypt_response, ResponseNonce};
let kp = ObliviousDoHKeyPair::from_parameters(
KEM_X25519,
KDF_SHA256,
AEAD_AES128GCM,
&[0u8; 32],
);
let query_wire = b"\x12\x34\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x07example\x03com\x00\x00\x01\x00\x01";
let query_pt = ObliviousDoHMessagePlaintext::new(query_wire, 0);
let mut os = OsRng;
let mut rng = os.unwrap_mut();
let (query_enc, client_secret) =
odoh_rs::encrypt_query(&query_pt, kp.public(), &mut rng).unwrap();
let (query_back, server_secret) = decrypt_query(&query_enc, &kp).unwrap();
assert_eq!(query_back.into_msg().as_ref(), query_wire);
let response_wire = b"\x12\x34\x81\x80\x00\x01\x00\x01\x00\x00\x00\x00";
let response_pt = ObliviousDoHMessagePlaintext::new(response_wire, 0);
let response_enc = encrypt_response(
&query_pt,
&response_pt,
server_secret,
ResponseNonce::default(),
)
.unwrap();
let response_back =
odoh_rs::decrypt_response(&query_pt, &response_enc, client_secret).unwrap();
assert_eq!(response_back.into_msg().as_ref(), response_wire);
}
}

View File

@@ -85,6 +85,14 @@ impl DnsPacket {
+ self.edns.as_ref().map_or(0, |e| e.options.capacity())
}
/// Apply `f` to every record in the three RR sections (answers,
/// authorities, resources). Does not touch questions or edns.
pub fn for_each_record_mut(&mut self, mut f: impl FnMut(&mut DnsRecord)) {
self.answers.iter_mut().for_each(&mut f);
self.authorities.iter_mut().for_each(&mut f);
self.resources.iter_mut().for_each(&mut f);
}
pub fn response_from(query: &DnsPacket, rescode: crate::header::ResultCode) -> DnsPacket {
let mut resp = DnsPacket::new();
resp.header.id = query.header.id;

View File

@@ -1,114 +1,66 @@
use crate::buffer::BytePacketBuffer;
use crate::Result;
#[derive(PartialEq, Eq, Debug, Clone, Hash, Copy)]
pub enum QueryType {
UNKNOWN(u16),
A, // 1
NS, // 2
CNAME, // 5
SOA, // 6
PTR, // 12
MX, // 15
TXT, // 16
AAAA, // 28
SRV, // 33
DS, // 43
RRSIG, // 46
NSEC, // 47
DNSKEY, // 48
NSEC3, // 50
OPT, // 41 (EDNS0 pseudo-type)
HTTPS, // 65
macro_rules! define_qtypes {
( $( $variant:ident = $num:literal, $str:literal ),* $(,)? ) => {
#[derive(PartialEq, Eq, Debug, Clone, Hash, Copy)]
pub enum QueryType {
UNKNOWN(u16),
$( $variant, )*
}
impl QueryType {
pub fn to_num(&self) -> u16 {
match *self {
QueryType::UNKNOWN(x) => x,
$( QueryType::$variant => $num, )*
}
}
pub fn from_num(num: u16) -> QueryType {
match num {
$( $num => QueryType::$variant, )*
_ => QueryType::UNKNOWN(num),
}
}
pub fn as_str(&self) -> &'static str {
match self {
QueryType::UNKNOWN(_) => "UNKNOWN",
$( QueryType::$variant => $str, )*
}
}
pub fn parse_str(s: &str) -> Option<QueryType> {
match s.to_ascii_uppercase().as_str() {
$( $str => Some(QueryType::$variant), )*
_ => None,
}
}
}
};
}
impl QueryType {
pub fn to_num(&self) -> u16 {
match *self {
QueryType::UNKNOWN(x) => x,
QueryType::A => 1,
QueryType::NS => 2,
QueryType::CNAME => 5,
QueryType::SOA => 6,
QueryType::PTR => 12,
QueryType::MX => 15,
QueryType::TXT => 16,
QueryType::AAAA => 28,
QueryType::SRV => 33,
QueryType::OPT => 41,
QueryType::DS => 43,
QueryType::RRSIG => 46,
QueryType::NSEC => 47,
QueryType::DNSKEY => 48,
QueryType::NSEC3 => 50,
QueryType::HTTPS => 65,
}
}
pub fn from_num(num: u16) -> QueryType {
match num {
1 => QueryType::A,
2 => QueryType::NS,
5 => QueryType::CNAME,
6 => QueryType::SOA,
12 => QueryType::PTR,
15 => QueryType::MX,
16 => QueryType::TXT,
28 => QueryType::AAAA,
33 => QueryType::SRV,
41 => QueryType::OPT,
43 => QueryType::DS,
46 => QueryType::RRSIG,
47 => QueryType::NSEC,
48 => QueryType::DNSKEY,
50 => QueryType::NSEC3,
65 => QueryType::HTTPS,
_ => QueryType::UNKNOWN(num),
}
}
pub fn as_str(&self) -> &'static str {
match self {
QueryType::A => "A",
QueryType::NS => "NS",
QueryType::CNAME => "CNAME",
QueryType::SOA => "SOA",
QueryType::PTR => "PTR",
QueryType::MX => "MX",
QueryType::TXT => "TXT",
QueryType::AAAA => "AAAA",
QueryType::SRV => "SRV",
QueryType::OPT => "OPT",
QueryType::DS => "DS",
QueryType::RRSIG => "RRSIG",
QueryType::NSEC => "NSEC",
QueryType::DNSKEY => "DNSKEY",
QueryType::NSEC3 => "NSEC3",
QueryType::HTTPS => "HTTPS",
QueryType::UNKNOWN(_) => "UNKNOWN",
}
}
pub fn parse_str(s: &str) -> Option<QueryType> {
match s.to_ascii_uppercase().as_str() {
"A" => Some(QueryType::A),
"NS" => Some(QueryType::NS),
"CNAME" => Some(QueryType::CNAME),
"SOA" => Some(QueryType::SOA),
"PTR" => Some(QueryType::PTR),
"MX" => Some(QueryType::MX),
"TXT" => Some(QueryType::TXT),
"AAAA" => Some(QueryType::AAAA),
"SRV" => Some(QueryType::SRV),
"DS" => Some(QueryType::DS),
"RRSIG" => Some(QueryType::RRSIG),
"DNSKEY" => Some(QueryType::DNSKEY),
"NSEC" => Some(QueryType::NSEC),
"NSEC3" => Some(QueryType::NSEC3),
"HTTPS" => Some(QueryType::HTTPS),
_ => None,
}
}
define_qtypes! {
A = 1, "A",
NS = 2, "NS",
CNAME = 5, "CNAME",
SOA = 6, "SOA",
PTR = 12, "PTR",
MX = 15, "MX",
TXT = 16, "TXT",
AAAA = 28, "AAAA",
LOC = 29, "LOC",
SRV = 33, "SRV",
NAPTR = 35, "NAPTR",
OPT = 41, "OPT",
DS = 43, "DS",
RRSIG = 46, "RRSIG",
NSEC = 47, "NSEC",
DNSKEY = 48, "DNSKEY",
NSEC3 = 50, "NSEC3",
SVCB = 64, "SVCB",
HTTPS = 65, "HTTPS",
}
#[derive(Debug, Clone, PartialEq, Eq)]

View File

@@ -24,6 +24,17 @@ pub enum DnsRecord {
host: String,
ttl: u32,
},
SOA {
domain: String,
mname: String,
rname: String,
serial: u32,
refresh: u32,
retry: u32,
expire: u32,
minimum: u32,
ttl: u32,
},
CNAME {
domain: String,
host: String,
@@ -100,6 +111,7 @@ impl DnsRecord {
| DnsRecord::RRSIG { domain, .. }
| DnsRecord::NSEC { domain, .. }
| DnsRecord::NSEC3 { domain, .. }
| DnsRecord::SOA { domain, .. }
| DnsRecord::UNKNOWN { domain, .. } => domain,
}
}
@@ -111,6 +123,7 @@ impl DnsRecord {
DnsRecord::NS { .. } => QueryType::NS,
DnsRecord::CNAME { .. } => QueryType::CNAME,
DnsRecord::MX { .. } => QueryType::MX,
DnsRecord::SOA { .. } => QueryType::SOA,
DnsRecord::DNSKEY { .. } => QueryType::DNSKEY,
DnsRecord::DS { .. } => QueryType::DS,
DnsRecord::RRSIG { .. } => QueryType::RRSIG,
@@ -132,6 +145,7 @@ impl DnsRecord {
| DnsRecord::RRSIG { ttl, .. }
| DnsRecord::NSEC { ttl, .. }
| DnsRecord::NSEC3 { ttl, .. }
| DnsRecord::SOA { ttl, .. }
| DnsRecord::UNKNOWN { ttl, .. } => *ttl,
}
}
@@ -172,6 +186,12 @@ impl DnsRecord {
+ next_hashed_owner.capacity()
+ type_bitmap.capacity()
}
DnsRecord::SOA {
domain,
mname,
rname,
..
} => domain.capacity() + mname.capacity() + rname.capacity(),
DnsRecord::UNKNOWN { domain, data, .. } => domain.capacity() + data.capacity(),
}
}
@@ -188,6 +208,7 @@ impl DnsRecord {
| DnsRecord::RRSIG { ttl, .. }
| DnsRecord::NSEC { ttl, .. }
| DnsRecord::NSEC3 { ttl, .. }
| DnsRecord::SOA { ttl, .. }
| DnsRecord::UNKNOWN { ttl, .. } => *ttl = new_ttl,
}
}
@@ -365,8 +386,31 @@ impl DnsRecord {
ttl,
})
}
QueryType::SOA => {
// MNAME/RNAME compressible per RFC 1035 §3.3.13 — decompress to avoid stale pointers on re-emit.
let mut mname = String::with_capacity(64);
buffer.read_qname(&mut mname)?;
let mut rname = String::with_capacity(64);
buffer.read_qname(&mut rname)?;
let serial = buffer.read_u32()?;
let refresh = buffer.read_u32()?;
let retry = buffer.read_u32()?;
let expire = buffer.read_u32()?;
let minimum = buffer.read_u32()?;
Ok(DnsRecord::SOA {
domain,
mname,
rname,
serial,
refresh,
retry,
expire,
minimum,
ttl,
})
}
_ => {
// SOA, TXT, SRV, etc. — stored as opaque bytes until parsed natively
// TXT, SRV, HTTPS, SVCB, etc. — stored as opaque bytes until parsed natively
let data = buffer.get_range(buffer.pos(), data_len as usize)?.to_vec();
buffer.step(data_len as usize)?;
Ok(DnsRecord::UNKNOWN {
@@ -430,6 +474,30 @@ impl DnsRecord {
let size = buffer.pos() - (pos + 2);
buffer.set_u16(pos, size as u16)?;
}
DnsRecord::SOA {
ref domain,
ref mname,
ref rname,
serial,
refresh,
retry,
expire,
minimum,
ttl,
} => {
write_header(buffer, domain, QueryType::SOA.to_num(), ttl)?;
let rdlen_pos = buffer.pos();
buffer.write_u16(0)?;
buffer.write_qname(mname)?;
buffer.write_qname(rname)?;
buffer.write_u32(serial)?;
buffer.write_u32(refresh)?;
buffer.write_u32(retry)?;
buffer.write_u32(expire)?;
buffer.write_u32(minimum)?;
let rdlen = buffer.pos() - (rdlen_pos + 2);
buffer.set_u16(rdlen_pos, rdlen as u16)?;
}
DnsRecord::AAAA {
ref domain,
ref addr,

342
src/relay.rs Normal file
View File

@@ -0,0 +1,342 @@
//! ODoH relay (RFC 9230 §5) — the forward-without-reading half of the
//! protocol. Runs `numa relay`; skips all resolver initialisation (no port
//! 53, no cache, no recursion, no dashboard). The relay never reads the
//! HPKE-sealed payload and keeps no per-request logs — only aggregate
//! counters.
use std::net::SocketAddr;
use std::sync::atomic::{AtomicU64, Ordering};
use std::sync::Arc;
use std::time::Duration;
use axum::body::Bytes;
use axum::extract::{DefaultBodyLimit, Query, State};
use axum::http::{header, StatusCode};
use axum::response::{IntoResponse, Response};
use axum::routing::{get, post};
use axum::Router;
use log::{error, info};
use serde::Deserialize;
use tokio::net::TcpListener;
use crate::forward::build_https_client_with_pool;
use crate::odoh::ODOH_CONTENT_TYPE;
use crate::Result;
/// Cap on the opaque body we accept from a client. ODoH envelopes are
/// ~100300 bytes in practice; anything larger is malformed or hostile.
const MAX_BODY_BYTES: usize = 4 * 1024;
/// Cap on the body we read back from the target before streaming to client.
/// Slightly larger: target responses carry DNS answers plus HPKE overhead.
const MAX_TARGET_RESPONSE_BYTES: usize = 8 * 1024;
/// Covers the whole client-to-target round trip — not just `.send()` — so a
/// slow-drip target can't hang a worker indefinitely after headers arrive.
const TARGET_REQUEST_TIMEOUT: Duration = Duration::from_secs(5);
/// The relay hits many distinct target hosts on behalf of clients. A
/// per-host idle pool of 4 keeps warm TLS connections available for concurrent
/// fan-out without blowing up memory on a small VPS.
const RELAY_POOL_PER_HOST: usize = 4;
#[derive(Deserialize)]
struct RelayParams {
targethost: String,
targetpath: String,
}
struct RelayState {
client: reqwest::Client,
total_requests: AtomicU64,
forwarded_ok: AtomicU64,
forwarded_err: AtomicU64,
rejected_bad_request: AtomicU64,
}
impl RelayState {
fn new() -> Arc<Self> {
Arc::new(RelayState {
client: build_https_client_with_pool(RELAY_POOL_PER_HOST),
total_requests: AtomicU64::new(0),
forwarded_ok: AtomicU64::new(0),
forwarded_err: AtomicU64::new(0),
rejected_bad_request: AtomicU64::new(0),
})
}
}
/// `DefaultBodyLimit` overrides axum's 2 MiB default so hostile clients
/// can't force the relay to buffer multi-MB bodies before our own cap.
fn build_app(state: Arc<RelayState>) -> Router {
Router::new()
.route("/relay", post(handle_relay))
.layer(DefaultBodyLimit::max(MAX_BODY_BYTES))
.route("/health", get(handle_health))
.with_state(state)
}
pub async fn run(addr: SocketAddr) -> Result<()> {
let app = build_app(RelayState::new());
let listener = TcpListener::bind(addr).await?;
info!("ODoH relay listening on {}", addr);
axum::serve(listener, app).await?;
Ok(())
}
async fn handle_health(State(state): State<Arc<RelayState>>) -> impl IntoResponse {
let body = format!(
"ok\ntotal {}\nforwarded_ok {}\nforwarded_err {}\nrejected_bad_request {}\n",
state.total_requests.load(Ordering::Relaxed),
state.forwarded_ok.load(Ordering::Relaxed),
state.forwarded_err.load(Ordering::Relaxed),
state.rejected_bad_request.load(Ordering::Relaxed),
);
(
StatusCode::OK,
[(header::CONTENT_TYPE, "text/plain; charset=utf-8")],
body,
)
}
async fn handle_relay(
State(state): State<Arc<RelayState>>,
Query(params): Query<RelayParams>,
headers: axum::http::HeaderMap,
body: Bytes,
) -> Response {
state.total_requests.fetch_add(1, Ordering::Relaxed);
if !content_type_matches(&headers, ODOH_CONTENT_TYPE) {
state.rejected_bad_request.fetch_add(1, Ordering::Relaxed);
return (
StatusCode::UNSUPPORTED_MEDIA_TYPE,
"expected application/oblivious-dns-message",
)
.into_response();
}
if body.len() > MAX_BODY_BYTES {
state.rejected_bad_request.fetch_add(1, Ordering::Relaxed);
return (StatusCode::PAYLOAD_TOO_LARGE, "body exceeds 4 KiB cap").into_response();
}
if !is_valid_hostname(&params.targethost) || !params.targetpath.starts_with('/') {
state.rejected_bad_request.fetch_add(1, Ordering::Relaxed);
return (StatusCode::BAD_REQUEST, "invalid targethost or targetpath").into_response();
}
let target_url = format!("https://{}{}", params.targethost, params.targetpath);
match forward_to_target(&state.client, &target_url, body).await {
Ok((status, resp_body)) => {
state.forwarded_ok.fetch_add(1, Ordering::Relaxed);
(
status,
[(header::CONTENT_TYPE, ODOH_CONTENT_TYPE)],
resp_body,
)
.into_response()
}
Err(e) => {
// Log the underlying reason for operators; don't leak reqwest
// internals (which can reveal the target's TLS config, IP, etc.)
// back to arbitrary clients.
error!("relay forward to {} failed: {}", target_url, e);
state.forwarded_err.fetch_add(1, Ordering::Relaxed);
(StatusCode::BAD_GATEWAY, "target unreachable").into_response()
}
}
}
async fn forward_to_target(
client: &reqwest::Client,
url: &str,
body: Bytes,
) -> Result<(StatusCode, Bytes)> {
let response = tokio::time::timeout(TARGET_REQUEST_TIMEOUT, async {
let resp = client
.post(url)
.header(header::CONTENT_TYPE, ODOH_CONTENT_TYPE)
.header(header::ACCEPT, ODOH_CONTENT_TYPE)
.body(body)
.send()
.await?;
let status = StatusCode::from_u16(resp.status().as_u16())?;
let resp_body = resp.bytes().await?;
Ok::<_, crate::Error>((status, resp_body))
})
.await
.map_err(|_| "timed out talking to target")??;
if response.1.len() > MAX_TARGET_RESPONSE_BYTES {
return Err("target response exceeds cap".into());
}
Ok(response)
}
fn content_type_matches(headers: &axum::http::HeaderMap, expected: &str) -> bool {
headers
.get(header::CONTENT_TYPE)
.and_then(|v| v.to_str().ok())
.map(|ct| ct.split(';').next().unwrap_or("").trim() == expected)
.unwrap_or(false)
}
/// Strict DNS-hostname validator, aimed at closing the SSRF surface a naive
/// `contains('.')` check leaves open (e.g. `example.com@internal.host`,
/// `evil.com/../admin`). Requires ASCII letters/digits/dot/dash, at least
/// one dot, no leading dot or dash, length ≤ 253 per RFC 1035.
fn is_valid_hostname(h: &str) -> bool {
if h.is_empty() || h.len() > 253 || !h.contains('.') {
return false;
}
if h.starts_with('.') || h.starts_with('-') || h.ends_with('.') || h.ends_with('-') {
return false;
}
h.chars()
.all(|c| c.is_ascii_alphanumeric() || c == '.' || c == '-')
}
#[cfg(test)]
mod tests {
use super::*;
async fn spawn_relay() -> (SocketAddr, Arc<RelayState>) {
let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
let addr = listener.local_addr().unwrap();
let state = RelayState::new();
let app = build_app(state.clone());
tokio::spawn(async move {
let _ = axum::serve(listener, app).await;
});
(addr, state)
}
#[tokio::test]
async fn rejects_missing_content_type() {
let (addr, state) = spawn_relay().await;
let client = reqwest::Client::new();
let resp = client
.post(format!(
"http://{}/relay?targethost=odoh.example.com&targetpath=/dns-query",
addr
))
.body("body")
.send()
.await
.unwrap();
assert_eq!(resp.status(), reqwest::StatusCode::UNSUPPORTED_MEDIA_TYPE);
assert_eq!(state.rejected_bad_request.load(Ordering::Relaxed), 1);
}
#[tokio::test]
async fn rejects_oversized_body() {
let (addr, _state) = spawn_relay().await;
let big = vec![0u8; MAX_BODY_BYTES + 1];
let client = reqwest::Client::new();
let resp = client
.post(format!(
"http://{}/relay?targethost=odoh.example.com&targetpath=/dns-query",
addr
))
.header(header::CONTENT_TYPE, ODOH_CONTENT_TYPE)
.body(big)
.send()
.await
.unwrap();
// axum's DefaultBodyLimit rejects before our handler runs, so the
// counter doesn't increment — but the status code proves the layer
// enforced the cap. Either status is acceptable evidence.
assert!(matches!(
resp.status(),
reqwest::StatusCode::PAYLOAD_TOO_LARGE | reqwest::StatusCode::BAD_REQUEST
));
}
#[tokio::test]
async fn rejects_targethost_without_dot() {
let (addr, state) = spawn_relay().await;
let client = reqwest::Client::new();
let resp = client
.post(format!(
"http://{}/relay?targethost=localhost&targetpath=/dns-query",
addr
))
.header(header::CONTENT_TYPE, ODOH_CONTENT_TYPE)
.body("body")
.send()
.await
.unwrap();
assert_eq!(resp.status(), reqwest::StatusCode::BAD_REQUEST);
assert_eq!(state.rejected_bad_request.load(Ordering::Relaxed), 1);
}
#[tokio::test]
async fn rejects_userinfo_ssrf_attempt() {
let (addr, state) = spawn_relay().await;
let client = reqwest::Client::new();
// The naive contains('.') check would let this through and reqwest
// would route to `internal.host` using `evil.com` as userinfo.
let resp = client
.post(format!(
"http://{}/relay?targethost=evil.com@internal.host&targetpath=/dns-query",
addr
))
.header(header::CONTENT_TYPE, ODOH_CONTENT_TYPE)
.body("body")
.send()
.await
.unwrap();
assert_eq!(resp.status(), reqwest::StatusCode::BAD_REQUEST);
assert_eq!(state.rejected_bad_request.load(Ordering::Relaxed), 1);
}
#[tokio::test]
async fn rejects_targetpath_without_leading_slash() {
let (addr, state) = spawn_relay().await;
let client = reqwest::Client::new();
let resp = client
.post(format!(
"http://{}/relay?targethost=odoh.example.com&targetpath=dns-query",
addr
))
.header(header::CONTENT_TYPE, ODOH_CONTENT_TYPE)
.body("body")
.send()
.await
.unwrap();
assert_eq!(resp.status(), reqwest::StatusCode::BAD_REQUEST);
assert_eq!(state.rejected_bad_request.load(Ordering::Relaxed), 1);
}
#[tokio::test]
async fn health_endpoint_reports_counters() {
let (addr, _state) = spawn_relay().await;
let client = reqwest::Client::new();
let resp = client
.get(format!("http://{}/health", addr))
.send()
.await
.unwrap();
assert_eq!(resp.status(), reqwest::StatusCode::OK);
let body = resp.text().await.unwrap();
assert!(body.contains("ok\n"));
assert!(body.contains("forwarded_ok 0"));
}
#[test]
fn hostname_validator_accepts_and_rejects() {
assert!(is_valid_hostname("odoh.cloudflare-dns.com"));
assert!(is_valid_hostname("a.b"));
assert!(!is_valid_hostname(""));
assert!(!is_valid_hostname("localhost"));
assert!(!is_valid_hostname(".leading.dot"));
assert!(!is_valid_hostname("trailing.dot."));
assert!(!is_valid_hostname("-leading.dash"));
assert!(!is_valid_hostname("evil.com@internal.host"));
assert!(!is_valid_hostname("evil.com/../admin"));
assert!(!is_valid_hostname(&"a".repeat(254)));
}
}

View File

@@ -13,11 +13,15 @@ use log::{error, info};
use tokio::net::UdpSocket;
use crate::blocklist::{download_blocklists, parse_blocklist, BlocklistStore};
use crate::bootstrap_resolver::NumaResolver;
use crate::buffer::BytePacketBuffer;
use crate::cache::DnsCache;
use crate::config::{build_zone_map, load_config, ConfigLoad};
use crate::ctx::{handle_query, ServerCtx};
use crate::forward::{parse_upstream, Upstream, UpstreamPool};
use crate::forward::{
build_https_client_with_resolver, parse_upstream_list, Upstream, UpstreamPool,
};
use crate::odoh::OdohConfigCache;
use crate::override_store::OverrideStore;
use crate::query_log::QueryLog;
use crate::service_store::ServiceStore;
@@ -45,6 +49,22 @@ pub async fn run(config_path: String) -> crate::Result<()> {
(dummy, "recursive (root hints)".to_string())
};
// Routes numa-originated HTTPS (DoH upstream, ODoH relay/target, blocklist
// CDN) away from the system resolver so lookups don't loop back through
// numa when it's its own system DNS.
let resolver_overrides = match config.upstream.mode {
crate::config::UpstreamMode::Odoh => config
.upstream
.odoh_upstream()
.map(|o| o.host_ip_overrides())
.unwrap_or_default(),
_ => std::collections::BTreeMap::new(),
};
let bootstrap_resolver: Arc<NumaResolver> = Arc::new(NumaResolver::new(
&config.upstream.fallback,
resolver_overrides,
));
let (resolved_mode, upstream_auto, pool, upstream_label) = match config.upstream.mode {
crate::config::UpstreamMode::Auto => {
info!("auto mode: probing recursive resolution...");
@@ -54,10 +74,7 @@ pub async fn run(config_path: String) -> crate::Result<()> {
(crate::config::UpstreamMode::Recursive, false, pool, label)
} else {
log::warn!("recursive probe failed — falling back to Quad9 DoH");
let client = reqwest::Client::builder()
.use_rustls_tls()
.build()
.unwrap_or_default();
let client = build_https_client_with_resolver(1, Some(bootstrap_resolver.clone()));
let url = DOH_FALLBACK.to_string();
let label = url.clone();
let pool = UpstreamPool::new(vec![Upstream::Doh { url, client }], vec![]);
@@ -82,16 +99,16 @@ pub async fn run(config_path: String) -> crate::Result<()> {
config.upstream.address.clone()
};
let primary: Vec<Upstream> = addrs
.iter()
.map(|s| parse_upstream(s, config.upstream.port))
.collect::<crate::Result<Vec<_>>>()?;
let fallback: Vec<Upstream> = config
.upstream
.fallback
.iter()
.map(|s| parse_upstream(s, config.upstream.port))
.collect::<crate::Result<Vec<_>>>()?;
let primary = parse_upstream_list(
&addrs,
config.upstream.port,
Some(bootstrap_resolver.clone()),
)?;
let fallback = parse_upstream_list(
&config.upstream.fallback,
config.upstream.port,
Some(bootstrap_resolver.clone()),
)?;
let pool = UpstreamPool::new(primary, fallback);
let label = pool.label();
@@ -102,6 +119,32 @@ pub async fn run(config_path: String) -> crate::Result<()> {
label,
)
}
crate::config::UpstreamMode::Odoh => {
let odoh = config.upstream.odoh_upstream()?;
let client = build_https_client_with_resolver(1, Some(bootstrap_resolver.clone()));
let target_config = Arc::new(OdohConfigCache::new(
odoh.target_host.clone(),
client.clone(),
));
let primary = vec![Upstream::Odoh {
relay_url: odoh.relay_url,
target_path: odoh.target_path,
client,
target_config,
}];
let fallback = if odoh.strict {
Vec::new()
} else {
parse_upstream_list(
&config.upstream.fallback,
config.upstream.port,
Some(bootstrap_resolver.clone()),
)?
};
let pool = UpstreamPool::new(primary, fallback);
let label = pool.label();
(crate::config::UpstreamMode::Odoh, false, pool, label)
}
};
let api_port = config.server.api_port;
@@ -123,7 +166,11 @@ pub async fn run(config_path: String) -> crate::Result<()> {
for fwd in &config.forwarding {
for suffix in &fwd.suffix {
info!("forwarding .{} to {} (config rule)", suffix, fwd.upstream);
info!(
"forwarding .{} to {} (config rule)",
suffix,
fwd.upstream.join(", ")
);
}
}
let forwarding_rules =
@@ -209,7 +256,7 @@ pub async fn run(config_path: String) -> crate::Result<()> {
upstream_port: config.upstream.port,
lan_ip: Mutex::new(crate::lan::detect_lan_ip().unwrap_or(std::net::Ipv4Addr::LOCALHOST)),
timeout: Duration::from_millis(config.upstream.timeout_ms),
hedge_delay: Duration::from_millis(config.upstream.hedge_ms),
hedge_delay: resolved_mode.hedge_delay(config.upstream.hedge_ms),
proxy_tld_suffix: if config.proxy.tld.is_empty() {
String::new()
} else {
@@ -232,6 +279,7 @@ pub async fn run(config_path: String) -> crate::Result<()> {
ca_pem,
mobile_enabled: config.mobile.enabled,
mobile_port: config.mobile.port,
filter_aaaa: config.server.filter_aaaa,
});
let zone_count: usize = ctx.zone_map.values().map(|m| m.len()).sum();
@@ -294,12 +342,13 @@ pub async fn run(config_path: String) -> crate::Result<()> {
};
// Title row: center within the box
let tag_line = "DNS that governs itself";
let title = format!(
"{b}NUMA{r} {it}DNS that governs itself{r} {d}v{}{r}",
"{b}NUMA{r} {it}{tag_line}{r} {d}v{}{r}",
env!("CARGO_PKG_VERSION")
);
// The title contains ANSI codes; visible length is ~38 chars. Pad to fill the box.
let title_visible_len = 4 + 2 + 24 + 2 + 1 + env!("CARGO_PKG_VERSION").len() + 1;
let title_visible_len = 4 + 2 + tag_line.len() + 2 + 1 + env!("CARGO_PKG_VERSION").len() + 1;
let title_pad = w.saturating_sub(title_visible_len);
eprintln!("\n{o}{bar_top}{r}");
eprint!("{o}{r} {title}");
@@ -386,8 +435,9 @@ pub async fn run(config_path: String) -> crate::Result<()> {
if config.blocking.enabled && !blocklist_lists.is_empty() {
let bl_ctx = Arc::clone(&ctx);
let bl_lists = blocklist_lists.clone();
let bl_resolver = bootstrap_resolver.clone();
tokio::spawn(async move {
load_blocklists(&bl_ctx, &bl_lists).await;
load_blocklists(&bl_ctx, &bl_lists, Some(bl_resolver.clone())).await;
// Periodic refresh
let mut interval = tokio::time::interval(Duration::from_secs(refresh_hours * 3600));
@@ -395,7 +445,7 @@ pub async fn run(config_path: String) -> crate::Result<()> {
loop {
interval.tick().await;
info!("refreshing blocklists...");
load_blocklists(&bl_ctx, &bl_lists).await;
load_blocklists(&bl_ctx, &bl_lists, Some(bl_resolver.clone())).await;
}
});
}
@@ -577,8 +627,8 @@ async fn network_watch_loop(ctx: Arc<ServerCtx>) {
}
}
async fn load_blocklists(ctx: &ServerCtx, lists: &[String]) {
let downloaded = download_blocklists(lists).await;
async fn load_blocklists(ctx: &ServerCtx, lists: &[String], resolver: Option<Arc<NumaResolver>>) {
let downloaded = download_blocklists(lists, resolver).await;
// Parse outside the lock to avoid blocking DNS queries during parse (~100ms)
let mut all_domains = std::collections::HashSet::new();
@@ -613,8 +663,10 @@ async fn warm_domain(ctx: &ServerCtx, domain: &str) {
}
async fn doh_keepalive_loop(ctx: Arc<ServerCtx>) {
// First tick fires immediately so we surface bootstrap-resolver failures
// (unreachable Quad9/Cloudflare, blocked :53, bad upstream hostname) in
// the startup logs instead of on the first client query.
let mut interval = tokio::time::interval(Duration::from_secs(25));
interval.tick().await; // skip first immediate tick
loop {
interval.tick().await;
let pool = ctx.upstream_pool.lock().unwrap().clone();

View File

@@ -102,6 +102,10 @@ pub struct ServerStats {
transport_tcp: u64,
transport_dot: u64,
transport_doh: u64,
upstream_transport_udp: u64,
upstream_transport_doh: u64,
upstream_transport_dot: u64,
upstream_transport_odoh: u64,
started_at: Instant,
}
@@ -124,6 +128,31 @@ impl Transport {
}
}
/// Wire protocol used for a forwarded upstream call. Orthogonal to
/// `QueryPath`: the path answers "where the answer came from"; this answers
/// "over what wire we spoke to the forwarder." Callers pass
/// `Option<UpstreamTransport>` — `None` for resolutions that never touched
/// a forwarder (cache/local/blocked) or for recursive mode, which has its
/// own counter via `QueryPath::Recursive`.
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum UpstreamTransport {
Udp,
Doh,
Dot,
Odoh,
}
impl UpstreamTransport {
pub fn as_str(&self) -> &'static str {
match self {
UpstreamTransport::Udp => "UDP",
UpstreamTransport::Doh => "DOH",
UpstreamTransport::Dot => "DOT",
UpstreamTransport::Odoh => "ODOH",
}
}
}
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum QueryPath {
Local,
@@ -202,11 +231,20 @@ impl ServerStats {
transport_tcp: 0,
transport_dot: 0,
transport_doh: 0,
upstream_transport_udp: 0,
upstream_transport_doh: 0,
upstream_transport_dot: 0,
upstream_transport_odoh: 0,
started_at: Instant::now(),
}
}
pub fn record(&mut self, path: QueryPath, transport: Transport) -> u64 {
pub fn record(
&mut self,
path: QueryPath,
transport: Transport,
upstream_transport: Option<UpstreamTransport>,
) -> u64 {
self.queries_total += 1;
match path {
QueryPath::Local => self.queries_local += 1,
@@ -225,6 +263,14 @@ impl ServerStats {
Transport::Dot => self.transport_dot += 1,
Transport::Doh => self.transport_doh += 1,
}
if let Some(ut) = upstream_transport {
match ut {
UpstreamTransport::Udp => self.upstream_transport_udp += 1,
UpstreamTransport::Doh => self.upstream_transport_doh += 1,
UpstreamTransport::Dot => self.upstream_transport_dot += 1,
UpstreamTransport::Odoh => self.upstream_transport_odoh += 1,
}
}
self.queries_total
}
@@ -253,6 +299,10 @@ impl ServerStats {
transport_tcp: self.transport_tcp,
transport_dot: self.transport_dot,
transport_doh: self.transport_doh,
upstream_transport_udp: self.upstream_transport_udp,
upstream_transport_doh: self.upstream_transport_doh,
upstream_transport_dot: self.upstream_transport_dot,
upstream_transport_odoh: self.upstream_transport_odoh,
}
}
@@ -263,7 +313,7 @@ impl ServerStats {
let secs = uptime.as_secs() % 60;
log::info!(
"STATS | uptime {}h{}m{}s | total {} | fwd {} | upstream {} | recursive {} | coalesced {} | cached {} | local {} | override {} | blocked {} | errors {}",
"STATS | uptime {}h{}m{}s | total {} | fwd {} | upstream {} | recursive {} | coalesced {} | cached {} | local {} | override {} | blocked {} | errors {} | up-udp {} | up-doh {} | up-dot {} | up-odoh {}",
hours, mins, secs,
self.queries_total,
self.queries_forwarded,
@@ -275,6 +325,10 @@ impl ServerStats {
self.queries_overridden,
self.queries_blocked,
self.upstream_errors,
self.upstream_transport_udp,
self.upstream_transport_doh,
self.upstream_transport_dot,
self.upstream_transport_odoh,
);
}
}
@@ -295,4 +349,8 @@ pub struct StatsSnapshot {
pub transport_tcp: u64,
pub transport_dot: u64,
pub transport_doh: u64,
pub upstream_transport_udp: u64,
pub upstream_transport_doh: u64,
pub upstream_transport_dot: u64,
pub upstream_transport_odoh: u64,
}

179
src/svcb.rs Normal file
View File

@@ -0,0 +1,179 @@
//! Minimal SVCB/HTTPS (RFC 9460) RDATA parser — just enough to strip
//! the `ipv6hint` SvcParam. Used by the `filter_aaaa` feature so
//! HTTPS-record-aware clients (Chrome ≥103, Firefox, Safari) don't
//! receive v6 address hints on IPv4-only networks.
/// SvcParamKey = 6 (RFC 9460 §14.3.2).
const IPV6_HINT_KEY: u16 = 6;
/// Strip the `ipv6hint` SvcParam from an HTTPS/SVCB RDATA blob.
///
/// Returns `Some(new_rdata)` if `ipv6hint` was present and removed.
/// Returns `None` if the record had no `ipv6hint`, or if the RDATA
/// couldn't be parsed — in both cases the caller should keep the
/// original bytes untouched.
///
/// SVCB RDATA (RFC 9460 §2.2):
/// SvcPriority (u16)
/// TargetName (uncompressed DNS name — labels terminated by 0 octet)
/// SvcParams (series of {u16 key, u16 len, opaque[len] value}, sorted by key)
pub fn strip_ipv6hint(rdata: &[u8]) -> Option<Vec<u8>> {
if rdata.len() < 2 {
return None;
}
let mut pos = 2;
// TargetName — uncompressed per RFC 9460 §2.2
loop {
let len = *rdata.get(pos)? as usize;
pos += 1;
if len == 0 {
break;
}
if len & 0xC0 != 0 {
// Pointer: forbidden in SVCB but defend against a broken upstream.
return None;
}
pos = pos.checked_add(len)?;
if pos > rdata.len() {
return None;
}
}
// Scan params once to decide whether we need to rebuild.
let params_start = pos;
let mut scan = pos;
let mut has_ipv6hint = false;
while scan < rdata.len() {
if scan + 4 > rdata.len() {
return None;
}
let key = u16::from_be_bytes([rdata[scan], rdata[scan + 1]]);
let vlen = u16::from_be_bytes([rdata[scan + 2], rdata[scan + 3]]) as usize;
let end = scan.checked_add(4)?.checked_add(vlen)?;
if end > rdata.len() {
return None;
}
if key == IPV6_HINT_KEY {
has_ipv6hint = true;
}
scan = end;
}
if scan != rdata.len() || !has_ipv6hint {
return None;
}
// Rebuild without ipv6hint, preserving param order (RFC 9460 requires
// ascending key order, which we preserve by filtering in place).
let mut out = Vec::with_capacity(rdata.len());
out.extend_from_slice(&rdata[..params_start]);
let mut pos = params_start;
while pos < rdata.len() {
let key = u16::from_be_bytes([rdata[pos], rdata[pos + 1]]);
let vlen = u16::from_be_bytes([rdata[pos + 2], rdata[pos + 3]]) as usize;
let end = pos + 4 + vlen;
if key != IPV6_HINT_KEY {
out.extend_from_slice(&rdata[pos..end]);
}
pos = end;
}
Some(out)
}
/// Build an SVCB RDATA blob from a priority, target labels, and
/// (key, value) param pairs. Shared by `svcb` unit tests and `ctx`
/// pipeline tests that need to seed the cache with a synthetic HTTPS RR.
#[cfg(test)]
pub(crate) fn build_rdata(priority: u16, target: &[&str], params: &[(u16, Vec<u8>)]) -> Vec<u8> {
let mut out = Vec::new();
out.extend_from_slice(&priority.to_be_bytes());
for label in target {
out.push(label.len() as u8);
out.extend_from_slice(label.as_bytes());
}
out.push(0);
for (key, value) in params {
out.extend_from_slice(&key.to_be_bytes());
out.extend_from_slice(&(value.len() as u16).to_be_bytes());
out.extend_from_slice(value);
}
out
}
#[cfg(test)]
mod tests {
use super::*;
fn alpn_h3() -> (u16, Vec<u8>) {
// alpn = ["h3"]: one length-prefixed ALPN id
(1, vec![0x02, b'h', b'3'])
}
fn ipv4hint_single() -> (u16, Vec<u8>) {
(4, vec![93, 184, 216, 34])
}
fn ipv6hint_single() -> (u16, Vec<u8>) {
// 2606:4700::1
(
6,
vec![
0x26, 0x06, 0x47, 0x00, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0x01,
],
)
}
#[test]
fn strips_ipv6hint_and_keeps_other_params() {
let rdata = build_rdata(1, &[], &[alpn_h3(), ipv4hint_single(), ipv6hint_single()]);
let stripped = strip_ipv6hint(&rdata).expect("ipv6hint present → stripped");
let expected = build_rdata(1, &[], &[alpn_h3(), ipv4hint_single()]);
assert_eq!(stripped, expected);
}
#[test]
fn no_ipv6hint_returns_none() {
let rdata = build_rdata(1, &[], &[alpn_h3(), ipv4hint_single()]);
assert!(strip_ipv6hint(&rdata).is_none());
}
#[test]
fn alias_mode_empty_params_returns_none() {
let rdata = build_rdata(0, &["example", "com"], &[]);
assert!(strip_ipv6hint(&rdata).is_none());
}
#[test]
fn only_ipv6hint_yields_empty_param_section() {
let rdata = build_rdata(1, &[], &[ipv6hint_single()]);
let stripped = strip_ipv6hint(&rdata).expect("ipv6hint present → stripped");
let expected = build_rdata(1, &[], &[]);
assert_eq!(stripped, expected);
}
#[test]
fn preserves_target_name() {
let rdata = build_rdata(1, &["svc", "example", "net"], &[ipv6hint_single()]);
let stripped = strip_ipv6hint(&rdata).unwrap();
assert!(stripped.starts_with(&[0x00, 0x01])); // priority
assert_eq!(&stripped[2..6], b"\x03svc");
}
#[test]
fn truncated_rdata_returns_none() {
// Priority only, no target terminator.
assert!(strip_ipv6hint(&[0, 1, 3, b'c', b'o', b'm']).is_none());
}
#[test]
fn empty_input_returns_none() {
assert!(strip_ipv6hint(&[]).is_none());
}
#[test]
fn param_length_overflow_returns_none() {
// key=6, length=0xFFFF but value is short — malformed.
let rdata = vec![0, 1, 0, 0, 6, 0xFF, 0xFF, 0, 1, 2];
assert!(strip_ipv6hint(&rdata).is_none());
}
}

View File

@@ -2,7 +2,9 @@ use std::net::SocketAddr;
use log::info;
#[cfg(any(target_os = "macos", target_os = "linux"))]
use crate::forward::Upstream;
use crate::forward::UpstreamPool;
fn print_recursive_hint() {
let is_recursive = crate::config::load_config("numa.toml")
@@ -20,15 +22,15 @@ fn is_loopback_or_stub(addr: &str) -> bool {
}
/// A conditional forwarding rule: domains matching `suffix` are forwarded to `upstream`.
#[derive(Debug, Clone)]
#[derive(Clone)]
pub struct ForwardingRule {
pub suffix: String,
dot_suffix: String, // pre-computed ".suffix" for zero-alloc matching
pub upstream: Upstream,
pub upstream: UpstreamPool,
}
impl ForwardingRule {
pub fn new(suffix: String, upstream: Upstream) -> Self {
pub fn new(suffix: String, upstream: UpstreamPool) -> Self {
let dot_suffix = format!(".{}", suffix);
Self {
suffix,
@@ -216,7 +218,8 @@ fn discover_macos() -> SystemDnsInfo {
for rule in &rules {
info!(
"auto-discovered forwarding: *.{} -> {}",
rule.suffix, rule.upstream
rule.suffix,
rule.upstream.label()
);
}
if rules.is_empty() {
@@ -235,7 +238,8 @@ fn discover_macos() -> SystemDnsInfo {
#[cfg(any(target_os = "macos", target_os = "linux"))]
fn make_rule(domain: &str, nameserver: &str) -> Option<ForwardingRule> {
let addr = crate::forward::parse_upstream_addr(nameserver, 53).ok()?;
Some(ForwardingRule::new(domain.to_string(), Upstream::Udp(addr)))
let pool = UpstreamPool::new(vec![Upstream::Udp(addr)], vec![]);
Some(ForwardingRule::new(domain.to_string(), pool))
}
#[cfg(target_os = "linux")]
@@ -1033,7 +1037,7 @@ fn uninstall_windows() -> Result<(), String> {
pub fn match_forwarding_rule<'a>(
domain: &str,
rules: &'a [ForwardingRule],
) -> Option<&'a Upstream> {
) -> Option<&'a UpstreamPool> {
for rule in rules {
if domain == rule.suffix || domain.ends_with(&rule.dot_suffix) {
return Some(&rule.upstream);
@@ -1412,7 +1416,7 @@ pub fn service_status() -> Result<(), String> {
}
}
#[cfg(any(target_os = "macos", target_os = "linux"))]
#[cfg(target_os = "macos")]
fn replace_exe_path(service: &str) -> Result<String, String> {
let exe_path =
std::env::current_exe().map_err(|e| format!("failed to get current exe: {}", e))?;
@@ -1660,10 +1664,78 @@ fn uninstall_linux() -> Result<(), String> {
Ok(())
}
/// Fallback install location when current_exe() sits on a path the
/// dynamic user cannot traverse (e.g. `/home/<user>/` mode 0700).
#[cfg(target_os = "linux")]
fn linux_service_exe_path() -> std::path::PathBuf {
std::path::PathBuf::from("/usr/local/bin/numa")
}
/// True iff every ancestor of `p` (excluding `/`) grants world-execute —
/// i.e. the `DynamicUser=yes` service account can traverse the path and
/// exec the binary without being in any group. Linuxbrew's
/// `/home/linuxbrew` is 0755 (traversable, keep brew's path, upgrades
/// via `brew` propagate). A build tree under `/home/<user>/` (0700) or
/// `~/.cargo/bin/` is not (copy to /usr/local/bin so systemd can reach it).
#[cfg(target_os = "linux")]
fn path_world_traversable_linux(p: &std::path::Path) -> bool {
use std::os::unix::fs::PermissionsExt;
let mut current = p;
while let Some(parent) = current.parent() {
if parent.as_os_str().is_empty() || parent == std::path::Path::new("/") {
break;
}
match std::fs::metadata(parent) {
Ok(m) if m.permissions().mode() & 0o001 != 0 => {}
_ => return false,
}
current = parent;
}
true
}
#[cfg(target_os = "linux")]
fn install_service_binary_linux() -> Result<std::path::PathBuf, String> {
let src = std::env::current_exe().map_err(|e| format!("current_exe(): {}", e))?;
if path_world_traversable_linux(&src) {
return Ok(src);
}
let dst = linux_service_exe_path();
if src == dst {
return Ok(dst);
}
if let Some(parent) = dst.parent() {
std::fs::create_dir_all(parent)
.map_err(|e| format!("failed to create {}: {}", parent.display(), e))?;
}
// Atomic replace via temp + rename. Plain copy fails with ETXTBSY when
// re-installing while the service is running the previous binary —
// rename swaps the path while the running process keeps the old inode.
let tmp = dst.with_extension("new");
std::fs::copy(&src, &tmp).map_err(|e| {
format!(
"failed to copy {} -> {}: {}",
src.display(),
tmp.display(),
e
)
})?;
std::fs::rename(&tmp, &dst).map_err(|e| {
let _ = std::fs::remove_file(&tmp);
format!(
"failed to rename {} -> {}: {}",
tmp.display(),
dst.display(),
e
)
})?;
Ok(dst)
}
#[cfg(target_os = "linux")]
fn install_service_linux() -> Result<(), String> {
let unit = include_str!("../numa.service");
let unit = replace_exe_path(unit)?;
let exe = install_service_binary_linux()?;
let unit = include_str!("../numa.service").replace("{{exe_path}}", &exe.to_string_lossy());
std::fs::write(SYSTEMD_UNIT, unit)
.map_err(|e| format!("failed to write {}: {}", SYSTEMD_UNIT, e))?;
@@ -1675,7 +1747,9 @@ fn install_service_linux() -> Result<(), String> {
eprintln!(" warning: failed to configure system DNS: {}", e);
}
run_systemctl(&["start", "numa"])?;
// restart, not start: on re-install the service is already running
// the previous binary; restart picks up the new one.
run_systemctl(&["restart", "numa"])?;
eprintln!(" Service installed and started.");
eprintln!(" Numa will auto-start on boot and restart if killed.");
@@ -1991,22 +2065,25 @@ Wireless LAN adapter Wi-Fi:
}
#[test]
#[cfg(any(target_os = "macos", target_os = "linux"))]
fn replace_exe_path_substitutes_template() {
fn install_templates_contain_exe_path_placeholder() {
// Both files are substituted at install time — plist via
// replace_exe_path on macOS, numa.service via inline .replace
// in install_service_linux. Catch placeholder removal early.
let plist = include_str!("../com.numa.dns.plist");
let unit = include_str!("../numa.service");
assert!(plist.contains("{{exe_path}}"), "plist missing placeholder");
assert!(
unit.contains("{{exe_path}}"),
"unit file missing placeholder"
);
}
#[test]
#[cfg(target_os = "macos")]
fn replace_exe_path_substitutes_template() {
let plist = include_str!("../com.numa.dns.plist");
let result = replace_exe_path(plist).expect("replace_exe_path failed for plist");
assert!(!result.contains("{{exe_path}}"));
let result = replace_exe_path(unit).expect("replace_exe_path failed for unit");
assert!(!result.contains("{{exe_path}}"));
}
#[test]

View File

@@ -12,11 +12,13 @@ use crate::cache::DnsCache;
use crate::config::UpstreamMode;
use crate::ctx::ServerCtx;
use crate::forward::{Upstream, UpstreamPool};
use crate::header::ResultCode;
use crate::health::HealthMeta;
use crate::lan::PeerStore;
use crate::override_store::OverrideStore;
use crate::packet::DnsPacket;
use crate::query_log::QueryLog;
use crate::record::DnsRecord;
use crate::service_store::ServiceStore;
use crate::srtt::SrttCache;
use crate::stats::ServerStats;
@@ -63,9 +65,24 @@ pub async fn test_ctx() -> ServerCtx {
ca_pem: None,
mobile_enabled: false,
mobile_port: 8765,
filter_aaaa: false,
}
}
/// Build a NOERROR response containing a single A record — the shape used
/// repeatedly by pipeline/forwarding tests to seed `mock_upstream`.
pub fn a_record_response(domain: &str, addr: Ipv4Addr, ttl: u32) -> DnsPacket {
let mut pkt = DnsPacket::new();
pkt.header.response = true;
pkt.header.rescode = ResultCode::NOERROR;
pkt.answers.push(DnsRecord::A {
domain: domain.to_string(),
addr,
ttl,
});
pkt
}
/// Spawn a UDP socket that replies to the first DNS query with the given
/// response packet (patching the query ID to match). Returns the socket address.
pub async fn mock_upstream(response: DnsPacket) -> SocketAddr {

288
tests/docker/install-systemd.sh Executable file
View File

@@ -0,0 +1,288 @@
#!/usr/bin/env bash
#
# Systemd service install verification for the DynamicUser-based Linux
# service unit. Stands up a privileged ubuntu:24.04 container with systemd
# as PID 1, builds numa inside, runs three scenarios that CI does not:
#
# A. Fresh install — every advertised port is not just bound but
# functional (DNS resolves on :53, TLS handshake validates against
# numa's CA on :853/:443, HTTP responds on :80, API on :5380).
# B. Upgrade from pre-drop layout (root-owned /var/lib/numa) preserves
# the CA fingerprint — users' browser-installed CA trust survives.
# C. Install from a 0700 source directory stages the binary under
# /usr/local/bin/numa and the service starts from there.
#
# First run is slow (~5-10 min): image pull + apt + cold cargo build.
# Subsequent runs reuse cached docker volumes for cargo + target (~30s).
#
# Requirements: docker
# Usage: ./tests/docker/install-systemd.sh
set -u
set -o pipefail
GREEN="\033[32m"; RED="\033[31m"; RESET="\033[0m"
pass() { printf " ${GREEN}PASS${RESET}: %s\n" "$*"; }
fail() { printf " ${RED}FAIL${RESET}: %s\n" "$*"; FAIL=1; }
# ============================================================
# Mode B: running inside the systemd container — run scenarios
# ============================================================
if [ "${NUMA_INSIDE:-}" = "1" ]; then
set +e # assertions report pass/fail, don't abort
FAIL=0
NUMA=/work/target/release/numa
reset_state() {
"$NUMA" uninstall >/dev/null 2>&1 || true
systemctl reset-failed numa 2>/dev/null || true
rm -rf /var/lib/numa /var/lib/private/numa /etc/numa /home/builder /usr/local/bin/numa
systemctl daemon-reload 2>/dev/null || true
}
main_pid_user() {
local pid
pid=$(systemctl show -p MainPID --value numa)
[ "$pid" != "0" ] || { echo ""; return; }
ps -o user= -p "$pid" 2>/dev/null | tr -d ' '
}
# MainPID + user briefly stabilize after a fresh restart. Retry so we
# don't race the moment systemd flips the service to "active" vs when
# the forked numa process actually owns MainPID.
assert_nonroot() {
local pid user comm n=0
while [ $n -lt 20 ]; do
pid=$(systemctl show -p MainPID --value numa)
if [ "$pid" != "0" ]; then
comm=$(ps -o comm= -p "$pid" 2>/dev/null | tr -d ' ')
user=$(ps -o user= -p "$pid" 2>/dev/null | tr -d ' ')
if [ "$comm" = "numa" ]; then
if [ "$user" = "root" ]; then
fail "daemon runs as root (expected transient UID)"
else
pass "daemon runs as $user (non-root)"
fi
return
fi
fi
sleep 0.2
n=$((n + 1))
done
fail "numa MainPID did not settle (last: pid=${pid:-?} comm=${comm:-?} user=${user:-?})"
}
# Functional DNS check: just "port 53 bound" isn't enough — systemd-resolved
# listens on 127.0.0.53 and would satisfy a bind test. Retries for ~15s
# to tolerate cold-start upstream / blocklist warmup.
assert_dns_works() {
local n=0
while [ $n -lt 15 ]; do
if dig @127.0.0.1 -p 53 example.com +short +timeout=2 +tries=1 2>/dev/null \
| grep -qE '^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$'; then
pass "DNS resolves on :53 (A record returned)"
return
fi
sleep 1
n=$((n + 1))
done
fail "DNS did not return an A record on :53 within 15s"
}
# TLS handshake: cert must validate against numa's CA when connecting
# to a .numa SNI. Catches port-not-bound, wrong cert, missing CA file.
assert_tls_handshake() {
local port=$1 sni=${2:-numa.numa} out
if out=$(openssl s_client -connect "127.0.0.1:${port}" \
-servername "$sni" \
-CAfile /var/lib/numa/ca.pem \
-verify_return_error </dev/null 2>&1); then
if echo "$out" | grep -q 'Verify return code: 0 (ok)'; then
pass "TLS handshake + cert chain verified on :${port}"
else
fail "TLS handshake on :${port} did not report 'Verify return code: 0'"
fi
else
fail "openssl s_client failed connecting to :${port}"
fi
}
assert_http_responds() {
local code
code=$(curl -s -o /dev/null -w "%{http_code}" --max-time 3 http://127.0.0.1/ || echo 000)
if [ "$code" != "000" ]; then
pass "HTTP responds on :80 (status $code)"
else
fail "HTTP :80 connection failed"
fi
}
assert_api_healthy() {
if curl -sf --max-time 3 http://127.0.0.1:5380/health >/dev/null; then
pass "API /health OK on :5380"
else
fail "API /health failed on :5380"
fi
}
ca_fingerprint() {
openssl x509 -in /var/lib/numa/ca.pem -noout -fingerprint -sha256 2>/dev/null \
| sed 's/.*=//'
}
wait_active() {
local n=0
while [ $n -lt 20 ]; do
systemctl is-active --quiet numa && return 0
sleep 0.5
n=$((n + 1))
done
fail "service did not become active within 10s"
systemctl status numa --no-pager -l 2>&1 | head -20 || true
return 1
}
# ---- Scenario A ----
printf "\n=== Scenario A: fresh install — every advertised port is functional ===\n"
reset_state
"$NUMA" install >/tmp/installA.log 2>&1 || { fail "install failed"; tail -20 /tmp/installA.log; }
wait_active || true
assert_nonroot
assert_dns_works
assert_tls_handshake 853
assert_tls_handshake 443
assert_http_responds
assert_api_healthy
# ---- Scenario B ----
# Pre-drop installs left /var/lib/numa as a plain root-owned tree.
# Flattening the current DynamicUser layout back into that shape
# simulates the upgrade path without needing an actual old binary.
printf "\n=== Scenario B: CA fingerprint survives upgrade from pre-drop layout ===\n"
fp_before=$(ca_fingerprint)
if [ -z "$fp_before" ]; then
fail "could not read initial CA fingerprint (skipping scenario B)"
else
echo " CA fingerprint before: $fp_before"
"$NUMA" uninstall >/dev/null 2>&1 || true
tmp=$(mktemp -d)
cp -a /var/lib/private/numa/. "$tmp"/ 2>/dev/null || true
rm -rf /var/lib/numa /var/lib/private/numa
mv "$tmp" /var/lib/numa
chown -R root:root /var/lib/numa
chmod 755 /var/lib/numa
[ -f /var/lib/numa/ca.pem ] || fail "ca.pem missing from seeded legacy tree"
"$NUMA" install >/tmp/installB.log 2>&1 || { fail "upgrade install failed"; tail -20 /tmp/installB.log; }
wait_active || true
assert_nonroot
fp_after=$(ca_fingerprint)
if [ -z "$fp_after" ]; then
fail "could not read CA fingerprint after upgrade"
elif [ "$fp_before" = "$fp_after" ]; then
pass "CA fingerprint preserved across upgrade"
else
fail "CA fingerprint changed: before=$fp_before after=$fp_after"
fi
assert_dns_works
fi
# ---- Scenario C ----
printf "\n=== Scenario C: install from unreachable source stages binary to /usr/local/bin ===\n"
reset_state
mkdir -p /home/builder
chmod 700 /home/builder
cp "$NUMA" /home/builder/numa
chmod 755 /home/builder/numa
/home/builder/numa install >/tmp/installC.log 2>&1 || { fail "install failed"; tail -20 /tmp/installC.log; }
wait_active || true
if [ -x /usr/local/bin/numa ]; then
pass "binary staged to /usr/local/bin/numa"
else
fail "/usr/local/bin/numa missing after install from 0700 source"
fi
exec_line=$(grep '^ExecStart=' /etc/systemd/system/numa.service 2>/dev/null || echo "ExecStart=<unit missing>")
if echo "$exec_line" | grep -q '/usr/local/bin/numa'; then
pass "unit ExecStart points to staged path"
else
fail "unit ExecStart wrong: $exec_line"
fi
assert_nonroot
assert_dns_works
reset_state
rm -rf /home/builder
echo
if [ "$FAIL" -eq 0 ]; then
printf "${GREEN}── all scenarios passed ──${RESET}\n"
exit 0
else
printf "${RED}── some scenarios failed ──${RESET}\n"
exit 1
fi
fi
# ============================================================
# Mode A: host-side bootstrap
# ============================================================
set -e
cd "$(dirname "$0")/../.."
IMAGE=numa-install-systemd:local
CONTAINER="numa-install-systemd-$$"
trap 'docker rm -f "$CONTAINER" >/dev/null 2>&1 || true' EXIT
echo "── building systemd-in-container image (cached after first run) ──"
docker build --quiet -t "$IMAGE" -f - . <<'DOCKERFILE' >/dev/null
FROM ubuntu:24.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -qq && apt-get install -y -qq \
systemd systemd-sysv systemd-resolved \
ca-certificates curl build-essential \
pkg-config libssl-dev cmake make perl \
dnsutils iproute2 openssl \
&& rm -rf /var/lib/apt/lists/* \
&& for u in dev-hugepages.mount sys-fs-fuse-connections.mount \
systemd-logind.service getty.target console-getty.service; do \
systemctl mask $u; \
done
STOPSIGNAL SIGRTMIN+3
CMD ["/lib/systemd/systemd"]
DOCKERFILE
echo "── starting systemd container ──"
docker run -d --name "$CONTAINER" \
--privileged --cgroupns=host \
--tmpfs /run --tmpfs /run/lock --tmpfs /tmp:exec \
-v "$PWD:/src:ro" \
-v numa-install-systemd-cargo:/root/.cargo \
-v numa-install-systemd-work:/work \
"$IMAGE" >/dev/null
# Wait for systemd to be up
for _ in $(seq 1 30); do
state=$(docker exec "$CONTAINER" systemctl is-system-running 2>&1 || true)
case "$state" in running|degraded) break ;; esac
sleep 0.5
done
echo "── copying source into /work (writable) ──"
docker exec "$CONTAINER" bash -c '
mkdir -p /work
tar -C /src --exclude=./target --exclude=./.git --exclude=./.claude -cf - . | tar -C /work -xf -
'
echo "── rustup + cargo build --release --locked ──"
docker exec "$CONTAINER" bash -c '
set -e
if ! command -v cargo &>/dev/null; then
curl -sSf https://sh.rustup.rs | sh -s -- -y --profile minimal --quiet
fi
. "$HOME/.cargo/env"
cd /work
cargo build --release --locked 2>&1 | tail -5
'
echo "── running scenarios ──"
docker exec -e NUMA_INSIDE=1 "$CONTAINER" bash /src/tests/docker/install-systemd.sh

View File

@@ -0,0 +1,155 @@
#!/usr/bin/env bash
#
# Reproducer for issue #122 — chicken-and-egg when numa is its own system
# resolver (HAOS add-on, Pi-hole-style container, laptop with
# resolv.conf → 127.0.0.1).
#
# Topology:
# container /etc/resolv.conf → nameserver 127.0.0.1
# numa bound on :53 → upstream DoH by hostname (quad9)
# numa boots → spawns blocklist download
# reqwest::get → getaddrinfo("cdn.jsdelivr.net")
# → loopback UDP :53 → numa → cache miss → DoH upstream
# → getaddrinfo("dns.quad9.net") → same loop → glibc EAI_AGAIN
#
# Expected on master: both assertions FAIL (bug reproduced).
# Expected after bootstrap-IP fix: both assertions PASS.
#
# Requirements: docker (with internet access for external lists/DoH)
# Usage: ./tests/docker/self-resolver-loop.sh
set -euo pipefail
cd "$(dirname "$0")/../.."
GREEN="\033[32m"; RED="\033[31m"; RESET="\033[0m"
pass() { printf " ${GREEN}${RESET} %s\n" "$1"; }
fail() { printf " ${RED}${RESET} %s\n" "$1"; printf " %s\n" "$2"; FAILED=$((FAILED+1)); }
FAILED=0
OUT=/tmp/numa-self-resolver.out
echo "── self-resolver-loop: building + reproducing on debian:bookworm ──"
echo " (first run is slow: image pull + cold cargo build, ~5-8 min)"
echo
docker run --rm \
-v "$PWD:/src:ro" \
-v numa-self-resolver-cargo:/root/.cargo \
-v numa-self-resolver-target:/work/target \
debian:bookworm bash -c '
set -e
# Phase 1: install deps + build with the container DNS as given by Docker
# (resolves deb.debian.org, static.rust-lang.org, crates.io).
apt-get update -qq && apt-get install -y -qq curl build-essential dnsutils 2>&1 | tail -3
if ! command -v cargo &>/dev/null; then
curl -sSf https://sh.rustup.rs | sh -s -- -y --profile minimal --quiet
fi
. "$HOME/.cargo/env"
mkdir -p /work
tar -C /src --exclude=./target --exclude=./.git -cf - . | tar -C /work -xf -
cd /work
echo "── cargo build --release --locked ──"
cargo build --release --locked 2>&1 | tail -5
echo
# Phase 2: flip system DNS to numa itself — this is the pathological
# topology from issue #122 (HAOS add-on, resolv.conf → 127.0.0.1).
# Everything after this point, any getaddrinfo call inside numa loops
# back through :53.
echo "nameserver 127.0.0.1" > /etc/resolv.conf
echo "── /etc/resolv.conf inside container (post-flip) ──"
cat /etc/resolv.conf
echo
cat > /tmp/numa.toml <<CONF
[server]
bind_addr = "0.0.0.0:53"
api_port = 5380
api_bind_addr = "127.0.0.1"
data_dir = "/tmp/numa-data"
[upstream]
mode = "forward"
address = ["https://dns.quad9.net/dns-query"]
timeout_ms = 3000
[blocking]
enabled = true
lists = ["https://cdn.jsdelivr.net/gh/hagezi/dns-blocklists@latest/hosts/pro.txt"]
CONF
mkdir -p /tmp/numa-data
echo "── starting numa ──"
RUST_LOG=info ./target/release/numa /tmp/numa.toml > /tmp/numa.log 2>&1 &
NUMA_PID=$!
# Wait up to 120s for blocklist to populate.
# Retry delays 2+10+30s = 42s, plus ~4 × ~10s getaddrinfo timeouts under
# self-loop = ~82s worst case. 120s leaves headroom.
LOADED=0
for i in $(seq 1 120); do
LOADED=$(curl -sf http://127.0.0.1:5380/blocking/stats 2>/dev/null \
| grep -o "\"domains_loaded\":[0-9]*" | cut -d: -f2 || echo 0)
[ "${LOADED:-0}" -gt 100 ] && break
sleep 1
done
# First cold DoH query — time it.
START=$(date +%s%N)
dig @127.0.0.1 example.com A +time=15 +tries=1 > /tmp/dig.out 2>&1 || true
END=$(date +%s%N)
LATENCY_MS=$(( (END - START) / 1000000 ))
STATUS=$(grep -oE "status: [A-Z]+" /tmp/dig.out | head -1 || echo "status: TIMEOUT")
kill $NUMA_PID 2>/dev/null || true
wait $NUMA_PID 2>/dev/null || true
echo
echo "=== RESULT ==="
echo "domains_loaded=$LOADED"
echo "first_query_latency_ms=$LATENCY_MS"
echo "first_query_${STATUS// /_}"
echo
echo "=== numa.log (tail 40) ==="
tail -40 /tmp/numa.log
echo
echo "=== dig.out ==="
cat /tmp/dig.out
' 2>&1 | tee "$OUT"
echo
echo "── assertions ──"
LOADED=$(grep '^domains_loaded=' "$OUT" | tail -1 | cut -d= -f2 || echo 0)
LATENCY=$(grep '^first_query_latency_ms=' "$OUT" | tail -1 | cut -d= -f2 || echo 999999)
STATUS_LINE=$(grep '^first_query_status_' "$OUT" | tail -1 || echo "first_query_status_TIMEOUT")
if [ "${LOADED:-0}" -gt 100 ]; then
pass "blocklist downloaded (domains_loaded=$LOADED)"
else
fail "blocklist downloaded (got domains_loaded=${LOADED:-0}, expected >100)" \
"chicken-and-egg: blocklist HTTPS client has no DNS bootstrap; getaddrinfo loops through numa"
fi
if [ "${LATENCY:-999999}" -lt 2000 ]; then
pass "first DoH query under 2s (latency=${LATENCY}ms, $STATUS_LINE)"
else
fail "first DoH query under 2s (got ${LATENCY}ms, $STATUS_LINE)" \
"self-loop on getaddrinfo(upstream_host); plain DoH needs bootstrap-IP symmetry with ODoH"
fi
echo
if [ "$FAILED" -eq 0 ]; then
printf "${GREEN}── self-resolver-loop passed (fix is in place) ──${RESET}\n"
exit 0
else
printf "${RED}── self-resolver-loop failed ($FAILED assertion(s)) — bug #122 reproduced ──${RESET}\n"
exit 1
fi

View File

@@ -1,7 +1,10 @@
#!/usr/bin/env bash
# Integration test suite for Numa
# Runs a test instance on port 5354, validates all features, exits with status.
# Usage: ./tests/integration.sh [release|debug]
# Usage:
# ./tests/integration.sh [release|debug] # all suites
# SUITES=7 ./tests/integration.sh # only Suite 7
# SUITES=1,3,7 ./tests/integration.sh # Suites 1, 3, and 7
set -euo pipefail
@@ -14,6 +17,14 @@ LOG="/tmp/numa-integration-test.log"
PASSED=0
FAILED=0
# Suite filter: empty runs all; comma list runs a subset.
SUITES="${SUITES:-}"
should_run_suite() {
[ -z "$SUITES" ] && return 0
case ",$SUITES," in *",$1,"*) return 0;; esac
return 1
}
# Colors
GREEN="\033[32m"
RED="\033[31m"
@@ -166,6 +177,7 @@ CONF
}
# ---- Suite 1: Recursive mode + DNSSEC ----
if should_run_suite 1; then
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║ Suite 1: Recursive + DNSSEC + Blocking ║"
@@ -234,7 +246,10 @@ kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
fi # end Suite 1
# ---- Suite 2: Forward mode (backward compat) ----
if should_run_suite 2; then
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║ Suite 2: Forward (DoH) + Blocking ║"
@@ -261,7 +276,10 @@ enabled = true
enabled = false
"
fi # end Suite 2
# ---- Suite 3: Forward UDP (plain, no DoH) ----
if should_run_suite 3; then
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║ Suite 3: Forward (UDP) + No Blocking ║"
@@ -307,7 +325,10 @@ kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
fi # end Suite 3
# ---- Suite 4: Local zones + Overrides API ----
if should_run_suite 4; then
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║ Suite 4: Local Zones + Overrides API ║"
@@ -416,7 +437,10 @@ kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
fi # end Suite 4
# ---- Suite 5: DNS-over-TLS (RFC 7858) ----
if should_run_suite 5; then
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║ Suite 5: DNS-over-TLS (RFC 7858) ║"
@@ -538,7 +562,10 @@ CONF
fi
sleep 1
fi # end Suite 5
# ---- Suite 6: Proxy + DoT coexistence ----
if should_run_suite 6; then
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║ Suite 6: Proxy + DoT Coexistence ║"
@@ -698,6 +725,376 @@ CONF
rm -rf "$NUMA_DATA"
fi
fi # end Suite 6
# ---- Suite 7: filter_aaaa (IPv4-only networks) ----
if should_run_suite 7; then
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║ Suite 7: filter_aaaa ║"
echo "╚══════════════════════════════════════════╝"
# Config A — filter on, with a local AAAA zone to prove local data bypass.
cat > "$CONFIG" << 'CONF'
[server]
bind_addr = "127.0.0.1:5354"
api_port = 5381
filter_aaaa = true
[upstream]
mode = "forward"
address = "9.9.9.9"
port = 53
[cache]
max_entries = 10000
[blocking]
enabled = false
[proxy]
enabled = false
[[zones]]
domain = "v6.test"
record_type = "AAAA"
value = "2001:db8::1"
ttl = 60
CONF
RUST_LOG=info "$BINARY" "$CONFIG" > "$LOG" 2>&1 &
NUMA_PID=$!
sleep 3
DIG="dig @127.0.0.1 -p $PORT +time=5 +tries=1"
echo ""
echo "=== filter_aaaa = true ==="
# A queries must be untouched.
check "A record resolves under filter_aaaa" \
"." \
"$($DIG google.com A +short | head -1)"
# AAAA must be NOERROR (NODATA), not NXDOMAIN, not SERVFAIL.
check "AAAA returns NOERROR (not NXDOMAIN)" \
"status: NOERROR" \
"$($DIG google.com AAAA 2>&1 | grep 'status:')"
check "AAAA returns zero answers (NODATA shape)" \
"ANSWER: 0" \
"$($DIG google.com AAAA 2>&1 | grep -oE 'ANSWER: [0-9]+' | head -1)"
# Local zone AAAA must survive the filter (PR claim: local data bypasses).
check "Local [[zones]] AAAA bypasses filter" \
"2001:db8::1" \
"$($DIG v6.test AAAA +short)"
# HTTPS RR: ipv6hint (SvcParamKey 6) must be stripped. Query as `type65`
# because dig 9.10.6 (macOS) misparses `HTTPS` as a domain name; `type65`
# works on both 9.10.6 and 9.18. Assert on the raw rdata hex (RFC 3597
# generic format), since dig 9.10.6 doesn't pretty-print HTTPS params.
# cloudflare.com's ipv6hint values sit under the 2606:4700 prefix —
# checking that `26064700` is absent from the rdata hex is a precise,
# upstream-stable signal that the TLV was stripped.
HTTPS_OUT=$($DIG cloudflare.com type65 2>&1)
if echo "$HTTPS_OUT" | grep -qE "cloudflare\.com\..*IN[[:space:]]+TYPE65"; then
HTTPS_HEX=$(echo "$HTTPS_OUT" | grep -A5 "IN[[:space:]]*TYPE65" | tr -d " \t\n")
if echo "$HTTPS_HEX" | grep -qi "26064700"; then
check "HTTPS ipv6hint stripped (2606:4700 absent from rdata)" "absent" "present"
else
check "HTTPS ipv6hint stripped (2606:4700 absent from rdata)" "absent" "absent"
fi
else
# Upstream didn't return an HTTPS record — skip rather than false-pass.
printf " ${DIM}~ HTTPS ipv6hint stripped (skipped: no HTTPS RR returned by upstream)${RESET}\n"
fi
kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
# Config B — filter off. Regression guard: prove AAAA answers come back
# when the flag isn't set, so a network failure in Config A can't silently
# pass as "filter working".
cat > "$CONFIG" << 'CONF'
[server]
bind_addr = "127.0.0.1:5354"
api_port = 5381
[upstream]
mode = "forward"
address = "9.9.9.9"
port = 53
[cache]
max_entries = 10000
[blocking]
enabled = false
[proxy]
enabled = false
CONF
RUST_LOG=info "$BINARY" "$CONFIG" > "$LOG" 2>&1 &
NUMA_PID=$!
sleep 3
echo ""
echo "=== filter_aaaa unset (regression guard) ==="
check "AAAA returns real answers with filter off" \
":" \
"$($DIG google.com AAAA +short | head -1)"
kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
fi # end Suite 7
# ---- Suite 8: ODoH (Oblivious DoH via public relay + target) ----
# Exercises the full client pipeline: /.well-known/odohconfigs fetch,
# HPKE seal/unseal, URL-query target routing (RFC 9230 §5), dashboard
# QueryPath::Odoh counter. Depends on the public ecosystem being up —
# the probe-odoh-ecosystem.sh script guards against flaky runs.
if should_run_suite 8; then
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║ Suite 8: ODoH (Anonymous DNS) ║"
echo "╚══════════════════════════════════════════╝"
run_test_suite "ODoH via edgecompute.app relay → Cloudflare target" "
[server]
bind_addr = \"127.0.0.1:5354\"
api_port = 5381
[upstream]
mode = \"odoh\"
relay = \"https://odoh-relay.edgecompute.app/proxy\"
target = \"https://odoh.cloudflare-dns.com/dns-query\"
[cache]
max_entries = 10000
min_ttl = 60
max_ttl = 86400
[blocking]
enabled = false
[proxy]
enabled = false
"
# Re-start briefly to assert ODoH-specific observability: the odoh counter
# has to tick above zero after a query, and the stats label has to reflect
# the oblivious path. These guard against silent regressions in the
# QueryPath::Odoh tagging and the /stats serialisation.
RUST_LOG=info "$BINARY" "$CONFIG" > "$LOG" 2>&1 &
NUMA_PID=$!
for _ in $(seq 1 30); do
curl -sf "http://127.0.0.1:$API_PORT/health" >/dev/null 2>&1 && break
sleep 0.1
done
$DIG example.com A +short > /dev/null 2>&1 || true
sleep 1
STATS=$(curl -sf http://127.0.0.1:$API_PORT/stats 2>/dev/null)
# upstream_transport.odoh lives inside the upstream_transport object.
ODOH_COUNT=$(echo "$STATS" | grep -o '"upstream_transport":{[^}]*}' \
| grep -o '"odoh":[0-9]*' | cut -d: -f2)
check "upstream_transport.odoh > 0 after a query" "[1-9]" "${ODOH_COUNT:-0}"
check "Upstream label advertises odoh://" \
"odoh://" \
"$(echo "$STATS" | grep -o '"upstream":"[^"]*"')"
check "Stats mode field is 'odoh'" \
'"mode":"odoh"' \
"$(echo "$STATS" | grep -o '"mode":"odoh"')"
# Strict-mode failure path: a clearly-unreachable relay must produce
# SERVFAIL without silent downgrade. We hijack the config to point at
# an .invalid host so we don't rely on external uptime.
kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
cat > "$CONFIG" << 'CONF'
[server]
bind_addr = "127.0.0.1:5354"
api_port = 5381
[upstream]
mode = "odoh"
relay = "https://relay.invalid/proxy"
target = "https://odoh.cloudflare-dns.com/dns-query"
strict = true
[cache]
max_entries = 10000
[blocking]
enabled = false
[proxy]
enabled = false
CONF
RUST_LOG=info "$BINARY" "$CONFIG" > "$LOG" 2>&1 &
NUMA_PID=$!
for _ in $(seq 1 30); do
curl -sf "http://127.0.0.1:$API_PORT/health" >/dev/null 2>&1 && break
sleep 0.1
done
check "Strict-mode relay outage returns SERVFAIL" \
"SERVFAIL" \
"$($DIG example.com A 2>&1 | grep 'status:')"
kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
# Negative: relay and target on the same host must be rejected at startup.
cat > "$CONFIG" << 'CONF'
[server]
bind_addr = "127.0.0.1:5354"
api_port = 5381
[upstream]
mode = "odoh"
relay = "https://odoh.cloudflare-dns.com/proxy"
target = "https://odoh.cloudflare-dns.com/dns-query"
CONF
STARTUP_OUT=$("$BINARY" "$CONFIG" 2>&1 || true)
check "Same-host relay+target rejected at startup" \
"same host" \
"$STARTUP_OUT"
# Guards ODoH's zero-plain-DNS-leak property: relay_ip / target_ip must
# land in the bootstrap resolver's override map so reqwest connects direct
# to the configured IPs instead of resolving the hostnames via plain DNS.
# RFC 5737 TEST-NET-1 IPs (unroutable).
cat > "$CONFIG" << 'CONF'
[server]
bind_addr = "127.0.0.1:5354"
api_port = 5381
[upstream]
mode = "odoh"
relay = "https://odoh-relay.example.com/proxy"
target = "https://odoh-target.example.org/dns-query"
relay_ip = "192.0.2.1"
target_ip = "192.0.2.2"
[cache]
max_entries = 10000
[blocking]
enabled = false
[proxy]
enabled = false
CONF
RUST_LOG=info "$BINARY" "$CONFIG" > "$LOG" 2>&1 &
NUMA_PID=$!
for _ in $(seq 1 30); do
curl -sf "http://127.0.0.1:$API_PORT/health" >/dev/null 2>&1 && break
sleep 0.1
done
OVERRIDE_LOG=$(grep 'bootstrap resolver: host overrides' "$LOG" || true)
check "relay_ip wired into bootstrap override map" \
"odoh-relay.example.com=192.0.2.1" \
"$OVERRIDE_LOG"
check "target_ip wired into bootstrap override map" \
"odoh-target.example.org=192.0.2.2" \
"$OVERRIDE_LOG"
kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
fi # end Suite 8
# ---- Suite 9: Numa's own ODoH relay (--relay-mode) ----
# Exercises `numa relay PORT` as a forwarding proxy to a real ODoH target.
# Validates the RFC 9230 §5 relay behaviour: URL-query routing, content-type
# gating, body-size cap, and /health observability.
if should_run_suite 9; then
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║ Suite 9: Numa ODoH Relay (own) ║"
echo "╚══════════════════════════════════════════╝"
RELAY_PORT=18443
"$BINARY" relay $RELAY_PORT > "$LOG" 2>&1 &
NUMA_PID=$!
for _ in $(seq 1 30); do
curl -sf "http://127.0.0.1:$RELAY_PORT/health" >/dev/null 2>&1 && break
sleep 0.1
done
echo ""
echo "=== Relay Endpoints ==="
check "Health endpoint returns ok" \
"ok" \
"$(curl -sf http://127.0.0.1:$RELAY_PORT/health | head -1)"
# Happy path: forwards arbitrary body to Cloudflare's ODoH target. The
# target will reject the garbage envelope with HTTP 400 — which is exactly
# what proves our relay faithfully forwarded (otherwise we'd see our own
# 4xx from the relay itself).
HAPPY_STATUS=$(curl -sS -o /dev/null -w "%{http_code}" -X POST \
-H "Content-Type: application/oblivious-dns-message" \
--data-binary "garbage-forwarded-end-to-end" \
"http://127.0.0.1:$RELAY_PORT/relay?targethost=odoh.cloudflare-dns.com&targetpath=/dns-query")
check "Relay forwards to target (target rejects garbage → 400)" \
"400" \
"$HAPPY_STATUS"
echo ""
echo "=== Guards ==="
check "Missing content-type → 415" \
"415" \
"$(curl -sS -o /dev/null -w '%{http_code}' -X POST --data-binary 'x' \
'http://127.0.0.1:'$RELAY_PORT'/relay?targethost=odoh.cloudflare-dns.com&targetpath=/dns-query')"
check "Oversized body (>4 KiB) → 413" \
"413" \
"$(head -c 5000 /dev/urandom | curl -sS -o /dev/null -w '%{http_code}' -X POST \
-H 'Content-Type: application/oblivious-dns-message' --data-binary @- \
'http://127.0.0.1:'$RELAY_PORT'/relay?targethost=odoh.cloudflare-dns.com&targetpath=/dns-query')"
check "Invalid targethost (no dot) → 400" \
"400" \
"$(curl -sS -o /dev/null -w '%{http_code}' -X POST \
-H 'Content-Type: application/oblivious-dns-message' --data-binary 'x' \
'http://127.0.0.1:'$RELAY_PORT'/relay?targethost=invalid&targetpath=/dns-query')"
echo ""
echo "=== Counters ==="
HEALTH=$(curl -sf "http://127.0.0.1:$RELAY_PORT/health")
check "Relay counted at least one forwarded_ok" \
"[1-9]" \
"$(echo "$HEALTH" | grep 'forwarded_ok' | awk '{print $2}')"
check "Relay counted at least one rejected_bad_request" \
"[1-9]" \
"$(echo "$HEALTH" | grep 'rejected_bad_request' | awk '{print $2}')"
kill "$NUMA_PID" 2>/dev/null || true
wait "$NUMA_PID" 2>/dev/null || true
sleep 1
fi # end Suite 9
# Summary
echo ""
TOTAL=$((PASSED + FAILED))

101
tests/probe-odoh-ecosystem.sh Executable file
View File

@@ -0,0 +1,101 @@
#!/usr/bin/env bash
# Probe the public ODoH ecosystem.
#
# Source of truth: DNSCrypt's curated list at
# https://github.com/DNSCrypt/dnscrypt-resolvers/tree/master/v3
# - v3/odoh-servers.md (ODoH targets)
# - v3/odoh-relays.md (ODoH relays)
#
# As of commit 2025-09-16 ("odohrelay-crypto-sx seems to be the only ODoH
# relay left"), the full public ecosystem is 4 targets + 1 relay. Re-run this
# script against the upstream list before making any "only N public relays"
# claim publicly.
#
# Usage: ./tests/probe-odoh-ecosystem.sh
set -uo pipefail
GREEN="\033[32m"
RED="\033[31m"
YELLOW="\033[33m"
DIM="\033[90m"
RESET="\033[0m"
UP=0
DOWN=0
probe_target() {
local name="$1"
local host="$2"
local url="https://${host}/.well-known/odohconfigs"
local start=$(date +%s%N)
local headers
headers=$(curl -sS -o /tmp/odoh-probe-body -D - --max-time 5 -A "numa-odoh-probe/0.1" "$url" 2>&1) || {
DOWN=$((DOWN + 1))
printf " ${RED}${RESET} %-25s ${DIM}unreachable${RESET}\n" "$name"
return
}
local elapsed_ms=$((($(date +%s%N) - start) / 1000000))
local status
status=$(echo "$headers" | head -1 | awk '{print $2}')
local ctype
ctype=$(echo "$headers" | grep -i '^content-type:' | head -1 | tr -d '\r')
local size
size=$(stat -f%z /tmp/odoh-probe-body 2>/dev/null || stat -c%s /tmp/odoh-probe-body 2>/dev/null || echo 0)
if [[ "$status" == "200" ]] && [[ "$size" -gt 0 ]]; then
UP=$((UP + 1))
printf " ${GREEN}${RESET} %-25s ${DIM}%4dms %s bytes %s${RESET}\n" "$name" "$elapsed_ms" "$size" "$ctype"
else
DOWN=$((DOWN + 1))
printf " ${RED}${RESET} %-25s ${DIM}status=%s size=%s${RESET}\n" "$name" "$status" "$size"
fi
rm -f /tmp/odoh-probe-body
}
probe_relay() {
# Relays don't expose /.well-known/odohconfigs — we just verify TLS reachability
# and that the endpoint responds to a malformed POST with an HTTP error
# (indicating the relay path exists). A real ODoH validation requires HPKE.
local name="$1"
local url="$2"
local start=$(date +%s%N)
local status
status=$(curl -sS -o /dev/null -w "%{http_code}" --max-time 5 -A "numa-odoh-probe/0.1" \
-X POST -H "Content-Type: application/oblivious-dns-message" \
--data-binary "" "$url" 2>&1) || {
DOWN=$((DOWN + 1))
printf " ${RED}${RESET} %-25s ${DIM}unreachable${RESET}\n" "$name"
return
}
local elapsed_ms=$((($(date +%s%N) - start) / 1000000))
# Any 2xx or 4xx means the endpoint is live (TLS works, HTTP responded).
# 5xx or 000 (curl failure) means broken.
if [[ "$status" =~ ^[24] ]]; then
UP=$((UP + 1))
printf " ${GREEN}${RESET} %-25s ${DIM}%4dms status=%s (endpoint live)${RESET}\n" "$name" "$elapsed_ms" "$status"
else
DOWN=$((DOWN + 1))
printf " ${RED}${RESET} %-25s ${DIM}status=%s${RESET}\n" "$name" "$status"
fi
}
echo "ODoH targets:"
probe_target "Cloudflare" "odoh.cloudflare-dns.com"
probe_target "crypto.sx" "odoh.crypto.sx"
probe_target "Snowstorm" "dope.snowstorm.love"
probe_target "Tiarap" "doh.tiarap.org"
echo
echo "ODoH relays:"
probe_relay "Frank Denis (Fastly)" "https://odoh-relay.edgecompute.app/proxy"
echo
TOTAL=$((UP + DOWN))
if [[ "$DOWN" -eq 0 ]]; then
printf "${GREEN}All %d endpoints up${RESET}\n" "$TOTAL"
exit 0
else
printf "${YELLOW}%d/%d up, %d down${RESET}\n" "$UP" "$TOTAL" "$DOWN"
exit 1
fi

View File

@@ -0,0 +1,115 @@
//! Regression test for issue #128: SOA with compressed MNAME/RNAME must
//! survive Numa's round-trip — compression pointers reference the upstream
//! packet's byte layout, so we have to decompress on read and re-compress
//! on write.
use numa::buffer::BytePacketBuffer;
use numa::packet::DnsPacket;
const COMPRESSION_FLAG: u16 = 0xC000;
fn upstream_packet() -> Vec<u8> {
let mut p = Vec::<u8>::new();
p.extend_from_slice(&[
0x12, 0x34, 0x81, 0x80, 0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x00,
]);
assert_eq!(p.len(), 12);
write_name(&mut p, &["odin", "adobe", "com"]);
p.extend_from_slice(&[0x00, 0x41, 0x00, 0x01]);
p.extend_from_slice(&[0xC0, 0x0C]);
p.extend_from_slice(&[0x00, 0x05, 0x00, 0x01, 0x00, 0x00, 0x23, 0x7F]);
let rdlen_pos_1 = p.len();
p.extend_from_slice(&[0x00, 0x00]);
let cname1_start = p.len();
write_name(&mut p, &["cdn", "adobeaemcloud", "com"]);
let rdlen_1 = (p.len() - cname1_start) as u16;
p[rdlen_pos_1..rdlen_pos_1 + 2].copy_from_slice(&rdlen_1.to_be_bytes());
p.extend_from_slice(&(COMPRESSION_FLAG | cname1_start as u16).to_be_bytes());
p.extend_from_slice(&[0x00, 0x05, 0x00, 0x01, 0x00, 0x00, 0x23, 0x7F]);
let rdlen_pos_2 = p.len();
p.extend_from_slice(&[0x00, 0x00]);
let cname2_start = p.len();
p.push(9);
p.extend_from_slice(b"adobe-aem");
let map_label_off = p.len();
p.push(3);
p.extend_from_slice(b"map");
let fastly_label_off = p.len();
p.push(6);
p.extend_from_slice(b"fastly");
p.push(3);
p.extend_from_slice(b"net");
p.push(0);
let rdlen_2 = (p.len() - cname2_start) as u16;
p[rdlen_pos_2..rdlen_pos_2 + 2].copy_from_slice(&rdlen_2.to_be_bytes());
p.extend_from_slice(&(COMPRESSION_FLAG | fastly_label_off as u16).to_be_bytes());
p.extend_from_slice(&[0x00, 0x06, 0x00, 0x01, 0x00, 0x00, 0x07, 0x08]);
let rdlen_pos_soa = p.len();
p.extend_from_slice(&[0x00, 0x00]);
let soa_rdata_start = p.len();
p.extend_from_slice(&(COMPRESSION_FLAG | map_label_off as u16).to_be_bytes());
p.extend_from_slice(&(COMPRESSION_FLAG | fastly_label_off as u16).to_be_bytes());
p.extend_from_slice(&1u32.to_be_bytes());
p.extend_from_slice(&7200u32.to_be_bytes());
p.extend_from_slice(&3600u32.to_be_bytes());
p.extend_from_slice(&1209600u32.to_be_bytes());
p.extend_from_slice(&1800u32.to_be_bytes());
let rdlen_soa = (p.len() - soa_rdata_start) as u16;
p[rdlen_pos_soa..rdlen_pos_soa + 2].copy_from_slice(&rdlen_soa.to_be_bytes());
p
}
fn write_name(p: &mut Vec<u8>, labels: &[&str]) {
for l in labels {
p.push(l.len() as u8);
p.extend_from_slice(l.as_bytes());
}
p.push(0);
}
#[test]
fn compressed_soa_survives_numa_round_trip() {
let upstream = upstream_packet();
let hickory_in = hickory_proto::op::Message::from_vec(&upstream)
.expect("hand-crafted upstream must be valid");
let soa_in_rd = hickory_in.name_servers()[0]
.data()
.clone()
.into_soa()
.expect("SOA rdata");
assert_eq!(soa_in_rd.mname().to_string(), "map.fastly.net.");
assert_eq!(soa_in_rd.rname().to_string(), "fastly.net.");
let mut in_buf = BytePacketBuffer::from_bytes(&upstream);
let pkt = DnsPacket::from_buffer(&mut in_buf).expect("numa parses upstream");
assert_eq!(pkt.answers.len(), 2);
assert_eq!(pkt.authorities.len(), 1);
let mut out_buf = BytePacketBuffer::new();
pkt.write(&mut out_buf).expect("numa writes");
let out = out_buf.filled().to_vec();
let hickory_out =
hickory_proto::op::Message::from_vec(&out).expect("numa re-emission must parse strictly");
let soa_out_rd = hickory_out.name_servers()[0]
.data()
.clone()
.into_soa()
.expect("SOA rdata on output");
assert_eq!(soa_out_rd.mname().to_string(), "map.fastly.net.");
assert_eq!(soa_out_rd.rname().to_string(), "fastly.net.");
assert_eq!(soa_out_rd.serial(), 1);
assert_eq!(soa_out_rd.refresh(), 7200);
assert_eq!(soa_out_rd.retry(), 3600);
assert_eq!(soa_out_rd.expire(), 1209600);
assert_eq!(soa_out_rd.minimum(), 1800);
}