feat(odoh): ship ODoH client + self-hosted relay (RFC 9230) #121

Merged
razvandimescu merged 5 commits from feat/odoh into main 2026-04-20 21:26:54 +08:00
razvandimescu commented 2026-04-20 17:34:42 +08:00 (Migrated from github.com)

Summary

  • ODoH client — new mode = "odoh" in [upstream]. HPKE seal/open via odoh-rs, URL-query target routing per RFC 9230 §5, strict-mode SERVFAIL on relay failure (no silent downgrade), /.well-known/odohconfigs fetcher with TTL cache and 60s backoff.
  • ODoH relay — new numa relay [PORT] subcommand. Forwards opaque POST bodies between client and target without reading them; SSRF-hardened, 4 KiB body cap, counters-only observability.
  • UpstreamTransport dimension — orthogonal to QueryPath, symmetric counter for UDP/DoH/DoT/ODoH via /stats.upstream_transport. Lets the dashboard answer "how much of my DNS traffic left in cleartext?" honestly.

What's in the tree

Client side (resolver talks ODoH upstream):

  • src/odoh.rs — fetcher, OdohConfigCache (lock-free ArcSwapOption read path, single-flight refresh mutex, refresh backoff), query_through_relay with retry-once on key rotation.
  • src/forward.rsUpstream::Odoh variant; Upstream::transport(); build_https_client_with_pool(n) helper (pool=1 for DoH/ODoH client, pool=4 for relay's many-target fan-out).
  • src/config.rsUpstreamMode::Odoh, OdohUpstream validator. Same-host rejection, https-only, fallback field accepts string-or-array.
  • src/serve.rs — new match arm constructing the pool under mode = "odoh".

Relay side (runs on its own host):

  • src/relay.rs — axum server, RFC 1035 hostname validator, DefaultBodyLimit layer, static 502 body (internals logged not leaked), /health with aggregate counters.
  • src/main.rsnuma relay [PORT] subcommand (default 8443).

Observability:

  • src/stats.rsUpstreamTransport enum, counters, StatsSnapshot fields.
  • src/api.rs/stats.upstream_transport JSON object mirroring client-side transport.
  • src/ctx.rs — tags each successful resolution with Some(UpstreamTransport::*) or None (cache/local/blocked).

Tests:

  • tests/integration.sh Suite 8 — ODoH client against Frank Denis's public relay + Cloudflare target. Covers strict-mode SERVFAIL and same-host config rejection.
  • tests/integration.sh Suite 9 — numa relay forwarding to Cloudflare + guards (missing content-type → 415, oversized body → 413, invalid targethost → 400).
  • tests/probe-odoh-ecosystem.sh — probes the full ecosystem (4 targets + 1 relay) per DNSCrypt's curated v3/odoh-relays.md.

Ecosystem provenance

DNSCrypt's canonical list (DNSCrypt/dnscrypt-resolvers/v3/odoh-relays.md, pruned 2025-09-16 with the commit message "odohrelay-crypto-sx seems to be the only ODoH relay left") currently has one public ODoH relay. Deploying Numa's relay doubles global supply, not "adds to a crowded field."

Test plan

  • cargo test --lib — 339 passed
  • cargo clippy --lib -- -D warnings — clean
  • cargo fmt --check — clean
  • cargo audit — only allowed warnings
  • Suite 2 (DoH) — all pass
  • Suite 8 (ODoH client) — all 18 pass
  • Suite 9 (ODoH relay) — all 7 pass
  • Live smoke test — dig @127.0.0.1 example.com through ODoH resolver → NOERROR in ~146ms via Frank Denis's relay → Cloudflare
  • Deploy numa relay to Hetzner CX22; confirm external reachability
## Summary - **ODoH client** — new `mode = "odoh"` in `[upstream]`. HPKE seal/open via `odoh-rs`, URL-query target routing per RFC 9230 §5, strict-mode SERVFAIL on relay failure (no silent downgrade), `/.well-known/odohconfigs` fetcher with TTL cache and 60s backoff. - **ODoH relay** — new `numa relay [PORT]` subcommand. Forwards opaque POST bodies between client and target without reading them; SSRF-hardened, 4 KiB body cap, counters-only observability. - **UpstreamTransport dimension** — orthogonal to `QueryPath`, symmetric counter for UDP/DoH/DoT/ODoH via `/stats.upstream_transport`. Lets the dashboard answer "how much of my DNS traffic left in cleartext?" honestly. ## What's in the tree **Client side** (resolver talks ODoH upstream): - `src/odoh.rs` — fetcher, `OdohConfigCache` (lock-free `ArcSwapOption` read path, single-flight refresh mutex, refresh backoff), `query_through_relay` with retry-once on key rotation. - `src/forward.rs` — `Upstream::Odoh` variant; `Upstream::transport()`; `build_https_client_with_pool(n)` helper (pool=1 for DoH/ODoH client, pool=4 for relay's many-target fan-out). - `src/config.rs` — `UpstreamMode::Odoh`, `OdohUpstream` validator. Same-host rejection, https-only, `fallback` field accepts string-or-array. - `src/serve.rs` — new match arm constructing the pool under `mode = "odoh"`. **Relay side** (runs on its own host): - `src/relay.rs` — axum server, RFC 1035 hostname validator, `DefaultBodyLimit` layer, static 502 body (internals logged not leaked), `/health` with aggregate counters. - `src/main.rs` — `numa relay [PORT]` subcommand (default 8443). **Observability:** - `src/stats.rs` — `UpstreamTransport` enum, counters, `StatsSnapshot` fields. - `src/api.rs` — `/stats.upstream_transport` JSON object mirroring client-side `transport`. - `src/ctx.rs` — tags each successful resolution with `Some(UpstreamTransport::*)` or `None` (cache/local/blocked). **Tests:** - `tests/integration.sh` Suite 8 — ODoH client against Frank Denis's public relay + Cloudflare target. Covers strict-mode SERVFAIL and same-host config rejection. - `tests/integration.sh` Suite 9 — `numa relay` forwarding to Cloudflare + guards (missing content-type → 415, oversized body → 413, invalid targethost → 400). - `tests/probe-odoh-ecosystem.sh` — probes the full ecosystem (4 targets + 1 relay) per DNSCrypt's curated `v3/odoh-relays.md`. ## Ecosystem provenance DNSCrypt's canonical list (`DNSCrypt/dnscrypt-resolvers/v3/odoh-relays.md`, pruned 2025-09-16 with the commit message *"odohrelay-crypto-sx seems to be the only ODoH relay left"*) currently has **one** public ODoH relay. Deploying Numa's relay doubles global supply, not "adds to a crowded field." ## Test plan - [x] `cargo test --lib` — 339 passed - [x] `cargo clippy --lib -- -D warnings` — clean - [x] `cargo fmt --check` — clean - [x] `cargo audit` — only allowed warnings - [x] Suite 2 (DoH) — all pass - [x] Suite 8 (ODoH client) — all 18 pass - [x] Suite 9 (ODoH relay) — all 7 pass - [x] Live smoke test — `dig @127.0.0.1 example.com` through ODoH resolver → NOERROR in ~146ms via Frank Denis's relay → Cloudflare - [ ] Deploy `numa relay` to Hetzner CX22; confirm external reachability
Sign in to join this conversation.