feat: recursive DNS + DNSSEC + TCP fallback (#17)
* feat: recursive resolution + full DNSSEC validation Numa becomes a true DNS resolver — resolves from root nameservers with complete DNSSEC chain-of-trust verification. Recursive resolution: - Iterative RFC 1034 from configurable root hints (13 default) - CNAME chasing (depth 8), referral following (depth 10) - A+AAAA glue extraction, IPv6 nameserver support - TLD priming: NS + DS + DNSKEY for 34 gTLDs + EU ccTLDs - Config: mode = "recursive" in [upstream], root_hints, prime_tlds DNSSEC (all 4 phases): - EDNS0 OPT pseudo-record (DO bit, 1232 payload per DNS Flag Day 2020) - DNSKEY, DS, RRSIG, NSEC, NSEC3 record types with wire read/write - Signature verification via ring: RSA/SHA-256, ECDSA P-256, Ed25519 - Chain-of-trust: zone DNSKEY → parent DS → root KSK (key tag 20326) - DNSKEY RRset self-signature verification (RRSIG(DNSKEY) by KSK) - RRSIG expiration/inception time validation - NSEC: NXDOMAIN gap proofs, NODATA type absence, wildcard denial - NSEC3: SHA-1 iterated hashing, closest encloser proof, hash range - Authority RRSIG verification for denial proofs - Config: [dnssec] enabled/strict (default false, opt-in) - AD bit on Secure, SERVFAIL on Bogus+strict - DnssecStatus cached per entry, ValidationStats logging Performance: - TLD chain pre-warmed on startup (root DNSKEY + TLD DS/DNSKEY) - Referral DS piggybacking from authority sections - DNSKEY prefetch before validation loop - Cold-cache validation: ~1 DNSKEY fetch (down from 5) - Benchmarks: RSA 10.9µs, ECDSA 174ns, DS verify 257ns Also: - write_qname fix for root domain "." (was producing malformed queries) - write_record_header() dedup, write_bytes() bulk writes - DnsRecord::domain() + query_type() accessors - UpstreamMode enum, DEFAULT_EDNS_PAYLOAD const - Real glue TTL (was hardcoded 3600) - DNSSEC restricted to recursive mode only Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: TCP fallback, query minimization, UDP auto-disable Transport resilience for restrictive networks (ISPs blocking UDP:53): - DNS-over-TCP fallback: UDP fail/truncation → automatic TCP retry - UDP auto-disable: after 3 consecutive failures, switch to TCP-first - IPv6 → TCP directly (UDP socket binds 0.0.0.0, can't reach IPv6) - Network change resets UDP detection for re-probing - Root hint rotation in TLD priming Privacy: - RFC 7816 query minimization: root servers see TLD only, not full name Code quality: - Merged find_starting_ns + find_starting_zone → find_closest_ns - Extracted resolve_ns_addrs_from_glue shared helper - Removed overall timeout wrapper (per-hop timeouts sufficient) - forward_tcp for DNS-over-TCP (RFC 1035 §4.2.2) Testing: - Mock TCP-only DNS server for fallback tests (no network needed) - tcp_fallback_resolves_when_udp_blocked - tcp_only_iterative_resolution - tcp_fallback_handles_nxdomain - udp_auto_disable_resets - Integration test suite (4 suites, 51 tests) - Network probe script (tests/network-probe.sh) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: DNSSEC verified badge in dashboard query log - Add dnssec field to QueryLogEntry, track validation status per query - DnssecStatus::as_str() for API serialization - Dashboard shows green checkmark next to DNSSEC-verified responses - Blog post: add "How keys get there" section, transport resilience section, trim code blocks, update What's Next Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use SVG shield for DNSSEC badge, update blog HTML Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: NS cache lookup from authorities, UDP re-probe, shield alignment - find_closest_ns checks authorities (not just answers) for NS records, fixing TLD priming cache misses that caused redundant root queries - Periodic UDP re-probe every 5min when disabled — re-enables UDP after switching from a restrictive network to an open one - Dashboard DNSSEC shield uses fixed-width container for alignment - Blog post: tuck key-tag into trust anchor paragraph Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: TCP single-write, mock server consistency, integration tests - TCP single-write fix: combine length prefix + message to avoid split segments that Microsoft/Azure DNS servers reject - Mock server (spawn_tcp_dns_server) updated to use single-write too - Tests: forward_tcp_wire_format, forward_tcp_single_segment_write - Integration: real-server checks for Microsoft/Office/Azure domains Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: recursive bar in dashboard, special-use domain interception Dashboard: - Add Recursive bar to resolution paths chart (cyan, distinct from Override) - Add RECURSIVE path tag style in query log Special-use domains (RFC 6761/6303/8880/9462): - .localhost → 127.0.0.1 (RFC 6761) - Private reverse PTR (10.x, 192.168.x, 172.16-31.x) → NXDOMAIN - _dns.resolver.arpa (DDR) → NXDOMAIN - ipv4only.arpa (NAT64) → 192.0.0.170/171 - mDNS service discovery for private ranges → NXDOMAIN Eliminates ~900ms SERVFAILs for macOS system queries that were hitting root servers unnecessarily. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: move generated blog HTML to site/blog/posts/, gitignore - Generated HTML now in site/blog/posts/ (gitignored) - CI workflow runs pandoc + make blog before deploy - Updated all internal blog links to /blog/posts/ path - blog/*.md remains the source of truth Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: review feedback — memory ordering, RRSIG time, NS resolution - Ordering::Relaxed → Acquire/Release for UDP_DISABLED/UDP_FAILURES (ARM correctness for cross-thread coordination) - RRSIG time validation: serial number arithmetic (RFC 4034 §3.1.5) + 300s clock skew fudge factor (matches BIND) - resolve_ns_addrs_from_glue collects addresses from ALL NS names, not just the first with glue (improves failover) - is_special_use_domain: eliminate 16 format! allocations per .in-addr.arpa query (parse octet instead) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: API endpoint tests, coverage target - 8 new axum handler tests: health, stats, query-log, overrides CRUD, cache, blocking stats, services CRUD, dashboard HTML - Tests use tower::oneshot — no network, no server startup - test_ctx() builds minimal ServerCtx for isolated testing - `make coverage` target (cargo-tarpaulin), separate from `make all` - 82 total tests (was 74) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
204
src/ctx.rs
204
src/ctx.rs
@@ -10,8 +10,8 @@ use tokio::net::UdpSocket;
|
||||
|
||||
use crate::blocklist::BlocklistStore;
|
||||
use crate::buffer::BytePacketBuffer;
|
||||
use crate::cache::DnsCache;
|
||||
use crate::config::ZoneMap;
|
||||
use crate::cache::{DnsCache, DnssecStatus};
|
||||
use crate::config::{UpstreamMode, ZoneMap};
|
||||
use crate::forward::{forward_query, Upstream};
|
||||
use crate::header::ResultCode;
|
||||
use crate::lan::PeerStore;
|
||||
@@ -27,6 +27,7 @@ use crate::system_dns::ForwardingRule;
|
||||
pub struct ServerCtx {
|
||||
pub socket: UdpSocket,
|
||||
pub zone_map: ZoneMap,
|
||||
/// std::sync::RwLock (not tokio) — locks must never be held across .await points.
|
||||
pub cache: RwLock<DnsCache>,
|
||||
pub stats: Mutex<ServerStats>,
|
||||
pub overrides: RwLock<OverrideStore>,
|
||||
@@ -48,6 +49,10 @@ pub struct ServerCtx {
|
||||
pub config_dir: PathBuf,
|
||||
pub data_dir: PathBuf,
|
||||
pub tls_config: Option<ArcSwap<ServerConfig>>,
|
||||
pub upstream_mode: UpstreamMode,
|
||||
pub root_hints: Vec<SocketAddr>,
|
||||
pub dnssec_enabled: bool,
|
||||
pub dnssec_strict: bool,
|
||||
}
|
||||
|
||||
pub async fn handle_query(
|
||||
@@ -72,12 +77,32 @@ pub async fn handle_query(
|
||||
|
||||
// Pipeline: overrides -> .tld interception -> blocklist -> local zones -> cache -> upstream
|
||||
// Each lock is scoped to avoid holding MutexGuard across await points.
|
||||
let (response, path) = {
|
||||
let (response, path, dnssec) = {
|
||||
let override_record = ctx.overrides.read().unwrap().lookup(&qname);
|
||||
if let Some(record) = override_record {
|
||||
let mut resp = DnsPacket::response_from(&query, ResultCode::NOERROR);
|
||||
resp.answers.push(record);
|
||||
(resp, QueryPath::Overridden)
|
||||
(resp, QueryPath::Overridden, DnssecStatus::Indeterminate)
|
||||
} else if qname == "localhost" || qname.ends_with(".localhost") {
|
||||
// RFC 6761: .localhost always resolves to loopback
|
||||
let mut resp = DnsPacket::response_from(&query, ResultCode::NOERROR);
|
||||
match qtype {
|
||||
QueryType::AAAA => resp.answers.push(DnsRecord::AAAA {
|
||||
domain: qname.clone(),
|
||||
addr: std::net::Ipv6Addr::LOCALHOST,
|
||||
ttl: 300,
|
||||
}),
|
||||
_ => resp.answers.push(DnsRecord::A {
|
||||
domain: qname.clone(),
|
||||
addr: std::net::Ipv4Addr::LOCALHOST,
|
||||
ttl: 300,
|
||||
}),
|
||||
}
|
||||
(resp, QueryPath::Local, DnssecStatus::Indeterminate)
|
||||
} else if is_special_use_domain(&qname) {
|
||||
// RFC 6761/8880: private PTR, DDR, NAT64 — answer locally
|
||||
let resp = special_use_response(&query, &qname, qtype);
|
||||
(resp, QueryPath::Local, DnssecStatus::Indeterminate)
|
||||
} else if !ctx.proxy_tld_suffix.is_empty()
|
||||
&& (qname.ends_with(&ctx.proxy_tld_suffix) || qname == ctx.proxy_tld)
|
||||
{
|
||||
@@ -115,7 +140,7 @@ pub async fn handle_query(
|
||||
ttl: 300,
|
||||
}),
|
||||
}
|
||||
(resp, QueryPath::Local)
|
||||
(resp, QueryPath::Local, DnssecStatus::Indeterminate)
|
||||
} else if ctx.blocklist.read().unwrap().is_blocked(&qname) {
|
||||
let mut resp = DnsPacket::response_from(&query, ResultCode::NOERROR);
|
||||
match qtype {
|
||||
@@ -130,17 +155,43 @@ pub async fn handle_query(
|
||||
ttl: 60,
|
||||
}),
|
||||
}
|
||||
(resp, QueryPath::Blocked)
|
||||
(resp, QueryPath::Blocked, DnssecStatus::Indeterminate)
|
||||
} else if let Some(records) = ctx.zone_map.get(qname.as_str()).and_then(|m| m.get(&qtype)) {
|
||||
let mut resp = DnsPacket::response_from(&query, ResultCode::NOERROR);
|
||||
resp.answers = records.clone();
|
||||
(resp, QueryPath::Local)
|
||||
(resp, QueryPath::Local, DnssecStatus::Indeterminate)
|
||||
} else {
|
||||
let cached = ctx.cache.read().unwrap().lookup(&qname, qtype);
|
||||
if let Some(cached) = cached {
|
||||
let cached = ctx.cache.read().unwrap().lookup_with_status(&qname, qtype);
|
||||
if let Some((cached, cached_dnssec)) = cached {
|
||||
let mut resp = cached;
|
||||
resp.header.id = query.header.id;
|
||||
(resp, QueryPath::Cached)
|
||||
if cached_dnssec == DnssecStatus::Secure {
|
||||
resp.header.authed_data = true;
|
||||
}
|
||||
(resp, QueryPath::Cached, cached_dnssec)
|
||||
} else if ctx.upstream_mode == UpstreamMode::Recursive {
|
||||
match crate::recursive::resolve_recursive(
|
||||
&qname,
|
||||
qtype,
|
||||
&ctx.cache,
|
||||
&query,
|
||||
&ctx.root_hints,
|
||||
)
|
||||
.await
|
||||
{
|
||||
Ok(resp) => (resp, QueryPath::Recursive, DnssecStatus::Indeterminate),
|
||||
Err(e) => {
|
||||
error!(
|
||||
"{} | {:?} {} | RECURSIVE ERROR | {}",
|
||||
src_addr, qtype, qname, e
|
||||
);
|
||||
(
|
||||
DnsPacket::response_from(&query, ResultCode::SERVFAIL),
|
||||
QueryPath::UpstreamError,
|
||||
DnssecStatus::Indeterminate,
|
||||
)
|
||||
}
|
||||
}
|
||||
} else {
|
||||
let upstream =
|
||||
match crate::system_dns::match_forwarding_rule(&qname, &ctx.forwarding_rules) {
|
||||
@@ -150,7 +201,7 @@ pub async fn handle_query(
|
||||
match forward_query(&query, &upstream, ctx.timeout).await {
|
||||
Ok(resp) => {
|
||||
ctx.cache.write().unwrap().insert(&qname, qtype, &resp);
|
||||
(resp, QueryPath::Forwarded)
|
||||
(resp, QueryPath::Forwarded, DnssecStatus::Indeterminate)
|
||||
}
|
||||
Err(e) => {
|
||||
error!(
|
||||
@@ -160,6 +211,7 @@ pub async fn handle_query(
|
||||
(
|
||||
DnsPacket::response_from(&query, ResultCode::SERVFAIL),
|
||||
QueryPath::UpstreamError,
|
||||
DnssecStatus::Indeterminate,
|
||||
)
|
||||
}
|
||||
}
|
||||
@@ -167,6 +219,55 @@ pub async fn handle_query(
|
||||
}
|
||||
};
|
||||
|
||||
let client_do = query.edns.as_ref().is_some_and(|e| e.do_bit);
|
||||
let mut response = response;
|
||||
|
||||
// DNSSEC validation (recursive/forwarded responses only)
|
||||
let mut dnssec = dnssec;
|
||||
if ctx.dnssec_enabled && path == QueryPath::Recursive {
|
||||
let (status, vstats) =
|
||||
crate::dnssec::validate_response(&response, &ctx.cache, &ctx.root_hints).await;
|
||||
|
||||
debug!(
|
||||
"DNSSEC | {} | {:?} | {}ms | dnskey_hit={} dnskey_fetch={} ds_hit={} ds_fetch={}",
|
||||
qname,
|
||||
status,
|
||||
vstats.elapsed_ms,
|
||||
vstats.dnskey_cache_hits,
|
||||
vstats.dnskey_fetches,
|
||||
vstats.ds_cache_hits,
|
||||
vstats.ds_fetches,
|
||||
);
|
||||
|
||||
dnssec = status;
|
||||
|
||||
if status == DnssecStatus::Secure {
|
||||
response.header.authed_data = true;
|
||||
}
|
||||
|
||||
if status == DnssecStatus::Bogus && ctx.dnssec_strict {
|
||||
response = DnsPacket::response_from(&query, ResultCode::SERVFAIL);
|
||||
}
|
||||
|
||||
ctx.cache
|
||||
.write()
|
||||
.unwrap()
|
||||
.insert_with_status(&qname, qtype, &response, status);
|
||||
}
|
||||
|
||||
// Strip DNSSEC records if client didn't set DO bit
|
||||
if !client_do {
|
||||
strip_dnssec_records(&mut response);
|
||||
}
|
||||
|
||||
// Echo EDNS back if client sent it
|
||||
if query.edns.is_some() {
|
||||
response.edns = Some(crate::packet::EdnsOpt {
|
||||
do_bit: client_do,
|
||||
..Default::default()
|
||||
});
|
||||
}
|
||||
|
||||
let elapsed = start.elapsed();
|
||||
|
||||
info!(
|
||||
@@ -216,7 +317,88 @@ pub async fn handle_query(
|
||||
path,
|
||||
rescode: response.header.rescode,
|
||||
latency_us: elapsed.as_micros() as u64,
|
||||
dnssec,
|
||||
});
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn is_dnssec_record(r: &DnsRecord) -> bool {
|
||||
matches!(
|
||||
r.query_type(),
|
||||
QueryType::RRSIG | QueryType::DNSKEY | QueryType::DS | QueryType::NSEC | QueryType::NSEC3
|
||||
)
|
||||
}
|
||||
|
||||
fn strip_dnssec_records(pkt: &mut DnsPacket) {
|
||||
pkt.answers.retain(|r| !is_dnssec_record(r));
|
||||
pkt.authorities.retain(|r| !is_dnssec_record(r));
|
||||
pkt.resources.retain(|r| !is_dnssec_record(r));
|
||||
}
|
||||
|
||||
fn is_special_use_domain(qname: &str) -> bool {
|
||||
if qname.ends_with(".in-addr.arpa") {
|
||||
// RFC 6303: private + loopback + link-local reverse DNS
|
||||
if qname.ends_with(".10.in-addr.arpa")
|
||||
|| qname.ends_with(".168.192.in-addr.arpa")
|
||||
|| qname.ends_with(".127.in-addr.arpa")
|
||||
|| qname.ends_with(".254.169.in-addr.arpa")
|
||||
|| qname.ends_with(".0.in-addr.arpa")
|
||||
|| qname.contains("_dns-sd._udp")
|
||||
{
|
||||
return true;
|
||||
}
|
||||
// 172.16-31.x.x (RFC 1918) — extract second octet from reverse name
|
||||
if qname.ends_with(".172.in-addr.arpa") {
|
||||
if let Some(octet_str) = qname
|
||||
.strip_suffix(".172.in-addr.arpa")
|
||||
.and_then(|s| s.rsplit('.').next())
|
||||
{
|
||||
if let Ok(octet) = octet_str.parse::<u8>() {
|
||||
return (16..=31).contains(&octet);
|
||||
}
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
// DDR (RFC 9462)
|
||||
if qname == "_dns.resolver.arpa" || qname.ends_with("._dns.resolver.arpa") {
|
||||
return true;
|
||||
}
|
||||
// NAT64 (RFC 8880)
|
||||
qname == "ipv4only.arpa"
|
||||
}
|
||||
|
||||
fn special_use_response(query: &DnsPacket, qname: &str, qtype: QueryType) -> DnsPacket {
|
||||
use std::net::{Ipv4Addr, Ipv6Addr};
|
||||
if qname == "ipv4only.arpa" {
|
||||
// RFC 8880: well-known NAT64 addresses
|
||||
let mut resp = DnsPacket::response_from(query, ResultCode::NOERROR);
|
||||
let domain = qname.to_string();
|
||||
match qtype {
|
||||
QueryType::A => {
|
||||
resp.answers.push(DnsRecord::A {
|
||||
domain: domain.clone(),
|
||||
addr: Ipv4Addr::new(192, 0, 0, 170),
|
||||
ttl: 300,
|
||||
});
|
||||
resp.answers.push(DnsRecord::A {
|
||||
domain,
|
||||
addr: Ipv4Addr::new(192, 0, 0, 171),
|
||||
ttl: 300,
|
||||
});
|
||||
}
|
||||
QueryType::AAAA => {
|
||||
resp.answers.push(DnsRecord::AAAA {
|
||||
domain,
|
||||
addr: Ipv6Addr::new(0x0064, 0xff9b, 0, 0, 0, 0, 0xc000, 0x00aa),
|
||||
ttl: 300,
|
||||
});
|
||||
}
|
||||
_ => {}
|
||||
}
|
||||
resp
|
||||
} else {
|
||||
DnsPacket::response_from(query, ResultCode::NXDOMAIN)
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user