feat: recursive DNS + DNSSEC + TCP fallback (#17)
* feat: recursive resolution + full DNSSEC validation Numa becomes a true DNS resolver — resolves from root nameservers with complete DNSSEC chain-of-trust verification. Recursive resolution: - Iterative RFC 1034 from configurable root hints (13 default) - CNAME chasing (depth 8), referral following (depth 10) - A+AAAA glue extraction, IPv6 nameserver support - TLD priming: NS + DS + DNSKEY for 34 gTLDs + EU ccTLDs - Config: mode = "recursive" in [upstream], root_hints, prime_tlds DNSSEC (all 4 phases): - EDNS0 OPT pseudo-record (DO bit, 1232 payload per DNS Flag Day 2020) - DNSKEY, DS, RRSIG, NSEC, NSEC3 record types with wire read/write - Signature verification via ring: RSA/SHA-256, ECDSA P-256, Ed25519 - Chain-of-trust: zone DNSKEY → parent DS → root KSK (key tag 20326) - DNSKEY RRset self-signature verification (RRSIG(DNSKEY) by KSK) - RRSIG expiration/inception time validation - NSEC: NXDOMAIN gap proofs, NODATA type absence, wildcard denial - NSEC3: SHA-1 iterated hashing, closest encloser proof, hash range - Authority RRSIG verification for denial proofs - Config: [dnssec] enabled/strict (default false, opt-in) - AD bit on Secure, SERVFAIL on Bogus+strict - DnssecStatus cached per entry, ValidationStats logging Performance: - TLD chain pre-warmed on startup (root DNSKEY + TLD DS/DNSKEY) - Referral DS piggybacking from authority sections - DNSKEY prefetch before validation loop - Cold-cache validation: ~1 DNSKEY fetch (down from 5) - Benchmarks: RSA 10.9µs, ECDSA 174ns, DS verify 257ns Also: - write_qname fix for root domain "." (was producing malformed queries) - write_record_header() dedup, write_bytes() bulk writes - DnsRecord::domain() + query_type() accessors - UpstreamMode enum, DEFAULT_EDNS_PAYLOAD const - Real glue TTL (was hardcoded 3600) - DNSSEC restricted to recursive mode only Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: TCP fallback, query minimization, UDP auto-disable Transport resilience for restrictive networks (ISPs blocking UDP:53): - DNS-over-TCP fallback: UDP fail/truncation → automatic TCP retry - UDP auto-disable: after 3 consecutive failures, switch to TCP-first - IPv6 → TCP directly (UDP socket binds 0.0.0.0, can't reach IPv6) - Network change resets UDP detection for re-probing - Root hint rotation in TLD priming Privacy: - RFC 7816 query minimization: root servers see TLD only, not full name Code quality: - Merged find_starting_ns + find_starting_zone → find_closest_ns - Extracted resolve_ns_addrs_from_glue shared helper - Removed overall timeout wrapper (per-hop timeouts sufficient) - forward_tcp for DNS-over-TCP (RFC 1035 §4.2.2) Testing: - Mock TCP-only DNS server for fallback tests (no network needed) - tcp_fallback_resolves_when_udp_blocked - tcp_only_iterative_resolution - tcp_fallback_handles_nxdomain - udp_auto_disable_resets - Integration test suite (4 suites, 51 tests) - Network probe script (tests/network-probe.sh) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: DNSSEC verified badge in dashboard query log - Add dnssec field to QueryLogEntry, track validation status per query - DnssecStatus::as_str() for API serialization - Dashboard shows green checkmark next to DNSSEC-verified responses - Blog post: add "How keys get there" section, transport resilience section, trim code blocks, update What's Next Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use SVG shield for DNSSEC badge, update blog HTML Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: NS cache lookup from authorities, UDP re-probe, shield alignment - find_closest_ns checks authorities (not just answers) for NS records, fixing TLD priming cache misses that caused redundant root queries - Periodic UDP re-probe every 5min when disabled — re-enables UDP after switching from a restrictive network to an open one - Dashboard DNSSEC shield uses fixed-width container for alignment - Blog post: tuck key-tag into trust anchor paragraph Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: TCP single-write, mock server consistency, integration tests - TCP single-write fix: combine length prefix + message to avoid split segments that Microsoft/Azure DNS servers reject - Mock server (spawn_tcp_dns_server) updated to use single-write too - Tests: forward_tcp_wire_format, forward_tcp_single_segment_write - Integration: real-server checks for Microsoft/Office/Azure domains Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: recursive bar in dashboard, special-use domain interception Dashboard: - Add Recursive bar to resolution paths chart (cyan, distinct from Override) - Add RECURSIVE path tag style in query log Special-use domains (RFC 6761/6303/8880/9462): - .localhost → 127.0.0.1 (RFC 6761) - Private reverse PTR (10.x, 192.168.x, 172.16-31.x) → NXDOMAIN - _dns.resolver.arpa (DDR) → NXDOMAIN - ipv4only.arpa (NAT64) → 192.0.0.170/171 - mDNS service discovery for private ranges → NXDOMAIN Eliminates ~900ms SERVFAILs for macOS system queries that were hitting root servers unnecessarily. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: move generated blog HTML to site/blog/posts/, gitignore - Generated HTML now in site/blog/posts/ (gitignored) - CI workflow runs pandoc + make blog before deploy - Updated all internal blog links to /blog/posts/ path - blog/*.md remains the source of truth Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: review feedback — memory ordering, RRSIG time, NS resolution - Ordering::Relaxed → Acquire/Release for UDP_DISABLED/UDP_FAILURES (ARM correctness for cross-thread coordination) - RRSIG time validation: serial number arithmetic (RFC 4034 §3.1.5) + 300s clock skew fudge factor (matches BIND) - resolve_ns_addrs_from_glue collects addresses from ALL NS names, not just the first with glue (improves failover) - is_special_use_domain: eliminate 16 format! allocations per .in-addr.arpa query (parse octet instead) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: API endpoint tests, coverage target - 8 new axum handler tests: health, stats, query-log, overrides CRUD, cache, blocking stats, services CRUD, dashboard HTML - Tests use tower::oneshot — no network, no server startup - test_ctx() builds minimal ServerCtx for isolated testing - `make coverage` target (cargo-tarpaulin), separate from `make all` - 82 total tests (was 74) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit was merged in pull request #17.
This commit is contained in:
259
src/api.rs
259
src/api.rs
@@ -153,6 +153,7 @@ struct QueryLogResponse {
|
||||
path: String,
|
||||
rescode: String,
|
||||
latency_ms: f64,
|
||||
dnssec: String,
|
||||
}
|
||||
|
||||
#[derive(Serialize)]
|
||||
@@ -178,6 +179,7 @@ struct LanStatsResponse {
|
||||
struct QueriesStats {
|
||||
total: u64,
|
||||
forwarded: u64,
|
||||
recursive: u64,
|
||||
cached: u64,
|
||||
local: u64,
|
||||
overridden: u64,
|
||||
@@ -460,6 +462,7 @@ async fn query_log(
|
||||
path: e.path.as_str().to_string(),
|
||||
rescode: e.rescode.as_str().to_string(),
|
||||
latency_ms: e.latency_us as f64 / 1000.0,
|
||||
dnssec: e.dnssec.as_str().to_string(),
|
||||
}
|
||||
})
|
||||
.collect()
|
||||
@@ -477,7 +480,11 @@ async fn stats(State(ctx): State<Arc<ServerCtx>>) -> Json<StatsResponse> {
|
||||
let override_count = ctx.overrides.read().unwrap().active_count();
|
||||
let bl_stats = ctx.blocklist.read().unwrap().stats();
|
||||
|
||||
let upstream = ctx.upstream.lock().unwrap().to_string();
|
||||
let upstream = if ctx.upstream_mode == crate::config::UpstreamMode::Recursive {
|
||||
"recursive (root hints)".to_string()
|
||||
} else {
|
||||
ctx.upstream.lock().unwrap().to_string()
|
||||
};
|
||||
|
||||
Json(StatsResponse {
|
||||
uptime_secs: snap.uptime_secs,
|
||||
@@ -487,6 +494,7 @@ async fn stats(State(ctx): State<Arc<ServerCtx>>) -> Json<StatsResponse> {
|
||||
queries: QueriesStats {
|
||||
total: snap.total,
|
||||
forwarded: snap.forwarded,
|
||||
recursive: snap.recursive,
|
||||
cached: snap.cached,
|
||||
local: snap.local,
|
||||
overridden: snap.overridden,
|
||||
@@ -901,3 +909,252 @@ async fn check_tcp(addr: std::net::SocketAddr) -> bool {
|
||||
.map(|r| r.is_ok())
|
||||
.unwrap_or(false)
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use axum::body::Body;
|
||||
use http::Request;
|
||||
use std::sync::{Mutex, RwLock};
|
||||
use tower::ServiceExt;
|
||||
|
||||
async fn test_ctx() -> Arc<ServerCtx> {
|
||||
let socket = tokio::net::UdpSocket::bind("127.0.0.1:0").await.unwrap();
|
||||
Arc::new(ServerCtx {
|
||||
socket,
|
||||
zone_map: std::collections::HashMap::new(),
|
||||
cache: RwLock::new(crate::cache::DnsCache::new(100, 60, 86400)),
|
||||
stats: Mutex::new(crate::stats::ServerStats::new()),
|
||||
overrides: RwLock::new(crate::override_store::OverrideStore::new()),
|
||||
blocklist: RwLock::new(crate::blocklist::BlocklistStore::new()),
|
||||
query_log: Mutex::new(crate::query_log::QueryLog::new(100)),
|
||||
services: Mutex::new(crate::service_store::ServiceStore::new()),
|
||||
lan_peers: Mutex::new(crate::lan::PeerStore::new(90)),
|
||||
forwarding_rules: Vec::new(),
|
||||
upstream: Mutex::new(crate::forward::Upstream::Udp(
|
||||
"127.0.0.1:53".parse().unwrap(),
|
||||
)),
|
||||
upstream_auto: false,
|
||||
upstream_port: 53,
|
||||
lan_ip: Mutex::new(std::net::Ipv4Addr::LOCALHOST),
|
||||
timeout: std::time::Duration::from_secs(3),
|
||||
proxy_tld: "numa".to_string(),
|
||||
proxy_tld_suffix: ".numa".to_string(),
|
||||
lan_enabled: false,
|
||||
config_path: "/tmp/test-numa.toml".to_string(),
|
||||
config_found: false,
|
||||
config_dir: std::path::PathBuf::from("/tmp"),
|
||||
data_dir: std::path::PathBuf::from("/tmp"),
|
||||
tls_config: None,
|
||||
upstream_mode: crate::config::UpstreamMode::Forward,
|
||||
root_hints: Vec::new(),
|
||||
dnssec_enabled: false,
|
||||
dnssec_strict: false,
|
||||
})
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn health_returns_ok() {
|
||||
let ctx = test_ctx().await;
|
||||
let resp = router(ctx)
|
||||
.oneshot(Request::get("/health").body(Body::empty()).unwrap())
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(resp.status(), 200);
|
||||
let body = axum::body::to_bytes(resp.into_body(), 1000).await.unwrap();
|
||||
let json: serde_json::Value = serde_json::from_slice(&body).unwrap();
|
||||
assert_eq!(json["status"], "ok");
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn stats_returns_json() {
|
||||
let ctx = test_ctx().await;
|
||||
let resp = router(ctx)
|
||||
.oneshot(Request::get("/stats").body(Body::empty()).unwrap())
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(resp.status(), 200);
|
||||
let body = axum::body::to_bytes(resp.into_body(), 10000).await.unwrap();
|
||||
let json: serde_json::Value = serde_json::from_slice(&body).unwrap();
|
||||
assert!(json["uptime_secs"].is_number());
|
||||
assert!(json["queries"]["total"].is_number());
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn query_log_empty() {
|
||||
let ctx = test_ctx().await;
|
||||
let resp = router(ctx)
|
||||
.oneshot(
|
||||
Request::get("/query-log?limit=10")
|
||||
.body(Body::empty())
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(resp.status(), 200);
|
||||
let body = axum::body::to_bytes(resp.into_body(), 10000).await.unwrap();
|
||||
let json: serde_json::Value = serde_json::from_slice(&body).unwrap();
|
||||
assert!(json.as_array().unwrap().is_empty());
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn overrides_crud() {
|
||||
let ctx = test_ctx().await;
|
||||
let a = router(ctx.clone());
|
||||
|
||||
// Create
|
||||
let resp = a
|
||||
.clone()
|
||||
.oneshot(
|
||||
Request::post("/overrides")
|
||||
.header("content-type", "application/json")
|
||||
.body(Body::from(
|
||||
r#"{"domain":"test.dev","target":"1.2.3.4","duration_secs":60}"#,
|
||||
))
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
assert!(resp.status().is_success());
|
||||
|
||||
// List
|
||||
let resp = a
|
||||
.clone()
|
||||
.oneshot(Request::get("/overrides").body(Body::empty()).unwrap())
|
||||
.await
|
||||
.unwrap();
|
||||
let body = axum::body::to_bytes(resp.into_body(), 10000).await.unwrap();
|
||||
assert!(String::from_utf8_lossy(&body).contains("test.dev"));
|
||||
|
||||
// Get
|
||||
let resp = a
|
||||
.clone()
|
||||
.oneshot(
|
||||
Request::get("/overrides/test.dev")
|
||||
.body(Body::empty())
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(resp.status(), 200);
|
||||
|
||||
// Delete
|
||||
let resp = a
|
||||
.clone()
|
||||
.oneshot(
|
||||
Request::delete("/overrides/test.dev")
|
||||
.body(Body::empty())
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
assert!(resp.status().is_success());
|
||||
|
||||
// Verify deleted
|
||||
let resp = a
|
||||
.oneshot(
|
||||
Request::get("/overrides/test.dev")
|
||||
.body(Body::empty())
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(resp.status(), 404);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn cache_list_and_flush() {
|
||||
let ctx = test_ctx().await;
|
||||
let a = router(ctx.clone());
|
||||
|
||||
// List (empty)
|
||||
let resp = a
|
||||
.clone()
|
||||
.oneshot(Request::get("/cache").body(Body::empty()).unwrap())
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(resp.status(), 200);
|
||||
|
||||
// Flush
|
||||
let resp = a
|
||||
.oneshot(Request::delete("/cache").body(Body::empty()).unwrap())
|
||||
.await
|
||||
.unwrap();
|
||||
assert!(resp.status().is_success());
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn blocking_stats_returns_json() {
|
||||
let ctx = test_ctx().await;
|
||||
let resp = router(ctx)
|
||||
.oneshot(Request::get("/blocking/stats").body(Body::empty()).unwrap())
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(resp.status(), 200);
|
||||
let body = axum::body::to_bytes(resp.into_body(), 10000).await.unwrap();
|
||||
let json: serde_json::Value = serde_json::from_slice(&body).unwrap();
|
||||
assert!(json["enabled"].is_boolean());
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn services_crud() {
|
||||
let ctx = test_ctx().await;
|
||||
let a = router(ctx);
|
||||
|
||||
// Add service
|
||||
let resp = a
|
||||
.clone()
|
||||
.oneshot(
|
||||
Request::post("/services")
|
||||
.header("content-type", "application/json")
|
||||
.body(Body::from(r#"{"name":"testapp","target_port":3000}"#))
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
assert!(resp.status().is_success());
|
||||
|
||||
// List
|
||||
let resp = a
|
||||
.clone()
|
||||
.oneshot(Request::get("/services").body(Body::empty()).unwrap())
|
||||
.await
|
||||
.unwrap();
|
||||
let body = axum::body::to_bytes(resp.into_body(), 10000).await.unwrap();
|
||||
assert!(String::from_utf8_lossy(&body).contains("testapp"));
|
||||
|
||||
// Delete
|
||||
let resp = a
|
||||
.clone()
|
||||
.oneshot(
|
||||
Request::delete("/services/testapp")
|
||||
.body(Body::empty())
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
assert!(resp.status().is_success());
|
||||
|
||||
// Verify deleted
|
||||
let resp = a
|
||||
.oneshot(Request::get("/services").body(Body::empty()).unwrap())
|
||||
.await
|
||||
.unwrap();
|
||||
let body = axum::body::to_bytes(resp.into_body(), 10000).await.unwrap();
|
||||
assert!(!String::from_utf8_lossy(&body).contains("testapp"));
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn dashboard_returns_html() {
|
||||
let ctx = test_ctx().await;
|
||||
let resp = router(ctx)
|
||||
.oneshot(Request::get("/").body(Body::empty()).unwrap())
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(resp.status(), 200);
|
||||
let body = axum::body::to_bytes(resp.into_body(), 100000)
|
||||
.await
|
||||
.unwrap();
|
||||
assert!(String::from_utf8_lossy(&body).contains("Numa"));
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user