dearsky/numa - numa - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Razvan Dimescu	22bebb85a0	fix: config path advisory ignores XDG file on interactive root (#81 ) (#83 ) Port-53 and TLS-data-dir advisories told users to create ~/.config/numa/numa.toml, but config_dir() routed root to /var/lib/numa/ and load_config never consulted the XDG path, so the file the user created was silently ignored. New suggested_config_path() helper prefers $HOME/.config/numa/ when HOME is set (and isn't "/" or empty), with config_dir() as lazy fallback. Used by both advisories and by load_config as an additional candidate, so the advised path is the path numa actually reads. Runtime state (services.json, TLS CA) stays in FHS — config_dir()/data_dir() are intentionally unchanged to keep continuity with the installed daemon. End-to-end replication + regression check in tests/docker/issue-81.sh: four scenarios (replication and existing-install, each against main and fix), all matching expectations.	2026-04-12 02:17:33 +03:00
Razvan Dimescu	7d6b0ed568	feat: DoH server endpoint + DoT enabled by default (#79 ) * chore: document multi-forwarder and cache warming in config and README Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: DNS-over-HTTPS server endpoint (RFC 8484) Serve DoH at POST /dns-query on the existing HTTPS proxy (port 443). Automatically enabled when proxy TLS is active — no config needed. Also fix zone map priority so local zones override RFC 6762 .local special-use handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: cargo fmt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: remove GoatCounter analytics from site GoatCounter domains (goatcounter.com, gc.zgo.at) are blocked by Hagezi Pro, which is Numa's default blocklist. A DNS privacy tool should not embed analytics that its own resolver blocks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: enable DoT listener by default DoT now starts automatically with `sudo numa`, matching the proxy and DoH which are already on by default. The self-signed CA infrastructure is shared with the proxy, so there is no additional setup. This makes `numa setup-phone` work out of the box. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 04:06:17 +03:00
Razvan Dimescu	7770129589	feat: cache warming — proactive DNS resolution for configured domains (#78 ) Resolves A + AAAA at startup for domains listed in [cache] warm, then re-resolves before TTL expiry (at 75% elapsed). Keeps critical domains always hot in cache with zero client-visible latency. Closes #34 (item 4) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 01:14:04 +03:00
Razvan Dimescu	8abcd91f95	feat: multi-forwarder with SRTT-based failover (#77 ) * feat: multi-forwarder with SRTT-based failover address accepts string or array, with optional per-server port override. New fallback pool tried only when all primaries fail. Sequential failover with SRTT ranking ensures fastest upstream is tried first. Closes #34 (items 1, 2, 3) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: simplify failover candidate list and deduplicate recursive pool Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: extract maybe_update_primary for testable upstream re-detection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: rustfmt Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 00:26:58 +03:00
Razvan Dimescu	2c20c56421	feat: mobile setup — QR onboarding, Wi-Fi scoped mobileconfig (#73 ) * fix: scope mobileconfig DNS to Wi-Fi only via OnDemandRules Without OnDemandRules, iOS applies the DoT profile globally — cellular DNS breaks when the phone leaves the LAN. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: phone setup QR code in dashboard header - Add /qr endpoint serving SVG (uses existing qrcode crate, svg feature) - Header popover: QR on desktop, direct download link on mobile viewports - Only visible when [mobile] enabled = true in config - Expose mobile.enabled and mobile.port in /stats response - Lazy-load QR on first click, dismiss on outside click Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add Cache-Control to /qr, re-fetch QR on each popover open Cache-Control: no-store prevents stale QR after LAN IP change. Remove qrLoaded flag so the QR always reflects the current IP. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: rustfmt serve_qr response tuple Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add iOS install steps to phone setup popover iOS shows "Profile Downloaded" with no guidance. The popover now includes the 3-step install flow including the buried Certificate Trust Settings toggle. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 22:21:51 +03:00
Razvan Dimescu	921ed68d54	fix: allowlist parent domain unblocks subdomains (#74 ) * fix: allowlist parent domain unblocks subdomains in blocklist The allowlist walk-up was interleaved with the blocklist walk-up, so an exact blocklist match on www.example.com short-circuited before reaching example.com in the allowlist. Now allowlist is checked at all parent levels before consulting the blocklist. Deduplicate is_blocked/check via find_in_set helper; is_blocked delegates to check. Adds 7 new blocklist tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: rustfmt blocklist tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * perf: zero-alloc is_blocked hot path, normalize trailing dots Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: rustfmt add_to_allowlist Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: extract normalize() for domain lowering + dot stripping Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 21:43:40 +03:00
Razvan Dimescu	de15b32325	feat: numa setup-phone — QR-based mobile DoT onboarding (#38 ) * feat: numa setup-phone — QR-based mobile DoT onboarding Adds a CLI subcommand that generates a one-time mobileconfig profile containing both the Numa local CA (as a com.apple.security.root payload) and the DoT DNS settings, then serves it via a temporary HTTP server and prints a scannable QR code in the terminal. Flow: 1. User runs `numa setup-phone` (no sudo needed) 2. Detects current LAN IP, reads CA from /usr/local/var/numa/ca.pem 3. Builds combined mobileconfig (CA trust + DoT) 4. Renders QR code with qrcode crate (Unicode block characters) 5. Serves the profile on port 8765, stays open until Ctrl+C 6. Counts successful downloads (multi-device households) Important caveat documented in instructions: even with the CA bundled in the profile, iOS still requires the user to manually enable trust in Settings → General → About → Certificate Trust Settings. Verified on a real iPhone. Stable PayloadIdentifiers/UUIDs ensure re-running replaces the existing profile on iOS rather than accumulating duplicates. - New module: src/setup_phone.rs (~270 lines) - New CLI subcommand: `numa setup-phone` - New dependency: qrcode = "0.14" (default-features = false) - tokio "signal" feature added for Ctrl+C handling - 3 unit tests: PEM stripping, mobileconfig generation, QR rendering Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: mobile API, enriched /health, mobileconfig module Adds a persistent read-only HTTP listener (default port 8765, LAN-bound) serving a dedicated subset of Numa's API for iOS/Android companion apps and as a replacement for the one-shot server setup_phone used to spin up: GET /health — enriched JSON with version, hostname, LAN IP, SNI, DoT config, mobile API port, CA fingerprint, features (shared handler with the main API on port 5380) GET /ca.pem — public CA certificate (shared handler) GET /mobileconfig — full iOS profile (CA trust + DNS settings pinned to current LAN IP) GET /ca.mobileconfig — CA-only iOS profile (trust anchor without DNS override — for the iOS companion app's programmatic DNS flow via NEDNSSettingsManager) All routes are idempotent GETs. The mobile API never serves the state-mutating routes that live on the main API (overrides, blocking toggle, service CRUD, cache flush), so it is safe to expose on the LAN regardless of the main API's bind address. The CA private key is never served by any route. Opt-in via `[mobile] enabled = true`. Default is false so new installs do not silently expose a LAN listener after upgrading; our committed numa.toml template enables it explicitly for spike testing. New modules: - src/mobileconfig.rs — ProfileMode::{Full, CaOnly} enum with plist builder lifted from setup_phone.rs. Full and CaOnly share the CA payload UUID (same trust anchor) but have distinct top-level UUIDs so they coexist as separate installable profiles on iOS. - src/health.rs — HealthMeta cached metadata built once at startup from config + CA fingerprint (SHA-256 of the PEM via ring), and the HealthResponse JSON shape shared between the main and mobile APIs. - src/mobile_api.rs — axum Router for the persistent listener. Reuses api::health and api::serve_ca from the main API; owns the two mobileconfig handlers. Modified: - src/api.rs — health() returns the enriched HealthResponse, now pub. serve_ca is now pub so mobile_api can reuse it. - src/config.rs — MobileConfig section (enabled, port, bind_addr). - src/ctx.rs — health_meta: HealthMeta field on ServerCtx. - src/main.rs — builds HealthMeta at startup, spawns mobile API listener if enabled. - src/lan.rs — build_announcement takes &HealthMeta and writes enriched TXT records (version, api_port, proto, dot_port, ca_fp). SRV port now reports the mobile API port; peer discovery still reads TXT `services=` so this is backwards compatible. Always announces even when no .numa services are registered, so the iOS companion app can discover Numa via mDNS regardless of service state. - src/setup_phone.rs — reduced from 267 to 100 lines. The CLI is now a thin QR wrapper over the persistent /mobileconfig endpoint; the hand-rolled one-shot HTTP server (accept_loop, RUST_OK_HEADERS, RUST_NOT_FOUND, download counter) is gone. - src/dot.rs — test fixture updated with HealthMeta::test_fixture(). - numa.toml — commented [mobile] section, enabled = true for spike. Tests: 136 unit tests passing (5 new in mobileconfig, 3 new in health). cargo clippy clean. Integration sanity check: curl'd /health, /ca.pem, /mobileconfig, /ca.mobileconfig against a running numa — all return 200 with correct content types and valid response bodies. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: setup-phone probe, unknown command error, query source in dashboard - setup-phone now probes the mobile API before printing the QR code and shows an actionable error if [mobile] is not enabled - Unknown CLI subcommands print an error instead of silently attempting to start a full server - Dashboard query log shows source IP under timestamp (localhost for loopback, full IP for LAN devices) with full addr on hover Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 19:08:56 +03:00
Razvan Dimescu	e860731c01	fix: escape DNS label text per RFC 1035 §5.1 (closes #36 ) (#54 ) * fix: escape dots and special characters in DNS label text per RFC 1035 §5.1 Closes #36 read_qname was pushing raw label bytes directly into the output string, producing ambiguous text for labels containing dots, backslashes, or non-printable bytes. fanf2 spotted this on HN: wire format `[8]exa.mple[3]com[0]` (two labels, first containing a literal 0x2E) was rendered as `exa.mple.com`, indistinguishable from three labels. Fix both sides of the text representation per RFC 1035 §5.1: read_qname — when rendering wire bytes to text: - literal `.` within a label → `\.` - literal `\` → `\\` - bytes outside 0x21..=0x7E → `\DDD` (3-digit decimal) - printable ASCII passes through unchanged write_qname — when parsing text back to wire: - `\.` produces a literal 0x2E inside the current label (not a separator) - `\\` produces a literal 0x5C - `\DDD` produces the byte with that decimal value (0..=255) - unescaped `.` still separates labels, empty labels still skipped - rejects trailing `\`, short `\DD`, and `\DDD` > 255 Impact in practice is low — real-world domains don't contain dots in labels — but it's a correctness bug in the wire format parser that could cause round-trip failures with adversarial input. The parser is the core of the project, so correctness bugs take priority over practical impact. Adds 16 unit tests in a new `#[cfg(test)] mod tests` block covering: plain domain read/write, literal-dot escaping on both sides, backslash escaping, non-printable + space decimal escapes, full round-trip preservation, and the three rejection cases for malformed escapes. Credit: fanf2 (https://news.ycombinator.com/item?id=47612321) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: stream label writes directly into buffer (review feedback) The first cut of this fix delegated write_qname to a helper (parse_escaped_labels) that built Vec<Vec<u8>> up-front, then iterated to emit the wire bytes. On a plain-ASCII domain like "www.google.com" that's ~4 heap allocations per write_qname call, and record.rs calls write_qname ~6 times per response — so this PR would regress bench_buffer_serialize by roughly 24 extra allocations per response vs. main, where the old non-escaping code had zero. Rewrite write_qname as a streaming byte-level loop that reserves the length byte up-front, writes the label body directly into the buffer, then backpatches the length via set(). Zero intermediate allocations on the common path, and the 63-byte label cap is now checked incrementally so oversized labels fail fast. Byte-level scanning is safe for UTF-8 input: continuation bytes are always in 0x80..=0xBF, so they can never collide with the ASCII `.` (0x2E) or `\` (0x5C) that drive label splitting and escape parsing. Also inline the \DDD rendering in read_qname to avoid the per-byte format!() allocation on non-printable input. Plain-ASCII reads hit the unchanged push(c as char) fast path, so the common case has zero regression. The parse_escaped_labels helper is deleted — no remaining callers. All 158 tests pass, clippy + fmt clean. Collapses three review findings (HIGH allocation regression, MEDIUM format! allocation, MEDIUM .unwrap() after digit guard) in one pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: route dnssec::name_to_wire through write_qname for escape handling Closes #55. dnssec::name_to_wire was a parallel implementation of the old write_qname's splitting loop — it iterated qname.split('.') and pushed raw bytes. It predated and duplicated the buffer.rs logic, and it did not understand RFC 1035 §5.1 text escapes. After the read_qname fix in this PR, names that come out of read_qname can contain \., \\, or \DDD sequences; feeding those back into the old name_to_wire would split on the literal '.' inside a \. sequence and produce corrupt RRSIG signed-data blobs. The underlying bug predates this PR — the old read_qname was broken too, so both sides of the DNSSEC canonical form pipeline were silently wrong in the same way. Making read_qname correct exposed the divergence, so it's fixed here in the same PR that introduced the exposure. Reimplement name_to_wire on top of BytePacketBuffer::write_qname: reserve a scratch buffer, let write_qname handle the escape parsing and length-byte framing, copy the emitted bytes into a Vec, then walk the wire once more to lowercase label bodies (length bytes stay untouched). Canonical form per RFC 4034 §6.2 requires the lowercasing, and it has to happen post-escape-resolution — a decimal escape like \065 produces 0x41 ('A'), which must be lowercased to 'a' in the final wire bytes. Call sites in build_signed_data, record_to_canonical_wire, record_rdata_canonical, and nsec3_hash are unchanged — the public signature stays the same, infallible Vec<u8> return. Tests: - name_to_wire_escaped_dot_in_label_is_not_a_separator — verifies the fanf2 example round-trips correctly through canonical form - name_to_wire_decimal_escape_is_lowercased — verifies post-escape lowercasing (the subtle correctness requirement) - existing name_to_wire_root, name_to_wire_domain, ds_verification tests still pass unchanged Test count: 158 → 160. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: tighten name_to_wire per review feedback - Replace hand-rolled per-byte lowercase loop with stdlib [u8]::make_ascii_lowercase(). Shorter and idiomatic. - Tighten the .expect() message to state the actual invariant (parseable DNS name) instead of vague "well-formed" language. - Replace the doc comment's "see #55" with the real invariant — issue numbers rot, and by merge time #55 is closed anyway. The comment now explains WHY the lowercase pass has to happen post-escape-resolution (\065 → 'A' → 'a') instead of during write_qname. - Drop the redundant `\065` test comment (the one-liner version is enough with the assertion showing the transform). No behavior change; 160 tests still pass, clippy + fmt clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: cover label cap and empty-label rollback; trim doc comments Closes coverage gaps left by PR #54: - write_rejects_label_over_63_bytes: pins the incremental 63-byte cap inside write_qname's inner loop (boundary at 63 vs 64). - write_skips_empty_labels: pins the rollback branch (pos = len_pos) triggered by leading or consecutive dots. Doc comments tightened: - write_qname: drop the streaming-impl walkthrough and the escape-grammar restatement (already documented on read_qname). - name_to_wire: drop the implementation explanation; keep the post-escape lowercasing rationale, which pins behavior a future refactor could silently break. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 08:53:46 +03:00
Razvan Dimescu	f556b60ce4	fix: suppress recursive hint in install when already configured (#71 ) `sudo numa install` unconditionally printed the "Want full DNS sovereignty?" hint even when numa.toml already has mode = "recursive". Now loads the config first and skips the message if recursive is already set. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 08:32:51 +03:00
Razvan Dimescu	422726f1c8	chore(deps): bump rcgen from 0.13 to 0.14 (#70 ) rcgen 0.14 replaced the separate Certificate + KeyPair args with a unified Issuer type. Migrates ensure_ca and generate_service_cert: - Load path: Issuer::from_ca_cert_der replaces the old CertificateParams::from_ca_cert_pem + self_signed round-trip. - Generate path: Issuer::new(params, key_pair) constructs directly from the params used for self_signed (no DER re-parse). - signed_by takes (&key_pair, &issuer) instead of (&key_pair, &cert, &key). Also drops thiserror v1 from the dep tree (rcgen 0.14 uses v2). Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 08:28:07 +03:00
Razvan Dimescu	643d6b01e1	fix(linux): consult resolvectl when resolv.conf only shows the stub (#52 ) On modern Arch / Ubuntu 22.04+ / Fedora desktops, NetworkManager + systemd-resolved symlink /etc/resolv.conf to stub-resolv.conf, which contains only: nameserver 127.0.0.53 The real upstream servers (router, ISP, configured DoT providers) live inside systemd-resolved's per-link state, exposed via 'resolvectl status'. discover_linux() was parsing /etc/resolv.conf, correctly filtering the stub address, and then falling through to the Quad9 DoH fallback because detect_dhcp_dns() is macOS-only on Linux. Net effect: on a large chunk of Linux installs, numa silently defaulted to Quad9 instead of the user's actual DNS — visible in Casey's AUR test banner (#33) as 'Upstream https://9.9.9.9/dns-query' despite his machine having working router DNS the entire time. resolvectl_dns_server() already exists — it was introduced for cloud VPC forwarding-rule discovery and knows how to ask systemd-resolved for the real active DNS server. This commit wires it into the default-upstream fallback chain, between the primary resolv.conf parse and the ~/.numa/original-resolv.conf backup. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 22:32:57 +03:00
Razvan Dimescu	fab8b698d8	fix: human-readable advisories for TLS data_dir + port-53 EACCES (#48 ) * fix: human-readable advisory when TLS data_dir is not writable When numa runs as non-root on a system with a privileged default data_dir (e.g. /usr/local/var/numa on macOS), TLS CA setup fails with a raw "Permission denied (os error 13)" and HTTPS proxy is silently disabled. The user sees a cryptic warning with no path forward. Detect std::io::ErrorKind::PermissionDenied on the tls error, print a diagnostic naming the data_dir and offering two fixes (install as system resolver, or point data_dir at a writable path), and keep the graceful-degradation behavior — DNS resolution and plain-HTTP proxy continue to work without HTTPS. All other TLS setup errors fall through to the existing log::warn!. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: port-53 advisory also handles EACCES (non-root privileged bind) The original port-53 match arm only caught EADDRINUSE, so a fresh non-root user on macOS/Linux hitting EACCES when trying to bind a privileged port saw the raw OS error instead of the advisory. Collapse the scoping helper and the advisory into a single `try_port53_advisory(bind_addr, &io::Error) -> Option<String>` that returns the formatted diagnostic when both the port is 53 and the error kind is one we can speak to (AddrInUse or PermissionDenied), and `None` otherwise. The two failure modes share one body with a cause-sentence variant — no duplicated fix text. Caller becomes a plain if-let: no match guard, no separate is_port_53 helper exposed on the public API. is_port_53 goes back to private. Unit tests cover all branches: AddrInUse, PermissionDenied, non-53 bind_addr, unrelated ErrorKind, and malformed bind_addr. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: move TLS error classification into tls module main.rs no longer downcasts a boxed error to figure out whether it's a permission-denied case. tls::try_data_dir_advisory(&err, &dir) encapsulates the downcast + kind match and returns Some(advisory) or None, mirroring system_dns::try_port53_advisory. main.rs becomes a plain if-let, symmetric with the port-53 path. Trim the docstrings on both advisory functions: they were narrating the implementation (errno mapping) instead of stating the contract. Add unit tests for try_data_dir_advisory covering PermissionDenied, other io::ErrorKind, and non-io errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-09 16:27:08 +03:00
Razvan Dimescu	a6f23a5ddb	fix: advisory + exit(1) when port 53 is already in use (#45 ) (#47 ) * fix: advisory + exit(1) when port 53 is already in use (#45) Detect AddrInUse on bind, print a human-readable diagnostic explaining systemd-resolved / Dnscache as the likely cause and offer two concrete fixes (sudo numa install, or bind_addr on a non-privileged port), then exit(1) instead of surfacing a raw OS error. Adds tests/docker/smoke-port53.sh: end-to-end Docker test that pre-binds port 53 with a Python UDP socket and asserts the advisory + exit code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: collapse port53 advisory to single flat path The per-platform cause sentences were cosmetic — they didn't change the user's actions (install, or bind_addr on a non-privileged port), but they introduced duplicated "another process..." strings, a dead-from-CI branch (is_systemd_resolved_active() == true is never reached by any test), and a pub visibility bump on is_systemd_resolved_active for a single caller. Replace with one flat format! whose cause line mentions both systemd-resolved and the Windows DNS Client inline. The existing smoke test now exercises 100% of the function. is_systemd_resolved_active reverts to private. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-09 15:03:58 +03:00
Razvan Dimescu	79ecb73d87	fix: use FHS-compliant /var/lib/numa as Linux data dir default (#43 ) * fix: use FHS-compliant /var/lib/numa as Linux data dir default numa's default system-wide data directory was hardcoded to /usr/local/var/numa for all Unix platforms. This is the right path on macOS (Homebrew prefix convention) but non-FHS on Linux, where Arch / Fedora / Debian / etc. expect persistent state under /var/lib/<pkg>. The mismatch was invisible to existing users (numa creates the dir silently on first run) but immediately surfaces when packaging for a distro — see PR #33 (community contribution to add an Arch AUR package) which had to add fragile sed-based path patching at PKGBUILD build time. The fix moves the path decision into a small helper: - daemon_data_dir() — cfg-gated platform dispatch (linux/macos) - resolve_linux_data_dir() — pure function, takes "does X exist?" as parameters, returns the right path Linux behavior: - Fresh install → /var/lib/numa (FHS) - Upgrading from pre-v0.10.1 install → /usr/local/var/numa (legacy) - Both paths exist → /var/lib/numa (FHS wins) The legacy fallback is critical: existing v0.10.0 Linux users have their CA cert + services.json under /usr/local/var/numa. Returning the new path unconditionally would cause CA regeneration on upgrade, breaking every browser that had trusted the previous CA. The fallback is checked at startup via std::path::Path::exists, so the upgrade is seamless and zero-config. macOS behavior is unchanged — /usr/local/var/numa is still correct because Homebrew's prefix is /usr/local. Test coverage: - resolve_linux_data_dir is a pure function gated cfg(any(linux,test)) so the same code path is unit-tested on every platform's CI run. - Four tests cover all combinations of (legacy_exists, fhs_exists), asserting the migration logic stays correct under future edits. The default config in numa.toml is also updated to document the new per-platform default paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: end-to-end FHS path verification + simplify cleanup Two related changes from a /simplify pass and a follow-up testing finalization: 1. lib.rs cleanup (no behavior change): - Drop FHS_LINUX_DATA_DIR and LEGACY_LINUX_DATA_DIR consts. Both were used in only 4 places total and the unit tests already bypassed them with string literals, so they were over-engineering. Inline the strings in daemon_data_dir() and resolve_linux_data_dir(). - Trim narrating doc/comments on the helper and the test bodies. Keep only the non-obvious WHY (the macOS Homebrew note and the migration-keeps-legacy rationale). 2. tests/docker/smoke-arch.sh: - Cherry-picked the previously-uncommitted Arch compatibility smoke test from feat/smoke-arch. - Removed the [server] data_dir = "/tmp/numa-smoke" override from the test config so the script now exercises the DEFAULT data dir code path — which is exactly what the FHS fix touches. - Added a path assertion after the dig succeeds: verify that /var/lib/numa/ca.pem exists (FHS) and /usr/local/var/numa is absent (no accidental dual-creation on a fresh install). Verified end-to-end on archlinux:latest (Apple Silicon, Rosetta): ── building + running numa on archlinux:latest ── ── cargo build --release --locked ── Finished `release` profile [optimized] target(s) in 24.02s ── dig @127.0.0.1 -p 5354 google.com A ── 142.251.38.206 ── FHS path check ── ✓ CA cert at /var/lib/numa/ca.pem (FHS path) ✓ legacy path /usr/local/var/numa absent (fresh install used FHS) ── smoke-arch passed ── This closes the testing gap where the unit tests covered the path-decision LOGIC in isolation but nothing exercised the live wiring on a real Linux filesystem. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 18:00:27 +03:00
Razvan Dimescu	bf5565ac26	fix: macOS use launchctl bootout/bootstrap instead of deprecated load (#42 ) The deprecated `launchctl load -w` returns exit code 0 even when it cannot actually reload a service whose label is already in launchd's in-memory state. It prints `Load failed: 5: Input/output error` to stderr but exits 0, so the install path interprets it as success and continues — silently leaving the running daemon on whatever binary was first loaded, even though the on-disk plist now points elsewhere. The consequence: every macOS user running `brew upgrade numa` rewrites the plist to point at the new binary, but launchctl never actually loads it. They think they upgraded; they're still running the old version. Neither #41 (cross-platform CA trust) nor #40 (self-referential backup) would actually take effect for them until they manually run: sudo launchctl bootout system /Library/LaunchDaemons/com.numa.dns.plist sudo launchctl bootstrap system /Library/LaunchDaemons/com.numa.dns.plist The fix uses the modern API symmetrically across all three call sites: - install_service_macos: bootout (best-effort cleanup, no-op on first install) → bootstrap → wait for readiness → configure DNS - install_service_macos rollback path: bootout instead of `unload` - uninstall_service_macos: bootout BEFORE remove_file (the modern API needs the plist file path as the specifier; doing it after remove would leave the service in memory until reboot) No new tests — this is a shell-call substitution with no logic to unit-test. Verified manually on macOS: `sudo numa install` no longer prints `Load failed`, and the daemon is correctly running the binary the plist points at. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 16:54:21 +03:00
Razvan Dimescu	679b346246	fix: prevent self-referential DNS backup on re-install (#40 ) * fix: prevent self-referential DNS backup on re-install The install flow previously captured current system DNS servers verbatim into the backup file. If numa was already installed, current DNS was 127.0.0.1, so the "backup" recorded 127.0.0.1 as the "original" — making a subsequent uninstall a no-op self-reference. Reproduced 2026-04-08 during v0.10.0 brew dogfood: after `sudo numa uninstall; sudo /opt/homebrew/bin/numa install`, `sudo numa uninstall` printed `restored DNS for "Wi-Fi" -> 127.0.0.1` because the brew binary's install step had overwritten the backup with the already-stub state. Fix (all three platforms): - macOS/Windows: if the existing backup already contains at least one non-loopback/non-stub upstream, preserve it as-is. If writing a fresh backup, filter loopback/stub addresses first so a capture from already-numa-managed state isn't self-referential. - Linux (resolv.conf fallback path): detect numa-managed or all-loopback resolv.conf content and skip the file copy in that case; preserve an existing useful backup rather than overwriting it. systemd-resolved path is unaffected (uses a drop-in, no backup file). Adds three unit tests for the predicates: macOS HashMap detection, Windows interface filter, and resolv.conf parsing (real upstream, self-referential, numa-marker, systemd stub, mixed). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: share iter_nameservers helper and reuse resolv.conf content Post-review simplifications on the stale-backup fix: - Extract iter_nameservers(&str) helper used by both parse_resolv_conf and resolv_conf_has_real_upstream. Eliminates the duplicated line-by-line nameserver parsing (findings from reuse review). - install_linux: reuse the already-read resolv.conf content via std::fs::write instead of a second read via std::fs::copy. - install_macos / install_windows: flatten the conditional eprintln pattern — always print a blank line, conditionally print the save message. Equivalent output, less branching. Net −12 lines. All 130 tests still pass, clippy clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: drop redundant trim before split_whitespace CI caught `clippy::trim_split_whitespace` on Rust 1.94: `split_whitespace()` already skips leading/trailing whitespace, so `.trim()` first is redundant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: extract load_backup helper Remove duplicated read+deserialize boilerplate shared by install_macos and install_windows. The two call sites each had an identical 4-line chain of read_to_string().ok().and_then(serde_json::from_str).ok() — collapse into a single generic helper load_backup<T>(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Revert "refactor: extract load_backup helper" This reverts commit `a54fb99428`. * test: drop windows_backup_filters_loopback The test inlined the 3-line filter block from install_windows rather than calling a production helper, so it was testing stdlib Vec::retain + is_loopback_or_stub — both already covered elsewhere. Deleting it removes a test that would silently pass even if install_windows stopped filtering altogether. The predicate logic for macOS-shaped backups stays covered by macos_backup_real_upstream_detection (same inner Vec<String> type). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add windows_backup_filters_loopback unit test The PR description mentioned this test but it was missing from the diff, leaving backup_has_real_upstream_windows untested. Mirrors the shape of macos_backup_real_upstream_detection: empty map → false, all-loopback (127.0.0.1, ::1, 0.0.0.0) → false, one real entry alongside loopback → true. Also relax the cfg gate on backup_has_real_upstream_windows from cfg(windows) to cfg(any(windows, test)) so the test compiles cross-platform, matching how backup_has_real_upstream_macos and the resolv_conf helpers are gated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 16:38:37 +03:00
Razvan Dimescu	039254280b	fix: cross-platform CA trust (Arch/Fedora + Windows) (#41 ) * fix: cross-platform CA trust (Arch/Fedora + Windows) Closes #35. trust_ca_linux now detects which trust store the distro ships and runs the matching refresh command, instead of hardcoding Debian's update-ca-certificates. Detection walks a const table in priority order, picking the first whose anchor dir exists: - debian: /usr/local/share/ca-certificates (update-ca-certificates) - pki: /etc/pki/ca-trust/source/anchors (update-ca-trust extract) - p11kit: /etc/ca-certificates/trust-source/anchors (trust extract-compat) Falls back with a clear error listing every backend tried. Adds Windows support via certutil -addstore Root / -delstore Root, removing the silent CA-trust gap on numa install (previously the service installed but the trust step quietly errored, leaving every HTTPS .numa request throwing browser warnings). Refactor: trust_ca and untrust_ca are now thin dispatchers calling per-platform helpers. CA_COMMON_NAME and CA_FILE_NAME are centralized in tls.rs and reused from system_dns.rs and api.rs. untrust_ca_linux no longer pre-checks file existence (TOCTOU) and skips the refresh when no file was actually removed. Test: tests/docker/install-trust.sh runs the install/uninstall contract against debian:stable, fedora:latest, and archlinux:latest in containers, asserting the cert lands in (and is removed from) the system bundle. All three pass locally. README notes the Firefox/NSS limitation (separate trust store). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: rustfmt fixes for trust_ca_linux helpers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: macOS CA trust contract test (manual) Adds tests/manual/install-trust-macos.sh — a sudo bash script that mirrors trust_ca_macos / untrust_ca_macos against a fixture cert with a unique CN. Designed to coexist with a running production numa: - Refuses to run if a real "Numa Local CA" is already in System.keychain (fail-closed protection for dogfood installs) - Uses a unique CN ("Numa Local CA Test <pid-timestamp>") so the test cert can never collide with production - Mirrors the by-hash deletion loop from untrust_ca_macos - Trap-cleanup on success or interrupt Lives under tests/manual/ to signal "host-mutating, dev-only" — distinct from tests/docker/install-trust.sh which is hermetic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: relax bail-out in macOS trust test (safe alongside production) The bail-out was overly defensive. The test cert uses a unique CN ("Numa Local CA Test <pid-ts>") that is strictly longer than the production CN, so `security find-certificate -c $TEST_CN` cannot substring-match the production cert. All deletes are by-hash, which can only target the test cert's specific hash. Coexistence is provably safe; document the reasoning in the header comment block and replace the refusal with an informational notice. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 15:18:01 +03:00
Razvan Dimescu	6887c8e02e	refactor: move data_dir override from env var to [server] TOML field Reverts the NUMA_DATA_DIR env var added in the previous commit and replaces it with a [server] data_dir TOML field. Numa already has a well-developed config system; adding a parallel env-var mechanism for a single knob was wrong. The principle: TOML is for application behavior configuration. Env vars are for bootstrap values (HOME, SUDO_USER to discover paths before config loads) and standard ecosystem conventions (RUST_LOG). data_dir is neither — it's an app knob, so it belongs in the TOML. Changes: - lib.rs::data_dir() reverts to the platform-specific fallback only - config.rs adds `data_dir: Option<PathBuf>` to ServerConfig - main.rs resolves config.server.data_dir with fallback to numa::data_dir() and passes it to build_tls_config, then stores the resolved path on ctx.data_dir for downstream consumers - tls.rs::build_tls_config takes `data_dir: &Path` as an explicit parameter instead of calling crate::data_dir() behind the caller's back. regenerate_tls and dot.rs self_signed_tls now pass &ctx.data_dir, honoring whatever path the config resolved to - tests/integration.sh Suite 6 uses `data_dir = "$NUMA_DATA"` in its test TOML instead of the NUMA_DATA_DIR env var prefix - numa.toml gains a commented-out data_dir example No behavior change for existing production deployments (the default path is unchanged). Test harness is now fully config-driven, and containerized deploys can override data_dir via mount+config without needing env var injection. 127/127 unit tests pass, Suite 6 passes end-to-end. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	7f52bd8a32	test: Suite 6 — proxy + DoT coexistence, NUMA_DATA_DIR override Adds integration test coverage for the realistic production shape where both the HTTPS proxy and DoT are enabled simultaneously. This was previously untested — every existing suite had either one or the other, so the interaction path was implicit. What Suite 6 verifies: - Both listeners bind without panic - DoT still resolves queries with the proxy enabled - Proxy HTTPS handshake still works with DoT enabled - Both certs validate against the same shared CA To run non-root, adds a NUMA_DATA_DIR env var override to data_dir() that lets callers point the CA/cert storage at any writable path. Useful beyond tests: containerized deployments, CI runners, dev testing without sudo. The fallback is the existing platform-specific path (unix: /usr/local/var/numa, windows: %PROGRAMDATA%\numa). Suite 6 sets NUMA_DATA_DIR=/tmp/numa-integration-data before starting numa, then trusts the generated CA at $NUMA_DATA_DIR/ca.pem for both kdig (DoT query) and openssl s_client (HTTPS proxy handshake) verification. All 6 suites, 32 checks, run non-root and pass locally. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	c98e6c3ea9	fix: install rustls crypto provider when loading user DoT cert Adds tests/integration.sh Suite 5 (DoT via kdig + openssl) and fixes a startup panic caught by it. Bug: when [dot] cert_path/key_path was set AND [proxy] was disabled, numa panicked on the first DoT handshake with "Could not automatically determine the process-level CryptoProvider from Rustls crate features". In normal deployments the proxy's build_tls_config installs the default provider as a side effect, masking the missing call in dot.rs::load_tls_config. Disable the proxy and the panic surfaces. Fix: call rustls::crypto::ring::default_provider().install_default() at the top of load_tls_config (no-op if already installed). Suite 5 exercises: - DoT listener binds on configured port - Resolves a local zone A record over TLS (kdig +tls) - Persistent connection reuse (kdig +keepopen, 3 queries, 1 handshake) - ALPN "dot" negotiation (openssl s_client -alpn dot) - ALPN mismatch rejected with no_application_protocol (openssl -alpn h2) Uses a pre-generated cert at /tmp so the test runs non-root. Skips gracefully if kdig or openssl aren't installed. Also: Dockerfile now EXPOSE 853/tcp so docker run -p 853:853 works out of the box when users enable DoT. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	186e709373	test: verify DoT server rejects mismatched ALPN Adds dot_rejects_non_dot_alpn to assert the rustls server enforces ALPN strictness rather than silently accepting a mismatched negotiation. This is the load-bearing behavior behind the cross- protocol confusion defense — without enforcement, the ALPN "dot" advertisement is just a sign hung on an unlocked door. Refactors test_tls_configs to return the leaf cert DER instead of a prebuilt client config, and adds a dot_client(cert_der, alpn) helper so each test can build a client config with the ALPN list it needs. The five existing DoT tests gain one line each to call dot_client with dot_alpn(); behavior unchanged. 127/127 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	bacc49667a	fix: DoT cert needs explicit {tld}.{tld} SAN, not just .{tld} wildcard self_signed_tls was passing an empty service_names list, so the generated cert only had the .numa wildcard SAN. Strict TLS clients (browsers, possibly some iOS versions) reject wildcards under single-label TLDs — see the existing comment in tls.rs explaining why the proxy lists each service explicitly. setup-phone's mobileconfig sends ServerName "numa.numa" as SNI, so the DoT cert must have an explicit numa.numa SAN. Pass proxy_tld itself as a service name, mirroring how main.rs already registers "numa" as a service for the proxy's TLS cert. Test fixture updated to mirror the production SAN shape (*.numa + numa.numa) and switched the client to SNI "numa.numa", so the existing DoT test suite implicitly exercises the SNI path used by setup-phone clients. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	7d0fe19462	style: drop narrating comments on dot_alpn and ALPN test Both were restating what the code already said — dot_alpn's doc narrated the function name and the test comment restated the assertion. RFC 7858 §3.2 is already cited on self_signed_tls and build_tls_config where the "why" actually matters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	1632fc36f2	feat: DoT write timeout and ALPN "dot" advertisement Two DoS/interop hardening items: 1. Bound write_framed by WRITE_TIMEOUT (10s) so a slow-reader attacker can't indefinitely hold a worker task and its connection permit. Symmetric to the existing handshake timeout. 2. Advertise ALPN "dot" per RFC 7858 §3.2. Required by some strict DoT clients (newer Apple stacks, some Android versions). rustls ServerConfig exposes alpn_protocols as a pub field so we set it after with_single_cert: - load_tls_config (user-provided cert/key): set directly - self_signed_tls (new, replaces fallback_tls): builds a fresh DoT-specific TLS config via build_tls_config with the ALPN list build_tls_config now takes an `alpn: Vec<Vec<u8>>` parameter so DoT and the proxy can pass different ALPN lists while sharing the same CA. Proxy callers pass Vec::new() (unchanged behavior). Dropped the ctx.tls_config reuse branch: we can't mutate a shared Arc<ServerConfig> to add DoT-specific ALPN, and reusing the proxy config was already quietly broken re: SAN (proxy cert covers *.{tld}, not the DoT server's bind hostname/IP). Added dot_negotiates_alpn test that asserts conn.alpn_protocol() returns Some(b"dot") after handshake. 126/126 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	2b0c4e3d5e	refactor: trim DoT listener — let-else reads, drop MIN_MSG_LEN and redundant localhost test - Collapse two 4-arm read/timeout matches to let-else (lose one defensive debug log on payload-read timeout; idle timeouts are routine on persistent DoT connections anyway) - Drop MIN_MSG_LEN: DnsPacket::from_buffer rejects truncated input on its own, and BytePacketBuffer is zero-init so buf[0..2] for sub-2-byte messages just yields a harmless FORMERR with id=0 - Inline ACCEPT_ERROR_BACKOFF (single use site) - Drop the partial cert/key warning: missing one of cert_path/ key_path silently falls back to self-signed; users see the self-signed cert at startup and figure it out - Drop dot_localhost_resolution test: RFC 6761 localhost is tested in ctx.rs; this test only verified DoT transport, which dot_resolves_local_zone already covers - Drop self-documenting comment in dot_multiple_queries_on_persistent_connection Net -32 lines, 125/125 tests pass, no behavior change users would notice. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	357c710ec4	style: rustfmt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	7742858b7b	refactor: simplify DoT cert/key match and extract send_response helper - Flatten 4-arm cert/key match in start_dot to 2 arms with the partial-config warning hoisted into a one-liner above the match. - Extract send_response() that serializes a DnsPacket and writes it framed, used by both the FORMERR-on-parse-error and SERVFAIL-on- resolve-error paths. Removes duplicated buffer/write/log boilerplate and unifies the rescode logging via {:?}. No behavior change; 126/126 tests still pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	1239ed0e72	fix: parse DoT queries up-front and echo question in SERVFAIL Address review findings on PR #25: - Refactor resolve_query to take a pre-parsed DnsPacket. Parse-error handling moves to the UDP caller, eliminating the double warn! line on malformed UDP queries. - Enforce MIN_MSG_LEN=12 (DNS header) in handle_dot_connection so query_id extraction is always reading client-sent bytes, not the zeroed buffer tail. - Parse the DoT query before calling resolve_query and retain it, so SERVFAIL responses can echo the original question section via response_from(). Parse failures send FORMERR with the client id. - Extract write_framed() helper for length-prefix + flush, reused by success, SERVFAIL, and FORMERR paths. - Back off 100ms on listener.accept() errors to avoid tight-looping on fd exhaustion. - Replace the hardcoded 127.0.0.1:53 upstream in dot_nxdomain_for_unknown with a bound-but-unresponsive UDP socket owned by the test, making it independent of the host's local resolver. Test now runs in ~220ms (timeout lowered to 200ms) instead of 3s and asserts the question is echoed in the SERVFAIL response. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	cb54ab3dfc	fix: harden DoT listener against slowloris and stale handshakes - Add 10s timeout on TLS handshake — prevents clients from holding a semaphore permit without completing the handshake - Add IDLE_TIMEOUT on payload read_exact — prevents slowloris after sending a valid length prefix then trickling bytes - Extract accept_loop() shared between start_dot and tests — eliminates duplicated accept logic that could drift - Add 5s timeout on TCP reads in recursive test mock server Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	aa8923b2c6	fix: add debug logging for DoT SERVFAIL serialization failure, TC-bit TODO Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	14efc51340	fix: send SERVFAIL on DoT resolve errors, extract shared connection handler - Send SERVFAIL response (with correct query ID) when resolve_query fails, preventing DoT clients from hanging until idle timeout - Extract handle_dot_connection() so tests use the same logic as production, eliminating duplicated accept/read/resolve loop - Replace magic 4096 with named MAX_MSG_LEN constant tied to BUF_SIZE - Add flush() after each TLS write to prevent buffered responses - Extract fallback_tls() helper, handle partial cert/key config, support IPv6 bind address, remove redundant crypto provider init Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	e4350ae81c	feat: add DNS-over-TLS (DoT) listener (RFC 7858) Refactor handle_query into transport-agnostic resolve_query that returns a BytePacketBuffer, keeping the UDP path zero-alloc. Add a TLS listener on port 853 with persistent connections, idle timeout, connection limits, and coalesced writes. Supports user-provided certs or self-signed CA fallback. Includes 5 integration tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 02:53:43 +03:00
Razvan Dimescu	766935ec97	style: fix rustfmt formatting Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 22:46:54 +03:00
Razvan Dimescu	efe3669540	fix: gate exe_path and replace_exe_path for Windows clippy, add macOS CI - Gate exe_path in restart_service() and replace_exe_path() behind #[cfg(any(target_os = "macos", target_os = "linux"))] to fix unused variable and dead code warnings on Windows - Add macOS CI job (clippy + tests) - Add test for template substitution in plist and systemd unit files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 22:46:54 +03:00
Laurin Brandner	ad34fe2d9e	Fix unit replacement for linux	2026-04-06 22:28:30 +03:00
Laurin Brandner	80fcfd10ae	flexible installation path	2026-04-06 22:28:30 +03:00
Razvan Dimescu	8c421b9fa3	fix: check forwarding rules before recursive resolution (#29 ) Conditional forwarding (Tailscale .ts.net, VPC private zones) was only checked in the forward mode branch. In recursive mode, queries for forwarding-rule domains went to root servers instead of the configured upstream, returning NXDOMAIN for private domains. Move the forwarding rule check before the recursive/forward branch so it takes priority regardless of mode. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 00:07:11 +03:00
Razvan Dimescu	ad7884f2f6	fix: add numa search domain on install for browser compatibility Chrome treats single-label TLDs (e.g. frontend.numa) as search queries unless a trailing slash is added. Adding "numa" as a search domain tells the OS resolver that .numa is valid, so browsers resolve it directly. macOS: networksetup -setsearchdomains, cleared on uninstall Linux (resolved): Domains=~. numa in drop-in Linux (resolv.conf): search numa Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 17:50:22 +03:00
Razvan Dimescu	0b883d1c0d	feat: Windows DNS configuration via netsh (#28 ) * feat: Windows DNS configuration via netsh numa install/uninstall now set/restore system DNS on Windows via netsh. Parses ipconfig /all per-interface (adapter name, DHCP status, DNS servers), saves backup to %APPDATA%\numa\original-dns.json, and restores on uninstall (DHCP or static with secondary servers). Handles localization (German adapter/DHCP/DNS labels), disconnected adapters, multiple interfaces, and missing admin privileges. Adds IP validation to discover_windows() for consistency. No Windows Service or CA trust yet — user runs numa in a terminal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: add cargo test to Windows CI job Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ci: upload Windows binary as artifact for testing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: SRTT decay tests panic on Windows due to Instant underflow On Windows, Instant starts near boot time — subtracting large durations panics. Use checked_sub with a process-start fallback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: SRTT decay tests use binary search for max Instant age Replace age() helper with set_age_secs() on SrttCache that binary-searches for the maximum subtractable duration. Prevents panic on Windows (Instant starts at boot) while still producing the oldest representable instant for correct decay calculations. Also removes ephemeral test-ubuntu.sh from git. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use ProgramData for Windows DNS backup path APPDATA differs between user and admin contexts — install runs as admin but uninstall might resolve a different APPDATA. Use ProgramData which is consistent across elevation contexts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: disable Dnscache on Windows install, re-enable on uninstall Windows DNS Client (Dnscache) holds port 53 at kernel level and can't be stopped via sc/net stop. Disable via registry during install (requires reboot), re-enable on uninstall. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: rewrite SRTT decay tests as pure functions Decay tests manipulated Instant timestamps which panics on Windows (Instant can't go before boot time). Rewrite to test decay_for_age() directly — a pure function taking srtt_ms and age_secs, no platform dependency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use Quad9 IP (9.9.9.9) for DoH fallback, not hostname DoH to dns.quad9.net requires DNS to resolve the hostname, which creates a chicken-and-egg loop when numa IS the system resolver (e.g. after numa install on Windows). Using the IP directly avoids the bootstrap dependency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: extract DOH_FALLBACK constant Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: extract QUAD9_IP constant Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: remove dead test helpers, fix constant placement Remove unused get_srtt_ms() and saturated_penalty_cache() left over from SRTT test rewrite. Move QUAD9_IP/DOH_FALLBACK after use block. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: ignore ConnectionReset on UDP socket (Windows ICMP error) Windows delivers ICMP port-unreachable as ConnectionReset on the next UDP recv_from, crashing numa. Linux/macOS silently ignore these. Catch and continue the recv loop. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: auto-start numa on Windows boot via registry Run key Without a Windows Service, rebooting after numa install leaves DNS broken (pointing at 127.0.0.1 with nothing listening). Register numa in HKLM\...\Run so it starts automatically. Removed on uninstall. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update README, Windows plan, and launch drafts for Windows support - README: platform-specific Quick Start, install/uninstall table - Windows plan: Phase 2 complete, Phase 3 scoped - Launch drafts: updated "Does it support Windows?" response Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: remove docs from git tracking (already gitignored) docs/ is in .gitignore but files were force-added. Remove from tracking — files remain on disk. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-01 18:17:52 +03:00
Razvan Dimescu	da93a3cde3	feat: add memory footprint to /stats and dashboard (#26 ) * feat: add memory footprint to /stats and dashboard Per-structure heap estimation (cache, blocklist, query log, SRTT, overrides) with process RSS via mach_task_basic_info / sysconf. Dashboard gets a 6th stat card and a sidebar breakdown panel with stacked bar visualization. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use phys_footprint on macOS to match Activity Monitor Switch from MACH_TASK_BASIC_INFO (resident_size) to TASK_VM_INFO (phys_footprint) which matches Activity Monitor's Memory column. Also: capacity-aware heap estimation, entry counts in memory payload, heap_bytes tests for all stores. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: remove redundant fields and fix naming in memory stats Remove duplicate entry counts from MemoryStats (already in parent StatsResponse), rename process_rss_bytes to process_memory_bytes to match macOS phys_footprint semantics, drop restating comments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 09:09:44 +03:00
Razvan Dimescu	98da440c84	feat: forward-by-default, auto recursive mode, Linux install fixes (#27 ) * feat: auto recursive mode, fix Linux install Auto mode (new default): probes a root server on startup; uses recursive resolution if outbound DNS works, falls back to Quad9 DoH if blocked. Dashboard shows mode indicator (green/yellow). Linux install fixes: - Add DNSStubListener=no to resolved drop-in (frees port 53) - Configure DNS before starting service (correct ordering) - Skip 127.0.0.53 in upstream detection - `numa install` now does everything (service + DNS + CA) - `numa uninstall` mirrors install (stop service + restore DNS) - Extract is_loopback_or_stub() for consistent filtering Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: enable DNSSEC validation by default With recursive as the default mode, DNSSEC validation completes the trustless resolution chain. Strict mode remains off by default. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: forward search domains to VPC resolver on Linux Parse search/domain lines from resolv.conf and create conditional forwarding rules to the original nameserver or AWS VPC resolver (169.254.169.253). Fixes internal hostname resolution on cloud VMs where recursive mode can't resolve private DNS zones. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: single-pass resolv.conf parsing, eliminate redundancies Parse resolv.conf once for both upstream and search domains instead of 2-3 reads. Extract CLOUD_VPC_RESOLVER constant. Use &'static str for mode in StatsResponse. Remove dead read_upstream_from_file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: macOS install health check, harden recursive probe Verify numa is listening (API port) before redirecting system DNS on macOS — if the service fails to start (e.g. port 53 in use), unload the service and abort instead of breaking DNS. Probe up to 3 root hints before declaring recursive mode unavailable. Validate IPs from resolvectl to avoid IPv6 fragment extraction. Extract DEFAULT_API_PORT constant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: widen make_rule cfg gate to include Linux make_rule was gated to macOS-only but discover_linux() calls it for search domain forwarding rules. CI failed on Linux with E0425. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: forward mode as default, recursive opt-in Forward mode (transparent proxy to system DNS) is now the default. Recursive and auto modes are explicit opt-in via config. This avoids bypassing corporate DNS policies, captive portals, VPC private zones, and parental controls on first install. - Move #[default] from Auto to Forward on UpstreamMode - DNSSEC defaults to off (no-op in forward mode) - 3-way match in main: Forward/Recursive/Auto with clean separation - Post-install message suggests mode = "recursive" for sovereignty - Update README, site, and launch drafts messaging Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-01 08:49:16 +03:00
Razvan Dimescu	ff1200eb10	feat: resolve .numa services to LAN IP for remote clients (#23 ) * feat: resolve .numa services to LAN IP for remote clients Remote DNS clients (e.g. phones on same WiFi) received 127.0.0.1 for local .numa services, which is unreachable from their perspective. Now returns the host's LAN IP when the query originates from a non-loopback address. Also auto-widens proxy bind to 0.0.0.0 when DNS is already public, and adds a startup warning when the proxy remains localhost-only. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: respect proxy bind_addr config, don't auto-widen The auto-widen silently overrode an explicit config value — the user's config should be the source of truth. Now the proxy always uses the configured bind_addr, and the warning fires whenever it's 127.0.0.1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update proxy bind_addr comment in example config Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 23:15:42 +03:00
Razvan Dimescu	49535568d9	refactor: deduplicate query builders, record extraction, sinkhole records (#22 ) - Add DnsPacket::query(id, domain, qtype) constructor; replace mock_query, make_query, and 4 inline constructions across ctx/forward/recursive/api - Add record_to_addr() in recursive.rs; replace 4 identical A/AAAA match blocks with filter_map one-liners - Add sinkhole_record() in ctx.rs; consolidate localhost and blocklist A/AAAA branching into single calls - Remove now-unused DnsQuestion imports Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 14:22:07 +03:00
Razvan Dimescu	669498e85f	refactor: extract resolve_coalesced, test real code (#21 ) * refactor: extract resolve_coalesced, rewrite tests against real code Extract Disposition enum, acquire_inflight(), and resolve_coalesced() from handle_query so coalescing logic is independently testable. Rewrite integration tests to call resolve_coalesced directly with mock futures instead of fighting the iterative resolver's NS chain. All 12 coalescing tests now exercise production code paths, not tokio primitives. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: SERVFAIL echoes question section, preserve error messages resolve_coalesced now takes &DnsPacket instead of query_id so SERVFAIL responses use response_from (echoing question section per RFC). Error messages preserved via Option<String> return for upstream error logging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-29 11:14:25 +03:00
Razvan Dimescu	30e46e549c	feat: in-flight query coalescing with COALESCED path (#20 ) * feat: in-flight query coalescing for recursive resolver When multiple queries for the same (domain, qtype) arrive concurrently and all miss the cache, only the first triggers recursive resolution. Subsequent queries wait on a broadcast channel for the result. Prevents thundering herd where N concurrent cache misses each independently walk the full NS chain, compounding timeouts. Uses InflightGuard (Drop impl) to guarantee map cleanup on panic/cancellation — prevents permanent SERVFAIL poisoning. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: add InflightMap type alias for clippy Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add COALESCED query path and coalescing tests Followers in the inflight coalescing path now log as COALESCED instead of RECURSIVE, making it visible in the dashboard when queries were deduplicated vs independently resolved. Adds 10 tests covering InflightGuard cleanup, broadcast mechanics, and concurrent handle_query coalescing through a mock TCP DNS server. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: cargo fmt Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: extract acquire_inflight, rewrite tests against real code Move Disposition enum and inflight acquisition logic into a standalone acquire_inflight() function. Rewrite 4 tests that were exercising tokio primitives to call the real coalescing code path instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 10:36:02 +03:00
Razvan Dimescu	06d4e91cd2	feat: SRTT-based nameserver selection (#19 ) * feat: SRTT-based nameserver selection for recursive resolver BIND-style Smoothed RTT (EWMA) tracking per NS IP address. The resolver learns which nameservers respond fastest and prefers them, eliminating cascading timeouts from slow/unreachable IPv6 servers. - New src/srtt.rs: SrttCache with record_rtt, record_failure, sort_by_rtt - EWMA formula: new = (old * 7 + sample) / 8, 5s failure penalty, 5min decay - TCP penalty (+100ms) lets SRTT naturally deprioritize IPv6-over-TCP - Enabled flag embedded in SrttCache (no-op when disabled) - Batch eviction (64 entries) for O(1) amortized writes at capacity - Configurable via [upstream] srtt = true/false (default: true) - Benchmark script: scripts/benchmark.sh (full, cold, warm, compare-all) - Benchmarks show 12x avg improvement, 0% queries >1s (was 58%) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: show DNSSEC and SRTT status in dashboard + API Add dnssec and srtt boolean fields to /stats API response. Display on/off indicators in the dashboard footer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: apply SRTT decay before EWMA so recovered servers rehabilitate Without decay-before-EWMA, a server penalized at 5000ms stayed near that value even after recovery — the stale raw penalty was used as the EWMA base instead of the decayed estimate. Extract decayed_srtt() helper and call it in record_rtt() before the smoothing step. Also restores removed "why" comments in send_query / resolve_recursive. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add install/upgrade instructions, smarter benchmark priming README: document `numa install`, `numa service`, Homebrew upgrade, and `make deploy` workflows. Benchmark: replace fixed `sleep 4` with `wait_for_priming` that polls cache entry count for stability. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 23:22:31 +02:00
Razvan Dimescu	71dbb138bc	fix: return NXDOMAIN for .local queries instead of SERVFAIL (#18 ) .local is reserved for mDNS (RFC 6762) and cannot be resolved by upstream DNS servers. Add it to is_special_use_domain() so queries like _grpc_config.localhost.local get an immediate NXDOMAIN instead of timing out and returning SERVFAIL. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 22:42:33 +02:00
Razvan Dimescu	a84f2e7f1d	feat: recursive DNS + DNSSEC + TCP fallback (#17 ) * feat: recursive resolution + full DNSSEC validation Numa becomes a true DNS resolver — resolves from root nameservers with complete DNSSEC chain-of-trust verification. Recursive resolution: - Iterative RFC 1034 from configurable root hints (13 default) - CNAME chasing (depth 8), referral following (depth 10) - A+AAAA glue extraction, IPv6 nameserver support - TLD priming: NS + DS + DNSKEY for 34 gTLDs + EU ccTLDs - Config: mode = "recursive" in [upstream], root_hints, prime_tlds DNSSEC (all 4 phases): - EDNS0 OPT pseudo-record (DO bit, 1232 payload per DNS Flag Day 2020) - DNSKEY, DS, RRSIG, NSEC, NSEC3 record types with wire read/write - Signature verification via ring: RSA/SHA-256, ECDSA P-256, Ed25519 - Chain-of-trust: zone DNSKEY → parent DS → root KSK (key tag 20326) - DNSKEY RRset self-signature verification (RRSIG(DNSKEY) by KSK) - RRSIG expiration/inception time validation - NSEC: NXDOMAIN gap proofs, NODATA type absence, wildcard denial - NSEC3: SHA-1 iterated hashing, closest encloser proof, hash range - Authority RRSIG verification for denial proofs - Config: [dnssec] enabled/strict (default false, opt-in) - AD bit on Secure, SERVFAIL on Bogus+strict - DnssecStatus cached per entry, ValidationStats logging Performance: - TLD chain pre-warmed on startup (root DNSKEY + TLD DS/DNSKEY) - Referral DS piggybacking from authority sections - DNSKEY prefetch before validation loop - Cold-cache validation: ~1 DNSKEY fetch (down from 5) - Benchmarks: RSA 10.9µs, ECDSA 174ns, DS verify 257ns Also: - write_qname fix for root domain "." (was producing malformed queries) - write_record_header() dedup, write_bytes() bulk writes - DnsRecord::domain() + query_type() accessors - UpstreamMode enum, DEFAULT_EDNS_PAYLOAD const - Real glue TTL (was hardcoded 3600) - DNSSEC restricted to recursive mode only Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: TCP fallback, query minimization, UDP auto-disable Transport resilience for restrictive networks (ISPs blocking UDP:53): - DNS-over-TCP fallback: UDP fail/truncation → automatic TCP retry - UDP auto-disable: after 3 consecutive failures, switch to TCP-first - IPv6 → TCP directly (UDP socket binds 0.0.0.0, can't reach IPv6) - Network change resets UDP detection for re-probing - Root hint rotation in TLD priming Privacy: - RFC 7816 query minimization: root servers see TLD only, not full name Code quality: - Merged find_starting_ns + find_starting_zone → find_closest_ns - Extracted resolve_ns_addrs_from_glue shared helper - Removed overall timeout wrapper (per-hop timeouts sufficient) - forward_tcp for DNS-over-TCP (RFC 1035 §4.2.2) Testing: - Mock TCP-only DNS server for fallback tests (no network needed) - tcp_fallback_resolves_when_udp_blocked - tcp_only_iterative_resolution - tcp_fallback_handles_nxdomain - udp_auto_disable_resets - Integration test suite (4 suites, 51 tests) - Network probe script (tests/network-probe.sh) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: DNSSEC verified badge in dashboard query log - Add dnssec field to QueryLogEntry, track validation status per query - DnssecStatus::as_str() for API serialization - Dashboard shows green checkmark next to DNSSEC-verified responses - Blog post: add "How keys get there" section, transport resilience section, trim code blocks, update What's Next Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use SVG shield for DNSSEC badge, update blog HTML Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: NS cache lookup from authorities, UDP re-probe, shield alignment - find_closest_ns checks authorities (not just answers) for NS records, fixing TLD priming cache misses that caused redundant root queries - Periodic UDP re-probe every 5min when disabled — re-enables UDP after switching from a restrictive network to an open one - Dashboard DNSSEC shield uses fixed-width container for alignment - Blog post: tuck key-tag into trust anchor paragraph Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: TCP single-write, mock server consistency, integration tests - TCP single-write fix: combine length prefix + message to avoid split segments that Microsoft/Azure DNS servers reject - Mock server (spawn_tcp_dns_server) updated to use single-write too - Tests: forward_tcp_wire_format, forward_tcp_single_segment_write - Integration: real-server checks for Microsoft/Office/Azure domains Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: recursive bar in dashboard, special-use domain interception Dashboard: - Add Recursive bar to resolution paths chart (cyan, distinct from Override) - Add RECURSIVE path tag style in query log Special-use domains (RFC 6761/6303/8880/9462): - .localhost → 127.0.0.1 (RFC 6761) - Private reverse PTR (10.x, 192.168.x, 172.16-31.x) → NXDOMAIN - _dns.resolver.arpa (DDR) → NXDOMAIN - ipv4only.arpa (NAT64) → 192.0.0.170/171 - mDNS service discovery for private ranges → NXDOMAIN Eliminates ~900ms SERVFAILs for macOS system queries that were hitting root servers unnecessarily. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: move generated blog HTML to site/blog/posts/, gitignore - Generated HTML now in site/blog/posts/ (gitignored) - CI workflow runs pandoc + make blog before deploy - Updated all internal blog links to /blog/posts/ path - blog/.md remains the source of truth Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix: review feedback — memory ordering, RRSIG time, NS resolution - Ordering::Relaxed → Acquire/Release for UDP_DISABLED/UDP_FAILURES (ARM correctness for cross-thread coordination) - RRSIG time validation: serial number arithmetic (RFC 4034 §3.1.5) + 300s clock skew fudge factor (matches BIND) - resolve_ns_addrs_from_glue collects addresses from ALL NS names, not just the first with glue (improves failover) - is_special_use_domain: eliminate 16 format! allocations per .in-addr.arpa query (parse octet instead) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: API endpoint tests, coverage target - 8 new axum handler tests: health, stats, query-log, overrides CRUD, cache, blocking stats, services CRUD, dashboard HTML - Tests use tower::oneshot — no network, no server startup - test_ctx() builds minimal ServerCtx for isolated testing - `make coverage` target (cargo-tarpaulin), separate from `make all` - 82 total tests (was 74) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 04:03:47 +02:00
Razvan Dimescu	f849a4d65f	feat: self-host fonts, styled block page, wildcard TLS (#16 ) * perf: optimize hot path — RwLock, inline filtering, pre-allocated strings - Mutex → RwLock for cache, blocklist, and overrides (concurrent read access) - Make cache.lookup() and overrides.lookup() take &self (read-only) - Eliminate 3 Vec allocations per DnsPacket::write() via inline filtering - Pre-allocate domain strings with capacity 64 in parse path - Add criterion micro-benchmarks (hot_path + throughput) - Add bench README documenting both benchmark suites Measured improvement: ~14% faster parsing, ~9% pipeline throughput, round-trip cached 733ns → 698ns (~2.3M queries/sec). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: simplify benchmark code after review - Remove redundant DnsHeader::new() (already set by DnsPacket::new()) - Remove unused DnsHeader import - Change simulate_cached_pipeline to take &DnsCache (lookup is &self now) - Remove unnecessary mut on cache in cache_lookup_miss bench Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * site: landing page overhaul, blog, benchmarks, numa.rs domain Landing page: - Split features into 3-layer card layout (Block & Protect, Developer Tools, Self-Sovereign DNS) - Add DoH and conditional forwarding to comparison table - Fix performance claim (2.3M → 2.0M qps to match benchmarks) - Add all 3 install methods (brew, cargo, curl) - Add OG tags + canonical URL for numa.rs - Fix code block whitespace rendering - Update roadmap with .onion bridge phase Blog: - Add "Building a DNS Resolver from Scratch in Rust" post - Blog index + template for future posts Other: - CNAME for GitHub Pages (numa.rs) - Benchmark results (bench/results.json) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: self-host fonts, styled block page, wildcard TLS Fonts: - Replace Google Fonts CDN with self-hosted woff2 (73KB, 5 files) - Serve fonts from API server via include_bytes! (dashboard works offline) - Proxy error pages use system fonts (zero external deps when DNS is broken) - Fix Instrument Serif font-weight: use 400 (only available weight) instead of synthetic bold 600/700 Proxy: - Styled "Blocked by Numa" page when blocked domain hits the proxy (was confusing "not a .numa domain" error) - Extract shared error_page() template for 403 + 404 pages (deduplicate ~160 lines of CSS) TLS: - Add wildcard SAN *.numa to cert — unregistered .numa domains get valid HTTPS (styled 404 without cert warning) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 02:19:54 +02:00
Razvan Dimescu	962b400f4c	perf: optimize DNS query hot path (#15 ) * perf: optimize hot path — RwLock, inline filtering, pre-allocated strings - Mutex → RwLock for cache, blocklist, and overrides (concurrent read access) - Make cache.lookup() and overrides.lookup() take &self (read-only) - Eliminate 3 Vec allocations per DnsPacket::write() via inline filtering - Pre-allocate domain strings with capacity 64 in parse path - Add criterion micro-benchmarks (hot_path + throughput) - Add bench README documenting both benchmark suites Measured improvement: ~14% faster parsing, ~9% pipeline throughput, round-trip cached 733ns → 698ns (~2.3M queries/sec). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: simplify benchmark code after review - Remove redundant DnsHeader::new() (already set by DnsPacket::new()) - Remove unused DnsHeader import - Change simulate_cached_pipeline to take &DnsCache (lookup is &self now) - Remove unnecessary mut on cache in cache_lookup_miss bench Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 02:01:08 +02:00

1 2 3

149 Commits