Commit Graph

56 Commits

Author SHA1 Message Date
Razvan Dimescu
669498e85f refactor: extract resolve_coalesced, test real code (#21)
* refactor: extract resolve_coalesced, rewrite tests against real code

Extract Disposition enum, acquire_inflight(), and resolve_coalesced()
from handle_query so coalescing logic is independently testable. Rewrite
integration tests to call resolve_coalesced directly with mock futures
instead of fighting the iterative resolver's NS chain. All 12 coalescing
tests now exercise production code paths, not tokio primitives.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: SERVFAIL echoes question section, preserve error messages

resolve_coalesced now takes &DnsPacket instead of query_id so SERVFAIL
responses use response_from (echoing question section per RFC). Error
messages preserved via Option<String> return for upstream error logging.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-29 11:14:25 +03:00
Razvan Dimescu
30e46e549c feat: in-flight query coalescing with COALESCED path (#20)
* feat: in-flight query coalescing for recursive resolver

When multiple queries for the same (domain, qtype) arrive concurrently
and all miss the cache, only the first triggers recursive resolution.
Subsequent queries wait on a broadcast channel for the result.

Prevents thundering herd where N concurrent cache misses each
independently walk the full NS chain, compounding timeouts.

Uses InflightGuard (Drop impl) to guarantee map cleanup on
panic/cancellation — prevents permanent SERVFAIL poisoning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: add InflightMap type alias for clippy

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add COALESCED query path and coalescing tests

Followers in the inflight coalescing path now log as COALESCED instead
of RECURSIVE, making it visible in the dashboard when queries were
deduplicated vs independently resolved. Adds 10 tests covering
InflightGuard cleanup, broadcast mechanics, and concurrent handle_query
coalescing through a mock TCP DNS server.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: cargo fmt

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: extract acquire_inflight, rewrite tests against real code

Move Disposition enum and inflight acquisition logic into a standalone
acquire_inflight() function. Rewrite 4 tests that were exercising tokio
primitives to call the real coalescing code path instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 10:36:02 +03:00
Razvan Dimescu
06d4e91cd2 feat: SRTT-based nameserver selection (#19)
* feat: SRTT-based nameserver selection for recursive resolver

BIND-style Smoothed RTT (EWMA) tracking per NS IP address. The resolver
learns which nameservers respond fastest and prefers them, eliminating
cascading timeouts from slow/unreachable IPv6 servers.

- New src/srtt.rs: SrttCache with record_rtt, record_failure, sort_by_rtt
- EWMA formula: new = (old * 7 + sample) / 8, 5s failure penalty, 5min decay
- TCP penalty (+100ms) lets SRTT naturally deprioritize IPv6-over-TCP
- Enabled flag embedded in SrttCache (no-op when disabled)
- Batch eviction (64 entries) for O(1) amortized writes at capacity
- Configurable via [upstream] srtt = true/false (default: true)
- Benchmark script: scripts/benchmark.sh (full, cold, warm, compare-all)
- Benchmarks show 12x avg improvement, 0% queries >1s (was 58%)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: show DNSSEC and SRTT status in dashboard + API

Add dnssec and srtt boolean fields to /stats API response.
Display on/off indicators in the dashboard footer.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: apply SRTT decay before EWMA so recovered servers rehabilitate

Without decay-before-EWMA, a server penalized at 5000ms stayed near
that value even after recovery — the stale raw penalty was used as the
EWMA base instead of the decayed estimate. Extract decayed_srtt()
helper and call it in record_rtt() before the smoothing step.

Also restores removed "why" comments in send_query / resolve_recursive.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add install/upgrade instructions, smarter benchmark priming

README: document `numa install`, `numa service`, Homebrew upgrade,
and `make deploy` workflows. Benchmark: replace fixed `sleep 4` with
`wait_for_priming` that polls cache entry count for stability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-28 23:22:31 +02:00
Razvan Dimescu
71dbb138bc fix: return NXDOMAIN for .local queries instead of SERVFAIL (#18)
.local is reserved for mDNS (RFC 6762) and cannot be resolved by
upstream DNS servers. Add it to is_special_use_domain() so queries
like _grpc_config.localhost.local get an immediate NXDOMAIN instead
of timing out and returning SERVFAIL.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-28 22:42:33 +02:00
Razvan Dimescu
a84f2e7f1d feat: recursive DNS + DNSSEC + TCP fallback (#17)
* feat: recursive resolution + full DNSSEC validation

Numa becomes a true DNS resolver — resolves from root nameservers
with complete DNSSEC chain-of-trust verification.

Recursive resolution:
- Iterative RFC 1034 from configurable root hints (13 default)
- CNAME chasing (depth 8), referral following (depth 10)
- A+AAAA glue extraction, IPv6 nameserver support
- TLD priming: NS + DS + DNSKEY for 34 gTLDs + EU ccTLDs
- Config: mode = "recursive" in [upstream], root_hints, prime_tlds

DNSSEC (all 4 phases):
- EDNS0 OPT pseudo-record (DO bit, 1232 payload per DNS Flag Day 2020)
- DNSKEY, DS, RRSIG, NSEC, NSEC3 record types with wire read/write
- Signature verification via ring: RSA/SHA-256, ECDSA P-256, Ed25519
- Chain-of-trust: zone DNSKEY → parent DS → root KSK (key tag 20326)
- DNSKEY RRset self-signature verification (RRSIG(DNSKEY) by KSK)
- RRSIG expiration/inception time validation
- NSEC: NXDOMAIN gap proofs, NODATA type absence, wildcard denial
- NSEC3: SHA-1 iterated hashing, closest encloser proof, hash range
- Authority RRSIG verification for denial proofs
- Config: [dnssec] enabled/strict (default false, opt-in)
- AD bit on Secure, SERVFAIL on Bogus+strict
- DnssecStatus cached per entry, ValidationStats logging

Performance:
- TLD chain pre-warmed on startup (root DNSKEY + TLD DS/DNSKEY)
- Referral DS piggybacking from authority sections
- DNSKEY prefetch before validation loop
- Cold-cache validation: ~1 DNSKEY fetch (down from 5)
- Benchmarks: RSA 10.9µs, ECDSA 174ns, DS verify 257ns

Also:
- write_qname fix for root domain "." (was producing malformed queries)
- write_record_header() dedup, write_bytes() bulk writes
- DnsRecord::domain() + query_type() accessors
- UpstreamMode enum, DEFAULT_EDNS_PAYLOAD const
- Real glue TTL (was hardcoded 3600)
- DNSSEC restricted to recursive mode only

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: TCP fallback, query minimization, UDP auto-disable

Transport resilience for restrictive networks (ISPs blocking UDP:53):
- DNS-over-TCP fallback: UDP fail/truncation → automatic TCP retry
- UDP auto-disable: after 3 consecutive failures, switch to TCP-first
- IPv6 → TCP directly (UDP socket binds 0.0.0.0, can't reach IPv6)
- Network change resets UDP detection for re-probing
- Root hint rotation in TLD priming

Privacy:
- RFC 7816 query minimization: root servers see TLD only, not full name

Code quality:
- Merged find_starting_ns + find_starting_zone → find_closest_ns
- Extracted resolve_ns_addrs_from_glue shared helper
- Removed overall timeout wrapper (per-hop timeouts sufficient)
- forward_tcp for DNS-over-TCP (RFC 1035 §4.2.2)

Testing:
- Mock TCP-only DNS server for fallback tests (no network needed)
- tcp_fallback_resolves_when_udp_blocked
- tcp_only_iterative_resolution
- tcp_fallback_handles_nxdomain
- udp_auto_disable_resets
- Integration test suite (4 suites, 51 tests)
- Network probe script (tests/network-probe.sh)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: DNSSEC verified badge in dashboard query log

- Add dnssec field to QueryLogEntry, track validation status per query
- DnssecStatus::as_str() for API serialization
- Dashboard shows green checkmark next to DNSSEC-verified responses
- Blog post: add "How keys get there" section, transport resilience section,
  trim code blocks, update What's Next

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use SVG shield for DNSSEC badge, update blog HTML

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: NS cache lookup from authorities, UDP re-probe, shield alignment

- find_closest_ns checks authorities (not just answers) for NS records,
  fixing TLD priming cache misses that caused redundant root queries
- Periodic UDP re-probe every 5min when disabled — re-enables UDP
  after switching from a restrictive network to an open one
- Dashboard DNSSEC shield uses fixed-width container for alignment
- Blog post: tuck key-tag into trust anchor paragraph

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: TCP single-write, mock server consistency, integration tests

- TCP single-write fix: combine length prefix + message to avoid split
  segments that Microsoft/Azure DNS servers reject
- Mock server (spawn_tcp_dns_server) updated to use single-write too
- Tests: forward_tcp_wire_format, forward_tcp_single_segment_write
- Integration: real-server checks for Microsoft/Office/Azure domains

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: recursive bar in dashboard, special-use domain interception

Dashboard:
- Add Recursive bar to resolution paths chart (cyan, distinct from Override)
- Add RECURSIVE path tag style in query log

Special-use domains (RFC 6761/6303/8880/9462):
- .localhost → 127.0.0.1 (RFC 6761)
- Private reverse PTR (10.x, 192.168.x, 172.16-31.x) → NXDOMAIN
- _dns.resolver.arpa (DDR) → NXDOMAIN
- ipv4only.arpa (NAT64) → 192.0.0.170/171
- mDNS service discovery for private ranges → NXDOMAIN

Eliminates ~900ms SERVFAILs for macOS system queries that were
hitting root servers unnecessarily.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: move generated blog HTML to site/blog/posts/, gitignore

- Generated HTML now in site/blog/posts/ (gitignored)
- CI workflow runs pandoc + make blog before deploy
- Updated all internal blog links to /blog/posts/ path
- blog/*.md remains the source of truth

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: review feedback — memory ordering, RRSIG time, NS resolution

- Ordering::Relaxed → Acquire/Release for UDP_DISABLED/UDP_FAILURES
  (ARM correctness for cross-thread coordination)
- RRSIG time validation: serial number arithmetic (RFC 4034 §3.1.5)
  + 300s clock skew fudge factor (matches BIND)
- resolve_ns_addrs_from_glue collects addresses from ALL NS names,
  not just the first with glue (improves failover)
- is_special_use_domain: eliminate 16 format! allocations per
  .in-addr.arpa query (parse octet instead)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: API endpoint tests, coverage target

- 8 new axum handler tests: health, stats, query-log, overrides CRUD,
  cache, blocking stats, services CRUD, dashboard HTML
- Tests use tower::oneshot — no network, no server startup
- test_ctx() builds minimal ServerCtx for isolated testing
- `make coverage` target (cargo-tarpaulin), separate from `make all`
- 82 total tests (was 74)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-28 04:03:47 +02:00
Razvan Dimescu
f849a4d65f feat: self-host fonts, styled block page, wildcard TLS (#16)
* perf: optimize hot path — RwLock, inline filtering, pre-allocated strings

- Mutex → RwLock for cache, blocklist, and overrides (concurrent read access)
- Make cache.lookup() and overrides.lookup() take &self (read-only)
- Eliminate 3 Vec allocations per DnsPacket::write() via inline filtering
- Pre-allocate domain strings with capacity 64 in parse path
- Add criterion micro-benchmarks (hot_path + throughput)
- Add bench README documenting both benchmark suites

Measured improvement: ~14% faster parsing, ~9% pipeline throughput,
round-trip cached 733ns → 698ns (~2.3M queries/sec).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: simplify benchmark code after review

- Remove redundant DnsHeader::new() (already set by DnsPacket::new())
- Remove unused DnsHeader import
- Change simulate_cached_pipeline to take &DnsCache (lookup is &self now)
- Remove unnecessary mut on cache in cache_lookup_miss bench

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* site: landing page overhaul, blog, benchmarks, numa.rs domain

Landing page:
- Split features into 3-layer card layout (Block & Protect, Developer Tools, Self-Sovereign DNS)
- Add DoH and conditional forwarding to comparison table
- Fix performance claim (2.3M → 2.0M qps to match benchmarks)
- Add all 3 install methods (brew, cargo, curl)
- Add OG tags + canonical URL for numa.rs
- Fix code block whitespace rendering
- Update roadmap with .onion bridge phase

Blog:
- Add "Building a DNS Resolver from Scratch in Rust" post
- Blog index + template for future posts

Other:
- CNAME for GitHub Pages (numa.rs)
- Benchmark results (bench/results.json)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: self-host fonts, styled block page, wildcard TLS

Fonts:
- Replace Google Fonts CDN with self-hosted woff2 (73KB, 5 files)
- Serve fonts from API server via include_bytes! (dashboard works offline)
- Proxy error pages use system fonts (zero external deps when DNS is broken)
- Fix Instrument Serif font-weight: use 400 (only available weight) instead of synthetic bold 600/700

Proxy:
- Styled "Blocked by Numa" page when blocked domain hits the proxy (was confusing "not a .numa domain" error)
- Extract shared error_page() template for 403 + 404 pages (deduplicate ~160 lines of CSS)

TLS:
- Add wildcard SAN *.numa to cert — unregistered .numa domains get valid HTTPS (styled 404 without cert warning)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 02:19:54 +02:00
Razvan Dimescu
962b400f4c perf: optimize DNS query hot path (#15)
* perf: optimize hot path — RwLock, inline filtering, pre-allocated strings

- Mutex → RwLock for cache, blocklist, and overrides (concurrent read access)
- Make cache.lookup() and overrides.lookup() take &self (read-only)
- Eliminate 3 Vec allocations per DnsPacket::write() via inline filtering
- Pre-allocate domain strings with capacity 64 in parse path
- Add criterion micro-benchmarks (hot_path + throughput)
- Add bench README documenting both benchmark suites

Measured improvement: ~14% faster parsing, ~9% pipeline throughput,
round-trip cached 733ns → 698ns (~2.3M queries/sec).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: simplify benchmark code after review

- Remove redundant DnsHeader::new() (already set by DnsPacket::new())
- Remove unused DnsHeader import
- Change simulate_cached_pipeline to take &DnsCache (lookup is &self now)
- Remove unnecessary mut on cache in cache_lookup_miss bench

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 02:01:08 +02:00
Razvan Dimescu
c5208e934d feat: DNS-over-HTTPS (DoH) upstream forwarding (#14)
* feat: DNS-over-HTTPS upstream forwarding

Encrypt upstream queries via DoH — ISPs see HTTPS traffic on port 443,
not plaintext DNS on port 53. URL scheme determines transport:
https:// = DoH, bare IP = plain UDP. Falls back to Quad9 DoH when
system resolver cannot be detected.

- Upstream enum (Udp/Doh) with Display and PartialEq
- BytePacketBuffer::from_bytes constructor
- reqwest http2 feature for DoH server compatibility
- network_watch_loop guards against DoH→UDP silent downgrade
- 5 new tests (mock DoH server, HTTP errors, timeout)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: cargo fmt

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add DoH to README — Why Numa, comparison table, roadmap

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 00:39:58 +02:00
Razvan Dimescu
e0c1997056 fix: regenerate TLS cert when services change (hot-reload via ArcSwap)
HTTPS proxy certs were generated once at startup. Services added at
runtime via API or LAN discovery got "not secure" in the browser
because their SAN wasn't in the cert. Now the cert is regenerated
on every service add/remove and swapped atomically via ArcSwap.
In-flight connections are unaffected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 16:14:06 +02:00
Razvan Dimescu
e7e5c173f2 dynamic banner width, hoist HTML escaper, cache CA, restore log path
- banner box width adapts to longest value (fixes overflow with long paths)
- hoist h() HTML escape function to script top, remove 3 local copies
- serve_ca: add Cache-Control: public, max-age=86400
- restore log path in dashboard footer alongside new config/data fields

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 12:29:18 +02:00
Razvan Dimescu
c6b35045d8 config visibility, PR review fixes, XSS hardening
Config visibility:
- startup banner shows config path, data dir, services path
- config search: ./numa.toml → ~/.config/numa/ → /usr/local/var/numa/
- /stats API exposes config_path and data_dir, dashboard footer renders them
- GET /ca.pem endpoint serves CA cert for cross-device TLS trust
- load_config returns ConfigLoad with found flag, warns on not-found
- ServerCtx stores PathBuf for config_dir/data_dir, string conversion at boundaries

PR review fixes:
- add explicit parens in resolve_route operator precedence (service_store.rs)
- hostname portability: drop -s flag, trim domain with split('.') (lan.rs)
- serve_ca uses spawn_blocking instead of sync fs::read in async handler
- load_config: remove TOCTOU exists() check, read directly and handle NotFound

XSS hardening:
- HTML-escape all user-controlled interpolations in dashboard (service names,
  route paths, ports, URLs, block check domain/reason)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 12:24:21 +02:00
Razvan Dimescu
10f1602803 address PR review: SRV port, drop spike, percent-encoded paths
- SRV record uses first service's port (was 0, confused dns-sd -L)
- Remove examples/mdns_coexist.rs (served its purpose as spike)
- Reject percent-encoding in route paths (defense-in-depth)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 11:21:09 +02:00
Razvan Dimescu
41a97bb930 dashboard: show LAN status in Local Services panel header
- Add lan_enabled to ServerCtx
- Add lan field to /stats API (enabled, peer count)
- Dashboard shows "LAN off" (dim) or "LAN on · N peers" (green)
- Tooltip shows enable command or mDNS service type

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 11:16:52 +02:00
Razvan Dimescu
4020776b8e simplify set_lan_enabled: fix config path, TOCTOU, double iteration
- Accept config path parameter (consistent with main's resolution)
- Read first, match on NotFound (eliminates TOCTOU race)
- Single position() call replaces any() + position()
- Precise key matching via split_once('=')
- Preserve original indentation on replacement
- Extract print_lan_status helper

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 10:59:35 +02:00
Razvan Dimescu
763ba1de91 add numa lan on/off CLI command, update README
- numa lan on/off toggles LAN discovery in numa.toml
- Writes [lan] section if missing, updates enabled if present
- Colored output with restart hint
- README: add lan on/off to help text

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 10:30:22 +02:00
Razvan Dimescu
fb89b78226 dashboard: route CRUD, source-aware service controls, XSS fix
- Add inline route management (+ route / x) per service in dashboard
- Expose service source (config vs api) in API response
- Only show service delete button for API-created services
- Pre-fill route port with service target_port
- Fix XSS in route path onclick handlers
- Skip renderServices refresh while route form is open (editingRoute guard)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 08:58:31 +02:00
Razvan Dimescu
64c4d146ec add unit tests for route matching, config defaults, and service store
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 07:49:22 +02:00
Razvan Dimescu
9c290b6ef4 fmt: fix proxy.rs formatting for CI rustfmt
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 07:13:58 +02:00
Razvan Dimescu
c836903db5 simplify: unify route structs, fix prefix collision, lint fixes
- Unify RouteConfig/RouteEntry/RouteResponse into single RouteEntry
- Fix prefix collision: /api no longer matches /apiary (segment boundary check)
- Add path traversal rejection in route API
- Extract MdnsAnnouncement struct (clippy type_complexity)
- cargo fmt

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 06:57:57 +02:00
Razvan Dimescu
5e5a6544bc LAN opt-in, mDNS migration, security hardening, path-based routing
- LAN discovery disabled by default (opt-in via [lan] enabled = true)
- Replace custom JSON multicast (239.255.70.78:5390) with standard mDNS
  (_numa._tcp.local on 224.0.0.251:5353) using existing DNS parser
- Instance ID in TXT record for multi-instance self-filtering
- API and proxy bind to 127.0.0.1 by default (0.0.0.0 when LAN enabled)
- Path-based routing: longest prefix match with optional prefix stripping
  via [[services]] routes = [{path, port, strip?}]
- REST API: GET/POST/DELETE /services/{name}/routes
- Dashboard shows route lines per service when configured
- Segment-boundary route matching (prevents /api matching /apiary)
- Route path validation (rejects path traversal)

Closes #11

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 06:56:31 +02:00
Razvan Dimescu
4c58ff49b0 reduce network change detection to 5s with tiered polling
LAN IP checked every 5s (cheap UDP socket call). Full upstream
re-detection runs every 30s as safety net, or immediately when
LAN IP changes. Reduces worst-case network switch recovery from
30s to 5s.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 19:36:03 +02:00
Razvan Dimescu
5810ee5aac show upstream DNS in stats API and dashboard footer
Expose current upstream address in /stats response. Dashboard footer
now shows "Upstream: x.x.x.x:53" — updates live when the network
watcher swaps the upstream.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 11:04:54 +02:00
Razvan Dimescu
06850de728 fix circular reference: detect DHCP DNS when scutil shows loopback
When numa install is active, scutil --dns only returns 127.0.0.1.
Previously fell back to 9.9.9.9 (Quad9) which fails on networks
that block external DNS. Now reads DHCP-provided DNS from
ipconfig getpacket en0/en1 as intermediate fallback before Quad9.

Tested on a network that blocks 8.8.8.8, 9.9.9.9, 1.1.1.1 but
allows ISP DNS (213.154.124.25) — Numa now auto-detects and uses it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 10:24:54 +02:00
Razvan Dimescu
995916d01b generalize upstream re-detection into network change watcher
Always detect network changes (LAN IP, upstream, peers) regardless
of upstream config. LAN IP is now tracked in ServerCtx and updated
every 30s — multicast announcements use the current IP instead of
the startup IP. Upstream re-detection still only runs when
auto-detected. Peer flush triggers on any network change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 09:38:09 +02:00
Razvan Dimescu
7aca3b1991 fix DNS failure on network change with upstream re-detection
Upstream DNS was resolved once at startup and never updated. Switching
Wi-Fi networks made all queries fail until restart.

Now spawns a background task (every 30s) that re-runs system DNS
discovery and swaps the upstream atomically if it changed. Also flushes
stale LAN peers from the old network on change.

Only activates when upstream is auto-detected (not explicitly configured).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 09:31:49 +02:00
Razvan Dimescu
c333705a0e fix needless return in trust_ca for Windows clippy
On Windows, the not(macos/linux) cfg block is the only path, so
clippy flags the return as needless. Use expression form instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 08:29:28 +02:00
Razvan Dimescu
50d17ae118 fix Windows clippy errors and unreachable code
Gate version detection behind cfg(unix), fix unreachable Ok(()) after
return in trust_ca, use next_back() and is_some_and() per clippy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 08:23:25 +02:00
Razvan Dimescu
5495107c9e add Windows support (Phase 1)
Cross-platform paths: config_dir() uses %APPDATA%, data_dir() uses
%PROGRAMDATA% on Windows. TLS cert directory uses data_dir() instead
of hardcoded /usr/local/var/numa. Windows DNS discovery via ipconfig.
Fixed cfg gates from not(macos) to explicit linux to prevent Linux
code compiling on Windows. Added Windows target to CI and release
workflows with zip packaging.

System integration (numa install/service) not yet supported on Windows
— users run numa.exe manually.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 08:13:53 +02:00
Razvan Dimescu
9a3de2f231 add LAN accessibility indicator for services
Show whether each service is reachable from the network or bound to
localhost only. Dashboard displays green "LAN" or amber "local only"
badge next to each healthy service. Unified TCP check function,
concurrent health+LAN probes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 06:35:12 +02:00
Razvan Dimescu
6fdadd637c fix LAN discovery: instance-based self-filter and multicast port reuse
Replace IP-based self-announcement filtering with a per-process instance
ID (pid ^ timestamp) so multiple instances on the same host can discover
each other. Enable SO_REUSEPORT for multicast socket binding on Unix.
Add multicast address validation on configured group.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 00:20:33 +02:00
Razvan Dimescu
9041ccc2e1 fix rustfmt formatting
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 16:54:03 +02:00
Razvan Dimescu
c9f1d98f45 add LAN service discovery via UDP multicast
Numa instances on the same network auto-discover each other's .numa
services. No config, no cloud — just multicast on 239.255.70.78:5390.

- PeerStore with lazy expiry (90s timeout, 30s broadcast interval)
- DNS resolves remote .numa services to peer's LAN IP (not localhost)
- Proxy forwards to peer IP for remote services
- Graceful degradation if multicast bind fails

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 16:45:46 +02:00
Razvan Dimescu
285778b646 add styled 404 page for unregistered .numa domains
Roman Stone themed 404 with Instrument Serif heading, JetBrains Mono
domain badge, brick pattern background, syntax-highlighted curl
example, and a delayed easter egg. Also updates dashboard link in
README to numa.numa.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 04:31:17 +02:00
Razvan Dimescu
7a64e7c4aa fix truncation check: use == instead of >= for buffer-full detection
recv_from can never return more bytes than the buffer size — the kernel
truncates silently. == is the correct heuristic for detecting truncation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 03:35:21 +02:00
Razvan Dimescu
2cb87bbe83 fix rustfmt formatting
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 03:31:44 +02:00
Razvan Dimescu
4b60a4b49c launch hardening: TC bit, Dockerfile, platform-aware deploy
- Set TC (truncation) bit when response exceeds 4096-byte buffer
  instead of dropping the response silently. Clients can retry via TCP.
- Log when upstream response is truncated in forward.rs.
- Dockerfile: bump to Rust 1.88, include site/service files, use
  alpine runtime instead of scratch, add cmake/perl for aws-lc-sys.
- Makefile deploy: platform-aware — codesign on macOS, systemctl on Linux.
- README: trim roadmap to near-term items only.
- Verified: Docker build + smoke test passes on Linux (Alpine musl).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 03:31:15 +02:00
Razvan Dimescu
cfd9a562af fix CA removal: delete by SHA-1 hash, update README with TLS
security delete-certificate -c fails when multiple certs match.
Now finds all certs by hash and deletes each individually.
Also updated README with HTTPS, service persistence, and TLS mentions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 01:35:14 +02:00
Razvan Dimescu
5e7a653f9c fix rustfmt formatting
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 01:15:51 +02:00
Razvan Dimescu
3bfcd827ac add TLS, service persistence, blocking panel, query types
- Local TLS: auto-generated CA + per-service certs (explicit SANs, not
  wildcards — browsers reject *.numa under single-label TLDs). HTTPS
  proxy on :443 via rustls/tokio-rustls. `numa install` trusts CA in
  macOS Keychain / Linux ca-certificates.
- Service persistence: user-added services saved to
  ~/.config/numa/services.json, survive restarts.
- Blocking panel: renamed "Check Domain" to "Blocking" with sources
  display, allowlist management UI, unpause button.
- Query types: recognize SOA, PTR, TXT, SRV, HTTPS (type 65) instead
  of logging as UNKNOWN.
- Blocklist gzip: reqwest now decompresses gzip responses from CDNs.
- Unified config_dir() in lib.rs for consistent path resolution under
  sudo and launchd. TLS certs use /usr/local/var/numa/ (writable as
  root daemon).
- Dashboard UX: panel subtitles differentiating overrides vs services,
  better placeholders, proxy route display, 600px query log height.
- Deploy: make deploy handles build+copy+codesign+restart cycle.
- Demo: scripts/record-demo.sh for recording hero GIF with CDP.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 01:15:07 +02:00
Razvan Dimescu
10502f2db2 fix service restart: add codesign and make deploy target
macOS kills unsigned binaries, so numa service restart failed after
copying a new build. Added ad-hoc codesign to restart flow and a
make deploy target that handles the full build-copy-sign-restart cycle.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 15:22:33 +02:00
Razvan Dimescu
754b064570 fix rustfmt formatting
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 15:07:42 +02:00
Razvan Dimescu
8f959ce0a5 add local service proxy with .numa domains
HTTP reverse proxy on port 80 lets developers use clean domain names
(frontend.numa, api.numa) instead of localhost:PORT. Includes WebSocket
upgrade support for HMR, TCP health checks, dashboard UI panel, and
REST API for service management. numa.numa is preconfigured for the
dashboard itself.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 15:07:15 +02:00
Razvan Dimescu
14a9e9e7e3 add numa service restart command
Kills the running process and lets launchd/systemd respawn it
with the updated binary. DNS stays configured throughout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 14:26:56 +02:00
Razvan Dimescu
ee776938c5 add auto-detect upstream, install script, release workflow
- Default upstream auto-detected from system resolver (scutil/resolv.conf)
  instead of hardcoding Google 8.8.8.8. Falls back to Quad9 (9.9.9.9).
- Single scutil --dns pass for both upstream detection and forwarding rules
- Linux: reads backup resolv.conf if current only has loopback
- Service start/stop now couples DNS config (install on start, uninstall on stop)
- Install script for one-line binary install from GitHub Releases
- GitHub Actions release workflow: builds for macOS/Linux x86_64/aarch64

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 14:14:20 +02:00
Razvan Dimescu
5eec8915d4 fix FORMERR: filter UNKNOWN records and increase buffer to 4096
Root cause: upstream resolvers return EDNS OPT records (type 41) in
the additional section. Our parser reads them as UNKNOWN, but write()
silently skips them — creating a header that claims N additional records
but a body with 0, producing FORMERR on the client side.

Fix: filter out UNKNOWN records before serialization and adjust header
counts to match. Also increase BytePacketBuffer from 512 to 4096 bytes
to handle modern DNS responses with many records.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 14:11:46 +02:00
Razvan Dimescu
7e29f3cb57 add domain check endpoint and dashboard search box
GET /blocking/check/{domain} — returns whether a domain is blocked,
the reason (exact match, parent domain, allowlist, disabled), and
the matching rule. Dashboard sidebar has a "Check Domain" search
box with inline results and one-click allow button.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:58:25 +02:00
Razvan Dimescu
0658ed7310 add service management CLI, log path in dashboard footer
- numa service start/stop/status commands (launchd + systemd)
- Dashboard footer shows log paths (macOS + Linux) and GitHub link
- Help text updated with all service commands

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:41:16 +02:00
Razvan Dimescu
57c4742f09 harden Linux DNS config and fix review findings
- Detect systemd-resolved: use drop-in config instead of overwriting
  /etc/resolv.conf (which gets regenerated)
- Warn if /etc/resolv.conf is a symlink (NetworkManager, etc.)
- Fix TOCTOU: attempt copy/remove directly, handle NotFound
- Remove side-effect from backup_path_linux (no eager mkdir)
- Fix macOS $HOME fallback: /var/root instead of /tmp
- Log warnings on launchctl/systemctl failures instead of silencing
- Delete plist before unloading (prevents zombie restarts)
- Extract ensure_binary_installed helper on Linux

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:32:20 +02:00
Razvan Dimescu
4645df50e0 add Linux systemd service and DNS configuration
Linux:
- numa install: backs up /etc/resolv.conf, sets nameserver to 127.0.0.1
- numa uninstall: restores original /etc/resolv.conf from backup
- numa service start: installs systemd unit, enables + starts
- numa service stop: stops, disables, removes unit file
- numa service status: shows systemctl status

macOS: launchd plist (already working)

Both platforms: Restart=always / KeepAlive=true for crash recovery.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:24:03 +02:00
Razvan Dimescu
ae9edb3593 fix CI: gate macOS-only helpers behind cfg(target_os = macos)
Move HashMap, PathBuf, numa_data_dir, backup_path inside macOS
cfg blocks so Linux builds don't see unused imports/functions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 11:48:33 +02:00