dearsky/numa - numa - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Razvan Dimescu	4aa91a5236	fix(api): Cache-Control: no-cache on dashboard HTML Browsers heuristically cached the dashboard page because the response carried no Cache-Control header, so a numa upgrade on the daemon did not surface updated PATH_DEFS (e.g. the UPSTREAM row added in v0.14.0) until the user hard-reloaded. Force revalidation on every load. Closes #144.	2026-04-24 17:51:14 +03:00
Razvan Dimescu	93f0ea7501	Merge pull request #145 from razvandimescu/docs/recipes docs: lift user-facing guides to recipes/, drop dangling docs/ refs	2026-04-24 15:22:44 +03:00
Razvan Dimescu	f7f35b3424	docs: lift user-facing guides to recipes/, drop dangling docs/ refs docs/ is gitignored; references to docs/implementation/*.md from public source, configs, and packaging were dead links outside the maintainer machine. Adds four recipes (README, dnsdist-front, doh-on-lan, odoh-upstream) under top-level recipes/ and repoints existing pointers. - numa.toml, packaging/client/{README.md,numa.toml}: point to recipes/odoh-upstream.md. - src/{bootstrap_resolver,forward,serve}.rs: reference issue #122 directly (module scope is broader than the ODoH-specific recipe). - src/health.rs: drop the §-ref; iOS HealthInfo remains named as the canonical consumer.	2026-04-24 15:09:16 +03:00
Razvan Dimescu	2274151c17	fix(packet): parse SOA natively to stop malformed replies (#128 ) SOA records were stored as opaque bytes (DnsRecord::UNKNOWN), so the RFC 1035 §3.3.13 MNAME/RNAME name-compression pointers — offsets into the upstream packet — were re-emitted verbatim. Once Numa applied its own compression to surrounding names, those pointers landed on garbage and clients rejected the reply ("malformed reply packet" in kdig). Parse SOA via read_qname and write via write_qname, matching the NS/CNAME/MX pattern. Adds the canonical-rdata arm in dnssec.rs for RRSIG verification. Regression test round-trips a CNAME-chain response with a compressed SOA in authority through hickory-proto strict parse.	2026-04-23 00:36:02 +03:00
Razvan Dimescu	b8a125b598	fix(upstream): default hedge_ms=0 to avoid silent 2x upstream query count Hedging fires a second upstream query against the same upstream after the hedge delay. Rescues packet loss and handshake stalls on flaky links, but every lookup shows up twice at the provider — silently halves the headroom for anyone on a quota'd upstream (NextDNS free tier, Control D, paid Quad9). Surfaced by #134 (bcookatpcsd), who saw every query duplicated on the NextDNS dashboard with a single-address DoT upstream. Not a bug — the feature doing what it says on the tin — but a surprising default. Flipping the default to 0 makes hedging explicitly opt-in. Users who want tail-latency rescue on flaky nets add `hedge_ms = 10` (or higher). No config migration needed; no breaking changes to the API surface. Also tightens the numa.toml comment so the trade-off is visible at config time, not retroactively on a provider dashboard.	2026-04-22 23:30:55 +03:00
Razvan Dimescu	5cba02a6c8	refactor(bootstrap): BTreeMap for overrides + simplify review - Switch overrides from HashMap to BTreeMap — deterministic iteration by type, drops the manual sort when logging. - Rename the flat_map closure's inner `ips` to `addrs` to stop shadowing the outer Vec<String>. - Trim the Suite 8 TEST-NET-1 comment to keep the "why" and drop mechanism narration. - Drop a redundant sleep 1 after wait — wait already blocks on exit.	2026-04-21 18:37:35 +03:00
Razvan Dimescu	51cce0347b	test(odoh): integration-verify relay_ip/target_ip override wiring Suite 8 now ends with a config using RFC 5737 TEST-NET-1 IPs as relay_ip/target_ip, started briefly so the bootstrap resolver logs its override map. Asserts both host=IP pairs land in that map — closing the gap flagged on PR #126 (zero-plain-DNS-leak for ODoH endpoints was only unit-tested). Also: NumaResolver::new now logs the override map at INFO when non-empty, so operators can verify their ODoH bootstrap without needing DEBUG level.	2026-04-21 17:43:02 +03:00
Razvan Dimescu	459395203d	style: cargo fmt	2026-04-21 16:30:26 +03:00
Razvan Dimescu	10469e96bd	fix(bootstrap): route numa HTTPS via IP-literal bootstrap resolver (#122 ) When numa is its own system DNS resolver (HAOS add-on, Pi-hole-style container, /etc/resolv.conf → 127.0.0.1), every numa-originated HTTPS connection — DoH upstream, ODoH relay/target, blocklist CDN — routed its hostname through getaddrinfo() back to numa itself. Cold boot deadlocked; steady state taxed every new TCP connection. 0.14.1's retry-with-backoff masked the startup race but not the underlying self-loop. NumaResolver implements reqwest::dns::Resolve with two lanes: - Per-host overrides (ODoH relay_ip/target_ip) short-circuit DNS entirely, preserving ODoH's zero-plain-DNS-leak property. - Otherwise: A+AAAA in parallel via UDP to IP-literal bootstrap servers, with TCP fallback for UDP-hostile networks. Bootstrap IPs come from upstream.fallback (IP-literal filtered, hostnames skipped with a warning). Empty fallback yields the hardcoded default [9.9.9.9, 1.1.1.1]; the chosen source is logged at startup so the silent default is visible. doh_keepalive_loop now fires its first tick immediately, and keepalive_doh logs failures at WARN — bootstrap issues surface within ~100ms of boot instead of on the first client query. Distinct from UpstreamPool.fallback (client-query failover) which stays untouched: client queries with no configured fallback still SERVFAIL on primary failure rather than silently shadow-routing. Reproducer: tests/docker/self-resolver-loop.sh. Before: 0 blocklist domains, 3072ms SERVFAIL. After: 397k domains, 118ms NOERROR.	2026-04-21 16:19:14 +03:00
Razvan Dimescu	31adc31c9b	refactor(ctx): coalesce forward-path upstream queries resolve_coalesced now takes leader_path: QueryPath and applies to all three upstream branches (Forwarded-rule, Recursive, Upstream), not just Recursive. Fixes thundering-herd at boot when N concurrent HTTPS setups each trigger independent forward queries for the same upstream hostname.	2026-04-21 16:18:52 +03:00
Razvan Dimescu	8bed7c4649	test(blocklist): decouple retry tests from RETRY_DELAYS_SECS length Derive both the flaky-server drop count and the zero-delay schedule from RETRY_DELAYS_SECS.len() so the tests keep exercising their intended invariants — "succeeds on final attempt" and "gives up after all attempts fail" — if the production retry schedule ever changes. Also: rename fail_first → drop_first_n to match drop(sock); swap the giveup test's empty body for an "unreachable" sentinel so a regression that accidentally served couldn't silently match Some("").	2026-04-20 19:19:43 +03:00
Razvan Dimescu	5b1642c6dc	fix(blocklist): retry on transient download failures (#122 ) On cold start, reqwest's getaddrinfo can race numa's own first-query cold-path latency — resolver timeout fires before numa warms its upstream DoH connection. Wrap each blocklist fetch in 3 retries with 2s/10s/30s backoff; by the second attempt, the upstream is warm and subsequent getaddrinfos succeed in <100ms. Also: parallelize fetches across lists via join_all (different hosts, no warming dependency), walk the full error source chain so reqwest failures surface the underlying cause, and parameterize retry delays for unit-test speed.	2026-04-20 19:19:43 +03:00
Razvan Dimescu	193b38b85f	feat(odoh): reject relay+target sharing an eTLD+1 Plain host-string equality caught the copy-paste-same-URL footgun but let `r.cloudflare.com` + `odoh.cloudflare.com` through — two subdomains of the same operator collapse ODoH to ordinary DoH. Add a second layer: compare registrable domains via the PSL (`psl` crate) after the exact- host check. Fails open on IP literals and unparseable hosts; the exact- host check still runs in those cases.	2026-04-20 18:46:54 +03:00
Razvan Dimescu	eb5ea3b645	refactor(odoh): deduplicate post-audit findings - Hoist ODOH_CONTENT_TYPE to a single pub(crate) constant in odoh.rs; relay.rs imports it instead of declaring its own. - Generalize dashboard encryptionPct(data, encryptedKeys, allKeys) so both Inbound Wire and Outbound Wire panels share the same math instead of drifting independently. - Extract RelayState::new() and build_app() helpers in relay.rs so the test spawn_relay() and production run() wire the same router + body-limit layer. Prevents future middleware from landing in one path but not the other. All 344 lib tests pass; no behavior change.	2026-04-20 16:03:34 +03:00
Razvan Dimescu	a3cc64c94f	feat(odoh): relay bind-address CLI arg + dashboard Outbound Wire panel - `numa relay [PORT] [BIND]` accepts an optional bind address (defaults to 127.0.0.1, matching the Caddy reverse-proxy deployment shape). Required for Docker, where the relay needs 0.0.0.0 inside the container so Caddy can reach it across the bridge network. - Dashboard now surfaces the upstream_transport dimension as an "Outbound Wire" panel alongside the existing "Inbound Wire" (renamed from "Transport" for directional clarity). Sub-headers — "apps → numa" / "numa → internet" — make the threat-model split obvious without jargon. Bars: UDP/DoH/DoT/ODoH, headline "X% encrypted outbound". The PR description's promise that "the dashboard answers how much of my DNS traffic left in cleartext honestly" is now true.	2026-04-20 15:44:20 +03:00
Razvan Dimescu	cf128c19af	feat(odoh): bootstrap-IP overrides + zero hedge for ODoH (post-deploy fixes) Two issues surfaced from running mode = "odoh" against the live Hetzner relay as system DNS: 1. Bootstrap deadlock. The reqwest HTTPS client resolves the relay and target hostnames via system DNS. When numa is itself the system resolver, the ODoH client loops trying to resolve through itself. Adds optional `relay_ip` and `target_ip` to `[upstream]`, plumbed into reqwest's `resolve()` so the HTTPS client bypasses system DNS for those two hostnames. TLS still validates against the URL hostname, so a stale IP fails loudly rather than silently MITM'ing. 2. 2x relay load. Default `hedge_ms = 10` triggers a duplicate in-flight query for every request. Useful for UDP/DoH/DoT (rescues tail latency cheaply); wasteful for ODoH (doubles HPKE seal/unseal, doubles sealed-byte footprint a passive observer can correlate, no latency win — relay hop dominates either way). Force-zero in oblivious mode regardless of configured hedge_ms. Validated end-to-end against odoh-relay.numa.rs → Cloudflare: 3 digs produced 3 forwarded_ok on the relay (was 6 before the hedge fix), upstream_transport.odoh ticks correctly.	2026-04-20 15:44:09 +03:00
Razvan Dimescu	241c40553b	feat(odoh): ship ODoH client + self-hosted relay (RFC 9230) Client (mode = "odoh"): URL-query target routing per RFC 9230 §5, /.well-known/odohconfigs TTL cache with 60s backoff on failure, HPKE seal/open via odoh-rs, strict-mode default that SERVFAILs on relay failure instead of silently downgrading. Host-equality config validation rejects same-operator relay/target pairs. Relay (`numa relay [PORT]`): axum server with /relay + /health. SSRF-hardened hostname validator (RFC 1035 ASCII + dot + dash), 4 KiB body cap at the axum layer, 5s full-transaction timeout, and static 502 on target failure (reqwest internals logged, not leaked). Aggregate counters only — no per-request logs. Observability: new `UpstreamTransport { Udp, Doh, Dot, Odoh }` orthogonal to `QueryPath`, so /stats can tally wire protocols symmetrically. Recursive mode records `Some(Udp)` for honest "bytes egressing in cleartext" accounting. Tests: Suite 8 exercises the client end-to-end via Frank Denis's public relay + Cloudflare target; Suite 9 exercises `numa relay` forwarding + guards against Cloudflare as the real far end. Full probe script at tests/probe-odoh-ecosystem.sh verifies the entire public ODoH ecosystem (4 targets + 1 relay per DNSCrypt's curated list — confirms deploying Numa's relay doubles global supply).	2026-04-20 12:34:04 +03:00
Razvan Dimescu	5725f94ff3	refactor(question): collapse QueryType impls behind define_qtypes! macro Adding a record type used to require 5 edits across the file (enum variant, to_num, from_num, as_str, parse_str). The macro takes a single (variant, num, str) tuple per type and generates the enum plus all four methods. UNKNOWN(u16) stays hand-coded since it carries data and can't sit in the table. src/question.rs: 156 lines -> 92 lines, no behavior change.	2026-04-19 08:01:18 +03:00
Razvan Dimescu	24610ae3fe	feat(question): add SVCB, LOC, NAPTR variants to QueryType Logs were printing UNKNOWN(64), UNKNOWN(29), UNKNOWN(35) for SVCB, LOC, and NAPTR — three RR types that have been registered for years and show up in the wild (notably SVCB via RFC 9462 DDR clients querying _dns.resolver.arpa). Adds the variants and replaces the SVCB_QTYPE u16 const introduced in #119 with QueryType::SVCB.to_num(), matching the HTTPS path. Closes #114.	2026-04-19 07:49:35 +03:00
Razvan Dimescu	6bc02982f0	Merge pull request #119 from razvandimescu/feat/filter-aaaa feat(resolver): filter_aaaa for IPv4-only networks	2026-04-19 07:31:27 +03:00
Razvan Dimescu	f9e996ae78	fmt: drop redundant comments per house style	2026-04-19 06:53:47 +03:00
Razvan Dimescu	5e85b147b9	feat(resolver): apply ipv6hint strip to SVCB (type 64) too HTTPS (65) and SVCB (64) share the RDATA wire format, so the existing parser already handles both — only the call site was HTTPS-only. Widen the qtype check and extend the existing pipeline test with a second query for SVCB.	2026-04-19 06:52:30 +03:00
Razvan Dimescu	d6bb9a0f01	fmt: rustfmt vec literal wrapping + signature collapse	2026-04-19 06:24:54 +03:00
Razvan Dimescu	61ea2e510d	refactor: dedupe HTTPS_TYPE, record-walk, and test rdata builder - Drop `const HTTPS_TYPE: u16 = 65;` in favor of `QueryType::HTTPS.to_num()` at the single call site — avoids a fresh magic number alongside the existing enum mapping in question.rs. - Add `DnsPacket::for_each_record_mut` so `strip_https_ipv6_hints` stops hand-rolling the answers/authorities/resources walk; future section rewrites go through the same helper. - Promote the SVCB test-rdata builder from `svcb::tests` to module scope as `pub(crate) #[cfg(test)] fn build_rdata`, and reuse it in the two pipeline tests in ctx.rs — kills ~20 lines of byte-fiddling and keeps one RDATA-construction code path.	2026-04-19 05:58:47 +03:00
Razvan Dimescu	22dd3cd222	fix(resolver): skip ipv6hint strip for DO-bit clients Modifying HTTPS rdata invalidates any accompanying RRSIG, so a DNSSEC- validating downstream would reject the response as Bogus. Gate the strip on !client_do, matching the existing DNSSEC-records strip. Adds a regression test that catches the gate being removed: builds a query with EDNS DO=1, asserts the HTTPS rdata round-trips untouched.	2026-04-19 05:52:37 +03:00
Razvan Dimescu	be98a02e49	feat(resolver): filter_aaaa for IPv4-only networks (#112 ) When enabled, AAAA queries short-circuit to NODATA (NOERROR + empty answer) so Happy Eyeballs clients don't stall waiting on a v6 address they can't use. Also strips `ipv6hint` SvcParam from HTTPS/SVCB answers (RFC 9460) so Chrome ≥103, Firefox, and Safari don't bypass the AAAA filter via the HTTPS record path. Local data is preserved: overrides, zones, the .numa proxy, and the blocklist sinkhole keep whatever v6 addresses they configure — the filter only kicks in on the cache/forward/recursive path. NODATA is correct per RFC 2308 here; NXDOMAIN would incorrectly imply the name doesn't exist for A queries either. Off by default. Opt in via `filter_aaaa = true` under `[server]`.	2026-04-18 19:52:06 +03:00
Razvan Dimescu	763131478f	fmt: rustfmt format! macro split	2026-04-18 12:15:44 +03:00
Razvan Dimescu	067195f2ab	fix(linux): atomic binary copy + restart instead of start on re-install Re-install failed with ETXTBSY (Text file busy) because std::fs::copy can't overwrite a binary that's currently being executed by the running service. Switch to copy-then-rename: write the new binary to /usr/local/bin/numa.new, then rename over /usr/local/bin/numa. Rename swaps the path while the running process keeps the old inode alive, so DNS keeps serving from the previous binary until restart. Bump systemctl start to restart so the new binary actually loads on re-install (start is a no-op when the unit is already active, which would silently leave the old binary running). Locally verified the full CI sequence: install → curl → reinstall → curl → uninstall → curl-fails. All three assertions pass.	2026-04-18 12:12:11 +03:00
Razvan Dimescu	e19505aa95	fix(linux): narrow replace_exe_path cfg to macos after Linux inlined the substitution Linux install_service_linux now does the {{exe_path}} substitution inline because it uses the (potentially copied) binary path returned by install_service_binary_linux, not current_exe(). The shared replace_exe_path helper is dead on Linux — clippy -D warnings caught it. Narrow the function to macos and split the placeholder test: keep the "both templates contain {{exe_path}}" assertion as a cross-platform test (catches placeholder removal on either file), keep the substitution test gated to macos where the function lives.	2026-04-18 11:57:54 +03:00
Razvan Dimescu	3970a9f45c	fix(linux): copy binary to /usr/local/bin when source path isn't world-traversable DynamicUser=yes' transient account can only traverse world-x directories. The CI binary at /home/runner/work/numa/numa/target/release/numa fails exec with EACCES because /home/runner is mode 0700; same applies to a build under /home/<user>/, ~/.cargo/bin, or any private $HOME tree. install_service_binary_linux now walks the binary's path. If every ancestor grants world-execute (Linuxbrew /home/linuxbrew is 0755, /usr/local/bin is fine, install.sh layout works), keep the source path so brew/distro upgrades propagate in place. Otherwise copy to /usr/local/bin/numa and reference that in the unit. Locally verified both branches in an Ubuntu 24.04 systemd container: - CI-like /home/runner (0700) → copies + service binds 5380 - Brew-like /home/linuxbrew (0755) → keeps source path + service binds 5380	2026-04-18 11:51:32 +03:00
Razvan Dimescu	4f6159d961	refactor(linux): switch to DynamicUser=yes, drop install-time user creation AUR installs never call `numa install` — PKGBUILD drops the unit straight into /usr/lib/systemd/system and the user runs `systemctl enable numa`. With User=numa the Rust installer's useradd code never fires there, breaking Arch out of the box. DynamicUser=yes sidesteps packaging entirely — systemd allocates a transient UID per start and remaps StateDirectory ownership (including legacy root-owned trees) automatically. Works on any modern systemd. Drops the ensure_numa_user_linux/chown helpers plus NUMA_USER; the unit file alone now captures the privilege-drop story.	2026-04-18 08:20:07 +03:00
Razvan Dimescu	695a8b963c	feat(linux): run systemd service as unprivileged numa user - numa.service: User=numa + CAP_NET_BIND_SERVICE + sandboxing block (ProtectSystem=strict, PrivateTmp, seccomp @system-service, etc) - install_service_linux: create numa system user + chown data_dir before first start so TLS-cert generation and state writes land on a numa-owned tree Runtime verified root-free on Linux — network_watch_loop only reads /etc/resolv.conf; all system-DNS mutation stays in the installer, which continues to run as root via sudo.	2026-04-18 07:56:59 +03:00
Razvan Dimescu	5f77af55e9	fix(forward): track SRTT for DoT upstreams, not just UDP The SRTT ordering + failure penalty path was UDP-only, so a DoT primary in a forwarding-rule pool was never deprioritized on failure and all DoT entries tied at INITIAL_SRTT_MS in the sort key. With [[forwarding]] now accepting arrays of upstreams, DoT pools are a first-class case and need the same healthiest-first behavior the default pool gets for UDP. - Add Upstream::tracked_ip() → Some(ip) for Udp/Dot, None for Doh (DoH has no stable IP — reqwest pools connections by hostname). - Rewire the three SRTT call sites in forward_with_failover_raw. - Hoist srtt.read() out of the candidate-scoring loop — one lock per query instead of N (matters now that pools commonly have N>1). - Drop unused #[derive(Debug)] on UpstreamPool and ForwardingRule. - Regression tests: udp_failure_records_in_srtt + dot_failure_records_in_srtt.	2026-04-17 03:39:21 +03:00
Razvan Dimescu	ab6cda0c91	Merge branch 'main' into feat/forwarding-array-upstream Resolves src/main.rs conflict: serve loop was extracted into src/serve.rs on main (PR #107). Ported the forwarding-rule log change to serve.rs — fwd.upstream is now Vec<String>, logged with join(", ").	2026-04-17 03:14:09 +03:00
Razvan Dimescu	fe9f31616e	test: add SCM output parsing and config path regression tests Extract parse_sc_registered and parse_sc_state as testable pure functions. 8 new tests covering: service registration detection, service state parsing, and Windows config_dir == data_dir invariant.	2026-04-16 19:31:26 +03:00
Razvan Dimescu	9f08d8b489	fix(windows): stop service before port probe, wait for full exit Stop the running service before disabling Dnscache so the port 53 probe sees the real state (not Numa's own binding). Wait for SCM STOPPED state before copying the binary to avoid os error 32 (file in use).	2026-04-16 19:21:56 +03:00
Razvan Dimescu	9bea038cb6	fix(windows): unify config/data dir and add service log file config_dir() on Windows now returns data_dir() (ProgramData) so config, services.json, and log file are in the same place for both interactive and service contexts. Service mode writes logs to numa.log via env_logger pipe. Dashboard shows correct log path per OS.	2026-04-16 19:12:42 +03:00
Razvan Dimescu	6789c321bc	fix(windows): defer DNS redirect until port 53 is free Probe port 53 after disabling Dnscache instead of assuming reboot is needed. Skip DNS redirect when port is blocked (service does it on first boot). Fix readiness probe: TCP connect to API port instead of broken UDP send_to that always succeeded.	2026-04-16 18:35:09 +03:00
Razvan Dimescu	65e65028a0	fix(windows): separate service lifecycle from install flow service start/stop/restart/status now map to proper SCM operations instead of re-running the full install/uninstall flow. On re-install, stop the running service first so the binary can be overwritten.	2026-04-16 16:59:54 +03:00
Razvan Dimescu	d3eab73a31	fix: use sort_by_key to satisfy clippy unnecessary_sort_by	2026-04-16 16:13:15 +03:00
Razvan Dimescu	22ec684e48	Merge remote-tracking branch 'origin/main' into feat/windows-service # Conflicts: # src/main.rs	2026-04-16 16:06:49 +03:00
Razvan Dimescu	0118ab0f44	feat: embed git SHA in version string via build.rs Adds a build.rs that runs `git describe --tags --always --dirty` and sets NUMA_BUILD_VERSION at compile time. A new `numa::version()` helper returns the build version, falling back to CARGO_PKG_VERSION when git is unavailable (source tarballs, Docker builds without .git). Version strings: tagged release: 0.13.1 commits ahead: 0.13.1+a87f907 uncommitted changes: 0.13.1+a87f907-dirty no git: 0.13.1 Replaces all 6 inline env!("CARGO_PKG_VERSION") call sites with the single version() function.	2026-04-16 13:02:25 +03:00
Razvan Dimescu	cc635f2f73	feat(dashboard): show version in header, restructure footer Closes #108. - Add `version` field to /stats (from CARGO_PKG_VERSION). - Show `v0.13.1` next to the Numa wordmark in the dashboard header. - Restructure the footer into two semantic rows: Row 1 (paths): Config · Data · Logs (platform-detected) Row 2 (runtime): Upstream · DNSSEC · SRTT · GitHub - Drop Mode from the footer (redundant with Upstream label). - Show only the matching-platform log path instead of both macOS and Linux unconditionally.	2026-04-16 06:15:48 +03:00
Razvan Dimescu	7bb484ada3	refactor(windows): deduplicate after simplify review - Drop the duplicate WINDOWS_SERVICE_NAME constant; call sites use the single source of truth at windows_service::SERVICE_NAME. - windows_service_exe_path and service_config_path now compose from crate::data_dir() instead of re-parsing %PROGRAMDATA% locally. - Factor the 6× sc.exe invocation boilerplate into a run_sc helper. - Replace the 200ms try_recv polling loop in the service dispatcher with a recv_timeout wait — cuts shutdown latency and idle CPU. - stop_service_scm/delete_service_scm now log warnings instead of silently swallowing failures, so unexpected errors are visible.	2026-04-15 23:48:09 +03:00
Razvan Dimescu	b610160cd1	feat(windows): run numa as a real SCM service, drop Run-key autostart Hooks the service-dispatcher scaffolding from the previous commit to actually serve DNS, and replaces the HKLM\…\Run login-time autostart with a proper Windows service created via sc.exe. Refactor - Extract main.rs's inline server body (~500 lines) into `numa::serve::run` so both the interactive CLI entry and the service dispatcher drive the same startup/serve loop. main.rs is now a thin subcommand router. - main.rs goes sync (no #[tokio::main]); each branch that needs async builds its own runtime and block_on's. Required so the --service path can hand off to SCM without fighting tokio for the entry thread. Windows service wrapper - `numa::windows_service::run_service` now builds a multi-thread tokio runtime on a dedicated thread and runs `serve::run` inside it. Stop/ Shutdown from SCM aborts the wait loop and reports SERVICE_STOPPED. - Config path resolves to `%PROGRAMDATA%\numa\numa.toml` when running under SCM (SYSTEM's cwd is System32, relative paths don't work). Install/uninstall - `install_windows` now copies numa.exe to a stable `%PROGRAMDATA%\numa\bin\numa.exe` and registers it via `sc create` with start=auto, obj=LocalSystem, and a failure policy of restart/5000/restart/5000/restart/10000. Starts the service immediately when no reboot is pending. - `uninstall_windows` stops + deletes the service and removes the binary copy before restoring DNS. - Drops the old `register_autostart` / `remove_autostart` helpers that wrote to `HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run` — that path runs at user login in the user's session with no stderr capture and no crash-restart policy, which is why we've been flying blind in every Windows debug session. DNS-set bugs (netsh destructive static, IPv6 not touched, uninstall secondary-drop) and file logging are orthogonal — tracked for follow-up.	2026-04-15 22:24:23 +03:00
Razvan Dimescu	cea4b0ef88	feat(windows): add windows-service crate + SCM dispatcher scaffold Lets numa.exe act as a real Windows service registered with the SCM, replacing the HKLM\...\Run login-time autostart that runs in the user session without stderr capture. - New `numa::windows_service` module (cfg(windows)) wraps Mullvad's `windows-service` crate: registers with SCM, reports Running, handles Stop/Shutdown, reports Stopped. - `numa.exe --service` is the entry point SCM uses (`sc create … binPath="numa.exe --service"`); interactive invocations are unchanged. - Dep is gated `[target.'cfg(windows)'.dependencies]` — zero impact on macOS/Linux builds or binary size. Scaffold only. The service currently blocks on an mpsc channel until Stop arrives; the actual serve loop will hook in once main.rs's inline server body is extracted into `numa::serve(config_path)` in a follow-up. This lets `sc start Numa` / `sc stop Numa` be verified end to end today.	2026-04-15 22:14:36 +03:00
Razvan Dimescu	4afc56a052	Merge main into feat/forwarding-array-upstream Resolves conflict in src/ctx.rs — both sides added independent tokio tests (forwarding fail-over on this branch, default-pool upstream path on main from #103). Keep both.	2026-04-15 21:28:04 +03:00
Razvan Dimescu	fef43635d6	fix(ci): rustfmt import order and gate Upstream import for Windows	2026-04-15 04:11:27 +03:00
Razvan Dimescu	9a0d586b13	feat: accept array of upstreams in [[forwarding]] Mirrors `[upstream] address` — `upstream` accepts string or array of strings, builds an `UpstreamPool` and routes queries through `forward_with_failover_raw` so SRTT ordering and failover apply to matched `[[forwarding]]` rules the same way they do for the default pool. Single-string rules keep their current behavior (one-element pool, equivalent single-upstream path). Empty array errors at config load. Addresses item 1 of issue #102. Plan: docs/102_item1.md.	2026-04-15 04:03:38 +03:00
Razvan Dimescu	ebb2a5db39	refactor: simplify upstream-path test — reuse pool mutex, drop narrating comment	2026-04-14 18:26:45 +03:00

1 2 3 4

181 Commits