Hooks the service-dispatcher scaffolding from the previous commit to
actually serve DNS, and replaces the HKLM\…\Run login-time autostart
with a proper Windows service created via sc.exe.
**Refactor**
- Extract main.rs's inline server body (~500 lines) into `numa::serve::run`
so both the interactive CLI entry and the service dispatcher drive the
same startup/serve loop. main.rs is now a thin subcommand router.
- main.rs goes sync (no #[tokio::main]); each branch that needs async
builds its own runtime and block_on's. Required so the --service path
can hand off to SCM without fighting tokio for the entry thread.
**Windows service wrapper**
- `numa::windows_service::run_service` now builds a multi-thread tokio
runtime on a dedicated thread and runs `serve::run` inside it. Stop/
Shutdown from SCM aborts the wait loop and reports SERVICE_STOPPED.
- Config path resolves to `%PROGRAMDATA%\numa\numa.toml` when running
under SCM (SYSTEM's cwd is System32, relative paths don't work).
**Install/uninstall**
- `install_windows` now copies numa.exe to a stable
`%PROGRAMDATA%\numa\bin\numa.exe` and registers it via `sc create`
with start=auto, obj=LocalSystem, and a failure policy of
restart/5000/restart/5000/restart/10000. Starts the service
immediately when no reboot is pending.
- `uninstall_windows` stops + deletes the service and removes the
binary copy before restoring DNS.
- Drops the old `register_autostart` / `remove_autostart` helpers that
wrote to `HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run` — that
path runs at user login in the user's session with no stderr capture
and no crash-restart policy, which is why we've been flying blind in
every Windows debug session.
DNS-set bugs (netsh destructive static, IPv6 not touched, uninstall
secondary-drop) and file logging are orthogonal — tracked for follow-up.
Lets numa.exe act as a real Windows service registered with the SCM,
replacing the HKLM\...\Run login-time autostart that runs in the user
session without stderr capture.
- New `numa::windows_service` module (cfg(windows)) wraps Mullvad's
`windows-service` crate: registers with SCM, reports Running, handles
Stop/Shutdown, reports Stopped.
- `numa.exe --service` is the entry point SCM uses
(`sc create … binPath="numa.exe --service"`); interactive invocations
are unchanged.
- Dep is gated `[target.'cfg(windows)'.dependencies]` — zero impact on
macOS/Linux builds or binary size.
Scaffold only. The service currently blocks on an mpsc channel until
Stop arrives; the actual serve loop will hook in once main.rs's inline
server body is extracted into `numa::serve(config_path)` in a follow-up.
This lets `sc start Numa` / `sc stop Numa` be verified end to end today.
Resolves conflict in src/ctx.rs — both sides added independent tokio
tests (forwarding fail-over on this branch, default-pool upstream path
on main from #103). Keep both.
Patches RUSTSEC-2026-0098 (URI name constraints incorrectly accepted)
and RUSTSEC-2026-0099 (wildcard cert name constraints), both published
2026-04-14. Transitive via reqwest / rustls / hickory / quinn.
Mirrors `[upstream] address` — `upstream` accepts string or array
of strings, builds an `UpstreamPool` and routes queries through
`forward_with_failover_raw` so SRTT ordering and failover apply to
matched `[[forwarding]]` rules the same way they do for the default
pool.
Single-string rules keep their current behavior (one-element pool,
equivalent single-upstream path). Empty array errors at config load.
Addresses item 1 of issue #102. Plan: docs/102_item1.md.
Queries matching a [[forwarding]] suffix rule now log as FORWARD;
queries resolved via the default [upstream] pool log as UPSTREAM.
Previously both paths shared the FORWARD label, making it impossible
to tell from logs whether a rule matched.
Adds QueryPath::Upstream, a queries.upstream stats counter exposed
via /stats, plus a matching dashboard filter, bar, and path tag.
Closes part of #102.
Config-level forwarding rules were parsed with the UDP-only
`parse_upstream_addr` helper, silently rejecting the DoT/DoH schemes
that the rest of the forwarding pipeline already supports.
Widen `ForwardingRule.upstream` from `SocketAddr` to `Upstream` so
config rules reuse the same parser as `[upstream].address` and
`fallback`. Demote `parse_upstream_addr` to `pub(crate)` to prevent
the same mistake recurring.
Closes#100.
Fixes#97 — on minimal Arch installs, rustc fails with
"error while loading shared libraries: libLLVM.so" because
llvm-libs isn't pulled in transitively.
- Open with shared reqwest pain, not the tool name
- Switch "we" to "I" for personal voice (playbook: solo dev > corporate)
- Replace Unbound feature-gap excuses with what I'm exploring next
(persistent SRTT, aggressive NSEC, adaptive hedge delays)
- Add context line linking hero cards to the recursive section
New post on reqwest HTTP/2 window tuning and request hedging
(Dean & Barroso's "The Tail at Scale" applied to DNS forwarding).
Covers DoH forwarding p99 improvement and cold recursive
resolution from 2.3s to 538ms.
Also adds blog build infrastructure: index generation script,
draft preview server, hero metrics/before-after CSS, and
normalizes date format across existing posts.
Test each pipeline stage in isolation through resolve_query:
- override takes precedence over all other paths
- localhost and *.localhost resolve to loopback
- local zone returns configured records
- .tld proxy resolves registered services to loopback
- blocklist sinkholes to 0.0.0.0
- cache hit returns stored response without upstream
resolve_query now returns (BytePacketBuffer, QueryPath) so callers
and tests can inspect the resolution path without reading the query
log. Production call sites (UDP, DoT, DoH) destructure and ignore it.
The forwarding test now uses a mock UDP upstream that replies with a
canned response, asserting QueryPath::Forwarded instead of != Local.
Explicit [[forwarding]] rules now take precedence over the RFC 6303
special-use domain intercept. Previously, PTR queries for private
ranges (e.g. 168.192.in-addr.arpa) always returned local NXDOMAIN
even when a forwarding rule pointed them at a corporate DNS server.
Add full-pipeline resolve_query test harness (test_ctx + resolve_in_test)
and two tests covering both the default behavior and the override.
Closes#94
Comparing local cache (0.8ms) vs a remote service (37ms) measures
network latency, not resolver quality. Any local resolver would
show the same advantage. Replaced with AdGuard Home comparison
which is a fair local-to-local benchmark.
AdGuard Home on port 5457, both forwarding via DoH. Cached queries
tied at 0.1ms. On degraded networks hedging hurts p99 (28ms vs 10ms
without) — both requests pay the same high RTT with no random spikes
to rescue. On clean networks hedging wins.
The DoH endpoint rejected requests with Host: 127.0.0.1/::1/localhost,
and the generated TLS cert had no IP SANs — so browsers couldn't use
https://127.0.0.1/dns-query even with the CA trusted.
- is_doh_host now accepts 127.0.0.1, ::1, localhost (with optional port)
- TLS cert includes 127.0.0.1 and ::1 IP SANs, plus bare TLD DNS SAN
Closes#87
Thread Transport enum through resolve pipeline, record per-query
transport in stats and query log. Dashboard gets bar chart panel
with encryption %, transport column in query log, and filter dropdown.
Hedging was DoH-only (hyper dispatch spike mitigation). Now applies to
UDP (rescues packet loss) and DoT (rescues TLS handshake stalls) too.
Same-upstream hedging: fires a second independent request after hedge_ms
delay. First response wins. Disable with hedge_ms = 0.
With ROUNDS=10, only the first query per domain was truly cold — the
other 9 hit cached NS delegations at <1ms, diluting the median to
0.4ms. Now cold mode uses 1 round so every sample is a real cold
resolve. Also extracted compare_two_rounds to support per-mode rounds.
Re-runs of --vs-unbound-cold were hitting stale cache entries from
prior runs. The static COUNTER reset to 0 each process, generating
the same c0.example.com subdomains. With the 1-hour stale window,
entries from 10 minutes ago served as stale hits.
Fix: prefix with PID (r{pid}-c{n}.domain) and flush Numa's cache
before cold benchmarks.
- Extract refresh_entry in ctx.rs — warm_domain in main.rs now delegates
to it instead of duplicating the resolve+cache logic (~40 lines removed)
- Eliminate unconditional .to_vec() of raw wire on every UDP/DoT query —
pass &buffer.buf[..len] directly (zero-cost for cache hits)
- Replace bare bool stale flag with Freshness enum (Fresh/NearExpiry/Stale)
making the three states self-documenting at every call site