Commit Graph

300 Commits

Author SHA1 Message Date
Razvan Dimescu
e19505aa95 fix(linux): narrow replace_exe_path cfg to macos after Linux inlined the substitution
Linux install_service_linux now does the {{exe_path}} substitution
inline because it uses the (potentially copied) binary path returned
by install_service_binary_linux, not current_exe(). The shared
replace_exe_path helper is dead on Linux — clippy -D warnings caught it.

Narrow the function to macos and split the placeholder test: keep the
"both templates contain {{exe_path}}" assertion as a cross-platform test
(catches placeholder removal on either file), keep the substitution test
gated to macos where the function lives.
2026-04-18 11:57:54 +03:00
Razvan Dimescu
3970a9f45c fix(linux): copy binary to /usr/local/bin when source path isn't world-traversable
DynamicUser=yes' transient account can only traverse world-x directories.
The CI binary at /home/runner/work/numa/numa/target/release/numa fails
exec with EACCES because /home/runner is mode 0700; same applies to a
build under /home/<user>/, ~/.cargo/bin, or any private $HOME tree.

install_service_binary_linux now walks the binary's path. If every
ancestor grants world-execute (Linuxbrew /home/linuxbrew is 0755,
/usr/local/bin is fine, install.sh layout works), keep the source
path so brew/distro upgrades propagate in place. Otherwise copy to
/usr/local/bin/numa and reference that in the unit.

Locally verified both branches in an Ubuntu 24.04 systemd container:
- CI-like /home/runner (0700) → copies + service binds 5380
- Brew-like /home/linuxbrew (0755) → keeps source path + service binds 5380
2026-04-18 11:51:32 +03:00
Razvan Dimescu
7b9db9e889 fix(linux): drop ProtectHome=true — blocks exec when binary lives under /home
Integration-linux journalctl showed status=203/EXEC: systemd couldn't
exec /home/runner/work/numa/numa/target/release/numa because
ProtectHome=yes makes /home invisible to the sandboxed process. My
local Docker test passed because the binary was at /workspace, not
/home.

DynamicUser=yes already implies ProtectHome=read-only, which preserves
exec access to binaries living under /home (cargo install, source
builds, CI) while blocking writes to user $HOMEs. Keep that default
rather than over-restricting.

Follow-up worth tracking: install_service_linux could copy the binary
to /usr/local/bin/numa the way Windows does at windows_service_exe_path,
making the unit's ExecStart independent of where `numa install` was
invoked from — then we could set ProtectHome=yes again.
2026-04-18 08:54:34 +03:00
Razvan Dimescu
dfeca53e21 ci: dump journalctl + systemctl status on integration-linux failure 2026-04-18 08:48:53 +03:00
Razvan Dimescu
4f6159d961 refactor(linux): switch to DynamicUser=yes, drop install-time user creation
AUR installs never call `numa install` — PKGBUILD drops the unit straight
into /usr/lib/systemd/system and the user runs `systemctl enable numa`.
With User=numa the Rust installer's useradd code never fires there,
breaking Arch out of the box.

DynamicUser=yes sidesteps packaging entirely — systemd allocates a
transient UID per start and remaps StateDirectory ownership (including
legacy root-owned trees) automatically. Works on any modern systemd.

Drops the ensure_numa_user_linux/chown helpers plus NUMA_USER; the
unit file alone now captures the privilege-drop story.
2026-04-18 08:20:07 +03:00
Razvan Dimescu
41aea1dd12 fix(linux): drop risky sandbox directives that break Rust network daemons
Integration test failed with exit 7 on curl to /health after a successful
install — service started but never listened. The likely culprits are
MemoryDenyWriteExecute (breaks jemalloc/some crypto), SystemCallFilter
~@privileged @resources (blocks setrlimit and friends tokio may use),
and RestrictNamespaces/LockPersonality (occasional foot-guns).

Pull them and keep a conservative hardening set that's well-tested with
Rust network services: no-new-privs, protect-system/home, private tmp
and devices, protect-kernel-*, restrict-realtime/suid/address-families.
Layer the aggressive bits back in follow-up PRs once tested individually.
2026-04-18 08:10:04 +03:00
Razvan Dimescu
695a8b963c feat(linux): run systemd service as unprivileged numa user
- numa.service: User=numa + CAP_NET_BIND_SERVICE + sandboxing block
  (ProtectSystem=strict, PrivateTmp, seccomp @system-service, etc)
- install_service_linux: create numa system user + chown data_dir
  before first start so TLS-cert generation and state writes land
  on a numa-owned tree

Runtime verified root-free on Linux — network_watch_loop only reads
/etc/resolv.conf; all system-DNS mutation stays in the installer,
which continues to run as root via sudo.
2026-04-18 07:56:59 +03:00
Razvan Dimescu
34e2182ae4 Merge pull request #104 from razvandimescu/feat/forwarding-array-upstream
feat: accept array of upstreams in [[forwarding]]
2026-04-17 23:25:04 +03:00
Razvan Dimescu
5f77af55e9 fix(forward): track SRTT for DoT upstreams, not just UDP
The SRTT ordering + failure penalty path was UDP-only, so a DoT primary
in a forwarding-rule pool was never deprioritized on failure and all
DoT entries tied at INITIAL_SRTT_MS in the sort key. With [[forwarding]]
now accepting arrays of upstreams, DoT pools are a first-class case and
need the same healthiest-first behavior the default pool gets for UDP.

- Add Upstream::tracked_ip() → Some(ip) for Udp/Dot, None for Doh
  (DoH has no stable IP — reqwest pools connections by hostname).
- Rewire the three SRTT call sites in forward_with_failover_raw.
- Hoist srtt.read() out of the candidate-scoring loop — one lock per
  query instead of N (matters now that pools commonly have N>1).
- Drop unused #[derive(Debug)] on UpstreamPool and ForwardingRule.
- Regression tests: udp_failure_records_in_srtt + dot_failure_records_in_srtt.
2026-04-17 03:39:21 +03:00
Razvan Dimescu
ab6cda0c91 Merge branch 'main' into feat/forwarding-array-upstream
Resolves src/main.rs conflict: serve loop was extracted into src/serve.rs on main (PR #107). Ported the forwarding-rule log change to serve.rs — fwd.upstream is now Vec<String>, logged with join(", ").
2026-04-17 03:14:09 +03:00
Razvan Dimescu
f9ce82f4b0 Merge pull request #107 from razvandimescu/feat/windows-service
feat(windows): run as a real SCM service, not a Run-key autostart
2026-04-17 02:02:43 +03:00
Razvan Dimescu
1d9495c013 ci: bridge DNS gap with direct upstream instead of polling
systemd-resolved has a ~40s reconfiguration stall after restart
(systemd #22521) that breaks the GHA runner's persistent connection
to results-receiver.actions.githubusercontent.com. Polling for DNS
recovery isn't enough since the .NET runner agent caches DNS at the
connection-pool level. Replace the broken stub-resolv symlink with a
direct upstream so DNS works instantly.
2026-04-17 01:32:36 +03:00
Razvan Dimescu
34b75833b8 ci: poll for DNS recovery in cleanup, not test step
Move DNS recovery wait into the cleanup step (if: always) so it runs
regardless of test outcome. Use getent hosts loop instead of sleep+dig
to match what post-steps actually use for resolution.
2026-04-17 01:11:20 +03:00
Razvan Dimescu
99af97a67b ci: wait for DNS recovery after uninstall on Linux
systemd-resolved needs a moment to restore its stub listener after
the numa drop-in is removed. Without a wait, the runner can't resolve
GitHub's API to report job completion.
2026-04-16 20:20:53 +03:00
Razvan Dimescu
9e56054f37 ci: add integration tests for install/uninstall lifecycle
Release-build + install/verify/re-install/uninstall cycle on Linux and
macOS. Runs after lint/test passes (needs dependency). Cleanup step
uses if: always() to handle cancellation.
2026-04-16 19:56:44 +03:00
Razvan Dimescu
fe9f31616e test: add SCM output parsing and config path regression tests
Extract parse_sc_registered and parse_sc_state as testable pure
functions. 8 new tests covering: service registration detection,
service state parsing, and Windows config_dir == data_dir invariant.
2026-04-16 19:31:26 +03:00
Razvan Dimescu
9f08d8b489 fix(windows): stop service before port probe, wait for full exit
Stop the running service before disabling Dnscache so the port 53 probe
sees the real state (not Numa's own binding). Wait for SCM STOPPED
state before copying the binary to avoid os error 32 (file in use).
2026-04-16 19:21:56 +03:00
Razvan Dimescu
9bea038cb6 fix(windows): unify config/data dir and add service log file
config_dir() on Windows now returns data_dir() (ProgramData) so config,
services.json, and log file are in the same place for both interactive
and service contexts. Service mode writes logs to numa.log via
env_logger pipe. Dashboard shows correct log path per OS.
2026-04-16 19:12:42 +03:00
Razvan Dimescu
f0a1dd7106 fix(dashboard): hide logs path on Windows (no log sink yet) 2026-04-16 19:01:34 +03:00
Razvan Dimescu
6789c321bc fix(windows): defer DNS redirect until port 53 is free
Probe port 53 after disabling Dnscache instead of assuming reboot is
needed. Skip DNS redirect when port is blocked (service does it on
first boot). Fix readiness probe: TCP connect to API port instead of
broken UDP send_to that always succeeded.
2026-04-16 18:35:09 +03:00
Razvan Dimescu
da40a8dbfc ci: fetch full history on Windows so build.rs embeds git SHA 2026-04-16 18:08:48 +03:00
Razvan Dimescu
65e65028a0 fix(windows): separate service lifecycle from install flow
service start/stop/restart/status now map to proper SCM operations
instead of re-running the full install/uninstall flow. On re-install,
stop the running service first so the binary can be overwritten.
2026-04-16 16:59:54 +03:00
Razvan Dimescu
d3eab73a31 fix: use sort_by_key to satisfy clippy unnecessary_sort_by 2026-04-16 16:13:15 +03:00
Razvan Dimescu
22ec684e48 Merge remote-tracking branch 'origin/main' into feat/windows-service
# Conflicts:
#	src/main.rs
2026-04-16 16:06:49 +03:00
Razvan Dimescu
aa040fd8a4 Merge pull request #111 from razvandimescu/fix/allowlist-input-focus
fix(dashboard): allowlist input erased by polling refresh
2026-04-16 15:27:02 +03:00
Razvan Dimescu
b69cc89d38 fix(dashboard): skip allowlist re-render while input has focus
The polling refresh replaced the entire allowlist panel innerHTML every
2 seconds, destroying the input field mid-typing. Users had to
paste-and-enter faster than the refresh interval — #106 reported this
as text "timing out and erasing."

Guard: skip renderAllowlist() when allowDomainInput has focus.
2026-04-16 15:12:00 +03:00
Razvan Dimescu
ebb801650e Merge pull request #110 from razvandimescu/feat/build-version
feat: embed git SHA in version string
2026-04-16 13:41:23 +03:00
Razvan Dimescu
30bb7365c9 refactor: robust git-describe parsing for pre-release tags
Switch to --long flag so format is always TAG-N-gSHA[-dirty], then
split from the right. Handles pre-release tags (v0.14.0-rc1) that
broke the previous left-split approach. Remove ineffective directory
watch on .git/refs/tags/. Trim comments.
2026-04-16 13:18:56 +03:00
Razvan Dimescu
0118ab0f44 feat: embed git SHA in version string via build.rs
Adds a build.rs that runs `git describe --tags --always --dirty` and
sets NUMA_BUILD_VERSION at compile time. A new `numa::version()` helper
returns the build version, falling back to CARGO_PKG_VERSION when git
is unavailable (source tarballs, Docker builds without .git).

Version strings:
  tagged release:      0.13.1
  commits ahead:       0.13.1+a87f907
  uncommitted changes: 0.13.1+a87f907-dirty
  no git:              0.13.1

Replaces all 6 inline env!("CARGO_PKG_VERSION") call sites with the
single version() function.
2026-04-16 13:02:25 +03:00
Razvan Dimescu
a87f907d20 Merge pull request #109 from razvandimescu/feat/dashboard-version
feat(dashboard): version in header, restructure footer
2026-04-16 11:29:54 +03:00
Razvan Dimescu
1c5e703330 fix(dashboard): collapse header on mobile (≤700px)
Hide tagline, version tag, and Phone Setup on narrow viewports so
the header stays single-row: logo + status dot + blocking toggle.
Reduces logo font-size from 1.8rem to 1.4rem on mobile.
2026-04-16 06:39:29 +03:00
Razvan Dimescu
cc635f2f73 feat(dashboard): show version in header, restructure footer
Closes #108.

- Add `version` field to /stats (from CARGO_PKG_VERSION).
- Show `v0.13.1` next to the Numa wordmark in the dashboard header.
- Restructure the footer into two semantic rows:
  Row 1 (paths): Config · Data · Logs (platform-detected)
  Row 2 (runtime): Upstream · DNSSEC · SRTT · GitHub
- Drop Mode from the footer (redundant with Upstream label).
- Show only the matching-platform log path instead of both
  macOS and Linux unconditionally.
2026-04-16 06:15:48 +03:00
Razvan Dimescu
7bb484ada3 refactor(windows): deduplicate after simplify review
- Drop the duplicate WINDOWS_SERVICE_NAME constant; call sites use the
  single source of truth at windows_service::SERVICE_NAME.
- windows_service_exe_path and service_config_path now compose from
  crate::data_dir() instead of re-parsing %PROGRAMDATA% locally.
- Factor the 6× sc.exe invocation boilerplate into a run_sc helper.
- Replace the 200ms try_recv polling loop in the service dispatcher
  with a recv_timeout wait — cuts shutdown latency and idle CPU.
- stop_service_scm/delete_service_scm now log warnings instead of
  silently swallowing failures, so unexpected errors are visible.
2026-04-15 23:48:09 +03:00
Razvan Dimescu
b610160cd1 feat(windows): run numa as a real SCM service, drop Run-key autostart
Hooks the service-dispatcher scaffolding from the previous commit to
actually serve DNS, and replaces the HKLM\…\Run login-time autostart
with a proper Windows service created via sc.exe.

**Refactor**
- Extract main.rs's inline server body (~500 lines) into `numa::serve::run`
  so both the interactive CLI entry and the service dispatcher drive the
  same startup/serve loop. main.rs is now a thin subcommand router.
- main.rs goes sync (no #[tokio::main]); each branch that needs async
  builds its own runtime and block_on's. Required so the --service path
  can hand off to SCM without fighting tokio for the entry thread.

**Windows service wrapper**
- `numa::windows_service::run_service` now builds a multi-thread tokio
  runtime on a dedicated thread and runs `serve::run` inside it. Stop/
  Shutdown from SCM aborts the wait loop and reports SERVICE_STOPPED.
- Config path resolves to `%PROGRAMDATA%\numa\numa.toml` when running
  under SCM (SYSTEM's cwd is System32, relative paths don't work).

**Install/uninstall**
- `install_windows` now copies numa.exe to a stable
  `%PROGRAMDATA%\numa\bin\numa.exe` and registers it via `sc create`
  with start=auto, obj=LocalSystem, and a failure policy of
  restart/5000/restart/5000/restart/10000. Starts the service
  immediately when no reboot is pending.
- `uninstall_windows` stops + deletes the service and removes the
  binary copy before restoring DNS.
- Drops the old `register_autostart` / `remove_autostart` helpers that
  wrote to `HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run` — that
  path runs at user login in the user's session with no stderr capture
  and no crash-restart policy, which is why we've been flying blind in
  every Windows debug session.

DNS-set bugs (netsh destructive static, IPv6 not touched, uninstall
secondary-drop) and file logging are orthogonal — tracked for follow-up.
2026-04-15 22:24:23 +03:00
Razvan Dimescu
cea4b0ef88 feat(windows): add windows-service crate + SCM dispatcher scaffold
Lets numa.exe act as a real Windows service registered with the SCM,
replacing the HKLM\...\Run login-time autostart that runs in the user
session without stderr capture.

- New `numa::windows_service` module (cfg(windows)) wraps Mullvad's
  `windows-service` crate: registers with SCM, reports Running, handles
  Stop/Shutdown, reports Stopped.
- `numa.exe --service` is the entry point SCM uses
  (`sc create … binPath="numa.exe --service"`); interactive invocations
  are unchanged.
- Dep is gated `[target.'cfg(windows)'.dependencies]` — zero impact on
  macOS/Linux builds or binary size.

Scaffold only. The service currently blocks on an mpsc channel until
Stop arrives; the actual serve loop will hook in once main.rs's inline
server body is extracted into `numa::serve(config_path)` in a follow-up.
This lets `sc start Numa` / `sc stop Numa` be verified end to end today.
2026-04-15 22:14:36 +03:00
Razvan Dimescu
4afc56a052 Merge main into feat/forwarding-array-upstream
Resolves conflict in src/ctx.rs — both sides added independent tokio
tests (forwarding fail-over on this branch, default-pool upstream path
on main from #103). Keep both.
2026-04-15 21:28:04 +03:00
Razvan Dimescu
43a5ca4bd5 Merge pull request #105 from razvandimescu/chore/audit-rustls-webpki
chore(deps): bump rustls-webpki to 0.103.12
2026-04-15 14:41:19 +03:00
Razvan Dimescu
b403671e11 chore(deps): bump rustls-webpki to 0.103.12
Patches RUSTSEC-2026-0098 (URI name constraints incorrectly accepted)
and RUSTSEC-2026-0099 (wildcard cert name constraints), both published
2026-04-14. Transitive via reqwest / rustls / hickory / quinn.
2026-04-15 14:27:17 +03:00
Razvan Dimescu
6f0144b237 Merge pull request #103 from razvandimescu/feat/upstream-log-label
feat: distinguish UPSTREAM vs FORWARD in logs and stats
2026-04-15 14:00:28 +03:00
Razvan Dimescu
fef43635d6 fix(ci): rustfmt import order and gate Upstream import for Windows 2026-04-15 04:11:27 +03:00
Razvan Dimescu
9a0d586b13 feat: accept array of upstreams in [[forwarding]]
Mirrors `[upstream] address` — `upstream` accepts string or array
of strings, builds an `UpstreamPool` and routes queries through
`forward_with_failover_raw` so SRTT ordering and failover apply to
matched `[[forwarding]]` rules the same way they do for the default
pool.

Single-string rules keep their current behavior (one-element pool,
equivalent single-upstream path). Empty array errors at config load.

Addresses item 1 of issue #102. Plan: docs/102_item1.md.
2026-04-15 04:03:38 +03:00
Razvan Dimescu
4bd08e206d feat(dashboard): hide zero-count path and transport rows 2026-04-14 21:25:11 +03:00
Razvan Dimescu
ebb2a5db39 refactor: simplify upstream-path test — reuse pool mutex, drop narrating comment 2026-04-14 18:26:45 +03:00
Razvan Dimescu
e0e0f50838 feat: distinguish UPSTREAM vs FORWARD in logs and stats
Queries matching a [[forwarding]] suffix rule now log as FORWARD;
queries resolved via the default [upstream] pool log as UPSTREAM.
Previously both paths shared the FORWARD label, making it impossible
to tell from logs whether a rule matched.

Adds QueryPath::Upstream, a queries.upstream stats counter exposed
via /stats, plus a matching dashboard filter, bar, and path tag.

Closes part of #102.
2026-04-14 18:18:32 +03:00
Razvan Dimescu
120ba5200e chore: bump version to 0.13.1 v0.13.1 2026-04-14 13:31:35 +03:00
Razvan Dimescu
45046bcf6e Merge pull request #101 from razvandimescu/fix/forward-tls-upstream
fix: accept tls:// and https:// in [[forwarding]] upstreams
2026-04-14 13:09:58 +03:00
Razvan Dimescu
b4b939c78b fix: accept tls:// and https:// in [[forwarding]] upstreams
Config-level forwarding rules were parsed with the UDP-only
`parse_upstream_addr` helper, silently rejecting the DoT/DoH schemes
that the rest of the forwarding pipeline already supports.

Widen `ForwardingRule.upstream` from `SocketAddr` to `Upstream` so
config rules reuse the same parser as `[upstream].address` and
`fallback`. Demote `parse_upstream_addr` to `pub(crate)` to prevent
the same mistake recurring.

Closes #100.
2026-04-14 09:22:24 +03:00
Razvan Dimescu
9a85e271ec Merge pull request #99 from razvandimescu/fix/aur-llvm-libs
fix: add llvm-libs to AUR makedepends
2026-04-13 17:09:08 +03:00
Razvan Dimescu
7dc1a0686f fix: add llvm-libs to AUR makedepends
Fixes #97 — on minimal Arch installs, rustc fails with
"error while loading shared libraries: libLLVM.so" because
llvm-libs isn't pulled in transitively.
2026-04-13 15:58:52 +03:00
Razvan Dimescu
a02722cdf9 Merge pull request #98 from razvandimescu/docker-support
feat: Docker support with multi-arch GHCR images
2026-04-13 15:53:56 +03:00