* Add GitHub Dependabot scanning (runs once a month)
* chore: group dependabot updates and use conventional commit prefix
Bundle all minor/patch bumps per ecosystem into a single PR to keep
noise manageable (~3 PRs/month instead of 10+). Major bumps still
get individual PRs since they may break APIs.
Commit messages now use the `chore(deps)` conventional-commit prefix
to match the repo's existing style.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Razvan Dimescu <ssaricu@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverts PR #44's approach of swapping GITHUB_TOKEN for a PAT on
action-gh-release. That approach worked in principle but failed in
practice during the v0.10.2 cut: HOMEBREW_TAP_GITHUB_TOKEN is a
fine-grained PAT scoped only to razvandimescu/homebrew-tap, so when
action-gh-release tried to create a release on razvandimescu/numa it
got 403 Resource not accessible. v0.10.2 had to be recovered manually
via `gh release create` from a user PAT.
Root cause of the original bug (from #44): GitHub Actions deliberately
does not propagate workflow events triggered by GITHUB_TOKEN, so a
release created by GITHUB_TOKEN silently failed to fire homebrew-bump's
`release: published` trigger.
Fix: sidestep the event-propagation rule entirely by invoking
homebrew-bump.yml directly as a reusable workflow via `workflow_call`.
- release.yml: drop the `token:` override on action-gh-release (reverts
to GITHUB_TOKEN default, which v0.10.0 and v0.10.1 used successfully)
and add a new `bump-homebrew` job that `needs: release` and `uses:`
homebrew-bump.yml with `secrets: inherit`.
- homebrew-bump.yml: add `workflow_call` trigger with a `version` input,
remove the `release: published` trigger (no longer needed), keep
`workflow_dispatch` for manual recovery, and collapse the version
determination step to a single `inputs.version` read.
Each token now does exactly what its scope permits:
- GITHUB_TOKEN creates the release on numa (contents: write, default)
- HOMEBREW_TAP_GITHUB_TOKEN pushes to homebrew-tap (unchanged)
The tap update becomes a child job in the release run, so failures are
visible in one place instead of "why didn't the release event fire?"
mysteries.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On modern Arch / Ubuntu 22.04+ / Fedora desktops, NetworkManager +
systemd-resolved symlink /etc/resolv.conf to stub-resolv.conf, which
contains only:
nameserver 127.0.0.53
The real upstream servers (router, ISP, configured DoT providers) live
inside systemd-resolved's per-link state, exposed via 'resolvectl status'.
discover_linux() was parsing /etc/resolv.conf, correctly filtering the
stub address, and then falling through to the Quad9 DoH fallback because
detect_dhcp_dns() is macOS-only on Linux. Net effect: on a large chunk of
Linux installs, numa silently defaulted to Quad9 instead of the user's
actual DNS — visible in Casey's AUR test banner (#33) as
'Upstream https://9.9.9.9/dns-query' despite his machine having working
router DNS the entire time.
resolvectl_dns_server() already exists — it was introduced for cloud VPC
forwarding-rule discovery and knows how to ask systemd-resolved for the
real active DNS server. This commit wires it into the default-upstream
fallback chain, between the primary resolv.conf parse and the
~/.numa/original-resolv.conf backup.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Follow-up to #49 and #50. With ownership and quoting fixed, the next run
([24199871832](https://github.com/razvandimescu/numa/actions/runs/24199871832))
reached makepkg and failed with:
/pkg/PKGBUILD: line 34: cargo: command not found
==> ERROR: A failure occurred in prepare().
The publish job only installs 'binutils git sudo' since its sole purpose
is to regenerate .SRCINFO. 'makepkg -od' still runs prepare(), which
calls cargo. The sibling validate job avoids this by passing --noprepare
(and installs rust anyway).
Mirror that pattern: add --noprepare to the metadata-generation invocation.
pkgver() runs before prepare() in makepkg's pipeline, so .SRCINFO still
captures the computed version. Keeps the container minimal (no rust toolchain).
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The docker block runs as '/bin/bash -c "<multi-line script>"'. A comment
inside the script contained embedded double quotes:
# "makepkg -od" fetches the source first so pkgver() can calculate the version.
The first embedded '"' prematurely closes the outer string. Bash then
parses the remainder into a second argument to 'bash -c' which becomes
$0 inside the container and is silently discarded. Net effect: the
in-container script stops at 'git config --add safe.directory', neither
'makepkg -od' nor 'makepkg --printsrcinfo > .SRCINFO' ever run, and the
host-side 'git add PKGBUILD .SRCINFO' fails with:
fatal: pathspec '.SRCINFO' did not match any files
This bug was masked by the earlier ownership bug fixed in #49 — once
that permission error was removed, this one surfaced.
Fix: drop the embedded double quotes from the comment.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 'Push to AUR' step failed on run 24195384571 with:
error: could not lock config file .git/config: Permission denied
Inside the docker block we 'chown -R builduser:builduser /pkg', which
propagates through the bind mount and transfers ownership of aur-repo/
(including .git/) to the container's builduser UID. When control returns
to the runner user, 'git config user.name' can no longer write .git/config
and the step exits 255.
Chown the directory back to the runner's UID/GID before resuming host-side
git operations.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Feature: add GitHub Actions workflow for publishing Arch Linux AUR package
* Fix issues in Arch Linux AUR publishing process
* Add patch to fix default Arch Linux binary path location issues
* fix: PKGBUILD compatibility with numa v0.10.1, fix QEMU action SHA pin
Three small bug fixes that make this PR mergeable end-to-end against
current main, without changing the package design (still numa-git,
still pushed on every main commit, still tracking HEAD via pkgver()):
1. Simplified prepare() — drop the obsolete sed patching for
/usr/local/bin/numa. That literal only appears in a comment
in current main; the actual binary path is determined at
runtime via std::env::current_exe(). Additionally, numa
v0.10.1 ships PR #43 which makes numa FHS-compliant on Linux
out of the box (/var/lib/numa for data dir), so no source
patching is needed at all on Arch.
2. Fixed package() sed for the systemd unit. The previous sed
targeted "ExecStart=/usr/local/bin/numa" but numa.service
actually uses "{{exe_path}}" as a templating placeholder
that's substituted at runtime by replace_exe_path() when
`numa install` runs. The sed silently did nothing, and the
AUR-installed unit file would have a literal "{{exe_path}}"
that systemd cannot start. Fixed sed:
sed 's|{{exe_path}}|/usr/bin/numa /etc/numa.toml|g' \
numa.service > numa.service.patched
3. Fixed broken docker/setup-qemu-action SHA pin in
publish-aur.yml. The pinned SHA
6882732593b27c7f95a044d559b586a46371a68e doesn't exist as
a commit in upstream docker/setup-qemu-action. Verified
v3.0.0 SHA is 68827325e0b33c7199eb31dd4e31fbe9023e06e3.
Without this fix the aarch64 validate job would fail to
load the action at workflow start.
Also refreshed the stale pkgver placeholder in PKGBUILD and
.SRCINFO from 0.9.1.r0.g1234abc to 0.10.1.r0.g0000000 — purely
cosmetic since pkgver() auto-overrides on every makepkg run,
but at least the in-VC value reflects the current era.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: make AUR packaging x86_64-only and stabilize local validation
Turns out Arch Linux doesn't officially support aarch64 architecture, so we will drop if from this AUR build process.
Changes:
- drop aarch64 from PKGBUILD, .SRCINFO, and AUR validation workflow
- keep AUR process aligned with official Arch Linux x86_64 support
- install rust directly in CI to avoid Arch cargo provider prompts
- fetch sources before running cargo audit and audit inside the
fetched repo
- disable makepkg LTO for this package to avoid Arch packaging link
failures
- mark /etc/numa.toml as a backup file
- Add local AUR build scratch directory exclusion to .gitignore
* Add temporary AUR test workflow
* Update github actions checkout workflow version
* remove temporary AUR test workflow
* fix: correct AUR SSH host key fingerprint
The previously pinned ed25519 key was truncated (52 chars) and did not
match the actual aur.archlinux.org host key. Verified via ssh-keyscan.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Razvan Dimescu <ssaricu@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: human-readable advisory when TLS data_dir is not writable
When numa runs as non-root on a system with a privileged default
data_dir (e.g. /usr/local/var/numa on macOS), TLS CA setup fails with
a raw "Permission denied (os error 13)" and HTTPS proxy is silently
disabled. The user sees a cryptic warning with no path forward.
Detect std::io::ErrorKind::PermissionDenied on the tls error, print a
diagnostic naming the data_dir and offering two fixes (install as
system resolver, or point data_dir at a writable path), and keep the
graceful-degradation behavior — DNS resolution and plain-HTTP proxy
continue to work without HTTPS.
All other TLS setup errors fall through to the existing log::warn!.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: port-53 advisory also handles EACCES (non-root privileged bind)
The original port-53 match arm only caught EADDRINUSE, so a fresh
non-root user on macOS/Linux hitting EACCES when trying to bind a
privileged port saw the raw OS error instead of the advisory.
Collapse the scoping helper and the advisory into a single
`try_port53_advisory(bind_addr, &io::Error) -> Option<String>` that
returns the formatted diagnostic when both the port is 53 and the
error kind is one we can speak to (AddrInUse or PermissionDenied),
and `None` otherwise. The two failure modes share one body with a
cause-sentence variant — no duplicated fix text.
Caller becomes a plain if-let: no match guard, no separate is_port_53
helper exposed on the public API. is_port_53 goes back to private.
Unit tests cover all branches: AddrInUse, PermissionDenied, non-53
bind_addr, unrelated ErrorKind, and malformed bind_addr.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: move TLS error classification into tls module
main.rs no longer downcasts a boxed error to figure out whether it's
a permission-denied case. tls::try_data_dir_advisory(&err, &dir)
encapsulates the downcast + kind match and returns Some(advisory) or
None, mirroring system_dns::try_port53_advisory. main.rs becomes a
plain if-let, symmetric with the port-53 path.
Trim the docstrings on both advisory functions: they were narrating
the implementation (errno mapping) instead of stating the contract.
Add unit tests for try_data_dir_advisory covering PermissionDenied,
other io::ErrorKind, and non-io errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: advisory + exit(1) when port 53 is already in use (#45)
Detect AddrInUse on bind, print a human-readable diagnostic explaining
systemd-resolved / Dnscache as the likely cause and offer two concrete
fixes (sudo numa install, or bind_addr on a non-privileged port), then
exit(1) instead of surfacing a raw OS error.
Adds tests/docker/smoke-port53.sh: end-to-end Docker test that
pre-binds port 53 with a Python UDP socket and asserts the advisory +
exit code.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: collapse port53 advisory to single flat path
The per-platform cause sentences were cosmetic — they didn't change
the user's actions (install, or bind_addr on a non-privileged port),
but they introduced duplicated "another process..." strings, a
dead-from-CI branch (is_systemd_resolved_active() == true is never
reached by any test), and a pub visibility bump on
is_systemd_resolved_active for a single caller.
Replace with one flat format! whose cause line mentions both
systemd-resolved and the Windows DNS Client inline. The existing
smoke test now exercises 100% of the function.
is_systemd_resolved_active reverts to private.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
GitHub Actions deliberately does not propagate workflow events triggered
by the default GITHUB_TOKEN — a safety feature against infinite loops.
softprops/action-gh-release falls back to GITHUB_TOKEN when no `token`
is supplied, so the resulting `release: published` event was silently
swallowed and never reached homebrew-bump.yml.
Discovered shipping v0.10.1: tag pushed cleanly, crates.io published
cleanly, GitHub release page created cleanly, but the brew tap never
auto-bumped. Had to trigger homebrew-bump.yml manually via
workflow_dispatch.
Fix: pass HOMEBREW_TAP_GITHUB_TOKEN explicitly. This is already a PAT
(used by homebrew-bump.yml to push cross-repo to razvandimescu/
homebrew-tap), so reusing it keeps the secret surface flat. PAT-authored
release events are the documented escape hatch from the GITHUB_TOKEN
no-propagation rule.
Applies to v0.10.2+. v0.10.1 was bumped manually.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use FHS-compliant /var/lib/numa as Linux data dir default
numa's default system-wide data directory was hardcoded to
/usr/local/var/numa for all Unix platforms. This is the right path on
macOS (Homebrew prefix convention) but non-FHS on Linux, where Arch /
Fedora / Debian / etc. expect persistent state under /var/lib/<pkg>.
The mismatch was invisible to existing users (numa creates the dir
silently on first run) but immediately surfaces when packaging for a
distro — see PR #33 (community contribution to add an Arch AUR package)
which had to add fragile sed-based path patching at PKGBUILD build time.
The fix moves the path decision into a small helper:
- daemon_data_dir() — cfg-gated platform dispatch (linux/macos)
- resolve_linux_data_dir() — pure function, takes "does X exist?"
as parameters, returns the right path
Linux behavior:
- Fresh install → /var/lib/numa (FHS)
- Upgrading from pre-v0.10.1 install → /usr/local/var/numa (legacy)
- Both paths exist → /var/lib/numa (FHS wins)
The legacy fallback is critical: existing v0.10.0 Linux users have
their CA cert + services.json under /usr/local/var/numa. Returning
the new path unconditionally would cause CA regeneration on upgrade,
breaking every browser that had trusted the previous CA. The fallback
is checked at startup via std::path::Path::exists, so the upgrade is
seamless and zero-config.
macOS behavior is unchanged — /usr/local/var/numa is still correct
because Homebrew's prefix is /usr/local.
Test coverage:
- resolve_linux_data_dir is a pure function gated cfg(any(linux,test))
so the same code path is unit-tested on every platform's CI run.
- Four tests cover all combinations of (legacy_exists, fhs_exists),
asserting the migration logic stays correct under future edits.
The default config in numa.toml is also updated to document the new
per-platform default paths.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: end-to-end FHS path verification + simplify cleanup
Two related changes from a /simplify pass and a follow-up testing
finalization:
1. lib.rs cleanup (no behavior change):
- Drop FHS_LINUX_DATA_DIR and LEGACY_LINUX_DATA_DIR consts. Both
were used in only 4 places total and the unit tests already
bypassed them with string literals, so they were over-engineering.
Inline the strings in daemon_data_dir() and resolve_linux_data_dir().
- Trim narrating doc/comments on the helper and the test bodies.
Keep only the non-obvious WHY (the macOS Homebrew note and the
migration-keeps-legacy rationale).
2. tests/docker/smoke-arch.sh:
- Cherry-picked the previously-uncommitted Arch compatibility smoke
test from feat/smoke-arch.
- Removed the [server] data_dir = "/tmp/numa-smoke" override from
the test config so the script now exercises the DEFAULT data dir
code path — which is exactly what the FHS fix touches.
- Added a path assertion after the dig succeeds: verify that
/var/lib/numa/ca.pem exists (FHS) and /usr/local/var/numa is
absent (no accidental dual-creation on a fresh install).
Verified end-to-end on archlinux:latest (Apple Silicon, Rosetta):
── building + running numa on archlinux:latest ──
── cargo build --release --locked ──
Finished `release` profile [optimized] target(s) in 24.02s
── dig @127.0.0.1 -p 5354 google.com A ──
142.251.38.206
── FHS path check ──
✓ CA cert at /var/lib/numa/ca.pem (FHS path)
✓ legacy path /usr/local/var/numa absent (fresh install used FHS)
── smoke-arch passed ──
This closes the testing gap where the unit tests covered the
path-decision LOGIC in isolation but nothing exercised the live
wiring on a real Linux filesystem.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The deprecated `launchctl load -w` returns exit code 0 even when it
cannot actually reload a service whose label is already in launchd's
in-memory state. It prints `Load failed: 5: Input/output error` to
stderr but exits 0, so the install path interprets it as success and
continues — silently leaving the running daemon on whatever binary
was first loaded, even though the on-disk plist now points elsewhere.
The consequence: every macOS user running `brew upgrade numa` rewrites
the plist to point at the new binary, but launchctl never actually
loads it. They think they upgraded; they're still running the old
version. Neither #41 (cross-platform CA trust) nor #40 (self-referential
backup) would actually take effect for them until they manually run:
sudo launchctl bootout system /Library/LaunchDaemons/com.numa.dns.plist
sudo launchctl bootstrap system /Library/LaunchDaemons/com.numa.dns.plist
The fix uses the modern API symmetrically across all three call sites:
- install_service_macos: bootout (best-effort cleanup, no-op on first
install) → bootstrap → wait for readiness → configure DNS
- install_service_macos rollback path: bootout instead of `unload`
- uninstall_service_macos: bootout BEFORE remove_file (the modern API
needs the plist file path as the specifier; doing it after remove
would leave the service in memory until reboot)
No new tests — this is a shell-call substitution with no logic to
unit-test. Verified manually on macOS: `sudo numa install` no longer
prints `Load failed`, and the daemon is correctly running the binary
the plist points at.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: prevent self-referential DNS backup on re-install
The install flow previously captured current system DNS servers
verbatim into the backup file. If numa was already installed, current
DNS was 127.0.0.1, so the "backup" recorded 127.0.0.1 as the "original"
— making a subsequent uninstall a no-op self-reference.
Reproduced 2026-04-08 during v0.10.0 brew dogfood: after
`sudo numa uninstall; sudo /opt/homebrew/bin/numa install`,
`sudo numa uninstall` printed `restored DNS for "Wi-Fi" -> 127.0.0.1`
because the brew binary's install step had overwritten the backup with
the already-stub state.
Fix (all three platforms):
- macOS/Windows: if the existing backup already contains at least one
non-loopback/non-stub upstream, preserve it as-is. If writing a fresh
backup, filter loopback/stub addresses first so a capture from
already-numa-managed state isn't self-referential.
- Linux (resolv.conf fallback path): detect numa-managed or all-loopback
resolv.conf content and skip the file copy in that case; preserve an
existing useful backup rather than overwriting it. systemd-resolved
path is unaffected (uses a drop-in, no backup file).
Adds three unit tests for the predicates: macOS HashMap detection,
Windows interface filter, and resolv.conf parsing (real upstream,
self-referential, numa-marker, systemd stub, mixed).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: share iter_nameservers helper and reuse resolv.conf content
Post-review simplifications on the stale-backup fix:
- Extract iter_nameservers(&str) helper used by both parse_resolv_conf
and resolv_conf_has_real_upstream. Eliminates the duplicated
line-by-line nameserver parsing (findings from reuse review).
- install_linux: reuse the already-read resolv.conf content via
std::fs::write instead of a second read via std::fs::copy.
- install_macos / install_windows: flatten the conditional eprintln
pattern — always print a blank line, conditionally print the save
message. Equivalent output, less branching.
Net −12 lines. All 130 tests still pass, clippy clean.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: drop redundant trim before split_whitespace
CI caught `clippy::trim_split_whitespace` on Rust 1.94: `split_whitespace()`
already skips leading/trailing whitespace, so `.trim()` first is redundant.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: extract load_backup helper
Remove duplicated read+deserialize boilerplate shared by install_macos
and install_windows. The two call sites each had an identical 4-line
chain of read_to_string().ok().and_then(serde_json::from_str).ok() —
collapse into a single generic helper load_backup<T>().
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Revert "refactor: extract load_backup helper"
This reverts commit a54fb99428.
* test: drop windows_backup_filters_loopback
The test inlined the 3-line filter block from install_windows rather
than calling a production helper, so it was testing stdlib Vec::retain
+ is_loopback_or_stub — both already covered elsewhere. Deleting it
removes a test that would silently pass even if install_windows stopped
filtering altogether.
The predicate logic for macOS-shaped backups stays covered by
macos_backup_real_upstream_detection (same inner Vec<String> type).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add windows_backup_filters_loopback unit test
The PR description mentioned this test but it was missing from the
diff, leaving backup_has_real_upstream_windows untested. Mirrors the
shape of macos_backup_real_upstream_detection: empty map → false,
all-loopback (127.0.0.1, ::1, 0.0.0.0) → false, one real entry
alongside loopback → true.
Also relax the cfg gate on backup_has_real_upstream_windows from
cfg(windows) to cfg(any(windows, test)) so the test compiles
cross-platform, matching how backup_has_real_upstream_macos and
the resolv_conf helpers are gated.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: cross-platform CA trust (Arch/Fedora + Windows)
Closes#35.
trust_ca_linux now detects which trust store the distro ships and
runs the matching refresh command, instead of hardcoding Debian's
update-ca-certificates. Detection walks a const table in priority
order, picking the first whose anchor dir exists:
- debian: /usr/local/share/ca-certificates (update-ca-certificates)
- pki: /etc/pki/ca-trust/source/anchors (update-ca-trust extract)
- p11kit: /etc/ca-certificates/trust-source/anchors (trust extract-compat)
Falls back with a clear error listing every backend tried.
Adds Windows support via certutil -addstore Root / -delstore Root,
removing the silent CA-trust gap on numa install (previously the
service installed but the trust step quietly errored, leaving every
HTTPS .numa request throwing browser warnings).
Refactor: trust_ca and untrust_ca are now thin dispatchers calling
per-platform helpers. CA_COMMON_NAME and CA_FILE_NAME are centralized
in tls.rs and reused from system_dns.rs and api.rs. untrust_ca_linux
no longer pre-checks file existence (TOCTOU) and skips the refresh
when no file was actually removed.
Test: tests/docker/install-trust.sh runs the install/uninstall
contract against debian:stable, fedora:latest, and archlinux:latest
in containers, asserting the cert lands in (and is removed from)
the system bundle. All three pass locally.
README notes the Firefox/NSS limitation (separate trust store).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: rustfmt fixes for trust_ca_linux helpers
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: macOS CA trust contract test (manual)
Adds tests/manual/install-trust-macos.sh — a sudo bash script that
mirrors trust_ca_macos / untrust_ca_macos against a fixture cert with
a unique CN. Designed to coexist with a running production numa:
- Refuses to run if a real "Numa Local CA" is already in System.keychain
(fail-closed protection for dogfood installs)
- Uses a unique CN ("Numa Local CA Test <pid-timestamp>") so the test
cert can never collide with production
- Mirrors the by-hash deletion loop from untrust_ca_macos
- Trap-cleanup on success or interrupt
Lives under tests/manual/ to signal "host-mutating, dev-only" — distinct
from tests/docker/install-trust.sh which is hermetic.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: relax bail-out in macOS trust test (safe alongside production)
The bail-out was overly defensive. The test cert uses a unique CN
("Numa Local CA Test <pid-ts>") that is strictly longer than the
production CN, so `security find-certificate -c $TEST_CN` cannot
substring-match the production cert. All deletes are by-hash, which
can only target the test cert's specific hash. Coexistence is
provably safe; document the reasoning in the header comment block
and replace the refusal with an informational notice.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a workflow that runs on release:published (and via manual
workflow_dispatch), fetches sha256 checksums from the published release
assets, and rewrites razvandimescu/homebrew-tap/numa.rb in place:
version, URL paths, and sha256 lines after each url. The formula's
existing on_macos/on_linux structure is preserved.
Uses HOMEBREW_TAP_GITHUB_TOKEN (already set as a repo secret) to push
directly to the tap's main branch.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Expands the DoT paragraph to make the trust model explicit. The
previous version said "self-signed or bring your own cert" without
explaining when to pick which or what the user experience looks like.
The two modes close numa's gap vs AdGuard Home: BYO cert mode is
functionally identical (Let's Encrypt via DNS-01 + cert_path/key_path),
and the self-signed mode is numa's advantage on LAN-only deploys.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds DoT to the four existing touchpoints in the README where the
feature naturally belongs:
- Hero paragraph: mentions DoT alongside DNSSEC as a headline feature
- Ad Blocking & Privacy section: dedicated paragraph with RFC 7858
reference, config hint, and the ALPN strictness guarantee
- Comparison table: new "Encrypted clients (DoT listener)" row.
Pi-hole "Needs stunnel sidecar" (verified — Pi-hole explicitly
closed the native-DoT feature request as out of scope; community
uses stunnel or AdGuard DNS Proxy as a TLS terminator)
- Roadmap: checks off "DNS-over-TLS listener" alongside the existing
DoH entry
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
v0.10.0 ships DNS-over-TLS. Tagged release v0.10.0 on main after
merge will pick up this Cargo.toml version, keeping tag and manifest
aligned for release.yml.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverts the NUMA_DATA_DIR env var added in the previous commit and
replaces it with a [server] data_dir TOML field. Numa already has a
well-developed config system; adding a parallel env-var mechanism
for a single knob was wrong.
The principle: TOML is for application behavior configuration. Env
vars are for bootstrap values (HOME, SUDO_USER to discover paths
before config loads) and standard ecosystem conventions (RUST_LOG).
data_dir is neither — it's an app knob, so it belongs in the TOML.
Changes:
- lib.rs::data_dir() reverts to the platform-specific fallback only
- config.rs adds `data_dir: Option<PathBuf>` to ServerConfig
- main.rs resolves config.server.data_dir with fallback to
numa::data_dir() and passes it to build_tls_config, then stores the
resolved path on ctx.data_dir for downstream consumers
- tls.rs::build_tls_config takes `data_dir: &Path` as an explicit
parameter instead of calling crate::data_dir() behind the caller's
back. regenerate_tls and dot.rs self_signed_tls now pass
&ctx.data_dir, honoring whatever path the config resolved to
- tests/integration.sh Suite 6 uses `data_dir = "$NUMA_DATA"` in its
test TOML instead of the NUMA_DATA_DIR env var prefix
- numa.toml gains a commented-out data_dir example
No behavior change for existing production deployments (the default
path is unchanged). Test harness is now fully config-driven, and
containerized deploys can override data_dir via mount+config without
needing env var injection.
127/127 unit tests pass, Suite 6 passes end-to-end.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds integration test coverage for the realistic production shape
where both the HTTPS proxy and DoT are enabled simultaneously. This
was previously untested — every existing suite had either one or the
other, so the interaction path was implicit.
What Suite 6 verifies:
- Both listeners bind without panic
- DoT still resolves queries with the proxy enabled
- Proxy HTTPS handshake still works with DoT enabled
- Both certs validate against the same shared CA
To run non-root, adds a NUMA_DATA_DIR env var override to data_dir()
that lets callers point the CA/cert storage at any writable path.
Useful beyond tests: containerized deployments, CI runners, dev
testing without sudo. The fallback is the existing platform-specific
path (unix: /usr/local/var/numa, windows: %PROGRAMDATA%\numa).
Suite 6 sets NUMA_DATA_DIR=/tmp/numa-integration-data before
starting numa, then trusts the generated CA at $NUMA_DATA_DIR/ca.pem
for both kdig (DoT query) and openssl s_client (HTTPS proxy
handshake) verification.
All 6 suites, 32 checks, run non-root and pass locally.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds tests/integration.sh Suite 5 (DoT via kdig + openssl) and
fixes a startup panic caught by it.
Bug: when [dot] cert_path/key_path was set AND [proxy] was disabled,
numa panicked on the first DoT handshake with "Could not
automatically determine the process-level CryptoProvider from Rustls
crate features". In normal deployments the proxy's build_tls_config
installs the default provider as a side effect, masking the missing
call in dot.rs::load_tls_config. Disable the proxy and the panic
surfaces. Fix: call
rustls::crypto::ring::default_provider().install_default() at the
top of load_tls_config (no-op if already installed).
Suite 5 exercises:
- DoT listener binds on configured port
- Resolves a local zone A record over TLS (kdig +tls)
- Persistent connection reuse (kdig +keepopen, 3 queries, 1 handshake)
- ALPN "dot" negotiation (openssl s_client -alpn dot)
- ALPN mismatch rejected with no_application_protocol (openssl -alpn h2)
Uses a pre-generated cert at /tmp so the test runs non-root.
Skips gracefully if kdig or openssl aren't installed.
Also: Dockerfile now EXPOSE 853/tcp so docker run -p 853:853 works
out of the box when users enable DoT.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds dot_rejects_non_dot_alpn to assert the rustls server enforces
ALPN strictness rather than silently accepting a mismatched
negotiation. This is the load-bearing behavior behind the cross-
protocol confusion defense — without enforcement, the ALPN "dot"
advertisement is just a sign hung on an unlocked door.
Refactors test_tls_configs to return the leaf cert DER instead of a
prebuilt client config, and adds a dot_client(cert_der, alpn) helper
so each test can build a client config with the ALPN list it needs.
The five existing DoT tests gain one line each to call dot_client
with dot_alpn(); behavior unchanged.
127/127 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
self_signed_tls was passing an empty service_names list, so the
generated cert only had the *.numa wildcard SAN. Strict TLS clients
(browsers, possibly some iOS versions) reject wildcards under
single-label TLDs — see the existing comment in tls.rs explaining
why the proxy lists each service explicitly.
setup-phone's mobileconfig sends ServerName "numa.numa" as SNI, so
the DoT cert must have an explicit numa.numa SAN. Pass proxy_tld
itself as a service name, mirroring how main.rs already registers
"numa" as a service for the proxy's TLS cert.
Test fixture updated to mirror the production SAN shape (*.numa +
numa.numa) and switched the client to SNI "numa.numa", so the
existing DoT test suite implicitly exercises the SNI path used by
setup-phone clients.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both were restating what the code already said — dot_alpn's doc
narrated the function name and the test comment restated the
assertion. RFC 7858 §3.2 is already cited on self_signed_tls and
build_tls_config where the "why" actually matters.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two DoS/interop hardening items:
1. Bound write_framed by WRITE_TIMEOUT (10s) so a slow-reader
attacker can't indefinitely hold a worker task and its connection
permit. Symmetric to the existing handshake timeout.
2. Advertise ALPN "dot" per RFC 7858 §3.2. Required by some strict
DoT clients (newer Apple stacks, some Android versions). rustls
ServerConfig exposes alpn_protocols as a pub field so we set it
after with_single_cert:
- load_tls_config (user-provided cert/key): set directly
- self_signed_tls (new, replaces fallback_tls): builds a fresh
DoT-specific TLS config via build_tls_config with the ALPN list
build_tls_config now takes an `alpn: Vec<Vec<u8>>` parameter so
DoT and the proxy can pass different ALPN lists while sharing the
same CA. Proxy callers pass Vec::new() (unchanged behavior).
Dropped the ctx.tls_config reuse branch: we can't mutate a shared
Arc<ServerConfig> to add DoT-specific ALPN, and reusing the proxy
config was already quietly broken re: SAN (proxy cert covers
*.{tld}, not the DoT server's bind hostname/IP).
Added dot_negotiates_alpn test that asserts conn.alpn_protocol()
returns Some(b"dot") after handshake. 126/126 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Matches the style of the other opt-in sections (blocking, dnssec, lan).
Documents all five DotConfig fields with their defaults.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Collapse two 4-arm read/timeout matches to let-else (lose one
defensive debug log on payload-read timeout; idle timeouts are
routine on persistent DoT connections anyway)
- Drop MIN_MSG_LEN: DnsPacket::from_buffer rejects truncated input
on its own, and BytePacketBuffer is zero-init so buf[0..2] for
sub-2-byte messages just yields a harmless FORMERR with id=0
- Inline ACCEPT_ERROR_BACKOFF (single use site)
- Drop the partial cert/key warning: missing one of cert_path/
key_path silently falls back to self-signed; users see the
self-signed cert at startup and figure it out
- Drop dot_localhost_resolution test: RFC 6761 localhost is tested
in ctx.rs; this test only verified DoT transport, which
dot_resolves_local_zone already covers
- Drop self-documenting comment in dot_multiple_queries_on_persistent_connection
Net -32 lines, 125/125 tests pass, no behavior change users would notice.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Flatten 4-arm cert/key match in start_dot to 2 arms with the
partial-config warning hoisted into a one-liner above the match.
- Extract send_response() that serializes a DnsPacket and writes it
framed, used by both the FORMERR-on-parse-error and SERVFAIL-on-
resolve-error paths. Removes duplicated buffer/write/log boilerplate
and unifies the rescode logging via {:?}.
No behavior change; 126/126 tests still pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address review findings on PR #25:
- Refactor resolve_query to take a pre-parsed DnsPacket. Parse-error
handling moves to the UDP caller, eliminating the double warn! line
on malformed UDP queries.
- Enforce MIN_MSG_LEN=12 (DNS header) in handle_dot_connection so
query_id extraction is always reading client-sent bytes, not the
zeroed buffer tail.
- Parse the DoT query before calling resolve_query and retain it, so
SERVFAIL responses can echo the original question section via
response_from(). Parse failures send FORMERR with the client id.
- Extract write_framed() helper for length-prefix + flush, reused by
success, SERVFAIL, and FORMERR paths.
- Back off 100ms on listener.accept() errors to avoid tight-looping
on fd exhaustion.
- Replace the hardcoded 127.0.0.1:53 upstream in dot_nxdomain_for_unknown
with a bound-but-unresponsive UDP socket owned by the test, making it
independent of the host's local resolver. Test now runs in ~220ms
(timeout lowered to 200ms) instead of 3s and asserts the question is
echoed in the SERVFAIL response.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add 10s timeout on TLS handshake — prevents clients from holding a
semaphore permit without completing the handshake
- Add IDLE_TIMEOUT on payload read_exact — prevents slowloris after
sending a valid length prefix then trickling bytes
- Extract accept_loop() shared between start_dot and tests — eliminates
duplicated accept logic that could drift
- Add 5s timeout on TCP reads in recursive test mock server
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Send SERVFAIL response (with correct query ID) when resolve_query
fails, preventing DoT clients from hanging until idle timeout
- Extract handle_dot_connection() so tests use the same logic as
production, eliminating duplicated accept/read/resolve loop
- Replace magic 4096 with named MAX_MSG_LEN constant tied to BUF_SIZE
- Add flush() after each TLS write to prevent buffered responses
- Extract fallback_tls() helper, handle partial cert/key config,
support IPv6 bind address, remove redundant crypto provider init
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Refactor handle_query into transport-agnostic resolve_query that returns
a BytePacketBuffer, keeping the UDP path zero-alloc. Add a TLS listener
on port 853 with persistent connections, idle timeout, connection limits,
and coalesced writes. Supports user-provided certs or self-signed CA
fallback. Includes 5 integration tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Gate exe_path in restart_service() and replace_exe_path() behind
#[cfg(any(target_os = "macos", target_os = "linux"))] to fix
unused variable and dead code warnings on Windows
- Add macOS CI job (clippy + tests)
- Add test for template substitution in plist and systemd unit files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Conditional forwarding (Tailscale .ts.net, VPC private zones) was
only checked in the forward mode branch. In recursive mode, queries
for forwarding-rule domains went to root servers instead of the
configured upstream, returning NXDOMAIN for private domains.
Move the forwarding rule check before the recursive/forward branch
so it takes priority regardless of mode.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Chrome treats single-label TLDs (e.g. frontend.numa) as search
queries unless a trailing slash is added. Adding "numa" as a search
domain tells the OS resolver that .numa is valid, so browsers
resolve it directly.
macOS: networksetup -setsearchdomains, cleared on uninstall
Linux (resolved): Domains=~. numa in drop-in
Linux (resolv.conf): search numa
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New: Windows DNS configuration (install/uninstall/auto-start).
Fix: DoH fallback uses IP to avoid DNS bootstrap loop.
Fix: UDP ConnectionReset crash on Windows.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: Windows DNS configuration via netsh
numa install/uninstall now set/restore system DNS on Windows via
netsh. Parses ipconfig /all per-interface (adapter name, DHCP status,
DNS servers), saves backup to %APPDATA%\numa\original-dns.json, and
restores on uninstall (DHCP or static with secondary servers).
Handles localization (German adapter/DHCP/DNS labels), disconnected
adapters, multiple interfaces, and missing admin privileges. Adds IP
validation to discover_windows() for consistency.
No Windows Service or CA trust yet — user runs numa in a terminal.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* ci: add cargo test to Windows CI job
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ci: upload Windows binary as artifact for testing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: SRTT decay tests panic on Windows due to Instant underflow
On Windows, Instant starts near boot time — subtracting large
durations panics. Use checked_sub with a process-start fallback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: SRTT decay tests use binary search for max Instant age
Replace age() helper with set_age_secs() on SrttCache that
binary-searches for the maximum subtractable duration. Prevents
panic on Windows (Instant starts at boot) while still producing
the oldest representable instant for correct decay calculations.
Also removes ephemeral test-ubuntu.sh from git.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use ProgramData for Windows DNS backup path
APPDATA differs between user and admin contexts — install runs as
admin but uninstall might resolve a different APPDATA. Use
ProgramData which is consistent across elevation contexts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: disable Dnscache on Windows install, re-enable on uninstall
Windows DNS Client (Dnscache) holds port 53 at kernel level and
can't be stopped via sc/net stop. Disable via registry during
install (requires reboot), re-enable on uninstall.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: rewrite SRTT decay tests as pure functions
Decay tests manipulated Instant timestamps which panics on Windows
(Instant can't go before boot time). Rewrite to test decay_for_age()
directly — a pure function taking srtt_ms and age_secs, no platform
dependency.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use Quad9 IP (9.9.9.9) for DoH fallback, not hostname
DoH to dns.quad9.net requires DNS to resolve the hostname, which
creates a chicken-and-egg loop when numa IS the system resolver
(e.g. after numa install on Windows). Using the IP directly avoids
the bootstrap dependency.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: extract DOH_FALLBACK constant
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: extract QUAD9_IP constant
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: remove dead test helpers, fix constant placement
Remove unused get_srtt_ms() and saturated_penalty_cache() left over
from SRTT test rewrite. Move QUAD9_IP/DOH_FALLBACK after use block.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: ignore ConnectionReset on UDP socket (Windows ICMP error)
Windows delivers ICMP port-unreachable as ConnectionReset on the
next UDP recv_from, crashing numa. Linux/macOS silently ignore these.
Catch and continue the recv loop.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: auto-start numa on Windows boot via registry Run key
Without a Windows Service, rebooting after numa install leaves DNS
broken (pointing at 127.0.0.1 with nothing listening). Register
numa in HKLM\...\Run so it starts automatically. Removed on
uninstall.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: update README, Windows plan, and launch drafts for Windows support
- README: platform-specific Quick Start, install/uninstall table
- Windows plan: Phase 2 complete, Phase 3 scoped
- Launch drafts: updated "Does it support Windows?" response
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: remove docs from git tracking (already gitignored)
docs/ is in .gitignore but files were force-added. Remove from
tracking — files remain on disk.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Breaking: default mode changed from auto to forward.
New: memory footprint stats + dashboard panel.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add memory footprint to /stats and dashboard
Per-structure heap estimation (cache, blocklist, query log, SRTT,
overrides) with process RSS via mach_task_basic_info / sysconf.
Dashboard gets a 6th stat card and a sidebar breakdown panel with
stacked bar visualization.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use phys_footprint on macOS to match Activity Monitor
Switch from MACH_TASK_BASIC_INFO (resident_size) to TASK_VM_INFO
(phys_footprint) which matches Activity Monitor's Memory column.
Also: capacity-aware heap estimation, entry counts in memory payload,
heap_bytes tests for all stores.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: remove redundant fields and fix naming in memory stats
Remove duplicate entry counts from MemoryStats (already in parent
StatsResponse), rename process_rss_bytes to process_memory_bytes
to match macOS phys_footprint semantics, drop restating comments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>