feat: TCP fallback, query minimization, UDP auto-disable
Transport resilience for restrictive networks (ISPs blocking UDP:53): - DNS-over-TCP fallback: UDP fail/truncation → automatic TCP retry - UDP auto-disable: after 3 consecutive failures, switch to TCP-first - IPv6 → TCP directly (UDP socket binds 0.0.0.0, can't reach IPv6) - Network change resets UDP detection for re-probing - Root hint rotation in TLD priming Privacy: - RFC 7816 query minimization: root servers see TLD only, not full name Code quality: - Merged find_starting_ns + find_starting_zone → find_closest_ns - Extracted resolve_ns_addrs_from_glue shared helper - Removed overall timeout wrapper (per-hop timeouts sufficient) - forward_tcp for DNS-over-TCP (RFC 1035 §4.2.2) Testing: - Mock TCP-only DNS server for fallback tests (no network needed) - tcp_fallback_resolves_when_udp_blocked - tcp_only_iterative_resolution - tcp_fallback_handles_nxdomain - udp_auto_disable_resets - Integration test suite (4 suites, 51 tests) - Network probe script (tests/network-probe.sh) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
665
site/blog/dnssec-from-scratch.html
Normal file
665
site/blog/dnssec-from-scratch.html
Normal file
@@ -0,0 +1,665 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Implementing DNSSEC from Scratch in Rust — Numa</title>
|
||||
<meta name="description" content="Recursive resolution from root hints,
|
||||
chain-of-trust validation, NSEC/NSEC3 denial proofs, and what I learned
|
||||
implementing DNSSEC with zero DNS libraries.">
|
||||
<link rel="stylesheet" href="/fonts/fonts.css">
|
||||
<style>
|
||||
*, *::before, *::after { margin: 0; padding: 0; box-sizing: border-box; }
|
||||
|
||||
:root {
|
||||
--bg-deep: #f5f0e8;
|
||||
--bg-surface: #ece5da;
|
||||
--bg-elevated: #e3dbce;
|
||||
--bg-card: #faf7f2;
|
||||
--amber: #c0623a;
|
||||
--amber-dim: #9e4e2d;
|
||||
--teal: #6b7c4e;
|
||||
--teal-dim: #566540;
|
||||
--violet: #64748b;
|
||||
--text-primary: #2c2418;
|
||||
--text-secondary: #6b5e4f;
|
||||
--text-dim: #a39888;
|
||||
--border: rgba(0, 0, 0, 0.08);
|
||||
--border-amber: rgba(192, 98, 58, 0.22);
|
||||
--font-display: 'Instrument Serif', Georgia, serif;
|
||||
--font-body: 'DM Sans', system-ui, sans-serif;
|
||||
--font-mono: 'JetBrains Mono', monospace;
|
||||
}
|
||||
|
||||
html { scroll-behavior: smooth; }
|
||||
|
||||
body {
|
||||
background: var(--bg-deep);
|
||||
color: var(--text-primary);
|
||||
font-family: var(--font-body);
|
||||
font-weight: 400;
|
||||
line-height: 1.7;
|
||||
-webkit-font-smoothing: antialiased;
|
||||
}
|
||||
|
||||
body::before {
|
||||
content: '';
|
||||
position: fixed;
|
||||
inset: 0;
|
||||
background-image: url("data:image/svg+xml,%3Csvg viewBox='0 0 256 256' xmlns='http://www.w3.org/2000/svg'%3E%3Cfilter id='n'%3E%3CfeTurbulence type='fractalNoise' baseFrequency='0.9' numOctaves='4' stitchTiles='stitch'/%3E%3C/filter%3E%3Crect width='100%25' height='100%25' filter='url(%23n)' opacity='0.025'/%3E%3C/svg%3E");
|
||||
pointer-events: none;
|
||||
z-index: 9999;
|
||||
}
|
||||
|
||||
/* --- Blog nav --- */
|
||||
.blog-nav {
|
||||
padding: 1.5rem 2rem;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 1.5rem;
|
||||
}
|
||||
|
||||
.blog-nav a {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.75rem;
|
||||
letter-spacing: 0.08em;
|
||||
text-transform: uppercase;
|
||||
color: var(--text-dim);
|
||||
text-decoration: none;
|
||||
transition: color 0.2s;
|
||||
}
|
||||
.blog-nav a:hover { color: var(--amber); }
|
||||
|
||||
.blog-nav .wordmark {
|
||||
font-family: var(--font-display);
|
||||
font-size: 1.4rem;
|
||||
font-weight: 400;
|
||||
color: var(--text-primary);
|
||||
text-decoration: none;
|
||||
letter-spacing: -0.02em;
|
||||
}
|
||||
.blog-nav .wordmark:hover { color: var(--amber); }
|
||||
|
||||
.blog-nav .sep {
|
||||
color: var(--text-dim);
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.75rem;
|
||||
}
|
||||
|
||||
/* --- Article --- */
|
||||
.article {
|
||||
max-width: 720px;
|
||||
margin: 0 auto;
|
||||
padding: 3rem 2rem 6rem;
|
||||
}
|
||||
|
||||
.article-header {
|
||||
margin-bottom: 3rem;
|
||||
padding-bottom: 2rem;
|
||||
border-bottom: 1px solid var(--border);
|
||||
}
|
||||
|
||||
.article-header h1 {
|
||||
font-family: var(--font-display);
|
||||
font-weight: 400;
|
||||
font-size: clamp(2rem, 5vw, 3rem);
|
||||
line-height: 1.15;
|
||||
margin-bottom: 1rem;
|
||||
color: var(--text-primary);
|
||||
}
|
||||
|
||||
.article-meta {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.75rem;
|
||||
color: var(--text-dim);
|
||||
letter-spacing: 0.04em;
|
||||
}
|
||||
|
||||
.article-meta a {
|
||||
color: var(--amber);
|
||||
text-decoration: none;
|
||||
}
|
||||
.article-meta a:hover { text-decoration: underline; }
|
||||
|
||||
/* --- Prose --- */
|
||||
.article h2 {
|
||||
font-family: var(--font-display);
|
||||
font-weight: 600;
|
||||
font-size: 1.8rem;
|
||||
line-height: 1.2;
|
||||
margin: 3rem 0 1rem;
|
||||
color: var(--text-primary);
|
||||
}
|
||||
|
||||
.article h3 {
|
||||
font-family: var(--font-body);
|
||||
font-weight: 600;
|
||||
font-size: 1.2rem;
|
||||
margin: 2rem 0 0.75rem;
|
||||
color: var(--text-primary);
|
||||
}
|
||||
|
||||
.article p {
|
||||
margin-bottom: 1.25rem;
|
||||
color: var(--text-secondary);
|
||||
font-size: 1.05rem;
|
||||
}
|
||||
|
||||
.article a {
|
||||
color: var(--amber);
|
||||
text-decoration: underline;
|
||||
text-decoration-color: rgba(192, 98, 58, 0.3);
|
||||
text-underline-offset: 2px;
|
||||
transition: text-decoration-color 0.2s;
|
||||
}
|
||||
.article a:hover {
|
||||
text-decoration-color: var(--amber);
|
||||
}
|
||||
|
||||
.article strong {
|
||||
color: var(--text-primary);
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.article ul, .article ol {
|
||||
margin-bottom: 1.25rem;
|
||||
padding-left: 1.5rem;
|
||||
color: var(--text-secondary);
|
||||
}
|
||||
|
||||
.article li {
|
||||
margin-bottom: 0.4rem;
|
||||
font-size: 1.05rem;
|
||||
}
|
||||
|
||||
.article blockquote {
|
||||
border-left: 3px solid var(--amber);
|
||||
padding: 0.75rem 1.25rem;
|
||||
margin: 1.5rem 0;
|
||||
background: rgba(192, 98, 58, 0.04);
|
||||
border-radius: 0 4px 4px 0;
|
||||
}
|
||||
|
||||
.article blockquote p {
|
||||
color: var(--text-secondary);
|
||||
font-style: italic;
|
||||
margin-bottom: 0;
|
||||
}
|
||||
|
||||
/* --- Code --- */
|
||||
.article code {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.88em;
|
||||
background: var(--bg-elevated);
|
||||
padding: 0.15em 0.4em;
|
||||
border-radius: 3px;
|
||||
color: var(--amber-dim);
|
||||
}
|
||||
|
||||
.article pre {
|
||||
background: var(--bg-card);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 6px;
|
||||
padding: 1.25rem 1.5rem;
|
||||
margin: 1.5rem 0;
|
||||
overflow-x: auto;
|
||||
line-height: 1.55;
|
||||
}
|
||||
|
||||
.article pre code {
|
||||
background: none;
|
||||
padding: 0;
|
||||
border-radius: 0;
|
||||
color: var(--text-primary);
|
||||
font-size: 0.85rem;
|
||||
}
|
||||
|
||||
/* --- Images --- */
|
||||
.article img {
|
||||
max-width: 100%;
|
||||
border-radius: 6px;
|
||||
border: 1px solid var(--border);
|
||||
margin: 1.5rem 0;
|
||||
}
|
||||
|
||||
/* --- Tables --- */
|
||||
.article table {
|
||||
width: 100%;
|
||||
border-collapse: collapse;
|
||||
margin: 1.5rem 0;
|
||||
font-size: 0.95rem;
|
||||
}
|
||||
|
||||
.article th {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.75rem;
|
||||
letter-spacing: 0.06em;
|
||||
text-transform: uppercase;
|
||||
color: var(--text-dim);
|
||||
text-align: left;
|
||||
padding: 0.6rem 1rem;
|
||||
border-bottom: 2px solid var(--border);
|
||||
}
|
||||
|
||||
.article td {
|
||||
padding: 0.6rem 1rem;
|
||||
border-bottom: 1px solid var(--border);
|
||||
color: var(--text-secondary);
|
||||
}
|
||||
|
||||
/* --- Footer --- */
|
||||
.blog-footer {
|
||||
text-align: center;
|
||||
padding: 3rem 2rem;
|
||||
border-top: 1px solid var(--border);
|
||||
max-width: 720px;
|
||||
margin: 0 auto;
|
||||
}
|
||||
|
||||
.blog-footer a {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.75rem;
|
||||
letter-spacing: 0.08em;
|
||||
text-transform: uppercase;
|
||||
color: var(--text-dim);
|
||||
text-decoration: none;
|
||||
margin: 0 1rem;
|
||||
}
|
||||
.blog-footer a:hover { color: var(--amber); }
|
||||
|
||||
/* --- Responsive --- */
|
||||
@media (max-width: 640px) {
|
||||
.article { padding: 2rem 1.25rem 4rem; }
|
||||
.article pre { padding: 1rem; margin-left: -0.5rem; margin-right: -0.5rem; border-radius: 0; border-left: none; border-right: none; }
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<nav class="blog-nav">
|
||||
<a href="/" class="wordmark">Numa</a>
|
||||
<span class="sep">/</span>
|
||||
<a href="/blog/">Blog</a>
|
||||
</nav>
|
||||
|
||||
<article class="article">
|
||||
<header class="article-header">
|
||||
<h1>Implementing DNSSEC from Scratch in Rust</h1>
|
||||
<div class="article-meta">
|
||||
March 2026 · <a href="https://dimescu.ro">Razvan Dimescu</a>
|
||||
</div>
|
||||
</header>
|
||||
|
||||
<p>In the <a href="/blog/dns-from-scratch.html">previous post</a> I
|
||||
covered how DNS works at the wire level — packet format, label
|
||||
compression, TTL caching, DoH. Numa was a forwarding resolver: it parsed
|
||||
packets, did useful things locally, and relayed the rest to Cloudflare
|
||||
or Quad9.</p>
|
||||
<p>That post ended with “recursive resolution and DNSSEC are on the
|
||||
roadmap.” This post is about building both.</p>
|
||||
<p>The short version: Numa now resolves from root nameservers with
|
||||
iterative queries, validates the full DNSSEC chain of trust, and
|
||||
cryptographically proves that non-existent domains don’t exist. No
|
||||
upstream dependency. No DNS libraries. Just <code>ring</code> for the
|
||||
crypto primitives and a lot of RFC reading.</p>
|
||||
<h2 id="why-recursive">Why recursive?</h2>
|
||||
<p>A forwarding resolver trusts its upstream. When you ask Quad9 for
|
||||
<code>cloudflare.com</code>, you trust that Quad9 returns the real
|
||||
answer. If Quad9 lies, gets compromised, or is legally compelled to
|
||||
redirect you — you have no way to know.</p>
|
||||
<p>A recursive resolver doesn’t trust anyone. It starts at the root
|
||||
nameservers (operated by 12 independent organizations) and follows the
|
||||
delegation chain: root → <code>.com</code> TLD →
|
||||
<code>cloudflare.com</code> authoritative servers. Each server only
|
||||
answers for its own zone. No single entity sees your full query
|
||||
pattern.</p>
|
||||
<p>DNSSEC adds cryptographic proof to each step. The root signs
|
||||
<code>.com</code>’s key. <code>.com</code> signs
|
||||
<code>cloudflare.com</code>’s key. <code>cloudflare.com</code> signs its
|
||||
own records. If any step is tampered with, the chain breaks and Numa
|
||||
rejects the response.</p>
|
||||
<h2 id="the-iterative-resolution-loop">The iterative resolution
|
||||
loop</h2>
|
||||
<p>Recursive resolution is a misnomer — the resolver actually uses
|
||||
<em>iterative</em> queries. It asks root “where is
|
||||
<code>cloudflare.com</code>?”, root says “I don’t know, but here are the
|
||||
<code>.com</code> nameservers.” It asks <code>.com</code>, which says
|
||||
“here are cloudflare’s nameservers.” It asks those, and gets the
|
||||
answer.</p>
|
||||
<pre><code>resolve("cloudflare.com", A)
|
||||
→ ask 198.41.0.4 (a.root-servers.net)
|
||||
← "try .com: ns1.gtld-servers.net (192.5.6.30)" [referral + glue]
|
||||
→ ask 192.5.6.30 (ns1.gtld-servers.net)
|
||||
← "try cloudflare: ns1.cloudflare.com (173.245.58.51)" [referral + glue]
|
||||
→ ask 173.245.58.51 (ns1.cloudflare.com)
|
||||
← "104.16.132.229" [answer]</code></pre>
|
||||
<p>The implementation (<code>src/recursive.rs</code>) is a loop with
|
||||
three possible outcomes per query:</p>
|
||||
<ol type="1">
|
||||
<li><strong>Answer</strong> — the server knows the record. Cache it,
|
||||
return it.</li>
|
||||
<li><strong>Referral</strong> — the server delegates to another zone.
|
||||
Extract NS records and glue (A/AAAA records for the nameservers,
|
||||
included in the additional section to avoid a chicken-and-egg problem),
|
||||
then query the next server.</li>
|
||||
<li><strong>NXDOMAIN/REFUSED</strong> — the name doesn’t exist or the
|
||||
server refuses. Cache the negative result.</li>
|
||||
</ol>
|
||||
<p>CNAME chasing adds complexity: if you ask for
|
||||
<code>www.cloudflare.com</code> and get a CNAME to
|
||||
<code>cloudflare.com</code>, you need to restart resolution for the new
|
||||
name. I cap this at 8 levels.</p>
|
||||
<h3 id="tld-priming">TLD priming</h3>
|
||||
<p>Cold-cache resolution is slow. Every query needs root → TLD →
|
||||
authoritative, each with its own network round-trip. For the first query
|
||||
to <code>example.com</code>, that’s three serial UDP round-trips before
|
||||
you get an answer.</p>
|
||||
<p>TLD priming solves this. On startup, Numa queries root for NS records
|
||||
of 34 common TLDs (<code>.com</code>, <code>.org</code>,
|
||||
<code>.net</code>, <code>.io</code>, <code>.dev</code>, plus EU ccTLDs),
|
||||
caching NS records, glue addresses, DS records, and DNSKEY records.
|
||||
After priming, the first query to any <code>.com</code> domain skips
|
||||
root entirely — it already knows where <code>.com</code>’s nameservers
|
||||
are, and already has the DNSSEC keys needed to validate the
|
||||
response.</p>
|
||||
<h2 id="dnssec-chain-of-trust">DNSSEC chain of trust</h2>
|
||||
<p>DNSSEC doesn’t encrypt DNS traffic. It <em>signs</em> it. Every DNS
|
||||
record can have an accompanying RRSIG (signature) record. The resolver
|
||||
verifies the signature against the zone’s DNSKEY, then verifies that
|
||||
DNSKEY against the parent zone’s DS (delegation signer) record, walking
|
||||
up until it reaches the root trust anchor — a hardcoded public key that
|
||||
IANA publishes and the entire internet agrees on.</p>
|
||||
<pre><code>cloudflare.com A 104.16.132.229
|
||||
signed by → RRSIG (key_tag=34505, algo=13, signer=cloudflare.com)
|
||||
verified with → DNSKEY (cloudflare.com, key_tag=34505, ECDSA P-256)
|
||||
vouched for by → DS (at .com, key_tag=2371, digest=SHA-256 of cloudflare's DNSKEY)
|
||||
signed by → RRSIG (key_tag=19718, signer=com)
|
||||
verified with → DNSKEY (com, key_tag=19718)
|
||||
vouched for by → DS (at root, key_tag=30909)
|
||||
signed by → RRSIG (signer=.)
|
||||
verified with → DNSKEY (., key_tag=20326) ← root trust anchor (hardcoded)</code></pre>
|
||||
<h3 id="the-trust-anchor">The trust anchor</h3>
|
||||
<p>IANA’s root KSK (Key Signing Key) has key tag 20326, algorithm 8
|
||||
(RSA/SHA-256), and a 256-byte public key. It was last rolled in 2018. I
|
||||
hardcode it as a <code>const</code> array — this is the one thing in the
|
||||
entire system that requires out-of-band trust.</p>
|
||||
<div class="sourceCode" id="cb3"><pre
|
||||
class="sourceCode rust"><code class="sourceCode rust"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="kw">const</span> ROOT_KSK_PUBLIC_KEY<span class="op">:</span> <span class="op">&</span>[<span class="dt">u8</span>] <span class="op">=</span> <span class="op">&</span>[</span>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="dv">0x03</span><span class="op">,</span> <span class="dv">0x01</span><span class="op">,</span> <span class="dv">0x00</span><span class="op">,</span> <span class="dv">0x01</span><span class="op">,</span> <span class="dv">0xac</span><span class="op">,</span> <span class="dv">0xff</span><span class="op">,</span> <span class="dv">0xb4</span><span class="op">,</span> <span class="dv">0x09</span><span class="op">,</span></span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> <span class="co">// ... 256 bytes total</span></span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a>]<span class="op">;</span></span></code></pre></div>
|
||||
<p>When IANA rolls this key (rare — the previous key lasted from 2010 to
|
||||
2018), every DNSSEC validator on the internet needs updating. For Numa,
|
||||
that means a binary update. Something to watch.</p>
|
||||
<h3 id="key-tag-computation">Key tag computation</h3>
|
||||
<p>Every DNSKEY has a key tag — a 16-bit identifier computed per RFC
|
||||
4034 Appendix B. It’s a simple checksum over the DNSKEY RDATA (flags +
|
||||
protocol + algorithm + public key), summing 16-bit words with carry:</p>
|
||||
<div class="sourceCode" id="cb4"><pre
|
||||
class="sourceCode rust"><code class="sourceCode rust"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="kw">pub</span> <span class="kw">fn</span> compute_key_tag(flags<span class="op">:</span> <span class="dt">u16</span><span class="op">,</span> protocol<span class="op">:</span> <span class="dt">u8</span><span class="op">,</span> algorithm<span class="op">:</span> <span class="dt">u8</span><span class="op">,</span> public_key<span class="op">:</span> <span class="op">&</span>[<span class="dt">u8</span>]) <span class="op">-></span> <span class="dt">u16</span> <span class="op">{</span></span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> <span class="kw">mut</span> rdata <span class="op">=</span> <span class="dt">Vec</span><span class="pp">::</span>with_capacity(<span class="dv">4</span> <span class="op">+</span> public_key<span class="op">.</span>len())<span class="op">;</span></span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> rdata<span class="op">.</span>push((flags <span class="op">>></span> <span class="dv">8</span>) <span class="kw">as</span> <span class="dt">u8</span>)<span class="op">;</span></span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> rdata<span class="op">.</span>push((flags <span class="op">&</span> <span class="dv">0xFF</span>) <span class="kw">as</span> <span class="dt">u8</span>)<span class="op">;</span></span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> rdata<span class="op">.</span>push(protocol)<span class="op">;</span></span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> rdata<span class="op">.</span>push(algorithm)<span class="op">;</span></span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> rdata<span class="op">.</span>extend_from_slice(public_key)<span class="op">;</span></span>
|
||||
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> <span class="kw">mut</span> ac<span class="op">:</span> <span class="dt">u32</span> <span class="op">=</span> <span class="dv">0</span><span class="op">;</span></span>
|
||||
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a> <span class="cf">for</span> (i<span class="op">,</span> <span class="op">&</span>byte) <span class="kw">in</span> rdata<span class="op">.</span>iter()<span class="op">.</span>enumerate() <span class="op">{</span></span>
|
||||
<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a> <span class="cf">if</span> i <span class="op">%</span> <span class="dv">2</span> <span class="op">==</span> <span class="dv">0</span> <span class="op">{</span> ac <span class="op">+=</span> (byte <span class="kw">as</span> <span class="dt">u32</span>) <span class="op"><<</span> <span class="dv">8</span><span class="op">;</span> <span class="op">}</span></span>
|
||||
<span id="cb4-12"><a href="#cb4-12" aria-hidden="true" tabindex="-1"></a> <span class="cf">else</span> <span class="op">{</span> ac <span class="op">+=</span> byte <span class="kw">as</span> <span class="dt">u32</span><span class="op">;</span> <span class="op">}</span></span>
|
||||
<span id="cb4-13"><a href="#cb4-13" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
|
||||
<span id="cb4-14"><a href="#cb4-14" aria-hidden="true" tabindex="-1"></a> ac <span class="op">+=</span> (ac <span class="op">>></span> <span class="dv">16</span>) <span class="op">&</span> <span class="dv">0xFFFF</span><span class="op">;</span></span>
|
||||
<span id="cb4-15"><a href="#cb4-15" aria-hidden="true" tabindex="-1"></a> (ac <span class="op">&</span> <span class="dv">0xFFFF</span>) <span class="kw">as</span> <span class="dt">u16</span></span>
|
||||
<span id="cb4-16"><a href="#cb4-16" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
|
||||
<p>The first test I wrote: compute the root KSK’s key tag and assert it
|
||||
equals 20326. Instant confidence that the RDATA encoding is correct.</p>
|
||||
<h2 id="the-crypto">The crypto</h2>
|
||||
<p>Numa uses <code>ring</code> for all cryptographic operations. Three
|
||||
algorithms cover the vast majority of signed zones:</p>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Algorithm</th>
|
||||
<th>ID</th>
|
||||
<th>Usage</th>
|
||||
<th>Verify time</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>RSA/SHA-256</td>
|
||||
<td>8</td>
|
||||
<td>Root, most TLDs</td>
|
||||
<td>10.9 µs</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>ECDSA P-256</td>
|
||||
<td>13</td>
|
||||
<td>Cloudflare, many modern zones</td>
|
||||
<td>174 ns</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Ed25519</td>
|
||||
<td>15</td>
|
||||
<td>Newer zones</td>
|
||||
<td>~200 ns</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<h3 id="rsa-key-format-conversion">RSA key format conversion</h3>
|
||||
<p>DNS stores RSA public keys in RFC 3110 format: exponent length (1 or
|
||||
3 bytes), exponent, modulus. <code>ring</code> expects PKCS#1 DER (ASN.1
|
||||
encoded). Converting between them means writing a minimal ASN.1
|
||||
encoder:</p>
|
||||
<div class="sourceCode" id="cb5"><pre
|
||||
class="sourceCode rust"><code class="sourceCode rust"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="kw">fn</span> rsa_dnskey_to_der(public_key<span class="op">:</span> <span class="op">&</span>[<span class="dt">u8</span>]) <span class="op">-></span> <span class="dt">Option</span><span class="op"><</span><span class="dt">Vec</span><span class="op"><</span><span class="dt">u8</span><span class="op">>></span> <span class="op">{</span></span>
|
||||
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> <span class="co">// Parse RFC 3110: [exp_len] [exponent] [modulus]</span></span>
|
||||
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> (exp_len<span class="op">,</span> exp_start) <span class="op">=</span> <span class="cf">if</span> public_key[<span class="dv">0</span>] <span class="op">==</span> <span class="dv">0</span> <span class="op">{</span></span>
|
||||
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> len <span class="op">=</span> <span class="dt">u16</span><span class="pp">::</span>from_be_bytes([public_key[<span class="dv">1</span>]<span class="op">,</span> public_key[<span class="dv">2</span>]]) <span class="kw">as</span> <span class="dt">usize</span><span class="op">;</span></span>
|
||||
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> (len<span class="op">,</span> <span class="dv">3</span>)</span>
|
||||
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
|
||||
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a> (public_key[<span class="dv">0</span>] <span class="kw">as</span> <span class="dt">usize</span><span class="op">,</span> <span class="dv">1</span>)</span>
|
||||
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a> <span class="op">};</span></span>
|
||||
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> exponent <span class="op">=</span> <span class="op">&</span>public_key[exp_start<span class="op">..</span>exp_start <span class="op">+</span> exp_len]<span class="op">;</span></span>
|
||||
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> modulus <span class="op">=</span> <span class="op">&</span>public_key[exp_start <span class="op">+</span> exp_len<span class="op">..</span>]<span class="op">;</span></span>
|
||||
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a> <span class="co">// Build ASN.1 DER: SEQUENCE { INTEGER modulus, INTEGER exponent }</span></span>
|
||||
<span id="cb5-13"><a href="#cb5-13" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> mod_der <span class="op">=</span> asn1_integer(modulus)<span class="op">;</span></span>
|
||||
<span id="cb5-14"><a href="#cb5-14" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> exp_der <span class="op">=</span> asn1_integer(exponent)<span class="op">;</span></span>
|
||||
<span id="cb5-15"><a href="#cb5-15" aria-hidden="true" tabindex="-1"></a> <span class="co">// ... wrap in SEQUENCE tag + length</span></span>
|
||||
<span id="cb5-16"><a href="#cb5-16" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
|
||||
<p>The <code>asn1_integer</code> function handles leading-zero stripping
|
||||
(DER integers must be minimal) and sign-bit padding (high bit set means
|
||||
negative in ASN.1, so positive numbers need a <code>0x00</code> prefix).
|
||||
Getting this wrong produces keys that <code>ring</code> silently rejects
|
||||
— one of the harder bugs to track down.</p>
|
||||
<h3 id="ecdsa-is-simpler">ECDSA is simpler</h3>
|
||||
<p>ECDSA P-256 keys in DNS are 64 bytes (x + y coordinates).
|
||||
<code>ring</code> expects uncompressed point format: <code>0x04</code>
|
||||
prefix + 64 bytes. One line:</p>
|
||||
<div class="sourceCode" id="cb6"><pre
|
||||
class="sourceCode rust"><code class="sourceCode rust"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> <span class="kw">mut</span> uncompressed <span class="op">=</span> <span class="dt">Vec</span><span class="pp">::</span>with_capacity(<span class="dv">65</span>)<span class="op">;</span></span>
|
||||
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a>uncompressed<span class="op">.</span>push(<span class="dv">0x04</span>)<span class="op">;</span></span>
|
||||
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a>uncompressed<span class="op">.</span>extend_from_slice(public_key)<span class="op">;</span> <span class="co">// 64 bytes from DNS</span></span></code></pre></div>
|
||||
<p>Signatures are also 64 bytes (r + s), used directly. No format
|
||||
conversion needed.</p>
|
||||
<h3 id="building-the-signed-data">Building the signed data</h3>
|
||||
<p>RRSIG verification doesn’t sign the DNS packet — it signs a canonical
|
||||
form of the records. Building this correctly is the most
|
||||
detail-sensitive part of DNSSEC. The signed data is:</p>
|
||||
<ol type="1">
|
||||
<li>RRSIG RDATA fields (type covered, algorithm, labels, original TTL,
|
||||
expiration, inception, key tag, signer name) — <em>without</em> the
|
||||
signature itself</li>
|
||||
<li>For each record in the RRset: owner name (lowercased, uncompressed)
|
||||
+ type + class + original TTL (from the RRSIG, not the record’s current
|
||||
TTL) + RDATA length + canonical RDATA</li>
|
||||
</ol>
|
||||
<p>The records must be sorted by their canonical wire-format
|
||||
representation. Owner names must be lowercased. The TTL must be the
|
||||
<em>original</em> TTL from the RRSIG, not the decremented TTL from
|
||||
caching.</p>
|
||||
<p>Getting any of these details wrong — wrong TTL, wrong case, wrong
|
||||
sort order, wrong RDATA encoding — produces a valid-looking but
|
||||
incorrect signed data blob, and <code>ring</code> returns a signature
|
||||
mismatch with no diagnostic information. I spent more time debugging
|
||||
signed data construction than any other part of DNSSEC.</p>
|
||||
<h2 id="proving-a-name-doesnt-exist">Proving a name doesn’t exist</h2>
|
||||
<p>Verifying that <code>cloudflare.com</code> has a valid A record is
|
||||
one thing. Proving that <code>doesnotexist.cloudflare.com</code>
|
||||
<em>doesn’t</em> exist — cryptographically, in a way that can’t be
|
||||
forged — is harder.</p>
|
||||
<h3 id="nsec">NSEC</h3>
|
||||
<p>NSEC records form a chain. Each NSEC says “the next name in this zone
|
||||
after me is X, and at my name these record types exist.” If you query
|
||||
<code>beta.example.com</code> and the zone has
|
||||
<code>alpha.example.com → NSEC → gamma.example.com</code>, the gap
|
||||
proves <code>beta</code> doesn’t exist — there’s nothing between
|
||||
<code>alpha</code> and <code>gamma</code>.</p>
|
||||
<p>For NXDOMAIN proofs, RFC 4035 §5.4 requires two things: 1. An NSEC
|
||||
record whose gap covers the queried name 2. An NSEC record proving no
|
||||
wildcard exists at the closest encloser</p>
|
||||
<p>The canonical DNS name ordering (RFC 4034 §6.1) compares labels
|
||||
right-to-left, case-insensitive. <code>a.example.com</code> <
|
||||
<code>b.example.com</code> because at the <code>example.com</code> level
|
||||
they’re equal, then <code>a</code> < <code>b</code>. But
|
||||
<code>z.example.com</code> < <code>a.example.org</code> because
|
||||
<code>.com</code> < <code>.org</code> at the TLD level.</p>
|
||||
<h3 id="nsec3">NSEC3</h3>
|
||||
<p>NSEC3 solves NSEC’s zone enumeration problem — with NSEC, you can
|
||||
walk the chain and discover every name in the zone. NSEC3 hashes the
|
||||
names first (iterated SHA-1 with a salt), so the NSEC3 chain reveals
|
||||
hashes, not names.</p>
|
||||
<p>The proof is a 3-part closest encloser proof (RFC 5155 §8.4): 1.
|
||||
<strong>Closest encloser</strong> — find an ancestor of the queried name
|
||||
whose hash exactly matches an NSEC3 owner 2. <strong>Next
|
||||
closer</strong> — the name one label longer than the closest encloser
|
||||
must fall within an NSEC3 hash range (proving it doesn’t exist) 3.
|
||||
<strong>Wildcard denial</strong> — the wildcard at the closest encloser
|
||||
(<code>*.closest_encloser</code>) must also fall within an NSEC3 hash
|
||||
range</p>
|
||||
<div class="sourceCode" id="cb7"><pre
|
||||
class="sourceCode rust"><code class="sourceCode rust"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a><span class="co">// Pre-compute hashes for all ancestors</span></span>
|
||||
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="dv">0</span><span class="op">..</span>labels<span class="op">.</span>len() <span class="op">{</span></span>
|
||||
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> name<span class="op">:</span> <span class="dt">String</span> <span class="op">=</span> labels[i<span class="op">..</span>]<span class="op">.</span>join(<span class="st">"."</span>)<span class="op">;</span></span>
|
||||
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a> ancestor_hashes<span class="op">.</span>push(nsec3_hash(<span class="op">&</span>name<span class="op">,</span> algorithm<span class="op">,</span> iterations<span class="op">,</span> salt))<span class="op">;</span></span>
|
||||
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span>
|
||||
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a><span class="co">// Walk from longest candidate: is this the closest encloser?</span></span>
|
||||
<span id="cb7-8"><a href="#cb7-8" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> i <span class="kw">in</span> <span class="dv">1</span><span class="op">..</span>labels<span class="op">.</span>len() <span class="op">{</span></span>
|
||||
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> ce_hash <span class="op">=</span> <span class="op">&</span>ancestor_hashes[i]<span class="op">;</span></span>
|
||||
<span id="cb7-10"><a href="#cb7-10" aria-hidden="true" tabindex="-1"></a> <span class="cf">if</span> <span class="op">!</span>decoded<span class="op">.</span>iter()<span class="op">.</span>any(<span class="op">|</span>(oh<span class="op">,</span> _)<span class="op">|</span> oh <span class="op">==</span> ce_hash) <span class="op">{</span> <span class="cf">continue</span><span class="op">;</span> <span class="op">}</span> <span class="co">// (1)</span></span>
|
||||
<span id="cb7-11"><a href="#cb7-11" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> nc_hash <span class="op">=</span> <span class="op">&</span>ancestor_hashes[i <span class="op">-</span> <span class="dv">1</span>]<span class="op">;</span></span>
|
||||
<span id="cb7-12"><a href="#cb7-12" aria-hidden="true" tabindex="-1"></a> <span class="cf">if</span> <span class="op">!</span>nsec3_any_covers(<span class="op">&</span>decoded<span class="op">,</span> nc_hash) <span class="op">{</span> <span class="cf">continue</span><span class="op">;</span> <span class="op">}</span> <span class="co">// (2)</span></span>
|
||||
<span id="cb7-13"><a href="#cb7-13" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> wc <span class="op">=</span> <span class="pp">format!</span>(<span class="st">"*.{}"</span><span class="op">,</span> labels[i<span class="op">..</span>]<span class="op">.</span>join(<span class="st">"."</span>))<span class="op">;</span></span>
|
||||
<span id="cb7-14"><a href="#cb7-14" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> wc_hash <span class="op">=</span> nsec3_hash(<span class="op">&</span>wc<span class="op">,</span> algorithm<span class="op">,</span> iterations<span class="op">,</span> salt)<span class="op">?;</span></span>
|
||||
<span id="cb7-15"><a href="#cb7-15" aria-hidden="true" tabindex="-1"></a> <span class="cf">if</span> nsec3_any_covers(<span class="op">&</span>decoded<span class="op">,</span> <span class="op">&</span>wc_hash) <span class="op">{</span> proven <span class="op">=</span> <span class="cn">true</span><span class="op">;</span> <span class="cf">break</span><span class="op">;</span> <span class="op">}</span> <span class="co">// (3)</span></span>
|
||||
<span id="cb7-16"><a href="#cb7-16" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
|
||||
<p>I cap NSEC3 iterations at 500 (RFC 9276 recommends 0). Higher
|
||||
iteration counts are a DoS vector — each verification requires
|
||||
<code>iterations + 1</code> SHA-1 hashes.</p>
|
||||
<h2 id="making-it-fast">Making it fast</h2>
|
||||
<p>Cold-cache DNSSEC validation initially required ~5 network fetches
|
||||
per query (DNSKEY for each zone in the chain, plus DS records). Three
|
||||
optimizations brought this down to ~1:</p>
|
||||
<p><strong>TLD priming</strong> (startup) — fetch root DNSKEY + each
|
||||
TLD’s NS/DS/DNSKEY. After priming, the trust chain from root to any
|
||||
<code>.com</code> zone is fully cached.</p>
|
||||
<p><strong>Referral DS piggybacking</strong> — when a TLD server refers
|
||||
you to <code>cloudflare.com</code>’s nameservers, the authority section
|
||||
often includes DS records for the child zone. Cache them during
|
||||
resolution instead of fetching separately during validation.</p>
|
||||
<p><strong>DNSKEY prefetch</strong> — before the validation loop, scan
|
||||
all RRSIGs for signer zones and batch-fetch any missing DNSKEYs. This
|
||||
avoids serial DNSKEY fetches inside the per-RRset verification loop.</p>
|
||||
<p>Result: a cold-cache query for <code>cloudflare.com</code> with full
|
||||
DNSSEC validation takes ~90ms. The TLD chain is already warm; only one
|
||||
DNSKEY fetch is needed (for <code>cloudflare.com</code> itself).</p>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Operation</th>
|
||||
<th>Time</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>ECDSA P-256 verify</td>
|
||||
<td>174 ns</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Ed25519 verify</td>
|
||||
<td>~200 ns</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>RSA/SHA-256 verify</td>
|
||||
<td>10.9 µs</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>DS digest (SHA-256)</td>
|
||||
<td>257 ns</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Key tag computation</td>
|
||||
<td>20–63 ns</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Cold-cache validation (1 fetch)</td>
|
||||
<td>~90 ms</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p>The network fetch dominates. The crypto is noise.</p>
|
||||
<h2 id="what-i-learned">What I learned</h2>
|
||||
<p><strong>DNSSEC is a verification system, not an encryption
|
||||
system.</strong> It proves authenticity — this record was signed by the
|
||||
zone owner. It doesn’t hide what you’re querying. For privacy, you still
|
||||
need encrypted transport (DoH/DoT) or recursive resolution (no single
|
||||
upstream).</p>
|
||||
<p><strong>The hardest bugs are in data serialization, not
|
||||
crypto.</strong> <code>ring</code> either verifies or it doesn’t — a
|
||||
binary answer. But getting the signed data blob exactly right (correct
|
||||
TTL, correct case, correct sort, correct RDATA encoding for each record
|
||||
type) requires extreme precision. A single wrong byte means verification
|
||||
fails with no hint about what’s wrong.</p>
|
||||
<p><strong>Negative proofs are harder than positive proofs.</strong>
|
||||
Verifying a record exists: verify one RRSIG. Proving a record doesn’t
|
||||
exist: find the right NSEC/NSEC3 records, verify their RRSIGs, check gap
|
||||
coverage, check wildcard denial, compute hashes. The NSEC3 closest
|
||||
encloser proof alone has three sub-proofs, each requiring hash
|
||||
computation and range checking.</p>
|
||||
<p><strong>Performance optimization is about avoiding network, not
|
||||
avoiding CPU.</strong> The crypto takes nanoseconds to microseconds. The
|
||||
network fetch takes tens of milliseconds. Every optimization that
|
||||
matters — TLD priming, DS piggybacking, DNSKEY prefetch — is about
|
||||
eliminating a round trip, not speeding up a hash.</p>
|
||||
<h2 id="whats-next">What’s next</h2>
|
||||
<p>Numa now has 13 feature layers, from basic DNS forwarding through
|
||||
full recursive DNSSEC resolution. The immediate roadmap:</p>
|
||||
<ul>
|
||||
<li><strong>DoT (DNS-over-TLS)</strong> — the last encrypted transport
|
||||
we don’t support</li>
|
||||
<li><strong><a href="https://github.com/pubky/pkarr">pkarr</a>
|
||||
integration</strong> — self-sovereign DNS via the Mainline BitTorrent
|
||||
DHT. Ed25519-signed DNS records published without a registrar.</li>
|
||||
<li><strong>Global <code>.numa</code> names</strong> — human-readable
|
||||
names backed by DHT, not ICANN</li>
|
||||
</ul>
|
||||
<p>The code is at <a
|
||||
href="https://github.com/razvandimescu/numa">github.com/razvandimescu/numa</a>.
|
||||
MIT license. The entire DNSSEC implementation is in <a
|
||||
href="https://github.com/razvandimescu/numa/blob/main/src/dnssec.rs"><code>src/dnssec.rs</code></a>
|
||||
(~1,600 lines) and <a
|
||||
href="https://github.com/razvandimescu/numa/blob/main/src/recursive.rs"><code>src/recursive.rs</code></a>
|
||||
(~600 lines).</p>
|
||||
</article>
|
||||
|
||||
<footer class="blog-footer">
|
||||
<a href="https://github.com/razvandimescu/numa">GitHub</a>
|
||||
<a href="/">Home</a>
|
||||
<a href="/blog/">Blog</a>
|
||||
</footer>
|
||||
|
||||
</body>
|
||||
</html>
|
||||
Reference in New Issue
Block a user