Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

2026-02-28 14:39:40 -05:00
parent 7885bf6278 d803bfe2b1
commit cd5943df23
7854 changed files with 3522914 additions and 0 deletions
--- a/vendor/ruvector/docs/research/rvf/INDEX.md
+++ b/vendor/ruvector/docs/research/rvf/INDEX.md
@@ -0,0 +1,87 @@
+# RVF: RuVector Format
+
+## A Living, Self-Reorganizing Runtime Substrate for Vector Intelligence
+
+---
+
+### Document Index
+
+#### Core Specification (`spec/`)
+
+| # | Document | Description |
+|---|----------|-------------|
+| 00 | [Overview](spec/00-overview.md) | The Four Laws, design coordinates, philosophy |
+| 01 | [Segment Model](spec/01-segment-model.md) | Append-only segments, headers, lifecycle, multi-file |
+| 02 | [Manifest System](spec/02-manifest-system.md) | Two-level manifests, hotset pointers, progressive boot |
+| 03 | [Temperature Tiering](spec/03-temperature-tiering.md) | Adaptive layout, access sketches, promotion/demotion |
+| 04 | [Progressive Indexing](spec/04-progressive-indexing.md) | Layer A/B/C availability, lazy build, partial search |
+| 05 | [Overlay Epochs](spec/05-overlay-epochs.md) | Streaming min-cut, epoch boundaries, rollback |
+| 06 | [Query Optimization](spec/06-query-optimization.md) | SIMD alignment, prefetch, varint IDs, cache analysis |
+| 07 | [Deletion & Lifecycle](spec/07-deletion-lifecycle.md) | Vector deletion, JOURNAL_SEG wire format, deletion bitmaps, compaction |
+| 08 | [Filtered Search](spec/08-filtered-search.md) | META_SEG wire format, filter expressions, metadata indexes |
+| 09 | [Concurrency & Versioning](spec/09-concurrency-versioning.md) | Writer locking, reader-writer coordination, space reclamation |
+| 10 | [Operations API](spec/10-operations-api.md) | Batch ops, error codes, network streaming, compaction scheduling |
+
+#### Wire Format (`wire/`)
+
+| Document | Description |
+|----------|-------------|
+| [Binary Layout](wire/binary-layout.md) | Byte-level format reference, all segment payloads |
+
+#### WASM Microkernel (`microkernel/`)
+
+| Document | Description |
+|----------|-------------|
+| [WASM Runtime](microkernel/wasm-runtime.md) | Cognitum tile mapping, 14 exports, hub-tile protocol |
+
+#### Domain Profiles (`profiles/`)
+
+| Document | Description |
+|----------|-------------|
+| [Domain Profiles](profiles/domain-profiles.md) | RVDNA, RVText, RVGraph, RVVision specifications |
+
+#### Cryptography (`crypto/`)
+
+| Document | Description |
+|----------|-------------|
+| [Quantum Signatures](crypto/quantum-signatures.md) | ML-DSA-65, SHAKE-256, hybrid encryption, witnesses |
+
+#### Benchmarks (`benchmarks/`)
+
+| Document | Description |
+|----------|-------------|
+| [Acceptance Tests](benchmarks/acceptance-tests.md) | Performance targets, crash safety, scalability |
+
+---
+
+### Quick Reference
+
+**The Four Laws**
+1. Truth lives at the tail
+2. Every segment is independently valid
+3. Data and state are separated
+4. The format adapts to its workload
+
+**Minimal Upgrade Path** (smallest changes that unlock everything)
+1. Add tail manifest segments
+2. Make every payload a segment with its own hash and length
+3. Add hotset pointers in the manifest
+4. Add an epoch overlay model
+
+**Hardware Profiles**
+- **Core**: 8 KB code + 8 KB data + 64 KB SIMD (Cognitum tile)
+- **Hot**: Multi-tile chip with shared memory
+- **Full**: Desktop/server with mmap and full feature set
+
+**Key Numbers**
+- Boot: 4 KB read, < 5 ms
+- First query: <= 4 MB read, recall >= 0.70
+- Full quality: recall >= 0.95
+- Signing: ML-DSA-65, 3,309 B signatures, ~4,500 sign/s
+- Distance: 384-dim fp16 L2 in ~12 AVX-512 cycles
+- Hot entry: 960 bytes (vector + 16 neighbors, cache-line aligned)
+
+**Design Choices**
+- Append-only + compaction (not random writes)
+- Both mmap desktop and microcontroller tiles
+- Priority: streamable > progressive > adaptive > p95 speed
--- a/vendor/ruvector/docs/research/rvf/SWARM-GUIDANCE.md
+++ b/vendor/ruvector/docs/research/rvf/SWARM-GUIDANCE.md
@@ -0,0 +1,633 @@
+# RVF Implementation Swarm Guidance
+
+## Objective
+
+Implement, test, optimize, and publish the RVF (RuVector Format) as the canonical binary format across all RuVector libraries. Deliver as Rust crates (crates.io), WASM packages (npm), and Node.js N-API bindings (npm).
+
+## Phase Overview
+
+```
+Phase 1: Foundation (rvf-types + rvf-wire)         ──  Week 1-2
+Phase 2: Core Runtime (manifest + index + quant)    ──  Week 3-5
+Phase 3: Integration (library adapters)             ──  Week 6-8
+Phase 4: WASM + Node Bindings                       ──  Week 9-10
+Phase 5: Testing + Benchmarks                       ──  Week 11-12
+Phase 6: Optimization + Publishing                  ──  Week 13-14
+```
+
+---
+
+## Phase 1: Foundation — `rvf-types` + `rvf-wire`
+
+### Agent Assignments
+
+| Agent | Role | Crate | Deliverable |
+|-------|------|-------|-------------|
+| **coder-1** | Types specialist | `crates/rvf/rvf-types/` | All segment types, enums, headers |
+| **coder-2** | Wire format specialist | `crates/rvf/rvf-wire/` | Read/write segment headers + payloads |
+| **tester-1** | TDD for types/wire | `crates/rvf/rvf-types/tests/`, `crates/rvf/rvf-wire/tests/` | Round-trip tests, fuzz targets |
+| **reviewer-1** | Spec compliance | N/A | Verify code matches wire format spec |
+
+### `rvf-types` (no_std, no alloc dependency)
+
+```toml
+[package]
+name = "rvf-types"
+version = "0.1.0"
+edition = "2021"
+description = "RuVector Format core types — segment headers, enums, flags"
+license = "MIT OR Apache-2.0"
+categories = ["data-structures", "no-std"]
+
+[features]
+default = []
+std = []
+serde = ["dep:serde"]
+```
+
+**Files to create:**
+
+```
+crates/rvf/rvf-types/
+  src/
+    lib.rs              # Re-exports
+    segment.rs          # SegmentHeader (64 bytes), SegmentType enum
+    flags.rs            # Flags bitfield (COMPRESSED, ENCRYPTED, SIGNED, etc.)
+    manifest.rs         # Level0Root (4096 bytes), ManifestTag enum
+    vec_seg.rs          # BlockDirectory, BlockHeader, DataType enum
+    index_seg.rs        # IndexHeader, IndexType, AdjacencyLayout
+    hot_seg.rs          # HotHeader, HotEntry layout
+    quant_seg.rs        # QuantHeader, QuantType enum
+    sketch_seg.rs       # SketchHeader layout
+    meta_seg.rs         # MetaField, FilterOp enum
+    profile.rs          # ProfileId, ProfileMagic constants
+    error.rs            # RvfError enum (format, query, write, tile, crypto)
+    constants.rs        # Magic numbers, alignment, limits
+  Cargo.toml
+```
+
+**Key constants (from spec):**
+
+```rust
+pub const SEGMENT_MAGIC: u32 = 0x52564653; // "RVFS"
+pub const ROOT_MANIFEST_MAGIC: u32 = 0x52564D30; // "RVM0"
+pub const SEGMENT_ALIGNMENT: usize = 64;
+pub const ROOT_MANIFEST_SIZE: usize = 4096;
+pub const MAX_SEGMENT_PAYLOAD: u64 = 4 * 1024 * 1024 * 1024; // 4 GB
+```
+
+**SegmentType enum (from spec 01):**
+
+```rust
+#[repr(u8)]
+pub enum SegmentType {
+    Invalid    = 0x00,
+    Vec        = 0x01,
+    Index      = 0x02,
+    Overlay    = 0x03,
+    Journal    = 0x04,
+    Manifest   = 0x05,
+    Quant      = 0x06,
+    Meta       = 0x07,
+    Hot        = 0x08,
+    Sketch     = 0x09,
+    Witness    = 0x0A,
+    Profile    = 0x0B,
+    Crypto     = 0x0C,
+    MetaIdx    = 0x0D,
+}
+```
+
+### `rvf-wire` (no_std + alloc)
+
+```toml
+[package]
+name = "rvf-wire"
+version = "0.1.0"
+description = "RuVector Format wire format reader/writer"
+
+[dependencies]
+rvf-types = { path = "../rvf-types" }
+
+[features]
+default = ["std"]
+std = ["rvf-types/std"]
+```
+
+**Files to create:**
+
+```
+crates/rvf/rvf-wire/
+  src/
+    lib.rs
+    reader.rs           # SegmentReader: parse header, validate magic/hash
+    writer.rs           # SegmentWriter: build header, compute hash, align
+    varint.rs           # LEB128 encode/decode
+    delta.rs            # Delta encoding with restart points
+    crc32c.rs           # CRC32C (software + hardware detect)
+    xxh3.rs             # XXH3-128 hash (or re-export from xxhash-rust)
+    tail_scan.rs        # find_latest_manifest() backward scan
+    manifest_reader.rs  # Level 0 root manifest parser
+    manifest_writer.rs  # Level 0 + Level 1 manifest builder
+    vec_seg_codec.rs    # VEC_SEG columnar encode/decode
+    hot_seg_codec.rs    # HOT_SEG interleaved encode/decode
+    index_seg_codec.rs  # INDEX_SEG adjacency encode/decode
+  Cargo.toml
+```
+
+### Phase 1 Acceptance Criteria
+
+- [ ] `rvf-types` compiles with `#![no_std]`
+- [ ] `rvf-wire` round-trips: create segment -> serialize -> deserialize -> compare
+- [ ] Tail scan finds manifest in valid file
+- [ ] CRC32C matches reference implementation
+- [ ] Varint codec matches LEB128 spec
+- [ ] `cargo test` passes for both crates
+- [ ] `cargo clippy` clean, `cargo fmt` clean
+
+---
+
+## Phase 2: Core Runtime — manifest + index + quant
+
+### Agent Assignments
+
+| Agent | Role | Crate | Deliverable |
+|-------|------|-------|-------------|
+| **coder-3** | Manifest system | `crates/rvf/rvf-manifest/` | Two-level manifest, progressive boot |
+| **coder-4** | Progressive indexing | `crates/rvf/rvf-index/` | Layer A/B/C HNSW with progressive load |
+| **coder-5** | Quantization | `crates/rvf/rvf-quant/` | Temperature-tiered quant (fp16/i8/PQ/binary) |
+| **coder-6** | Full runtime | `crates/rvf/rvf-runtime/` | RvfStore API, compaction, append-only |
+| **tester-2** | Integration tests | `crates/rvf/tests/` | Progressive load, crash safety, recall |
+
+### `rvf-manifest`
+
+**Key functionality:**
+- Parse Level 0 root manifest (4096 bytes) -> extract hotset pointers
+- Parse Level 1 TLV records -> build segment directory
+- Write new manifest on mutation (two-fsync protocol)
+- Manifest chain for rollback (OVERLAY_CHAIN record)
+
+### `rvf-index`
+
+**Key functionality:**
+- Layer A: Entry points + top-layer adjacency (from INDEX_SEG with HOT flag)
+- Layer B: Partial adjacency for hot region (built incrementally)
+- Layer C: Full HNSW adjacency (built lazily in background)
+- Varint delta-encoded neighbor lists with restart points
+- Prefetch hints for cache-friendly traversal
+
+**Integration with existing ruvector-core HNSW:**
+- Wrap `hnsw_rs` graph as the in-memory structure
+- Serialize HNSW to INDEX_SEG format
+- Deserialize INDEX_SEG into `hnsw_rs` layers
+
+### `rvf-quant`
+
+**Key functionality:**
+- Scalar quantization: fp32 -> int8 (4x compression)
+- Product quantization: M subspaces, K centroids (8-16x compression)
+- Binary quantization: sign bit (32x compression)
+- QUANT_SEG read/write for codebooks
+- Temperature tier assignment from SKETCH_SEG access counters
+
+### `rvf-runtime`
+
+**Key functionality:**
+- `RvfStore::create()` / `RvfStore::open()` / `RvfStore::open_readonly()`
+- Append-only write path (VEC_SEG + MANIFEST_SEG)
+- Progressive load sequence (Level 0 -> hotset -> Level 1 -> on-demand)
+- Background compaction (IO-budget-aware, priority-ordered)
+- Count-Min Sketch maintenance for temperature decisions
+- Promotion/demotion lifecycle
+
+### Phase 2 Acceptance Criteria
+
+- [ ] Progressive boot: parse Level 0 in < 1ms, first query in < 50ms (1M vectors)
+- [ ] Recall@10 >= 0.70 with Layer A only
+- [ ] Recall@10 >= 0.95 with all layers loaded
+- [ ] Crash safety: kill -9 during write -> recover to last valid manifest
+- [ ] Compaction reduces dead space while respecting IO budget
+- [ ] Scalar quantization reconstruction error < 0.5%
+
+---
+
+## Phase 3: Integration — Library Adapters
+
+### Agent Assignments
+
+| Agent | Role | Target Library | Deliverable |
+|-------|------|---------------|-------------|
+| **coder-7** | claude-flow adapter | claude-flow memory | RVF-backed memory store |
+| **coder-8** | agentdb adapter | agentdb | RVF as persistence backend |
+| **coder-9** | agentic-flow adapter | agentic-flow | RVF streaming for inter-agent exchange |
+| **coder-10** | rvlite adapter | rvlite | RVF Core Profile minimal store |
+
+### claude-flow Memory -> RVF
+
+```
+Current: JSON flat files + in-memory HNSW
+Target:  RVF file per memory namespace
+
+Mapping:
+  memory store  -> RvfStore with RVText profile
+  memory search -> rvf_runtime.query()
+  memory persist -> RVF append (VEC_SEG + META_SEG + MANIFEST_SEG)
+  audit trail   -> WITNESS_SEG with hash chain
+  session state -> META_SEG with TTL metadata
+```
+
+### agentdb -> RVF
+
+```
+Current: Custom HNSW + serde persistence
+Target:  RVF file per database instance
+
+Mapping:
+  agentdb.insert()  -> rvf_runtime.ingest_batch()
+  agentdb.search()  -> rvf_runtime.query()
+  agentdb.persist() -> already persistent (append-only)
+  HNSW graph        -> INDEX_SEG (Layer A/B/C)
+  Metadata          -> META_SEG + METAIDX_SEG
+```
+
+### agentic-flow -> RVF
+
+```
+Current: Shared memory blobs between agents
+Target:  RVF TCP streaming protocol
+
+Mapping:
+  agent memory share -> RVF SUBSCRIBE + UPDATE_NOTIFY
+  swarm state        -> META_SEG in shared RVF file
+  learning patterns  -> SKETCH_SEG for access tracking
+  consensus state    -> WITNESS_SEG with signatures
+```
+
+### Phase 3 Acceptance Criteria
+
+- [ ] claude-flow `memory store` and `memory search` work against RVF backend
+- [ ] agentdb existing test suite passes with RVF storage (swap in, not rewrite)
+- [ ] agentic-flow agents can share vectors through RVF streaming protocol
+- [ ] Legacy format import tools for each library
+
+---
+
+## Phase 4: WASM + Node.js Bindings
+
+### Agent Assignments
+
+| Agent | Role | Target | Deliverable |
+|-------|------|--------|-------------|
+| **coder-11** | WASM microkernel | `crates/rvf/rvf-wasm/` | 14-export WASM module (<8 KB) |
+| **coder-12** | WASM full runtime | `npm/packages/rvf-wasm/` | wasm-pack build, browser-compatible |
+| **coder-13** | Node.js N-API | `crates/rvf/rvf-node/` | napi-rs bindings, platform packages |
+| **coder-14** | TypeScript SDK | `npm/packages/rvf/` | TypeScript wrapper, types, docs |
+
+### WASM Microkernel (`rvf-wasm` crate, `wasm32-unknown-unknown`)
+
+```rust
+// 14 exports matching spec (microkernel/wasm-runtime.md)
+#[no_mangle] pub extern "C" fn rvf_init(config_ptr: i32) -> i32;
+#[no_mangle] pub extern "C" fn rvf_load_query(query_ptr: i32, dim: i32) -> i32;
+#[no_mangle] pub extern "C" fn rvf_load_block(block_ptr: i32, count: i32, dtype: i32) -> i32;
+#[no_mangle] pub extern "C" fn rvf_distances(metric: i32, result_ptr: i32) -> i32;
+#[no_mangle] pub extern "C" fn rvf_topk_merge(dist_ptr: i32, id_ptr: i32, count: i32, k: i32) -> i32;
+#[no_mangle] pub extern "C" fn rvf_topk_read(out_ptr: i32) -> i32;
+// ... remaining 8 exports
+```
+
+**Build command:**
+```bash
+cargo build --target wasm32-unknown-unknown --release -p rvf-wasm
+wasm-opt -Oz -o rvf-microkernel.wasm target/wasm32-unknown-unknown/release/rvf_wasm.wasm
+```
+
+**Size budget:** Must be < 8 KB after wasm-opt.
+
+### WASM Full Runtime (wasm-pack, browser)
+
+```bash
+cd crates/rvf/rvf-runtime
+wasm-pack build --target web --features wasm
+```
+
+**npm package:** `@ruvector/rvf-wasm`
+
+```typescript
+// npm/packages/rvf-wasm/index.ts
+import init, { RvfStore } from './pkg/rvf_runtime.js';
+
+await init();
+const store = RvfStore.fromBytes(rvfFileBytes);
+const results = store.query(queryVector, 10);
+```
+
+### Node.js N-API Bindings (napi-rs)
+
+```bash
+cd crates/rvf/rvf-node
+npm run build  # napi build --platform --release
+```
+
+**Platform packages:**
+
+| Package | Target |
+|---------|--------|
+| `@ruvector/rvf-node` | Main package with postinstall platform select |
+| `@ruvector/rvf-node-linux-x64-gnu` | Linux x86_64 glibc |
+| `@ruvector/rvf-node-linux-arm64-gnu` | Linux aarch64 glibc |
+| `@ruvector/rvf-node-darwin-arm64` | macOS Apple Silicon |
+| `@ruvector/rvf-node-darwin-x64` | macOS Intel |
+| `@ruvector/rvf-node-win32-x64-msvc` | Windows x64 |
+
+### TypeScript SDK
+
+```typescript
+// npm/packages/rvf/src/index.ts
+export class RvfDatabase {
+  static async open(path: string): Promise<RvfDatabase>;
+  static async create(path: string, options?: RvfOptions): Promise<RvfDatabase>;
+
+  async insert(id: string, vector: Float32Array, metadata?: Record<string, unknown>): Promise<void>;
+  async insertBatch(entries: RvfEntry[]): Promise<RvfIngestResult>;
+  async query(vector: Float32Array, k: number, options?: RvfQueryOptions): Promise<RvfResult[]>;
+  async delete(ids: string[]): Promise<RvfDeleteResult>;
+
+  // Progressive loading
+  async openProgressive(source: string | URL): Promise<RvfProgressiveReader>;
+}
+
+export interface RvfOptions {
+  profile?: 'generic' | 'rvdna' | 'rvtext' | 'rvgraph' | 'rvvision';
+  dimensions: number;
+  metric?: 'l2' | 'cosine' | 'dotproduct' | 'hamming';
+  compression?: 'none' | 'lz4' | 'zstd';
+  signing?: { algorithm: 'ed25519' | 'ml-dsa-65'; key: Uint8Array };
+}
+```
+
+### Phase 4 Acceptance Criteria
+
+- [ ] WASM microkernel < 8 KB after wasm-opt
+- [ ] WASM full runtime works in Chrome, Firefox, Node.js
+- [ ] N-API bindings pass same test suite as Rust crate
+- [ ] TypeScript types match Rust API surface
+- [ ] All platform binaries build in CI
+
+---
+
+## Phase 5: Testing + Benchmarks
+
+### Agent Assignments
+
+| Agent | Role | Scope |
+|-------|------|-------|
+| **tester-3** | Acceptance tests | 10M vector cold start, recall, crash safety |
+| **tester-4** | Benchmark harness | criterion benches, perf targets from spec |
+| **tester-5** | Fuzz testing | cargo-fuzz for wire format parsing |
+| **tester-6** | WASM tests | Browser + Cognitum tile simulation |
+
+### Test Matrix
+
+| Test Category | Description | Target |
+|--------------|-------------|--------|
+| **Round-trip** | Write + read all segment types | `rvf-wire` |
+| **Progressive boot** | Cold start, measure recall at each phase | `rvf-runtime` |
+| **Crash safety** | kill -9 during ingest/manifest/compaction | `rvf-runtime` |
+| **Bit flip detection** | Random corruption -> hash/CRC catch | `rvf-wire` |
+| **Recall benchmarks** | recall@10 at Layer A, B, C | `rvf-index` |
+| **Latency benchmarks** | p50/p95/p99 query latency | `rvf-runtime` |
+| **Throughput benchmarks** | QPS and ingest rate | `rvf-runtime` |
+| **WASM performance** | Distance compute, top-K in WASM | `rvf-wasm` |
+| **Interop** | agentdb/claude-flow/agentic-flow integration | adapters |
+| **Profile compatibility** | Generic reader opens RVDNA/RVText files | `rvf-runtime` |
+
+### Benchmark Commands
+
+```bash
+# Rust benchmarks
+cd crates/rvf/rvf-runtime && cargo bench
+
+# WASM benchmarks
+cd npm/packages/rvf-wasm && npm run bench
+
+# Node.js benchmarks
+cd npm/packages/rvf-node && npm run bench
+
+# Full acceptance test (10M vectors)
+cd crates/rvf && cargo test --release --test acceptance -- --ignored
+```
+
+### Phase 5 Acceptance Criteria
+
+- [ ] All performance targets from `benchmarks/acceptance-tests.md` met
+- [ ] Zero data loss in crash safety tests (100 iterations)
+- [ ] 100% bit-flip detection rate
+- [ ] WASM microkernel passes Cognitum tile simulation
+- [ ] No memory safety issues found by fuzz testing (1M iterations)
+
+---
+
+## Phase 6: Optimization + Publishing
+
+### Agent Assignments
+
+| Agent | Role | Scope |
+|-------|------|-------|
+| **optimizer-1** | SIMD tuning | AVX-512/NEON distance kernels, alignment |
+| **optimizer-2** | Compression tuning | LZ4/ZSTD level selection, block size |
+| **publisher-1** | crates.io publishing | Version management, dependency graph |
+| **publisher-2** | npm publishing | Platform packages, wasm-pack output |
+
+### SIMD Optimization Targets
+
+| Operation | AVX-512 Target | NEON Target | WASM v128 Target |
+|-----------|---------------|-------------|-----------------|
+| L2 distance (384-dim fp16) | ~12 cycles | ~48 cycles | ~96 cycles |
+| Dot product (384-dim fp16) | ~12 cycles | ~48 cycles | ~96 cycles |
+| Hamming (384-bit) | 1 cycle (VPOPCNTDQ) | ~6 cycles (CNT) | ~24 cycles |
+| PQ ADC (48 subspaces) | ~48 cycles (gather) | ~96 cycles (TBL) | ~192 cycles |
+
+### Publishing Dependency Order
+
+Crates must be published in dependency order:
+
+```
+1. rvf-types        (no deps)
+2. rvf-wire          (depends on rvf-types)
+3. rvf-quant         (depends on rvf-types)
+4. rvf-manifest      (depends on rvf-types, rvf-wire)
+5. rvf-index         (depends on rvf-types, rvf-wire, rvf-quant)
+6. rvf-crypto        (depends on rvf-types, rvf-wire)
+7. rvf-runtime       (depends on all above)
+8. rvf-wasm          (depends on rvf-types, rvf-wire, rvf-quant)
+9. rvf-node          (depends on rvf-runtime)
+10. rvf-server       (depends on rvf-runtime)
+```
+
+### crates.io Publishing
+
+```bash
+# Publish in dependency order
+for crate in rvf-types rvf-wire rvf-quant rvf-manifest rvf-index rvf-crypto rvf-runtime rvf-wasm rvf-node rvf-server; do
+  cd crates/rvf/$crate
+  cargo publish
+  sleep 30  # Wait for crates.io index update
+  cd -
+done
+```
+
+### npm Publishing
+
+```bash
+# WASM package
+cd npm/packages/rvf-wasm
+npm publish --access public
+
+# Node.js platform binaries
+for platform in linux-x64-gnu linux-arm64-gnu darwin-arm64 darwin-x64 win32-x64-msvc; do
+  cd npm/packages/rvf-node-$platform
+  npm publish --access public
+  cd -
+done
+
+# Main Node.js package
+cd npm/packages/rvf-node
+npm publish --access public
+
+# TypeScript SDK
+cd npm/packages/rvf
+npm publish --access public
+```
+
+### Phase 6 Acceptance Criteria
+
+- [ ] SIMD distance kernels meet cycle targets on each platform
+- [ ] All crates published to crates.io with correct dependency graph
+- [ ] All npm packages published with correct platform detection
+- [ ] `npx rvf --version` works
+- [ ] `npm install @ruvector/rvf` works on all supported platforms
+- [ ] GitHub release with changelog
+
+---
+
+## Swarm Topology
+
+```
+                    ┌──────────────┐
+                    │  Queen       │
+                    │  Coordinator │
+                    └──────┬───────┘
+                           │
+            ┌──────────────┼──────────────┐
+            │              │              │
+     ┌──────▼──────┐ ┌────▼─────┐ ┌──────▼──────┐
+     │ Foundation  │ │  Runtime │ │ Integration │
+     │ Squad       │ │  Squad   │ │ Squad       │
+     │ (coder 1-2) │ │ (coder  │ │ (coder 7-10)│
+     │ (tester-1)  │ │  3-6)   │ │             │
+     │ (reviewer-1)│ │ (test-2)│ │             │
+     └─────────────┘ └─────────┘ └─────────────┘
+            │              │              │
+            │    ┌─────────┼──────────┐   │
+            │    │         │          │   │
+            │  ┌─▼───────┐│┌─────────▼┐  │
+            │  │ WASM +  │││ Testing  │  │
+            │  │ Node    │││ Squad    │  │
+            │  │ Squad   │││(tester   │  │
+            │  │(coder   │││ 3-6)    │  │
+            │  │ 11-14)  │││          │  │
+            │  └─────────┘│└──────────┘  │
+            │             │              │
+            └─────────────┼──────────────┘
+                    ┌─────▼──────┐
+                    │ Optimize + │
+                    │ Publish    │
+                    │ Squad      │
+                    └────────────┘
+```
+
+### Swarm Init Command
+
+```bash
+npx @claude-flow/cli@latest swarm init \
+  --topology hierarchical \
+  --max-agents 8 \
+  --strategy specialized
+```
+
+### Agent Spawn Commands (via Claude Code Task tool)
+
+All agents should be spawned as `run_in_background: true` Task calls in a single message. Each agent receives:
+
+1. The relevant RVF spec files to read (from `docs/research/rvf/`)
+2. The ADR-029 for context
+3. The specific phase deliverables from this guidance
+4. The acceptance criteria as exit conditions
+
+---
+
+## Critical Path
+
+```
+rvf-types ──> rvf-wire ──> rvf-manifest ──> rvf-runtime ──> adapters ──> publish
+                 │                              │
+                 └──> rvf-quant ────────────────┘
+                 │                              │
+                 └──> rvf-index ────────────────┘
+                                                │
+                 rvf-wasm (parallel) ───────────┘
+                 rvf-node (parallel) ───────────┘
+```
+
+**Blocking dependencies:**
+- Everything depends on `rvf-types`
+- `rvf-wire` unlocks all other crates
+- `rvf-runtime` blocks integration adapters
+- `rvf-wasm` and `rvf-node` can proceed in parallel once `rvf-wire` exists
+
+---
+
+## File Layout Summary
+
+```
+crates/rvf/
+  rvf-types/        # Segment types, headers, enums (no_std)
+  rvf-wire/         # Wire format read/write (no_std + alloc)
+  rvf-index/        # Progressive HNSW indexing
+  rvf-manifest/     # Two-level manifest system
+  rvf-quant/        # Temperature-tiered quantization
+  rvf-crypto/       # ML-DSA-65, SHAKE-256
+  rvf-runtime/      # Full runtime (RvfStore API)
+  rvf-wasm/         # WASM microkernel (<8 KB)
+  rvf-node/         # Node.js N-API bindings
+  rvf-server/       # TCP/HTTP streaming server
+  tests/            # Integration + acceptance tests
+  benches/          # Criterion benchmarks
+
+npm/packages/
+  rvf/              # TypeScript SDK (@ruvector/rvf)
+  rvf-wasm/         # Browser WASM (@ruvector/rvf-wasm)
+  rvf-node/         # Node.js native (@ruvector/rvf-node)
+  rvf-node-linux-x64-gnu/
+  rvf-node-linux-arm64-gnu/
+  rvf-node-darwin-arm64/
+  rvf-node-darwin-x64/
+  rvf-node-win32-x64-msvc/
+```
+
+---
+
+## Success Metrics
+
+| Metric | Target | Measured By |
+|--------|--------|-------------|
+| Cold boot time | < 5 ms | Phase 5 acceptance test |
+| First query recall@10 | >= 0.70 | Phase 5 recall benchmark |
+| Full recall@10 | >= 0.95 | Phase 5 recall benchmark |
+| Query latency p50 | < 0.3 ms (10M vectors) | Phase 5 latency benchmark |
+| WASM microkernel size | < 8 KB | Phase 4 build output |
+| Crash safety | 0 data loss in 100 kill tests | Phase 5 crash test |
+| Crates published | 10 crates on crates.io | Phase 6 publish |
+| NPM packages published | 8+ packages on npm | Phase 6 publish |
+| Library integration | 4 libraries using RVF | Phase 3 adapter tests |
--- a/vendor/ruvector/docs/research/rvf/benchmarks/acceptance-tests.md
+++ b/vendor/ruvector/docs/research/rvf/benchmarks/acceptance-tests.md
@@ -0,0 +1,341 @@
+# RVF Acceptance Tests and Performance Targets
+
+## 1. Primary Acceptance Test
+
+> **Cold start on a 10 million vector file: load and answer the first query with a
+> useful result (recall@10 >= 0.70) without reading more than the last 4 MB, then
+> converge to full quality (recall@10 >= 0.95) as it progressively maps more segments.**
+
+### Test Parameters
+
+```
+Dataset:         10 million vectors
+Dimensions:      384 (sentence embedding size)
+Base dtype:      fp16 (768 bytes per vector)
+Raw file size:   ~7.2 GB (vectors only)
+With index:      ~10-12 GB total
+Query set:       1000 queries from held-out test set
+Ground truth:    Brute-force exact k-NN (k=10)
+Metric:          L2 distance
+```
+
+### Success Criteria
+
+| Phase | Time Budget | Data Read | Min Recall@10 | Description |
+|-------|------------|-----------|---------------|-------------|
+| Boot | < 5 ms | 4 KB (Level 0) | N/A | Parse root manifest |
+| First query | < 50 ms | <= 4 MB | >= 0.70 | Layer A + hot cache |
+| Working quality | < 500 ms | <= 200 MB | >= 0.85 | Layer A + B |
+| Full quality | < 5 s | <= 4 GB | >= 0.95 | Layers A + B + C |
+| Optimized | < 30 s | Full file | >= 0.98 | All layers + hot tier |
+
+### Measurement Methodology
+
+```
+1. Create RVF file from 10M vector dataset
+   - Build full HNSW index (M=16, ef_construction=200)
+   - Compute temperature tiers (default: all warm initially)
+   - Write with all segment types
+
+2. Cold start measurement
+   - Drop filesystem cache: echo 3 > /proc/sys/vm/drop_caches
+   - Open file, start timer
+   - Read Level 0 (4 KB), record time T_boot
+   - Read hotset data, record time T_hotset
+   - Execute first query, record time T_first_query and recall@10
+   - Continue progressive loading
+   - At each milestone: record time, data read, recall@10
+
+3. Throughput measurement (warm)
+   - After full load, execute 1000 queries
+   - Measure queries per second (QPS)
+   - Measure p50, p95, p99 latency
+   - Measure recall@10 average
+
+4. Streaming ingest measurement
+   - Start with empty file
+   - Ingest 10M vectors in streaming mode
+   - Measure ingest rate (vectors/second)
+   - Measure file size over time
+   - Verify crash safety (kill -9 at random points, verify recovery)
+```
+
+## 2. Performance Targets
+
+### Query Latency (10M vectors, 384 dim, fp16)
+
+| Hardware | QPS (single thread) | p50 Latency | p95 Latency | p99 Latency |
+|----------|-------------------|-------------|-------------|-------------|
+| Desktop (AVX-512) | 5,000-15,000 | 0.1 ms | 0.3 ms | 1.0 ms |
+| Desktop (AVX2) | 3,000-8,000 | 0.2 ms | 0.5 ms | 2.0 ms |
+| Laptop (NEON) | 2,000-5,000 | 0.3 ms | 1.0 ms | 3.0 ms |
+| WASM (browser) | 500-2,000 | 1.0 ms | 3.0 ms | 10.0 ms |
+| Cognitum tile | 100-500 | 2.0 ms | 5.0 ms | 15.0 ms |
+
+### Streaming Ingest Rate
+
+| Hardware | Vectors/Second | Bytes/Second | Notes |
+|----------|---------------|-------------|-------|
+| NVMe SSD | 200K-500K | 150-380 MB/s | fsync every 1000 vectors |
+| SATA SSD | 50K-100K | 38-76 MB/s | fsync every 1000 vectors |
+| HDD | 10K-30K | 7-23 MB/s | Sequential append |
+| Network (1 Gbps) | 50K-100K | 38-76 MB/s | Streaming over network |
+
+### Progressive Load Times
+
+| Phase | NVMe SSD | SATA SSD | HDD | Network |
+|-------|----------|----------|-----|---------|
+| Boot (4 KB) | < 0.1 ms | < 0.5 ms | < 10 ms | < 50 ms |
+| First query (4 MB) | < 2 ms | < 10 ms | < 100 ms | < 500 ms |
+| Working quality (200 MB) | < 100 ms | < 500 ms | < 5 s | < 20 s |
+| Full quality (4 GB) | < 2 s | < 10 s | < 120 s | < 400 s |
+
+### Space Efficiency
+
+| Configuration | Bytes/Vector | File Size (10M) | Ratio vs Raw |
+|--------------|-------------|-----------------|-------------|
+| Raw fp32 | 1,536 | 14.3 GB | 1.0x |
+| RVF uniform fp16 | 768 + overhead | 8.0 GB | 0.56x |
+| RVF adaptive (equilibrium) | ~300 avg | 3.2 GB | 0.22x |
+| RVF aggressive (binary cold) | ~100 avg | 1.1 GB | 0.08x |
+
+## 3. Crash Safety Tests
+
+### Test 1: Kill During Vector Ingest
+
+```
+1. Start ingesting 1M vectors
+2. After 500K vectors: kill -9 the writer
+3. Verify: file is readable
+4. Verify: latest valid manifest is found
+5. Verify: all vectors referenced by latest manifest are intact
+6. Verify: no data corruption (all segment hashes valid)
+```
+
+**Pass criteria**: Zero data loss for committed segments. At most the
+last incomplete segment is lost (bounded by fsync interval).
+
+### Test 2: Kill During Manifest Write
+
+```
+1. Create file with 1M vectors
+2. Trigger manifest rewrite (add metadata, trigger compaction)
+3. Kill -9 during manifest write
+4. Verify: file falls back to previous valid manifest
+5. Verify: all queries work correctly with previous manifest
+```
+
+**Pass criteria**: Automatic fallback to previous manifest. No manual
+recovery needed.
+
+### Test 3: Kill During Compaction
+
+```
+1. Create file with 1M vectors across 100 small VEC_SEGs
+2. Trigger compaction
+3. Kill -9 during compaction
+4. Verify: file is readable (old segments still valid)
+5. Verify: partial compaction output is safely ignored
+```
+
+**Pass criteria**: Old segments remain valid. Incomplete compaction
+output has no manifest reference and is safely orphaned.
+
+### Test 4: Bit Flip Detection
+
+```
+1. Create valid RVF file
+2. Flip random bits in various locations
+3. Verify: corruption detected by hash/CRC checks
+4. Verify: specific corrupted segment identified
+5. Verify: other segments still readable
+```
+
+**Pass criteria**: 100% detection of single-bit flips. Corruption
+isolated to affected segment.
+
+## 4. Scalability Tests
+
+### Test: 1 Billion Vectors
+
+```
+Dataset:     1B vectors, 384 dimensions, fp16
+File size:   ~700 GB (raw) -> ~200 GB (adaptive RVF)
+Hardware:    Server with 256 GB RAM, NVMe array
+
+Verify:
+  - Boot time < 10 ms
+  - First query < 100 ms
+  - Full quality convergence < 60 s
+  - Recall@10 >= 0.95 at full quality
+  - Streaming ingest sustained at 100K+ vectors/second
+```
+
+### Test: High Dimensionality
+
+```
+Dataset:     1M vectors, 4096 dimensions (LLM embeddings)
+File size:   ~8 GB (fp16)
+
+Verify:
+  - PQ compression to 5-bit achieves >= 10x compression
+  - Recall@10 >= 0.90 with PQ
+  - Query latency < 5 ms (p95) with PQ + HNSW
+```
+
+### Test: Multi-File Sharding
+
+```
+Dataset:     100M vectors across 10 shard files
+Verify:
+  - Transparent query across all shards
+  - Shard addition without full rebuild
+  - Individual shard compaction
+  - Shard removal with manifest update only
+```
+
+## 5. WASM Performance Tests
+
+### Browser Environment
+
+```
+Runtime:     Chrome V8 / Firefox SpiderMonkey
+SIMD:        WASM v128
+Memory:      Limited to 4 GB WASM heap
+
+Test: Load 1M vector RVF file via fetch()
+  - Boot time < 50 ms
+  - First query < 200 ms (after boot)
+  - QPS >= 500 (single thread)
+  - Memory usage < 500 MB
+```
+
+### Cognitum Tile Simulation
+
+```
+Runtime:     wasmtime with memory limits
+Code limit:  8 KB
+Data limit:  8 KB
+Scratch:     64 KB
+
+Test: Process 1000 blocks via hub protocol
+  - Distance computation matches reference implementation
+  - Top-K results match brute-force within quantization tolerance
+  - No memory access out of bounds
+  - Tile recovers from simulated faults
+```
+
+## 6. Interoperability Tests
+
+### Round-Trip Test
+
+```
+1. Create RVF file from numpy arrays
+2. Read back with independent implementation
+3. Verify: all vectors bit-identical
+4. Verify: all metadata preserved
+5. Verify: index produces same results
+```
+
+### Profile Compatibility Test
+
+```
+1. Create RVDNA file with genomic data
+2. Create RVText file with text embeddings
+3. Read both with generic RVF reader
+4. Verify: generic reader can access vectors and metadata
+5. Verify: profile-specific features degrade gracefully
+```
+
+### Version Forward Compatibility Test
+
+```
+1. Create RVF file with version 1
+2. Add segments with hypothetical version 2 features (unknown tags)
+3. Read with version 1 reader
+4. Verify: version 1 reader skips unknown segments/tags
+5. Verify: version 1 data is fully accessible
+```
+
+## 7. Security Tests
+
+### Signature Verification
+
+```
+1. Create signed RVF file (ML-DSA-65)
+2. Verify all segment signatures
+3. Modify one byte in a signed segment
+4. Verify: modification detected
+5. Verify: other segments still valid
+```
+
+### Encryption Round-Trip
+
+```
+1. Create encrypted RVF file (ML-KEM-768 + AES-256-GCM)
+2. Decrypt with correct key
+3. Verify: plaintext matches original
+4. Attempt decrypt with wrong key
+5. Verify: decryption fails (GCM auth tag mismatch)
+```
+
+### Key Rotation
+
+```
+1. Create file signed with key A
+2. Rotate to key B (write CRYPTO_SEG rotation record)
+3. Write new segments signed with key B
+4. Verify: old segments valid with key A
+5. Verify: new segments valid with key B
+6. Verify: cross-signature in rotation record is valid
+```
+
+## 8. Benchmark Harness
+
+### Recommended Tools
+
+| Purpose | Tool | Notes |
+|---------|------|-------|
+| Latency measurement | criterion (Rust) / benchmark.js | Statistical rigor |
+| Recall measurement | Custom recall@K computation | Against brute-force ground truth |
+| Memory profiling | valgrind massif / Chrome DevTools | Peak and sustained |
+| I/O profiling | blktrace / iostat | Verify read patterns |
+| SIMD verification | Intel SDE / ARM emulator | Correct SIMD codegen |
+| Crash testing | Custom harness with kill -9 | Random timing |
+
+### Report Format
+
+Each benchmark run produces a report:
+
+```json
+{
+  "test_name": "cold_start_10m",
+  "dataset": {
+    "vector_count": 10000000,
+    "dimensions": 384,
+    "dtype": "fp16",
+    "file_size_bytes": 10737418240
+  },
+  "hardware": {
+    "cpu": "Intel Xeon w5-3435X",
+    "simd": "AVX-512",
+    "ram_gb": 256,
+    "storage": "NVMe Samsung 990 Pro"
+  },
+  "results": {
+    "boot_ms": 0.08,
+    "first_query_ms": 12.3,
+    "first_query_recall_at_10": 0.73,
+    "working_quality_ms": 340,
+    "working_quality_recall_at_10": 0.87,
+    "full_quality_ms": 3200,
+    "full_quality_recall_at_10": 0.96,
+    "steady_state_qps": 8500,
+    "steady_state_p50_ms": 0.12,
+    "steady_state_p95_ms": 0.28,
+    "steady_state_p99_ms": 0.85,
+    "data_read_first_query_mb": 3.2,
+    "data_read_working_quality_mb": 180
+  }
+}
+```
--- a/vendor/ruvector/docs/research/rvf/crypto/quantum-signatures.md
+++ b/vendor/ruvector/docs/research/rvf/crypto/quantum-signatures.md
@@ -0,0 +1,312 @@
+# RVF Quantum-Resistant Cryptography
+
+## 1. Threat Model
+
+RVF files may contain high-value intelligence (medical genomics, proprietary
+embeddings, classified networks). The cryptographic design must:
+
+1. **Authenticate**: Prove a segment was written by an authorized producer
+2. **Integrity**: Detect any modification to segment payloads
+3. **Quantum resistance**: Survive attacks by future quantum computers
+4. **Performance**: Not bottleneck streaming ingest or query paths
+5. **Compactness**: Signatures must fit in segment footers without bloating
+
+### Harvest-Now, Decrypt-Later
+
+Adversaries may archive RVF files today and break classical signatures later
+with quantum computers. Post-quantum signatures protect against this from day one.
+
+## 2. Algorithm Selection
+
+### NIST Post-Quantum Standards (FIPS 204, 205, 206)
+
+| Algorithm | Standard | Type | Sig Size | PK Size | SK Size | Sign/s | Verify/s | Level |
+|-----------|----------|------|----------|---------|---------|--------|----------|-------|
+| ML-DSA-44 | FIPS 204 | Lattice | 2,420 B | 1,312 B | 2,560 B | ~9,000 | ~42,000 | 2 |
+| ML-DSA-65 | FIPS 204 | Lattice | 3,309 B | 1,952 B | 4,032 B | ~4,500 | ~17,000 | 3 |
+| ML-DSA-87 | FIPS 204 | Lattice | 4,627 B | 2,592 B | 4,896 B | ~2,800 | ~10,000 | 5 |
+| SLH-DSA-128s | FIPS 205 | Hash | 7,856 B | 32 B | 64 B | ~350 | ~15,000 | 1 |
+| SLH-DSA-128f | FIPS 205 | Hash | 17,088 B | 32 B | 64 B | ~3,000 | ~90,000 | 1 |
+| FN-DSA-512 | FIPS 206 | Lattice | 666 B | 897 B | ~1.3 KB | ~5,000 | ~25,000 | 1 |
+
+### RVF Default: ML-DSA-65
+
+**Why ML-DSA-65**:
+- NIST Level 3 security (128-bit post-quantum)
+- 3,309 byte signatures (manageable in segment footer)
+- ~4,500 sign/s (sufficient for streaming ingest at segment level)
+- ~17,000 verify/s (fast enough for progressive load verification)
+- Well-studied lattice assumption (Module-LWE)
+
+**Alternative for size-constrained environments (Core Profile)**:
+FN-DSA-512 with 666 byte signatures — but FIPS 206 is newer and less deployed.
+
+**Alternative for maximum conservatism**:
+SLH-DSA-128s (hash-based, stateless, minimal assumptions) — 7,856 byte
+signatures but the smallest keys and strongest theoretical foundation.
+
+## 3. Signature Scheme
+
+### What Gets Signed
+
+Each signed segment's signature covers:
+
+```
+signed_data = segment_header[0:40]    # Header minus content_hash and padding
+            || content_hash           # The payload hash
+            || segment_id_bytes       # Prevent replay
+            || context_string         # Domain separation
+```
+
+The signature does NOT cover the raw payload directly — it covers the payload's
+hash. This means:
+- Signing is O(1) regardless of payload size
+- The hash is computed during write anyway (required for integrity)
+- Verification requires only the header + hash, not the full payload
+
+### Context String
+
+```
+context = "RVF-v1-" || seg_type_name || "-" || profile_name
+```
+
+Examples:
+- `"RVF-v1-VEC_SEG-rvdna"`
+- `"RVF-v1-MANIFEST_SEG-generic"`
+
+Domain separation prevents cross-type signature confusion.
+
+### Key Management
+
+Keys are stored in CRYPTO_SEG segments:
+
+```
+CRYPTO_SEG Payload:
+  key_type: u8
+    0 = signing public key
+    1 = verification certificate chain
+    2 = encryption public key (for ENCRYPTED segments)
+    3 = key rotation record
+
+  algorithm: u8
+    0 = Ed25519 (classical)
+    1 = ML-DSA-65 (post-quantum)
+    2 = SLH-DSA-128s (hash-based PQ)
+    3 = X25519 (classical KEM)
+    4 = ML-KEM-768 (post-quantum KEM)
+
+  key_id: [u8; 16]    Unique key identifier (hash of public key)
+  key_data: [u8; var]  The actual key material
+  valid_from: u64      Timestamp (ns) when key becomes valid
+  valid_until: u64     Timestamp (ns) when key expires (0 = no expiry)
+```
+
+### Key Rotation
+
+New keys are introduced by writing a new CRYPTO_SEG with `key_type=3`
+(rotation record) that references both old and new key IDs. Segments
+signed with either key are valid during the transition period.
+
+```
+CRYPTO_SEG (rotation):
+  old_key_id: [u8; 16]
+  new_key_id: [u8; 16]
+  rotation_timestamp: u64
+  cross_signature: [u8; var]   New key signed by old key
+```
+
+## 4. Hash Functions
+
+### SHAKE-256 (Primary)
+
+SHAKE-256 from the SHA-3 family is used for:
+- Content hashes in segment headers (128-bit truncation for compactness)
+- Min-cut witness hashes (256-bit for cryptographic binding)
+- Key derivation
+- Domain separation
+
+**Why SHAKE-256**:
+- Post-quantum safe (Keccak is not vulnerable to Grover's algorithm at 256-bit output)
+- Extendable output function (XOF) — can produce any hash length
+- No length extension attacks
+- ~1 GB/s in software, faster with hardware SHA-3 extensions
+
+### XXH3-128 (Fast Path)
+
+XXH3 is used for non-cryptographic content hashing where speed matters more
+than collision resistance:
+- Segment content hashes when crypto verification is not required
+- Block-level integrity checks in combination with CRC32C
+
+**Performance**: ~50 GB/s with AVX2. This means hash computation is never
+the bottleneck during streaming ingest.
+
+### CRC32C (Block Level)
+
+CRC32C is used for per-block integrity within segments:
+- Detects random bit flips and truncation
+- Hardware accelerated on x86 (SSE4.2) and ARM (CRC32 extension)
+- ~3 GB/s throughput
+
+### Hash Selection by Context
+
+| Context | Algorithm | Output Size | Why |
+|---------|-----------|------------|-----|
+| Block integrity | CRC32C | 4 B | Fastest, HW accel |
+| Segment content hash (fast) | XXH3-128 | 16 B | Very fast, good distribution |
+| Segment content hash (crypto) | SHAKE-256 | 16 B | Post-quantum, collision resistant |
+| Witness / proof hashes | SHAKE-256 | 32 B | Full crypto strength |
+| Key derivation | SHAKE-256 | 32+ B | XOF flexibility |
+
+## 5. Encryption (Optional)
+
+For ENCRYPTED segments, RVF uses hybrid encryption:
+
+### Key Encapsulation
+
+```
+Classical:      X25519 ECDH
+Post-Quantum:   ML-KEM-768 (CRYSTALS-Kyber, NIST Level 3)
+Hybrid:         X25519 || ML-KEM-768 (concatenated shared secrets)
+```
+
+### Payload Encryption
+
+```
+Algorithm:      AES-256-GCM (AEAD)
+Key:            SHAKE-256(X25519_shared || ML-KEM_shared || context)
+Nonce:          First 12 bytes of SHAKE-256(segment_id || timestamp)
+AAD:            segment_header[0:40] (authenticated but not encrypted)
+```
+
+### Encrypted Segment Layout
+
+```
+Segment Header (64B, plaintext)
+  flags: ENCRYPTED set
+  content_hash: hash of PLAINTEXT payload (for integrity after decrypt)
+
+Encapsulated Keys
+  x25519_ephemeral_pk: [u8; 32]
+  ml_kem_ciphertext: [u8; 1088]
+  key_id_recipient: [u8; 16]
+
+Encrypted Payload
+  AES-256-GCM ciphertext (same size as plaintext + 16B auth tag)
+
+Signature Footer (if also SIGNED)
+  Signature covers header + encapsulated keys + encrypted payload
+```
+
+## 6. Capability Manifests (WITNESS_SEG)
+
+WITNESS_SEGs provide cryptographic proof of provenance and computation:
+
+### Witness Types
+
+```
+0x01  PROVENANCE      Who created this file and when
+0x02  COMPUTATION     Proof that an index was correctly built
+0x03  DELEGATION      Authorization chain for data access
+0x04  AUDIT           Record of queries executed against this file
+0x05  ATTESTATION     Hardware attestation (for Cognitum tiles)
+```
+
+### Provenance Witness
+
+```
+creator_key_id: [u8; 16]
+creation_time: u64
+tool_name: [u8; 64]
+tool_version: [u8; 16]
+input_hashes: [(hash256, description)]   Hashes of source data
+transform_description: [u8; var]         What was done to create vectors
+signature: [u8; var]                     Creator's signature over all above
+```
+
+### Computation Witness
+
+```
+computation_type: u8
+  0 = HNSW construction
+  1 = Quantization training
+  2 = Temperature compaction
+  3 = Overlay rebalance
+  4 = Index merge
+
+input_segments: [segment_id]
+output_segments: [segment_id]
+parameters: [(key, value)]
+result_hash: hash256
+duration_ns: u64
+signature: [u8; var]
+```
+
+This allows any reader to verify that the index was built from the declared
+vectors using the declared parameters — without re-running the computation.
+
+## 7. Signing Performance Budget
+
+For streaming ingest at 100K vectors/second with 1024-vector blocks:
+
+```
+Segment write rate:  ~100 segments/second (1024 vectors per VEC_SEG)
+Manifest writes:     ~1/second (batched)
+
+ML-DSA-65 signing:   ~4,500/second
+Signing budget:      100 segment sigs + 1 manifest sig = 101/second
+Utilization:         101 / 4,500 = 2.2%
+```
+
+Signing is not a bottleneck. Even at 10x the ingest rate, ML-DSA-65 has
+headroom.
+
+For verification during progressive load (reading 1000 segments):
+
+```
+ML-DSA-65 verify:    ~17,000/second
+Verification budget: 1000 segments / 17,000 = 59 ms
+```
+
+All segments verified in under 60 ms. This runs concurrently with data
+loading, so it adds minimal latency to the progressive boot sequence.
+
+## 8. Core Profile Crypto
+
+For the Core Profile (8 KB code budget), full ML-DSA-65 verification is
+too large (~15 KB of code). Options:
+
+1. **Hub verifies, tile trusts**: Hub checks all signatures before sending
+   blocks to tiles. Tile only needs CRC32C for transport integrity.
+
+2. **Truncated verification**: Tile verifies only the CRC32C of received
+   blocks. Hub provides a signed attestation that the source segments
+   were verified.
+
+3. **FN-DSA-512**: Smaller verification code (~3 KB), 666 byte signatures.
+   Fits in tile code budget but is less mature.
+
+Recommended: Option 1 (hub verifies, tile trusts) for the initial release.
+The hub is a trusted component in the Cognitum architecture, and the
+tile-hub channel is physically secure (on-chip mesh).
+
+## 9. Algorithm Agility
+
+The `sig_algo` and `checksum_algo` fields in segment headers and footers
+allow algorithm migration without format changes:
+
+```
+Today:       ML-DSA-65 signatures, SHAKE-256 hashes
+Future:      May migrate to ML-DSA-87 or newer NIST standards
+Transition:  Write new segments with new algo, old segments remain valid
+Verification: Reader tries algo from header field, no guessing needed
+```
+
+New algorithms are introduced by:
+1. Assigning a new enum value
+2. Writing a CRYPTO_SEG with the new key type
+3. Signing new segments with the new algorithm
+4. Old segments with old signatures remain verifiable
+
+No file rewrite needed. No flag day. Gradual migration through the
+append-only segment model.
--- a/vendor/ruvector/docs/research/rvf/microkernel/wasm-runtime.md
+++ b/vendor/ruvector/docs/research/rvf/microkernel/wasm-runtime.md
@@ -0,0 +1,397 @@
+# RVF WASM Microkernel and Cognitum Hardware Mapping
+
+## 1. Design Philosophy
+
+RVF must run on hardware ranging from a 64 KB WASM tile to a petabyte
+cluster. The WASM microkernel is the minimal runtime that makes a tile
+a first-class RVF citizen — capable of answering queries, ingesting
+streams, and participating in distributed search.
+
+The microkernel is not a shrunken version of the full runtime. It is a
+**purpose-built execution core** that exposes the exact set of operations
+a tile needs, and nothing more.
+
+## 2. Cognitum Tile Architecture
+
+### Hardware Constraints
+
+```
+-----------------------------------+
+| Cognitum Tile                     |
+|                                   |
+|  Code Memory:    8 KB             |
+|  Data Memory:    8 KB             |
+|  SIMD Scratch:   64 KB            |
+|  Registers:      v128 (WASM SIMD) |
+|  Clock:          ~1 GHz           |
+|  Interconnect:   Mesh to hub      |
+|                                   |
+|  No filesystem. No mmap.          |
+|  No allocator beyond scratch.     |
+|  All I/O through hub messages.    |
+-----------------------------------+
+```
+
+### Memory Map
+
+```
+Code (8 KB):
+  0x0000 - 0x0FFF   Microkernel WASM bytecode (4 KB)
+  0x1000 - 0x17FF   Distance function hot path (2 KB)
+  0x1800 - 0x1FFF   Decode / quantization stubs (2 KB)
+
+Data (8 KB):
+  0x0000 - 0x003F   Tile configuration (64 B)
+  0x0040 - 0x00FF   Query scratch (192 B: query vector fp16)
+  0x0100 - 0x01FF   Result buffer (256 B: top-K candidates)
+  0x0200 - 0x03FF   Routing table (512 B: entry points + centroids)
+  0x0400 - 0x07FF   Decode workspace (1 KB)
+  0x0800 - 0x0FFF   Message I/O buffer (2 KB)
+  0x1000 - 0x1FFF   Neighbor list cache (4 KB)
+
+SIMD Scratch (64 KB):
+  0x0000 - 0x7FFF   Vector block (up to 85 vectors @ 384-dim fp16)
+  0x8000 - 0xBFFF   Distance accumulator / PQ tables (16 KB)
+  0xC000 - 0xEFFF   Hot cache subset (12 KB)
+  0xF000 - 0xFFFF   Temporary / spill (4 KB)
+```
+
+### Tile Budget
+
+For 384-dim fp16 vectors:
+- One vector: 768 bytes
+- SIMD scratch holds: 64 KB / 768 = ~85 vectors
+- Top-K result buffer: 16 candidates * 16 B = 256 B
+- Query vector: 768 B
+
+A tile can process one block of ~85 vectors per cycle, computing distances
+and maintaining a top-K heap entirely within scratch memory.
+
+## 3. Microkernel Exports
+
+The WASM microkernel exports exactly these functions:
+
+```wat
+;; === Core Query Path ===
+
+;; Initialize tile with configuration
+;; config_ptr: pointer to 64B tile config in data memory
+(export "rvf_init" (func $rvf_init (param $config_ptr i32) (result i32)))
+
+;; Load query vector into query scratch
+;; query_ptr: pointer to fp16 vector in data memory
+;; dim: vector dimensionality
+(export "rvf_load_query" (func $rvf_load_query
+    (param $query_ptr i32) (param $dim i32) (result i32)))
+
+;; Load a block of vectors into SIMD scratch
+;; block_ptr: pointer to vector block in SIMD scratch
+;; count: number of vectors
+;; dtype: data type enum
+(export "rvf_load_block" (func $rvf_load_block
+    (param $block_ptr i32) (param $count i32)
+    (param $dtype i32) (result i32)))
+
+;; Compute distances between query and loaded block
+;; metric: 0=L2, 1=IP, 2=cosine, 3=hamming
+;; result_ptr: pointer to write distances
+(export "rvf_distances" (func $rvf_distances
+    (param $metric i32) (param $result_ptr i32) (result i32)))
+
+;; Merge distances into top-K heap
+;; dist_ptr: pointer to distance array
+;; id_ptr: pointer to vector ID array
+;; count: number of candidates
+;; k: top-K to maintain
+(export "rvf_topk_merge" (func $rvf_topk_merge
+    (param $dist_ptr i32) (param $id_ptr i32)
+    (param $count i32) (param $k i32) (result i32)))
+
+;; Read current top-K results
+;; out_ptr: pointer to write results (id, distance pairs)
+(export "rvf_topk_read" (func $rvf_topk_read
+    (param $out_ptr i32) (result i32)))
+
+;; === Quantization ===
+
+;; Load scalar quantization parameters (min/max per dim)
+(export "rvf_load_sq_params" (func $rvf_load_sq_params
+    (param $params_ptr i32) (param $dim i32) (result i32)))
+
+;; Dequantize int8 block to fp16 in SIMD scratch
+(export "rvf_dequant_i8" (func $rvf_dequant_i8
+    (param $src_ptr i32) (param $dst_ptr i32)
+    (param $count i32) (result i32)))
+
+;; Load PQ codebook subset
+(export "rvf_load_pq_codebook" (func $rvf_load_pq_codebook
+    (param $codebook_ptr i32) (param $M i32)
+    (param $K i32) (result i32)))
+
+;; Compute PQ asymmetric distances
+(export "rvf_pq_distances" (func $rvf_pq_distances
+    (param $codes_ptr i32) (param $count i32)
+    (param $result_ptr i32) (result i32)))
+
+;; === HNSW Navigation ===
+
+;; Load neighbor list for a node
+(export "rvf_load_neighbors" (func $rvf_load_neighbors
+    (param $node_id i64) (param $layer i32)
+    (param $out_ptr i32) (result i32)))
+
+;; Greedy search step: given current node, find nearest neighbor
+(export "rvf_greedy_step" (func $rvf_greedy_step
+    (param $current_id i64) (param $layer i32) (result i64)))
+
+;; === Segment Verification ===
+
+;; Verify segment header hash
+(export "rvf_verify_header" (func $rvf_verify_header
+    (param $header_ptr i32) (result i32)))
+
+;; Compute CRC32C of a data region
+(export "rvf_crc32c" (func $rvf_crc32c
+    (param $data_ptr i32) (param $len i32) (result i32)))
+```
+
+### Export Count
+
+14 exports. Each maps to a tight inner loop that fits in the 8 KB code budget.
+The host (hub) is responsible for all I/O, segment parsing, and orchestration.
+
+## 4. Host-Tile Protocol
+
+Communication between the hub and tile uses fixed-size messages through
+the 2 KB I/O buffer:
+
+### Message Format
+
+```
+Offset  Size  Field        Description
+------  ----  -----        -----------
+0x00    2     msg_type     Message type enum
+0x02    2     msg_length   Payload length
+0x04    4     msg_id       Correlation ID
+0x08    var   payload      Type-specific payload
+```
+
+### Message Types
+
+```
+Hub -> Tile:
+  0x01  LOAD_QUERY       Send query vector (768 B for 384-dim fp16)
+  0x02  LOAD_BLOCK       Send vector block (up to ~1.5 KB compressed)
+  0x03  LOAD_NEIGHBORS   Send neighbor list for a node
+  0x04  LOAD_PARAMS      Send quantization parameters
+  0x05  COMPUTE          Trigger distance computation
+  0x06  READ_TOPK        Request current top-K results
+  0x07  RESET            Clear tile state for new query
+
+Tile -> Hub:
+  0x81  TOPK_RESULT      Top-K results (id, distance pairs)
+  0x82  NEED_BLOCK       Request a specific vector block
+  0x83  NEED_NEIGHBORS   Request neighbor list for a node
+  0x84  DONE             Computation complete
+  0x85  ERROR            Error with code
+```
+
+### Execution Flow
+
+```
+Hub                                 Tile
+ |                                    |
+ |--- LOAD_QUERY (768B) ------------>|
+ |                                    | rvf_load_query()
+ |--- LOAD_PARAMS (SQ params) ------>|
+ |                                    | rvf_load_sq_params()
+ |--- LOAD_BLOCK (block 0) -------->|
+ |                                    | rvf_load_block()
+ |                                    | rvf_distances()
+ |                                    | rvf_topk_merge()
+ |--- LOAD_BLOCK (block 1) -------->|
+ |                                    | rvf_load_block()
+ |                                    | rvf_distances()
+ |                                    | rvf_topk_merge()
+ |    ...                             |
+ |--- READ_TOPK -------------------->|
+ |                                    | rvf_topk_read()
+ |<--- TOPK_RESULT ------------------|
+ |                                    |
+```
+
+### Pull Mode
+
+For HNSW search, the tile drives the traversal:
+
+```
+Hub                                 Tile
+ |                                    |
+ |--- LOAD_QUERY -------------------->|
+ |--- LOAD_NEIGHBORS (entry point) -->|
+ |                                    | rvf_greedy_step()
+ |<--- NEED_NEIGHBORS (next node) ----|
+ |--- LOAD_NEIGHBORS (next node) ---->|
+ |                                    | rvf_greedy_step()
+ |<--- NEED_BLOCK (for candidate) ----|
+ |--- LOAD_BLOCK -------------------->|
+ |                                    | rvf_distances()
+ |                                    | rvf_topk_merge()
+ |<--- DONE ----------------------------|
+ |--- READ_TOPK --------------------->|
+ |<--- TOPK_RESULT ------------------|
+```
+
+## 5. Three Hardware Profiles
+
+### RVF Core Profile (Tile)
+
+```
+Target:         Cognitum tile (8KB + 8KB + 64KB)
+Features:       Distance compute, top-K, SQ dequant, CRC32C verify
+Max vectors:    ~85 per block load
+Max dimensions: 384 (fp16) or 768 (i8)
+Index:          None (hub routes, tile computes)
+Streaming:      Receive blocks from hub
+Quantization:   i8 scalar only (no PQ on tile)
+Compression:    None (hub decompresses before sending)
+```
+
+### RVF Hot Profile (Chip)
+
+```
+Target:         Cognitum chip (multiple tiles + shared memory)
+Features:       Core + PQ distance, HNSW navigation, parallel tiles
+Max vectors:    Limited by shared memory (~10K in shared cache)
+Max dimensions: 1024
+Index:          Layer A in shared memory
+Streaming:      Block streaming across tiles
+Quantization:   i8 scalar + PQ (6-bit)
+Compression:    LZ4 decompress in shared memory
+```
+
+### RVF Full Profile (Hub/Desktop)
+
+```
+Target:         Desktop CPU, server, hub controller
+Features:       All features, all segment types, all quantization
+Max vectors:    Billions (limited by storage)
+Max dimensions: Unlimited
+Index:          Full HNSW (Layers A + B + C)
+Streaming:      Full append-only segment model
+Quantization:   All tiers (fp16, i8, PQ, binary)
+Compression:    All (LZ4, ZSTD, custom)
+Crypto:         Full (ML-DSA-65 signatures, SHAKE-256)
+Temperature:    Full adaptive tiering
+Overlay:        Full epoch model with compaction
+```
+
+### Profile Detection
+
+The root manifest's `profile_id` field declares the minimum profile needed:
+
+```
+0x00    generic     Requires Full Profile features
+0x01    core        Fully usable with Core Profile
+0x02    hot         Requires Hot Profile minimum
+0x03    full        Requires Full Profile
+```
+
+A Full Profile reader can always read Core or Hot files. A Core Profile
+reader rejects Full Profile files but can read Core files. Hot Profile
+readers can read Core and Hot files.
+
+## 6. SIMD Strategy by Platform
+
+### WASM v128 (Tile/Browser)
+
+```wasm
+;; L2 distance: fp16 vectors, 384 dimensions
+;; Process 8 fp16 values per v128 operation
+
+(func $l2_fp16_384 (param $a_ptr i32) (param $b_ptr i32) (result f32)
+    (local $acc v128)
+    (local $i i32)
+    (local.set $acc (v128.const i64x2 0 0))
+    (local.set $i (i32.const 0))
+
+    (block $done
+        (loop $loop
+            ;; Load 8 fp16 values, widen to f32x4 pairs
+            ;; Subtract, square, accumulate
+            ;; ... (8 values per iteration, 48 iterations for 384 dims)
+
+            (br_if $done (i32.ge_u (local.get $i) (i32.const 384)))
+            (br $loop)
+        )
+    )
+    ;; Horizontal sum of accumulator
+    ;; Return L2 distance
+)
+```
+
+### AVX-512 (Desktop/Server)
+
+```
+; Process 32 fp16 values per cycle with VCVTPH2PS + VFMADD231PS
+; 384 dims = 12 iterations of 32 values
+; ~12 cycles per distance computation
+```
+
+### ARM NEON (Mobile/Edge)
+
+```
+; Process 8 fp16 values per cycle with FMLA
+; 384 dims = 48 iterations of 8 values
+; ~48 cycles per distance computation
+```
+
+## 7. Microkernel Size Budget
+
+```
+Function                    Estimated Size
+--------                    --------------
+rvf_init                    128 B
+rvf_load_query              64 B
+rvf_load_block              256 B
+rvf_distances (L2 fp16)     512 B
+rvf_distances (L2 i8)       384 B
+rvf_distances (IP fp16)     512 B
+rvf_distances (hamming)     256 B
+rvf_topk_merge              384 B
+rvf_topk_read               64 B
+rvf_load_sq_params          64 B
+rvf_dequant_i8              256 B
+rvf_load_pq_codebook        128 B
+rvf_pq_distances            512 B
+rvf_load_neighbors          128 B
+rvf_greedy_step             512 B
+rvf_verify_header           128 B
+rvf_crc32c                  256 B
+Message dispatch loop       384 B
+Utility functions           256 B
+WASM overhead               512 B
+                            ----------
+Total                       ~5,500 B (< 8 KB code budget)
+```
+
+Remaining ~2.5 KB of code space is available for domain-specific extensions
+(e.g., codon distance for RVDNA profile, token overlap for RVText profile).
+
+## 8. Fault Isolation
+
+Each tile runs in a WASM sandbox. A tile cannot:
+- Access hub memory directly
+- Communicate with other tiles except through the hub
+- Allocate memory beyond its 8 KB data + 64 KB scratch
+- Execute code beyond its 8 KB code space
+- Trap without the hub catching and recovering
+
+If a tile traps (out-of-bounds, unreachable, stack overflow):
+1. Hub catches the trap
+2. Hub marks tile as faulted
+3. Hub reassigns the tile's work to another tile (or processes locally)
+4. Hub optionally restarts the faulted tile with fresh state
+
+This makes the system resilient to individual tile failures — important for
+large tile arrays where hardware faults are inevitable.
--- a/vendor/ruvector/docs/research/rvf/profiles/domain-profiles.md
+++ b/vendor/ruvector/docs/research/rvf/profiles/domain-profiles.md
@@ -0,0 +1,377 @@
+# RVF Domain Profiles
+
+## 1. Profile Architecture
+
+A domain profile is a **semantic overlay** on the universal RVF substrate. It does
+not change the wire format — every profile-specific file is a valid RVF file. The
+profile adds:
+
+1. **Semantic type annotations** for vector dimensions
+2. **Domain-specific distance metrics**
+3. **Custom quantization strategies** optimized for the domain
+4. **Metadata schemas** for domain-specific labels and provenance
+5. **Query preprocessing** conventions
+
+Profiles are declared in a PROFILE_SEG and referenced by the root manifest's
+`profile_id` field.
+
+```
+-- RVF Universal Substrate --+
+| Segments, manifests, tiers  |
+| HNSW index, overlays        |
+| Temperature, compaction     |
+-----------------------------+
+        |
+        | profile_id
+        v
+-- Domain Profile Layer --+
+| Semantic types            |
+| Custom distances          |
+| Metadata schema           |
+| Query conventions         |
+---------------------------+
+```
+
+## 2. PROFILE_SEG Binary Layout
+
+```
+Offset  Size  Field              Description
+------  ----  -----              -----------
+0x00    4     profile_magic      Profile-specific magic number
+0x04    2     profile_version    Profile spec version
+0x06    2     profile_id         Same as root manifest profile_id
+0x08    32    profile_name       UTF-8 null-terminated name
+0x28    8     schema_length      Length of metadata schema
+0x30    var   metadata_schema    JSON or binary schema for META_SEG entries
+var     8     distance_config_len Length of distance configuration
+var     var   distance_config    Distance metric parameters
+var     8     quant_config_len   Length of quantization configuration
+var     var   quant_config       Domain-specific quantization parameters
+var     8     preprocess_len     Length of preprocessing spec
+var     var   preprocess_spec    Query preprocessing pipeline description
+```
+
+## 3. RVDNA Profile (Genomics)
+
+### Profile Declaration
+
+```
+profile_magic:    0x52444E41 ("RDNA")
+profile_id:       0x01
+profile_name:     "rvdna"
+```
+
+### Semantic Types
+
+RVDNA vectors encode biological sequences at multiple granularities:
+
+| Granularity | Dimensions | Encoding | Use Case |
+|------------|-----------|----------|----------|
+| Codon | 64 | Frequency of each codon in reading frame | Gene-level comparison |
+| K-mer (k=6) | 4096 | 6-mer frequency spectrum | Species identification |
+| Motif | 128-512 | Learned motif embeddings (transformer) | Regulatory element search |
+| Structure | 256 | Protein secondary structure embedding | Fold similarity |
+| Epigenetic | 384 | Methylation + histone mark embedding | Epigenomic comparison |
+
+### Distance Metrics
+
+```
+Codon frequency:     Jensen-Shannon divergence (symmetric KL)
+K-mer spectrum:      Cosine similarity (normalized frequency vectors)
+Motif embedding:     L2 distance (Euclidean in learned space)
+Structure:           L2 distance with structure-aware weighting
+Epigenetic:          Weighted cosine (CpG density as weight)
+```
+
+### Quantization Strategy
+
+Genomic vectors have specific statistical properties:
+
+- **Codon frequencies**: Sparse, non-negative, sum-to-1. Use **scalar quantization
+  with log transform**: `q = round(log2(freq + epsilon) * scale)`. 8-bit covers
+  6 orders of magnitude.
+
+- **K-mer spectra**: Very sparse (most 6-mers absent in short reads). Use
+  **sparse encoding**: store only non-zero k-mer indices + values. Typical
+  compression: 20-50x over dense.
+
+- **Learned embeddings**: Gaussian-distributed. Standard PQ works well.
+  M=32 subspaces, K=256 centroids (8-bit codes).
+
+### Metadata Schema
+
+```json
+{
+  "type": "rvdna",
+  "fields": {
+    "organism": { "type": "string", "indexed": true },
+    "gene_id": { "type": "string", "indexed": true },
+    "chromosome": { "type": "string", "indexed": true },
+    "position_start": { "type": "u64", "indexed": true },
+    "position_end": { "type": "u64", "indexed": true },
+    "strand": { "type": "enum", "values": ["+", "-"] },
+    "quality_score": { "type": "f32" },
+    "source_format": { "type": "enum", "values": ["FASTA", "FASTQ", "BAM", "VCF"] },
+    "read_depth": { "type": "u32" },
+    "gc_content": { "type": "f32" }
+  }
+}
+```
+
+### Query Preprocessing
+
+For RVDNA queries:
+1. Input: Raw sequence string (ACGT...)
+2. Compute k-mer frequency spectrum
+3. Apply log transform for codon/k-mer queries
+4. Normalize to unit length for cosine metrics
+5. Encode as fp16 vector
+6. Submit to RVF query path
+
+## 4. RVText Profile (Language)
+
+### Profile Declaration
+
+```
+profile_magic:    0x52545854 ("RTXT")
+profile_id:       0x02
+profile_name:     "rvtext"
+```
+
+### Semantic Types
+
+| Granularity | Dimensions | Source | Use Case |
+|------------|-----------|--------|----------|
+| Token | 768-1536 | Transformer last hidden state | Semantic search |
+| Sentence | 384-768 | Sentence transformer pooled output | Document retrieval |
+| Paragraph | 384-1024 | Long-context model embedding | Passage ranking |
+| Document | 256-512 | Document-level embedding | Collection search |
+| Sparse | 30522 | BM25/SPLADE term weights | Lexical matching |
+
+### Distance Metrics
+
+```
+Dense embeddings:    Cosine similarity (normalized dot product)
+Sparse (SPLADE):     Dot product on sparse vectors
+Hybrid:              alpha * dense_score + (1-alpha) * sparse_score
+Matryoshka:          Cosine on truncated prefix (adaptive dimensionality)
+```
+
+### Quantization Strategy
+
+Text embeddings are well-suited to aggressive quantization:
+
+- **Dense (384-768 dim)**: Binary quantization achieves 0.95+ recall on
+  normalized embeddings. 384 dims -> 48 bytes. Use binary for cold tier,
+  int8 for hot.
+
+- **Sparse (SPLADE)**: Store as sorted (term_id, weight) pairs with
+  delta-encoded term_ids. Typical sparsity: 100-300 non-zero terms out
+  of 30K vocabulary. Compression: ~100x over dense.
+
+- **Matryoshka**: Store full-dimension vectors but index only the first
+  D/4 dimensions. Progressive refinement uses more dimensions.
+
+### Metadata Schema
+
+```json
+{
+  "type": "rvtext",
+  "fields": {
+    "text": { "type": "string", "stored": true, "max_length": 8192 },
+    "source_url": { "type": "string", "indexed": true },
+    "language": { "type": "string", "indexed": true },
+    "model_id": { "type": "string" },
+    "chunk_index": { "type": "u32" },
+    "total_chunks": { "type": "u32" },
+    "token_count": { "type": "u32" },
+    "timestamp": { "type": "u64" }
+  }
+}
+```
+
+### Query Preprocessing
+
+1. Input: Raw text string
+2. Tokenize with model-specific tokenizer
+3. Encode through embedding model (or receive pre-computed embedding)
+4. L2-normalize for cosine similarity
+5. Optionally: compute SPLADE sparse expansion
+6. Submit dense + sparse to hybrid query path
+
+## 5. RVGraph Profile (Networks)
+
+### Profile Declaration
+
+```
+profile_magic:    0x52475248 ("RGRH")
+profile_id:       0x03
+profile_name:     "rvgraph"
+```
+
+### Semantic Types
+
+| Granularity | Dimensions | Source | Use Case |
+|------------|-----------|--------|----------|
+| Node | 64-256 | Node2Vec / GCN embedding | Node similarity |
+| Edge | 64-128 | Edge feature embedding | Link prediction |
+| Subgraph | 128-512 | Graph kernel embedding | Subgraph matching |
+| Community | 64-256 | Community embedding | Community detection |
+| Spectral | 32-128 | Laplacian eigenvectors | Graph structure |
+
+### Distance Metrics
+
+```
+Node embedding:      L2 distance
+Edge embedding:      Cosine similarity
+Subgraph:            Wasserstein distance (approximated by L2 on sorted features)
+Community:           Cosine similarity
+Spectral:            L2 on normalized eigenvectors
+```
+
+### Integration with Overlay System
+
+RVGraph uniquely integrates with the RVF overlay epoch system:
+
+- **Graph structure** is stored in OVERLAY_SEGs (not just as metadata)
+- **Node embeddings** are stored in VEC_SEGs
+- **Edge weights** are overlay deltas
+- **Community assignments** are partition summaries
+- **Min-cut witnesses** directly serve graph partitioning queries
+
+This means RVGraph files are simultaneously vector stores AND graph databases.
+The overlay system provides dynamic graph operations (add/remove edges,
+rebalance partitions) while the vector system provides similarity search.
+
+### Metadata Schema
+
+```json
+{
+  "type": "rvgraph",
+  "fields": {
+    "node_type": { "type": "string", "indexed": true },
+    "edge_type": { "type": "string", "indexed": true },
+    "node_label": { "type": "string", "indexed": true },
+    "degree": { "type": "u32", "indexed": true },
+    "community_id": { "type": "u32", "indexed": true },
+    "pagerank": { "type": "f32" },
+    "clustering_coeff": { "type": "f32" },
+    "source_graph": { "type": "string" }
+  }
+}
+```
+
+## 6. RVVision Profile (Imagery)
+
+### Profile Declaration
+
+```
+profile_magic:    0x52564953 ("RVIS")
+profile_id:       0x04
+profile_name:     "rvvision"
+```
+
+### Semantic Types
+
+| Granularity | Dimensions | Source | Use Case |
+|------------|-----------|--------|----------|
+| Patch | 64-256 | ViT patch embedding | Region search |
+| Image | 512-2048 | CLIP / DINOv2 global embedding | Image retrieval |
+| Object | 256-512 | Object detection crop embedding | Object search |
+| Scene | 128-512 | Scene classification embedding | Scene matching |
+| Multi-scale | 256 * N | Pyramid of embeddings at scales | Scale-invariant search |
+
+### Distance Metrics
+
+```
+CLIP embedding:      Cosine similarity (model-normalized)
+DINOv2:              Cosine similarity
+Patch:               L2 distance (not normalized)
+Multi-scale:         Weighted sum of per-scale cosine similarities
+```
+
+### Quantization Strategy
+
+Vision embeddings have high intrinsic dimensionality but are compressible:
+
+- **CLIP (512-dim)**: PQ with M=64, K=256 works well. Binary quantization
+  achieves 0.90+ recall.
+
+- **DINOv2 (768-dim)**: Similar to CLIP. PQ M=96, K=256.
+
+- **Patch embeddings**: Large volume (196+ patches per image). Aggressive
+  quantization to 4-bit scalar. Use residual PQ for high-recall applications.
+
+### Spatial Metadata
+
+RVVision supports spatial queries through metadata:
+
+```json
+{
+  "type": "rvvision",
+  "fields": {
+    "image_id": { "type": "string", "indexed": true },
+    "patch_row": { "type": "u16" },
+    "patch_col": { "type": "u16" },
+    "scale": { "type": "f32" },
+    "bbox_x": { "type": "f32" },
+    "bbox_y": { "type": "f32" },
+    "bbox_w": { "type": "f32" },
+    "bbox_h": { "type": "f32" },
+    "object_class": { "type": "string", "indexed": true },
+    "confidence": { "type": "f32" },
+    "model_id": { "type": "string" }
+  }
+}
+```
+
+## 7. Custom Profile Registration
+
+New profiles can be registered by writing a PROFILE_SEG:
+
+```
+1. Choose a unique profile_id (0x10-0xEF for custom profiles)
+2. Define a 4-byte profile_magic
+3. Define metadata schema
+4. Define distance metric configuration
+5. Define quantization recommendations
+6. Write PROFILE_SEG into the RVF file
+7. Set profile_id in root manifest
+```
+
+The profile system is open — any domain can define its own profile as long
+as it maps onto the RVF substrate. The substrate does not need to understand
+the domain semantics; it only needs to store vectors, compute distances,
+and maintain indexes.
+
+## 8. Cross-Profile Queries
+
+RVF files with different profiles can be queried together if their vectors
+share a compatible embedding space. This is common in multimodal applications:
+
+```
+Query: "Find images similar to this text description"
+
+1. Text embedding (RVText profile) -> 512-dim CLIP text vector
+2. Image database (RVVision profile) -> 512-dim CLIP image vectors
+3. Distance metric: Cosine similarity (shared CLIP space)
+4. Result: Images ranked by text-image similarity
+```
+
+The query path treats both files as RVF files. The profile only affects
+preprocessing and metadata interpretation — the core distance computation
+and indexing are profile-agnostic.
+
+## 9. Profile Compatibility Matrix
+
+| Source Profile | Target Profile | Compatible? | Condition |
+|---------------|---------------|------------|-----------|
+| RVDNA | RVDNA | Yes | Same granularity |
+| RVText | RVText | Yes | Same model or compatible space |
+| RVVision | RVVision | Yes | Same model or compatible space |
+| RVText | RVVision | Yes | If both use CLIP or shared space |
+| RVDNA | RVText | No* | Unless mapped through protein language model |
+| RVGraph | Any | Partial | Node embeddings may share space |
+
+*Cross-domain compatibility requires explicit embedding space alignment,
+which is outside the scope of the format spec but enabled by it.
--- a/vendor/ruvector/docs/research/rvf/spec/00-overview.md
+++ b/vendor/ruvector/docs/research/rvf/spec/00-overview.md
@@ -0,0 +1,140 @@
+# RVF: RuVector Format Specification
+
+## The Universal Substrate for Living Intelligence
+
+**Version**: 0.1.0-draft
+**Status**: Research
+**Date**: 2026-02-13
+
+---
+
+## What RVF Is
+
+RVF is not a file format. It is a **runtime substrate** — a living, self-reorganizing
+binary medium that stores, streams, indexes, and adapts vector intelligence across
+any domain, any scale, and any hardware tier.
+
+Where traditional formats are snapshots of data, RVF is a **continuously evolving
+organism**. It ingests without rewriting. It answers queries before it finishes loading.
+It reorganizes its own layout to match access patterns. It survives crashes without
+journals. It fits on a 64 KB WASM tile or scales to a petabyte hub.
+
+## The Four Laws of RVF
+
+Every design decision in RVF derives from four inviolable laws:
+
+### Law 1: Truth Lives at the Tail
+
+The most recent `MANIFEST_SEG` at the tail of the file is the sole source of truth.
+No front-loaded metadata. No section directory that must be rewritten on mutation.
+Readers scan backward from EOF to find the latest manifest and know exactly what
+to map.
+
+**Consequence**: Append-only writes. Streaming ingest. No global rewrite ever.
+
+### Law 2: Every Segment Is Independently Valid
+
+Each segment carries its own magic number, length, content hash, and type tag.
+A reader encountering any segment in isolation can verify it, identify it, and
+decide whether to process it. No segment depends on prior segments for structural
+validity.
+
+**Consequence**: Crash safety for free. Parallel verification. Segment-level
+integrity without a global checksum.
+
+### Law 3: Data and State Are Separated
+
+Vector payloads, index structures, overlay graphs, quantization dictionaries, and
+runtime metadata live in distinct segment types. The manifest binds them together
+but they never intermingle. This means you can replace the index without touching
+vectors, update the overlay without rebuilding adjacency, or swap quantization
+without re-encoding.
+
+**Consequence**: Incremental updates. Modular evolution. Zero-copy segment reuse.
+
+### Law 4: The Format Adapts to Its Workload
+
+RVF monitors access patterns through lightweight sketches and periodically
+reorganizes: promoting hot vectors to faster tiers, compacting stale overlays,
+lazily building deeper index layers. The format is not static — it converges
+toward the optimal layout for its actual workload.
+
+**Consequence**: Self-tuning performance. No manual optimization. The file gets
+faster the more you use it.
+
+## Design Coordinates
+
+| Property | RVF Answer |
+|----------|-----------|
+| Write model | Append-only segments + background compaction |
+| Read model | Tail-manifest scan, then progressive mmap |
+| Index model | Layered availability (entry points -> partial -> full) |
+| Compression | Temperature-tiered (fp16 hot, 5-7 bit warm, 3 bit cold) |
+| Alignment | 64-byte for SIMD (AVX-512, NEON, WASM v128) |
+| Crash safety | Segment-level hashes, no WAL required |
+| Crypto | Post-quantum (ML-DSA-65 signatures, SHAKE-256 hashes) |
+| Streaming | Yes — first query before full load |
+| Hardware | 8 KB tile to petabyte hub |
+| Domain | Universal — genomics, text, graph, vision as profiles |
+
+## Acceptance Test
+
+> Cold start on a 10 million vector file: load and answer the first query with a
+> useful (recall >= 0.7) result without reading more than the last 4 MB, then
+> converge to full quality (recall >= 0.95) as it progressively maps more segments.
+
+## Document Map
+
+| Document | Path | Content |
+|----------|------|---------|
+| This overview | `spec/00-overview.md` | Philosophy, laws, design coordinates |
+| Segment model | `spec/01-segment-model.md` | Segment types, headers, append-only rules |
+| Manifest system | `spec/02-manifest-system.md` | Two-level manifests, hotset pointers |
+| Temperature tiering | `spec/03-temperature-tiering.md` | Adaptive layout, access sketches, promotion |
+| Progressive indexing | `spec/04-progressive-indexing.md` | Layered HNSW, partial availability |
+| Overlay epochs | `spec/05-overlay-epochs.md` | Streaming min-cut, epoch boundaries |
+| Wire format | `wire/binary-layout.md` | Byte-level binary format reference |
+| WASM microkernel | `microkernel/wasm-runtime.md` | Cognitum tile mapping, WASM exports |
+| Domain profiles | `profiles/domain-profiles.md` | RVDNA, RVText, RVGraph, RVVision |
+| Crypto spec | `crypto/quantum-signatures.md` | Post-quantum primitives, segment signing |
+| Benchmarks | `benchmarks/acceptance-tests.md` | Performance targets, test methodology |
+
+## Relationship to RVDNA
+
+RVDNA (RuVector DNA) was the first domain-specific format for genomic vector
+intelligence. In the RVF model, RVDNA becomes a **profile** — a set of conventions
+for how genomic data maps onto the universal RVF substrate:
+
+```
+RVF (universal substrate)
+  |
+  +-- RVF Core Profile    (minimal, fits on 64KB tile)
+  +-- RVF Hot Profile      (chip-optimized, SIMD-heavy)
+  +-- RVF Full Profile     (hub-scale, all features)
+  |
+  +-- Domain Profiles
+       +-- RVDNA           (genomics: codons, motifs, k-mers)
+       +-- RVText          (language: embeddings, token graphs)
+       +-- RVGraph         (networks: adjacency, partitions)
+       +-- RVVision        (imagery: feature maps, patch vectors)
+```
+
+The substrate carries the laws. The profiles carry the semantics.
+
+## Design Answers
+
+**Q: Random writes or append-only plus compaction?**
+A: Append-only plus compaction. This gives speed and crash safety almost for free.
+Random writes add complexity for marginal benefit in the vector workload.
+
+**Q: Primary target mmap on desktop CPUs or also microcontroller tiles?**
+A: Both. RVF defines three hardware profiles. The Core profile fits in 8 KB code +
+8 KB data + 64 KB SIMD scratch. The Full profile assumes mmap on desktop-class
+memory. The wire format is identical — only the runtime behavior changes.
+
+**Q: Which property matters most?**
+A: All four are non-negotiable, but the priority order for conflict resolution is:
+1. **Streamable** (never block on write)
+2. **Progressive** (answer before fully loaded)
+3. **Adaptive** (self-optimize over time)
+4. **p95 speed** (predictable tail latency)
--- a/vendor/ruvector/docs/research/rvf/spec/01-segment-model.md
+++ b/vendor/ruvector/docs/research/rvf/spec/01-segment-model.md
@@ -0,0 +1,224 @@
+# RVF Segment Model
+
+## 1. Append-Only Segment Architecture
+
+An RVF file is a linear sequence of **segments**. Each segment is a self-contained,
+independently verifiable unit. New data is always appended — never inserted into or
+overwritten within existing segments.
+
+```
+------------+------------+------------+     +------------+
+| Segment 0  | Segment 1  | Segment 2  | ... | Segment N  |  <-- EOF
+------------+------------+------------+     +------------+
+                                                    ^
+                                            Latest MANIFEST_SEG
+                                            (source of truth)
+```
+
+### Why Append-Only
+
+| Property | Benefit |
+|----------|---------|
+| Write amplification | Zero — each byte written once until compaction |
+| Crash safety | Partial segment at tail is detectable and discardable |
+| Concurrent reads | Readers see a consistent snapshot at any manifest boundary |
+| Streaming ingest | Writer never blocks on reorganization |
+| mmap friendliness | Pages only grow — no invalidation of mapped regions |
+
+## 2. Segment Header
+
+Every segment begins with a fixed 64-byte header. The header is 64-byte aligned
+to match SIMD register width.
+
+```
+Offset  Size  Field              Description
+------  ----  -----              -----------
+0x00    4     magic              0x52564653 ("RVFS" in ASCII)
+0x04    1     version            Segment format version (currently 1)
+0x05    1     seg_type           Segment type enum (see below)
+0x06    2     flags              Bitfield: compressed, encrypted, signed, sealed, etc.
+0x08    8     segment_id         Monotonically increasing segment ordinal
+0x10    8     payload_length     Byte length of payload (after header, before footer)
+0x18    8     timestamp_ns       Nanosecond UNIX timestamp of segment creation
+0x20    1     checksum_algo      Hash algorithm enum: 0=CRC32C, 1=XXH3-128, 2=SHAKE-256
+0x21    1     compression        Compression enum: 0=none, 1=LZ4, 2=ZSTD, 3=custom
+0x22    2     reserved_0         Must be zero
+0x24    4     reserved_1         Must be zero
+0x28    16    content_hash       First 128 bits of payload hash (algorithm per checksum_algo)
+0x38    4     uncompressed_len   Original payload size (0 if no compression)
+0x3C    4     alignment_pad      Padding to reach 64-byte boundary
+```
+
+**Total header**: 64 bytes (one cache line, one AVX-512 register width).
+
+### Magic Validation
+
+Readers scanning backward from EOF look for `0x52564653` at 64-byte aligned
+boundaries. This enables fast tail-scan even on corrupted files.
+
+### Flags Bitfield
+
+```
+Bit 0:  COMPRESSED    Payload is compressed per compression field
+Bit 1:  ENCRYPTED     Payload is encrypted (key info in manifest)
+Bit 2:  SIGNED        A signature footer follows the payload
+Bit 3:  SEALED        Segment is immutable (compaction output)
+Bit 4:  PARTIAL       Segment is a partial write (streaming ingest)
+Bit 5:  TOMBSTONE     Segment logically deletes a prior segment
+Bit 6:  HOT           Segment contains temperature-promoted data
+Bit 7:  OVERLAY       Segment contains overlay/delta data
+Bit 8:  SNAPSHOT      Segment contains full snapshot (not delta)
+Bit 9:  CHECKPOINT    Segment is a safe rollback point
+Bits 10-15: reserved
+```
+
+## 3. Segment Types
+
+```
+Value  Name            Purpose
+-----  ----            -------
+0x01   VEC_SEG         Raw vector payloads (the actual embeddings)
+0x02   INDEX_SEG       HNSW adjacency lists, entry points, routing tables
+0x03   OVERLAY_SEG     Graph overlay deltas, partition updates, min-cut witnesses
+0x04   JOURNAL_SEG     Metadata mutations (label changes, deletions, moves)
+0x05   MANIFEST_SEG    Segment directory, hotset pointers, epoch state
+0x06   QUANT_SEG       Quantization dictionaries and codebooks
+0x07   META_SEG        Arbitrary key-value metadata (tags, provenance, lineage)
+0x08   HOT_SEG         Temperature-promoted hot data (vectors + neighbors)
+0x09   SKETCH_SEG      Access counter sketches for temperature decisions
+0x0A   WITNESS_SEG     Capability manifests, proof of computation, audit trails
+0x0B   PROFILE_SEG     Domain profile declarations (RVDNA, RVText, etc.)
+0x0C   CRYPTO_SEG      Key material, signature chains, certificate anchors
+0x0D   METAIDX_SEG     Metadata inverted indexes for filtered search
+```
+
+### Reserved Range
+
+Types `0x00` and `0xF0`-`0xFF` are reserved. `0x00` indicates an uninitialized
+or zeroed region (not a valid segment). `0xF0`-`0xFF` are reserved for
+implementation-specific extensions.
+
+## 4. Segment Footer
+
+If the `SIGNED` flag is set, the payload is followed by a signature footer:
+
+```
+Offset  Size   Field              Description
+------  ----   -----              -----------
+0x00    2      sig_algo           Signature algorithm: 0=Ed25519, 1=ML-DSA-65, 2=SLH-DSA-128s
+0x02    2      sig_length         Byte length of signature
+0x04    var    signature          The signature bytes
+var     4      footer_length      Total footer size (for backward scanning)
+```
+
+Unsigned segments have no footer — the next segment header follows immediately
+after the payload (at the next 64-byte aligned boundary).
+
+## 5. Segment Lifecycle
+
+### Write Path
+
+```
+1. Allocate segment ID (monotonic counter)
+2. Compute payload hash
+3. Write header + payload + optional footer
+4. fsync (or fdatasync for non-manifest segments)
+5. Write MANIFEST_SEG referencing the new segment
+6. fsync the manifest
+```
+
+The two-fsync protocol ensures that:
+- If crash occurs before step 6, the orphan segment is harmless (no manifest points to it)
+- If crash occurs during step 6, the partial manifest is detectable (bad hash)
+- After step 6, the segment is durably committed
+
+### Read Path
+
+```
+1. Seek to EOF
+2. Scan backward for latest MANIFEST_SEG (look for magic at aligned boundaries)
+3. Parse manifest -> get segment directory
+4. Map segments on demand (progressive loading)
+```
+
+### Compaction
+
+Compaction merges multiple segments into fewer, larger, sealed segments:
+
+```
+Before:  [VEC_SEG_1] [VEC_SEG_2] [VEC_SEG_3] [MANIFEST_3]
+After:   [VEC_SEG_1] [VEC_SEG_2] [VEC_SEG_3] [MANIFEST_3] [VEC_SEG_sealed] [MANIFEST_4]
+                                                              ^^^^^^^^^^^^^^^^^
+                                                              New sealed segment
+                                                              merging 1+2+3
+```
+
+Old segments are marked with TOMBSTONE entries in the new manifest. Space is
+reclaimed when the file is eventually rewritten (or old segments are in a
+separate file in multi-file mode).
+
+### Multi-File Mode
+
+For very large datasets, RVF can span multiple files:
+
+```
+data.rvf          Main file with manifests and hot data
+data.rvf.cold.0   Cold segment shard 0
+data.rvf.cold.1   Cold segment shard 1
+data.rvf.idx.0    Index segment shard 0
+```
+
+The manifest in the main file contains shard references with file paths and
+byte ranges. This enables cold data to live on slower storage while hot data
+stays on fast storage.
+
+## 6. Segment Addressing
+
+Segments are addressed by their `segment_id` (monotonically increasing 64-bit
+integer). The manifest maps segment IDs to file offsets (and optionally shard
+file paths in multi-file mode).
+
+Within a segment, data is addressed by **block offset** — a 32-bit offset from
+the start of the segment payload. This limits individual segments to 4 GB, which
+is intentional: it keeps segments manageable for compaction and progressive loading.
+
+### Block Structure Within VEC_SEG
+
+```
+-------------------+
+| Block Header (16B)|
+|   block_id: u32   |
+|   count: u32      |
+|   dim: u16        |
+|   dtype: u8       |
+|   pad: [u8; 5]    |
+-------------------+
+| Vectors           |
+| (count * dim *    |
+|  sizeof(dtype))   |
+| [64B aligned]     |
+-------------------+
+| ID Map            |
+| (varint delta     |
+|  encoded IDs)     |
+-------------------+
+| Block Footer      |
+|   crc32c: u32     |
+-------------------+
+```
+
+Vectors within a block are stored **columnar** — all dimension 0 values, then all
+dimension 1 values, etc. This maximizes compression ratio. But the HOT_SEG stores
+vectors **interleaved** (row-major) for cache-friendly sequential scan during
+top-K refinement.
+
+## 7. Invariants
+
+1. Segment IDs are strictly monotonically increasing within a file
+2. A valid RVF file contains at least one MANIFEST_SEG
+3. The last MANIFEST_SEG is always the source of truth
+4. Segment headers are always 64-byte aligned
+5. No segment payload exceeds 4 GB
+6. Content hashes are computed over the raw (uncompressed, unencrypted) payload
+7. Sealed segments are never modified — only tombstoned
+8. A reader that cannot find a valid MANIFEST_SEG must reject the file
--- a/vendor/ruvector/docs/research/rvf/spec/02-manifest-system.md
+++ b/vendor/ruvector/docs/research/rvf/spec/02-manifest-system.md
@@ -0,0 +1,287 @@
+# RVF Manifest System
+
+## 1. Two-Level Manifest Architecture
+
+The manifest system is what makes RVF progressive. Instead of a monolithic directory
+that must be fully parsed before any query, RVF uses a two-level manifest that
+enables instant boot followed by incremental refinement.
+
+```
+                          EOF
+                           |
+                           v
+--------------------------------------------------+
+| Level 0: Root Manifest (fixed 4096 bytes)        |
+|   - Magic + version                              |
+|   - Pointer to Level 1 manifest segment          |
+|   - Hotset pointers (inline)                     |
+|   - Total vector count                           |
+|   - Dimension                                    |
+|   - Epoch counter                                |
+|   - Profile declaration                          |
+--------------------------------------------------+
+          |
+          | points to
+          v
+--------------------------------------------------+
+| Level 1: Full Manifest (variable size)           |
+|   - Complete segment directory                   |
+|   - Temperature tier map                         |
+|   - Index layer availability                     |
+|   - Overlay epoch chain                          |
+|   - Compaction state                             |
+|   - Shard references (multi-file)                |
+|   - Capability manifest                          |
+--------------------------------------------------+
+```
+
+### Why Two Levels
+
+A reader performing cold start only needs Level 0 (4 KB). From Level 0 alone,
+it can locate the entry points, coarse routing graph, quantization dictionary,
+and centroids — enough to answer approximate queries immediately.
+
+Level 1 is loaded asynchronously to enable full-quality queries, but the system
+is functional before Level 1 is fully parsed.
+
+## 2. Level 0: Root Manifest
+
+The root manifest is always the **last 4096 bytes** of the file (or the last
+4096 bytes of the most recent MANIFEST_SEG). Its fixed size enables instant
+location: `seek(EOF - 4096)`.
+
+### Binary Layout
+
+```
+Offset  Size  Field                     Description
+------  ----  -----                     -----------
+0x000   4     magic                     0x52564D30 ("RVM0")
+0x004   2     version                   Root manifest version
+0x006   2     flags                     Root manifest flags
+0x008   8     l1_manifest_offset        Byte offset to Level 1 manifest segment
+0x010   8     l1_manifest_length        Byte length of Level 1 manifest segment
+0x018   8     total_vector_count        Total vectors across all segments
+0x020   2     dimension                 Vector dimensionality
+0x022   1     base_dtype                Base data type enum
+0x023   1     profile_id                Domain profile (0=generic, 1=dna, 2=text, 3=graph, 4=vision)
+0x024   4     epoch                     Current overlay epoch number
+0x028   8     created_ns                File creation timestamp (ns)
+0x030   8     modified_ns               Last modification timestamp (ns)
+
+--- Hotset Pointers (the key to instant boot) ---
+
+0x038   8     entrypoint_seg_offset     Offset to segment containing HNSW entry points
+0x040   4     entrypoint_block_offset   Block offset within that segment
+0x044   4     entrypoint_count          Number of entry points
+
+0x048   8     toplayer_seg_offset       Offset to segment with top-layer adjacency
+0x050   4     toplayer_block_offset     Block offset
+0x054   4     toplayer_node_count       Nodes in top layer
+
+0x058   8     centroid_seg_offset       Offset to segment with cluster centroids / pivots
+0x060   4     centroid_block_offset     Block offset
+0x064   4     centroid_count            Number of centroids
+
+0x068   8     quantdict_seg_offset      Offset to quantization dictionary segment
+0x070   4     quantdict_block_offset    Block offset
+0x074   4     quantdict_size            Dictionary size in bytes
+
+0x078   8     hot_cache_seg_offset      Offset to HOT_SEG with interleaved hot vectors
+0x080   4     hot_cache_block_offset    Block offset
+0x084   4     hot_cache_vector_count    Vectors in hot cache
+
+0x088   8     prefetch_map_offset       Offset to prefetch hint table
+0x090   4     prefetch_map_entries      Number of prefetch entries
+
+--- Crypto ---
+
+0x094   2     sig_algo                  Manifest signature algorithm
+0x096   2     sig_length                Signature length
+0x098   var   signature                 Manifest signature (up to 3400 bytes for ML-DSA-65)
+
+--- Padding to 4096 bytes ---
+
+0xF00   252   reserved                  Reserved / zero-padded to 4096
+0xFFC   4     root_checksum             CRC32C of bytes 0x000-0xFFB
+```
+
+**Total**: Exactly 4096 bytes (one page, one disk sector on most hardware).
+
+### Hotset Pointers
+
+The six hotset pointers are the minimum information needed to answer a query:
+
+1. **Entry points**: Where to start HNSW traversal
+2. **Top-layer adjacency**: Coarse routing to the right neighborhood
+3. **Centroids/pivots**: For IVF-style pre-filtering or partition routing
+4. **Quantization dictionary**: For decoding compressed vectors
+5. **Hot cache**: Pre-decoded interleaved vectors for top-K refinement
+6. **Prefetch map**: Contiguous neighbor-list pages with prefetch offsets
+
+With these six pointers, a reader can:
+- Start HNSW search at the entry point
+- Route through the top layer
+- Quantize the query using the dictionary
+- Scan the hot cache for refinement
+- Prefetch neighbor pages for cache-friendly traversal
+
+All without reading Level 1 or any cold segments.
+
+## 3. Level 1: Full Manifest
+
+Level 1 is a variable-size segment (type `MANIFEST_SEG`) referenced by Level 0.
+It contains the complete file directory.
+
+### Structure
+
+Level 1 is encoded as a sequence of typed records using a tag-length-value (TLV)
+scheme for forward compatibility:
+
+```
+---+---+---+---+---+---+---+---+
+| Tag (2B) | Length (4B) | Pad   |  <- 8-byte aligned record header
+---+---+---+---+---+---+---+---+
+| Value (Length bytes)            |
+| [padded to 8-byte boundary]    |
+---------------------------------+
+```
+
+### Record Types
+
+```
+Tag     Name                    Description
+---     ----                    -----------
+0x0001  SEGMENT_DIR             Array of segment directory entries
+0x0002  TEMP_TIER_MAP           Temperature tier assignments per block
+0x0003  INDEX_LAYERS            Index layer availability bitmap
+0x0004  OVERLAY_CHAIN           Epoch chain with rollback pointers
+0x0005  COMPACTION_STATE        Active/tombstoned segment sets
+0x0006  SHARD_REFS              Multi-file shard references
+0x0007  CAPABILITY_MANIFEST     What this file can do (features, limits)
+0x0008  PROFILE_CONFIG          Domain-specific configuration
+0x0009  ACCESS_SKETCH_REF       Pointer to latest SKETCH_SEG
+0x000A  PREFETCH_TABLE          Full prefetch hint table
+0x000B  ID_RESTART_POINTS       Restart point index for varint delta IDs
+0x000C  WITNESS_CHAIN           Proof-of-computation witness chain
+0x000D  KEY_DIRECTORY           Encryption key references (not keys themselves)
+```
+
+### Segment Directory Entry
+
+```
+Offset  Size  Field                Description
+------  ----  -----                -----------
+0x00    8     segment_id           Segment ordinal
+0x08    1     seg_type             Segment type enum
+0x09    1     tier                 Temperature tier (0=hot, 1=warm, 2=cold)
+0x0A    2     flags                Segment flags
+0x0C    4     reserved             Must be zero
+0x10    8     file_offset          Byte offset in file (or shard)
+0x18    8     payload_length       Decompressed payload length
+0x20    8     compressed_length    Compressed length (0 if uncompressed)
+0x28    2     shard_id             Shard index (0 for main file)
+0x2A    2     compression          Compression algorithm
+0x2C    4     block_count          Number of blocks in segment
+0x30    16    content_hash         Payload hash (first 128 bits)
+```
+
+**Total**: 64 bytes per entry (cache-line aligned).
+
+## 4. Manifest Lifecycle
+
+### Writing a New Manifest
+
+Every mutation to the file produces a new MANIFEST_SEG appended at the tail:
+
+```
+1. Compute new Level 1 manifest (segment directory + metadata)
+2. Write Level 1 as a MANIFEST_SEG payload
+3. Compute Level 0 root manifest pointing to Level 1
+4. Write Level 0 as the last 4096 bytes of the MANIFEST_SEG
+5. fsync
+```
+
+The MANIFEST_SEG payload structure is:
+
+```
+-----------------------------------+
+| Level 1 manifest (variable size)  |
+-----------------------------------+
+| Level 0 root manifest (4096 B)   |  <-- Always the last 4096 bytes
+-----------------------------------+
+```
+
+### Reading the Manifest
+
+```
+1. seek(EOF - 4096)
+2. Read 4096 bytes -> Level 0 root manifest
+3. Validate magic (0x52564D30) and checksum
+4. If valid: extract hotset pointers -> system is queryable
+5. Async: read Level 1 at l1_manifest_offset -> full directory
+6. If Level 0 is invalid: scan backward for previous MANIFEST_SEG
+```
+
+Step 6 provides crash recovery. If the latest manifest write was interrupted,
+the previous manifest is still valid. Readers scan backward at 64-byte aligned
+boundaries looking for the RVFS magic + MANIFEST_SEG type.
+
+### Manifest Chain
+
+Each manifest implicitly forms a chain through the segment ID ordering. For
+explicit rollback support, Level 1 contains the `OVERLAY_CHAIN` record which
+stores:
+
+```
+epoch: u32              Current epoch
+prev_manifest_offset: u64   Offset of previous MANIFEST_SEG
+prev_manifest_id: u64       Segment ID of previous MANIFEST_SEG
+checkpoint_hash: [u8; 16]   Hash of the complete state at this epoch
+```
+
+This enables point-in-time recovery and bisection debugging.
+
+## 5. Hotset Pointer Semantics
+
+### Entry Point Stability
+
+Entry points are the HNSW nodes at the highest layer. They change rarely (only
+when the index is rebuilt or a new highest-layer node is inserted). The root
+manifest caches them directly so they survive across manifest generations without
+re-reading the index.
+
+### Centroid Refresh
+
+Centroids may drift as data is added. The manifest tracks a `centroid_epoch` — if
+the current epoch exceeds centroid_epoch + threshold, the runtime should schedule
+centroid recomputation. But the stale centroids remain usable (recall degrades
+gracefully, it does not fail).
+
+### Hot Cache Coherence
+
+The hot cache in HOT_SEG is a **read-optimized snapshot** of the most-accessed
+vectors. It may be stale relative to the latest VEC_SEGs. The manifest tracks
+a `hot_cache_epoch` for staleness detection. Queries use the hot cache for fast
+initial results, then refine against authoritative VEC_SEGs if needed.
+
+## 6. Progressive Boot Sequence
+
+```
+Time     Action                          System State
+----     ------                          ------------
+t=0      Read last 4 KB (Level 0)        Booting
+t+1ms    Parse hotset pointers            Queryable (approximate)
+t+2ms    mmap entry points + top layer    Better routing
+t+5ms    mmap hot cache + quant dict      Fast top-K refinement
+t+10ms   Start loading Level 1            Discovering full directory
+t+50ms   Level 1 parsed                   Full segment awareness
+t+100ms  mmap warm VEC_SEGs              Recall improving
+t+500ms  mmap cold VEC_SEGs              Full recall
+t+1s     Background index layer build     Converging to optimal
+```
+
+For a 10M vector file (~4 GB at 384 dimensions, float16):
+- Level 0 read: 4 KB in <1 ms
+- Hotset data: ~2-4 MB (entry points + top layer + centroids + hot cache)
+- First query: within 5-10 ms of open
+- Full convergence: 1-5 seconds depending on storage speed
--- a/vendor/ruvector/docs/research/rvf/spec/03-temperature-tiering.md
+++ b/vendor/ruvector/docs/research/rvf/spec/03-temperature-tiering.md
@@ -0,0 +1,285 @@
+# RVF Temperature Tiering
+
+## 1. Adaptive Layout as a First-Class Concept
+
+Traditional vector formats place data once and leave it. RVF treats data placement
+as a **continuous optimization problem**. Every vector block has a temperature, and
+the format periodically reorganizes to keep hot data fast and cold data small.
+
+```
+                Access Frequency
+                     ^
+                     |
+Tier 0 (HOT)        |  ████████   fp16 / 8-bit, interleaved
+                     |  ████████   < 1μs random access
+                     |
+Tier 1 (WARM)        |  ░░░░░░░░░░░░░░░░   5-7 bit quantized
+                     |  ░░░░░░░░░░░░░░░░   columnar, compressed
+                     |
+Tier 2 (COLD)        |  ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒   3-bit or 1-bit
+                     |  ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒   heavy compression
+                     |
+                     +------------------------------------> Vector ID
+```
+
+### Tier Definitions
+
+| Tier | Name | Quantization | Layout | Compression | Access Latency |
+|------|------|-------------|--------|-------------|----------------|
+| 0 | Hot | fp16 or int8 | Interleaved (row-major) | None or LZ4 | < 1 μs |
+| 1 | Warm | 5-7 bit SQ/PQ | Columnar | LZ4 or ZSTD | 1-10 μs |
+| 2 | Cold | 3-bit or binary | Columnar | ZSTD level 9+ | 10-100 μs |
+
+### Memory Ratios
+
+For 384-dimensional vectors (typical embedding size):
+
+| Tier | Bytes/Vector | Ratio vs fp32 | 10M Vectors |
+|------|-------------|---------------|-------------|
+| fp32 (raw) | 1536 B | 1.0x | 14.3 GB |
+| Tier 0 (fp16) | 768 B | 2.0x | 7.2 GB |
+| Tier 0 (int8) | 384 B | 4.0x | 3.6 GB |
+| Tier 1 (6-bit) | 288 B | 5.3x | 2.7 GB |
+| Tier 1 (5-bit) | 240 B | 6.4x | 2.2 GB |
+| Tier 2 (3-bit) | 144 B | 10.7x | 1.3 GB |
+| Tier 2 (1-bit) | 48 B | 32.0x | 0.45 GB |
+
+## 2. Access Counter Sketch
+
+Temperature decisions require knowing which blocks are accessed frequently.
+RVF maintains a lightweight **Count-Min Sketch** per block set, stored in
+SKETCH_SEG segments.
+
+### Sketch Parameters
+
+```
+Width (w):    1024 counters
+Depth (d):    4 hash functions
+Counter size: 8-bit saturating (max 255)
+Memory:       1024 * 4 * 1 = 4 KB per sketch
+Granularity:  One sketch per 1024-vector block
+Decay:        Halve all counters every 2^16 accesses (aging)
+```
+
+For 10M vectors in 1024-vector blocks:
+- 9,766 blocks
+- 9,766 * 4 KB = ~38 MB of sketches
+- Stored in SKETCH_SEG, referenced by manifest
+
+### Sketch Operations
+
+**On query access**:
+```
+block_id = vector_id / block_size
+for i in 0..depth:
+    idx = hash_i(block_id) % width
+    sketch[i][idx] = min(sketch[i][idx] + 1, 255)
+```
+
+**On temperature check**:
+```
+count = min over i of sketch[i][hash_i(block_id) % width]
+if count > HOT_THRESHOLD:   tier = 0
+elif count > WARM_THRESHOLD: tier = 1
+else:                        tier = 2
+```
+
+**Aging** (every 2^16 accesses):
+```
+for all counters: counter = counter >> 1
+```
+
+This ensures the sketch tracks *recent* access patterns, not cumulative history.
+
+### Why Count-Min Sketch
+
+| Alternative | Memory | Accuracy | Update Cost |
+|------------|--------|----------|-------------|
+| Per-vector counter | 80 MB (10M * 8B) | Exact | O(1) |
+| Count-Min Sketch | 38 MB | ~99.9% | O(depth) = O(4) |
+| HyperLogLog | 6 MB | ~98% | O(1) but cardinality only |
+| Bloom filter | 12 MB | No counting | N/A |
+
+Count-Min Sketch is the best trade-off: sub-exact accuracy with bounded memory
+and constant-time updates.
+
+## 3. Promotion and Demotion
+
+### Promotion: Warm/Cold -> Hot
+
+When a block's access count exceeds HOT_THRESHOLD for two consecutive sketch
+epochs:
+
+```
+1. Read the block from its current VEC_SEG
+2. Decode/dequantize vectors to fp16 or int8
+3. Rearrange from columnar to interleaved layout
+4. Write as a new HOT_SEG (or append to existing HOT_SEG)
+5. Update manifest with new tier assignment
+6. Optionally: add neighbor lists to HOT_SEG for locality
+```
+
+### Demotion: Hot -> Warm -> Cold
+
+When a block's access count drops below WARM_THRESHOLD:
+
+```
+1. The block is not immediately rewritten
+2. On next compaction cycle, the block is written to the appropriate tier
+3. Quantization is applied during compaction (not lazily)
+4. The HOT_SEG entry is tombstoned in the manifest
+```
+
+### Eviction as Compression
+
+The key insight: **eviction from hot tier is just compression, not deletion**.
+The vector data is always present — it just moves to a more compressed
+representation. This means:
+
+- No data loss on eviction
+- Recall degrades gracefully (quantized vectors still contribute to search)
+- The file naturally compresses over time as access patterns stabilize
+
+## 4. Temperature-Aware Compaction
+
+Standard compaction merges segments for space efficiency. Temperature-aware
+compaction also **rearranges blocks by tier**:
+
+```
+Before compaction:
+  VEC_SEG_1:  [hot] [cold] [warm] [hot] [cold]
+  VEC_SEG_2:  [warm] [hot] [cold] [warm] [warm]
+
+After temperature-aware compaction:
+  HOT_SEG:    [hot] [hot] [hot]       <- interleaved, fp16
+  VEC_SEG_W:  [warm] [warm] [warm] [warm]  <- columnar, 6-bit
+  VEC_SEG_C:  [cold] [cold] [cold]     <- columnar, 3-bit
+```
+
+This creates **physical locality by temperature**: hot blocks are contiguous
+(good for sequential scan), warm blocks are contiguous (good for batch decode),
+cold blocks are contiguous (good for compression ratio).
+
+### Compaction Triggers
+
+| Trigger | Condition | Action |
+|---------|-----------|--------|
+| Sketch epoch | Every N writes | Evaluate all block temperatures |
+| Space amplification | Dead space > 30% | Merge + rewrite segments |
+| Tier imbalance | Hot tier > 20% of data | Demote cold blocks |
+| Hot miss rate | Hot cache miss > 10% | Promote missing blocks |
+
+## 5. Quantization Strategies by Tier
+
+### Tier 0: Hot
+
+**Scalar quantization to int8** (preferred) or **fp16** (for maximum recall).
+
+```
+Encoding:
+  q = round((v - min) / (max - min) * 255)
+
+Decoding:
+  v = q / 255 * (max - min) + min
+
+Parameters stored in QUANT_SEG:
+  min: f32 per dimension
+  max: f32 per dimension
+```
+
+Distance computation directly on int8 using SIMD (vpsubb + vpmaddubsw on AVX-512).
+
+### Tier 1: Warm
+
+**Product Quantization (PQ)** with 5-7 bits per sub-vector.
+
+```
+Parameters:
+  M subspaces:          48 (for 384-dim vectors, 8 dims per subspace)
+  K centroids per sub:  64 (6-bit) or 128 (7-bit)
+  Codebook:             M * K * 8 * sizeof(f32) = 48 * 64 * 8 * 4 = 96 KB
+
+Encoding:
+  For each subvector: find nearest centroid -> store centroid index
+
+Distance computation:
+  ADC (Asymmetric Distance Computation) with precomputed distance tables
+```
+
+### Tier 2: Cold
+
+**Binary quantization** (1-bit) or **ternary quantization** (2-bit / 3-bit).
+
+```
+Binary encoding:
+  b = sign(v)  -> 1 bit per dimension
+  384 dims -> 48 bytes per vector (32x compression)
+
+Distance:
+  Hamming distance via POPCNT
+  XOR + POPCNT on AVX-512: 512 bits per cycle
+
+Ternary (3-bit with magnitude):
+  t = {-1, 0, +1} based on threshold
+  magnitude = |v| quantized to 3 levels
+  384 dims -> 144 bytes per vector (10.7x compression)
+```
+
+### Codebook Storage
+
+All quantization parameters (codebooks, min/max ranges, centroids) are stored
+in QUANT_SEG segments. The root manifest's `quantdict_seg_offset` hotset pointer
+references the active quantization dictionary for fast boot.
+
+Multiple QUANT_SEGs can coexist for different tiers — the manifest maps each
+tier to its dictionary.
+
+## 6. Hardware Adaptation
+
+### Desktop (AVX-512)
+
+- Hot tier: int8 with VNNI dot product (4 int8 multiplies per cycle)
+- Warm tier: PQ with AVX-512 gather for table lookups
+- Cold tier: Binary with VPOPCNTDQ (512-bit popcount)
+
+### ARM (NEON)
+
+- Hot tier: int8 with SDOT instruction
+- Warm tier: PQ with TBL for table lookups
+- Cold tier: Binary with CNT (population count)
+
+### WASM (v128)
+
+- Hot tier: int8 with i8x16.dot_i7x16_i16x8 (if available)
+- Warm tier: Scalar PQ (no gather)
+- Cold tier: Binary with manual popcount
+
+### Cognitum Tile (8KB code + 8KB data + 64KB SIMD)
+
+- Hot tier only: int8 interleaved, fits in SIMD scratch
+- No warm/cold — data stays on hub, tile fetches blocks on demand
+- Sketch is maintained by hub, not tile
+
+## 7. Self-Organization Over Time
+
+```
+t=0    All data Tier 1 (default warm)
+       |
+t+N    First sketch epoch: identify hot blocks
+       Promote top 5% to Tier 0
+       |
+t+2N   Second epoch: validate promotions
+       Demote false positives back to Tier 1
+       Identify true cold blocks (0 access in 2 epochs)
+       |
+t+3N   Compaction: physically separate tiers
+       HOT_SEG created with interleaved layout
+       Cold blocks compressed to 3-bit
+       |
+t+∞    Equilibrium: ~5% hot, ~30% warm, ~65% cold
+       File size: ~2-3x smaller than uniform fp16
+       Query p95: dominated by hot tier latency
+```
+
+The format converges to an equilibrium that reflects actual usage. No manual
+tuning required.
--- a/vendor/ruvector/docs/research/rvf/spec/04-progressive-indexing.md
+++ b/vendor/ruvector/docs/research/rvf/spec/04-progressive-indexing.md
@@ -0,0 +1,374 @@
+# RVF Progressive Indexing
+
+## 1. Index as Layers of Availability
+
+Traditional HNSW serialization is all-or-nothing: either the full graph is loaded,
+or nothing works. RVF decomposes the index into three layers of availability, each
+independently useful, each stored in separate INDEX_SEG segments.
+
+```
+Layer C: Full Adjacency
+--------------------------------------------------+
+| Complete neighbor lists for every node at every   |
+| HNSW level. Built lazily. Optional for queries.   |
+| Recall: >= 0.95                                   |
+--------------------------------------------------+
+        ^  loaded last (seconds to minutes)
+        |
+Layer B: Partial Adjacency
+--------------------------------------------------+
+| Neighbor lists for the most-accessed region       |
+| (determined by temperature sketch). Covers the    |
+| hot working set of the graph.                     |
+| Recall: >= 0.85                                   |
+--------------------------------------------------+
+        ^  loaded second (100ms - 1s)
+        |
+Layer A: Entry Points + Coarse Routing
+--------------------------------------------------+
+| HNSW entry points. Top-layer adjacency lists.     |
+| Cluster centroids for IVF pre-routing.            |
+| Always present. Always in Level 0 hotset.         |
+| Recall: >= 0.70                                   |
+--------------------------------------------------+
+        ^  loaded first (< 5ms)
+        |
+      File open
+```
+
+### Why Three Layers
+
+| Layer | Purpose | Data Size (10M vectors) | Load Time (NVMe) |
+|-------|---------|------------------------|-------------------|
+| A | First query possible | 1-4 MB | < 5 ms |
+| B | Good quality for working set | 50-200 MB | 100-500 ms |
+| C | Full recall for all queries | 1-4 GB | 2-10 s |
+
+A system that only loads Layer A can still answer queries — just with lower recall.
+As layers B and C load asynchronously, quality improves transparently.
+
+## 2. Layer A: Entry Points and Coarse Routing
+
+### Content
+
+- **HNSW entry points**: The node(s) at the highest layer of the HNSW graph.
+  Typically 1 node, but may be multiple for redundancy.
+- **Top-layer adjacency**: Full neighbor lists for all nodes at HNSW layers
+  >= ceil(ln(N) / ln(M)) - 2. For 10M vectors with M=16, this is layers 5-6,
+  containing ~100-1000 nodes.
+- **Cluster centroids**: K centroids (K = sqrt(N) typically, so ~3162 for 10M)
+  used for IVF-style partition routing.
+- **Centroid-to-partition map**: Which centroid owns which vector ID ranges.
+
+### Storage
+
+Layer A data is stored in a dedicated INDEX_SEG with `flags.HOT` set. The root
+manifest's hotset pointers reference this segment directly. On cold start, this
+is the first data mapped after the manifest.
+
+### Binary Layout of Layer A INDEX_SEG
+
+```
+-------------------------------------------+
+| Header: INDEX_SEG, flags=HOT              |
+-------------------------------------------+
+| Block 0: Entry Points                     |
+|   entry_count: u32                        |
+|   max_layer: u32                          |
+|   [entry_node_id: u64, layer: u32] * N    |
+-------------------------------------------+
+| Block 1: Top-Layer Adjacency              |
+|   layer_count: u32                        |
+|   For each layer (top to bottom):         |
+|     node_count: u32                       |
+|     For each node:                        |
+|       node_id: u64                        |
+|       neighbor_count: u16                 |
+|       [neighbor_id: u64] * neighbor_count |
+|     [64B padding]                         |
+-------------------------------------------+
+| Block 2: Centroids                        |
+|   centroid_count: u32                     |
+|   dim: u16                                |
+|   dtype: u8 (fp16)                        |
+|   [centroid_vector: fp16 * dim] * K       |
+|   [64B aligned]                           |
+-------------------------------------------+
+| Block 3: Partition Map                    |
+|   partition_count: u32                    |
+|   For each partition:                     |
+|     centroid_id: u32                      |
+|     vector_id_start: u64                  |
+|     vector_id_end: u64                    |
+|     segment_ref: u64 (segment_id)         |
+|     block_ref: u32 (block offset)         |
+-------------------------------------------+
+```
+
+### Query Using Only Layer A
+
+```python
+def query_layer_a_only(query, k, layer_a):
+    # Step 1: Find nearest centroids
+    dists = [distance(query, c) for c in layer_a.centroids]
+    top_partitions = top_n(dists, n_probe)
+
+    # Step 2: HNSW search through top layers only
+    entry = layer_a.entry_points[0]
+    current = entry
+    for layer in range(layer_a.max_layer, layer_a.min_available_layer, -1):
+        current = greedy_search(query, current, layer_a.adjacency[layer])
+
+    # Step 3: If hot cache available, refine against it
+    if hot_cache:
+        candidates = scan_hot_cache(query, hot_cache, current.partition)
+        return top_k(candidates, k)
+
+    # Step 4: Otherwise, return centroid-approximate results
+    return approximate_from_centroids(query, top_partitions, k)
+```
+
+Expected recall: 0.65-0.75 (depends on centroid quality and hot cache coverage).
+
+## 3. Layer B: Partial Adjacency
+
+### Content
+
+Neighbor lists for the **hot region** of the graph — the set of nodes that appear
+most frequently in query traversals. Determined by the temperature sketch (see
+03-temperature-tiering.md).
+
+Typically covers:
+- All nodes at HNSW layers >= 2
+- Layer 0-1 nodes in the hot temperature tier
+- ~10-20% of total nodes
+
+### Storage
+
+Layer B is stored in one or more INDEX_SEGs without the HOT flag. The Level 1
+manifest maps these segments and records which node ID ranges they cover.
+
+### Incremental Build
+
+Layer B can be built incrementally:
+
+```
+1. After Layer A is loaded, begin query serving
+2. In background: read VEC_SEGs for hot-tier blocks
+3. Build HNSW adjacency for those blocks
+4. Write as new INDEX_SEG
+5. Update manifest to include Layer B
+6. Future queries use Layer B for better recall
+```
+
+This means the index improves over time without blocking any queries.
+
+### Partial Adjacency Routing
+
+When a query traversal reaches a node without Layer B adjacency (i.e., it's in
+the cold region), the system falls back to:
+
+1. **Centroid routing**: Use Layer A centroids to estimate the nearest region
+2. **Linear scan**: Scan the relevant VEC_SEG block directly
+3. **Approximate**: Accept slightly lower recall for that portion
+
+```python
+def search_with_partial_index(query, k, layers):
+    # Start with Layer A routing
+    current = hnsw_search_layers(query, layers.a, layers.a.max_layer, 2)
+
+    # Continue with Layer B (where available)
+    if layers.b.has_node(current):
+        current = hnsw_search_layers(query, layers.b, 1, 0,
+                                      start=current)
+    else:
+        # Fallback: scan the block containing current
+        candidates = linear_scan_block(query, current.block)
+        current = best_of(current, candidates)
+
+    return top_k(current.visited, k)
+```
+
+## 4. Layer C: Full Adjacency
+
+### Content
+
+Complete neighbor lists for every node at every HNSW level. This is the
+traditional full HNSW graph.
+
+### Storage
+
+Layer C may be split across multiple INDEX_SEGs for large datasets. The
+manifest records the node ID ranges covered by each segment.
+
+### Lazy Build
+
+Layer C is built lazily — it is not required for the file to be functional.
+The build process runs as a background task:
+
+```
+1. Identify unindexed VEC_SEG blocks (those without Layer C adjacency)
+2. Read blocks in partition order (good locality)
+3. Build HNSW adjacency using the existing partial graph as scaffold
+4. Write new INDEX_SEG(s)
+5. Update manifest
+```
+
+### Build Prioritization
+
+Blocks are indexed in temperature order:
+1. Hot blocks first (most query benefit)
+2. Warm blocks next
+3. Cold blocks last (may never be indexed if queries don't reach them)
+
+This means the index build converges to useful quality fast, then approaches
+completeness asymptotically.
+
+## 5. Index Segment Binary Format
+
+### Adjacency List Encoding
+
+Neighbor lists are stored using **varint delta encoding with restart points**
+for fast random access:
+
+```
+-------------------------------------------+
+| Restart Point Index                       |
+|   restart_interval: u32 (e.g., 64)       |
+|   restart_count: u32                      |
+|   [restart_offset: u32] * restart_count   |
+|   [64B aligned]                           |
+-------------------------------------------+
+| Adjacency Data                            |
+|   For each node (sorted by node_id):      |
+|     neighbor_count: varint                |
+|     [delta_encoded_neighbor_id: varint]   |
+|     (restart point every N nodes)         |
+-------------------------------------------+
+```
+
+**Restart points**: Every `restart_interval` nodes (default 64), the delta
+encoding resets to absolute IDs. This enables O(1) random access to any node's
+neighbors by:
+
+1. Binary search the restart point index for the nearest restart <= target
+2. Seek to that restart offset
+3. Sequentially decode from restart to target (at most 63 decodes)
+
+### Varint Encoding
+
+Standard LEB128 varint:
+- Values 0-127: 1 byte
+- Values 128-16383: 2 bytes
+- Values 16384-2097151: 3 bytes
+
+For delta-encoded neighbor IDs (typical delta: 1-1000), most values fit in 1-2
+bytes, giving ~3-4x compression over fixed u64.
+
+### Prefetch Hints
+
+The manifest's prefetch table maps node ID ranges to contiguous page ranges:
+
+```
+Prefetch Entry:
+  node_id_start: u64
+  node_id_end: u64
+  page_offset: u64      Offset of first contiguous page
+  page_count: u32       Number of contiguous pages
+  prefetch_ahead: u32   Pages to prefetch ahead of current access
+```
+
+When the HNSW search accesses a node, the runtime issues `madvise(WILLNEED)`
+(or equivalent) for the next `prefetch_ahead` pages. This hides disk/memory
+latency behind computation.
+
+## 6. Index Consistency
+
+### Append-Only Index Updates
+
+When new vectors are added:
+
+1. New vectors go into a **fresh VEC_SEG** (append-only)
+2. A temporary in-memory index covers the new vectors
+3. When the in-memory index reaches a threshold, it is written as a new INDEX_SEG
+4. The manifest is updated to include both the old and new INDEX_SEGs
+5. Queries search both indexes and merge results
+
+This is analogous to LSM-tree compaction levels but for graph indexes.
+
+### Index Merging
+
+When too many small INDEX_SEGs accumulate:
+
+```
+1. Read all small INDEX_SEGs
+2. Build a unified HNSW graph over all vectors
+3. Write as a single sealed INDEX_SEG
+4. Tombstone old INDEX_SEGs in manifest
+```
+
+### Concurrent Read/Write
+
+Readers always see a consistent snapshot through the manifest chain:
+- Reader opens file -> reads manifest -> has immutable segment set
+- Writer appends new segments + new manifest
+- Reader continues using old manifest until it explicitly re-reads
+- No locks needed — append-only guarantees no mutation of existing data
+
+## 7. Query Path Integration
+
+The complete query path combining progressive indexing with temperature tiering:
+
+```
+                         Query
+                           |
+                           v
+                    +-----------+
+                    | Layer A   |   Entry points + top-layer routing
+                    | (always)  |   ~5ms to load on cold start
+                    +-----------+
+                           |
+                    Is Layer B available for this region?
+                      /              \
+                    Yes               No
+                    /                   \
+            +-----------+         +-----------+
+            | Layer B   |         | Centroid  |
+            | HNSW      |         | Fallback  |
+            | search    |         | + scan    |
+            +-----------+         +-----------+
+                    \                  /
+                     \                /
+                      v              v
+                    +-----------+
+                    | Candidate |
+                    | Set       |
+                    +-----------+
+                           |
+                    Is hot cache available?
+                      /              \
+                    Yes               No
+                    /                   \
+            +-----------+         +-----------+
+            | Hot cache |         | Decode    |
+            | re-rank   |         | from      |
+            | (int8/fp16)|        | VEC_SEG   |
+            +-----------+         +-----------+
+                    \                  /
+                     v                v
+                    +-----------+
+                    | Top-K     |
+                    | Results   |
+                    +-----------+
+```
+
+### Recall Expectations by State
+
+| State | Layers Available | Expected Recall@10 |
+|-------|-----------------|-------------------|
+| Cold start (L0 only) | A | 0.65-0.75 |
+| L0 + hot cache | A + hot | 0.75-0.85 |
+| L0 + L1 loading | A + B partial | 0.80-0.90 |
+| L1 complete | A + B | 0.85-0.92 |
+| Full load | A + B + C | 0.95-0.99 |
+| Full + optimized | A + B + C + hot | 0.98-0.999 |
--- a/vendor/ruvector/docs/research/rvf/spec/05-overlay-epochs.md
+++ b/vendor/ruvector/docs/research/rvf/spec/05-overlay-epochs.md
@@ -0,0 +1,308 @@
+# RVF Overlay Epochs
+
+## 1. Streaming Dynamic Min-Cut Overlay
+
+The overlay system manages dynamic graph partitioning — how the vector space is
+subdivided for distributed search, shard routing, and load balancing. Unlike
+static partitioning, RVF overlays evolve with the data through an epoch-based
+model that bounds memory, bounds load time, and enables rollback.
+
+## 2. Overlay Segment Structure
+
+Each OVERLAY_SEG stores a delta relative to the previous epoch's partition state:
+
+```
+-------------------------------------------+
+| Header: OVERLAY_SEG                       |
+-------------------------------------------+
+| Epoch Header                              |
+|   epoch: u32                              |
+|   parent_epoch: u32                       |
+|   parent_seg_id: u64                      |
+|   rollback_offset: u64                    |
+|   timestamp_ns: u64                       |
+|   delta_count: u32                        |
+|   partition_count: u32                    |
+-------------------------------------------+
+| Edge Deltas                               |
+|   For each delta:                         |
+|     delta_type: u8 (ADD=1, REMOVE=2,      |
+|                     REWEIGHT=3)           |
+|     src_node: u64                         |
+|     dst_node: u64                         |
+|     weight: f32 (for ADD/REWEIGHT)        |
+|   [64B aligned]                           |
+-------------------------------------------+
+| Partition Summaries                       |
+|   For each partition:                     |
+|     partition_id: u32                     |
+|     node_count: u64                       |
+|     edge_cut_weight: f64                  |
+|     centroid: [fp16 * dim]                |
+|     node_id_range_start: u64              |
+|     node_id_range_end: u64               |
+|   [64B aligned]                           |
+-------------------------------------------+
+| Min-Cut Witness                           |
+|   witness_type: u8                        |
+|     0 = checksum only                     |
+|     1 = full certificate                  |
+|   cut_value: f64                          |
+|   cut_edge_count: u32                     |
+|   partition_hash: [u8; 32] (SHAKE-256)    |
+|   If witness_type == 1:                   |
+|     [cut_edge: (u64, u64)] * count        |
+|   [64B aligned]                           |
+-------------------------------------------+
+| Rollback Pointer                          |
+|   prev_epoch_offset: u64                  |
+|   prev_epoch_hash: [u8; 16]              |
+-------------------------------------------+
+```
+
+## 3. Epoch Lifecycle
+
+### Epoch Creation
+
+A new epoch is created when:
+- A batch of vectors is inserted that changes partition balance by > threshold
+- The accumulated edge deltas exceed a size limit (default: 1 MB)
+- A manual rebalance is triggered
+- A merge/compaction produces a new partition layout
+
+```
+Epoch 0 (initial)     Epoch 1             Epoch 2
+----------------+    +----------------+   +----------------+
+| Full snapshot  |    | Deltas vs E0   |   | Deltas vs E1   |
+| of partitions  |    | +50 edges      |   | +30 edges      |
+| 32 partitions  |    | -12 edges      |   | -8 edges       |
+| min-cut: 0.342 |    | rebalance: P3  |   | split: P7->P7a |
+----------------+    +----------------+   +----------------+
+```
+
+### State Reconstruction
+
+To reconstruct the current partition state:
+
+```
+1. Read latest MANIFEST_SEG -> get current_epoch
+2. Read OVERLAY_SEG for current_epoch
+3. If overlay is a delta: recursively read parent epochs
+4. Apply deltas in order: base -> epoch 1 -> epoch 2 -> ... -> current
+5. Result: complete partition state
+```
+
+For efficiency, the manifest caches the **last full snapshot epoch**. Delta
+chains never exceed a configurable depth (default: 8 epochs) before a new
+snapshot is forced.
+
+### Compaction (Epoch Collapse)
+
+When the delta chain reaches maximum depth:
+
+```
+1. Reconstruct full state from chain
+2. Write new OVERLAY_SEG with witness_type=full_snapshot
+3. This becomes the new base epoch
+4. Old overlay segments are tombstoned
+5. New delta chain starts from this base
+```
+
+```
+Before:  E0(snap) -> E1(delta) -> E2(delta) -> ... -> E8(delta)
+After:   E0(snap) -> ... -> E8(delta) -> E9(snap, compacted)
+         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+         These can be garbage collected
+```
+
+## 4. Min-Cut Witness
+
+The min-cut witness provides a cryptographic proof that the current partition
+is "good enough" — that the edge cut is within acceptable bounds.
+
+### Witness Types
+
+**Type 0: Checksum Only**
+
+A SHAKE-256 hash of the complete partition state. Allows verification that
+the state is consistent but doesn't prove optimality.
+
+```
+witness = SHAKE-256(
+    for each partition sorted by id:
+        partition_id || node_count || sorted(node_ids) || edge_cut_weight
+)
+```
+
+**Type 1: Full Certificate**
+
+Lists the actual cut edges. Allows any reader to verify that:
+1. The listed edges are the only edges crossing partition boundaries
+2. The total cut weight matches `cut_value`
+3. No better cut exists within the local search neighborhood (optional)
+
+### Bounded-Time Min-Cut Updates
+
+Full min-cut computation is expensive (O(V * E) for max-flow). RVF uses
+**incremental min-cut maintenance**:
+
+For each edge delta:
+```
+1. If ADD(u, v) where u and v are in same partition:
+   -> No cut change. O(1).
+
+2. If ADD(u, v) where u in P_i and v in P_j:
+   -> cut_weight[P_i][P_j] += weight. O(1).
+   -> Check if moving u to P_j or v to P_i reduces total cut.
+   -> If yes: execute move, update partition summaries. O(degree).
+
+3. If REMOVE(u, v) across partitions:
+   -> cut_weight[P_i][P_j] -= weight. O(1).
+   -> No rebalance needed (cut improved).
+
+4. If REMOVE(u, v) within same partition:
+   -> Check connectivity. If partition splits: create new partition. O(component).
+```
+
+This bounds update time to O(max_degree) per edge delta in the common case,
+with O(component_size) in the rare partition-split case.
+
+### Semi-Streaming Min-Cut
+
+For large-scale rebalancing (e.g., after bulk insert), RVF uses a semi-streaming
+algorithm inspired by Assadi et al.:
+
+```
+Phase 1: Single pass over edges to build a sparse skeleton
+  - Sample each edge with probability O(1/epsilon)
+  - Space: O(n * polylog(n))
+
+Phase 2: Compute min-cut on skeleton
+  - Standard max-flow on sparse graph
+  - Time: O(n^2 * polylog(n))
+
+Phase 3: Verify against full edge set
+  - Stream edges again, check cut validity
+  - If invalid: refine skeleton and repeat
+```
+
+This runs in O(n * polylog(n)) space regardless of edge count, making it
+suitable for streaming over massive graphs.
+
+## 5. Overlay Size Management
+
+### Size Threshold
+
+Each OVERLAY_SEG has a maximum payload size (configurable, default 1 MB).
+When the accumulated deltas for the current epoch approach this threshold,
+a new epoch is forced.
+
+### Memory Budget
+
+The total memory for overlay state is bounded:
+
+```
+max_overlay_memory = max_chain_depth * max_seg_size + snapshot_size
+                   = 8 * 1 MB + snapshot_size
+```
+
+For 10M vectors with 32 partitions:
+- Snapshot: ~32 * (8 + 16 + 768) bytes per partition ≈ 25 KB
+- Delta chain: ≤ 8 MB
+- Total: ≤ 9 MB
+
+This is a fixed overhead regardless of dataset size (partition count scales
+sublinearly).
+
+### Garbage Collection
+
+Overlay segments behind the last full snapshot are candidates for garbage
+collection. The manifest tracks which overlay segments are still reachable
+from the current epoch chain.
+
+```
+Reachable:    current_epoch -> parent -> ... -> last_snapshot
+Unreachable:  Everything before last_snapshot (safely deletable)
+```
+
+GC runs during compaction. Old OVERLAY_SEGs are tombstoned in the manifest
+and their space is reclaimed on file rewrite.
+
+## 6. Distributed Overlay Coordination
+
+When RVF files are sharded across multiple nodes, the overlay system coordinates
+partition state:
+
+### Shard-Local Overlays
+
+Each shard maintains its own OVERLAY_SEG chain for its local partitions.
+The global partition state is the union of all shard-local overlays.
+
+### Cross-Shard Rebalancing
+
+When a partition becomes unbalanced across shards:
+
+```
+1. Coordinator computes target partition assignment
+2. Each shard writes a JOURNAL_SEG with vector move instructions
+3. Vectors are copied (not moved — append-only) to target shards
+4. Each shard writes a new OVERLAY_SEG reflecting the new partition
+5. Coordinator writes a global MANIFEST_SEG with new shard map
+```
+
+This is eventually consistent — during rebalancing, queries may search both
+old and new locations and deduplicate results.
+
+### Consistency Model
+
+**Within a shard**: Linearizable (single-writer, manifest chain)
+**Across shards**: Eventually consistent with bounded staleness
+
+The epoch counter provides a total order for convergence checking:
+- If all shards report epoch >= E, the global state at epoch E is complete
+- Stale shards are detectable by comparing epoch counters
+
+## 7. Epoch-Aware Query Routing
+
+Queries use the overlay state for partition routing:
+
+```python
+def route_query(query, overlay):
+    # Find nearest partition centroids
+    dists = [distance(query, p.centroid) for p in overlay.partitions]
+    target_partitions = top_n(dists, n_probe)
+
+    # Check epoch freshness
+    if overlay.epoch < current_epoch - stale_threshold:
+        # Overlay is stale — broaden search
+        target_partitions = top_n(dists, n_probe * 2)
+
+    return target_partitions
+```
+
+### Epoch Rollback
+
+If an overlay epoch is found to be corrupt or suboptimal:
+
+```
+1. Read rollback_pointer from current OVERLAY_SEG
+2. The pointer gives the offset of the previous epoch's OVERLAY_SEG
+3. Write a new MANIFEST_SEG pointing to the previous epoch as current
+4. Future writes continue from the rolled-back state
+```
+
+This provides O(1) rollback to any ancestor epoch in the chain.
+
+## 8. Integration with Progressive Indexing
+
+The overlay system and the index system are coupled:
+
+- **Partition centroids** in the overlay guide Layer A routing
+- **Partition boundaries** determine which INDEX_SEGs cover which regions
+- **Partition rebalancing** may invalidate Layer B adjacency for moved vectors
+  (these are rebuilt lazily)
+- **Layer C** is partitioned aligned — each INDEX_SEG covers vectors within
+  a single partition for locality
+
+This means overlay compaction can trigger partial index rebuild, but only for
+the affected partitions — not the entire index.
--- a/vendor/ruvector/docs/research/rvf/spec/06-query-optimization.md
+++ b/vendor/ruvector/docs/research/rvf/spec/06-query-optimization.md
@@ -0,0 +1,386 @@
+# RVF Ultra-Fast Query Path
+
+## 1. CPU Shape Optimization
+
+The block layout determines performance at the hardware level. RVF is designed
+to match the shape of modern CPUs: wide SIMD, deep caches, hardware prefetch.
+
+### Four Optimizations
+
+1. **Strict 64-byte alignment** for all numeric arrays
+2. **Columnar + interleaved hybrid** for compression and speed
+3. **Prefetch hints** for cache-friendly graph traversal
+4. **Dictionary-coded IDs** for fast random access
+
+## 2. Strict Alignment
+
+Every numeric array in RVF starts at a 64-byte aligned offset. This matches:
+
+| Target | Register Width | Alignment |
+|--------|---------------|-----------|
+| AVX-512 | 512 bits = 64 bytes | 64 B |
+| AVX2 | 256 bits = 32 bytes | 64 B (superset) |
+| ARM NEON | 128 bits = 16 bytes | 64 B (superset) |
+| WASM v128 | 128 bits = 16 bytes | 64 B (superset) |
+| Cache line | Typically 64 bytes | 64 B (exact) |
+
+By aligning to 64 bytes, RVF ensures:
+- Zero-copy load into any SIMD register (no unaligned penalty)
+- No cache-line splits (each access touches exactly one cache line)
+- Optimal hardware prefetch behavior (prefetcher operates on cache lines)
+
+### Alignment in Practice
+
+```
+Segment header:           64 B (naturally aligned, first item in segment)
+Block header:             Padded to 64 B boundary
+Vector data start:        64 B aligned from block start
+Each dimension column:    64 B aligned (columnar VEC_SEG)
+Each vector entry:        64 B aligned (interleaved HOT_SEG)
+ID map:                   64 B aligned
+Restart point index:      64 B aligned
+```
+
+Padding bytes between sections are zero-filled and excluded from checksums.
+
+## 3. Columnar + Interleaved Hybrid
+
+### Columnar Storage (VEC_SEG) — Optimized for Compression
+
+```
+Block layout (1024 vectors, 384 dimensions, fp16):
+
+Offset 0x000:   dim_0[vec_0], dim_0[vec_1], ..., dim_0[vec_1023]   (2048 B)
+Offset 0x800:   dim_1[vec_0], dim_1[vec_1], ..., dim_1[vec_1023]   (2048 B)
+...
+Offset 0xBF800: dim_383[vec_0], ..., dim_383[vec_1023]              (2048 B)
+
+Total: 384 * 2048 = 786,432 bytes (768 KB per block)
+```
+
+**Why columnar for cold/warm storage**:
+- Adjacent values in the same dimension are correlated -> higher compression ratio
+- LZ4 on columnar fp16 achieves 1.5-2.5x compression (vs 1.1-1.3x on interleaved)
+- ZSTD on columnar fp16 achieves 2.5-4x compression
+- Batch operations (computing mean, variance) scan one dimension at a time
+
+### Interleaved Storage (HOT_SEG) — Optimized for Speed
+
+```
+Entry layout (one hot vector, 384 dim fp16):
+
+Offset 0x000:   vector_id (8 B)
+Offset 0x008:   dim_0, dim_1, dim_2, ..., dim_383  (768 B)
+Offset 0x308:   neighbor_count (2 B)
+Offset 0x30A:   neighbor_0, neighbor_1, ... (8 B each)
+Offset 0x38A:   padding to 64B boundary
+                --> 960 bytes per entry (at M=16 neighbors)
+```
+
+**Why interleaved for hot data**:
+- One vector = one sequential read (no column gathering)
+- Distance computation: load vector, compute, move to next (streaming pattern)
+- Neighbors co-located: after finding a good candidate, immediately traverse
+- 960 bytes per entry = 15 cache lines = predictable memory access
+
+### When to Use Each
+
+| Operation | Layout | Reason |
+|-----------|--------|--------|
+| Bulk distance computation | Columnar | SIMD operates on dimension columns |
+| Top-K refinement scan | Interleaved | Sequential scan of candidates |
+| Compression/archival | Columnar | Better ratio |
+| HNSW search (hot region) | Interleaved | Vector + neighbors together |
+| Batch insert | Columnar | Write once, compress well |
+
+## 4. Prefetch Hints
+
+### The Problem
+
+HNSW search is pointer-chasing: compute distance at node A, read neighbor
+list, jump to node B, compute distance, repeat. Each jump is a random
+memory access. On a 10M vector file, this means:
+
+```
+HNSW search: ~100-200 distance computations per query
+Each computation: 1 random read (vector) + 1 random read (neighbors)
+Random read latency: 50-100 ns (DRAM), 10-50 μs (SSD)
+Total: 10-40 μs (DRAM), 1-10 ms (SSD) without prefetch
+```
+
+### The Solution
+
+Store neighbor lists **contiguously** and add **prefetch offsets** in the
+manifest so the runtime can issue prefetch instructions ahead of time.
+
+### Prefetch Table Structure
+
+The manifest contains a prefetch table mapping node ID ranges to contiguous
+page regions:
+
+```
+prefetch_table:
+  entry_count: u32
+  entries:
+    [0]: node_ids 0-9999      -> pages at offset 0x100000, 50 pages, prefetch 3 ahead
+    [1]: node_ids 10000-19999  -> pages at offset 0x200000, 50 pages, prefetch 3 ahead
+    ...
+```
+
+### Runtime Prefetch Strategy
+
+```python
+def hnsw_search_with_prefetch(query, entry_point, ef_search):
+    candidates = MaxHeap()
+    visited = BitSet()
+    worklist = MinHeap([(distance(query, entry_point), entry_point)])
+
+    while worklist:
+        dist, node = worklist.pop()
+
+        # PREFETCH: while processing this node, prefetch neighbors' data
+        neighbors = get_neighbors(node)
+        for n in neighbors[:PREFETCH_AHEAD]:
+            if n not in visited:
+                prefetch_vector(n)      # madvise(WILLNEED) or __builtin_prefetch
+                prefetch_neighbors(n)   # prefetch neighbor list page
+
+        # COMPUTE: distance to neighbors (data should be in cache by now)
+        for n in neighbors:
+            if n not in visited:
+                visited.add(n)
+                d = distance(query, get_vector(n))
+                if d < candidates.max() or len(candidates) < ef_search:
+                    candidates.push((d, n))
+                    worklist.push((d, n))
+
+    return candidates.top_k(k)
+```
+
+### Contiguous Neighbor Layout
+
+HOT_SEG stores vectors and neighbors together. For cold INDEX_SEGs, neighbor
+lists are laid out in **node ID order** within contiguous pages:
+
+```
+Page 0:  neighbors[node_0], neighbors[node_1], ..., neighbors[node_63]
+Page 1:  neighbors[node_64], ..., neighbors[node_127]
+...
+```
+
+Because HNSW search tends to traverse nodes in the same graph neighborhood
+(spatially close node IDs if data was inserted in order), sequential node
+IDs tend to be accessed together. Contiguous layout turns random access
+into sequential reads.
+
+### Expected Improvement
+
+| Configuration | p95 Latency (10M vectors) |
+|--------------|--------------------------|
+| No prefetch, random layout | 2.5 ms |
+| No prefetch, contiguous layout | 1.2 ms |
+| Prefetch, contiguous layout | 0.3 ms |
+| Prefetch, contiguous + hot cache | 0.15 ms |
+
+## 5. Dictionary-Coded IDs
+
+### The Problem
+
+Vector IDs in neighbor lists and ID maps are 64-bit integers. For 10M vectors,
+most IDs fit in 24 bits. Storing full 64-bit IDs wastes ~5 bytes per entry.
+
+With M=16 neighbors per node and 10M nodes:
+- Raw: 10M * 16 * 8 = 1.2 GB of ID data
+- Desired: < 300 MB
+
+### Varint Delta Encoding
+
+IDs within a block or neighbor list are sorted and delta-encoded:
+
+```
+Original IDs:    [1000, 1005, 1008, 1020, 1100]
+Deltas:          [1000,    5,    3,   12,   80]
+Varint bytes:    [  2B,  1B,  1B,   1B,   1B]  = 6 bytes (vs 40 bytes raw)
+```
+
+### Restart Points
+
+Every N entries (default N=64), the delta resets to an absolute value:
+
+```
+Group 0 (entries 0-63):    delta from 0 (absolute start)
+Group 1 (entries 64-127):  delta from entry[64] (restart)
+Group 2 (entries 128-191): delta from entry[128] (restart)
+```
+
+The restart point index stores the offset of each restart group:
+
+```
+restart_index:
+  interval: 64
+  offsets: [0, 156, 298, 445, ...]  // byte offsets into encoded data
+```
+
+### Random Access
+
+To find the neighbors of node N:
+
+```
+1. group = N / restart_interval            // O(1)
+2. offset = restart_index[group]           // O(1)
+3. seek to offset in encoded data          // O(1)
+4. decode sequentially from restart to N   // O(restart_interval) = O(64)
+```
+
+Total: O(64) varint decodes = ~50-100 ns. Compare with sorted array binary
+search: O(log N) = O(24) comparisons with cache misses = ~200-500 ns.
+
+### SIMD Varint Decoding
+
+Modern SIMD can decode varints in bulk:
+
+```
+AVX-512 VBMI: ~8 varints per cycle using VPERMB + VPSHUFB
+Throughput: 2-4 billion integers/second (Lemire et al.)
+```
+
+At 16 neighbors per node, one HNSW search step decodes 16 varints in ~2-4 ns.
+
+### Compression Ratio
+
+| Encoding | Bytes per ID (avg) | 10M * 16 neighbors |
+|----------|-------------------|-------------------|
+| Raw u64 | 8.0 B | 1,220 MB |
+| Raw u32 | 4.0 B | 610 MB |
+| Varint (no delta) | 3.2 B | 488 MB |
+| Varint delta | 1.5 B | 229 MB |
+| Varint delta + restart | 1.6 B | 244 MB |
+
+Delta encoding with restart points achieves ~5x compression over raw u64
+while maintaining fast random access.
+
+## 6. Cache Behavior Analysis
+
+### L1/L2/L3 Working Sets
+
+For a typical query on 10M vectors (384 dim, fp16):
+
+```
+HNSW search:
+  ~150 distance computations
+  Each computation: 768 B (vector) + ~128 B (neighbor list) ≈ 896 B
+  Total working set: 150 * 896 ≈ 131 KB
+
+Top-K refinement (hot cache scan):
+  ~1000 candidates checked
+  Each: 960 B (interleaved HOT_SEG entry)
+  Total: 960 KB
+
+Query vector: 768 B (always in L1)
+Quantization tables: 96 KB (PQ codebook, always in L2)
+```
+
+| Cache Level | Size | What Fits |
+|------------|------|-----------|
+| L1 (32-48 KB) | Query vector + current node | Always hit |
+| L2 (256 KB-1 MB) | PQ tables + 100-200 hot entries | Usually hit |
+| L3 (8-32 MB) | Hot cache + partial index | Mostly hit |
+| DRAM | Everything | Full dataset |
+
+### p95 Latency Budget
+
+```
+HNSW traversal:    150 nodes * 100 ns/node = 15 μs (L3 hit)
+Distance compute:  150 * 50 ns = 7.5 μs (SIMD)
+Top-K refinement:  1000 * 10 ns = 10 μs (hot cache, L2/L3 hit)
+Overhead:          5 μs (heap ops, bookkeeping)
+                   -------
+Total p95:         ~37.5 μs ≈ 0.04 ms
+
+With prefetch:     ~30 μs (hide 25% of traversal latency)
+```
+
+This matches the target of < 0.3 ms p95 on desktop hardware. The dominant
+cost is memory bandwidth, not computation — which is why cache-friendly
+layout and prefetch are critical.
+
+## 7. Distance Function SIMD Implementations
+
+### L2 Distance (fp16, 384 dim, AVX-512)
+
+```
+; 384 fp16 values = 768 bytes = 12 ZMM registers
+; Process 32 fp16 values per iteration (convert to 16 fp32 per half)
+
+.loop:
+    vmovdqu16   zmm0, [rsi + rcx]      ; Load 32 fp16 from A
+    vmovdqu16   zmm1, [rdi + rcx]      ; Load 32 fp16 from B
+    vcvtph2ps   zmm2, ymm0             ; Convert low 16 to fp32
+    vcvtph2ps   zmm3, ymm1
+    vsubps      zmm2, zmm2, zmm3       ; diff = A - B
+    vfmadd231ps zmm4, zmm2, zmm2       ; acc += diff * diff
+    ; Repeat for high 16
+    vextracti64x4 ymm0, zmm0, 1
+    vextracti64x4 ymm1, zmm1, 1
+    vcvtph2ps   zmm2, ymm0
+    vcvtph2ps   zmm3, ymm1
+    vsubps      zmm2, zmm2, zmm3
+    vfmadd231ps zmm4, zmm2, zmm2
+    add         rcx, 64
+    cmp         rcx, 768
+    jl          .loop
+
+; Horizontal sum of zmm4 -> scalar result
+; ~12 iterations, ~24 FMA ops, ~12 cycles total
+```
+
+### Inner Product (int8, 384 dim, AVX-512 VNNI)
+
+```
+; 384 int8 values = 384 bytes = 6 ZMM registers
+; VPDPBUSD: 64 uint8*int8 multiply-adds per cycle
+
+.loop:
+    vmovdqu8    zmm0, [rsi + rcx]      ; 64 uint8 from A
+    vmovdqu8    zmm1, [rdi + rcx]      ; 64 int8 from B
+    vpdpbusd    zmm2, zmm0, zmm1       ; acc += dot(A, B) per 4 bytes
+    add         rcx, 64
+    cmp         rcx, 384
+    jl          .loop
+
+; 6 iterations, 6 VPDPBUSD ops, ~6 cycles
+; ~16x faster than fp16 L2
+```
+
+### Hamming Distance (binary, 384 dim, AVX-512)
+
+```
+; 384 bits = 48 bytes = 1 partial ZMM load
+; VPOPCNTDQ: popcount on 8 x 64-bit words per cycle
+
+    vmovdqu8    zmm0, [rsi]            ; Load 48 bytes (384 bits) from A
+    vmovdqu8    zmm1, [rdi]            ; Load 48 bytes from B
+    vpxorq      zmm2, zmm0, zmm1       ; XOR -> differing bits
+    vpopcntq    zmm3, zmm2             ; Popcount per 64-bit word
+    ; Horizontal sum of 6 popcounts -> Hamming distance
+    ; ~3 cycles total
+```
+
+## 8. Summary: Query Path Hot Loop
+
+The complete hot path for one HNSW search step:
+
+```
+1. Load current node's neighbor list       [L2/L3 cache, 128 B, ~5 ns]
+2. Issue prefetch for next neighbors       [~1 ns]
+3. For each neighbor (M=16):
+   a. Check visited bitmap                 [L1, ~1 ns]
+   b. Load neighbor vector (hot cache)     [L2/L3, 768 B, ~5-10 ns]
+   c. SIMD distance (fp16, 384 dim)        [~12 cycles = ~4 ns]
+   d. Heap insert if better                [~5 ns]
+4. Total per step: ~300-500 ns
+5. Total per query (~150 steps): ~50-75 μs
+```
+
+This achieves 13,000-20,000 QPS per thread on desktop hardware — matching
+or exceeding dedicated vector databases for in-memory workloads.
--- a/vendor/ruvector/docs/research/rvf/spec/07-deletion-lifecycle.md
+++ b/vendor/ruvector/docs/research/rvf/spec/07-deletion-lifecycle.md
@@ -0,0 +1,580 @@
+# RVF Deletion Lifecycle
+
+## 1. Overview
+
+Deletion in RVF follows a two-phase protocol consistent with the append-only
+segment architecture. Vectors are never removed in-place. Instead, a soft
+delete records intent in a JOURNAL_SEG, and a subsequent compaction hard
+deletes by physically excluding the vectors from sealed output segments.
+
+```
+                  JOURNAL_SEG         Compaction           GC / Rewrite
+                  (append)            (merge)              (reclaim)
+    ACTIVE -----> SOFT_DELETED -----> HARD_DELETED ------> RECLAIMED
+      |               |                    |                    |
+      |  query path   |  query path        |                   |
+      |  returns vec  |  skips vec         |  vec absent       |  space freed
+      |               |  (bitmap check)    |  from output seg  |
+```
+
+Readers always see a consistent snapshot: a deletion is invisible until
+the manifest referencing the new deletion bitmap is durably committed.
+
+## 2. Vector Lifecycle State Machine
+
+```
+----------+     JOURNAL_SEG        +-----------------+
+|          |  DELETE_VECTOR / RANGE  |                 |
+|  ACTIVE  +----------------------->+  SOFT_DELETED   |
+|          |                        |                 |
+----------+                        +--------+--------+
+                                             |  Compaction seals output
+                                             v  excluding this vector
+                                    +--------+--------+
+                                    |  HARD_DELETED   |
+                                    +--------+--------+
+                                             |  File rewrite / truncation
+                                             v  reclaims physical space
+                                    +--------+--------+
+                                    |   RECLAIMED     |
+                                    +-----------------+
+```
+
+| State | Bitmap Bit | Physical Bytes | Query Visible |
+|-------|------------|----------------|---------------|
+| ACTIVE | 0 | Vector in VEC_SEG | Yes |
+| SOFT_DELETED | 1 | Vector in VEC_SEG | No |
+| HARD_DELETED | N/A | Excluded from sealed output | No |
+| RECLAIMED | N/A | Bytes overwritten / freed | No |
+
+| Transition | Trigger | Durability |
+|------------|---------|------------|
+| ACTIVE -> SOFT_DELETED | JOURNAL_SEG + MANIFEST_SEG with bitmap | After manifest fsync |
+| SOFT_DELETED -> HARD_DELETED | Compaction writes sealed VEC_SEG without vector | After compaction manifest fsync |
+| HARD_DELETED -> RECLAIMED | File rewrite or old shard deletion | After shard unlink |
+
+## 3. JOURNAL_SEG Wire Format (type 0x04)
+
+A JOURNAL_SEG records metadata mutations: deletions, metadata updates, tier
+moves, and ID remappings. Its payload follows the standard 64-byte segment
+header (see `01-segment-model.md` section 2).
+
+### 3.1 Journal Header (64 bytes)
+
+```
+Offset  Type    Field                 Description
+------  ----    -----                 -----------
+0x00    u32     entry_count           Number of journal entries
+0x04    u32     journal_epoch         Epoch when this journal was written
+0x08    u64     prev_journal_seg_id   Segment ID of previous JOURNAL_SEG (0 if first)
+0x10    u32     flags                 Reserved, must be 0
+0x14    u8[44]  reserved              Zero-padded to 64-byte alignment
+```
+
+### 3.2 Journal Entry Format
+
+Each entry begins on an 8-byte aligned boundary:
+
+```
+Offset  Type    Field          Description
+------  ----    -----          -----------
+0x00    u8      entry_type     Entry type enum
+0x01    u8      reserved       Must be 0x00
+0x02    u16     entry_length   Byte length of type-specific payload
+0x04    u8[]    payload        Type-specific payload
+var     u8[]    padding        Zero-pad to next 8-byte boundary
+```
+
+### 3.3 Entry Types
+
+```
+Value  Name              Payload Size  Description
+-----  ----              ------------  -----------
+0x01   DELETE_VECTOR      8 B          Delete a single vector by ID
+0x02   DELETE_RANGE      16 B          Delete a contiguous range of vector IDs
+0x03   UPDATE_METADATA   variable      Update key-value metadata for a vector
+0x04   MOVE_VECTOR       24 B          Reassign vector to a different segment/tier
+0x05   REMAP_ID          16 B          Reassign vector ID (post-compaction)
+```
+
+### 3.4 Type-Specific Payloads
+
+**DELETE_VECTOR (0x01)**
+```
+0x00  u64  vector_id    ID of the vector to soft-delete
+```
+
+**DELETE_RANGE (0x02)**
+```
+0x00  u64  start_id     First vector ID (inclusive)
+0x08  u64  end_id       Last vector ID (exclusive)
+```
+Invariant: `start_id < end_id`. Range `[start_id, end_id)` is half-open.
+
+**UPDATE_METADATA (0x03)**
+```
+0x00  u64   vector_id   Target vector ID
+0x08  u16   key_len     Byte length of metadata key
+0x0A  u8[]  key         Metadata key (UTF-8)
+var   u16   val_len     Byte length of metadata value
+var+2 u8[]  val         Metadata value (opaque bytes)
+```
+
+**MOVE_VECTOR (0x04)**
+```
+0x00  u64  vector_id    Target vector ID
+0x08  u64  src_seg      Source segment ID
+0x10  u64  dst_seg      Destination segment ID
+```
+
+**REMAP_ID (0x05)**
+```
+0x00  u64  old_id       Original vector ID
+0x08  u64  new_id       New vector ID after compaction
+```
+
+### 3.5 Complete JOURNAL_SEG Example
+
+Deleting vector 42, deleting range [1000, 2000), remapping ID 500 -> 3:
+
+```
+Byte offset   Content                    Notes
+-----------   -------                    -----
+0x00-0x3F     Segment header (64 B)      seg_type=0x04, magic=RVFS
+0x40-0x7F     Journal header (64 B)      entry_count=3, epoch=7,
+                                         prev_journal_seg_id=12
+--- Entry 0: DELETE_VECTOR ---
+0x80          0x01                       entry_type
+0x81          0x00                       reserved
+0x82-0x83     0x0008                     entry_length = 8
+0x84-0x8B     0x000000000000002A         vector_id = 42
+0x8C-0x8F     0x00000000                 padding to 8B
+
+--- Entry 1: DELETE_RANGE ---
+0x90          0x02                       entry_type
+0x91          0x00                       reserved
+0x92-0x93     0x0010                     entry_length = 16
+0x94-0x9B     0x00000000000003E8         start_id = 1000
+0x9C-0xA3     0x00000000000007D0         end_id = 2000
+
+--- Entry 2: REMAP_ID ---
+0xA4          0x05                       entry_type
+0xA5          0x00                       reserved
+0xA6-0xA7     0x0010                     entry_length = 16
+0xA8-0xAF     0x00000000000001F4         old_id = 500
+0xB0-0xB7     0x0000000000000003         new_id = 3
+```
+
+## 4. Deletion Bitmap
+
+### 4.1 Manifest Record
+
+The deletion bitmap is stored in the Level 1 manifest as a TLV record:
+
+```
+Tag     Name                Description
+---     ----                -----------
+0x000E  DELETION_BITMAP     Roaring bitmap of soft-deleted vector IDs
+```
+
+This extends the TLV tag space (previous: 0x000D KEY_DIRECTORY).
+
+### 4.2 Roaring Bitmap Binary Layout
+
+Vector IDs are 64-bit. The upper 32 bits select a **high key**; the lower
+32 bits index into a **container** for that high key.
+
+```
+---------------------------------------------+
+| DELETION_BITMAP TLV Value                   |
+---------------------------------------------+
+| Bitmap Header                               |
+|   cookie: u32       (0x3B3A3332)            |
+|   high_key_count: u32                       |
+|   For each high key:                        |
+|     high_key: u32                           |
+|     container_type: u8                      |
+|       0x01 = ARRAY_CONTAINER               |
+|       0x02 = BITMAP_CONTAINER              |
+|       0x03 = RUN_CONTAINER                 |
+|     container_offset: u32 (from bitmap start)|
+|   [8B aligned]                              |
+---------------------------------------------+
+| Container Data                              |
+|   Container 0: [type-specific layout]       |
+|   Container 1: ...                          |
+|   [8B aligned per container]                |
+---------------------------------------------+
+```
+
+### 4.3 Container Types
+
+**ARRAY_CONTAINER (0x01)** -- Sparse deletions (< 4096 set bits per 64K range).
+```
+0x00  u16    cardinality   Number of set values (1-4096)
+0x02  u16[]  values        Sorted array of 16-bit values
+```
+Size: `2 + 2 * cardinality` bytes.
+
+**BITMAP_CONTAINER (0x02)** -- Dense deletions (>= 4096 set bits per 64K range).
+```
+0x00  u16      cardinality   Number of set bits
+0x02  u8[8192] bitmap        Fixed 65536-bit bitmap (8 KB)
+```
+Size: 8194 bytes (fixed).
+
+**RUN_CONTAINER (0x03)** -- Contiguous ranges of deletions.
+```
+0x00  u16        run_count   Number of runs
+0x02  (u16,u16)  runs[]      Array of (start, length-1) pairs
+```
+Size: `2 + 4 * run_count` bytes.
+
+### 4.4 Size Estimation
+
+| Deletion Pattern | Deleted IDs | Container Types | Bitmap Size |
+|------------------|-------------|-----------------|-------------|
+| Sparse random | 10,000 (0.1%) | ~153 array | ~22 KB |
+| Clustered ranges | 10,000 (0.1%) | ~5 run | ~0.1 KB |
+| Mixed workload | 100,000 (1%) | array + run | ~80 KB |
+| Heavy deletion | 1,000,000 (10%) | bitmap + run | ~200 KB |
+
+Even at 200 KB the bitmap fits entirely in L2 cache.
+
+### 4.5 Bitmap Operations
+
+```python
+def bitmap_check(bitmap, vector_id):
+    """Returns True if vector_id is soft-deleted. O(1) amortized."""
+    high_key = vector_id >> 16
+    low_val  = vector_id & 0xFFFF
+    container = bitmap.get_container(high_key)
+    if container is None:
+        return False
+    return container.contains(low_val)  # array: bsearch, bitmap: bit test, run: bsearch
+
+def bitmap_set(bitmap, vector_id):
+    """Mark a vector as soft-deleted."""
+    high_key = vector_id >> 16
+    low_val  = vector_id & 0xFFFF
+    container = bitmap.get_or_create_container(high_key)
+    container.add(low_val)
+    if container.type == ARRAY and container.cardinality > 4096:
+        container.promote_to_bitmap()
+```
+
+## 5. Delete-Aware Query Path
+
+### 5.1 HNSW Traversal with Deletion Filtering
+
+Deleted vectors remain in the HNSW graph until compaction rebuilds the index.
+During search, the deletion bitmap is checked per candidate. Deleted nodes are
+still traversed for connectivity but excluded from the result set.
+
+```python
+def hnsw_search_delete_aware(query, entry_point, ef_search, k, del_bitmap):
+    candidates = MaxHeap()   # worst candidate on top
+    visited    = BitSet()
+    worklist   = MinHeap()   # best candidate first
+
+    d0 = distance(query, get_vector(entry_point))
+    worklist.push((d0, entry_point))
+    visited.add(entry_point)
+    if not bitmap_check(del_bitmap, entry_point):
+        candidates.push((d0, entry_point))
+
+    while worklist:
+        dist, node = worklist.pop()
+        if candidates.size() >= ef_search and dist > candidates.peek_max():
+            break
+
+        neighbors = get_neighbors(node)
+        for n in neighbors[:PREFETCH_AHEAD]:
+            if n not in visited:
+                prefetch_vector(n)
+
+        for n in neighbors:
+            if n in visited:
+                continue
+            visited.add(n)
+            d = distance(query, get_vector(n))
+            is_deleted = bitmap_check(del_bitmap, n)   # O(1) bitmap lookup
+
+            # Always add to worklist (graph connectivity)
+            if candidates.size() < ef_search or d < candidates.peek_max():
+                worklist.push((d, n))
+            # Only add to results if NOT deleted
+            if not is_deleted:
+                if candidates.size() < ef_search:
+                    candidates.push((d, n))
+                elif d < candidates.peek_max():
+                    candidates.replace_max((d, n))
+
+    return candidates.top_k(k)
+```
+
+### 5.2 Top-K Refinement with Deletion Filtering
+
+```python
+def topk_refine_delete_aware(candidates, hot_cache, query, k, del_bitmap):
+    heap = MaxHeap()
+    for cand_dist, cand_id in candidates:
+        heap.push((cand_dist, cand_id))
+
+    for entry in hot_cache.sequential_scan():
+        if bitmap_check(del_bitmap, entry.vector_id):
+            continue   # skip soft-deleted
+        d = distance(query, entry.vector)
+        if heap.size() < k:
+            heap.push((d, entry.vector_id))
+        elif d < heap.peek_max():
+            heap.replace_max((d, entry.vector_id))
+
+    return heap.drain_sorted()
+```
+
+### 5.3 Performance Impact
+
+| Operation | Without Deletions | With Deletions | Overhead |
+|-----------|-------------------|----------------|----------|
+| Bitmap check | N/A | ~2-5 ns (L1/L2 hit) | Per candidate |
+| HNSW step (M=16) | ~300-500 ns | ~330-580 ns | +10% |
+| Top-K refine (1000) | ~10 us | ~12 us | +20% worst |
+| Total query | ~50-75 us | ~55-85 us | +10-13% |
+
+At typical deletion rates (< 5%), overhead is negligible: the bitmap fits in
+L2 cache, graph connectivity is preserved, and the cost is one branch plus
+one bitmap load per candidate.
+
+## 6. Deletion Write Path
+
+All deletion operations follow the same two-fsync protocol:
+
+```python
+def delete_vectors(file, entries):
+    """Soft-delete vectors. entries: list of DeleteVector or DeleteRange."""
+    # 1. Append JOURNAL_SEG
+    journal = JournalSegment(
+        epoch=current_epoch(file),
+        prev_journal_seg_id=latest_journal_id(file),
+        entries=entries
+    )
+    append_segment(file, journal)
+    fsync(file)   # orphan-safe: no manifest references this yet
+
+    # 2. Update deletion bitmap in memory
+    bitmap = load_deletion_bitmap(file)
+    for e in entries:
+        if e.type == DELETE_VECTOR:
+            bitmap_set(bitmap, e.vector_id)
+        elif e.type == DELETE_RANGE:
+            bitmap.add_range(e.start_id, e.end_id)
+
+    # 3. Append MANIFEST_SEG with updated bitmap
+    manifest = build_manifest(file, deletion_bitmap=bitmap)
+    append_segment(file, manifest)
+    fsync(file)   # deletion now visible to all new readers
+```
+
+Single deletes, bulk ranges, and batch deletes all use this path. Batch
+operations pack multiple entries into one JOURNAL_SEG to amortize fsync cost.
+
+## 7. Compaction with Deletions
+
+### 7.1 Compaction Process
+
+```
+Before:
+[VEC_1] [VEC_2] [JOURNAL_1] [VEC_3] [JOURNAL_2] [MANIFEST_5]
+ 0-999   1000-   del:42,     3000-   del:[1000,   bitmap={42,500,
+         2999    del:500     4999    2000)         1000..1999}
+
+After:
+... [MANIFEST_5] [VEC_sealed] [INDEX_new] [MANIFEST_6]
+                  vectors 0-4999           bitmap={}
+                  MINUS deleted            (empty for
+                                           compacted range)
+```
+
+### 7.2 Compaction Algorithm
+
+```python
+def compact_with_deletions(file, seg_ids):
+    bitmap = load_deletion_bitmap(file)
+    output, id_remap, next_id = [], {}, 0
+
+    for seg_id in sorted(seg_ids):
+        seg = load_segment(file, seg_id)
+        if seg.seg_type != VEC_SEG:
+            continue
+        for vec_id, vector in seg.all_vectors():
+            if bitmap_check(bitmap, vec_id):
+                continue                        # physically exclude
+            id_remap[vec_id] = next_id
+            output.append((next_id, vector))
+            next_id += 1
+
+    append_segment(file, VecSegment(flags=SEALED, vectors=output))
+
+    remaps = [RemapIdEntry(old, new) for old, new in id_remap.items() if old != new]
+    if remaps:
+        append_segment(file, JournalSegment(entries=remaps))
+
+    append_segment(file, build_hnsw_index(output))
+
+    for old_id in id_remap:
+        bitmap.remove(old_id)
+
+    manifest = build_manifest(file,
+        tombstone_seg_ids=seg_ids,
+        deletion_bitmap=bitmap)
+    append_segment(file, manifest)
+    fsync(file)
+```
+
+### 7.3 Journal Merging
+
+During compaction, JOURNAL_SEGs covering the compacted range are consumed:
+
+| Entry Type | Materialization |
+|------------|-----------------|
+| DELETE_VECTOR / DELETE_RANGE | Vectors excluded from output |
+| UPDATE_METADATA | Applied to output META_SEG |
+| MOVE_VECTOR | Tier assignment applied in new manifest |
+| REMAP_ID | Chained: old remap composed with new remap |
+
+Consumed JOURNAL_SEGs are tombstoned alongside compacted VEC_SEGs.
+
+### 7.4 Compaction Invariants
+
+| ID | Invariant |
+|----|-----------|
+| INV-D1 | After compaction, deletion bitmap is empty for compacted range |
+| INV-D2 | Sealed output contains only ACTIVE vectors |
+| INV-D3 | REMAP_ID entries journaled for every relocated vector |
+| INV-D4 | Compacted input segments tombstoned in new manifest |
+| INV-D5 | Sealed segments are never modified |
+| INV-D6 | Rebuilt indexes exclude deleted nodes |
+
+## 8. Deletion Consistency
+
+### 8.1 Crash Safety
+
+```
+Write path:
+  1. Append JOURNAL_SEG -> fsync         crash here: orphan, invisible
+  2. Append MANIFEST_SEG -> fsync        crash here: partial manifest, fallback
+
+Recovery:
+- Crash after step 1: JOURNAL_SEG orphaned. No manifest references it.
+  Reader sees previous manifest. Deletion NOT visible. Orphan cleaned
+  up by next compaction.
+- Crash during step 2: Partial MANIFEST_SEG has bad checksum. Reader
+  falls back to previous valid manifest. Deletion NOT visible.
+- After step 2 success: Manifest durable. Deletion visible.
+```
+
+**Guarantee**: Uncommitted deletions never affect readers. Deletion is
+atomic at the manifest fsync boundary.
+
+### 8.2 Manifest Chain Visibility
+
+```
+MANIFEST_3: bitmap = {}
+  |  JOURNAL_SEG written (delete vector 42)
+MANIFEST_4: bitmap = {42}     <-- deletion visible from here
+  |  Compaction runs
+MANIFEST_5: bitmap = {}       <-- vector 42 physically removed
+```
+
+A reader holding MANIFEST_3 continues to see vector 42. A reader opening
+after MANIFEST_4 will not. This provides snapshot isolation at manifest
+granularity.
+
+### 8.3 Multi-File Mode
+
+In multi-file mode, each shard maintains its own deletion bitmap. The
+DELETION_BITMAP TLV record supports two modes:
+
+```
+----------------------------------------------+
+| mode: u8                                     |
+|   0x00 = SINGLE   (one bitmap, inline)       |
+|   0x01 = SHARDED  (per-shard references)     |
+----------------------------------------------+
+SINGLE (0x00):
+|   roaring_bitmap: [u8; ...]                  |
+
+SHARDED (0x01):
+|   shard_count: u16                           |
+|   For each shard:                            |
+|     shard_id: u16                            |
+|     bitmap_offset: u64  (in shard file)      |
+|     bitmap_length: u32                       |
+|     bitmap_hash: hash128                     |
+----------------------------------------------+
+```
+
+Queries spanning shards load per-shard bitmaps and check each candidate
+against its shard's bitmap.
+
+### 8.4 Concurrent Access
+
+One writer at a time (file-level advisory lock). Multiple readers are safe
+due to append-only architecture. A reader that opened before a deletion
+sees the pre-deletion snapshot until it re-reads the manifest.
+
+## 9. Space Reclamation
+
+| Trigger | Threshold | Action |
+|---------|-----------|--------|
+| Deletion ratio | > 20% of vectors deleted | Schedule compaction |
+| Bitmap size | > 1 MB | Schedule compaction |
+| Segment count | > 64 mutable segments | Schedule compaction |
+| Manual | User-initiated | Compact immediately |
+
+Space accounting derived from the manifest:
+```
+total_vector_count:     10,000,000   (Level 0 root manifest)
+deleted_vector_count:      150,000   (bitmap cardinality)
+active_vector_count:     9,850,000   (total - deleted)
+deletion_ratio:              1.5%    (below threshold)
+wasted_bytes:           ~115 MB      (150K * 768 B per fp16-384 vector)
+```
+
+## 10. Summary
+
+### Deletion Protocol
+
+| Step | Action | Durability |
+|------|--------|------------|
+| 1 | Append JOURNAL_SEG with DELETE entries | fsync (orphan-safe) |
+| 2 | Update roaring deletion bitmap | In-memory |
+| 3 | Append MANIFEST_SEG with new bitmap | fsync (deletion visible) |
+| 4 | Compaction excludes deleted vectors | fsync (physical removal) |
+| 5 | File rewrite reclaims space | fsync (space freed) |
+
+### New Wire Format Elements
+
+| Element | Type / Tag | Section |
+|---------|------------|---------|
+| JOURNAL_SEG | Segment type 0x04 | 3 |
+| DELETE_VECTOR | Journal entry 0x01 | 3.4 |
+| DELETE_RANGE | Journal entry 0x02 | 3.4 |
+| UPDATE_METADATA | Journal entry 0x03 | 3.4 |
+| MOVE_VECTOR | Journal entry 0x04 | 3.4 |
+| REMAP_ID | Journal entry 0x05 | 3.4 |
+| DELETION_BITMAP | Level 1 TLV 0x000E | 4 |
+
+### Invariants
+
+| ID | Invariant |
+|----|-----------|
+| INV-D1 | After compaction, deletion bitmap is empty for compacted range |
+| INV-D2 | Sealed output segments contain only ACTIVE vectors |
+| INV-D3 | ID remappings journaled for every compaction-relocated vector |
+| INV-D4 | Compacted input segments tombstoned in new manifest |
+| INV-D5 | Sealed segments are never modified |
+| INV-D6 | Rebuilt indexes exclude deleted nodes |
+| INV-D7 | Uncommitted deletions never affect readers (crash safety) |
+| INV-D8 | Deletion visibility is atomic at the manifest fsync boundary |
--- a/vendor/ruvector/docs/research/rvf/spec/08-filtered-search.md
+++ b/vendor/ruvector/docs/research/rvf/spec/08-filtered-search.md
@@ -0,0 +1,724 @@
+# RVF Filtered Search
+
+## 1. Motivation
+
+Domain profiles declare metadata schemas with indexed fields (e.g., `"organism"` in
+RVDNA, `"language"` in RVText, `"node_type"` in RVGraph), but the format provides no
+specification for how those indexes are built, stored, or evaluated at query time.
+
+Filtered search is the combination of vector similarity search with metadata
+predicates. Without it, a caller must retrieve an over-sized result set and filter
+client-side — wasting bandwidth, latency, and recall budget.
+
+This specification adds:
+
+1. **META_SEG** payload layout (segment type 0x07) for storing per-vector metadata
+2. **Filter expression language** with a compact binary encoding
+3. **Three evaluation strategies** (pre-, post-, and intra-filtering)
+4. **METAIDX_SEG** (new segment type 0x0D) for inverted and bitmap indexes
+5. **Manifest integration** via a new Level 1 TLV record
+6. **Temperature tier coordination** for metadata segments
+
+## 2. META_SEG Payload Layout (Segment Type 0x07)
+
+META_SEG stores the actual metadata values associated with vectors. It uses the
+standard 64-byte segment header (see `binary-layout.md` Section 3) with
+`seg_type = 0x07`.
+
+```
+META_SEG Payload:
+
+------------------------------------------+
+| Meta Header (64 bytes, padded)           |
+|   schema_id:            u32              |  References PROFILE_SEG schema
+|   vector_id_range_start: u64             |  First vector ID covered
+|   vector_id_range_end:   u64             |  Last vector ID covered (inclusive)
+|   field_count:           u16             |  Number of fields in this segment
+|   encoding:              u8              |  0 = row-oriented, 1 = column-oriented
+|   reserved:              [u8; 37]        |  Must be zero
+|   [64B aligned]                          |
+------------------------------------------+
+| Field Directory                          |
+|   For each field (field_count entries):  |
+|     field_id:       u16                  |
+|     field_type:     u8                   |
+|     flags:          u8                   |
+|     field_offset:   u32                  |  Byte offset from payload start
+|   [64B aligned]                          |
+------------------------------------------+
+| Field Data (column-oriented)             |
+|   (see Section 2.1 for per-type layout)  |
+------------------------------------------+
+```
+
+### Field Type Enum
+
+```
+Value   Type      Wire Size          Description
+-----   ----      ---------          -----------
+0x00    string    Variable           UTF-8, dictionary-encoded in column layout
+0x01    u32       4 bytes            Unsigned 32-bit integer
+0x02    u64       8 bytes            Unsigned 64-bit integer
+0x03    f32       4 bytes            IEEE 754 single-precision float
+0x04    enum      Variable (packed)  Enumeration with defined label set
+0x05    bool      1 bit (packed)     Boolean
+```
+
+### Field Flags
+
+```
+Bit  Mask  Name       Meaning
+---  ----  ----       -------
+0    0x01  INDEXED    Field has a corresponding METAIDX_SEG
+1    0x02  SORTED     Values are stored in sorted order
+2    0x04  NULLABLE   Null bitmap present before values
+3    0x08  STORED     Field value returned in query results (not just filterable)
+4-7        reserved   Must be zero
+```
+
+### 2.1 Column-Oriented Field Layouts
+
+Column-oriented encoding (encoding = 1) is the preferred layout. Each field's data
+block starts at a 64-byte aligned boundary.
+
+**String fields** (dictionary-encoded):
+
+```
+dict_size:    u32                           Number of distinct strings
+For each dict entry:
+  length:     u16                           Byte length of UTF-8 string
+  bytes:      [u8; length]                  UTF-8 encoded string
+[4B aligned after dictionary]
+codes:        [varint; vector_count]        Dictionary code per vector
+[64B aligned]
+```
+
+Dictionary codes are 0-indexed into the dictionary array. Code `0xFFFFFFFF` (max
+varint value for u32 range) represents null if the NULLABLE flag is set.
+
+**Numeric fields** (u32, u64, f32 -- direct array):
+
+```
+If NULLABLE:
+  null_bitmap: [u8; ceil(vector_count / 8)]  Bit-packed, 1 = present, 0 = null
+  [8B aligned]
+values:       [field_type; vector_count]     Dense array of values
+[64B aligned]
+```
+
+Values for null entries are zero-filled but must not be relied upon.
+
+**Enum fields** (bit-packed):
+
+```
+enum_count:   u8                            Number of enum labels
+For each enum label:
+  length:     u8                            Byte length of label
+  bytes:      [u8; length]                  UTF-8 label string
+bits_per_code: u8                           ceil(log2(enum_count))
+codes:        packed bit array              bits_per_code bits per vector
+              [ceil(vector_count * bits_per_code / 8) bytes]
+[64B aligned]
+```
+
+For example, an enum with 3 values (`"+", "-", "."`) uses 2 bits per vector.
+1M vectors = 250 KB.
+
+**Bool fields** (bit-packed):
+
+```
+If NULLABLE:
+  null_bitmap: [u8; ceil(vector_count / 8)]
+  [8B aligned]
+values:       [u8; ceil(vector_count / 8)]  Bit-packed, 1 = true, 0 = false
+[64B aligned]
+```
+
+### 2.2 Sorted Index (Inline)
+
+For fields with the SORTED flag, an additional sorted permutation index follows
+the field data:
+
+```
+sorted_count:   u32                         Must equal vector_count
+sorted_order:   [varint delta-encoded]      Vector IDs in ascending value order
+restart_interval: u16                       Restart every N entries (default 128)
+restart_offsets:  [u32; ceil(sorted_count / restart_interval)]
+[64B aligned]
+```
+
+This enables binary search over field values for range queries without requiring
+a separate METAIDX_SEG. It is suitable for fields where a full inverted index
+would be wasteful (high cardinality numeric fields like `position_start`).
+
+## 3. Filter Expression Language
+
+### 3.1 Abstract Syntax
+
+A filter expression is a tree of predicates combined with boolean logic:
+
+```
+expr ::= field_ref CMP literal         -- comparison
+       | field_ref IN literal_set       -- set membership
+       | field_ref PREFIX string_lit    -- string prefix match
+       | field_ref CONTAINS string_lit  -- substring containment
+       | expr AND expr                  -- conjunction
+       | expr OR expr                   -- disjunction
+       | NOT expr                       -- negation
+```
+
+### 3.2 Binary Encoding (Postfix / RPN)
+
+Filter expressions are encoded as a postfix (Reverse Polish Notation) token stream
+for stack-based evaluation. This avoids the need for recursive parsing and enables
+single-pass evaluation with a fixed-size stack.
+
+```
+Filter Expression Binary Layout:
+
+header:
+  node_count:     u16                   Total number of tokens
+  stack_depth:    u8                    Maximum stack depth required
+  reserved:       u8                    Must be zero
+
+tokens (postfix order):
+  For each token:
+    node_type:    u8                    Token type (see enum below)
+    payload:      type-specific         Variable-size payload
+```
+
+### Token Type Enum
+
+```
+Value   Name        Stack Effect   Payload
+-----   ----        ------------   -------
+0x01    FIELD_REF   push +1        field_id: u16
+0x02    LIT_U32     push +1        value: u32
+0x03    LIT_U64     push +1        value: u64
+0x04    LIT_F32     push +1        value: f32
+0x05    LIT_STR     push +1        length: u16, bytes: [u8; length]
+0x06    LIT_BOOL    push +1        value: u8 (0 or 1)
+0x07    LIT_NULL    push +1        (no payload)
+
+0x10    CMP_EQ      pop 2, push 1  (no payload) -- a == b
+0x11    CMP_NE      pop 2, push 1  (no payload) -- a != b
+0x12    CMP_LT      pop 2, push 1  (no payload) -- a < b
+0x13    CMP_LE      pop 2, push 1  (no payload) -- a <= b
+0x14    CMP_GT      pop 2, push 1  (no payload) -- a > b
+0x15    CMP_GE      pop 2, push 1  (no payload) -- a >= b
+
+0x20    IN_SET      pop 1, push 1  set_size: u16, [encoded values]
+0x21    PREFIX      pop 2, push 1  (no payload) -- string prefix
+0x22    CONTAINS    pop 2, push 1  (no payload) -- substring match
+
+0x30    AND         pop 2, push 1  (no payload)
+0x31    OR          pop 2, push 1  (no payload)
+0x32    NOT         pop 1, push 1  (no payload)
+```
+
+### 3.3 Encoding Example
+
+Filter: `organism = "E. coli" AND position_start >= 1000`
+
+```
+Token 0: FIELD_REF   field_id=0 (organism)          stack: [organism_val]
+Token 1: LIT_STR     "E. coli"                      stack: [organism_val, "E. coli"]
+Token 2: CMP_EQ                                     stack: [true/false]
+Token 3: FIELD_REF   field_id=3 (position_start)    stack: [bool, pos_val]
+Token 4: LIT_U64     1000                           stack: [bool, pos_val, 1000]
+Token 5: CMP_GE                                     stack: [bool, true/false]
+Token 6: AND                                        stack: [result]
+
+Binary: node_count=7, stack_depth=3
+  01 00:00  05 00:07 "E. coli"  10  01 00:03  03 00:00:00:00:00:00:03:E8  15  30
+```
+
+### 3.4 Evaluation
+
+Evaluation processes tokens left to right using a fixed-size boolean/value stack:
+
+```python
+def evaluate(tokens, vector_id, metadata):
+    stack = []
+    for token in tokens:
+        if token.type == FIELD_REF:
+            stack.append(metadata.get_value(vector_id, token.field_id))
+        elif token.type in (LIT_U32, LIT_U64, LIT_F32, LIT_STR, LIT_BOOL, LIT_NULL):
+            stack.append(token.value)
+        elif token.type in (CMP_EQ, CMP_NE, CMP_LT, CMP_LE, CMP_GT, CMP_GE):
+            b, a = stack.pop(), stack.pop()
+            stack.append(compare(a, token.type, b))
+        elif token.type == IN_SET:
+            a = stack.pop()
+            stack.append(a in token.value_set)
+        elif token.type in (PREFIX, CONTAINS):
+            b, a = stack.pop(), stack.pop()
+            stack.append(string_match(a, token.type, b))
+        elif token.type == AND:
+            b, a = stack.pop(), stack.pop()
+            stack.append(a and b)
+        elif token.type == OR:
+            b, a = stack.pop(), stack.pop()
+            stack.append(a or b)
+        elif token.type == NOT:
+            stack.append(not stack.pop())
+    return stack[0]
+```
+
+Maximum stack depth is declared in the header so the evaluator can pre-allocate.
+Implementations must reject expressions with `stack_depth > 16`.
+
+## 4. Filter Evaluation Strategies
+
+The runtime selects one of three strategies based on the estimated **selectivity**
+of the filter (the fraction of vectors passing the filter).
+
+### 4.1 Pre-Filtering (Selectivity < 1%)
+
+Build the candidate ID set from metadata indexes first, then run vector search
+only on the filtered subset.
+
+```
+1. Evaluate filter using METAIDX_SEG inverted/bitmap indexes
+2. Collect matching vector IDs into a candidate set C
+3. If |C| < ef_search:
+     Flat scan all candidates, return top-K
+   Else:
+     Build temporary flat index over C, run HNSW search restricted to C
+4. Return top-K results
+```
+
+**Tradeoffs**:
+- Optimal when the candidate set is very small (hundreds to low thousands)
+- Risk: if the candidate set is disconnected in the HNSW graph, search cannot
+  traverse from entry points to candidates. The flat scan fallback handles this.
+- Memory: candidate set bitmap = `ceil(total_vectors / 8)` bytes
+
+### 4.2 Post-Filtering (Selectivity > 20%)
+
+Run standard HNSW search with over-retrieval, then filter results.
+
+```
+1. Compute over_retrieval_factor = min(1.0 / selectivity, 10.0)
+2. Set ef_search_adj = ef_search * over_retrieval_factor
+3. Run standard HNSW search with ef_search_adj
+4. Filter result set by evaluating filter expression per candidate
+5. Return top-K from filtered results
+```
+
+**Tradeoffs**:
+- Optimal when the filter passes most vectors (minimal wasted computation)
+- Risk: if over-retrieval factor is too low, fewer than K results survive filtering.
+  The caller should retry with a higher factor or fall back to intra-filtering.
+- No modification to HNSW traversal logic required.
+
+### 4.3 Intra-Filtering (1% <= Selectivity <= 20%)
+
+Evaluate the filter during HNSW traversal, skipping nodes that fail the predicate.
+
+```python
+def filtered_hnsw_search(query, filter_expr, entry_point, ef_search, k):
+    candidates = MaxHeap()       # top-K results (max-heap by distance)
+    worklist = MinHeap()         # exploration frontier (min-heap by distance)
+    visited = BitSet()
+    filtered_skips = 0
+    max_skips = ef_search * 3    # backoff threshold
+
+    worklist.push((distance(query, entry_point), entry_point))
+    visited.add(entry_point)
+
+    while worklist and filtered_skips < max_skips:
+        dist, node = worklist.pop()
+
+        # Check filter predicate
+        if not evaluate(filter_expr, node, metadata):
+            filtered_skips += 1
+            # Still expand neighbors (maintain graph connectivity)
+            neighbors = get_neighbors(node)
+            for n in neighbors:
+                if n not in visited:
+                    visited.add(n)
+                    d = distance(query, get_vector(n))
+                    worklist.push((d, n))
+            continue
+
+        filtered_skips = 0  # reset skip counter on successful match
+        candidates.push((dist, node))
+        if len(candidates) > k:
+            candidates.pop()  # evict worst
+
+        # Expand neighbors
+        neighbors = get_neighbors(node)
+        for n in neighbors:
+            if n not in visited:
+                visited.add(n)
+                d = distance(query, get_vector(n))
+                if len(candidates) < ef_search or d < candidates.max():
+                    worklist.push((d, n))
+
+    return candidates.top_k(k)
+```
+
+**Key design decisions**:
+
+1. **Skipped nodes still expand neighbors**: This preserves graph connectivity.
+   A node that fails the filter may have neighbors that pass it.
+
+2. **Skip counter with backoff**: If too many consecutive nodes fail the filter,
+   the search is exhausting the local neighborhood without finding matches. The
+   `max_skips` threshold triggers termination to avoid unbounded traversal.
+
+3. **Adaptive ef expansion**: When `filtered_skips > ef_search`, the effective
+   search frontier is larger than requested, compensating for filtered-out nodes.
+
+### 4.4 Strategy Selection
+
+```
+selectivity = estimate_selectivity(filter_expr, metaidx_stats)
+
+if selectivity < 0.01:
+    strategy = PRE_FILTER
+elif selectivity > 0.20:
+    strategy = POST_FILTER
+else:
+    strategy = INTRA_FILTER
+```
+
+Selectivity estimation uses statistics stored in the METAIDX_SEG header:
+
+- **Inverted index**: `posting_list_length / total_vectors` per term
+- **Bitmap index**: `popcount(bitmap) / total_vectors` per enum value
+- **Range tree**: count of values in range / total_vectors
+
+For compound filters (AND/OR), selectivity is estimated using independence
+assumption: `P(A AND B) = P(A) * P(B)`, `P(A OR B) = P(A) + P(B) - P(A) * P(B)`.
+
+## 5. METAIDX_SEG (Segment Type 0x0D)
+
+METAIDX_SEG stores secondary indexes over metadata fields for fast predicate
+evaluation. Each METAIDX_SEG covers one field. The segment type enum value 0x0D
+is allocated from the reserved range (see `binary-layout.md` Section 3).
+
+```
+METAIDX_SEG Payload:
+
+------------------------------------------+
+| Index Header (64 bytes, padded)          |
+|   field_id:         u16                  |  Field being indexed
+|   index_type:       u8                   |  0=inverted, 1=range_tree, 2=bitmap
+|   field_type:       u8                   |  Mirrors META_SEG field_type
+|   total_vectors:    u64                  |  Vectors covered by this index
+|   unique_values:    u64                  |  Cardinality (distinct values)
+|   reserved:         [u8; 42]             |
+|   [64B aligned]                          |
+------------------------------------------+
+| Index Data (type-specific)               |
+------------------------------------------+
+```
+
+### 5.1 Inverted Index (index_type = 0)
+
+Best for: string fields with moderate cardinality (100 to 100K distinct values).
+
+```
+term_count:       u32
+For each term (sorted by encoded value):
+  term_length:    u16
+  term_bytes:     [u8; term_length]         Encoded value (UTF-8 for strings)
+  posting_length: u32                       Number of vector IDs
+  postings:       [varint delta-encoded]    Sorted vector IDs
+  [8B aligned after postings]
+[64B aligned]
+```
+
+Posting lists use varint delta encoding identical to the ID encoding in VEC_SEG
+(see `binary-layout.md` Section 5). Restart points every 128 entries enable
+binary search within a posting list for intersection operations.
+
+### 5.2 Range Tree (index_type = 1)
+
+Best for: numeric fields requiring range queries (u32, u64, f32).
+
+```
+page_size:        u32                       Fixed 4096 bytes (4 KB, one disk page)
+page_count:       u32
+root_page:        u32                       Page index of B+ tree root
+tree_height:      u8
+reserved:         [u8; 47]
+[64B aligned]
+
+Internal Page (4096 bytes):
+  page_type:      u8 (0 = internal)
+  key_count:      u16
+  keys:           [field_type; key_count]   Separator keys
+  children:       [u32; key_count + 1]      Child page indices
+  [zero-padded to 4096]
+
+Leaf Page (4096 bytes):
+  page_type:      u8 (1 = leaf)
+  entry_count:    u16
+  prev_leaf:      u32                       Linked-list pointer for range scan
+  next_leaf:      u32
+  entries:
+    For each entry:
+      value:      field_type                The metadata value
+      vector_id:  u64                       Associated vector ID
+  [zero-padded to 4096]
+```
+
+Leaf pages form a doubly-linked list for efficient range scans. A range query
+`position_start >= 1000 AND position_start <= 5000` descends the tree to find
+the first leaf with value >= 1000, then scans forward until value > 5000.
+
+### 5.3 Bitmap Index (index_type = 2)
+
+Best for: enum and bool fields with low cardinality (< 64 distinct values).
+
+```
+value_count:      u8                        Number of distinct enum/bool values
+For each value:
+  value_label_len: u8
+  value_label:    [u8; value_label_len]     The enum label or "true"/"false"
+  bitmap_format:  u8                        0 = raw, 1 = roaring
+  bitmap_length:  u32                       Byte length of bitmap data
+  bitmap_data:    [u8; bitmap_length]       Bitmap of matching vector IDs
+  [8B aligned]
+[64B aligned]
+```
+
+**Raw bitmaps** are used when `total_vectors < 8192` (1 KB per bitmap).
+
+**Roaring bitmaps** are used for larger datasets. The roaring format stores
+the bitmap as a set of containers (array, bitmap, or run-length) per 64K chunk.
+This matches the industry-standard Roaring bitmap serialization (compatible with
+CRoaring / roaring-rs wire format).
+
+Bitmap intersection and union operations map directly to AND/OR filter predicates
+using SIMD bitwise operations. For 10M vectors:
+
+```
+Raw bitmap:    ~1.2 MB per value (impractical for many values)
+Roaring bitmap: 100 KB - 1 MB per value depending on density
+AND/OR:        ~0.1 ms per operation (AVX-512 on 1 MB bitmap)
+```
+
+## 6. Level 1 Manifest Addition
+
+### Tag 0x000F: METADATA_INDEX_DIR
+
+A new TLV record in the Level 1 manifest (see `02-manifest-system.md` Section 3)
+that maps indexed metadata fields to their METAIDX_SEG segment IDs.
+
+```
+Tag:     0x000F
+Name:    METADATA_INDEX_DIR
+
+Payload:
+  entry_count:    u16
+  For each entry:
+    field_id:     u16                       Matches META_SEG field_id
+    field_name_len: u8
+    field_name:   [u8; field_name_len]      UTF-8 field name for debugging
+    index_seg_id: u64                       Segment ID of METAIDX_SEG
+    index_type:   u8                        0=inverted, 1=range_tree, 2=bitmap
+    stats:
+      total_vectors: u64
+      unique_values: u64
+      min_posting_len: u32                  Smallest posting list size
+      max_posting_len: u32                  Largest posting list size
+```
+
+This allows the query planner to estimate selectivity without reading the
+METAIDX_SEG segments themselves. The `min_posting_len` and `max_posting_len`
+fields provide bounds for cardinality estimation.
+
+### Updated Record Types Table
+
+```
+Tag     Name                    Description
+---     ----                    -----------
+0x0001  SEGMENT_DIR             Array of segment directory entries
+0x0002  TEMP_TIER_MAP           Temperature tier assignments per block
+...
+0x000D  KEY_DIRECTORY           Encryption key references
+0x000E  (reserved)
+0x000F  METADATA_INDEX_DIR      Metadata field -> METAIDX_SEG mapping
+```
+
+## 7. Performance Analysis
+
+### 7.1 Filter Strategy vs Selectivity vs Recall
+
+| Selectivity | Strategy | Recall@10 | Latency (10M vectors) | Notes |
+|-------------|----------|-----------|----------------------|-------|
+| 0.001% (100 matches) | Pre-filter | 1.00 | 0.02 ms | Flat scan on 100 candidates |
+| 0.01% (1K matches) | Pre-filter | 0.99 | 0.08 ms | Flat scan on 1K candidates |
+| 0.1% (10K matches) | Pre-filter | 0.98 | 0.5 ms | Mini-HNSW on 10K candidates |
+| 1% (100K matches) | Intra-filter | 0.96 | 0.12 ms | ~10% node skip overhead |
+| 5% (500K matches) | Intra-filter | 0.95 | 0.08 ms | ~5% node skip overhead |
+| 10% (1M matches) | Intra-filter | 0.94 | 0.06 ms | Minimal skip overhead |
+| 20% (2M matches) | Post-filter | 0.95 | 0.10 ms | 5x over-retrieval |
+| 50% (5M matches) | Post-filter | 0.97 | 0.06 ms | 2x over-retrieval |
+| 100% (no filter) | None | 0.98 | 0.04 ms | Baseline unfiltered |
+
+### 7.2 Memory Overhead of Metadata Indexes
+
+For 10M vectors with the RVDNA profile (5 indexed fields):
+
+| Field | Type | Cardinality | Index Type | Size |
+|-------|------|-------------|------------|------|
+| organism | string | ~50K | Inverted | ~80 MB |
+| gene_id | string | ~500K | Inverted | ~120 MB |
+| chromosome | string | ~25 | Bitmap (roaring) | ~12 MB |
+| position_start | u64 | ~10M | Range tree | ~160 MB |
+| position_end | u64 | ~10M | Range tree | ~160 MB |
+| **Total** | | | | **~532 MB** |
+
+As a fraction of vector data (10M * 384 dim * fp16 = 7.2 GB): **~7.4% overhead**.
+
+For the RVText profile (2 indexed fields, typically lower cardinality):
+
+| Field | Type | Cardinality | Index Type | Size |
+|-------|------|-------------|------------|------|
+| source_url | string | ~100K | Inverted | ~90 MB |
+| language | string | ~50 | Bitmap (roaring) | ~8 MB |
+| **Total** | | | | **~98 MB** |
+
+Overhead: **~1.4%** of vector data.
+
+### 7.3 Query Latency Breakdown (Filtered Intra-Search)
+
+```
+Phase                         Time        Notes
+-----                         ----        -----
+Parse filter expression       0.5 us      Stack-based, no allocation
+Estimate selectivity          1.0 us      Read manifest stats
+Load METAIDX_SEG (if cold)    50-200 us   First query only; cached after
+HNSW traversal (150 steps)    45 us       Baseline unfiltered
+  + filter eval per node      +12 us      ~80 ns per eval * 150 nodes
+  + skip expansion            +8 us       ~20% more nodes visited at 5% sel.
+Top-K collection              10 us       Heap operations
+                              --------
+Total (warm cache)            ~76 us
+Total (cold start)            ~276 us
+```
+
+## 8. Integration with Temperature Tiering
+
+Metadata follows the same temperature model as vector data (see
+`03-temperature-tiering.md`), but with its own tier assignments.
+
+### 8.1 Hot Metadata
+
+Indexed fields for hot-tier vectors are kept resident in memory:
+
+- **Bitmap indexes** for low-cardinality fields (enum, bool) are always hot.
+  Total size is bounded: `cardinality * ceil(hot_vectors / 8)` bytes. For 100K
+  hot vectors and 25 enum values: 25 * 12.5 KB = 312 KB.
+
+- **Inverted index posting lists** are cached using an LRU policy keyed by
+  (field_id, term). Frequently queried terms (e.g., `language = "en"`) remain
+  resident.
+
+- **Range tree pages** follow the standard B+ tree buffer pool model. Hot pages
+  (root + first two levels) are pinned. Leaf pages are demand-paged.
+
+### 8.2 Cold Metadata
+
+Cold metadata covers vectors that are rarely accessed:
+
+- META_SEG data for cold vectors is compressed with ZSTD (level 9+) and stored
+  in cold-tier segments.
+- METAIDX_SEG posting lists for cold vectors are not loaded until a query
+  specifically requests them.
+- When a filter matches only cold vectors (detected via the temperature tier
+  map), the runtime issues a warning: filtered search on cold data may require
+  decompression latency of 10-100 ms.
+
+### 8.3 Compaction Coordination
+
+When temperature-aware compaction reorganizes vector segments (see
+`03-temperature-tiering.md` Section 4), metadata must follow:
+
+```
+1. Identify vectors moving between tiers
+2. Rewrite META_SEG for affected vector ID ranges
+3. Rebuild METAIDX_SEG posting lists (vector IDs may be renumbered during
+   compaction if the COMPACTION_RENUMBER flag is set)
+4. Update METADATA_INDEX_DIR in the new manifest
+5. Tombstone old META_SEG and METAIDX_SEG segments
+```
+
+Metadata compaction piggybacks on vector compaction -- it never triggers
+independently. This ensures metadata and vector segments remain in consistent
+temperature tiers.
+
+### 8.4 Metadata-Aware Promotion
+
+When a filter query frequently accesses metadata for warm-tier vectors, those
+metadata segments are candidates for promotion to hot tier. The access sketch
+(SKETCH_SEG) tracks metadata segment accesses alongside vector accesses:
+
+```
+sketch_key = (META_SEG_ID << 32) | block_id
+```
+
+This reuses the existing sketch infrastructure without modification.
+
+## 9. Wire Protocol: Filtered Query Message
+
+For completeness, the filter expression is carried in the query message as a
+tagged field. The query wire format is outside the scope of the storage spec,
+but the filter payload is defined here for interoperability.
+
+```
+Query Message Filter Field:
+  tag:              u16 (0x0040 = FILTER)
+  length:           u32
+  filter_version:   u8 (1)
+  filter_payload:   [u8; length - 1]       Binary filter expression (Section 3.2)
+```
+
+Implementations that do not support filtered search must ignore tag 0x0040 and
+return unfiltered results. This preserves backward compatibility.
+
+## 10. Implementation Notes
+
+### 10.1 Index Selection Heuristics
+
+When building indexes for a new META_SEG field, implementations should select
+the index type automatically:
+
+```
+if field_type in (enum, bool) and cardinality < 64:
+    index_type = BITMAP
+elif field_type in (u32, u64, f32):
+    index_type = RANGE_TREE
+else:
+    index_type = INVERTED
+```
+
+Fields without the `"indexed": true` property in the profile schema must not
+have METAIDX_SEG segments built. They are stored in META_SEG for retrieval
+only (the STORED flag).
+
+### 10.2 Posting List Intersection
+
+For AND filters on multiple indexed fields, posting list intersection is
+performed using a merge-based algorithm on sorted, delta-decoded posting lists:
+
+```
+Sorted Intersection (two-pointer merge):
+  Time: O(min(|A|, |B|)) with skip-ahead via restart points
+  Practical: ~100 ns per 1000 common elements (SIMD comparison)
+```
+
+For OR filters, posting list union uses a similar merge with deduplication.
+
+### 10.3 Null Handling
+
+- `FIELD_REF` for a null value pushes a sentinel NULL onto the stack
+- `CMP_EQ NULL` returns true only for null values
+- `CMP_NE NULL` returns true for all non-null values
+- All other comparisons against NULL return false (SQL-style three-valued logic)
+- `IN_SET` never matches NULL unless NULL is explicitly in the set
--- a/vendor/ruvector/docs/research/rvf/spec/09-concurrency-versioning.md
+++ b/vendor/ruvector/docs/research/rvf/spec/09-concurrency-versioning.md
@@ -0,0 +1,474 @@
+# RVF Concurrency, Versioning, and Space Reclamation
+
+## 1. Single-Writer / Multi-Reader Model
+
+RVF uses a **single-writer, multi-reader** concurrency model. At most one process
+may append segments to an RVF file at any time. Any number of readers may operate
+concurrently with each other and with the writer. This model is enforced by an
+advisory lock file, not by OS-level mandatory locking.
+
+| Concern | Advisory Lock | Mandatory Lock (flock/fcntl) |
+|---------|---------------|------------------------------|
+| NFS compatibility | Works (lock file is a regular file) | Broken on many NFS configs |
+| Crash recovery | Stale lock detectable by PID check | Kernel auto-releases, but only locally |
+| Cross-language | Any language can create a file | Requires OS-specific syscalls |
+| Visibility | Lock state inspectable by humans | Opaque kernel state |
+| Multi-file mode | One lock covers all shards | Would need per-shard locks |
+
+## 2. Writer Lock File
+
+The writer lock is a file named `<basename>.rvf.lock` in the same directory as the
+RVF file. For example, `data.rvf` uses `data.rvf.lock`.
+
+### Binary Layout
+
+```
+Offset  Size  Field              Description
+------  ----  -----              -----------
+0x00    4     magic              0x52564C46 ("RVLF" in ASCII)
+0x04    4     pid                Writer process ID (u32)
+0x08    64    hostname           Null-terminated hostname (max 63 chars + null)
+0x48    8     timestamp_ns       Lock acquisition time (nanosecond UNIX timestamp)
+0x50    16    writer_id          Random UUID (128-bit, written as raw bytes)
+0x60    4     lock_version       Lock protocol version (currently 1)
+0x64    4     checksum           CRC32C of bytes 0x00-0x63
+```
+
+**Total**: 104 bytes.
+
+### Lock Acquisition Protocol
+
+```
+1. Construct lock file content (magic, PID, hostname, timestamp, random UUID)
+2. Compute CRC32C over bytes 0x00-0x63, store at 0x64
+3. Attempt open("<basename>.rvf.lock", O_CREAT | O_EXCL | O_WRONLY)
+4. If open succeeds:
+   a. Write 104 bytes
+   b. fsync
+   c. Lock acquired — proceed with writes
+5. If open fails (EEXIST):
+   a. Read existing lock file
+   b. Validate magic and checksum
+   c. If invalid: delete stale lock, retry from step 3
+   d. If valid: run stale lock detection (see below)
+   e. If stale: delete lock, retry from step 3
+   f. If not stale: lock acquisition fails — another writer is active
+```
+
+The `O_CREAT | O_EXCL` combination is atomic on POSIX filesystems, preventing
+two processes from simultaneously creating the lock.
+
+### Stale Lock Detection
+
+A lock is considered stale when **both** of the following are true:
+
+1. **PID is dead**: `kill(pid, 0)` returns `ESRCH` (process does not exist), OR
+   the hostname does not match the current host (remote crash)
+2. **Age exceeds threshold**: `now_ns - timestamp_ns > 30_000_000_000` (30 seconds)
+
+The age check prevents a race where a PID is recycled by the OS. A lock younger
+than 30 seconds is never considered stale, even if the PID appears dead, because
+PID reuse on modern systems can occur within milliseconds.
+
+If the hostname differs from the current host, the PID check is not meaningful.
+In this case, only the age threshold applies. Implementations SHOULD use a longer
+threshold (300 seconds) for cross-host lock recovery to account for clock skew.
+
+### Lock Release Protocol
+
+```
+1. fsync all pending data and manifest segments
+2. Verify the lock file still contains our writer_id (re-read and compare)
+3. If writer_id matches: unlink("<basename>.rvf.lock")
+4. If writer_id does not match: abort — another process stole the lock
+```
+
+Step 2 prevents a writer from deleting a lock that was legitimately taken over
+after a stale lock recovery by another process.
+
+If a writer crashes without releasing the lock, the lock file persists on disk.
+The next writer detects the orphan via stale lock detection and reclaims it.
+No data corruption occurs because the append-only segment model guarantees that
+partial writes are detectable: a segment with a bad content hash or a truncated
+manifest is simply ignored.
+
+## 3. Reader-Writer Coordination
+
+Readers and writers operate independently. The append-only architecture ensures
+they never conflict.
+
+### Reader Protocol
+
+```
+1. Open file (read-only, no lock required)
+2. Read Level 0 root manifest (last 4096 bytes)
+3. Parse hotset pointers and Level 1 offset
+4. This manifest snapshot defines the reader's view of the file
+5. All queries within this session use the snapshot
+6. To see new data: re-read Level 0 (explicit refresh)
+```
+
+### Writer Protocol
+
+```
+1. Acquire lock (Section 2)
+2. Read current manifest to learn segment directory state
+3. Append new segments (VEC_SEG, INDEX_SEG, etc.)
+4. Append new MANIFEST_SEG referencing all live segments
+5. fsync
+6. Release lock (Section 2)
+```
+
+### Concurrent Timeline
+
+```
+Time    Writer                          Reader A            Reader B
+----    ------                          --------            --------
+t=0     Acquires lock
+t=1     Appends VEC_SEG_4                                   Opens file
+t=2     Appends VEC_SEG_5               Opens file          Reads manifest M3
+t=3     Appends MANIFEST_SEG M4         Reads manifest M3   Queries (sees M3)
+t=4     fsync, releases lock            Queries (sees M3)   Queries (sees M3)
+t=5                                     Queries (sees M3)   Refreshes -> M4
+t=6                                     Refreshes -> M4     Queries (sees M4)
+```
+
+Reader A opened during the write but read manifest M3 (already stable) and never
+sees partially written segments. Reader B sees M3 until explicit refresh. Neither
+reader is blocked; the writer is never blocked by readers.
+
+### Snapshot Isolation Guarantees
+
+A reader holding a manifest snapshot is guaranteed:
+
+1. All referenced segments are fully written and fsynced
+2. Segment content hashes match (the manifest would not reference broken segments)
+3. The snapshot is internally consistent (no partial epoch states)
+4. The snapshot remains valid for the lifetime of the open file descriptor, even
+   if the file is compacted and replaced (old inode persists until close)
+
+## 4. Format Versioning
+
+RVF uses explicit version fields at every structural level. The versioning rules
+are designed for forward compatibility — older readers can safely process files
+produced by newer writers, with graceful degradation.
+
+### Segment Version Compatibility
+
+The segment header `version` field (offset 0x04, currently `1`) governs
+segment-level compatibility.
+
+| Rule | Description |
+|------|-------------|
+| S1 | A v1 reader MUST successfully process all v1 segments |
+| S2 | A v1 reader MUST skip segments with version > 1 |
+| S3 | A v1 reader MUST log a warning when skipping unknown versions |
+| S4 | A v1 reader MUST NOT reject a file because it contains unknown-version segments |
+| S5 | A v2+ writer MUST write a root manifest readable by v1 readers (if the root manifest format allows it) |
+| S6 | A v2+ writer MAY write segments with version > 1 |
+| S7 | Readers MUST use `payload_length` from the segment header to skip unknown segments |
+
+Skipping works because the segment header layout is stable: magic, version,
+seg_type, and payload_length occupy fixed offsets. A reader skips unknown
+segments by seeking past `64 + payload_length` bytes (header + payload).
+
+### Unknown Segment Types
+
+The segment type enum (offset 0x05) may be extended in future versions.
+
+| Rule | Description |
+|------|-------------|
+| T1 | A reader MUST skip segment types outside the recognized range (currently 0x01-0x0C) |
+| T2 | A reader MUST NOT reject a file because of unknown segment types |
+| T3 | A reader MUST use the header's `payload_length` to skip the unknown segment |
+| T4 | A reader SHOULD log unknown types at diagnostic/debug level |
+| T5 | Types 0x00 and 0xF0-0xFF remain reserved (see spec 01, Section 3) |
+
+### Level 1 TLV Forward Compatibility
+
+Level 1 manifest records use tag-length-value encoding. New tags may be added
+in any version.
+
+| Rule | Description |
+|------|-------------|
+| L1 | A reader MUST skip TLV records with unknown tags |
+| L2 | A reader MUST use the record's `length` field (4 bytes at tag offset +2) to skip |
+| L3 | A writer MUST NOT change the semantics of an existing tag |
+| L4 | A writer MUST NOT reuse a tag value for a different purpose |
+| L5 | New tags MUST be assigned sequentially from 0x000E onward |
+
+### Root Manifest Compatibility
+
+The root manifest (Level 0) has the strictest compatibility requirements because
+it is the entry point for all readers.
+
+| Rule | Description |
+|------|-------------|
+| R1 | The magic `0x52564D30` at offset 0x000 is frozen forever |
+| R2 | The layout of bytes 0x000-0x007 (magic + version + flags) is frozen forever |
+| R3 | New fields may be added to reserved space at offsets 0xF00-0xFFB |
+| R4 | Readers MUST ignore non-zero bytes in reserved space they do not understand |
+| R5 | The root checksum at 0xFFC always covers bytes 0x000-0xFFB |
+| R6 | A v2+ writer extending reserved space MUST ensure the checksum remains valid |
+
+There is no explicit version negotiation. Compatibility is achieved through the
+skip rules above. A reader processes what it understands and skips what it does
+not. This avoids capability exchange, making RVF suitable for offline and
+archival use cases.
+
+## 5. Variable Dimension Support
+
+The root manifest declares a `dimension` field (offset 0x020, u16) and each
+VEC_SEG block declares its own `dim` field (block header offset 0x08, u16).
+These may differ.
+
+### Dimension Rules
+
+| Rule | Description |
+|------|-------------|
+| D1 | The root manifest `dimension` is the **primary dimension** (most common in the file) |
+| D2 | An RVF file MAY contain VEC_SEG blocks with dimensions different from the primary |
+| D3 | Each VEC_SEG block's `dim` field is authoritative for the vectors in that block |
+| D4 | The HNSW index (INDEX_SEG) covers only vectors matching the primary dimension |
+| D5 | Vectors with non-primary dimensions are searchable via flat scan or a separate index |
+| D6 | A PROFILE_SEG may declare multiple expected dimensions |
+
+### Dimension Catalog (Level 1 Record)
+
+A new Level 1 TLV record (tag `0x0010`, DIMENSION_CATALOG) enables readers to
+discover all dimensions present without scanning every VEC_SEG.
+
+Record layout:
+
+```
+Offset  Size  Field                Description
+------  ----  -----                -----------
+0x00    2     entry_count          Number of dimension entries
+0x02    2     reserved             Must be zero
+```
+
+Followed by `entry_count` entries of:
+
+```
+Offset  Size  Field                Description
+------  ----  -----                -----------
+0x00    2     dimension            Vector dimensionality
+0x02    1     dtype                Data type enum for these vectors
+0x03    1     flags                0x01 = primary, 0x02 = has_index
+0x04    4     vector_count         Number of vectors with this dimension
+0x08    8     index_seg_offset     Offset to dedicated index (0 if none)
+```
+
+**Entry size**: 16 bytes.
+
+Example for an RVDNA profile file:
+
+```
+DIMENSION_CATALOG:
+  entry_count: 3
+  [0] dim=64,   dtype=f16, flags=0x01 (primary, has_index), count=10000000, index=0x1A00000
+  [1] dim=384,  dtype=f16, flags=0x02 (has_index),          count=500000,   index=0x3F00000
+  [2] dim=4096, dtype=f32, flags=0x00 (flat scan only),     count=10000,    index=0
+```
+
+## 6. Space Reclamation
+
+Over time, tombstoned segments and superseded manifests accumulate dead space.
+RVF provides three reclamation strategies, each suited to different operating
+conditions.
+
+### Strategy 1: Hole-Punching
+
+On Linux filesystems that support `fallocate(2)` with `FALLOC_FL_PUNCH_HOLE`
+(ext4, XFS, btrfs), tombstoned segment ranges can be released back to the
+filesystem without rewriting the file.
+
+```
+Before:  [VEC_1 live] [VEC_2 dead] [VEC_3 dead] [VEC_4 live] [MANIFEST]
+After:   [VEC_1 live] [  hole   ] [  hole   ] [VEC_4 live] [MANIFEST]
+```
+
+File size is unchanged but disk blocks are freed. No data movement occurs — each
+punch is O(1). Reader mmap still works (holes read as zeros, but the manifest
+never references them). Hole-punching is performed only on segments marked as
+TOMBSTONE in the current manifest's COMPACTION_STATE record.
+
+### Strategy 2: Copy-Compact
+
+Copy-compact rewrites the file, including only live segments. This is the
+universal strategy that works on all filesystems.
+
+```
+Protocol:
+1. Acquire writer lock
+2. Read current manifest to enumerate live segments
+3. Create temporary file: <basename>.rvf.compact.tmp
+4. Write live segments sequentially to temporary file
+5. Write new MANIFEST_SEG with updated offsets
+6. fsync temporary file
+7. Atomic rename: <basename>.rvf.compact.tmp -> <basename>.rvf
+8. Release writer lock
+```
+
+The atomic rename (step 7) ensures readers either see the old file or the new
+file, never a partial state. Readers that opened the old file before the rename
+continue operating on the old inode via their open file descriptor. The old
+inode is freed when the last reader closes its descriptor.
+
+### Strategy 3: Shard Rewrite (Multi-File Mode)
+
+In multi-file mode, individual shard files can be rewritten independently:
+
+```
+Protocol:
+1. Acquire writer lock
+2. Read shard reference from Level 1 SHARD_REFS record
+3. Write new shard: <basename>.rvf.cold.<N>.compact.tmp
+4. fsync new shard
+5. Update main file manifest with new shard reference
+6. fsync main file
+7. Atomic rename new shard over old shard
+8. Release writer lock
+```
+
+The old shard is safe to delete after all readers close their descriptors.
+Implementations MAY defer deletion using a grace period (default: 60 seconds).
+
+## 7. Space Reclamation Triggers
+
+Reclamation is not performed on every write. Implementations SHOULD evaluate
+triggers after each manifest write and act when thresholds are exceeded.
+
+| Trigger | Threshold | Action |
+|---------|-----------|--------|
+| Dead space ratio | > 50% of file size | Copy-compact |
+| Dead space absolute | > 1 GB | Hole-punch if supported, else copy-compact |
+| Tombstone count | > 10,000 JOURNAL_SEG tombstone entries | Consolidate journal segments |
+| Time since last compaction | > 7 days | Evaluate dead space ratio, compact if > 25% |
+
+### Dead Space Calculation
+
+Dead space is computed from the manifest's COMPACTION_STATE record:
+
+```
+dead_bytes = sum(payload_length + 64) for each tombstoned segment
+total_bytes = file_size
+dead_ratio = dead_bytes / total_bytes
+```
+
+The `+ 64` accounts for the segment header.
+
+### Trigger Evaluation Protocol
+
+```
+1. After writing a new MANIFEST_SEG, compute dead_bytes and dead_ratio
+2. If dead_ratio > 0.50: schedule copy-compact
+3. Else if dead_bytes > 1 GB:
+   a. If fallocate supported: hole-punch tombstoned ranges
+   b. Else: schedule copy-compact
+4. If tombstone_count > 10,000: consolidate JOURNAL_SEGs
+5. If days_since_last_compact > 7 AND dead_ratio > 0.25: schedule copy-compact
+```
+
+Scheduled compactions MAY be deferred to a background process or low-activity
+period.
+
+## 8. Multi-Process Compaction
+
+Compaction is a write operation and requires the writer lock. Only one process
+may compact at a time.
+
+### Background Compaction Process
+
+A dedicated compaction process can run alongside the application:
+
+```
+1. Attempt writer lock acquisition
+2. If lock acquired:
+   a. Read current manifest
+   b. Evaluate reclamation triggers
+   c. If compaction needed:
+      i.   Write WITNESS_SEG with compaction_state = STARTED
+      ii.  Perform compaction (copy-compact or hole-punch)
+      iii. Write WITNESS_SEG with compaction_state = COMPLETED
+      iv.  Write new MANIFEST_SEG
+   d. Release lock
+3. If lock not acquired: sleep and retry
+```
+
+### Crash Safety
+
+Compaction is crash-safe by construction. Copy-compact does not rename until
+fsynced — a crash before rename leaves the original file untouched and the
+temporary file is cleaned up on next startup. Hole-punch `fallocate` calls are
+individually atomic; a crash mid-sequence leaves the manifest consistent because
+it references only live segments. Shard rewrite follows the same atomic rename
+pattern as copy-compact.
+
+### Compaction Progress and Resumability
+
+For long-running compactions, the writer records progress in WITNESS_SEG segments:
+
+```
+WITNESS_SEG compaction payload:
+  Offset  Size  Field                Description
+  ------  ----  -----                -----------
+  0x00    4     state                0=STARTED, 1=IN_PROGRESS, 2=COMPLETED, 3=ABORTED
+  0x04    8     source_manifest_id   Segment ID of manifest being compacted
+  0x0C    8     last_copied_seg_id   Last segment ID successfully written to new file
+  0x14    8     bytes_written        Total bytes written to new file so far
+  0x1C    8     bytes_remaining      Estimated bytes remaining
+  0x24    16    temp_file_hash       Hash of temporary file at last checkpoint
+```
+
+If a compaction process crashes and restarts, it can:
+
+1. Find the latest WITNESS_SEG with `state = IN_PROGRESS`
+2. Verify the temporary file exists and matches `temp_file_hash`
+3. Resume from `last_copied_seg_id + 1`
+4. If verification fails, delete the temporary file and restart compaction
+
+## 9. Crash Recovery Summary
+
+RVF recovers from crashes at any point without external tooling.
+
+| Crash Point | State After Recovery | Action Required |
+|-------------|---------------------|-----------------|
+| Segment append (before manifest) | Orphan segment at tail | None — manifest does not reference it |
+| Manifest write | Partial manifest at tail | Scan backward to previous valid manifest |
+| Lock acquisition | Lock file may or may not exist | Stale lock detection resolves it |
+| Lock release | Lock file persists | Stale lock detection resolves it |
+| Copy-compact (before rename) | Temporary file on disk | Delete `*.compact.tmp` on startup |
+| Copy-compact (during rename) | Atomic — old or new | No action needed |
+| Hole-punch | Partial holes punched | No action — manifest is consistent |
+| Shard rewrite | Temporary shard on disk | Delete `*.compact.tmp` on startup |
+
+### Startup Recovery Protocol
+
+On startup, before acquiring a write lock, a writer SHOULD:
+
+```
+1. Delete any <basename>.rvf.compact.tmp files (orphaned compaction)
+2. Delete any <basename>.rvf.cold.*.compact.tmp files (orphaned shard compaction)
+3. Validate the lock file (if present) for staleness
+4. Open the RVF file and locate the latest valid manifest
+5. If the tail contains a partial segment (magic present, bad hash):
+   a. Log a warning with the partial segment's offset and type
+   b. The partial segment is outside the manifest — it is harmless
+   c. The next append will overwrite it (or it will be compacted away)
+```
+
+## 10. Invariants
+
+The following invariants extend those in spec 01 (Section 7):
+
+1. At most one writer lock exists per RVF file at any time
+2. A lock file with valid magic and checksum represents an active or stale lock
+3. Readers never require a lock, regardless of operation
+4. A manifest snapshot is immutable for the lifetime of a reader session
+5. Compaction never modifies live segments — it creates new ones
+6. Hole-punched regions are never referenced by any manifest
+7. The root manifest magic and first 8 bytes are frozen across all versions
+8. Unknown segment versions and types are skipped, never rejected
+9. Unknown TLV tags in Level 1 are skipped, never rejected
+10. Each VEC_SEG block's `dim` field is authoritative for that block's vectors
--- a/vendor/ruvector/docs/research/rvf/spec/10-operations-api.md
+++ b/vendor/ruvector/docs/research/rvf/spec/10-operations-api.md
@@ -0,0 +1,688 @@
+# RVF Operations API
+
+## 1. Scope
+
+This document specifies the operational surface of an RVF runtime: error codes
+returned by all operations, wire formats for batch queries, batch ingest, and
+batch deletes, the network streaming protocol for progressive loading over HTTP
+and TCP, and the compaction scheduling policy. It complements the segment model
+(spec 01), manifest system (spec 02), and query optimization (spec 06).
+
+All multi-byte integers are little-endian unless otherwise noted. All offsets
+within messages are byte offsets from the start of the message payload.
+
+## 2. Error Code Enumeration
+
+Error codes are 16-bit unsigned integers. The high byte identifies the error
+category; the low byte identifies the specific error within that category.
+Implementations must preserve unrecognized codes in responses and must not
+treat unknown codes as fatal unless the high byte is `0x01` (format error).
+
+### Category 0x00: Success
+
+```
+Code    Name                  Description
+------  --------------------  ----------------------------------------
+0x0000  OK                    Operation succeeded
+0x0001  OK_PARTIAL            Partial success (some items failed)
+```
+
+`OK_PARTIAL` is returned when a batch operation succeeds for some items and
+fails for others. The response body contains per-item status details.
+
+### Category 0x01: Format Errors
+
+```
+Code    Name                  Description
+------  --------------------  ----------------------------------------
+0x0100  INVALID_MAGIC         Segment magic mismatch (expected 0x52564653)
+0x0101  INVALID_VERSION       Unsupported segment version
+0x0102  INVALID_CHECKSUM      Segment hash verification failed
+0x0103  INVALID_SIGNATURE     Cryptographic signature invalid
+0x0104  TRUNCATED_SEGMENT     Segment payload shorter than declared length
+0x0105  INVALID_MANIFEST      Root manifest validation failed
+0x0106  MANIFEST_NOT_FOUND    No valid MANIFEST_SEG in file
+0x0107  UNKNOWN_SEGMENT_TYPE  Segment type not recognized (warning, not fatal)
+0x0108  ALIGNMENT_ERROR       Data not at expected 64B boundary
+```
+
+`UNKNOWN_SEGMENT_TYPE` is advisory. A reader encountering an unknown segment
+type should skip it and continue. All other format errors in this category
+are fatal for the affected segment.
+
+### Category 0x02: Query Errors
+
+```
+Code    Name                  Description
+------  --------------------  ----------------------------------------
+0x0200  DIMENSION_MISMATCH    Query vector dimension != index dimension
+0x0201  EMPTY_INDEX           No index segments available
+0x0202  METRIC_UNSUPPORTED    Requested distance metric not available
+0x0203  FILTER_PARSE_ERROR    Invalid filter expression
+0x0204  K_TOO_LARGE           Requested K exceeds available vectors
+0x0205  TIMEOUT               Query exceeded time budget
+```
+
+When `K_TOO_LARGE` is returned, the response still contains all available
+results. The result count will be less than the requested K.
+
+### Category 0x03: Write Errors
+
+```
+Code    Name                  Description
+------  --------------------  ----------------------------------------
+0x0300  LOCK_HELD             Another writer holds the lock
+0x0301  LOCK_STALE            Lock file exists but owner process is dead
+0x0302  DISK_FULL             Insufficient space for write
+0x0303  FSYNC_FAILED          Durable write failed
+0x0304  SEGMENT_TOO_LARGE     Segment exceeds 4 GB limit
+0x0305  READ_ONLY             File opened in read-only mode
+```
+
+`LOCK_STALE` is informational. The runtime may attempt to break the stale
+lock and retry. If recovery succeeds, the original operation proceeds with
+an `OK` status.
+
+### Category 0x04: Tile Errors (WASM Microkernel)
+
+```
+Code    Name                  Description
+------  --------------------  ----------------------------------------
+0x0400  TILE_TRAP             WASM trap (OOB, unreachable, stack overflow)
+0x0401  TILE_OOM              Tile exceeded scratch memory (64 KB)
+0x0402  TILE_TIMEOUT          Tile computation exceeded time budget
+0x0403  TILE_INVALID_MSG      Malformed hub-tile message
+0x0404  TILE_UNSUPPORTED_OP   Operation not available on this profile
+```
+
+All tile errors trigger the fault isolation protocol described in
+`microkernel/wasm-runtime.md` section 8. The hub reassigns the tile's
+work and optionally restarts the faulted tile.
+
+### Category 0x05: Crypto Errors
+
+```
+Code    Name                  Description
+------  --------------------  ----------------------------------------
+0x0500  KEY_NOT_FOUND         Referenced key_id not in CRYPTO_SEG
+0x0501  KEY_EXPIRED           Key past valid_until timestamp
+0x0502  DECRYPT_FAILED        Decryption or auth tag verification failed
+0x0503  ALGO_UNSUPPORTED      Cryptographic algorithm not implemented
+```
+
+Crypto errors are always fatal for the affected segment. An implementation
+must not serve data from a segment that fails signature or decryption checks.
+
+## 3. Batch Query API
+
+### Wire Format: Request
+
+Batch queries amortize connection overhead and enable the runtime to
+schedule vector block loads across multiple queries simultaneously.
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    4       query_count         Number of queries in batch (max 1024)
+0x04    4       k                   Shared top-K parameter
+0x08    1       metric              Distance metric: 0=L2, 1=IP, 2=cosine, 3=hamming
+0x09    3       reserved            Must be zero
+0x0C    4       ef_search           HNSW ef_search parameter
+0x10    4       shared_filter_len   Byte length of shared filter (0 = no filter)
+0x14    var     shared_filter       Filter expression (applies to all queries)
+var     var     queries[]           Per-query entries (see below)
+```
+
+Each query entry:
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    4       query_id            Client-assigned correlation ID
+0x04    2       dim                 Vector dimensionality
+0x06    1       dtype               Data type: 0=fp32, 1=fp16, 2=i8, 3=binary
+0x07    1       flags               Bit 0: has per-query filter
+0x08    var     vector              Query vector (dim * sizeof(dtype) bytes)
+var     4       filter_len          Byte length of per-query filter (if flags bit 0)
+var     var     filter              Per-query filter (overrides shared filter)
+```
+
+When both a shared filter and a per-query filter are present, the per-query
+filter takes precedence. A per-query filter of zero length inherits the
+shared filter.
+
+### Wire Format: Response
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    4       query_count         Number of query results
+0x04    var     results[]           Per-query result entries
+```
+
+Each result entry:
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    4       query_id            Correlation ID from request
+0x04    2       status              Error code (0x0000 = OK)
+0x06    2       reserved            Must be zero
+0x08    4       result_count        Number of results returned
+0x0C    var     results[]           Array of (vector_id: u64, distance: f32) pairs
+```
+
+Each result pair is 12 bytes: 8 bytes for the vector ID followed by 4 bytes
+for the distance value. Results are sorted by distance ascending (nearest first).
+
+### Batch Scheduling
+
+The runtime should process batch queries using the following strategy:
+
+1. Parse all query vectors and load them into memory
+2. Identify shared segments across queries (block deduplication)
+3. Load each vector block once and evaluate all relevant queries against it
+4. Merge per-query top-K heaps independently
+5. Return results as soon as each query completes (streaming response)
+
+This amortizes I/O: if N queries touch the same vector block, the block is
+read once instead of N times.
+
+## 4. Batch Ingest API
+
+### Wire Format: Request
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    4       vector_count        Number of vectors to ingest (max 65536)
+0x04    2       dim                 Vector dimensionality
+0x06    1       dtype               Data type: 0=fp32, 1=fp16, 2=i8, 3=binary
+0x07    1       flags               Bit 0: metadata_included
+0x08    var     vectors[]           Vector entries
+var     var     metadata[]          Metadata entries (if flags bit 0)
+```
+
+Each vector entry:
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    8       vector_id           Globally unique vector ID
+0x08    var     vector              Vector data (dim * sizeof(dtype) bytes)
+```
+
+Each metadata entry (when metadata_included is set):
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    2       field_count         Number of metadata fields
+0x02    var     fields[]            Field entries
+```
+
+Each metadata field:
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    2       field_id            Field identifier (application-defined)
+0x02    1       value_type          0=u64, 1=i64, 2=f64, 3=string, 4=bytes
+0x03    var     value               Encoded value (u64/i64/f64: 8B; string/bytes: 4B length + data)
+```
+
+### Wire Format: Response
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    4       accepted_count      Number of vectors accepted
+0x04    4       rejected_count      Number of vectors rejected
+0x08    4       manifest_epoch      Epoch of manifest after commit
+0x0C    var     rejected_ids[]      Array of rejected vector IDs (u64 * rejected_count)
+var     var     rejected_reasons[]  Array of error codes (u16 * rejected_count)
+```
+
+The `manifest_epoch` field is the epoch of the MANIFEST_SEG written after the
+ingest is committed. Clients can use this value to confirm that a subsequent
+read will include the ingested vectors.
+
+### Ingest Commit Semantics
+
+1. The runtime writes vectors to a new VEC_SEG (append-only)
+2. If metadata is included, a META_SEG is appended
+3. Both segments are fsynced
+4. A new MANIFEST_SEG is written referencing the new segments
+5. The manifest is fsynced
+6. The response is sent with the new manifest_epoch
+
+Vectors are visible to queries only after step 6 completes.
+
+## 5. Batch Delete API
+
+### Wire Format: Request
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    1       delete_type         0=by_id, 1=by_range, 2=by_filter
+0x01    3       reserved            Must be zero
+0x04    var     payload             Type-specific payload (see below)
+```
+
+Delete by ID (`delete_type = 0`):
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    4       count               Number of IDs to delete
+0x04    var     ids[]               Array of vector IDs (u64 * count)
+```
+
+Delete by range (`delete_type = 1`):
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    8       start_id            Start of range (inclusive)
+0x08    8       end_id              End of range (exclusive)
+```
+
+Delete by filter (`delete_type = 2`):
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    4       filter_len          Byte length of filter expression
+0x04    var     filter              Filter expression
+```
+
+### Wire Format: Response
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    8       deleted_count       Number of vectors deleted
+0x08    2       status              Error code (0x0000 = OK)
+0x0A    2       reserved            Must be zero
+0x0C    4       manifest_epoch      Epoch of manifest after delete committed
+```
+
+### Delete Mechanics
+
+Deletes are logical. The runtime appends a JOURNAL_SEG containing tombstone
+entries for the deleted vector IDs. The new MANIFEST_SEG marks affected
+VEC_SEGs as partially dead. Physical reclamation happens during compaction.
+
+## 6. Network Streaming Protocol
+
+### 6.1 HTTP Range Requests (Read-Only Access)
+
+RVF's progressive loading model maps naturally to HTTP byte-range requests.
+A client can boot from a remote `.rvf` file and become queryable without
+downloading the entire file.
+
+**Phase 1: Boot (mandatory)**
+
+```
+GET /file.rvf  Range: bytes=-4096
+```
+
+Retrieves the last 4 KB of the file. This contains the Level 0 root manifest
+(MANIFEST_SEG). The client parses hotset pointers, the segment directory, and
+the profile ID.
+
+If the file is smaller than 4 KB, the entire file is returned. If the last
+4 KB does not contain a valid MANIFEST_SEG, the client extends the range
+backward in 4 KB increments until one is found or 1 MB is scanned (at which
+point it returns `MANIFEST_NOT_FOUND`).
+
+**Phase 2: Hotset (parallel, mandatory for queries)**
+
+Using offsets from the Level 0 manifest, the client issues up to 5 parallel
+range requests:
+
+```
+GET /file.rvf  Range: bytes=<entrypoint_offset>-<entrypoint_end>
+GET /file.rvf  Range: bytes=<toplayer_offset>-<toplayer_end>
+GET /file.rvf  Range: bytes=<centroid_offset>-<centroid_end>
+GET /file.rvf  Range: bytes=<quantdict_offset>-<quantdict_end>
+GET /file.rvf  Range: bytes=<hotcache_offset>-<hotcache_end>
+```
+
+These fetch the HNSW entry point, top-layer graph, routing centroids,
+quantization dictionary, and the hot cache (HOT_SEG). After these 5 requests
+complete, the system is queryable with recall >= 0.7.
+
+**Phase 3: Level 1 (background)**
+
+```
+GET /file.rvf  Range: bytes=<l1_offset>-<l1_end>
+```
+
+Fetches the Level 1 manifest containing the full segment directory. This
+enables the client to discover all segments and plan on-demand fetches.
+
+**Phase 4: On-demand (per query)**
+
+For queries that require cold data not yet fetched:
+
+```
+GET /file.rvf  Range: bytes=<segment_offset>-<segment_end>
+```
+
+The client caches fetched segments locally. Repeated queries against the
+same data region do not trigger additional requests.
+
+### HTTP Requirements
+
+- Server must support `Accept-Ranges: bytes`
+- Server must return `206 Partial Content` for range requests
+- Server should support multiple ranges in a single request (`multipart/byteranges`)
+- Client should use `If-None-Match` with the file's ETag to detect stale caches
+
+### 6.2 TCP Streaming Protocol (Real-Time Access)
+
+For real-time ingest and low-latency queries, RVF defines a binary TCP
+protocol over TLS 1.3.
+
+**Connection Setup**
+
+```
+1. Client opens TCP connection to server
+2. TLS 1.3 handshake (mandatory, no plaintext mode)
+3. Client sends HELLO message with protocol version and capabilities
+4. Server responds with HELLO_ACK confirming capabilities
+5. Connection is ready for messages
+```
+
+**Framing**
+
+All messages are length-prefixed:
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    4       frame_length        Payload length (big-endian, max 16 MB)
+0x04    1       msg_type            Message type (see below)
+0x05    3       msg_id              Correlation ID (big-endian, wraps at 2^24)
+0x08    var     payload             Message-specific payload
+```
+
+Frame length is big-endian (network byte order) for consistency with TLS
+framing. The 16 MB maximum prevents a single message from monopolizing the
+connection. Payloads larger than 16 MB must be split across multiple messages
+using continuation framing (see section 6.4).
+
+**Message Types**
+
+```
+Client -> Server:
+  0x01  QUERY           Batch query (payload = Batch Query Request)
+  0x02  INGEST          Batch ingest (payload = Batch Ingest Request)
+  0x03  DELETE          Batch delete (payload = Batch Delete Request)
+  0x04  STATUS          Request server status (no payload)
+  0x05  SUBSCRIBE       Subscribe to update notifications
+
+Server -> Client:
+  0x81  QUERY_RESULT    Batch query result
+  0x82  INGEST_ACK      Batch ingest acknowledgment
+  0x83  DELETE_ACK      Batch delete acknowledgment
+  0x84  STATUS_RESP     Server status response
+  0x85  UPDATE_NOTIFY   Push notification of new data
+  0xFF  ERROR           Error with code and description
+```
+
+**ERROR Message Payload**
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    2       error_code          Error code from section 2
+0x02    2       description_len     Byte length of description string
+0x04    var     description         UTF-8 error description (human-readable)
+```
+
+### 6.3 Streaming Ingest Protocol
+
+The TCP protocol supports continuous ingest where the client streams vectors
+without waiting for per-batch acknowledgments.
+
+**Flow**
+
+```
+Client                              Server
+  |                                    |
+  |--- INGEST (batch 0) ------------->|
+  |--- INGEST (batch 1) ------------->|  Pipelining: send without waiting
+  |--- INGEST (batch 2) ------------->|
+  |                                    | Server writes VEC_SEGs, appends manifest
+  |<--- INGEST_ACK (batch 0) ---------|
+  |<--- INGEST_ACK (batch 1) ---------|
+  |                                    | Backpressure: server delays ACK
+  |--- INGEST (batch 3) ------------->|  Client respects window
+  |<--- INGEST_ACK (batch 2) ---------|
+  |                                    |
+```
+
+**Backpressure**
+
+The server controls ingest rate by delaying INGEST_ACK responses. The client
+must limit its in-flight (unacknowledged) ingest messages to a configurable
+window size (default: 8 messages). When the window is full, the client must
+wait for an ACK before sending the next batch.
+
+The server should send backpressure when:
+- Write queue exceeds 80% capacity
+- Compaction is falling behind (dead space > 50%)
+- Available disk space drops below 10%
+
+**Commit Semantics**
+
+Each INGEST_ACK contains the `manifest_epoch` after commit. The server
+guarantees that all vectors acknowledged with epoch E are visible to any
+query that reads the manifest at epoch >= E.
+
+### 6.4 Continuation Framing
+
+For payloads exceeding the 16 MB frame limit:
+
+```
+Frame 0: msg_type = original type, flags bit 0 = CONTINUATION_START
+Frame 1: msg_type = 0x00 (CONTINUATION), flags bit 0 = 0
+Frame 2: msg_type = 0x00 (CONTINUATION), flags bit 0 = 0
+Frame N: msg_type = 0x00 (CONTINUATION), flags bit 1 = CONTINUATION_END
+```
+
+The receiver reassembles the payload from all continuation frames before
+processing. The msg_id is shared across all frames of a continuation sequence.
+
+### 6.5 SUBSCRIBE and UPDATE_NOTIFY
+
+The SUBSCRIBE message registers the client for push notifications when new
+data is committed:
+
+```
+SUBSCRIBE payload:
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    4       min_epoch           Only notify for epochs > this value
+0x04    1       notify_flags        Bit 0: ingest, Bit 1: delete, Bit 2: compaction
+0x05    3       reserved            Must be zero
+```
+
+The server sends UPDATE_NOTIFY whenever a new MANIFEST_SEG is committed that
+matches the subscription criteria:
+
+```
+UPDATE_NOTIFY payload:
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    4       epoch               New manifest epoch
+0x04    1       event_type          0=ingest, 1=delete, 2=compaction
+0x05    3       reserved            Must be zero
+0x08    4       affected_count      Number of vectors affected
+0x0C    8       new_total           Total vector count after event
+```
+
+## 7. Compaction Scheduling Policy
+
+Compaction merges small, overlapping, or partially-dead segments into larger,
+sealed segments. Because compaction competes with queries and ingest for I/O
+bandwidth, the runtime enforces a scheduling policy.
+
+### 7.1 IO Budget
+
+Compaction must consume at most 30% of available IOPS. The runtime measures
+IOPS over a 5-second sliding window and throttles compaction I/O to stay
+within budget.
+
+```
+available_iops = measured_iops_capacity (from benchmarking at startup)
+compaction_budget = available_iops * 0.30
+compaction_throttle = max(compaction_budget - current_compaction_iops, 0)
+```
+
+### 7.2 Priority Ordering
+
+When I/O bandwidth is contended, operations are prioritized:
+
+```
+Priority 1 (highest):  Queries (reads from VEC_SEG, INDEX_SEG, HOT_SEG)
+Priority 2:            Ingest (writes to VEC_SEG, META_SEG, MANIFEST_SEG)
+Priority 3 (lowest):   Compaction (reads + writes of sealed segments)
+```
+
+Compaction yields to queries and ingest. If a compaction I/O operation would
+cause a query to exceed its time budget, the compaction operation is deferred.
+
+### 7.3 Scheduling Triggers
+
+Compaction runs when all of the following conditions are met:
+
+| Condition | Threshold | Rationale |
+|-----------|-----------|-----------|
+| Query load | < 50% of capacity | Avoid competing with active queries |
+| Dead space ratio | > 20% of total file size | Not worth compacting small amounts |
+| Segment count | > 32 active segments | Many small segments hurt read performance |
+| Time since last compaction | > 60 seconds | Prevent compaction storms |
+
+The runtime evaluates these conditions every 10 seconds.
+
+### 7.4 Emergency Compaction
+
+If dead space exceeds 70% of total file size, compaction enters emergency mode:
+
+```
+Emergency compaction rules:
+  1. Compaction preempts ingest (ingest is paused, not rejected)
+  2. IO budget increases to 60% of available IOPS
+  3. Compaction runs regardless of query load
+  4. Ingest resumes after dead space drops below 50%
+```
+
+During emergency compaction, the server responds to INGEST messages with
+delayed ACKs (backpressure) rather than rejecting them. Queries continue to
+be served at highest priority.
+
+### 7.5 Compaction Progress Reporting
+
+The STATUS response includes compaction state:
+
+```
+STATUS_RESP compaction fields:
+Offset  Size    Field                 Description
+------  ------  -------------------   ----------------------------------------
+0x00    1       compaction_state      0=idle, 1=running, 2=emergency
+0x01    1       progress_pct          Completion percentage (0-100)
+0x02    2       reserved              Must be zero
+0x04    8       dead_bytes            Total dead space in bytes
+0x0C    8       total_bytes           Total file size in bytes
+0x14    4       segments_remaining    Segments left to compact
+0x18    4       segments_completed    Segments compacted in current run
+0x1C    4       estimated_seconds     Estimated time to completion
+0x20    4       io_budget_pct         Current IO budget percentage (30 or 60)
+```
+
+### 7.6 Compaction Segment Selection
+
+The runtime selects segments for compaction using a tiered strategy:
+
+```
+1. Tombstoned segments:       Always compacted first (reclaim dead space)
+2. Small VEC_SEGs:            Segments < 1 MB merged into larger segments
+3. High-overlap INDEX_SEGs:   Index segments covering the same ID range
+4. Cold OVERLAY_SEGs:         Overlay deltas merged into base segments
+```
+
+The compaction output is always a sealed segment (SEALED flag set). Sealed
+segments are immutable and can be verified independently.
+
+## 8. STATUS Response Format
+
+The STATUS message provides a snapshot of the server state for monitoring
+and diagnostics.
+
+```
+STATUS_RESP payload:
+Offset  Size    Field                 Description
+------  ------  -------------------   ----------------------------------------
+0x00    4       protocol_version      Protocol version (currently 1)
+0x04    4       manifest_epoch        Current manifest epoch
+0x08    8       total_vectors         Total vector count
+0x10    8       total_segments        Total segment count
+0x18    8       file_size_bytes       Total file size
+0x20    4       query_qps             Queries per second (last 5s window)
+0x24    4       ingest_vps            Vectors ingested per second (last 5s window)
+0x28    24      compaction            Compaction state (see section 7.5)
+0x40    1       profile_id            Active hardware profile (0x00-0x03)
+0x41    1       health                0=healthy, 1=degraded, 2=read_only
+0x42    2       reserved              Must be zero
+0x44    4       uptime_seconds        Server uptime
+```
+
+## 9. Filter Expression Format
+
+Filter expressions used in batch queries and batch deletes share a common
+binary encoding:
+
+```
+Offset  Size    Field               Description
+------  ------  ------------------  ----------------------------------------
+0x00    1       op                  Operator enum (see below)
+0x01    2       field_id            Metadata field to filter on
+0x03    1       value_type          Value type (matches metadata field types)
+0x04    var     value               Comparison value
+var     var     children[]          Sub-expressions (for AND/OR/NOT)
+```
+
+Operator enum:
+
+```
+0x00  EQ          field == value
+0x01  NE          field != value
+0x02  LT          field < value
+0x03  LE          field <= value
+0x04  GT          field > value
+0x05  GE          field >= value
+0x06  IN          field in [values]
+0x07  RANGE       field in [low, high)
+0x10  AND         All children must match
+0x11  OR          Any child must match
+0x12  NOT         Negate single child
+```
+
+Filters are evaluated during the query scan phase. Vectors that do not match
+the filter are excluded from distance computation entirely (pre-filtering) or
+from the result set (post-filtering), depending on the runtime's cost model.
+
+## 10. Invariants
+
+1. Error codes are stable across versions; new codes are additive only
+2. Batch operations are atomic per-item, not per-batch (partial success is valid)
+3. TCP connections are always TLS 1.3; plaintext is not permitted
+4. Frame length is big-endian; all other multi-byte fields are little-endian
+5. HTTP progressive loading must succeed with at most 7 round trips to become queryable
+6. Compaction never runs at more than 60% of available IOPS, even in emergency mode
+7. The STATUS response is always available, even during emergency compaction
+8. Filter expressions are limited to 64 levels of nesting depth
--- a/vendor/ruvector/docs/research/rvf/spec/11-wasm-bootstrap.md
+++ b/vendor/ruvector/docs/research/rvf/spec/11-wasm-bootstrap.md
@@ -0,0 +1,420 @@
+# RVF WASM Self-Bootstrapping Specification
+
+## 1. Motivation
+
+Traditional file formats require an external runtime to interpret their contents.
+A JPEG needs an image decoder. A SQLite database needs the SQLite library. An RVF
+file needs a vector search engine.
+
+What if the file carried its own runtime?
+
+By embedding a tiny WASM interpreter inside the RVF file itself, we eliminate the
+last external dependency. The host only needs **raw execution capability** — the
+ability to run bytes as instructions. RVF becomes **self-bootstrapping**: a single
+file that contains both its data and the complete machinery to process that data.
+
+This is the transition from "needs a compatible runtime" to **"runs anywhere
+compute exists."**
+
+## 2. Architecture
+
+### The Bootstrap Stack
+
+```
+Layer 3:  RVF Data Segments          (VEC_SEG, INDEX_SEG, MANIFEST_SEG, ...)
+            ^
+            | processes
+            |
+Layer 2:  WASM Microkernel           (WASM_SEG, role=Microkernel, ~5.5 KB)
+            ^                         14 exports: query, ingest, distance, top-K
+            | executes
+            |
+Layer 1:  WASM Interpreter           (WASM_SEG, role=Interpreter, ~50 KB)
+            ^                         Minimal stack machine that runs WASM bytecode
+            | loads
+            |
+Layer 0:  Raw Bytes                  (The .rvf file on any storage medium)
+```
+
+Each layer depends only on the one below it. The host reads Layer 0 (raw bytes),
+finds the interpreter at Layer 1, uses it to execute the microkernel at Layer 2,
+which then processes the data at Layer 3.
+
+### Segment Layout
+
+```
+┌──────────────────────────────────────────────────────────────────────┐
+│                         bootable.rvf                                 │
+│                                                                      │
+│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌─────────┐ │
+│  │  WASM_SEG    │  │  WASM_SEG    │  │  VEC_SEG     │  │ INDEX   │ │
+│  │  0x10        │  │  0x10        │  │  0x01        │  │ _SEG    │ │
+│  │              │  │              │  │              │  │ 0x02    │ │
+│  │ role=Interp  │  │ role=uKernel │  │ 10M vectors  │  │ HNSW    │ │
+│  │ ~50 KB       │  │ ~5.5 KB      │  │ 384-dim fp16 │  │ L0+L1   │ │
+│  │ priority=0   │  │ priority=1   │  │              │  │         │ │
+│  └──────────────┘  └──────────────┘  └──────────────┘  └─────────┘ │
+│                                                                      │
+│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐               │
+│  │ QUANT_SEG    │  │ WITNESS_SEG  │  │ MANIFEST_SEG │  ← tail      │
+│  │ codebooks    │  │ audit trail  │  │ source of    │               │
+│  │              │  │              │  │ truth        │               │
+│  └──────────────┘  └──────────────┘  └──────────────┘               │
+└──────────────────────────────────────────────────────────────────────┘
+```
+
+## 3. WASM_SEG Wire Format
+
+### Segment Type
+
+```
+Value:  0x10
+Name:   WASM_SEG
+```
+
+Uses the standard 64-byte RVF segment header (`SegmentHeader`), followed by
+a 64-byte `WasmHeader`, followed by the WASM bytecode.
+
+### WasmHeader (64 bytes)
+
+```
+Offset  Size  Type    Field               Description
+------  ----  ----    -----               -----------
+0x00    4     u32     wasm_magic           0x5256574D ("RVWM" big-endian)
+0x04    2     u16     header_version       Currently 1
+0x06    1     u8      role                 Bootstrap role (see WasmRole enum)
+0x07    1     u8      target               Target platform (see WasmTarget enum)
+0x08    2     u16     required_features    WASM feature bitfield
+0x0A    2     u16     export_count         Number of WASM exports
+0x0C    4     u32     bytecode_size        Uncompressed bytecode size (bytes)
+0x10    4     u32     compressed_size      Compressed size (0 = no compression)
+0x14    1     u8      compression          0=none, 1=LZ4, 2=ZSTD
+0x15    1     u8      min_memory_pages     Minimum linear memory (64 KB each)
+0x16    1     u8      max_memory_pages     Maximum linear memory (0 = no limit)
+0x17    1     u8      table_count          Number of WASM tables
+0x18    32    hash256 bytecode_hash        SHAKE-256-256 of uncompressed bytecode
+0x38    1     u8      bootstrap_priority   Lower = tried first in chain
+0x39    1     u8      interpreter_type     Interpreter variant (if role=Interpreter)
+0x3A    6     u8[6]   reserved             Must be zero
+```
+
+### WasmRole Enum
+
+```
+Value  Name            Description
+-----  ----            -----------
+0x00   Microkernel     RVF query engine (5.5 KB Cognitum tile runtime)
+0x01   Interpreter     Minimal WASM interpreter for self-bootstrapping
+0x02   Combined        Interpreter + microkernel linked together
+0x03   Extension       Domain-specific module (custom distance, decoder)
+0x04   ControlPlane    Store management (create, export, segment parsing)
+```
+
+### WasmTarget Enum
+
+```
+Value  Name         Description
+-----  ----         -----------
+0x00   Wasm32       Generic wasm32 (any compliant runtime)
+0x01   WasiP1       WASI Preview 1 (requires WASI syscalls)
+0x02   WasiP2       WASI Preview 2 (component model)
+0x03   Browser      Browser-optimized (expects Web APIs)
+0x04   BareTile     Bare-metal Cognitum tile (hub-tile protocol only)
+```
+
+### Required Features Bitfield
+
+```
+Bit  Mask    Feature
+---  ----    -------
+0    0x0001  SIMD (v128 operations)
+1    0x0002  Bulk memory operations
+2    0x0004  Multi-value returns
+3    0x0008  Reference types
+4    0x0010  Threads (shared memory)
+5    0x0020  Tail call optimization
+6    0x0040  GC (garbage collection)
+7    0x0080  Exception handling
+```
+
+### Interpreter Type (when role=Interpreter)
+
+```
+Value  Name              Description
+-----  ----              -----------
+0x00   StackMachine      Generic stack-based interpreter
+0x01   Wasm3Compatible   wasm3-style (register machine)
+0x02   WamrCompatible    WAMR-style (AOT + interpreter)
+0x03   WasmiCompatible   wasmi-style (pure stack machine)
+```
+
+## 4. Bootstrap Resolution Protocol
+
+### Discovery
+
+1. Scan all segments for `seg_type == 0x10` (WASM_SEG)
+2. Parse the 64-byte WasmHeader from each
+3. Validate `wasm_magic == 0x5256574D`
+4. Sort by `bootstrap_priority` ascending
+
+### Resolution
+
+```
+IF any WASM_SEG has role=Combined:
+    → SelfContained bootstrap (single module does everything)
+
+ELIF WASM_SEG with role=Interpreter AND role=Microkernel both exist:
+    → TwoStage bootstrap (interpreter runs microkernel)
+
+ELIF only WASM_SEG with role=Microkernel exists:
+    → HostRequired (needs external WASM runtime)
+
+ELSE:
+    → No WASM bootstrap available
+```
+
+### Execution Sequence (Two-Stage)
+
+```
+Host                    Interpreter              Microkernel           Data
+ |                         |                        |                   |
+ |-- read WASM_SEG[0] --->|                        |                   |
+ |   (interpreter bytes)   |                        |                   |
+ |                         |                        |                   |
+ |-- instantiate -------->|                        |                   |
+ |   (load into memory)    |                        |                   |
+ |                         |                        |                   |
+ |-- feed WASM_SEG[1] --->|-- instantiate -------->|                   |
+ |   (microkernel bytes)   |   (via interpreter)    |                   |
+ |                         |                        |                   |
+ |-- LOAD_QUERY --------->|------- forward ------->|                   |
+ |                         |                        |-- read VEC_SEG -->|
+ |                         |                        |<- vector block ---|
+ |                         |                        |                   |
+ |                         |                        |  rvf_distances()  |
+ |                         |                        |  rvf_topk_merge() |
+ |                         |                        |                   |
+ |<-- TOPK_RESULT --------|<------ return ---------|                   |
+```
+
+## 5. Size Budget
+
+### Microkernel (role=Microkernel)
+
+Already specified in `microkernel/wasm-runtime.md`:
+
+```
+Total:  ~5,500 bytes (< 8 KB code budget)
+Exports: 14 (query path + quantization + HNSW + verification)
+Memory:  8 KB data + 64 KB SIMD scratch
+```
+
+### Interpreter (role=Interpreter)
+
+Target: minimal WASM bytecode interpreter sufficient to run the microkernel.
+
+```
+Component                    Estimated Size
+---------                    --------------
+WASM binary parser           4 KB
+  (magic, section parsing)
+Type section decoder         1 KB
+  (function types)
+Import/Export resolution     2 KB
+Code section interpreter     12 KB
+  (control flow, locals)
+Stack machine engine         8 KB
+  (operand stack, call stack)
+Memory management            3 KB
+  (linear memory, grow)
+i32/i64 integer ops          4 KB
+  (add, sub, mul, div, rem, shifts)
+f32/f64 float ops            6 KB
+  (add, sub, mul, div, sqrt, conversions)
+v128 SIMD ops (optional)     8 KB
+  (only if WASM_FEAT_SIMD required)
+Table + call_indirect        2 KB
+                             ----------
+Total (no SIMD):             ~42 KB
+Total (with SIMD):           ~50 KB
+```
+
+### Combined (role=Combined)
+
+Interpreter linked with microkernel in a single module:
+
+```
+Total: ~48-56 KB (interpreter + microkernel, with overlap eliminated)
+```
+
+### Self-Bootstrapping Overhead
+
+For a 10M vector file (~7.3 GB at 384-dim fp16):
+- Bootstrap overhead: ~56 KB / ~7.3 GB = **0.0008%**
+- The file is 99.9992% data, 0.0008% self-sufficient runtime
+
+For a 1000-vector file (~750 KB):
+- Bootstrap overhead: ~56 KB / ~750 KB = **7.5%**
+- Still practical for edge/IoT deployments
+
+## 6. Execution Tiers (Extended)
+
+The original three-tier model from ADR-030 is extended:
+
+| Tier | Segment | Size | Boot | Self-Bootstrap? |
+|------|---------|------|------|-----------------|
+| 0: Embedded WASM Interpreter | WASM_SEG (role=Interpreter) | ~50 KB | <5 ms | **Yes** — file carries its own runtime |
+| 1: WASM Microkernel | WASM_SEG (role=Microkernel) | 5.5 KB | <1 ms | No — needs host or Tier 0 |
+| 2: eBPF | EBPF_SEG | 10-50 KB | <20 ms | No — needs Linux kernel |
+| 3: Unikernel | KERNEL_SEG | 200 KB-2 MB | <125 ms | No — needs VMM (Firecracker) |
+
+**Key insight**: Tier 0 makes all other tiers optional. An RVF file with
+Tier 0 embedded runs on *any* host that can execute bytes — bare metal,
+browser, microcontroller, FPGA with a soft CPU, or even another WASM runtime.
+
+## 7. "Runs Anywhere Compute Exists"
+
+### What This Means
+
+A self-bootstrapping RVF file requires exactly **one capability** from its host:
+
+> The ability to read bytes from storage and execute them as instructions.
+
+That's it. No operating system. No file system. No network stack. No runtime
+library. No package manager. No container engine.
+
+### Where It Runs
+
+| Host | How It Works |
+|------|-------------|
+| **x86 server** | Native WASM runtime (Wasmtime/WAMR) runs microkernel directly |
+| **ARM edge device** | Same — native WASM runtime |
+| **Browser tab** | `WebAssembly.instantiate()` on the microkernel bytes |
+| **Microcontroller** | Embedded interpreter runs microkernel in 64 KB scratch |
+| **FPGA soft CPU** | Interpreter mapped to BRAM, microkernel in flash |
+| **Another WASM runtime** | Interpreter-in-WASM runs microkernel-in-WASM (turtles) |
+| **Bare metal** | Bootloader extracts interpreter, interpreter runs microkernel |
+| **TEE enclave** | Enclave loads interpreter, verified via WITNESS_SEG attestation |
+
+### The Bootstrapping Invariant
+
+For any host `H` with execution capability `E`:
+
+```
+∀ H, E:  can_execute(H, E) ∧ can_read_bytes(H)
+         → can_process_rvf(H, self_bootstrapping_rvf_file)
+```
+
+The file is a **fixed point** of the execution relation: it contains everything
+needed to process itself.
+
+## 8. Security Considerations
+
+### Interpreter Verification
+
+The embedded interpreter's bytecode is hashed with SHAKE-256-256 and stored
+in the WasmHeader (`bytecode_hash`). A WITNESS_SEG can chain the interpreter
+hash to a trusted build, providing:
+
+- **Provenance**: Who built this interpreter?
+- **Integrity**: Has the interpreter been modified?
+- **Attestation**: Can a TEE verify the interpreter before execution?
+
+### Sandbox Guarantees
+
+The WASM sandbox model applies at every layer:
+- The interpreter cannot access host memory beyond its linear memory
+- The microkernel cannot access interpreter memory
+- Each layer communicates only through defined exports/imports
+- A trapped module cannot corrupt other modules
+
+### Bootstrap Attack Surface
+
+| Attack | Mitigation |
+|--------|-----------|
+| Malicious interpreter | Verify `bytecode_hash` against known-good hash in WITNESS_SEG |
+| Modified microkernel | Interpreter verifies microkernel hash before instantiation |
+| Data corruption | Segment-level CRC32C/SHAKE-256 hashes (Law 2) |
+| Code injection | WASM validates all code at load time (type checking) |
+| Resource exhaustion | `max_memory_pages` cap, epoch-based interruption |
+
+## 9. API
+
+### Rust (rvf-runtime)
+
+```rust
+// Embed a WASM module
+store.embed_wasm(
+    role: WasmRole::Microkernel as u8,
+    target: WasmTarget::Wasm32 as u8,
+    required_features: WASM_FEAT_SIMD,
+    wasm_bytecode: &microkernel_bytes,
+    export_count: 14,
+    bootstrap_priority: 1,
+    interpreter_type: 0,
+)?;
+
+// Make self-bootstrapping
+store.embed_wasm(
+    role: WasmRole::Interpreter as u8,
+    target: WasmTarget::Wasm32 as u8,
+    required_features: 0,
+    wasm_bytecode: &interpreter_bytes,
+    export_count: 3,
+    bootstrap_priority: 0,
+    interpreter_type: 0x03, // wasmi-compatible
+)?;
+
+// Check if file is self-bootstrapping
+assert!(store.is_self_bootstrapping());
+
+// Extract all WASM modules (ordered by priority)
+let modules = store.extract_wasm_all()?;
+```
+
+### WASM (rvf-wasm bootstrap module)
+
+```rust
+use rvf_wasm::bootstrap::{resolve_bootstrap_chain, get_bytecode, BootstrapChain};
+
+let chain = resolve_bootstrap_chain(&rvf_bytes);
+
+match chain {
+    BootstrapChain::SelfContained { combined } => {
+        let bytecode = get_bytecode(&rvf_bytes, &combined).unwrap();
+        // Instantiate and run
+    }
+    BootstrapChain::TwoStage { interpreter, microkernel } => {
+        let interp_code = get_bytecode(&rvf_bytes, &interpreter).unwrap();
+        let kernel_code = get_bytecode(&rvf_bytes, &microkernel).unwrap();
+        // Load interpreter, then use it to run microkernel
+    }
+    _ => { /* use host runtime */ }
+}
+```
+
+## 10. Relationship to Existing Segments
+
+| Segment | Relationship to WASM_SEG |
+|---------|-------------------------|
+| KERNEL_SEG (0x0E) | Alternative execution tier — KERNEL_SEG boots a full unikernel, WASM_SEG runs a lightweight microkernel. Both make the file self-executing but at different capability levels. |
+| EBPF_SEG (0x0F) | Complementary — eBPF accelerates hot-path queries on Linux hosts while WASM provides universal portability. |
+| WITNESS_SEG (0x0A) | Verification — WITNESS_SEG chains can attest the interpreter and microkernel hashes, providing a trust anchor for the bootstrap chain. |
+| CRYPTO_SEG (0x0C) | Signing — CRYPTO_SEG key material can sign WASM_SEG contents for tamper detection. |
+| MANIFEST_SEG (0x05) | Discovery — the tail manifest references all WASM_SEGs with their roles and priorities. |
+
+## 11. Implementation Status
+
+| Component | Crate | Status |
+|-----------|-------|--------|
+| `SegmentType::Wasm` (0x10) | `rvf-types` | Implemented |
+| `WasmHeader` (64-byte header) | `rvf-types` | Implemented |
+| `WasmRole`, `WasmTarget` enums | `rvf-types` | Implemented |
+| `write_wasm_seg` | `rvf-runtime` | Implemented |
+| `embed_wasm` / `extract_wasm` | `rvf-runtime` | Implemented |
+| `extract_wasm_all` (priority-sorted) | `rvf-runtime` | Implemented |
+| `is_self_bootstrapping` | `rvf-runtime` | Implemented |
+| `resolve_bootstrap_chain` | `rvf-wasm` | Implemented |
+| `get_bytecode` (zero-copy extraction) | `rvf-wasm` | Implemented |
+| Embedded interpreter (wasmi-based) | `rvf-wasm` | Future |
+| Combined interpreter+microkernel build | `rvf-wasm` | Future |
--- a/vendor/ruvector/docs/research/rvf/wire/binary-layout.md
+++ b/vendor/ruvector/docs/research/rvf/wire/binary-layout.md
@@ -0,0 +1,439 @@
+# RVF Wire Format Reference
+
+## 1. File Structure
+
+An RVF file is a byte stream with no fixed header at offset 0. All structure
+is discovered from the tail.
+
+```
+Byte 0                                                               EOF
+|                                                                      |
+v                                                                      v
+--------+--------+--------+     +--------+---------+--------+---------+
+| Seg 0  | Seg 1  | Seg 2  | ... | Seg N  | Seg N+1 | Seg N+2| Mfst K |
+| VEC    | VEC    | INDEX  |     | VEC    | HOT     | INDEX  | MANIF  |
+--------+--------+--------+     +--------+---------+--------+---------+
+                                                               ^       ^
+                                                               |       |
+                                                        Level 1 Mfst   |
+                                                                Level 0
+                                                              (last 4KB)
+```
+
+### Alignment Rule
+
+Every segment starts at a **64-byte aligned** boundary. If a segment's
+payload + footer does not end on a 64-byte boundary, zero-padding is inserted
+before the next segment header.
+
+### Byte Order
+
+All multi-byte integers are **little-endian**. All floating-point values
+are IEEE 754 little-endian. This matches x86, ARM (in default mode), and
+WASM native byte order.
+
+## 2. Primitive Types
+
+```
+Type        Size    Encoding
+----        ----    --------
+u8          1       Unsigned 8-bit integer
+u16         2       Unsigned 16-bit little-endian
+u32         4       Unsigned 32-bit little-endian
+u64         8       Unsigned 64-bit little-endian
+i32         4       Signed 32-bit little-endian (two's complement)
+i64         8       Signed 64-bit little-endian (two's complement)
+f16         2       IEEE 754 half-precision little-endian
+f32         4       IEEE 754 single-precision little-endian
+f64         8       IEEE 754 double-precision little-endian
+varint      1-10    LEB128 unsigned variable-length integer
+svarint     1-10    ZigZag + LEB128 signed variable-length integer
+hash128     16      First 128 bits of hash output
+hash256     32      First 256 bits of hash output
+```
+
+### Varint Encoding (LEB128)
+
+```
+Value 0-127:        1 byte   [0xxxxxxx]
+Value 128-16383:    2 bytes  [1xxxxxxx 0xxxxxxx]
+Value 16384-2097151: 3 bytes [1xxxxxxx 1xxxxxxx 0xxxxxxx]
+...up to 10 bytes for u64
+```
+
+### Delta Encoding
+
+Sequences of sorted integers use delta encoding:
+```
+Original:  [100, 105, 108, 120, 200]
+Deltas:    [100,   5,   3,  12,  80]
+Encoded:   [varint(100), varint(5), varint(3), varint(12), varint(80)]
+```
+
+With restart points every N entries, the first value in each restart group
+is absolute (not delta-encoded).
+
+## 3. Segment Header (64 bytes)
+
+```
+Offset  Type    Field              Notes
+------  ----    -----              -----
+0x00    u32     magic              Always 0x52564653 ("RVFS")
+0x04    u8      version            Format version (1)
+0x05    u8      seg_type           Segment type enum
+0x06    u16     flags              See flags bitfield
+0x08    u64     segment_id         Monotonic ordinal
+0x10    u64     payload_length     Bytes after header, before footer
+0x18    u64     timestamp_ns       UNIX nanoseconds
+0x20    u8      checksum_algo      0=CRC32C, 1=XXH3-128, 2=SHAKE-256
+0x21    u8      compression        0=none, 1=LZ4, 2=ZSTD, 3=custom
+0x22    u16     reserved_0         Must be 0x0000
+0x24    u32     reserved_1         Must be 0x00000000
+0x28    hash128 content_hash       Payload hash (first 128 bits)
+0x38    u32     uncompressed_len   Original payload size (0 if no compression)
+0x3C    u32     alignment_pad      Zero padding to 64B boundary
+```
+
+### Segment Type Enum
+
+```
+0x00    INVALID         Not a valid segment
+0x01    VEC_SEG         Vector payloads
+0x02    INDEX_SEG       HNSW adjacency
+0x03    OVERLAY_SEG     Graph overlay deltas
+0x04    JOURNAL_SEG     Metadata mutations
+0x05    MANIFEST_SEG    Segment directory
+0x06    QUANT_SEG       Quantization dictionaries
+0x07    META_SEG        Key-value metadata
+0x08    HOT_SEG         Temperature-promoted data
+0x09    SKETCH_SEG      Access counter sketches
+0x0A    WITNESS_SEG     Capability manifests
+0x0B    PROFILE_SEG     Domain profile declarations
+0x0C    CRYPTO_SEG      Key material / certificate anchors
+0x0D-0xEF  reserved
+0xF0-0xFF  extension    Implementation-specific
+```
+
+### Flags Bitfield
+
+```
+Bit  Mask    Name         Meaning
+---  ----    ----         -------
+0    0x0001  COMPRESSED   Payload compressed per compression field
+1    0x0002  ENCRYPTED    Payload encrypted (key in CRYPTO_SEG)
+2    0x0004  SIGNED       Signature footer follows payload
+3    0x0008  SEALED       Immutable (compaction output)
+4    0x0010  PARTIAL      Partial/streaming write
+5    0x0020  TOMBSTONE    Logically deletes prior segment
+6    0x0040  HOT          Contains hot-tier data
+7    0x0080  OVERLAY      Contains overlay/delta data
+8    0x0100  SNAPSHOT     Full snapshot (not delta)
+9    0x0200  CHECKPOINT   Safe rollback point
+10-15        reserved     Must be zero
+```
+
+## 4. Signature Footer
+
+Present only if `SIGNED` flag is set. Follows immediately after the payload.
+
+```
+Offset  Type    Field           Notes
+------  ----    -----           -----
+0x00    u16     sig_algo        0=Ed25519, 1=ML-DSA-65, 2=SLH-DSA-128s
+0x02    u16     sig_length      Signature byte length
+0x04    u8[]    signature       Signature bytes
+var     u32     footer_length   Total footer size (for backward scan)
+```
+
+### Signature Algorithm Sizes
+
+| Algorithm | sig_length | Post-Quantum | Performance |
+|-----------|-----------|-------------|-------------|
+| Ed25519 | 64 B | No | ~76,000 sign/s |
+| ML-DSA-65 | 3,309 B | Yes (NIST Level 3) | ~4,500 sign/s |
+| SLH-DSA-128s | 7,856 B | Yes (NIST Level 1) | ~350 sign/s |
+
+## 5. VEC_SEG Payload Layout
+
+Vector segments store blocks of vectors in columnar layout for compression.
+
+```
+------------------------------------------+
+| VEC_SEG Payload                          |
+------------------------------------------+
+| Block Directory                          |
+|   block_count: u32                       |
+|   For each block:                        |
+|     block_offset: u32 (from payload start)|
+|     vector_count: u32                    |
+|     dim: u16                             |
+|     dtype: u8                            |
+|     tier: u8                             |
+|   [64B aligned]                          |
+------------------------------------------+
+| Block 0                                  |
+|   +-- Columnar Vectors --+               |
+|   | dim_0[0..count]      |  <- all vals  |
+|   | dim_1[0..count]      |     for dim 0 |
+|   | ...                  |     then dim 1 |
+|   | dim_D[0..count]      |     etc.      |
+|   +----------------------+               |
+|   +-- ID Map --+                         |
+|   | encoding: u8 (0=raw, 1=delta-varint) |
+|   | restart_interval: u16                |
+|   | id_count: u32                        |
+|   | [restart_offsets: u32[]] (if delta)   |
+|   | [ids: encoded]                       |
+|   +-----------+                          |
+|   +-- Block CRC --+                      |
+|   | crc32c: u32    |                     |
+|   +----------------+                     |
+|   [64B padding]                          |
+------------------------------------------+
+| Block 1                                  |
+|   ...                                    |
+------------------------------------------+
+```
+
+### Data Type Enum
+
+```
+0x00    f32     32-bit float
+0x01    f16     16-bit float
+0x02    bf16    bfloat16
+0x03    i8      signed 8-bit integer (scalar quantized)
+0x04    u8      unsigned 8-bit integer
+0x05    i4      4-bit integer (packed, 2 per byte)
+0x06    binary  1-bit (packed, 8 per byte)
+0x07    pq      Product-quantized codes
+0x08    custom  Custom encoding (see QUANT_SEG)
+```
+
+### Columnar vs Interleaved
+
+**VEC_SEG** (columnar): `dim_0[all], dim_1[all], ..., dim_D[all]`
+- Better compression (similar values adjacent)
+- Better for batch operations
+- Worse for single-vector random access
+
+**HOT_SEG** (interleaved): `vec_0[all_dims], vec_1[all_dims], ...`
+- Better for single-vector access (one cache line per vector)
+- Better for top-K refinement (sequential scan)
+- No compression benefit
+
+## 6. INDEX_SEG Payload Layout
+
+```
+------------------------------------------+
+| INDEX_SEG Payload                        |
+------------------------------------------+
+| Index Header                             |
+|   index_type: u8 (0=HNSW, 1=IVF, 2=flat)|
+|   layer_level: u8 (A=0, B=1, C=2)       |
+|   M: u16 (HNSW max neighbors per layer)  |
+|   ef_construction: u32                   |
+|   node_count: u64                        |
+|   [64B aligned]                          |
+------------------------------------------+
+| Restart Point Index                      |
+|   restart_interval: u32                  |
+|   restart_count: u32                     |
+|   [restart_offset: u32] * count          |
+|   [64B aligned]                          |
+------------------------------------------+
+| Adjacency Data                           |
+|   For each node (sorted by node_id):     |
+|     layer_count: varint                  |
+|     For each layer:                      |
+|       neighbor_count: varint             |
+|       [delta_neighbor_id: varint] * cnt  |
+|   [64B padding per restart group]        |
+------------------------------------------+
+| Prefetch Hints (optional)                |
+|   hint_count: u32                        |
+|   For each hint:                         |
+|     node_range_start: u64                |
+|     node_range_end: u64                  |
+|     page_offset: u64                     |
+|     page_count: u32                      |
+|     prefetch_ahead: u32                  |
+|   [64B aligned]                          |
+------------------------------------------+
+```
+
+## 7. HOT_SEG Payload Layout
+
+The hot segment stores the most-accessed vectors in interleaved (row-major)
+layout with their neighbor lists co-located for cache locality.
+
+```
+------------------------------------------+
+| HOT_SEG Payload                          |
+------------------------------------------+
+| Hot Header                               |
+|   vector_count: u32                      |
+|   dim: u16                               |
+|   dtype: u8 (f16 or i8)                  |
+|   neighbor_M: u16                        |
+|   [64B aligned]                          |
+------------------------------------------+
+| Interleaved Hot Data                     |
+|   For each hot vector:                   |
+|     vector_id: u64                       |
+|     vector: [dtype * dim]                |
+|     neighbor_count: u16                  |
+|     [neighbor_id: u64] * neighbor_count  |
+|     [64B aligned per entry]              |
+------------------------------------------+
+```
+
+Each hot entry is self-contained: vector + neighbors in one contiguous block.
+A sequential scan of the HOT_SEG for top-K refinement reads vectors and
+neighbors without any pointer chasing.
+
+### Hot Entry Size Example
+
+For 384-dim fp16 vectors with M=16 neighbors:
+```
+8 (id) + 768 (vector) + 2 (count) + 128 (neighbors) = 906 bytes
+Padded to 64B: 960 bytes per entry
+```
+
+1000 hot vectors = 960 KB (fits in L2 cache on most CPUs).
+
+## 8. MANIFEST_SEG Payload Layout
+
+```
+------------------------------------------+
+| MANIFEST_SEG Payload                     |
+------------------------------------------+
+| TLV Records (Level 1 manifest)           |
+|   For each record:                       |
+|     tag: u16                             |
+|     length: u32                          |
+|     pad: u16 (to 8B alignment)           |
+|     value: [u8; length]                  |
+|     [8B aligned]                         |
+------------------------------------------+
+| Level 0 Root Manifest (last 4096 bytes)  |
+|   (See 02-manifest-system.md for layout) |
+------------------------------------------+
+```
+
+## 9. SKETCH_SEG Payload Layout
+
+```
+------------------------------------------+
+| SKETCH_SEG Payload                       |
+------------------------------------------+
+| Sketch Header                            |
+|   block_count: u32                       |
+|   width: u32 (counters per row)          |
+|   depth: u32 (hash functions)            |
+|   counter_bits: u8 (8 or 16)            |
+|   decay_shift: u8 (aging right-shift)    |
+|   total_accesses: u64                    |
+|   [64B aligned]                          |
+------------------------------------------+
+| Sketch Data                              |
+|   For each block:                        |
+|     block_id: u32                        |
+|     counters: [u8; width * depth]        |
+|   [64B aligned per block]               |
+------------------------------------------+
+```
+
+## 10. QUANT_SEG Payload Layout
+
+```
+------------------------------------------+
+| QUANT_SEG Payload                        |
+------------------------------------------+
+| Quant Header                             |
+|   quant_type: u8                         |
+|     0 = scalar (min-max per dim)         |
+|     1 = product quantization             |
+|     2 = binary threshold                 |
+|     3 = residual PQ                      |
+|   tier: u8                               |
+|   dim: u16                               |
+|   [64B aligned]                          |
+------------------------------------------+
+| Type-specific data:                      |
+|                                          |
+| Scalar (type 0):                         |
+|   min: [f32; dim]                        |
+|   max: [f32; dim]                        |
+|                                          |
+| PQ (type 1):                             |
+|   M: u16 (subspaces)                     |
+|   K: u16 (centroids per sub)             |
+|   sub_dim: u16 (dims per sub)            |
+|   codebook: [f32; M * K * sub_dim]       |
+|                                          |
+| Binary (type 2):                         |
+|   threshold: [f32; dim]                  |
+|                                          |
+| Residual PQ (type 3):                    |
+|   coarse_centroids: [f32; K_coarse * dim]|
+|   residual_codebook: [f32; M * K * sub]  |
+|                                          |
+| [64B aligned]                            |
+------------------------------------------+
+```
+
+## 11. Checksum Algorithms
+
+| ID | Algorithm | Output | Speed (HW accel) | Use Case |
+|----|-----------|--------|-------------------|----------|
+| 0 | CRC32C | 4 B (stored in 16B field, zero-padded) | ~3 GB/s (SSE4.2) | Per-block integrity |
+| 1 | XXH3-128 | 16 B | ~50 GB/s (AVX2) | Segment content hash |
+| 2 | SHAKE-256 | 16 or 32 B | ~1 GB/s | Cryptographic verification |
+
+Default recommendation:
+- Block-level CRC: CRC32C (fastest, hardware accelerated)
+- Segment content hash: XXH3-128 (fast, good distribution)
+- Crypto witness hashes: SHAKE-256 (post-quantum safe)
+
+## 12. Compression
+
+| ID | Algorithm | Ratio | Decompress Speed | Use Case |
+|----|-----------|-------|-----------------|----------|
+| 0 | None | 1.0x | N/A | Hot tier |
+| 1 | LZ4 | 1.5-3x | ~4 GB/s | Warm tier, low latency |
+| 2 | ZSTD | 3-6x | ~1.5 GB/s | Cold tier, high ratio |
+| 3 | Custom | Varies | Varies | Domain-specific |
+
+Compression is applied per-segment payload. Individual blocks within a
+segment share the same compression.
+
+## 13. Tail Scan Algorithm
+
+```python
+def find_latest_manifest(file):
+    file_size = file.seek(0, SEEK_END)
+
+    # Try fast path: last 4096 bytes
+    file.seek(file_size - 4096)
+    root = file.read(4096)
+    if root[0:4] == b'RVM0' and verify_crc(root):
+        return parse_root_manifest(root)
+
+    # Slow path: scan backward for MANIFEST_SEG header
+    scan_pos = file_size - 64  # Start at last 64B boundary
+    while scan_pos >= 0:
+        file.seek(scan_pos)
+        header = file.read(64)
+        if (header[0:4] == b'RVFS' and
+            header[5] == 0x05 and  # MANIFEST_SEG
+            verify_segment_header(header)):
+            return parse_manifest_segment(file, scan_pos)
+        scan_pos -= 64  # Previous 64B boundary
+
+    raise CorruptFileError("No valid MANIFEST_SEG found")
+```
+
+Worst case: full backward scan at 64B granularity. For a 4 GB file, this is
+67M checks — but each check is a 4-byte comparison, so it completes in ~100ms
+on a modern CPU with mmap. In practice, the fast path succeeds on the first try
+for non-corrupt files.