wifi-densepose/vendor/ruvector/examples/rvf/README.md

<p align="center">
  <strong>RVF Examples</strong> &mdash; Learn by Running
</p>

<p align="center">
  <em>Hands-on examples for the unified agentic AI format &mdash; store it, send it, run it</em>
</p>

<p align="center">
  <a href="#quick-start">Quick Start</a> &bull;
  <a href="#examples-at-a-glance">Examples</a> &bull;
  <a href="#features-covered">Features</a> &bull;
  <a href="#performance">Performance</a> &bull;
  <a href="#comparison">Comparison</a>
</p>

<p align="center">
  <img alt="Examples" src="https://img.shields.io/badge/examples-40_runnable-brightgreen?style=flat-square" />
  <img alt="Rust" src="https://img.shields.io/badge/rust-1.87%2B-orange?style=flat-square" />
  <img alt="License" src="https://img.shields.io/badge/license-MIT%2FApache--2.0-blue?style=flat-square" />
  <img alt="Tests" src="https://img.shields.io/badge/tests-453_passing-brightgreen?style=flat-square" />
  <img alt="no_std" src="https://img.shields.io/badge/no__std-compatible-green?style=flat-square" />
  <img alt="Crates" src="https://img.shields.io/badge/crates-13-blue?style=flat-square" />
</p>

---

## What is RVF?

**RVF (RuVector Format)** is the unified agentic AI file format. One `.rvf` file does three jobs:

1. **Store** &mdash; vectors, indexes, metadata, and cryptographic proofs live in one file. No database server required.
2. **Transfer** &mdash; the same file streams over a network. Query, insert, and delete operations work over the wire with zero conversion.
3. **Run** &mdash; pack model weights, graph neural networks, WASM code, or even a bootable OS kernel into the file. Now it's not just data &mdash; it's a self-contained intelligence unit you can deploy anywhere.

### Why does this matter?

Today, an AI agent's state is scattered: embeddings in one database, model weights in another, graph structure in a third, config in a fourth. Nothing talks to anything else. Moving between tools means re-indexing from scratch. There's no standard way to prove any of it was computed securely &mdash; and no way to hand an agent its complete knowledge as a single portable artifact.

RVF solves this. It gives agentic AI a **universal substrate** &mdash; one file that works everywhere:

| What it does | Where it runs | What you get |
|-------------|--------------|-------------|
| Stores vectors | Server (HNSW index) | Sub-millisecond search over millions of vectors |
| Stores vectors | Browser (5.5 KB WASM) | Same file, no backend needed |
| Stores vectors | Edge / IoT / mobile | Lightweight API, tiny footprint |
| Transfers data | Over the network | Batched query/ingest/delete via TCP |
| Runs code | Inside a TEE | Cryptographic proof of secure computation |
| Runs code | Bare metal / VM | File boots itself as a microservice |
| Runs code | Linux kernel (eBPF) | Sub-microsecond hot-path acceleration |
| Runs intelligence | Anywhere | Model + data + graph + trust chain in one file |

### Key properties

- **Crash-safe** &mdash; no write-ahead log needed; if power dies mid-write, the file stays consistent
- **Self-describing** &mdash; the schema is in the file; no external catalog required
- **Progressive loading** &mdash; start answering queries before the full index is loaded
- **Domain profiles** &mdash; `.rvdna` for genomics, `.rvtext` for language, `.rvgraph` for networks, `.rvvis` for vision &mdash; same format underneath
- **Lineage tracking** &mdash; every derived file records its parent's hash, like DNA inheritance
- **Tamper-evident** &mdash; witness chains and post-quantum signatures prove nothing was altered

These examples walk you through every major feature, from the simplest "insert and query" to wire format inspection, witness chains, and sealed cognitive engines.

### What you can build with RVF

| Use case | What goes in the file | Result |
|----------|----------------------|--------|
| **Semantic search** | Vectors + HNSW index | Single-file vector database, no server needed |
| **Agent memory** | Vectors + metadata + witness chain | Portable, auditable AI agent knowledge base |
| **Sealed LoRA distribution** | Base embeddings + OVERLAY_SEG adapter deltas | Ship fine-tuned models as one versioned file |
| **Portable graph intelligence** | Node embeddings + GRAPH_SEG adjacency | GNN state that transfers between systems |
| **Self-booting AI service** | Vectors + index + KERNEL_SEG unikernel | File boots as a microservice on bare metal or Firecracker |
| **Kernel-accelerated cache** | Hot vectors + EBPF_SEG XDP program | Sub-microsecond lookups in the Linux kernel data path |
| **Confidential AI** | Any of the above + TEE attestation | Cryptographic proof everything ran inside a secure enclave |
| **Genomic analysis** | DNA k-mer embeddings + variant tensors | `.rvdna` file with lineage tracking across analysis pipeline |
| **Firmware-style AI versioning** | Full cognitive state + lineage chain | Parent &rarr; child derivation with hash verification, like DNA |

---

## Quick Start

```bash
# Clone the repo
git clone https://github.com/ruvnet/ruvector
cd ruvector/examples/rvf

# Run your first example
cargo run --example basic_store
```

That's it. You'll see a store created, 100 vectors inserted, nearest neighbors found, and persistence verified &mdash; all in under a second.

### Using the CLI

You can also work with RVF stores from the command line without writing any Rust:

```bash
# Build the CLI
cd crates/rvf && cargo build -p rvf-cli

# Create a store, ingest data, and query
rvf create vectors.rvf --dimension 384
rvf ingest vectors.rvf --input data.json --format json
rvf query vectors.rvf --vector "0.1,0.2,..." --k 10
rvf status vectors.rvf
rvf inspect vectors.rvf
rvf compact vectors.rvf

# Derive a child store with lineage tracking
rvf derive parent.rvf child.rvf --type filter

# All commands support --json for machine-readable output
rvf status vectors.rvf --json
```

<details>
<summary><strong>Run All 40 Examples</strong></summary>

**Core (6):**
```bash
cargo run --example basic_store          # Store lifecycle + k-NN
cargo run --example progressive_index    # Three-layer HNSW recall
cargo run --example quantization         # Scalar / product / binary
cargo run --example wire_format          # Raw segment I/O
cargo run --example crypto_signing       # Ed25519 + witness chains
cargo run --example filtered_search      # Metadata-filtered queries
```

**Agentic AI (6):**
```bash
cargo run --example agent_memory         # Persistent agent memory + witness audit
cargo run --example swarm_knowledge      # Multi-agent shared knowledge base
cargo run --example reasoning_trace      # Chain-of-thought with lineage derivation
cargo run --example tool_cache           # Tool call result cache with TTL
cargo run --example agent_handoff        # Transfer agent state between instances
cargo run --example experience_replay    # RL experience replay buffer
```

**Practical Production (5):**
```bash
cargo run --example semantic_search      # Document search with metadata filters
cargo run --example recommendation       # Item recommendations (collaborative filtering)
cargo run --example rag_pipeline         # Retrieval-augmented generation pipeline
cargo run --example embedding_cache      # LRU cache with temperature tiering
cargo run --example dedup_detector       # Near-duplicate detection + compaction
```

**Vertical Domains (4):**
```bash
cargo run --example genomic_pipeline     # DNA k-mer search (.rvdna profile)
cargo run --example financial_signals    # Market signals with TEE attestation
cargo run --example medical_imaging      # Radiology search (.rvvis profile)
cargo run --example legal_discovery      # Legal doc similarity (.rvtext profile)
```

**Exotic Capabilities (5):**
```bash
cargo run --example self_booting         # RVF with embedded unikernel
cargo run --example ebpf_accelerator     # eBPF hot-path acceleration
cargo run --example hyperbolic_taxonomy  # Hierarchy-aware search
cargo run --example multimodal_fusion    # Cross-modal text + image search
cargo run --example sealed_engine        # Full cognitive engine (capstone)
```

**Runtime Targets (4) + Postgres (1):**
```bash
cargo run --example browser_wasm         # Browser-side WASM vector search
cargo run --example edge_iot             # IoT device with binary quantization
cargo run --example serverless_function  # Cold-start optimized for Lambda
cargo run --example ruvllm_inference     # LLM KV cache + LoRA via RVF
cargo run --example postgres_bridge      # PostgreSQL ↔ RVF export/import
```

**Network & Security (4):**
```bash
cargo run --example network_sync         # Peer-to-peer vector store sync
cargo run --example tee_attestation      # TEE attestation + sealed keys
cargo run --example access_control       # Role-based vector access control
cargo run --example zero_knowledge       # Zero-knowledge proof integration
```

**Autonomous Agent (1):**
```bash
cargo run --example ruvbot               # Autonomous RVF-powered agent bot
```

**POSIX & Systems (3):**
```bash
cargo run --example posix_fileops        # POSIX file operations with RVF
cargo run --example linux_microkernel    # Linux microkernel distribution
cargo run --example mcp_in_rvf           # MCP server embedded in RVF
```

**Network Operations (1):**
```bash
cargo run --example network_interfaces   # Network OS telemetry (60 interfaces)
```

</details>

### Prerequisites

- **Rust 1.87+** &mdash; install via [rustup](https://rustup.rs/)
- No other dependencies needed &mdash; everything builds from source
- All examples use deterministic pseudo-random data, so results are reproducible across runs

---

<details>
<summary><strong>Examples at a Glance (40 examples)</strong></summary>

### Core

| # | Example | Difficulty | What You'll Learn |
|---|---------|-----------|-------------------|
| 1 | basic_store | Beginner | Create, insert, query, persist, reopen |
| 2 | progressive_index | Intermediate | Three-layer HNSW, recall measurement |
| 3 | quantization | Intermediate | Scalar/product/binary quantization, tiering |
| 4 | wire_format | Advanced | Raw segment I/O, hash validation, tail-scan |
| 5 | crypto_signing | Advanced | Ed25519 signing, witness chains, tamper detection |
| 6 | filtered_search | Intermediate | Metadata filters: Eq, Range, AND/OR/IN |

### Agentic AI

| # | Example | Difficulty | What You'll Learn |
|---|---------|-----------|-------------------|
| 7 | agent_memory | Intermediate | Persistent agent memory, session recall, witness audit |
| 8 | swarm_knowledge | Intermediate | Multi-agent shared knowledge, cross-agent search |
| 9 | reasoning_trace | Advanced | Chain-of-thought lineage (parent &rarr; child &rarr; grandchild) |
| 10 | tool_cache | Intermediate | Tool call caching, TTL, delete_by_filter, compaction |
| 11 | agent_handoff | Advanced | Transfer agent state, derive clone, lineage verification |
| 12 | experience_replay | Intermediate | RL replay buffer, priority sampling, tiering |

### Practical Production

| # | Example | Difficulty | What You'll Learn |
|---|---------|-----------|-------------------|
| 13 | semantic_search | Beginner | Document search engine, 4 filter workflows |
| 14 | recommendation | Intermediate | Collaborative filtering, genre/quality filters |
| 15 | rag_pipeline | Advanced | 5-step RAG: chunk, embed, retrieve, rerank, assemble |
| 16 | embedding_cache | Advanced | Zipf access patterns, 3-tier quantization, memory savings |
| 17 | dedup_detector | Intermediate | Near-duplicate detection, clustering, compaction |

### Vertical Domains

| # | Example | Difficulty | What You'll Learn |
|---|---------|-----------|-------------------|
| 18 | genomic_pipeline | Advanced | DNA k-mer search, `.rvdna` profile, lineage |
| 19 | financial_signals | Advanced | Market signals, Ed25519 signing, attestation |
| 20 | medical_imaging | Intermediate | Radiology search, `.rvvis` profile, audit trail |
| 21 | legal_discovery | Intermediate | Legal similarity, `.rvtext` profile, discovery audit |

### Exotic Capabilities

| # | Example | Difficulty | What You'll Learn |
|---|---------|-----------|-------------------|
| 22 | self_booting | Advanced | Embed/extract unikernel, kernel header verification |
| 23 | ebpf_accelerator | Advanced | Embed/extract eBPF, XDP program, co-existence |
| 24 | hyperbolic_taxonomy | Intermediate | Hierarchy-aware embeddings, depth-filtered search |
| 25 | multimodal_fusion | Intermediate | Cross-modal text+image search, modality filtering |
| 26 | sealed_engine | Advanced | Capstone: vectors + kernel + eBPF + witness + lineage |

### Runtime Targets + Postgres

| # | Example | Difficulty | What You'll Learn |
|---|---------|-----------|-------------------|
| 27 | browser_wasm | Intermediate | WASM-compatible API, raw wire segments, size targets |
| 28 | edge_iot | Beginner | Constrained device, binary quantization, memory budget |
| 29 | serverless_function | Intermediate | Cold start, manifest tail-scan, progressive loading |
| 30 | ruvllm_inference | Advanced | KV cache + LoRA adapters + policy store via RVF |
| 31 | postgres_bridge | Intermediate | PG export/import, offline query, lineage, witness audit |

### Network & Security

| # | Example | Difficulty | What You'll Learn |
|---|---------|-----------|-------------------|
| 32 | network_sync | Advanced | Peer-to-peer sync, vector exchange, conflict resolution |
| 33 | tee_attestation | Advanced | TEE platform attestation, sealed keys, computation proof |
| 34 | access_control | Intermediate | Role-based access, permission checks, audit trails |
| 35 | zero_knowledge | Advanced | ZK proofs for vector operations, privacy-preserving search |

### Autonomous Agent

| # | Example | Difficulty | What You'll Learn |
|---|---------|-----------|-------------------|
| 36 | ruvbot | Advanced | Autonomous agent with RVF memory, planning, tool use |

### POSIX & Systems

| # | Example | Difficulty | What You'll Learn |
|---|---------|-----------|-------------------|
| 37 | posix_fileops | Intermediate | Raw I/O, atomic rename, locking, segment random access |
| 38 | linux_microkernel | Advanced | Package management, SSH keys, kernel embed, lineage updates |
| 39 | mcp_in_rvf | Advanced | MCP server runtime embedded in RVF, eBPF filter, tools |

### Network Operations

| # | Example | Difficulty | What You'll Learn |
|---|---------|-----------|-------------------|
| 40 | network_interfaces | Intermediate | Multi-chassis telemetry, anomaly detection, filtered queries |

</details>

---

<details>
<summary><strong>Features Covered</strong></summary>

### Storage &mdash; vectors in, answers out

| Feature | Example | Description |
|---------|---------|-------------|
| k-NN Search | basic_store | Find nearest neighbors by L2 or cosine distance |
| Persistence | basic_store | Close a store, reopen it, verify results match |
| Metadata Filters | filtered_search | Eq, Ne, Gt, Lt, Range, In, And, Or expressions |
| Combined Filters | filtered_search | Multi-condition queries (category + score range) |

### Indexing &mdash; speed vs. accuracy trade-offs

| Feature | Example | Description |
|---------|---------|-------------|
| Progressive Indexing | progressive_index | Three-tier HNSW: Layer A (fast), B (better), C (best) |
| Recall Measurement | progressive_index | Compare approximate results against brute-force ground truth |

### Compression &mdash; fit more vectors in less memory

| Feature | Example | Description |
|---------|---------|-------------|
| Scalar Quantization | quantization | fp32 &rarr; u8 (4x compression, Hot tier) |
| Product Quantization | quantization | fp32 &rarr; PQ codes (8-32x compression, Warm tier) |
| Binary Quantization | quantization | fp32 &rarr; 1-bit (32x compression, Cold tier) |
| Temperature Tiering | quantization | Count-Min Sketch access tracking + automatic tier assignment |

### Wire format &mdash; what the bytes look like on disk and over the network

| Feature | Example | Description |
|---------|---------|-------------|
| Segment I/O | wire_format | Write/read 64-byte-aligned segments with type/flags/hash |
| Hash Validation | wire_format | CRC32c / XXH3 integrity checks on every segment |
| Tail-Scan | wire_format | Find latest manifest by scanning backward from EOF |

### Trust &mdash; signatures, audit trails, and tamper detection

| Feature | Example | Description |
|---------|---------|-------------|
| Ed25519 Signing | crypto_signing | Sign segments, verify signatures, detect tampering |
| Witness Chains | crypto_signing | SHAKE-256 linked audit trails (73-byte entries) |
| Tamper Detection | crypto_signing | Any byte flip breaks chain verification |

### Agentic AI &mdash; lineage, domains, and self-booting intelligence

| Feature | Example | Description |
|---------|---------|-------------|
| DNA-Style Lineage | (API) | Every derived file records its parent's hash and derivation type |
| Domain Profiles | (API) | `.rvdna`, `.rvtext`, `.rvgraph`, `.rvvis` &mdash; same format, domain-specific hints |
| Computational Container | `claude_code_appliance` | Embed a WASM microkernel, eBPF program, or bootable unikernel |
| Self-Booting Appliance | `claude_code_appliance` | 5.1 MB `.rvf` &mdash; boots Linux, serves queries, runs Claude Code |
| Import (JSON/CSV/NumPy) | (API) | Load embeddings from `.json`, `.csv`, or `.npy` files via `rvf-import` or `rvf ingest` CLI |
| Unified CLI | `rvf` | 9 subcommands: create, ingest, query, delete, status, inspect, compact, derive, serve |
| Compaction | (API) | Garbage-collect tombstoned vectors and reclaim disk space |
| Batch Delete | (API) | Delete vectors by ID with tombstone markers |

### Self-Booting RVF &mdash; Claude Code Appliance

The `claude_code_appliance` example builds a complete self-booting AI development environment as a single `.rvf` file. It uses real infrastructure &mdash; a Docker-built Linux kernel, Ed25519 SSH keys, a BPF C socket filter, and a cryptographic witness chain.

```bash
cd examples/rvf
cargo run --example claude_code_appliance
```

**What it produces** (5.1 MB file):

```
claude_code_appliance.rvf
  ├── KERNEL_SEG    Linux 6.8.12 bzImage (5.2 MB, x86_64)
  ├── EBPF_SEG      Socket filter — allows ports 2222, 8080 only
  ├── VEC_SEG       20 package embeddings (128-dim)
  ├── INDEX_SEG     HNSW graph for package search
  ├── WITNESS_SEG   6-entry tamper-evident audit trail
  ├── CRYPTO_SEG    3 Ed25519 SSH user keys (root, deploy, claude)
  ├── MANIFEST_SEG  4 KB root with segment directory
  └── Snapshot      v1 derived image with lineage tracking
```

**Boot and connect:**

```bash
rvf launch claude_code_appliance.rvf        # Boot on QEMU/Firecracker
ssh -p 2222 deploy@localhost                 # SSH in
curl -s localhost:8080/query -d '{"vector":[0.1,...], "k":5}'
```

Final file: **5.1 MB single `.rvf`** &mdash; boots Linux, serves queries, runs Claude Code.

</details>

<details>
<summary><strong>What RVF Contains</strong></summary>

An RVF file is built from **segments** &mdash; self-describing blocks that can be combined freely. Here are all 16 types, grouped by purpose:

```
 Data              Indexing           Compression        Runtime
+-----------+     +-----------+     +-----------+     +-----------+
| VEC  0x01 |     | INDEX 0x02|     | QUANT 0x06|     | WASM      |
| (vectors) |     | (HNSW)    |     | (SQ/PQ/BQ)|     | (5.5 KB)  |
+-----------+     +-----------+     +-----------+     +-----------+
| META 0x07 |     | META_IDX  |     | HOT  0x08 |     | KERNEL    |
| (key-val) |     | 0x0D      |     | (promoted) |     | 0x0E      |
+-----------+     +-----------+     +-----------+     +-----------+
| JOURNAL   |     | OVERLAY   |     | SKETCH    |     | EBPF      |
| 0x04      |     | 0x03      |     | 0x09      |     | 0x0F      |
+-----------+     +-----------+     +-----------+     +-----------+

 Trust             State              Domain
+-----------+     +-----------+     +-----------+
| WITNESS   |     | MANIFEST  |     | PROFILE   |
| 0x0A      |     | 0x05      |     | 0x0B      |
+-----------+     +-----------+     +-----------+
| CRYPTO    |
| 0x0C      |
+-----------+
```

Any segment you don't need is simply absent. A basic vector store uses VEC + INDEX + MANIFEST. A sealed cognitive engine might use all 16.

### RuVector Ecosystem Integration

RVF is the universal substrate for the entire RuVector ecosystem. Here's how the 75+ Rust crates map onto RVF segments:

| Domain | Crates | RVF Segments Used |
|--------|--------|-------------------|
| **LLM inference** | `ruvllm`, `ruvllm-cli` | VEC (KV cache), OVERLAY (LoRA), WITNESS (audit) |
| **Self-optimizing learning** | `sona` | OVERLAY (micro-LoRA), META (EWC++ weights) |
| **Graph neural networks** | `ruvector-gnn`, `ruvector-graph` | INDEX (HNSW topology), META (edge weights) |
| **Quantum computing** | `ruQu`, `ruqu-core`, `ruqu-algorithms` | SKETCH (VQE snapshots), META (syndrome tables) |
| **Attention mechanisms** | `ruvector-attention`, `ruvector-mincut-gated-transformer` | VEC (attention matrices), QUANT (INT4/FP16) |
| **Coherence systems** | `cognitum-gate-kernel`, `prime-radiant` | WITNESS (tile witnesses), WASM (64 KB tiles) |
| **Neuromorphic** | `ruvector-nervous-system`, `micro-hnsw-wasm` | VEC (spike trains), INDEX (spiking HNSW) |
| **Agent memory** | `agentdb`, `claude-flow`, `agentic-flow` | VEC + INDEX + WITNESS (full agent state) |
| **Edge / browser** | `rvlite`, `rvf-wasm` | VEC + INDEX via 5.5 KB WASM microkernel |
| **Hyperbolic geometry** | `ruvector-hyperbolic-hnsw`, `ruvector-math` | INDEX (Poincar&eacute; ball HNSW) |
| **Routing / inference** | `ruvector-tiny-dancer-core`, `ruvector-sparse-inference` | VEC (feature vectors), META (routing policies) |
| **Observation pipeline** | `ospipe` | META (state vectors), WITNESS (provenance) |

</details>

<details>
<summary><strong>Performance & Comparison</strong></summary>

RVF is designed for speed at every layer:

| Metric | Value | Example |
|--------|-------|---------|
| Cold boot (4 KB manifest) | **< 5 ms** | wire_format |
| First query (Layer A only) | **recall >= 0.70** | progressive_index |
| Full recall (Layer C) | **>= 0.95** | progressive_index |
| WASM binary size | **~5.5 KB** | &mdash; |
| Segment header | **64 bytes** | wire_format |
| Witness chain entry | **73 bytes** | crypto_signing |
| Scalar quantization | **4x compression** | quantization |
| Product quantization | **8-32x compression** | quantization |
| Binary quantization | **32x compression** | quantization |

### Progressive Loading

Instead of waiting for the full index, RVF serves queries immediately:

```
Layer A ─────> Layer B ─────> Layer C
(microsecs)    (~10 ms)       (~50 ms)
recall ~0.70   recall ~0.85   recall ~0.95
```

The `progressive_index` example measures this recall progression with brute-force ground truth.

### Comparison

#### vs. vector databases

| Feature | RVF | Annoy | FAISS | Qdrant | Milvus |
|---------|-----|-------|-------|--------|--------|
| Single-file format | Yes | Yes | No | No | No |
| Crash-safe (no WAL) | Yes | No | No | WAL | WAL |
| Progressive loading | 3 layers | No | No | No | No |
| WASM support | 5.5 KB | No | No | No | No |
| `no_std` compatible | Yes | No | No | No | No |
| Post-quantum sigs | ML-DSA-65 | No | No | No | No |
| TEE attestation | Yes | No | No | No | No |
| Metadata filtering | Yes | No | Yes | Yes | Yes |
| Auto quantization | 3-tier | No | Manual | Yes | Yes |
| Append-only | Yes | Build-once | Build-once | Log | Log |
| Witness chains | Yes | No | No | No | No |
| Lineage provenance | Yes (DNA-style) | No | No | No | No |
| Computational container | Yes (WASM/eBPF/unikernel) | No | No | No | No |
| Domain profiles | 5 profiles | No | No | No | No |
| Language bindings | Rust, Node, WASM | C++, Python | C++, Python | Rust, Python | Go, Python |

#### vs. model registries, graph DBs, and container formats

RVF replaces multiple tools because it carries data, model, graph, runtime, and trust chain together:

| Capability | RVF | GGUF | ONNX | SafeTensors | Neo4j | Docker/OCI |
|-----------|-----|------|------|-------------|-------|------------|
| Vector storage + search | Yes | No | No | No | No | No |
| Model weight deltas (LoRA) | OVERLAY_SEG | Full weights | Full graph | Weights only | No | No |
| Graph neural state | GRAPH_SEG | No | No | No | Yes | No |
| Cryptographic audit trail | WITNESS_SEG | No | No | No | No | No |
| Self-booting runtime | KERNEL_SEG | No | No | No | No | Yes |
| Kernel-level acceleration | EBPF_SEG | No | No | No | No | No |
| File lineage / versioning | DNA-style | No | No | No | No | Image layers |
| TEE attestation | Built-in | No | No | No | No | No |
| Single portable file | Yes | Yes | Yes | Yes | No | Image tarball |
| Runs in browser | 5.5 KB WASM | No | ONNX.js | No | No | No |

</details>

<details>
<summary><strong>Usage Patterns (8 patterns)</strong></summary>

### Pattern 1: Simple Vector Store

The most common use case. Create a store, add embeddings, query nearest neighbors.

```rust
use rvf_runtime::{RvfStore, RvfOptions, QueryOptions};
use rvf_runtime::options::DistanceMetric;

let options = RvfOptions {
    dimension: 384,
    metric: DistanceMetric::L2,
    ..Default::default()
};
let mut store = RvfStore::create("vectors.rvf", options)?;

// Insert embeddings
store.ingest_batch(&[&embedding], &[1], None)?;

// Query top-10 nearest neighbors
let results = store.query(&query, 10, &QueryOptions::default())?;
for r in &results {
    println!("id={}, distance={:.4}", r.id, r.distance);
}
```

See: [`basic_store.rs`](examples/basic_store.rs)

### Pattern 2: Filtered Search

Attach metadata to vectors, then filter during queries.

```rust
use rvf_runtime::{FilterExpr, MetadataEntry, MetadataValue};
use rvf_runtime::filter::FilterValue;

// Add metadata during ingestion
let metadata = vec![
    MetadataEntry { field_id: 0, value: MetadataValue::String("science".into()) },
    MetadataEntry { field_id: 1, value: MetadataValue::U64(95) },
];
store.ingest_batch(&[&vec], &[42], Some(&metadata))?;

// Query with filter: category == "science" AND score > 80
let filter = FilterExpr::And(vec![
    FilterExpr::Eq(0, FilterValue::String("science".into())),
    FilterExpr::Gt(1, FilterValue::U64(80)),
]);
let opts = QueryOptions { filter: Some(filter), ..Default::default() };
let results = store.query(&query, 10, &opts)?;
```

See: [`filtered_search.rs`](examples/filtered_search.rs)

### Pattern 3: Progressive Recall

Start serving queries instantly, improve quality as more data loads.

```rust
use rvf_index::{build_full_index, build_layer_a, build_layer_c, ProgressiveIndex};

// Build HNSW graph
let graph = build_full_index(&store, n, &config, &rng, &l2_distance);

// Layer A: instant but approximate
let layer_a = build_layer_a(&graph, &centroids, &assignments, n as u64);
let idx = ProgressiveIndex { layer_a: Some(layer_a), layer_b: None, layer_c: None };
let fast_results = idx.search(&query, 10, 200, &store); // recall ~0.70

// Layer C: full precision
let layer_c = build_layer_c(&graph);
let idx_full = ProgressiveIndex { layer_a: Some(layer_a), layer_b: None, layer_c: Some(layer_c) };
let precise_results = idx_full.search(&query, 10, 200, &store); // recall ~0.95
```

See: [`progressive_index.rs`](examples/progressive_index.rs)

### Pattern 4: Cryptographic Integrity

Sign segments and build tamper-evident audit trails.

```rust
use rvf_crypto::{sign_segment, verify_segment, create_witness_chain, WitnessEntry, shake256_256};
use ed25519_dalek::SigningKey;

// Sign a segment
let footer = sign_segment(&header, &payload, &signing_key);

// Verify signature
assert!(verify_segment(&header, &payload, &footer, &verifying_key));

// Build an audit trail
let entries = vec![WitnessEntry {
    prev_hash: [0; 32],
    action_hash: shake256_256(b"inserted 1000 vectors"),
    timestamp_ns: 1_700_000_000_000_000_000,
    witness_type: 0x01, // PROVENANCE
}];
let chain = create_witness_chain(&entries);
```

See: [`crypto_signing.rs`](examples/crypto_signing.rs)

### Pattern 5: Import from JSON / CSV / NumPy

Load embeddings from common formats without writing a parser.

```rust
use rvf_import::{import_json, import_csv, import_npy};

// From a JSON array of vectors
import_json("embeddings.json", &mut store)?;

// From a CSV file (one vector per row)
import_csv("embeddings.csv", &mut store)?;

// From a NumPy .npy file
import_npy("embeddings.npy", &mut store)?;
```

### Pattern 6: Delete and Compact

Remove vectors by ID, then reclaim disk space.

```rust
// Delete specific vectors (marks as tombstones)
store.delete_batch(&[42, 99, 1001])?;

// Compact: rewrite the file without tombstoned data
store.compact()?;
```

### Pattern 7: File Lineage (Parent &rarr; Child Derivation)

Create derived files that track their ancestry.

```rust
use rvf_types::DerivationType;

// Create a parent store
let parent = RvfStore::create("parent.rvf", options)?;

// Derive a filtered child — records parent's hash automatically
let child = parent.derive("child.rvf", DerivationType::Filter, None)?;
assert_eq!(child.lineage_depth(), 1);
assert_eq!(child.parent_id(), parent.file_id());

// Derive a grandchild
let grandchild = child.derive("grandchild.rvdna", DerivationType::Quantize, None)?;
assert_eq!(grandchild.lineage_depth(), 2);
```

### Pattern 8: Embed a Computational Container

Pack a bootable kernel or eBPF program into the file.

```rust
use rvf_types::kernel::{KernelArch, KernelType};
use rvf_types::ebpf::{EbpfProgramType, EbpfAttachType};

// Embed a unikernel — file can now boot as a standalone service
store.embed_kernel(KernelArch::X86_64, KernelType::HermitOs, &kernel_image, 8080)?;

// Embed an eBPF program — enables kernel-level acceleration
store.embed_ebpf(EbpfProgramType::Xdp, EbpfAttachType::XdpIngress, 384, &bytecode, &btf)?;

// Extract later
let (hdr, img) = store.extract_kernel()?.unwrap();
let (hdr, prog) = store.extract_ebpf()?.unwrap();
```

</details>

<details>
<summary><strong>Tutorial: Your First RVF Store (Step by Step)</strong></summary>

### Step 1: Set Up

Create a new Rust project and add the dependency:

```bash
cargo new my_vectors
cd my_vectors
```

Add to `Cargo.toml`:

```toml
[dependencies]
rvf-runtime = { path = "../crates/rvf/rvf-runtime" }
tempfile = "3"
```

### Step 2: Create a Store

```rust
use rvf_runtime::{RvfStore, RvfOptions, QueryOptions};
use rvf_runtime::options::DistanceMetric;
use tempfile::TempDir;

fn main() {
    let tmp = TempDir::new().unwrap();
    let path = tmp.path().join("my.rvf");

    let opts = RvfOptions {
        dimension: 128,
        metric: DistanceMetric::L2,
        ..Default::default()
    };
    let mut store = RvfStore::create(&path, opts).unwrap();
```

### Step 3: Insert Vectors

Vectors are inserted in batches. Each vector needs a unique `u64` ID.

```rust
    let vec_a = vec![0.1f32; 128];
    let vec_b = vec![0.2f32; 128];
    let vecs: Vec<&[f32]> = vec![&vec_a, &vec_b];
    let ids = vec![1u64, 2];

    let result = store.ingest_batch(&vecs, &ids, None).unwrap();
    println!("Accepted: {}, Rejected: {}", result.accepted, result.rejected);
```

### Step 4: Query

```rust
    let query = vec![0.15f32; 128];
    let results = store.query(&query, 5, &QueryOptions::default()).unwrap();

    for r in &results {
        println!("  id={}, dist={:.6}", r.id, r.distance);
    }
```

### Step 5: Verify Persistence

```rust
    store.close().unwrap();

    let reopened = RvfStore::open(&path).unwrap();
    let results2 = reopened.query(&query, 5, &QueryOptions::default()).unwrap();
    assert_eq!(results.len(), results2.len());
    println!("Persistence verified!");
}
```

### Expected Output

```
Accepted: 2, Rejected: 0
  id=1, dist=0.064000
  id=2, dist=0.032000
Persistence verified!
```

</details>

<details>
<summary><strong>Tutorial: Understanding Quantization Tiers</strong></summary>

### The Problem

A million 384-dim vectors at full precision (fp32) takes **1.5 GB** of RAM. Not all vectors are accessed equally &mdash; most are rarely touched. Why keep them all at full precision?

### The Solution: Temperature Tiering

RVF assigns vectors to three compression levels based on how often they're accessed:

| Tier | Access Pattern | Compression | Memory per Vector (384d) |
|------|---------------|------------|--------------------------|
| **Hot** | Frequently queried | Scalar (fp32 -> u8) | 384 bytes (4x smaller) |
| **Warm** | Occasionally queried | Product quantization | 48 bytes (32x smaller) |
| **Cold** | Rarely accessed | Binary (1-bit) | 48 bytes (32x smaller) |
| Raw | No compression | fp32 | 1,536 bytes |

### How It Works

**1. Track access patterns** using a Count-Min Sketch (a probabilistic counter):

```rust
let mut sketch = CountMinSketch::default_sketch();

// Every time a vector is accessed, increment its counter
sketch.increment(vector_id);

// Check how often a vector has been accessed
let count = sketch.estimate(vector_id);
```

**2. Assign tiers** based on configurable thresholds:

```rust
let tier = assign_tier(count);
// Hot:  count >= 100
// Warm: count >= 10
// Cold: count < 10
```

**3. Encode at the appropriate level:**

```rust
// Hot: Scalar (fast, low error)
let sq = ScalarQuantizer::train(&vectors);
let encoded = sq.encode_vec(&vector);  // 384 bytes

// Warm: Product (balanced)
let pq = ProductQuantizer::train(&vectors, 48, 64, 20);
let encoded = pq.encode_vec(&vector);  // 48 bytes

// Cold: Binary (smallest, approximate)
let bits = encode_binary(&vector);     // 48 bytes
```

### Run the Example

```bash
cargo run --example quantization
```

You'll see a comparison table showing compression ratio, reconstruction error (MSE), and bytes per vector for each tier.

</details>

<details>
<summary><strong>Tutorial: Building Witness Chains for Audit Trails</strong></summary>

### What Is a Witness Chain?

A witness chain is a tamper-evident log of events. Each entry links to the previous one through a cryptographic hash. If any entry is modified, all subsequent hash links break &mdash; making tampering detectable without a blockchain.

### Chain Structure

```
  Entry 0 (genesis)         Entry 1                  Entry 2
+-------------------+   +-------------------+   +-------------------+
| prev_hash: 0x00.. |   | prev_hash: H(E0)  |   | prev_hash: H(E1)  |
| action:   H(data) |   | action:   H(data) |   | action:   H(data) |
| timestamp: T0     |   | timestamp: T1     |   | timestamp: T2     |
| type: PROVENANCE  |   | type: COMPUTATION |   | type: SEARCH      |
+-------------------+   +-------------------+   +-------------------+
        73 bytes                73 bytes                73 bytes
```

- **prev_hash**: SHAKE-256 hash of the previous entry (zeroed for genesis)
- **action_hash**: SHAKE-256 hash of whatever action is being recorded
- **timestamp_ns**: Nanosecond UNIX timestamp
- **witness_type**: What kind of event (see table below)

### Witness Types

| Code | Name | When to Use |
|------|------|------------|
| `0x01` | PROVENANCE | Data origin tracking (e.g., "loaded from model X") |
| `0x02` | COMPUTATION | Operation recording (e.g., "built HNSW index") |
| `0x03` | SEARCH | Query audit (e.g., "searched for query Q, got results R") |
| `0x04` | DELETION | Deletion audit (e.g., "deleted vectors 1-100") |
| `0x05` | PLATFORM_ATTESTATION | TEE attestation (e.g., "enclave measured as M") |
| `0x06` | KEY_BINDING | Sealed key (e.g., "key K bound to enclave M") |
| `0x07` | COMPUTATION_PROOF | Verified computation (e.g., "search ran inside enclave") |
| `0x08` | DATA_PROVENANCE | Full chain (e.g., "model -> TEE -> RVF file") |
| `0x09` | DERIVATION | File lineage derivation event |
| `0x0A` | LINEAGE_MERGE | Multi-parent lineage merge |
| `0x0B` | LINEAGE_SNAPSHOT | Lineage snapshot checkpoint |
| `0x0C` | LINEAGE_TRANSFORM | Lineage transform operation |
| `0x0D` | LINEAGE_VERIFY | Lineage verification event |

### Creating and Verifying

```rust
use rvf_crypto::{create_witness_chain, verify_witness_chain, WitnessEntry, shake256_256};

// Record three events
let entries = vec![
    WitnessEntry {
        prev_hash: [0; 32], // genesis
        action_hash: shake256_256(b"loaded embeddings from model-v2"),
        timestamp_ns: 1_700_000_000_000_000_000,
        witness_type: 0x01,
    },
    WitnessEntry {
        prev_hash: [0; 32], // filled by create_witness_chain
        action_hash: shake256_256(b"built HNSW index (M=16, ef=200)"),
        timestamp_ns: 1_700_000_001_000_000_000,
        witness_type: 0x02,
    },
    WitnessEntry {
        prev_hash: [0; 32],
        action_hash: shake256_256(b"query: top-10 for user request #42"),
        timestamp_ns: 1_700_000_002_000_000_000,
        witness_type: 0x03,
    },
];

let chain_bytes = create_witness_chain(&entries);
let verified = verify_witness_chain(&chain_bytes).unwrap();
assert_eq!(verified.len(), 3);
```

### Tamper Detection

Flip any byte in the chain and verification fails:

```rust
let mut tampered = chain_bytes.clone();
tampered[100] ^= 0xFF; // flip one byte

assert!(verify_witness_chain(&tampered).is_err()); // detected!
```

### Run the Example

```bash
cargo run --example crypto_signing
```

The example creates a 5-entry chain, verifies it, then demonstrates tamper and truncation detection.

</details>

<details>
<summary><strong>Tutorial: Wire Format Deep Dive</strong></summary>

### Segment Header (64 bytes)

Every piece of data in an RVF file is wrapped in a self-describing segment. The header is always exactly 64 bytes:

```
Offset  Size  Field             Description
------  ----  -----             -----------
0x00    4     magic             0x52564653 ("RVFS")
0x04    1     version           Format version (currently 1)
0x05    1     seg_type          Segment type (VEC, INDEX, MANIFEST, ...)
0x06    2     flags             Bitfield (COMPRESSED, SIGNED, ATTESTED, ...)
0x08    8     segment_id        Monotonically increasing ID
0x10    8     payload_length    Byte length of payload
0x18    8     timestamp_ns      Nanosecond UNIX timestamp
0x20    1     checksum_algo     0=CRC32C, 1=XXH3-128, 2=SHAKE-256
0x21    1     compression       0=none, 1=LZ4, 2=ZSTD
0x22    2     reserved_0        Must be zero
0x24    4     reserved_1        Must be zero
0x28    16    content_hash      First 128 bits of payload hash
0x38    4     uncompressed_len  Original size before compression
0x3C    4     alignment_pad     Padding to 64-byte boundary
```

### The 16 Segment Types

| Code | Name | Purpose |
|------|------|---------|
| `0x01` | VEC | Raw vector embeddings |
| `0x02` | INDEX | HNSW adjacency and routing tables |
| `0x03` | OVERLAY | Graph overlay deltas |
| `0x04` | JOURNAL | Metadata mutations, deletions |
| `0x05` | MANIFEST | Segment directory, epoch state |
| `0x06` | QUANT | Quantization dictionaries (scalar/PQ/binary) |
| `0x07` | META | Key-value metadata |
| `0x08` | HOT | Temperature-promoted data |
| `0x09` | SKETCH | Access counter sketches (Count-Min) |
| `0x0A` | WITNESS | Audit trails, attestation proofs |
| `0x0B` | PROFILE | Domain profile declarations |
| `0x0C` | CRYPTO | Key material, signature chains |
| `0x0D` | META_IDX | Metadata inverted indexes |
| `0x0E` | KERNEL | Compressed unikernel image (self-booting) |
| `0x0F` | EBPF | eBPF program for kernel-level acceleration |

### Segment Flags

| Bit | Name | Description |
|-----|------|-------------|
| 0 | COMPRESSED | Payload is compressed (LZ4 or ZSTD) |
| 1 | ENCRYPTED | Payload is encrypted |
| 2 | SIGNED | Signature footer follows payload |
| 3 | SEALED | Immutable (compaction output) |
| 4 | PARTIAL | Streaming / partial write |
| 5 | TOMBSTONE | Logical deletion marker |
| 6 | HOT | Temperature-promoted |
| 7 | OVERLAY | Contains delta data |
| 8 | SNAPSHOT | Full snapshot |
| 9 | CHECKPOINT | Safe rollback point |
| 10 | ATTESTED | Produced inside attested TEE |
| 11 | HAS_LINEAGE | File carries FileIdentity lineage data |

### Crash Safety: Two-fsync Protocol

RVF doesn't need a write-ahead log. Instead:

1. Write data segment + payload, then `fsync`
2. Write MANIFEST_SEG with updated state, then `fsync`

If the process crashes between fsyncs, the incomplete segment has no manifest reference &mdash; it's ignored on recovery. Simple, safe, fast.

### Tail-Scan

To find the current state, scan backward from the end of the file for the latest MANIFEST_SEG. The root manifest fits in 4 KB, so cold boot takes < 5 ms.

### Run the Example

```bash
cargo run --example wire_format
```

You'll see three segments written, read back, hash-validated, corruption detected, and a tail-scan for the manifest.

</details>

<details>
<summary><strong>Tutorial: Metadata Filtering Patterns</strong></summary>

### Available Filter Expressions

| Expression | Syntax | Description |
|-----------|--------|-------------|
| `Eq` | `FilterExpr::Eq(field_id, value)` | Exact match |
| `Ne` | `FilterExpr::Ne(field_id, value)` | Not equal |
| `Gt` | `FilterExpr::Gt(field_id, value)` | Greater than |
| `Lt` | `FilterExpr::Lt(field_id, value)` | Less than |
| `Range` | `FilterExpr::Range(field_id, low, high)` | Value in [low, high) |
| `In` | `FilterExpr::In(field_id, values)` | Value is one of |
| `And` | `FilterExpr::And(vec![...])` | All conditions must match |
| `Or` | `FilterExpr::Or(vec![...])` | Any condition matches |

### Metadata Types

| Type | Rust | Use Case |
|------|------|----------|
| `String` | `MetadataValue::String("cat".into())` | Categories, labels, tags |
| `U64` | `MetadataValue::U64(95)` | Scores, counts, timestamps |
| `Bytes` | `MetadataValue::Bytes(vec![...])` | Binary data, hashes |

### Common Patterns

**Category filter:**
```rust
FilterExpr::Eq(0, FilterValue::String("science".into()))
```

**Score range:**
```rust
FilterExpr::Range(1, FilterValue::U64(30), FilterValue::U64(90))
```

**Multi-category:**
```rust
FilterExpr::In(0, vec![
    FilterValue::String("science".into()),
    FilterValue::String("tech".into()),
])
```

**Combined (AND):**
```rust
FilterExpr::And(vec![
    FilterExpr::Eq(0, FilterValue::String("science".into())),
    FilterExpr::Gt(1, FilterValue::U64(80)),
])
```

### Run the Example

```bash
cargo run --example filtered_search
```

The example creates 500 vectors with category and score metadata, then runs 7 different filter queries showing selectivity and verification.

</details>

<details>
<summary><strong>Tutorial: Progressive Index Recall Measurement</strong></summary>

### What Is Recall?

**Recall@K** measures how many of the true K nearest neighbors your approximate algorithm actually returns. A recall of 0.95 means 95% of results are correct.

```
recall@K = |approximate_results ∩ exact_results| / K
```

### How Progressive Indexing Achieves This

RVF builds an HNSW (Hierarchical Navigable Small World) graph, then splits it into three loadable layers:

**Layer A: Coarse Routing**
- Entry points (topmost HNSW nodes)
- Partition centroids for guided search
- Loads in microseconds
- Recall: ~0.40-0.70

**Layer B: Hot Region**
- Adjacency lists for the most frequently accessed vectors
- Covers the "working set" of your data
- Recall: ~0.70-0.85

**Layer C: Full Graph**
- Complete HNSW adjacency for all vectors
- Loaded in background while queries are already being served
- Recall: >= 0.95

### Measuring Recall in the Example

The `progressive_index` example:
1. Generates 5,000 vectors (128 dims)
2. Builds the full HNSW graph (M=16, ef_construction=200)
3. Splits into Layer A, B, C
4. Runs 50 queries at each stage
5. Computes recall@10 against brute-force ground truth

```bash
cargo run --example progressive_index
```

Expected output:

```
=== Recall Progression Summary ===
        Layers  Recall@10
  A only         0.xxx
  A + B          0.xxx
  A + B + C      0.9xx
```

### Tuning ef_search

The `ef_search` parameter controls how many candidates HNSW explores during search. Higher values improve recall at the cost of latency:

| ef_search | Recall@10 | Relative Speed |
|-----------|-----------|---------------|
| 10 | ~0.75 | Fastest |
| 50 | ~0.90 | Balanced |
| 200 | ~0.97 | Most accurate |

</details>

<details>
<summary><strong>Technical Reference: Signature Footer Format</strong></summary>

When the `SIGNED` flag is set on a segment, a signature footer follows the payload:

| Offset | Size | Field |
|--------|------|-------|
| 0x00 | 2 | `sig_algo` (0=Ed25519, 1=ML-DSA-65, 2=SLH-DSA-128s) |
| 0x02 | 2 | `sig_length` |
| 0x04 | var | `signature` (64 to 7,856 bytes) |
| var | 4 | `footer_length` (for backward scan) |

### Supported Algorithms

| Algorithm | Signature Size | Security Level | Standard |
|-----------|---------------|---------------|----------|
| Ed25519 | 64 bytes | 128-bit classical | RFC 8032 |
| ML-DSA-65 | 3,309 bytes | NIST Level 3 (post-quantum) | FIPS 204 |
| SLH-DSA-128s | 7,856 bytes | NIST Level 1 (post-quantum, stateless) | FIPS 205 |

### Signing Flow

1. Serialize the segment header (64 bytes) and payload into a signing buffer
2. Compute SHAKE-256 hash of the buffer
3. Sign the hash with the chosen algorithm
4. Append the signature footer after the payload (before padding)
5. Set the `SIGNED` flag in the header

### Verification Flow

1. Read segment header and payload
2. Recompute SHAKE-256 hash of header + payload
3. Read signature footer (scan backward from segment end using `footer_length`)
4. Verify signature against the public key

</details>

<details>
<summary><strong>Technical Reference: Confidential Core Attestation</strong></summary>

### Overview

RVF can record hardware TEE (Trusted Execution Environment) attestation quotes alongside vector data. This provides cryptographic proof that:

- The platform is genuine (e.g., real Intel SGX hardware)
- The code running inside the enclave matches a known measurement
- Encryption keys are sealed to the enclave identity
- Vector operations were computed inside the secure environment

### Supported TEE Platforms

| Platform | Enum Value | Quote Format |
|----------|-----------|--------------|
| Intel SGX | `TeePlatform::Sgx` (0) | DCAP attestation quote |
| AMD SEV-SNP | `TeePlatform::SevSnp` (1) | VCEK attestation report |
| Intel TDX | `TeePlatform::Tdx` (2) | TD quote |
| ARM CCA | `TeePlatform::ArmCca` (3) | CCA token |
| Software (testing) | `TeePlatform::SoftwareTee` (0xFE) | Synthetic (no hardware) |

### Attestation Header (112 bytes, `repr(C)`)

```
Offset  Size  Field
------  ----  -----
0x00    1     platform           TeePlatform enum value
0x01    1     attestation_type   AttestationWitnessType enum value
0x02    4     quote_length       Length of the platform-specific quote
0x06    2     reserved
0x08    32    measurement        SHAKE-256 hash of enclave code
0x28    32    signer_id          SHAKE-256 hash of signing identity
0x48    8     timestamp_ns       Nanosecond UNIX timestamp
0x50    16    nonce              Anti-replay nonce
0x60    2     svn                Security Version Number
0x62    1     sig_algo           Signature algorithm for the quote
0x63    1     flags              Attestation flags
0x64    4     report_data_len    Length of additional report data
0x68    8     reserved
```

### Attestation Types

| Type | Witness Code | Purpose |
|------|-------------|---------|
| Platform Attestation | `0x05` | TEE identity + measurement verification |
| Key Binding | `0x06` | Keys sealed to enclave measurement |
| Computation Proof | `0x07` | Proof that operations ran inside enclave |
| Data Provenance | `0x08` | Full chain: model -> TEE -> RVF file |

### ATTESTED Segment Flag

Any segment produced inside a TEE should set bit 10 (`ATTESTED`) in the segment header flags. This enables fast scanning to identify attested segments without parsing payloads.

### QuoteVerifier Trait

The verification interface is pluggable:

```rust
pub trait QuoteVerifier {
    fn platform(&self) -> TeePlatform;
    fn verify_quote(
        &self,
        quote: &[u8],
        report_data: &[u8],
        expected_measurement: &[u8; 32],
    ) -> Result<(), String>;
}
```

Implement this trait for your TEE platform to enable hardware-backed verification. The `SoftwareTee` variant allows testing without real hardware.

</details>

<details>
<summary><strong>Technical Reference: Computational Container (Self-Booting RVF)</strong></summary>

### Three-Tier Execution Model

RVF files can optionally carry executable compute alongside vector data:

| Tier | Segment | Size | Environment | Boot Time | Use Case |
|------|---------|------|-------------|-----------|----------|
| **1: WASM** | WASM_SEG (existing) | 5.5 KB | Browser, edge, IoT | <1 ms | Portable queries everywhere |
| **2: eBPF** | EBPF_SEG (`0x0F`) | 10-50 KB | Linux kernel (XDP, TC) | <20 ms | Sub-microsecond hot cache hits |
| **3: Unikernel** | KERNEL_SEG (`0x0E`) | 200 KB - 2 MB | Firecracker, TEE, bare metal | <125 ms | Zero-dependency self-booting service |

### KernelHeader (128 bytes)

| Field | Size | Description |
|-------|------|-------------|
| `kernel_magic` | 4 | `0x52564B4E` ("RVKN") |
| `header_version` | 2 | Currently 1 |
| `kernel_arch` | 1 | x86_64 (0), AArch64 (1), RISC-V (2), WASM (3) |
| `kernel_type` | 1 | HermitOS (0), Unikraft (1), Custom (2), TestStub (0xFE) |
| `image_size` | 4 | Uncompressed kernel size |
| `compressed_size` | 4 | Compressed (ZSTD) size |
| `image_hash` | 32 | SHAKE-256-256 of uncompressed image |
| `api_port` | 2 | HTTP API port (network byte order) |
| `api_transport` | 1 | HTTP (0), gRPC (1), virtio-vsock (2) |
| `kernel_flags` | 8 | Feature flags (read-only, metrics, TEE, etc.) |
| `cmdline_len` | 2 | Length of kernel command line |

### EbpfHeader (64 bytes)

| Field | Size | Description |
|-------|------|-------------|
| `ebpf_magic` | 4 | `0x52564250` ("RVBP") |
| `program_type` | 1 | XDP (0), TC (1), Tracepoint (2), Socket (3) |
| `attach_type` | 1 | XdpIngress (0), TcIngress (1), etc. |
| `max_dimension` | 4 | Maximum vector dimension (eBPF verifier loop bound) |
| `bytecode_size` | 4 | Size of BPF ELF object |
| `btf_size` | 4 | Size of BTF section |
| `map_count` | 4 | Number of BPF maps |

### Embedding and Extracting

```rust
use rvf_runtime::RvfStore;
use rvf_types::kernel::{KernelArch, KernelType};
use rvf_types::ebpf::{EbpfProgramType, EbpfAttachType};

let mut store = RvfStore::open("vectors.rvf")?;

// Embed a kernel
store.embed_kernel(KernelArch::X86_64, KernelType::HermitOs, &image, 8080)?;

// Embed an eBPF program
store.embed_ebpf(EbpfProgramType::Xdp, EbpfAttachType::XdpIngress, 384, &bytecode, &btf)?;

// Extract later
let (kernel_hdr, kernel_img) = store.extract_kernel()?.unwrap();
let (ebpf_hdr, ebpf_prog) = store.extract_ebpf()?.unwrap();
```

### Forward Compatibility

Files with KERNEL_SEG or EBPF_SEG work with older readers -- unknown segment types are skipped per the RVF forward-compatibility rule. The computational capability is purely additive.

See [ADR-030](../../docs/adr/ADR-030-rvf-computational-container.md) for the full specification.

</details>

<details>
<summary><strong>Technical Reference: DNA-Style Lineage Provenance</strong></summary>

### How Lineage Works

Every RVF file carries a 68-byte `FileIdentity` in its root manifest:

| Field | Size | Description |
|-------|------|-------------|
| `file_id` | 16 | Unique UUID for this file |
| `parent_id` | 16 | UUID of the parent file (all zeros for root) |
| `parent_hash` | 32 | SHAKE-256-256 of parent's manifest |
| `lineage_depth` | 4 | Generation count (0 for root) |

### Derivation Chain

```
Parent.rvf ──derive()──> Child.rvf ──derive()──> Grandchild.rvdna
  file_id: A               file_id: B               file_id: C
  parent_id: [0;16]         parent_id: A              parent_id: B
  parent_hash: [0;32]       parent_hash: hash(A)      parent_hash: hash(B)
  depth: 0                  depth: 1                  depth: 2
```

### Derivation Types

| Code | Type | Description |
|------|------|-------------|
| 0 | Clone | Exact copy |
| 1 | Filter | Subset of parent's vectors |
| 2 | Merge | Multi-parent merge |
| 3 | Quantize | Re-quantized version |
| 4 | Reindex | Re-indexed with different parameters |
| 5 | Transform | Transformed embeddings |
| 6 | Snapshot | Point-in-time snapshot |
| 0xFF | UserDefined | Application-specific derivation |

### Using the API

```rust
use rvf_runtime::RvfStore;
use rvf_types::DerivationType;

let parent = RvfStore::create("parent.rvf", options)?;

// Derive a filtered child
let child = parent.derive("child.rvf", DerivationType::Filter, None)?;
assert_eq!(child.lineage_depth(), 1);
assert_eq!(child.parent_id(), parent.file_id());
```

### Domain Extensions

| Extension | Domain Profile | Optimized For |
|-----------|---------------|---------------|
| `.rvf` | Generic | General-purpose vectors |
| `.rvdna` | RVDNA | Genomic sequence embeddings |
| `.rvtext` | RVText | Language model embeddings |
| `.rvgraph` | RVGraph | Graph/network node embeddings |
| `.rvvis` | RVVision | Image/vision model embeddings |

See [ADR-029](../../docs/adr/ADR-029-rvf-canonical-format.md) for the full format specification.

</details>

<details>
<summary><strong>Technical Reference: Crate Architecture</strong></summary>

### Crate Map

```
                    +-----------------------------------------+
                    |         Cognitive Layer                   |
                    |  ruvllm | gnn | ruQu | attention | sona  |
                    |  mincut | prime-radiant | nervous-system |
                    +---+-------------+---------------+-------+
                        |             |               |
                    +-----------------------------------------+
                    |           Application Layer              |
                    |  claude-flow | agentdb | agentic-flow    |
                    |  ospipe | rvlite | sona | your-app      |
                    +---+-------------+---------------+-------+
                        |             |               |
                    +---v-------------v---------------v-------+
                    |           RVF SDK Layer                   |
                    |  rvf-runtime | rvf-index | rvf-quant      |
                    |  rvf-manifest | rvf-crypto | rvf-wire     |
                    +---+-------------+---------------+-------+
                        |             |               |
               +--------v------+ +---v--------+ +----v-------+ +----v------+
               |  rvf-server   | |  rvf-node  | |  rvf-wasm  | |  rvf-cli  |
               |  HTTP + TCP   | |  N-API     | |  ~46 KB    | |  clap     |
               +---------------+ +------------+ +------------+ +-----------+
```

### Crate Details

| Crate | Lines | no_std | Purpose |
|-------|------:|:------:|---------|
| `rvf-types` | 3,184 | Yes | Segment types, kernel/eBPF headers, lineage, enums |
| `rvf-wire` | 2,011 | Yes | Wire format read/write, hash validation |
| `rvf-manifest` | 1,580 | No | Two-level manifest with 4 KB root, FileIdentity codec |
| `rvf-index` | 2,691 | No | HNSW progressive indexing (Layer A/B/C) |
| `rvf-quant` | 1,443 | No | Scalar, product, and binary quantization |
| `rvf-crypto` | 1,725 | Partial | SHAKE-256, Ed25519, witness chains, attestation, lineage |
| `rvf-runtime` | 3,607 | No | Full store API, compaction, lineage, kernel/eBPF embed |
| `rvf-import` | 980 | No | JSON, CSV, NumPy (.npy) importers |
| `rvf-wasm` | 1,616 | Yes | WASM control plane: in-memory store, query, segment inspection |
| `rvf-node` | 852 | No | Node.js N-API bindings with lineage, kernel/eBPF, inspection |
| `rvf-cli` | 665 | No | Unified CLI: create, ingest, query, delete, status, inspect, compact, derive, serve |
| `rvf-server` | 1,165 | No | HTTP REST + TCP streaming server |

### Library Adapters

| Adapter | Purpose | Key Feature |
|---------|---------|-------------|
| `rvf-adapter-claude-flow` | AI agent memory | WITNESS_SEG audit trails |
| `rvf-adapter-agentdb` | Agent vector database | Progressive HNSW indexing |
| `rvf-adapter-ospipe` | Observation-State pipeline | META_SEG for state vectors |
| `rvf-adapter-agentic-flow` | Swarm coordination | Inter-agent memory sharing |
| `rvf-adapter-rvlite` | Lightweight embedded store | Minimal API, edge-friendly |
| `rvf-adapter-sona` | Neural architecture | Experience replay + trajectories |

</details>

<details>
<summary><strong>Technical Reference: File Format Specification</strong></summary>

### File Extension

| Extension | Usage |
|-----------|-------|
| `.rvf` | Standard RuVector Format file |
| `.rvf.cold.N` | Cold shard N (multi-file mode) |
| `.rvf.idx.N` | Index shard N (multi-file mode) |

### MIME Type

`application/x-ruvector-format`

### Magic Number

`0x52564653` (ASCII: "RVFS")

### Byte Order

All multi-byte integers are **little-endian**.

### Alignment

All segments are **64-byte aligned** (cache-line friendly). Payloads are padded to the next 64-byte boundary.

### Root Manifest

The root manifest (Level 0) occupies the last 4,096 bytes of the most recent MANIFEST_SEG. This enables instant location via backward scan:

```rust
let (offset, header) = find_latest_manifest(&file_data)?;
```

The root manifest provides:
- Segment directory (offsets to all segments)
- Hotset pointers (entry points, top layer, centroids, quant dicts)
- Epoch counter
- Vector count and dimension
- Profile identifiers

### Domain Profiles

| Profile | Code | Optimized For |
|---------|------|---------------|
| Generic | `0x00` | General-purpose vectors |
| RVDNA | `0x01` | Genomic sequence embeddings |
| RVText | `0x02` | Language model embeddings |
| RVGraph | `0x03` | Graph/network node embeddings |
| RVVision | `0x04` | Image/vision model embeddings |

</details>

<details>
<summary><strong>Building from Source</strong></summary>

### Prerequisites

- **Rust 1.87+** via [rustup](https://rustup.rs/) (`rustup update stable`)
- For WASM: `rustup target add wasm32-unknown-unknown`
- For Node.js bindings: Node.js 18+ and `npm`

### Build Examples

```bash
cd examples/rvf
cargo build
```

### Build All RVF Crates

```bash
cd crates/rvf
cargo build --workspace
```

### Run All Tests

```bash
cd crates/rvf
cargo test --workspace
```

### Run Clippy

```bash
cd crates/rvf
cargo clippy --all-targets --workspace --exclude rvf-wasm
```

### Build WASM Microkernel

```bash
cd crates/rvf
cargo build --target wasm32-unknown-unknown -p rvf-wasm --release
ls target/wasm32-unknown-unknown/release/rvf_wasm.wasm
```

### Build Node.js Bindings

```bash
cd crates/rvf/rvf-node
npm install && npm run build
```

### Run Benchmarks

```bash
cd crates/rvf
cargo bench --bench rvf_benchmarks
```

</details>

---

<details>
<summary><strong>Project Structure</strong></summary>

```
examples/rvf/
  Cargo.toml                  # Standalone workspace
  src/lib.rs                  # Shared utilities
  examples/
    # Core (6)
    basic_store.rs            # Store lifecycle, insert, query, persistence
    progressive_index.rs      # Three-layer HNSW, recall measurement
    quantization.rs           # Scalar, product, binary quantization + tiering
    wire_format.rs            # Raw segment I/O, hash validation, tail-scan
    crypto_signing.rs         # Ed25519 signing, witness chains, tamper detection
    filtered_search.rs        # Metadata-filtered vector search
    # Agentic AI (6)
    agent_memory.rs           # Persistent agent memory + witness audit
    swarm_knowledge.rs        # Multi-agent shared knowledge base
    reasoning_trace.rs        # Chain-of-thought with lineage derivation
    tool_cache.rs             # Tool call result cache with TTL + compaction
    agent_handoff.rs          # Transfer agent state between instances
    experience_replay.rs      # RL experience replay buffer
    # Practical Production (5)
    semantic_search.rs        # Document search engine (4 filter workflows)
    recommendation.rs         # Item recommendations (collaborative filtering)
    rag_pipeline.rs           # Retrieval-augmented generation pipeline
    embedding_cache.rs        # LRU cache with temperature tiering
    dedup_detector.rs         # Near-duplicate detection + compaction
    # Vertical Domains (4)
    genomic_pipeline.rs       # DNA k-mer search (.rvdna profile)
    financial_signals.rs      # Market signals with attestation
    medical_imaging.rs        # Radiology embedding search (.rvvis)
    legal_discovery.rs        # Legal document similarity (.rvtext)
    # Exotic Capabilities (5)
    self_booting.rs           # RVF with embedded unikernel
    ebpf_accelerator.rs       # eBPF hot-path acceleration
    hyperbolic_taxonomy.rs    # Hierarchy-aware search
    multimodal_fusion.rs      # Cross-modal text + image search
    sealed_engine.rs          # Full cognitive engine (capstone)
    # Runtime Targets + Postgres (5)
    browser_wasm.rs           # Browser-side WASM vector search
    edge_iot.rs               # IoT device with binary quantization
    serverless_function.rs    # Cold-start optimized for Lambda
    ruvllm_inference.rs       # LLM KV cache + LoRA via RVF
    postgres_bridge.rs        # PostgreSQL ↔ RVF export/import
    # Network & Security (4)
    network_sync.rs           # Peer-to-peer vector store sync
    tee_attestation.rs        # TEE attestation + sealed keys
    access_control.rs         # Role-based vector access control
    zero_knowledge.rs         # Zero-knowledge proof integration
    # Autonomous Agent (1)
    ruvbot.rs                 # Autonomous RVF-powered agent bot
    # POSIX & Systems (3)
    posix_fileops.rs          # POSIX file operations with RVF
    linux_microkernel.rs      # Linux microkernel distribution
    mcp_in_rvf.rs             # MCP server embedded in RVF
    # Network Operations (1)
    network_interfaces.rs     # Network OS telemetry (60 interfaces)
```

</details>

## Learn More

| Resource | Description |
|----------|-------------|
| [RVF Format Specification](../../crates/rvf/README.md) | Full format documentation, architecture, and API reference |
| [ADR-029](../../docs/adr/ADR-029-rvf-canonical-format.md) | Architecture decision record for the canonical format |
| [ADR-030](../../docs/adr/ADR-030-rvf-computational-container.md) | Computational container (KERNEL_SEG, EBPF_SEG) specification |
| [ADR-031](../../docs/adr/ADR-031-rvf-example-repository.md) | Example repository design (this collection of 40 examples) |
| [Benchmarks](../../crates/rvf/benches/) | Performance benchmarks (HNSW build, quantization, wire I/O) |
| [Integration Tests](../../crates/rvf/tests/rvf-integration/) | E2E test suite (progressive recall, quantization, wire interop) |

## Contributing

```bash
git clone https://github.com/ruvnet/ruvector
cd ruvector/examples/rvf
cargo build && cargo run --example basic_store
```

All contributions must pass `cargo clippy` with zero warnings and maintain the existing test count (currently 543+).

## License

Dual-licensed under [MIT](../../LICENSE-MIT) or [Apache-2.0](../../LICENSE-APACHE) at your option.

---

<p align="center">
  <sub>Built with Rust. One file &mdash; store it, send it, run it.</sub>
</p>