Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
This commit is contained in:
445
docs/adr/ADR-034-qr-cognitive-seed.md
Normal file
445
docs/adr/ADR-034-qr-cognitive-seed.md
Normal file
@@ -0,0 +1,445 @@
|
||||
# ADR-034: QR Cognitive Seed — A World Inside a World
|
||||
|
||||
**Status**: Implemented
|
||||
**Date**: 2026-02-15
|
||||
**Depends on**: ADR-029 (RVF Canonical Format), ADR-030 (Cognitive Container), ADR-033 (Progressive Indexing Hardening)
|
||||
**Affects**: `rvf-types`, `rvf-runtime`
|
||||
**Zero external dependencies**: All crypto, compression, and FFI implemented from scratch.
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
RVF files are self-bootstrapping cognitive containers: they carry their own WASM interpreter, signed manifests, and progressive index layers. But distribution still assumes a filesystem — a URL, a disk, a cloud bucket.
|
||||
|
||||
What if intelligence could live in printed ink?
|
||||
|
||||
A QR code can carry up to 2,953 bytes (Version 40, Low EC). That's enough for:
|
||||
- A 64-byte RVQS header
|
||||
- A 5.5 KB WASM microkernel (LZ-compressed to ~2.1 KB)
|
||||
- A 32-byte HMAC-SHA256 signature
|
||||
- A 500-byte progressive download manifest with host URLs + content hashes
|
||||
|
||||
**Total seed: ~2,700 bytes. Fits in a single QR code with 229 bytes headroom.**
|
||||
|
||||
The result: scan a QR code and mount a portable brain. The AI literally exists in the data printed on a piece of paper. Offline-first, signed, verifiable, capable of bootstrapping into a streamed universe.
|
||||
|
||||
---
|
||||
|
||||
## Decision
|
||||
|
||||
### 1. QR Seed Format (RVQS — RuVector QR Seed)
|
||||
|
||||
A QR Cognitive Seed is a binary payload with this wire format:
|
||||
|
||||
```
|
||||
Offset Size Field Description
|
||||
------ ---- ----- -----------
|
||||
0x000 4 seed_magic 0x52565153 ("RVQS")
|
||||
0x004 2 seed_version Seed format version (1)
|
||||
0x006 2 flags Seed flags (see below)
|
||||
0x008 8 file_id Unique identifier for this seed
|
||||
0x010 4 total_vector_count Expected vectors when fully loaded
|
||||
0x014 2 dimension Vector dimensionality
|
||||
0x016 1 base_dtype Base data type (DataType enum)
|
||||
0x017 1 profile_id Domain profile
|
||||
0x018 8 created_ns Seed creation timestamp (nanos)
|
||||
0x020 4 microkernel_offset Offset to WASM microkernel data
|
||||
0x024 4 microkernel_size Compressed microkernel size
|
||||
0x028 4 download_manifest_offset Offset to download manifest
|
||||
0x02C 4 download_manifest_size Download manifest size
|
||||
0x030 2 sig_algo Signature algorithm (0=Ed25519, 1=ML-DSA-65, 2=HMAC-SHA256)
|
||||
0x032 2 sig_length Signature byte length
|
||||
0x034 4 total_seed_size Total payload size in bytes
|
||||
0x038 8 content_hash SHA-256-64 of microkernel+manifest data
|
||||
0x040 var microkernel_data LZ-compressed WASM microkernel
|
||||
... var download_manifest Progressive download manifest (TLV)
|
||||
... var signature Seed signature (covers 0x000..sig start)
|
||||
```
|
||||
|
||||
#### 1.1 Seed Flags
|
||||
|
||||
```
|
||||
Bit Name Description
|
||||
--- ---- -----------
|
||||
0 SEED_HAS_MICROKERNEL Embedded WASM microkernel present
|
||||
1 SEED_HAS_DOWNLOAD Progressive download manifest present
|
||||
2 SEED_SIGNED Payload is signed
|
||||
3 SEED_OFFLINE_CAPABLE Seed is useful without network access
|
||||
4 SEED_ENCRYPTED Payload is encrypted (key in TEE or passphrase)
|
||||
5 SEED_COMPRESSED Microkernel is LZ-compressed
|
||||
6 SEED_HAS_VECTORS Seed contains inline vector data (tiny model)
|
||||
7 SEED_STREAM_UPGRADE Seed can upgrade itself via streaming
|
||||
```
|
||||
|
||||
#### 1.2 Signature Algorithms
|
||||
|
||||
| ID | Algorithm | Size | Dependencies | Use Case |
|
||||
|----|-----------|------|--------------|----------|
|
||||
| 0 | Ed25519 | 64 B | `ed25519-dalek` | Asymmetric, production |
|
||||
| 1 | ML-DSA-65 | 3,309 B | `pqcrypto` | Post-quantum (requires 2-QR) |
|
||||
| 2 | HMAC-SHA256 | 32 B | **None** (built-in) | Symmetric, zero-dep default |
|
||||
|
||||
**sig_algo=2 (HMAC-SHA256) is implemented from scratch with zero external dependencies.**
|
||||
|
||||
### 2. Progressive Download Manifest
|
||||
|
||||
The download manifest tells the runtime how to grow from seed to full intelligence. It uses a TLV structure:
|
||||
|
||||
```
|
||||
Tag Length Description
|
||||
------ ------ -----------
|
||||
0x0001 var HostEntry: Primary download host
|
||||
0x0002 var HostEntry: Fallback host (up to 3)
|
||||
0x0003 32 content_hash: SHA-256 of the full RVF file
|
||||
0x0004 8 total_file_size: Expected size of the full RVF
|
||||
0x0005 var LayerManifest: Progressive layer download order
|
||||
0x0006 16 session_token: Ephemeral auth token for download
|
||||
0x0007 4 ttl_seconds: Token expiry
|
||||
0x0008 var CertPin: TLS certificate pin (SHA-256 of SPKI)
|
||||
```
|
||||
|
||||
**Default layer order:**
|
||||
|
||||
| Priority | Layer | Size | Purpose |
|
||||
|----------|-------|------|---------|
|
||||
| 0 | Level 0 manifest | 4 KB | Instant boot |
|
||||
| 1 | Hot cache (centroids + entry points) | ~50 KB | First query capability |
|
||||
| 2 | HNSW Layer A | ~200 KB | recall >= 0.70 |
|
||||
| 3 | Quantization dictionaries | ~100 KB | Compact search |
|
||||
| 4 | HNSW Layer B | ~500 KB | recall >= 0.85 |
|
||||
| 5 | Full vectors (warm tier) | variable | Full recall |
|
||||
| 6 | HNSW Layer C | variable | recall >= 0.95 |
|
||||
|
||||
### 3. Bootstrap Sequence
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ QR Code (≤2,953 bytes) │
|
||||
│ ┌──────────┬──────────────┬────────────┬──────────────────┐ │
|
||||
│ │ RVQS Hdr │ WASM μkernel │ DL Manifest│ HMAC-SHA256 Sig │ │
|
||||
│ │ 64 bytes │ ~2.1 KB (LZ) │ ~500 bytes │ 32 bytes │ │
|
||||
│ └──────────┴──────────────┴────────────┴──────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Phase 0: Scan & Verify (offline, <1ms) │
|
||||
│ 1. Parse RVQS header, validate magic 0x52565153 │
|
||||
│ 2. Verify content hash: SHA-256(μkernel ‖ manifest)[0..8] │
|
||||
│ 3. Verify HMAC-SHA256 signature (constant-time comparison) │
|
||||
│ 4. Decompress WASM microkernel (built-in LZ decompressor) │
|
||||
│ 5. Instantiate WASM runtime │
|
||||
│ 6. Seed is now ALIVE — cognitive kernel running │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼ (if network available)
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Phase 1: Progressive Download (background, priority-ordered) │
|
||||
│ 1. Fetch Level 0 manifest (4 KB) → instant full boot │
|
||||
│ 2. Fetch hot cache → first query capability (50% recall) │
|
||||
│ 3. Fetch HNSW Layer A → recall ≥ 0.70 │
|
||||
│ 4. Fetch remaining layers in priority order │
|
||||
│ Each layer: verify SHA-256-128 content_hash → append → index │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Phase 2: Full Intelligence │
|
||||
│ 1. All layers downloaded and verified │
|
||||
│ 2. Full HNSW index active, recall ≥ 0.95 │
|
||||
│ 3. Seed has grown into a complete cognitive container │
|
||||
│ 4. Can operate fully offline from this point │
|
||||
│ 5. Can re-export as a new QR seed (with updated vectors) │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### 4. Security Model
|
||||
|
||||
#### 4.1 Built-in Cryptography (Zero Dependencies)
|
||||
|
||||
All cryptographic primitives are implemented from scratch in pure Rust:
|
||||
|
||||
| Primitive | Implementation | Location | Tests |
|
||||
|-----------|---------------|----------|-------|
|
||||
| SHA-256 | FIPS 180-4 | `rvf-types/src/sha256.rs` | 11 (incl. NIST + RFC 4231 vectors) |
|
||||
| HMAC-SHA256 | RFC 2104 | `rvf-types/src/sha256.rs` | 3 RFC 4231 test cases |
|
||||
| Constant-time compare | XOR accumulator | `rvf-types/src/sha256.rs` | 2 |
|
||||
| LZ compression | SCF-1 (4KB window) | `rvf-runtime/src/compress.rs` | 11 |
|
||||
| Content hashing | SHA-256 truncated | `rvf-runtime/src/seed_crypto.rs` | 10 |
|
||||
|
||||
**Signature verification flow:**
|
||||
|
||||
```rust
|
||||
let parsed = ParsedSeed::parse(&payload)?;
|
||||
|
||||
// Full verification: magic + content hash + HMAC signature.
|
||||
parsed.verify_all(signing_key, &payload)?;
|
||||
|
||||
// Or step by step:
|
||||
assert!(parsed.verify_content_hash());
|
||||
parsed.verify_signature(signing_key, &payload)?;
|
||||
let wasm = parsed.decompress_microkernel()?;
|
||||
```
|
||||
|
||||
#### 4.2 Content Integrity
|
||||
|
||||
- **Seed content hash**: SHA-256(microkernel ‖ manifest) truncated to 64 bits. Stored in header at offset 0x38.
|
||||
- **Layer content hashes**: SHA-256 of each layer truncated to 128 bits. Verified on download.
|
||||
- **Full file hash**: SHA-256 of the complete RVF file. Stored in manifest TLV tag 0x0003.
|
||||
|
||||
#### 4.3 Download Security
|
||||
|
||||
1. **Content hashes**: Each layer has a SHA-256-128 hash. Downloaded data is verified before use.
|
||||
2. **TLS certificate pinning**: SHA-256 of host's SPKI. Prevents MITM even if a CA is compromised.
|
||||
3. **Session tokens**: Ephemeral 16-byte auth tokens with TTL.
|
||||
4. **Host key verification**: Each HostEntry contains the host's public key hash.
|
||||
|
||||
### 5. Mobile Integration
|
||||
|
||||
#### 5.1 iOS App Clip — Scan. Boot. Intelligence.
|
||||
|
||||
The strongest UX story: QR opens an App Clip instantly, no install.
|
||||
|
||||
```
|
||||
User scans QR with iOS Camera
|
||||
│
|
||||
▼
|
||||
iOS launches App Clip instantly (no App Store, no login)
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────┐
|
||||
│ App Clip (~50 KB RVF static library) │
|
||||
│ │
|
||||
│ 1. rvqs_parse_header() — read RVQS │
|
||||
│ 2. rvqs_verify_signature() — HMAC │
|
||||
│ 3. rvqs_verify_content_hash() — SHA256 │
|
||||
│ 4. rvqs_decompress_microkernel() — LZ │
|
||||
│ 5. Mount WASM runtime │
|
||||
│ 6. rvqs_get_primary_host_url() │
|
||||
│ 7. Stream layers → progressive recall │
|
||||
└─────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
User sees intelligence in <2 seconds
|
||||
Optional: "Get Full App" upgrade path
|
||||
```
|
||||
|
||||
The App Clip contains the compiled `rvf-runtime` as a static library linked via C FFI:
|
||||
|
||||
```swift
|
||||
// Swift calling the Rust FFI
|
||||
var header = RvqsHeaderC()
|
||||
let rc = rvqs_parse_header(qrData.baseAddress, qrData.count, &header)
|
||||
guard rc == RVQS_OK else { return }
|
||||
|
||||
let verifyRc = rvqs_verify_signature(qrData.baseAddress, qrData.count,
|
||||
key.baseAddress, key.count)
|
||||
guard verifyRc == RVQS_OK else { showError("Invalid signature"); return }
|
||||
```
|
||||
|
||||
**Build for iOS:**
|
||||
```bash
|
||||
cargo build --release --target aarch64-apple-ios --lib
|
||||
cargo build --release --target aarch64-apple-ios-sim --lib
|
||||
```
|
||||
|
||||
#### 5.2 Web App — Zero App Store
|
||||
|
||||
For zero App Store involvement, the QR URL opens a Progressive Web App:
|
||||
|
||||
```
|
||||
QR contains: https://brain.ruvector.ai/s/{seed-id}
|
||||
│
|
||||
▼
|
||||
Browser opens instantly (Safari, Chrome)
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────┐
|
||||
│ PWA with WASM RVF Loader │
|
||||
│ │
|
||||
│ 1. Fetch RVQS seed from URL path │
|
||||
│ 2. Parse + verify in WASM │
|
||||
│ 3. Decompress microkernel │
|
||||
│ 4. Stream layers via fetch() API │
|
||||
│ 5. Render intelligence in browser │
|
||||
│ │
|
||||
│ Works offline via Service Worker cache │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Build for WASM:**
|
||||
```bash
|
||||
cargo build --release --target wasm32-unknown-unknown --lib
|
||||
```
|
||||
|
||||
The RVF loader compiles to ~50 KB WASM. Service Worker caches the loader + downloaded layers for offline use.
|
||||
|
||||
#### 5.3 Android
|
||||
|
||||
Same C FFI approach, compiled for NDK targets:
|
||||
|
||||
```bash
|
||||
cargo build --release --target aarch64-linux-android --lib
|
||||
```
|
||||
|
||||
Called from Kotlin via JNI or from a WebView with the WASM build.
|
||||
|
||||
#### 5.4 Delivery Comparison
|
||||
|
||||
| Method | Install Required | App Store Review | Boot Time | Offline |
|
||||
|--------|-----------------|-----------------|-----------|---------|
|
||||
| App Clip (iOS) | No | Yes (light) | <1s | Yes |
|
||||
| PWA / Web App | No | No | ~2s | Yes (SW) |
|
||||
| Android Instant App | No | Yes | <1s | Yes |
|
||||
| Full native app | Yes | Yes | N/A | Yes |
|
||||
|
||||
### 6. C FFI Reference
|
||||
|
||||
The following `extern "C"` functions are exported for mobile integration:
|
||||
|
||||
```c
|
||||
// Parse the 64-byte header from a QR seed payload.
|
||||
int rvqs_parse_header(const uint8_t* data, size_t data_len, RvqsHeaderC* out);
|
||||
|
||||
// Verify HMAC-SHA256 signature. Returns RVQS_OK (0) or error code.
|
||||
int rvqs_verify_signature(const uint8_t* data, size_t data_len,
|
||||
const uint8_t* key, size_t key_len);
|
||||
|
||||
// Verify content hash integrity. Returns RVQS_OK (0) or error code.
|
||||
int rvqs_verify_content_hash(const uint8_t* data, size_t data_len);
|
||||
|
||||
// Decompress the embedded microkernel into caller's buffer.
|
||||
int rvqs_decompress_microkernel(const uint8_t* data, size_t data_len,
|
||||
uint8_t* out, size_t out_cap, size_t* out_len);
|
||||
|
||||
// Extract the primary download URL from the manifest.
|
||||
int rvqs_get_primary_host_url(const uint8_t* data, size_t data_len,
|
||||
uint8_t* url_buf, size_t url_cap, size_t* url_len);
|
||||
```
|
||||
|
||||
**Error codes:**
|
||||
|
||||
| Code | Name | Meaning |
|
||||
|------|------|---------|
|
||||
| 0 | RVQS_OK | Success |
|
||||
| -1 | RVQS_ERR_NULL_PTR | Null pointer argument |
|
||||
| -2 | RVQS_ERR_TOO_SHORT | Payload smaller than header |
|
||||
| -3 | RVQS_ERR_BAD_MAGIC | Invalid RVQS magic bytes |
|
||||
| -4 | RVQS_ERR_SIGNATURE_INVALID | Signature verification failed |
|
||||
| -5 | RVQS_ERR_HASH_MISMATCH | Content hash doesn't match data |
|
||||
| -6 | RVQS_ERR_DECOMPRESS_FAIL | LZ decompression failed |
|
||||
| -7 | RVQS_ERR_BUFFER_TOO_SMALL | Caller's buffer too small |
|
||||
| -8 | RVQS_ERR_PARSE_FAIL | General parse failure |
|
||||
|
||||
### 7. Size Budget (Actual Measured)
|
||||
|
||||
```
|
||||
Component Raw Size Compressed In QR
|
||||
───────────────────────── ──────── ────────── ─────
|
||||
RVQS Header 64 B 64 B 64 B
|
||||
WASM Microkernel 5,500 B 2,095 B 2,095 B
|
||||
Download Manifest (2 hosts) 496 B 496 B 496 B
|
||||
HMAC-SHA256 Signature 32 B 32 B 32 B
|
||||
───────────────────────── ──────── ────────── ─────
|
||||
Total (measured) 2,687 B
|
||||
|
||||
QR Version 40, Low EC capacity: 2,953 B
|
||||
Remaining headroom: 266 B
|
||||
```
|
||||
|
||||
### 8. Example Use Cases
|
||||
|
||||
#### 8.1 Business Card Brain
|
||||
|
||||
Print a QR code on a business card. Scan it to mount a personal AI assistant that knows your work, your papers, your projects. Offline-first. When connected, it streams your full knowledge base.
|
||||
|
||||
#### 8.2 Medical Record Seed
|
||||
|
||||
A QR code on a patient wristband contains a signed seed pointing to their medical vector index. Scan to query drug interactions, allergies, treatment history. Works offline in the ER.
|
||||
|
||||
#### 8.3 Firmware Intelligence
|
||||
|
||||
Embedded in a product's QR code: a cognitive seed that can diagnose problems, suggest fixes, and stream updated knowledge from the manufacturer.
|
||||
|
||||
#### 8.4 Paper Backup
|
||||
|
||||
Print your AI's seed on paper. Store it in a safe. In a disaster, scan the paper and your AI bootstraps from printed ink. The signature proves it's yours.
|
||||
|
||||
#### 8.5 Conference Badge
|
||||
|
||||
NFC/QR on a conference badge. Tap to mount the speaker's research brain. Walk around, scan badges, collect intelligences. Each one is signed by the speaker.
|
||||
|
||||
---
|
||||
|
||||
## Implementation
|
||||
|
||||
### Files
|
||||
|
||||
| File | Lines | Purpose |
|
||||
|------|-------|---------|
|
||||
| `rvf-types/src/sha256.rs` | 230 | Pure SHA-256 (FIPS 180-4) + HMAC-SHA256 (RFC 2104) |
|
||||
| `rvf-types/src/qr_seed.rs` | 370 | RVQS wire format types, SeedHeader, LayerEntry, HostEntry |
|
||||
| `rvf-runtime/src/compress.rs` | 210 | LZ77 compression (SCF-1 format, 4KB window) |
|
||||
| `rvf-runtime/src/seed_crypto.rs` | 100 | Sign, verify, content/layer hashing |
|
||||
| `rvf-runtime/src/qr_seed.rs` | 1050 | SeedBuilder, ParsedSeed, TLV manifest, bootstrap progress |
|
||||
| `rvf-runtime/src/ffi.rs` | 240 | C FFI for App Clip / mobile (5 exported functions) |
|
||||
| `rvf-runtime/examples/qr_seed_bootstrap.rs` | 250 | Full demo: build → sign → parse → verify → decompress → bootstrap |
|
||||
| `rvf-runtime/tests/qr_seed_e2e.rs` | 220 | 11 end-to-end integration tests |
|
||||
|
||||
### Test Coverage
|
||||
|
||||
| Module | Tests | Verified Against |
|
||||
|--------|-------|-----------------|
|
||||
| SHA-256 | 11 | NIST FIPS 180-4 test vectors |
|
||||
| HMAC-SHA256 | 3 | RFC 4231 test cases 1, 2, 5 |
|
||||
| LZ compression | 11 | Round-trip + WASM patterns |
|
||||
| Seed crypto | 10 | Sign/verify/tamper detection |
|
||||
| QR seed types | 9 | Header round-trip, flags, magic |
|
||||
| QR seed runtime | 12 | Builder, parser, manifest TLV |
|
||||
| C FFI | 7 | Parse, verify, decompress, URL extract |
|
||||
| E2E integration | 11 | Full pipeline with real crypto |
|
||||
|
||||
**Total: 74 QR seed tests, all passing.**
|
||||
|
||||
---
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Intelligence becomes **physically portable** — printed on paper, etched in metal, tattooed on skin
|
||||
- **Zero external dependencies** — SHA-256, HMAC, LZ compression, FFI all built from scratch
|
||||
- **Mobile-first** — App Clip (iOS), PWA (web), Instant App (Android) all supported via C FFI
|
||||
- **Offline-first** by design — the seed is useful before any network access
|
||||
- **Cryptographically verified** — HMAC-SHA256 signatures with constant-time comparison
|
||||
- **Progressive loading** — first query at 50% recall after 6% download
|
||||
- **Self-upgrading** — a seed can re-export itself with new knowledge
|
||||
|
||||
### Negative
|
||||
|
||||
- QR capacity limits seed size to ~2,900 bytes
|
||||
- HMAC-SHA256 requires a shared secret (symmetric); for asymmetric signatures, add `ed25519-dalek` as optional dep
|
||||
- Download manifest URLs have finite TTL — seeds expire unless hosts are stable
|
||||
- Built-in LZ compression is simpler than Brotli (~1.4-2.5x vs ~3-4x ratio)
|
||||
|
||||
### Migration
|
||||
|
||||
- Existing RVF files can generate QR seeds via `SeedBuilder::new().compress_microkernel(&wasm).build_and_sign(key)`
|
||||
- QR seeds bootstrap into standard RVF files — no special runtime needed
|
||||
- Seeds are forward-compatible: unknown TLV tags are ignored by older runtimes
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- QR Code Specification: ISO/IEC 18004:2015
|
||||
- SHA-256: FIPS 180-4
|
||||
- HMAC: RFC 2104
|
||||
- HMAC-SHA256 Test Vectors: RFC 4231
|
||||
- Apple App Clips: developer.apple.com/app-clips
|
||||
- RVF Spec 02: Manifest System (Level 0 / Level 1)
|
||||
- RVF Spec 11: WASM Self-Bootstrapping
|
||||
- ADR-029: RVF Canonical Format
|
||||
- ADR-030: Cognitive Container
|
||||
- ADR-033: Progressive Indexing Hardening
|
||||
Reference in New Issue
Block a user