Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

This commit is contained in:
ruv
2026-02-28 14:39:40 -05:00
7854 changed files with 3522914 additions and 0 deletions

View File

@@ -0,0 +1,87 @@
# RVF: RuVector Format
## A Living, Self-Reorganizing Runtime Substrate for Vector Intelligence
---
### Document Index
#### Core Specification (`spec/`)
| # | Document | Description |
|---|----------|-------------|
| 00 | [Overview](spec/00-overview.md) | The Four Laws, design coordinates, philosophy |
| 01 | [Segment Model](spec/01-segment-model.md) | Append-only segments, headers, lifecycle, multi-file |
| 02 | [Manifest System](spec/02-manifest-system.md) | Two-level manifests, hotset pointers, progressive boot |
| 03 | [Temperature Tiering](spec/03-temperature-tiering.md) | Adaptive layout, access sketches, promotion/demotion |
| 04 | [Progressive Indexing](spec/04-progressive-indexing.md) | Layer A/B/C availability, lazy build, partial search |
| 05 | [Overlay Epochs](spec/05-overlay-epochs.md) | Streaming min-cut, epoch boundaries, rollback |
| 06 | [Query Optimization](spec/06-query-optimization.md) | SIMD alignment, prefetch, varint IDs, cache analysis |
| 07 | [Deletion & Lifecycle](spec/07-deletion-lifecycle.md) | Vector deletion, JOURNAL_SEG wire format, deletion bitmaps, compaction |
| 08 | [Filtered Search](spec/08-filtered-search.md) | META_SEG wire format, filter expressions, metadata indexes |
| 09 | [Concurrency & Versioning](spec/09-concurrency-versioning.md) | Writer locking, reader-writer coordination, space reclamation |
| 10 | [Operations API](spec/10-operations-api.md) | Batch ops, error codes, network streaming, compaction scheduling |
#### Wire Format (`wire/`)
| Document | Description |
|----------|-------------|
| [Binary Layout](wire/binary-layout.md) | Byte-level format reference, all segment payloads |
#### WASM Microkernel (`microkernel/`)
| Document | Description |
|----------|-------------|
| [WASM Runtime](microkernel/wasm-runtime.md) | Cognitum tile mapping, 14 exports, hub-tile protocol |
#### Domain Profiles (`profiles/`)
| Document | Description |
|----------|-------------|
| [Domain Profiles](profiles/domain-profiles.md) | RVDNA, RVText, RVGraph, RVVision specifications |
#### Cryptography (`crypto/`)
| Document | Description |
|----------|-------------|
| [Quantum Signatures](crypto/quantum-signatures.md) | ML-DSA-65, SHAKE-256, hybrid encryption, witnesses |
#### Benchmarks (`benchmarks/`)
| Document | Description |
|----------|-------------|
| [Acceptance Tests](benchmarks/acceptance-tests.md) | Performance targets, crash safety, scalability |
---
### Quick Reference
**The Four Laws**
1. Truth lives at the tail
2. Every segment is independently valid
3. Data and state are separated
4. The format adapts to its workload
**Minimal Upgrade Path** (smallest changes that unlock everything)
1. Add tail manifest segments
2. Make every payload a segment with its own hash and length
3. Add hotset pointers in the manifest
4. Add an epoch overlay model
**Hardware Profiles**
- **Core**: 8 KB code + 8 KB data + 64 KB SIMD (Cognitum tile)
- **Hot**: Multi-tile chip with shared memory
- **Full**: Desktop/server with mmap and full feature set
**Key Numbers**
- Boot: 4 KB read, < 5 ms
- First query: <= 4 MB read, recall >= 0.70
- Full quality: recall >= 0.95
- Signing: ML-DSA-65, 3,309 B signatures, ~4,500 sign/s
- Distance: 384-dim fp16 L2 in ~12 AVX-512 cycles
- Hot entry: 960 bytes (vector + 16 neighbors, cache-line aligned)
**Design Choices**
- Append-only + compaction (not random writes)
- Both mmap desktop and microcontroller tiles
- Priority: streamable > progressive > adaptive > p95 speed