git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
88 lines
3.4 KiB
Markdown
88 lines
3.4 KiB
Markdown
# RVF: RuVector Format
|
|
|
|
## A Living, Self-Reorganizing Runtime Substrate for Vector Intelligence
|
|
|
|
---
|
|
|
|
### Document Index
|
|
|
|
#### Core Specification (`spec/`)
|
|
|
|
| # | Document | Description |
|
|
|---|----------|-------------|
|
|
| 00 | [Overview](spec/00-overview.md) | The Four Laws, design coordinates, philosophy |
|
|
| 01 | [Segment Model](spec/01-segment-model.md) | Append-only segments, headers, lifecycle, multi-file |
|
|
| 02 | [Manifest System](spec/02-manifest-system.md) | Two-level manifests, hotset pointers, progressive boot |
|
|
| 03 | [Temperature Tiering](spec/03-temperature-tiering.md) | Adaptive layout, access sketches, promotion/demotion |
|
|
| 04 | [Progressive Indexing](spec/04-progressive-indexing.md) | Layer A/B/C availability, lazy build, partial search |
|
|
| 05 | [Overlay Epochs](spec/05-overlay-epochs.md) | Streaming min-cut, epoch boundaries, rollback |
|
|
| 06 | [Query Optimization](spec/06-query-optimization.md) | SIMD alignment, prefetch, varint IDs, cache analysis |
|
|
| 07 | [Deletion & Lifecycle](spec/07-deletion-lifecycle.md) | Vector deletion, JOURNAL_SEG wire format, deletion bitmaps, compaction |
|
|
| 08 | [Filtered Search](spec/08-filtered-search.md) | META_SEG wire format, filter expressions, metadata indexes |
|
|
| 09 | [Concurrency & Versioning](spec/09-concurrency-versioning.md) | Writer locking, reader-writer coordination, space reclamation |
|
|
| 10 | [Operations API](spec/10-operations-api.md) | Batch ops, error codes, network streaming, compaction scheduling |
|
|
|
|
#### Wire Format (`wire/`)
|
|
|
|
| Document | Description |
|
|
|----------|-------------|
|
|
| [Binary Layout](wire/binary-layout.md) | Byte-level format reference, all segment payloads |
|
|
|
|
#### WASM Microkernel (`microkernel/`)
|
|
|
|
| Document | Description |
|
|
|----------|-------------|
|
|
| [WASM Runtime](microkernel/wasm-runtime.md) | Cognitum tile mapping, 14 exports, hub-tile protocol |
|
|
|
|
#### Domain Profiles (`profiles/`)
|
|
|
|
| Document | Description |
|
|
|----------|-------------|
|
|
| [Domain Profiles](profiles/domain-profiles.md) | RVDNA, RVText, RVGraph, RVVision specifications |
|
|
|
|
#### Cryptography (`crypto/`)
|
|
|
|
| Document | Description |
|
|
|----------|-------------|
|
|
| [Quantum Signatures](crypto/quantum-signatures.md) | ML-DSA-65, SHAKE-256, hybrid encryption, witnesses |
|
|
|
|
#### Benchmarks (`benchmarks/`)
|
|
|
|
| Document | Description |
|
|
|----------|-------------|
|
|
| [Acceptance Tests](benchmarks/acceptance-tests.md) | Performance targets, crash safety, scalability |
|
|
|
|
---
|
|
|
|
### Quick Reference
|
|
|
|
**The Four Laws**
|
|
1. Truth lives at the tail
|
|
2. Every segment is independently valid
|
|
3. Data and state are separated
|
|
4. The format adapts to its workload
|
|
|
|
**Minimal Upgrade Path** (smallest changes that unlock everything)
|
|
1. Add tail manifest segments
|
|
2. Make every payload a segment with its own hash and length
|
|
3. Add hotset pointers in the manifest
|
|
4. Add an epoch overlay model
|
|
|
|
**Hardware Profiles**
|
|
- **Core**: 8 KB code + 8 KB data + 64 KB SIMD (Cognitum tile)
|
|
- **Hot**: Multi-tile chip with shared memory
|
|
- **Full**: Desktop/server with mmap and full feature set
|
|
|
|
**Key Numbers**
|
|
- Boot: 4 KB read, < 5 ms
|
|
- First query: <= 4 MB read, recall >= 0.70
|
|
- Full quality: recall >= 0.95
|
|
- Signing: ML-DSA-65, 3,309 B signatures, ~4,500 sign/s
|
|
- Distance: 384-dim fp16 L2 in ~12 AVX-512 cycles
|
|
- Hot entry: 960 bytes (vector + 16 neighbors, cache-line aligned)
|
|
|
|
**Design Choices**
|
|
- Append-only + compaction (not random writes)
|
|
- Both mmap desktop and microcontroller tiles
|
|
- Priority: streamable > progressive > adaptive > p95 speed
|