git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
185 lines
10 KiB
Markdown
185 lines
10 KiB
Markdown
# Delta-Behavior Architecture Decision Records
|
|
|
|
This directory contains the Architecture Decision Records (ADRs) for implementing Delta-Behavior in RuVector - a delta-first approach to incremental vector updates.
|
|
|
|
## Overview
|
|
|
|
Delta-Behavior transforms RuVector into a **delta-first vector database** where all updates are expressed as incremental changes (deltas) rather than full vector replacements. This approach provides:
|
|
|
|
- **10-100x bandwidth reduction** for sparse updates
|
|
- **Full temporal history** with point-in-time queries
|
|
- **CRDT-based conflict resolution** for concurrent updates
|
|
- **Lazy index repair** with quality bounds
|
|
- **Multi-tier compression** (5-50x storage reduction)
|
|
|
|
## ADR Index
|
|
|
|
| ADR | Title | Status | Summary |
|
|
|-----|-------|--------|---------|
|
|
| [ADR-DB-001](ADR-DB-001-delta-behavior-core-architecture.md) | Delta Behavior Core Architecture | Proposed | Delta-first architecture with layered composition |
|
|
| [ADR-DB-002](ADR-DB-002-delta-encoding-format.md) | Delta Encoding Format | Proposed | Hybrid sparse-dense with adaptive switching |
|
|
| [ADR-DB-003](ADR-DB-003-delta-propagation-protocol.md) | Delta Propagation Protocol | Proposed | Reactive push with backpressure |
|
|
| [ADR-DB-004](ADR-DB-004-delta-conflict-resolution.md) | Delta Conflict Resolution | Proposed | CRDT-based with causal ordering |
|
|
| [ADR-DB-005](ADR-DB-005-delta-index-updates.md) | Delta Index Updates | Proposed | Lazy repair with quality bounds |
|
|
| [ADR-DB-006](ADR-DB-006-delta-compression-strategy.md) | Delta Compression Strategy | Proposed | Multi-tier compression pipeline |
|
|
| [ADR-DB-007](ADR-DB-007-delta-temporal-windows.md) | Delta Temporal Windows | Proposed | Adaptive windows with compaction |
|
|
| [ADR-DB-008](ADR-DB-008-delta-wasm-integration.md) | Delta WASM Integration | Proposed | Component model with shared memory |
|
|
| [ADR-DB-009](ADR-DB-009-delta-observability.md) | Delta Observability | Proposed | Delta lineage tracking with OpenTelemetry |
|
|
| [ADR-DB-010](ADR-DB-010-delta-security-model.md) | Delta Security Model | Proposed | Signed deltas with capability tokens |
|
|
|
|
## Architecture Diagram
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ DELTA-BEHAVIOR ARCHITECTURE │
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
|
|
┌───────────────┐
|
|
│ Delta API │ ADR-001
|
|
│ (apply, get, │
|
|
│ rollback) │
|
|
└───────┬───────┘
|
|
│
|
|
┌─────────────────┼─────────────────┐
|
|
│ │ │
|
|
v v v
|
|
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
|
|
│ Security │ │ Propagation │ │ Observability │
|
|
│ (signed, │ │ (reactive, │ │ (lineage, │
|
|
│ capability) │ │ backpressure)│ │ tracing) │
|
|
│ ADR-010 │ │ ADR-003 │ │ ADR-009 │
|
|
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
|
|
│ │ │
|
|
└─────────────────┼─────────────────┘
|
|
│
|
|
┌───────v───────┐
|
|
│ Conflict │ ADR-004
|
|
│ Resolution │
|
|
│ (CRDT, VC) │
|
|
└───────┬───────┘
|
|
│
|
|
┌─────────────────┼─────────────────┐
|
|
│ │ │
|
|
v v v
|
|
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
|
|
│ Encoding │ │ Temporal │ │ Index │
|
|
│ (sparse/ │ │ Windows │ │ Updates │
|
|
│ dense/RLE) │ │ (adaptive) │ │ (lazy repair) │
|
|
│ ADR-002 │ │ ADR-007 │ │ ADR-005 │
|
|
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
|
|
│ │ │
|
|
└─────────────────┼─────────────────┘
|
|
│
|
|
┌─────────────────┼─────────────────┐
|
|
│ │ │
|
|
v v v
|
|
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
|
|
│ Compression │ │ WASM │ │ Storage │
|
|
│ (LZ4/Zstd/ │ │ Integration │ │ Layer │
|
|
│ quantize) │ │ (component │ │ (delta log, │
|
|
│ ADR-006 │ │ model) │ │ checkpoint) │
|
|
│ │ │ ADR-008 │ │ ADR-001 │
|
|
└───────────────┘ └───────────────┘ └───────────────┘
|
|
```
|
|
|
|
## Key Design Decisions
|
|
|
|
### 1. Delta-First Storage (ADR-001)
|
|
All mutations are stored as deltas. Full vectors are materialized on-demand by composing delta chains. Checkpoints provide optimization points for composition.
|
|
|
|
### 2. Hybrid Encoding (ADR-002)
|
|
Automatic selection between sparse, dense, RLE, and dictionary encoding based on delta characteristics. Achieves 1-10x encoding-level compression.
|
|
|
|
### 3. Reactive Propagation (ADR-003)
|
|
Push-based delta distribution with explicit backpressure. Causal ordering via vector clocks ensures consistency.
|
|
|
|
### 4. CRDT Merging (ADR-004)
|
|
Per-dimension version tracking with configurable conflict resolution strategies (LWW, max, average, custom).
|
|
|
|
### 5. Lazy Index Repair (ADR-005)
|
|
Index updates are deferred until quality degrades below bounds. Background repair maintains recall targets.
|
|
|
|
### 6. Multi-Tier Compression (ADR-006)
|
|
Encoding -> Quantization -> Entropy coding -> Batch optimization. Achieves 5-50x total compression.
|
|
|
|
### 7. Adaptive Windows (ADR-007)
|
|
Dynamic window sizing based on load. Automatic compaction reduces long-term storage.
|
|
|
|
### 8. WASM Component Model (ADR-008)
|
|
Clean interface contracts for browser deployment. Shared memory patterns for high-throughput scenarios.
|
|
|
|
### 9. Lineage Tracking (ADR-009)
|
|
Full delta provenance with OpenTelemetry integration. Point-in-time reconstruction and blame queries.
|
|
|
|
### 10. Signed Deltas (ADR-010)
|
|
Ed25519 signatures for integrity. Capability tokens for fine-grained authorization.
|
|
|
|
## Performance Targets
|
|
|
|
| Metric | Target | Notes |
|
|
|--------|--------|-------|
|
|
| Delta application | < 50us | Faster than full write |
|
|
| Composition (100 deltas) | < 1ms | With checkpoint |
|
|
| Network reduction (sparse) | > 10x | For <10% dimension changes |
|
|
| Storage compression | 5-50x | With full pipeline |
|
|
| Index recall degradation | < 5% | With lazy repair |
|
|
| Security overhead | < 100us | Signature verification |
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 1: Core Infrastructure
|
|
- Delta types and storage (ADR-001)
|
|
- Basic encoding (ADR-002)
|
|
- Simple checkpointing
|
|
|
|
### Phase 2: Distribution
|
|
- Propagation protocol (ADR-003)
|
|
- Conflict resolution (ADR-004)
|
|
- Causal ordering
|
|
|
|
### Phase 3: Index Integration
|
|
- Lazy repair (ADR-005)
|
|
- Quality monitoring
|
|
- Incremental HNSW
|
|
|
|
### Phase 4: Optimization
|
|
- Multi-tier compression (ADR-006)
|
|
- Temporal windows (ADR-007)
|
|
- Adaptive policies
|
|
|
|
### Phase 5: Platform
|
|
- WASM integration (ADR-008)
|
|
- Observability (ADR-009)
|
|
- Security model (ADR-010)
|
|
|
|
## Dependencies
|
|
|
|
| Component | Crate | Purpose |
|
|
|-----------|-------|---------|
|
|
| Signatures | `ed25519-dalek` | Delta signing |
|
|
| Compression | `lz4_flex`, `zstd` | Entropy coding |
|
|
| Tracing | `opentelemetry` | Observability |
|
|
| Async | `tokio` | Propagation |
|
|
| Serialization | `bincode`, `serde` | Wire format |
|
|
|
|
## Related ADRs
|
|
|
|
- **ADR-001**: Ruvector Core Architecture
|
|
- **ADR-CE-002**: Incremental Coherence Computation
|
|
- **ADR-005**: WASM Runtime Integration
|
|
- **ADR-007**: Security Review & Technical Debt
|
|
|
|
## References
|
|
|
|
1. Shapiro, M., et al. "Conflict-free Replicated Data Types." SSS 2011.
|
|
2. Kleppmann, M. "Designing Data-Intensive Applications." O'Reilly, 2017.
|
|
3. Malkov, Y., & Yashunin, D. "Efficient and robust approximate nearest neighbor search using HNSW graphs."
|
|
4. OpenTelemetry Specification. https://opentelemetry.io/docs/specs/
|
|
5. WebAssembly Component Model. https://component-model.bytecodealliance.org/
|
|
|
|
---
|
|
|
|
**Authors**: RuVector Architecture Team
|
|
**Date**: 2026-01-28
|
|
**Status**: Proposed
|