Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'
This commit is contained in:
184
vendor/ruvector/docs/adr/delta-behavior/README.md
vendored
Normal file
184
vendor/ruvector/docs/adr/delta-behavior/README.md
vendored
Normal file
@@ -0,0 +1,184 @@
|
||||
# Delta-Behavior Architecture Decision Records
|
||||
|
||||
This directory contains the Architecture Decision Records (ADRs) for implementing Delta-Behavior in RuVector - a delta-first approach to incremental vector updates.
|
||||
|
||||
## Overview
|
||||
|
||||
Delta-Behavior transforms RuVector into a **delta-first vector database** where all updates are expressed as incremental changes (deltas) rather than full vector replacements. This approach provides:
|
||||
|
||||
- **10-100x bandwidth reduction** for sparse updates
|
||||
- **Full temporal history** with point-in-time queries
|
||||
- **CRDT-based conflict resolution** for concurrent updates
|
||||
- **Lazy index repair** with quality bounds
|
||||
- **Multi-tier compression** (5-50x storage reduction)
|
||||
|
||||
## ADR Index
|
||||
|
||||
| ADR | Title | Status | Summary |
|
||||
|-----|-------|--------|---------|
|
||||
| [ADR-DB-001](ADR-DB-001-delta-behavior-core-architecture.md) | Delta Behavior Core Architecture | Proposed | Delta-first architecture with layered composition |
|
||||
| [ADR-DB-002](ADR-DB-002-delta-encoding-format.md) | Delta Encoding Format | Proposed | Hybrid sparse-dense with adaptive switching |
|
||||
| [ADR-DB-003](ADR-DB-003-delta-propagation-protocol.md) | Delta Propagation Protocol | Proposed | Reactive push with backpressure |
|
||||
| [ADR-DB-004](ADR-DB-004-delta-conflict-resolution.md) | Delta Conflict Resolution | Proposed | CRDT-based with causal ordering |
|
||||
| [ADR-DB-005](ADR-DB-005-delta-index-updates.md) | Delta Index Updates | Proposed | Lazy repair with quality bounds |
|
||||
| [ADR-DB-006](ADR-DB-006-delta-compression-strategy.md) | Delta Compression Strategy | Proposed | Multi-tier compression pipeline |
|
||||
| [ADR-DB-007](ADR-DB-007-delta-temporal-windows.md) | Delta Temporal Windows | Proposed | Adaptive windows with compaction |
|
||||
| [ADR-DB-008](ADR-DB-008-delta-wasm-integration.md) | Delta WASM Integration | Proposed | Component model with shared memory |
|
||||
| [ADR-DB-009](ADR-DB-009-delta-observability.md) | Delta Observability | Proposed | Delta lineage tracking with OpenTelemetry |
|
||||
| [ADR-DB-010](ADR-DB-010-delta-security-model.md) | Delta Security Model | Proposed | Signed deltas with capability tokens |
|
||||
|
||||
## Architecture Diagram
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ DELTA-BEHAVIOR ARCHITECTURE │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌───────────────┐
|
||||
│ Delta API │ ADR-001
|
||||
│ (apply, get, │
|
||||
│ rollback) │
|
||||
└───────┬───────┘
|
||||
│
|
||||
┌─────────────────┼─────────────────┐
|
||||
│ │ │
|
||||
v v v
|
||||
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
|
||||
│ Security │ │ Propagation │ │ Observability │
|
||||
│ (signed, │ │ (reactive, │ │ (lineage, │
|
||||
│ capability) │ │ backpressure)│ │ tracing) │
|
||||
│ ADR-010 │ │ ADR-003 │ │ ADR-009 │
|
||||
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
|
||||
│ │ │
|
||||
└─────────────────┼─────────────────┘
|
||||
│
|
||||
┌───────v───────┐
|
||||
│ Conflict │ ADR-004
|
||||
│ Resolution │
|
||||
│ (CRDT, VC) │
|
||||
└───────┬───────┘
|
||||
│
|
||||
┌─────────────────┼─────────────────┐
|
||||
│ │ │
|
||||
v v v
|
||||
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
|
||||
│ Encoding │ │ Temporal │ │ Index │
|
||||
│ (sparse/ │ │ Windows │ │ Updates │
|
||||
│ dense/RLE) │ │ (adaptive) │ │ (lazy repair) │
|
||||
│ ADR-002 │ │ ADR-007 │ │ ADR-005 │
|
||||
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
|
||||
│ │ │
|
||||
└─────────────────┼─────────────────┘
|
||||
│
|
||||
┌─────────────────┼─────────────────┐
|
||||
│ │ │
|
||||
v v v
|
||||
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
|
||||
│ Compression │ │ WASM │ │ Storage │
|
||||
│ (LZ4/Zstd/ │ │ Integration │ │ Layer │
|
||||
│ quantize) │ │ (component │ │ (delta log, │
|
||||
│ ADR-006 │ │ model) │ │ checkpoint) │
|
||||
│ │ │ ADR-008 │ │ ADR-001 │
|
||||
└───────────────┘ └───────────────┘ └───────────────┘
|
||||
```
|
||||
|
||||
## Key Design Decisions
|
||||
|
||||
### 1. Delta-First Storage (ADR-001)
|
||||
All mutations are stored as deltas. Full vectors are materialized on-demand by composing delta chains. Checkpoints provide optimization points for composition.
|
||||
|
||||
### 2. Hybrid Encoding (ADR-002)
|
||||
Automatic selection between sparse, dense, RLE, and dictionary encoding based on delta characteristics. Achieves 1-10x encoding-level compression.
|
||||
|
||||
### 3. Reactive Propagation (ADR-003)
|
||||
Push-based delta distribution with explicit backpressure. Causal ordering via vector clocks ensures consistency.
|
||||
|
||||
### 4. CRDT Merging (ADR-004)
|
||||
Per-dimension version tracking with configurable conflict resolution strategies (LWW, max, average, custom).
|
||||
|
||||
### 5. Lazy Index Repair (ADR-005)
|
||||
Index updates are deferred until quality degrades below bounds. Background repair maintains recall targets.
|
||||
|
||||
### 6. Multi-Tier Compression (ADR-006)
|
||||
Encoding -> Quantization -> Entropy coding -> Batch optimization. Achieves 5-50x total compression.
|
||||
|
||||
### 7. Adaptive Windows (ADR-007)
|
||||
Dynamic window sizing based on load. Automatic compaction reduces long-term storage.
|
||||
|
||||
### 8. WASM Component Model (ADR-008)
|
||||
Clean interface contracts for browser deployment. Shared memory patterns for high-throughput scenarios.
|
||||
|
||||
### 9. Lineage Tracking (ADR-009)
|
||||
Full delta provenance with OpenTelemetry integration. Point-in-time reconstruction and blame queries.
|
||||
|
||||
### 10. Signed Deltas (ADR-010)
|
||||
Ed25519 signatures for integrity. Capability tokens for fine-grained authorization.
|
||||
|
||||
## Performance Targets
|
||||
|
||||
| Metric | Target | Notes |
|
||||
|--------|--------|-------|
|
||||
| Delta application | < 50us | Faster than full write |
|
||||
| Composition (100 deltas) | < 1ms | With checkpoint |
|
||||
| Network reduction (sparse) | > 10x | For <10% dimension changes |
|
||||
| Storage compression | 5-50x | With full pipeline |
|
||||
| Index recall degradation | < 5% | With lazy repair |
|
||||
| Security overhead | < 100us | Signature verification |
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 1: Core Infrastructure
|
||||
- Delta types and storage (ADR-001)
|
||||
- Basic encoding (ADR-002)
|
||||
- Simple checkpointing
|
||||
|
||||
### Phase 2: Distribution
|
||||
- Propagation protocol (ADR-003)
|
||||
- Conflict resolution (ADR-004)
|
||||
- Causal ordering
|
||||
|
||||
### Phase 3: Index Integration
|
||||
- Lazy repair (ADR-005)
|
||||
- Quality monitoring
|
||||
- Incremental HNSW
|
||||
|
||||
### Phase 4: Optimization
|
||||
- Multi-tier compression (ADR-006)
|
||||
- Temporal windows (ADR-007)
|
||||
- Adaptive policies
|
||||
|
||||
### Phase 5: Platform
|
||||
- WASM integration (ADR-008)
|
||||
- Observability (ADR-009)
|
||||
- Security model (ADR-010)
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Component | Crate | Purpose |
|
||||
|-----------|-------|---------|
|
||||
| Signatures | `ed25519-dalek` | Delta signing |
|
||||
| Compression | `lz4_flex`, `zstd` | Entropy coding |
|
||||
| Tracing | `opentelemetry` | Observability |
|
||||
| Async | `tokio` | Propagation |
|
||||
| Serialization | `bincode`, `serde` | Wire format |
|
||||
|
||||
## Related ADRs
|
||||
|
||||
- **ADR-001**: Ruvector Core Architecture
|
||||
- **ADR-CE-002**: Incremental Coherence Computation
|
||||
- **ADR-005**: WASM Runtime Integration
|
||||
- **ADR-007**: Security Review & Technical Debt
|
||||
|
||||
## References
|
||||
|
||||
1. Shapiro, M., et al. "Conflict-free Replicated Data Types." SSS 2011.
|
||||
2. Kleppmann, M. "Designing Data-Intensive Applications." O'Reilly, 2017.
|
||||
3. Malkov, Y., & Yashunin, D. "Efficient and robust approximate nearest neighbor search using HNSW graphs."
|
||||
4. OpenTelemetry Specification. https://opentelemetry.io/docs/specs/
|
||||
5. WebAssembly Component Model. https://component-model.bytecodealliance.org/
|
||||
|
||||
---
|
||||
|
||||
**Authors**: RuVector Architecture Team
|
||||
**Date**: 2026-01-28
|
||||
**Status**: Proposed
|
||||
Reference in New Issue
Block a user