Files
..

Delta-Behavior Architecture Decision Records

This directory contains the Architecture Decision Records (ADRs) for implementing Delta-Behavior in RuVector - a delta-first approach to incremental vector updates.

Overview

Delta-Behavior transforms RuVector into a delta-first vector database where all updates are expressed as incremental changes (deltas) rather than full vector replacements. This approach provides:

  • 10-100x bandwidth reduction for sparse updates
  • Full temporal history with point-in-time queries
  • CRDT-based conflict resolution for concurrent updates
  • Lazy index repair with quality bounds
  • Multi-tier compression (5-50x storage reduction)

ADR Index

ADR Title Status Summary
ADR-DB-001 Delta Behavior Core Architecture Proposed Delta-first architecture with layered composition
ADR-DB-002 Delta Encoding Format Proposed Hybrid sparse-dense with adaptive switching
ADR-DB-003 Delta Propagation Protocol Proposed Reactive push with backpressure
ADR-DB-004 Delta Conflict Resolution Proposed CRDT-based with causal ordering
ADR-DB-005 Delta Index Updates Proposed Lazy repair with quality bounds
ADR-DB-006 Delta Compression Strategy Proposed Multi-tier compression pipeline
ADR-DB-007 Delta Temporal Windows Proposed Adaptive windows with compaction
ADR-DB-008 Delta WASM Integration Proposed Component model with shared memory
ADR-DB-009 Delta Observability Proposed Delta lineage tracking with OpenTelemetry
ADR-DB-010 Delta Security Model Proposed Signed deltas with capability tokens

Architecture Diagram

┌─────────────────────────────────────────────────────────────────────────────┐
│                          DELTA-BEHAVIOR ARCHITECTURE                         │
└─────────────────────────────────────────────────────────────────────────────┘

                              ┌───────────────┐
                              │  Delta API    │  ADR-001
                              │  (apply, get, │
                              │   rollback)   │
                              └───────┬───────┘
                                      │
                    ┌─────────────────┼─────────────────┐
                    │                 │                 │
                    v                 v                 v
            ┌───────────────┐ ┌───────────────┐ ┌───────────────┐
            │   Security    │ │  Propagation  │ │ Observability │
            │   (signed,    │ │  (reactive,   │ │  (lineage,    │
            │  capability)  │ │  backpressure)│ │   tracing)    │
            │   ADR-010     │ │   ADR-003     │ │   ADR-009     │
            └───────┬───────┘ └───────┬───────┘ └───────┬───────┘
                    │                 │                 │
                    └─────────────────┼─────────────────┘
                                      │
                              ┌───────v───────┐
                              │   Conflict    │  ADR-004
                              │  Resolution   │
                              │  (CRDT, VC)   │
                              └───────┬───────┘
                                      │
                    ┌─────────────────┼─────────────────┐
                    │                 │                 │
                    v                 v                 v
            ┌───────────────┐ ┌───────────────┐ ┌───────────────┐
            │   Encoding    │ │   Temporal    │ │    Index      │
            │  (sparse/     │ │   Windows     │ │   Updates     │
            │   dense/RLE)  │ │  (adaptive)   │ │ (lazy repair) │
            │   ADR-002     │ │   ADR-007     │ │   ADR-005     │
            └───────┬───────┘ └───────┬───────┘ └───────┬───────┘
                    │                 │                 │
                    └─────────────────┼─────────────────┘
                                      │
                    ┌─────────────────┼─────────────────┐
                    │                 │                 │
                    v                 v                 v
            ┌───────────────┐ ┌───────────────┐ ┌───────────────┐
            │  Compression  │ │    WASM       │ │   Storage     │
            │  (LZ4/Zstd/   │ │ Integration   │ │   Layer       │
            │   quantize)   │ │ (component    │ │ (delta log,   │
            │   ADR-006     │ │   model)      │ │  checkpoint)  │
            │               │ │   ADR-008     │ │   ADR-001     │
            └───────────────┘ └───────────────┘ └───────────────┘

Key Design Decisions

1. Delta-First Storage (ADR-001)

All mutations are stored as deltas. Full vectors are materialized on-demand by composing delta chains. Checkpoints provide optimization points for composition.

2. Hybrid Encoding (ADR-002)

Automatic selection between sparse, dense, RLE, and dictionary encoding based on delta characteristics. Achieves 1-10x encoding-level compression.

3. Reactive Propagation (ADR-003)

Push-based delta distribution with explicit backpressure. Causal ordering via vector clocks ensures consistency.

4. CRDT Merging (ADR-004)

Per-dimension version tracking with configurable conflict resolution strategies (LWW, max, average, custom).

5. Lazy Index Repair (ADR-005)

Index updates are deferred until quality degrades below bounds. Background repair maintains recall targets.

6. Multi-Tier Compression (ADR-006)

Encoding -> Quantization -> Entropy coding -> Batch optimization. Achieves 5-50x total compression.

7. Adaptive Windows (ADR-007)

Dynamic window sizing based on load. Automatic compaction reduces long-term storage.

8. WASM Component Model (ADR-008)

Clean interface contracts for browser deployment. Shared memory patterns for high-throughput scenarios.

9. Lineage Tracking (ADR-009)

Full delta provenance with OpenTelemetry integration. Point-in-time reconstruction and blame queries.

10. Signed Deltas (ADR-010)

Ed25519 signatures for integrity. Capability tokens for fine-grained authorization.

Performance Targets

Metric Target Notes
Delta application < 50us Faster than full write
Composition (100 deltas) < 1ms With checkpoint
Network reduction (sparse) > 10x For <10% dimension changes
Storage compression 5-50x With full pipeline
Index recall degradation < 5% With lazy repair
Security overhead < 100us Signature verification

Implementation Phases

Phase 1: Core Infrastructure

  • Delta types and storage (ADR-001)
  • Basic encoding (ADR-002)
  • Simple checkpointing

Phase 2: Distribution

  • Propagation protocol (ADR-003)
  • Conflict resolution (ADR-004)
  • Causal ordering

Phase 3: Index Integration

  • Lazy repair (ADR-005)
  • Quality monitoring
  • Incremental HNSW

Phase 4: Optimization

  • Multi-tier compression (ADR-006)
  • Temporal windows (ADR-007)
  • Adaptive policies

Phase 5: Platform

  • WASM integration (ADR-008)
  • Observability (ADR-009)
  • Security model (ADR-010)

Dependencies

Component Crate Purpose
Signatures ed25519-dalek Delta signing
Compression lz4_flex, zstd Entropy coding
Tracing opentelemetry Observability
Async tokio Propagation
Serialization bincode, serde Wire format
  • ADR-001: Ruvector Core Architecture
  • ADR-CE-002: Incremental Coherence Computation
  • ADR-005: WASM Runtime Integration
  • ADR-007: Security Review & Technical Debt

References

  1. Shapiro, M., et al. "Conflict-free Replicated Data Types." SSS 2011.
  2. Kleppmann, M. "Designing Data-Intensive Applications." O'Reilly, 2017.
  3. Malkov, Y., & Yashunin, D. "Efficient and robust approximate nearest neighbor search using HNSW graphs."
  4. OpenTelemetry Specification. https://opentelemetry.io/docs/specs/
  5. WebAssembly Component Model. https://component-model.bytecodealliance.org/

Authors: RuVector Architecture Team Date: 2026-01-28 Status: Proposed