25 KiB
ADR-DB-005: Delta Index Updates
Status: Proposed Date: 2026-01-28 Authors: RuVector Architecture Team Deciders: Architecture Review Board Parent: ADR-DB-001 Delta Behavior Core Architecture
Version History
| Version | Date | Author | Changes |
|---|---|---|---|
| 0.1 | 2026-01-28 | Architecture Team | Initial proposal |
Context and Problem Statement
The Index Update Challenge
HNSW (Hierarchical Navigable Small World) indexes present unique challenges for delta-based updates:
- Graph Structure: HNSW is a proximity graph where edges connect similar vectors
- Insert Complexity: O(log n * ef_construction) for proper graph maintenance
- Update Semantics: Standard HNSW has no native update operation
- Recall Sensitivity: Graph quality directly impacts search recall
- Concurrent Access: Updates must not corrupt concurrent searches
Current HNSW Behavior
Ruvector's existing HNSW implementation (ADR-001) uses:
hnsw_rslibrary for graph operations- Mark-delete semantics (no graph restructuring)
- Full rebuild for significant changes
- No incremental edge updates
Delta Update Scenarios
| Scenario | Vector Change | Impact on Neighbors |
|---|---|---|
| Minor adjustment (<5%) | Negligible | Neighbors likely still valid |
| Moderate change (5-20%) | Moderate | Some edges may be suboptimal |
| Major change (>20%) | Significant | Many edges invalidated |
| Dimension shift | Variable | Depends on affected dimensions |
Decision
Adopt Lazy Repair with Quality Bounds
We implement a lazy repair strategy that:
- Applies deltas immediately to vector data
- Defers index repair until quality degrades
- Uses quality bounds to trigger selective repair
- Maintains search correctness through fallback mechanisms
Architecture Overview
┌─────────────────────────────────────────────────────────────┐
│ DELTA INDEX MANAGER │
└─────────────────────────────────────────────────────────────┘
│
┌─────────────────┬─────────────────┬┴──────────────────┬─────────────────┐
│ │ │ │ │
v v v v v
┌─────────┐ ┌─────────┐ ┌───────────┐ ┌─────────────┐ ┌─────────┐
│ Delta │ │ Quality │ │ Lazy │ │ Checkpoint │ │ Rebuild │
│ Tracker │ │ Monitor │ │ Repair │ │ Manager │ │ Trigger │
└─────────┘ └─────────┘ └───────────┘ └─────────────┘ └─────────┘
│ │ │ │ │
│ │ │ │ │
v v v v v
┌─────────────────────────────────────────────────────────────────────────────────┐
│ HNSW INDEX LAYER │
│ Vector Data │ Edge Graph │ Entry Points │ Layer Structure │ Distance Cache │
└─────────────────────────────────────────────────────────────────────────────────┘
Core Components
1. Delta Tracker
/// Tracks pending index updates from deltas
pub struct DeltaTracker {
/// Pending updates by vector ID
pending: DashMap<VectorId, PendingUpdate>,
/// Delta accumulation before index update
delta_buffer: Vec<AccumulatedDelta>,
/// Configuration
config: DeltaTrackerConfig,
}
#[derive(Debug, Clone)]
pub struct PendingUpdate {
/// Original vector (before deltas)
pub original: Vec<f32>,
/// Current vector (after deltas)
pub current: Vec<f32>,
/// Accumulated delta magnitude
pub total_delta_magnitude: f32,
/// Number of deltas accumulated
pub delta_count: u32,
/// First delta timestamp
pub first_delta_at: Instant,
/// Index entry status
pub index_status: IndexStatus,
}
#[derive(Debug, Clone, Copy)]
pub enum IndexStatus {
/// Index matches vector exactly
Synchronized,
/// Index is stale but within bounds
Stale { estimated_quality: f32 },
/// Index needs repair
NeedsRepair,
/// Not yet indexed
NotIndexed,
}
impl DeltaTracker {
/// Record a delta application
pub fn record_delta(
&self,
vector_id: &VectorId,
old_vector: &[f32],
new_vector: &[f32],
) {
let delta_magnitude = compute_l2_delta(old_vector, new_vector);
self.pending
.entry(vector_id.clone())
.and_modify(|update| {
update.current = new_vector.to_vec();
update.total_delta_magnitude += delta_magnitude;
update.delta_count += 1;
update.index_status = self.estimate_status(update);
})
.or_insert_with(|| PendingUpdate {
original: old_vector.to_vec(),
current: new_vector.to_vec(),
total_delta_magnitude: delta_magnitude,
delta_count: 1,
first_delta_at: Instant::now(),
index_status: IndexStatus::Stale {
estimated_quality: self.estimate_quality(delta_magnitude),
},
});
}
/// Get vectors needing repair
pub fn get_repair_candidates(&self) -> Vec<VectorId> {
self.pending
.iter()
.filter(|e| matches!(e.index_status, IndexStatus::NeedsRepair))
.map(|e| e.key().clone())
.collect()
}
fn estimate_status(&self, update: &PendingUpdate) -> IndexStatus {
let relative_change = update.total_delta_magnitude
/ (vector_magnitude(&update.original) + 1e-10);
if relative_change > self.config.repair_threshold {
IndexStatus::NeedsRepair
} else {
IndexStatus::Stale {
estimated_quality: self.estimate_quality(update.total_delta_magnitude),
}
}
}
fn estimate_quality(&self, delta_magnitude: f32) -> f32 {
// Quality decays with delta magnitude
// Based on empirical HNSW edge validity studies
(-delta_magnitude / self.config.quality_decay_constant).exp()
}
}
2. Quality Monitor
/// Monitors index quality and triggers repairs
pub struct QualityMonitor {
/// Sampled quality measurements
measurements: RingBuffer<QualityMeasurement>,
/// Current quality estimate
current_quality: AtomicF32,
/// Quality bounds configuration
bounds: QualityBounds,
/// Repair trigger channel
repair_trigger: Sender<RepairRequest>,
}
#[derive(Debug, Clone, Copy)]
pub struct QualityBounds {
/// Minimum acceptable recall
pub min_recall: f32,
/// Target recall
pub target_recall: f32,
/// Sampling rate (fraction of searches)
pub sample_rate: f32,
/// Number of samples for estimate
pub sample_window: usize,
}
impl Default for QualityBounds {
fn default() -> Self {
Self {
min_recall: 0.90,
target_recall: 0.95,
sample_rate: 0.01, // Sample 1% of searches
sample_window: 1000,
}
}
}
#[derive(Debug, Clone)]
pub struct QualityMeasurement {
/// Estimated recall for this search
pub recall: f32,
/// Number of stale vectors encountered
pub stale_vectors: u32,
/// Timestamp
pub timestamp: Instant,
}
impl QualityMonitor {
/// Sample a search for quality estimation
pub async fn sample_search(
&self,
query: &[f32],
hnsw_results: &[SearchResult],
k: usize,
) -> Option<QualityMeasurement> {
// Only sample based on configured rate
if !self.should_sample() {
return None;
}
// Compute ground truth via exact search on sample
let exact_results = self.exact_search_sample(query, k).await;
// Calculate recall
let hnsw_ids: HashSet<_> = hnsw_results.iter().map(|r| &r.id).collect();
let exact_ids: HashSet<_> = exact_results.iter().map(|r| &r.id).collect();
let overlap = hnsw_ids.intersection(&exact_ids).count();
let recall = overlap as f32 / k as f32;
// Count stale vectors in results
let stale_count = self.count_stale_in_results(hnsw_results);
let measurement = QualityMeasurement {
recall,
stale_vectors: stale_count,
timestamp: Instant::now(),
};
// Update estimates
self.measurements.push(measurement.clone());
self.update_quality_estimate();
// Trigger repair if below bounds
if recall < self.bounds.min_recall {
let _ = self.repair_trigger.send(RepairRequest::QualityBelowBounds {
current_recall: recall,
min_recall: self.bounds.min_recall,
});
}
Some(measurement)
}
fn update_quality_estimate(&self) {
let recent: Vec<_> = self.measurements
.iter()
.rev()
.take(self.bounds.sample_window)
.collect();
if recent.is_empty() {
return;
}
let avg_recall = recent.iter().map(|m| m.recall).sum::<f32>() / recent.len() as f32;
self.current_quality.store(avg_recall, Ordering::Relaxed);
}
}
3. Lazy Repair Engine
/// Performs lazy index repair operations
pub struct LazyRepairEngine {
/// HNSW index reference
hnsw: Arc<RwLock<HnswIndex>>,
/// Delta tracker reference
tracker: Arc<DeltaTracker>,
/// Repair configuration
config: RepairConfig,
/// Background repair task
repair_task: Option<JoinHandle<()>>,
}
#[derive(Debug, Clone)]
pub struct RepairConfig {
/// Maximum repairs per batch
pub batch_size: usize,
/// Repair interval
pub repair_interval: Duration,
/// Whether to use background repair
pub background_repair: bool,
/// Priority ordering for repairs
pub priority: RepairPriority,
}
#[derive(Debug, Clone, Copy)]
pub enum RepairPriority {
/// Repair most changed vectors first
MostChanged,
/// Repair oldest pending first
Oldest,
/// Repair most frequently accessed first
MostAccessed,
/// Round-robin
RoundRobin,
}
impl LazyRepairEngine {
/// Repair a single vector in the index
pub async fn repair_vector(&self, vector_id: &VectorId) -> Result<RepairResult> {
// Get current vector state
let update = self.tracker.pending.get(vector_id)
.ok_or(RepairError::VectorNotPending)?;
let mut hnsw = self.hnsw.write().await;
// Strategy 1: Soft update (if change is small)
if update.total_delta_magnitude < self.config.soft_update_threshold {
return self.soft_update(&mut hnsw, vector_id, &update.current).await;
}
// Strategy 2: Re-insertion (moderate change)
if update.total_delta_magnitude < self.config.reinsert_threshold {
return self.reinsert(&mut hnsw, vector_id, &update.current).await;
}
// Strategy 3: Full repair (large change)
self.full_repair(&mut hnsw, vector_id, &update.current).await
}
/// Soft update: only update vector data, keep edges
async fn soft_update(
&self,
hnsw: &mut HnswIndex,
vector_id: &VectorId,
new_vector: &[f32],
) -> Result<RepairResult> {
// Update vector data without touching graph structure
hnsw.update_vector_data(vector_id, new_vector)?;
// Mark as synchronized
self.tracker.pending.remove(vector_id);
Ok(RepairResult::SoftUpdate {
vector_id: vector_id.clone(),
edges_preserved: true,
})
}
/// Re-insertion: remove and re-add to graph
async fn reinsert(
&self,
hnsw: &mut HnswIndex,
vector_id: &VectorId,
new_vector: &[f32],
) -> Result<RepairResult> {
// Get current index position
let old_idx = hnsw.get_index_for_vector(vector_id)?;
// Mark old position as deleted
hnsw.mark_deleted(old_idx)?;
// Insert with new vector
let new_idx = hnsw.insert_vector(vector_id.clone(), new_vector.to_vec())?;
// Update tracker
self.tracker.pending.remove(vector_id);
Ok(RepairResult::Reinserted {
vector_id: vector_id.clone(),
old_idx,
new_idx,
})
}
/// Full repair: rebuild local neighborhood
async fn full_repair(
&self,
hnsw: &mut HnswIndex,
vector_id: &VectorId,
new_vector: &[f32],
) -> Result<RepairResult> {
// Get current neighbors
let old_neighbors = hnsw.get_neighbors(vector_id)?;
// Remove and reinsert
self.reinsert(hnsw, vector_id, new_vector).await?;
// Repair edges from old neighbors
let repaired_edges = self.repair_neighbor_edges(hnsw, &old_neighbors).await?;
Ok(RepairResult::FullRepair {
vector_id: vector_id.clone(),
repaired_edges,
})
}
/// Background repair loop
pub async fn run_background_repair(&self) {
loop {
tokio::time::sleep(self.config.repair_interval).await;
// Get repair candidates
let candidates = self.tracker.get_repair_candidates();
if candidates.is_empty() {
continue;
}
// Prioritize
let prioritized = self.prioritize_repairs(candidates);
// Repair batch
for vector_id in prioritized.into_iter().take(self.config.batch_size) {
if let Err(e) = self.repair_vector(&vector_id).await {
tracing::warn!("Repair failed for {}: {}", vector_id, e);
}
}
}
}
}
Recall vs Latency Tradeoffs
┌──────────────────────────────────────────────────────────┐
│ RECALL vs LATENCY TRADEOFF │
└──────────────────────────────────────────────────────────┘
Recall
100% │ ┌──────────────────┐
│ / │
│ / Immediate Repair │
│ / │
95% │ ┌───────────────────────────●───────────────────────┤
│ / │ │
│ / Lazy Repair │ │
│ / │ │
90% │●───────────────────────────────┤ │
│ │ │
│ Quality Bound │ │
85% │ (Min Acceptable) │ │
│ │ │
└────────────────────────────────┴───────────────────────┴───>
Low Medium High
Write Latency
──── Lazy Repair (Selected): Best balance
- - - Immediate Repair: Highest recall, highest latency
· · · No Repair: Lowest latency, recall degrades
Repair Strategy Selection
/// Select repair strategy based on delta characteristics
pub fn select_repair_strategy(
delta_magnitude: f32,
vector_norm: f32,
access_frequency: f32,
current_recall: f32,
config: &RepairConfig,
) -> RepairStrategy {
let relative_change = delta_magnitude / (vector_norm + 1e-10);
// High access frequency = repair sooner
let access_weight = if access_frequency > config.hot_vector_threshold {
0.7 // Reduce thresholds for hot vectors
} else {
1.0
};
// Low current recall = repair more aggressively
let recall_weight = if current_recall < config.quality_bounds.min_recall {
0.5 // Halve thresholds when recall is critical
} else {
1.0
};
let effective_threshold = config.soft_update_threshold * access_weight * recall_weight;
if relative_change < effective_threshold {
RepairStrategy::Deferred // No immediate action
} else if relative_change < config.reinsert_threshold * access_weight * recall_weight {
RepairStrategy::SoftUpdate
} else if relative_change < config.full_repair_threshold * access_weight * recall_weight {
RepairStrategy::Reinsert
} else {
RepairStrategy::FullRepair
}
}
Recall vs Latency Analysis
Simulated Workload Results
| Strategy | Write Latency (p50) | Recall@10 | Recall@100 |
|---|---|---|---|
| Immediate Repair | 2.1ms | 99.2% | 98.7% |
| Lazy (aggressive) | 150us | 96.5% | 95.1% |
| Lazy (balanced) | 80us | 94.2% | 92.8% |
| Lazy (relaxed) | 50us | 91.3% | 89.5% |
| No Repair | 35us | 85.1%* | 82.3%* |
*Degrades over time with update volume
Quality Degradation Curves
Recall over time (1000 updates/sec, no repair):
100% ├────────────
│ \
95% │ \──────────────
│ \
90% │ \────────────
│ \
85% │ \───────
│
80% │
└─────────────────────────────────────────────────────>
0 5 10 15 20 Minutes
With lazy repair (balanced):
100% ├────────────
│ \ ┌─────┐ ┌─────┐ ┌─────┐
95% │ \───┬┘ └───┬┘ └───┬┘ └───
│ │ Repair │ Repair │ Repair
90% │ │ │ │
│
85% │
└─────────────────────────────────────────────────────>
0 5 10 15 20 Minutes
Considered Options
Option 1: Immediate Rebuild
Description: Rebuild affected portions of graph on every delta.
Pros:
- Always accurate graph
- Maximum recall
- Simple correctness model
Cons:
- O(log n * ef_construction) per update
- High write latency
- Blocks concurrent searches
Verdict: Rejected - latency unacceptable for streaming updates.
Option 2: Periodic Full Rebuild
Description: Allow degradation, rebuild entire index periodically.
Pros:
- Minimal write overhead
- Predictable rebuild schedule
- Simple implementation
Cons:
- Extended degradation periods
- Expensive rebuilds
- Resource spikes
Verdict: Available as configuration option, not default.
Option 3: Lazy Update (Selected)
Description: Defer repairs, trigger on quality bounds.
Pros:
- Low write latency
- Bounded recall degradation
- Adaptive to workload
- Background repair
Cons:
- Complexity in quality monitoring
- Potential recall dips
Verdict: Adopted - optimal balance for delta workloads.
Option 4: Learned Index Repair
Description: ML model predicts optimal repair timing.
Pros:
- Potentially optimal decisions
- Adapts to patterns
Cons:
- Training complexity
- Model maintenance
- Explainability
Verdict: Deferred to future version.
Technical Specification
Index Update API
/// Delta-aware HNSW index
#[async_trait]
pub trait DeltaAwareIndex: Send + Sync {
/// Apply delta without immediate index update
async fn apply_delta(&self, delta: &VectorDelta) -> Result<DeltaApplication>;
/// Get current recall estimate
fn current_recall(&self) -> f32;
/// Get vectors pending repair
fn pending_repairs(&self) -> Vec<VectorId>;
/// Force repair of specific vectors
async fn repair_vectors(&self, ids: &[VectorId]) -> Result<Vec<RepairResult>>;
/// Trigger background repair cycle
async fn trigger_repair_cycle(&self) -> Result<RepairCycleSummary>;
/// Search with optional quality sampling
async fn search_with_quality(
&self,
query: &[f32],
k: usize,
sample_quality: bool,
) -> Result<SearchWithQuality>;
}
#[derive(Debug)]
pub struct DeltaApplication {
pub vector_id: VectorId,
pub delta_id: DeltaId,
pub strategy: RepairStrategy,
pub deferred_repair: bool,
pub estimated_recall_impact: f32,
}
#[derive(Debug)]
pub struct SearchWithQuality {
pub results: Vec<SearchResult>,
pub quality_sample: Option<QualityMeasurement>,
pub stale_results: u32,
}
Configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DeltaIndexConfig {
/// Quality bounds for triggering repair
pub quality_bounds: QualityBounds,
/// Repair engine configuration
pub repair_config: RepairConfig,
/// Delta tracker configuration
pub tracker_config: DeltaTrackerConfig,
/// Enable background repair
pub background_repair: bool,
/// Checkpoint interval (for recovery)
pub checkpoint_interval: Duration,
}
impl Default for DeltaIndexConfig {
fn default() -> Self {
Self {
quality_bounds: QualityBounds::default(),
repair_config: RepairConfig {
batch_size: 100,
repair_interval: Duration::from_secs(5),
background_repair: true,
priority: RepairPriority::MostChanged,
soft_update_threshold: 0.05, // 5% change
reinsert_threshold: 0.20, // 20% change
full_repair_threshold: 0.50, // 50% change
},
tracker_config: DeltaTrackerConfig {
repair_threshold: 0.15,
quality_decay_constant: 0.1,
},
background_repair: true,
checkpoint_interval: Duration::from_secs(300),
}
}
}
Consequences
Benefits
- Low Write Latency: Sub-millisecond delta application
- Bounded Degradation: Quality monitoring prevents unacceptable recall
- Adaptive: Repairs prioritized by impact and access patterns
- Background Processing: Repairs don't block user operations
- Resource Efficient: Avoids unnecessary graph restructuring
Risks and Mitigations
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Recall below bounds | Low | High | Aggressive repair triggers |
| Repair backlog | Medium | Medium | Batch size tuning |
| Stale search results | Medium | Medium | Optional exact fallback |
| Checkpoint overhead | Low | Low | Incremental checkpoints |
References
- Malkov, Y., & Yashunin, D. "Efficient and robust approximate nearest neighbor search using HNSW graphs."
- Singh, A., et al. "FreshDiskANN: A Fast and Accurate Graph-Based ANN Index for Streaming Similarity Search."
- ADR-001: Ruvector Core Architecture
- ADR-DB-001: Delta Behavior Core Architecture
Related Decisions
- ADR-DB-001: Delta Behavior Core Architecture
- ADR-DB-003: Delta Propagation Protocol
- ADR-DB-007: Delta Temporal Windows