git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
641 lines
19 KiB
Markdown
641 lines
19 KiB
Markdown
# ADR-DB-004: Delta Conflict Resolution
|
|
|
|
**Status**: Proposed
|
|
**Date**: 2026-01-28
|
|
**Authors**: RuVector Architecture Team
|
|
**Deciders**: Architecture Review Board
|
|
**Parent**: ADR-DB-001 Delta Behavior Core Architecture
|
|
|
|
## Version History
|
|
|
|
| Version | Date | Author | Changes |
|
|
|---------|------|--------|---------|
|
|
| 0.1 | 2026-01-28 | Architecture Team | Initial proposal |
|
|
|
|
---
|
|
|
|
## Context and Problem Statement
|
|
|
|
### The Conflict Challenge
|
|
|
|
In distributed delta-first systems, concurrent updates to the same vector can create conflicts:
|
|
|
|
```
|
|
Time ─────────────────────────────────────────>
|
|
|
|
Replica A: v0 ──[Δa: dim[5]=0.8]──> v1a
|
|
\
|
|
\
|
|
Replica B: ──[Δb: dim[5]=0.3]──> v1b
|
|
|
|
Conflict: Both replicas modified dim[5] concurrently
|
|
```
|
|
|
|
### Conflict Scenarios
|
|
|
|
| Scenario | Frequency | Complexity |
|
|
|----------|-----------|------------|
|
|
| Same dimension, different values | High | Simple |
|
|
| Overlapping sparse updates | Medium | Moderate |
|
|
| Scale vs. sparse conflict | Low | Complex |
|
|
| Delete vs. update race | Low | Critical |
|
|
|
|
### Requirements
|
|
|
|
1. **Deterministic**: Same conflicts resolve identically on all replicas
|
|
2. **Commutative**: Order of conflict discovery doesn't affect outcome
|
|
3. **Low Latency**: Resolution shouldn't block writes
|
|
4. **Meaningful**: Results should be mathematically sensible for vectors
|
|
|
|
---
|
|
|
|
## Decision
|
|
|
|
### Adopt CRDT-Based Resolution with Causal Ordering
|
|
|
|
We implement conflict resolution using Conflict-free Replicated Data Types (CRDTs) with vector-specific merge semantics.
|
|
|
|
### CRDT Design for Vectors
|
|
|
|
#### Vector as a CRDT
|
|
|
|
```rust
|
|
/// CRDT-enabled vector with per-dimension version tracking
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub struct CrdtVector {
|
|
/// Vector ID
|
|
pub id: VectorId,
|
|
/// Dimensions with per-dimension causality
|
|
pub dimensions: Vec<CrdtDimension>,
|
|
/// Overall vector clock
|
|
pub clock: VectorClock,
|
|
/// Deletion marker
|
|
pub tombstone: Option<Tombstone>,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub struct CrdtDimension {
|
|
/// Current value
|
|
pub value: f32,
|
|
/// Last update clock
|
|
pub clock: VectorClock,
|
|
/// Originating replica
|
|
pub origin: ReplicaId,
|
|
/// Timestamp of update
|
|
pub timestamp: DateTime<Utc>,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub struct Tombstone {
|
|
pub deleted_at: DateTime<Utc>,
|
|
pub deleted_by: ReplicaId,
|
|
pub clock: VectorClock,
|
|
}
|
|
```
|
|
|
|
#### Merge Operation
|
|
|
|
```rust
|
|
impl CrdtVector {
|
|
/// Merge another CRDT vector into this one
|
|
pub fn merge(&mut self, other: &CrdtVector) -> MergeResult {
|
|
assert_eq!(self.id, other.id);
|
|
let mut conflicts = Vec::new();
|
|
|
|
// Handle tombstone
|
|
self.tombstone = match (&self.tombstone, &other.tombstone) {
|
|
(None, None) => None,
|
|
(Some(t), None) | (None, Some(t)) => Some(t.clone()),
|
|
(Some(t1), Some(t2)) => {
|
|
// Latest tombstone wins
|
|
Some(if t1.timestamp > t2.timestamp { t1.clone() } else { t2.clone() })
|
|
}
|
|
};
|
|
|
|
// If deleted, no need to merge dimensions
|
|
if self.tombstone.is_some() {
|
|
return MergeResult { conflicts, tombstoned: true };
|
|
}
|
|
|
|
// Merge each dimension
|
|
for (i, (self_dim, other_dim)) in
|
|
self.dimensions.iter_mut().zip(other.dimensions.iter()).enumerate()
|
|
{
|
|
let ordering = self_dim.clock.compare(&other_dim.clock);
|
|
|
|
match ordering {
|
|
ClockOrdering::Before => {
|
|
// Other is newer, take it
|
|
*self_dim = other_dim.clone();
|
|
}
|
|
ClockOrdering::After | ClockOrdering::Equal => {
|
|
// Self is newer or equal, keep it
|
|
}
|
|
ClockOrdering::Concurrent => {
|
|
// Conflict! Apply resolution strategy
|
|
let resolved = self.resolve_dimension_conflict(i, self_dim, other_dim);
|
|
conflicts.push(DimensionConflict {
|
|
dimension: i,
|
|
local_value: self_dim.value,
|
|
remote_value: other_dim.value,
|
|
resolved_value: resolved.value,
|
|
strategy: resolved.strategy,
|
|
});
|
|
*self_dim = resolved.dimension;
|
|
}
|
|
}
|
|
}
|
|
|
|
// Update overall clock
|
|
self.clock.merge(&other.clock);
|
|
|
|
MergeResult { conflicts, tombstoned: false }
|
|
}
|
|
|
|
fn resolve_dimension_conflict(
|
|
&self,
|
|
dim_idx: usize,
|
|
local: &CrdtDimension,
|
|
remote: &CrdtDimension,
|
|
) -> ResolvedDimension {
|
|
// Strategy selection based on configured policy
|
|
match self.conflict_strategy(dim_idx) {
|
|
ConflictStrategy::LastWriteWins => {
|
|
// Latest timestamp wins
|
|
let winner = if local.timestamp > remote.timestamp { local } else { remote };
|
|
ResolvedDimension {
|
|
dimension: winner.clone(),
|
|
strategy: ConflictStrategy::LastWriteWins,
|
|
}
|
|
}
|
|
ConflictStrategy::MaxValue => {
|
|
// Take maximum value
|
|
let max_val = local.value.max(remote.value);
|
|
let winner = if local.value >= remote.value { local } else { remote };
|
|
ResolvedDimension {
|
|
dimension: CrdtDimension {
|
|
value: max_val,
|
|
clock: merge_clocks(&local.clock, &remote.clock),
|
|
origin: winner.origin.clone(),
|
|
timestamp: winner.timestamp.max(remote.timestamp),
|
|
},
|
|
strategy: ConflictStrategy::MaxValue,
|
|
}
|
|
}
|
|
ConflictStrategy::Average => {
|
|
// Average the values
|
|
let avg = (local.value + remote.value) / 2.0;
|
|
ResolvedDimension {
|
|
dimension: CrdtDimension {
|
|
value: avg,
|
|
clock: merge_clocks(&local.clock, &remote.clock),
|
|
origin: "merged".into(),
|
|
timestamp: local.timestamp.max(remote.timestamp),
|
|
},
|
|
strategy: ConflictStrategy::Average,
|
|
}
|
|
}
|
|
ConflictStrategy::ReplicaPriority(priorities) => {
|
|
// Higher priority replica wins
|
|
let local_priority = priorities.get(&local.origin).copied().unwrap_or(0);
|
|
let remote_priority = priorities.get(&remote.origin).copied().unwrap_or(0);
|
|
let winner = if local_priority >= remote_priority { local } else { remote };
|
|
ResolvedDimension {
|
|
dimension: winner.clone(),
|
|
strategy: ConflictStrategy::ReplicaPriority(priorities),
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Conflict Resolution Strategies
|
|
|
|
```rust
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub enum ConflictStrategy {
|
|
/// Last write wins based on timestamp
|
|
LastWriteWins,
|
|
/// Take maximum value (for monotonic dimensions)
|
|
MaxValue,
|
|
/// Take minimum value
|
|
MinValue,
|
|
/// Average conflicting values
|
|
Average,
|
|
/// Weighted average based on replica trust
|
|
WeightedAverage(HashMap<ReplicaId, f32>),
|
|
/// Replica priority ordering
|
|
ReplicaPriority(HashMap<ReplicaId, u32>),
|
|
/// Custom merge function
|
|
Custom(CustomMergeFn),
|
|
}
|
|
|
|
pub type CustomMergeFn = Arc<dyn Fn(f32, f32, &ConflictContext) -> f32 + Send + Sync>;
|
|
```
|
|
|
|
### Vector Clock Implementation
|
|
|
|
```rust
|
|
/// Extended vector clock for delta tracking
|
|
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
|
pub struct VectorClock {
|
|
/// Replica -> logical timestamp mapping
|
|
clock: HashMap<ReplicaId, u64>,
|
|
}
|
|
|
|
impl VectorClock {
|
|
pub fn new() -> Self {
|
|
Self { clock: HashMap::new() }
|
|
}
|
|
|
|
/// Increment for local replica
|
|
pub fn increment(&mut self, replica: &ReplicaId) {
|
|
let counter = self.clock.entry(replica.clone()).or_insert(0);
|
|
*counter += 1;
|
|
}
|
|
|
|
/// Get timestamp for replica
|
|
pub fn get(&self, replica: &ReplicaId) -> u64 {
|
|
self.clock.get(replica).copied().unwrap_or(0)
|
|
}
|
|
|
|
/// Merge with another clock (take max)
|
|
pub fn merge(&mut self, other: &VectorClock) {
|
|
for (replica, &ts) in &other.clock {
|
|
let current = self.clock.entry(replica.clone()).or_insert(0);
|
|
*current = (*current).max(ts);
|
|
}
|
|
}
|
|
|
|
/// Compare two clocks for causality
|
|
pub fn compare(&self, other: &VectorClock) -> ClockOrdering {
|
|
let mut less_than = false;
|
|
let mut greater_than = false;
|
|
|
|
// Check all replicas in self
|
|
for (replica, &self_ts) in &self.clock {
|
|
let other_ts = other.get(replica);
|
|
if self_ts < other_ts {
|
|
less_than = true;
|
|
} else if self_ts > other_ts {
|
|
greater_than = true;
|
|
}
|
|
}
|
|
|
|
// Check replicas only in other
|
|
for (replica, &other_ts) in &other.clock {
|
|
if !self.clock.contains_key(replica) && other_ts > 0 {
|
|
less_than = true;
|
|
}
|
|
}
|
|
|
|
match (less_than, greater_than) {
|
|
(false, false) => ClockOrdering::Equal,
|
|
(true, false) => ClockOrdering::Before,
|
|
(false, true) => ClockOrdering::After,
|
|
(true, true) => ClockOrdering::Concurrent,
|
|
}
|
|
}
|
|
|
|
/// Check if concurrent (conflicting)
|
|
pub fn is_concurrent(&self, other: &VectorClock) -> bool {
|
|
matches!(self.compare(other), ClockOrdering::Concurrent)
|
|
}
|
|
}
|
|
|
|
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
|
pub enum ClockOrdering {
|
|
Equal,
|
|
Before,
|
|
After,
|
|
Concurrent,
|
|
}
|
|
```
|
|
|
|
### Operation-Based Delta Merging
|
|
|
|
```rust
|
|
/// Merge concurrent delta operations
|
|
pub fn merge_delta_operations(
|
|
local: &DeltaOperation,
|
|
remote: &DeltaOperation,
|
|
strategy: &ConflictStrategy,
|
|
) -> DeltaOperation {
|
|
match (local, remote) {
|
|
// Both sparse: merge index sets
|
|
(
|
|
DeltaOperation::Sparse { indices: li, values: lv },
|
|
DeltaOperation::Sparse { indices: ri, values: rv },
|
|
) => {
|
|
let mut merged_indices = Vec::new();
|
|
let mut merged_values = Vec::new();
|
|
|
|
let local_map: HashMap<_, _> = li.iter().zip(lv.iter()).collect();
|
|
let remote_map: HashMap<_, _> = ri.iter().zip(rv.iter()).collect();
|
|
|
|
let all_indices: HashSet<_> = li.iter().chain(ri.iter()).collect();
|
|
|
|
for &idx in all_indices {
|
|
let local_val = local_map.get(&idx).copied();
|
|
let remote_val = remote_map.get(&idx).copied();
|
|
|
|
let value = match (local_val, remote_val) {
|
|
(Some(&l), None) => l,
|
|
(None, Some(&r)) => r,
|
|
(Some(&l), Some(&r)) => resolve_value_conflict(l, r, strategy),
|
|
(None, None) => unreachable!(),
|
|
};
|
|
|
|
merged_indices.push(*idx);
|
|
merged_values.push(value);
|
|
}
|
|
|
|
DeltaOperation::Sparse {
|
|
indices: merged_indices,
|
|
values: merged_values,
|
|
}
|
|
}
|
|
|
|
// Sparse vs Dense: apply sparse changes on top of dense
|
|
(
|
|
DeltaOperation::Sparse { indices, values },
|
|
DeltaOperation::Dense { vector },
|
|
)
|
|
| (
|
|
DeltaOperation::Dense { vector },
|
|
DeltaOperation::Sparse { indices, values },
|
|
) => {
|
|
let mut result = vector.clone();
|
|
for (&idx, &val) in indices.iter().zip(values.iter()) {
|
|
result[idx as usize] = val;
|
|
}
|
|
DeltaOperation::Dense { vector: result }
|
|
}
|
|
|
|
// Both dense: element-wise merge
|
|
(
|
|
DeltaOperation::Dense { vector: lv },
|
|
DeltaOperation::Dense { vector: rv },
|
|
) => {
|
|
let merged: Vec<f32> = lv.iter()
|
|
.zip(rv.iter())
|
|
.map(|(&l, &r)| resolve_value_conflict(l, r, strategy))
|
|
.collect();
|
|
DeltaOperation::Dense { vector: merged }
|
|
}
|
|
|
|
// Scale operations: compose
|
|
(
|
|
DeltaOperation::Scale { factor: f1 },
|
|
DeltaOperation::Scale { factor: f2 },
|
|
) => {
|
|
DeltaOperation::Scale { factor: f1 * f2 }
|
|
}
|
|
|
|
// Delete wins over updates (tombstone semantics)
|
|
(DeltaOperation::Delete, _) | (_, DeltaOperation::Delete) => {
|
|
DeltaOperation::Delete
|
|
}
|
|
|
|
// Other combinations: convert to dense and merge
|
|
_ => {
|
|
// Fallback: materialize both and merge
|
|
DeltaOperation::Dense {
|
|
vector: vec![], // Would compute actual merge
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
fn resolve_value_conflict(local: f32, remote: f32, strategy: &ConflictStrategy) -> f32 {
|
|
match strategy {
|
|
ConflictStrategy::LastWriteWins => remote, // Assume remote is "latest"
|
|
ConflictStrategy::MaxValue => local.max(remote),
|
|
ConflictStrategy::MinValue => local.min(remote),
|
|
ConflictStrategy::Average => (local + remote) / 2.0,
|
|
ConflictStrategy::WeightedAverage(weights) => {
|
|
// Would need context for proper weighting
|
|
(local + remote) / 2.0
|
|
}
|
|
_ => remote, // Default fallback
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Consistency Guarantees
|
|
|
|
### Eventual Consistency
|
|
|
|
The CRDT approach guarantees **strong eventual consistency**:
|
|
|
|
1. **Eventual Delivery**: All deltas eventually reach all replicas
|
|
2. **Convergence**: Replicas with same deltas converge to same state
|
|
3. **Termination**: Merge operations always terminate
|
|
|
|
### Causal Consistency
|
|
|
|
Vector clocks ensure causal ordering:
|
|
|
|
```
|
|
Property: If Δa happens-before Δb, then on all replicas:
|
|
Δa is applied before Δb
|
|
|
|
Proof: Vector clock comparison ensures causal dependencies
|
|
are satisfied before applying deltas
|
|
```
|
|
|
|
### Conflict Freedom Theorem
|
|
|
|
```
|
|
For any two concurrent deltas Δa and Δb:
|
|
merge(Δa, Δb) = merge(Δb, Δa) [Commutativity]
|
|
merge(Δa, merge(Δb, Δc)) = merge(merge(Δa, Δb), Δc) [Associativity]
|
|
merge(Δa, Δa) = Δa [Idempotence]
|
|
```
|
|
|
|
These properties ensure:
|
|
- Order-independent convergence
|
|
- Safe retry/redelivery
|
|
- Partition tolerance
|
|
|
|
---
|
|
|
|
## Considered Options
|
|
|
|
### Option 1: Last-Write-Wins (LWW)
|
|
|
|
**Description**: Latest timestamp wins, simple conflict resolution.
|
|
|
|
**Pros**:
|
|
- Extremely simple
|
|
- Low overhead
|
|
- Deterministic
|
|
|
|
**Cons**:
|
|
- Clock skew sensitivity
|
|
- Loses concurrent updates
|
|
- No semantic awareness
|
|
|
|
**Verdict**: Available as strategy option, not default.
|
|
|
|
### Option 2: Pure Vector Clocks
|
|
|
|
**Description**: Track causality, reject concurrent writes.
|
|
|
|
**Pros**:
|
|
- Perfect causality tracking
|
|
- No data loss
|
|
|
|
**Cons**:
|
|
- Requires conflict handling at application level
|
|
- Concurrent writes fail
|
|
|
|
**Verdict**: Rejected - too restrictive for vector workloads.
|
|
|
|
### Option 3: Operational Transform (OT)
|
|
|
|
**Description**: Transform operations to maintain consistency.
|
|
|
|
**Pros**:
|
|
- Preserves all intentions
|
|
- Used successfully in collaborative editing
|
|
|
|
**Cons**:
|
|
- Complex transformation functions
|
|
- Hard to prove correctness
|
|
- Doesn't map well to vector semantics
|
|
|
|
**Verdict**: Rejected - CRDT semantics more natural for vectors.
|
|
|
|
### Option 4: CRDT with Causal Ordering (Selected)
|
|
|
|
**Description**: CRDT merge with per-dimension version tracking.
|
|
|
|
**Pros**:
|
|
- Automatic convergence
|
|
- Semantically meaningful merges
|
|
- Flexible strategies
|
|
- Proven correctness
|
|
|
|
**Cons**:
|
|
- Per-dimension overhead
|
|
- More complex than LWW
|
|
|
|
**Verdict**: Adopted - optimal balance of correctness and flexibility.
|
|
|
|
---
|
|
|
|
## Technical Specification
|
|
|
|
### Conflict Detection API
|
|
|
|
```rust
|
|
/// Detect conflicts between deltas
|
|
pub fn detect_conflicts(
|
|
local_delta: &VectorDelta,
|
|
remote_delta: &VectorDelta,
|
|
) -> ConflictReport {
|
|
let mut conflicts = Vec::new();
|
|
|
|
// Check if targeting same vector
|
|
if local_delta.vector_id != remote_delta.vector_id {
|
|
return ConflictReport::NoConflict;
|
|
}
|
|
|
|
// Check causality
|
|
let ordering = local_delta.clock.compare(&remote_delta.clock);
|
|
|
|
if ordering != ClockOrdering::Concurrent {
|
|
return ConflictReport::Ordered { ordering };
|
|
}
|
|
|
|
// Analyze operation conflicts
|
|
let op_conflicts = analyze_operation_conflicts(
|
|
&local_delta.operation,
|
|
&remote_delta.operation,
|
|
);
|
|
|
|
ConflictReport::Conflicts {
|
|
vector_id: local_delta.vector_id.clone(),
|
|
local_delta_id: local_delta.delta_id.clone(),
|
|
remote_delta_id: remote_delta.delta_id.clone(),
|
|
dimension_conflicts: op_conflicts,
|
|
}
|
|
}
|
|
```
|
|
|
|
### Configuration
|
|
|
|
```rust
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub struct ConflictConfig {
|
|
/// Default resolution strategy
|
|
pub default_strategy: ConflictStrategy,
|
|
/// Per-namespace strategies
|
|
pub namespace_strategies: HashMap<String, ConflictStrategy>,
|
|
/// Per-dimension strategies (dimension index -> strategy)
|
|
pub dimension_strategies: HashMap<usize, ConflictStrategy>,
|
|
/// Whether to log conflicts
|
|
pub log_conflicts: bool,
|
|
/// Conflict callback for custom handling
|
|
#[serde(skip)]
|
|
pub conflict_callback: Option<ConflictCallback>,
|
|
/// Tombstone retention duration
|
|
pub tombstone_retention: Duration,
|
|
}
|
|
|
|
impl Default for ConflictConfig {
|
|
fn default() -> Self {
|
|
Self {
|
|
default_strategy: ConflictStrategy::LastWriteWins,
|
|
namespace_strategies: HashMap::new(),
|
|
dimension_strategies: HashMap::new(),
|
|
log_conflicts: true,
|
|
conflict_callback: None,
|
|
tombstone_retention: Duration::from_secs(86400 * 7), // 7 days
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Consequences
|
|
|
|
### Benefits
|
|
|
|
1. **Automatic Convergence**: All replicas converge without coordination
|
|
2. **Partition Tolerance**: Works during network partitions
|
|
3. **Semantic Merging**: Vector-appropriate conflict resolution
|
|
4. **Flexibility**: Configurable per-dimension strategies
|
|
5. **Auditability**: All conflicts logged with resolution
|
|
|
|
### Risks and Mitigations
|
|
|
|
| Risk | Probability | Impact | Mitigation |
|
|
|------|-------------|--------|------------|
|
|
| Memory overhead | Medium | Medium | Lazy per-dimension tracking |
|
|
| Merge complexity | Low | Medium | Thorough testing, formal verification |
|
|
| Strategy misconfiguration | Medium | High | Sensible defaults, validation |
|
|
| Tombstone accumulation | Medium | Medium | Garbage collection policies |
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
1. Shapiro, M., et al. "Conflict-free Replicated Data Types." SSS 2011.
|
|
2. Kleppmann, M., & Almeida, P. S. "A Conflict-Free Replicated JSON Datatype." IEEE TPDS 2017.
|
|
3. Ruvector conflict.rs: Existing conflict resolution implementation
|
|
4. ADR-DB-001: Delta Behavior Core Architecture
|
|
|
|
---
|
|
|
|
## Related Decisions
|
|
|
|
- **ADR-DB-001**: Delta Behavior Core Architecture
|
|
- **ADR-DB-003**: Delta Propagation Protocol
|
|
- **ADR-DB-005**: Delta Index Updates
|