19 KiB
ADR-DB-004: Delta Conflict Resolution
Status: Proposed Date: 2026-01-28 Authors: RuVector Architecture Team Deciders: Architecture Review Board Parent: ADR-DB-001 Delta Behavior Core Architecture
Version History
| Version | Date | Author | Changes |
|---|---|---|---|
| 0.1 | 2026-01-28 | Architecture Team | Initial proposal |
Context and Problem Statement
The Conflict Challenge
In distributed delta-first systems, concurrent updates to the same vector can create conflicts:
Time ─────────────────────────────────────────>
Replica A: v0 ──[Δa: dim[5]=0.8]──> v1a
\
\
Replica B: ──[Δb: dim[5]=0.3]──> v1b
Conflict: Both replicas modified dim[5] concurrently
Conflict Scenarios
| Scenario | Frequency | Complexity |
|---|---|---|
| Same dimension, different values | High | Simple |
| Overlapping sparse updates | Medium | Moderate |
| Scale vs. sparse conflict | Low | Complex |
| Delete vs. update race | Low | Critical |
Requirements
- Deterministic: Same conflicts resolve identically on all replicas
- Commutative: Order of conflict discovery doesn't affect outcome
- Low Latency: Resolution shouldn't block writes
- Meaningful: Results should be mathematically sensible for vectors
Decision
Adopt CRDT-Based Resolution with Causal Ordering
We implement conflict resolution using Conflict-free Replicated Data Types (CRDTs) with vector-specific merge semantics.
CRDT Design for Vectors
Vector as a CRDT
/// CRDT-enabled vector with per-dimension version tracking
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CrdtVector {
/// Vector ID
pub id: VectorId,
/// Dimensions with per-dimension causality
pub dimensions: Vec<CrdtDimension>,
/// Overall vector clock
pub clock: VectorClock,
/// Deletion marker
pub tombstone: Option<Tombstone>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CrdtDimension {
/// Current value
pub value: f32,
/// Last update clock
pub clock: VectorClock,
/// Originating replica
pub origin: ReplicaId,
/// Timestamp of update
pub timestamp: DateTime<Utc>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Tombstone {
pub deleted_at: DateTime<Utc>,
pub deleted_by: ReplicaId,
pub clock: VectorClock,
}
Merge Operation
impl CrdtVector {
/// Merge another CRDT vector into this one
pub fn merge(&mut self, other: &CrdtVector) -> MergeResult {
assert_eq!(self.id, other.id);
let mut conflicts = Vec::new();
// Handle tombstone
self.tombstone = match (&self.tombstone, &other.tombstone) {
(None, None) => None,
(Some(t), None) | (None, Some(t)) => Some(t.clone()),
(Some(t1), Some(t2)) => {
// Latest tombstone wins
Some(if t1.timestamp > t2.timestamp { t1.clone() } else { t2.clone() })
}
};
// If deleted, no need to merge dimensions
if self.tombstone.is_some() {
return MergeResult { conflicts, tombstoned: true };
}
// Merge each dimension
for (i, (self_dim, other_dim)) in
self.dimensions.iter_mut().zip(other.dimensions.iter()).enumerate()
{
let ordering = self_dim.clock.compare(&other_dim.clock);
match ordering {
ClockOrdering::Before => {
// Other is newer, take it
*self_dim = other_dim.clone();
}
ClockOrdering::After | ClockOrdering::Equal => {
// Self is newer or equal, keep it
}
ClockOrdering::Concurrent => {
// Conflict! Apply resolution strategy
let resolved = self.resolve_dimension_conflict(i, self_dim, other_dim);
conflicts.push(DimensionConflict {
dimension: i,
local_value: self_dim.value,
remote_value: other_dim.value,
resolved_value: resolved.value,
strategy: resolved.strategy,
});
*self_dim = resolved.dimension;
}
}
}
// Update overall clock
self.clock.merge(&other.clock);
MergeResult { conflicts, tombstoned: false }
}
fn resolve_dimension_conflict(
&self,
dim_idx: usize,
local: &CrdtDimension,
remote: &CrdtDimension,
) -> ResolvedDimension {
// Strategy selection based on configured policy
match self.conflict_strategy(dim_idx) {
ConflictStrategy::LastWriteWins => {
// Latest timestamp wins
let winner = if local.timestamp > remote.timestamp { local } else { remote };
ResolvedDimension {
dimension: winner.clone(),
strategy: ConflictStrategy::LastWriteWins,
}
}
ConflictStrategy::MaxValue => {
// Take maximum value
let max_val = local.value.max(remote.value);
let winner = if local.value >= remote.value { local } else { remote };
ResolvedDimension {
dimension: CrdtDimension {
value: max_val,
clock: merge_clocks(&local.clock, &remote.clock),
origin: winner.origin.clone(),
timestamp: winner.timestamp.max(remote.timestamp),
},
strategy: ConflictStrategy::MaxValue,
}
}
ConflictStrategy::Average => {
// Average the values
let avg = (local.value + remote.value) / 2.0;
ResolvedDimension {
dimension: CrdtDimension {
value: avg,
clock: merge_clocks(&local.clock, &remote.clock),
origin: "merged".into(),
timestamp: local.timestamp.max(remote.timestamp),
},
strategy: ConflictStrategy::Average,
}
}
ConflictStrategy::ReplicaPriority(priorities) => {
// Higher priority replica wins
let local_priority = priorities.get(&local.origin).copied().unwrap_or(0);
let remote_priority = priorities.get(&remote.origin).copied().unwrap_or(0);
let winner = if local_priority >= remote_priority { local } else { remote };
ResolvedDimension {
dimension: winner.clone(),
strategy: ConflictStrategy::ReplicaPriority(priorities),
}
}
}
}
}
Conflict Resolution Strategies
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum ConflictStrategy {
/// Last write wins based on timestamp
LastWriteWins,
/// Take maximum value (for monotonic dimensions)
MaxValue,
/// Take minimum value
MinValue,
/// Average conflicting values
Average,
/// Weighted average based on replica trust
WeightedAverage(HashMap<ReplicaId, f32>),
/// Replica priority ordering
ReplicaPriority(HashMap<ReplicaId, u32>),
/// Custom merge function
Custom(CustomMergeFn),
}
pub type CustomMergeFn = Arc<dyn Fn(f32, f32, &ConflictContext) -> f32 + Send + Sync>;
Vector Clock Implementation
/// Extended vector clock for delta tracking
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct VectorClock {
/// Replica -> logical timestamp mapping
clock: HashMap<ReplicaId, u64>,
}
impl VectorClock {
pub fn new() -> Self {
Self { clock: HashMap::new() }
}
/// Increment for local replica
pub fn increment(&mut self, replica: &ReplicaId) {
let counter = self.clock.entry(replica.clone()).or_insert(0);
*counter += 1;
}
/// Get timestamp for replica
pub fn get(&self, replica: &ReplicaId) -> u64 {
self.clock.get(replica).copied().unwrap_or(0)
}
/// Merge with another clock (take max)
pub fn merge(&mut self, other: &VectorClock) {
for (replica, &ts) in &other.clock {
let current = self.clock.entry(replica.clone()).or_insert(0);
*current = (*current).max(ts);
}
}
/// Compare two clocks for causality
pub fn compare(&self, other: &VectorClock) -> ClockOrdering {
let mut less_than = false;
let mut greater_than = false;
// Check all replicas in self
for (replica, &self_ts) in &self.clock {
let other_ts = other.get(replica);
if self_ts < other_ts {
less_than = true;
} else if self_ts > other_ts {
greater_than = true;
}
}
// Check replicas only in other
for (replica, &other_ts) in &other.clock {
if !self.clock.contains_key(replica) && other_ts > 0 {
less_than = true;
}
}
match (less_than, greater_than) {
(false, false) => ClockOrdering::Equal,
(true, false) => ClockOrdering::Before,
(false, true) => ClockOrdering::After,
(true, true) => ClockOrdering::Concurrent,
}
}
/// Check if concurrent (conflicting)
pub fn is_concurrent(&self, other: &VectorClock) -> bool {
matches!(self.compare(other), ClockOrdering::Concurrent)
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum ClockOrdering {
Equal,
Before,
After,
Concurrent,
}
Operation-Based Delta Merging
/// Merge concurrent delta operations
pub fn merge_delta_operations(
local: &DeltaOperation,
remote: &DeltaOperation,
strategy: &ConflictStrategy,
) -> DeltaOperation {
match (local, remote) {
// Both sparse: merge index sets
(
DeltaOperation::Sparse { indices: li, values: lv },
DeltaOperation::Sparse { indices: ri, values: rv },
) => {
let mut merged_indices = Vec::new();
let mut merged_values = Vec::new();
let local_map: HashMap<_, _> = li.iter().zip(lv.iter()).collect();
let remote_map: HashMap<_, _> = ri.iter().zip(rv.iter()).collect();
let all_indices: HashSet<_> = li.iter().chain(ri.iter()).collect();
for &idx in all_indices {
let local_val = local_map.get(&idx).copied();
let remote_val = remote_map.get(&idx).copied();
let value = match (local_val, remote_val) {
(Some(&l), None) => l,
(None, Some(&r)) => r,
(Some(&l), Some(&r)) => resolve_value_conflict(l, r, strategy),
(None, None) => unreachable!(),
};
merged_indices.push(*idx);
merged_values.push(value);
}
DeltaOperation::Sparse {
indices: merged_indices,
values: merged_values,
}
}
// Sparse vs Dense: apply sparse changes on top of dense
(
DeltaOperation::Sparse { indices, values },
DeltaOperation::Dense { vector },
)
| (
DeltaOperation::Dense { vector },
DeltaOperation::Sparse { indices, values },
) => {
let mut result = vector.clone();
for (&idx, &val) in indices.iter().zip(values.iter()) {
result[idx as usize] = val;
}
DeltaOperation::Dense { vector: result }
}
// Both dense: element-wise merge
(
DeltaOperation::Dense { vector: lv },
DeltaOperation::Dense { vector: rv },
) => {
let merged: Vec<f32> = lv.iter()
.zip(rv.iter())
.map(|(&l, &r)| resolve_value_conflict(l, r, strategy))
.collect();
DeltaOperation::Dense { vector: merged }
}
// Scale operations: compose
(
DeltaOperation::Scale { factor: f1 },
DeltaOperation::Scale { factor: f2 },
) => {
DeltaOperation::Scale { factor: f1 * f2 }
}
// Delete wins over updates (tombstone semantics)
(DeltaOperation::Delete, _) | (_, DeltaOperation::Delete) => {
DeltaOperation::Delete
}
// Other combinations: convert to dense and merge
_ => {
// Fallback: materialize both and merge
DeltaOperation::Dense {
vector: vec![], // Would compute actual merge
}
}
}
}
fn resolve_value_conflict(local: f32, remote: f32, strategy: &ConflictStrategy) -> f32 {
match strategy {
ConflictStrategy::LastWriteWins => remote, // Assume remote is "latest"
ConflictStrategy::MaxValue => local.max(remote),
ConflictStrategy::MinValue => local.min(remote),
ConflictStrategy::Average => (local + remote) / 2.0,
ConflictStrategy::WeightedAverage(weights) => {
// Would need context for proper weighting
(local + remote) / 2.0
}
_ => remote, // Default fallback
}
}
Consistency Guarantees
Eventual Consistency
The CRDT approach guarantees strong eventual consistency:
- Eventual Delivery: All deltas eventually reach all replicas
- Convergence: Replicas with same deltas converge to same state
- Termination: Merge operations always terminate
Causal Consistency
Vector clocks ensure causal ordering:
Property: If Δa happens-before Δb, then on all replicas:
Δa is applied before Δb
Proof: Vector clock comparison ensures causal dependencies
are satisfied before applying deltas
Conflict Freedom Theorem
For any two concurrent deltas Δa and Δb:
merge(Δa, Δb) = merge(Δb, Δa) [Commutativity]
merge(Δa, merge(Δb, Δc)) = merge(merge(Δa, Δb), Δc) [Associativity]
merge(Δa, Δa) = Δa [Idempotence]
These properties ensure:
- Order-independent convergence
- Safe retry/redelivery
- Partition tolerance
Considered Options
Option 1: Last-Write-Wins (LWW)
Description: Latest timestamp wins, simple conflict resolution.
Pros:
- Extremely simple
- Low overhead
- Deterministic
Cons:
- Clock skew sensitivity
- Loses concurrent updates
- No semantic awareness
Verdict: Available as strategy option, not default.
Option 2: Pure Vector Clocks
Description: Track causality, reject concurrent writes.
Pros:
- Perfect causality tracking
- No data loss
Cons:
- Requires conflict handling at application level
- Concurrent writes fail
Verdict: Rejected - too restrictive for vector workloads.
Option 3: Operational Transform (OT)
Description: Transform operations to maintain consistency.
Pros:
- Preserves all intentions
- Used successfully in collaborative editing
Cons:
- Complex transformation functions
- Hard to prove correctness
- Doesn't map well to vector semantics
Verdict: Rejected - CRDT semantics more natural for vectors.
Option 4: CRDT with Causal Ordering (Selected)
Description: CRDT merge with per-dimension version tracking.
Pros:
- Automatic convergence
- Semantically meaningful merges
- Flexible strategies
- Proven correctness
Cons:
- Per-dimension overhead
- More complex than LWW
Verdict: Adopted - optimal balance of correctness and flexibility.
Technical Specification
Conflict Detection API
/// Detect conflicts between deltas
pub fn detect_conflicts(
local_delta: &VectorDelta,
remote_delta: &VectorDelta,
) -> ConflictReport {
let mut conflicts = Vec::new();
// Check if targeting same vector
if local_delta.vector_id != remote_delta.vector_id {
return ConflictReport::NoConflict;
}
// Check causality
let ordering = local_delta.clock.compare(&remote_delta.clock);
if ordering != ClockOrdering::Concurrent {
return ConflictReport::Ordered { ordering };
}
// Analyze operation conflicts
let op_conflicts = analyze_operation_conflicts(
&local_delta.operation,
&remote_delta.operation,
);
ConflictReport::Conflicts {
vector_id: local_delta.vector_id.clone(),
local_delta_id: local_delta.delta_id.clone(),
remote_delta_id: remote_delta.delta_id.clone(),
dimension_conflicts: op_conflicts,
}
}
Configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ConflictConfig {
/// Default resolution strategy
pub default_strategy: ConflictStrategy,
/// Per-namespace strategies
pub namespace_strategies: HashMap<String, ConflictStrategy>,
/// Per-dimension strategies (dimension index -> strategy)
pub dimension_strategies: HashMap<usize, ConflictStrategy>,
/// Whether to log conflicts
pub log_conflicts: bool,
/// Conflict callback for custom handling
#[serde(skip)]
pub conflict_callback: Option<ConflictCallback>,
/// Tombstone retention duration
pub tombstone_retention: Duration,
}
impl Default for ConflictConfig {
fn default() -> Self {
Self {
default_strategy: ConflictStrategy::LastWriteWins,
namespace_strategies: HashMap::new(),
dimension_strategies: HashMap::new(),
log_conflicts: true,
conflict_callback: None,
tombstone_retention: Duration::from_secs(86400 * 7), // 7 days
}
}
}
Consequences
Benefits
- Automatic Convergence: All replicas converge without coordination
- Partition Tolerance: Works during network partitions
- Semantic Merging: Vector-appropriate conflict resolution
- Flexibility: Configurable per-dimension strategies
- Auditability: All conflicts logged with resolution
Risks and Mitigations
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Memory overhead | Medium | Medium | Lazy per-dimension tracking |
| Merge complexity | Low | Medium | Thorough testing, formal verification |
| Strategy misconfiguration | Medium | High | Sensible defaults, validation |
| Tombstone accumulation | Medium | Medium | Garbage collection policies |
References
- Shapiro, M., et al. "Conflict-free Replicated Data Types." SSS 2011.
- Kleppmann, M., & Almeida, P. S. "A Conflict-Free Replicated JSON Datatype." IEEE TPDS 2017.
- Ruvector conflict.rs: Existing conflict resolution implementation
- ADR-DB-001: Delta Behavior Core Architecture
Related Decisions
- ADR-DB-001: Delta Behavior Core Architecture
- ADR-DB-003: Delta Propagation Protocol
- ADR-DB-005: Delta Index Updates