Squashed 'vendor/ruvector/' content from commit b64c2172

git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00
commit d803bfe2b1
7854 changed files with 3522914 additions and 0 deletions
--- a/crates/ruvector-mincut/docs/adr/DDC-001-coherence-gate-design-criteria.md
+++ b/crates/ruvector-mincut/docs/adr/DDC-001-coherence-gate-design-criteria.md
@@ -0,0 +1,370 @@
+# DDC-001: Anytime-Valid Coherence Gate - Design Decision Criteria
+
+**Version**: 1.0
+**Date**: 2026-01-17
+**Related ADR**: ADR-001-anytime-valid-coherence-gate
+
+## Purpose
+
+This document specifies the design decision criteria for implementing the Anytime-Valid Coherence Gate (AVCG). It provides concrete guidance for architectural choices, implementation trade-offs, and acceptance criteria.
+
+---
+
+## 1. Graph Model Design Decisions
+
+### DDC-1.1: Action Graph Construction
+
+**Decision Required**: How to construct the action graph G_t from agent state?
+
+| Option | Description | Pros | Cons | Recommendation |
+|--------|-------------|------|------|----------------|
+| **A. State-Action Pairs** | Nodes = (state, action), Edges = transitions | Fine-grained control; precise cuts | Large graphs; O(|S|·|A|) nodes | Use for high-stakes domains |
+| **B. Abstract State Clusters** | Nodes = state clusters, Edges = aggregate transitions | Smaller graphs; faster updates | May miss nuanced boundaries | **Recommended for v0** |
+| **C. Learned Embeddings** | Nodes = learned state embeddings | Adaptive; captures latent structure | Requires training data; less interpretable | Future enhancement |
+
+**Acceptance Criteria**:
+- [ ] Graph construction completes in < 100μs for typical agent states
+- [ ] Graph accurately represents reachability to unsafe states
+- [ ] Witness partitions are human-interpretable
+
+### DDC-1.2: Edge Weight Semantics
+
+**Decision Required**: What do edge weights represent?
+
+| Option | Interpretation | Use Case |
+|--------|---------------|----------|
+| **A. Risk Scores** | Higher weight = higher risk of unsafe outcome | Min-cut = minimum total risk to unsafe |
+| **B. Inverse Probability** | Higher weight = less likely transition | Min-cut = least likely path to unsafe |
+| **C. Unit Weights** | All edges weight 1.0 | Min-cut = fewest actions to unsafe |
+| **D. Conformal Set Size** | Weight = |C_t| for that action | Natural integration with predictive uncertainty |
+
+**Recommendation**: Option D creates natural integration between min-cut and conformal prediction.
+
+**Acceptance Criteria**:
+- [ ] Weight semantics are documented and consistent
+- [ ] Min-cut value has interpretable meaning for operators
+- [ ] Weights update correctly on new observations
+
+---
+
+## 2. Conformal Predictor Architecture
+
+### DDC-2.1: Base Predictor Selection
+
+**Decision Required**: Which base predictor to wrap with conformal prediction?
+
+| Option | Characteristics | Computational Cost |
+|--------|----------------|-------------------|
+| **A. Neural Network** | High capacity; requires calibration | Medium-High |
+| **B. Random Forest** | Built-in uncertainty; robust | Medium |
+| **C. Gaussian Process** | Natural uncertainty; O(n³) training | High |
+| **D. Ensemble with Dropout** | Approximate Bayesian; scalable | Medium |
+
+**Recommendation**: Option D (Ensemble with Dropout) for balance of capacity and uncertainty.
+
+**Acceptance Criteria**:
+- [ ] Base predictor achieves acceptable accuracy on held-out data
+- [ ] Prediction latency < 10ms for single action
+- [ ] Uncertainty estimates correlate with actual error rates
+
+### DDC-2.2: Non-Conformity Score Function
+
+**Decision Required**: How to compute non-conformity scores?
+
+| Option | Formula | Properties |
+|--------|---------|------------|
+| **A. Absolute Residual** | s(x,y) = |y - ŷ(x)| | Simple; symmetric |
+| **B. Normalized Residual** | s(x,y) = |y - ŷ(x)| / σ̂(x) | Scale-invariant |
+| **C. CQR** | s(x,y) = max(q̂_lo - y, y - q̂_hi) | Heteroscedastic coverage |
+
+**Recommendation**: Option C (CQR) for heteroscedastic agent environments.
+
+**Acceptance Criteria**:
+- [ ] Marginal coverage ≥ 1 - α over calibration window
+- [ ] Conditional coverage approximately uniform across feature space
+- [ ] Prediction sets are not trivially large
+
+### DDC-2.3: Shift Adaptation Method
+
+**Decision Required**: How to adapt conformal predictor to distribution shift?
+
+| Method | Adaptation Speed | Conservativeness |
+|--------|-----------------|------------------|
+| **A. ACI (Adaptive Conformal)** | Medium | High |
+| **B. Retrospective Adjustment** | Fast | Medium |
+| **C. COP (Conformal Optimistic)** | Fastest | Low (but valid) |
+| **D. CORE (RL-based)** | Adaptive | Task-dependent |
+
+**Recommendation**: Hybrid approach:
+- Use COP for normal operation (fast, less conservative)
+- Fall back to ACI under detected severe shift
+- Use retrospective adjustment for post-hoc correction
+
+**Acceptance Criteria**:
+- [ ] Coverage maintained during gradual shift (δ < 0.1/step)
+- [ ] Recovery to target coverage within 100 steps after abrupt shift
+- [ ] No catastrophic coverage failures (coverage never < 0.5)
+
+---
+
+## 3. E-Process Construction
+
+### DDC-3.1: E-Value Computation Method
+
+**Decision Required**: How to compute per-action e-values?
+
+| Method | Requirements | Robustness |
+|--------|--------------|------------|
+| **A. Likelihood Ratio** | Density models for H₀ and H₁ | Low (model-dependent) |
+| **B. Universal Inference** | Split data; no density needed | Medium |
+| **C. Mixture E-Values** | Multiple alternatives | High (hedged) |
+| **D. Betting E-Values** | Online learning framework | High (adaptive) |
+
+**Recommendation**: Option C (Mixture E-Values) for robustness:
+```
+e_t = (1/K) Σ_k e_t^{(k)}
+```
+Where each e_t^{(k)} tests a different alternative hypothesis.
+
+**Acceptance Criteria**:
+- [ ] E[e_t | H₀] ≤ 1 verified empirically
+- [ ] Power against reasonable alternatives > 0.5
+- [ ] Computation time < 1ms per e-value
+
+### DDC-3.2: E-Process Update Rule
+
+**Decision Required**: How to update the e-process over time?
+
+| Rule | Formula | Properties |
+|------|---------|------------|
+| **A. Product** | E_t = Π_{i=1}^t e_i | Aggressive; exponential power |
+| **B. Average** | E_t = (1/t) Σ_{i=1}^t e_i | Conservative; bounded |
+| **C. Exponential Moving** | E_t = λ·e_t + (1-λ)·E_{t-1} | Balanced; forgetting |
+| **D. Mixture Supermartingale** | E_t = Σ_j w_j · E_t^{(j)} | Robust; hedged |
+
+**Recommendation**:
+- Option A (Product) for high-stakes single decisions
+- Option D (Mixture) for continuous monitoring
+
+**Acceptance Criteria**:
+- [ ] E_t remains nonnegative supermartingale
+- [ ] Stopping time τ has valid Type I error: P(E_τ ≥ 1/α) ≤ α
+- [ ] Power grows with evidence accumulation
+
+### DDC-3.3: Null Hypothesis Specification
+
+**Decision Required**: What constitutes the "coherence" null hypothesis?
+
+| Formulation | Meaning |
+|-------------|---------|
+| **A. Action Safety** | H₀: P(action leads to unsafe state) ≤ p₀ |
+| **B. State Stability** | H₀: P(state deviates from normal) ≤ p₀ |
+| **C. Policy Consistency** | H₀: Current policy ≈ reference policy |
+| **D. Composite** | H₀: (A) ∧ (B) ∧ (C) |
+
+**Recommendation**: Start with Option A, extend to Option D for production.
+
+**Acceptance Criteria**:
+- [ ] H₀ is well-specified and testable
+- [ ] False alarm rate matches target α
+- [ ] Null violations are meaningfully dangerous
+
+---
+
+## 4. Integration Architecture
+
+### DDC-4.1: Signal Combination Strategy
+
+**Decision Required**: How to combine the three signals into a gate decision?
+
+| Strategy | Logic | Properties |
+|----------|-------|------------|
+| **A. Sequential Short-Circuit** | Cut → Conformal → E-process | Fast rejection; ordered |
+| **B. Parallel with Voting** | All evaluate; majority rules | Robust; slower |
+| **C. Weighted Integration** | score = w₁·cut + w₂·conf + w₃·e | Flexible; needs tuning |
+| **D. Hierarchical** | E-process gates conformal gates cut | Layered authority |
+
+**Recommendation**: Option A (Sequential Short-Circuit):
+1. Min-cut DENY is immediate (structural safety)
+2. Conformal uncertainty gates e-process (no point accumulating evidence if outcome unpredictable)
+3. E-process makes final permit/defer decision
+
+**Acceptance Criteria**:
+- [ ] Gate latency < 50ms for typical decisions
+- [ ] No single-point-of-failure (graceful degradation)
+- [ ] Decision audit trail is complete
+
+### DDC-4.2: Graceful Degradation
+
+**Decision Required**: How should the gate behave when components fail?
+
+| Component Failure | Fallback Behavior |
+|-------------------|-------------------|
+| Min-cut unavailable | Defer all actions; alert operator |
+| Conformal predictor fails | Use widened prediction sets (conservative) |
+| E-process computation fails | Use last valid e-value; decay confidence |
+| All components fail | Full DENY; require human approval |
+
+**Acceptance Criteria**:
+- [ ] Failure detection within 100ms
+- [ ] Fallback never less safe than full DENY
+- [ ] Recovery is automatic when component restores
+
+### DDC-4.3: Latency Budget Allocation
+
+**Decision Required**: How to allocate total latency budget across components?
+
+Given total budget T_total (e.g., 50ms):
+
+| Component | Allocation | Rationale |
+|-----------|------------|-----------|
+| Min-cut update | 0.2 · T | Amortized; subpolynomial |
+| Conformal prediction | 0.4 · T | Main computation |
+| E-process update | 0.2 · T | Arithmetic; fast |
+| Decision logic | 0.1 · T | Simple rules |
+| Receipt generation | 0.1 · T | Hashing; logging |
+
+**Acceptance Criteria**:
+- [ ] p99 latency < T_total
+- [ ] No component exceeds 2× its budget
+- [ ] Latency monitoring in place
+
+---
+
+## 5. Operational Parameters
+
+### DDC-5.1: Threshold Configuration
+
+| Parameter | Symbol | Default | Range | Tuning Guidance |
+|-----------|--------|---------|-------|-----------------|
+| E-process deny threshold | τ_deny | 0.01 | [0.001, 0.1] | Lower = more conservative |
+| E-process permit threshold | τ_permit | 100 | [10, 1000] | Higher = more evidence required |
+| Uncertainty threshold | θ_uncertainty | 0.5 | [0.1, 1.0] | Fraction of outcome space |
+| Confidence threshold | θ_confidence | 0.1 | [0.01, 0.3] | Fraction of outcome space |
+| Conformal coverage target | 1-α | 0.9 | [0.8, 0.99] | Higher = larger sets |
+
+### DDC-5.2: Audit Requirements
+
+| Requirement | Specification |
+|-------------|---------------|
+| Receipt retention | 90 days minimum |
+| Receipt format | JSON + protobuf |
+| Receipt signing | Ed25519 signature |
+| Receipt searchability | Indexed by action_id, timestamp, decision |
+| Receipt integrity | Merkle tree for batch verification |
+
+---
+
+## 6. Testing & Validation Criteria
+
+### DDC-6.1: Unit Test Coverage
+
+| Module | Coverage Target | Critical Paths |
+|--------|-----------------|----------------|
+| conformal/ | ≥ 90% | Prediction set generation; shift adaptation |
+| eprocess/ | ≥ 95% | E-value validity; supermartingale property |
+| anytime_gate/ | ≥ 90% | Decision logic; receipt generation |
+
+### DDC-6.2: Integration Test Scenarios
+
+| Scenario | Expected Behavior |
+|----------|-------------------|
+| Normal operation | Permit rate > 90% |
+| Gradual shift | Coverage maintained; permit rate may decrease |
+| Abrupt shift | Temporary DEFER; recovery within 100 steps |
+| Adversarial probe | DENY rate increases; alerts generated |
+| Component failure | Graceful degradation; no unsafe permits |
+
+### DDC-6.3: Benchmark Requirements
+
+| Metric | Target | Measurement Method |
+|--------|--------|-------------------|
+| Gate latency p50 | < 10ms | Continuous profiling |
+| Gate latency p99 | < 50ms | Continuous profiling |
+| False deny rate | < 5% | Simulation with known-safe actions |
+| Missed unsafe rate | < 0.1% | Simulation with known-unsafe actions |
+| Coverage maintenance | ≥ 85% | Real distribution shift scenarios |
+
+---
+
+## 7. Implementation Phases
+
+### Phase 1: Foundation (v0.1)
+- [ ] E-value and e-process core implementation
+- [ ] Basic conformal prediction with ACI
+- [ ] Integration with existing `GateController`
+- [ ] Simple witness receipts
+
+### Phase 2: Adaptation (v0.2)
+- [ ] COP and retrospective adjustment
+- [ ] Mixture e-values for robustness
+- [ ] Graph model with conformal-based weights
+- [ ] Enhanced audit trail
+
+### Phase 3: Production (v1.0)
+- [ ] CORE RL-based adaptation
+- [ ] Learned graph construction
+- [ ] Cryptographic receipt signing
+- [ ] Full monitoring and alerting
+
+---
+
+## 8. Open Questions for Review
+
+1. **Graph Model Scope**: Should the action graph include only immediate actions or multi-step lookahead?
+
+2. **E-Process Null**: Is "action safety" the right null hypothesis, or should we test "policy consistency"?
+
+3. **Threshold Learning**: Should thresholds be fixed or learned via meta-optimization?
+
+4. **Human-in-Loop**: How should DEFER decisions be presented to human operators?
+
+5. **Adversarial Robustness**: How does AVCG perform against adaptive adversaries who observe gate decisions?
+
+---
+
+## 9. Sign-Off
+
+| Role | Name | Date | Signature |
+|------|------|------|-----------|
+| Architecture Lead | | | |
+| Security Lead | | | |
+| ML Lead | | | |
+| Engineering Lead | | | |
+
+---
+
+## Appendix A: Glossary
+
+| Term | Definition |
+|------|------------|
+| **E-value** | Nonnegative test statistic with E[e] ≤ 1 under null |
+| **E-process** | Sequence of e-values forming a nonnegative supermartingale |
+| **Conformal Prediction** | Distribution-free method for calibrated uncertainty |
+| **Witness Partition** | Explicit (S, V\S) showing which vertices are separated |
+| **Anytime-Valid** | Guarantee holds at any stopping time |
+| **COP** | Conformal Optimistic Prediction |
+| **CORE** | Conformal Regression via Reinforcement Learning |
+| **ACI** | Adaptive Conformal Inference |
+
+## Appendix B: Key Equations
+
+### E-Value Validity
+```
+E_H₀[e] ≤ 1
+```
+
+### Anytime-Valid Type I Error
+```
+P_H₀(∃t: E_t ≥ 1/α) ≤ α
+```
+
+### Conformal Coverage
+```
+P(Y_{t+1} ∈ C_t(X_{t+1})) ≥ 1 - α
+```
+
+### E-Value Composition
+```
+e₁ · e₂ is valid if e₁, e₂ independent
+```