Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

This commit is contained in:
ruv
2026-02-28 14:39:40 -05:00
7854 changed files with 3522914 additions and 0 deletions

83
vendor/ruvector/docs/dag/00-INDEX.md vendored Normal file
View File

@@ -0,0 +1,83 @@
# Neural Self-Learning DAG Implementation Plan
## Project Overview
This document set provides a complete implementation plan for integrating a Neural Self-Learning DAG system into RuVector-Postgres, with optional QuDAG distributed consensus integration.
## Document Index
| Document | Description | Priority |
|----------|-------------|----------|
| [01-ARCHITECTURE.md](./01-ARCHITECTURE.md) | System architecture and component overview | P0 |
| [02-DAG-ATTENTION-MECHANISMS.md](./02-DAG-ATTENTION-MECHANISMS.md) | 7 specialized DAG attention implementations | P0 |
| [03-SONA-INTEGRATION.md](./03-SONA-INTEGRATION.md) | Self-Optimizing Neural Architecture integration | P0 |
| [04-POSTGRES-INTEGRATION.md](./04-POSTGRES-INTEGRATION.md) | PostgreSQL extension integration details | P0 |
| [05-QUERY-PLAN-DAG.md](./05-QUERY-PLAN-DAG.md) | Query plan as learnable DAG structure | P1 |
| [06-MINCUT-OPTIMIZATION.md](./06-MINCUT-OPTIMIZATION.md) | Min-cut based bottleneck detection | P1 |
| [07-SELF-HEALING.md](./07-SELF-HEALING.md) | Self-healing and adaptive repair | P1 |
| [08-QUDAG-INTEGRATION.md](./08-QUDAG-INTEGRATION.md) | QuDAG distributed consensus integration | P2 |
| [09-SQL-API.md](./09-SQL-API.md) | Complete SQL API specification | P0 |
| [10-TESTING-STRATEGY.md](./10-TESTING-STRATEGY.md) | Testing approach and benchmarks | P1 |
| [11-AGENT-TASKS.md](./11-AGENT-TASKS.md) | 15-agent swarm task breakdown | P0 |
| [12-MILESTONES.md](./12-MILESTONES.md) | Implementation milestones and timeline | P0 |
## Quick Start for Agents
1. Read [01-ARCHITECTURE.md](./01-ARCHITECTURE.md) for system overview
2. Check [11-AGENT-TASKS.md](./11-AGENT-TASKS.md) for your assigned tasks
3. Follow task-specific documents as referenced
4. Coordinate via shared memory patterns in [03-SONA-INTEGRATION.md](./03-SONA-INTEGRATION.md)
## Project Goals
### Primary Goals
- Create self-learning query optimization for RuVector-Postgres
- Implement 7 DAG-centric attention mechanisms
- Integrate SONA two-tier learning system
- Provide adaptive cost estimation
- Enable bottleneck detection via min-cut analysis
### Secondary Goals
- QuDAG distributed consensus for federated learning
- Self-healing index maintenance
- HDC state compression for efficient sync
- Production-ready SQL API
## Success Metrics
| Metric | Target | Measurement |
|--------|--------|-------------|
| Query latency improvement | 30-50% | Benchmark suite |
| Pattern recall accuracy | >95% | Test coverage |
| Learning overhead | <5% | Per-query timing |
| Bottleneck detection | O(n^0.12) | Algorithmic analysis |
| Memory overhead | <100MB | Per-table measurement |
## Dependencies
### Required Crates (Internal)
- `ruvector-postgres` - PostgreSQL extension framework
- `ruvector-attention` - 39 attention mechanisms
- `ruvector-gnn` - Graph neural network layers
- `ruvector-graph` - Query execution DAG
- `ruvector-mincut` - Subpolynomial min-cut
- `ruvector-nervous-system` - BTSP, HDC, spiking networks
- `sona` - Self-Optimizing Neural Architecture
### Required Crates (External)
- `pgrx` - PostgreSQL Rust extension framework
- `dashmap` - Concurrent hashmap
- `parking_lot` - Fast synchronization primitives
- `ndarray` - N-dimensional arrays
- `rayon` - Parallel iterators
### Optional (QuDAG Integration)
- `qudag` - Quantum-resistant DAG consensus
- `ml-kem` - Post-quantum key encapsulation
- `ml-dsa` - Post-quantum signatures
## Version
- Plan Version: 1.0.0
- Target RuVector Version: 0.5.0
- Last Updated: 2025-12-29

View File

@@ -0,0 +1,484 @@
# Neural Self-Learning DAG Architecture
## Overview
The Neural Self-Learning DAG system transforms RuVector-Postgres from a static query executor into an adaptive system that learns optimal configurations from query patterns.
## System Architecture
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ NEURAL DAG RUVECTOR-POSTGRES │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ SQL INTERFACE LAYER │ │
│ │ ruvector_enable_neural_dag() | ruvector_dag_patterns() | ... │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
│ │ QUERY OPTIMIZER LAYER │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ │
│ │ │ Pattern │ │ Attention │ │ Cost │ │ Plan │ │ │
│ │ │ Matcher │ │ Selector │ │ Estimator │ │ Rewriter │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
│ │ DAG ATTENTION LAYER │ │
│ │ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │ │
│ │ │Topological│ │ Causal │ │ Critical │ │ MinCut │ │ │
│ │ │ Attention │ │ Cone │ │ Path │ │ Gated │ │ │
│ │ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │ │
│ │ ┌───────────┐ ┌───────────┐ ┌───────────┐ │ │
│ │ │Hierarchic │ │ Parallel │ │ Temporal │ │ │
│ │ │ Lorentz │ │ Branch │ │ BTSP │ │ │
│ │ └───────────┘ └───────────┘ └───────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
│ │ SONA LEARNING LAYER │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ INSTANT LOOP (<100μs) BACKGROUND LOOP (hourly) │ │ │
│ │ │ ┌─────────────┐ ┌─────────────┐ │ │ │
│ │ │ │ MicroLoRA │ │ BaseLoRA │ │ │ │
│ │ │ │ (rank 1-2) │ │ (rank 8) │ │ │ │
│ │ │ └─────────────┘ └─────────────┘ │ │ │
│ │ │ ┌─────────────┐ ┌─────────────┐ │ │ │
│ │ │ │ Trajectory │ ──────────────► │ ReasoningBk │ │ │ │
│ │ │ │ Buffer │ │ (K-means) │ │ │ │
│ │ │ └─────────────┘ └─────────────┘ │ │ │
│ │ │ ┌─────────────┐ │ │ │
│ │ │ │ EWC++ │ │ │ │
│ │ │ │ (forgetting)│ │ │ │
│ │ │ └─────────────┘ │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
│ │ OPTIMIZATION LAYER │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ │
│ │ │ MinCut │ │ HDC │ │ BTSP │ │ Self- │ │ │
│ │ │ Analysis │ │ State │ │ Memory │ │ Healing │ │ │
│ │ │ O(n^0.12) │ │ Compression │ │ One-Shot │ │ Engine │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
│ │ STORAGE LAYER │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ │
│ │ │ Pattern │ │ Embedding │ │ Trajectory │ │ Index │ │ │
│ │ │ Store │ │ Cache │ │ History │ │ Metadata │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ OPTIONAL: QUDAG CONSENSUS LAYER │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ │
│ │ │ Federated │ │ Pattern │ │ ML-DSA │ │ rUv │ │ │
│ │ │ Learning │ │ Consensus │ │ Signatures │ │ Tokens │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
## Component Descriptions
### 1. SQL Interface Layer
Provides PostgreSQL-native functions for interacting with the Neural DAG system.
**Key Components:**
- `ruvector_enable_neural_dag()` - Enable learning for a table
- `ruvector_dag_patterns()` - View learned patterns
- `ruvector_attention_*()` - DAG attention functions
- `ruvector_dag_learn()` - Trigger learning cycle
**Location:** `crates/ruvector-postgres/src/dag/operators.rs`
### 2. Query Optimizer Layer
Intercepts queries and applies learned optimizations.
**Key Components:**
- **Pattern Matcher**: Finds similar past query patterns via cosine similarity
- **Attention Selector**: UCB bandit for choosing optimal attention type
- **Cost Estimator**: Adaptive cost model with micro-LoRA updates
- **Plan Rewriter**: Applies learned operator ordering and parameters
**Location:** `crates/ruvector-postgres/src/dag/optimizer.rs`
### 3. DAG Attention Layer
Seven specialized attention mechanisms for DAG structures.
| Attention Type | Use Case | Complexity |
|----------------|----------|------------|
| Topological | Respect DAG ordering | O(n·k) |
| Causal Cone | Distance-weighted ancestors | O(n·d) |
| Critical Path | Focus on bottlenecks | O(n + critical_len) |
| MinCut Gated | Gate by criticality | O(n^0.12 + n·k) |
| Hierarchical Lorentz | Deep nesting | O(n·d) |
| Parallel Branch | Coordinate branches | O(n·b) |
| Temporal BTSP | Time-correlated patterns | O(n·w) |
**Location:** `crates/ruvector-postgres/src/dag/attention/`
### 4. SONA Learning Layer
Two-tier learning system for continuous optimization.
**Instant Loop (per-query):**
- MicroLoRA adaptation (rank 1-2)
- Trajectory recording
- <100μs overhead
**Background Loop (hourly):**
- K-means++ pattern extraction
- BaseLoRA updates (rank 8)
- EWC++ constraint application
**Location:** `crates/ruvector-postgres/src/dag/learning/`
### 5. Optimization Layer
Advanced optimization components.
**Key Components:**
- **MinCut Analysis**: O(n^0.12) bottleneck detection
- **HDC State**: 10K-bit hypervector compression
- **BTSP Memory**: One-shot pattern recall
- **Self-Healing**: Proactive index repair
**Location:** `crates/ruvector-postgres/src/dag/optimization/`
### 6. Storage Layer
Persistent storage for learned patterns and state.
**Key Components:**
- **Pattern Store**: DashMap + PostgreSQL tables
- **Embedding Cache**: LRU cache for hot embeddings
- **Trajectory History**: Ring buffer for recent queries
- **Index Metadata**: Pattern-to-index mappings
**Location:** `crates/ruvector-postgres/src/dag/storage/`
### 7. QuDAG Consensus Layer (Optional)
Distributed learning via quantum-resistant consensus.
**Key Components:**
- **Federated Learning**: Privacy-preserving pattern sharing
- **Pattern Consensus**: QR-Avalanche for pattern validation
- **ML-DSA Signatures**: Quantum-resistant pattern signing
- **rUv Tokens**: Incentivize learning contributions
**Location:** `crates/ruvector-postgres/src/dag/qudag/`
## Data Flow
### Query Execution Flow
```
SQL Query
┌─────────────────────────────────────┐
│ 1. Pattern Matching │
│ - Embed query plan │
│ - Find similar patterns in │
│ ReasoningBank (cosine sim) │
│ - Return top-k matches │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 2. Optimization Decision │
│ - If pattern found (conf > 0.8): │
│ Apply learned configuration │
│ - Else: │
│ Use defaults + micro-LoRA │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 3. Attention Selection │
│ - UCB bandit selects attention │
│ - Based on query pattern type │
│ - Exploration vs exploitation │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 4. Plan Execution │
│ - Execute with optimized params │
│ - Record operator timings │
│ - Track intermediate results │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 5. Trajectory Recording │
│ - Store query embedding │
│ - Store operator activations │
│ - Store outcome metrics │
│ - Compute quality score │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 6. Instant Learning │
│ - MicroLoRA gradient accumulate │
│ - Auto-flush at 100 queries │
│ - Update attention selector │
└─────────────────────────────────────┘
```
### Learning Cycle Flow
```
Hourly Trigger
┌─────────────────────────────────────┐
│ 1. Drain Trajectory Buffer │
│ - Collect 1000+ trajectories │
│ - Filter by quality threshold │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 2. K-means++ Clustering │
│ - 100 clusters │
│ - Deterministic initialization │
│ - Max 100 iterations │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 3. Pattern Extraction │
│ - Compute cluster centroids │
│ - Extract optimal parameters │
│ - Calculate confidence scores │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 4. EWC++ Constraint Check │
│ - Compute Fisher information │
│ - Apply forgetting prevention │
│ - Detect task boundaries │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 5. BaseLoRA Update │
│ - Apply constrained gradients │
│ - Update all layers │
│ - Merge weights if needed │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 6. ReasoningBank Update │
│ - Store new patterns │
│ - Consolidate similar patterns │
│ - Evict low-confidence patterns │
└─────────────────────────────────────┘
```
## Module Dependencies
```
ruvector-postgres/src/dag/
├── mod.rs # Module root, re-exports
├── operators.rs # SQL function definitions
├── attention/
│ ├── mod.rs # Attention trait and registry
│ ├── topological.rs # TopologicalAttention
│ ├── causal_cone.rs # CausalConeAttention
│ ├── critical_path.rs # CriticalPathAttention
│ ├── mincut_gated.rs # MinCutGatedAttention
│ ├── hierarchical.rs # HierarchicalLorentzAttention
│ ├── parallel_branch.rs # ParallelBranchAttention
│ ├── temporal_btsp.rs # TemporalBTSPAttention
│ └── ensemble.rs # EnsembleAttention
├── learning/
│ ├── mod.rs # Learning coordinator
│ ├── sona_engine.rs # SONA integration wrapper
│ ├── trajectory.rs # Trajectory buffer
│ ├── patterns.rs # Pattern extraction
│ ├── reasoning_bank.rs # Pattern storage
│ ├── ewc.rs # EWC++ integration
│ └── attention_selector.rs # UCB bandit selector
├── optimizer/
│ ├── mod.rs # Optimizer coordinator
│ ├── pattern_matcher.rs # Pattern matching
│ ├── cost_estimator.rs # Adaptive costs
│ └── plan_rewriter.rs # Plan transformation
├── optimization/
│ ├── mod.rs # Optimization utilities
│ ├── mincut.rs # Min-cut integration
│ ├── hdc_state.rs # HDC compression
│ ├── btsp_memory.rs # BTSP one-shot
│ └── self_healing.rs # Self-healing engine
├── storage/
│ ├── mod.rs # Storage coordinator
│ ├── pattern_store.rs # Pattern persistence
│ ├── embedding_cache.rs # Embedding LRU
│ └── trajectory_store.rs # Trajectory history
├── qudag/
│ ├── mod.rs # QuDAG integration
│ ├── federated.rs # Federated learning
│ ├── consensus.rs # Pattern consensus
│ ├── signatures.rs # ML-DSA signing
│ └── tokens.rs # rUv token interface
└── types/
├── mod.rs # Type definitions
├── neural_plan.rs # NeuralDagPlan
├── trajectory.rs # DagTrajectory
├── pattern.rs # LearnedDagPattern
└── metrics.rs # ExecutionMetrics
```
## Configuration
### Default Configuration
```rust
pub struct NeuralDagConfig {
// Learning
pub learning_enabled: bool, // true
pub max_trajectories: usize, // 10000
pub pattern_clusters: usize, // 100
pub quality_threshold: f32, // 0.3
pub background_interval_ms: u64, // 3600000 (1 hour)
// Attention
pub default_attention: DagAttentionType, // Topological
pub attention_exploration: f32, // 0.1
pub ucb_exploration_c: f32, // 1.414
// SONA
pub micro_lora_rank: usize, // 2
pub micro_lora_lr: f32, // 0.002
pub base_lora_rank: usize, // 8
pub base_lora_lr: f32, // 0.001
// EWC++
pub ewc_lambda: f32, // 2000.0
pub ewc_max_lambda: f32, // 15000.0
pub ewc_fisher_decay: f32, // 0.999
// MinCut
pub mincut_enabled: bool, // true
pub mincut_threshold: f32, // 0.5
// HDC
pub hdc_dimensions: usize, // 10000
// Self-Healing
pub healing_enabled: bool, // true
pub healing_check_interval_ms: u64, // 300000 (5 min)
}
```
### PostgreSQL GUC Variables
```sql
-- Enable/disable neural DAG
SET ruvector.neural_dag_enabled = true;
-- Learning parameters
SET ruvector.dag_learning_rate = 0.002;
SET ruvector.dag_pattern_clusters = 100;
SET ruvector.dag_quality_threshold = 0.3;
-- Attention parameters
SET ruvector.dag_attention_type = 'auto';
SET ruvector.dag_attention_exploration = 0.1;
-- EWC parameters
SET ruvector.dag_ewc_lambda = 2000.0;
-- MinCut parameters
SET ruvector.dag_mincut_enabled = true;
SET ruvector.dag_mincut_threshold = 0.5;
```
## Performance Targets
| Operation | Target Latency | Notes |
|-----------|----------------|-------|
| Pattern matching | <1ms | Top-5 similar patterns |
| Attention computation | <500μs | Per operator |
| MicroLoRA forward | <100μs | Per query |
| Trajectory recording | <50μs | Non-blocking |
| Background learning | <5s | 1000 trajectories |
| MinCut analysis | <10ms | O(n^0.12) |
| HDC encoding | <100μs | 10K dimensions |
## Memory Budget
| Component | Budget | Notes |
|-----------|--------|-------|
| Pattern Store | 50MB | ~1000 patterns per table |
| Embedding Cache | 20MB | LRU for hot embeddings |
| Trajectory Buffer | 20MB | 10K trajectories |
| MicroLoRA | 10KB | Per active query |
| BaseLoRA | 400KB | Per table |
| HDC State | 1.2KB | Per state snapshot |
**Total per table:** ~100MB maximum
## Thread Safety
All components use thread-safe primitives:
- `DashMap` for concurrent pattern storage
- `parking_lot::RwLock` for embedding cache
- `crossbeam::ArrayQueue` for trajectory buffer
- `AtomicU64` for counters and statistics
- PostgreSQL background workers for learning cycles
## Error Handling
```rust
pub enum NeuralDagError {
// Configuration errors
InvalidConfig(String),
TableNotEnabled(String),
// Learning errors
InsufficientTrajectories,
PatternExtractionFailed,
EwcConstraintViolation,
// Attention errors
AttentionComputationFailed,
InvalidDagStructure,
// Storage errors
PatternStoreFull,
EmbeddingCacheMiss,
// MinCut errors
MinCutComputationFailed,
GraphDisconnected,
// QuDAG errors (optional)
ConsensusTimeout,
SignatureVerificationFailed,
}
```
All errors are logged and non-fatal - the system falls back to default behavior on error.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,914 @@
# Query Plan as Learnable DAG
## Overview
This document specifies how PostgreSQL query plans are represented as DAGs (Directed Acyclic Graphs) and how they become targets for neural learning.
## Query Plan DAG Structure
### Conceptual Model
```
┌─────────────┐
│ RESULT │ (Root)
└──────┬──────┘
┌──────┴──────┐
│ SORT │
└──────┬──────┘
┌────────────┴────────────┐
│ │
┌──────┴──────┐ ┌──────┴──────┐
│ FILTER │ │ FILTER │
└──────┬──────┘ └──────┬──────┘
│ │
┌──────┴──────┐ ┌──────┴──────┐
│ HNSW SCAN │ │ SEQ SCAN │
│ (documents) │ │ (authors) │
└─────────────┘ └─────────────┘
Leaf Nodes Leaf Nodes
```
### NeuralDagPlan Structure
```rust
/// Query plan enhanced with neural learning capabilities
#[derive(Clone, Debug)]
pub struct NeuralDagPlan {
// ═══════════════════════════════════════════════════════════════
// BASE PLAN STRUCTURE
// ═══════════════════════════════════════════════════════════════
/// Plan ID (unique per execution)
pub plan_id: u64,
/// Root operator
pub root: OperatorNode,
/// All operators in topological order (leaves first)
pub operators: Vec<OperatorNode>,
/// Edges: parent_id -> Vec<child_id>
pub edges: HashMap<OperatorId, Vec<OperatorId>>,
/// Reverse edges: child_id -> parent_id
pub reverse_edges: HashMap<OperatorId, OperatorId>,
/// Pipeline breakers (blocking operators)
pub pipeline_breakers: Vec<OperatorId>,
/// Parallelism configuration
pub parallelism: usize,
// ═══════════════════════════════════════════════════════════════
// NEURAL ENHANCEMENTS
// ═══════════════════════════════════════════════════════════════
/// Operator embeddings (256-dim per operator)
pub operator_embeddings: Vec<Vec<f32>>,
/// Plan embedding (computed from operators)
pub plan_embedding: Option<Vec<f32>>,
/// Attention weights between operators
pub attention_weights: Vec<Vec<f32>>,
/// Selected attention type
pub attention_type: DagAttentionType,
/// Trajectory ID (links to ReasoningBank)
pub trajectory_id: Option<u64>,
// ═══════════════════════════════════════════════════════════════
// LEARNED PARAMETERS
// ═══════════════════════════════════════════════════════════════
/// Learned cost estimates per operator
pub learned_costs: Option<Vec<f32>>,
/// Execution parameters
pub params: ExecutionParams,
/// Pattern match info (if pattern was applied)
pub pattern_match: Option<PatternMatch>,
// ═══════════════════════════════════════════════════════════════
// OPTIMIZATION STATE
// ═══════════════════════════════════════════════════════════════
/// MinCut criticality per operator
pub criticalities: Option<Vec<f32>>,
/// Critical path operators
pub critical_path: Option<Vec<OperatorId>>,
/// Bottleneck score (0.0 - 1.0)
pub bottleneck_score: Option<f32>,
}
/// Single operator in the plan DAG
#[derive(Clone, Debug)]
pub struct OperatorNode {
/// Unique operator ID
pub id: OperatorId,
/// Operator type
pub op_type: OperatorType,
/// Target table (if applicable)
pub table_name: Option<String>,
/// Index used (if applicable)
pub index_name: Option<String>,
/// Filter predicate (if applicable)
pub filter: Option<FilterExpr>,
/// Join condition (if join)
pub join_condition: Option<JoinCondition>,
/// Projected columns
pub projection: Vec<String>,
/// Estimated rows
pub estimated_rows: f64,
/// Estimated cost
pub estimated_cost: f64,
/// Operator embedding (learned)
pub embedding: Vec<f32>,
/// Depth in DAG (0 = leaf)
pub depth: usize,
/// Is this on critical path?
pub is_critical: bool,
/// MinCut criticality score
pub criticality: f32,
}
/// Operator types
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
pub enum OperatorType {
// Scan operators (leaves)
SeqScan,
IndexScan,
IndexOnlyScan,
HnswScan,
IvfFlatScan,
BitmapScan,
// Join operators
NestedLoop,
HashJoin,
MergeJoin,
// Aggregation operators
Aggregate,
GroupAggregate,
HashAggregate,
// Sort operators
Sort,
IncrementalSort,
// Filter operators
Filter,
Result,
// Set operators
Append,
MergeAppend,
Union,
Intersect,
Except,
// Subquery operators
SubqueryScan,
CteScan,
MaterializeNode,
// Utility
Limit,
Unique,
WindowAgg,
// Parallel
Gather,
GatherMerge,
}
/// Pattern match information
#[derive(Clone, Debug)]
pub struct PatternMatch {
pub pattern_id: PatternId,
pub confidence: f32,
pub similarity: f32,
pub applied_params: ExecutionParams,
}
```
### Operator Embedding
```rust
impl OperatorNode {
/// Generate embedding for this operator
pub fn generate_embedding(&mut self, config: &EmbeddingConfig) {
let dim = config.hidden_dim;
let mut embedding = vec![0.0; dim];
// 1. Operator type encoding (one-hot style, but dense)
let type_offset = self.op_type.type_index() * 16;
for i in 0..16 {
embedding[type_offset + i] = self.op_type.type_features()[i];
}
// 2. Cardinality encoding (log scale)
let card_offset = 128;
let log_rows = (self.estimated_rows + 1.0).ln();
embedding[card_offset] = log_rows / 20.0; // Normalize
// 3. Cost encoding (log scale)
let cost_offset = 129;
let log_cost = (self.estimated_cost + 1.0).ln();
embedding[cost_offset] = log_cost / 30.0; // Normalize
// 4. Depth encoding
let depth_offset = 130;
embedding[depth_offset] = self.depth as f32 / 20.0;
// 5. Table/index encoding (if applicable)
if let Some(ref table) = self.table_name {
let table_hash = hash_string(table);
let table_offset = 132;
for i in 0..16 {
embedding[table_offset + i] = ((table_hash >> (i * 4)) & 0xF) as f32 / 16.0;
}
}
// 6. Filter complexity encoding
if let Some(ref filter) = self.filter {
let filter_offset = 148;
embedding[filter_offset] = filter.complexity() as f32 / 10.0;
embedding[filter_offset + 1] = filter.selectivity_estimate();
}
// 7. Join encoding
if let Some(ref join) = self.join_condition {
let join_offset = 150;
embedding[join_offset] = join.join_type.type_index() as f32 / 4.0;
embedding[join_offset + 1] = join.estimated_selectivity;
}
// L2 normalize
let norm: f32 = embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
if norm > 1e-8 {
for x in &mut embedding {
*x /= norm;
}
}
self.embedding = embedding;
}
}
impl OperatorType {
/// Get feature vector for operator type
fn type_features(&self) -> [f32; 16] {
match self {
// Scans - low cost per row
OperatorType::SeqScan => [1.0, 0.0, 0.0, 0.0, 0.2, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
OperatorType::IndexScan => [0.8, 0.2, 0.0, 0.0, 0.1, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
OperatorType::HnswScan => [0.6, 0.4, 0.0, 0.0, 0.05, 0.8, 0.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
OperatorType::IvfFlatScan => [0.7, 0.3, 0.0, 0.0, 0.08, 0.7, 0.2, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
// Joins - high cost
OperatorType::NestedLoop => [0.0, 0.0, 1.0, 0.0, 0.9, 0.0, 0.0, 0.8, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
OperatorType::HashJoin => [0.0, 0.0, 0.8, 0.2, 0.5, 0.0, 0.0, 0.6, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
OperatorType::MergeJoin => [0.0, 0.0, 0.6, 0.4, 0.4, 0.0, 0.0, 0.4, 0.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
// Aggregation - blocking
OperatorType::Aggregate => [0.0, 0.0, 0.0, 0.0, 0.3, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.5, 0.0, 0.0, 0.0, 0.0],
OperatorType::HashAggregate => [0.0, 0.0, 0.0, 0.0, 0.4, 0.0, 0.0, 0.0, 0.5, 0.8, 0.0, 0.6, 0.0, 0.0, 0.0, 0.0],
// Sort - blocking
OperatorType::Sort => [0.0, 0.0, 0.0, 0.0, 0.6, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.7, 0.0, 0.0, 0.0, 0.0],
// Default
_ => [0.5; 16],
}
}
fn type_index(&self) -> usize {
match self {
OperatorType::SeqScan => 0,
OperatorType::IndexScan => 1,
OperatorType::IndexOnlyScan => 1,
OperatorType::HnswScan => 2,
OperatorType::IvfFlatScan => 3,
OperatorType::BitmapScan => 4,
OperatorType::NestedLoop => 5,
OperatorType::HashJoin => 6,
OperatorType::MergeJoin => 7,
OperatorType::Aggregate | OperatorType::GroupAggregate | OperatorType::HashAggregate => 8,
OperatorType::Sort | OperatorType::IncrementalSort => 9,
OperatorType::Filter => 10,
OperatorType::Limit => 11,
_ => 12,
}
}
}
```
## Plan Conversion from PostgreSQL
### PlannedStmt to NeuralDagPlan
```rust
impl NeuralDagPlan {
/// Convert PostgreSQL PlannedStmt to NeuralDagPlan
pub unsafe fn from_planned_stmt(stmt: *mut pg_sys::PlannedStmt) -> Self {
let mut plan = NeuralDagPlan::new();
// Extract plan tree
let plan_tree = (*stmt).planTree;
plan.root = Self::convert_plan_node(plan_tree, &mut plan, 0);
// Compute topological order
plan.compute_topological_order();
// Generate embeddings
plan.generate_embeddings();
// Identify pipeline breakers
plan.identify_pipeline_breakers();
plan
}
/// Recursively convert plan nodes
unsafe fn convert_plan_node(
node: *mut pg_sys::Plan,
plan: &mut NeuralDagPlan,
depth: usize,
) -> OperatorNode {
if node.is_null() {
panic!("Null plan node");
}
let node_type = (*node).type_;
let estimated_rows = (*node).plan_rows;
let estimated_cost = (*node).total_cost;
let op_type = Self::pg_node_to_op_type(node_type, node);
let op_id = plan.next_operator_id();
let mut operator = OperatorNode {
id: op_id,
op_type,
table_name: Self::extract_table_name(node),
index_name: Self::extract_index_name(node),
filter: Self::extract_filter(node),
join_condition: Self::extract_join_condition(node),
projection: Self::extract_projection(node),
estimated_rows,
estimated_cost,
embedding: vec![],
depth,
is_critical: false,
criticality: 0.0,
};
// Process children
let left_plan = (*node).lefttree;
let right_plan = (*node).righttree;
let mut child_ids = Vec::new();
if !left_plan.is_null() {
let left_op = Self::convert_plan_node(left_plan, plan, depth + 1);
child_ids.push(left_op.id);
plan.reverse_edges.insert(left_op.id, op_id);
plan.operators.push(left_op);
}
if !right_plan.is_null() {
let right_op = Self::convert_plan_node(right_plan, plan, depth + 1);
child_ids.push(right_op.id);
plan.reverse_edges.insert(right_op.id, op_id);
plan.operators.push(right_op);
}
if !child_ids.is_empty() {
plan.edges.insert(op_id, child_ids);
}
operator
}
/// Map PostgreSQL node type to OperatorType
unsafe fn pg_node_to_op_type(node_type: pg_sys::NodeTag, node: *mut pg_sys::Plan) -> OperatorType {
match node_type {
pg_sys::NodeTag::T_SeqScan => OperatorType::SeqScan,
pg_sys::NodeTag::T_IndexScan => {
// Check if it's HNSW or IVFFlat
let index_scan = node as *mut pg_sys::IndexScan;
let index_oid = (*index_scan).indexid;
if Self::is_hnsw_index(index_oid) {
OperatorType::HnswScan
} else if Self::is_ivfflat_index(index_oid) {
OperatorType::IvfFlatScan
} else {
OperatorType::IndexScan
}
}
pg_sys::NodeTag::T_IndexOnlyScan => OperatorType::IndexOnlyScan,
pg_sys::NodeTag::T_BitmapHeapScan => OperatorType::BitmapScan,
pg_sys::NodeTag::T_NestLoop => OperatorType::NestedLoop,
pg_sys::NodeTag::T_HashJoin => OperatorType::HashJoin,
pg_sys::NodeTag::T_MergeJoin => OperatorType::MergeJoin,
pg_sys::NodeTag::T_Agg => {
let agg = node as *mut pg_sys::Agg;
match (*agg).aggstrategy {
pg_sys::AggStrategy::AGG_HASHED => OperatorType::HashAggregate,
pg_sys::AggStrategy::AGG_SORTED => OperatorType::GroupAggregate,
_ => OperatorType::Aggregate,
}
}
pg_sys::NodeTag::T_Sort => OperatorType::Sort,
pg_sys::NodeTag::T_IncrementalSort => OperatorType::IncrementalSort,
pg_sys::NodeTag::T_Limit => OperatorType::Limit,
pg_sys::NodeTag::T_Unique => OperatorType::Unique,
pg_sys::NodeTag::T_Append => OperatorType::Append,
pg_sys::NodeTag::T_MergeAppend => OperatorType::MergeAppend,
pg_sys::NodeTag::T_Gather => OperatorType::Gather,
pg_sys::NodeTag::T_GatherMerge => OperatorType::GatherMerge,
pg_sys::NodeTag::T_WindowAgg => OperatorType::WindowAgg,
pg_sys::NodeTag::T_SubqueryScan => OperatorType::SubqueryScan,
pg_sys::NodeTag::T_CteScan => OperatorType::CteScan,
pg_sys::NodeTag::T_Material => OperatorType::MaterializeNode,
pg_sys::NodeTag::T_Result => OperatorType::Result,
_ => OperatorType::Filter, // Default
}
}
}
```
## Plan Embedding Computation
### Hierarchical Aggregation
```rust
impl NeuralDagPlan {
/// Generate plan-level embedding from operator embeddings
pub fn generate_plan_embedding(&mut self) {
let dim = self.operator_embeddings[0].len();
let mut plan_embedding = vec![0.0; dim];
// Method 1: Weighted sum by depth (deeper = lower weight)
let max_depth = self.operators.iter().map(|o| o.depth).max().unwrap_or(0);
for (i, op) in self.operators.iter().enumerate() {
let depth_weight = 1.0 / (op.depth as f32 + 1.0);
let cost_weight = (op.estimated_cost / self.total_cost()).min(1.0) as f32;
let weight = depth_weight * 0.5 + cost_weight * 0.5;
for (j, &val) in self.operator_embeddings[i].iter().enumerate() {
plan_embedding[j] += weight * val;
}
}
// L2 normalize
let norm: f32 = plan_embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
if norm > 1e-8 {
for x in &mut plan_embedding {
*x /= norm;
}
}
self.plan_embedding = Some(plan_embedding);
}
/// Generate embedding using attention over operators
pub fn generate_plan_embedding_with_attention(&mut self, attention: &dyn DagAttention) {
// Use root operator as query
let root_embedding = &self.operator_embeddings[0];
// Build context from all operators
let ctx = self.build_dag_context();
// Compute attention weights
let query_node = DagNode {
id: self.root.id,
embedding: root_embedding.clone(),
};
let output = attention.forward(&query_node, &ctx, &AttentionConfig::default())
.expect("Attention computation failed");
// Store attention weights
self.attention_weights = vec![output.weights.clone()];
// Use aggregated output as plan embedding
self.plan_embedding = Some(output.aggregated);
}
fn build_dag_context(&self) -> DagContext {
DagContext {
nodes: self.operators.iter()
.map(|op| DagNode {
id: op.id,
embedding: op.embedding.clone(),
})
.collect(),
edges: self.edges.clone(),
reverse_edges: self.reverse_edges.iter()
.map(|(&child, &parent)| (child, vec![parent]))
.collect(),
depths: self.operators.iter()
.map(|op| (op.id, op.depth))
.collect(),
timestamps: None,
criticalities: self.criticalities.as_ref().map(|c| {
self.operators.iter()
.enumerate()
.map(|(i, op)| (op.id, c[i]))
.collect()
}),
}
}
}
```
## Plan Optimization
### Learned Cost Adjustment
```rust
impl NeuralDagPlan {
/// Apply learned cost adjustments
pub fn apply_learned_costs(&mut self) {
if let Some(ref learned_costs) = self.learned_costs {
for (i, op) in self.operators.iter_mut().enumerate() {
if i < learned_costs.len() {
// Adjust estimated cost by learned factor
let adjustment = learned_costs[i];
op.estimated_cost *= (1.0 + adjustment) as f64;
}
}
}
}
/// Reorder operators based on learned pattern
pub fn reorder_operators(&mut self, optimal_ordering: &[OperatorId]) {
// Only reorder within commutative operators (e.g., join order)
let join_ops: Vec<_> = self.operators.iter()
.filter(|op| matches!(op.op_type,
OperatorType::HashJoin |
OperatorType::MergeJoin |
OperatorType::NestedLoop))
.map(|op| op.id)
.collect();
if join_ops.len() < 2 {
return; // Nothing to reorder
}
// Apply learned ordering
// This is a simplified version - real implementation needs
// to preserve DAG constraints
for (i, &target_id) in optimal_ordering.iter().enumerate() {
if i < join_ops.len() {
// Swap join operators to match target ordering
// (preserving child relationships)
}
}
}
/// Apply learned execution parameters
pub fn apply_params(&mut self, params: &ExecutionParams) {
self.params = params.clone();
// Apply to relevant operators
for op in &mut self.operators {
match op.op_type {
OperatorType::HnswScan => {
if let Some(ef) = params.ef_search {
op.embedding[160] = ef as f32 / 100.0; // Encode in embedding
}
}
OperatorType::IvfFlatScan => {
if let Some(probes) = params.probes {
op.embedding[161] = probes as f32 / 50.0;
}
}
_ => {}
}
}
}
}
```
### Critical Path Analysis
```rust
impl NeuralDagPlan {
/// Compute critical path through the plan DAG
pub fn compute_critical_path(&mut self) {
// Dynamic programming: longest path
let mut longest_to: HashMap<OperatorId, f64> = HashMap::new();
let mut longest_from: HashMap<OperatorId, f64> = HashMap::new();
let mut predecessor: HashMap<OperatorId, OperatorId> = HashMap::new();
// Forward pass (leaves to root) - longest path TO each node
for op in self.operators.iter().rev() { // Reverse topo order
let mut max_cost = 0.0;
let mut max_pred = None;
if let Some(children) = self.edges.get(&op.id) {
for &child_id in children {
let child_cost = longest_to.get(&child_id).unwrap_or(&0.0);
if *child_cost > max_cost {
max_cost = *child_cost;
max_pred = Some(child_id);
}
}
}
longest_to.insert(op.id, max_cost + op.estimated_cost);
if let Some(pred) = max_pred {
predecessor.insert(op.id, pred);
}
}
// Backward pass (root to leaves) - longest path FROM each node
for op in &self.operators {
let mut max_cost = 0.0;
if let Some(&parent_id) = self.reverse_edges.get(&op.id) {
let parent_cost = longest_from.get(&parent_id).unwrap_or(&0.0);
max_cost = max_cost.max(*parent_cost + self.get_operator(parent_id).estimated_cost);
}
longest_from.insert(op.id, max_cost);
}
// Find critical path
let global_longest = longest_to.values().cloned().fold(0.0, f64::max);
let mut critical_path = Vec::new();
for op in &self.operators {
let total_through = longest_to[&op.id] + longest_from[&op.id];
if (total_through - global_longest).abs() < 1e-6 {
critical_path.push(op.id);
}
}
// Mark operators
for op in &mut self.operators {
op.is_critical = critical_path.contains(&op.id);
}
self.critical_path = Some(critical_path);
}
/// Compute bottleneck score (0.0 - 1.0)
pub fn compute_bottleneck_score(&mut self) {
if let Some(ref critical_path) = self.critical_path {
if critical_path.is_empty() {
self.bottleneck_score = Some(0.0);
return;
}
// Bottleneck = max(single_op_cost / total_cost)
let total_cost = self.total_cost();
let max_single = critical_path.iter()
.map(|&id| self.get_operator(id).estimated_cost)
.fold(0.0, f64::max);
self.bottleneck_score = Some((max_single / total_cost) as f32);
}
}
}
```
## Learning Target: Plan Quality
### Quality Computation
```rust
/// Compute quality score for a plan execution
pub fn compute_plan_quality(plan: &NeuralDagPlan, metrics: &ExecutionMetrics) -> f32 {
// Multi-objective quality function
// 1. Latency score (lower is better)
// Target: 10ms for simple queries, 1s for complex
let complexity = plan.operators.len() as f32;
let target_latency_us = 10000.0 * complexity.sqrt();
let latency_score = (target_latency_us / (metrics.latency_us as f32 + 1.0)).min(1.0);
// 2. Accuracy score (for vector queries)
// If we have relevance feedback
let accuracy_score = if let Some(precision) = metrics.precision {
precision
} else {
1.0 // Assume accurate if no feedback
};
// 3. Efficiency score (rows per microsecond)
let efficiency_score = if metrics.latency_us > 0 {
(metrics.rows_processed as f32 / metrics.latency_us as f32 * 1000.0).min(1.0)
} else {
1.0
};
// 4. Memory score (lower is better)
let target_memory = 10_000_000.0 * complexity; // 10MB per operator
let memory_score = (target_memory / (metrics.memory_bytes as f32 + 1.0)).min(1.0);
// 5. Cache efficiency
let cache_score = metrics.cache_hit_rate;
// Weighted combination
let weights = [0.35, 0.25, 0.15, 0.15, 0.10];
let scores = [latency_score, accuracy_score, efficiency_score, memory_score, cache_score];
weights.iter().zip(scores.iter())
.map(|(w, s)| w * s)
.sum()
}
```
### Gradient Estimation
```rust
impl DagTrajectory {
/// Estimate gradient for REINFORCE-style learning
pub fn estimate_gradient(&self) -> Vec<f32> {
let dim = self.plan_embedding.len();
let mut gradient = vec![0.0; dim];
// REINFORCE with baseline
let baseline = 0.5; // Could be learned
let advantage = self.quality - baseline;
// gradient += advantage * activation
// Simplified: use plan embedding as "activation"
for (i, &val) in self.plan_embedding.iter().enumerate() {
gradient[i] = advantage * val;
}
// Also incorporate operator-level signals
for (op_idx, op_embedding) in self.operator_embeddings.iter().enumerate() {
// Weight by attention
let attention_weight = if op_idx < self.attention_weights.len() {
self.attention_weights.get(0)
.and_then(|w| w.get(op_idx))
.unwrap_or(&(1.0 / self.operator_embeddings.len() as f32))
.clone()
} else {
1.0 / self.operator_embeddings.len() as f32
};
for (i, &val) in op_embedding.iter().enumerate() {
if i < dim {
gradient[i] += advantage * val * attention_weight * 0.5;
}
}
}
// L2 normalize
let norm: f32 = gradient.iter().map(|x| x * x).sum::<f32>().sqrt();
if norm > 1e-8 {
for x in &mut gradient {
*x /= norm;
}
}
gradient
}
}
```
## Integration with PostgreSQL Planner
### Plan Modification Points
```rust
/// Points where neural DAG can influence planning
pub enum PlanModificationPoint {
/// Before any planning
PrePlanning,
/// After join enumeration, before selecting best join order
JoinOrdering,
/// After creating base plan, before optimization
PreOptimization,
/// After optimization, before execution
PostOptimization,
/// During execution (adaptive)
DuringExecution,
}
impl NeuralDagPlan {
/// Apply neural modifications at specified point
pub fn apply_modifications(&mut self, point: PlanModificationPoint, engine: &DagSonaEngine) {
match point {
PlanModificationPoint::PrePlanning => {
// Hint optimal parameters based on query pattern
self.apply_pre_planning_hints(engine);
}
PlanModificationPoint::JoinOrdering => {
// Suggest optimal join order
if let Some(ordering) = engine.suggest_join_order(&self.plan_embedding) {
self.reorder_operators(&ordering);
}
}
PlanModificationPoint::PreOptimization => {
// Adjust cost estimates
if let Some(costs) = engine.predict_costs(&self.plan_embedding) {
self.learned_costs = Some(costs);
self.apply_learned_costs();
}
}
PlanModificationPoint::PostOptimization => {
// Final parameter tuning
if let Some(params) = engine.suggest_params(&self.plan_embedding) {
self.apply_params(&params);
}
}
PlanModificationPoint::DuringExecution => {
// Adaptive re-planning (future work)
}
}
}
}
```
## Serialization
### Plan Persistence
```rust
impl NeuralDagPlan {
/// Serialize plan for storage
pub fn to_bytes(&self) -> Vec<u8> {
bincode::serialize(self).expect("Serialization failed")
}
/// Deserialize plan
pub fn from_bytes(bytes: &[u8]) -> Result<Self, bincode::Error> {
bincode::deserialize(bytes)
}
/// Export to JSON for debugging
pub fn to_json(&self) -> serde_json::Value {
json!({
"plan_id": self.plan_id,
"operators": self.operators.iter().map(|op| json!({
"id": op.id,
"type": format!("{:?}", op.op_type),
"table": op.table_name,
"estimated_rows": op.estimated_rows,
"estimated_cost": op.estimated_cost,
"depth": op.depth,
"is_critical": op.is_critical,
"criticality": op.criticality,
})).collect::<Vec<_>>(),
"edges": self.edges,
"attention_type": format!("{:?}", self.attention_type),
"bottleneck_score": self.bottleneck_score,
"params": {
"ef_search": self.params.ef_search,
"probes": self.params.probes,
"parallelism": self.params.parallelism,
}
})
}
}
```
## Performance Considerations
| Operation | Complexity | Target Latency |
|-----------|------------|----------------|
| Plan conversion | O(n) | <1ms |
| Embedding generation | O(n × d) | <500μs |
| Plan embedding | O(n × d) | <200μs |
| Critical path | O(n²) | <1ms |
| MinCut criticality | O(n^0.12) | <10ms |
| Pattern matching | O(k × d) | <1ms |
Where n = operators, d = embedding dimension (256), k = patterns (100).

View File

@@ -0,0 +1,667 @@
# MinCut Optimization Specification
## Overview
This document specifies how the subpolynomial O(n^0.12) min-cut algorithm from `ruvector-mincut` integrates with the Neural DAG system for bottleneck detection and optimization.
## MinCut Integration Architecture
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ MINCUT OPTIMIZATION LAYER │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ SUBPOLYNOMIAL MINCUT ENGINE │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Hierarchical│ │ LocalKCut │ │ LinkCut │ │ │
│ │ │Decomposition│ │ Oracle │ │ Tree │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
│ │ DAG CRITICALITY ANALYZER │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Operator │ │ Bottleneck │ │ Critical │ │ │
│ │ │ Criticality │ │ Detection │ │ Path │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
│ │ OPTIMIZATION ACTIONS │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ │
│ │ │ Gated │ │ Redundancy │ │ Parallel │ │ Self- │ │ │
│ │ │ Attention │ │ Injection │ │ Expansion │ │ Healing │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
## DAG MinCut Engine
### Core Structure
```rust
/// MinCut engine adapted for query plan DAGs
pub struct DagMinCutEngine {
/// Subpolynomial min-cut algorithm
mincut: SubpolynomialMinCut,
/// Graph representation of current DAG
graph: DynamicGraph,
/// Cached criticality scores
criticality_cache: DashMap<OperatorId, f32>,
/// Configuration
config: MinCutConfig,
/// Metrics
metrics: MinCutMetrics,
}
#[derive(Clone, Debug)]
pub struct MinCutConfig {
/// Enable/disable mincut analysis
pub enabled: bool,
/// Criticality threshold for bottleneck detection
pub bottleneck_threshold: f32,
/// Maximum operators to analyze
pub max_operators: usize,
/// Cache TTL in seconds
pub cache_ttl_secs: u64,
/// Enable self-healing
pub self_healing_enabled: bool,
/// Healing check interval
pub healing_interval_ms: u64,
}
impl Default for MinCutConfig {
fn default() -> Self {
Self {
enabled: true,
bottleneck_threshold: 0.5,
max_operators: 1000,
cache_ttl_secs: 300,
self_healing_enabled: true,
healing_interval_ms: 300000, // 5 minutes
}
}
}
```
### Graph Construction from DAG
```rust
impl DagMinCutEngine {
/// Build graph from query plan DAG
pub fn build_from_plan(&mut self, plan: &NeuralDagPlan) {
self.graph.clear();
// Add vertices (operators)
for op in &plan.operators {
let weight = self.operator_weight(op);
self.graph.add_vertex(op.id, weight);
}
// Add edges (data flow)
for (&parent_id, children) in &plan.edges {
for &child_id in children {
// Edge weight = data volume estimate
let parent_op = plan.get_operator(parent_id);
let weight = parent_op.estimated_rows as f64;
self.graph.add_edge(parent_id, child_id, weight);
}
}
// Initialize min-cut structure
self.mincut.initialize(&self.graph);
}
/// Compute operator weight for min-cut
fn operator_weight(&self, op: &OperatorNode) -> f64 {
// Weight based on:
// 1. Estimated cost (primary)
// 2. Blocking nature (pipeline breakers are heavier)
// 3. Parallelizability (less parallelizable = heavier)
let base_weight = op.estimated_cost;
let blocking_factor = if op.is_pipeline_breaker() {
2.0
} else {
1.0
};
let parallel_factor = match op.op_type {
OperatorType::Sort | OperatorType::Aggregate => 1.5,
OperatorType::HashJoin => 1.2,
_ => 1.0,
};
base_weight * blocking_factor * parallel_factor
}
}
```
## Criticality Computation
### Operator Criticality
```rust
impl DagMinCutEngine {
/// Compute criticality for all operators
pub fn compute_all_criticalities(&self, plan: &NeuralDagPlan) -> HashMap<OperatorId, f32> {
let global_cut = self.mincut.query();
let mut criticalities = HashMap::new();
for op in &plan.operators {
let criticality = self.compute_operator_criticality(op.id, global_cut);
criticalities.insert(op.id, criticality);
}
criticalities
}
/// Compute criticality for a single operator
/// Criticality = how much removing this operator would reduce min-cut
pub fn compute_operator_criticality(&self, op_id: OperatorId, global_cut: u64) -> f32 {
// Check cache first
if let Some(cached) = self.criticality_cache.get(&op_id) {
return *cached;
}
// Use LocalKCut oracle
let query = LocalKCutQuery {
seed_vertices: vec![op_id],
budget_k: global_cut,
radius: 3, // Local neighborhood
};
let criticality = match self.mincut.local_query(query) {
LocalKCutResult::Found { cut_value, .. } => {
// Criticality = (global - local) / global
if global_cut > 0 {
(global_cut - cut_value) as f32 / global_cut as f32
} else {
0.0
}
}
LocalKCutResult::NoneInLocality => 0.0,
};
// Cache result
self.criticality_cache.insert(op_id, criticality);
criticality
}
/// Identify bottleneck operators
pub fn identify_bottlenecks(&self, plan: &NeuralDagPlan) -> Vec<BottleneckInfo> {
let criticalities = self.compute_all_criticalities(plan);
let mut bottlenecks: Vec<_> = criticalities.iter()
.filter(|(_, &crit)| crit > self.config.bottleneck_threshold)
.map(|(&op_id, &crit)| {
let op = plan.get_operator(op_id);
BottleneckInfo {
operator_id: op_id,
operator_type: op.op_type.clone(),
criticality: crit,
estimated_cost: op.estimated_cost,
recommendation: self.generate_recommendation(op, crit),
}
})
.collect();
// Sort by criticality (most critical first)
bottlenecks.sort_by(|a, b| b.criticality.partial_cmp(&a.criticality).unwrap());
bottlenecks
}
/// Generate optimization recommendation for bottleneck
fn generate_recommendation(&self, op: &OperatorNode, criticality: f32) -> OptimizationRecommendation {
match op.op_type {
OperatorType::SeqScan => {
OptimizationRecommendation::CreateIndex {
table: op.table_name.clone().unwrap_or_default(),
columns: op.filter.as_ref()
.map(|f| f.columns())
.unwrap_or_default(),
}
}
OperatorType::HnswScan | OperatorType::IvfFlatScan => {
if criticality > 0.8 {
OptimizationRecommendation::IncreaseEfSearch {
current: 40, // Would be extracted from plan
recommended: 80,
}
} else {
OptimizationRecommendation::None
}
}
OperatorType::NestedLoop => {
OptimizationRecommendation::ConsiderHashJoin {
estimated_improvement: criticality * 50.0,
}
}
OperatorType::Sort => {
if op.estimated_rows > 100000.0 {
OptimizationRecommendation::AddSortIndex {
columns: op.projection.clone(),
}
} else {
OptimizationRecommendation::None
}
}
OperatorType::HashAggregate if op.estimated_rows > 1000000.0 => {
OptimizationRecommendation::ConsiderPartitioning {
partition_key: op.projection.first().cloned(),
}
}
_ => OptimizationRecommendation::None,
}
}
}
/// Information about a bottleneck
#[derive(Clone, Debug)]
pub struct BottleneckInfo {
pub operator_id: OperatorId,
pub operator_type: OperatorType,
pub criticality: f32,
pub estimated_cost: f64,
pub recommendation: OptimizationRecommendation,
}
/// Optimization recommendations
#[derive(Clone, Debug)]
pub enum OptimizationRecommendation {
None,
CreateIndex { table: String, columns: Vec<String> },
IncreaseEfSearch { current: usize, recommended: usize },
ConsiderHashJoin { estimated_improvement: f32 },
AddSortIndex { columns: Vec<String> },
ConsiderPartitioning { partition_key: Option<String> },
AddParallelism { recommended_workers: usize },
MaterializeSubquery { subquery_id: OperatorId },
}
```
## MinCut Gated Attention Integration
### Gating Mechanism
```rust
impl DagMinCutEngine {
/// Compute attention gates based on criticality
pub fn compute_attention_gates(
&self,
plan: &NeuralDagPlan,
) -> Vec<f32> {
let criticalities = self.compute_all_criticalities(plan);
plan.operators.iter()
.map(|op| {
let crit = criticalities.get(&op.id).unwrap_or(&0.0);
if *crit > self.config.bottleneck_threshold {
1.0 // Full attention for bottlenecks
} else {
crit / self.config.bottleneck_threshold // Scaled
}
})
.collect()
}
/// Apply gating to attention weights
pub fn gate_attention_weights(
&self,
weights: &[f32],
gates: &[f32],
) -> Vec<f32> {
assert_eq!(weights.len(), gates.len());
let gated: Vec<f32> = weights.iter()
.zip(gates.iter())
.map(|(w, g)| w * g)
.collect();
// Renormalize
let sum: f32 = gated.iter().sum();
if sum > 1e-8 {
gated.iter().map(|w| w / sum).collect()
} else {
vec![1.0 / weights.len() as f32; weights.len()]
}
}
}
```
## Dynamic Updates
### Incremental MinCut Maintenance
```rust
impl DagMinCutEngine {
/// Handle operator cost update (O(n^0.12) amortized)
pub fn update_operator_cost(&mut self, op_id: OperatorId, new_cost: f64) {
let old_weight = self.graph.get_vertex_weight(op_id);
let new_weight = new_cost * self.get_operator_factors(op_id);
// Update graph
self.graph.update_vertex_weight(op_id, new_weight);
// Incremental min-cut update
// The subpolynomial algorithm handles this efficiently
self.mincut.on_vertex_weight_change(op_id, old_weight, new_weight);
// Invalidate cache for affected operators
self.invalidate_local_cache(op_id);
}
/// Handle edge addition (e.g., plan change)
pub fn add_edge(&mut self, from: OperatorId, to: OperatorId, weight: f64) {
self.graph.add_edge(from, to, weight);
self.mincut.insert_edge(from, to);
self.invalidate_local_cache(from);
self.invalidate_local_cache(to);
}
/// Handle edge removal
pub fn remove_edge(&mut self, from: OperatorId, to: OperatorId) {
self.graph.remove_edge(from, to);
self.mincut.delete_edge(from, to);
self.invalidate_local_cache(from);
self.invalidate_local_cache(to);
}
/// Invalidate cache for operator and neighbors
fn invalidate_local_cache(&self, op_id: OperatorId) {
self.criticality_cache.remove(&op_id);
// Also invalidate neighbors (within radius 3)
let neighbors = self.graph.get_neighbors_within_radius(op_id, 3);
for neighbor in neighbors {
self.criticality_cache.remove(&neighbor);
}
}
}
```
## Self-Healing Integration
### Bottleneck Detection Loop
```rust
impl DagMinCutEngine {
/// Background bottleneck detection
pub fn run_health_check(&self, plan: &NeuralDagPlan) -> HealthCheckResult {
let start = Instant::now();
// Compute global min-cut
let global_cut = self.mincut.query();
// Identify bottlenecks
let bottlenecks = self.identify_bottlenecks(plan);
// Compute health score
let health_score = self.compute_health_score(&bottlenecks);
// Generate alerts if needed
let alerts = self.generate_alerts(&bottlenecks);
HealthCheckResult {
global_mincut: global_cut,
health_score,
bottleneck_count: bottlenecks.len(),
severe_bottlenecks: bottlenecks.iter()
.filter(|b| b.criticality > 0.8)
.count(),
bottlenecks,
alerts,
duration: start.elapsed(),
}
}
fn compute_health_score(&self, bottlenecks: &[BottleneckInfo]) -> f32 {
if bottlenecks.is_empty() {
return 1.0;
}
// Score decreases with bottleneck severity
let max_criticality = bottlenecks.iter()
.map(|b| b.criticality)
.fold(0.0, f32::max);
let avg_criticality = bottlenecks.iter()
.map(|b| b.criticality)
.sum::<f32>() / bottlenecks.len() as f32;
1.0 - (max_criticality * 0.6 + avg_criticality * 0.4)
}
fn generate_alerts(&self, bottlenecks: &[BottleneckInfo]) -> Vec<Alert> {
bottlenecks.iter()
.filter(|b| b.criticality > 0.7)
.map(|b| Alert {
severity: if b.criticality > 0.9 {
AlertSeverity::Critical
} else if b.criticality > 0.8 {
AlertSeverity::Warning
} else {
AlertSeverity::Info
},
message: format!(
"Bottleneck detected: {:?} (criticality: {:.2})",
b.operator_type, b.criticality
),
recommendation: b.recommendation.clone(),
})
.collect()
}
}
#[derive(Clone, Debug)]
pub struct HealthCheckResult {
pub global_mincut: u64,
pub health_score: f32,
pub bottleneck_count: usize,
pub severe_bottlenecks: usize,
pub bottlenecks: Vec<BottleneckInfo>,
pub alerts: Vec<Alert>,
pub duration: Duration,
}
#[derive(Clone, Debug)]
pub struct Alert {
pub severity: AlertSeverity,
pub message: String,
pub recommendation: OptimizationRecommendation,
}
#[derive(Clone, Debug, PartialEq, Eq)]
pub enum AlertSeverity {
Info,
Warning,
Critical,
}
```
## Redundancy Injection
### Bypass Path Creation
```rust
impl DagMinCutEngine {
/// Suggest redundant paths to reduce bottleneck impact
pub fn suggest_redundancy(&self, plan: &NeuralDagPlan) -> Vec<RedundancySuggestion> {
let bottlenecks = self.identify_bottlenecks(plan);
let mut suggestions = Vec::new();
for bottleneck in &bottlenecks {
if bottleneck.criticality > 0.7 {
// Find alternative paths around this operator
let alternatives = self.find_alternative_paths(
plan,
bottleneck.operator_id,
);
if let Some(alt) = alternatives.first() {
suggestions.push(RedundancySuggestion {
bottleneck_id: bottleneck.operator_id,
alternative_path: alt.clone(),
estimated_improvement: self.estimate_improvement(
bottleneck,
alt,
),
});
}
}
}
suggestions
}
fn find_alternative_paths(
&self,
plan: &NeuralDagPlan,
bottleneck_id: OperatorId,
) -> Vec<AlternativePath> {
let mut alternatives = Vec::new();
let bottleneck = plan.get_operator(bottleneck_id);
match bottleneck.op_type {
OperatorType::SeqScan => {
// Alternative: Index scan if index exists
if let Some(ref table) = bottleneck.table_name {
if let Some(index) = self.find_usable_index(table, &bottleneck.filter) {
alternatives.push(AlternativePath::UseIndex {
index_name: index,
estimated_speedup: 10.0,
});
}
}
}
OperatorType::NestedLoop => {
// Alternative: Hash join
alternatives.push(AlternativePath::ReplaceJoin {
new_join_type: OperatorType::HashJoin,
estimated_speedup: 5.0,
});
}
OperatorType::Sort => {
// Alternative: Pre-sorted input via index
alternatives.push(AlternativePath::SortedIndex {
columns: bottleneck.projection.clone(),
estimated_speedup: 3.0,
});
}
_ => {}
}
alternatives
}
fn estimate_improvement(
&self,
bottleneck: &BottleneckInfo,
alternative: &AlternativePath,
) -> f32 {
let base_cost = bottleneck.estimated_cost;
let speedup = alternative.estimated_speedup();
let new_cost = base_cost / speedup;
let improvement = (base_cost - new_cost) / base_cost;
improvement as f32 * bottleneck.criticality
}
}
#[derive(Clone, Debug)]
pub struct RedundancySuggestion {
pub bottleneck_id: OperatorId,
pub alternative_path: AlternativePath,
pub estimated_improvement: f32,
}
#[derive(Clone, Debug)]
pub enum AlternativePath {
UseIndex { index_name: String, estimated_speedup: f64 },
ReplaceJoin { new_join_type: OperatorType, estimated_speedup: f64 },
SortedIndex { columns: Vec<String>, estimated_speedup: f64 },
Materialize { subquery_id: OperatorId, estimated_speedup: f64 },
Parallelize { workers: usize, estimated_speedup: f64 },
}
impl AlternativePath {
fn estimated_speedup(&self) -> f64 {
match self {
Self::UseIndex { estimated_speedup, .. } => *estimated_speedup,
Self::ReplaceJoin { estimated_speedup, .. } => *estimated_speedup,
Self::SortedIndex { estimated_speedup, .. } => *estimated_speedup,
Self::Materialize { estimated_speedup, .. } => *estimated_speedup,
Self::Parallelize { estimated_speedup, .. } => *estimated_speedup,
}
}
}
```
## SQL Interface
```sql
-- Compute mincut criticality for a plan
SELECT * FROM ruvector_dag_mincut_criticality('documents');
-- Get bottleneck analysis
SELECT * FROM ruvector_dag_bottlenecks('documents');
-- Get health check result
SELECT ruvector_dag_mincut_health('documents');
-- Get redundancy suggestions
SELECT * FROM ruvector_dag_redundancy_suggestions('documents');
-- Enable/disable mincut analysis
SET ruvector.dag_mincut_enabled = true;
SET ruvector.dag_mincut_threshold = 0.5;
```
## Performance Characteristics
| Operation | Complexity | Typical Latency |
|-----------|------------|-----------------|
| Global min-cut query | O(1) | <1μs |
| Single criticality | O(n^0.12) | <100μs |
| All criticalities | O(n^1.12) | <10ms (100 ops) |
| Edge insert | O(n^0.12) amortized | <100μs |
| Edge delete | O(n^0.12) amortized | <100μs |
| Health check | O(n^1.12) | <50ms |
## Memory Usage
| Component | Size | Notes |
|-----------|------|-------|
| Graph structure | O(n + m) | Vertices + edges |
| Hierarchical decomposition | O(n log n) | Multi-level |
| LinkCut tree | O(n) | Sleator-Tarjan |
| Criticality cache | O(n) | Bounded by TTL |
| LocalKCut coloring | O(k² log n) | Per query |
**Typical overhead:** ~1MB per 1000 operators

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,739 @@
# QuDAG Integration Specification
## Overview
This document specifies the optional integration between RuVector-Postgres Neural DAG system and QuDAG (Quantum-resistant Distributed DAG) for federated learning and distributed consensus on learned patterns.
## Integration Architecture
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ QUDAG INTEGRATION LAYER │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ FEDERATED LEARNING │ │
│ │ │ │
│ │ Node A (US) Node B (EU) Node C (Asia) │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ RuVector-PG │ │ RuVector-PG │ │ RuVector-PG │ │ │
│ │ │ ┌──────────┐ │ │ ┌──────────┐ │ │ ┌──────────┐ │ │ │
│ │ │ │ Patterns │ │ │ │ Patterns │ │ │ │ Patterns │ │ │ │
│ │ │ └────┬─────┘ │ │ └────┬─────┘ │ │ └────┬─────┘ │ │ │
│ │ └──────┼───────┘ └──────┼───────┘ └──────┼───────┘ │ │
│ │ │ │ │ │ │
│ │ └────────────────────┼────────────────────┘ │ │
│ │ ▼ │ │
│ │ ┌─────────────────┐ │ │
│ │ │ QuDAG Network │ │ │
│ │ │ (QR-Avalanche) │ │ │
│ │ └────────┬────────┘ │ │
│ │ │ │ │
│ │ ┌────────────────────┼────────────────────┐ │ │
│ │ ▼ ▼ ▼ │ │
│ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │
│ │ │ Consensus │ │ Consensus │ │ Consensus │ │ │
│ │ │ Patterns │ │ Patterns │ │ Patterns │ │ │
│ │ └────────────┘ └────────────┘ └────────────┘ │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ SECURITY LAYER │ │
│ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │
│ │ │ ML-KEM │ │ ML-DSA │ │ Differential│ │ rUv │ │ │
│ │ │ Encryption │ │ Signatures │ │ Privacy │ │ Tokens │ │ │
│ │ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
## QuDAG Client
### Core Structure
```rust
pub struct QuDagClient {
/// QuDAG node connection
node_url: String,
/// Node identity (ML-DSA keypair)
identity: QuDagIdentity,
/// Local pattern cache
pattern_cache: DashMap<PatternId, ConsensusPattern>,
/// Pending proposals
pending_proposals: DashMap<ProposalId, PatternProposal>,
/// Configuration
config: QuDagConfig,
/// Metrics
metrics: QuDagMetrics,
}
#[derive(Clone)]
pub struct QuDagIdentity {
/// ML-DSA-65 public key
pub public_key: MlDsaPublicKey,
/// ML-DSA-65 private key (encrypted at rest)
private_key: MlDsaPrivateKey,
/// Node identifier
pub node_id: NodeId,
/// Dark address (for anonymous communication)
pub dark_address: Option<DarkAddress>,
}
#[derive(Clone, Debug)]
pub struct QuDagConfig {
/// Enable QuDAG integration
pub enabled: bool,
/// QuDAG node URL
pub node_url: String,
/// Differential privacy epsilon
pub dp_epsilon: f64,
/// Minimum validators for consensus
pub min_validators: usize,
/// Consensus timeout (seconds)
pub consensus_timeout_secs: u64,
/// Sync interval (seconds)
pub sync_interval_secs: u64,
/// Maximum patterns per proposal
pub max_patterns_per_proposal: usize,
/// rUv staking requirement
pub min_stake_ruv: u64,
}
impl Default for QuDagConfig {
fn default() -> Self {
Self {
enabled: false,
node_url: "https://yyz.qudag.darknet/mcp".to_string(),
dp_epsilon: 1.0,
min_validators: 5,
consensus_timeout_secs: 30,
sync_interval_secs: 3600,
max_patterns_per_proposal: 100,
min_stake_ruv: 10,
}
}
}
```
### Pattern Proposal
```rust
impl QuDagClient {
/// Propose local patterns for consensus
pub async fn propose_patterns(
&self,
patterns: &[LearnedDagPattern],
) -> Result<ProposalId, QuDagError> {
// 1. Add differential privacy noise
let noisy_patterns = self.add_dp_noise(patterns)?;
// 2. Create proposal
let proposal = PatternProposal {
id: self.generate_proposal_id(),
proposer: self.identity.node_id.clone(),
patterns: noisy_patterns,
stake: self.config.min_stake_ruv,
timestamp: SystemTime::now(),
signature: None,
};
// 3. Sign with ML-DSA
let signed_proposal = self.sign_proposal(proposal)?;
// 4. Submit to QuDAG network
self.submit_proposal(&signed_proposal).await?;
// 5. Track pending
self.pending_proposals.insert(signed_proposal.id, signed_proposal.clone());
Ok(signed_proposal.id)
}
/// Add differential privacy noise to patterns
fn add_dp_noise(&self, patterns: &[LearnedDagPattern]) -> Result<Vec<NoisyPattern>, QuDagError> {
let epsilon = self.config.dp_epsilon;
patterns.iter()
.map(|p| {
// Add Laplace noise to centroid
let noisy_centroid: Vec<f32> = p.centroid.iter()
.map(|&v| {
let noise = laplace_sample(0.0, 1.0 / epsilon);
v + noise as f32
})
.collect();
// Quantize quality scores
let quantized_quality = (p.avg_metrics.quality * 10.0).round() / 10.0;
Ok(NoisyPattern {
centroid: noisy_centroid,
attention_type: p.optimal_attention.clone(),
quality: quantized_quality,
sample_count_bucket: bucket_sample_count(p.sample_count),
})
})
.collect()
}
/// Sign proposal with ML-DSA-65
fn sign_proposal(&self, mut proposal: PatternProposal) -> Result<PatternProposal, QuDagError> {
let message = proposal.to_signing_bytes();
let signature = self.identity.private_key.sign(&message)?;
proposal.signature = Some(signature);
Ok(proposal)
}
/// Submit proposal to QuDAG network
async fn submit_proposal(&self, proposal: &PatternProposal) -> Result<(), QuDagError> {
// Connect to QuDAG MCP server
let client = McpClient::connect(&self.config.node_url).await?;
// Call dag_submit tool
let response = client.call_tool("dag_submit", json!({
"vertex_type": "pattern_proposal",
"payload": proposal.to_encrypted_bytes(&self.get_network_key())?,
"parents": self.get_recent_vertices().await?,
})).await?;
if response["success"].as_bool().unwrap_or(false) {
Ok(())
} else {
Err(QuDagError::SubmissionFailed(
response["error"].as_str().unwrap_or("Unknown error").to_string()
))
}
}
}
#[derive(Clone, Debug)]
pub struct PatternProposal {
pub id: ProposalId,
pub proposer: NodeId,
pub patterns: Vec<NoisyPattern>,
pub stake: u64,
pub timestamp: SystemTime,
pub signature: Option<MlDsaSignature>,
}
#[derive(Clone, Debug)]
pub struct NoisyPattern {
/// Centroid with DP noise
pub centroid: Vec<f32>,
/// Attention type (no noise needed)
pub attention_type: DagAttentionType,
/// Quantized quality
pub quality: f64,
/// Bucketed sample count (privacy)
pub sample_count_bucket: SampleCountBucket,
}
#[derive(Clone, Debug)]
pub enum SampleCountBucket {
Few, // < 10
Some, // 10-50
Many, // 50-200
Lots, // > 200
}
```
### Consensus Validation
```rust
impl QuDagClient {
/// Validate incoming pattern proposals
pub async fn validate_proposal(
&self,
proposal: &PatternProposal,
) -> Result<ValidationResult, QuDagError> {
// 1. Verify signature
if !self.verify_signature(proposal)? {
return Ok(ValidationResult::Rejected {
reason: "Invalid signature".to_string(),
});
}
// 2. Check stake
let balance = self.get_ruv_balance(&proposal.proposer).await?;
if balance < proposal.stake {
return Ok(ValidationResult::Rejected {
reason: "Insufficient stake".to_string(),
});
}
// 3. Validate pattern quality
let quality_scores: Vec<f64> = proposal.patterns.iter()
.map(|p| p.quality)
.collect();
let avg_quality = quality_scores.iter().sum::<f64>() / quality_scores.len() as f64;
if avg_quality < 0.3 {
return Ok(ValidationResult::Rejected {
reason: "Low quality patterns".to_string(),
});
}
// 4. Check for duplicate patterns
let duplicates = self.check_duplicates(&proposal.patterns).await?;
if duplicates > proposal.patterns.len() / 2 {
return Ok(ValidationResult::Rejected {
reason: "Too many duplicate patterns".to_string(),
});
}
// 5. Compute accuracy improvement (sample-based)
let improvement = self.estimate_improvement(&proposal.patterns).await?;
Ok(ValidationResult::Accepted {
quality_score: avg_quality,
improvement_estimate: improvement,
validator: self.identity.node_id.clone(),
})
}
/// Submit validation to QuDAG
pub async fn submit_validation(
&self,
proposal_id: ProposalId,
result: &ValidationResult,
) -> Result<(), QuDagError> {
let validation = Validation {
proposal_id,
result: result.clone(),
validator: self.identity.node_id.clone(),
timestamp: SystemTime::now(),
signature: None,
};
let signed = self.sign_validation(validation)?;
let client = McpClient::connect(&self.config.node_url).await?;
client.call_tool("dag_submit", json!({
"vertex_type": "pattern_validation",
"payload": signed.to_encrypted_bytes(&self.get_network_key())?,
"parents": [proposal_id.to_string()],
})).await?;
Ok(())
}
}
#[derive(Clone, Debug)]
pub enum ValidationResult {
Accepted {
quality_score: f64,
improvement_estimate: f32,
validator: NodeId,
},
Rejected {
reason: String,
},
}
```
### Pattern Synchronization
```rust
impl QuDagClient {
/// Sync consensus patterns from QuDAG
pub async fn sync_patterns(&self) -> Result<SyncResult, QuDagError> {
let start = Instant::now();
// 1. Get latest consensus patterns
let client = McpClient::connect(&self.config.node_url).await?;
let response = client.call_tool("dag_query", json!({
"query_type": "consensus_patterns",
"since": self.last_sync_timestamp(),
"limit": 1000,
})).await?;
let consensus_patterns: Vec<ConsensusPattern> = serde_json::from_value(
response["patterns"].clone()
)?;
// 2. Verify signatures
let verified: Vec<_> = consensus_patterns.into_iter()
.filter(|p| self.verify_consensus_signature(p).unwrap_or(false))
.collect();
// 3. Update local cache
let mut new_count = 0;
for pattern in &verified {
if !self.pattern_cache.contains_key(&pattern.id) {
self.pattern_cache.insert(pattern.id, pattern.clone());
new_count += 1;
}
}
// 4. Update local ReasoningBank
let imported = self.import_to_reasoning_bank(&verified)?;
Ok(SyncResult {
patterns_received: verified.len(),
new_patterns: new_count,
patterns_imported: imported,
duration: start.elapsed(),
})
}
/// Import consensus patterns to local ReasoningBank
fn import_to_reasoning_bank(&self, patterns: &[ConsensusPattern]) -> Result<usize, QuDagError> {
let engines = get_all_dag_engines();
let mut imported = 0;
for pattern in patterns {
// Find matching local engine by pattern type
for engine in &engines {
let local_pattern = LearnedDagPattern {
id: self.generate_local_pattern_id(),
centroid: pattern.centroid.clone(),
optimal_params: ExecutionParams::default(),
optimal_attention: pattern.attention_type.clone(),
confidence: pattern.consensus_confidence,
sample_count: pattern.total_samples,
avg_metrics: AverageMetrics {
latency_us: 0.0, // Unknown from consensus
memory_bytes: 0.0,
quality: pattern.avg_quality,
},
updated_at: SystemTime::now(),
};
let mut bank = engine.dag_reasoning_bank.write();
bank.store(local_pattern);
imported += 1;
}
}
Ok(imported)
}
}
#[derive(Clone, Debug)]
pub struct ConsensusPattern {
pub id: PatternId,
pub centroid: Vec<f32>,
pub attention_type: DagAttentionType,
pub avg_quality: f64,
pub total_samples: usize,
pub consensus_confidence: f32,
pub validators: Vec<NodeId>,
pub signatures: Vec<MlDsaSignature>,
pub finalized_at: SystemTime,
}
#[derive(Clone, Debug)]
pub struct SyncResult {
pub patterns_received: usize,
pub new_patterns: usize,
pub patterns_imported: usize,
pub duration: Duration,
}
```
## rUv Token Integration
### Token Economy
```rust
pub struct RuvTokenClient {
/// QuDAG client reference
qudag: Arc<QuDagClient>,
/// Local balance cache
balance_cache: AtomicU64,
/// Pending rewards
pending_rewards: DashMap<TransactionId, PendingReward>,
}
impl RuvTokenClient {
/// Check rUv balance
pub async fn get_balance(&self) -> Result<u64, QuDagError> {
let client = McpClient::connect(&self.qudag.config.node_url).await?;
let response = client.call_tool("ruv_balance", json!({
"address": self.qudag.identity.node_id.to_string(),
})).await?;
let balance = response["balance"].as_u64().unwrap_or(0);
self.balance_cache.store(balance, Ordering::Relaxed);
Ok(balance)
}
/// Stake rUv for pattern proposal
pub async fn stake(&self, amount: u64) -> Result<TransactionId, QuDagError> {
let client = McpClient::connect(&self.qudag.config.node_url).await?;
let response = client.call_tool("ruv_stake", json!({
"amount": amount,
"purpose": "pattern_proposal",
"signature": self.sign_stake_request(amount)?,
})).await?;
Ok(TransactionId::from_str(response["tx_id"].as_str().unwrap())?)
}
/// Claim rewards for accepted patterns
pub async fn claim_rewards(&self) -> Result<ClaimResult, QuDagError> {
let client = McpClient::connect(&self.qudag.config.node_url).await?;
let response = client.call_tool("ruv_claim_rewards", json!({
"address": self.qudag.identity.node_id.to_string(),
"signature": self.sign_claim_request()?,
})).await?;
let claimed = response["claimed"].as_u64().unwrap_or(0);
let new_balance = response["new_balance"].as_u64().unwrap_or(0);
self.balance_cache.store(new_balance, Ordering::Relaxed);
Ok(ClaimResult {
amount_claimed: claimed,
new_balance,
})
}
}
/// Reward structure
#[derive(Clone, Debug)]
pub struct RewardStructure {
/// Base reward for accepted pattern
pub pattern_accepted: u64, // 10 rUv
/// Bonus for accuracy improvement
pub accuracy_bonus_per_percent: u64, // 10 rUv per 1%
/// Validation reward
pub validation_reward: u64, // 2 rUv
/// Penalty for rejected pattern
pub rejection_penalty: u64, // 5 rUv
/// Byzantine behavior penalty
pub byzantine_penalty: u64, // 1000 rUv
}
impl Default for RewardStructure {
fn default() -> Self {
Self {
pattern_accepted: 10,
accuracy_bonus_per_percent: 10,
validation_reward: 2,
rejection_penalty: 5,
byzantine_penalty: 1000,
}
}
}
```
## Security Layer
### ML-KEM Encryption
```rust
pub struct PatternEncryption {
/// Network public key (for encryption)
network_key: MlKemPublicKey,
/// Local private key (for decryption)
local_key: MlKemPrivateKey,
}
impl PatternEncryption {
/// Encrypt pattern for network transmission
pub fn encrypt(&self, pattern: &NoisyPattern) -> Result<EncryptedPattern, CryptoError> {
let plaintext = pattern.to_bytes();
// Encapsulate shared secret
let (ciphertext, shared_secret) = self.network_key.encapsulate()?;
// Derive key from shared secret
let key = blake3::derive_key("QuDAG Pattern Encryption", &shared_secret);
// Encrypt with ChaCha20-Poly1305
let nonce = generate_nonce();
let encrypted = chacha20_poly1305_encrypt(&key, &nonce, &plaintext)?;
Ok(EncryptedPattern {
ciphertext,
encrypted_data: encrypted,
nonce,
})
}
/// Decrypt pattern from network
pub fn decrypt(&self, encrypted: &EncryptedPattern) -> Result<NoisyPattern, CryptoError> {
// Decapsulate shared secret
let shared_secret = self.local_key.decapsulate(&encrypted.ciphertext)?;
// Derive key
let key = blake3::derive_key("QuDAG Pattern Encryption", &shared_secret);
// Decrypt
let plaintext = chacha20_poly1305_decrypt(
&key,
&encrypted.nonce,
&encrypted.encrypted_data,
)?;
NoisyPattern::from_bytes(&plaintext)
}
}
```
### ML-DSA Signatures
```rust
pub struct PatternSigning {
/// Signing key
private_key: MlDsaPrivateKey,
/// Verification key
public_key: MlDsaPublicKey,
}
impl PatternSigning {
/// Sign pattern proposal
pub fn sign_proposal(&self, proposal: &PatternProposal) -> Result<MlDsaSignature, CryptoError> {
let message = proposal.to_signing_bytes();
self.private_key.sign(&message)
}
/// Verify proposal signature
pub fn verify_proposal(
&self,
proposal: &PatternProposal,
public_key: &MlDsaPublicKey,
) -> Result<bool, CryptoError> {
let message = proposal.to_signing_bytes();
let signature = proposal.signature.as_ref()
.ok_or(CryptoError::MissingSignature)?;
public_key.verify(&message, signature)
}
/// Sign validation
pub fn sign_validation(&self, validation: &Validation) -> Result<MlDsaSignature, CryptoError> {
let message = validation.to_signing_bytes();
self.private_key.sign(&message)
}
}
```
## SQL Interface
```sql
-- Enable QuDAG integration
SELECT ruvector_dag_qudag_enable('{
"node_url": "https://yyz.qudag.darknet/mcp",
"dp_epsilon": 1.0,
"min_stake_ruv": 10
}'::jsonb);
-- Register identity
SELECT ruvector_dag_qudag_register();
-- Propose patterns for consensus
SELECT ruvector_dag_qudag_propose('documents');
-- Sync consensus patterns
SELECT ruvector_dag_qudag_sync();
-- Get rUv balance
SELECT ruvector_dag_ruv_balance();
-- Claim rewards
SELECT ruvector_dag_ruv_claim();
-- Get QuDAG status
SELECT ruvector_dag_qudag_status();
```
## Configuration
### PostgreSQL GUC Variables
```sql
-- Enable/disable QuDAG
SET ruvector.dag_qudag_enabled = true;
-- QuDAG node URL
SET ruvector.dag_qudag_node_url = 'https://yyz.qudag.darknet/mcp';
-- Differential privacy epsilon
SET ruvector.dag_qudag_dp_epsilon = 1.0;
-- Sync interval (seconds)
SET ruvector.dag_qudag_sync_interval = 3600;
-- Minimum stake for proposals
SET ruvector.dag_qudag_min_stake = 10;
```
## Metrics
```rust
#[derive(Clone, Debug, Default)]
pub struct QuDagMetrics {
pub proposals_submitted: AtomicU64,
pub proposals_accepted: AtomicU64,
pub proposals_rejected: AtomicU64,
pub validations_performed: AtomicU64,
pub patterns_synced: AtomicU64,
pub ruv_earned: AtomicU64,
pub ruv_spent: AtomicU64,
pub last_sync_time: AtomicU64,
}
impl QuDagMetrics {
pub fn to_json(&self) -> serde_json::Value {
json!({
"proposals_submitted": self.proposals_submitted.load(Ordering::Relaxed),
"proposals_accepted": self.proposals_accepted.load(Ordering::Relaxed),
"proposals_rejected": self.proposals_rejected.load(Ordering::Relaxed),
"acceptance_rate": self.acceptance_rate(),
"validations_performed": self.validations_performed.load(Ordering::Relaxed),
"patterns_synced": self.patterns_synced.load(Ordering::Relaxed),
"ruv_net": self.ruv_net(),
})
}
fn acceptance_rate(&self) -> f64 {
let submitted = self.proposals_submitted.load(Ordering::Relaxed);
let accepted = self.proposals_accepted.load(Ordering::Relaxed);
if submitted > 0 {
accepted as f64 / submitted as f64
} else {
0.0
}
}
fn ruv_net(&self) -> i64 {
self.ruv_earned.load(Ordering::Relaxed) as i64
- self.ruv_spent.load(Ordering::Relaxed) as i64
}
}
```

790
vendor/ruvector/docs/dag/09-SQL-API.md vendored Normal file
View File

@@ -0,0 +1,790 @@
# SQL API Reference
## Overview
Complete SQL API for the Neural DAG Learning system integrated with RuVector-Postgres.
## Configuration Functions
### System Configuration
```sql
-- Enable/disable neural DAG learning
SELECT ruvector.dag_set_enabled(enabled BOOLEAN) RETURNS VOID;
-- Configure learning rate
SELECT ruvector.dag_set_learning_rate(rate FLOAT8) RETURNS VOID;
-- Set attention mechanism
SELECT ruvector.dag_set_attention(
mechanism TEXT -- 'topological', 'causal_cone', 'critical_path',
-- 'mincut_gated', 'hierarchical_lorentz',
-- 'parallel_branch', 'temporal_btsp', 'auto'
) RETURNS VOID;
-- Configure SONA parameters
SELECT ruvector.dag_configure_sona(
micro_lora_rank INT DEFAULT 2,
base_lora_rank INT DEFAULT 8,
ewc_lambda FLOAT8 DEFAULT 5000.0,
pattern_clusters INT DEFAULT 100
) RETURNS VOID;
-- Set QuDAG network endpoint
SELECT ruvector.dag_set_qudag_endpoint(
endpoint TEXT,
stake_amount FLOAT8 DEFAULT 0.0
) RETURNS VOID;
```
### Runtime Status
```sql
-- Get current configuration
SELECT * FROM ruvector.dag_config();
-- Returns: (enabled, learning_rate, attention_mechanism,
-- micro_lora_rank, base_lora_rank, ewc_lambda, qudag_endpoint)
-- Get system status
SELECT * FROM ruvector.dag_status();
-- Returns: (active_patterns, total_trajectories, avg_improvement,
-- attention_hits, learning_rate_effective, qudag_connected)
-- Check health
SELECT * FROM ruvector.dag_health_check();
-- Returns: (component, status, last_check, message)
```
## Query Analysis Functions
### Plan Analysis
```sql
-- Analyze query plan and return neural DAG insights
SELECT * FROM ruvector.dag_analyze_plan(
query_text TEXT
) RETURNS TABLE (
node_id INT,
operator_type TEXT,
criticality FLOAT8,
bottleneck_score FLOAT8,
embedding VECTOR(256),
parent_ids INT[],
child_ids INT[],
estimated_cost FLOAT8,
recommendations TEXT[]
);
-- Get critical path for query
SELECT * FROM ruvector.dag_critical_path(
query_text TEXT
) RETURNS TABLE (
path_position INT,
node_id INT,
operator_type TEXT,
accumulated_cost FLOAT8,
attention_weight FLOAT8
);
-- Identify bottlenecks
SELECT * FROM ruvector.dag_bottlenecks(
query_text TEXT,
threshold FLOAT8 DEFAULT 0.7
) RETURNS TABLE (
node_id INT,
operator_type TEXT,
bottleneck_score FLOAT8,
impact_estimate FLOAT8,
suggested_action TEXT
);
-- Get min-cut analysis
SELECT * FROM ruvector.dag_mincut_analysis(
query_text TEXT
) RETURNS TABLE (
cut_id INT,
source_nodes INT[],
sink_nodes INT[],
cut_capacity FLOAT8,
parallelization_opportunity BOOLEAN
);
```
### Query Optimization
```sql
-- Get optimization suggestions
SELECT * FROM ruvector.dag_suggest_optimizations(
query_text TEXT
) RETURNS TABLE (
suggestion_id INT,
category TEXT, -- 'index', 'join_order', 'parallelism', 'memory'
description TEXT,
expected_improvement FLOAT8,
implementation_sql TEXT,
confidence FLOAT8
);
-- Rewrite query using learned patterns
SELECT ruvector.dag_rewrite_query(
query_text TEXT
) RETURNS TEXT;
-- Estimate query with neural predictions
SELECT * FROM ruvector.dag_estimate(
query_text TEXT
) RETURNS TABLE (
metric TEXT,
postgres_estimate FLOAT8,
neural_estimate FLOAT8,
confidence FLOAT8
);
```
## Attention Mechanism Functions
### Attention Scores
```sql
-- Compute attention for query DAG
SELECT * FROM ruvector.dag_attention_scores(
query_text TEXT,
mechanism TEXT DEFAULT 'auto'
) RETURNS TABLE (
node_id INT,
attention_weight FLOAT8,
query_contribution FLOAT8[],
key_contribution FLOAT8[]
);
-- Get attention matrix
SELECT ruvector.dag_attention_matrix(
query_text TEXT,
mechanism TEXT DEFAULT 'auto'
) RETURNS FLOAT8[][];
-- Visualize attention (returns DOT graph)
SELECT ruvector.dag_attention_visualize(
query_text TEXT,
mechanism TEXT DEFAULT 'auto',
format TEXT DEFAULT 'dot' -- 'dot', 'json', 'ascii'
) RETURNS TEXT;
```
### Attention Configuration
```sql
-- Set attention hyperparameters
SELECT ruvector.dag_attention_configure(
mechanism TEXT,
params JSONB
-- Example params:
-- topological: {"max_depth": 5, "decay_factor": 0.9}
-- causal_cone: {"time_window": 1000, "future_discount": 0.5}
-- critical_path: {"path_weight": 2.0, "branch_penalty": 0.3}
-- mincut_gated: {"gate_threshold": 0.1, "flow_capacity": "cost"}
-- hierarchical_lorentz: {"curvature": -1.0, "time_scale": 0.1}
-- parallel_branch: {"max_branches": 8, "sync_penalty": 0.2}
-- temporal_btsp: {"plateau_duration": 100, "eligibility_decay": 0.95}
) RETURNS VOID;
-- Get attention statistics
SELECT * FROM ruvector.dag_attention_stats()
RETURNS TABLE (
mechanism TEXT,
invocations BIGINT,
avg_latency_us FLOAT8,
hit_rate FLOAT8,
improvement_ratio FLOAT8
);
```
## SONA Learning Functions
### Pattern Management
```sql
-- Store a learned pattern
SELECT ruvector.dag_store_pattern(
pattern_vector VECTOR(256),
pattern_metadata JSONB,
quality_score FLOAT8
) RETURNS BIGINT; -- pattern_id
-- Query similar patterns
SELECT * FROM ruvector.dag_query_patterns(
query_vector VECTOR(256),
k INT DEFAULT 5,
similarity_threshold FLOAT8 DEFAULT 0.7
) RETURNS TABLE (
pattern_id BIGINT,
similarity FLOAT8,
quality_score FLOAT8,
metadata JSONB,
usage_count INT,
last_used TIMESTAMPTZ
);
-- Get pattern clusters (ReasoningBank)
SELECT * FROM ruvector.dag_pattern_clusters()
RETURNS TABLE (
cluster_id INT,
centroid VECTOR(256),
member_count INT,
avg_quality FLOAT8,
representative_query TEXT
);
-- Force pattern consolidation
SELECT ruvector.dag_consolidate_patterns(
target_clusters INT DEFAULT 100
) RETURNS TABLE (
clusters_before INT,
clusters_after INT,
patterns_merged INT,
consolidation_time_ms FLOAT8
);
```
### Trajectory Management
```sql
-- Record a learning trajectory
SELECT ruvector.dag_record_trajectory(
query_hash BIGINT,
dag_structure JSONB,
execution_time_ms FLOAT8,
improvement_ratio FLOAT8,
attention_mechanism TEXT
) RETURNS BIGINT; -- trajectory_id
-- Get trajectory history
SELECT * FROM ruvector.dag_trajectory_history(
time_range TSTZRANGE DEFAULT NULL,
min_improvement FLOAT8 DEFAULT 0.0,
limit_count INT DEFAULT 100
) RETURNS TABLE (
trajectory_id BIGINT,
query_hash BIGINT,
recorded_at TIMESTAMPTZ,
execution_time_ms FLOAT8,
improvement_ratio FLOAT8,
attention_mechanism TEXT
);
-- Analyze trajectory trends
SELECT * FROM ruvector.dag_trajectory_trends(
window_size INTERVAL DEFAULT '1 hour'
) RETURNS TABLE (
window_start TIMESTAMPTZ,
trajectory_count INT,
avg_improvement FLOAT8,
best_mechanism TEXT,
pattern_discoveries INT
);
```
### Learning Control
```sql
-- Trigger immediate learning cycle
SELECT ruvector.dag_learn_now() RETURNS TABLE (
patterns_updated INT,
new_clusters INT,
ewc_constraints_updated INT,
cycle_time_ms FLOAT8
);
-- Reset learning state (use with caution)
SELECT ruvector.dag_reset_learning(
preserve_patterns BOOLEAN DEFAULT TRUE,
preserve_trajectories BOOLEAN DEFAULT FALSE
) RETURNS VOID;
-- Export learned state
SELECT ruvector.dag_export_state() RETURNS BYTEA;
-- Import learned state
SELECT ruvector.dag_import_state(state_data BYTEA) RETURNS TABLE (
patterns_imported INT,
trajectories_imported INT,
clusters_restored INT
);
-- Get EWC constraint info
SELECT * FROM ruvector.dag_ewc_constraints()
RETURNS TABLE (
parameter_name TEXT,
fisher_importance FLOAT8,
optimal_value FLOAT8,
last_updated TIMESTAMPTZ
);
```
## Self-Healing Functions
### Health Monitoring
```sql
-- Run comprehensive health check
SELECT * FROM ruvector.dag_health_report()
RETURNS TABLE (
subsystem TEXT,
status TEXT,
score FLOAT8,
issues TEXT[],
recommendations TEXT[]
);
-- Get anomaly detection results
SELECT * FROM ruvector.dag_anomalies(
time_range TSTZRANGE DEFAULT '[now - 1 hour, now]'::TSTZRANGE
) RETURNS TABLE (
anomaly_id BIGINT,
detected_at TIMESTAMPTZ,
anomaly_type TEXT,
severity TEXT,
affected_component TEXT,
z_score FLOAT8,
resolved BOOLEAN
);
-- Check index health
SELECT * FROM ruvector.dag_index_health()
RETURNS TABLE (
index_name TEXT,
index_type TEXT,
fragmentation FLOAT8,
recall_estimate FLOAT8,
recommended_action TEXT
);
-- Check learning drift
SELECT * FROM ruvector.dag_learning_drift()
RETURNS TABLE (
metric TEXT,
current_value FLOAT8,
baseline_value FLOAT8,
drift_magnitude FLOAT8,
trend TEXT
);
```
### Repair Operations
```sql
-- Trigger automatic repair
SELECT * FROM ruvector.dag_auto_repair()
RETURNS TABLE (
repair_id BIGINT,
repair_type TEXT,
target TEXT,
status TEXT,
duration_ms FLOAT8
);
-- Rebalance specific index
SELECT ruvector.dag_rebalance_index(
index_name TEXT,
target_recall FLOAT8 DEFAULT 0.95
) RETURNS TABLE (
vectors_moved INT,
new_recall FLOAT8,
duration_ms FLOAT8
);
-- Reset pattern quality scores
SELECT ruvector.dag_reset_pattern_quality(
pattern_ids BIGINT[] DEFAULT NULL -- NULL = all patterns
) RETURNS INT; -- patterns reset
-- Force cluster recomputation
SELECT ruvector.dag_recompute_clusters(
algorithm TEXT DEFAULT 'kmeans_pp'
) RETURNS TABLE (
old_clusters INT,
new_clusters INT,
silhouette_score FLOAT8
);
```
## QuDAG Integration Functions
### Network Operations
```sql
-- Connect to QuDAG network
SELECT ruvector.qudag_connect(
endpoint TEXT,
identity_key BYTEA DEFAULT NULL -- auto-generate if NULL
) RETURNS TABLE (
connected BOOLEAN,
node_id TEXT,
network_version TEXT
);
-- Get network status
SELECT * FROM ruvector.qudag_status()
RETURNS TABLE (
connected BOOLEAN,
node_id TEXT,
peers INT,
latest_round BIGINT,
sync_status TEXT
);
-- Propose pattern to network
SELECT ruvector.qudag_propose_pattern(
pattern_vector VECTOR(256),
metadata JSONB,
stake_amount FLOAT8 DEFAULT 0.0
) RETURNS TABLE (
proposal_id TEXT,
submitted_at TIMESTAMPTZ,
status TEXT
);
-- Check proposal status
SELECT * FROM ruvector.qudag_proposal_status(
proposal_id TEXT
) RETURNS TABLE (
status TEXT,
votes_for INT,
votes_against INT,
finalized BOOLEAN,
finalized_at TIMESTAMPTZ
);
-- Sync patterns from network
SELECT * FROM ruvector.qudag_sync_patterns(
since_round BIGINT DEFAULT 0
) RETURNS TABLE (
patterns_received INT,
patterns_applied INT,
conflicts_resolved INT
);
```
### Token Operations
```sql
-- Get rUv balance
SELECT ruvector.qudag_balance() RETURNS FLOAT8;
-- Stake tokens
SELECT ruvector.qudag_stake(
amount FLOAT8
) RETURNS TABLE (
new_stake FLOAT8,
tx_hash TEXT
);
-- Claim rewards
SELECT * FROM ruvector.qudag_claim_rewards()
RETURNS TABLE (
amount FLOAT8,
tx_hash TEXT,
source TEXT
);
-- Get staking info
SELECT * FROM ruvector.qudag_staking_info()
RETURNS TABLE (
staked_amount FLOAT8,
pending_rewards FLOAT8,
lock_until TIMESTAMPTZ,
apr_estimate FLOAT8
);
```
### Cryptographic Operations
```sql
-- Generate ML-KEM keypair
SELECT ruvector.qudag_generate_kem_keypair()
RETURNS TABLE (
public_key BYTEA,
secret_key_id TEXT -- stored securely
);
-- Encrypt data for peer
SELECT ruvector.qudag_encrypt(
plaintext BYTEA,
recipient_pubkey BYTEA
) RETURNS TABLE (
ciphertext BYTEA,
encapsulated_key BYTEA
);
-- Decrypt received data
SELECT ruvector.qudag_decrypt(
ciphertext BYTEA,
encapsulated_key BYTEA,
secret_key_id TEXT
) RETURNS BYTEA;
-- Sign data
SELECT ruvector.qudag_sign(
data BYTEA
) RETURNS BYTEA; -- ML-DSA signature
-- Verify signature
SELECT ruvector.qudag_verify(
data BYTEA,
signature BYTEA,
pubkey BYTEA
) RETURNS BOOLEAN;
```
## Monitoring and Statistics
### Performance Metrics
```sql
-- Get overall statistics
SELECT * FROM ruvector.dag_statistics()
RETURNS TABLE (
metric TEXT,
value FLOAT8,
unit TEXT,
updated_at TIMESTAMPTZ
);
-- Get latency breakdown
SELECT * FROM ruvector.dag_latency_breakdown(
time_range TSTZRANGE DEFAULT '[now - 1 hour, now]'::TSTZRANGE
) RETURNS TABLE (
component TEXT,
p50_us FLOAT8,
p95_us FLOAT8,
p99_us FLOAT8,
max_us FLOAT8
);
-- Get memory usage
SELECT * FROM ruvector.dag_memory_usage()
RETURNS TABLE (
component TEXT,
allocated_bytes BIGINT,
used_bytes BIGINT,
peak_bytes BIGINT
);
-- Get throughput metrics
SELECT * FROM ruvector.dag_throughput(
window INTERVAL DEFAULT '1 minute'
) RETURNS TABLE (
metric TEXT,
count BIGINT,
per_second FLOAT8
);
```
### Debugging
```sql
-- Enable debug logging
SELECT ruvector.dag_set_debug(
enabled BOOLEAN,
components TEXT[] DEFAULT ARRAY['all']
) RETURNS VOID;
-- Get recent debug logs
SELECT * FROM ruvector.dag_debug_logs(
since TIMESTAMPTZ DEFAULT now() - interval '5 minutes',
component TEXT DEFAULT NULL,
severity TEXT DEFAULT NULL -- 'debug', 'info', 'warn', 'error'
) RETURNS TABLE (
logged_at TIMESTAMPTZ,
component TEXT,
severity TEXT,
message TEXT,
context JSONB
);
-- Trace single query
SELECT * FROM ruvector.dag_trace_query(
query_text TEXT
) RETURNS TABLE (
step INT,
operation TEXT,
duration_us FLOAT8,
details JSONB
);
-- Export diagnostics bundle
SELECT ruvector.dag_export_diagnostics() RETURNS BYTEA;
```
## Batch Operations
### Bulk Processing
```sql
-- Analyze multiple queries
SELECT * FROM ruvector.dag_bulk_analyze(
queries TEXT[]
) RETURNS TABLE (
query_index INT,
bottleneck_count INT,
suggested_mechanism TEXT,
estimated_improvement FLOAT8
);
-- Pre-warm patterns for workload
SELECT ruvector.dag_prewarm_patterns(
representative_queries TEXT[]
) RETURNS TABLE (
patterns_loaded INT,
cache_hit_rate FLOAT8
);
-- Batch record trajectories
SELECT ruvector.dag_bulk_record_trajectories(
trajectories JSONB[]
) RETURNS INT; -- trajectories recorded
```
## Views
### System Views
```sql
-- Active configuration
CREATE VIEW ruvector.dag_active_config AS
SELECT * FROM ruvector.dag_config();
-- Recent patterns
CREATE VIEW ruvector.dag_recent_patterns AS
SELECT pattern_id, created_at, quality_score, usage_count
FROM ruvector.dag_patterns
WHERE created_at > now() - interval '24 hours'
ORDER BY quality_score DESC;
-- Attention effectiveness
CREATE VIEW ruvector.dag_attention_effectiveness AS
SELECT
mechanism,
count(*) as uses,
avg(improvement_ratio) as avg_improvement,
percentile_cont(0.95) WITHIN GROUP (ORDER BY improvement_ratio) as p95_improvement
FROM ruvector.dag_trajectories
WHERE recorded_at > now() - interval '7 days'
GROUP BY mechanism;
-- Health summary
CREATE VIEW ruvector.dag_health_summary AS
SELECT
subsystem,
status,
score,
array_length(issues, 1) as issue_count
FROM ruvector.dag_health_report();
```
## Installation SQL
```sql
-- Create extension
CREATE EXTENSION IF NOT EXISTS ruvector_dag CASCADE;
-- Required tables (auto-created by extension)
-- ruvector.dag_patterns - Learned patterns storage
-- ruvector.dag_trajectories - Learning trajectory history
-- ruvector.dag_clusters - Pattern clusters (ReasoningBank)
-- ruvector.dag_anomalies - Detected anomalies log
-- ruvector.dag_repairs - Repair history
-- ruvector.dag_qudag_proposals - QuDAG proposal tracking
-- Recommended indexes
CREATE INDEX ON ruvector.dag_patterns USING hnsw (pattern_vector vector_cosine_ops);
CREATE INDEX ON ruvector.dag_trajectories (recorded_at DESC);
CREATE INDEX ON ruvector.dag_trajectories (query_hash);
CREATE INDEX ON ruvector.dag_anomalies (detected_at DESC) WHERE NOT resolved;
-- Grant permissions
GRANT USAGE ON SCHEMA ruvector TO PUBLIC;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA ruvector TO PUBLIC;
GRANT SELECT ON ALL TABLES IN SCHEMA ruvector TO PUBLIC;
```
## Usage Examples
### Basic Query Optimization
```sql
-- Enable neural DAG learning
SELECT ruvector.dag_set_enabled(true);
-- Analyze a query
SELECT * FROM ruvector.dag_analyze_plan($$
SELECT v.*, m.category
FROM vectors v
JOIN metadata m ON v.id = m.vector_id
WHERE v.embedding <-> $1 < 0.5
ORDER BY v.embedding <-> $1
LIMIT 100
$$);
-- Get optimization suggestions
SELECT * FROM ruvector.dag_suggest_optimizations($$
SELECT v.*, m.category
FROM vectors v
JOIN metadata m ON v.id = m.vector_id
WHERE v.embedding <-> $1 < 0.5
ORDER BY v.embedding <-> $1
LIMIT 100
$$);
```
### Attention Mechanism Selection
```sql
-- Let system choose best attention
SELECT ruvector.dag_set_attention('auto');
-- Or manually select based on workload
-- For deep query plans:
SELECT ruvector.dag_set_attention('topological');
-- For time-series workloads:
SELECT ruvector.dag_set_attention('causal_cone');
-- For CPU-bound queries:
SELECT ruvector.dag_set_attention('critical_path');
```
### Distributed Learning with QuDAG
```sql
-- Connect to QuDAG network
SELECT * FROM ruvector.qudag_connect(
'https://qudag.example.com:8443'
);
-- Stake tokens for participation
SELECT * FROM ruvector.qudag_stake(100.0);
-- Patterns are now automatically shared and validated
-- Check sync status
SELECT * FROM ruvector.qudag_status();
```
## Error Codes
| Code | Name | Description |
|------|------|-------------|
| RV001 | DAG_DISABLED | Neural DAG learning is disabled |
| RV002 | INVALID_ATTENTION | Unknown attention mechanism |
| RV003 | PATTERN_NOT_FOUND | Referenced pattern does not exist |
| RV004 | LEARNING_FAILED | Learning cycle failed |
| RV005 | QUDAG_DISCONNECTED | Not connected to QuDAG network |
| RV006 | QUDAG_AUTH_FAILED | QuDAG authentication failed |
| RV007 | INSUFFICIENT_STAKE | Not enough staked tokens |
| RV008 | CRYPTO_ERROR | Cryptographic operation failed |
| RV009 | REPAIR_FAILED | Self-healing repair failed |
| RV010 | TRAJECTORY_OVERFLOW | Trajectory buffer full |
---
*Document: 09-SQL-API.md | Version: 1.0 | Last Updated: 2025-01-XX*

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,881 @@
# Agent Task Assignments
## Overview
Task breakdown for 15-agent swarm implementing the Neural DAG Learning system. Each agent has specific responsibilities, dependencies, and deliverables.
## Swarm Topology
```
┌─────────────────────┐
│ QUEEN AGENT │
│ (Coordinator) │
│ Agent #0 │
└──────────┬──────────┘
┌──────────────────────┼──────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ CORE TEAM │ │ POSTGRES TEAM │ │ QUDAG TEAM │
│ Agents 1-5 │ │ Agents 6-9 │ │ Agents 10-12 │
└───────────────┘ └───────────────┘ └───────────────┘
┌──────────┴──────────┐
▼ ▼
┌───────────────┐ ┌───────────────┐
│ TESTING TEAM │ │ DOCS TEAM │
│ Agents 13-14 │ │ Agent 15 │
└───────────────┘ └───────────────┘
```
## Agent Assignments
---
### Agent #0: Queen Coordinator
**Type**: `queen-coordinator`
**Role**: Central orchestration, dependency management, conflict resolution
**Responsibilities**:
- Monitor all agent progress via memory coordination
- Resolve cross-team dependencies and conflicts
- Manage swarm-wide configuration
- Aggregate status reports
- Make strategic decisions on implementation order
- Coordinate code reviews between teams
**Deliverables**:
- Swarm coordination logs
- Dependency resolution decisions
- Final integration verification
**Memory Keys**:
- `swarm/queen/status` - Overall swarm status
- `swarm/queen/decisions` - Strategic decisions log
- `swarm/queen/dependencies` - Cross-agent dependency tracking
**No direct code output** - Coordination only
---
### Agent #1: Core DAG Engine
**Type**: `coder`
**Role**: Core DAG data structures and algorithms
**Responsibilities**:
1. Implement `QueryDag` structure
2. Implement `OperatorNode` and `OperatorType`
3. Implement DAG traversal algorithms (topological sort, DFS, BFS)
4. Implement edge/node management
5. Implement DAG serialization/deserialization
**Files to Create/Modify**:
```
ruvector-dag/src/
├── lib.rs
├── dag/
│ ├── mod.rs
│ ├── query_dag.rs
│ ├── operator_node.rs
│ ├── traversal.rs
│ └── serialization.rs
```
**Dependencies**: None (foundational)
**Blocked By**: None
**Blocks**: Agents 2, 3, 4, 6
**Deliverables**:
- [ ] `QueryDag` struct with node/edge management
- [ ] `OperatorNode` with all operator types
- [ ] Topological sort implementation
- [ ] Cycle detection
- [ ] JSON/binary serialization
**Estimated Complexity**: Medium
---
### Agent #2: Attention Mechanisms (Basic)
**Type**: `coder`
**Role**: Implement first 4 attention mechanisms
**Responsibilities**:
1. Implement `DagAttention` trait
2. Implement `TopologicalAttention`
3. Implement `CausalConeAttention`
4. Implement `CriticalPathAttention`
5. Implement `MinCutGatedAttention`
**Files to Create/Modify**:
```
ruvector-dag/src/attention/
├── mod.rs
├── traits.rs
├── topological.rs
├── causal_cone.rs
├── critical_path.rs
└── mincut_gated.rs
```
**Dependencies**: Agent #1 (QueryDag)
**Blocked By**: Agent #1
**Blocks**: Agents 6, 13
**Deliverables**:
- [ ] `DagAttention` trait definition
- [ ] `TopologicalAttention` with decay
- [ ] `CausalConeAttention` with temporal awareness
- [ ] `CriticalPathAttention` with path computation
- [ ] `MinCutGatedAttention` with flow-based gating
**Estimated Complexity**: High
---
### Agent #3: Attention Mechanisms (Advanced)
**Type**: `coder`
**Role**: Implement advanced attention mechanisms
**Responsibilities**:
1. Implement `HierarchicalLorentzAttention`
2. Implement `ParallelBranchAttention`
3. Implement `TemporalBTSPAttention`
4. Implement `AttentionSelector` (UCB bandit)
5. Implement attention caching
**Files to Create/Modify**:
```
ruvector-dag/src/attention/
├── hierarchical_lorentz.rs
├── parallel_branch.rs
├── temporal_btsp.rs
├── selector.rs
└── cache.rs
```
**Dependencies**: Agent #1 (QueryDag), Agent #2 (DagAttention trait)
**Blocked By**: Agents #1, #2
**Blocks**: Agents 6, 13
**Deliverables**:
- [ ] `HierarchicalLorentzAttention` with hyperbolic ops
- [ ] `ParallelBranchAttention` with branch detection
- [ ] `TemporalBTSPAttention` with eligibility traces
- [ ] `AttentionSelector` with UCB selection
- [ ] LRU attention cache
**Estimated Complexity**: Very High
---
### Agent #4: SONA Integration
**Type**: `coder`
**Role**: Self-Optimizing Neural Architecture integration
**Responsibilities**:
1. Implement `DagSonaEngine`
2. Implement `MicroLoRA` adaptation
3. Implement `DagTrajectoryBuffer`
4. Implement `DagReasoningBank`
5. Implement `EwcPlusPlus` constraints
**Files to Create/Modify**:
```
ruvector-dag/src/sona/
├── mod.rs
├── engine.rs
├── micro_lora.rs
├── trajectory.rs
├── reasoning_bank.rs
└── ewc.rs
```
**Dependencies**: Agent #1 (QueryDag)
**Blocked By**: Agent #1
**Blocks**: Agents 6, 7, 13
**Deliverables**:
- [ ] `DagSonaEngine` orchestration
- [ ] `MicroLoRA` rank-2 adaptation (<100μs)
- [ ] Lock-free trajectory buffer
- [ ] K-means++ clustering for patterns
- [ ] EWC++ with Fisher information
**Estimated Complexity**: Very High
---
### Agent #5: MinCut Optimization
**Type**: `coder`
**Role**: Subpolynomial min-cut algorithms
**Responsibilities**:
1. Implement `DagMinCutEngine`
2. Implement `LocalKCut` oracle
3. Implement dynamic update algorithms
4. Implement bottleneck detection
5. Implement redundancy suggestions
**Files to Create/Modify**:
```
ruvector-dag/src/mincut/
├── mod.rs
├── engine.rs
├── local_kcut.rs
├── dynamic_updates.rs
├── bottleneck.rs
└── redundancy.rs
```
**Dependencies**: Agent #1 (QueryDag)
**Blocked By**: Agent #1
**Blocks**: Agent #2 (MinCutGatedAttention), Agent #6
**Deliverables**:
- [ ] `DagMinCutEngine` with O(n^0.12) updates
- [ ] `LocalKCut` oracle implementation
- [ ] Hierarchical decomposition
- [ ] Bottleneck scoring algorithm
- [ ] Redundancy recommendation engine
**Estimated Complexity**: Very High
---
### Agent #6: PostgreSQL Core Integration
**Type**: `backend-dev`
**Role**: Core PostgreSQL extension integration
**Responsibilities**:
1. Set up pgrx extension structure
2. Implement GUC variables
3. Implement global state management
4. Implement query hooks (planner, executor)
5. Implement background worker registration
**Files to Create/Modify**:
```
ruvector-postgres/src/dag/
├── mod.rs
├── extension.rs
├── guc.rs
├── state.rs
├── hooks.rs
└── worker.rs
```
**Dependencies**: Agents #1-5 (all core components)
**Blocked By**: Agents #1, #2, #3, #4, #5
**Blocks**: Agents #7, #8, #9
**Deliverables**:
- [ ] Extension scaffolding with pgrx
- [ ] All GUC variables from spec
- [ ] Thread-safe global state (DashMap)
- [ ] Planner hook for DAG analysis
- [ ] Executor hooks for trajectory capture
- [ ] Background worker main loop
**Estimated Complexity**: High
---
### Agent #7: PostgreSQL SQL Functions (Part 1)
**Type**: `backend-dev`
**Role**: Core SQL function implementations
**Responsibilities**:
1. Configuration functions
2. Query analysis functions
3. Attention functions
4. Basic status/health functions
**Files to Create/Modify**:
```
ruvector-postgres/src/dag/
├── functions/
│ ├── mod.rs
│ ├── config.rs
│ ├── analysis.rs
│ ├── attention.rs
│ └── status.rs
```
**SQL Functions**:
- `dag_set_enabled`
- `dag_set_learning_rate`
- `dag_set_attention`
- `dag_configure_sona`
- `dag_config`
- `dag_status`
- `dag_analyze_plan`
- `dag_critical_path`
- `dag_bottlenecks`
- `dag_attention_scores`
- `dag_attention_matrix`
**Dependencies**: Agent #6 (PostgreSQL core)
**Blocked By**: Agent #6
**Blocks**: Agent #13
**Deliverables**:
- [ ] All configuration SQL functions
- [ ] Query analysis functions
- [ ] Attention computation functions
- [ ] Status reporting functions
**Estimated Complexity**: Medium
---
### Agent #8: PostgreSQL SQL Functions (Part 2)
**Type**: `backend-dev`
**Role**: Learning and pattern SQL functions
**Responsibilities**:
1. Pattern management functions
2. Trajectory functions
3. Learning control functions
4. Self-healing functions
**Files to Create/Modify**:
```
ruvector-postgres/src/dag/
├── functions/
│ ├── patterns.rs
│ ├── trajectories.rs
│ ├── learning.rs
│ └── healing.rs
```
**SQL Functions**:
- `dag_store_pattern`
- `dag_query_patterns`
- `dag_pattern_clusters`
- `dag_consolidate_patterns`
- `dag_record_trajectory`
- `dag_trajectory_history`
- `dag_learn_now`
- `dag_reset_learning`
- `dag_health_report`
- `dag_anomalies`
- `dag_auto_repair`
**Dependencies**: Agent #6 (PostgreSQL core), Agent #4 (SONA)
**Blocked By**: Agents #4, #6
**Blocks**: Agent #13
**Deliverables**:
- [ ] Pattern CRUD functions
- [ ] Trajectory management functions
- [ ] Learning control functions
- [ ] Health and healing functions
**Estimated Complexity**: Medium
---
### Agent #9: Self-Healing System
**Type**: `coder`
**Role**: Autonomous self-healing implementation
**Responsibilities**:
1. Implement `AnomalyDetector`
2. Implement `IndexHealthChecker`
3. Implement `LearningDriftDetector`
4. Implement repair strategies
5. Implement healing orchestrator
**Files to Create/Modify**:
```
ruvector-dag/src/healing/
├── mod.rs
├── anomaly.rs
├── index_health.rs
├── drift_detector.rs
├── strategies.rs
└── orchestrator.rs
```
**Dependencies**: Agent #4 (SONA), Agent #6 (PostgreSQL hooks)
**Blocked By**: Agents #4, #6
**Blocks**: Agent #8 (healing SQL functions), Agent #13
**Deliverables**:
- [ ] Z-score anomaly detection
- [ ] HNSW/IVFFlat health monitoring
- [ ] Pattern drift detection
- [ ] Repair strategy implementations
- [ ] Healing loop orchestration
**Estimated Complexity**: High
---
### Agent #10: QuDAG Client
**Type**: `coder`
**Role**: QuDAG network client implementation
**Responsibilities**:
1. Implement `QuDagClient`
2. Implement network communication
3. Implement pattern proposal flow
4. Implement consensus validation
5. Implement pattern synchronization
**Files to Create/Modify**:
```
ruvector-dag/src/qudag/
├── mod.rs
├── client.rs
├── network.rs
├── proposal.rs
├── consensus.rs
└── sync.rs
```
**Dependencies**: Agent #4 (patterns to propose)
**Blocked By**: Agent #4
**Blocks**: Agents #11, #12
**Deliverables**:
- [ ] QuDAG network client
- [ ] Async communication layer
- [ ] Pattern proposal protocol
- [ ] Consensus status tracking
- [ ] Pattern sync mechanism
**Estimated Complexity**: High
---
### Agent #11: QuDAG Cryptography
**Type**: `security-manager`
**Role**: Quantum-resistant cryptography
**Responsibilities**:
1. Implement ML-KEM-768 wrapper
2. Implement ML-DSA signature wrapper
3. Implement identity management
4. Implement secure key storage
5. Implement differential privacy for patterns
**Files to Create/Modify**:
```
ruvector-dag/src/qudag/
├── crypto/
│ ├── mod.rs
│ ├── ml_kem.rs
│ ├── ml_dsa.rs
│ ├── identity.rs
│ ├── keystore.rs
│ └── differential_privacy.rs
```
**Dependencies**: Agent #10 (QuDAG client)
**Blocked By**: Agent #10
**Blocks**: Agent #12
**Deliverables**:
- [ ] ML-KEM-768 encrypt/decrypt
- [ ] ML-DSA sign/verify
- [ ] Identity keypair management
- [ ] Secure keystore (zeroize)
- [ ] Laplace noise for DP
**Estimated Complexity**: High
---
### Agent #12: QuDAG Token Integration
**Type**: `backend-dev`
**Role**: rUv token operations
**Responsibilities**:
1. Implement staking interface
2. Implement reward claiming
3. Implement balance tracking
4. Implement token SQL functions
5. Implement governance participation
**Files to Create/Modify**:
```
ruvector-dag/src/qudag/
├── tokens/
│ ├── mod.rs
│ ├── staking.rs
│ ├── rewards.rs
│ └── governance.rs
ruvector-postgres/src/dag/functions/
├── qudag.rs (SQL functions for QuDAG)
```
**Dependencies**: Agent #10 (QuDAG client), Agent #11 (crypto)
**Blocked By**: Agents #10, #11
**Blocks**: Agent #13
**Deliverables**:
- [ ] Staking operations
- [ ] Reward computation
- [ ] Balance queries
- [ ] QuDAG SQL functions
- [ ] Governance voting
**Estimated Complexity**: Medium
---
### Agent #13: Test Suite Developer
**Type**: `tester`
**Role**: Comprehensive test implementation
**Responsibilities**:
1. Unit tests for all modules
2. Integration tests
3. Property-based tests
4. Benchmark tests
5. CI pipeline setup
**Files to Create/Modify**:
```
ruvector-dag/tests/
├── unit/
│ ├── attention/
│ ├── sona/
│ ├── mincut/
│ ├── healing/
│ └── qudag/
├── integration/
│ ├── postgres/
│ └── qudag/
├── property/
└── fixtures/
ruvector-dag/benches/
├── attention_bench.rs
├── sona_bench.rs
└── mincut_bench.rs
.github/workflows/
└── dag-tests.yml
```
**Dependencies**: All code agents (1-12)
**Blocked By**: Agents #1-12 (tests require implementations)
**Blocks**: None (can test incrementally)
**Deliverables**:
- [ ] >80% unit test coverage
- [ ] All integration tests passing
- [ ] Property tests (1000+ cases)
- [ ] Benchmarks meeting performance targets
- [ ] CI/CD pipeline
**Estimated Complexity**: High
---
### Agent #14: Test Data & Fixtures
**Type**: `tester`
**Role**: Test data generation and fixtures
**Responsibilities**:
1. Generate realistic query DAGs
2. Generate synthetic patterns
3. Generate trajectory data
4. Create mock QuDAG server
5. Create test databases
**Files to Create/Modify**:
```
ruvector-dag/tests/
├── fixtures/
│ ├── dag_generator.rs
│ ├── pattern_generator.rs
│ ├── trajectory_generator.rs
│ └── mock_qudag.rs
├── data/
│ ├── sample_dags.json
│ ├── sample_patterns.bin
│ └── sample_trajectories.json
```
**Dependencies**: Agent #1 (DAG structure definitions)
**Blocked By**: Agent #1
**Blocks**: Agent #13 (needs fixtures)
**Deliverables**:
- [ ] DAG generator for all complexity levels
- [ ] Pattern generator for learning tests
- [ ] Mock QuDAG server for network tests
- [ ] Sample data files
- [ ] Test database setup scripts
**Estimated Complexity**: Medium
---
### Agent #15: Documentation & Examples
**Type**: `api-docs`
**Role**: API documentation and usage examples
**Responsibilities**:
1. Rust API documentation
2. SQL API documentation
3. Usage examples
4. Integration guides
5. Troubleshooting guides
**Files to Create/Modify**:
```
ruvector-dag/
├── README.md
├── examples/
│ ├── basic_usage.rs
│ ├── attention_selection.rs
│ ├── learning_workflow.rs
│ └── qudag_integration.rs
docs/dag/
├── USAGE.md
├── TROUBLESHOOTING.md
└── EXAMPLES.md
```
**Dependencies**: All code agents (1-12)
**Blocked By**: None (can document spec first, update with impl)
**Blocks**: None
**Deliverables**:
- [ ] Complete rustdoc for all public APIs
- [ ] SQL function documentation
- [ ] Working code examples
- [ ] Integration guide
- [ ] Troubleshooting guide
**Estimated Complexity**: Medium
---
## Task Dependencies Graph
```
┌─────┐
│ 0 │ Queen
└──┬──┘
┌─────────────┼─────────────┐
│ │ │
┌──┴──┐ ┌──┴──┐ ┌──┴──┐
│ 1 │ │ 14 │ │ 15 │
└──┬──┘ └──┬──┘ └─────┘
│ │
┌────┼────┬───────┤
│ │ │ │
┌──┴─┐┌─┴──┐┌┴──┐┌───┴───┐
│ 2 ││ 4 ││ 5 ││ (13) │
└──┬─┘└─┬──┘└─┬─┘└───────┘
│ │ │
┌──┴─┐ │ │
│ 3 │ │ │
└──┬─┘ │ │
│ │ │
└────┼─────┘
┌──┴──┐
│ 6 │ PostgreSQL Core
└──┬──┘
┌────┼────┬────┐
│ │ │ │
┌──┴─┐┌─┴──┐│ ┌──┴──┐
│ 7 ││ 8 ││ │ 9 │
└────┘└────┘│ └─────┘
┌──┴──┐
│ 10 │ QuDAG Client
└──┬──┘
┌──┴──┐
│ 11 │ QuDAG Crypto
└──┬──┘
┌──┴──┐
│ 12 │ QuDAG Tokens
└──┬──┘
┌──┴──┐
│ 13 │ Tests
└─────┘
```
## Execution Phases
### Phase 1: Foundation (Agents 1, 14, 15)
- Agent #1: Core DAG Engine
- Agent #14: Test fixtures (parallel)
- Agent #15: Documentation skeleton (parallel)
**Duration**: Can start immediately
**Milestone**: QueryDag and OperatorNode complete
### Phase 2: Core Features (Agents 2, 3, 4, 5)
- Agent #2: Basic Attention
- Agent #3: Advanced Attention (after Agent #2)
- Agent #4: SONA Integration
- Agent #5: MinCut Optimization
**Duration**: After Phase 1 foundation
**Milestone**: All attention mechanisms and learning components
### Phase 3: PostgreSQL Integration (Agents 6, 7, 8, 9)
- Agent #6: PostgreSQL Core
- Agent #7: SQL Functions Part 1 (after Agent #6)
- Agent #8: SQL Functions Part 2 (after Agent #6)
- Agent #9: Self-Healing (after Agent #6)
**Duration**: After Phase 2 core features
**Milestone**: Full PostgreSQL extension functional
### Phase 4: QuDAG Integration (Agents 10, 11, 12)
- Agent #10: QuDAG Client
- Agent #11: QuDAG Crypto (after Agent #10)
- Agent #12: QuDAG Tokens (after Agent #11)
**Duration**: Can start after Agent #4 (SONA)
**Milestone**: Distributed pattern learning operational
### Phase 5: Testing & Validation (Agent 13)
- Agent #13: Full test suite
- Integration testing
- Performance validation
**Duration**: Ongoing throughout, intensive at end
**Milestone**: All tests passing, benchmarks met
## Coordination Protocol
### Memory Keys for Cross-Agent Communication
```
swarm/dag/
├── status/
│ ├── agent_{N}_status # Individual agent status
│ ├── phase_status # Current phase
│ └── blockers # Active blockers
├── artifacts/
│ ├── agent_{N}_files # Files created/modified
│ ├── interfaces # Shared interface definitions
│ └── schemas # Data schemas
├── decisions/
│ ├── api_decisions # API design decisions
│ ├── implementation # Implementation choices
│ └── conflicts # Resolved conflicts
└── metrics/
├── progress # Completion percentages
├── performance # Performance measurements
└── issues # Known issues
```
### Communication Hooks
Each agent MUST run before work:
```bash
npx claude-flow@alpha hooks pre-task --description "Agent #{N}: {task}"
npx claude-flow@alpha hooks session-restore --session-id "swarm-dag"
```
Each agent MUST run after work:
```bash
npx claude-flow@alpha hooks post-edit --file "{file}" --memory-key "swarm/dag/artifacts/agent_{N}_files"
npx claude-flow@alpha hooks post-task --task-id "agent_{N}_{task}"
```
## Success Criteria
| Agent | Must Complete | Performance Target |
|-------|---------------|-------------------|
| #1 | QueryDag, traversals, serialization | - |
| #2 | 4 attention mechanisms | <100μs per mechanism |
| #3 | 3 attention mechanisms + selector | <200μs per mechanism |
| #4 | SONA engine, MicroLoRA, ReasoningBank | <100μs adaptation |
| #5 | MinCut engine, dynamic updates | O(n^0.12) amortized |
| #6 | Extension scaffold, hooks, worker | - |
| #7 | 11 SQL functions | <5ms per function |
| #8 | 11 SQL functions | <5ms per function |
| #9 | Healing system | <1s detection latency |
| #10 | QuDAG client, sync | <500ms network ops |
| #11 | ML-KEM, ML-DSA | <10ms crypto ops |
| #12 | Token operations | <100ms token ops |
| #13 | >80% coverage, all benchmarks | - |
| #14 | All fixtures, mock server | - |
| #15 | Complete docs, examples | - |
---
*Document: 11-AGENT-TASKS.md | Version: 1.0 | Last Updated: 2025-01-XX*

View File

@@ -0,0 +1,757 @@
# Implementation Milestones
## Overview
Structured milestone plan for implementing the Neural DAG Learning system with 15-agent swarm coordination.
## Milestone Summary
```
┌─────────────────────────────────────────────────────────────────────────┐
│ NEURAL DAG LEARNING IMPLEMENTATION │
├─────────────────────────────────────────────────────────────────────────┤
│ M1: Foundation ████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 15% │
│ M2: Core Attention ████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 25% │
│ M3: SONA Learning ████████████████░░░░░░░░░░░░░░░░░░░░░░░░ 35% │
│ M4: PostgreSQL ████████████████████████░░░░░░░░░░░░░░░░ 55% │
│ M5: Self-Healing ████████████████████████████░░░░░░░░░░░░ 65% │
│ M6: QuDAG Integration ████████████████████████████████░░░░░░░░ 80% │
│ M7: Testing ████████████████████████████████████░░░░ 90% │
│ M8: Production Ready ████████████████████████████████████████ 100% │
└─────────────────────────────────────────────────────────────────────────┘
```
---
## Milestone 1: Foundation
**Status**: Not Started
**Priority**: Critical
**Agents**: #1, #14, #15
### Objectives
- [ ] Establish core DAG data structures
- [ ] Create test fixture infrastructure
- [ ] Initialize documentation structure
### Deliverables
| Deliverable | Agent | Status | Notes |
|-------------|-------|--------|-------|
| `QueryDag` struct | #1 | Pending | Node/edge management |
| `OperatorNode` enum | #1 | Pending | All 15+ operator types |
| Topological sort | #1 | Pending | O(V+E) implementation |
| Cycle detection | #1 | Pending | For validation |
| DAG serialization | #1 | Pending | JSON + binary formats |
| Test DAG generator | #14 | Pending | All complexity levels |
| Mock fixtures | #14 | Pending | Sample data |
| Doc skeleton | #15 | Pending | README + guides |
### Acceptance Criteria
```rust
// Core functionality must work
let mut dag = QueryDag::new();
dag.add_node(0, OperatorNode::SeqScan { table: "users".into() });
dag.add_node(1, OperatorNode::Filter { predicate: "id > 0".into() });
dag.add_edge(0, 1).unwrap();
let sorted = dag.topological_sort().unwrap();
assert_eq!(sorted, vec![0, 1]);
let json = dag.to_json().unwrap();
let restored = QueryDag::from_json(&json).unwrap();
assert_eq!(dag, restored);
```
### Files Created
```
ruvector-dag/
├── Cargo.toml
├── src/
│ ├── lib.rs
│ └── dag/
│ ├── mod.rs
│ ├── query_dag.rs
│ ├── operator_node.rs
│ ├── traversal.rs
│ └── serialization.rs
└── tests/
└── fixtures/
├── dag_generator.rs
└── sample_dags.json
```
### Exit Criteria
- [ ] All unit tests pass for DAG module
- [ ] Benchmark: create 1000-node DAG in <10ms
- [ ] Documentation: rustdoc for all public items
- [ ] Code review approved by Queen agent
---
## Milestone 2: Core Attention Mechanisms
**Status**: Not Started
**Priority**: Critical
**Agents**: #2, #3
### Objectives
- [ ] Implement all 7 attention mechanisms
- [ ] Implement attention selector (UCB bandit)
- [ ] Achieve performance targets
### Deliverables
| Deliverable | Agent | Status | Target |
|-------------|-------|--------|--------|
| `DagAttention` trait | #2 | Pending | - |
| `TopologicalAttention` | #2 | Pending | <50μs/100 nodes |
| `CausalConeAttention` | #2 | Pending | <100μs/100 nodes |
| `CriticalPathAttention` | #2 | Pending | <75μs/100 nodes |
| `MinCutGatedAttention` | #2 | Pending | <200μs/100 nodes |
| `HierarchicalLorentzAttention` | #3 | Pending | <150μs/100 nodes |
| `ParallelBranchAttention` | #3 | Pending | <100μs/100 nodes |
| `TemporalBTSPAttention` | #3 | Pending | <120μs/100 nodes |
| `AttentionSelector` | #3 | Pending | UCB regret O(√T) |
| Attention cache | #3 | Pending | 10K entry LRU |
### Acceptance Criteria
```rust
// All mechanisms implement trait
let mechanisms: Vec<Box<dyn DagAttention>> = vec![
Box::new(TopologicalAttention::new(config)),
Box::new(CausalConeAttention::new(config)),
Box::new(CriticalPathAttention::new(config)),
Box::new(MinCutGatedAttention::new(config)),
Box::new(HierarchicalLorentzAttention::new(config)),
Box::new(ParallelBranchAttention::new(config)),
Box::new(TemporalBTSPAttention::new(config)),
];
for mechanism in mechanisms {
let scores = mechanism.forward(&dag).unwrap();
// Scores sum to ~1.0
let sum: f32 = scores.values().sum();
assert!((sum - 1.0).abs() < 0.001);
// All scores in [0, 1]
assert!(scores.values().all(|&s| s >= 0.0 && s <= 1.0));
}
// Selector chooses based on history
let mut selector = AttentionSelector::new(mechanisms.len());
for _ in 0..100 {
let chosen = selector.select();
let reward = simulate_query_improvement();
selector.update(chosen, reward);
}
```
### Performance Benchmarks
| Mechanism | 10 nodes | 100 nodes | 500 nodes | 1000 nodes |
|-----------|----------|-----------|-----------|------------|
| Topological | <5μs | <50μs | <200μs | <500μs |
| CausalCone | <10μs | <100μs | <400μs | <1ms |
| CriticalPath | <8μs | <75μs | <300μs | <700μs |
| MinCutGated | <20μs | <200μs | <800μs | <2ms |
| HierarchicalLorentz | <15μs | <150μs | <600μs | <1.5ms |
| ParallelBranch | <10μs | <100μs | <400μs | <1ms |
| TemporalBTSP | <12μs | <120μs | <500μs | <1.2ms |
### Files Created
```
ruvector-dag/src/attention/
├── mod.rs
├── traits.rs
├── topological.rs
├── causal_cone.rs
├── critical_path.rs
├── mincut_gated.rs
├── hierarchical_lorentz.rs
├── parallel_branch.rs
├── temporal_btsp.rs
├── selector.rs
└── cache.rs
```
### Exit Criteria
- [ ] All 7 mechanisms pass unit tests
- [ ] All performance benchmarks met
- [ ] Property tests pass (1000 cases each)
- [ ] Selector converges to best mechanism in tests
- [ ] Code review approved
---
## Milestone 3: SONA Learning System
**Status**: Not Started
**Priority**: Critical
**Agents**: #4, #5
### Objectives
- [ ] Implement SONA engine with two-tier learning
- [ ] Implement MinCut optimization engine
- [ ] Achieve subpolynomial update complexity
### Deliverables
| Deliverable | Agent | Status | Target |
|-------------|-------|--------|--------|
| `DagSonaEngine` | #4 | Pending | Orchestration |
| `MicroLoRA` | #4 | Pending | <100μs adapt |
| `DagTrajectoryBuffer` | #4 | Pending | Lock-free, 1K cap |
| `DagReasoningBank` | #4 | Pending | 100 clusters, <2ms search |
| `EwcPlusPlus` | #4 | Pending | λ=5000 default |
| `DagMinCutEngine` | #5 | Pending | - |
| `LocalKCut` oracle | #5 | Pending | Local approximation |
| Dynamic updates | #5 | Pending | O(n^0.12) amortized |
| Bottleneck detection | #5 | Pending | - |
### Acceptance Criteria
```rust
// SONA instant loop
let mut sona = DagSonaEngine::new(config);
let dag = create_query_dag();
let start = Instant::now();
let enhanced = sona.pre_query(&dag).unwrap();
assert!(start.elapsed() < Duration::from_micros(100));
// Learning from trajectory
sona.post_query(&dag, &execution_metrics);
// Verify learning happened
let patterns = sona.reasoning_bank.query_similar(&dag.embedding(), 1);
assert!(!patterns.is_empty());
// MinCut dynamic updates
let mut mincut = DagMinCutEngine::new();
mincut.build_from_dag(&large_dag);
let timings: Vec<Duration> = (0..1000)
.map(|_| {
let start = Instant::now();
mincut.update_edge(rand_u(), rand_v(), rand_weight());
start.elapsed()
})
.collect();
let amortized = timings.iter().sum::<Duration>() / 1000;
// Verify subpolynomial: amortized << O(n)
```
### Files Created
```
ruvector-dag/src/
├── sona/
│ ├── mod.rs
│ ├── engine.rs
│ ├── micro_lora.rs
│ ├── trajectory.rs
│ ├── reasoning_bank.rs
│ └── ewc.rs
└── mincut/
├── mod.rs
├── engine.rs
├── local_kcut.rs
├── dynamic_updates.rs
├── bottleneck.rs
└── redundancy.rs
```
### Exit Criteria
- [ ] MicroLoRA adapts in <100μs
- [ ] Pattern search in <2ms for 10K patterns
- [ ] EWC prevents catastrophic forgetting (>80% task retention)
- [ ] MinCut updates are O(n^0.12) amortized
- [ ] All tests pass
---
## Milestone 4: PostgreSQL Integration
**Status**: Not Started
**Priority**: Critical
**Agents**: #6, #7, #8
### Objectives
- [ ] Create functional PostgreSQL extension
- [ ] Implement all SQL functions
- [ ] Hook into query execution pipeline
### Deliverables
| Deliverable | Agent | Status | Notes |
|-------------|-------|--------|-------|
| pgrx extension setup | #6 | Pending | Extension skeleton |
| GUC variables | #6 | Pending | All config vars |
| Global state | #6 | Pending | DashMap-based |
| Planner hook | #6 | Pending | DAG analysis |
| Executor hooks | #6 | Pending | Trajectory capture |
| Background worker | #6 | Pending | Learning loop |
| Config SQL funcs | #7 | Pending | 5 functions |
| Analysis SQL funcs | #7 | Pending | 6 functions |
| Attention SQL funcs | #7 | Pending | 3 functions |
| Pattern SQL funcs | #8 | Pending | 4 functions |
| Trajectory SQL funcs | #8 | Pending | 3 functions |
| Learning SQL funcs | #8 | Pending | 4 functions |
### Acceptance Criteria
```sql
-- Extension loads successfully
CREATE EXTENSION ruvector_dag CASCADE;
-- Configuration works
SELECT ruvector.dag_set_enabled(true);
SELECT ruvector.dag_set_attention('auto');
-- Query analysis works
SELECT * FROM ruvector.dag_analyze_plan($$
SELECT * FROM vectors
WHERE embedding <-> '[0.1,0.2,0.3]' < 0.5
LIMIT 10
$$);
-- Patterns are stored
INSERT INTO test_vectors SELECT generate_series(1,1000), random_vector(128);
SELECT COUNT(*) FROM ruvector.dag_pattern_clusters(); -- Should have clusters
-- Learning improves over time
DO $$
DECLARE
initial_time FLOAT8;
final_time FLOAT8;
BEGIN
-- Run workload
FOR i IN 1..100 LOOP
PERFORM * FROM test_vectors ORDER BY embedding <-> random_vector(128) LIMIT 10;
END LOOP;
-- Check improvement
SELECT avg_improvement INTO final_time FROM ruvector.dag_status();
RAISE NOTICE 'Improvement ratio: %', final_time;
END $$;
```
### Files Created
```
ruvector-postgres/src/dag/
├── mod.rs
├── extension.rs
├── guc.rs
├── state.rs
├── hooks.rs
├── worker.rs
└── functions/
├── mod.rs
├── config.rs
├── analysis.rs
├── attention.rs
├── patterns.rs
├── trajectories.rs
└── learning.rs
```
### Exit Criteria
- [ ] Extension creates without errors
- [ ] All 25+ SQL functions work
- [ ] Query hooks capture execution data
- [ ] Background worker runs learning loop
- [ ] Integration tests pass
---
## Milestone 5: Self-Healing System
**Status**: Not Started
**Priority**: High
**Agents**: #9
### Objectives
- [ ] Implement autonomous anomaly detection
- [ ] Implement automatic repair strategies
- [ ] Integrate with healing SQL functions
### Deliverables
| Deliverable | Status | Notes |
|-------------|--------|-------|
| `AnomalyDetector` | Pending | Z-score based |
| `IndexHealthChecker` | Pending | HNSW/IVFFlat |
| `LearningDriftDetector` | Pending | Pattern quality trends |
| `RepairStrategy` trait | Pending | Strategy interface |
| `IndexRebalanceStrategy` | Pending | Rebalance indexes |
| `PatternResetStrategy` | Pending | Reset bad patterns |
| `HealingOrchestrator` | Pending | Main loop |
### Acceptance Criteria
```rust
// Anomaly detection
let mut detector = AnomalyDetector::new(AnomalyConfig {
z_threshold: 3.0,
window_size: 100,
});
// Inject anomaly
for _ in 0..99 {
detector.observe(1.0); // Normal
}
detector.observe(100.0); // Anomaly
let anomalies = detector.detect();
assert!(!anomalies.is_empty());
assert!(anomalies[0].z_score > 3.0);
// Self-healing
let orchestrator = HealingOrchestrator::new(config);
orchestrator.run_cycle().unwrap();
// Verify repairs applied
let health = orchestrator.health_report();
assert!(health.overall_score > 0.8);
```
### Files Created
```
ruvector-dag/src/healing/
├── mod.rs
├── anomaly.rs
├── index_health.rs
├── drift_detector.rs
├── strategies.rs
└── orchestrator.rs
ruvector-postgres/src/dag/functions/
└── healing.rs
```
### Exit Criteria
- [ ] Anomalies detected within 1s
- [ ] Repairs applied automatically
- [ ] No false positives in 24h test
- [ ] SQL healing functions work
- [ ] Integration tests pass
---
## Milestone 6: QuDAG Integration
**Status**: Not Started
**Priority**: High
**Agents**: #10, #11, #12
### Objectives
- [ ] Connect to QuDAG network
- [ ] Implement quantum-resistant crypto
- [ ] Enable distributed pattern learning
### Deliverables
| Deliverable | Agent | Status | Notes |
|-------------|-------|--------|-------|
| `QuDagClient` | #10 | Pending | Network client |
| Pattern proposal | #10 | Pending | Submit patterns |
| Pattern sync | #10 | Pending | Receive patterns |
| Consensus validation | #10 | Pending | Track votes |
| ML-KEM-768 | #11 | Pending | Encryption |
| ML-DSA | #11 | Pending | Signatures |
| Identity management | #11 | Pending | Key generation |
| Differential privacy | #11 | Pending | Pattern noise |
| Staking interface | #12 | Pending | Token staking |
| Reward claiming | #12 | Pending | Earn rUv |
| QuDAG SQL funcs | #12 | Pending | SQL interface |
### Acceptance Criteria
```rust
// Connect to network
let client = QuDagClient::connect("https://qudag.example.com:8443").await?;
assert!(client.is_connected());
// Propose pattern with DP
let pattern = PatternProposal {
vector: pattern_vector,
metadata: metadata,
noise: laplace_noise(epsilon),
};
let proposal_id = client.propose_pattern(pattern).await?;
// Wait for consensus
let status = client.wait_for_consensus(&proposal_id, timeout).await?;
assert!(matches!(status, ConsensusStatus::Finalized));
// Sync patterns
let new_patterns = client.sync_patterns(since_round).await?;
for pattern in new_patterns {
reasoning_bank.import_pattern(pattern);
}
// Token operations
let balance = client.get_balance().await?;
client.stake(100.0).await?;
let rewards = client.claim_rewards().await?;
```
### Files Created
```
ruvector-dag/src/qudag/
├── mod.rs
├── client.rs
├── network.rs
├── proposal.rs
├── consensus.rs
├── sync.rs
├── crypto/
│ ├── mod.rs
│ ├── ml_kem.rs
│ ├── ml_dsa.rs
│ ├── identity.rs
│ ├── keystore.rs
│ └── differential_privacy.rs
└── tokens/
├── mod.rs
├── staking.rs
├── rewards.rs
└── governance.rs
ruvector-postgres/src/dag/functions/
└── qudag.rs
```
### Exit Criteria
- [ ] Connect to test QuDAG network
- [ ] Pattern proposals finalize
- [ ] Pattern sync works bidirectionally
- [ ] ML-KEM/ML-DSA operations work
- [ ] Token operations succeed
- [ ] SQL functions work
---
## Milestone 7: Comprehensive Testing
**Status**: Not Started
**Priority**: High
**Agents**: #13, #14
### Objectives
- [ ] Achieve >80% test coverage
- [ ] All benchmarks meet targets
- [ ] CI/CD pipeline operational
### Deliverables
| Category | Count | Status |
|----------|-------|--------|
| Unit tests (attention) | 50+ | Pending |
| Unit tests (sona) | 40+ | Pending |
| Unit tests (mincut) | 30+ | Pending |
| Unit tests (healing) | 25+ | Pending |
| Unit tests (qudag) | 35+ | Pending |
| Integration tests (postgres) | 20+ | Pending |
| Integration tests (qudag) | 15+ | Pending |
| Property tests | 20+ | Pending |
| Benchmarks | 15+ | Pending |
### Performance Verification
| Component | Target | Test |
|-----------|--------|------|
| Topological attention | <50μs/100 nodes | Benchmark |
| MicroLoRA | <100μs | Benchmark |
| Pattern search | <2ms/10K | Benchmark |
| MinCut update | O(n^0.12) | Benchmark |
| Query analysis | <5ms | Integration |
| Full learning cycle | <100ms | Integration |
### Coverage Targets
```
Overall: >80%
attention/: >90%
sona/: >85%
mincut/: >85%
healing/: >80%
qudag/: >75%
functions/: >85%
```
### Files Created
```
ruvector-dag/
├── tests/
│ ├── unit/
│ │ ├── attention/
│ │ ├── sona/
│ │ ├── mincut/
│ │ ├── healing/
│ │ └── qudag/
│ ├── integration/
│ │ ├── postgres/
│ │ └── qudag/
│ ├── property/
│ └── fixtures/
├── benches/
│ ├── attention_bench.rs
│ ├── sona_bench.rs
│ └── mincut_bench.rs
.github/workflows/
├── dag-tests.yml
└── dag-benchmarks.yml
```
### Exit Criteria
- [ ] Coverage >80%
- [ ] All tests pass
- [ ] All benchmarks meet targets
- [ ] CI pipeline green
- [ ] No critical issues
---
## Milestone 8: Production Ready
**Status**: Not Started
**Priority**: Critical
**Agents**: All
### Objectives
- [ ] Complete documentation
- [ ] Performance optimization
- [ ] Security audit
- [ ] Release preparation
### Deliverables
| Deliverable | Status |
|-------------|--------|
| Complete rustdoc | Pending |
| SQL API docs | Pending |
| Usage examples | Pending |
| Integration guide | Pending |
| Troubleshooting guide | Pending |
| Performance tuning guide | Pending |
| Security review | Pending |
| CHANGELOG | Pending |
| Release notes | Pending |
### Security Checklist
- [ ] No secret exposure
- [ ] Input validation on all SQL functions
- [ ] Safe memory handling (no leaks)
- [ ] Cryptographic review (ML-KEM/ML-DSA)
- [ ] Differential privacy parameters validated
- [ ] No SQL injection vectors
- [ ] Resource limits enforced
### Performance Optimization
- [ ] Profile and optimize hot paths
- [ ] Memory usage optimization
- [ ] Cache tuning
- [ ] Query plan caching
- [ ] Background worker tuning
### Release Checklist
- [ ] Version bump
- [ ] CHANGELOG updated
- [ ] All tests pass
- [ ] Benchmarks verified
- [ ] Documentation complete
- [ ] Examples tested
- [ ] Binary artifacts built
- [ ] crates.io ready (if applicable)
### Exit Criteria
- [ ] All previous milestones complete
- [ ] Documentation complete
- [ ] Security review passed
- [ ] Performance targets met
- [ ] Ready for production deployment
---
## Risk Register
| Risk | Impact | Probability | Mitigation |
|------|--------|-------------|------------|
| MinCut complexity target not achievable | High | Medium | Fall back to O(√n) algorithm |
| PostgreSQL hook instability | High | Low | Extensive testing, fallback modes |
| QuDAG network unavailable | Medium | Medium | Local-only fallback mode |
| Performance regression | Medium | Medium | Continuous benchmarking in CI |
| Memory leaks | High | Low | Valgrind/miri testing |
| Cross-agent coordination failures | Medium | Medium | Queen agent mediation |
## Dependencies
### External Dependencies
| Dependency | Version | Purpose |
|------------|---------|---------|
| pgrx | ^0.11 | PostgreSQL extension |
| tokio | ^1.0 | Async runtime |
| dashmap | ^5.0 | Concurrent hashmap |
| crossbeam | ^0.8 | Lock-free structures |
| ndarray | ^0.15 | Numeric arrays |
| ml-kem | TBD | ML-KEM-768 |
| ml-dsa | TBD | ML-DSA signatures |
### Internal Dependencies
- `ruvector-core`: Vector operations, SONA base
- `ruvector-graph`: GNN, attention base
- `ruvector-postgres`: Extension infrastructure
---
## Completion Tracking
| Milestone | Weight | Status | Completion |
|-----------|--------|--------|------------|
| M1: Foundation | 15% | Not Started | 0% |
| M2: Core Attention | 10% | Not Started | 0% |
| M3: SONA Learning | 10% | Not Started | 0% |
| M4: PostgreSQL | 20% | Not Started | 0% |
| M5: Self-Healing | 10% | Not Started | 0% |
| M6: QuDAG | 15% | Not Started | 0% |
| M7: Testing | 10% | Not Started | 0% |
| M8: Production | 10% | Not Started | 0% |
| **TOTAL** | **100%** | - | **0%** |
---
*Document: 12-MILESTONES.md | Version: 1.0 | Last Updated: 2025-01-XX*