Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
This commit is contained in:
83
docs/dag/00-INDEX.md
Normal file
83
docs/dag/00-INDEX.md
Normal file
@@ -0,0 +1,83 @@
|
||||
# Neural Self-Learning DAG Implementation Plan
|
||||
|
||||
## Project Overview
|
||||
|
||||
This document set provides a complete implementation plan for integrating a Neural Self-Learning DAG system into RuVector-Postgres, with optional QuDAG distributed consensus integration.
|
||||
|
||||
## Document Index
|
||||
|
||||
| Document | Description | Priority |
|
||||
|----------|-------------|----------|
|
||||
| [01-ARCHITECTURE.md](./01-ARCHITECTURE.md) | System architecture and component overview | P0 |
|
||||
| [02-DAG-ATTENTION-MECHANISMS.md](./02-DAG-ATTENTION-MECHANISMS.md) | 7 specialized DAG attention implementations | P0 |
|
||||
| [03-SONA-INTEGRATION.md](./03-SONA-INTEGRATION.md) | Self-Optimizing Neural Architecture integration | P0 |
|
||||
| [04-POSTGRES-INTEGRATION.md](./04-POSTGRES-INTEGRATION.md) | PostgreSQL extension integration details | P0 |
|
||||
| [05-QUERY-PLAN-DAG.md](./05-QUERY-PLAN-DAG.md) | Query plan as learnable DAG structure | P1 |
|
||||
| [06-MINCUT-OPTIMIZATION.md](./06-MINCUT-OPTIMIZATION.md) | Min-cut based bottleneck detection | P1 |
|
||||
| [07-SELF-HEALING.md](./07-SELF-HEALING.md) | Self-healing and adaptive repair | P1 |
|
||||
| [08-QUDAG-INTEGRATION.md](./08-QUDAG-INTEGRATION.md) | QuDAG distributed consensus integration | P2 |
|
||||
| [09-SQL-API.md](./09-SQL-API.md) | Complete SQL API specification | P0 |
|
||||
| [10-TESTING-STRATEGY.md](./10-TESTING-STRATEGY.md) | Testing approach and benchmarks | P1 |
|
||||
| [11-AGENT-TASKS.md](./11-AGENT-TASKS.md) | 15-agent swarm task breakdown | P0 |
|
||||
| [12-MILESTONES.md](./12-MILESTONES.md) | Implementation milestones and timeline | P0 |
|
||||
|
||||
## Quick Start for Agents
|
||||
|
||||
1. Read [01-ARCHITECTURE.md](./01-ARCHITECTURE.md) for system overview
|
||||
2. Check [11-AGENT-TASKS.md](./11-AGENT-TASKS.md) for your assigned tasks
|
||||
3. Follow task-specific documents as referenced
|
||||
4. Coordinate via shared memory patterns in [03-SONA-INTEGRATION.md](./03-SONA-INTEGRATION.md)
|
||||
|
||||
## Project Goals
|
||||
|
||||
### Primary Goals
|
||||
- Create self-learning query optimization for RuVector-Postgres
|
||||
- Implement 7 DAG-centric attention mechanisms
|
||||
- Integrate SONA two-tier learning system
|
||||
- Provide adaptive cost estimation
|
||||
- Enable bottleneck detection via min-cut analysis
|
||||
|
||||
### Secondary Goals
|
||||
- QuDAG distributed consensus for federated learning
|
||||
- Self-healing index maintenance
|
||||
- HDC state compression for efficient sync
|
||||
- Production-ready SQL API
|
||||
|
||||
## Success Metrics
|
||||
|
||||
| Metric | Target | Measurement |
|
||||
|--------|--------|-------------|
|
||||
| Query latency improvement | 30-50% | Benchmark suite |
|
||||
| Pattern recall accuracy | >95% | Test coverage |
|
||||
| Learning overhead | <5% | Per-query timing |
|
||||
| Bottleneck detection | O(n^0.12) | Algorithmic analysis |
|
||||
| Memory overhead | <100MB | Per-table measurement |
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Required Crates (Internal)
|
||||
- `ruvector-postgres` - PostgreSQL extension framework
|
||||
- `ruvector-attention` - 39 attention mechanisms
|
||||
- `ruvector-gnn` - Graph neural network layers
|
||||
- `ruvector-graph` - Query execution DAG
|
||||
- `ruvector-mincut` - Subpolynomial min-cut
|
||||
- `ruvector-nervous-system` - BTSP, HDC, spiking networks
|
||||
- `sona` - Self-Optimizing Neural Architecture
|
||||
|
||||
### Required Crates (External)
|
||||
- `pgrx` - PostgreSQL Rust extension framework
|
||||
- `dashmap` - Concurrent hashmap
|
||||
- `parking_lot` - Fast synchronization primitives
|
||||
- `ndarray` - N-dimensional arrays
|
||||
- `rayon` - Parallel iterators
|
||||
|
||||
### Optional (QuDAG Integration)
|
||||
- `qudag` - Quantum-resistant DAG consensus
|
||||
- `ml-kem` - Post-quantum key encapsulation
|
||||
- `ml-dsa` - Post-quantum signatures
|
||||
|
||||
## Version
|
||||
|
||||
- Plan Version: 1.0.0
|
||||
- Target RuVector Version: 0.5.0
|
||||
- Last Updated: 2025-12-29
|
||||
484
docs/dag/01-ARCHITECTURE.md
Normal file
484
docs/dag/01-ARCHITECTURE.md
Normal file
@@ -0,0 +1,484 @@
|
||||
# Neural Self-Learning DAG Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
The Neural Self-Learning DAG system transforms RuVector-Postgres from a static query executor into an adaptive system that learns optimal configurations from query patterns.
|
||||
|
||||
## System Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ NEURAL DAG RUVECTOR-POSTGRES │
|
||||
├─────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ SQL INTERFACE LAYER │ │
|
||||
│ │ ruvector_enable_neural_dag() | ruvector_dag_patterns() | ... │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
|
||||
│ │ QUERY OPTIMIZER LAYER │ │
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ │
|
||||
│ │ │ Pattern │ │ Attention │ │ Cost │ │ Plan │ │ │
|
||||
│ │ │ Matcher │ │ Selector │ │ Estimator │ │ Rewriter │ │ │
|
||||
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
|
||||
│ │ DAG ATTENTION LAYER │ │
|
||||
│ │ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │ │
|
||||
│ │ │Topological│ │ Causal │ │ Critical │ │ MinCut │ │ │
|
||||
│ │ │ Attention │ │ Cone │ │ Path │ │ Gated │ │ │
|
||||
│ │ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │ │
|
||||
│ │ ┌───────────┐ ┌───────────┐ ┌───────────┐ │ │
|
||||
│ │ │Hierarchic │ │ Parallel │ │ Temporal │ │ │
|
||||
│ │ │ Lorentz │ │ Branch │ │ BTSP │ │ │
|
||||
│ │ └───────────┘ └───────────┘ └───────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
|
||||
│ │ SONA LEARNING LAYER │ │
|
||||
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
|
||||
│ │ │ INSTANT LOOP (<100μs) BACKGROUND LOOP (hourly) │ │ │
|
||||
│ │ │ ┌─────────────┐ ┌─────────────┐ │ │ │
|
||||
│ │ │ │ MicroLoRA │ │ BaseLoRA │ │ │ │
|
||||
│ │ │ │ (rank 1-2) │ │ (rank 8) │ │ │ │
|
||||
│ │ │ └─────────────┘ └─────────────┘ │ │ │
|
||||
│ │ │ ┌─────────────┐ ┌─────────────┐ │ │ │
|
||||
│ │ │ │ Trajectory │ ──────────────► │ ReasoningBk │ │ │ │
|
||||
│ │ │ │ Buffer │ │ (K-means) │ │ │ │
|
||||
│ │ │ └─────────────┘ └─────────────┘ │ │ │
|
||||
│ │ │ ┌─────────────┐ │ │ │
|
||||
│ │ │ │ EWC++ │ │ │ │
|
||||
│ │ │ │ (forgetting)│ │ │ │
|
||||
│ │ │ └─────────────┘ │ │ │
|
||||
│ │ └─────────────────────────────────────────────────────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
|
||||
│ │ OPTIMIZATION LAYER │ │
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ │
|
||||
│ │ │ MinCut │ │ HDC │ │ BTSP │ │ Self- │ │ │
|
||||
│ │ │ Analysis │ │ State │ │ Memory │ │ Healing │ │ │
|
||||
│ │ │ O(n^0.12) │ │ Compression │ │ One-Shot │ │ Engine │ │ │
|
||||
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
|
||||
│ │ STORAGE LAYER │ │
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ │
|
||||
│ │ │ Pattern │ │ Embedding │ │ Trajectory │ │ Index │ │ │
|
||||
│ │ │ Store │ │ Cache │ │ History │ │ Metadata │ │ │
|
||||
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ OPTIONAL: QUDAG CONSENSUS LAYER │ │
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ │
|
||||
│ │ │ Federated │ │ Pattern │ │ ML-DSA │ │ rUv │ │ │
|
||||
│ │ │ Learning │ │ Consensus │ │ Signatures │ │ Tokens │ │ │
|
||||
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Component Descriptions
|
||||
|
||||
### 1. SQL Interface Layer
|
||||
|
||||
Provides PostgreSQL-native functions for interacting with the Neural DAG system.
|
||||
|
||||
**Key Components:**
|
||||
- `ruvector_enable_neural_dag()` - Enable learning for a table
|
||||
- `ruvector_dag_patterns()` - View learned patterns
|
||||
- `ruvector_attention_*()` - DAG attention functions
|
||||
- `ruvector_dag_learn()` - Trigger learning cycle
|
||||
|
||||
**Location:** `crates/ruvector-postgres/src/dag/operators.rs`
|
||||
|
||||
### 2. Query Optimizer Layer
|
||||
|
||||
Intercepts queries and applies learned optimizations.
|
||||
|
||||
**Key Components:**
|
||||
- **Pattern Matcher**: Finds similar past query patterns via cosine similarity
|
||||
- **Attention Selector**: UCB bandit for choosing optimal attention type
|
||||
- **Cost Estimator**: Adaptive cost model with micro-LoRA updates
|
||||
- **Plan Rewriter**: Applies learned operator ordering and parameters
|
||||
|
||||
**Location:** `crates/ruvector-postgres/src/dag/optimizer.rs`
|
||||
|
||||
### 3. DAG Attention Layer
|
||||
|
||||
Seven specialized attention mechanisms for DAG structures.
|
||||
|
||||
| Attention Type | Use Case | Complexity |
|
||||
|----------------|----------|------------|
|
||||
| Topological | Respect DAG ordering | O(n·k) |
|
||||
| Causal Cone | Distance-weighted ancestors | O(n·d) |
|
||||
| Critical Path | Focus on bottlenecks | O(n + critical_len) |
|
||||
| MinCut Gated | Gate by criticality | O(n^0.12 + n·k) |
|
||||
| Hierarchical Lorentz | Deep nesting | O(n·d) |
|
||||
| Parallel Branch | Coordinate branches | O(n·b) |
|
||||
| Temporal BTSP | Time-correlated patterns | O(n·w) |
|
||||
|
||||
**Location:** `crates/ruvector-postgres/src/dag/attention/`
|
||||
|
||||
### 4. SONA Learning Layer
|
||||
|
||||
Two-tier learning system for continuous optimization.
|
||||
|
||||
**Instant Loop (per-query):**
|
||||
- MicroLoRA adaptation (rank 1-2)
|
||||
- Trajectory recording
|
||||
- <100μs overhead
|
||||
|
||||
**Background Loop (hourly):**
|
||||
- K-means++ pattern extraction
|
||||
- BaseLoRA updates (rank 8)
|
||||
- EWC++ constraint application
|
||||
|
||||
**Location:** `crates/ruvector-postgres/src/dag/learning/`
|
||||
|
||||
### 5. Optimization Layer
|
||||
|
||||
Advanced optimization components.
|
||||
|
||||
**Key Components:**
|
||||
- **MinCut Analysis**: O(n^0.12) bottleneck detection
|
||||
- **HDC State**: 10K-bit hypervector compression
|
||||
- **BTSP Memory**: One-shot pattern recall
|
||||
- **Self-Healing**: Proactive index repair
|
||||
|
||||
**Location:** `crates/ruvector-postgres/src/dag/optimization/`
|
||||
|
||||
### 6. Storage Layer
|
||||
|
||||
Persistent storage for learned patterns and state.
|
||||
|
||||
**Key Components:**
|
||||
- **Pattern Store**: DashMap + PostgreSQL tables
|
||||
- **Embedding Cache**: LRU cache for hot embeddings
|
||||
- **Trajectory History**: Ring buffer for recent queries
|
||||
- **Index Metadata**: Pattern-to-index mappings
|
||||
|
||||
**Location:** `crates/ruvector-postgres/src/dag/storage/`
|
||||
|
||||
### 7. QuDAG Consensus Layer (Optional)
|
||||
|
||||
Distributed learning via quantum-resistant consensus.
|
||||
|
||||
**Key Components:**
|
||||
- **Federated Learning**: Privacy-preserving pattern sharing
|
||||
- **Pattern Consensus**: QR-Avalanche for pattern validation
|
||||
- **ML-DSA Signatures**: Quantum-resistant pattern signing
|
||||
- **rUv Tokens**: Incentivize learning contributions
|
||||
|
||||
**Location:** `crates/ruvector-postgres/src/dag/qudag/`
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Query Execution Flow
|
||||
|
||||
```
|
||||
SQL Query
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ 1. Pattern Matching │
|
||||
│ - Embed query plan │
|
||||
│ - Find similar patterns in │
|
||||
│ ReasoningBank (cosine sim) │
|
||||
│ - Return top-k matches │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ 2. Optimization Decision │
|
||||
│ - If pattern found (conf > 0.8): │
|
||||
│ Apply learned configuration │
|
||||
│ - Else: │
|
||||
│ Use defaults + micro-LoRA │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ 3. Attention Selection │
|
||||
│ - UCB bandit selects attention │
|
||||
│ - Based on query pattern type │
|
||||
│ - Exploration vs exploitation │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ 4. Plan Execution │
|
||||
│ - Execute with optimized params │
|
||||
│ - Record operator timings │
|
||||
│ - Track intermediate results │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ 5. Trajectory Recording │
|
||||
│ - Store query embedding │
|
||||
│ - Store operator activations │
|
||||
│ - Store outcome metrics │
|
||||
│ - Compute quality score │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ 6. Instant Learning │
|
||||
│ - MicroLoRA gradient accumulate │
|
||||
│ - Auto-flush at 100 queries │
|
||||
│ - Update attention selector │
|
||||
└─────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Learning Cycle Flow
|
||||
|
||||
```
|
||||
Hourly Trigger
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ 1. Drain Trajectory Buffer │
|
||||
│ - Collect 1000+ trajectories │
|
||||
│ - Filter by quality threshold │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ 2. K-means++ Clustering │
|
||||
│ - 100 clusters │
|
||||
│ - Deterministic initialization │
|
||||
│ - Max 100 iterations │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ 3. Pattern Extraction │
|
||||
│ - Compute cluster centroids │
|
||||
│ - Extract optimal parameters │
|
||||
│ - Calculate confidence scores │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ 4. EWC++ Constraint Check │
|
||||
│ - Compute Fisher information │
|
||||
│ - Apply forgetting prevention │
|
||||
│ - Detect task boundaries │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ 5. BaseLoRA Update │
|
||||
│ - Apply constrained gradients │
|
||||
│ - Update all layers │
|
||||
│ - Merge weights if needed │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ 6. ReasoningBank Update │
|
||||
│ - Store new patterns │
|
||||
│ - Consolidate similar patterns │
|
||||
│ - Evict low-confidence patterns │
|
||||
└─────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Module Dependencies
|
||||
|
||||
```
|
||||
ruvector-postgres/src/dag/
|
||||
├── mod.rs # Module root, re-exports
|
||||
├── operators.rs # SQL function definitions
|
||||
│
|
||||
├── attention/
|
||||
│ ├── mod.rs # Attention trait and registry
|
||||
│ ├── topological.rs # TopologicalAttention
|
||||
│ ├── causal_cone.rs # CausalConeAttention
|
||||
│ ├── critical_path.rs # CriticalPathAttention
|
||||
│ ├── mincut_gated.rs # MinCutGatedAttention
|
||||
│ ├── hierarchical.rs # HierarchicalLorentzAttention
|
||||
│ ├── parallel_branch.rs # ParallelBranchAttention
|
||||
│ ├── temporal_btsp.rs # TemporalBTSPAttention
|
||||
│ └── ensemble.rs # EnsembleAttention
|
||||
│
|
||||
├── learning/
|
||||
│ ├── mod.rs # Learning coordinator
|
||||
│ ├── sona_engine.rs # SONA integration wrapper
|
||||
│ ├── trajectory.rs # Trajectory buffer
|
||||
│ ├── patterns.rs # Pattern extraction
|
||||
│ ├── reasoning_bank.rs # Pattern storage
|
||||
│ ├── ewc.rs # EWC++ integration
|
||||
│ └── attention_selector.rs # UCB bandit selector
|
||||
│
|
||||
├── optimizer/
|
||||
│ ├── mod.rs # Optimizer coordinator
|
||||
│ ├── pattern_matcher.rs # Pattern matching
|
||||
│ ├── cost_estimator.rs # Adaptive costs
|
||||
│ └── plan_rewriter.rs # Plan transformation
|
||||
│
|
||||
├── optimization/
|
||||
│ ├── mod.rs # Optimization utilities
|
||||
│ ├── mincut.rs # Min-cut integration
|
||||
│ ├── hdc_state.rs # HDC compression
|
||||
│ ├── btsp_memory.rs # BTSP one-shot
|
||||
│ └── self_healing.rs # Self-healing engine
|
||||
│
|
||||
├── storage/
|
||||
│ ├── mod.rs # Storage coordinator
|
||||
│ ├── pattern_store.rs # Pattern persistence
|
||||
│ ├── embedding_cache.rs # Embedding LRU
|
||||
│ └── trajectory_store.rs # Trajectory history
|
||||
│
|
||||
├── qudag/
|
||||
│ ├── mod.rs # QuDAG integration
|
||||
│ ├── federated.rs # Federated learning
|
||||
│ ├── consensus.rs # Pattern consensus
|
||||
│ ├── signatures.rs # ML-DSA signing
|
||||
│ └── tokens.rs # rUv token interface
|
||||
│
|
||||
└── types/
|
||||
├── mod.rs # Type definitions
|
||||
├── neural_plan.rs # NeuralDagPlan
|
||||
├── trajectory.rs # DagTrajectory
|
||||
├── pattern.rs # LearnedDagPattern
|
||||
└── metrics.rs # ExecutionMetrics
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Default Configuration
|
||||
|
||||
```rust
|
||||
pub struct NeuralDagConfig {
|
||||
// Learning
|
||||
pub learning_enabled: bool, // true
|
||||
pub max_trajectories: usize, // 10000
|
||||
pub pattern_clusters: usize, // 100
|
||||
pub quality_threshold: f32, // 0.3
|
||||
pub background_interval_ms: u64, // 3600000 (1 hour)
|
||||
|
||||
// Attention
|
||||
pub default_attention: DagAttentionType, // Topological
|
||||
pub attention_exploration: f32, // 0.1
|
||||
pub ucb_exploration_c: f32, // 1.414
|
||||
|
||||
// SONA
|
||||
pub micro_lora_rank: usize, // 2
|
||||
pub micro_lora_lr: f32, // 0.002
|
||||
pub base_lora_rank: usize, // 8
|
||||
pub base_lora_lr: f32, // 0.001
|
||||
|
||||
// EWC++
|
||||
pub ewc_lambda: f32, // 2000.0
|
||||
pub ewc_max_lambda: f32, // 15000.0
|
||||
pub ewc_fisher_decay: f32, // 0.999
|
||||
|
||||
// MinCut
|
||||
pub mincut_enabled: bool, // true
|
||||
pub mincut_threshold: f32, // 0.5
|
||||
|
||||
// HDC
|
||||
pub hdc_dimensions: usize, // 10000
|
||||
|
||||
// Self-Healing
|
||||
pub healing_enabled: bool, // true
|
||||
pub healing_check_interval_ms: u64, // 300000 (5 min)
|
||||
}
|
||||
```
|
||||
|
||||
### PostgreSQL GUC Variables
|
||||
|
||||
```sql
|
||||
-- Enable/disable neural DAG
|
||||
SET ruvector.neural_dag_enabled = true;
|
||||
|
||||
-- Learning parameters
|
||||
SET ruvector.dag_learning_rate = 0.002;
|
||||
SET ruvector.dag_pattern_clusters = 100;
|
||||
SET ruvector.dag_quality_threshold = 0.3;
|
||||
|
||||
-- Attention parameters
|
||||
SET ruvector.dag_attention_type = 'auto';
|
||||
SET ruvector.dag_attention_exploration = 0.1;
|
||||
|
||||
-- EWC parameters
|
||||
SET ruvector.dag_ewc_lambda = 2000.0;
|
||||
|
||||
-- MinCut parameters
|
||||
SET ruvector.dag_mincut_enabled = true;
|
||||
SET ruvector.dag_mincut_threshold = 0.5;
|
||||
```
|
||||
|
||||
## Performance Targets
|
||||
|
||||
| Operation | Target Latency | Notes |
|
||||
|-----------|----------------|-------|
|
||||
| Pattern matching | <1ms | Top-5 similar patterns |
|
||||
| Attention computation | <500μs | Per operator |
|
||||
| MicroLoRA forward | <100μs | Per query |
|
||||
| Trajectory recording | <50μs | Non-blocking |
|
||||
| Background learning | <5s | 1000 trajectories |
|
||||
| MinCut analysis | <10ms | O(n^0.12) |
|
||||
| HDC encoding | <100μs | 10K dimensions |
|
||||
|
||||
## Memory Budget
|
||||
|
||||
| Component | Budget | Notes |
|
||||
|-----------|--------|-------|
|
||||
| Pattern Store | 50MB | ~1000 patterns per table |
|
||||
| Embedding Cache | 20MB | LRU for hot embeddings |
|
||||
| Trajectory Buffer | 20MB | 10K trajectories |
|
||||
| MicroLoRA | 10KB | Per active query |
|
||||
| BaseLoRA | 400KB | Per table |
|
||||
| HDC State | 1.2KB | Per state snapshot |
|
||||
|
||||
**Total per table:** ~100MB maximum
|
||||
|
||||
## Thread Safety
|
||||
|
||||
All components use thread-safe primitives:
|
||||
|
||||
- `DashMap` for concurrent pattern storage
|
||||
- `parking_lot::RwLock` for embedding cache
|
||||
- `crossbeam::ArrayQueue` for trajectory buffer
|
||||
- `AtomicU64` for counters and statistics
|
||||
- PostgreSQL background workers for learning cycles
|
||||
|
||||
## Error Handling
|
||||
|
||||
```rust
|
||||
pub enum NeuralDagError {
|
||||
// Configuration errors
|
||||
InvalidConfig(String),
|
||||
TableNotEnabled(String),
|
||||
|
||||
// Learning errors
|
||||
InsufficientTrajectories,
|
||||
PatternExtractionFailed,
|
||||
EwcConstraintViolation,
|
||||
|
||||
// Attention errors
|
||||
AttentionComputationFailed,
|
||||
InvalidDagStructure,
|
||||
|
||||
// Storage errors
|
||||
PatternStoreFull,
|
||||
EmbeddingCacheMiss,
|
||||
|
||||
// MinCut errors
|
||||
MinCutComputationFailed,
|
||||
GraphDisconnected,
|
||||
|
||||
// QuDAG errors (optional)
|
||||
ConsensusTimeout,
|
||||
SignatureVerificationFailed,
|
||||
}
|
||||
```
|
||||
|
||||
All errors are logged and non-fatal - the system falls back to default behavior on error.
|
||||
1236
docs/dag/02-DAG-ATTENTION-MECHANISMS.md
Normal file
1236
docs/dag/02-DAG-ATTENTION-MECHANISMS.md
Normal file
File diff suppressed because it is too large
Load Diff
1009
docs/dag/03-SONA-INTEGRATION.md
Normal file
1009
docs/dag/03-SONA-INTEGRATION.md
Normal file
File diff suppressed because it is too large
Load Diff
1202
docs/dag/04-POSTGRES-INTEGRATION.md
Normal file
1202
docs/dag/04-POSTGRES-INTEGRATION.md
Normal file
File diff suppressed because it is too large
Load Diff
914
docs/dag/05-QUERY-PLAN-DAG.md
Normal file
914
docs/dag/05-QUERY-PLAN-DAG.md
Normal file
@@ -0,0 +1,914 @@
|
||||
# Query Plan as Learnable DAG
|
||||
|
||||
## Overview
|
||||
|
||||
This document specifies how PostgreSQL query plans are represented as DAGs (Directed Acyclic Graphs) and how they become targets for neural learning.
|
||||
|
||||
## Query Plan DAG Structure
|
||||
|
||||
### Conceptual Model
|
||||
|
||||
```
|
||||
┌─────────────┐
|
||||
│ RESULT │ (Root)
|
||||
└──────┬──────┘
|
||||
│
|
||||
┌──────┴──────┐
|
||||
│ SORT │
|
||||
└──────┬──────┘
|
||||
│
|
||||
┌────────────┴────────────┐
|
||||
│ │
|
||||
┌──────┴──────┐ ┌──────┴──────┐
|
||||
│ FILTER │ │ FILTER │
|
||||
└──────┬──────┘ └──────┬──────┘
|
||||
│ │
|
||||
┌──────┴──────┐ ┌──────┴──────┐
|
||||
│ HNSW SCAN │ │ SEQ SCAN │
|
||||
│ (documents) │ │ (authors) │
|
||||
└─────────────┘ └─────────────┘
|
||||
|
||||
Leaf Nodes Leaf Nodes
|
||||
```
|
||||
|
||||
### NeuralDagPlan Structure
|
||||
|
||||
```rust
|
||||
/// Query plan enhanced with neural learning capabilities
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct NeuralDagPlan {
|
||||
// ═══════════════════════════════════════════════════════════════
|
||||
// BASE PLAN STRUCTURE
|
||||
// ═══════════════════════════════════════════════════════════════
|
||||
|
||||
/// Plan ID (unique per execution)
|
||||
pub plan_id: u64,
|
||||
|
||||
/// Root operator
|
||||
pub root: OperatorNode,
|
||||
|
||||
/// All operators in topological order (leaves first)
|
||||
pub operators: Vec<OperatorNode>,
|
||||
|
||||
/// Edges: parent_id -> Vec<child_id>
|
||||
pub edges: HashMap<OperatorId, Vec<OperatorId>>,
|
||||
|
||||
/// Reverse edges: child_id -> parent_id
|
||||
pub reverse_edges: HashMap<OperatorId, OperatorId>,
|
||||
|
||||
/// Pipeline breakers (blocking operators)
|
||||
pub pipeline_breakers: Vec<OperatorId>,
|
||||
|
||||
/// Parallelism configuration
|
||||
pub parallelism: usize,
|
||||
|
||||
// ═══════════════════════════════════════════════════════════════
|
||||
// NEURAL ENHANCEMENTS
|
||||
// ═══════════════════════════════════════════════════════════════
|
||||
|
||||
/// Operator embeddings (256-dim per operator)
|
||||
pub operator_embeddings: Vec<Vec<f32>>,
|
||||
|
||||
/// Plan embedding (computed from operators)
|
||||
pub plan_embedding: Option<Vec<f32>>,
|
||||
|
||||
/// Attention weights between operators
|
||||
pub attention_weights: Vec<Vec<f32>>,
|
||||
|
||||
/// Selected attention type
|
||||
pub attention_type: DagAttentionType,
|
||||
|
||||
/// Trajectory ID (links to ReasoningBank)
|
||||
pub trajectory_id: Option<u64>,
|
||||
|
||||
// ═══════════════════════════════════════════════════════════════
|
||||
// LEARNED PARAMETERS
|
||||
// ═══════════════════════════════════════════════════════════════
|
||||
|
||||
/// Learned cost estimates per operator
|
||||
pub learned_costs: Option<Vec<f32>>,
|
||||
|
||||
/// Execution parameters
|
||||
pub params: ExecutionParams,
|
||||
|
||||
/// Pattern match info (if pattern was applied)
|
||||
pub pattern_match: Option<PatternMatch>,
|
||||
|
||||
// ═══════════════════════════════════════════════════════════════
|
||||
// OPTIMIZATION STATE
|
||||
// ═══════════════════════════════════════════════════════════════
|
||||
|
||||
/// MinCut criticality per operator
|
||||
pub criticalities: Option<Vec<f32>>,
|
||||
|
||||
/// Critical path operators
|
||||
pub critical_path: Option<Vec<OperatorId>>,
|
||||
|
||||
/// Bottleneck score (0.0 - 1.0)
|
||||
pub bottleneck_score: Option<f32>,
|
||||
}
|
||||
|
||||
/// Single operator in the plan DAG
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct OperatorNode {
|
||||
/// Unique operator ID
|
||||
pub id: OperatorId,
|
||||
|
||||
/// Operator type
|
||||
pub op_type: OperatorType,
|
||||
|
||||
/// Target table (if applicable)
|
||||
pub table_name: Option<String>,
|
||||
|
||||
/// Index used (if applicable)
|
||||
pub index_name: Option<String>,
|
||||
|
||||
/// Filter predicate (if applicable)
|
||||
pub filter: Option<FilterExpr>,
|
||||
|
||||
/// Join condition (if join)
|
||||
pub join_condition: Option<JoinCondition>,
|
||||
|
||||
/// Projected columns
|
||||
pub projection: Vec<String>,
|
||||
|
||||
/// Estimated rows
|
||||
pub estimated_rows: f64,
|
||||
|
||||
/// Estimated cost
|
||||
pub estimated_cost: f64,
|
||||
|
||||
/// Operator embedding (learned)
|
||||
pub embedding: Vec<f32>,
|
||||
|
||||
/// Depth in DAG (0 = leaf)
|
||||
pub depth: usize,
|
||||
|
||||
/// Is this on critical path?
|
||||
pub is_critical: bool,
|
||||
|
||||
/// MinCut criticality score
|
||||
pub criticality: f32,
|
||||
}
|
||||
|
||||
/// Operator types
|
||||
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
|
||||
pub enum OperatorType {
|
||||
// Scan operators (leaves)
|
||||
SeqScan,
|
||||
IndexScan,
|
||||
IndexOnlyScan,
|
||||
HnswScan,
|
||||
IvfFlatScan,
|
||||
BitmapScan,
|
||||
|
||||
// Join operators
|
||||
NestedLoop,
|
||||
HashJoin,
|
||||
MergeJoin,
|
||||
|
||||
// Aggregation operators
|
||||
Aggregate,
|
||||
GroupAggregate,
|
||||
HashAggregate,
|
||||
|
||||
// Sort operators
|
||||
Sort,
|
||||
IncrementalSort,
|
||||
|
||||
// Filter operators
|
||||
Filter,
|
||||
Result,
|
||||
|
||||
// Set operators
|
||||
Append,
|
||||
MergeAppend,
|
||||
Union,
|
||||
Intersect,
|
||||
Except,
|
||||
|
||||
// Subquery operators
|
||||
SubqueryScan,
|
||||
CteScan,
|
||||
MaterializeNode,
|
||||
|
||||
// Utility
|
||||
Limit,
|
||||
Unique,
|
||||
WindowAgg,
|
||||
|
||||
// Parallel
|
||||
Gather,
|
||||
GatherMerge,
|
||||
}
|
||||
|
||||
/// Pattern match information
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct PatternMatch {
|
||||
pub pattern_id: PatternId,
|
||||
pub confidence: f32,
|
||||
pub similarity: f32,
|
||||
pub applied_params: ExecutionParams,
|
||||
}
|
||||
```
|
||||
|
||||
### Operator Embedding
|
||||
|
||||
```rust
|
||||
impl OperatorNode {
|
||||
/// Generate embedding for this operator
|
||||
pub fn generate_embedding(&mut self, config: &EmbeddingConfig) {
|
||||
let dim = config.hidden_dim;
|
||||
let mut embedding = vec![0.0; dim];
|
||||
|
||||
// 1. Operator type encoding (one-hot style, but dense)
|
||||
let type_offset = self.op_type.type_index() * 16;
|
||||
for i in 0..16 {
|
||||
embedding[type_offset + i] = self.op_type.type_features()[i];
|
||||
}
|
||||
|
||||
// 2. Cardinality encoding (log scale)
|
||||
let card_offset = 128;
|
||||
let log_rows = (self.estimated_rows + 1.0).ln();
|
||||
embedding[card_offset] = log_rows / 20.0; // Normalize
|
||||
|
||||
// 3. Cost encoding (log scale)
|
||||
let cost_offset = 129;
|
||||
let log_cost = (self.estimated_cost + 1.0).ln();
|
||||
embedding[cost_offset] = log_cost / 30.0; // Normalize
|
||||
|
||||
// 4. Depth encoding
|
||||
let depth_offset = 130;
|
||||
embedding[depth_offset] = self.depth as f32 / 20.0;
|
||||
|
||||
// 5. Table/index encoding (if applicable)
|
||||
if let Some(ref table) = self.table_name {
|
||||
let table_hash = hash_string(table);
|
||||
let table_offset = 132;
|
||||
for i in 0..16 {
|
||||
embedding[table_offset + i] = ((table_hash >> (i * 4)) & 0xF) as f32 / 16.0;
|
||||
}
|
||||
}
|
||||
|
||||
// 6. Filter complexity encoding
|
||||
if let Some(ref filter) = self.filter {
|
||||
let filter_offset = 148;
|
||||
embedding[filter_offset] = filter.complexity() as f32 / 10.0;
|
||||
embedding[filter_offset + 1] = filter.selectivity_estimate();
|
||||
}
|
||||
|
||||
// 7. Join encoding
|
||||
if let Some(ref join) = self.join_condition {
|
||||
let join_offset = 150;
|
||||
embedding[join_offset] = join.join_type.type_index() as f32 / 4.0;
|
||||
embedding[join_offset + 1] = join.estimated_selectivity;
|
||||
}
|
||||
|
||||
// L2 normalize
|
||||
let norm: f32 = embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
if norm > 1e-8 {
|
||||
for x in &mut embedding {
|
||||
*x /= norm;
|
||||
}
|
||||
}
|
||||
|
||||
self.embedding = embedding;
|
||||
}
|
||||
}
|
||||
|
||||
impl OperatorType {
|
||||
/// Get feature vector for operator type
|
||||
fn type_features(&self) -> [f32; 16] {
|
||||
match self {
|
||||
// Scans - low cost per row
|
||||
OperatorType::SeqScan => [1.0, 0.0, 0.0, 0.0, 0.2, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
|
||||
OperatorType::IndexScan => [0.8, 0.2, 0.0, 0.0, 0.1, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
|
||||
OperatorType::HnswScan => [0.6, 0.4, 0.0, 0.0, 0.05, 0.8, 0.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
|
||||
OperatorType::IvfFlatScan => [0.7, 0.3, 0.0, 0.0, 0.08, 0.7, 0.2, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
|
||||
|
||||
// Joins - high cost
|
||||
OperatorType::NestedLoop => [0.0, 0.0, 1.0, 0.0, 0.9, 0.0, 0.0, 0.8, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
|
||||
OperatorType::HashJoin => [0.0, 0.0, 0.8, 0.2, 0.5, 0.0, 0.0, 0.6, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
|
||||
OperatorType::MergeJoin => [0.0, 0.0, 0.6, 0.4, 0.4, 0.0, 0.0, 0.4, 0.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
|
||||
|
||||
// Aggregation - blocking
|
||||
OperatorType::Aggregate => [0.0, 0.0, 0.0, 0.0, 0.3, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.5, 0.0, 0.0, 0.0, 0.0],
|
||||
OperatorType::HashAggregate => [0.0, 0.0, 0.0, 0.0, 0.4, 0.0, 0.0, 0.0, 0.5, 0.8, 0.0, 0.6, 0.0, 0.0, 0.0, 0.0],
|
||||
|
||||
// Sort - blocking
|
||||
OperatorType::Sort => [0.0, 0.0, 0.0, 0.0, 0.6, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.7, 0.0, 0.0, 0.0, 0.0],
|
||||
|
||||
// Default
|
||||
_ => [0.5; 16],
|
||||
}
|
||||
}
|
||||
|
||||
fn type_index(&self) -> usize {
|
||||
match self {
|
||||
OperatorType::SeqScan => 0,
|
||||
OperatorType::IndexScan => 1,
|
||||
OperatorType::IndexOnlyScan => 1,
|
||||
OperatorType::HnswScan => 2,
|
||||
OperatorType::IvfFlatScan => 3,
|
||||
OperatorType::BitmapScan => 4,
|
||||
OperatorType::NestedLoop => 5,
|
||||
OperatorType::HashJoin => 6,
|
||||
OperatorType::MergeJoin => 7,
|
||||
OperatorType::Aggregate | OperatorType::GroupAggregate | OperatorType::HashAggregate => 8,
|
||||
OperatorType::Sort | OperatorType::IncrementalSort => 9,
|
||||
OperatorType::Filter => 10,
|
||||
OperatorType::Limit => 11,
|
||||
_ => 12,
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Plan Conversion from PostgreSQL
|
||||
|
||||
### PlannedStmt to NeuralDagPlan
|
||||
|
||||
```rust
|
||||
impl NeuralDagPlan {
|
||||
/// Convert PostgreSQL PlannedStmt to NeuralDagPlan
|
||||
pub unsafe fn from_planned_stmt(stmt: *mut pg_sys::PlannedStmt) -> Self {
|
||||
let mut plan = NeuralDagPlan::new();
|
||||
|
||||
// Extract plan tree
|
||||
let plan_tree = (*stmt).planTree;
|
||||
plan.root = Self::convert_plan_node(plan_tree, &mut plan, 0);
|
||||
|
||||
// Compute topological order
|
||||
plan.compute_topological_order();
|
||||
|
||||
// Generate embeddings
|
||||
plan.generate_embeddings();
|
||||
|
||||
// Identify pipeline breakers
|
||||
plan.identify_pipeline_breakers();
|
||||
|
||||
plan
|
||||
}
|
||||
|
||||
/// Recursively convert plan nodes
|
||||
unsafe fn convert_plan_node(
|
||||
node: *mut pg_sys::Plan,
|
||||
plan: &mut NeuralDagPlan,
|
||||
depth: usize,
|
||||
) -> OperatorNode {
|
||||
if node.is_null() {
|
||||
panic!("Null plan node");
|
||||
}
|
||||
|
||||
let node_type = (*node).type_;
|
||||
let estimated_rows = (*node).plan_rows;
|
||||
let estimated_cost = (*node).total_cost;
|
||||
|
||||
let op_type = Self::pg_node_to_op_type(node_type, node);
|
||||
let op_id = plan.next_operator_id();
|
||||
|
||||
let mut operator = OperatorNode {
|
||||
id: op_id,
|
||||
op_type,
|
||||
table_name: Self::extract_table_name(node),
|
||||
index_name: Self::extract_index_name(node),
|
||||
filter: Self::extract_filter(node),
|
||||
join_condition: Self::extract_join_condition(node),
|
||||
projection: Self::extract_projection(node),
|
||||
estimated_rows,
|
||||
estimated_cost,
|
||||
embedding: vec![],
|
||||
depth,
|
||||
is_critical: false,
|
||||
criticality: 0.0,
|
||||
};
|
||||
|
||||
// Process children
|
||||
let left_plan = (*node).lefttree;
|
||||
let right_plan = (*node).righttree;
|
||||
|
||||
let mut child_ids = Vec::new();
|
||||
|
||||
if !left_plan.is_null() {
|
||||
let left_op = Self::convert_plan_node(left_plan, plan, depth + 1);
|
||||
child_ids.push(left_op.id);
|
||||
plan.reverse_edges.insert(left_op.id, op_id);
|
||||
plan.operators.push(left_op);
|
||||
}
|
||||
|
||||
if !right_plan.is_null() {
|
||||
let right_op = Self::convert_plan_node(right_plan, plan, depth + 1);
|
||||
child_ids.push(right_op.id);
|
||||
plan.reverse_edges.insert(right_op.id, op_id);
|
||||
plan.operators.push(right_op);
|
||||
}
|
||||
|
||||
if !child_ids.is_empty() {
|
||||
plan.edges.insert(op_id, child_ids);
|
||||
}
|
||||
|
||||
operator
|
||||
}
|
||||
|
||||
/// Map PostgreSQL node type to OperatorType
|
||||
unsafe fn pg_node_to_op_type(node_type: pg_sys::NodeTag, node: *mut pg_sys::Plan) -> OperatorType {
|
||||
match node_type {
|
||||
pg_sys::NodeTag::T_SeqScan => OperatorType::SeqScan,
|
||||
pg_sys::NodeTag::T_IndexScan => {
|
||||
// Check if it's HNSW or IVFFlat
|
||||
let index_scan = node as *mut pg_sys::IndexScan;
|
||||
let index_oid = (*index_scan).indexid;
|
||||
|
||||
if Self::is_hnsw_index(index_oid) {
|
||||
OperatorType::HnswScan
|
||||
} else if Self::is_ivfflat_index(index_oid) {
|
||||
OperatorType::IvfFlatScan
|
||||
} else {
|
||||
OperatorType::IndexScan
|
||||
}
|
||||
}
|
||||
pg_sys::NodeTag::T_IndexOnlyScan => OperatorType::IndexOnlyScan,
|
||||
pg_sys::NodeTag::T_BitmapHeapScan => OperatorType::BitmapScan,
|
||||
pg_sys::NodeTag::T_NestLoop => OperatorType::NestedLoop,
|
||||
pg_sys::NodeTag::T_HashJoin => OperatorType::HashJoin,
|
||||
pg_sys::NodeTag::T_MergeJoin => OperatorType::MergeJoin,
|
||||
pg_sys::NodeTag::T_Agg => {
|
||||
let agg = node as *mut pg_sys::Agg;
|
||||
match (*agg).aggstrategy {
|
||||
pg_sys::AggStrategy::AGG_HASHED => OperatorType::HashAggregate,
|
||||
pg_sys::AggStrategy::AGG_SORTED => OperatorType::GroupAggregate,
|
||||
_ => OperatorType::Aggregate,
|
||||
}
|
||||
}
|
||||
pg_sys::NodeTag::T_Sort => OperatorType::Sort,
|
||||
pg_sys::NodeTag::T_IncrementalSort => OperatorType::IncrementalSort,
|
||||
pg_sys::NodeTag::T_Limit => OperatorType::Limit,
|
||||
pg_sys::NodeTag::T_Unique => OperatorType::Unique,
|
||||
pg_sys::NodeTag::T_Append => OperatorType::Append,
|
||||
pg_sys::NodeTag::T_MergeAppend => OperatorType::MergeAppend,
|
||||
pg_sys::NodeTag::T_Gather => OperatorType::Gather,
|
||||
pg_sys::NodeTag::T_GatherMerge => OperatorType::GatherMerge,
|
||||
pg_sys::NodeTag::T_WindowAgg => OperatorType::WindowAgg,
|
||||
pg_sys::NodeTag::T_SubqueryScan => OperatorType::SubqueryScan,
|
||||
pg_sys::NodeTag::T_CteScan => OperatorType::CteScan,
|
||||
pg_sys::NodeTag::T_Material => OperatorType::MaterializeNode,
|
||||
pg_sys::NodeTag::T_Result => OperatorType::Result,
|
||||
_ => OperatorType::Filter, // Default
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Plan Embedding Computation
|
||||
|
||||
### Hierarchical Aggregation
|
||||
|
||||
```rust
|
||||
impl NeuralDagPlan {
|
||||
/// Generate plan-level embedding from operator embeddings
|
||||
pub fn generate_plan_embedding(&mut self) {
|
||||
let dim = self.operator_embeddings[0].len();
|
||||
let mut plan_embedding = vec![0.0; dim];
|
||||
|
||||
// Method 1: Weighted sum by depth (deeper = lower weight)
|
||||
let max_depth = self.operators.iter().map(|o| o.depth).max().unwrap_or(0);
|
||||
|
||||
for (i, op) in self.operators.iter().enumerate() {
|
||||
let depth_weight = 1.0 / (op.depth as f32 + 1.0);
|
||||
let cost_weight = (op.estimated_cost / self.total_cost()).min(1.0) as f32;
|
||||
let weight = depth_weight * 0.5 + cost_weight * 0.5;
|
||||
|
||||
for (j, &val) in self.operator_embeddings[i].iter().enumerate() {
|
||||
plan_embedding[j] += weight * val;
|
||||
}
|
||||
}
|
||||
|
||||
// L2 normalize
|
||||
let norm: f32 = plan_embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
if norm > 1e-8 {
|
||||
for x in &mut plan_embedding {
|
||||
*x /= norm;
|
||||
}
|
||||
}
|
||||
|
||||
self.plan_embedding = Some(plan_embedding);
|
||||
}
|
||||
|
||||
/// Generate embedding using attention over operators
|
||||
pub fn generate_plan_embedding_with_attention(&mut self, attention: &dyn DagAttention) {
|
||||
// Use root operator as query
|
||||
let root_embedding = &self.operator_embeddings[0];
|
||||
|
||||
// Build context from all operators
|
||||
let ctx = self.build_dag_context();
|
||||
|
||||
// Compute attention weights
|
||||
let query_node = DagNode {
|
||||
id: self.root.id,
|
||||
embedding: root_embedding.clone(),
|
||||
};
|
||||
|
||||
let output = attention.forward(&query_node, &ctx, &AttentionConfig::default())
|
||||
.expect("Attention computation failed");
|
||||
|
||||
// Store attention weights
|
||||
self.attention_weights = vec![output.weights.clone()];
|
||||
|
||||
// Use aggregated output as plan embedding
|
||||
self.plan_embedding = Some(output.aggregated);
|
||||
}
|
||||
|
||||
fn build_dag_context(&self) -> DagContext {
|
||||
DagContext {
|
||||
nodes: self.operators.iter()
|
||||
.map(|op| DagNode {
|
||||
id: op.id,
|
||||
embedding: op.embedding.clone(),
|
||||
})
|
||||
.collect(),
|
||||
edges: self.edges.clone(),
|
||||
reverse_edges: self.reverse_edges.iter()
|
||||
.map(|(&child, &parent)| (child, vec![parent]))
|
||||
.collect(),
|
||||
depths: self.operators.iter()
|
||||
.map(|op| (op.id, op.depth))
|
||||
.collect(),
|
||||
timestamps: None,
|
||||
criticalities: self.criticalities.as_ref().map(|c| {
|
||||
self.operators.iter()
|
||||
.enumerate()
|
||||
.map(|(i, op)| (op.id, c[i]))
|
||||
.collect()
|
||||
}),
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Plan Optimization
|
||||
|
||||
### Learned Cost Adjustment
|
||||
|
||||
```rust
|
||||
impl NeuralDagPlan {
|
||||
/// Apply learned cost adjustments
|
||||
pub fn apply_learned_costs(&mut self) {
|
||||
if let Some(ref learned_costs) = self.learned_costs {
|
||||
for (i, op) in self.operators.iter_mut().enumerate() {
|
||||
if i < learned_costs.len() {
|
||||
// Adjust estimated cost by learned factor
|
||||
let adjustment = learned_costs[i];
|
||||
op.estimated_cost *= (1.0 + adjustment) as f64;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Reorder operators based on learned pattern
|
||||
pub fn reorder_operators(&mut self, optimal_ordering: &[OperatorId]) {
|
||||
// Only reorder within commutative operators (e.g., join order)
|
||||
let join_ops: Vec<_> = self.operators.iter()
|
||||
.filter(|op| matches!(op.op_type,
|
||||
OperatorType::HashJoin |
|
||||
OperatorType::MergeJoin |
|
||||
OperatorType::NestedLoop))
|
||||
.map(|op| op.id)
|
||||
.collect();
|
||||
|
||||
if join_ops.len() < 2 {
|
||||
return; // Nothing to reorder
|
||||
}
|
||||
|
||||
// Apply learned ordering
|
||||
// This is a simplified version - real implementation needs
|
||||
// to preserve DAG constraints
|
||||
for (i, &target_id) in optimal_ordering.iter().enumerate() {
|
||||
if i < join_ops.len() {
|
||||
// Swap join operators to match target ordering
|
||||
// (preserving child relationships)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Apply learned execution parameters
|
||||
pub fn apply_params(&mut self, params: &ExecutionParams) {
|
||||
self.params = params.clone();
|
||||
|
||||
// Apply to relevant operators
|
||||
for op in &mut self.operators {
|
||||
match op.op_type {
|
||||
OperatorType::HnswScan => {
|
||||
if let Some(ef) = params.ef_search {
|
||||
op.embedding[160] = ef as f32 / 100.0; // Encode in embedding
|
||||
}
|
||||
}
|
||||
OperatorType::IvfFlatScan => {
|
||||
if let Some(probes) = params.probes {
|
||||
op.embedding[161] = probes as f32 / 50.0;
|
||||
}
|
||||
}
|
||||
_ => {}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Critical Path Analysis
|
||||
|
||||
```rust
|
||||
impl NeuralDagPlan {
|
||||
/// Compute critical path through the plan DAG
|
||||
pub fn compute_critical_path(&mut self) {
|
||||
// Dynamic programming: longest path
|
||||
let mut longest_to: HashMap<OperatorId, f64> = HashMap::new();
|
||||
let mut longest_from: HashMap<OperatorId, f64> = HashMap::new();
|
||||
let mut predecessor: HashMap<OperatorId, OperatorId> = HashMap::new();
|
||||
|
||||
// Forward pass (leaves to root) - longest path TO each node
|
||||
for op in self.operators.iter().rev() { // Reverse topo order
|
||||
let mut max_cost = 0.0;
|
||||
let mut max_pred = None;
|
||||
|
||||
if let Some(children) = self.edges.get(&op.id) {
|
||||
for &child_id in children {
|
||||
let child_cost = longest_to.get(&child_id).unwrap_or(&0.0);
|
||||
if *child_cost > max_cost {
|
||||
max_cost = *child_cost;
|
||||
max_pred = Some(child_id);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
longest_to.insert(op.id, max_cost + op.estimated_cost);
|
||||
if let Some(pred) = max_pred {
|
||||
predecessor.insert(op.id, pred);
|
||||
}
|
||||
}
|
||||
|
||||
// Backward pass (root to leaves) - longest path FROM each node
|
||||
for op in &self.operators {
|
||||
let mut max_cost = 0.0;
|
||||
|
||||
if let Some(&parent_id) = self.reverse_edges.get(&op.id) {
|
||||
let parent_cost = longest_from.get(&parent_id).unwrap_or(&0.0);
|
||||
max_cost = max_cost.max(*parent_cost + self.get_operator(parent_id).estimated_cost);
|
||||
}
|
||||
|
||||
longest_from.insert(op.id, max_cost);
|
||||
}
|
||||
|
||||
// Find critical path
|
||||
let global_longest = longest_to.values().cloned().fold(0.0, f64::max);
|
||||
|
||||
let mut critical_path = Vec::new();
|
||||
for op in &self.operators {
|
||||
let total_through = longest_to[&op.id] + longest_from[&op.id];
|
||||
if (total_through - global_longest).abs() < 1e-6 {
|
||||
critical_path.push(op.id);
|
||||
}
|
||||
}
|
||||
|
||||
// Mark operators
|
||||
for op in &mut self.operators {
|
||||
op.is_critical = critical_path.contains(&op.id);
|
||||
}
|
||||
|
||||
self.critical_path = Some(critical_path);
|
||||
}
|
||||
|
||||
/// Compute bottleneck score (0.0 - 1.0)
|
||||
pub fn compute_bottleneck_score(&mut self) {
|
||||
if let Some(ref critical_path) = self.critical_path {
|
||||
if critical_path.is_empty() {
|
||||
self.bottleneck_score = Some(0.0);
|
||||
return;
|
||||
}
|
||||
|
||||
// Bottleneck = max(single_op_cost / total_cost)
|
||||
let total_cost = self.total_cost();
|
||||
let max_single = critical_path.iter()
|
||||
.map(|&id| self.get_operator(id).estimated_cost)
|
||||
.fold(0.0, f64::max);
|
||||
|
||||
self.bottleneck_score = Some((max_single / total_cost) as f32);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Learning Target: Plan Quality
|
||||
|
||||
### Quality Computation
|
||||
|
||||
```rust
|
||||
/// Compute quality score for a plan execution
|
||||
pub fn compute_plan_quality(plan: &NeuralDagPlan, metrics: &ExecutionMetrics) -> f32 {
|
||||
// Multi-objective quality function
|
||||
|
||||
// 1. Latency score (lower is better)
|
||||
// Target: 10ms for simple queries, 1s for complex
|
||||
let complexity = plan.operators.len() as f32;
|
||||
let target_latency_us = 10000.0 * complexity.sqrt();
|
||||
let latency_score = (target_latency_us / (metrics.latency_us as f32 + 1.0)).min(1.0);
|
||||
|
||||
// 2. Accuracy score (for vector queries)
|
||||
// If we have relevance feedback
|
||||
let accuracy_score = if let Some(precision) = metrics.precision {
|
||||
precision
|
||||
} else {
|
||||
1.0 // Assume accurate if no feedback
|
||||
};
|
||||
|
||||
// 3. Efficiency score (rows per microsecond)
|
||||
let efficiency_score = if metrics.latency_us > 0 {
|
||||
(metrics.rows_processed as f32 / metrics.latency_us as f32 * 1000.0).min(1.0)
|
||||
} else {
|
||||
1.0
|
||||
};
|
||||
|
||||
// 4. Memory score (lower is better)
|
||||
let target_memory = 10_000_000.0 * complexity; // 10MB per operator
|
||||
let memory_score = (target_memory / (metrics.memory_bytes as f32 + 1.0)).min(1.0);
|
||||
|
||||
// 5. Cache efficiency
|
||||
let cache_score = metrics.cache_hit_rate;
|
||||
|
||||
// Weighted combination
|
||||
let weights = [0.35, 0.25, 0.15, 0.15, 0.10];
|
||||
let scores = [latency_score, accuracy_score, efficiency_score, memory_score, cache_score];
|
||||
|
||||
weights.iter().zip(scores.iter())
|
||||
.map(|(w, s)| w * s)
|
||||
.sum()
|
||||
}
|
||||
```
|
||||
|
||||
### Gradient Estimation
|
||||
|
||||
```rust
|
||||
impl DagTrajectory {
|
||||
/// Estimate gradient for REINFORCE-style learning
|
||||
pub fn estimate_gradient(&self) -> Vec<f32> {
|
||||
let dim = self.plan_embedding.len();
|
||||
let mut gradient = vec![0.0; dim];
|
||||
|
||||
// REINFORCE with baseline
|
||||
let baseline = 0.5; // Could be learned
|
||||
let advantage = self.quality - baseline;
|
||||
|
||||
// gradient += advantage * activation
|
||||
// Simplified: use plan embedding as "activation"
|
||||
for (i, &val) in self.plan_embedding.iter().enumerate() {
|
||||
gradient[i] = advantage * val;
|
||||
}
|
||||
|
||||
// Also incorporate operator-level signals
|
||||
for (op_idx, op_embedding) in self.operator_embeddings.iter().enumerate() {
|
||||
// Weight by attention
|
||||
let attention_weight = if op_idx < self.attention_weights.len() {
|
||||
self.attention_weights.get(0)
|
||||
.and_then(|w| w.get(op_idx))
|
||||
.unwrap_or(&(1.0 / self.operator_embeddings.len() as f32))
|
||||
.clone()
|
||||
} else {
|
||||
1.0 / self.operator_embeddings.len() as f32
|
||||
};
|
||||
|
||||
for (i, &val) in op_embedding.iter().enumerate() {
|
||||
if i < dim {
|
||||
gradient[i] += advantage * val * attention_weight * 0.5;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// L2 normalize
|
||||
let norm: f32 = gradient.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
if norm > 1e-8 {
|
||||
for x in &mut gradient {
|
||||
*x /= norm;
|
||||
}
|
||||
}
|
||||
|
||||
gradient
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Integration with PostgreSQL Planner
|
||||
|
||||
### Plan Modification Points
|
||||
|
||||
```rust
|
||||
/// Points where neural DAG can influence planning
|
||||
pub enum PlanModificationPoint {
|
||||
/// Before any planning
|
||||
PrePlanning,
|
||||
|
||||
/// After join enumeration, before selecting best join order
|
||||
JoinOrdering,
|
||||
|
||||
/// After creating base plan, before optimization
|
||||
PreOptimization,
|
||||
|
||||
/// After optimization, before execution
|
||||
PostOptimization,
|
||||
|
||||
/// During execution (adaptive)
|
||||
DuringExecution,
|
||||
}
|
||||
|
||||
impl NeuralDagPlan {
|
||||
/// Apply neural modifications at specified point
|
||||
pub fn apply_modifications(&mut self, point: PlanModificationPoint, engine: &DagSonaEngine) {
|
||||
match point {
|
||||
PlanModificationPoint::PrePlanning => {
|
||||
// Hint optimal parameters based on query pattern
|
||||
self.apply_pre_planning_hints(engine);
|
||||
}
|
||||
|
||||
PlanModificationPoint::JoinOrdering => {
|
||||
// Suggest optimal join order
|
||||
if let Some(ordering) = engine.suggest_join_order(&self.plan_embedding) {
|
||||
self.reorder_operators(&ordering);
|
||||
}
|
||||
}
|
||||
|
||||
PlanModificationPoint::PreOptimization => {
|
||||
// Adjust cost estimates
|
||||
if let Some(costs) = engine.predict_costs(&self.plan_embedding) {
|
||||
self.learned_costs = Some(costs);
|
||||
self.apply_learned_costs();
|
||||
}
|
||||
}
|
||||
|
||||
PlanModificationPoint::PostOptimization => {
|
||||
// Final parameter tuning
|
||||
if let Some(params) = engine.suggest_params(&self.plan_embedding) {
|
||||
self.apply_params(¶ms);
|
||||
}
|
||||
}
|
||||
|
||||
PlanModificationPoint::DuringExecution => {
|
||||
// Adaptive re-planning (future work)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Serialization
|
||||
|
||||
### Plan Persistence
|
||||
|
||||
```rust
|
||||
impl NeuralDagPlan {
|
||||
/// Serialize plan for storage
|
||||
pub fn to_bytes(&self) -> Vec<u8> {
|
||||
bincode::serialize(self).expect("Serialization failed")
|
||||
}
|
||||
|
||||
/// Deserialize plan
|
||||
pub fn from_bytes(bytes: &[u8]) -> Result<Self, bincode::Error> {
|
||||
bincode::deserialize(bytes)
|
||||
}
|
||||
|
||||
/// Export to JSON for debugging
|
||||
pub fn to_json(&self) -> serde_json::Value {
|
||||
json!({
|
||||
"plan_id": self.plan_id,
|
||||
"operators": self.operators.iter().map(|op| json!({
|
||||
"id": op.id,
|
||||
"type": format!("{:?}", op.op_type),
|
||||
"table": op.table_name,
|
||||
"estimated_rows": op.estimated_rows,
|
||||
"estimated_cost": op.estimated_cost,
|
||||
"depth": op.depth,
|
||||
"is_critical": op.is_critical,
|
||||
"criticality": op.criticality,
|
||||
})).collect::<Vec<_>>(),
|
||||
"edges": self.edges,
|
||||
"attention_type": format!("{:?}", self.attention_type),
|
||||
"bottleneck_score": self.bottleneck_score,
|
||||
"params": {
|
||||
"ef_search": self.params.ef_search,
|
||||
"probes": self.params.probes,
|
||||
"parallelism": self.params.parallelism,
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
| Operation | Complexity | Target Latency |
|
||||
|-----------|------------|----------------|
|
||||
| Plan conversion | O(n) | <1ms |
|
||||
| Embedding generation | O(n × d) | <500μs |
|
||||
| Plan embedding | O(n × d) | <200μs |
|
||||
| Critical path | O(n²) | <1ms |
|
||||
| MinCut criticality | O(n^0.12) | <10ms |
|
||||
| Pattern matching | O(k × d) | <1ms |
|
||||
|
||||
Where n = operators, d = embedding dimension (256), k = patterns (100).
|
||||
667
docs/dag/06-MINCUT-OPTIMIZATION.md
Normal file
667
docs/dag/06-MINCUT-OPTIMIZATION.md
Normal file
@@ -0,0 +1,667 @@
|
||||
# MinCut Optimization Specification
|
||||
|
||||
## Overview
|
||||
|
||||
This document specifies how the subpolynomial O(n^0.12) min-cut algorithm from `ruvector-mincut` integrates with the Neural DAG system for bottleneck detection and optimization.
|
||||
|
||||
## MinCut Integration Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ MINCUT OPTIMIZATION LAYER │
|
||||
├─────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ SUBPOLYNOMIAL MINCUT ENGINE │ │
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
|
||||
│ │ │ Hierarchical│ │ LocalKCut │ │ LinkCut │ │ │
|
||||
│ │ │Decomposition│ │ Oracle │ │ Tree │ │ │
|
||||
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
|
||||
│ │ DAG CRITICALITY ANALYZER │ │
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
|
||||
│ │ │ Operator │ │ Bottleneck │ │ Critical │ │ │
|
||||
│ │ │ Criticality │ │ Detection │ │ Path │ │ │
|
||||
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────────────────────────┴───────────────────────────────────┐ │
|
||||
│ │ OPTIMIZATION ACTIONS │ │
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ │
|
||||
│ │ │ Gated │ │ Redundancy │ │ Parallel │ │ Self- │ │ │
|
||||
│ │ │ Attention │ │ Injection │ │ Expansion │ │ Healing │ │ │
|
||||
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## DAG MinCut Engine
|
||||
|
||||
### Core Structure
|
||||
|
||||
```rust
|
||||
/// MinCut engine adapted for query plan DAGs
|
||||
pub struct DagMinCutEngine {
|
||||
/// Subpolynomial min-cut algorithm
|
||||
mincut: SubpolynomialMinCut,
|
||||
|
||||
/// Graph representation of current DAG
|
||||
graph: DynamicGraph,
|
||||
|
||||
/// Cached criticality scores
|
||||
criticality_cache: DashMap<OperatorId, f32>,
|
||||
|
||||
/// Configuration
|
||||
config: MinCutConfig,
|
||||
|
||||
/// Metrics
|
||||
metrics: MinCutMetrics,
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct MinCutConfig {
|
||||
/// Enable/disable mincut analysis
|
||||
pub enabled: bool,
|
||||
|
||||
/// Criticality threshold for bottleneck detection
|
||||
pub bottleneck_threshold: f32,
|
||||
|
||||
/// Maximum operators to analyze
|
||||
pub max_operators: usize,
|
||||
|
||||
/// Cache TTL in seconds
|
||||
pub cache_ttl_secs: u64,
|
||||
|
||||
/// Enable self-healing
|
||||
pub self_healing_enabled: bool,
|
||||
|
||||
/// Healing check interval
|
||||
pub healing_interval_ms: u64,
|
||||
}
|
||||
|
||||
impl Default for MinCutConfig {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
enabled: true,
|
||||
bottleneck_threshold: 0.5,
|
||||
max_operators: 1000,
|
||||
cache_ttl_secs: 300,
|
||||
self_healing_enabled: true,
|
||||
healing_interval_ms: 300000, // 5 minutes
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Graph Construction from DAG
|
||||
|
||||
```rust
|
||||
impl DagMinCutEngine {
|
||||
/// Build graph from query plan DAG
|
||||
pub fn build_from_plan(&mut self, plan: &NeuralDagPlan) {
|
||||
self.graph.clear();
|
||||
|
||||
// Add vertices (operators)
|
||||
for op in &plan.operators {
|
||||
let weight = self.operator_weight(op);
|
||||
self.graph.add_vertex(op.id, weight);
|
||||
}
|
||||
|
||||
// Add edges (data flow)
|
||||
for (&parent_id, children) in &plan.edges {
|
||||
for &child_id in children {
|
||||
// Edge weight = data volume estimate
|
||||
let parent_op = plan.get_operator(parent_id);
|
||||
let weight = parent_op.estimated_rows as f64;
|
||||
self.graph.add_edge(parent_id, child_id, weight);
|
||||
}
|
||||
}
|
||||
|
||||
// Initialize min-cut structure
|
||||
self.mincut.initialize(&self.graph);
|
||||
}
|
||||
|
||||
/// Compute operator weight for min-cut
|
||||
fn operator_weight(&self, op: &OperatorNode) -> f64 {
|
||||
// Weight based on:
|
||||
// 1. Estimated cost (primary)
|
||||
// 2. Blocking nature (pipeline breakers are heavier)
|
||||
// 3. Parallelizability (less parallelizable = heavier)
|
||||
|
||||
let base_weight = op.estimated_cost;
|
||||
|
||||
let blocking_factor = if op.is_pipeline_breaker() {
|
||||
2.0
|
||||
} else {
|
||||
1.0
|
||||
};
|
||||
|
||||
let parallel_factor = match op.op_type {
|
||||
OperatorType::Sort | OperatorType::Aggregate => 1.5,
|
||||
OperatorType::HashJoin => 1.2,
|
||||
_ => 1.0,
|
||||
};
|
||||
|
||||
base_weight * blocking_factor * parallel_factor
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Criticality Computation
|
||||
|
||||
### Operator Criticality
|
||||
|
||||
```rust
|
||||
impl DagMinCutEngine {
|
||||
/// Compute criticality for all operators
|
||||
pub fn compute_all_criticalities(&self, plan: &NeuralDagPlan) -> HashMap<OperatorId, f32> {
|
||||
let global_cut = self.mincut.query();
|
||||
let mut criticalities = HashMap::new();
|
||||
|
||||
for op in &plan.operators {
|
||||
let criticality = self.compute_operator_criticality(op.id, global_cut);
|
||||
criticalities.insert(op.id, criticality);
|
||||
}
|
||||
|
||||
criticalities
|
||||
}
|
||||
|
||||
/// Compute criticality for a single operator
|
||||
/// Criticality = how much removing this operator would reduce min-cut
|
||||
pub fn compute_operator_criticality(&self, op_id: OperatorId, global_cut: u64) -> f32 {
|
||||
// Check cache first
|
||||
if let Some(cached) = self.criticality_cache.get(&op_id) {
|
||||
return *cached;
|
||||
}
|
||||
|
||||
// Use LocalKCut oracle
|
||||
let query = LocalKCutQuery {
|
||||
seed_vertices: vec![op_id],
|
||||
budget_k: global_cut,
|
||||
radius: 3, // Local neighborhood
|
||||
};
|
||||
|
||||
let criticality = match self.mincut.local_query(query) {
|
||||
LocalKCutResult::Found { cut_value, .. } => {
|
||||
// Criticality = (global - local) / global
|
||||
if global_cut > 0 {
|
||||
(global_cut - cut_value) as f32 / global_cut as f32
|
||||
} else {
|
||||
0.0
|
||||
}
|
||||
}
|
||||
LocalKCutResult::NoneInLocality => 0.0,
|
||||
};
|
||||
|
||||
// Cache result
|
||||
self.criticality_cache.insert(op_id, criticality);
|
||||
|
||||
criticality
|
||||
}
|
||||
|
||||
/// Identify bottleneck operators
|
||||
pub fn identify_bottlenecks(&self, plan: &NeuralDagPlan) -> Vec<BottleneckInfo> {
|
||||
let criticalities = self.compute_all_criticalities(plan);
|
||||
|
||||
let mut bottlenecks: Vec<_> = criticalities.iter()
|
||||
.filter(|(_, &crit)| crit > self.config.bottleneck_threshold)
|
||||
.map(|(&op_id, &crit)| {
|
||||
let op = plan.get_operator(op_id);
|
||||
BottleneckInfo {
|
||||
operator_id: op_id,
|
||||
operator_type: op.op_type.clone(),
|
||||
criticality: crit,
|
||||
estimated_cost: op.estimated_cost,
|
||||
recommendation: self.generate_recommendation(op, crit),
|
||||
}
|
||||
})
|
||||
.collect();
|
||||
|
||||
// Sort by criticality (most critical first)
|
||||
bottlenecks.sort_by(|a, b| b.criticality.partial_cmp(&a.criticality).unwrap());
|
||||
|
||||
bottlenecks
|
||||
}
|
||||
|
||||
/// Generate optimization recommendation for bottleneck
|
||||
fn generate_recommendation(&self, op: &OperatorNode, criticality: f32) -> OptimizationRecommendation {
|
||||
match op.op_type {
|
||||
OperatorType::SeqScan => {
|
||||
OptimizationRecommendation::CreateIndex {
|
||||
table: op.table_name.clone().unwrap_or_default(),
|
||||
columns: op.filter.as_ref()
|
||||
.map(|f| f.columns())
|
||||
.unwrap_or_default(),
|
||||
}
|
||||
}
|
||||
|
||||
OperatorType::HnswScan | OperatorType::IvfFlatScan => {
|
||||
if criticality > 0.8 {
|
||||
OptimizationRecommendation::IncreaseEfSearch {
|
||||
current: 40, // Would be extracted from plan
|
||||
recommended: 80,
|
||||
}
|
||||
} else {
|
||||
OptimizationRecommendation::None
|
||||
}
|
||||
}
|
||||
|
||||
OperatorType::NestedLoop => {
|
||||
OptimizationRecommendation::ConsiderHashJoin {
|
||||
estimated_improvement: criticality * 50.0,
|
||||
}
|
||||
}
|
||||
|
||||
OperatorType::Sort => {
|
||||
if op.estimated_rows > 100000.0 {
|
||||
OptimizationRecommendation::AddSortIndex {
|
||||
columns: op.projection.clone(),
|
||||
}
|
||||
} else {
|
||||
OptimizationRecommendation::None
|
||||
}
|
||||
}
|
||||
|
||||
OperatorType::HashAggregate if op.estimated_rows > 1000000.0 => {
|
||||
OptimizationRecommendation::ConsiderPartitioning {
|
||||
partition_key: op.projection.first().cloned(),
|
||||
}
|
||||
}
|
||||
|
||||
_ => OptimizationRecommendation::None,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Information about a bottleneck
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct BottleneckInfo {
|
||||
pub operator_id: OperatorId,
|
||||
pub operator_type: OperatorType,
|
||||
pub criticality: f32,
|
||||
pub estimated_cost: f64,
|
||||
pub recommendation: OptimizationRecommendation,
|
||||
}
|
||||
|
||||
/// Optimization recommendations
|
||||
#[derive(Clone, Debug)]
|
||||
pub enum OptimizationRecommendation {
|
||||
None,
|
||||
CreateIndex { table: String, columns: Vec<String> },
|
||||
IncreaseEfSearch { current: usize, recommended: usize },
|
||||
ConsiderHashJoin { estimated_improvement: f32 },
|
||||
AddSortIndex { columns: Vec<String> },
|
||||
ConsiderPartitioning { partition_key: Option<String> },
|
||||
AddParallelism { recommended_workers: usize },
|
||||
MaterializeSubquery { subquery_id: OperatorId },
|
||||
}
|
||||
```
|
||||
|
||||
## MinCut Gated Attention Integration
|
||||
|
||||
### Gating Mechanism
|
||||
|
||||
```rust
|
||||
impl DagMinCutEngine {
|
||||
/// Compute attention gates based on criticality
|
||||
pub fn compute_attention_gates(
|
||||
&self,
|
||||
plan: &NeuralDagPlan,
|
||||
) -> Vec<f32> {
|
||||
let criticalities = self.compute_all_criticalities(plan);
|
||||
|
||||
plan.operators.iter()
|
||||
.map(|op| {
|
||||
let crit = criticalities.get(&op.id).unwrap_or(&0.0);
|
||||
|
||||
if *crit > self.config.bottleneck_threshold {
|
||||
1.0 // Full attention for bottlenecks
|
||||
} else {
|
||||
crit / self.config.bottleneck_threshold // Scaled
|
||||
}
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
/// Apply gating to attention weights
|
||||
pub fn gate_attention_weights(
|
||||
&self,
|
||||
weights: &[f32],
|
||||
gates: &[f32],
|
||||
) -> Vec<f32> {
|
||||
assert_eq!(weights.len(), gates.len());
|
||||
|
||||
let gated: Vec<f32> = weights.iter()
|
||||
.zip(gates.iter())
|
||||
.map(|(w, g)| w * g)
|
||||
.collect();
|
||||
|
||||
// Renormalize
|
||||
let sum: f32 = gated.iter().sum();
|
||||
if sum > 1e-8 {
|
||||
gated.iter().map(|w| w / sum).collect()
|
||||
} else {
|
||||
vec![1.0 / weights.len() as f32; weights.len()]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Dynamic Updates
|
||||
|
||||
### Incremental MinCut Maintenance
|
||||
|
||||
```rust
|
||||
impl DagMinCutEngine {
|
||||
/// Handle operator cost update (O(n^0.12) amortized)
|
||||
pub fn update_operator_cost(&mut self, op_id: OperatorId, new_cost: f64) {
|
||||
let old_weight = self.graph.get_vertex_weight(op_id);
|
||||
let new_weight = new_cost * self.get_operator_factors(op_id);
|
||||
|
||||
// Update graph
|
||||
self.graph.update_vertex_weight(op_id, new_weight);
|
||||
|
||||
// Incremental min-cut update
|
||||
// The subpolynomial algorithm handles this efficiently
|
||||
self.mincut.on_vertex_weight_change(op_id, old_weight, new_weight);
|
||||
|
||||
// Invalidate cache for affected operators
|
||||
self.invalidate_local_cache(op_id);
|
||||
}
|
||||
|
||||
/// Handle edge addition (e.g., plan change)
|
||||
pub fn add_edge(&mut self, from: OperatorId, to: OperatorId, weight: f64) {
|
||||
self.graph.add_edge(from, to, weight);
|
||||
self.mincut.insert_edge(from, to);
|
||||
self.invalidate_local_cache(from);
|
||||
self.invalidate_local_cache(to);
|
||||
}
|
||||
|
||||
/// Handle edge removal
|
||||
pub fn remove_edge(&mut self, from: OperatorId, to: OperatorId) {
|
||||
self.graph.remove_edge(from, to);
|
||||
self.mincut.delete_edge(from, to);
|
||||
self.invalidate_local_cache(from);
|
||||
self.invalidate_local_cache(to);
|
||||
}
|
||||
|
||||
/// Invalidate cache for operator and neighbors
|
||||
fn invalidate_local_cache(&self, op_id: OperatorId) {
|
||||
self.criticality_cache.remove(&op_id);
|
||||
|
||||
// Also invalidate neighbors (within radius 3)
|
||||
let neighbors = self.graph.get_neighbors_within_radius(op_id, 3);
|
||||
for neighbor in neighbors {
|
||||
self.criticality_cache.remove(&neighbor);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Self-Healing Integration
|
||||
|
||||
### Bottleneck Detection Loop
|
||||
|
||||
```rust
|
||||
impl DagMinCutEngine {
|
||||
/// Background bottleneck detection
|
||||
pub fn run_health_check(&self, plan: &NeuralDagPlan) -> HealthCheckResult {
|
||||
let start = Instant::now();
|
||||
|
||||
// Compute global min-cut
|
||||
let global_cut = self.mincut.query();
|
||||
|
||||
// Identify bottlenecks
|
||||
let bottlenecks = self.identify_bottlenecks(plan);
|
||||
|
||||
// Compute health score
|
||||
let health_score = self.compute_health_score(&bottlenecks);
|
||||
|
||||
// Generate alerts if needed
|
||||
let alerts = self.generate_alerts(&bottlenecks);
|
||||
|
||||
HealthCheckResult {
|
||||
global_mincut: global_cut,
|
||||
health_score,
|
||||
bottleneck_count: bottlenecks.len(),
|
||||
severe_bottlenecks: bottlenecks.iter()
|
||||
.filter(|b| b.criticality > 0.8)
|
||||
.count(),
|
||||
bottlenecks,
|
||||
alerts,
|
||||
duration: start.elapsed(),
|
||||
}
|
||||
}
|
||||
|
||||
fn compute_health_score(&self, bottlenecks: &[BottleneckInfo]) -> f32 {
|
||||
if bottlenecks.is_empty() {
|
||||
return 1.0;
|
||||
}
|
||||
|
||||
// Score decreases with bottleneck severity
|
||||
let max_criticality = bottlenecks.iter()
|
||||
.map(|b| b.criticality)
|
||||
.fold(0.0, f32::max);
|
||||
|
||||
let avg_criticality = bottlenecks.iter()
|
||||
.map(|b| b.criticality)
|
||||
.sum::<f32>() / bottlenecks.len() as f32;
|
||||
|
||||
1.0 - (max_criticality * 0.6 + avg_criticality * 0.4)
|
||||
}
|
||||
|
||||
fn generate_alerts(&self, bottlenecks: &[BottleneckInfo]) -> Vec<Alert> {
|
||||
bottlenecks.iter()
|
||||
.filter(|b| b.criticality > 0.7)
|
||||
.map(|b| Alert {
|
||||
severity: if b.criticality > 0.9 {
|
||||
AlertSeverity::Critical
|
||||
} else if b.criticality > 0.8 {
|
||||
AlertSeverity::Warning
|
||||
} else {
|
||||
AlertSeverity::Info
|
||||
},
|
||||
message: format!(
|
||||
"Bottleneck detected: {:?} (criticality: {:.2})",
|
||||
b.operator_type, b.criticality
|
||||
),
|
||||
recommendation: b.recommendation.clone(),
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct HealthCheckResult {
|
||||
pub global_mincut: u64,
|
||||
pub health_score: f32,
|
||||
pub bottleneck_count: usize,
|
||||
pub severe_bottlenecks: usize,
|
||||
pub bottlenecks: Vec<BottleneckInfo>,
|
||||
pub alerts: Vec<Alert>,
|
||||
pub duration: Duration,
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct Alert {
|
||||
pub severity: AlertSeverity,
|
||||
pub message: String,
|
||||
pub recommendation: OptimizationRecommendation,
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug, PartialEq, Eq)]
|
||||
pub enum AlertSeverity {
|
||||
Info,
|
||||
Warning,
|
||||
Critical,
|
||||
}
|
||||
```
|
||||
|
||||
## Redundancy Injection
|
||||
|
||||
### Bypass Path Creation
|
||||
|
||||
```rust
|
||||
impl DagMinCutEngine {
|
||||
/// Suggest redundant paths to reduce bottleneck impact
|
||||
pub fn suggest_redundancy(&self, plan: &NeuralDagPlan) -> Vec<RedundancySuggestion> {
|
||||
let bottlenecks = self.identify_bottlenecks(plan);
|
||||
let mut suggestions = Vec::new();
|
||||
|
||||
for bottleneck in &bottlenecks {
|
||||
if bottleneck.criticality > 0.7 {
|
||||
// Find alternative paths around this operator
|
||||
let alternatives = self.find_alternative_paths(
|
||||
plan,
|
||||
bottleneck.operator_id,
|
||||
);
|
||||
|
||||
if let Some(alt) = alternatives.first() {
|
||||
suggestions.push(RedundancySuggestion {
|
||||
bottleneck_id: bottleneck.operator_id,
|
||||
alternative_path: alt.clone(),
|
||||
estimated_improvement: self.estimate_improvement(
|
||||
bottleneck,
|
||||
alt,
|
||||
),
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
suggestions
|
||||
}
|
||||
|
||||
fn find_alternative_paths(
|
||||
&self,
|
||||
plan: &NeuralDagPlan,
|
||||
bottleneck_id: OperatorId,
|
||||
) -> Vec<AlternativePath> {
|
||||
let mut alternatives = Vec::new();
|
||||
|
||||
let bottleneck = plan.get_operator(bottleneck_id);
|
||||
|
||||
match bottleneck.op_type {
|
||||
OperatorType::SeqScan => {
|
||||
// Alternative: Index scan if index exists
|
||||
if let Some(ref table) = bottleneck.table_name {
|
||||
if let Some(index) = self.find_usable_index(table, &bottleneck.filter) {
|
||||
alternatives.push(AlternativePath::UseIndex {
|
||||
index_name: index,
|
||||
estimated_speedup: 10.0,
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
OperatorType::NestedLoop => {
|
||||
// Alternative: Hash join
|
||||
alternatives.push(AlternativePath::ReplaceJoin {
|
||||
new_join_type: OperatorType::HashJoin,
|
||||
estimated_speedup: 5.0,
|
||||
});
|
||||
}
|
||||
|
||||
OperatorType::Sort => {
|
||||
// Alternative: Pre-sorted input via index
|
||||
alternatives.push(AlternativePath::SortedIndex {
|
||||
columns: bottleneck.projection.clone(),
|
||||
estimated_speedup: 3.0,
|
||||
});
|
||||
}
|
||||
|
||||
_ => {}
|
||||
}
|
||||
|
||||
alternatives
|
||||
}
|
||||
|
||||
fn estimate_improvement(
|
||||
&self,
|
||||
bottleneck: &BottleneckInfo,
|
||||
alternative: &AlternativePath,
|
||||
) -> f32 {
|
||||
let base_cost = bottleneck.estimated_cost;
|
||||
let speedup = alternative.estimated_speedup();
|
||||
|
||||
let new_cost = base_cost / speedup;
|
||||
let improvement = (base_cost - new_cost) / base_cost;
|
||||
|
||||
improvement as f32 * bottleneck.criticality
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct RedundancySuggestion {
|
||||
pub bottleneck_id: OperatorId,
|
||||
pub alternative_path: AlternativePath,
|
||||
pub estimated_improvement: f32,
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug)]
|
||||
pub enum AlternativePath {
|
||||
UseIndex { index_name: String, estimated_speedup: f64 },
|
||||
ReplaceJoin { new_join_type: OperatorType, estimated_speedup: f64 },
|
||||
SortedIndex { columns: Vec<String>, estimated_speedup: f64 },
|
||||
Materialize { subquery_id: OperatorId, estimated_speedup: f64 },
|
||||
Parallelize { workers: usize, estimated_speedup: f64 },
|
||||
}
|
||||
|
||||
impl AlternativePath {
|
||||
fn estimated_speedup(&self) -> f64 {
|
||||
match self {
|
||||
Self::UseIndex { estimated_speedup, .. } => *estimated_speedup,
|
||||
Self::ReplaceJoin { estimated_speedup, .. } => *estimated_speedup,
|
||||
Self::SortedIndex { estimated_speedup, .. } => *estimated_speedup,
|
||||
Self::Materialize { estimated_speedup, .. } => *estimated_speedup,
|
||||
Self::Parallelize { estimated_speedup, .. } => *estimated_speedup,
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## SQL Interface
|
||||
|
||||
```sql
|
||||
-- Compute mincut criticality for a plan
|
||||
SELECT * FROM ruvector_dag_mincut_criticality('documents');
|
||||
|
||||
-- Get bottleneck analysis
|
||||
SELECT * FROM ruvector_dag_bottlenecks('documents');
|
||||
|
||||
-- Get health check result
|
||||
SELECT ruvector_dag_mincut_health('documents');
|
||||
|
||||
-- Get redundancy suggestions
|
||||
SELECT * FROM ruvector_dag_redundancy_suggestions('documents');
|
||||
|
||||
-- Enable/disable mincut analysis
|
||||
SET ruvector.dag_mincut_enabled = true;
|
||||
SET ruvector.dag_mincut_threshold = 0.5;
|
||||
```
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
| Operation | Complexity | Typical Latency |
|
||||
|-----------|------------|-----------------|
|
||||
| Global min-cut query | O(1) | <1μs |
|
||||
| Single criticality | O(n^0.12) | <100μs |
|
||||
| All criticalities | O(n^1.12) | <10ms (100 ops) |
|
||||
| Edge insert | O(n^0.12) amortized | <100μs |
|
||||
| Edge delete | O(n^0.12) amortized | <100μs |
|
||||
| Health check | O(n^1.12) | <50ms |
|
||||
|
||||
## Memory Usage
|
||||
|
||||
| Component | Size | Notes |
|
||||
|-----------|------|-------|
|
||||
| Graph structure | O(n + m) | Vertices + edges |
|
||||
| Hierarchical decomposition | O(n log n) | Multi-level |
|
||||
| LinkCut tree | O(n) | Sleator-Tarjan |
|
||||
| Criticality cache | O(n) | Bounded by TTL |
|
||||
| LocalKCut coloring | O(k² log n) | Per query |
|
||||
|
||||
**Typical overhead:** ~1MB per 1000 operators
|
||||
1033
docs/dag/07-SELF-HEALING.md
Normal file
1033
docs/dag/07-SELF-HEALING.md
Normal file
File diff suppressed because it is too large
Load Diff
739
docs/dag/08-QUDAG-INTEGRATION.md
Normal file
739
docs/dag/08-QUDAG-INTEGRATION.md
Normal file
@@ -0,0 +1,739 @@
|
||||
# QuDAG Integration Specification
|
||||
|
||||
## Overview
|
||||
|
||||
This document specifies the optional integration between RuVector-Postgres Neural DAG system and QuDAG (Quantum-resistant Distributed DAG) for federated learning and distributed consensus on learned patterns.
|
||||
|
||||
## Integration Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ QUDAG INTEGRATION LAYER │
|
||||
├─────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌──────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ FEDERATED LEARNING │ │
|
||||
│ │ │ │
|
||||
│ │ Node A (US) Node B (EU) Node C (Asia) │ │
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
|
||||
│ │ │ RuVector-PG │ │ RuVector-PG │ │ RuVector-PG │ │ │
|
||||
│ │ │ ┌──────────┐ │ │ ┌──────────┐ │ │ ┌──────────┐ │ │ │
|
||||
│ │ │ │ Patterns │ │ │ │ Patterns │ │ │ │ Patterns │ │ │ │
|
||||
│ │ │ └────┬─────┘ │ │ └────┬─────┘ │ │ └────┬─────┘ │ │ │
|
||||
│ │ └──────┼───────┘ └──────┼───────┘ └──────┼───────┘ │ │
|
||||
│ │ │ │ │ │ │
|
||||
│ │ └────────────────────┼────────────────────┘ │ │
|
||||
│ │ ▼ │ │
|
||||
│ │ ┌─────────────────┐ │ │
|
||||
│ │ │ QuDAG Network │ │ │
|
||||
│ │ │ (QR-Avalanche) │ │ │
|
||||
│ │ └────────┬────────┘ │ │
|
||||
│ │ │ │ │
|
||||
│ │ ┌────────────────────┼────────────────────┐ │ │
|
||||
│ │ ▼ ▼ ▼ │ │
|
||||
│ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │
|
||||
│ │ │ Consensus │ │ Consensus │ │ Consensus │ │ │
|
||||
│ │ │ Patterns │ │ Patterns │ │ Patterns │ │ │
|
||||
│ │ └────────────┘ └────────────┘ └────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ └──────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ SECURITY LAYER │ │
|
||||
│ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │
|
||||
│ │ │ ML-KEM │ │ ML-DSA │ │ Differential│ │ rUv │ │ │
|
||||
│ │ │ Encryption │ │ Signatures │ │ Privacy │ │ Tokens │ │ │
|
||||
│ │ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │ │
|
||||
│ └──────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## QuDAG Client
|
||||
|
||||
### Core Structure
|
||||
|
||||
```rust
|
||||
pub struct QuDagClient {
|
||||
/// QuDAG node connection
|
||||
node_url: String,
|
||||
|
||||
/// Node identity (ML-DSA keypair)
|
||||
identity: QuDagIdentity,
|
||||
|
||||
/// Local pattern cache
|
||||
pattern_cache: DashMap<PatternId, ConsensusPattern>,
|
||||
|
||||
/// Pending proposals
|
||||
pending_proposals: DashMap<ProposalId, PatternProposal>,
|
||||
|
||||
/// Configuration
|
||||
config: QuDagConfig,
|
||||
|
||||
/// Metrics
|
||||
metrics: QuDagMetrics,
|
||||
}
|
||||
|
||||
#[derive(Clone)]
|
||||
pub struct QuDagIdentity {
|
||||
/// ML-DSA-65 public key
|
||||
pub public_key: MlDsaPublicKey,
|
||||
|
||||
/// ML-DSA-65 private key (encrypted at rest)
|
||||
private_key: MlDsaPrivateKey,
|
||||
|
||||
/// Node identifier
|
||||
pub node_id: NodeId,
|
||||
|
||||
/// Dark address (for anonymous communication)
|
||||
pub dark_address: Option<DarkAddress>,
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct QuDagConfig {
|
||||
/// Enable QuDAG integration
|
||||
pub enabled: bool,
|
||||
|
||||
/// QuDAG node URL
|
||||
pub node_url: String,
|
||||
|
||||
/// Differential privacy epsilon
|
||||
pub dp_epsilon: f64,
|
||||
|
||||
/// Minimum validators for consensus
|
||||
pub min_validators: usize,
|
||||
|
||||
/// Consensus timeout (seconds)
|
||||
pub consensus_timeout_secs: u64,
|
||||
|
||||
/// Sync interval (seconds)
|
||||
pub sync_interval_secs: u64,
|
||||
|
||||
/// Maximum patterns per proposal
|
||||
pub max_patterns_per_proposal: usize,
|
||||
|
||||
/// rUv staking requirement
|
||||
pub min_stake_ruv: u64,
|
||||
}
|
||||
|
||||
impl Default for QuDagConfig {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
enabled: false,
|
||||
node_url: "https://yyz.qudag.darknet/mcp".to_string(),
|
||||
dp_epsilon: 1.0,
|
||||
min_validators: 5,
|
||||
consensus_timeout_secs: 30,
|
||||
sync_interval_secs: 3600,
|
||||
max_patterns_per_proposal: 100,
|
||||
min_stake_ruv: 10,
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern Proposal
|
||||
|
||||
```rust
|
||||
impl QuDagClient {
|
||||
/// Propose local patterns for consensus
|
||||
pub async fn propose_patterns(
|
||||
&self,
|
||||
patterns: &[LearnedDagPattern],
|
||||
) -> Result<ProposalId, QuDagError> {
|
||||
// 1. Add differential privacy noise
|
||||
let noisy_patterns = self.add_dp_noise(patterns)?;
|
||||
|
||||
// 2. Create proposal
|
||||
let proposal = PatternProposal {
|
||||
id: self.generate_proposal_id(),
|
||||
proposer: self.identity.node_id.clone(),
|
||||
patterns: noisy_patterns,
|
||||
stake: self.config.min_stake_ruv,
|
||||
timestamp: SystemTime::now(),
|
||||
signature: None,
|
||||
};
|
||||
|
||||
// 3. Sign with ML-DSA
|
||||
let signed_proposal = self.sign_proposal(proposal)?;
|
||||
|
||||
// 4. Submit to QuDAG network
|
||||
self.submit_proposal(&signed_proposal).await?;
|
||||
|
||||
// 5. Track pending
|
||||
self.pending_proposals.insert(signed_proposal.id, signed_proposal.clone());
|
||||
|
||||
Ok(signed_proposal.id)
|
||||
}
|
||||
|
||||
/// Add differential privacy noise to patterns
|
||||
fn add_dp_noise(&self, patterns: &[LearnedDagPattern]) -> Result<Vec<NoisyPattern>, QuDagError> {
|
||||
let epsilon = self.config.dp_epsilon;
|
||||
|
||||
patterns.iter()
|
||||
.map(|p| {
|
||||
// Add Laplace noise to centroid
|
||||
let noisy_centroid: Vec<f32> = p.centroid.iter()
|
||||
.map(|&v| {
|
||||
let noise = laplace_sample(0.0, 1.0 / epsilon);
|
||||
v + noise as f32
|
||||
})
|
||||
.collect();
|
||||
|
||||
// Quantize quality scores
|
||||
let quantized_quality = (p.avg_metrics.quality * 10.0).round() / 10.0;
|
||||
|
||||
Ok(NoisyPattern {
|
||||
centroid: noisy_centroid,
|
||||
attention_type: p.optimal_attention.clone(),
|
||||
quality: quantized_quality,
|
||||
sample_count_bucket: bucket_sample_count(p.sample_count),
|
||||
})
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
/// Sign proposal with ML-DSA-65
|
||||
fn sign_proposal(&self, mut proposal: PatternProposal) -> Result<PatternProposal, QuDagError> {
|
||||
let message = proposal.to_signing_bytes();
|
||||
let signature = self.identity.private_key.sign(&message)?;
|
||||
proposal.signature = Some(signature);
|
||||
Ok(proposal)
|
||||
}
|
||||
|
||||
/// Submit proposal to QuDAG network
|
||||
async fn submit_proposal(&self, proposal: &PatternProposal) -> Result<(), QuDagError> {
|
||||
// Connect to QuDAG MCP server
|
||||
let client = McpClient::connect(&self.config.node_url).await?;
|
||||
|
||||
// Call dag_submit tool
|
||||
let response = client.call_tool("dag_submit", json!({
|
||||
"vertex_type": "pattern_proposal",
|
||||
"payload": proposal.to_encrypted_bytes(&self.get_network_key())?,
|
||||
"parents": self.get_recent_vertices().await?,
|
||||
})).await?;
|
||||
|
||||
if response["success"].as_bool().unwrap_or(false) {
|
||||
Ok(())
|
||||
} else {
|
||||
Err(QuDagError::SubmissionFailed(
|
||||
response["error"].as_str().unwrap_or("Unknown error").to_string()
|
||||
))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct PatternProposal {
|
||||
pub id: ProposalId,
|
||||
pub proposer: NodeId,
|
||||
pub patterns: Vec<NoisyPattern>,
|
||||
pub stake: u64,
|
||||
pub timestamp: SystemTime,
|
||||
pub signature: Option<MlDsaSignature>,
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct NoisyPattern {
|
||||
/// Centroid with DP noise
|
||||
pub centroid: Vec<f32>,
|
||||
|
||||
/// Attention type (no noise needed)
|
||||
pub attention_type: DagAttentionType,
|
||||
|
||||
/// Quantized quality
|
||||
pub quality: f64,
|
||||
|
||||
/// Bucketed sample count (privacy)
|
||||
pub sample_count_bucket: SampleCountBucket,
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug)]
|
||||
pub enum SampleCountBucket {
|
||||
Few, // < 10
|
||||
Some, // 10-50
|
||||
Many, // 50-200
|
||||
Lots, // > 200
|
||||
}
|
||||
```
|
||||
|
||||
### Consensus Validation
|
||||
|
||||
```rust
|
||||
impl QuDagClient {
|
||||
/// Validate incoming pattern proposals
|
||||
pub async fn validate_proposal(
|
||||
&self,
|
||||
proposal: &PatternProposal,
|
||||
) -> Result<ValidationResult, QuDagError> {
|
||||
// 1. Verify signature
|
||||
if !self.verify_signature(proposal)? {
|
||||
return Ok(ValidationResult::Rejected {
|
||||
reason: "Invalid signature".to_string(),
|
||||
});
|
||||
}
|
||||
|
||||
// 2. Check stake
|
||||
let balance = self.get_ruv_balance(&proposal.proposer).await?;
|
||||
if balance < proposal.stake {
|
||||
return Ok(ValidationResult::Rejected {
|
||||
reason: "Insufficient stake".to_string(),
|
||||
});
|
||||
}
|
||||
|
||||
// 3. Validate pattern quality
|
||||
let quality_scores: Vec<f64> = proposal.patterns.iter()
|
||||
.map(|p| p.quality)
|
||||
.collect();
|
||||
|
||||
let avg_quality = quality_scores.iter().sum::<f64>() / quality_scores.len() as f64;
|
||||
if avg_quality < 0.3 {
|
||||
return Ok(ValidationResult::Rejected {
|
||||
reason: "Low quality patterns".to_string(),
|
||||
});
|
||||
}
|
||||
|
||||
// 4. Check for duplicate patterns
|
||||
let duplicates = self.check_duplicates(&proposal.patterns).await?;
|
||||
if duplicates > proposal.patterns.len() / 2 {
|
||||
return Ok(ValidationResult::Rejected {
|
||||
reason: "Too many duplicate patterns".to_string(),
|
||||
});
|
||||
}
|
||||
|
||||
// 5. Compute accuracy improvement (sample-based)
|
||||
let improvement = self.estimate_improvement(&proposal.patterns).await?;
|
||||
|
||||
Ok(ValidationResult::Accepted {
|
||||
quality_score: avg_quality,
|
||||
improvement_estimate: improvement,
|
||||
validator: self.identity.node_id.clone(),
|
||||
})
|
||||
}
|
||||
|
||||
/// Submit validation to QuDAG
|
||||
pub async fn submit_validation(
|
||||
&self,
|
||||
proposal_id: ProposalId,
|
||||
result: &ValidationResult,
|
||||
) -> Result<(), QuDagError> {
|
||||
let validation = Validation {
|
||||
proposal_id,
|
||||
result: result.clone(),
|
||||
validator: self.identity.node_id.clone(),
|
||||
timestamp: SystemTime::now(),
|
||||
signature: None,
|
||||
};
|
||||
|
||||
let signed = self.sign_validation(validation)?;
|
||||
|
||||
let client = McpClient::connect(&self.config.node_url).await?;
|
||||
client.call_tool("dag_submit", json!({
|
||||
"vertex_type": "pattern_validation",
|
||||
"payload": signed.to_encrypted_bytes(&self.get_network_key())?,
|
||||
"parents": [proposal_id.to_string()],
|
||||
})).await?;
|
||||
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug)]
|
||||
pub enum ValidationResult {
|
||||
Accepted {
|
||||
quality_score: f64,
|
||||
improvement_estimate: f32,
|
||||
validator: NodeId,
|
||||
},
|
||||
Rejected {
|
||||
reason: String,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern Synchronization
|
||||
|
||||
```rust
|
||||
impl QuDagClient {
|
||||
/// Sync consensus patterns from QuDAG
|
||||
pub async fn sync_patterns(&self) -> Result<SyncResult, QuDagError> {
|
||||
let start = Instant::now();
|
||||
|
||||
// 1. Get latest consensus patterns
|
||||
let client = McpClient::connect(&self.config.node_url).await?;
|
||||
|
||||
let response = client.call_tool("dag_query", json!({
|
||||
"query_type": "consensus_patterns",
|
||||
"since": self.last_sync_timestamp(),
|
||||
"limit": 1000,
|
||||
})).await?;
|
||||
|
||||
let consensus_patterns: Vec<ConsensusPattern> = serde_json::from_value(
|
||||
response["patterns"].clone()
|
||||
)?;
|
||||
|
||||
// 2. Verify signatures
|
||||
let verified: Vec<_> = consensus_patterns.into_iter()
|
||||
.filter(|p| self.verify_consensus_signature(p).unwrap_or(false))
|
||||
.collect();
|
||||
|
||||
// 3. Update local cache
|
||||
let mut new_count = 0;
|
||||
for pattern in &verified {
|
||||
if !self.pattern_cache.contains_key(&pattern.id) {
|
||||
self.pattern_cache.insert(pattern.id, pattern.clone());
|
||||
new_count += 1;
|
||||
}
|
||||
}
|
||||
|
||||
// 4. Update local ReasoningBank
|
||||
let imported = self.import_to_reasoning_bank(&verified)?;
|
||||
|
||||
Ok(SyncResult {
|
||||
patterns_received: verified.len(),
|
||||
new_patterns: new_count,
|
||||
patterns_imported: imported,
|
||||
duration: start.elapsed(),
|
||||
})
|
||||
}
|
||||
|
||||
/// Import consensus patterns to local ReasoningBank
|
||||
fn import_to_reasoning_bank(&self, patterns: &[ConsensusPattern]) -> Result<usize, QuDagError> {
|
||||
let engines = get_all_dag_engines();
|
||||
let mut imported = 0;
|
||||
|
||||
for pattern in patterns {
|
||||
// Find matching local engine by pattern type
|
||||
for engine in &engines {
|
||||
let local_pattern = LearnedDagPattern {
|
||||
id: self.generate_local_pattern_id(),
|
||||
centroid: pattern.centroid.clone(),
|
||||
optimal_params: ExecutionParams::default(),
|
||||
optimal_attention: pattern.attention_type.clone(),
|
||||
confidence: pattern.consensus_confidence,
|
||||
sample_count: pattern.total_samples,
|
||||
avg_metrics: AverageMetrics {
|
||||
latency_us: 0.0, // Unknown from consensus
|
||||
memory_bytes: 0.0,
|
||||
quality: pattern.avg_quality,
|
||||
},
|
||||
updated_at: SystemTime::now(),
|
||||
};
|
||||
|
||||
let mut bank = engine.dag_reasoning_bank.write();
|
||||
bank.store(local_pattern);
|
||||
imported += 1;
|
||||
}
|
||||
}
|
||||
|
||||
Ok(imported)
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct ConsensusPattern {
|
||||
pub id: PatternId,
|
||||
pub centroid: Vec<f32>,
|
||||
pub attention_type: DagAttentionType,
|
||||
pub avg_quality: f64,
|
||||
pub total_samples: usize,
|
||||
pub consensus_confidence: f32,
|
||||
pub validators: Vec<NodeId>,
|
||||
pub signatures: Vec<MlDsaSignature>,
|
||||
pub finalized_at: SystemTime,
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct SyncResult {
|
||||
pub patterns_received: usize,
|
||||
pub new_patterns: usize,
|
||||
pub patterns_imported: usize,
|
||||
pub duration: Duration,
|
||||
}
|
||||
```
|
||||
|
||||
## rUv Token Integration
|
||||
|
||||
### Token Economy
|
||||
|
||||
```rust
|
||||
pub struct RuvTokenClient {
|
||||
/// QuDAG client reference
|
||||
qudag: Arc<QuDagClient>,
|
||||
|
||||
/// Local balance cache
|
||||
balance_cache: AtomicU64,
|
||||
|
||||
/// Pending rewards
|
||||
pending_rewards: DashMap<TransactionId, PendingReward>,
|
||||
}
|
||||
|
||||
impl RuvTokenClient {
|
||||
/// Check rUv balance
|
||||
pub async fn get_balance(&self) -> Result<u64, QuDagError> {
|
||||
let client = McpClient::connect(&self.qudag.config.node_url).await?;
|
||||
|
||||
let response = client.call_tool("ruv_balance", json!({
|
||||
"address": self.qudag.identity.node_id.to_string(),
|
||||
})).await?;
|
||||
|
||||
let balance = response["balance"].as_u64().unwrap_or(0);
|
||||
self.balance_cache.store(balance, Ordering::Relaxed);
|
||||
|
||||
Ok(balance)
|
||||
}
|
||||
|
||||
/// Stake rUv for pattern proposal
|
||||
pub async fn stake(&self, amount: u64) -> Result<TransactionId, QuDagError> {
|
||||
let client = McpClient::connect(&self.qudag.config.node_url).await?;
|
||||
|
||||
let response = client.call_tool("ruv_stake", json!({
|
||||
"amount": amount,
|
||||
"purpose": "pattern_proposal",
|
||||
"signature": self.sign_stake_request(amount)?,
|
||||
})).await?;
|
||||
|
||||
Ok(TransactionId::from_str(response["tx_id"].as_str().unwrap())?)
|
||||
}
|
||||
|
||||
/// Claim rewards for accepted patterns
|
||||
pub async fn claim_rewards(&self) -> Result<ClaimResult, QuDagError> {
|
||||
let client = McpClient::connect(&self.qudag.config.node_url).await?;
|
||||
|
||||
let response = client.call_tool("ruv_claim_rewards", json!({
|
||||
"address": self.qudag.identity.node_id.to_string(),
|
||||
"signature": self.sign_claim_request()?,
|
||||
})).await?;
|
||||
|
||||
let claimed = response["claimed"].as_u64().unwrap_or(0);
|
||||
let new_balance = response["new_balance"].as_u64().unwrap_or(0);
|
||||
|
||||
self.balance_cache.store(new_balance, Ordering::Relaxed);
|
||||
|
||||
Ok(ClaimResult {
|
||||
amount_claimed: claimed,
|
||||
new_balance,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
/// Reward structure
|
||||
#[derive(Clone, Debug)]
|
||||
pub struct RewardStructure {
|
||||
/// Base reward for accepted pattern
|
||||
pub pattern_accepted: u64, // 10 rUv
|
||||
|
||||
/// Bonus for accuracy improvement
|
||||
pub accuracy_bonus_per_percent: u64, // 10 rUv per 1%
|
||||
|
||||
/// Validation reward
|
||||
pub validation_reward: u64, // 2 rUv
|
||||
|
||||
/// Penalty for rejected pattern
|
||||
pub rejection_penalty: u64, // 5 rUv
|
||||
|
||||
/// Byzantine behavior penalty
|
||||
pub byzantine_penalty: u64, // 1000 rUv
|
||||
}
|
||||
|
||||
impl Default for RewardStructure {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
pattern_accepted: 10,
|
||||
accuracy_bonus_per_percent: 10,
|
||||
validation_reward: 2,
|
||||
rejection_penalty: 5,
|
||||
byzantine_penalty: 1000,
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Security Layer
|
||||
|
||||
### ML-KEM Encryption
|
||||
|
||||
```rust
|
||||
pub struct PatternEncryption {
|
||||
/// Network public key (for encryption)
|
||||
network_key: MlKemPublicKey,
|
||||
|
||||
/// Local private key (for decryption)
|
||||
local_key: MlKemPrivateKey,
|
||||
}
|
||||
|
||||
impl PatternEncryption {
|
||||
/// Encrypt pattern for network transmission
|
||||
pub fn encrypt(&self, pattern: &NoisyPattern) -> Result<EncryptedPattern, CryptoError> {
|
||||
let plaintext = pattern.to_bytes();
|
||||
|
||||
// Encapsulate shared secret
|
||||
let (ciphertext, shared_secret) = self.network_key.encapsulate()?;
|
||||
|
||||
// Derive key from shared secret
|
||||
let key = blake3::derive_key("QuDAG Pattern Encryption", &shared_secret);
|
||||
|
||||
// Encrypt with ChaCha20-Poly1305
|
||||
let nonce = generate_nonce();
|
||||
let encrypted = chacha20_poly1305_encrypt(&key, &nonce, &plaintext)?;
|
||||
|
||||
Ok(EncryptedPattern {
|
||||
ciphertext,
|
||||
encrypted_data: encrypted,
|
||||
nonce,
|
||||
})
|
||||
}
|
||||
|
||||
/// Decrypt pattern from network
|
||||
pub fn decrypt(&self, encrypted: &EncryptedPattern) -> Result<NoisyPattern, CryptoError> {
|
||||
// Decapsulate shared secret
|
||||
let shared_secret = self.local_key.decapsulate(&encrypted.ciphertext)?;
|
||||
|
||||
// Derive key
|
||||
let key = blake3::derive_key("QuDAG Pattern Encryption", &shared_secret);
|
||||
|
||||
// Decrypt
|
||||
let plaintext = chacha20_poly1305_decrypt(
|
||||
&key,
|
||||
&encrypted.nonce,
|
||||
&encrypted.encrypted_data,
|
||||
)?;
|
||||
|
||||
NoisyPattern::from_bytes(&plaintext)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### ML-DSA Signatures
|
||||
|
||||
```rust
|
||||
pub struct PatternSigning {
|
||||
/// Signing key
|
||||
private_key: MlDsaPrivateKey,
|
||||
|
||||
/// Verification key
|
||||
public_key: MlDsaPublicKey,
|
||||
}
|
||||
|
||||
impl PatternSigning {
|
||||
/// Sign pattern proposal
|
||||
pub fn sign_proposal(&self, proposal: &PatternProposal) -> Result<MlDsaSignature, CryptoError> {
|
||||
let message = proposal.to_signing_bytes();
|
||||
self.private_key.sign(&message)
|
||||
}
|
||||
|
||||
/// Verify proposal signature
|
||||
pub fn verify_proposal(
|
||||
&self,
|
||||
proposal: &PatternProposal,
|
||||
public_key: &MlDsaPublicKey,
|
||||
) -> Result<bool, CryptoError> {
|
||||
let message = proposal.to_signing_bytes();
|
||||
let signature = proposal.signature.as_ref()
|
||||
.ok_or(CryptoError::MissingSignature)?;
|
||||
|
||||
public_key.verify(&message, signature)
|
||||
}
|
||||
|
||||
/// Sign validation
|
||||
pub fn sign_validation(&self, validation: &Validation) -> Result<MlDsaSignature, CryptoError> {
|
||||
let message = validation.to_signing_bytes();
|
||||
self.private_key.sign(&message)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## SQL Interface
|
||||
|
||||
```sql
|
||||
-- Enable QuDAG integration
|
||||
SELECT ruvector_dag_qudag_enable('{
|
||||
"node_url": "https://yyz.qudag.darknet/mcp",
|
||||
"dp_epsilon": 1.0,
|
||||
"min_stake_ruv": 10
|
||||
}'::jsonb);
|
||||
|
||||
-- Register identity
|
||||
SELECT ruvector_dag_qudag_register();
|
||||
|
||||
-- Propose patterns for consensus
|
||||
SELECT ruvector_dag_qudag_propose('documents');
|
||||
|
||||
-- Sync consensus patterns
|
||||
SELECT ruvector_dag_qudag_sync();
|
||||
|
||||
-- Get rUv balance
|
||||
SELECT ruvector_dag_ruv_balance();
|
||||
|
||||
-- Claim rewards
|
||||
SELECT ruvector_dag_ruv_claim();
|
||||
|
||||
-- Get QuDAG status
|
||||
SELECT ruvector_dag_qudag_status();
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### PostgreSQL GUC Variables
|
||||
|
||||
```sql
|
||||
-- Enable/disable QuDAG
|
||||
SET ruvector.dag_qudag_enabled = true;
|
||||
|
||||
-- QuDAG node URL
|
||||
SET ruvector.dag_qudag_node_url = 'https://yyz.qudag.darknet/mcp';
|
||||
|
||||
-- Differential privacy epsilon
|
||||
SET ruvector.dag_qudag_dp_epsilon = 1.0;
|
||||
|
||||
-- Sync interval (seconds)
|
||||
SET ruvector.dag_qudag_sync_interval = 3600;
|
||||
|
||||
-- Minimum stake for proposals
|
||||
SET ruvector.dag_qudag_min_stake = 10;
|
||||
```
|
||||
|
||||
## Metrics
|
||||
|
||||
```rust
|
||||
#[derive(Clone, Debug, Default)]
|
||||
pub struct QuDagMetrics {
|
||||
pub proposals_submitted: AtomicU64,
|
||||
pub proposals_accepted: AtomicU64,
|
||||
pub proposals_rejected: AtomicU64,
|
||||
pub validations_performed: AtomicU64,
|
||||
pub patterns_synced: AtomicU64,
|
||||
pub ruv_earned: AtomicU64,
|
||||
pub ruv_spent: AtomicU64,
|
||||
pub last_sync_time: AtomicU64,
|
||||
}
|
||||
|
||||
impl QuDagMetrics {
|
||||
pub fn to_json(&self) -> serde_json::Value {
|
||||
json!({
|
||||
"proposals_submitted": self.proposals_submitted.load(Ordering::Relaxed),
|
||||
"proposals_accepted": self.proposals_accepted.load(Ordering::Relaxed),
|
||||
"proposals_rejected": self.proposals_rejected.load(Ordering::Relaxed),
|
||||
"acceptance_rate": self.acceptance_rate(),
|
||||
"validations_performed": self.validations_performed.load(Ordering::Relaxed),
|
||||
"patterns_synced": self.patterns_synced.load(Ordering::Relaxed),
|
||||
"ruv_net": self.ruv_net(),
|
||||
})
|
||||
}
|
||||
|
||||
fn acceptance_rate(&self) -> f64 {
|
||||
let submitted = self.proposals_submitted.load(Ordering::Relaxed);
|
||||
let accepted = self.proposals_accepted.load(Ordering::Relaxed);
|
||||
if submitted > 0 {
|
||||
accepted as f64 / submitted as f64
|
||||
} else {
|
||||
0.0
|
||||
}
|
||||
}
|
||||
|
||||
fn ruv_net(&self) -> i64 {
|
||||
self.ruv_earned.load(Ordering::Relaxed) as i64
|
||||
- self.ruv_spent.load(Ordering::Relaxed) as i64
|
||||
}
|
||||
}
|
||||
```
|
||||
790
docs/dag/09-SQL-API.md
Normal file
790
docs/dag/09-SQL-API.md
Normal file
@@ -0,0 +1,790 @@
|
||||
# SQL API Reference
|
||||
|
||||
## Overview
|
||||
|
||||
Complete SQL API for the Neural DAG Learning system integrated with RuVector-Postgres.
|
||||
|
||||
## Configuration Functions
|
||||
|
||||
### System Configuration
|
||||
|
||||
```sql
|
||||
-- Enable/disable neural DAG learning
|
||||
SELECT ruvector.dag_set_enabled(enabled BOOLEAN) RETURNS VOID;
|
||||
|
||||
-- Configure learning rate
|
||||
SELECT ruvector.dag_set_learning_rate(rate FLOAT8) RETURNS VOID;
|
||||
|
||||
-- Set attention mechanism
|
||||
SELECT ruvector.dag_set_attention(
|
||||
mechanism TEXT -- 'topological', 'causal_cone', 'critical_path',
|
||||
-- 'mincut_gated', 'hierarchical_lorentz',
|
||||
-- 'parallel_branch', 'temporal_btsp', 'auto'
|
||||
) RETURNS VOID;
|
||||
|
||||
-- Configure SONA parameters
|
||||
SELECT ruvector.dag_configure_sona(
|
||||
micro_lora_rank INT DEFAULT 2,
|
||||
base_lora_rank INT DEFAULT 8,
|
||||
ewc_lambda FLOAT8 DEFAULT 5000.0,
|
||||
pattern_clusters INT DEFAULT 100
|
||||
) RETURNS VOID;
|
||||
|
||||
-- Set QuDAG network endpoint
|
||||
SELECT ruvector.dag_set_qudag_endpoint(
|
||||
endpoint TEXT,
|
||||
stake_amount FLOAT8 DEFAULT 0.0
|
||||
) RETURNS VOID;
|
||||
```
|
||||
|
||||
### Runtime Status
|
||||
|
||||
```sql
|
||||
-- Get current configuration
|
||||
SELECT * FROM ruvector.dag_config();
|
||||
-- Returns: (enabled, learning_rate, attention_mechanism,
|
||||
-- micro_lora_rank, base_lora_rank, ewc_lambda, qudag_endpoint)
|
||||
|
||||
-- Get system status
|
||||
SELECT * FROM ruvector.dag_status();
|
||||
-- Returns: (active_patterns, total_trajectories, avg_improvement,
|
||||
-- attention_hits, learning_rate_effective, qudag_connected)
|
||||
|
||||
-- Check health
|
||||
SELECT * FROM ruvector.dag_health_check();
|
||||
-- Returns: (component, status, last_check, message)
|
||||
```
|
||||
|
||||
## Query Analysis Functions
|
||||
|
||||
### Plan Analysis
|
||||
|
||||
```sql
|
||||
-- Analyze query plan and return neural DAG insights
|
||||
SELECT * FROM ruvector.dag_analyze_plan(
|
||||
query_text TEXT
|
||||
) RETURNS TABLE (
|
||||
node_id INT,
|
||||
operator_type TEXT,
|
||||
criticality FLOAT8,
|
||||
bottleneck_score FLOAT8,
|
||||
embedding VECTOR(256),
|
||||
parent_ids INT[],
|
||||
child_ids INT[],
|
||||
estimated_cost FLOAT8,
|
||||
recommendations TEXT[]
|
||||
);
|
||||
|
||||
-- Get critical path for query
|
||||
SELECT * FROM ruvector.dag_critical_path(
|
||||
query_text TEXT
|
||||
) RETURNS TABLE (
|
||||
path_position INT,
|
||||
node_id INT,
|
||||
operator_type TEXT,
|
||||
accumulated_cost FLOAT8,
|
||||
attention_weight FLOAT8
|
||||
);
|
||||
|
||||
-- Identify bottlenecks
|
||||
SELECT * FROM ruvector.dag_bottlenecks(
|
||||
query_text TEXT,
|
||||
threshold FLOAT8 DEFAULT 0.7
|
||||
) RETURNS TABLE (
|
||||
node_id INT,
|
||||
operator_type TEXT,
|
||||
bottleneck_score FLOAT8,
|
||||
impact_estimate FLOAT8,
|
||||
suggested_action TEXT
|
||||
);
|
||||
|
||||
-- Get min-cut analysis
|
||||
SELECT * FROM ruvector.dag_mincut_analysis(
|
||||
query_text TEXT
|
||||
) RETURNS TABLE (
|
||||
cut_id INT,
|
||||
source_nodes INT[],
|
||||
sink_nodes INT[],
|
||||
cut_capacity FLOAT8,
|
||||
parallelization_opportunity BOOLEAN
|
||||
);
|
||||
```
|
||||
|
||||
### Query Optimization
|
||||
|
||||
```sql
|
||||
-- Get optimization suggestions
|
||||
SELECT * FROM ruvector.dag_suggest_optimizations(
|
||||
query_text TEXT
|
||||
) RETURNS TABLE (
|
||||
suggestion_id INT,
|
||||
category TEXT, -- 'index', 'join_order', 'parallelism', 'memory'
|
||||
description TEXT,
|
||||
expected_improvement FLOAT8,
|
||||
implementation_sql TEXT,
|
||||
confidence FLOAT8
|
||||
);
|
||||
|
||||
-- Rewrite query using learned patterns
|
||||
SELECT ruvector.dag_rewrite_query(
|
||||
query_text TEXT
|
||||
) RETURNS TEXT;
|
||||
|
||||
-- Estimate query with neural predictions
|
||||
SELECT * FROM ruvector.dag_estimate(
|
||||
query_text TEXT
|
||||
) RETURNS TABLE (
|
||||
metric TEXT,
|
||||
postgres_estimate FLOAT8,
|
||||
neural_estimate FLOAT8,
|
||||
confidence FLOAT8
|
||||
);
|
||||
```
|
||||
|
||||
## Attention Mechanism Functions
|
||||
|
||||
### Attention Scores
|
||||
|
||||
```sql
|
||||
-- Compute attention for query DAG
|
||||
SELECT * FROM ruvector.dag_attention_scores(
|
||||
query_text TEXT,
|
||||
mechanism TEXT DEFAULT 'auto'
|
||||
) RETURNS TABLE (
|
||||
node_id INT,
|
||||
attention_weight FLOAT8,
|
||||
query_contribution FLOAT8[],
|
||||
key_contribution FLOAT8[]
|
||||
);
|
||||
|
||||
-- Get attention matrix
|
||||
SELECT ruvector.dag_attention_matrix(
|
||||
query_text TEXT,
|
||||
mechanism TEXT DEFAULT 'auto'
|
||||
) RETURNS FLOAT8[][];
|
||||
|
||||
-- Visualize attention (returns DOT graph)
|
||||
SELECT ruvector.dag_attention_visualize(
|
||||
query_text TEXT,
|
||||
mechanism TEXT DEFAULT 'auto',
|
||||
format TEXT DEFAULT 'dot' -- 'dot', 'json', 'ascii'
|
||||
) RETURNS TEXT;
|
||||
```
|
||||
|
||||
### Attention Configuration
|
||||
|
||||
```sql
|
||||
-- Set attention hyperparameters
|
||||
SELECT ruvector.dag_attention_configure(
|
||||
mechanism TEXT,
|
||||
params JSONB
|
||||
-- Example params:
|
||||
-- topological: {"max_depth": 5, "decay_factor": 0.9}
|
||||
-- causal_cone: {"time_window": 1000, "future_discount": 0.5}
|
||||
-- critical_path: {"path_weight": 2.0, "branch_penalty": 0.3}
|
||||
-- mincut_gated: {"gate_threshold": 0.1, "flow_capacity": "cost"}
|
||||
-- hierarchical_lorentz: {"curvature": -1.0, "time_scale": 0.1}
|
||||
-- parallel_branch: {"max_branches": 8, "sync_penalty": 0.2}
|
||||
-- temporal_btsp: {"plateau_duration": 100, "eligibility_decay": 0.95}
|
||||
) RETURNS VOID;
|
||||
|
||||
-- Get attention statistics
|
||||
SELECT * FROM ruvector.dag_attention_stats()
|
||||
RETURNS TABLE (
|
||||
mechanism TEXT,
|
||||
invocations BIGINT,
|
||||
avg_latency_us FLOAT8,
|
||||
hit_rate FLOAT8,
|
||||
improvement_ratio FLOAT8
|
||||
);
|
||||
```
|
||||
|
||||
## SONA Learning Functions
|
||||
|
||||
### Pattern Management
|
||||
|
||||
```sql
|
||||
-- Store a learned pattern
|
||||
SELECT ruvector.dag_store_pattern(
|
||||
pattern_vector VECTOR(256),
|
||||
pattern_metadata JSONB,
|
||||
quality_score FLOAT8
|
||||
) RETURNS BIGINT; -- pattern_id
|
||||
|
||||
-- Query similar patterns
|
||||
SELECT * FROM ruvector.dag_query_patterns(
|
||||
query_vector VECTOR(256),
|
||||
k INT DEFAULT 5,
|
||||
similarity_threshold FLOAT8 DEFAULT 0.7
|
||||
) RETURNS TABLE (
|
||||
pattern_id BIGINT,
|
||||
similarity FLOAT8,
|
||||
quality_score FLOAT8,
|
||||
metadata JSONB,
|
||||
usage_count INT,
|
||||
last_used TIMESTAMPTZ
|
||||
);
|
||||
|
||||
-- Get pattern clusters (ReasoningBank)
|
||||
SELECT * FROM ruvector.dag_pattern_clusters()
|
||||
RETURNS TABLE (
|
||||
cluster_id INT,
|
||||
centroid VECTOR(256),
|
||||
member_count INT,
|
||||
avg_quality FLOAT8,
|
||||
representative_query TEXT
|
||||
);
|
||||
|
||||
-- Force pattern consolidation
|
||||
SELECT ruvector.dag_consolidate_patterns(
|
||||
target_clusters INT DEFAULT 100
|
||||
) RETURNS TABLE (
|
||||
clusters_before INT,
|
||||
clusters_after INT,
|
||||
patterns_merged INT,
|
||||
consolidation_time_ms FLOAT8
|
||||
);
|
||||
```
|
||||
|
||||
### Trajectory Management
|
||||
|
||||
```sql
|
||||
-- Record a learning trajectory
|
||||
SELECT ruvector.dag_record_trajectory(
|
||||
query_hash BIGINT,
|
||||
dag_structure JSONB,
|
||||
execution_time_ms FLOAT8,
|
||||
improvement_ratio FLOAT8,
|
||||
attention_mechanism TEXT
|
||||
) RETURNS BIGINT; -- trajectory_id
|
||||
|
||||
-- Get trajectory history
|
||||
SELECT * FROM ruvector.dag_trajectory_history(
|
||||
time_range TSTZRANGE DEFAULT NULL,
|
||||
min_improvement FLOAT8 DEFAULT 0.0,
|
||||
limit_count INT DEFAULT 100
|
||||
) RETURNS TABLE (
|
||||
trajectory_id BIGINT,
|
||||
query_hash BIGINT,
|
||||
recorded_at TIMESTAMPTZ,
|
||||
execution_time_ms FLOAT8,
|
||||
improvement_ratio FLOAT8,
|
||||
attention_mechanism TEXT
|
||||
);
|
||||
|
||||
-- Analyze trajectory trends
|
||||
SELECT * FROM ruvector.dag_trajectory_trends(
|
||||
window_size INTERVAL DEFAULT '1 hour'
|
||||
) RETURNS TABLE (
|
||||
window_start TIMESTAMPTZ,
|
||||
trajectory_count INT,
|
||||
avg_improvement FLOAT8,
|
||||
best_mechanism TEXT,
|
||||
pattern_discoveries INT
|
||||
);
|
||||
```
|
||||
|
||||
### Learning Control
|
||||
|
||||
```sql
|
||||
-- Trigger immediate learning cycle
|
||||
SELECT ruvector.dag_learn_now() RETURNS TABLE (
|
||||
patterns_updated INT,
|
||||
new_clusters INT,
|
||||
ewc_constraints_updated INT,
|
||||
cycle_time_ms FLOAT8
|
||||
);
|
||||
|
||||
-- Reset learning state (use with caution)
|
||||
SELECT ruvector.dag_reset_learning(
|
||||
preserve_patterns BOOLEAN DEFAULT TRUE,
|
||||
preserve_trajectories BOOLEAN DEFAULT FALSE
|
||||
) RETURNS VOID;
|
||||
|
||||
-- Export learned state
|
||||
SELECT ruvector.dag_export_state() RETURNS BYTEA;
|
||||
|
||||
-- Import learned state
|
||||
SELECT ruvector.dag_import_state(state_data BYTEA) RETURNS TABLE (
|
||||
patterns_imported INT,
|
||||
trajectories_imported INT,
|
||||
clusters_restored INT
|
||||
);
|
||||
|
||||
-- Get EWC constraint info
|
||||
SELECT * FROM ruvector.dag_ewc_constraints()
|
||||
RETURNS TABLE (
|
||||
parameter_name TEXT,
|
||||
fisher_importance FLOAT8,
|
||||
optimal_value FLOAT8,
|
||||
last_updated TIMESTAMPTZ
|
||||
);
|
||||
```
|
||||
|
||||
## Self-Healing Functions
|
||||
|
||||
### Health Monitoring
|
||||
|
||||
```sql
|
||||
-- Run comprehensive health check
|
||||
SELECT * FROM ruvector.dag_health_report()
|
||||
RETURNS TABLE (
|
||||
subsystem TEXT,
|
||||
status TEXT,
|
||||
score FLOAT8,
|
||||
issues TEXT[],
|
||||
recommendations TEXT[]
|
||||
);
|
||||
|
||||
-- Get anomaly detection results
|
||||
SELECT * FROM ruvector.dag_anomalies(
|
||||
time_range TSTZRANGE DEFAULT '[now - 1 hour, now]'::TSTZRANGE
|
||||
) RETURNS TABLE (
|
||||
anomaly_id BIGINT,
|
||||
detected_at TIMESTAMPTZ,
|
||||
anomaly_type TEXT,
|
||||
severity TEXT,
|
||||
affected_component TEXT,
|
||||
z_score FLOAT8,
|
||||
resolved BOOLEAN
|
||||
);
|
||||
|
||||
-- Check index health
|
||||
SELECT * FROM ruvector.dag_index_health()
|
||||
RETURNS TABLE (
|
||||
index_name TEXT,
|
||||
index_type TEXT,
|
||||
fragmentation FLOAT8,
|
||||
recall_estimate FLOAT8,
|
||||
recommended_action TEXT
|
||||
);
|
||||
|
||||
-- Check learning drift
|
||||
SELECT * FROM ruvector.dag_learning_drift()
|
||||
RETURNS TABLE (
|
||||
metric TEXT,
|
||||
current_value FLOAT8,
|
||||
baseline_value FLOAT8,
|
||||
drift_magnitude FLOAT8,
|
||||
trend TEXT
|
||||
);
|
||||
```
|
||||
|
||||
### Repair Operations
|
||||
|
||||
```sql
|
||||
-- Trigger automatic repair
|
||||
SELECT * FROM ruvector.dag_auto_repair()
|
||||
RETURNS TABLE (
|
||||
repair_id BIGINT,
|
||||
repair_type TEXT,
|
||||
target TEXT,
|
||||
status TEXT,
|
||||
duration_ms FLOAT8
|
||||
);
|
||||
|
||||
-- Rebalance specific index
|
||||
SELECT ruvector.dag_rebalance_index(
|
||||
index_name TEXT,
|
||||
target_recall FLOAT8 DEFAULT 0.95
|
||||
) RETURNS TABLE (
|
||||
vectors_moved INT,
|
||||
new_recall FLOAT8,
|
||||
duration_ms FLOAT8
|
||||
);
|
||||
|
||||
-- Reset pattern quality scores
|
||||
SELECT ruvector.dag_reset_pattern_quality(
|
||||
pattern_ids BIGINT[] DEFAULT NULL -- NULL = all patterns
|
||||
) RETURNS INT; -- patterns reset
|
||||
|
||||
-- Force cluster recomputation
|
||||
SELECT ruvector.dag_recompute_clusters(
|
||||
algorithm TEXT DEFAULT 'kmeans_pp'
|
||||
) RETURNS TABLE (
|
||||
old_clusters INT,
|
||||
new_clusters INT,
|
||||
silhouette_score FLOAT8
|
||||
);
|
||||
```
|
||||
|
||||
## QuDAG Integration Functions
|
||||
|
||||
### Network Operations
|
||||
|
||||
```sql
|
||||
-- Connect to QuDAG network
|
||||
SELECT ruvector.qudag_connect(
|
||||
endpoint TEXT,
|
||||
identity_key BYTEA DEFAULT NULL -- auto-generate if NULL
|
||||
) RETURNS TABLE (
|
||||
connected BOOLEAN,
|
||||
node_id TEXT,
|
||||
network_version TEXT
|
||||
);
|
||||
|
||||
-- Get network status
|
||||
SELECT * FROM ruvector.qudag_status()
|
||||
RETURNS TABLE (
|
||||
connected BOOLEAN,
|
||||
node_id TEXT,
|
||||
peers INT,
|
||||
latest_round BIGINT,
|
||||
sync_status TEXT
|
||||
);
|
||||
|
||||
-- Propose pattern to network
|
||||
SELECT ruvector.qudag_propose_pattern(
|
||||
pattern_vector VECTOR(256),
|
||||
metadata JSONB,
|
||||
stake_amount FLOAT8 DEFAULT 0.0
|
||||
) RETURNS TABLE (
|
||||
proposal_id TEXT,
|
||||
submitted_at TIMESTAMPTZ,
|
||||
status TEXT
|
||||
);
|
||||
|
||||
-- Check proposal status
|
||||
SELECT * FROM ruvector.qudag_proposal_status(
|
||||
proposal_id TEXT
|
||||
) RETURNS TABLE (
|
||||
status TEXT,
|
||||
votes_for INT,
|
||||
votes_against INT,
|
||||
finalized BOOLEAN,
|
||||
finalized_at TIMESTAMPTZ
|
||||
);
|
||||
|
||||
-- Sync patterns from network
|
||||
SELECT * FROM ruvector.qudag_sync_patterns(
|
||||
since_round BIGINT DEFAULT 0
|
||||
) RETURNS TABLE (
|
||||
patterns_received INT,
|
||||
patterns_applied INT,
|
||||
conflicts_resolved INT
|
||||
);
|
||||
```
|
||||
|
||||
### Token Operations
|
||||
|
||||
```sql
|
||||
-- Get rUv balance
|
||||
SELECT ruvector.qudag_balance() RETURNS FLOAT8;
|
||||
|
||||
-- Stake tokens
|
||||
SELECT ruvector.qudag_stake(
|
||||
amount FLOAT8
|
||||
) RETURNS TABLE (
|
||||
new_stake FLOAT8,
|
||||
tx_hash TEXT
|
||||
);
|
||||
|
||||
-- Claim rewards
|
||||
SELECT * FROM ruvector.qudag_claim_rewards()
|
||||
RETURNS TABLE (
|
||||
amount FLOAT8,
|
||||
tx_hash TEXT,
|
||||
source TEXT
|
||||
);
|
||||
|
||||
-- Get staking info
|
||||
SELECT * FROM ruvector.qudag_staking_info()
|
||||
RETURNS TABLE (
|
||||
staked_amount FLOAT8,
|
||||
pending_rewards FLOAT8,
|
||||
lock_until TIMESTAMPTZ,
|
||||
apr_estimate FLOAT8
|
||||
);
|
||||
```
|
||||
|
||||
### Cryptographic Operations
|
||||
|
||||
```sql
|
||||
-- Generate ML-KEM keypair
|
||||
SELECT ruvector.qudag_generate_kem_keypair()
|
||||
RETURNS TABLE (
|
||||
public_key BYTEA,
|
||||
secret_key_id TEXT -- stored securely
|
||||
);
|
||||
|
||||
-- Encrypt data for peer
|
||||
SELECT ruvector.qudag_encrypt(
|
||||
plaintext BYTEA,
|
||||
recipient_pubkey BYTEA
|
||||
) RETURNS TABLE (
|
||||
ciphertext BYTEA,
|
||||
encapsulated_key BYTEA
|
||||
);
|
||||
|
||||
-- Decrypt received data
|
||||
SELECT ruvector.qudag_decrypt(
|
||||
ciphertext BYTEA,
|
||||
encapsulated_key BYTEA,
|
||||
secret_key_id TEXT
|
||||
) RETURNS BYTEA;
|
||||
|
||||
-- Sign data
|
||||
SELECT ruvector.qudag_sign(
|
||||
data BYTEA
|
||||
) RETURNS BYTEA; -- ML-DSA signature
|
||||
|
||||
-- Verify signature
|
||||
SELECT ruvector.qudag_verify(
|
||||
data BYTEA,
|
||||
signature BYTEA,
|
||||
pubkey BYTEA
|
||||
) RETURNS BOOLEAN;
|
||||
```
|
||||
|
||||
## Monitoring and Statistics
|
||||
|
||||
### Performance Metrics
|
||||
|
||||
```sql
|
||||
-- Get overall statistics
|
||||
SELECT * FROM ruvector.dag_statistics()
|
||||
RETURNS TABLE (
|
||||
metric TEXT,
|
||||
value FLOAT8,
|
||||
unit TEXT,
|
||||
updated_at TIMESTAMPTZ
|
||||
);
|
||||
|
||||
-- Get latency breakdown
|
||||
SELECT * FROM ruvector.dag_latency_breakdown(
|
||||
time_range TSTZRANGE DEFAULT '[now - 1 hour, now]'::TSTZRANGE
|
||||
) RETURNS TABLE (
|
||||
component TEXT,
|
||||
p50_us FLOAT8,
|
||||
p95_us FLOAT8,
|
||||
p99_us FLOAT8,
|
||||
max_us FLOAT8
|
||||
);
|
||||
|
||||
-- Get memory usage
|
||||
SELECT * FROM ruvector.dag_memory_usage()
|
||||
RETURNS TABLE (
|
||||
component TEXT,
|
||||
allocated_bytes BIGINT,
|
||||
used_bytes BIGINT,
|
||||
peak_bytes BIGINT
|
||||
);
|
||||
|
||||
-- Get throughput metrics
|
||||
SELECT * FROM ruvector.dag_throughput(
|
||||
window INTERVAL DEFAULT '1 minute'
|
||||
) RETURNS TABLE (
|
||||
metric TEXT,
|
||||
count BIGINT,
|
||||
per_second FLOAT8
|
||||
);
|
||||
```
|
||||
|
||||
### Debugging
|
||||
|
||||
```sql
|
||||
-- Enable debug logging
|
||||
SELECT ruvector.dag_set_debug(
|
||||
enabled BOOLEAN,
|
||||
components TEXT[] DEFAULT ARRAY['all']
|
||||
) RETURNS VOID;
|
||||
|
||||
-- Get recent debug logs
|
||||
SELECT * FROM ruvector.dag_debug_logs(
|
||||
since TIMESTAMPTZ DEFAULT now() - interval '5 minutes',
|
||||
component TEXT DEFAULT NULL,
|
||||
severity TEXT DEFAULT NULL -- 'debug', 'info', 'warn', 'error'
|
||||
) RETURNS TABLE (
|
||||
logged_at TIMESTAMPTZ,
|
||||
component TEXT,
|
||||
severity TEXT,
|
||||
message TEXT,
|
||||
context JSONB
|
||||
);
|
||||
|
||||
-- Trace single query
|
||||
SELECT * FROM ruvector.dag_trace_query(
|
||||
query_text TEXT
|
||||
) RETURNS TABLE (
|
||||
step INT,
|
||||
operation TEXT,
|
||||
duration_us FLOAT8,
|
||||
details JSONB
|
||||
);
|
||||
|
||||
-- Export diagnostics bundle
|
||||
SELECT ruvector.dag_export_diagnostics() RETURNS BYTEA;
|
||||
```
|
||||
|
||||
## Batch Operations
|
||||
|
||||
### Bulk Processing
|
||||
|
||||
```sql
|
||||
-- Analyze multiple queries
|
||||
SELECT * FROM ruvector.dag_bulk_analyze(
|
||||
queries TEXT[]
|
||||
) RETURNS TABLE (
|
||||
query_index INT,
|
||||
bottleneck_count INT,
|
||||
suggested_mechanism TEXT,
|
||||
estimated_improvement FLOAT8
|
||||
);
|
||||
|
||||
-- Pre-warm patterns for workload
|
||||
SELECT ruvector.dag_prewarm_patterns(
|
||||
representative_queries TEXT[]
|
||||
) RETURNS TABLE (
|
||||
patterns_loaded INT,
|
||||
cache_hit_rate FLOAT8
|
||||
);
|
||||
|
||||
-- Batch record trajectories
|
||||
SELECT ruvector.dag_bulk_record_trajectories(
|
||||
trajectories JSONB[]
|
||||
) RETURNS INT; -- trajectories recorded
|
||||
```
|
||||
|
||||
## Views
|
||||
|
||||
### System Views
|
||||
|
||||
```sql
|
||||
-- Active configuration
|
||||
CREATE VIEW ruvector.dag_active_config AS
|
||||
SELECT * FROM ruvector.dag_config();
|
||||
|
||||
-- Recent patterns
|
||||
CREATE VIEW ruvector.dag_recent_patterns AS
|
||||
SELECT pattern_id, created_at, quality_score, usage_count
|
||||
FROM ruvector.dag_patterns
|
||||
WHERE created_at > now() - interval '24 hours'
|
||||
ORDER BY quality_score DESC;
|
||||
|
||||
-- Attention effectiveness
|
||||
CREATE VIEW ruvector.dag_attention_effectiveness AS
|
||||
SELECT
|
||||
mechanism,
|
||||
count(*) as uses,
|
||||
avg(improvement_ratio) as avg_improvement,
|
||||
percentile_cont(0.95) WITHIN GROUP (ORDER BY improvement_ratio) as p95_improvement
|
||||
FROM ruvector.dag_trajectories
|
||||
WHERE recorded_at > now() - interval '7 days'
|
||||
GROUP BY mechanism;
|
||||
|
||||
-- Health summary
|
||||
CREATE VIEW ruvector.dag_health_summary AS
|
||||
SELECT
|
||||
subsystem,
|
||||
status,
|
||||
score,
|
||||
array_length(issues, 1) as issue_count
|
||||
FROM ruvector.dag_health_report();
|
||||
```
|
||||
|
||||
## Installation SQL
|
||||
|
||||
```sql
|
||||
-- Create extension
|
||||
CREATE EXTENSION IF NOT EXISTS ruvector_dag CASCADE;
|
||||
|
||||
-- Required tables (auto-created by extension)
|
||||
-- ruvector.dag_patterns - Learned patterns storage
|
||||
-- ruvector.dag_trajectories - Learning trajectory history
|
||||
-- ruvector.dag_clusters - Pattern clusters (ReasoningBank)
|
||||
-- ruvector.dag_anomalies - Detected anomalies log
|
||||
-- ruvector.dag_repairs - Repair history
|
||||
-- ruvector.dag_qudag_proposals - QuDAG proposal tracking
|
||||
|
||||
-- Recommended indexes
|
||||
CREATE INDEX ON ruvector.dag_patterns USING hnsw (pattern_vector vector_cosine_ops);
|
||||
CREATE INDEX ON ruvector.dag_trajectories (recorded_at DESC);
|
||||
CREATE INDEX ON ruvector.dag_trajectories (query_hash);
|
||||
CREATE INDEX ON ruvector.dag_anomalies (detected_at DESC) WHERE NOT resolved;
|
||||
|
||||
-- Grant permissions
|
||||
GRANT USAGE ON SCHEMA ruvector TO PUBLIC;
|
||||
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA ruvector TO PUBLIC;
|
||||
GRANT SELECT ON ALL TABLES IN SCHEMA ruvector TO PUBLIC;
|
||||
```
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Basic Query Optimization
|
||||
|
||||
```sql
|
||||
-- Enable neural DAG learning
|
||||
SELECT ruvector.dag_set_enabled(true);
|
||||
|
||||
-- Analyze a query
|
||||
SELECT * FROM ruvector.dag_analyze_plan($$
|
||||
SELECT v.*, m.category
|
||||
FROM vectors v
|
||||
JOIN metadata m ON v.id = m.vector_id
|
||||
WHERE v.embedding <-> $1 < 0.5
|
||||
ORDER BY v.embedding <-> $1
|
||||
LIMIT 100
|
||||
$$);
|
||||
|
||||
-- Get optimization suggestions
|
||||
SELECT * FROM ruvector.dag_suggest_optimizations($$
|
||||
SELECT v.*, m.category
|
||||
FROM vectors v
|
||||
JOIN metadata m ON v.id = m.vector_id
|
||||
WHERE v.embedding <-> $1 < 0.5
|
||||
ORDER BY v.embedding <-> $1
|
||||
LIMIT 100
|
||||
$$);
|
||||
```
|
||||
|
||||
### Attention Mechanism Selection
|
||||
|
||||
```sql
|
||||
-- Let system choose best attention
|
||||
SELECT ruvector.dag_set_attention('auto');
|
||||
|
||||
-- Or manually select based on workload
|
||||
-- For deep query plans:
|
||||
SELECT ruvector.dag_set_attention('topological');
|
||||
|
||||
-- For time-series workloads:
|
||||
SELECT ruvector.dag_set_attention('causal_cone');
|
||||
|
||||
-- For CPU-bound queries:
|
||||
SELECT ruvector.dag_set_attention('critical_path');
|
||||
```
|
||||
|
||||
### Distributed Learning with QuDAG
|
||||
|
||||
```sql
|
||||
-- Connect to QuDAG network
|
||||
SELECT * FROM ruvector.qudag_connect(
|
||||
'https://qudag.example.com:8443'
|
||||
);
|
||||
|
||||
-- Stake tokens for participation
|
||||
SELECT * FROM ruvector.qudag_stake(100.0);
|
||||
|
||||
-- Patterns are now automatically shared and validated
|
||||
-- Check sync status
|
||||
SELECT * FROM ruvector.qudag_status();
|
||||
```
|
||||
|
||||
## Error Codes
|
||||
|
||||
| Code | Name | Description |
|
||||
|------|------|-------------|
|
||||
| RV001 | DAG_DISABLED | Neural DAG learning is disabled |
|
||||
| RV002 | INVALID_ATTENTION | Unknown attention mechanism |
|
||||
| RV003 | PATTERN_NOT_FOUND | Referenced pattern does not exist |
|
||||
| RV004 | LEARNING_FAILED | Learning cycle failed |
|
||||
| RV005 | QUDAG_DISCONNECTED | Not connected to QuDAG network |
|
||||
| RV006 | QUDAG_AUTH_FAILED | QuDAG authentication failed |
|
||||
| RV007 | INSUFFICIENT_STAKE | Not enough staked tokens |
|
||||
| RV008 | CRYPTO_ERROR | Cryptographic operation failed |
|
||||
| RV009 | REPAIR_FAILED | Self-healing repair failed |
|
||||
| RV010 | TRAJECTORY_OVERFLOW | Trajectory buffer full |
|
||||
|
||||
---
|
||||
|
||||
*Document: 09-SQL-API.md | Version: 1.0 | Last Updated: 2025-01-XX*
|
||||
1013
docs/dag/10-TESTING-STRATEGY.md
Normal file
1013
docs/dag/10-TESTING-STRATEGY.md
Normal file
File diff suppressed because it is too large
Load Diff
881
docs/dag/11-AGENT-TASKS.md
Normal file
881
docs/dag/11-AGENT-TASKS.md
Normal file
@@ -0,0 +1,881 @@
|
||||
# Agent Task Assignments
|
||||
|
||||
## Overview
|
||||
|
||||
Task breakdown for 15-agent swarm implementing the Neural DAG Learning system. Each agent has specific responsibilities, dependencies, and deliverables.
|
||||
|
||||
## Swarm Topology
|
||||
|
||||
```
|
||||
┌─────────────────────┐
|
||||
│ QUEEN AGENT │
|
||||
│ (Coordinator) │
|
||||
│ Agent #0 │
|
||||
└──────────┬──────────┘
|
||||
│
|
||||
┌──────────────────────┼──────────────────────┐
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
|
||||
│ CORE TEAM │ │ POSTGRES TEAM │ │ QUDAG TEAM │
|
||||
│ Agents 1-5 │ │ Agents 6-9 │ │ Agents 10-12 │
|
||||
└───────────────┘ └───────────────┘ └───────────────┘
|
||||
│
|
||||
┌──────────┴──────────┐
|
||||
▼ ▼
|
||||
┌───────────────┐ ┌───────────────┐
|
||||
│ TESTING TEAM │ │ DOCS TEAM │
|
||||
│ Agents 13-14 │ │ Agent 15 │
|
||||
└───────────────┘ └───────────────┘
|
||||
```
|
||||
|
||||
## Agent Assignments
|
||||
|
||||
---
|
||||
|
||||
### Agent #0: Queen Coordinator
|
||||
|
||||
**Type**: `queen-coordinator`
|
||||
|
||||
**Role**: Central orchestration, dependency management, conflict resolution
|
||||
|
||||
**Responsibilities**:
|
||||
- Monitor all agent progress via memory coordination
|
||||
- Resolve cross-team dependencies and conflicts
|
||||
- Manage swarm-wide configuration
|
||||
- Aggregate status reports
|
||||
- Make strategic decisions on implementation order
|
||||
- Coordinate code reviews between teams
|
||||
|
||||
**Deliverables**:
|
||||
- Swarm coordination logs
|
||||
- Dependency resolution decisions
|
||||
- Final integration verification
|
||||
|
||||
**Memory Keys**:
|
||||
- `swarm/queen/status` - Overall swarm status
|
||||
- `swarm/queen/decisions` - Strategic decisions log
|
||||
- `swarm/queen/dependencies` - Cross-agent dependency tracking
|
||||
|
||||
**No direct code output** - Coordination only
|
||||
|
||||
---
|
||||
|
||||
### Agent #1: Core DAG Engine
|
||||
|
||||
**Type**: `coder`
|
||||
|
||||
**Role**: Core DAG data structures and algorithms
|
||||
|
||||
**Responsibilities**:
|
||||
1. Implement `QueryDag` structure
|
||||
2. Implement `OperatorNode` and `OperatorType`
|
||||
3. Implement DAG traversal algorithms (topological sort, DFS, BFS)
|
||||
4. Implement edge/node management
|
||||
5. Implement DAG serialization/deserialization
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-dag/src/
|
||||
├── lib.rs
|
||||
├── dag/
|
||||
│ ├── mod.rs
|
||||
│ ├── query_dag.rs
|
||||
│ ├── operator_node.rs
|
||||
│ ├── traversal.rs
|
||||
│ └── serialization.rs
|
||||
```
|
||||
|
||||
**Dependencies**: None (foundational)
|
||||
|
||||
**Blocked By**: None
|
||||
|
||||
**Blocks**: Agents 2, 3, 4, 6
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] `QueryDag` struct with node/edge management
|
||||
- [ ] `OperatorNode` with all operator types
|
||||
- [ ] Topological sort implementation
|
||||
- [ ] Cycle detection
|
||||
- [ ] JSON/binary serialization
|
||||
|
||||
**Estimated Complexity**: Medium
|
||||
|
||||
---
|
||||
|
||||
### Agent #2: Attention Mechanisms (Basic)
|
||||
|
||||
**Type**: `coder`
|
||||
|
||||
**Role**: Implement first 4 attention mechanisms
|
||||
|
||||
**Responsibilities**:
|
||||
1. Implement `DagAttention` trait
|
||||
2. Implement `TopologicalAttention`
|
||||
3. Implement `CausalConeAttention`
|
||||
4. Implement `CriticalPathAttention`
|
||||
5. Implement `MinCutGatedAttention`
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-dag/src/attention/
|
||||
├── mod.rs
|
||||
├── traits.rs
|
||||
├── topological.rs
|
||||
├── causal_cone.rs
|
||||
├── critical_path.rs
|
||||
└── mincut_gated.rs
|
||||
```
|
||||
|
||||
**Dependencies**: Agent #1 (QueryDag)
|
||||
|
||||
**Blocked By**: Agent #1
|
||||
|
||||
**Blocks**: Agents 6, 13
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] `DagAttention` trait definition
|
||||
- [ ] `TopologicalAttention` with decay
|
||||
- [ ] `CausalConeAttention` with temporal awareness
|
||||
- [ ] `CriticalPathAttention` with path computation
|
||||
- [ ] `MinCutGatedAttention` with flow-based gating
|
||||
|
||||
**Estimated Complexity**: High
|
||||
|
||||
---
|
||||
|
||||
### Agent #3: Attention Mechanisms (Advanced)
|
||||
|
||||
**Type**: `coder`
|
||||
|
||||
**Role**: Implement advanced attention mechanisms
|
||||
|
||||
**Responsibilities**:
|
||||
1. Implement `HierarchicalLorentzAttention`
|
||||
2. Implement `ParallelBranchAttention`
|
||||
3. Implement `TemporalBTSPAttention`
|
||||
4. Implement `AttentionSelector` (UCB bandit)
|
||||
5. Implement attention caching
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-dag/src/attention/
|
||||
├── hierarchical_lorentz.rs
|
||||
├── parallel_branch.rs
|
||||
├── temporal_btsp.rs
|
||||
├── selector.rs
|
||||
└── cache.rs
|
||||
```
|
||||
|
||||
**Dependencies**: Agent #1 (QueryDag), Agent #2 (DagAttention trait)
|
||||
|
||||
**Blocked By**: Agents #1, #2
|
||||
|
||||
**Blocks**: Agents 6, 13
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] `HierarchicalLorentzAttention` with hyperbolic ops
|
||||
- [ ] `ParallelBranchAttention` with branch detection
|
||||
- [ ] `TemporalBTSPAttention` with eligibility traces
|
||||
- [ ] `AttentionSelector` with UCB selection
|
||||
- [ ] LRU attention cache
|
||||
|
||||
**Estimated Complexity**: Very High
|
||||
|
||||
---
|
||||
|
||||
### Agent #4: SONA Integration
|
||||
|
||||
**Type**: `coder`
|
||||
|
||||
**Role**: Self-Optimizing Neural Architecture integration
|
||||
|
||||
**Responsibilities**:
|
||||
1. Implement `DagSonaEngine`
|
||||
2. Implement `MicroLoRA` adaptation
|
||||
3. Implement `DagTrajectoryBuffer`
|
||||
4. Implement `DagReasoningBank`
|
||||
5. Implement `EwcPlusPlus` constraints
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-dag/src/sona/
|
||||
├── mod.rs
|
||||
├── engine.rs
|
||||
├── micro_lora.rs
|
||||
├── trajectory.rs
|
||||
├── reasoning_bank.rs
|
||||
└── ewc.rs
|
||||
```
|
||||
|
||||
**Dependencies**: Agent #1 (QueryDag)
|
||||
|
||||
**Blocked By**: Agent #1
|
||||
|
||||
**Blocks**: Agents 6, 7, 13
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] `DagSonaEngine` orchestration
|
||||
- [ ] `MicroLoRA` rank-2 adaptation (<100μs)
|
||||
- [ ] Lock-free trajectory buffer
|
||||
- [ ] K-means++ clustering for patterns
|
||||
- [ ] EWC++ with Fisher information
|
||||
|
||||
**Estimated Complexity**: Very High
|
||||
|
||||
---
|
||||
|
||||
### Agent #5: MinCut Optimization
|
||||
|
||||
**Type**: `coder`
|
||||
|
||||
**Role**: Subpolynomial min-cut algorithms
|
||||
|
||||
**Responsibilities**:
|
||||
1. Implement `DagMinCutEngine`
|
||||
2. Implement `LocalKCut` oracle
|
||||
3. Implement dynamic update algorithms
|
||||
4. Implement bottleneck detection
|
||||
5. Implement redundancy suggestions
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-dag/src/mincut/
|
||||
├── mod.rs
|
||||
├── engine.rs
|
||||
├── local_kcut.rs
|
||||
├── dynamic_updates.rs
|
||||
├── bottleneck.rs
|
||||
└── redundancy.rs
|
||||
```
|
||||
|
||||
**Dependencies**: Agent #1 (QueryDag)
|
||||
|
||||
**Blocked By**: Agent #1
|
||||
|
||||
**Blocks**: Agent #2 (MinCutGatedAttention), Agent #6
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] `DagMinCutEngine` with O(n^0.12) updates
|
||||
- [ ] `LocalKCut` oracle implementation
|
||||
- [ ] Hierarchical decomposition
|
||||
- [ ] Bottleneck scoring algorithm
|
||||
- [ ] Redundancy recommendation engine
|
||||
|
||||
**Estimated Complexity**: Very High
|
||||
|
||||
---
|
||||
|
||||
### Agent #6: PostgreSQL Core Integration
|
||||
|
||||
**Type**: `backend-dev`
|
||||
|
||||
**Role**: Core PostgreSQL extension integration
|
||||
|
||||
**Responsibilities**:
|
||||
1. Set up pgrx extension structure
|
||||
2. Implement GUC variables
|
||||
3. Implement global state management
|
||||
4. Implement query hooks (planner, executor)
|
||||
5. Implement background worker registration
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-postgres/src/dag/
|
||||
├── mod.rs
|
||||
├── extension.rs
|
||||
├── guc.rs
|
||||
├── state.rs
|
||||
├── hooks.rs
|
||||
└── worker.rs
|
||||
```
|
||||
|
||||
**Dependencies**: Agents #1-5 (all core components)
|
||||
|
||||
**Blocked By**: Agents #1, #2, #3, #4, #5
|
||||
|
||||
**Blocks**: Agents #7, #8, #9
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] Extension scaffolding with pgrx
|
||||
- [ ] All GUC variables from spec
|
||||
- [ ] Thread-safe global state (DashMap)
|
||||
- [ ] Planner hook for DAG analysis
|
||||
- [ ] Executor hooks for trajectory capture
|
||||
- [ ] Background worker main loop
|
||||
|
||||
**Estimated Complexity**: High
|
||||
|
||||
---
|
||||
|
||||
### Agent #7: PostgreSQL SQL Functions (Part 1)
|
||||
|
||||
**Type**: `backend-dev`
|
||||
|
||||
**Role**: Core SQL function implementations
|
||||
|
||||
**Responsibilities**:
|
||||
1. Configuration functions
|
||||
2. Query analysis functions
|
||||
3. Attention functions
|
||||
4. Basic status/health functions
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-postgres/src/dag/
|
||||
├── functions/
|
||||
│ ├── mod.rs
|
||||
│ ├── config.rs
|
||||
│ ├── analysis.rs
|
||||
│ ├── attention.rs
|
||||
│ └── status.rs
|
||||
```
|
||||
|
||||
**SQL Functions**:
|
||||
- `dag_set_enabled`
|
||||
- `dag_set_learning_rate`
|
||||
- `dag_set_attention`
|
||||
- `dag_configure_sona`
|
||||
- `dag_config`
|
||||
- `dag_status`
|
||||
- `dag_analyze_plan`
|
||||
- `dag_critical_path`
|
||||
- `dag_bottlenecks`
|
||||
- `dag_attention_scores`
|
||||
- `dag_attention_matrix`
|
||||
|
||||
**Dependencies**: Agent #6 (PostgreSQL core)
|
||||
|
||||
**Blocked By**: Agent #6
|
||||
|
||||
**Blocks**: Agent #13
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] All configuration SQL functions
|
||||
- [ ] Query analysis functions
|
||||
- [ ] Attention computation functions
|
||||
- [ ] Status reporting functions
|
||||
|
||||
**Estimated Complexity**: Medium
|
||||
|
||||
---
|
||||
|
||||
### Agent #8: PostgreSQL SQL Functions (Part 2)
|
||||
|
||||
**Type**: `backend-dev`
|
||||
|
||||
**Role**: Learning and pattern SQL functions
|
||||
|
||||
**Responsibilities**:
|
||||
1. Pattern management functions
|
||||
2. Trajectory functions
|
||||
3. Learning control functions
|
||||
4. Self-healing functions
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-postgres/src/dag/
|
||||
├── functions/
|
||||
│ ├── patterns.rs
|
||||
│ ├── trajectories.rs
|
||||
│ ├── learning.rs
|
||||
│ └── healing.rs
|
||||
```
|
||||
|
||||
**SQL Functions**:
|
||||
- `dag_store_pattern`
|
||||
- `dag_query_patterns`
|
||||
- `dag_pattern_clusters`
|
||||
- `dag_consolidate_patterns`
|
||||
- `dag_record_trajectory`
|
||||
- `dag_trajectory_history`
|
||||
- `dag_learn_now`
|
||||
- `dag_reset_learning`
|
||||
- `dag_health_report`
|
||||
- `dag_anomalies`
|
||||
- `dag_auto_repair`
|
||||
|
||||
**Dependencies**: Agent #6 (PostgreSQL core), Agent #4 (SONA)
|
||||
|
||||
**Blocked By**: Agents #4, #6
|
||||
|
||||
**Blocks**: Agent #13
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] Pattern CRUD functions
|
||||
- [ ] Trajectory management functions
|
||||
- [ ] Learning control functions
|
||||
- [ ] Health and healing functions
|
||||
|
||||
**Estimated Complexity**: Medium
|
||||
|
||||
---
|
||||
|
||||
### Agent #9: Self-Healing System
|
||||
|
||||
**Type**: `coder`
|
||||
|
||||
**Role**: Autonomous self-healing implementation
|
||||
|
||||
**Responsibilities**:
|
||||
1. Implement `AnomalyDetector`
|
||||
2. Implement `IndexHealthChecker`
|
||||
3. Implement `LearningDriftDetector`
|
||||
4. Implement repair strategies
|
||||
5. Implement healing orchestrator
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-dag/src/healing/
|
||||
├── mod.rs
|
||||
├── anomaly.rs
|
||||
├── index_health.rs
|
||||
├── drift_detector.rs
|
||||
├── strategies.rs
|
||||
└── orchestrator.rs
|
||||
```
|
||||
|
||||
**Dependencies**: Agent #4 (SONA), Agent #6 (PostgreSQL hooks)
|
||||
|
||||
**Blocked By**: Agents #4, #6
|
||||
|
||||
**Blocks**: Agent #8 (healing SQL functions), Agent #13
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] Z-score anomaly detection
|
||||
- [ ] HNSW/IVFFlat health monitoring
|
||||
- [ ] Pattern drift detection
|
||||
- [ ] Repair strategy implementations
|
||||
- [ ] Healing loop orchestration
|
||||
|
||||
**Estimated Complexity**: High
|
||||
|
||||
---
|
||||
|
||||
### Agent #10: QuDAG Client
|
||||
|
||||
**Type**: `coder`
|
||||
|
||||
**Role**: QuDAG network client implementation
|
||||
|
||||
**Responsibilities**:
|
||||
1. Implement `QuDagClient`
|
||||
2. Implement network communication
|
||||
3. Implement pattern proposal flow
|
||||
4. Implement consensus validation
|
||||
5. Implement pattern synchronization
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-dag/src/qudag/
|
||||
├── mod.rs
|
||||
├── client.rs
|
||||
├── network.rs
|
||||
├── proposal.rs
|
||||
├── consensus.rs
|
||||
└── sync.rs
|
||||
```
|
||||
|
||||
**Dependencies**: Agent #4 (patterns to propose)
|
||||
|
||||
**Blocked By**: Agent #4
|
||||
|
||||
**Blocks**: Agents #11, #12
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] QuDAG network client
|
||||
- [ ] Async communication layer
|
||||
- [ ] Pattern proposal protocol
|
||||
- [ ] Consensus status tracking
|
||||
- [ ] Pattern sync mechanism
|
||||
|
||||
**Estimated Complexity**: High
|
||||
|
||||
---
|
||||
|
||||
### Agent #11: QuDAG Cryptography
|
||||
|
||||
**Type**: `security-manager`
|
||||
|
||||
**Role**: Quantum-resistant cryptography
|
||||
|
||||
**Responsibilities**:
|
||||
1. Implement ML-KEM-768 wrapper
|
||||
2. Implement ML-DSA signature wrapper
|
||||
3. Implement identity management
|
||||
4. Implement secure key storage
|
||||
5. Implement differential privacy for patterns
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-dag/src/qudag/
|
||||
├── crypto/
|
||||
│ ├── mod.rs
|
||||
│ ├── ml_kem.rs
|
||||
│ ├── ml_dsa.rs
|
||||
│ ├── identity.rs
|
||||
│ ├── keystore.rs
|
||||
│ └── differential_privacy.rs
|
||||
```
|
||||
|
||||
**Dependencies**: Agent #10 (QuDAG client)
|
||||
|
||||
**Blocked By**: Agent #10
|
||||
|
||||
**Blocks**: Agent #12
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] ML-KEM-768 encrypt/decrypt
|
||||
- [ ] ML-DSA sign/verify
|
||||
- [ ] Identity keypair management
|
||||
- [ ] Secure keystore (zeroize)
|
||||
- [ ] Laplace noise for DP
|
||||
|
||||
**Estimated Complexity**: High
|
||||
|
||||
---
|
||||
|
||||
### Agent #12: QuDAG Token Integration
|
||||
|
||||
**Type**: `backend-dev`
|
||||
|
||||
**Role**: rUv token operations
|
||||
|
||||
**Responsibilities**:
|
||||
1. Implement staking interface
|
||||
2. Implement reward claiming
|
||||
3. Implement balance tracking
|
||||
4. Implement token SQL functions
|
||||
5. Implement governance participation
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-dag/src/qudag/
|
||||
├── tokens/
|
||||
│ ├── mod.rs
|
||||
│ ├── staking.rs
|
||||
│ ├── rewards.rs
|
||||
│ └── governance.rs
|
||||
|
||||
ruvector-postgres/src/dag/functions/
|
||||
├── qudag.rs (SQL functions for QuDAG)
|
||||
```
|
||||
|
||||
**Dependencies**: Agent #10 (QuDAG client), Agent #11 (crypto)
|
||||
|
||||
**Blocked By**: Agents #10, #11
|
||||
|
||||
**Blocks**: Agent #13
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] Staking operations
|
||||
- [ ] Reward computation
|
||||
- [ ] Balance queries
|
||||
- [ ] QuDAG SQL functions
|
||||
- [ ] Governance voting
|
||||
|
||||
**Estimated Complexity**: Medium
|
||||
|
||||
---
|
||||
|
||||
### Agent #13: Test Suite Developer
|
||||
|
||||
**Type**: `tester`
|
||||
|
||||
**Role**: Comprehensive test implementation
|
||||
|
||||
**Responsibilities**:
|
||||
1. Unit tests for all modules
|
||||
2. Integration tests
|
||||
3. Property-based tests
|
||||
4. Benchmark tests
|
||||
5. CI pipeline setup
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-dag/tests/
|
||||
├── unit/
|
||||
│ ├── attention/
|
||||
│ ├── sona/
|
||||
│ ├── mincut/
|
||||
│ ├── healing/
|
||||
│ └── qudag/
|
||||
├── integration/
|
||||
│ ├── postgres/
|
||||
│ └── qudag/
|
||||
├── property/
|
||||
└── fixtures/
|
||||
|
||||
ruvector-dag/benches/
|
||||
├── attention_bench.rs
|
||||
├── sona_bench.rs
|
||||
└── mincut_bench.rs
|
||||
|
||||
.github/workflows/
|
||||
└── dag-tests.yml
|
||||
```
|
||||
|
||||
**Dependencies**: All code agents (1-12)
|
||||
|
||||
**Blocked By**: Agents #1-12 (tests require implementations)
|
||||
|
||||
**Blocks**: None (can test incrementally)
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] >80% unit test coverage
|
||||
- [ ] All integration tests passing
|
||||
- [ ] Property tests (1000+ cases)
|
||||
- [ ] Benchmarks meeting performance targets
|
||||
- [ ] CI/CD pipeline
|
||||
|
||||
**Estimated Complexity**: High
|
||||
|
||||
---
|
||||
|
||||
### Agent #14: Test Data & Fixtures
|
||||
|
||||
**Type**: `tester`
|
||||
|
||||
**Role**: Test data generation and fixtures
|
||||
|
||||
**Responsibilities**:
|
||||
1. Generate realistic query DAGs
|
||||
2. Generate synthetic patterns
|
||||
3. Generate trajectory data
|
||||
4. Create mock QuDAG server
|
||||
5. Create test databases
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-dag/tests/
|
||||
├── fixtures/
|
||||
│ ├── dag_generator.rs
|
||||
│ ├── pattern_generator.rs
|
||||
│ ├── trajectory_generator.rs
|
||||
│ └── mock_qudag.rs
|
||||
├── data/
|
||||
│ ├── sample_dags.json
|
||||
│ ├── sample_patterns.bin
|
||||
│ └── sample_trajectories.json
|
||||
```
|
||||
|
||||
**Dependencies**: Agent #1 (DAG structure definitions)
|
||||
|
||||
**Blocked By**: Agent #1
|
||||
|
||||
**Blocks**: Agent #13 (needs fixtures)
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] DAG generator for all complexity levels
|
||||
- [ ] Pattern generator for learning tests
|
||||
- [ ] Mock QuDAG server for network tests
|
||||
- [ ] Sample data files
|
||||
- [ ] Test database setup scripts
|
||||
|
||||
**Estimated Complexity**: Medium
|
||||
|
||||
---
|
||||
|
||||
### Agent #15: Documentation & Examples
|
||||
|
||||
**Type**: `api-docs`
|
||||
|
||||
**Role**: API documentation and usage examples
|
||||
|
||||
**Responsibilities**:
|
||||
1. Rust API documentation
|
||||
2. SQL API documentation
|
||||
3. Usage examples
|
||||
4. Integration guides
|
||||
5. Troubleshooting guides
|
||||
|
||||
**Files to Create/Modify**:
|
||||
```
|
||||
ruvector-dag/
|
||||
├── README.md
|
||||
├── examples/
|
||||
│ ├── basic_usage.rs
|
||||
│ ├── attention_selection.rs
|
||||
│ ├── learning_workflow.rs
|
||||
│ └── qudag_integration.rs
|
||||
|
||||
docs/dag/
|
||||
├── USAGE.md
|
||||
├── TROUBLESHOOTING.md
|
||||
└── EXAMPLES.md
|
||||
```
|
||||
|
||||
**Dependencies**: All code agents (1-12)
|
||||
|
||||
**Blocked By**: None (can document spec first, update with impl)
|
||||
|
||||
**Blocks**: None
|
||||
|
||||
**Deliverables**:
|
||||
- [ ] Complete rustdoc for all public APIs
|
||||
- [ ] SQL function documentation
|
||||
- [ ] Working code examples
|
||||
- [ ] Integration guide
|
||||
- [ ] Troubleshooting guide
|
||||
|
||||
**Estimated Complexity**: Medium
|
||||
|
||||
---
|
||||
|
||||
## Task Dependencies Graph
|
||||
|
||||
```
|
||||
┌─────┐
|
||||
│ 0 │ Queen
|
||||
└──┬──┘
|
||||
│
|
||||
┌─────────────┼─────────────┐
|
||||
│ │ │
|
||||
┌──┴──┐ ┌──┴──┐ ┌──┴──┐
|
||||
│ 1 │ │ 14 │ │ 15 │
|
||||
└──┬──┘ └──┬──┘ └─────┘
|
||||
│ │
|
||||
┌────┼────┬───────┤
|
||||
│ │ │ │
|
||||
┌──┴─┐┌─┴──┐┌┴──┐┌───┴───┐
|
||||
│ 2 ││ 4 ││ 5 ││ (13) │
|
||||
└──┬─┘└─┬──┘└─┬─┘└───────┘
|
||||
│ │ │
|
||||
┌──┴─┐ │ │
|
||||
│ 3 │ │ │
|
||||
└──┬─┘ │ │
|
||||
│ │ │
|
||||
└────┼─────┘
|
||||
│
|
||||
┌──┴──┐
|
||||
│ 6 │ PostgreSQL Core
|
||||
└──┬──┘
|
||||
│
|
||||
┌────┼────┬────┐
|
||||
│ │ │ │
|
||||
┌──┴─┐┌─┴──┐│ ┌──┴──┐
|
||||
│ 7 ││ 8 ││ │ 9 │
|
||||
└────┘└────┘│ └─────┘
|
||||
│
|
||||
┌──┴──┐
|
||||
│ 10 │ QuDAG Client
|
||||
└──┬──┘
|
||||
│
|
||||
┌──┴──┐
|
||||
│ 11 │ QuDAG Crypto
|
||||
└──┬──┘
|
||||
│
|
||||
┌──┴──┐
|
||||
│ 12 │ QuDAG Tokens
|
||||
└──┬──┘
|
||||
│
|
||||
┌──┴──┐
|
||||
│ 13 │ Tests
|
||||
└─────┘
|
||||
```
|
||||
|
||||
## Execution Phases
|
||||
|
||||
### Phase 1: Foundation (Agents 1, 14, 15)
|
||||
- Agent #1: Core DAG Engine
|
||||
- Agent #14: Test fixtures (parallel)
|
||||
- Agent #15: Documentation skeleton (parallel)
|
||||
|
||||
**Duration**: Can start immediately
|
||||
**Milestone**: QueryDag and OperatorNode complete
|
||||
|
||||
### Phase 2: Core Features (Agents 2, 3, 4, 5)
|
||||
- Agent #2: Basic Attention
|
||||
- Agent #3: Advanced Attention (after Agent #2)
|
||||
- Agent #4: SONA Integration
|
||||
- Agent #5: MinCut Optimization
|
||||
|
||||
**Duration**: After Phase 1 foundation
|
||||
**Milestone**: All attention mechanisms and learning components
|
||||
|
||||
### Phase 3: PostgreSQL Integration (Agents 6, 7, 8, 9)
|
||||
- Agent #6: PostgreSQL Core
|
||||
- Agent #7: SQL Functions Part 1 (after Agent #6)
|
||||
- Agent #8: SQL Functions Part 2 (after Agent #6)
|
||||
- Agent #9: Self-Healing (after Agent #6)
|
||||
|
||||
**Duration**: After Phase 2 core features
|
||||
**Milestone**: Full PostgreSQL extension functional
|
||||
|
||||
### Phase 4: QuDAG Integration (Agents 10, 11, 12)
|
||||
- Agent #10: QuDAG Client
|
||||
- Agent #11: QuDAG Crypto (after Agent #10)
|
||||
- Agent #12: QuDAG Tokens (after Agent #11)
|
||||
|
||||
**Duration**: Can start after Agent #4 (SONA)
|
||||
**Milestone**: Distributed pattern learning operational
|
||||
|
||||
### Phase 5: Testing & Validation (Agent 13)
|
||||
- Agent #13: Full test suite
|
||||
- Integration testing
|
||||
- Performance validation
|
||||
|
||||
**Duration**: Ongoing throughout, intensive at end
|
||||
**Milestone**: All tests passing, benchmarks met
|
||||
|
||||
## Coordination Protocol
|
||||
|
||||
### Memory Keys for Cross-Agent Communication
|
||||
|
||||
```
|
||||
swarm/dag/
|
||||
├── status/
|
||||
│ ├── agent_{N}_status # Individual agent status
|
||||
│ ├── phase_status # Current phase
|
||||
│ └── blockers # Active blockers
|
||||
├── artifacts/
|
||||
│ ├── agent_{N}_files # Files created/modified
|
||||
│ ├── interfaces # Shared interface definitions
|
||||
│ └── schemas # Data schemas
|
||||
├── decisions/
|
||||
│ ├── api_decisions # API design decisions
|
||||
│ ├── implementation # Implementation choices
|
||||
│ └── conflicts # Resolved conflicts
|
||||
└── metrics/
|
||||
├── progress # Completion percentages
|
||||
├── performance # Performance measurements
|
||||
└── issues # Known issues
|
||||
```
|
||||
|
||||
### Communication Hooks
|
||||
|
||||
Each agent MUST run before work:
|
||||
```bash
|
||||
npx claude-flow@alpha hooks pre-task --description "Agent #{N}: {task}"
|
||||
npx claude-flow@alpha hooks session-restore --session-id "swarm-dag"
|
||||
```
|
||||
|
||||
Each agent MUST run after work:
|
||||
```bash
|
||||
npx claude-flow@alpha hooks post-edit --file "{file}" --memory-key "swarm/dag/artifacts/agent_{N}_files"
|
||||
npx claude-flow@alpha hooks post-task --task-id "agent_{N}_{task}"
|
||||
```
|
||||
|
||||
## Success Criteria
|
||||
|
||||
| Agent | Must Complete | Performance Target |
|
||||
|-------|---------------|-------------------|
|
||||
| #1 | QueryDag, traversals, serialization | - |
|
||||
| #2 | 4 attention mechanisms | <100μs per mechanism |
|
||||
| #3 | 3 attention mechanisms + selector | <200μs per mechanism |
|
||||
| #4 | SONA engine, MicroLoRA, ReasoningBank | <100μs adaptation |
|
||||
| #5 | MinCut engine, dynamic updates | O(n^0.12) amortized |
|
||||
| #6 | Extension scaffold, hooks, worker | - |
|
||||
| #7 | 11 SQL functions | <5ms per function |
|
||||
| #8 | 11 SQL functions | <5ms per function |
|
||||
| #9 | Healing system | <1s detection latency |
|
||||
| #10 | QuDAG client, sync | <500ms network ops |
|
||||
| #11 | ML-KEM, ML-DSA | <10ms crypto ops |
|
||||
| #12 | Token operations | <100ms token ops |
|
||||
| #13 | >80% coverage, all benchmarks | - |
|
||||
| #14 | All fixtures, mock server | - |
|
||||
| #15 | Complete docs, examples | - |
|
||||
|
||||
---
|
||||
|
||||
*Document: 11-AGENT-TASKS.md | Version: 1.0 | Last Updated: 2025-01-XX*
|
||||
757
docs/dag/12-MILESTONES.md
Normal file
757
docs/dag/12-MILESTONES.md
Normal file
@@ -0,0 +1,757 @@
|
||||
# Implementation Milestones
|
||||
|
||||
## Overview
|
||||
|
||||
Structured milestone plan for implementing the Neural DAG Learning system with 15-agent swarm coordination.
|
||||
|
||||
## Milestone Summary
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ NEURAL DAG LEARNING IMPLEMENTATION │
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ M1: Foundation ████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 15% │
|
||||
│ M2: Core Attention ████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 25% │
|
||||
│ M3: SONA Learning ████████████████░░░░░░░░░░░░░░░░░░░░░░░░ 35% │
|
||||
│ M4: PostgreSQL ████████████████████████░░░░░░░░░░░░░░░░ 55% │
|
||||
│ M5: Self-Healing ████████████████████████████░░░░░░░░░░░░ 65% │
|
||||
│ M6: QuDAG Integration ████████████████████████████████░░░░░░░░ 80% │
|
||||
│ M7: Testing ████████████████████████████████████░░░░ 90% │
|
||||
│ M8: Production Ready ████████████████████████████████████████ 100% │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Milestone 1: Foundation
|
||||
|
||||
**Status**: Not Started
|
||||
**Priority**: Critical
|
||||
**Agents**: #1, #14, #15
|
||||
|
||||
### Objectives
|
||||
|
||||
- [ ] Establish core DAG data structures
|
||||
- [ ] Create test fixture infrastructure
|
||||
- [ ] Initialize documentation structure
|
||||
|
||||
### Deliverables
|
||||
|
||||
| Deliverable | Agent | Status | Notes |
|
||||
|-------------|-------|--------|-------|
|
||||
| `QueryDag` struct | #1 | Pending | Node/edge management |
|
||||
| `OperatorNode` enum | #1 | Pending | All 15+ operator types |
|
||||
| Topological sort | #1 | Pending | O(V+E) implementation |
|
||||
| Cycle detection | #1 | Pending | For validation |
|
||||
| DAG serialization | #1 | Pending | JSON + binary formats |
|
||||
| Test DAG generator | #14 | Pending | All complexity levels |
|
||||
| Mock fixtures | #14 | Pending | Sample data |
|
||||
| Doc skeleton | #15 | Pending | README + guides |
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
```rust
|
||||
// Core functionality must work
|
||||
let mut dag = QueryDag::new();
|
||||
dag.add_node(0, OperatorNode::SeqScan { table: "users".into() });
|
||||
dag.add_node(1, OperatorNode::Filter { predicate: "id > 0".into() });
|
||||
dag.add_edge(0, 1).unwrap();
|
||||
|
||||
let sorted = dag.topological_sort().unwrap();
|
||||
assert_eq!(sorted, vec![0, 1]);
|
||||
|
||||
let json = dag.to_json().unwrap();
|
||||
let restored = QueryDag::from_json(&json).unwrap();
|
||||
assert_eq!(dag, restored);
|
||||
```
|
||||
|
||||
### Files Created
|
||||
|
||||
```
|
||||
ruvector-dag/
|
||||
├── Cargo.toml
|
||||
├── src/
|
||||
│ ├── lib.rs
|
||||
│ └── dag/
|
||||
│ ├── mod.rs
|
||||
│ ├── query_dag.rs
|
||||
│ ├── operator_node.rs
|
||||
│ ├── traversal.rs
|
||||
│ └── serialization.rs
|
||||
└── tests/
|
||||
└── fixtures/
|
||||
├── dag_generator.rs
|
||||
└── sample_dags.json
|
||||
```
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- [ ] All unit tests pass for DAG module
|
||||
- [ ] Benchmark: create 1000-node DAG in <10ms
|
||||
- [ ] Documentation: rustdoc for all public items
|
||||
- [ ] Code review approved by Queen agent
|
||||
|
||||
---
|
||||
|
||||
## Milestone 2: Core Attention Mechanisms
|
||||
|
||||
**Status**: Not Started
|
||||
**Priority**: Critical
|
||||
**Agents**: #2, #3
|
||||
|
||||
### Objectives
|
||||
|
||||
- [ ] Implement all 7 attention mechanisms
|
||||
- [ ] Implement attention selector (UCB bandit)
|
||||
- [ ] Achieve performance targets
|
||||
|
||||
### Deliverables
|
||||
|
||||
| Deliverable | Agent | Status | Target |
|
||||
|-------------|-------|--------|--------|
|
||||
| `DagAttention` trait | #2 | Pending | - |
|
||||
| `TopologicalAttention` | #2 | Pending | <50μs/100 nodes |
|
||||
| `CausalConeAttention` | #2 | Pending | <100μs/100 nodes |
|
||||
| `CriticalPathAttention` | #2 | Pending | <75μs/100 nodes |
|
||||
| `MinCutGatedAttention` | #2 | Pending | <200μs/100 nodes |
|
||||
| `HierarchicalLorentzAttention` | #3 | Pending | <150μs/100 nodes |
|
||||
| `ParallelBranchAttention` | #3 | Pending | <100μs/100 nodes |
|
||||
| `TemporalBTSPAttention` | #3 | Pending | <120μs/100 nodes |
|
||||
| `AttentionSelector` | #3 | Pending | UCB regret O(√T) |
|
||||
| Attention cache | #3 | Pending | 10K entry LRU |
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
```rust
|
||||
// All mechanisms implement trait
|
||||
let mechanisms: Vec<Box<dyn DagAttention>> = vec![
|
||||
Box::new(TopologicalAttention::new(config)),
|
||||
Box::new(CausalConeAttention::new(config)),
|
||||
Box::new(CriticalPathAttention::new(config)),
|
||||
Box::new(MinCutGatedAttention::new(config)),
|
||||
Box::new(HierarchicalLorentzAttention::new(config)),
|
||||
Box::new(ParallelBranchAttention::new(config)),
|
||||
Box::new(TemporalBTSPAttention::new(config)),
|
||||
];
|
||||
|
||||
for mechanism in mechanisms {
|
||||
let scores = mechanism.forward(&dag).unwrap();
|
||||
|
||||
// Scores sum to ~1.0
|
||||
let sum: f32 = scores.values().sum();
|
||||
assert!((sum - 1.0).abs() < 0.001);
|
||||
|
||||
// All scores in [0, 1]
|
||||
assert!(scores.values().all(|&s| s >= 0.0 && s <= 1.0));
|
||||
}
|
||||
|
||||
// Selector chooses based on history
|
||||
let mut selector = AttentionSelector::new(mechanisms.len());
|
||||
for _ in 0..100 {
|
||||
let chosen = selector.select();
|
||||
let reward = simulate_query_improvement();
|
||||
selector.update(chosen, reward);
|
||||
}
|
||||
```
|
||||
|
||||
### Performance Benchmarks
|
||||
|
||||
| Mechanism | 10 nodes | 100 nodes | 500 nodes | 1000 nodes |
|
||||
|-----------|----------|-----------|-----------|------------|
|
||||
| Topological | <5μs | <50μs | <200μs | <500μs |
|
||||
| CausalCone | <10μs | <100μs | <400μs | <1ms |
|
||||
| CriticalPath | <8μs | <75μs | <300μs | <700μs |
|
||||
| MinCutGated | <20μs | <200μs | <800μs | <2ms |
|
||||
| HierarchicalLorentz | <15μs | <150μs | <600μs | <1.5ms |
|
||||
| ParallelBranch | <10μs | <100μs | <400μs | <1ms |
|
||||
| TemporalBTSP | <12μs | <120μs | <500μs | <1.2ms |
|
||||
|
||||
### Files Created
|
||||
|
||||
```
|
||||
ruvector-dag/src/attention/
|
||||
├── mod.rs
|
||||
├── traits.rs
|
||||
├── topological.rs
|
||||
├── causal_cone.rs
|
||||
├── critical_path.rs
|
||||
├── mincut_gated.rs
|
||||
├── hierarchical_lorentz.rs
|
||||
├── parallel_branch.rs
|
||||
├── temporal_btsp.rs
|
||||
├── selector.rs
|
||||
└── cache.rs
|
||||
```
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- [ ] All 7 mechanisms pass unit tests
|
||||
- [ ] All performance benchmarks met
|
||||
- [ ] Property tests pass (1000 cases each)
|
||||
- [ ] Selector converges to best mechanism in tests
|
||||
- [ ] Code review approved
|
||||
|
||||
---
|
||||
|
||||
## Milestone 3: SONA Learning System
|
||||
|
||||
**Status**: Not Started
|
||||
**Priority**: Critical
|
||||
**Agents**: #4, #5
|
||||
|
||||
### Objectives
|
||||
|
||||
- [ ] Implement SONA engine with two-tier learning
|
||||
- [ ] Implement MinCut optimization engine
|
||||
- [ ] Achieve subpolynomial update complexity
|
||||
|
||||
### Deliverables
|
||||
|
||||
| Deliverable | Agent | Status | Target |
|
||||
|-------------|-------|--------|--------|
|
||||
| `DagSonaEngine` | #4 | Pending | Orchestration |
|
||||
| `MicroLoRA` | #4 | Pending | <100μs adapt |
|
||||
| `DagTrajectoryBuffer` | #4 | Pending | Lock-free, 1K cap |
|
||||
| `DagReasoningBank` | #4 | Pending | 100 clusters, <2ms search |
|
||||
| `EwcPlusPlus` | #4 | Pending | λ=5000 default |
|
||||
| `DagMinCutEngine` | #5 | Pending | - |
|
||||
| `LocalKCut` oracle | #5 | Pending | Local approximation |
|
||||
| Dynamic updates | #5 | Pending | O(n^0.12) amortized |
|
||||
| Bottleneck detection | #5 | Pending | - |
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
```rust
|
||||
// SONA instant loop
|
||||
let mut sona = DagSonaEngine::new(config);
|
||||
let dag = create_query_dag();
|
||||
|
||||
let start = Instant::now();
|
||||
let enhanced = sona.pre_query(&dag).unwrap();
|
||||
assert!(start.elapsed() < Duration::from_micros(100));
|
||||
|
||||
// Learning from trajectory
|
||||
sona.post_query(&dag, &execution_metrics);
|
||||
|
||||
// Verify learning happened
|
||||
let patterns = sona.reasoning_bank.query_similar(&dag.embedding(), 1);
|
||||
assert!(!patterns.is_empty());
|
||||
|
||||
// MinCut dynamic updates
|
||||
let mut mincut = DagMinCutEngine::new();
|
||||
mincut.build_from_dag(&large_dag);
|
||||
|
||||
let timings: Vec<Duration> = (0..1000)
|
||||
.map(|_| {
|
||||
let start = Instant::now();
|
||||
mincut.update_edge(rand_u(), rand_v(), rand_weight());
|
||||
start.elapsed()
|
||||
})
|
||||
.collect();
|
||||
|
||||
let amortized = timings.iter().sum::<Duration>() / 1000;
|
||||
// Verify subpolynomial: amortized << O(n)
|
||||
```
|
||||
|
||||
### Files Created
|
||||
|
||||
```
|
||||
ruvector-dag/src/
|
||||
├── sona/
|
||||
│ ├── mod.rs
|
||||
│ ├── engine.rs
|
||||
│ ├── micro_lora.rs
|
||||
│ ├── trajectory.rs
|
||||
│ ├── reasoning_bank.rs
|
||||
│ └── ewc.rs
|
||||
└── mincut/
|
||||
├── mod.rs
|
||||
├── engine.rs
|
||||
├── local_kcut.rs
|
||||
├── dynamic_updates.rs
|
||||
├── bottleneck.rs
|
||||
└── redundancy.rs
|
||||
```
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- [ ] MicroLoRA adapts in <100μs
|
||||
- [ ] Pattern search in <2ms for 10K patterns
|
||||
- [ ] EWC prevents catastrophic forgetting (>80% task retention)
|
||||
- [ ] MinCut updates are O(n^0.12) amortized
|
||||
- [ ] All tests pass
|
||||
|
||||
---
|
||||
|
||||
## Milestone 4: PostgreSQL Integration
|
||||
|
||||
**Status**: Not Started
|
||||
**Priority**: Critical
|
||||
**Agents**: #6, #7, #8
|
||||
|
||||
### Objectives
|
||||
|
||||
- [ ] Create functional PostgreSQL extension
|
||||
- [ ] Implement all SQL functions
|
||||
- [ ] Hook into query execution pipeline
|
||||
|
||||
### Deliverables
|
||||
|
||||
| Deliverable | Agent | Status | Notes |
|
||||
|-------------|-------|--------|-------|
|
||||
| pgrx extension setup | #6 | Pending | Extension skeleton |
|
||||
| GUC variables | #6 | Pending | All config vars |
|
||||
| Global state | #6 | Pending | DashMap-based |
|
||||
| Planner hook | #6 | Pending | DAG analysis |
|
||||
| Executor hooks | #6 | Pending | Trajectory capture |
|
||||
| Background worker | #6 | Pending | Learning loop |
|
||||
| Config SQL funcs | #7 | Pending | 5 functions |
|
||||
| Analysis SQL funcs | #7 | Pending | 6 functions |
|
||||
| Attention SQL funcs | #7 | Pending | 3 functions |
|
||||
| Pattern SQL funcs | #8 | Pending | 4 functions |
|
||||
| Trajectory SQL funcs | #8 | Pending | 3 functions |
|
||||
| Learning SQL funcs | #8 | Pending | 4 functions |
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
```sql
|
||||
-- Extension loads successfully
|
||||
CREATE EXTENSION ruvector_dag CASCADE;
|
||||
|
||||
-- Configuration works
|
||||
SELECT ruvector.dag_set_enabled(true);
|
||||
SELECT ruvector.dag_set_attention('auto');
|
||||
|
||||
-- Query analysis works
|
||||
SELECT * FROM ruvector.dag_analyze_plan($$
|
||||
SELECT * FROM vectors
|
||||
WHERE embedding <-> '[0.1,0.2,0.3]' < 0.5
|
||||
LIMIT 10
|
||||
$$);
|
||||
|
||||
-- Patterns are stored
|
||||
INSERT INTO test_vectors SELECT generate_series(1,1000), random_vector(128);
|
||||
SELECT COUNT(*) FROM ruvector.dag_pattern_clusters(); -- Should have clusters
|
||||
|
||||
-- Learning improves over time
|
||||
DO $$
|
||||
DECLARE
|
||||
initial_time FLOAT8;
|
||||
final_time FLOAT8;
|
||||
BEGIN
|
||||
-- Run workload
|
||||
FOR i IN 1..100 LOOP
|
||||
PERFORM * FROM test_vectors ORDER BY embedding <-> random_vector(128) LIMIT 10;
|
||||
END LOOP;
|
||||
|
||||
-- Check improvement
|
||||
SELECT avg_improvement INTO final_time FROM ruvector.dag_status();
|
||||
RAISE NOTICE 'Improvement ratio: %', final_time;
|
||||
END $$;
|
||||
```
|
||||
|
||||
### Files Created
|
||||
|
||||
```
|
||||
ruvector-postgres/src/dag/
|
||||
├── mod.rs
|
||||
├── extension.rs
|
||||
├── guc.rs
|
||||
├── state.rs
|
||||
├── hooks.rs
|
||||
├── worker.rs
|
||||
└── functions/
|
||||
├── mod.rs
|
||||
├── config.rs
|
||||
├── analysis.rs
|
||||
├── attention.rs
|
||||
├── patterns.rs
|
||||
├── trajectories.rs
|
||||
└── learning.rs
|
||||
```
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- [ ] Extension creates without errors
|
||||
- [ ] All 25+ SQL functions work
|
||||
- [ ] Query hooks capture execution data
|
||||
- [ ] Background worker runs learning loop
|
||||
- [ ] Integration tests pass
|
||||
|
||||
---
|
||||
|
||||
## Milestone 5: Self-Healing System
|
||||
|
||||
**Status**: Not Started
|
||||
**Priority**: High
|
||||
**Agents**: #9
|
||||
|
||||
### Objectives
|
||||
|
||||
- [ ] Implement autonomous anomaly detection
|
||||
- [ ] Implement automatic repair strategies
|
||||
- [ ] Integrate with healing SQL functions
|
||||
|
||||
### Deliverables
|
||||
|
||||
| Deliverable | Status | Notes |
|
||||
|-------------|--------|-------|
|
||||
| `AnomalyDetector` | Pending | Z-score based |
|
||||
| `IndexHealthChecker` | Pending | HNSW/IVFFlat |
|
||||
| `LearningDriftDetector` | Pending | Pattern quality trends |
|
||||
| `RepairStrategy` trait | Pending | Strategy interface |
|
||||
| `IndexRebalanceStrategy` | Pending | Rebalance indexes |
|
||||
| `PatternResetStrategy` | Pending | Reset bad patterns |
|
||||
| `HealingOrchestrator` | Pending | Main loop |
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
```rust
|
||||
// Anomaly detection
|
||||
let mut detector = AnomalyDetector::new(AnomalyConfig {
|
||||
z_threshold: 3.0,
|
||||
window_size: 100,
|
||||
});
|
||||
|
||||
// Inject anomaly
|
||||
for _ in 0..99 {
|
||||
detector.observe(1.0); // Normal
|
||||
}
|
||||
detector.observe(100.0); // Anomaly
|
||||
|
||||
let anomalies = detector.detect();
|
||||
assert!(!anomalies.is_empty());
|
||||
assert!(anomalies[0].z_score > 3.0);
|
||||
|
||||
// Self-healing
|
||||
let orchestrator = HealingOrchestrator::new(config);
|
||||
orchestrator.run_cycle().unwrap();
|
||||
|
||||
// Verify repairs applied
|
||||
let health = orchestrator.health_report();
|
||||
assert!(health.overall_score > 0.8);
|
||||
```
|
||||
|
||||
### Files Created
|
||||
|
||||
```
|
||||
ruvector-dag/src/healing/
|
||||
├── mod.rs
|
||||
├── anomaly.rs
|
||||
├── index_health.rs
|
||||
├── drift_detector.rs
|
||||
├── strategies.rs
|
||||
└── orchestrator.rs
|
||||
|
||||
ruvector-postgres/src/dag/functions/
|
||||
└── healing.rs
|
||||
```
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- [ ] Anomalies detected within 1s
|
||||
- [ ] Repairs applied automatically
|
||||
- [ ] No false positives in 24h test
|
||||
- [ ] SQL healing functions work
|
||||
- [ ] Integration tests pass
|
||||
|
||||
---
|
||||
|
||||
## Milestone 6: QuDAG Integration
|
||||
|
||||
**Status**: Not Started
|
||||
**Priority**: High
|
||||
**Agents**: #10, #11, #12
|
||||
|
||||
### Objectives
|
||||
|
||||
- [ ] Connect to QuDAG network
|
||||
- [ ] Implement quantum-resistant crypto
|
||||
- [ ] Enable distributed pattern learning
|
||||
|
||||
### Deliverables
|
||||
|
||||
| Deliverable | Agent | Status | Notes |
|
||||
|-------------|-------|--------|-------|
|
||||
| `QuDagClient` | #10 | Pending | Network client |
|
||||
| Pattern proposal | #10 | Pending | Submit patterns |
|
||||
| Pattern sync | #10 | Pending | Receive patterns |
|
||||
| Consensus validation | #10 | Pending | Track votes |
|
||||
| ML-KEM-768 | #11 | Pending | Encryption |
|
||||
| ML-DSA | #11 | Pending | Signatures |
|
||||
| Identity management | #11 | Pending | Key generation |
|
||||
| Differential privacy | #11 | Pending | Pattern noise |
|
||||
| Staking interface | #12 | Pending | Token staking |
|
||||
| Reward claiming | #12 | Pending | Earn rUv |
|
||||
| QuDAG SQL funcs | #12 | Pending | SQL interface |
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
```rust
|
||||
// Connect to network
|
||||
let client = QuDagClient::connect("https://qudag.example.com:8443").await?;
|
||||
assert!(client.is_connected());
|
||||
|
||||
// Propose pattern with DP
|
||||
let pattern = PatternProposal {
|
||||
vector: pattern_vector,
|
||||
metadata: metadata,
|
||||
noise: laplace_noise(epsilon),
|
||||
};
|
||||
let proposal_id = client.propose_pattern(pattern).await?;
|
||||
|
||||
// Wait for consensus
|
||||
let status = client.wait_for_consensus(&proposal_id, timeout).await?;
|
||||
assert!(matches!(status, ConsensusStatus::Finalized));
|
||||
|
||||
// Sync patterns
|
||||
let new_patterns = client.sync_patterns(since_round).await?;
|
||||
for pattern in new_patterns {
|
||||
reasoning_bank.import_pattern(pattern);
|
||||
}
|
||||
|
||||
// Token operations
|
||||
let balance = client.get_balance().await?;
|
||||
client.stake(100.0).await?;
|
||||
let rewards = client.claim_rewards().await?;
|
||||
```
|
||||
|
||||
### Files Created
|
||||
|
||||
```
|
||||
ruvector-dag/src/qudag/
|
||||
├── mod.rs
|
||||
├── client.rs
|
||||
├── network.rs
|
||||
├── proposal.rs
|
||||
├── consensus.rs
|
||||
├── sync.rs
|
||||
├── crypto/
|
||||
│ ├── mod.rs
|
||||
│ ├── ml_kem.rs
|
||||
│ ├── ml_dsa.rs
|
||||
│ ├── identity.rs
|
||||
│ ├── keystore.rs
|
||||
│ └── differential_privacy.rs
|
||||
└── tokens/
|
||||
├── mod.rs
|
||||
├── staking.rs
|
||||
├── rewards.rs
|
||||
└── governance.rs
|
||||
|
||||
ruvector-postgres/src/dag/functions/
|
||||
└── qudag.rs
|
||||
```
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- [ ] Connect to test QuDAG network
|
||||
- [ ] Pattern proposals finalize
|
||||
- [ ] Pattern sync works bidirectionally
|
||||
- [ ] ML-KEM/ML-DSA operations work
|
||||
- [ ] Token operations succeed
|
||||
- [ ] SQL functions work
|
||||
|
||||
---
|
||||
|
||||
## Milestone 7: Comprehensive Testing
|
||||
|
||||
**Status**: Not Started
|
||||
**Priority**: High
|
||||
**Agents**: #13, #14
|
||||
|
||||
### Objectives
|
||||
|
||||
- [ ] Achieve >80% test coverage
|
||||
- [ ] All benchmarks meet targets
|
||||
- [ ] CI/CD pipeline operational
|
||||
|
||||
### Deliverables
|
||||
|
||||
| Category | Count | Status |
|
||||
|----------|-------|--------|
|
||||
| Unit tests (attention) | 50+ | Pending |
|
||||
| Unit tests (sona) | 40+ | Pending |
|
||||
| Unit tests (mincut) | 30+ | Pending |
|
||||
| Unit tests (healing) | 25+ | Pending |
|
||||
| Unit tests (qudag) | 35+ | Pending |
|
||||
| Integration tests (postgres) | 20+ | Pending |
|
||||
| Integration tests (qudag) | 15+ | Pending |
|
||||
| Property tests | 20+ | Pending |
|
||||
| Benchmarks | 15+ | Pending |
|
||||
|
||||
### Performance Verification
|
||||
|
||||
| Component | Target | Test |
|
||||
|-----------|--------|------|
|
||||
| Topological attention | <50μs/100 nodes | Benchmark |
|
||||
| MicroLoRA | <100μs | Benchmark |
|
||||
| Pattern search | <2ms/10K | Benchmark |
|
||||
| MinCut update | O(n^0.12) | Benchmark |
|
||||
| Query analysis | <5ms | Integration |
|
||||
| Full learning cycle | <100ms | Integration |
|
||||
|
||||
### Coverage Targets
|
||||
|
||||
```
|
||||
Overall: >80%
|
||||
attention/: >90%
|
||||
sona/: >85%
|
||||
mincut/: >85%
|
||||
healing/: >80%
|
||||
qudag/: >75%
|
||||
functions/: >85%
|
||||
```
|
||||
|
||||
### Files Created
|
||||
|
||||
```
|
||||
ruvector-dag/
|
||||
├── tests/
|
||||
│ ├── unit/
|
||||
│ │ ├── attention/
|
||||
│ │ ├── sona/
|
||||
│ │ ├── mincut/
|
||||
│ │ ├── healing/
|
||||
│ │ └── qudag/
|
||||
│ ├── integration/
|
||||
│ │ ├── postgres/
|
||||
│ │ └── qudag/
|
||||
│ ├── property/
|
||||
│ └── fixtures/
|
||||
├── benches/
|
||||
│ ├── attention_bench.rs
|
||||
│ ├── sona_bench.rs
|
||||
│ └── mincut_bench.rs
|
||||
|
||||
.github/workflows/
|
||||
├── dag-tests.yml
|
||||
└── dag-benchmarks.yml
|
||||
```
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- [ ] Coverage >80%
|
||||
- [ ] All tests pass
|
||||
- [ ] All benchmarks meet targets
|
||||
- [ ] CI pipeline green
|
||||
- [ ] No critical issues
|
||||
|
||||
---
|
||||
|
||||
## Milestone 8: Production Ready
|
||||
|
||||
**Status**: Not Started
|
||||
**Priority**: Critical
|
||||
**Agents**: All
|
||||
|
||||
### Objectives
|
||||
|
||||
- [ ] Complete documentation
|
||||
- [ ] Performance optimization
|
||||
- [ ] Security audit
|
||||
- [ ] Release preparation
|
||||
|
||||
### Deliverables
|
||||
|
||||
| Deliverable | Status |
|
||||
|-------------|--------|
|
||||
| Complete rustdoc | Pending |
|
||||
| SQL API docs | Pending |
|
||||
| Usage examples | Pending |
|
||||
| Integration guide | Pending |
|
||||
| Troubleshooting guide | Pending |
|
||||
| Performance tuning guide | Pending |
|
||||
| Security review | Pending |
|
||||
| CHANGELOG | Pending |
|
||||
| Release notes | Pending |
|
||||
|
||||
### Security Checklist
|
||||
|
||||
- [ ] No secret exposure
|
||||
- [ ] Input validation on all SQL functions
|
||||
- [ ] Safe memory handling (no leaks)
|
||||
- [ ] Cryptographic review (ML-KEM/ML-DSA)
|
||||
- [ ] Differential privacy parameters validated
|
||||
- [ ] No SQL injection vectors
|
||||
- [ ] Resource limits enforced
|
||||
|
||||
### Performance Optimization
|
||||
|
||||
- [ ] Profile and optimize hot paths
|
||||
- [ ] Memory usage optimization
|
||||
- [ ] Cache tuning
|
||||
- [ ] Query plan caching
|
||||
- [ ] Background worker tuning
|
||||
|
||||
### Release Checklist
|
||||
|
||||
- [ ] Version bump
|
||||
- [ ] CHANGELOG updated
|
||||
- [ ] All tests pass
|
||||
- [ ] Benchmarks verified
|
||||
- [ ] Documentation complete
|
||||
- [ ] Examples tested
|
||||
- [ ] Binary artifacts built
|
||||
- [ ] crates.io ready (if applicable)
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- [ ] All previous milestones complete
|
||||
- [ ] Documentation complete
|
||||
- [ ] Security review passed
|
||||
- [ ] Performance targets met
|
||||
- [ ] Ready for production deployment
|
||||
|
||||
---
|
||||
|
||||
## Risk Register
|
||||
|
||||
| Risk | Impact | Probability | Mitigation |
|
||||
|------|--------|-------------|------------|
|
||||
| MinCut complexity target not achievable | High | Medium | Fall back to O(√n) algorithm |
|
||||
| PostgreSQL hook instability | High | Low | Extensive testing, fallback modes |
|
||||
| QuDAG network unavailable | Medium | Medium | Local-only fallback mode |
|
||||
| Performance regression | Medium | Medium | Continuous benchmarking in CI |
|
||||
| Memory leaks | High | Low | Valgrind/miri testing |
|
||||
| Cross-agent coordination failures | Medium | Medium | Queen agent mediation |
|
||||
|
||||
## Dependencies
|
||||
|
||||
### External Dependencies
|
||||
|
||||
| Dependency | Version | Purpose |
|
||||
|------------|---------|---------|
|
||||
| pgrx | ^0.11 | PostgreSQL extension |
|
||||
| tokio | ^1.0 | Async runtime |
|
||||
| dashmap | ^5.0 | Concurrent hashmap |
|
||||
| crossbeam | ^0.8 | Lock-free structures |
|
||||
| ndarray | ^0.15 | Numeric arrays |
|
||||
| ml-kem | TBD | ML-KEM-768 |
|
||||
| ml-dsa | TBD | ML-DSA signatures |
|
||||
|
||||
### Internal Dependencies
|
||||
|
||||
- `ruvector-core`: Vector operations, SONA base
|
||||
- `ruvector-graph`: GNN, attention base
|
||||
- `ruvector-postgres`: Extension infrastructure
|
||||
|
||||
---
|
||||
|
||||
## Completion Tracking
|
||||
|
||||
| Milestone | Weight | Status | Completion |
|
||||
|-----------|--------|--------|------------|
|
||||
| M1: Foundation | 15% | Not Started | 0% |
|
||||
| M2: Core Attention | 10% | Not Started | 0% |
|
||||
| M3: SONA Learning | 10% | Not Started | 0% |
|
||||
| M4: PostgreSQL | 20% | Not Started | 0% |
|
||||
| M5: Self-Healing | 10% | Not Started | 0% |
|
||||
| M6: QuDAG | 15% | Not Started | 0% |
|
||||
| M7: Testing | 10% | Not Started | 0% |
|
||||
| M8: Production | 10% | Not Started | 0% |
|
||||
| **TOTAL** | **100%** | - | **0%** |
|
||||
|
||||
---
|
||||
|
||||
*Document: 12-MILESTONES.md | Version: 1.0 | Last Updated: 2025-01-XX*
|
||||
Reference in New Issue
Block a user