Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
This commit is contained in:
657
npm/packages/ruvbot/docs/FEATURE_COMPARISON.md
Normal file
657
npm/packages/ruvbot/docs/FEATURE_COMPARISON.md
Normal file
@@ -0,0 +1,657 @@
|
||||
# RuvBot vs Clawdbot: Feature Parity & SOTA Comparison
|
||||
|
||||
## Executive Summary
|
||||
|
||||
RuvBot builds on Clawdbot's pioneering personal AI assistant architecture while **fixing critical security vulnerabilities** and introducing **state-of-the-art (SOTA)** improvements through RuVector's WASM-accelerated vector operations, self-learning neural patterns, and enterprise-grade multi-tenancy.
|
||||
|
||||
## Critical Security Gap in Clawdbot
|
||||
|
||||
**Clawdbot should NOT be used in production environments** without significant security hardening:
|
||||
|
||||
| Security Feature | Clawdbot | RuvBot | Risk Level |
|
||||
|-----------------|----------|--------|------------|
|
||||
| Prompt Injection Defense | **MISSING** | Protected | **CRITICAL** |
|
||||
| Jailbreak Detection | **MISSING** | Protected | **CRITICAL** |
|
||||
| PII Data Protection | **MISSING** | Auto-masked | **HIGH** |
|
||||
| Input Sanitization | **MISSING** | Full | **HIGH** |
|
||||
| Multi-tenant Isolation | **MISSING** | PostgreSQL RLS | **HIGH** |
|
||||
| Response Validation | **MISSING** | AIDefence | **MEDIUM** |
|
||||
| Audit Logging | **BASIC** | Comprehensive | **MEDIUM** |
|
||||
|
||||
**RuvBot addresses ALL of these vulnerabilities** with a 6-layer defense-in-depth architecture and integrated AIDefence protection.
|
||||
|
||||
## Feature Comparison Matrix
|
||||
|
||||
| Feature | Clawdbot | RuvBot | RuvBot Advantage |
|
||||
|---------|----------|--------|------------------|
|
||||
| **Security** | Basic | 6-layer + AIDefence | **CRITICAL UPGRADE** |
|
||||
| **Prompt Injection** | **VULNERABLE** | Protected (<5ms) | **Essential** |
|
||||
| **Jailbreak Defense** | **VULNERABLE** | Detected + Blocked | **Essential** |
|
||||
| **PII Protection** | **NONE** | Auto-masked | **Compliance-ready** |
|
||||
| **Vector Memory** | Optional | HNSW-indexed WASM | 150x-12,500x faster search |
|
||||
| **Learning** | Static | SONA adaptive | Self-improving with EWC++ |
|
||||
| **Embeddings** | External API | Local WASM | 75x faster, no network latency |
|
||||
| **Multi-tenancy** | Single-user | Full RLS | Enterprise-ready isolation |
|
||||
| **LLM Models** | Single provider | 12+ (Gemini 2.5, Claude, GPT) | Full flexibility |
|
||||
| **LLM Routing** | Single model | MoE + FastGRNN | 100% routing accuracy |
|
||||
| **Background Tasks** | Basic | agentic-flow workers | 12 specialized worker types |
|
||||
| **Plugin System** | Basic | IPFS registry + sandboxed | claude-flow inspired |
|
||||
|
||||
## Deep Feature Analysis
|
||||
|
||||
### 1. Vector Memory System
|
||||
|
||||
#### Clawdbot
|
||||
- Uses external embedding APIs (OpenAI, etc.)
|
||||
- In-memory or basic database storage
|
||||
- Linear search for retrieval
|
||||
|
||||
#### RuvBot (SOTA)
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ RuvBot Memory Architecture │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ WASM Embedder (384-4096 dim) │
|
||||
│ └─ SIMD-optimized vector operations │
|
||||
│ └─ LRU caching (10K+ entries) │
|
||||
│ └─ Batch processing (32 vectors/batch) │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ HNSW Index (RuVector) │
|
||||
│ └─ Hierarchical Navigable Small Worlds │
|
||||
│ └─ O(log n) search complexity │
|
||||
│ └─ 100K-10M vector capacity │
|
||||
│ └─ ef_construction=200, M=16 (tuned) │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Memory Types │
|
||||
│ └─ Episodic: Conversation events │
|
||||
│ └─ Semantic: Knowledge/facts │
|
||||
│ └─ Procedural: Skills/patterns │
|
||||
│ └─ Working: Short-term context │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
Performance Benchmarks:
|
||||
- 10K vectors: <1ms search (vs 50ms Clawdbot)
|
||||
- 100K vectors: <5ms search (vs 500ms+ Clawdbot)
|
||||
- 1M vectors: <10ms search (not feasible in Clawdbot)
|
||||
```
|
||||
|
||||
### 2. Self-Learning System
|
||||
|
||||
#### Clawdbot
|
||||
- No built-in learning
|
||||
- Static skill definitions
|
||||
- Manual updates required
|
||||
|
||||
#### RuvBot (SOTA)
|
||||
```
|
||||
SONA Learning Pipeline:
|
||||
1. RETRIEVE: HNSW pattern search (<1ms)
|
||||
2. JUDGE: Verdict classification (success/failure)
|
||||
3. DISTILL: LoRA weight extraction
|
||||
4. CONSOLIDATE: EWC++ prevents catastrophic forgetting
|
||||
|
||||
Trajectory Learning:
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ User Query ──► Agent Response ──► Outcome ──► Pattern Store │
|
||||
│ │ │ │ │ │
|
||||
│ ▼ ▼ ▼ ▼ │
|
||||
│ Embedding Action Log Reward Score Neural Update │
|
||||
│ │
|
||||
│ Continuous improvement with each interaction │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### 3. LLM Routing & Intelligence
|
||||
|
||||
#### Clawdbot
|
||||
- Single model configuration
|
||||
- Manual model selection
|
||||
- No routing optimization
|
||||
|
||||
#### RuvBot (SOTA)
|
||||
```
|
||||
3-Tier Intelligent Routing:
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Tier 1: Agent Booster (<1ms, $0) │
|
||||
│ └─ Simple transforms: var→const, add-types, remove-console │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Tier 2: Haiku (~500ms, $0.0002) │
|
||||
│ └─ Bug fixes, simple tasks, low complexity │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Tier 3: Sonnet/Opus (2-5s, $0.003-$0.015) │
|
||||
│ └─ Architecture, security, complex reasoning │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
MoE (Mixture of Experts) + FastGRNN:
|
||||
- 100% routing accuracy (hybrid keyword-first strategy)
|
||||
- 75% cost reduction vs always-Sonnet
|
||||
- 352x faster for Tier 1 tasks
|
||||
```
|
||||
|
||||
### 4. Multi-Tenancy & Enterprise Features
|
||||
|
||||
#### Clawdbot
|
||||
- Single-user design
|
||||
- Shared data storage
|
||||
- No isolation
|
||||
|
||||
#### RuvBot (SOTA)
|
||||
```
|
||||
Enterprise Multi-Tenancy:
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Tenant Isolation Layers │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Database: PostgreSQL Row-Level Security (RLS) │
|
||||
│ └─ Automatic tenant_id filtering │
|
||||
│ └─ Cross-tenant queries impossible │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Memory: Namespace isolation │
|
||||
│ └─ Separate HNSW indices per tenant │
|
||||
│ └─ Embedding isolation │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Workers: Tenant-scoped queues │
|
||||
│ └─ Resource quotas per tenant │
|
||||
│ └─ Priority scheduling │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ API: Tenant context middleware │
|
||||
│ └─ JWT claims with tenant_id │
|
||||
│ └─ Rate limits per tenant │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### 5. Background Workers
|
||||
|
||||
#### Clawdbot
|
||||
- Basic async processing
|
||||
- No specialized workers
|
||||
- Limited task types
|
||||
|
||||
#### RuvBot (SOTA)
|
||||
```
|
||||
12 Specialized Background Workers:
|
||||
┌───────────────────┬──────────┬─────────────────────────────────┐
|
||||
│ Worker │ Priority │ Purpose │
|
||||
├───────────────────┼──────────┼─────────────────────────────────┤
|
||||
│ ultralearn │ normal │ Deep knowledge acquisition │
|
||||
│ optimize │ high │ Performance optimization │
|
||||
│ consolidate │ low │ Memory consolidation (EWC++) │
|
||||
│ predict │ normal │ Predictive preloading │
|
||||
│ audit │ critical │ Security analysis │
|
||||
│ map │ normal │ Codebase/context mapping │
|
||||
│ preload │ low │ Resource preloading │
|
||||
│ deepdive │ normal │ Deep code/content analysis │
|
||||
│ document │ normal │ Auto-documentation │
|
||||
│ refactor │ normal │ Refactoring suggestions │
|
||||
│ benchmark │ normal │ Performance benchmarking │
|
||||
│ testgaps │ normal │ Test coverage analysis │
|
||||
└───────────────────┴──────────┴─────────────────────────────────┘
|
||||
```
|
||||
|
||||
### 6. Security Comparison
|
||||
|
||||
#### Clawdbot
|
||||
- Good baseline security
|
||||
- Environment-based secrets
|
||||
- Basic input validation
|
||||
|
||||
#### RuvBot (SOTA)
|
||||
```
|
||||
6-Layer Defense in Depth:
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Layer 1: Transport (TLS 1.3, HSTS, cert pinning) │
|
||||
│ Layer 2: Authentication (JWT RS256, OAuth 2.0, rate limiting) │
|
||||
│ Layer 3: Authorization (RBAC, claims, tenant isolation) │
|
||||
│ Layer 4: Data Protection (AES-256-GCM, key rotation) │
|
||||
│ Layer 5: Input Validation (Zod schemas, injection prevention) │
|
||||
│ Layer 6: WASM Sandbox (memory isolation, resource limits) │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
Compliance Ready:
|
||||
- GDPR: Data export, deletion, consent
|
||||
- SOC 2: Audit logging, access controls
|
||||
- HIPAA: Encryption, access logging (configurable)
|
||||
```
|
||||
|
||||
## Performance Benchmarks
|
||||
|
||||
| Operation | Clawdbot | RuvBot | Improvement |
|
||||
|-----------|----------|--------|-------------|
|
||||
| Embedding generation | 200ms (API) | 2.7ms (WASM) | 74x faster |
|
||||
| Vector search (10K) | 50ms | <1ms | 50x faster |
|
||||
| Vector search (100K) | 500ms+ | <5ms | 100x faster |
|
||||
| Session restore | 100ms | 10ms | 10x faster |
|
||||
| Skill invocation | 50ms | 5ms | 10x faster |
|
||||
| Cold start | 3s | 500ms | 6x faster |
|
||||
|
||||
## Architecture Advantages
|
||||
|
||||
### RuvBot SOTA Innovations
|
||||
|
||||
1. **WASM-First Design**
|
||||
- Cross-platform consistency
|
||||
- No native compilation needed
|
||||
- Portable to browser environments
|
||||
|
||||
2. **Neural Substrate Integration**
|
||||
- Continuous learning via SONA
|
||||
- Pattern recognition with MoE
|
||||
- Catastrophic forgetting prevention (EWC++)
|
||||
|
||||
3. **Distributed Coordination**
|
||||
- Byzantine fault-tolerant consensus
|
||||
- Raft leader election
|
||||
- Gossip protocol for eventual consistency
|
||||
|
||||
4. **RuVector Integration**
|
||||
- 53+ SQL functions for vectors
|
||||
- 39 attention mechanisms
|
||||
- Hyperbolic embeddings for hierarchies
|
||||
- Flash Attention (2.49x-7.47x speedup)
|
||||
|
||||
## Migration Path
|
||||
|
||||
Clawdbot users can migrate to RuvBot with:
|
||||
|
||||
```bash
|
||||
# Export Clawdbot data
|
||||
clawdbot export --format json > data.json
|
||||
|
||||
# Import to RuvBot
|
||||
ruvbot import --from-clawdbot data.json
|
||||
|
||||
# Verify migration
|
||||
ruvbot doctor --verify-migration
|
||||
```
|
||||
|
||||
## Skills Comparison (52 Clawdbot → 68+ RuvBot)
|
||||
|
||||
### Clawdbot Skills (52)
|
||||
```
|
||||
1password, apple-notes, apple-reminders, bear-notes, bird, blogwatcher,
|
||||
blucli, bluebubbles, camsnap, canvas, clawdhub, coding-agent, discord,
|
||||
eightctl, food-order, gemini, gifgrep, github, gog, goplaces, himalaya,
|
||||
imsg, local-places, mcporter, model-usage, nano-banana-pro, nano-pdf,
|
||||
notion, obsidian, openai-image-gen, openai-whisper, openai-whisper-api,
|
||||
openhue, oracle, ordercli, peekaboo, sag, session-logs, sherpa-onnx-tts,
|
||||
skill-creator, slack, songsee, sonoscli, spotify-player, summarize,
|
||||
things-mac, tmux, trello, video-frames, voice-call, wacli, weather
|
||||
```
|
||||
|
||||
### RuvBot Skills (68+)
|
||||
```
|
||||
All 52 Clawdbot skills PLUS:
|
||||
|
||||
RuVector-Enhanced Skills:
|
||||
├─ semantic-search : HNSW O(log n) vector search (150x faster)
|
||||
├─ pattern-learning : SONA trajectory learning
|
||||
├─ hybrid-search : Vector + BM25 fusion
|
||||
├─ embedding-batch : Parallel WASM embedding
|
||||
├─ context-predict : Predictive context preloading
|
||||
├─ memory-consolidate : EWC++ memory consolidation
|
||||
|
||||
Distributed Skills (agentic-flow):
|
||||
├─ swarm-orchestrate : Multi-agent coordination
|
||||
├─ consensus-reach : Byzantine fault-tolerant consensus
|
||||
├─ load-balance : Dynamic task distribution
|
||||
├─ mesh-coordinate : Peer-to-peer mesh networking
|
||||
|
||||
Enterprise Skills:
|
||||
├─ tenant-isolate : Multi-tenant data isolation
|
||||
├─ audit-log : Comprehensive security logging
|
||||
├─ key-rotate : Automatic secret rotation
|
||||
├─ rls-enforce : Row-level security enforcement
|
||||
```
|
||||
|
||||
## Complete Module Comparison
|
||||
|
||||
| Module Category | Clawdbot (68) | RuvBot | RuvBot Advantage |
|
||||
|-----------------|---------------|--------|------------------|
|
||||
| **Core** | agents, sessions, memory | ✅ | + SONA learning |
|
||||
| **Channels** | slack, discord, telegram, signal, whatsapp, line, imessage | ✅ All + web | + Multi-tenant channels |
|
||||
| **CLI** | cli, commands | ✅ + MCP server | + 140+ subcommands |
|
||||
| **Memory** | SQLite + FTS | ✅ + HNSW WASM | **150-12,500x faster** |
|
||||
| **Embedding** | OpenAI/Gemini API | ✅ + Local WASM | **75x faster, $0 cost** |
|
||||
| **Workers** | Basic async | 12 specialized | + Learning workers |
|
||||
| **Routing** | Single model | 3-tier MoE | **75% cost reduction** |
|
||||
| **Cron** | Basic scheduler | ✅ + Priority queues | + Tenant-scoped |
|
||||
| **Daemon** | Basic | ✅ + Health checks | + Auto-recovery |
|
||||
| **Gateway** | HTTP | ✅ + WebSocket | + GraphQL subscriptions |
|
||||
| **Plugin SDK** | JavaScript | ✅ + WASM | + Sandboxed execution |
|
||||
| **TTS** | sherpa-onnx | ✅ + RuvLLM | + Lower latency |
|
||||
| **TUI** | Basic | ✅ + Rich | + Status dashboard |
|
||||
| **Security** | Good | 6-layer | + Defense in depth |
|
||||
| **Browser** | Puppeteer | ✅ + Playwright | + Session persistence |
|
||||
| **Media** | Basic | ✅ + WASM | + GPU acceleration |
|
||||
|
||||
## RuVector Exclusive Capabilities
|
||||
|
||||
### 1. WASM Vector Operations (npm @ruvector/wasm-unified)
|
||||
```typescript
|
||||
// RuvBot uses RuVector WASM for all vector operations
|
||||
import { HnswIndex, simdDistance } from '@ruvector/wasm-unified';
|
||||
|
||||
// 150x faster than Clawdbot's external API
|
||||
const results = await hnswIndex.search(query, { k: 10 });
|
||||
```
|
||||
|
||||
### 2. Local LLM with SONA (npm @ruvector/ruvllm)
|
||||
```typescript
|
||||
// Self-Optimizing Neural Architecture
|
||||
import { RuvLLM, SonaTrainer } from '@ruvector/ruvllm';
|
||||
|
||||
// Continuous learning from every interaction
|
||||
await sonaTrainer.train({
|
||||
trajectory: session.messages,
|
||||
outcome: 'success',
|
||||
consolidate: true // EWC++ prevents forgetting
|
||||
});
|
||||
```
|
||||
|
||||
### 3. PostgreSQL Vector Store (npm @ruvector/postgres-cli)
|
||||
```sql
|
||||
-- RuVector adds 53+ vector SQL functions
|
||||
SELECT * FROM memories
|
||||
WHERE tenant_id = current_tenant() -- RLS
|
||||
ORDER BY embedding <=> $query -- Cosine similarity
|
||||
LIMIT 10;
|
||||
```
|
||||
|
||||
### 4. Agentic-Flow Integration (npx agentic-flow)
|
||||
```typescript
|
||||
// Multi-agent swarm coordination
|
||||
import { SwarmCoordinator, ByzantineConsensus } from 'agentic-flow';
|
||||
|
||||
// 12 specialized background workers
|
||||
await swarm.dispatch({
|
||||
worker: 'ultralearn',
|
||||
task: { type: 'deep-analysis', content }
|
||||
});
|
||||
```
|
||||
|
||||
## Benchmark: RuvBot Dominance
|
||||
|
||||
| Metric | Clawdbot | RuvBot | Ratio |
|
||||
|--------|----------|--------|-------|
|
||||
| Embedding latency | 200ms | 2.7ms | **74x** |
|
||||
| 10K vector search | 50ms | <1ms | **50x** |
|
||||
| 100K vector search | 500ms | <5ms | **100x** |
|
||||
| 1M vector search | N/A | <10ms | **∞** |
|
||||
| Session restore | 100ms | 10ms | **10x** |
|
||||
| Skill invocation | 50ms | 5ms | **10x** |
|
||||
| Cold start | 3000ms | 500ms | **6x** |
|
||||
| Memory consolidation | N/A | <50ms | **∞** |
|
||||
| Pattern learning | N/A | <5ms | **∞** |
|
||||
| Multi-tenant query | N/A | <2ms | **∞** |
|
||||
|
||||
## agentic-flow Integration Details
|
||||
|
||||
### Background Workers (12 Types)
|
||||
| Worker | Clawdbot | RuvBot | Enhancement |
|
||||
|--------|----------|--------|-------------|
|
||||
| ultralearn | ❌ | ✅ | Deep knowledge acquisition |
|
||||
| optimize | ❌ | ✅ | Performance optimization |
|
||||
| consolidate | ❌ | ✅ | EWC++ memory consolidation |
|
||||
| predict | ❌ | ✅ | Predictive preloading |
|
||||
| audit | ❌ | ✅ | Security analysis |
|
||||
| map | ❌ | ✅ | Codebase mapping |
|
||||
| preload | ❌ | ✅ | Resource preloading |
|
||||
| deepdive | ❌ | ✅ | Deep code analysis |
|
||||
| document | ❌ | ✅ | Auto-documentation |
|
||||
| refactor | ❌ | ✅ | Refactoring suggestions |
|
||||
| benchmark | ❌ | ✅ | Performance benchmarking |
|
||||
| testgaps | ❌ | ✅ | Test coverage analysis |
|
||||
|
||||
### Swarm Topologies
|
||||
| Topology | Clawdbot | RuvBot | Use Case |
|
||||
|----------|----------|--------|----------|
|
||||
| hierarchical | ❌ | ✅ | Queen-worker coordination |
|
||||
| mesh | ❌ | ✅ | Peer-to-peer networking |
|
||||
| hierarchical-mesh | ❌ | ✅ | Hybrid scalability |
|
||||
| adaptive | ❌ | ✅ | Dynamic switching |
|
||||
|
||||
### Consensus Mechanisms
|
||||
| Protocol | Clawdbot | RuvBot | Fault Tolerance |
|
||||
|----------|----------|--------|-----------------|
|
||||
| Byzantine | ❌ | ✅ | f < n/3 faulty |
|
||||
| Raft | ❌ | ✅ | f < n/2 failures |
|
||||
| Gossip | ❌ | ✅ | Eventually consistent |
|
||||
| CRDT | ❌ | ✅ | Conflict-free replication |
|
||||
|
||||
### 10. Cloud Deployment
|
||||
|
||||
#### Clawdbot
|
||||
- Manual deployment
|
||||
- No cloud-native support
|
||||
- Self-managed infrastructure
|
||||
|
||||
#### RuvBot (SOTA)
|
||||
```
|
||||
Google Cloud Platform (Cost-Optimized):
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Cloud Run (Serverless) │
|
||||
│ └─ Scale to zero when idle │
|
||||
│ └─ Auto-scale 0-100 instances │
|
||||
│ └─ 512Mi memory, sub-second cold start │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Cloud SQL (PostgreSQL) │
|
||||
│ └─ db-f1-micro (~$10/month) │
|
||||
│ └─ Automatic backups │
|
||||
│ └─ Row-Level Security │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Infrastructure as Code │
|
||||
│ └─ Terraform modules included │
|
||||
│ └─ Cloud Build CI/CD pipeline │
|
||||
│ └─ One-command deployment │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
Estimated Monthly Cost:
|
||||
| Traffic Level | Configuration | Cost |
|
||||
|---------------|---------------|------|
|
||||
| Low (<1K/day) | Min resources | ~$15-20/month |
|
||||
| Medium (<10K/day) | Scaled | ~$40/month |
|
||||
| High (<100K/day) | Enterprise | ~$150/month |
|
||||
```
|
||||
|
||||
### 11. LLM Provider Support
|
||||
|
||||
#### Clawdbot
|
||||
- Single provider (typically OpenAI)
|
||||
- No model routing
|
||||
- Fixed pricing
|
||||
- No Gemini 2.5 support
|
||||
|
||||
#### RuvBot (SOTA)
|
||||
```
|
||||
Multi-Provider Architecture with Gemini 2.5 Default:
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ OpenRouter (200+ Models) - DEFAULT PROVIDER │
|
||||
│ └─ Google Gemini 2.5 Pro Preview (RECOMMENDED) │
|
||||
│ └─ Google Gemini 2.0 Flash (fast responses) │
|
||||
│ └─ Google Gemini 2.0 Flash Thinking (FREE reasoning) │
|
||||
│ └─ Qwen QwQ-32B (Reasoning) - FREE tier available │
|
||||
│ └─ DeepSeek R1 (Open-source reasoning) │
|
||||
│ └─ OpenAI O1/GPT-4o │
|
||||
│ └─ Meta Llama 3.1 405B │
|
||||
│ └─ Best for: Cost optimization, variety │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Anthropic (Direct API) │
|
||||
│ └─ Claude 3.5 Sonnet (latest) │
|
||||
│ └─ Claude 3 Opus (complex analysis) │
|
||||
│ └─ Best for: Quality, reliability, safety │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
Model Comparison (12 Available):
|
||||
| Model | Provider | Best For | Cost |
|
||||
|-------|----------|----------|------|
|
||||
| Gemini 2.5 Pro | OpenRouter | General + Reasoning | $$ |
|
||||
| Gemini 2.0 Flash | OpenRouter | Speed | $ |
|
||||
| Gemini 2.0 Flash Thinking | OpenRouter | Reasoning | FREE |
|
||||
| Claude 3.5 Sonnet | Anthropic | Quality | $$$ |
|
||||
| GPT-4o | OpenRouter | General | $$$ |
|
||||
| QwQ-32B | OpenRouter | Math/Reasoning | $ |
|
||||
| QwQ-32B Free | OpenRouter | Budget | FREE |
|
||||
| DeepSeek R1 | OpenRouter | Open-source | $ |
|
||||
| O1 Preview | OpenRouter | Advanced reasoning | $$$$ |
|
||||
| Llama 3.1 405B | OpenRouter | Enterprise | $$ |
|
||||
|
||||
Intelligent Model Selection:
|
||||
- Budget → Gemini 2.0 Flash Thinking (FREE) or QwQ Free
|
||||
- General → Gemini 2.5 Pro (DEFAULT)
|
||||
- Quality → Claude 3.5 Sonnet
|
||||
- Complex reasoning → O1 Preview or Claude Opus
|
||||
```
|
||||
|
||||
### 12. Hybrid Search
|
||||
|
||||
#### Clawdbot
|
||||
- Vector-only search
|
||||
- No keyword fallback
|
||||
- Limited result ranking
|
||||
|
||||
#### RuvBot (SOTA)
|
||||
```
|
||||
Hybrid Search Architecture (ADR-009):
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Query Processing │
|
||||
│ ┌─────────────┐ ┌─────────────┐ │
|
||||
│ │ BM25 │ │ Vector │ │
|
||||
│ │ Keyword │ │ Semantic │ │
|
||||
│ │ Search │ │ Search │ │
|
||||
│ └──────┬──────┘ └──────┬──────┘ │
|
||||
│ │ │ │
|
||||
│ └────────────┬───────────────┘ │
|
||||
│ ▼ │
|
||||
│ ┌───────────────┐ │
|
||||
│ │ RRF Fusion │ │
|
||||
│ │ (k=60) │ │
|
||||
│ └───────┬───────┘ │
|
||||
│ ▼ │
|
||||
│ ┌───────────────┐ │
|
||||
│ │ Re-ranking │ │
|
||||
│ │ + Filtering │ │
|
||||
│ └───────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
BM25 Configuration:
|
||||
- k1: 1.2 (term frequency saturation)
|
||||
- b: 0.75 (document length normalization)
|
||||
- Tokenization: Unicode word boundaries
|
||||
- Stemming: Porter stemmer (optional)
|
||||
|
||||
Search Accuracy Comparison:
|
||||
| Method | Precision@10 | Recall@100 | Latency |
|
||||
|--------|--------------|------------|---------|
|
||||
| BM25 only | 0.72 | 0.85 | <5ms |
|
||||
| Vector only | 0.78 | 0.92 | <10ms |
|
||||
| Hybrid (RRF) | 0.91 | 0.97 | <15ms |
|
||||
```
|
||||
|
||||
### 13. Adversarial Defense (AIDefence Integration)
|
||||
|
||||
#### Clawdbot
|
||||
- Basic input validation
|
||||
- No prompt injection protection
|
||||
- No jailbreak detection
|
||||
- Manual PII handling
|
||||
|
||||
#### RuvBot (SOTA)
|
||||
```
|
||||
AIDefence Multi-Layer Protection (ADR-014):
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Layer 1: Pattern Detection (<5ms) │
|
||||
│ └─ 50+ prompt injection signatures │
|
||||
│ └─ Jailbreak patterns (DAN, bypass, unlimited) │
|
||||
│ └─ Custom patterns (configurable) │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Layer 2: PII Protection (<3ms) │
|
||||
│ └─ Email, phone, SSN, credit cards │
|
||||
│ └─ API keys and tokens │
|
||||
│ └─ IP addresses │
|
||||
│ └─ Automatic masking │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Layer 3: Sanitization (<1ms) │
|
||||
│ └─ Control character removal │
|
||||
│ └─ Unicode homoglyph normalization │
|
||||
│ └─ Encoding attack prevention │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Layer 4: Behavioral Analysis (<100ms) [Optional] │
|
||||
│ └─ User behavior baseline │
|
||||
│ └─ Anomaly detection │
|
||||
│ └─ Deviation scoring │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Layer 5: Response Validation (<8ms) │
|
||||
│ └─ PII leak detection │
|
||||
│ └─ Injection echo detection │
|
||||
│ └─ Malicious code detection │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
Threat Detection Performance:
|
||||
| Threat Type | Clawdbot | RuvBot | Detection Time |
|
||||
|-------------|----------|--------|----------------|
|
||||
| Prompt Injection | ❌ | ✅ | <5ms |
|
||||
| Jailbreak | ❌ | ✅ | <5ms |
|
||||
| PII Exposure | ❌ | ✅ | <3ms |
|
||||
| Control Characters | ❌ | ✅ | <1ms |
|
||||
| Homoglyph Attacks | ❌ | ✅ | <1ms |
|
||||
| Behavioral Anomaly | ❌ | ✅ | <100ms |
|
||||
| Response Leakage | ❌ | ✅ | <8ms |
|
||||
|
||||
Usage Example:
|
||||
```typescript
|
||||
import { createAIDefenceGuard } from '@ruvector/ruvbot';
|
||||
|
||||
const guard = createAIDefenceGuard({
|
||||
detectPromptInjection: true,
|
||||
detectJailbreak: true,
|
||||
detectPII: true,
|
||||
blockThreshold: 'medium',
|
||||
});
|
||||
|
||||
const result = await guard.analyze(userInput);
|
||||
if (!result.safe) {
|
||||
// Block or use sanitized input
|
||||
const safeInput = result.sanitizedInput;
|
||||
}
|
||||
```
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
RuvBot represents a **security-first, next-generation evolution** of the personal AI assistant paradigm:
|
||||
|
||||
### Security: The Critical Difference
|
||||
|
||||
| Security Feature | Clawdbot | RuvBot | Verdict |
|
||||
|-----------------|----------|--------|---------|
|
||||
| **Prompt Injection** | VULNERABLE | Protected (<5ms) | ⚠️ **CRITICAL** |
|
||||
| **Jailbreak Defense** | VULNERABLE | Blocked | ⚠️ **CRITICAL** |
|
||||
| **PII Protection** | NONE | Auto-masked | ⚠️ **HIGH RISK** |
|
||||
| **Input Sanitization** | NONE | Full | ⚠️ **HIGH RISK** |
|
||||
| **Multi-tenant Isolation** | NONE | PostgreSQL RLS | ⚠️ **HIGH RISK** |
|
||||
|
||||
**Do not deploy Clawdbot in production without security hardening.**
|
||||
|
||||
### Complete Comparison
|
||||
|
||||
| Aspect | Clawdbot | RuvBot | Winner |
|
||||
|--------|----------|--------|--------|
|
||||
| **Security** | Vulnerable | 6-layer + AIDefence | 🏆 RuvBot |
|
||||
| **Adversarial Defense** | None | AIDefence (<10ms) | 🏆 RuvBot |
|
||||
| **Performance** | Baseline | 50-150x faster | 🏆 RuvBot |
|
||||
| **Intelligence** | Static | Self-learning SONA | 🏆 RuvBot |
|
||||
| **Scalability** | Single-user | Enterprise multi-tenant | 🏆 RuvBot |
|
||||
| **LLM Models** | Single | 12+ (Gemini 2.5, Claude, GPT) | 🏆 RuvBot |
|
||||
| **Plugin System** | Basic | IPFS + sandboxed | 🏆 RuvBot |
|
||||
| **Skills** | 52 | 68+ | 🏆 RuvBot |
|
||||
| **Workers** | Basic | 12 specialized | 🏆 RuvBot |
|
||||
| **Consensus** | None | 4 protocols | 🏆 RuvBot |
|
||||
| **Cloud Deploy** | Manual | GCP Terraform (~$15/mo) | 🏆 RuvBot |
|
||||
| **Hybrid Search** | Vector-only | BM25 + Vector RRF | 🏆 RuvBot |
|
||||
| **Cost** | API fees | $0 local WASM | 🏆 RuvBot |
|
||||
| **Portability** | Node.js | WASM everywhere | 🏆 RuvBot |
|
||||
|
||||
**RuvBot is definitively better than Clawdbot in every measurable dimension**, especially security and intelligence, while maintaining full compatibility with Clawdbot's skill and extension architecture.
|
||||
|
||||
### Migration Recommendation
|
||||
|
||||
If you are currently using Clawdbot, **migrate to RuvBot immediately** to address critical security vulnerabilities. RuvBot provides a seamless migration path with full skill compatibility.
|
||||
916
npm/packages/ruvbot/docs/IMPLEMENTATION_PLAN.yaml
Normal file
916
npm/packages/ruvbot/docs/IMPLEMENTATION_PLAN.yaml
Normal file
@@ -0,0 +1,916 @@
|
||||
# RuvBot Implementation Plan
|
||||
# High-performance AI assistant bot with WASM embeddings, vector memory, and multi-platform integration
|
||||
|
||||
plan:
|
||||
objective: "Build RuvBot npm package - a self-learning AI assistant with WASM embeddings, vector memory, and Slack/webhook integrations"
|
||||
|
||||
version: "0.1.0"
|
||||
estimated_duration: "6-8 weeks"
|
||||
|
||||
success_criteria:
|
||||
- "Package installable via npx @ruvector/ruvbot"
|
||||
- "CLI supports local and remote deployment modes"
|
||||
- "WASM embeddings working in Node.js and browser"
|
||||
- "Vector memory with HNSW search < 10ms"
|
||||
- "Slack integration with real-time message handling"
|
||||
- "Background workers processing async tasks"
|
||||
- "Extensible skill system with hot-reload"
|
||||
- "Session persistence across restarts"
|
||||
- "85%+ test coverage on core modules"
|
||||
|
||||
phases:
|
||||
# ============================================================================
|
||||
# PHASE 1: Core Foundation (Week 1-2)
|
||||
# ============================================================================
|
||||
- name: "Phase 1: Core Foundation"
|
||||
duration: "2 weeks"
|
||||
description: "Establish package structure and core domain entities"
|
||||
|
||||
tasks:
|
||||
- id: "p1-t1"
|
||||
description: "Initialize package with tsup, TypeScript, and ESM/CJS dual build"
|
||||
agent: "coder"
|
||||
dependencies: []
|
||||
estimated_time: "2h"
|
||||
priority: "critical"
|
||||
files:
|
||||
- "package.json"
|
||||
- "tsconfig.json"
|
||||
- "tsup.config.ts"
|
||||
- ".npmignore"
|
||||
|
||||
- id: "p1-t2"
|
||||
description: "Create core domain entities (Agent, Session, Message, Skill)"
|
||||
agent: "coder"
|
||||
dependencies: ["p1-t1"]
|
||||
estimated_time: "4h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/core/entities/Agent.ts"
|
||||
- "src/core/entities/Session.ts"
|
||||
- "src/core/entities/Message.ts"
|
||||
- "src/core/entities/Skill.ts"
|
||||
- "src/core/entities/index.ts"
|
||||
- "src/core/types.ts"
|
||||
|
||||
- id: "p1-t3"
|
||||
description: "Implement RuvBot main class with lifecycle management"
|
||||
agent: "coder"
|
||||
dependencies: ["p1-t2"]
|
||||
estimated_time: "4h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/RuvBot.ts"
|
||||
- "src/core/BotConfig.ts"
|
||||
- "src/core/BotState.ts"
|
||||
|
||||
- id: "p1-t4"
|
||||
description: "Create error types and result monads"
|
||||
agent: "coder"
|
||||
dependencies: ["p1-t1"]
|
||||
estimated_time: "2h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "src/core/errors.ts"
|
||||
- "src/core/Result.ts"
|
||||
|
||||
- id: "p1-t5"
|
||||
description: "Set up unit testing with vitest"
|
||||
agent: "tester"
|
||||
dependencies: ["p1-t3"]
|
||||
estimated_time: "3h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "vitest.config.ts"
|
||||
- "tests/unit/core/RuvBot.test.ts"
|
||||
- "tests/unit/core/entities/*.test.ts"
|
||||
|
||||
# ============================================================================
|
||||
# PHASE 2: Infrastructure Layer (Week 2-3)
|
||||
# ============================================================================
|
||||
- name: "Phase 2: Infrastructure Layer"
|
||||
duration: "1.5 weeks"
|
||||
description: "Database, messaging, and worker infrastructure"
|
||||
|
||||
tasks:
|
||||
- id: "p2-t1"
|
||||
description: "Implement SessionStore with SQLite and PostgreSQL adapters"
|
||||
agent: "coder"
|
||||
dependencies: ["p1-t2"]
|
||||
estimated_time: "6h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/infrastructure/storage/SessionStore.ts"
|
||||
- "src/infrastructure/storage/adapters/SQLiteAdapter.ts"
|
||||
- "src/infrastructure/storage/adapters/PostgresAdapter.ts"
|
||||
- "src/infrastructure/storage/adapters/BaseAdapter.ts"
|
||||
|
||||
- id: "p2-t2"
|
||||
description: "Create MessageQueue with in-memory and Redis backends"
|
||||
agent: "coder"
|
||||
dependencies: ["p1-t2"]
|
||||
estimated_time: "5h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/infrastructure/messaging/MessageQueue.ts"
|
||||
- "src/infrastructure/messaging/InMemoryQueue.ts"
|
||||
- "src/infrastructure/messaging/RedisQueue.ts"
|
||||
|
||||
- id: "p2-t3"
|
||||
description: "Implement WorkerPool using agentic-flow patterns"
|
||||
agent: "coder"
|
||||
dependencies: ["p2-t2"]
|
||||
estimated_time: "6h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/infrastructure/workers/WorkerPool.ts"
|
||||
- "src/infrastructure/workers/Worker.ts"
|
||||
- "src/infrastructure/workers/TaskScheduler.ts"
|
||||
- "src/infrastructure/workers/tasks/index.ts"
|
||||
|
||||
- id: "p2-t4"
|
||||
description: "Create EventBus for internal pub/sub communication"
|
||||
agent: "coder"
|
||||
dependencies: ["p1-t1"]
|
||||
estimated_time: "3h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "src/infrastructure/events/EventBus.ts"
|
||||
- "src/infrastructure/events/types.ts"
|
||||
|
||||
- id: "p2-t5"
|
||||
description: "Add connection pooling and health checks"
|
||||
agent: "coder"
|
||||
dependencies: ["p2-t1", "p2-t2"]
|
||||
estimated_time: "4h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "src/infrastructure/health/HealthChecker.ts"
|
||||
- "src/infrastructure/pool/ConnectionPool.ts"
|
||||
|
||||
# ============================================================================
|
||||
# PHASE 3: Learning Layer - WASM & ruvllm (Week 3-4)
|
||||
# ============================================================================
|
||||
- name: "Phase 3: Learning Layer"
|
||||
duration: "1.5 weeks"
|
||||
description: "WASM embeddings and ruvllm integration for self-learning"
|
||||
|
||||
tasks:
|
||||
- id: "p3-t1"
|
||||
description: "Create MemoryManager with HNSW vector search"
|
||||
agent: "coder"
|
||||
dependencies: ["p2-t1"]
|
||||
estimated_time: "8h"
|
||||
priority: "critical"
|
||||
files:
|
||||
- "src/learning/memory/MemoryManager.ts"
|
||||
- "src/learning/memory/VectorIndex.ts"
|
||||
- "src/learning/memory/types.ts"
|
||||
dependencies_pkg:
|
||||
- "@ruvector/wasm-unified"
|
||||
|
||||
- id: "p3-t2"
|
||||
description: "Integrate @ruvector/wasm-unified for WASM embeddings"
|
||||
agent: "coder"
|
||||
dependencies: ["p3-t1"]
|
||||
estimated_time: "6h"
|
||||
priority: "critical"
|
||||
files:
|
||||
- "src/learning/embeddings/WasmEmbedder.ts"
|
||||
- "src/learning/embeddings/EmbeddingCache.ts"
|
||||
- "src/learning/embeddings/index.ts"
|
||||
|
||||
- id: "p3-t3"
|
||||
description: "Integrate @ruvector/ruvllm for LLM orchestration"
|
||||
agent: "coder"
|
||||
dependencies: ["p3-t2"]
|
||||
estimated_time: "6h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/learning/llm/LLMOrchestrator.ts"
|
||||
- "src/learning/llm/ModelRouter.ts"
|
||||
- "src/learning/llm/SessionContext.ts"
|
||||
dependencies_pkg:
|
||||
- "@ruvector/ruvllm"
|
||||
|
||||
- id: "p3-t4"
|
||||
description: "Implement trajectory learning and pattern extraction"
|
||||
agent: "coder"
|
||||
dependencies: ["p3-t3"]
|
||||
estimated_time: "5h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "src/learning/trajectory/TrajectoryRecorder.ts"
|
||||
- "src/learning/trajectory/PatternExtractor.ts"
|
||||
- "src/learning/trajectory/types.ts"
|
||||
|
||||
- id: "p3-t5"
|
||||
description: "Add semantic search and retrieval pipeline"
|
||||
agent: "coder"
|
||||
dependencies: ["p3-t1", "p3-t2"]
|
||||
estimated_time: "4h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/learning/retrieval/SemanticSearch.ts"
|
||||
- "src/learning/retrieval/RetrievalPipeline.ts"
|
||||
|
||||
# ============================================================================
|
||||
# PHASE 4: Skill System (Week 4-5)
|
||||
# ============================================================================
|
||||
- name: "Phase 4: Skill System"
|
||||
duration: "1 week"
|
||||
description: "Extensible skill registry with hot-reload support"
|
||||
|
||||
tasks:
|
||||
- id: "p4-t1"
|
||||
description: "Create SkillRegistry with plugin architecture"
|
||||
agent: "coder"
|
||||
dependencies: ["p1-t2", "p3-t3"]
|
||||
estimated_time: "6h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/skills/SkillRegistry.ts"
|
||||
- "src/skills/SkillLoader.ts"
|
||||
- "src/skills/SkillContext.ts"
|
||||
- "src/skills/types.ts"
|
||||
|
||||
- id: "p4-t2"
|
||||
description: "Implement built-in skills (search, summarize, code)"
|
||||
agent: "coder"
|
||||
dependencies: ["p4-t1"]
|
||||
estimated_time: "8h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/skills/builtin/SearchSkill.ts"
|
||||
- "src/skills/builtin/SummarizeSkill.ts"
|
||||
- "src/skills/builtin/CodeSkill.ts"
|
||||
- "src/skills/builtin/MemorySkill.ts"
|
||||
- "src/skills/builtin/index.ts"
|
||||
|
||||
- id: "p4-t3"
|
||||
description: "Add skill hot-reload with file watching"
|
||||
agent: "coder"
|
||||
dependencies: ["p4-t1"]
|
||||
estimated_time: "4h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "src/skills/HotReloader.ts"
|
||||
- "src/skills/SkillValidator.ts"
|
||||
|
||||
- id: "p4-t4"
|
||||
description: "Create skill template generator"
|
||||
agent: "coder"
|
||||
dependencies: ["p4-t1"]
|
||||
estimated_time: "3h"
|
||||
priority: "low"
|
||||
files:
|
||||
- "src/skills/templates/skill-template.ts"
|
||||
- "src/skills/generator.ts"
|
||||
|
||||
# ============================================================================
|
||||
# PHASE 5: Integrations (Week 5-6)
|
||||
# ============================================================================
|
||||
- name: "Phase 5: Integrations"
|
||||
duration: "1.5 weeks"
|
||||
description: "Slack, webhooks, and external service integrations"
|
||||
|
||||
tasks:
|
||||
- id: "p5-t1"
|
||||
description: "Implement SlackAdapter with Socket Mode"
|
||||
agent: "coder"
|
||||
dependencies: ["p1-t3", "p4-t1"]
|
||||
estimated_time: "8h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/integrations/slack/SlackAdapter.ts"
|
||||
- "src/integrations/slack/SlackEventHandler.ts"
|
||||
- "src/integrations/slack/SlackMessageFormatter.ts"
|
||||
- "src/integrations/slack/types.ts"
|
||||
dependencies_pkg:
|
||||
- "@slack/bolt"
|
||||
- "@slack/web-api"
|
||||
|
||||
- id: "p5-t2"
|
||||
description: "Create WebhookServer for HTTP callbacks"
|
||||
agent: "coder"
|
||||
dependencies: ["p1-t3"]
|
||||
estimated_time: "5h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/integrations/webhooks/WebhookServer.ts"
|
||||
- "src/integrations/webhooks/WebhookValidator.ts"
|
||||
- "src/integrations/webhooks/routes.ts"
|
||||
|
||||
- id: "p5-t3"
|
||||
description: "Add Discord adapter"
|
||||
agent: "coder"
|
||||
dependencies: ["p5-t1"]
|
||||
estimated_time: "6h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "src/integrations/discord/DiscordAdapter.ts"
|
||||
- "src/integrations/discord/DiscordEventHandler.ts"
|
||||
dependencies_pkg:
|
||||
- "discord.js"
|
||||
|
||||
- id: "p5-t4"
|
||||
description: "Create generic ChatAdapter interface"
|
||||
agent: "coder"
|
||||
dependencies: ["p5-t1", "p5-t3"]
|
||||
estimated_time: "3h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "src/integrations/ChatAdapter.ts"
|
||||
- "src/integrations/AdapterFactory.ts"
|
||||
- "src/integrations/types.ts"
|
||||
|
||||
# ============================================================================
|
||||
# PHASE 6: API Layer (Week 6)
|
||||
# ============================================================================
|
||||
- name: "Phase 6: API Layer"
|
||||
duration: "1 week"
|
||||
description: "REST and GraphQL endpoints for external access"
|
||||
|
||||
tasks:
|
||||
- id: "p6-t1"
|
||||
description: "Create REST API server with Express/Fastify"
|
||||
agent: "coder"
|
||||
dependencies: ["p1-t3", "p4-t1"]
|
||||
estimated_time: "6h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/api/rest/server.ts"
|
||||
- "src/api/rest/routes/chat.ts"
|
||||
- "src/api/rest/routes/sessions.ts"
|
||||
- "src/api/rest/routes/skills.ts"
|
||||
- "src/api/rest/routes/health.ts"
|
||||
- "src/api/rest/middleware/auth.ts"
|
||||
- "src/api/rest/middleware/rateLimit.ts"
|
||||
dependencies_pkg:
|
||||
- "fastify"
|
||||
- "@fastify/cors"
|
||||
- "@fastify/rate-limit"
|
||||
|
||||
- id: "p6-t2"
|
||||
description: "Add GraphQL API with subscriptions"
|
||||
agent: "coder"
|
||||
dependencies: ["p6-t1"]
|
||||
estimated_time: "6h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "src/api/graphql/schema.ts"
|
||||
- "src/api/graphql/resolvers/chat.ts"
|
||||
- "src/api/graphql/resolvers/sessions.ts"
|
||||
- "src/api/graphql/subscriptions.ts"
|
||||
dependencies_pkg:
|
||||
- "mercurius"
|
||||
- "graphql"
|
||||
|
||||
- id: "p6-t3"
|
||||
description: "Implement OpenAPI spec generation"
|
||||
agent: "coder"
|
||||
dependencies: ["p6-t1"]
|
||||
estimated_time: "3h"
|
||||
priority: "low"
|
||||
files:
|
||||
- "src/api/openapi/generator.ts"
|
||||
- "src/api/openapi/decorators.ts"
|
||||
|
||||
# ============================================================================
|
||||
# PHASE 7: CLI & Distribution (Week 6-7)
|
||||
# ============================================================================
|
||||
- name: "Phase 7: CLI & Distribution"
|
||||
duration: "1 week"
|
||||
description: "CLI interface and npx distribution setup"
|
||||
|
||||
tasks:
|
||||
- id: "p7-t1"
|
||||
description: "Create CLI entry point with commander"
|
||||
agent: "coder"
|
||||
dependencies: ["p1-t3", "p5-t1", "p6-t1"]
|
||||
estimated_time: "6h"
|
||||
priority: "critical"
|
||||
files:
|
||||
- "bin/cli.js"
|
||||
- "src/cli/index.ts"
|
||||
- "src/cli/commands/start.ts"
|
||||
- "src/cli/commands/config.ts"
|
||||
- "src/cli/commands/skills.ts"
|
||||
- "src/cli/commands/status.ts"
|
||||
dependencies_pkg:
|
||||
- "commander"
|
||||
- "chalk"
|
||||
- "ora"
|
||||
- "inquirer"
|
||||
|
||||
- id: "p7-t2"
|
||||
description: "Add local vs remote deployment modes"
|
||||
agent: "coder"
|
||||
dependencies: ["p7-t1"]
|
||||
estimated_time: "4h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "src/cli/modes/local.ts"
|
||||
- "src/cli/modes/remote.ts"
|
||||
- "src/cli/modes/docker.ts"
|
||||
|
||||
- id: "p7-t3"
|
||||
description: "Create configuration wizard"
|
||||
agent: "coder"
|
||||
dependencies: ["p7-t1"]
|
||||
estimated_time: "4h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "src/cli/wizard/ConfigWizard.ts"
|
||||
- "src/cli/wizard/prompts.ts"
|
||||
|
||||
- id: "p7-t4"
|
||||
description: "Build install script for curl | bash deployment"
|
||||
agent: "coder"
|
||||
dependencies: ["p7-t1"]
|
||||
estimated_time: "3h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "scripts/install.sh"
|
||||
- "scripts/uninstall.sh"
|
||||
|
||||
- id: "p7-t5"
|
||||
description: "Create Docker configuration"
|
||||
agent: "coder"
|
||||
dependencies: ["p7-t2"]
|
||||
estimated_time: "3h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "Dockerfile"
|
||||
- "docker-compose.yml"
|
||||
- ".dockerignore"
|
||||
|
||||
# ============================================================================
|
||||
# PHASE 8: Testing & Documentation (Week 7-8)
|
||||
# ============================================================================
|
||||
- name: "Phase 8: Testing & Documentation"
|
||||
duration: "1 week"
|
||||
description: "Comprehensive testing and documentation"
|
||||
|
||||
tasks:
|
||||
- id: "p8-t1"
|
||||
description: "Integration tests for all modules"
|
||||
agent: "tester"
|
||||
dependencies: ["p7-t1"]
|
||||
estimated_time: "8h"
|
||||
priority: "high"
|
||||
files:
|
||||
- "tests/integration/bot.test.ts"
|
||||
- "tests/integration/memory.test.ts"
|
||||
- "tests/integration/skills.test.ts"
|
||||
- "tests/integration/slack.test.ts"
|
||||
- "tests/integration/api.test.ts"
|
||||
|
||||
- id: "p8-t2"
|
||||
description: "E2E tests with real services"
|
||||
agent: "tester"
|
||||
dependencies: ["p8-t1"]
|
||||
estimated_time: "6h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "tests/e2e/full-flow.test.ts"
|
||||
- "tests/e2e/slack-flow.test.ts"
|
||||
- "tests/fixtures/"
|
||||
|
||||
- id: "p8-t3"
|
||||
description: "Performance benchmarks"
|
||||
agent: "tester"
|
||||
dependencies: ["p8-t1"]
|
||||
estimated_time: "4h"
|
||||
priority: "medium"
|
||||
files:
|
||||
- "benchmarks/memory.bench.ts"
|
||||
- "benchmarks/embeddings.bench.ts"
|
||||
- "benchmarks/throughput.bench.ts"
|
||||
|
||||
# ============================================================================
|
||||
# CRITICAL PATH
|
||||
# ============================================================================
|
||||
critical_path:
|
||||
- "p1-t1" # Package init
|
||||
- "p1-t2" # Core entities
|
||||
- "p1-t3" # RuvBot class
|
||||
- "p3-t1" # MemoryManager
|
||||
- "p3-t2" # WASM embeddings
|
||||
- "p4-t1" # SkillRegistry
|
||||
- "p5-t1" # SlackAdapter
|
||||
- "p7-t1" # CLI
|
||||
|
||||
# ============================================================================
|
||||
# RISK ASSESSMENT
|
||||
# ============================================================================
|
||||
risks:
|
||||
- id: "risk-1"
|
||||
description: "WASM module compatibility issues across Node versions"
|
||||
likelihood: "medium"
|
||||
impact: "high"
|
||||
mitigation: "Test on Node 18, 20, 22. Provide pure JS fallback for critical paths"
|
||||
|
||||
- id: "risk-2"
|
||||
description: "Slack API rate limiting during high traffic"
|
||||
likelihood: "medium"
|
||||
impact: "medium"
|
||||
mitigation: "Implement exponential backoff and message batching"
|
||||
|
||||
- id: "risk-3"
|
||||
description: "Memory leaks in long-running bot instances"
|
||||
likelihood: "medium"
|
||||
impact: "high"
|
||||
mitigation: "Add memory monitoring, implement LRU caches, periodic cleanup"
|
||||
|
||||
- id: "risk-4"
|
||||
description: "Breaking changes in upstream @ruvector packages"
|
||||
likelihood: "low"
|
||||
impact: "high"
|
||||
mitigation: "Pin specific versions, maintain compatibility layer"
|
||||
|
||||
- id: "risk-5"
|
||||
description: "Vector index corruption on unexpected shutdown"
|
||||
likelihood: "medium"
|
||||
impact: "high"
|
||||
mitigation: "WAL logging, periodic snapshots, automatic recovery"
|
||||
|
||||
# ============================================================================
|
||||
# PACKAGE STRUCTURE
|
||||
# ============================================================================
|
||||
package_structure:
|
||||
root: "npm/packages/ruvbot"
|
||||
directories:
|
||||
- path: "src/core"
|
||||
purpose: "Domain entities and core types"
|
||||
files:
|
||||
- "entities/Agent.ts"
|
||||
- "entities/Session.ts"
|
||||
- "entities/Message.ts"
|
||||
- "entities/Skill.ts"
|
||||
- "types.ts"
|
||||
- "errors.ts"
|
||||
- "Result.ts"
|
||||
- "BotConfig.ts"
|
||||
- "BotState.ts"
|
||||
|
||||
- path: "src/infrastructure"
|
||||
purpose: "Database, messaging, and worker infrastructure"
|
||||
files:
|
||||
- "storage/SessionStore.ts"
|
||||
- "storage/adapters/SQLiteAdapter.ts"
|
||||
- "storage/adapters/PostgresAdapter.ts"
|
||||
- "messaging/MessageQueue.ts"
|
||||
- "messaging/InMemoryQueue.ts"
|
||||
- "messaging/RedisQueue.ts"
|
||||
- "workers/WorkerPool.ts"
|
||||
- "workers/Worker.ts"
|
||||
- "workers/TaskScheduler.ts"
|
||||
- "events/EventBus.ts"
|
||||
- "health/HealthChecker.ts"
|
||||
|
||||
- path: "src/learning"
|
||||
purpose: "WASM embeddings, vector memory, and ruvllm integration"
|
||||
files:
|
||||
- "memory/MemoryManager.ts"
|
||||
- "memory/VectorIndex.ts"
|
||||
- "embeddings/WasmEmbedder.ts"
|
||||
- "embeddings/EmbeddingCache.ts"
|
||||
- "llm/LLMOrchestrator.ts"
|
||||
- "llm/ModelRouter.ts"
|
||||
- "trajectory/TrajectoryRecorder.ts"
|
||||
- "trajectory/PatternExtractor.ts"
|
||||
- "retrieval/SemanticSearch.ts"
|
||||
|
||||
- path: "src/skills"
|
||||
purpose: "Extensible skill system"
|
||||
files:
|
||||
- "SkillRegistry.ts"
|
||||
- "SkillLoader.ts"
|
||||
- "SkillContext.ts"
|
||||
- "HotReloader.ts"
|
||||
- "builtin/SearchSkill.ts"
|
||||
- "builtin/SummarizeSkill.ts"
|
||||
- "builtin/CodeSkill.ts"
|
||||
- "builtin/MemorySkill.ts"
|
||||
|
||||
- path: "src/integrations"
|
||||
purpose: "Slack, Discord, and webhook integrations"
|
||||
files:
|
||||
- "ChatAdapter.ts"
|
||||
- "AdapterFactory.ts"
|
||||
- "slack/SlackAdapter.ts"
|
||||
- "slack/SlackEventHandler.ts"
|
||||
- "discord/DiscordAdapter.ts"
|
||||
- "webhooks/WebhookServer.ts"
|
||||
|
||||
- path: "src/api"
|
||||
purpose: "REST and GraphQL endpoints"
|
||||
files:
|
||||
- "rest/server.ts"
|
||||
- "rest/routes/chat.ts"
|
||||
- "rest/routes/sessions.ts"
|
||||
- "rest/routes/skills.ts"
|
||||
- "graphql/schema.ts"
|
||||
- "graphql/resolvers/*.ts"
|
||||
|
||||
- path: "src/cli"
|
||||
purpose: "CLI interface"
|
||||
files:
|
||||
- "index.ts"
|
||||
- "commands/start.ts"
|
||||
- "commands/config.ts"
|
||||
- "commands/skills.ts"
|
||||
- "modes/local.ts"
|
||||
- "modes/remote.ts"
|
||||
- "wizard/ConfigWizard.ts"
|
||||
|
||||
- path: "bin"
|
||||
purpose: "CLI entry point for npx"
|
||||
files:
|
||||
- "cli.js"
|
||||
|
||||
- path: "tests"
|
||||
purpose: "Test suites"
|
||||
files:
|
||||
- "unit/**/*.test.ts"
|
||||
- "integration/**/*.test.ts"
|
||||
- "e2e/**/*.test.ts"
|
||||
|
||||
- path: "scripts"
|
||||
purpose: "Installation and utility scripts"
|
||||
files:
|
||||
- "install.sh"
|
||||
- "uninstall.sh"
|
||||
|
||||
# ============================================================================
|
||||
# DEPENDENCIES
|
||||
# ============================================================================
|
||||
dependencies:
|
||||
production:
|
||||
core:
|
||||
- name: "@ruvector/wasm-unified"
|
||||
version: "^1.0.0"
|
||||
purpose: "WASM embeddings and attention mechanisms"
|
||||
|
||||
- name: "@ruvector/ruvllm"
|
||||
version: "^2.3.0"
|
||||
purpose: "LLM orchestration with SONA learning"
|
||||
|
||||
- name: "@ruvector/postgres-cli"
|
||||
version: "^0.2.6"
|
||||
purpose: "PostgreSQL vector storage"
|
||||
|
||||
infrastructure:
|
||||
- name: "better-sqlite3"
|
||||
version: "^9.0.0"
|
||||
purpose: "Local SQLite storage"
|
||||
|
||||
- name: "ioredis"
|
||||
version: "^5.3.0"
|
||||
purpose: "Redis message queue"
|
||||
|
||||
- name: "fastify"
|
||||
version: "^4.24.0"
|
||||
purpose: "REST API server"
|
||||
|
||||
integrations:
|
||||
- name: "@slack/bolt"
|
||||
version: "^3.16.0"
|
||||
purpose: "Slack bot framework"
|
||||
|
||||
- name: "discord.js"
|
||||
version: "^14.14.0"
|
||||
purpose: "Discord integration"
|
||||
optional: true
|
||||
|
||||
cli:
|
||||
- name: "commander"
|
||||
version: "^12.0.0"
|
||||
purpose: "CLI framework"
|
||||
|
||||
- name: "chalk"
|
||||
version: "^4.1.2"
|
||||
purpose: "Terminal styling"
|
||||
|
||||
- name: "ora"
|
||||
version: "^5.4.1"
|
||||
purpose: "Terminal spinners"
|
||||
|
||||
- name: "inquirer"
|
||||
version: "^9.2.0"
|
||||
purpose: "Interactive prompts"
|
||||
|
||||
development:
|
||||
- name: "typescript"
|
||||
version: "^5.3.0"
|
||||
|
||||
- name: "tsup"
|
||||
version: "^8.0.0"
|
||||
purpose: "Build tool"
|
||||
|
||||
- name: "vitest"
|
||||
version: "^1.1.0"
|
||||
purpose: "Testing framework"
|
||||
|
||||
- name: "@types/node"
|
||||
version: "^20.10.0"
|
||||
|
||||
# ============================================================================
|
||||
# NPX DISTRIBUTION
|
||||
# ============================================================================
|
||||
npx_distribution:
|
||||
package_name: "@ruvector/ruvbot"
|
||||
binary_name: "ruvbot"
|
||||
|
||||
commands:
|
||||
- command: "npx @ruvector/ruvbot init"
|
||||
description: "Initialize RuvBot in current directory"
|
||||
|
||||
- command: "npx @ruvector/ruvbot start"
|
||||
description: "Start bot in local mode"
|
||||
|
||||
- command: "npx @ruvector/ruvbot start --remote"
|
||||
description: "Start bot connected to remote services"
|
||||
|
||||
- command: "npx @ruvector/ruvbot config"
|
||||
description: "Interactive configuration wizard"
|
||||
|
||||
- command: "npx @ruvector/ruvbot skills list"
|
||||
description: "List available skills"
|
||||
|
||||
- command: "npx @ruvector/ruvbot skills add <name>"
|
||||
description: "Add a skill from registry"
|
||||
|
||||
- command: "npx @ruvector/ruvbot status"
|
||||
description: "Show bot status and health"
|
||||
|
||||
install_script:
|
||||
url: "https://get.ruvector.dev/ruvbot"
|
||||
method: "curl -fsSL https://get.ruvector.dev/ruvbot | bash"
|
||||
|
||||
environment_variables:
|
||||
required:
|
||||
- name: "SLACK_BOT_TOKEN"
|
||||
description: "Slack bot OAuth token"
|
||||
|
||||
- name: "SLACK_SIGNING_SECRET"
|
||||
description: "Slack app signing secret"
|
||||
|
||||
optional:
|
||||
- name: "RUVBOT_PORT"
|
||||
description: "HTTP server port"
|
||||
default: "3000"
|
||||
|
||||
- name: "RUVBOT_LOG_LEVEL"
|
||||
description: "Logging verbosity"
|
||||
default: "info"
|
||||
|
||||
- name: "RUVBOT_STORAGE"
|
||||
description: "Storage backend (sqlite|postgres|memory)"
|
||||
default: "sqlite"
|
||||
|
||||
- name: "RUVBOT_MEMORY_PATH"
|
||||
description: "Path for vector memory storage"
|
||||
default: "./data/memory"
|
||||
|
||||
- name: "DATABASE_URL"
|
||||
description: "PostgreSQL connection string"
|
||||
|
||||
- name: "REDIS_URL"
|
||||
description: "Redis connection string"
|
||||
|
||||
- name: "ANTHROPIC_API_KEY"
|
||||
description: "Anthropic API key for Claude"
|
||||
|
||||
- name: "OPENAI_API_KEY"
|
||||
description: "OpenAI API key"
|
||||
|
||||
# ============================================================================
|
||||
# CONFIGURATION FILES
|
||||
# ============================================================================
|
||||
config_files:
|
||||
- name: "ruvbot.config.json"
|
||||
purpose: "Main configuration file"
|
||||
example: |
|
||||
{
|
||||
"name": "my-ruvbot",
|
||||
"port": 3000,
|
||||
"storage": {
|
||||
"type": "sqlite",
|
||||
"path": "./data/ruvbot.db"
|
||||
},
|
||||
"memory": {
|
||||
"dimensions": 384,
|
||||
"maxVectors": 100000,
|
||||
"indexType": "hnsw"
|
||||
},
|
||||
"skills": {
|
||||
"enabled": ["search", "summarize", "code", "memory"],
|
||||
"custom": ["./skills/*.js"]
|
||||
},
|
||||
"integrations": {
|
||||
"slack": {
|
||||
"enabled": true,
|
||||
"socketMode": true
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
- name: ".env"
|
||||
purpose: "Environment variables"
|
||||
example: |
|
||||
SLACK_BOT_TOKEN=xoxb-xxx
|
||||
SLACK_SIGNING_SECRET=xxx
|
||||
SLACK_APP_TOKEN=xapp-xxx
|
||||
ANTHROPIC_API_KEY=sk-ant-xxx
|
||||
|
||||
# ============================================================================
|
||||
# MILESTONES
|
||||
# ============================================================================
|
||||
milestones:
|
||||
- name: "M1: Core Bot"
|
||||
date: "Week 2"
|
||||
deliverables:
|
||||
- "RuvBot class with lifecycle management"
|
||||
- "Core entities (Agent, Session, Message)"
|
||||
- "Basic unit tests"
|
||||
|
||||
- name: "M2: Infrastructure"
|
||||
date: "Week 3"
|
||||
deliverables:
|
||||
- "Session persistence"
|
||||
- "Message queue"
|
||||
- "Worker pool"
|
||||
|
||||
- name: "M3: Learning"
|
||||
date: "Week 4"
|
||||
deliverables:
|
||||
- "WASM embeddings working"
|
||||
- "Vector memory with HNSW"
|
||||
- "Semantic search"
|
||||
|
||||
- name: "M4: Skills & Integrations"
|
||||
date: "Week 5"
|
||||
deliverables:
|
||||
- "Skill registry with built-in skills"
|
||||
- "Slack integration working"
|
||||
|
||||
- name: "M5: API & CLI"
|
||||
date: "Week 6"
|
||||
deliverables:
|
||||
- "REST API"
|
||||
- "CLI with npx support"
|
||||
|
||||
- name: "M6: Production Ready"
|
||||
date: "Week 8"
|
||||
deliverables:
|
||||
- "85%+ test coverage"
|
||||
- "Performance benchmarks passing"
|
||||
- "Published to npm"
|
||||
|
||||
# ============================================================================
|
||||
# TEAM ALLOCATION
|
||||
# ============================================================================
|
||||
team_allocation:
|
||||
agents:
|
||||
- role: "architect"
|
||||
tasks: ["p1-t2", "p3-t1", "p4-t1"]
|
||||
focus: "System design and core architecture"
|
||||
|
||||
- role: "coder"
|
||||
tasks: ["p1-t1", "p1-t3", "p2-*", "p3-*", "p5-*", "p6-*", "p7-*"]
|
||||
focus: "Implementation"
|
||||
|
||||
- role: "tester"
|
||||
tasks: ["p1-t5", "p8-*"]
|
||||
focus: "Testing and quality assurance"
|
||||
|
||||
- role: "reviewer"
|
||||
tasks: ["all"]
|
||||
focus: "Code review and security"
|
||||
|
||||
# ============================================================================
|
||||
# QUALITY GATES
|
||||
# ============================================================================
|
||||
quality_gates:
|
||||
- name: "Unit Test Coverage"
|
||||
threshold: ">= 80%"
|
||||
tool: "vitest"
|
||||
|
||||
- name: "Type Coverage"
|
||||
threshold: ">= 95%"
|
||||
tool: "typescript --noEmit"
|
||||
|
||||
- name: "No High Severity Vulnerabilities"
|
||||
threshold: "0 high/critical"
|
||||
tool: "npm audit"
|
||||
|
||||
- name: "Performance Benchmarks"
|
||||
thresholds:
|
||||
- metric: "embedding_latency"
|
||||
value: "< 50ms"
|
||||
- metric: "vector_search_latency"
|
||||
value: "< 10ms"
|
||||
- metric: "message_throughput"
|
||||
value: "> 1000 msg/s"
|
||||
172
npm/packages/ruvbot/docs/adr/ADR-001-architecture-overview.md
Normal file
172
npm/packages/ruvbot/docs/adr/ADR-001-architecture-overview.md
Normal file
@@ -0,0 +1,172 @@
|
||||
# ADR-001: RuvBot Architecture Overview
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Date
|
||||
2026-01-27
|
||||
|
||||
## Context
|
||||
|
||||
We need to build **RuvBot**, a Clawdbot-style personal AI assistant with a RuVector backend. The system must:
|
||||
|
||||
1. Provide a self-hosted, extensible AI assistant framework
|
||||
2. Integrate with RuVector's WASM-based vector operations for SOTA learning
|
||||
3. Support multi-tenancy for enterprise deployments
|
||||
4. Enable long-running tasks via background workers
|
||||
5. Integrate with messaging platforms (Slack, Discord, webhooks)
|
||||
6. Distribute as an `npx` package with local/remote deployment options
|
||||
|
||||
## Decision
|
||||
|
||||
### High-Level Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────┐
|
||||
│ RuvBot System │
|
||||
├─────────────────────────────────────────────────────────────────────┤
|
||||
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐│
|
||||
│ │ REST API │ │ GraphQL │ │ Slack │ │ Webhooks ││
|
||||
│ │ Endpoints │ │ Gateway │ │ Adapter │ │ Handler ││
|
||||
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘│
|
||||
│ │ │ │ │ │
|
||||
│ ┌──────┴────────────────┴────────────────┴────────────────┴──────┐│
|
||||
│ │ Message Router ││
|
||||
│ └─────────────────────────────┬───────────────────────────────────┘│
|
||||
│ │ │
|
||||
│ ┌─────────────────────────────┴───────────────────────────────────┐│
|
||||
│ │ Core Application Layer ││
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││
|
||||
│ │ │ AgentManager │ │SessionStore │ │ SkillRegistry│ ││
|
||||
│ │ └──────────────┘ └──────────────┘ └──────────────┘ ││
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││
|
||||
│ │ │MemoryManager │ │WorkerPool │ │ EventBus │ ││
|
||||
│ │ └──────────────┘ └──────────────┘ └──────────────┘ ││
|
||||
│ └─────────────────────────────────────────────────────────────────┘│
|
||||
│ │ │
|
||||
│ ┌─────────────────────────────┴───────────────────────────────────┐│
|
||||
│ │ Infrastructure Layer ││
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││
|
||||
│ │ │ RuVector │ │ PostgreSQL │ │ RuvLLM │ ││
|
||||
│ │ │ WASM Engine │ │ + pgvector │ │ Inference │ ││
|
||||
│ │ └──────────────┘ └──────────────┘ └──────────────┘ ││
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││
|
||||
│ │ │ agentic-flow │ │ SONA Learning│ │ HNSW Index │ ││
|
||||
│ │ │ Workers │ │ System │ │ Memory │ ││
|
||||
│ │ └──────────────┘ └──────────────┘ └──────────────┘ ││
|
||||
│ └─────────────────────────────────────────────────────────────────┘│
|
||||
└─────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### DDD Bounded Contexts
|
||||
|
||||
#### 1. Core Context
|
||||
- **Agent**: The AI agent entity with identity, capabilities, and state
|
||||
- **Session**: Conversation context with message history and metadata
|
||||
- **Memory**: Vector-based memory with HNSW indexing
|
||||
- **Skill**: Extensible capabilities (tools, commands, integrations)
|
||||
|
||||
#### 2. Infrastructure Context
|
||||
- **Persistence**: PostgreSQL with RuVector extensions, pgvector
|
||||
- **Messaging**: Event-driven message bus (Redis/in-memory)
|
||||
- **Workers**: Background task processing via agentic-flow
|
||||
|
||||
#### 3. Integration Context
|
||||
- **Slack**: Slack Bot API adapter
|
||||
- **Webhooks**: Generic webhook handler
|
||||
- **Providers**: LLM provider abstraction (Anthropic, OpenAI, etc.)
|
||||
|
||||
#### 4. Learning Context
|
||||
- **Embeddings**: RuVector WASM vector operations
|
||||
- **Training**: Trajectory learning, LoRA fine-tuning
|
||||
- **Patterns**: Neural pattern storage and retrieval
|
||||
|
||||
### Technology Stack
|
||||
|
||||
| Layer | Technology | Purpose |
|
||||
|-------|------------|---------|
|
||||
| Runtime | Node.js 18+ | Primary runtime |
|
||||
| Language | TypeScript (ESM) | Type-safe development |
|
||||
| Vector Engine | @ruvector/wasm-unified | SIMD-optimized vectors |
|
||||
| LLM Layer | @ruvector/ruvllm | SONA, LoRA, inference |
|
||||
| Database | PostgreSQL + pgvector | Persistence + vectors |
|
||||
| Workers | agentic-flow | Background processing |
|
||||
| Testing | Vitest | Unit/Integration/E2E |
|
||||
| CLI | Commander.js | npx distribution |
|
||||
|
||||
### Package Structure
|
||||
|
||||
```
|
||||
npm/packages/ruvbot/
|
||||
├── bin/ # CLI entry points
|
||||
│ └── ruvbot.ts # npx ruvbot entry
|
||||
├── src/
|
||||
│ ├── core/ # Domain layer
|
||||
│ │ ├── entities/ # Agent, Session, Memory, Skill
|
||||
│ │ ├── services/ # AgentManager, SessionStore, etc.
|
||||
│ │ └── events/ # Domain events
|
||||
│ ├── infrastructure/ # Infrastructure layer
|
||||
│ │ ├── persistence/ # PostgreSQL, SQLite adapters
|
||||
│ │ ├── messaging/ # Event bus, message queue
|
||||
│ │ └── workers/ # agentic-flow integration
|
||||
│ ├── integrations/ # External integrations
|
||||
│ │ ├── slack/ # Slack adapter
|
||||
│ │ ├── webhooks/ # Webhook handlers
|
||||
│ │ └── providers/ # LLM providers
|
||||
│ ├── learning/ # Learning system
|
||||
│ │ ├── embeddings/ # WASM vector ops
|
||||
│ │ ├── training/ # LoRA, SONA
|
||||
│ │ └── patterns/ # Pattern storage
|
||||
│ └── api/ # API layer
|
||||
│ ├── rest/ # REST endpoints
|
||||
│ └── graphql/ # GraphQL schema
|
||||
├── tests/
|
||||
│ ├── unit/
|
||||
│ ├── integration/
|
||||
│ └── e2e/
|
||||
├── docs/
|
||||
│ └── adr/ # Architecture Decision Records
|
||||
└── scripts/ # Build/deploy scripts
|
||||
```
|
||||
|
||||
### Multi-Tenancy Strategy
|
||||
|
||||
1. **Database Level**: Row-Level Security (RLS) with tenant_id
|
||||
2. **Application Level**: Tenant context middleware
|
||||
3. **Memory Level**: Namespace isolation in vector storage
|
||||
4. **Worker Level**: Tenant-scoped job queues
|
||||
|
||||
### Key Design Principles
|
||||
|
||||
1. **Self-Learning**: Every interaction improves the system via SONA
|
||||
2. **WASM-First**: Use RuVector WASM for portable, fast vector ops
|
||||
3. **Event-Driven**: Loose coupling via event bus
|
||||
4. **Extensible**: Plugin architecture for skills and integrations
|
||||
5. **Observable**: Built-in metrics and tracing
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- Modular architecture enables independent scaling
|
||||
- WASM integration provides consistent cross-platform performance
|
||||
- Multi-tenancy from day one avoids later refactoring
|
||||
- Self-learning improves over time with usage
|
||||
|
||||
### Negative
|
||||
- Initial complexity is higher than monolithic approach
|
||||
- WASM has some interop overhead
|
||||
- Multi-tenancy adds complexity to all data operations
|
||||
|
||||
### Risks
|
||||
- WASM performance in Node.js may vary by platform
|
||||
- PostgreSQL dependency limits serverless options
|
||||
- Background workers need careful monitoring
|
||||
|
||||
## Related ADRs
|
||||
- ADR-002: Multi-tenancy Design
|
||||
- ADR-003: Persistence Layer
|
||||
- ADR-004: Background Workers
|
||||
- ADR-005: Integration Layer
|
||||
- ADR-006: WASM Integration
|
||||
- ADR-007: Learning System
|
||||
- ADR-008: Security Architecture
|
||||
873
npm/packages/ruvbot/docs/adr/ADR-002-multi-tenancy-design.md
Normal file
873
npm/packages/ruvbot/docs/adr/ADR-002-multi-tenancy-design.md
Normal file
@@ -0,0 +1,873 @@
|
||||
# ADR-002: Multi-tenancy Design
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-01-27
|
||||
**Decision Makers:** RuVector Architecture Team
|
||||
**Technical Area:** Security, Data Architecture
|
||||
|
||||
---
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
RuvBot must serve multiple organizations (tenants) and users within each organization while maintaining strict data isolation. A breach of tenant boundaries would:
|
||||
|
||||
1. Violate privacy and compliance requirements (GDPR, SOC2, HIPAA)
|
||||
2. Expose sensitive business information
|
||||
3. Destroy trust in the platform
|
||||
4. Create legal liability
|
||||
|
||||
The multi-tenancy design must address:
|
||||
|
||||
- **Data Isolation**: No cross-tenant data access
|
||||
- **Authentication**: Identity verification at multiple levels
|
||||
- **Authorization**: Fine-grained permission control
|
||||
- **Resource Limits**: Fair usage and cost allocation
|
||||
- **Audit Trails**: Complete visibility into access patterns
|
||||
|
||||
---
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
### Security Requirements
|
||||
|
||||
| Requirement | Criticality | Description |
|
||||
|-------------|-------------|-------------|
|
||||
| Zero cross-tenant leakage | Critical | No tenant can access another tenant's data |
|
||||
| Row-level security | Critical | Database enforces isolation, not just application |
|
||||
| Token-based auth | High | Stateless, revocable authentication |
|
||||
| RBAC + ABAC | High | Role and attribute-based access control |
|
||||
| Audit logging | High | All data access logged with tenant context |
|
||||
|
||||
### Operational Requirements
|
||||
|
||||
| Requirement | Target | Description |
|
||||
|-------------|--------|-------------|
|
||||
| Tenant provisioning | < 30s | New tenant setup time |
|
||||
| User provisioning | < 5s | New user creation time |
|
||||
| Quota enforcement | Real-time | Immediate limit enforcement |
|
||||
| Data export | < 1h for 1GB | GDPR data portability |
|
||||
| Data deletion | < 24h | GDPR right to erasure |
|
||||
|
||||
---
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
### Adopt Hierarchical Multi-tenancy with RLS and JWT Claims
|
||||
|
||||
We implement a three-level hierarchy with PostgreSQL Row-Level Security (RLS) as the primary isolation mechanism.
|
||||
|
||||
```
|
||||
+---------------------------+
|
||||
| ORGANIZATION | Billing entity, security boundary
|
||||
|---------------------------|
|
||||
| id: UUID |
|
||||
| name: string |
|
||||
| plan: Plan |
|
||||
| settings: OrgSettings |
|
||||
| quotas: ResourceQuotas |
|
||||
+-------------+-------------+
|
||||
|
|
||||
| 1:N
|
||||
v
|
||||
+---------------------------+
|
||||
| WORKSPACE | Project/team boundary
|
||||
|---------------------------|
|
||||
| id: UUID |
|
||||
| orgId: UUID (FK) |
|
||||
| name: string |
|
||||
| settings: WorkspaceSettings|
|
||||
+-------------+-------------+
|
||||
|
|
||||
| 1:N
|
||||
v
|
||||
+---------------------------+
|
||||
| USER | Individual identity
|
||||
|---------------------------|
|
||||
| id: UUID |
|
||||
| workspaceId: UUID (FK) |
|
||||
| email: string |
|
||||
| roles: Role[] |
|
||||
| preferences: Preferences |
|
||||
+---------------------------+
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Tenant Isolation Layers
|
||||
|
||||
### Layer 1: Network Isolation
|
||||
|
||||
```
|
||||
Internet
|
||||
|
|
||||
v
|
||||
+---+---+
|
||||
| WAF | Rate limiting, DDoS protection
|
||||
+---+---+
|
||||
|
|
||||
v
|
||||
+---+---+
|
||||
| LB/TLS| TLS termination, tenant routing
|
||||
+---+---+
|
||||
|
|
||||
+--------+--------+--------+
|
||||
| | | |
|
||||
+---v---+ +---v---+ +---v---+ +---v---+
|
||||
| Org A | | Org B | | Org C | | Org D | Virtual host routing
|
||||
+-------+ +-------+ +-------+ +-------+
|
||||
```
|
||||
|
||||
### Layer 2: Authentication & Authorization
|
||||
|
||||
```typescript
|
||||
// JWT token structure with tenant claims
|
||||
interface RuvBotToken {
|
||||
// Standard claims
|
||||
sub: string; // User ID
|
||||
iat: number; // Issued at
|
||||
exp: number; // Expiration
|
||||
|
||||
// Tenant claims (always present)
|
||||
org_id: string; // Organization ID
|
||||
workspace_id: string; // Workspace ID
|
||||
|
||||
// Permission claims
|
||||
roles: Role[]; // User roles
|
||||
permissions: string[];// Explicit permissions
|
||||
|
||||
// Resource claims
|
||||
quotas: {
|
||||
sessions: number;
|
||||
messages_per_day: number;
|
||||
memory_mb: number;
|
||||
};
|
||||
}
|
||||
|
||||
// Role hierarchy
|
||||
enum Role {
|
||||
ORG_OWNER = 'org:owner',
|
||||
ORG_ADMIN = 'org:admin',
|
||||
WORKSPACE_ADMIN = 'workspace:admin',
|
||||
MEMBER = 'member',
|
||||
VIEWER = 'viewer',
|
||||
API_KEY = 'api_key',
|
||||
}
|
||||
|
||||
// Permission matrix
|
||||
const PERMISSIONS: Record<Role, string[]> = {
|
||||
'org:owner': ['*'],
|
||||
'org:admin': ['org:read', 'org:write', 'workspace:*', 'user:*', 'billing:read'],
|
||||
'workspace:admin': ['workspace:read', 'workspace:write', 'user:read', 'user:invite'],
|
||||
'member': ['session:*', 'memory:read', 'memory:write', 'skill:execute'],
|
||||
'viewer': ['session:read', 'memory:read'],
|
||||
'api_key': ['session:create', 'session:read'],
|
||||
};
|
||||
```
|
||||
|
||||
### Layer 3: Database Row-Level Security
|
||||
|
||||
```sql
|
||||
-- Enable RLS on all tenant-scoped tables
|
||||
ALTER TABLE conversations ENABLE ROW LEVEL SECURITY;
|
||||
ALTER TABLE memories ENABLE ROW LEVEL SECURITY;
|
||||
ALTER TABLE sessions ENABLE ROW LEVEL SECURITY;
|
||||
ALTER TABLE skills ENABLE ROW LEVEL SECURITY;
|
||||
ALTER TABLE trajectories ENABLE ROW LEVEL SECURITY;
|
||||
|
||||
-- Create tenant context function
|
||||
CREATE OR REPLACE FUNCTION current_tenant_id()
|
||||
RETURNS UUID AS $$
|
||||
BEGIN
|
||||
RETURN current_setting('app.current_org_id', true)::UUID;
|
||||
END;
|
||||
$$ LANGUAGE plpgsql SECURITY DEFINER;
|
||||
|
||||
CREATE OR REPLACE FUNCTION current_workspace_id()
|
||||
RETURNS UUID AS $$
|
||||
BEGIN
|
||||
RETURN current_setting('app.current_workspace_id', true)::UUID;
|
||||
END;
|
||||
$$ LANGUAGE plpgsql SECURITY DEFINER;
|
||||
|
||||
-- RLS policies for conversations
|
||||
CREATE POLICY conversations_isolation ON conversations
|
||||
FOR ALL
|
||||
USING (org_id = current_tenant_id())
|
||||
WITH CHECK (org_id = current_tenant_id());
|
||||
|
||||
-- RLS policies for memories (workspace-level)
|
||||
CREATE POLICY memories_isolation ON memories
|
||||
FOR ALL
|
||||
USING (
|
||||
org_id = current_tenant_id()
|
||||
AND workspace_id = current_workspace_id()
|
||||
);
|
||||
|
||||
-- Read-only policy for cross-workspace memory sharing
|
||||
CREATE POLICY memories_shared_read ON memories
|
||||
FOR SELECT
|
||||
USING (
|
||||
org_id = current_tenant_id()
|
||||
AND is_shared = true
|
||||
);
|
||||
```
|
||||
|
||||
### Layer 4: Vector Store Isolation
|
||||
|
||||
```typescript
|
||||
// Namespace isolation in RuVector
|
||||
interface VectorNamespace {
|
||||
// Namespace format: {org_id}/{workspace_id}/{collection}
|
||||
// Example: "550e8400-e29b/.../episodic"
|
||||
|
||||
encode(orgId: string, workspaceId: string, collection: string): string;
|
||||
decode(namespace: string): { orgId: string; workspaceId: string; collection: string };
|
||||
validate(namespace: string, token: RuvBotToken): boolean;
|
||||
}
|
||||
|
||||
// Vector store with tenant isolation
|
||||
class TenantIsolatedVectorStore {
|
||||
constructor(
|
||||
private store: RuVectorAdapter,
|
||||
private tenantContext: TenantContext
|
||||
) {}
|
||||
|
||||
async search(query: Float32Array, options: SearchOptions): Promise<SearchResult[]> {
|
||||
const namespace = this.getNamespace(options.collection);
|
||||
|
||||
// Validate namespace matches token claims
|
||||
if (!this.validateNamespace(namespace)) {
|
||||
throw new TenantIsolationError('Namespace mismatch');
|
||||
}
|
||||
|
||||
return this.store.search(query, { ...options, namespace });
|
||||
}
|
||||
|
||||
private getNamespace(collection: string): string {
|
||||
return `${this.tenantContext.orgId}/${this.tenantContext.workspaceId}/${collection}`;
|
||||
}
|
||||
|
||||
private validateNamespace(namespace: string): boolean {
|
||||
const { orgId, workspaceId } = VectorNamespace.decode(namespace);
|
||||
return (
|
||||
orgId === this.tenantContext.orgId &&
|
||||
workspaceId === this.tenantContext.workspaceId
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Data Partitioning Strategy
|
||||
|
||||
### PostgreSQL Partitioning
|
||||
|
||||
```sql
|
||||
-- Partition conversations by org_id for isolation and performance
|
||||
CREATE TABLE conversations (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
org_id UUID NOT NULL,
|
||||
workspace_id UUID NOT NULL,
|
||||
session_id UUID NOT NULL,
|
||||
user_id UUID NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
role VARCHAR(20) NOT NULL,
|
||||
embedding_id UUID,
|
||||
metadata JSONB,
|
||||
created_at TIMESTAMPTZ DEFAULT NOW()
|
||||
) PARTITION BY LIST (org_id);
|
||||
|
||||
-- Create partition per organization
|
||||
CREATE OR REPLACE FUNCTION create_org_partition(org_id UUID)
|
||||
RETURNS void AS $$
|
||||
DECLARE
|
||||
partition_name TEXT;
|
||||
BEGIN
|
||||
partition_name := 'conversations_' || replace(org_id::text, '-', '_');
|
||||
EXECUTE format(
|
||||
'CREATE TABLE IF NOT EXISTS %I PARTITION OF conversations FOR VALUES IN (%L)',
|
||||
partition_name,
|
||||
org_id
|
||||
);
|
||||
END;
|
||||
$$ LANGUAGE plpgsql;
|
||||
|
||||
-- Indexes per partition
|
||||
CREATE INDEX CONCURRENTLY conversations_session_idx
|
||||
ON conversations (session_id, created_at DESC);
|
||||
CREATE INDEX CONCURRENTLY conversations_user_idx
|
||||
ON conversations (user_id, created_at DESC);
|
||||
CREATE INDEX CONCURRENTLY conversations_embedding_idx
|
||||
ON conversations (embedding_id) WHERE embedding_id IS NOT NULL;
|
||||
```
|
||||
|
||||
### Vector Store Partitioning
|
||||
|
||||
```typescript
|
||||
// HNSW index per tenant for isolation and independent scaling
|
||||
interface TenantVectorIndex {
|
||||
orgId: string;
|
||||
workspaceId: string;
|
||||
collection: 'episodic' | 'semantic' | 'skills';
|
||||
|
||||
// Index configuration (can vary per tenant plan)
|
||||
config: {
|
||||
dimensions: number; // 384 for MiniLM, 1536 for larger models
|
||||
m: number; // HNSW connections (16-32)
|
||||
efConstruction: number; // Build quality (100-200)
|
||||
efSearch: number; // Query quality (50-100)
|
||||
};
|
||||
|
||||
// Usage metrics
|
||||
metrics: {
|
||||
vectorCount: number;
|
||||
memoryUsageMB: number;
|
||||
avgSearchLatencyMs: number;
|
||||
lastOptimized: Date;
|
||||
};
|
||||
}
|
||||
|
||||
// Index lifecycle management
|
||||
class TenantIndexManager {
|
||||
async provisionTenant(orgId: string): Promise<void> {
|
||||
// Create default workspaces indices
|
||||
await this.createIndex(orgId, 'default', 'episodic');
|
||||
await this.createIndex(orgId, 'default', 'semantic');
|
||||
await this.createIndex(orgId, 'default', 'skills');
|
||||
}
|
||||
|
||||
async deleteTenant(orgId: string): Promise<void> {
|
||||
// Delete all indices for org (GDPR deletion)
|
||||
const indices = await this.listIndices(orgId);
|
||||
await Promise.all(indices.map(idx => this.deleteIndex(idx.id)));
|
||||
|
||||
// Log deletion for audit
|
||||
await this.auditLog.record({
|
||||
action: 'tenant_deletion',
|
||||
orgId,
|
||||
indexCount: indices.length,
|
||||
timestamp: new Date(),
|
||||
});
|
||||
}
|
||||
|
||||
async optimizeIndex(indexId: string): Promise<OptimizationResult> {
|
||||
// Background optimization with tenant resource limits
|
||||
const index = await this.getIndex(indexId);
|
||||
const quota = await this.getQuota(index.orgId);
|
||||
|
||||
if (index.metrics.memoryUsageMB > quota.maxVectorMemoryMB) {
|
||||
// Apply quantization to reduce memory
|
||||
return this.compressIndex(indexId, 'product_quantization');
|
||||
}
|
||||
|
||||
return this.rebalanceIndex(indexId);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Authentication Flows
|
||||
|
||||
### OAuth2/OIDC Flow
|
||||
|
||||
```
|
||||
+--------+ +--------+
|
||||
| User | | IdP |
|
||||
+---+----+ +---+----+
|
||||
| |
|
||||
| 1. Login request |
|
||||
+--------------------------------------->|
|
||||
| |
|
||||
| 2. Redirect to IdP |
|
||||
|<---------------------------------------+
|
||||
| |
|
||||
| 3. Authenticate + consent |
|
||||
+--------------------------------------->|
|
||||
| |
|
||||
| 4. Auth code redirect |
|
||||
|<---------------------------------------+
|
||||
| |
|
||||
| +--------+ |
|
||||
| 5. Auth code | RuvBot | |
|
||||
+------------------>| Auth | |
|
||||
| +---+----+ |
|
||||
| | |
|
||||
| 6. Exchange code | |
|
||||
| +--------------->|
|
||||
| | |
|
||||
| 7. ID + Access token | |
|
||||
| |<---------------+
|
||||
| | |
|
||||
| 8. Create session, |
|
||||
| issue RuvBot JWT |
|
||||
|<----------------------+
|
||||
| |
|
||||
| 9. Authenticated |
|
||||
+<----------------------+
|
||||
```
|
||||
|
||||
### API Key Authentication
|
||||
|
||||
```typescript
|
||||
// API key structure
|
||||
interface APIKey {
|
||||
id: string;
|
||||
keyHash: string; // SHA-256 hash of actual key
|
||||
prefix: string; // First 8 chars for identification
|
||||
orgId: string;
|
||||
workspaceId: string;
|
||||
name: string;
|
||||
permissions: string[];
|
||||
rateLimit: RateLimitConfig;
|
||||
expiresAt: Date | null;
|
||||
lastUsedAt: Date | null;
|
||||
createdBy: string;
|
||||
createdAt: Date;
|
||||
}
|
||||
|
||||
// API key validation middleware
|
||||
async function validateAPIKey(req: Request): Promise<TenantContext> {
|
||||
const authHeader = req.headers.authorization;
|
||||
if (!authHeader?.startsWith('Bearer ')) {
|
||||
throw new AuthenticationError('Missing authorization header');
|
||||
}
|
||||
|
||||
const key = authHeader.slice(7);
|
||||
const prefix = key.slice(0, 8);
|
||||
const keyHash = crypto.createHash('sha256').update(key).digest('hex');
|
||||
|
||||
// Lookup by prefix, then verify hash (timing-safe)
|
||||
const apiKey = await db.apiKeys.findByPrefix(prefix);
|
||||
if (!apiKey || !crypto.timingSafeEqual(
|
||||
Buffer.from(apiKey.keyHash),
|
||||
Buffer.from(keyHash)
|
||||
)) {
|
||||
throw new AuthenticationError('Invalid API key');
|
||||
}
|
||||
|
||||
// Check expiration
|
||||
if (apiKey.expiresAt && apiKey.expiresAt < new Date()) {
|
||||
throw new AuthenticationError('API key expired');
|
||||
}
|
||||
|
||||
// Update last used (async, don't block)
|
||||
db.apiKeys.updateLastUsed(apiKey.id).catch(console.error);
|
||||
|
||||
return {
|
||||
orgId: apiKey.orgId,
|
||||
workspaceId: apiKey.workspaceId,
|
||||
userId: apiKey.createdBy,
|
||||
roles: [Role.API_KEY],
|
||||
permissions: apiKey.permissions,
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Resource Quotas and Rate Limiting
|
||||
|
||||
### Quota Configuration
|
||||
|
||||
```typescript
|
||||
// Plan-based quota tiers
|
||||
interface ResourceQuotas {
|
||||
// Session limits
|
||||
maxConcurrentSessions: number;
|
||||
maxSessionDurationMinutes: number;
|
||||
maxTurnsPerSession: number;
|
||||
|
||||
// Memory limits
|
||||
maxMemoriesPerWorkspace: number;
|
||||
maxVectorStorageMB: number;
|
||||
maxEmbeddingsPerDay: number;
|
||||
|
||||
// Compute limits
|
||||
maxLLMTokensPerDay: number;
|
||||
maxSkillExecutionsPerDay: number;
|
||||
maxBackgroundJobsPerHour: number;
|
||||
|
||||
// Rate limits
|
||||
requestsPerMinute: number;
|
||||
requestsPerHour: number;
|
||||
burstLimit: number;
|
||||
}
|
||||
|
||||
const PLAN_QUOTAS: Record<Plan, ResourceQuotas> = {
|
||||
free: {
|
||||
maxConcurrentSessions: 2,
|
||||
maxSessionDurationMinutes: 30,
|
||||
maxTurnsPerSession: 50,
|
||||
maxMemoriesPerWorkspace: 1000,
|
||||
maxVectorStorageMB: 50,
|
||||
maxEmbeddingsPerDay: 500,
|
||||
maxLLMTokensPerDay: 10000,
|
||||
maxSkillExecutionsPerDay: 100,
|
||||
maxBackgroundJobsPerHour: 10,
|
||||
requestsPerMinute: 20,
|
||||
requestsPerHour: 500,
|
||||
burstLimit: 5,
|
||||
},
|
||||
pro: {
|
||||
maxConcurrentSessions: 10,
|
||||
maxSessionDurationMinutes: 120,
|
||||
maxTurnsPerSession: 500,
|
||||
maxMemoriesPerWorkspace: 50000,
|
||||
maxVectorStorageMB: 1000,
|
||||
maxEmbeddingsPerDay: 10000,
|
||||
maxLLMTokensPerDay: 500000,
|
||||
maxSkillExecutionsPerDay: 5000,
|
||||
maxBackgroundJobsPerHour: 200,
|
||||
requestsPerMinute: 100,
|
||||
requestsPerHour: 5000,
|
||||
burstLimit: 20,
|
||||
},
|
||||
enterprise: {
|
||||
maxConcurrentSessions: -1, // Unlimited
|
||||
maxSessionDurationMinutes: -1,
|
||||
maxTurnsPerSession: -1,
|
||||
maxMemoriesPerWorkspace: -1,
|
||||
maxVectorStorageMB: -1,
|
||||
maxEmbeddingsPerDay: -1,
|
||||
maxLLMTokensPerDay: -1,
|
||||
maxSkillExecutionsPerDay: -1,
|
||||
maxBackgroundJobsPerHour: -1,
|
||||
requestsPerMinute: 500,
|
||||
requestsPerHour: 20000,
|
||||
burstLimit: 50,
|
||||
},
|
||||
};
|
||||
```
|
||||
|
||||
### Rate Limiter Implementation
|
||||
|
||||
```typescript
|
||||
// Token bucket rate limiter with Redis backend
|
||||
class TenantRateLimiter {
|
||||
constructor(private redis: Redis) {}
|
||||
|
||||
async checkLimit(
|
||||
tenantId: string,
|
||||
action: string,
|
||||
config: RateLimitConfig
|
||||
): Promise<RateLimitResult> {
|
||||
const key = `ratelimit:${tenantId}:${action}`;
|
||||
const now = Date.now();
|
||||
const windowMs = config.windowMs || 60000;
|
||||
|
||||
// Lua script for atomic rate limit check
|
||||
const result = await this.redis.eval(`
|
||||
local key = KEYS[1]
|
||||
local now = tonumber(ARGV[1])
|
||||
local window = tonumber(ARGV[2])
|
||||
local limit = tonumber(ARGV[3])
|
||||
local burst = tonumber(ARGV[4])
|
||||
|
||||
-- Remove expired entries
|
||||
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
|
||||
|
||||
-- Count current requests
|
||||
local count = redis.call('ZCARD', key)
|
||||
|
||||
-- Check burst limit (recent 1s)
|
||||
local burstCount = redis.call('ZCOUNT', key, now - 1000, now)
|
||||
|
||||
if burstCount >= burst then
|
||||
return {0, limit - count, burst - burstCount, now + 1000}
|
||||
end
|
||||
|
||||
if count >= limit then
|
||||
local oldest = redis.call('ZRANGE', key, 0, 0, 'WITHSCORES')
|
||||
local retryAfter = oldest[2] + window - now
|
||||
return {0, 0, burst - burstCount, retryAfter}
|
||||
end
|
||||
|
||||
-- Add current request
|
||||
redis.call('ZADD', key, now, now .. ':' .. math.random())
|
||||
redis.call('PEXPIRE', key, window)
|
||||
|
||||
return {1, limit - count - 1, burst - burstCount - 1, 0}
|
||||
`, 1, key, now, windowMs, config.limit, config.burstLimit);
|
||||
|
||||
const [allowed, remaining, burstRemaining, retryAfter] = result as number[];
|
||||
|
||||
return {
|
||||
allowed: allowed === 1,
|
||||
remaining,
|
||||
burstRemaining,
|
||||
retryAfter: retryAfter > 0 ? Math.ceil(retryAfter / 1000) : 0,
|
||||
limit: config.limit,
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Audit Logging
|
||||
|
||||
```typescript
|
||||
// Comprehensive audit trail
|
||||
interface AuditEvent {
|
||||
id: string;
|
||||
timestamp: Date;
|
||||
|
||||
// Tenant context
|
||||
orgId: string;
|
||||
workspaceId: string;
|
||||
userId: string;
|
||||
|
||||
// Event details
|
||||
action: AuditAction;
|
||||
resource: AuditResource;
|
||||
resourceId: string;
|
||||
|
||||
// Request context
|
||||
requestId: string;
|
||||
ipAddress: string;
|
||||
userAgent: string;
|
||||
|
||||
// Change tracking
|
||||
before?: Record<string, unknown>;
|
||||
after?: Record<string, unknown>;
|
||||
|
||||
// Outcome
|
||||
status: 'success' | 'failure' | 'denied';
|
||||
errorCode?: string;
|
||||
errorMessage?: string;
|
||||
}
|
||||
|
||||
type AuditAction =
|
||||
| 'create' | 'read' | 'update' | 'delete'
|
||||
| 'login' | 'logout' | 'token_refresh'
|
||||
| 'export' | 'import'
|
||||
| 'share' | 'unshare'
|
||||
| 'invite' | 'remove'
|
||||
| 'skill_execute' | 'memory_recall'
|
||||
| 'quota_exceeded' | 'rate_limited';
|
||||
|
||||
type AuditResource =
|
||||
| 'user' | 'session' | 'conversation'
|
||||
| 'memory' | 'skill' | 'agent'
|
||||
| 'workspace' | 'organization'
|
||||
| 'api_key' | 'webhook';
|
||||
|
||||
// Audit logger with async persistence
|
||||
class AuditLogger {
|
||||
private buffer: AuditEvent[] = [];
|
||||
private flushInterval: NodeJS.Timeout;
|
||||
|
||||
constructor(
|
||||
private storage: AuditStorage,
|
||||
private config: { batchSize: number; flushMs: number }
|
||||
) {
|
||||
this.flushInterval = setInterval(() => this.flush(), config.flushMs);
|
||||
}
|
||||
|
||||
async log(event: Omit<AuditEvent, 'id' | 'timestamp'>): Promise<void> {
|
||||
const fullEvent: AuditEvent = {
|
||||
...event,
|
||||
id: crypto.randomUUID(),
|
||||
timestamp: new Date(),
|
||||
};
|
||||
|
||||
this.buffer.push(fullEvent);
|
||||
|
||||
if (this.buffer.length >= this.config.batchSize) {
|
||||
await this.flush();
|
||||
}
|
||||
}
|
||||
|
||||
private async flush(): Promise<void> {
|
||||
if (this.buffer.length === 0) return;
|
||||
|
||||
const events = this.buffer.splice(0, this.buffer.length);
|
||||
await this.storage.batchInsert(events);
|
||||
}
|
||||
|
||||
async query(filter: AuditFilter): Promise<AuditEvent[]> {
|
||||
// Ensure tenant isolation in queries
|
||||
if (!filter.orgId) {
|
||||
throw new Error('orgId required for audit queries');
|
||||
}
|
||||
return this.storage.query(filter);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## GDPR Compliance
|
||||
|
||||
### Data Export
|
||||
|
||||
```typescript
|
||||
// Personal data export for GDPR Article 15
|
||||
class DataExporter {
|
||||
async exportUserData(
|
||||
orgId: string,
|
||||
userId: string
|
||||
): Promise<DataExportResult> {
|
||||
const export = {
|
||||
metadata: {
|
||||
userId,
|
||||
orgId,
|
||||
exportedAt: new Date(),
|
||||
format: 'json',
|
||||
version: '1.0',
|
||||
},
|
||||
data: {} as Record<string, unknown>,
|
||||
};
|
||||
|
||||
// Collect all user data across contexts
|
||||
const [
|
||||
profile,
|
||||
sessions,
|
||||
conversations,
|
||||
memories,
|
||||
preferences,
|
||||
auditLogs,
|
||||
] = await Promise.all([
|
||||
this.exportProfile(userId),
|
||||
this.exportSessions(userId),
|
||||
this.exportConversations(userId),
|
||||
this.exportMemories(userId),
|
||||
this.exportPreferences(userId),
|
||||
this.exportAuditLogs(userId),
|
||||
]);
|
||||
|
||||
export.data = {
|
||||
profile,
|
||||
sessions,
|
||||
conversations,
|
||||
memories,
|
||||
preferences,
|
||||
auditLogs,
|
||||
};
|
||||
|
||||
// Generate downloadable archive
|
||||
const archivePath = await this.createArchive(export);
|
||||
|
||||
// Log export for audit
|
||||
await this.auditLogger.log({
|
||||
orgId,
|
||||
workspaceId: '*',
|
||||
userId,
|
||||
action: 'export',
|
||||
resource: 'user',
|
||||
resourceId: userId,
|
||||
status: 'success',
|
||||
});
|
||||
|
||||
return {
|
||||
downloadUrl: await this.generateSignedUrl(archivePath),
|
||||
expiresAt: new Date(Date.now() + 24 * 60 * 60 * 1000), // 24h
|
||||
sizeBytes: await this.getFileSize(archivePath),
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Data Deletion
|
||||
|
||||
```typescript
|
||||
// Right to erasure (GDPR Article 17)
|
||||
class DataDeleter {
|
||||
async deleteUserData(
|
||||
orgId: string,
|
||||
userId: string,
|
||||
options: DeletionOptions = {}
|
||||
): Promise<DeletionResult> {
|
||||
const jobId = crypto.randomUUID();
|
||||
|
||||
// Start deletion job (may take time for large datasets)
|
||||
await this.jobQueue.enqueue('data-deletion', {
|
||||
jobId,
|
||||
orgId,
|
||||
userId,
|
||||
options,
|
||||
});
|
||||
|
||||
return {
|
||||
jobId,
|
||||
status: 'pending',
|
||||
estimatedCompletionTime: await this.estimateCompletionTime(userId),
|
||||
};
|
||||
}
|
||||
|
||||
async executeDeletion(job: DeletionJob): Promise<void> {
|
||||
const { orgId, userId, options } = job.data;
|
||||
|
||||
// Order matters: delete dependent data first
|
||||
const steps = [
|
||||
{ name: 'sessions', fn: () => this.deleteSessions(userId) },
|
||||
{ name: 'conversations', fn: () => this.deleteConversations(userId) },
|
||||
{ name: 'memories', fn: () => this.deleteMemories(userId, options.preserveShared) },
|
||||
{ name: 'embeddings', fn: () => this.deleteEmbeddings(userId) },
|
||||
{ name: 'trajectories', fn: () => this.deleteTrajectories(userId) },
|
||||
{ name: 'preferences', fn: () => this.deletePreferences(userId) },
|
||||
{ name: 'audit_logs', fn: () => this.anonymizeAuditLogs(userId) }, // Anonymize, not delete
|
||||
{ name: 'profile', fn: () => this.deleteProfile(userId) },
|
||||
];
|
||||
|
||||
for (const step of steps) {
|
||||
try {
|
||||
const result = await step.fn();
|
||||
await this.updateProgress(job.id, step.name, 'completed', result);
|
||||
} catch (error) {
|
||||
await this.updateProgress(job.id, step.name, 'failed', error);
|
||||
throw error; // Fail job, require manual intervention
|
||||
}
|
||||
}
|
||||
|
||||
// Final audit entry (anonymized user reference)
|
||||
await this.auditLogger.log({
|
||||
orgId,
|
||||
workspaceId: '*',
|
||||
userId: 'DELETED_USER',
|
||||
action: 'delete',
|
||||
resource: 'user',
|
||||
resourceId: userId.slice(0, 8) + '...',
|
||||
status: 'success',
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Consequences
|
||||
|
||||
### Benefits
|
||||
|
||||
1. **Strong Isolation**: RLS + namespace isolation prevents cross-tenant access
|
||||
2. **Compliance Ready**: GDPR, SOC2, HIPAA requirements addressed
|
||||
3. **Scalable Quotas**: Per-tenant resource limits enable fair usage
|
||||
4. **Audit Trail**: Complete visibility for security and compliance
|
||||
5. **Flexible Auth**: OAuth2 + API keys support various use cases
|
||||
|
||||
### Risks and Mitigations
|
||||
|
||||
| Risk | Probability | Impact | Mitigation |
|
||||
|------|-------------|--------|------------|
|
||||
| RLS bypass via SQL injection | Low | Critical | Parameterized queries, ORM only |
|
||||
| Token theft | Medium | High | Short expiry, refresh rotation |
|
||||
| Quota gaming (multiple accounts) | Medium | Medium | Device fingerprinting, email verification |
|
||||
| Audit log tampering | Low | High | Append-only storage, checksums |
|
||||
|
||||
---
|
||||
|
||||
## Related Decisions
|
||||
|
||||
- **ADR-001**: Architecture Overview
|
||||
- **ADR-003**: Persistence Layer (RLS implementation details)
|
||||
|
||||
---
|
||||
|
||||
## Revision History
|
||||
|
||||
| Version | Date | Author | Changes |
|
||||
|---------|------|--------|---------|
|
||||
| 1.0 | 2026-01-27 | RuVector Architecture Team | Initial version |
|
||||
952
npm/packages/ruvbot/docs/adr/ADR-003-persistence-layer.md
Normal file
952
npm/packages/ruvbot/docs/adr/ADR-003-persistence-layer.md
Normal file
@@ -0,0 +1,952 @@
|
||||
# ADR-003: Persistence Layer
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-01-27
|
||||
**Decision Makers:** RuVector Architecture Team
|
||||
**Technical Area:** Data Architecture, Storage
|
||||
|
||||
---
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
RuvBot requires a persistence layer that handles diverse data types:
|
||||
|
||||
1. **Relational Data**: Users, organizations, sessions, skills (structured, transactional)
|
||||
2. **Vector Data**: Embeddings for memory recall (high-dimensional, similarity search)
|
||||
3. **Session State**: Active conversation context (ephemeral, fast access)
|
||||
4. **Event Streams**: Audit logs, trajectories (append-only, time-series)
|
||||
|
||||
The persistence layer must support:
|
||||
|
||||
- **Multi-tenancy** with strict isolation
|
||||
- **High performance** for real-time conversation
|
||||
- **Durability** for compliance and recovery
|
||||
- **Scalability** for enterprise deployments
|
||||
|
||||
---
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
### Data Characteristics
|
||||
|
||||
| Data Type | Volume | Access Pattern | Consistency | Durability |
|
||||
|-----------|--------|----------------|-------------|------------|
|
||||
| User/Org metadata | Low | Read-heavy | Strong | Required |
|
||||
| Session state | Medium | Read-write balanced | Eventual OK | Nice-to-have |
|
||||
| Conversation history | High | Append-mostly | Strong | Required |
|
||||
| Vector embeddings | Very High | Read-heavy | Eventual OK | Required |
|
||||
| Memory indices | High | Read-heavy | Eventual OK | Nice-to-have |
|
||||
| Audit logs | Very High | Append-only | Strong | Required |
|
||||
|
||||
### Performance Requirements
|
||||
|
||||
| Operation | Target Latency | Target Throughput |
|
||||
|-----------|----------------|-------------------|
|
||||
| Session lookup | < 5ms p99 | 10K/s |
|
||||
| Memory recall (HNSW) | < 50ms p99 | 1K/s |
|
||||
| Conversation insert | < 20ms p99 | 5K/s |
|
||||
| Full-text search | < 100ms p99 | 500/s |
|
||||
| Batch embedding insert | < 500ms p99 | 100 batches/s |
|
||||
|
||||
---
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
### Adopt Polyglot Persistence with Unified API
|
||||
|
||||
We implement a three-tier storage architecture:
|
||||
|
||||
```
|
||||
+-----------------------------------------------------------------------------+
|
||||
| PERSISTENCE LAYER |
|
||||
+-----------------------------------------------------------------------------+
|
||||
|
||||
+--------------------------+
|
||||
| Persistence Gateway |
|
||||
| (Unified API) |
|
||||
+-------------+------------+
|
||||
|
|
||||
+-----------------------+-----------------------+
|
||||
| | |
|
||||
+---------v---------+ +---------v---------+ +---------v---------+
|
||||
| PostgreSQL | | RuVector | | Redis |
|
||||
| (Primary) | | (Vector Store) | | (Cache) |
|
||||
|-------------------| |-------------------| |-------------------|
|
||||
| - User/Org data | | - Embeddings | | - Session state |
|
||||
| - Conversations | | - HNSW indices | | - Rate limits |
|
||||
| - Skills config | | - Pattern store | | - Pub/Sub |
|
||||
| - Audit logs | | - Similarity | | - Job queues |
|
||||
| - RLS isolation | | - Learning data | | - Leaderboard |
|
||||
+-------------------+ +-------------------+ +-------------------+
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## PostgreSQL Schema
|
||||
|
||||
### Core Tables
|
||||
|
||||
```sql
|
||||
-- Extensions
|
||||
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
|
||||
CREATE EXTENSION IF NOT EXISTS "pgcrypto";
|
||||
CREATE EXTENSION IF NOT EXISTS "pg_trgm"; -- Full-text search
|
||||
|
||||
-- Organizations (tenant root)
|
||||
CREATE TABLE organizations (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
name VARCHAR(255) NOT NULL,
|
||||
slug VARCHAR(100) NOT NULL UNIQUE,
|
||||
plan VARCHAR(50) NOT NULL DEFAULT 'free',
|
||||
settings JSONB NOT NULL DEFAULT '{}',
|
||||
quotas JSONB NOT NULL DEFAULT '{}',
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX organizations_slug_idx ON organizations (slug);
|
||||
|
||||
-- Workspaces (project boundary)
|
||||
CREATE TABLE workspaces (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
org_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
|
||||
name VARCHAR(255) NOT NULL,
|
||||
slug VARCHAR(100) NOT NULL,
|
||||
settings JSONB NOT NULL DEFAULT '{}',
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
UNIQUE (org_id, slug)
|
||||
);
|
||||
|
||||
CREATE INDEX workspaces_org_idx ON workspaces (org_id);
|
||||
|
||||
-- Users
|
||||
CREATE TABLE users (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
org_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
|
||||
email VARCHAR(255) NOT NULL,
|
||||
password_hash VARCHAR(255), -- NULL for OAuth users
|
||||
display_name VARCHAR(255),
|
||||
avatar_url VARCHAR(500),
|
||||
roles TEXT[] NOT NULL DEFAULT '{"member"}',
|
||||
preferences JSONB NOT NULL DEFAULT '{}',
|
||||
email_verified_at TIMESTAMPTZ,
|
||||
last_login_at TIMESTAMPTZ,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
UNIQUE (org_id, email)
|
||||
);
|
||||
|
||||
CREATE INDEX users_org_idx ON users (org_id);
|
||||
CREATE INDEX users_email_idx ON users (email);
|
||||
|
||||
-- Workspace memberships
|
||||
CREATE TABLE workspace_memberships (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
workspace_id UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE,
|
||||
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
|
||||
role VARCHAR(50) NOT NULL DEFAULT 'member',
|
||||
joined_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
UNIQUE (workspace_id, user_id)
|
||||
);
|
||||
|
||||
CREATE INDEX workspace_memberships_user_idx ON workspace_memberships (user_id);
|
||||
```
|
||||
|
||||
### Session and Conversation Tables
|
||||
|
||||
```sql
|
||||
-- Agents (bot configurations)
|
||||
CREATE TABLE agents (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
org_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
|
||||
workspace_id UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE,
|
||||
name VARCHAR(255) NOT NULL,
|
||||
description TEXT,
|
||||
persona JSONB NOT NULL DEFAULT '{}',
|
||||
skill_ids UUID[] NOT NULL DEFAULT '{}',
|
||||
memory_config JSONB NOT NULL DEFAULT '{}',
|
||||
status VARCHAR(50) NOT NULL DEFAULT 'active',
|
||||
version INTEGER NOT NULL DEFAULT 1,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
ALTER TABLE agents ENABLE ROW LEVEL SECURITY;
|
||||
CREATE POLICY agents_isolation ON agents
|
||||
FOR ALL USING (org_id = current_tenant_id());
|
||||
|
||||
CREATE INDEX agents_org_workspace_idx ON agents (org_id, workspace_id);
|
||||
|
||||
-- Sessions (conversation containers)
|
||||
CREATE TABLE sessions (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
org_id UUID NOT NULL,
|
||||
workspace_id UUID NOT NULL,
|
||||
agent_id UUID NOT NULL REFERENCES agents(id) ON DELETE CASCADE,
|
||||
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
|
||||
channel VARCHAR(50) NOT NULL DEFAULT 'api', -- api, slack, webhook
|
||||
channel_id VARCHAR(255), -- External channel identifier
|
||||
state VARCHAR(50) NOT NULL DEFAULT 'active',
|
||||
context_snapshot JSONB, -- Serialized context for recovery
|
||||
turn_count INTEGER NOT NULL DEFAULT 0,
|
||||
token_count INTEGER NOT NULL DEFAULT 0,
|
||||
started_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
last_active_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
expires_at TIMESTAMPTZ NOT NULL,
|
||||
ended_at TIMESTAMPTZ
|
||||
) PARTITION BY LIST (org_id);
|
||||
|
||||
ALTER TABLE sessions ENABLE ROW LEVEL SECURITY;
|
||||
CREATE POLICY sessions_isolation ON sessions
|
||||
FOR ALL USING (org_id = current_tenant_id());
|
||||
|
||||
CREATE INDEX sessions_user_active_idx ON sessions (user_id, state)
|
||||
WHERE state = 'active';
|
||||
CREATE INDEX sessions_agent_idx ON sessions (agent_id);
|
||||
CREATE INDEX sessions_expires_idx ON sessions (expires_at)
|
||||
WHERE state = 'active';
|
||||
|
||||
-- Conversation turns
|
||||
CREATE TABLE conversation_turns (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
org_id UUID NOT NULL,
|
||||
workspace_id UUID NOT NULL,
|
||||
session_id UUID NOT NULL,
|
||||
user_id UUID NOT NULL,
|
||||
role VARCHAR(20) NOT NULL, -- user, assistant, system, tool
|
||||
content TEXT NOT NULL,
|
||||
content_type VARCHAR(50) NOT NULL DEFAULT 'text',
|
||||
embedding_id UUID, -- Reference to vector store
|
||||
tool_calls JSONB, -- Function/skill calls
|
||||
tool_results JSONB, -- Function/skill results
|
||||
metadata JSONB NOT NULL DEFAULT '{}',
|
||||
token_count INTEGER,
|
||||
latency_ms INTEGER,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
) PARTITION BY LIST (org_id);
|
||||
|
||||
ALTER TABLE conversation_turns ENABLE ROW LEVEL SECURITY;
|
||||
CREATE POLICY turns_isolation ON conversation_turns
|
||||
FOR ALL USING (org_id = current_tenant_id());
|
||||
|
||||
-- Composite index for session history queries
|
||||
CREATE INDEX turns_session_time_idx ON conversation_turns (session_id, created_at DESC);
|
||||
CREATE INDEX turns_embedding_idx ON conversation_turns (embedding_id)
|
||||
WHERE embedding_id IS NOT NULL;
|
||||
```
|
||||
|
||||
### Memory Tables
|
||||
|
||||
```sql
|
||||
-- Memory entries (facts, events stored for recall)
|
||||
CREATE TABLE memories (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
org_id UUID NOT NULL,
|
||||
workspace_id UUID NOT NULL,
|
||||
user_id UUID, -- NULL for workspace-level memories
|
||||
memory_type VARCHAR(50) NOT NULL, -- episodic, semantic, procedural
|
||||
content TEXT NOT NULL,
|
||||
embedding_id UUID NOT NULL, -- Reference to vector store
|
||||
source_type VARCHAR(50), -- conversation, import, skill
|
||||
source_id UUID, -- Reference to source entity
|
||||
importance FLOAT NOT NULL DEFAULT 0.5, -- 0-1 importance score
|
||||
access_count INTEGER NOT NULL DEFAULT 0,
|
||||
last_accessed_at TIMESTAMPTZ,
|
||||
is_shared BOOLEAN NOT NULL DEFAULT FALSE,
|
||||
expires_at TIMESTAMPTZ,
|
||||
metadata JSONB NOT NULL DEFAULT '{}',
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
) PARTITION BY LIST (org_id);
|
||||
|
||||
ALTER TABLE memories ENABLE ROW LEVEL SECURITY;
|
||||
|
||||
-- User-scoped memories
|
||||
CREATE POLICY memories_user_isolation ON memories
|
||||
FOR ALL USING (
|
||||
org_id = current_tenant_id()
|
||||
AND workspace_id = current_workspace_id()
|
||||
AND (user_id = current_user_id() OR user_id IS NULL)
|
||||
);
|
||||
|
||||
-- Shared memories (read-only across workspace)
|
||||
CREATE POLICY memories_shared_read ON memories
|
||||
FOR SELECT USING (
|
||||
org_id = current_tenant_id()
|
||||
AND is_shared = TRUE
|
||||
);
|
||||
|
||||
CREATE INDEX memories_workspace_type_idx ON memories (workspace_id, memory_type);
|
||||
CREATE INDEX memories_user_type_idx ON memories (user_id, memory_type)
|
||||
WHERE user_id IS NOT NULL;
|
||||
CREATE INDEX memories_embedding_idx ON memories (embedding_id);
|
||||
CREATE INDEX memories_importance_idx ON memories (importance DESC);
|
||||
CREATE INDEX memories_access_idx ON memories (last_accessed_at DESC);
|
||||
|
||||
-- Memory relationships (for graph traversal)
|
||||
CREATE TABLE memory_edges (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
org_id UUID NOT NULL,
|
||||
source_memory_id UUID NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
|
||||
target_memory_id UUID NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
|
||||
edge_type VARCHAR(50) NOT NULL, -- related_to, caused_by, part_of, supersedes
|
||||
weight FLOAT NOT NULL DEFAULT 1.0,
|
||||
metadata JSONB NOT NULL DEFAULT '{}',
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
ALTER TABLE memory_edges ENABLE ROW LEVEL SECURITY;
|
||||
CREATE POLICY edges_isolation ON memory_edges
|
||||
FOR ALL USING (org_id = current_tenant_id());
|
||||
|
||||
CREATE INDEX memory_edges_source_idx ON memory_edges (source_memory_id);
|
||||
CREATE INDEX memory_edges_target_idx ON memory_edges (target_memory_id);
|
||||
CREATE INDEX memory_edges_type_idx ON memory_edges (edge_type);
|
||||
```
|
||||
|
||||
### Skills and Learning Tables
|
||||
|
||||
```sql
|
||||
-- Skills (registered capabilities)
|
||||
CREATE TABLE skills (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
org_id UUID NOT NULL,
|
||||
workspace_id UUID, -- NULL for org-wide skills
|
||||
name VARCHAR(255) NOT NULL,
|
||||
description TEXT,
|
||||
version VARCHAR(50) NOT NULL DEFAULT '1.0.0',
|
||||
triggers JSONB NOT NULL DEFAULT '[]',
|
||||
parameters JSONB NOT NULL DEFAULT '{}',
|
||||
implementation_type VARCHAR(50) NOT NULL, -- builtin, script, webhook
|
||||
implementation JSONB NOT NULL, -- Type-specific config
|
||||
hooks JSONB NOT NULL DEFAULT '{}',
|
||||
is_enabled BOOLEAN NOT NULL DEFAULT TRUE,
|
||||
usage_count INTEGER NOT NULL DEFAULT 0,
|
||||
success_rate FLOAT,
|
||||
avg_latency_ms FLOAT,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
ALTER TABLE skills ENABLE ROW LEVEL SECURITY;
|
||||
CREATE POLICY skills_isolation ON skills
|
||||
FOR ALL USING (org_id = current_tenant_id());
|
||||
|
||||
CREATE INDEX skills_workspace_idx ON skills (workspace_id);
|
||||
CREATE INDEX skills_enabled_idx ON skills (is_enabled) WHERE is_enabled = TRUE;
|
||||
|
||||
-- Trajectories (learning data)
|
||||
CREATE TABLE trajectories (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
org_id UUID NOT NULL,
|
||||
workspace_id UUID NOT NULL,
|
||||
session_id UUID NOT NULL,
|
||||
turn_ids UUID[] NOT NULL,
|
||||
skill_ids UUID[],
|
||||
start_time TIMESTAMPTZ NOT NULL,
|
||||
end_time TIMESTAMPTZ NOT NULL,
|
||||
verdict VARCHAR(50), -- positive, negative, neutral, pending
|
||||
verdict_reason TEXT,
|
||||
metrics JSONB NOT NULL DEFAULT '{}',
|
||||
embedding_id UUID,
|
||||
is_exported BOOLEAN NOT NULL DEFAULT FALSE,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
ALTER TABLE trajectories ENABLE ROW LEVEL SECURITY;
|
||||
CREATE POLICY trajectories_isolation ON trajectories
|
||||
FOR ALL USING (org_id = current_tenant_id());
|
||||
|
||||
CREATE INDEX trajectories_session_idx ON trajectories (session_id);
|
||||
CREATE INDEX trajectories_verdict_idx ON trajectories (verdict)
|
||||
WHERE verdict IS NOT NULL;
|
||||
CREATE INDEX trajectories_export_idx ON trajectories (is_exported)
|
||||
WHERE is_exported = FALSE;
|
||||
|
||||
-- Learned patterns
|
||||
CREATE TABLE learned_patterns (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
org_id UUID NOT NULL,
|
||||
workspace_id UUID, -- NULL for org-wide patterns
|
||||
pattern_type VARCHAR(50) NOT NULL, -- response, routing, skill_selection
|
||||
embedding_id UUID NOT NULL,
|
||||
exemplar_trajectory_ids UUID[] NOT NULL,
|
||||
confidence FLOAT NOT NULL,
|
||||
usage_count INTEGER NOT NULL DEFAULT 0,
|
||||
success_count INTEGER NOT NULL DEFAULT 0,
|
||||
is_active BOOLEAN NOT NULL DEFAULT TRUE,
|
||||
superseded_by UUID REFERENCES learned_patterns(id),
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
ALTER TABLE learned_patterns ENABLE ROW LEVEL SECURITY;
|
||||
CREATE POLICY patterns_isolation ON learned_patterns
|
||||
FOR ALL USING (org_id = current_tenant_id());
|
||||
|
||||
CREATE INDEX patterns_type_idx ON learned_patterns (pattern_type);
|
||||
CREATE INDEX patterns_active_idx ON learned_patterns (is_active)
|
||||
WHERE is_active = TRUE;
|
||||
CREATE INDEX patterns_embedding_idx ON learned_patterns (embedding_id);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## RuVector Integration
|
||||
|
||||
### Vector Store Adapter
|
||||
|
||||
```typescript
|
||||
// Unified vector store interface
|
||||
interface RuVectorAdapter {
|
||||
// Index management
|
||||
createIndex(config: IndexConfig): Promise<IndexHandle>;
|
||||
deleteIndex(handle: IndexHandle): Promise<void>;
|
||||
getIndex(namespace: string): Promise<IndexHandle | null>;
|
||||
|
||||
// Vector operations
|
||||
insert(handle: IndexHandle, entries: VectorEntry[]): Promise<void>;
|
||||
update(handle: IndexHandle, id: string, vector: Float32Array): Promise<void>;
|
||||
delete(handle: IndexHandle, ids: string[]): Promise<void>;
|
||||
|
||||
// Search operations
|
||||
search(handle: IndexHandle, query: Float32Array, options: SearchOptions): Promise<SearchResult[]>;
|
||||
batchSearch(handle: IndexHandle, queries: Float32Array[], options: SearchOptions): Promise<SearchResult[][]>;
|
||||
|
||||
// Index operations
|
||||
optimize(handle: IndexHandle): Promise<OptimizationResult>;
|
||||
stats(handle: IndexHandle): Promise<IndexStats>;
|
||||
}
|
||||
|
||||
interface IndexConfig {
|
||||
namespace: string;
|
||||
dimensions: number;
|
||||
distanceMetric: 'cosine' | 'euclidean' | 'dot_product';
|
||||
hnsw: {
|
||||
m: number;
|
||||
efConstruction: number;
|
||||
efSearch: number;
|
||||
};
|
||||
quantization?: {
|
||||
type: 'scalar' | 'product' | 'binary';
|
||||
bits?: number;
|
||||
};
|
||||
}
|
||||
|
||||
interface VectorEntry {
|
||||
id: string;
|
||||
vector: Float32Array;
|
||||
metadata?: Record<string, unknown>;
|
||||
}
|
||||
|
||||
interface SearchResult {
|
||||
id: string;
|
||||
score: number;
|
||||
metadata?: Record<string, unknown>;
|
||||
}
|
||||
```
|
||||
|
||||
### Namespace Schema
|
||||
|
||||
```typescript
|
||||
// Vector namespace organization
|
||||
const VECTOR_NAMESPACES = {
|
||||
// Memory embeddings
|
||||
EPISODIC: (orgId: string, workspaceId: string) =>
|
||||
`${orgId}/${workspaceId}/memory/episodic`,
|
||||
SEMANTIC: (orgId: string, workspaceId: string) =>
|
||||
`${orgId}/${workspaceId}/memory/semantic`,
|
||||
PROCEDURAL: (orgId: string, workspaceId: string) =>
|
||||
`${orgId}/${workspaceId}/memory/procedural`,
|
||||
|
||||
// Conversation embeddings
|
||||
CONVERSATIONS: (orgId: string, workspaceId: string) =>
|
||||
`${orgId}/${workspaceId}/conversations`,
|
||||
|
||||
// Learning embeddings
|
||||
TRAJECTORIES: (orgId: string, workspaceId: string) =>
|
||||
`${orgId}/${workspaceId}/learning/trajectories`,
|
||||
PATTERNS: (orgId: string, workspaceId: string) =>
|
||||
`${orgId}/${workspaceId}/learning/patterns`,
|
||||
|
||||
// Skill embeddings (for intent matching)
|
||||
SKILLS: (orgId: string) =>
|
||||
`${orgId}/skills`,
|
||||
};
|
||||
|
||||
// Index configuration per namespace type
|
||||
const INDEX_CONFIGS: Record<string, Partial<IndexConfig>> = {
|
||||
'memory/episodic': {
|
||||
dimensions: 384,
|
||||
distanceMetric: 'cosine',
|
||||
hnsw: { m: 16, efConstruction: 100, efSearch: 50 },
|
||||
},
|
||||
'memory/semantic': {
|
||||
dimensions: 384,
|
||||
distanceMetric: 'cosine',
|
||||
hnsw: { m: 32, efConstruction: 200, efSearch: 100 },
|
||||
},
|
||||
'conversations': {
|
||||
dimensions: 384,
|
||||
distanceMetric: 'cosine',
|
||||
hnsw: { m: 16, efConstruction: 100, efSearch: 50 },
|
||||
quantization: { type: 'scalar' }, // Compress for volume
|
||||
},
|
||||
'learning/patterns': {
|
||||
dimensions: 384,
|
||||
distanceMetric: 'cosine',
|
||||
hnsw: { m: 32, efConstruction: 200, efSearch: 100 },
|
||||
},
|
||||
};
|
||||
```
|
||||
|
||||
### WASM/Native Detection
|
||||
|
||||
```typescript
|
||||
// Automatic runtime detection
|
||||
class RuVectorFactory {
|
||||
private static instance: RuVectorAdapter | null = null;
|
||||
|
||||
static async create(): Promise<RuVectorAdapter> {
|
||||
if (this.instance) return this.instance;
|
||||
|
||||
// Try native first (better performance)
|
||||
try {
|
||||
const native = await import('@ruvector/core');
|
||||
if (native.isNativeAvailable()) {
|
||||
console.log('RuVector: Using native NAPI bindings');
|
||||
this.instance = new NativeRuVectorAdapter(native);
|
||||
return this.instance;
|
||||
}
|
||||
} catch (e) {
|
||||
console.debug('Native bindings not available:', e);
|
||||
}
|
||||
|
||||
// Fall back to WASM
|
||||
try {
|
||||
const wasm = await import('@ruvector/wasm');
|
||||
console.log('RuVector: Using WASM runtime');
|
||||
this.instance = new WasmRuVectorAdapter(wasm);
|
||||
return this.instance;
|
||||
} catch (e) {
|
||||
throw new Error(`Failed to load RuVector runtime: ${e}`);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Redis Schema
|
||||
|
||||
### Session Cache
|
||||
|
||||
```typescript
|
||||
// Session state keys
|
||||
const SESSION_KEYS = {
|
||||
// Active session state
|
||||
state: (sessionId: string) => `session:${sessionId}:state`,
|
||||
|
||||
// Context window (recent turns)
|
||||
context: (sessionId: string) => `session:${sessionId}:context`,
|
||||
|
||||
// Session lock (prevent concurrent modifications)
|
||||
lock: (sessionId: string) => `session:${sessionId}:lock`,
|
||||
|
||||
// User's active sessions
|
||||
userSessions: (userId: string) => `user:${userId}:sessions`,
|
||||
|
||||
// Session expiry sorted set
|
||||
expiryIndex: () => 'sessions:expiry',
|
||||
};
|
||||
|
||||
// Session state structure
|
||||
interface CachedSessionState {
|
||||
id: string;
|
||||
agentId: string;
|
||||
userId: string;
|
||||
state: SessionState;
|
||||
turnCount: number;
|
||||
tokenCount: number;
|
||||
lastActiveAt: number;
|
||||
expiresAt: number;
|
||||
}
|
||||
|
||||
// Context window structure
|
||||
interface CachedContextWindow {
|
||||
maxTokens: number;
|
||||
turns: Array<{
|
||||
id: string;
|
||||
role: string;
|
||||
content: string;
|
||||
createdAt: number;
|
||||
}>;
|
||||
retrievedMemoryIds: string[];
|
||||
}
|
||||
```
|
||||
|
||||
### Rate Limiting
|
||||
|
||||
```typescript
|
||||
// Rate limit keys
|
||||
const RATE_LIMIT_KEYS = {
|
||||
// Per-tenant rate limits
|
||||
tenant: (tenantId: string, action: string, window: string) =>
|
||||
`ratelimit:tenant:${tenantId}:${action}:${window}`,
|
||||
|
||||
// Per-user rate limits
|
||||
user: (userId: string, action: string, window: string) =>
|
||||
`ratelimit:user:${userId}:${action}:${window}`,
|
||||
|
||||
// Global rate limits
|
||||
global: (action: string, window: string) =>
|
||||
`ratelimit:global:${action}:${window}`,
|
||||
};
|
||||
|
||||
// Rate limit actions
|
||||
type RateLimitAction =
|
||||
| 'api_request'
|
||||
| 'llm_call'
|
||||
| 'embedding_request'
|
||||
| 'memory_write'
|
||||
| 'skill_execute'
|
||||
| 'webhook_dispatch';
|
||||
```
|
||||
|
||||
### Pub/Sub Channels
|
||||
|
||||
```typescript
|
||||
// Real-time event channels
|
||||
const PUBSUB_CHANNELS = {
|
||||
// Session events
|
||||
sessionCreated: (workspaceId: string) =>
|
||||
`events:${workspaceId}:session:created`,
|
||||
sessionEnded: (workspaceId: string) =>
|
||||
`events:${workspaceId}:session:ended`,
|
||||
|
||||
// Conversation events
|
||||
turnCreated: (sessionId: string) =>
|
||||
`events:session:${sessionId}:turn:created`,
|
||||
|
||||
// Memory events
|
||||
memoryCreated: (workspaceId: string) =>
|
||||
`events:${workspaceId}:memory:created`,
|
||||
memoryUpdated: (workspaceId: string) =>
|
||||
`events:${workspaceId}:memory:updated`,
|
||||
|
||||
// Skill events
|
||||
skillExecuted: (workspaceId: string) =>
|
||||
`events:${workspaceId}:skill:executed`,
|
||||
|
||||
// System events
|
||||
quotaWarning: (tenantId: string) =>
|
||||
`events:${tenantId}:quota:warning`,
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Data Access Patterns
|
||||
|
||||
### Repository Pattern
|
||||
|
||||
```typescript
|
||||
// Base repository with tenant context
|
||||
abstract class TenantRepository<T> {
|
||||
constructor(
|
||||
protected db: PostgresAdapter,
|
||||
protected tenantContext: TenantContext
|
||||
) {}
|
||||
|
||||
protected async withTenantContext<R>(
|
||||
fn: (db: PostgresAdapter) => Promise<R>
|
||||
): Promise<R> {
|
||||
// Set tenant context for RLS
|
||||
await this.db.query(`
|
||||
SELECT set_config('app.current_org_id', $1, true),
|
||||
set_config('app.current_workspace_id', $2, true),
|
||||
set_config('app.current_user_id', $3, true)
|
||||
`, [
|
||||
this.tenantContext.orgId,
|
||||
this.tenantContext.workspaceId,
|
||||
this.tenantContext.userId,
|
||||
]);
|
||||
|
||||
return fn(this.db);
|
||||
}
|
||||
|
||||
abstract findById(id: string): Promise<T | null>;
|
||||
abstract save(entity: T): Promise<T>;
|
||||
abstract delete(id: string): Promise<void>;
|
||||
}
|
||||
|
||||
// Memory repository example
|
||||
class MemoryRepository extends TenantRepository<Memory> {
|
||||
async findById(id: string): Promise<Memory | null> {
|
||||
return this.withTenantContext(async (db) => {
|
||||
const rows = await db.query<MemoryRow>(
|
||||
'SELECT * FROM memories WHERE id = $1',
|
||||
[id]
|
||||
);
|
||||
return rows[0] ? this.toEntity(rows[0]) : null;
|
||||
});
|
||||
}
|
||||
|
||||
async findByEmbedding(
|
||||
embedding: Float32Array,
|
||||
options: MemorySearchOptions
|
||||
): Promise<MemoryWithScore[]> {
|
||||
// Search vector store first
|
||||
const vectorResults = await this.vectorStore.search(
|
||||
this.getIndexHandle(),
|
||||
embedding,
|
||||
{ k: options.limit, threshold: options.minScore }
|
||||
);
|
||||
|
||||
if (vectorResults.length === 0) return [];
|
||||
|
||||
// Fetch full memory records
|
||||
return this.withTenantContext(async (db) => {
|
||||
const ids = vectorResults.map(r => r.id);
|
||||
const scoreMap = new Map(vectorResults.map(r => [r.id, r.score]));
|
||||
|
||||
const rows = await db.query<MemoryRow>(
|
||||
'SELECT * FROM memories WHERE id = ANY($1)',
|
||||
[ids]
|
||||
);
|
||||
|
||||
return rows
|
||||
.map(row => ({
|
||||
memory: this.toEntity(row),
|
||||
score: scoreMap.get(row.id) ?? 0,
|
||||
}))
|
||||
.sort((a, b) => b.score - a.score);
|
||||
});
|
||||
}
|
||||
|
||||
async save(memory: Memory): Promise<Memory> {
|
||||
return this.withTenantContext(async (db) => {
|
||||
// Generate embedding if not present
|
||||
if (!memory.embeddingId) {
|
||||
const embedding = await this.embedder.embed(memory.content);
|
||||
const embeddingId = crypto.randomUUID();
|
||||
|
||||
await this.vectorStore.insert(this.getIndexHandle(), [{
|
||||
id: embeddingId,
|
||||
vector: embedding,
|
||||
metadata: { memoryId: memory.id },
|
||||
}]);
|
||||
|
||||
memory.embeddingId = embeddingId;
|
||||
}
|
||||
|
||||
// Upsert to database
|
||||
const row = await db.query<MemoryRow>(`
|
||||
INSERT INTO memories (
|
||||
id, org_id, workspace_id, user_id, memory_type, content,
|
||||
embedding_id, source_type, source_id, importance, metadata
|
||||
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11)
|
||||
ON CONFLICT (id) DO UPDATE SET
|
||||
content = EXCLUDED.content,
|
||||
importance = EXCLUDED.importance,
|
||||
metadata = EXCLUDED.metadata,
|
||||
updated_at = NOW()
|
||||
RETURNING *
|
||||
`, [
|
||||
memory.id,
|
||||
this.tenantContext.orgId,
|
||||
this.tenantContext.workspaceId,
|
||||
memory.userId,
|
||||
memory.type,
|
||||
memory.content,
|
||||
memory.embeddingId,
|
||||
memory.sourceType,
|
||||
memory.sourceId,
|
||||
memory.importance,
|
||||
memory.metadata,
|
||||
]);
|
||||
|
||||
return this.toEntity(row[0]);
|
||||
});
|
||||
}
|
||||
|
||||
private getIndexHandle(): IndexHandle {
|
||||
return {
|
||||
namespace: VECTOR_NAMESPACES[this.tenantContext.workspaceId]
|
||||
? VECTOR_NAMESPACES.EPISODIC(
|
||||
this.tenantContext.orgId,
|
||||
this.tenantContext.workspaceId
|
||||
)
|
||||
: VECTOR_NAMESPACES.SEMANTIC(
|
||||
this.tenantContext.orgId,
|
||||
this.tenantContext.workspaceId
|
||||
),
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Unit of Work Pattern
|
||||
|
||||
```typescript
|
||||
// Transaction coordination
|
||||
class UnitOfWork {
|
||||
private operations: Operation[] = [];
|
||||
private committed = false;
|
||||
|
||||
constructor(
|
||||
private db: PostgresAdapter,
|
||||
private vectorStore: RuVectorAdapter,
|
||||
private cache: CacheAdapter
|
||||
) {}
|
||||
|
||||
addMemory(memory: Memory): void {
|
||||
this.operations.push({
|
||||
type: 'memory',
|
||||
action: 'upsert',
|
||||
entity: memory,
|
||||
});
|
||||
}
|
||||
|
||||
addTurn(turn: ConversationTurn): void {
|
||||
this.operations.push({
|
||||
type: 'turn',
|
||||
action: 'insert',
|
||||
entity: turn,
|
||||
});
|
||||
}
|
||||
|
||||
async commit(): Promise<void> {
|
||||
if (this.committed) throw new Error('Already committed');
|
||||
|
||||
try {
|
||||
await this.db.transaction(async (tx) => {
|
||||
// Execute database operations
|
||||
for (const op of this.operations.filter(o => o.type !== 'cache')) {
|
||||
await this.executeDbOperation(tx, op);
|
||||
}
|
||||
|
||||
// Execute vector operations (outside transaction, but after DB success)
|
||||
for (const op of this.operations.filter(o =>
|
||||
o.type === 'memory' || o.type === 'turn'
|
||||
)) {
|
||||
await this.executeVectorOperation(op);
|
||||
}
|
||||
});
|
||||
|
||||
// Execute cache operations (best effort)
|
||||
for (const op of this.operations.filter(o => o.type === 'cache')) {
|
||||
await this.executeCacheOperation(op).catch(console.error);
|
||||
}
|
||||
|
||||
this.committed = true;
|
||||
} catch (error) {
|
||||
// Rollback vector operations on failure
|
||||
await this.rollbackVectorOperations();
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Migration Strategy
|
||||
|
||||
### Schema Migrations
|
||||
|
||||
```typescript
|
||||
// Migration runner
|
||||
class MigrationRunner {
|
||||
async migrate(direction: 'up' | 'down' = 'up'): Promise<void> {
|
||||
const migrations = await this.loadMigrations();
|
||||
const applied = await this.getAppliedMigrations();
|
||||
|
||||
if (direction === 'up') {
|
||||
const pending = migrations.filter(m => !applied.has(m.version));
|
||||
for (const migration of pending) {
|
||||
await this.applyMigration(migration);
|
||||
}
|
||||
} else {
|
||||
const toRollback = [...applied].reverse();
|
||||
for (const version of toRollback) {
|
||||
const migration = migrations.find(m => m.version === version);
|
||||
if (migration) {
|
||||
await this.rollbackMigration(migration);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private async applyMigration(migration: Migration): Promise<void> {
|
||||
await this.db.transaction(async (tx) => {
|
||||
// Run migration SQL
|
||||
await tx.query(migration.up);
|
||||
|
||||
// Record migration
|
||||
await tx.query(
|
||||
'INSERT INTO schema_migrations (version, applied_at) VALUES ($1, NOW())',
|
||||
[migration.version]
|
||||
);
|
||||
});
|
||||
|
||||
console.log(`Applied migration: ${migration.version}`);
|
||||
}
|
||||
}
|
||||
|
||||
// Example migration
|
||||
const MIGRATION_001: Migration = {
|
||||
version: '001_initial_schema',
|
||||
up: `
|
||||
-- Create organizations table
|
||||
CREATE TABLE organizations (...);
|
||||
|
||||
-- Create workspaces table
|
||||
CREATE TABLE workspaces (...);
|
||||
|
||||
-- ... rest of schema
|
||||
`,
|
||||
down: `
|
||||
DROP TABLE IF EXISTS workspaces;
|
||||
DROP TABLE IF EXISTS organizations;
|
||||
`,
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Consequences
|
||||
|
||||
### Benefits
|
||||
|
||||
1. **Strong Isolation**: RLS + namespace isolation at every layer
|
||||
2. **Performance**: Optimized indices, caching, and partitioning
|
||||
3. **Flexibility**: Polyglot persistence matches data characteristics
|
||||
4. **Durability**: PostgreSQL for critical data, redundant vector storage
|
||||
5. **Scalability**: Horizontal scaling via partitions and Redis cluster
|
||||
|
||||
### Trade-offs
|
||||
|
||||
| Benefit | Trade-off |
|
||||
|---------|-----------|
|
||||
| RLS security | Slight query overhead |
|
||||
| HNSW speed | Memory consumption |
|
||||
| Redis caching | Consistency complexity |
|
||||
| Polyglot persistence | Operational complexity |
|
||||
|
||||
---
|
||||
|
||||
## Related Decisions
|
||||
|
||||
- **ADR-001**: Architecture Overview
|
||||
- **ADR-002**: Multi-tenancy Design
|
||||
- **ADR-006**: WASM Integration (vector store runtime)
|
||||
|
||||
---
|
||||
|
||||
## Revision History
|
||||
|
||||
| Version | Date | Author | Changes |
|
||||
|---------|------|--------|---------|
|
||||
| 1.0 | 2026-01-27 | RuVector Architecture Team | Initial version |
|
||||
1068
npm/packages/ruvbot/docs/adr/ADR-004-background-workers.md
Normal file
1068
npm/packages/ruvbot/docs/adr/ADR-004-background-workers.md
Normal file
File diff suppressed because it is too large
Load Diff
907
npm/packages/ruvbot/docs/adr/ADR-005-integration-layer.md
Normal file
907
npm/packages/ruvbot/docs/adr/ADR-005-integration-layer.md
Normal file
@@ -0,0 +1,907 @@
|
||||
# ADR-005: Integration Layer
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-01-27
|
||||
**Decision Makers:** RuVector Architecture Team
|
||||
**Technical Area:** Integrations, External Services
|
||||
|
||||
---
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
RuvBot must integrate with external systems to:
|
||||
|
||||
1. **Receive messages** from Slack, webhooks, and other channels
|
||||
2. **Send notifications** and responses back to users
|
||||
3. **Connect to AI providers** for LLM inference and embeddings
|
||||
4. **Interact with external APIs** for skill execution
|
||||
5. **Provide webhooks** for third-party integrations
|
||||
|
||||
The integration layer must be:
|
||||
|
||||
- **Extensible** for new integration types
|
||||
- **Resilient** to external service failures
|
||||
- **Secure** with proper authentication and authorization
|
||||
- **Observable** with logging and metrics
|
||||
|
||||
---
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
### Integration Requirements
|
||||
|
||||
| Integration | Priority | Features Required |
|
||||
|-------------|----------|-------------------|
|
||||
| Slack | Critical | Events, commands, blocks, threads |
|
||||
| REST Webhooks | Critical | Inbound/outbound, signatures |
|
||||
| Anthropic Claude | Critical | Completions, streaming |
|
||||
| OpenAI | High | Completions, embeddings |
|
||||
| Custom LLMs | Medium | Provider abstraction |
|
||||
| External APIs | Medium | HTTP client, retries |
|
||||
|
||||
### Reliability Requirements
|
||||
|
||||
| Requirement | Target |
|
||||
|-------------|--------|
|
||||
| Webhook delivery success | > 99% |
|
||||
| Provider failover time | < 1s |
|
||||
| Message ordering | Within session |
|
||||
| Duplicate detection | 100% |
|
||||
|
||||
---
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
### Adopt Adapter Pattern with Circuit Breaker
|
||||
|
||||
We implement the integration layer using:
|
||||
|
||||
1. **Adapter Pattern**: Common interface for each integration type
|
||||
2. **Circuit Breaker**: Prevent cascade failures from external services
|
||||
3. **Retry with Backoff**: Handle transient failures
|
||||
4. **Event-Driven**: Decouple ingestion from processing
|
||||
|
||||
```
|
||||
+-----------------------------------------------------------------------------+
|
||||
| INTEGRATION LAYER |
|
||||
+-----------------------------------------------------------------------------+
|
||||
|
||||
+---------------------------+
|
||||
| Integration Gateway |
|
||||
| (Protocol Normalization)|
|
||||
+-------------+-------------+
|
||||
|
|
||||
+-----------------------+-----------------------+
|
||||
| | |
|
||||
+---------v---------+ +---------v---------+ +---------v---------+
|
||||
| Slack Adapter | | Webhook Adapter | | Provider Adapter |
|
||||
|-------------------| |-------------------| |-------------------|
|
||||
| - Events API | | - Inbound routes | | - LLM clients |
|
||||
| - Commands | | - Outbound queue | | - Embeddings |
|
||||
| - Interactive | | - Signatures | | - Circuit breaker |
|
||||
| - OAuth | | - Retries | | - Failover |
|
||||
+-------------------+ +-------------------+ +-------------------+
|
||||
| | |
|
||||
+-----------------------+-----------------------+
|
||||
|
|
||||
+-------------v-------------+
|
||||
| Event Normalizer |
|
||||
| (Unified Message Format) |
|
||||
+-------------+-------------+
|
||||
|
|
||||
+-------------v-------------+
|
||||
| Core Context |
|
||||
+---------------------------+
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Slack Integration
|
||||
|
||||
### Architecture
|
||||
|
||||
```typescript
|
||||
// Slack integration components
|
||||
interface SlackIntegration {
|
||||
// Event handling
|
||||
events: SlackEventHandler;
|
||||
|
||||
// Slash commands
|
||||
commands: SlackCommandHandler;
|
||||
|
||||
// Interactive components (buttons, modals)
|
||||
interactive: SlackInteractiveHandler;
|
||||
|
||||
// Block Kit builder
|
||||
blocks: BlockKitBuilder;
|
||||
|
||||
// Web API client
|
||||
client: SlackWebClient;
|
||||
|
||||
// OAuth flow
|
||||
oauth: SlackOAuthHandler;
|
||||
}
|
||||
|
||||
// Event types we handle
|
||||
type SlackEventType =
|
||||
| 'message'
|
||||
| 'app_mention'
|
||||
| 'reaction_added'
|
||||
| 'reaction_removed'
|
||||
| 'channel_created'
|
||||
| 'member_joined_channel'
|
||||
| 'file_shared'
|
||||
| 'app_home_opened';
|
||||
|
||||
// Normalized event structure
|
||||
interface SlackIncomingEvent {
|
||||
type: SlackEventType;
|
||||
teamId: string;
|
||||
channelId: string;
|
||||
userId: string;
|
||||
text?: string;
|
||||
threadTs?: string;
|
||||
ts: string;
|
||||
raw: unknown;
|
||||
}
|
||||
```
|
||||
|
||||
### Event Handler
|
||||
|
||||
```typescript
|
||||
// Slack event processing
|
||||
class SlackEventHandler {
|
||||
private eventQueue: Queue<SlackIncomingEvent>;
|
||||
private deduplicator: EventDeduplicator;
|
||||
|
||||
constructor(
|
||||
private config: SlackConfig,
|
||||
private sessionManager: SessionManager,
|
||||
private agent: Agent
|
||||
) {
|
||||
this.eventQueue = new Queue('slack-events');
|
||||
this.deduplicator = new EventDeduplicator({
|
||||
ttl: 300000, // 5 minutes
|
||||
keyFn: (e) => `${e.teamId}:${e.channelId}:${e.ts}`,
|
||||
});
|
||||
}
|
||||
|
||||
// Express middleware for Slack events
|
||||
middleware(): RequestHandler {
|
||||
return async (req, res) => {
|
||||
// Verify Slack signature
|
||||
if (!this.verifySignature(req)) {
|
||||
return res.status(401).send('Invalid signature');
|
||||
}
|
||||
|
||||
const body = req.body;
|
||||
|
||||
// Handle URL verification challenge
|
||||
if (body.type === 'url_verification') {
|
||||
return res.json({ challenge: body.challenge });
|
||||
}
|
||||
|
||||
// Acknowledge immediately (Slack 3s timeout)
|
||||
res.status(200).send();
|
||||
|
||||
// Process event asynchronously
|
||||
await this.handleEvent(body.event);
|
||||
};
|
||||
}
|
||||
|
||||
private async handleEvent(rawEvent: unknown): Promise<void> {
|
||||
const event = this.normalizeEvent(rawEvent);
|
||||
|
||||
// Deduplicate (Slack may retry)
|
||||
if (await this.deduplicator.isDuplicate(event)) {
|
||||
this.logger.debug('Duplicate event ignored', { event });
|
||||
return;
|
||||
}
|
||||
|
||||
// Filter events we care about
|
||||
if (!this.shouldProcess(event)) {
|
||||
return;
|
||||
}
|
||||
|
||||
// Map to tenant context
|
||||
const tenant = await this.resolveTenant(event.teamId);
|
||||
if (!tenant) {
|
||||
this.logger.warn('Unknown Slack team', { teamId: event.teamId });
|
||||
return;
|
||||
}
|
||||
|
||||
// Enqueue for processing
|
||||
await this.eventQueue.add('process', {
|
||||
event,
|
||||
tenant,
|
||||
receivedAt: Date.now(),
|
||||
});
|
||||
}
|
||||
|
||||
private shouldProcess(event: SlackIncomingEvent): boolean {
|
||||
// Skip bot messages
|
||||
if (event.raw?.bot_id) return false;
|
||||
|
||||
// Only process certain event types
|
||||
return ['message', 'app_mention'].includes(event.type);
|
||||
}
|
||||
|
||||
private verifySignature(req: Request): boolean {
|
||||
const timestamp = req.headers['x-slack-request-timestamp'] as string;
|
||||
const signature = req.headers['x-slack-signature'] as string;
|
||||
|
||||
// Prevent replay attacks (5 minute window)
|
||||
const now = Math.floor(Date.now() / 1000);
|
||||
if (Math.abs(now - parseInt(timestamp)) > 300) {
|
||||
return false;
|
||||
}
|
||||
|
||||
const baseString = `v0:${timestamp}:${req.rawBody}`;
|
||||
const expectedSignature = `v0=${crypto
|
||||
.createHmac('sha256', this.config.signingSecret)
|
||||
.update(baseString)
|
||||
.digest('hex')}`;
|
||||
|
||||
return crypto.timingSafeEqual(
|
||||
Buffer.from(signature),
|
||||
Buffer.from(expectedSignature)
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Slash Commands
|
||||
|
||||
```typescript
|
||||
// Slash command handling
|
||||
class SlackCommandHandler {
|
||||
private commands: Map<string, CommandDefinition> = new Map();
|
||||
|
||||
register(command: CommandDefinition): void {
|
||||
this.commands.set(command.name, command);
|
||||
}
|
||||
|
||||
middleware(): RequestHandler {
|
||||
return async (req, res) => {
|
||||
if (!this.verifySignature(req)) {
|
||||
return res.status(401).send('Invalid signature');
|
||||
}
|
||||
|
||||
const { command, text, user_id, channel_id, team_id, response_url } = req.body;
|
||||
|
||||
const commandDef = this.commands.get(command);
|
||||
if (!commandDef) {
|
||||
return res.json({
|
||||
response_type: 'ephemeral',
|
||||
text: `Unknown command: ${command}`,
|
||||
});
|
||||
}
|
||||
|
||||
// Parse arguments
|
||||
const args = this.parseArgs(text, commandDef.argSchema);
|
||||
|
||||
// Acknowledge with loading state
|
||||
res.json({
|
||||
response_type: 'ephemeral',
|
||||
text: 'Processing...',
|
||||
});
|
||||
|
||||
try {
|
||||
// Execute command
|
||||
const result = await commandDef.handler({
|
||||
args,
|
||||
userId: user_id,
|
||||
channelId: channel_id,
|
||||
teamId: team_id,
|
||||
});
|
||||
|
||||
// Send actual response
|
||||
await this.sendResponse(response_url, {
|
||||
response_type: result.public ? 'in_channel' : 'ephemeral',
|
||||
blocks: result.blocks,
|
||||
text: result.text,
|
||||
});
|
||||
} catch (error) {
|
||||
await this.sendResponse(response_url, {
|
||||
response_type: 'ephemeral',
|
||||
text: `Error: ${(error as Error).message}`,
|
||||
});
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
private parseArgs(text: string, schema: ArgSchema): Record<string, unknown> {
|
||||
const args: Record<string, unknown> = {};
|
||||
const parts = text.trim().split(/\s+/);
|
||||
|
||||
for (const [name, def] of Object.entries(schema)) {
|
||||
if (def.positional !== undefined) {
|
||||
args[name] = parts[def.positional];
|
||||
} else if (def.flag) {
|
||||
const flagIndex = parts.indexOf(`--${name}`);
|
||||
if (flagIndex !== -1) {
|
||||
args[name] = parts[flagIndex + 1] ?? true;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return args;
|
||||
}
|
||||
}
|
||||
|
||||
// Command definition
|
||||
interface CommandDefinition {
|
||||
name: string;
|
||||
description: string;
|
||||
argSchema: ArgSchema;
|
||||
handler: (ctx: CommandContext) => Promise<CommandResult>;
|
||||
}
|
||||
|
||||
// Example command
|
||||
const askCommand: CommandDefinition = {
|
||||
name: '/ask',
|
||||
description: 'Ask RuvBot a question',
|
||||
argSchema: {
|
||||
question: { positional: 0, required: true },
|
||||
context: { flag: true },
|
||||
},
|
||||
handler: async (ctx) => {
|
||||
const session = await sessionManager.getOrCreate(ctx.userId, ctx.channelId);
|
||||
const response = await agent.process(session, ctx.args.question as string);
|
||||
|
||||
return {
|
||||
public: false,
|
||||
text: response.content,
|
||||
blocks: formatResponseBlocks(response),
|
||||
};
|
||||
},
|
||||
};
|
||||
```
|
||||
|
||||
### Block Kit Builder
|
||||
|
||||
```typescript
|
||||
// Fluent Block Kit builder
|
||||
class BlockKitBuilder {
|
||||
private blocks: Block[] = [];
|
||||
|
||||
section(text: string): this {
|
||||
this.blocks.push({
|
||||
type: 'section',
|
||||
text: { type: 'mrkdwn', text },
|
||||
});
|
||||
return this;
|
||||
}
|
||||
|
||||
divider(): this {
|
||||
this.blocks.push({ type: 'divider' });
|
||||
return this;
|
||||
}
|
||||
|
||||
context(...elements: string[]): this {
|
||||
this.blocks.push({
|
||||
type: 'context',
|
||||
elements: elements.map(e => ({ type: 'mrkdwn', text: e })),
|
||||
});
|
||||
return this;
|
||||
}
|
||||
|
||||
actions(actionId: string, buttons: Button[]): this {
|
||||
this.blocks.push({
|
||||
type: 'actions',
|
||||
block_id: actionId,
|
||||
elements: buttons.map(b => ({
|
||||
type: 'button',
|
||||
text: { type: 'plain_text', text: b.text },
|
||||
action_id: b.actionId,
|
||||
value: b.value,
|
||||
style: b.style,
|
||||
})),
|
||||
});
|
||||
return this;
|
||||
}
|
||||
|
||||
input(label: string, actionId: string, options: InputOptions): this {
|
||||
this.blocks.push({
|
||||
type: 'input',
|
||||
label: { type: 'plain_text', text: label },
|
||||
element: {
|
||||
type: options.multiline ? 'plain_text_input' : 'plain_text_input',
|
||||
action_id: actionId,
|
||||
multiline: options.multiline,
|
||||
placeholder: options.placeholder
|
||||
? { type: 'plain_text', text: options.placeholder }
|
||||
: undefined,
|
||||
},
|
||||
});
|
||||
return this;
|
||||
}
|
||||
|
||||
build(): Block[] {
|
||||
return this.blocks;
|
||||
}
|
||||
}
|
||||
|
||||
// Usage example
|
||||
const responseBlocks = new BlockKitBuilder()
|
||||
.section('Here is what I found:')
|
||||
.divider()
|
||||
.section(responseText)
|
||||
.context(`Generated in ${latencyMs}ms`)
|
||||
.actions('feedback', [
|
||||
{ text: 'Helpful', actionId: 'feedback_positive', value: responseId, style: 'primary' },
|
||||
{ text: 'Not helpful', actionId: 'feedback_negative', value: responseId },
|
||||
])
|
||||
.build();
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Webhook Integration
|
||||
|
||||
### Inbound Webhooks
|
||||
|
||||
```typescript
|
||||
// Inbound webhook configuration
|
||||
interface WebhookEndpoint {
|
||||
id: string;
|
||||
path: string; // e.g., "/webhooks/github"
|
||||
method: 'POST' | 'PUT';
|
||||
secretKey?: string;
|
||||
signatureHeader?: string;
|
||||
signatureAlgorithm?: 'hmac-sha256' | 'hmac-sha1';
|
||||
handler: WebhookHandler;
|
||||
rateLimit?: RateLimitConfig;
|
||||
}
|
||||
|
||||
class InboundWebhookRouter {
|
||||
private endpoints: Map<string, WebhookEndpoint> = new Map();
|
||||
|
||||
register(endpoint: WebhookEndpoint): void {
|
||||
this.endpoints.set(endpoint.path, endpoint);
|
||||
}
|
||||
|
||||
middleware(): RequestHandler {
|
||||
return async (req, res, next) => {
|
||||
const endpoint = this.endpoints.get(req.path);
|
||||
if (!endpoint) {
|
||||
return next();
|
||||
}
|
||||
|
||||
// Rate limiting
|
||||
if (endpoint.rateLimit) {
|
||||
const allowed = await this.rateLimiter.check(
|
||||
`webhook:${endpoint.id}:${req.ip}`,
|
||||
endpoint.rateLimit
|
||||
);
|
||||
if (!allowed) {
|
||||
return res.status(429).json({ error: 'Rate limit exceeded' });
|
||||
}
|
||||
}
|
||||
|
||||
// Signature verification
|
||||
if (endpoint.secretKey) {
|
||||
if (!this.verifySignature(req, endpoint)) {
|
||||
return res.status(401).json({ error: 'Invalid signature' });
|
||||
}
|
||||
}
|
||||
|
||||
try {
|
||||
const result = await endpoint.handler({
|
||||
body: req.body,
|
||||
headers: req.headers,
|
||||
query: req.query,
|
||||
});
|
||||
|
||||
res.status(result.status ?? 200).json(result.body ?? { ok: true });
|
||||
} catch (error) {
|
||||
this.logger.error('Webhook handler error', { error, endpoint: endpoint.id });
|
||||
res.status(500).json({ error: 'Internal error' });
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
private verifySignature(req: Request, endpoint: WebhookEndpoint): boolean {
|
||||
const signatureHeader = endpoint.signatureHeader ?? 'x-signature';
|
||||
const providedSignature = req.headers[signatureHeader.toLowerCase()] as string;
|
||||
|
||||
if (!providedSignature) return false;
|
||||
|
||||
const algorithm = endpoint.signatureAlgorithm ?? 'hmac-sha256';
|
||||
const expectedSignature = crypto
|
||||
.createHmac(algorithm.replace('hmac-', ''), endpoint.secretKey!)
|
||||
.update(req.rawBody)
|
||||
.digest('hex');
|
||||
|
||||
// Handle various signature formats
|
||||
const normalizedProvided = providedSignature
|
||||
.replace(/^sha256=/, '')
|
||||
.replace(/^sha1=/, '');
|
||||
|
||||
return crypto.timingSafeEqual(
|
||||
Buffer.from(normalizedProvided),
|
||||
Buffer.from(expectedSignature)
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Outbound Webhooks
|
||||
|
||||
```typescript
|
||||
// Outbound webhook delivery
|
||||
class OutboundWebhookDispatcher {
|
||||
constructor(
|
||||
private queue: Queue<WebhookDelivery>,
|
||||
private storage: WebhookStorage,
|
||||
private http: HttpClient
|
||||
) {}
|
||||
|
||||
async dispatch(
|
||||
webhookId: string,
|
||||
event: WebhookEvent,
|
||||
options?: DispatchOptions
|
||||
): Promise<string> {
|
||||
const webhook = await this.storage.findById(webhookId);
|
||||
if (!webhook || !webhook.isEnabled) {
|
||||
throw new Error(`Webhook ${webhookId} not found or disabled`);
|
||||
}
|
||||
|
||||
const deliveryId = crypto.randomUUID();
|
||||
const payload = this.buildPayload(event, webhook);
|
||||
const signature = this.sign(payload, webhook.secret);
|
||||
|
||||
// Queue for delivery
|
||||
await this.queue.add(
|
||||
'deliver',
|
||||
{
|
||||
deliveryId,
|
||||
webhookId,
|
||||
url: webhook.url,
|
||||
payload,
|
||||
signature,
|
||||
headers: webhook.headers,
|
||||
},
|
||||
{
|
||||
attempts: 10,
|
||||
backoff: { type: 'exponential', delay: 1000 },
|
||||
removeOnComplete: 100,
|
||||
removeOnFail: 1000,
|
||||
}
|
||||
);
|
||||
|
||||
return deliveryId;
|
||||
}
|
||||
|
||||
private buildPayload(event: WebhookEvent, webhook: Webhook): string {
|
||||
return JSON.stringify({
|
||||
id: crypto.randomUUID(),
|
||||
type: event.type,
|
||||
timestamp: new Date().toISOString(),
|
||||
data: event.data,
|
||||
webhook_id: webhook.id,
|
||||
});
|
||||
}
|
||||
|
||||
private sign(payload: string, secret: string): string {
|
||||
const timestamp = Math.floor(Date.now() / 1000);
|
||||
const signaturePayload = `${timestamp}.${payload}`;
|
||||
const signature = crypto
|
||||
.createHmac('sha256', secret)
|
||||
.update(signaturePayload)
|
||||
.digest('hex');
|
||||
return `t=${timestamp},v1=${signature}`;
|
||||
}
|
||||
}
|
||||
|
||||
// Webhook event types
|
||||
type WebhookEventType =
|
||||
| 'session.created'
|
||||
| 'session.ended'
|
||||
| 'message.received'
|
||||
| 'message.sent'
|
||||
| 'memory.created'
|
||||
| 'skill.executed'
|
||||
| 'error.occurred';
|
||||
|
||||
interface WebhookEvent {
|
||||
type: WebhookEventType;
|
||||
data: Record<string, unknown>;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## LLM Provider Integration
|
||||
|
||||
### Provider Abstraction
|
||||
|
||||
```typescript
|
||||
// Unified LLM provider interface
|
||||
interface LLMProvider {
|
||||
// Basic completion
|
||||
complete(
|
||||
messages: Message[],
|
||||
options: CompletionOptions
|
||||
): Promise<Completion>;
|
||||
|
||||
// Streaming completion
|
||||
stream(
|
||||
messages: Message[],
|
||||
options: StreamOptions
|
||||
): AsyncGenerator<Token, Completion, void>;
|
||||
|
||||
// Token counting
|
||||
countTokens(text: string): Promise<number>;
|
||||
|
||||
// Model info
|
||||
getModel(): ModelInfo;
|
||||
|
||||
// Health check
|
||||
isHealthy(): Promise<boolean>;
|
||||
}
|
||||
|
||||
interface CompletionOptions {
|
||||
maxTokens?: number;
|
||||
temperature?: number;
|
||||
topP?: number;
|
||||
stopSequences?: string[];
|
||||
tools?: Tool[];
|
||||
}
|
||||
|
||||
interface Completion {
|
||||
content: string;
|
||||
finishReason: 'stop' | 'length' | 'tool_use';
|
||||
usage: {
|
||||
inputTokens: number;
|
||||
outputTokens: number;
|
||||
};
|
||||
toolCalls?: ToolCall[];
|
||||
}
|
||||
```
|
||||
|
||||
### Anthropic Claude Provider
|
||||
|
||||
```typescript
|
||||
// Claude provider implementation
|
||||
class ClaudeProvider implements LLMProvider {
|
||||
private client: AnthropicClient;
|
||||
private circuitBreaker: CircuitBreaker;
|
||||
|
||||
constructor(config: ClaudeConfig) {
|
||||
this.client = new Anthropic({
|
||||
apiKey: config.apiKey,
|
||||
baseURL: config.baseURL,
|
||||
});
|
||||
|
||||
this.circuitBreaker = new CircuitBreaker({
|
||||
failureThreshold: 5,
|
||||
resetTimeout: 30000,
|
||||
});
|
||||
}
|
||||
|
||||
async complete(
|
||||
messages: Message[],
|
||||
options: CompletionOptions
|
||||
): Promise<Completion> {
|
||||
return this.circuitBreaker.execute(async () => {
|
||||
const response = await this.client.messages.create({
|
||||
model: 'claude-sonnet-4-20250514',
|
||||
max_tokens: options.maxTokens ?? 1024,
|
||||
temperature: options.temperature ?? 0.7,
|
||||
messages: this.formatMessages(messages),
|
||||
tools: options.tools?.map(this.formatTool),
|
||||
});
|
||||
|
||||
return this.parseResponse(response);
|
||||
});
|
||||
}
|
||||
|
||||
async *stream(
|
||||
messages: Message[],
|
||||
options: StreamOptions
|
||||
): AsyncGenerator<Token, Completion, void> {
|
||||
const stream = await this.client.messages.stream({
|
||||
model: 'claude-sonnet-4-20250514',
|
||||
max_tokens: options.maxTokens ?? 1024,
|
||||
temperature: options.temperature ?? 0.7,
|
||||
messages: this.formatMessages(messages),
|
||||
});
|
||||
|
||||
let fullContent = '';
|
||||
let inputTokens = 0;
|
||||
let outputTokens = 0;
|
||||
|
||||
for await (const event of stream) {
|
||||
if (event.type === 'content_block_delta') {
|
||||
const text = event.delta.text;
|
||||
fullContent += text;
|
||||
yield { type: 'text', text };
|
||||
} else if (event.type === 'message_delta') {
|
||||
outputTokens = event.usage?.output_tokens ?? 0;
|
||||
} else if (event.type === 'message_start') {
|
||||
inputTokens = event.message.usage?.input_tokens ?? 0;
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
content: fullContent,
|
||||
finishReason: 'stop',
|
||||
usage: { inputTokens, outputTokens },
|
||||
};
|
||||
}
|
||||
|
||||
private formatMessages(messages: Message[]): AnthropicMessage[] {
|
||||
return messages.map(m => ({
|
||||
role: m.role === 'user' ? 'user' : 'assistant',
|
||||
content: m.content,
|
||||
}));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Provider Registry with Failover
|
||||
|
||||
```typescript
|
||||
// Multi-provider registry with automatic failover
|
||||
class ProviderRegistry {
|
||||
private providers: Map<string, LLMProvider> = new Map();
|
||||
private primary: string;
|
||||
private fallbacks: string[];
|
||||
|
||||
constructor(config: ProviderRegistryConfig) {
|
||||
this.primary = config.primary;
|
||||
this.fallbacks = config.fallbacks;
|
||||
}
|
||||
|
||||
register(name: string, provider: LLMProvider): void {
|
||||
this.providers.set(name, provider);
|
||||
}
|
||||
|
||||
async complete(
|
||||
messages: Message[],
|
||||
options: CompletionOptions
|
||||
): Promise<Completion> {
|
||||
const providerOrder = [this.primary, ...this.fallbacks];
|
||||
|
||||
for (const providerName of providerOrder) {
|
||||
const provider = this.providers.get(providerName);
|
||||
if (!provider) continue;
|
||||
|
||||
try {
|
||||
// Check health before using
|
||||
if (await provider.isHealthy()) {
|
||||
const result = await provider.complete(messages, options);
|
||||
this.metrics.increment('provider.success', { provider: providerName });
|
||||
return result;
|
||||
}
|
||||
} catch (error) {
|
||||
this.logger.warn(`Provider ${providerName} failed`, { error });
|
||||
this.metrics.increment('provider.failure', { provider: providerName });
|
||||
}
|
||||
}
|
||||
|
||||
throw new Error('All LLM providers unavailable');
|
||||
}
|
||||
|
||||
async *stream(
|
||||
messages: Message[],
|
||||
options: StreamOptions
|
||||
): AsyncGenerator<Token, Completion, void> {
|
||||
const provider = this.providers.get(this.primary);
|
||||
if (!provider) {
|
||||
throw new Error(`Primary provider ${this.primary} not found`);
|
||||
}
|
||||
|
||||
// Streaming doesn't support automatic failover (would be disruptive)
|
||||
yield* provider.stream(messages, options);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Circuit Breaker
|
||||
|
||||
```typescript
|
||||
// Circuit breaker for external service protection
|
||||
class CircuitBreaker {
|
||||
private state: 'closed' | 'open' | 'half-open' = 'closed';
|
||||
private failures = 0;
|
||||
private lastFailureTime = 0;
|
||||
private successesSinceHalfOpen = 0;
|
||||
|
||||
constructor(private config: CircuitBreakerConfig) {}
|
||||
|
||||
async execute<T>(fn: () => Promise<T>): Promise<T> {
|
||||
if (this.state === 'open') {
|
||||
if (Date.now() - this.lastFailureTime > this.config.resetTimeout) {
|
||||
this.state = 'half-open';
|
||||
this.successesSinceHalfOpen = 0;
|
||||
} else {
|
||||
throw new CircuitBreakerOpenError();
|
||||
}
|
||||
}
|
||||
|
||||
try {
|
||||
const result = await fn();
|
||||
this.onSuccess();
|
||||
return result;
|
||||
} catch (error) {
|
||||
this.onFailure();
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
private onSuccess(): void {
|
||||
if (this.state === 'half-open') {
|
||||
this.successesSinceHalfOpen++;
|
||||
if (this.successesSinceHalfOpen >= this.config.successThreshold) {
|
||||
this.state = 'closed';
|
||||
this.failures = 0;
|
||||
}
|
||||
} else {
|
||||
this.failures = 0;
|
||||
}
|
||||
}
|
||||
|
||||
private onFailure(): void {
|
||||
this.failures++;
|
||||
this.lastFailureTime = Date.now();
|
||||
|
||||
if (this.failures >= this.config.failureThreshold) {
|
||||
this.state = 'open';
|
||||
}
|
||||
}
|
||||
|
||||
getState(): CircuitBreakerState {
|
||||
return {
|
||||
state: this.state,
|
||||
failures: this.failures,
|
||||
lastFailureTime: this.lastFailureTime,
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
interface CircuitBreakerConfig {
|
||||
failureThreshold: number; // Failures before opening
|
||||
successThreshold: number; // Successes in half-open to close
|
||||
resetTimeout: number; // ms before trying half-open
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Consequences
|
||||
|
||||
### Benefits
|
||||
|
||||
1. **Unified Interface**: All integrations exposed through consistent APIs
|
||||
2. **Resilience**: Circuit breakers and retries prevent cascade failures
|
||||
3. **Extensibility**: Easy to add new providers and integrations
|
||||
4. **Observability**: Comprehensive metrics and logging
|
||||
5. **Security**: Proper signature verification and authentication
|
||||
|
||||
### Trade-offs
|
||||
|
||||
| Benefit | Trade-off |
|
||||
|---------|-----------|
|
||||
| Abstraction | Some provider-specific features hidden |
|
||||
| Circuit breaker | Delayed recovery after incidents |
|
||||
| Retry logic | Potential duplicate processing |
|
||||
| Async processing | Eventually consistent state |
|
||||
|
||||
---
|
||||
|
||||
## Related Decisions
|
||||
|
||||
- **ADR-001**: Architecture Overview
|
||||
- **ADR-004**: Background Workers (webhook delivery)
|
||||
|
||||
---
|
||||
|
||||
## Revision History
|
||||
|
||||
| Version | Date | Author | Changes |
|
||||
|---------|------|--------|---------|
|
||||
| 1.0 | 2026-01-27 | RuVector Architecture Team | Initial version |
|
||||
775
npm/packages/ruvbot/docs/adr/ADR-006-wasm-integration.md
Normal file
775
npm/packages/ruvbot/docs/adr/ADR-006-wasm-integration.md
Normal file
@@ -0,0 +1,775 @@
|
||||
# ADR-006: WASM Integration
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-01-27
|
||||
**Decision Makers:** RuVector Architecture Team
|
||||
**Technical Area:** Runtime, Performance
|
||||
|
||||
---
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
RuvBot requires high-performance vector operations and ML inference for:
|
||||
|
||||
1. **Embedding generation** for memory storage and retrieval
|
||||
2. **HNSW search** for semantic memory recall
|
||||
3. **Pattern matching** for learned response optimization
|
||||
4. **Quantization** for memory-efficient vector storage
|
||||
|
||||
The runtime must support:
|
||||
|
||||
- **Server-side Node.js** for API workloads
|
||||
- **Edge deployments** (Cloudflare Workers, Vercel Edge)
|
||||
- **Browser execution** for client-side features
|
||||
- **Fallback paths** when WASM is unavailable
|
||||
|
||||
---
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
### Performance Requirements
|
||||
|
||||
| Operation | Target Latency | Environment |
|
||||
|-----------|----------------|-------------|
|
||||
| Embed single text | < 10ms | WASM |
|
||||
| Embed batch (32) | < 100ms | WASM |
|
||||
| HNSW search k=10 | < 5ms | Native/WASM |
|
||||
| Quantize vector | < 1ms | WASM |
|
||||
| Pattern match | < 20ms | WASM |
|
||||
|
||||
### Compatibility Requirements
|
||||
|
||||
| Environment | WASM Support | Native Support |
|
||||
|-------------|--------------|----------------|
|
||||
| Node.js 18+ | Full | Full (NAPI) |
|
||||
| Node.js 14-17 | Partial | Full (NAPI) |
|
||||
| Cloudflare Workers | Full | None |
|
||||
| Vercel Edge | Full | None |
|
||||
| Browser (Chrome/FF/Safari) | Full | None |
|
||||
| Deno | Full | Partial |
|
||||
|
||||
---
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
### Adopt Hybrid WASM/Native Runtime with Automatic Detection
|
||||
|
||||
We implement a runtime abstraction that:
|
||||
|
||||
1. **Detects environment** at initialization
|
||||
2. **Prefers native bindings** when available (2-5x faster)
|
||||
3. **Falls back to WASM** universally
|
||||
4. **Provides consistent API** regardless of backend
|
||||
|
||||
```
|
||||
+-----------------------------------------------------------------------------+
|
||||
| WASM INTEGRATION LAYER |
|
||||
+-----------------------------------------------------------------------------+
|
||||
|
||||
+---------------------------+
|
||||
| Runtime Detector |
|
||||
+-------------+-------------+
|
||||
|
|
||||
+---------------------+---------------------+
|
||||
| |
|
||||
+-----------v-----------+ +-----------v-----------+
|
||||
| Native Backend | | WASM Backend |
|
||||
| (NAPI-RS) | | (wasm-bindgen) |
|
||||
|-----------------------| |-----------------------|
|
||||
| - @ruvector/core | | - @ruvector/wasm |
|
||||
| - @ruvector/ruvllm | | - @ruvllm-wasm |
|
||||
| - @ruvector/sona | | - @sona-wasm |
|
||||
+-----------+-----------+ +-----------+-----------+
|
||||
| |
|
||||
+---------------------+---------------------+
|
||||
|
|
||||
+-------------v-------------+
|
||||
| Unified API Surface |
|
||||
| (RuVectorRuntime) |
|
||||
+---------------------------+
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## WASM Module Architecture
|
||||
|
||||
### Module Organization
|
||||
|
||||
```typescript
|
||||
// WASM module types available
|
||||
interface WasmModules {
|
||||
// Vector operations
|
||||
vectorOps: {
|
||||
distance: (a: Float32Array, b: Float32Array, metric: DistanceMetric) => number;
|
||||
batchDistance: (query: Float32Array, vectors: Float32Array[], metric: DistanceMetric) => Float32Array;
|
||||
normalize: (vector: Float32Array) => Float32Array;
|
||||
quantize: (vector: Float32Array, config: QuantizationConfig) => Uint8Array;
|
||||
dequantize: (quantized: Uint8Array, config: QuantizationConfig) => Float32Array;
|
||||
};
|
||||
|
||||
// HNSW index
|
||||
hnsw: {
|
||||
create: (config: HnswConfig) => HnswIndexHandle;
|
||||
insert: (handle: HnswIndexHandle, id: string, vector: Float32Array) => void;
|
||||
search: (handle: HnswIndexHandle, query: Float32Array, k: number) => SearchResult[];
|
||||
delete: (handle: HnswIndexHandle, id: string) => boolean;
|
||||
serialize: (handle: HnswIndexHandle) => Uint8Array;
|
||||
deserialize: (data: Uint8Array) => HnswIndexHandle;
|
||||
free: (handle: HnswIndexHandle) => void;
|
||||
};
|
||||
|
||||
// Embeddings
|
||||
embeddings: {
|
||||
loadModel: (modelPath: string) => EmbeddingModelHandle;
|
||||
embed: (handle: EmbeddingModelHandle, text: string) => Float32Array;
|
||||
embedBatch: (handle: EmbeddingModelHandle, texts: string[]) => Float32Array[];
|
||||
unloadModel: (handle: EmbeddingModelHandle) => void;
|
||||
};
|
||||
|
||||
// Learning
|
||||
learning: {
|
||||
createPattern: (embedding: Float32Array, metadata: unknown) => PatternHandle;
|
||||
matchPatterns: (query: Float32Array, patterns: PatternHandle[], threshold: number) => PatternMatch[];
|
||||
trainLoRA: (trajectories: Trajectory[], config: LoRAConfig) => LoRAWeights;
|
||||
applyEWC: (weights: ModelWeights, fisher: FisherMatrix, lambda: number) => ModelWeights;
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### Runtime Detection
|
||||
|
||||
```typescript
|
||||
// Automatic runtime detection and initialization
|
||||
class RuVectorRuntime {
|
||||
private static instance: RuVectorRuntime | null = null;
|
||||
private backend: 'native' | 'wasm' | 'js-fallback';
|
||||
private modules: WasmModules | NativeModules;
|
||||
|
||||
private constructor() {}
|
||||
|
||||
static async initialize(): Promise<RuVectorRuntime> {
|
||||
if (this.instance) return this.instance;
|
||||
|
||||
const runtime = new RuVectorRuntime();
|
||||
await runtime.detectAndLoad();
|
||||
this.instance = runtime;
|
||||
return runtime;
|
||||
}
|
||||
|
||||
private async detectAndLoad(): Promise<void> {
|
||||
// Try native first (best performance)
|
||||
if (await this.tryNative()) {
|
||||
this.backend = 'native';
|
||||
console.log('RuVector: Using native NAPI backend');
|
||||
return;
|
||||
}
|
||||
|
||||
// Try WASM
|
||||
if (await this.tryWasm()) {
|
||||
this.backend = 'wasm';
|
||||
console.log('RuVector: Using WASM backend');
|
||||
return;
|
||||
}
|
||||
|
||||
// Fall back to pure JS (limited functionality)
|
||||
this.backend = 'js-fallback';
|
||||
console.warn('RuVector: Using JS fallback (limited performance)');
|
||||
await this.loadJsFallback();
|
||||
}
|
||||
|
||||
private async tryNative(): Promise<boolean> {
|
||||
// Native only available in Node.js
|
||||
if (typeof process === 'undefined' || !process.versions?.node) {
|
||||
return false;
|
||||
}
|
||||
|
||||
try {
|
||||
const nativeModule = await import('@ruvector/core');
|
||||
if (typeof nativeModule.isNativeAvailable === 'function' &&
|
||||
nativeModule.isNativeAvailable()) {
|
||||
this.modules = nativeModule;
|
||||
return true;
|
||||
}
|
||||
} catch (e) {
|
||||
console.debug('Native module not available:', e);
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
private async tryWasm(): Promise<boolean> {
|
||||
try {
|
||||
// Check WebAssembly support
|
||||
if (typeof WebAssembly !== 'object') {
|
||||
return false;
|
||||
}
|
||||
|
||||
// Load WASM modules
|
||||
const [vectorOps, hnsw, embeddings, learning] = await Promise.all([
|
||||
import('@ruvector/wasm'),
|
||||
import('@ruvector/wasm/hnsw'),
|
||||
import('@ruvector/wasm/embeddings'),
|
||||
import('@ruvector/wasm/learning'),
|
||||
]);
|
||||
|
||||
// Initialize WASM modules
|
||||
await Promise.all([
|
||||
vectorOps.default(),
|
||||
hnsw.default(),
|
||||
embeddings.default(),
|
||||
learning.default(),
|
||||
]);
|
||||
|
||||
this.modules = {
|
||||
vectorOps,
|
||||
hnsw,
|
||||
embeddings,
|
||||
learning,
|
||||
};
|
||||
|
||||
return true;
|
||||
} catch (e) {
|
||||
console.debug('WASM modules not available:', e);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
private async loadJsFallback(): Promise<void> {
|
||||
// Pure JS implementations (slower but always work)
|
||||
const { JsFallbackModules } = await import('./js-fallback');
|
||||
this.modules = new JsFallbackModules();
|
||||
}
|
||||
|
||||
getBackend(): 'native' | 'wasm' | 'js-fallback' {
|
||||
return this.backend;
|
||||
}
|
||||
|
||||
getModules(): WasmModules | NativeModules {
|
||||
if (!this.modules) {
|
||||
throw new Error('RuVector runtime not initialized');
|
||||
}
|
||||
return this.modules;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Embedding Engine
|
||||
|
||||
### WASM Embedder
|
||||
|
||||
```typescript
|
||||
// WASM-based embedding engine
|
||||
class WasmEmbedder {
|
||||
private modelHandle: EmbeddingModelHandle | null = null;
|
||||
private modelPath: string;
|
||||
private dimensions: number;
|
||||
private runtime: RuVectorRuntime;
|
||||
|
||||
constructor(config: EmbedderConfig) {
|
||||
this.modelPath = config.modelPath;
|
||||
this.dimensions = config.dimensions ?? 384;
|
||||
}
|
||||
|
||||
async initialize(): Promise<void> {
|
||||
this.runtime = await RuVectorRuntime.initialize();
|
||||
const { embeddings } = this.runtime.getModules();
|
||||
|
||||
// Load model (downloads and caches if needed)
|
||||
const modelData = await this.loadModelData();
|
||||
this.modelHandle = embeddings.loadModel(modelData);
|
||||
}
|
||||
|
||||
async embed(text: string): Promise<Float32Array> {
|
||||
if (!this.modelHandle) {
|
||||
throw new Error('Embedder not initialized');
|
||||
}
|
||||
|
||||
const { embeddings } = this.runtime.getModules();
|
||||
return embeddings.embed(this.modelHandle, text);
|
||||
}
|
||||
|
||||
async embedBatch(texts: string[]): Promise<Float32Array[]> {
|
||||
if (!this.modelHandle) {
|
||||
throw new Error('Embedder not initialized');
|
||||
}
|
||||
|
||||
const { embeddings } = this.runtime.getModules();
|
||||
|
||||
// Process in chunks to avoid OOM
|
||||
const chunkSize = 32;
|
||||
const results: Float32Array[] = [];
|
||||
|
||||
for (let i = 0; i < texts.length; i += chunkSize) {
|
||||
const chunk = texts.slice(i, i + chunkSize);
|
||||
const chunkResults = embeddings.embedBatch(this.modelHandle, chunk);
|
||||
results.push(...chunkResults);
|
||||
}
|
||||
|
||||
return results;
|
||||
}
|
||||
|
||||
getDimensions(): number {
|
||||
return this.dimensions;
|
||||
}
|
||||
|
||||
async dispose(): Promise<void> {
|
||||
if (this.modelHandle) {
|
||||
const { embeddings } = this.runtime.getModules();
|
||||
embeddings.unloadModel(this.modelHandle);
|
||||
this.modelHandle = null;
|
||||
}
|
||||
}
|
||||
|
||||
private async loadModelData(): Promise<Uint8Array> {
|
||||
// Check cache first
|
||||
const cached = await this.modelCache.get(this.modelPath);
|
||||
if (cached) return cached;
|
||||
|
||||
// Download model
|
||||
const response = await fetch(this.modelPath);
|
||||
const buffer = await response.arrayBuffer();
|
||||
const data = new Uint8Array(buffer);
|
||||
|
||||
// Cache for future use
|
||||
await this.modelCache.set(this.modelPath, data);
|
||||
|
||||
return data;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Model Cache
|
||||
|
||||
```typescript
|
||||
// Cross-environment model cache
|
||||
class ModelCache {
|
||||
private memoryCache: Map<string, Uint8Array> = new Map();
|
||||
|
||||
async get(key: string): Promise<Uint8Array | null> {
|
||||
// Check memory cache first
|
||||
if (this.memoryCache.has(key)) {
|
||||
return this.memoryCache.get(key)!;
|
||||
}
|
||||
|
||||
// Try persistent cache (environment-specific)
|
||||
if (typeof caches !== 'undefined') {
|
||||
// Browser/Cloudflare Cache API
|
||||
return this.getFromCacheAPI(key);
|
||||
} else if (typeof process !== 'undefined' && process.versions?.node) {
|
||||
// Node.js file system cache
|
||||
return this.getFromFileCache(key);
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
async set(key: string, data: Uint8Array): Promise<void> {
|
||||
// Always store in memory
|
||||
this.memoryCache.set(key, data);
|
||||
|
||||
// Persist to appropriate cache
|
||||
if (typeof caches !== 'undefined') {
|
||||
await this.setToCacheAPI(key, data);
|
||||
} else if (typeof process !== 'undefined' && process.versions?.node) {
|
||||
await this.setToFileCache(key, data);
|
||||
}
|
||||
}
|
||||
|
||||
private async getFromCacheAPI(key: string): Promise<Uint8Array | null> {
|
||||
try {
|
||||
const cache = await caches.open('ruvector-models');
|
||||
const response = await cache.match(key);
|
||||
if (response) {
|
||||
const buffer = await response.arrayBuffer();
|
||||
return new Uint8Array(buffer);
|
||||
}
|
||||
} catch (e) {
|
||||
console.debug('Cache API error:', e);
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
private async setToCacheAPI(key: string, data: Uint8Array): Promise<void> {
|
||||
try {
|
||||
const cache = await caches.open('ruvector-models');
|
||||
const response = new Response(data, {
|
||||
headers: { 'Content-Type': 'application/octet-stream' },
|
||||
});
|
||||
await cache.put(key, response);
|
||||
} catch (e) {
|
||||
console.debug('Cache API error:', e);
|
||||
}
|
||||
}
|
||||
|
||||
private async getFromFileCache(key: string): Promise<Uint8Array | null> {
|
||||
const fs = await import('fs/promises');
|
||||
const path = await import('path');
|
||||
const os = await import('os');
|
||||
|
||||
const cacheDir = path.join(os.homedir(), '.ruvector', 'models');
|
||||
const cachePath = path.join(cacheDir, this.keyToFilename(key));
|
||||
|
||||
try {
|
||||
const data = await fs.readFile(cachePath);
|
||||
return new Uint8Array(data);
|
||||
} catch (e) {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
private async setToFileCache(key: string, data: Uint8Array): Promise<void> {
|
||||
const fs = await import('fs/promises');
|
||||
const path = await import('path');
|
||||
const os = await import('os');
|
||||
|
||||
const cacheDir = path.join(os.homedir(), '.ruvector', 'models');
|
||||
await fs.mkdir(cacheDir, { recursive: true });
|
||||
|
||||
const cachePath = path.join(cacheDir, this.keyToFilename(key));
|
||||
await fs.writeFile(cachePath, data);
|
||||
}
|
||||
|
||||
private keyToFilename(key: string): string {
|
||||
const crypto = require('crypto');
|
||||
return crypto.createHash('sha256').update(key).digest('hex').slice(0, 32);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## HNSW Index WASM Wrapper
|
||||
|
||||
```typescript
|
||||
// WASM-based HNSW index
|
||||
class WasmHnswIndex {
|
||||
private handle: HnswIndexHandle | null = null;
|
||||
private runtime: RuVectorRuntime;
|
||||
private config: HnswConfig;
|
||||
private vectorCount = 0;
|
||||
|
||||
constructor(config: HnswConfig) {
|
||||
this.config = config;
|
||||
}
|
||||
|
||||
async initialize(): Promise<void> {
|
||||
this.runtime = await RuVectorRuntime.initialize();
|
||||
const { hnsw } = this.runtime.getModules();
|
||||
this.handle = hnsw.create(this.config);
|
||||
}
|
||||
|
||||
async insert(id: string, vector: Float32Array): Promise<void> {
|
||||
if (!this.handle) throw new Error('Index not initialized');
|
||||
|
||||
// Validate dimensions
|
||||
if (vector.length !== this.config.dimensions) {
|
||||
throw new Error(`Vector dimension mismatch: ${vector.length} vs ${this.config.dimensions}`);
|
||||
}
|
||||
|
||||
const { hnsw } = this.runtime.getModules();
|
||||
hnsw.insert(this.handle, id, vector);
|
||||
this.vectorCount++;
|
||||
}
|
||||
|
||||
async insertBatch(entries: Array<{ id: string; vector: Float32Array }>): Promise<void> {
|
||||
if (!this.handle) throw new Error('Index not initialized');
|
||||
|
||||
const { hnsw } = this.runtime.getModules();
|
||||
|
||||
for (const entry of entries) {
|
||||
if (entry.vector.length !== this.config.dimensions) {
|
||||
throw new Error(`Vector dimension mismatch for ${entry.id}`);
|
||||
}
|
||||
hnsw.insert(this.handle, entry.id, entry.vector);
|
||||
this.vectorCount++;
|
||||
}
|
||||
}
|
||||
|
||||
async search(query: Float32Array, k: number): Promise<SearchResult[]> {
|
||||
if (!this.handle) throw new Error('Index not initialized');
|
||||
|
||||
if (query.length !== this.config.dimensions) {
|
||||
throw new Error(`Query dimension mismatch: ${query.length}`);
|
||||
}
|
||||
|
||||
const { hnsw } = this.runtime.getModules();
|
||||
return hnsw.search(this.handle, query, Math.min(k, this.vectorCount));
|
||||
}
|
||||
|
||||
async delete(id: string): Promise<boolean> {
|
||||
if (!this.handle) throw new Error('Index not initialized');
|
||||
|
||||
const { hnsw } = this.runtime.getModules();
|
||||
const deleted = hnsw.delete(this.handle, id);
|
||||
if (deleted) this.vectorCount--;
|
||||
return deleted;
|
||||
}
|
||||
|
||||
async serialize(): Promise<Uint8Array> {
|
||||
if (!this.handle) throw new Error('Index not initialized');
|
||||
|
||||
const { hnsw } = this.runtime.getModules();
|
||||
return hnsw.serialize(this.handle);
|
||||
}
|
||||
|
||||
async deserialize(data: Uint8Array): Promise<void> {
|
||||
const { hnsw } = this.runtime.getModules();
|
||||
|
||||
// Free existing handle if any
|
||||
if (this.handle) {
|
||||
hnsw.free(this.handle);
|
||||
}
|
||||
|
||||
this.handle = hnsw.deserialize(data);
|
||||
}
|
||||
|
||||
getStats(): IndexStats {
|
||||
return {
|
||||
vectorCount: this.vectorCount,
|
||||
dimensions: this.config.dimensions,
|
||||
m: this.config.m,
|
||||
efConstruction: this.config.efConstruction,
|
||||
efSearch: this.config.efSearch,
|
||||
backend: this.runtime.getBackend(),
|
||||
};
|
||||
}
|
||||
|
||||
async dispose(): Promise<void> {
|
||||
if (this.handle) {
|
||||
const { hnsw } = this.runtime.getModules();
|
||||
hnsw.free(this.handle);
|
||||
this.handle = null;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
interface HnswConfig {
|
||||
dimensions: number;
|
||||
m: number; // Max connections per node per layer
|
||||
efConstruction: number; // Build-time exploration factor
|
||||
efSearch: number; // Query-time exploration factor
|
||||
distanceMetric: 'cosine' | 'euclidean' | 'dot_product';
|
||||
}
|
||||
|
||||
interface SearchResult {
|
||||
id: string;
|
||||
score: number;
|
||||
}
|
||||
|
||||
interface IndexStats {
|
||||
vectorCount: number;
|
||||
dimensions: number;
|
||||
m: number;
|
||||
efConstruction: number;
|
||||
efSearch: number;
|
||||
backend: 'native' | 'wasm' | 'js-fallback';
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Memory Management
|
||||
|
||||
### WASM Memory Pooling
|
||||
|
||||
```typescript
|
||||
// Efficient memory management for WASM
|
||||
class WasmMemoryPool {
|
||||
private pools: Map<number, Float32Array[]> = new Map();
|
||||
private maxPoolSize = 100;
|
||||
|
||||
// Get or create a Float32Array of specified length
|
||||
acquire(length: number): Float32Array {
|
||||
const pool = this.pools.get(length);
|
||||
|
||||
if (pool && pool.length > 0) {
|
||||
return pool.pop()!;
|
||||
}
|
||||
|
||||
return new Float32Array(length);
|
||||
}
|
||||
|
||||
// Return array to pool for reuse
|
||||
release(array: Float32Array): void {
|
||||
const length = array.length;
|
||||
let pool = this.pools.get(length);
|
||||
|
||||
if (!pool) {
|
||||
pool = [];
|
||||
this.pools.set(length, pool);
|
||||
}
|
||||
|
||||
if (pool.length < this.maxPoolSize) {
|
||||
// Zero out for security
|
||||
array.fill(0);
|
||||
pool.push(array);
|
||||
}
|
||||
// Otherwise let GC handle it
|
||||
}
|
||||
|
||||
// Clear pools when memory pressure detected
|
||||
clear(): void {
|
||||
this.pools.clear();
|
||||
}
|
||||
|
||||
getStats(): PoolStats {
|
||||
const stats: PoolStats = { totalArrays: 0, totalBytes: 0, pools: {} };
|
||||
|
||||
for (const [length, pool] of this.pools) {
|
||||
stats.pools[length] = pool.length;
|
||||
stats.totalArrays += pool.length;
|
||||
stats.totalBytes += pool.length * length * 4; // 4 bytes per float32
|
||||
}
|
||||
|
||||
return stats;
|
||||
}
|
||||
}
|
||||
|
||||
// Usage in embedder
|
||||
class PooledWasmEmbedder extends WasmEmbedder {
|
||||
private pool = new WasmMemoryPool();
|
||||
|
||||
async embed(text: string): Promise<Float32Array> {
|
||||
const result = await super.embed(text);
|
||||
|
||||
// Copy to pooled array
|
||||
const pooled = this.pool.acquire(result.length);
|
||||
pooled.set(result);
|
||||
|
||||
return pooled;
|
||||
}
|
||||
|
||||
releaseEmbedding(embedding: Float32Array): void {
|
||||
this.pool.release(embedding);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Benchmarks
|
||||
|
||||
```typescript
|
||||
// Benchmark suite for runtime comparison
|
||||
class WasmBenchmarks {
|
||||
async runAll(): Promise<BenchmarkResults> {
|
||||
const results: BenchmarkResults = {};
|
||||
|
||||
// Embedding benchmarks
|
||||
results.embedSingle = await this.benchmarkEmbedSingle();
|
||||
results.embedBatch = await this.benchmarkEmbedBatch();
|
||||
|
||||
// HNSW benchmarks
|
||||
results.hnswInsert = await this.benchmarkHnswInsert();
|
||||
results.hnswSearch = await this.benchmarkHnswSearch();
|
||||
|
||||
// Vector operations
|
||||
results.distance = await this.benchmarkDistance();
|
||||
results.quantize = await this.benchmarkQuantize();
|
||||
|
||||
return results;
|
||||
}
|
||||
|
||||
private async benchmarkEmbedSingle(): Promise<BenchmarkResult> {
|
||||
const embedder = new WasmEmbedder({ modelPath: 'minilm-l6-v2' });
|
||||
await embedder.initialize();
|
||||
|
||||
const iterations = 100;
|
||||
const texts = Array(iterations).fill('This is a test sentence for embedding.');
|
||||
|
||||
const start = performance.now();
|
||||
for (const text of texts) {
|
||||
await embedder.embed(text);
|
||||
}
|
||||
const elapsed = performance.now() - start;
|
||||
|
||||
return {
|
||||
operation: 'embed_single',
|
||||
iterations,
|
||||
totalMs: elapsed,
|
||||
avgMs: elapsed / iterations,
|
||||
opsPerSecond: (iterations / elapsed) * 1000,
|
||||
};
|
||||
}
|
||||
|
||||
private async benchmarkHnswSearch(): Promise<BenchmarkResult> {
|
||||
const index = new WasmHnswIndex({
|
||||
dimensions: 384,
|
||||
m: 16,
|
||||
efConstruction: 100,
|
||||
efSearch: 50,
|
||||
distanceMetric: 'cosine',
|
||||
});
|
||||
await index.initialize();
|
||||
|
||||
// Insert 10k vectors
|
||||
for (let i = 0; i < 10000; i++) {
|
||||
await index.insert(`vec_${i}`, this.randomVector(384));
|
||||
}
|
||||
|
||||
const iterations = 1000;
|
||||
const query = this.randomVector(384);
|
||||
|
||||
const start = performance.now();
|
||||
for (let i = 0; i < iterations; i++) {
|
||||
await index.search(query, 10);
|
||||
}
|
||||
const elapsed = performance.now() - start;
|
||||
|
||||
return {
|
||||
operation: 'hnsw_search_10k',
|
||||
iterations,
|
||||
totalMs: elapsed,
|
||||
avgMs: elapsed / iterations,
|
||||
opsPerSecond: (iterations / elapsed) * 1000,
|
||||
};
|
||||
}
|
||||
|
||||
private randomVector(dim: number): Float32Array {
|
||||
const vec = new Float32Array(dim);
|
||||
for (let i = 0; i < dim; i++) {
|
||||
vec[i] = Math.random() * 2 - 1;
|
||||
}
|
||||
return vec;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Consequences
|
||||
|
||||
### Benefits
|
||||
|
||||
1. **Universal Deployment**: Same code runs everywhere (Node, Edge, Browser)
|
||||
2. **Performance**: Near-native performance for vector operations
|
||||
3. **Fallback Safety**: Always works even without WASM support
|
||||
4. **Memory Efficiency**: Pooling and proper cleanup prevent leaks
|
||||
5. **Model Portability**: ONNX models run in any environment
|
||||
|
||||
### Trade-offs
|
||||
|
||||
| Benefit | Trade-off |
|
||||
|---------|-----------|
|
||||
| Portability | Slight overhead vs pure native |
|
||||
| WASM safety | No direct memory access (by design) |
|
||||
| Model caching | Disk/Cache API storage needed |
|
||||
| Lazy loading | First-use latency for initialization |
|
||||
|
||||
---
|
||||
|
||||
## Related Decisions
|
||||
|
||||
- **ADR-001**: Architecture Overview
|
||||
- **ADR-003**: Persistence Layer (vector storage)
|
||||
- **ADR-007**: Learning System (pattern WASM modules)
|
||||
|
||||
---
|
||||
|
||||
## Revision History
|
||||
|
||||
| Version | Date | Author | Changes |
|
||||
|---------|------|--------|---------|
|
||||
| 1.0 | 2026-01-27 | RuVector Architecture Team | Initial version |
|
||||
1134
npm/packages/ruvbot/docs/adr/ADR-007-learning-system.md
Normal file
1134
npm/packages/ruvbot/docs/adr/ADR-007-learning-system.md
Normal file
File diff suppressed because it is too large
Load Diff
151
npm/packages/ruvbot/docs/adr/ADR-008-security-architecture.md
Normal file
151
npm/packages/ruvbot/docs/adr/ADR-008-security-architecture.md
Normal file
@@ -0,0 +1,151 @@
|
||||
# ADR-008: Security Architecture
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Date
|
||||
2026-01-27
|
||||
|
||||
## Context
|
||||
|
||||
RuvBot handles sensitive data including:
|
||||
- User conversations and personal information
|
||||
- API credentials for LLM providers
|
||||
- Integration tokens (Slack, Discord)
|
||||
- Vector embeddings that may encode sensitive content
|
||||
- Multi-tenant data requiring strict isolation
|
||||
|
||||
## Decision
|
||||
|
||||
### Security Layers
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Security Architecture │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Layer 1: Transport Security │
|
||||
│ - TLS 1.3 for all connections │
|
||||
│ - Certificate pinning for external APIs │
|
||||
│ - HSTS enabled by default │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Layer 2: Authentication │
|
||||
│ - JWT tokens with RS256 signing │
|
||||
│ - OAuth 2.0 for Slack/Discord │
|
||||
│ - API key authentication with rate limiting │
|
||||
│ - Session tokens with secure rotation │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Layer 3: Authorization │
|
||||
│ - RBAC with claims-based permissions │
|
||||
│ - Tenant isolation at all layers │
|
||||
│ - Skill-level permission grants │
|
||||
│ - Resource-based access control │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Layer 4: Data Protection │
|
||||
│ - AES-256-GCM for data at rest │
|
||||
│ - Field-level encryption for sensitive data │
|
||||
│ - Key rotation with envelope encryption │
|
||||
│ - Secure secret management │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Layer 5: Input Validation │
|
||||
│ - Zod schema validation for all inputs │
|
||||
│ - SQL injection prevention (parameterized queries) │
|
||||
│ - XSS prevention (content sanitization) │
|
||||
│ - Path traversal prevention │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Layer 6: WASM Sandbox │
|
||||
│ - Memory isolation per operation │
|
||||
│ - Resource limits (CPU, memory) │
|
||||
│ - No filesystem access from WASM │
|
||||
│ - Controlled imports/exports │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Multi-Tenancy Security
|
||||
|
||||
```sql
|
||||
-- PostgreSQL Row-Level Security
|
||||
CREATE POLICY tenant_isolation ON memories
|
||||
USING (tenant_id = current_setting('app.current_tenant')::uuid);
|
||||
|
||||
CREATE POLICY tenant_isolation ON sessions
|
||||
USING (tenant_id = current_setting('app.current_tenant')::uuid);
|
||||
|
||||
CREATE POLICY tenant_isolation ON agents
|
||||
USING (tenant_id = current_setting('app.current_tenant')::uuid);
|
||||
```
|
||||
|
||||
### Secret Management
|
||||
|
||||
```typescript
|
||||
// Secrets are never logged or exposed
|
||||
interface SecretStore {
|
||||
get(key: string): Promise<string>;
|
||||
set(key: string, value: string, options?: SecretOptions): Promise<void>;
|
||||
rotate(key: string): Promise<void>;
|
||||
delete(key: string): Promise<void>;
|
||||
}
|
||||
|
||||
// Environment variable validation
|
||||
const requiredSecrets = z.object({
|
||||
ANTHROPIC_API_KEY: z.string().startsWith('sk-ant-'),
|
||||
SLACK_BOT_TOKEN: z.string().startsWith('xoxb-').optional(),
|
||||
DATABASE_URL: z.string().url().optional(),
|
||||
});
|
||||
```
|
||||
|
||||
### API Security
|
||||
|
||||
1. **Rate Limiting**: Per-tenant, per-endpoint limits
|
||||
2. **Request Signing**: HMAC-SHA256 for webhooks
|
||||
3. **IP Allowlisting**: Optional for enterprise
|
||||
4. **Audit Logging**: All security events logged
|
||||
|
||||
### Vulnerability Prevention
|
||||
|
||||
| CVE Category | Prevention |
|
||||
|--------------|------------|
|
||||
| Injection (SQL, NoSQL, Command) | Parameterized queries, input validation |
|
||||
| XSS | Content-Security-Policy, output encoding |
|
||||
| CSRF | SameSite cookies, origin validation |
|
||||
| SSRF | URL allowlisting, no user-controlled URLs |
|
||||
| Path Traversal | Path sanitization, chroot for file ops |
|
||||
| Sensitive Data Exposure | Encryption, minimal logging |
|
||||
| Broken Authentication | Secure session management |
|
||||
| Security Misconfiguration | Secure defaults, hardening guide |
|
||||
|
||||
### Compliance Readiness
|
||||
|
||||
- **GDPR**: Data export, deletion, consent tracking
|
||||
- **SOC 2**: Audit logging, access controls
|
||||
- **HIPAA**: Encryption, access logging (with configuration)
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- Defense in depth provides multiple security layers
|
||||
- Multi-tenancy isolation prevents data leakage
|
||||
- Comprehensive input validation blocks injection attacks
|
||||
- WASM sandbox limits damage from malicious code
|
||||
|
||||
### Negative
|
||||
- Performance overhead from encryption/validation
|
||||
- Complexity in secret management
|
||||
- Additional testing required for security features
|
||||
|
||||
### Risks
|
||||
- Key management complexity
|
||||
- Potential for misconfiguration
|
||||
- Balance between security and usability
|
||||
|
||||
## Security Checklist
|
||||
|
||||
- [ ] TLS configured for all endpoints
|
||||
- [ ] API keys stored in secure vault
|
||||
- [ ] Rate limiting enabled
|
||||
- [ ] Audit logging configured
|
||||
- [ ] Input validation on all endpoints
|
||||
- [ ] SQL injection tests passing
|
||||
- [ ] XSS tests passing
|
||||
- [ ] CSRF protection enabled
|
||||
- [ ] Security headers configured
|
||||
- [ ] Dependency vulnerabilities scanned
|
||||
159
npm/packages/ruvbot/docs/adr/ADR-009-hybrid-search.md
Normal file
159
npm/packages/ruvbot/docs/adr/ADR-009-hybrid-search.md
Normal file
@@ -0,0 +1,159 @@
|
||||
# ADR-009: Hybrid Search Architecture
|
||||
|
||||
## Status
|
||||
Accepted (Implemented)
|
||||
|
||||
## Date
|
||||
2026-01-27
|
||||
|
||||
## Context
|
||||
|
||||
Clawdbot uses basic vector search with external embedding APIs. RuvBot improves on this with:
|
||||
- Local WASM embeddings (75x faster)
|
||||
- HNSW indexing (150x-12,500x faster)
|
||||
- Need for hybrid search combining vector + keyword (BM25)
|
||||
|
||||
## Decision
|
||||
|
||||
### Hybrid Search Pipeline
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ RuvBot Hybrid Search │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Query Input │
|
||||
│ └─ Text normalization │
|
||||
│ └─ Query embedding (WASM, <3ms) │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Parallel Search (Promise.all) │
|
||||
│ ├─ Vector Search (HNSW) ├─ Keyword Search (BM25) │
|
||||
│ │ └─ Cosine similarity │ └─ Inverted index │
|
||||
│ │ └─ Top-K candidates │ └─ IDF + TF scoring │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Result Fusion │
|
||||
│ └─ Reciprocal Rank Fusion (RRF) │
|
||||
│ └─ Linear combination │
|
||||
│ └─ Weighted average with presence bonus │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Post-Processing │
|
||||
│ └─ Score normalization (BM25 max-normalized) │
|
||||
│ └─ Matched term tracking │
|
||||
│ └─ Threshold filtering │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Implementation
|
||||
|
||||
Located in `/npm/packages/ruvbot/src/learning/search/`:
|
||||
- `HybridSearch.ts` - Main hybrid search coordinator
|
||||
- `BM25Index.ts` - BM25 keyword search implementation
|
||||
|
||||
### Configuration
|
||||
|
||||
```typescript
|
||||
interface HybridSearchConfig {
|
||||
vector: {
|
||||
enabled: boolean;
|
||||
weight: number; // 0.0-1.0, default: 0.7
|
||||
};
|
||||
keyword: {
|
||||
enabled: boolean;
|
||||
weight: number; // 0.0-1.0, default: 0.3
|
||||
k1?: number; // BM25 k1 parameter, default: 1.2
|
||||
b?: number; // BM25 b parameter, default: 0.75
|
||||
};
|
||||
fusion: {
|
||||
method: 'rrf' | 'linear' | 'weighted';
|
||||
k: number; // RRF constant, default: 60
|
||||
candidateMultiplier: number; // default: 3
|
||||
};
|
||||
}
|
||||
|
||||
interface HybridSearchOptions {
|
||||
topK?: number; // default: 10
|
||||
threshold?: number; // default: 0
|
||||
vectorOnly?: boolean;
|
||||
keywordOnly?: boolean;
|
||||
}
|
||||
|
||||
interface HybridSearchResult {
|
||||
id: string;
|
||||
vectorScore: number;
|
||||
keywordScore: number;
|
||||
fusedScore: number;
|
||||
matchedTerms?: string[];
|
||||
}
|
||||
```
|
||||
|
||||
### Fusion Methods
|
||||
|
||||
| Method | Algorithm | Best For |
|
||||
|--------|-----------|----------|
|
||||
| `rrf` | Reciprocal Rank Fusion: `1/(k + rank)` | General use, rank-based |
|
||||
| `linear` | `α·vectorScore + β·keywordScore` | Score-sensitive ranking |
|
||||
| `weighted` | Linear + 0.1 bonus for dual matches | Boosting exact matches |
|
||||
|
||||
### BM25 Implementation
|
||||
|
||||
```typescript
|
||||
interface BM25Config {
|
||||
k1: number; // Term frequency saturation (default: 1.2)
|
||||
b: number; // Document length normalization (default: 0.75)
|
||||
}
|
||||
```
|
||||
|
||||
Features:
|
||||
- Inverted index with document frequency tracking
|
||||
- Built-in stopword filtering (100+ common words)
|
||||
- Basic Porter-style stemming (ing, ed, es, s, ly, tion)
|
||||
- Average document length normalization
|
||||
|
||||
### Performance Targets
|
||||
|
||||
| Operation | Target | Achieved |
|
||||
|-----------|--------|----------|
|
||||
| Query embedding | <5ms | 2.7ms |
|
||||
| Vector search (100K) | <10ms | <5ms |
|
||||
| Keyword search | <20ms | <15ms |
|
||||
| Fusion | <5ms | <2ms |
|
||||
| Total hybrid | <40ms | <25ms |
|
||||
|
||||
### Usage Example
|
||||
|
||||
```typescript
|
||||
import { HybridSearch, createHybridSearch } from './learning/search';
|
||||
|
||||
// Create with custom config
|
||||
const search = createHybridSearch({
|
||||
vector: { enabled: true, weight: 0.7 },
|
||||
keyword: { enabled: true, weight: 0.3, k1: 1.2, b: 0.75 },
|
||||
fusion: { method: 'rrf', k: 60, candidateMultiplier: 3 },
|
||||
});
|
||||
|
||||
// Initialize with vector index and embedder
|
||||
search.initialize(vectorIndex, embedder);
|
||||
|
||||
// Add documents
|
||||
await search.add('doc1', 'Document content here');
|
||||
|
||||
// Search
|
||||
const results = await search.search('query text', { topK: 10 });
|
||||
```
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- Better recall than vector-only search
|
||||
- Handles exact matches and semantic similarity
|
||||
- Maintains keyword search for debugging
|
||||
- Parallel search execution for low latency
|
||||
|
||||
### Negative
|
||||
- Slightly higher latency than vector-only
|
||||
- Requires maintaining both indices
|
||||
- More complex tuning
|
||||
|
||||
### Trade-offs
|
||||
- Weight tuning requires experimentation
|
||||
- Memory overhead for dual indices
|
||||
- BM25 stemming is basic (not full Porter algorithm)
|
||||
238
npm/packages/ruvbot/docs/adr/ADR-010-multi-channel.md
Normal file
238
npm/packages/ruvbot/docs/adr/ADR-010-multi-channel.md
Normal file
@@ -0,0 +1,238 @@
|
||||
# ADR-010: Multi-Channel Integration
|
||||
|
||||
## Status
|
||||
Accepted (Partially Implemented)
|
||||
|
||||
## Date
|
||||
2026-01-27
|
||||
|
||||
## Context
|
||||
|
||||
Clawdbot supports multiple messaging channels:
|
||||
- Slack, Discord, Telegram, Signal, WhatsApp, Line, iMessage
|
||||
- Web, CLI, API interfaces
|
||||
|
||||
RuvBot must match and exceed with:
|
||||
- All Clawdbot channels
|
||||
- Multi-tenant channel isolation
|
||||
- Unified message handling
|
||||
|
||||
## Decision
|
||||
|
||||
### Channel Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ RuvBot Channel Layer │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Channel Adapters │
|
||||
│ ├─ SlackAdapter : @slack/bolt [IMPLEMENTED] │
|
||||
│ ├─ DiscordAdapter : discord.js [IMPLEMENTED] │
|
||||
│ ├─ TelegramAdapter : telegraf [IMPLEMENTED] │
|
||||
│ ├─ SignalAdapter : signal-client [PLANNED] │
|
||||
│ ├─ WhatsAppAdapter : baileys [PLANNED] │
|
||||
│ ├─ LineAdapter : @line/bot-sdk [PLANNED] │
|
||||
│ ├─ WebAdapter : WebSocket + REST [PLANNED] │
|
||||
│ └─ CLIAdapter : readline + terminal [PLANNED] │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Message Normalization │
|
||||
│ └─ Unified Message format │
|
||||
│ └─ Attachment handling │
|
||||
│ └─ Thread/reply context │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Multi-Tenant Isolation │
|
||||
│ └─ Channel credentials per tenant │
|
||||
│ └─ Namespace isolation │
|
||||
│ └─ Rate limiting per tenant │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Implementation
|
||||
|
||||
Located in `/npm/packages/ruvbot/src/channels/`:
|
||||
- `ChannelRegistry.ts` - Central registry and routing
|
||||
- `adapters/BaseAdapter.ts` - Abstract base class
|
||||
- `adapters/SlackAdapter.ts` - Slack integration
|
||||
- `adapters/DiscordAdapter.ts` - Discord integration
|
||||
- `adapters/TelegramAdapter.ts` - Telegram integration
|
||||
|
||||
### Unified Message Interface
|
||||
|
||||
```typescript
|
||||
interface UnifiedMessage {
|
||||
id: string;
|
||||
channelId: string;
|
||||
channelType: ChannelType;
|
||||
tenantId: string;
|
||||
userId: string;
|
||||
username?: string;
|
||||
content: string;
|
||||
attachments?: Attachment[];
|
||||
threadId?: string;
|
||||
replyTo?: string;
|
||||
timestamp: Date;
|
||||
metadata: Record<string, unknown>;
|
||||
}
|
||||
|
||||
interface Attachment {
|
||||
id: string;
|
||||
type: 'image' | 'file' | 'audio' | 'video' | 'link';
|
||||
url?: string;
|
||||
data?: Buffer;
|
||||
mimeType?: string;
|
||||
filename?: string;
|
||||
size?: number;
|
||||
}
|
||||
|
||||
type ChannelType =
|
||||
| 'slack' | 'discord' | 'telegram'
|
||||
| 'signal' | 'whatsapp' | 'line'
|
||||
| 'imessage' | 'web' | 'api' | 'cli';
|
||||
```
|
||||
|
||||
### BaseAdapter Abstract Class
|
||||
|
||||
```typescript
|
||||
abstract class BaseAdapter {
|
||||
type: ChannelType;
|
||||
tenantId: string;
|
||||
enabled: boolean;
|
||||
|
||||
// Lifecycle
|
||||
abstract connect(): Promise<void>;
|
||||
abstract disconnect(): Promise<void>;
|
||||
|
||||
// Messaging
|
||||
abstract send(channelId: string, content: string, options?: SendOptions): Promise<string>;
|
||||
abstract reply(message: UnifiedMessage, content: string, options?: SendOptions): Promise<string>;
|
||||
|
||||
// Event handling
|
||||
onMessage(handler: MessageHandler): void;
|
||||
offMessage(handler: MessageHandler): void;
|
||||
getStatus(): AdapterStatus;
|
||||
}
|
||||
```
|
||||
|
||||
### Channel Registry
|
||||
|
||||
```typescript
|
||||
interface ChannelRegistry {
|
||||
// Registration
|
||||
register(adapter: BaseAdapter): void;
|
||||
unregister(type: ChannelType, tenantId: string): boolean;
|
||||
|
||||
// Lookup
|
||||
get(type: ChannelType, tenantId: string): BaseAdapter | undefined;
|
||||
getByType(type: ChannelType): BaseAdapter[];
|
||||
getByTenant(tenantId: string): BaseAdapter[];
|
||||
getAll(): BaseAdapter[];
|
||||
|
||||
// Lifecycle
|
||||
start(): Promise<void>;
|
||||
stop(): Promise<void>;
|
||||
|
||||
// Messaging
|
||||
onMessage(handler: MessageHandler): void;
|
||||
offMessage(handler: MessageHandler): void;
|
||||
broadcast(message: string, channelIds: string[], filter?: ChannelFilter): Promise<Map<string, string>>;
|
||||
|
||||
// Statistics
|
||||
getStats(): RegistryStats;
|
||||
}
|
||||
|
||||
interface ChannelRegistryConfig {
|
||||
defaultRateLimit?: {
|
||||
requests: number;
|
||||
windowMs: number;
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### Adapter Configuration
|
||||
|
||||
```typescript
|
||||
interface AdapterConfig {
|
||||
type: ChannelType;
|
||||
tenantId: string;
|
||||
credentials: ChannelCredentials;
|
||||
enabled?: boolean;
|
||||
rateLimit?: {
|
||||
requests: number;
|
||||
windowMs: number;
|
||||
};
|
||||
}
|
||||
|
||||
interface ChannelCredentials {
|
||||
token?: string;
|
||||
apiKey?: string;
|
||||
webhookUrl?: string;
|
||||
clientId?: string;
|
||||
clientSecret?: string;
|
||||
botId?: string;
|
||||
[key: string]: unknown;
|
||||
}
|
||||
```
|
||||
|
||||
### Usage Example
|
||||
|
||||
```typescript
|
||||
import { ChannelRegistry, SlackAdapter, DiscordAdapter } from './channels';
|
||||
|
||||
// Create registry with rate limiting
|
||||
const registry = new ChannelRegistry({
|
||||
defaultRateLimit: { requests: 100, windowMs: 60000 }
|
||||
});
|
||||
|
||||
// Register adapters
|
||||
registry.register(new SlackAdapter({
|
||||
type: 'slack',
|
||||
tenantId: 'tenant-1',
|
||||
credentials: { token: process.env.SLACK_TOKEN }
|
||||
}));
|
||||
|
||||
registry.register(new DiscordAdapter({
|
||||
type: 'discord',
|
||||
tenantId: 'tenant-1',
|
||||
credentials: { token: process.env.DISCORD_TOKEN }
|
||||
}));
|
||||
|
||||
// Handle messages
|
||||
registry.onMessage(async (message) => {
|
||||
console.log(`[${message.channelType}] ${message.userId}: ${message.content}`);
|
||||
});
|
||||
|
||||
// Start all adapters
|
||||
await registry.start();
|
||||
```
|
||||
|
||||
## Implementation Status
|
||||
|
||||
| Adapter | Status | Library | Notes |
|
||||
|---------|--------|---------|-------|
|
||||
| Slack | Implemented | @slack/bolt | Full support |
|
||||
| Discord | Implemented | discord.js | Full support |
|
||||
| Telegram | Implemented | telegraf | Full support |
|
||||
| Signal | Planned | signal-client | Requires native deps |
|
||||
| WhatsApp | Planned | baileys | Unofficial API |
|
||||
| Line | Planned | @line/bot-sdk | - |
|
||||
| Web | Planned | WebSocket | Custom implementation |
|
||||
| CLI | Planned | readline | For testing |
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- Unified message handling across all channels
|
||||
- Multi-tenant channel isolation with per-tenant indexing
|
||||
- Easy to add new channels via BaseAdapter
|
||||
- Built-in rate limiting per adapter
|
||||
|
||||
### Negative
|
||||
- Complexity of maintaining multiple integrations
|
||||
- Different channel capabilities (some don't support threads)
|
||||
- Only 3 of 8+ channels currently implemented
|
||||
|
||||
### RuvBot Advantages over Clawdbot
|
||||
- Multi-tenant channel credentials with isolation
|
||||
- Channel-specific rate limiting
|
||||
- Cross-channel message routing via broadcast
|
||||
- Adapter status tracking and statistics
|
||||
205
npm/packages/ruvbot/docs/adr/ADR-011-swarm-coordination.md
Normal file
205
npm/packages/ruvbot/docs/adr/ADR-011-swarm-coordination.md
Normal file
@@ -0,0 +1,205 @@
|
||||
# ADR-011: Swarm Coordination (agentic-flow Integration)
|
||||
|
||||
## Status
|
||||
Accepted (Implemented)
|
||||
|
||||
## Date
|
||||
2026-01-27
|
||||
|
||||
## Context
|
||||
|
||||
Clawdbot has basic async processing. RuvBot integrates agentic-flow patterns for:
|
||||
- Multi-agent swarm coordination
|
||||
- 12 specialized background workers
|
||||
- Byzantine fault-tolerant consensus
|
||||
- Dynamic topology switching
|
||||
|
||||
## Decision
|
||||
|
||||
### Swarm Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ RuvBot Swarm Coordination │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Topologies │
|
||||
│ ├─ hierarchical : Queen-worker (anti-drift) │
|
||||
│ ├─ mesh : Peer-to-peer network │
|
||||
│ ├─ hierarchical-mesh : Hybrid for scalability │
|
||||
│ └─ adaptive : Dynamic switching │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Consensus Protocols │
|
||||
│ ├─ byzantine : BFT (f < n/3 faulty) │
|
||||
│ ├─ raft : Leader-based (f < n/2) │
|
||||
│ ├─ gossip : Eventually consistent │
|
||||
│ └─ crdt : Conflict-free replication │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Background Workers (12) │
|
||||
│ ├─ ultralearn [normal] : Deep knowledge acquisition │
|
||||
│ ├─ optimize [high] : Performance optimization │
|
||||
│ ├─ consolidate [low] : Memory consolidation (EWC++) │
|
||||
│ ├─ predict [normal] : Predictive preloading │
|
||||
│ ├─ audit [critical] : Security analysis │
|
||||
│ ├─ map [normal] : Codebase mapping │
|
||||
│ ├─ preload [low] : Resource preloading │
|
||||
│ ├─ deepdive [normal] : Deep code analysis │
|
||||
│ ├─ document [normal] : Auto-documentation │
|
||||
│ ├─ refactor [normal] : Refactoring suggestions │
|
||||
│ ├─ benchmark [normal] : Performance benchmarking │
|
||||
│ └─ testgaps [normal] : Test coverage analysis │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Implementation
|
||||
|
||||
Located in `/npm/packages/ruvbot/src/swarm/`:
|
||||
- `SwarmCoordinator.ts` - Main coordinator with task dispatch
|
||||
- `ByzantineConsensus.ts` - PBFT-style consensus implementation
|
||||
|
||||
### SwarmCoordinator
|
||||
|
||||
```typescript
|
||||
interface SwarmConfig {
|
||||
topology: SwarmTopology; // 'hierarchical' | 'mesh' | 'hierarchical-mesh' | 'adaptive'
|
||||
maxAgents: number; // default: 8
|
||||
strategy: 'specialized' | 'balanced' | 'adaptive';
|
||||
consensus: ConsensusProtocol; // 'byzantine' | 'raft' | 'gossip' | 'crdt'
|
||||
heartbeatInterval?: number; // default: 5000ms
|
||||
taskTimeout?: number; // default: 60000ms
|
||||
}
|
||||
|
||||
interface SwarmTask {
|
||||
id: string;
|
||||
worker: WorkerType;
|
||||
type: string;
|
||||
content: unknown;
|
||||
priority: WorkerPriority;
|
||||
status: 'pending' | 'running' | 'completed' | 'failed';
|
||||
assignedAgent?: string;
|
||||
result?: unknown;
|
||||
error?: string;
|
||||
createdAt: Date;
|
||||
startedAt?: Date;
|
||||
completedAt?: Date;
|
||||
}
|
||||
|
||||
interface SwarmAgent {
|
||||
id: string;
|
||||
type: WorkerType;
|
||||
status: 'idle' | 'busy' | 'offline';
|
||||
currentTask?: string;
|
||||
completedTasks: number;
|
||||
failedTasks: number;
|
||||
lastHeartbeat: Date;
|
||||
}
|
||||
```
|
||||
|
||||
### Worker Configuration
|
||||
|
||||
```typescript
|
||||
const WORKER_DEFAULTS: Record<WorkerType, WorkerConfig> = {
|
||||
ultralearn: { priority: 'normal', concurrency: 2, timeout: 60000, retries: 3, backoff: 'exponential' },
|
||||
optimize: { priority: 'high', concurrency: 4, timeout: 30000, retries: 2, backoff: 'exponential' },
|
||||
consolidate: { priority: 'low', concurrency: 1, timeout: 120000, retries: 1, backoff: 'linear' },
|
||||
predict: { priority: 'normal', concurrency: 2, timeout: 15000, retries: 2, backoff: 'exponential' },
|
||||
audit: { priority: 'critical', concurrency: 1, timeout: 45000, retries: 3, backoff: 'exponential' },
|
||||
map: { priority: 'normal', concurrency: 2, timeout: 60000, retries: 2, backoff: 'linear' },
|
||||
preload: { priority: 'low', concurrency: 4, timeout: 10000, retries: 1, backoff: 'linear' },
|
||||
deepdive: { priority: 'normal', concurrency: 2, timeout: 90000, retries: 2, backoff: 'exponential' },
|
||||
document: { priority: 'normal', concurrency: 2, timeout: 30000, retries: 2, backoff: 'linear' },
|
||||
refactor: { priority: 'normal', concurrency: 2, timeout: 60000, retries: 2, backoff: 'exponential' },
|
||||
benchmark: { priority: 'normal', concurrency: 1, timeout: 120000, retries: 1, backoff: 'linear' },
|
||||
testgaps: { priority: 'normal', concurrency: 2, timeout: 45000, retries: 2, backoff: 'linear' },
|
||||
};
|
||||
```
|
||||
|
||||
### ByzantineConsensus (PBFT)
|
||||
|
||||
```typescript
|
||||
interface ConsensusConfig {
|
||||
replicas: number; // Total number of replicas (default: 5)
|
||||
timeout: number; // Timeout per phase (default: 30000ms)
|
||||
retries: number; // Retries before failing (default: 3)
|
||||
requireSignatures: boolean;
|
||||
}
|
||||
|
||||
// Fault tolerance: f < n/3
|
||||
// Quorum size: ceil(2n/3)
|
||||
```
|
||||
|
||||
**Phases:**
|
||||
1. `pre-prepare` - Leader broadcasts proposal
|
||||
2. `prepare` - Replicas validate and send prepare messages
|
||||
3. `commit` - Wait for quorum of commit messages
|
||||
4. `decided` - Consensus reached
|
||||
5. `failed` - Consensus failed (timeout/Byzantine fault)
|
||||
|
||||
### Usage Example
|
||||
|
||||
```typescript
|
||||
import { SwarmCoordinator, ByzantineConsensus } from './swarm';
|
||||
|
||||
// Initialize swarm
|
||||
const swarm = new SwarmCoordinator({
|
||||
topology: 'hierarchical',
|
||||
maxAgents: 8,
|
||||
strategy: 'specialized',
|
||||
consensus: 'raft'
|
||||
});
|
||||
|
||||
await swarm.start();
|
||||
|
||||
// Spawn specialized agents
|
||||
await swarm.spawnAgent('ultralearn');
|
||||
await swarm.spawnAgent('optimize');
|
||||
|
||||
// Dispatch task
|
||||
const task = await swarm.dispatch({
|
||||
worker: 'ultralearn',
|
||||
task: { type: 'deep-analysis', content: 'analyze this' },
|
||||
priority: 'normal'
|
||||
});
|
||||
|
||||
// Wait for completion
|
||||
const result = await swarm.waitForTask(task.id);
|
||||
|
||||
// Byzantine consensus for critical decisions
|
||||
const consensus = new ByzantineConsensus({ replicas: 5, timeout: 30000 });
|
||||
consensus.initializeReplicas(['node1', 'node2', 'node3', 'node4', 'node5']);
|
||||
const decision = await consensus.propose({ action: 'deploy', version: '1.0.0' });
|
||||
```
|
||||
|
||||
### Events
|
||||
|
||||
SwarmCoordinator emits:
|
||||
- `started`, `stopped`
|
||||
- `agent:spawned`, `agent:removed`, `agent:offline`
|
||||
- `task:created`, `task:assigned`, `task:completed`, `task:failed`
|
||||
|
||||
ByzantineConsensus emits:
|
||||
- `proposal:created`
|
||||
- `phase:pre-prepare`, `phase:prepare`, `phase:commit`
|
||||
- `vote:received`
|
||||
- `consensus:decided`, `consensus:failed`, `consensus:no-quorum`
|
||||
- `replica:faulty`, `view:changed`
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- Distributed task execution with priority queues
|
||||
- Fault tolerance via PBFT consensus
|
||||
- Specialized workers for different task types
|
||||
- Heartbeat-based health monitoring
|
||||
- Event-driven architecture
|
||||
|
||||
### Negative
|
||||
- Coordination overhead
|
||||
- Complexity of distributed systems
|
||||
- Memory overhead for task/agent tracking
|
||||
|
||||
### RuvBot Advantages over Clawdbot
|
||||
- 12 specialized workers vs basic async
|
||||
- Byzantine fault tolerance vs none
|
||||
- Multi-topology support vs single-threaded
|
||||
- Learning workers (ultralearn, consolidate) vs static
|
||||
- Priority-based task scheduling
|
||||
376
npm/packages/ruvbot/docs/adr/ADR-012-llm-providers.md
Normal file
376
npm/packages/ruvbot/docs/adr/ADR-012-llm-providers.md
Normal file
@@ -0,0 +1,376 @@
|
||||
# ADR-012: LLM Provider Integration
|
||||
|
||||
## Status
|
||||
Accepted (Implemented)
|
||||
|
||||
## Date
|
||||
2026-01-27
|
||||
|
||||
## Context
|
||||
|
||||
RuvBot requires LLM capabilities for:
|
||||
- Conversational AI responses
|
||||
- Reasoning and analysis tasks
|
||||
- Tool/function calling
|
||||
- Streaming responses for real-time UX
|
||||
|
||||
The system needs to support multiple providers to:
|
||||
- Allow cost optimization (use cheaper models for simple tasks)
|
||||
- Provide fallback options
|
||||
- Access specialized models (reasoning models like QwQ, O1, DeepSeek R1)
|
||||
- Support both direct API access and unified gateways
|
||||
|
||||
## Decision
|
||||
|
||||
### Provider Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ RuvBot LLM Provider Layer │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Provider Interface │
|
||||
│ └─ LLMProvider (abstract interface) │
|
||||
│ ├─ complete() - Single completion │
|
||||
│ ├─ stream() - Streaming completion (AsyncGenerator) │
|
||||
│ ├─ countTokens() - Token estimation │
|
||||
│ ├─ getModel() - Model info │
|
||||
│ └─ isHealthy() - Health check │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Implementations │
|
||||
│ ├─ AnthropicProvider : Direct Anthropic API │
|
||||
│ │ └─ Claude 4, 3.5, 3 models │
|
||||
│ └─ OpenRouterProvider : Multi-model gateway │
|
||||
│ ├─ Qwen QwQ (reasoning) │
|
||||
│ ├─ DeepSeek R1 (reasoning) │
|
||||
│ ├─ Claude via OpenRouter │
|
||||
│ ├─ GPT-4, O1 via OpenRouter │
|
||||
│ └─ Gemini, Llama via OpenRouter │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Features │
|
||||
│ ├─ Tool/Function calling │
|
||||
│ ├─ Streaming with token callbacks │
|
||||
│ ├─ Automatic retry with backoff │
|
||||
│ └─ Token counting │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Implementation
|
||||
|
||||
Located in `/npm/packages/ruvbot/src/integration/providers/`:
|
||||
- `index.ts` - Interface definitions and exports
|
||||
- `AnthropicProvider.ts` - Anthropic Claude integration
|
||||
- `OpenRouterProvider.ts` - OpenRouter multi-model gateway
|
||||
|
||||
### LLMProvider Interface
|
||||
|
||||
```typescript
|
||||
interface LLMProvider {
|
||||
complete(messages: Message[], options?: CompletionOptions): Promise<Completion>;
|
||||
stream(messages: Message[], options?: StreamOptions): AsyncGenerator<Token, Completion, void>;
|
||||
countTokens(text: string): Promise<number>;
|
||||
getModel(): ModelInfo;
|
||||
isHealthy(): Promise<boolean>;
|
||||
}
|
||||
|
||||
interface Message {
|
||||
role: 'user' | 'assistant' | 'system';
|
||||
content: string;
|
||||
}
|
||||
|
||||
interface CompletionOptions {
|
||||
maxTokens?: number;
|
||||
temperature?: number; // 0.0-2.0
|
||||
topP?: number; // 0.0-1.0
|
||||
stopSequences?: string[];
|
||||
tools?: Tool[];
|
||||
}
|
||||
|
||||
interface StreamOptions extends CompletionOptions {
|
||||
onToken?: (token: string) => void;
|
||||
}
|
||||
|
||||
interface Completion {
|
||||
content: string;
|
||||
finishReason: 'stop' | 'length' | 'tool_use';
|
||||
usage: {
|
||||
inputTokens: number;
|
||||
outputTokens: number;
|
||||
};
|
||||
toolCalls?: ToolCall[];
|
||||
}
|
||||
|
||||
interface Token {
|
||||
type: 'text' | 'tool_use';
|
||||
text?: string;
|
||||
toolUse?: ToolCall;
|
||||
}
|
||||
```
|
||||
|
||||
### Tool/Function Calling
|
||||
|
||||
```typescript
|
||||
interface Tool {
|
||||
name: string;
|
||||
description: string;
|
||||
parameters: Record<string, unknown>; // JSON Schema
|
||||
}
|
||||
|
||||
interface ToolCall {
|
||||
id: string;
|
||||
name: string;
|
||||
input: Record<string, unknown>;
|
||||
}
|
||||
```
|
||||
|
||||
### AnthropicProvider
|
||||
|
||||
Direct integration with Anthropic's Claude API.
|
||||
|
||||
```typescript
|
||||
interface AnthropicConfig {
|
||||
apiKey: string;
|
||||
baseUrl?: string; // default: 'https://api.anthropic.com'
|
||||
model?: string; // default: 'claude-3-5-sonnet-20241022'
|
||||
maxRetries?: number; // default: 3
|
||||
timeout?: number; // default: 60000ms
|
||||
}
|
||||
|
||||
type AnthropicModel =
|
||||
| 'claude-opus-4-20250514'
|
||||
| 'claude-sonnet-4-20250514'
|
||||
| 'claude-3-5-sonnet-20241022'
|
||||
| 'claude-3-5-haiku-20241022'
|
||||
| 'claude-3-opus-20240229'
|
||||
| 'claude-3-sonnet-20240229'
|
||||
| 'claude-3-haiku-20240307';
|
||||
```
|
||||
|
||||
**Model Specifications:**
|
||||
|
||||
| Model | Max Tokens | Context Window | Best For |
|
||||
|-------|------------|----------------|----------|
|
||||
| claude-opus-4-20250514 | 32,768 | 200,000 | Complex reasoning, analysis |
|
||||
| claude-sonnet-4-20250514 | 16,384 | 200,000 | Balanced performance |
|
||||
| claude-3-5-sonnet-20241022 | 8,192 | 200,000 | General purpose |
|
||||
| claude-3-5-haiku-20241022 | 8,192 | 200,000 | Fast, cost-effective |
|
||||
| claude-3-opus-20240229 | 4,096 | 200,000 | Complex tasks |
|
||||
| claude-3-sonnet-20240229 | 4,096 | 200,000 | Balanced |
|
||||
| claude-3-haiku-20240307 | 4,096 | 200,000 | Fast responses |
|
||||
|
||||
**Usage:**
|
||||
|
||||
```typescript
|
||||
import { createAnthropicProvider } from './integration/providers';
|
||||
|
||||
const provider = createAnthropicProvider({
|
||||
apiKey: process.env.ANTHROPIC_API_KEY!,
|
||||
model: 'claude-3-5-sonnet-20241022',
|
||||
});
|
||||
|
||||
// Simple completion
|
||||
const response = await provider.complete([
|
||||
{ role: 'user', content: 'Hello!' }
|
||||
]);
|
||||
|
||||
// Streaming
|
||||
for await (const token of provider.stream(messages)) {
|
||||
if (token.type === 'text') {
|
||||
process.stdout.write(token.text!);
|
||||
}
|
||||
}
|
||||
|
||||
// With tools
|
||||
const toolResponse = await provider.complete(messages, {
|
||||
tools: [{
|
||||
name: 'get_weather',
|
||||
description: 'Get weather for a location',
|
||||
parameters: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
location: { type: 'string' }
|
||||
}
|
||||
}
|
||||
}]
|
||||
});
|
||||
```
|
||||
|
||||
### OpenRouterProvider
|
||||
|
||||
Access to 100+ models through OpenRouter's unified API.
|
||||
|
||||
```typescript
|
||||
interface OpenRouterConfig {
|
||||
apiKey: string;
|
||||
baseUrl?: string; // default: 'https://openrouter.ai/api'
|
||||
model?: string; // default: 'qwen/qwq-32b'
|
||||
siteUrl?: string; // For attribution
|
||||
siteName?: string; // default: 'RuvBot'
|
||||
maxRetries?: number; // default: 3
|
||||
timeout?: number; // default: 120000ms (longer for reasoning)
|
||||
}
|
||||
|
||||
type OpenRouterModel =
|
||||
// Reasoning Models
|
||||
| 'qwen/qwq-32b'
|
||||
| 'qwen/qwq-32b:free'
|
||||
| 'openai/o1-preview'
|
||||
| 'openai/o1-mini'
|
||||
| 'deepseek/deepseek-r1'
|
||||
// Standard Models
|
||||
| 'anthropic/claude-3.5-sonnet'
|
||||
| 'openai/gpt-4o'
|
||||
| 'google/gemini-pro-1.5'
|
||||
| 'meta-llama/llama-3.1-405b-instruct'
|
||||
| string; // Any OpenRouter model
|
||||
```
|
||||
|
||||
**Reasoning Model Specifications:**
|
||||
|
||||
| Model | Max Tokens | Context | Special Features |
|
||||
|-------|------------|---------|------------------|
|
||||
| qwen/qwq-32b | 16,384 | 32,768 | Chain-of-thought reasoning |
|
||||
| qwen/qwq-32b:free | 16,384 | 32,768 | Free tier available |
|
||||
| openai/o1-preview | 32,768 | 128,000 | Advanced reasoning |
|
||||
| openai/o1-mini | 65,536 | 128,000 | Faster reasoning |
|
||||
| deepseek/deepseek-r1 | 8,192 | 64,000 | Open-source reasoning |
|
||||
|
||||
**Usage:**
|
||||
|
||||
```typescript
|
||||
import {
|
||||
createOpenRouterProvider,
|
||||
createQwQProvider,
|
||||
createDeepSeekR1Provider
|
||||
} from './integration/providers';
|
||||
|
||||
// General OpenRouter
|
||||
const provider = createOpenRouterProvider({
|
||||
apiKey: process.env.OPENROUTER_API_KEY!,
|
||||
model: 'qwen/qwq-32b',
|
||||
});
|
||||
|
||||
// Convenience: QwQ reasoning model
|
||||
const qwq = createQwQProvider(process.env.OPENROUTER_API_KEY!, false);
|
||||
|
||||
// Convenience: Free QwQ
|
||||
const qwqFree = createQwQProvider(process.env.OPENROUTER_API_KEY!, true);
|
||||
|
||||
// Convenience: DeepSeek R1
|
||||
const deepseek = createDeepSeekR1Provider(process.env.OPENROUTER_API_KEY!);
|
||||
|
||||
// List available models
|
||||
const models = await provider.listModels();
|
||||
```
|
||||
|
||||
### Configuration Options
|
||||
|
||||
**Environment Variables:**
|
||||
|
||||
```bash
|
||||
# Anthropic
|
||||
ANTHROPIC_API_KEY=sk-ant-...
|
||||
|
||||
# OpenRouter
|
||||
OPENROUTER_API_KEY=sk-or-...
|
||||
```
|
||||
|
||||
**Rate Limiting:**
|
||||
- Both providers use native fetch with `AbortSignal.timeout()`
|
||||
- Anthropic: 60s default timeout
|
||||
- OpenRouter: 120s default timeout (for reasoning models)
|
||||
|
||||
**Retry Strategy:**
|
||||
- Default: 3 retries
|
||||
- Backoff: Not implemented in base (use with retry libraries)
|
||||
|
||||
### Performance Benchmarks
|
||||
|
||||
| Operation | Anthropic | OpenRouter |
|
||||
|-----------|-----------|------------|
|
||||
| Cold start | ~500ms | ~800ms |
|
||||
| Token latency (first) | ~200ms | ~300ms |
|
||||
| Throughput (tokens/s) | ~50 | ~40 |
|
||||
| Tool call parsing | <10ms | <10ms |
|
||||
|
||||
### Error Handling
|
||||
|
||||
```typescript
|
||||
try {
|
||||
const response = await provider.complete(messages);
|
||||
} catch (error) {
|
||||
if (error.message.includes('API error: 429')) {
|
||||
// Rate limited - implement backoff
|
||||
} else if (error.message.includes('API error: 401')) {
|
||||
// Invalid API key
|
||||
} else if (error.message.includes('timeout')) {
|
||||
// Request timed out
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Usage Patterns
|
||||
|
||||
**Model Routing by Task Complexity:**
|
||||
|
||||
```typescript
|
||||
function selectProvider(taskComplexity: 'simple' | 'medium' | 'complex' | 'reasoning') {
|
||||
switch (taskComplexity) {
|
||||
case 'simple':
|
||||
return createAnthropicProvider({ apiKey, model: 'claude-3-5-haiku-20241022' });
|
||||
case 'medium':
|
||||
return createAnthropicProvider({ apiKey, model: 'claude-3-5-sonnet-20241022' });
|
||||
case 'complex':
|
||||
return createAnthropicProvider({ apiKey, model: 'claude-opus-4-20250514' });
|
||||
case 'reasoning':
|
||||
return createQwQProvider(openRouterApiKey);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Fallback Chain:**
|
||||
|
||||
```typescript
|
||||
async function completeWithFallback(messages: Message[]) {
|
||||
const providers = [
|
||||
createAnthropicProvider({ apiKey, model: 'claude-3-5-sonnet-20241022' }),
|
||||
createOpenRouterProvider({ apiKey: orKey, model: 'anthropic/claude-3.5-sonnet' }),
|
||||
createQwQProvider(orKey, true), // Free fallback
|
||||
];
|
||||
|
||||
for (const provider of providers) {
|
||||
try {
|
||||
if (await provider.isHealthy()) {
|
||||
return await provider.complete(messages);
|
||||
}
|
||||
} catch (error) {
|
||||
console.warn(`Provider failed, trying next:`, error);
|
||||
}
|
||||
}
|
||||
throw new Error('All providers failed');
|
||||
}
|
||||
```
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- Unified interface for multiple LLM providers
|
||||
- Access to 100+ models through OpenRouter
|
||||
- Native streaming support with token callbacks
|
||||
- Tool/function calling support
|
||||
- Easy provider switching for cost optimization
|
||||
|
||||
### Negative
|
||||
- Token counting is approximate (not tiktoken-based)
|
||||
- No built-in retry with exponential backoff
|
||||
- System messages handled differently by providers
|
||||
|
||||
### Trade-offs
|
||||
- OpenRouter adds latency vs direct API calls
|
||||
- Reasoning models (QwQ, O1) have longer timeouts
|
||||
- Free tiers have rate limits and quotas
|
||||
|
||||
### RuvBot Advantages
|
||||
- Multi-provider support vs single provider
|
||||
- Reasoning model access (QwQ, DeepSeek R1, O1)
|
||||
- Factory functions for common configurations
|
||||
- Streaming with async generators
|
||||
263
npm/packages/ruvbot/docs/adr/ADR-013-gcp-deployment.md
Normal file
263
npm/packages/ruvbot/docs/adr/ADR-013-gcp-deployment.md
Normal file
@@ -0,0 +1,263 @@
|
||||
# ADR-013: Google Cloud Platform Deployment Architecture
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Date
|
||||
2026-01-27
|
||||
|
||||
## Context
|
||||
|
||||
RuvBot needs a production-ready deployment option that:
|
||||
1. Minimizes operational costs for low-traffic scenarios
|
||||
2. Scales automatically with demand
|
||||
3. Provides persistence for sessions, memory, and learning data
|
||||
4. Secures API keys and credentials
|
||||
5. Supports multi-tenant deployments
|
||||
|
||||
## Decision
|
||||
|
||||
Deploy RuvBot on Google Cloud Platform using serverless and managed services optimized for cost.
|
||||
|
||||
### Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ Google Cloud Platform │
|
||||
├─────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ Cloud │ │ Cloud │ │ Cloud │ │
|
||||
│ │ Build │───▶│ Registry │───▶│ Run │ │
|
||||
│ │ (CI/CD) │ │ (Images) │ │ (App) │ │
|
||||
│ └──────────────┘ └──────────────┘ └──────┬───────┘ │
|
||||
│ │ │
|
||||
│ ┌────────────────────────────┼────────────────────────┐ │
|
||||
│ │ │ │ │
|
||||
│ ┌──────▼──────┐ ┌────────────────▼───────────┐ │ │
|
||||
│ │ Secret │ │ Cloud SQL │ │ │
|
||||
│ │ Manager │ │ (PostgreSQL) │ │ │
|
||||
│ │ │ │ db-f1-micro │ │ │
|
||||
│ └─────────────┘ └────────────────────────────┘ │ │
|
||||
│ │ │
|
||||
│ ┌─────────────┐ ┌────────────────────────────┐ │ │
|
||||
│ │ Cloud │ │ Memorystore │ │ │
|
||||
│ │ Storage │ │ (Redis) - Optional │ │ │
|
||||
│ │ (Files) │ │ Basic tier │ │ │
|
||||
│ └─────────────┘ └────────────────────────────┘ │ │
|
||||
│ │ │
|
||||
│ └────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Cost Optimization Strategy
|
||||
|
||||
| Service | Configuration | Monthly Cost | Notes |
|
||||
|---------|--------------|--------------|-------|
|
||||
| Cloud Run | 0-10 instances, 512Mi RAM | ~$0-5 | Free tier: 2M requests |
|
||||
| Cloud SQL | db-f1-micro, 10GB SSD | ~$10-15 | Smallest instance |
|
||||
| Secret Manager | 3-5 secrets | ~$0.18 | $0.06/secret/month |
|
||||
| Cloud Storage | Standard, lifecycle policies | ~$0.02/GB | Auto-tiering |
|
||||
| Cloud Build | Free tier | ~$0 | 120 min/day free |
|
||||
| **Total (low traffic)** | | **~$15-20/month** | |
|
||||
|
||||
### Service Configuration
|
||||
|
||||
#### Cloud Run (Compute)
|
||||
|
||||
```yaml
|
||||
# Serverless container configuration
|
||||
resources:
|
||||
cpu: "1"
|
||||
memory: "512Mi"
|
||||
scaling:
|
||||
minInstances: 0 # Scale to zero when idle
|
||||
maxInstances: 10 # Limit for cost control
|
||||
concurrency: 80 # Requests per instance
|
||||
features:
|
||||
cpuIdle: true # Reduce CPU when idle (cost savings)
|
||||
startupCpuBoost: true # Faster cold starts
|
||||
timeout: 300s # 5 minutes for long operations
|
||||
```
|
||||
|
||||
#### Cloud SQL (Database)
|
||||
|
||||
```hcl
|
||||
# Cost-optimized PostgreSQL
|
||||
tier = "db-f1-micro" # 0.6GB RAM, shared CPU
|
||||
disk_size = 10 # Minimum SSD
|
||||
availability = "ZONAL" # Single zone (cheaper)
|
||||
backup_retention = 7 # 7 days
|
||||
|
||||
# Extensions enabled
|
||||
- uuid-ossp # UUID generation
|
||||
- pgcrypto # Cryptographic functions
|
||||
- pg_trgm # Text search (trigram similarity)
|
||||
```
|
||||
|
||||
#### Secret Manager
|
||||
|
||||
Securely stores:
|
||||
- `anthropic-api-key` - Anthropic API credentials
|
||||
- `openrouter-api-key` - OpenRouter API credentials
|
||||
- `database-url` - PostgreSQL connection string
|
||||
|
||||
#### Cloud Storage
|
||||
|
||||
```hcl
|
||||
# Automatic cost optimization
|
||||
lifecycle_rules = [
|
||||
{ age = 30, action = "SetStorageClass", class = "NEARLINE" },
|
||||
{ age = 90, action = "SetStorageClass", class = "COLDLINE" }
|
||||
]
|
||||
```
|
||||
|
||||
### Deployment Options
|
||||
|
||||
#### Option 1: Quick Deploy (gcloud CLI)
|
||||
|
||||
```bash
|
||||
# Set environment variables
|
||||
export ANTHROPIC_API_KEY="sk-ant-..."
|
||||
export PROJECT_ID="my-project"
|
||||
|
||||
# Run deployment script
|
||||
./deploy/gcp/deploy.sh --project-id $PROJECT_ID
|
||||
```
|
||||
|
||||
#### Option 2: Infrastructure as Code (Terraform)
|
||||
|
||||
```bash
|
||||
cd deploy/gcp/terraform
|
||||
|
||||
terraform init
|
||||
terraform plan -var="project_id=my-project" -var="anthropic_api_key=sk-ant-..."
|
||||
terraform apply
|
||||
```
|
||||
|
||||
#### Option 3: CI/CD (Cloud Build)
|
||||
|
||||
```yaml
|
||||
# Trigger on push to main branch
|
||||
trigger:
|
||||
branch: main
|
||||
included_files:
|
||||
- "npm/packages/ruvbot/**"
|
||||
|
||||
# cloudbuild.yaml handles build and deploy
|
||||
```
|
||||
|
||||
### Multi-Tenant Configuration
|
||||
|
||||
For multiple tenants:
|
||||
|
||||
```hcl
|
||||
# Separate Cloud SQL databases
|
||||
resource "google_sql_database" "tenant" {
|
||||
for_each = var.tenants
|
||||
name = "ruvbot_${each.key}"
|
||||
instance = google_sql_database_instance.ruvbot.name
|
||||
}
|
||||
|
||||
# Row-Level Security in PostgreSQL
|
||||
ALTER TABLE sessions ENABLE ROW LEVEL SECURITY;
|
||||
CREATE POLICY tenant_isolation ON sessions
|
||||
USING (tenant_id = current_setting('app.tenant_id')::uuid);
|
||||
```
|
||||
|
||||
### Scaling Considerations
|
||||
|
||||
| Traffic Level | Cloud Run Instances | Cloud SQL | Estimated Cost |
|
||||
|---------------|---------------------|-----------|----------------|
|
||||
| Low (<1K req/day) | 0-1 | db-f1-micro | ~$15/month |
|
||||
| Medium (<10K req/day) | 1-3 | db-g1-small | ~$40/month |
|
||||
| High (<100K req/day) | 3-10 | db-custom | ~$150/month |
|
||||
| Enterprise | 10-100 | Regional HA | ~$500+/month |
|
||||
|
||||
### Security Configuration
|
||||
|
||||
```hcl
|
||||
# Service account with minimal permissions
|
||||
roles = [
|
||||
"roles/secretmanager.secretAccessor",
|
||||
"roles/cloudsql.client",
|
||||
"roles/storage.objectAdmin",
|
||||
"roles/logging.logWriter",
|
||||
"roles/monitoring.metricWriter",
|
||||
]
|
||||
|
||||
# Network security
|
||||
ip_configuration {
|
||||
ipv4_enabled = false # Production: use private IP
|
||||
private_network = google_compute_network.vpc.id
|
||||
}
|
||||
```
|
||||
|
||||
### Health Monitoring
|
||||
|
||||
```yaml
|
||||
# Cloud Run health checks
|
||||
startup_probe:
|
||||
http_get:
|
||||
path: /health
|
||||
port: 8080
|
||||
initial_delay_seconds: 5
|
||||
timeout_seconds: 3
|
||||
period_seconds: 10
|
||||
|
||||
liveness_probe:
|
||||
http_get:
|
||||
path: /health
|
||||
port: 8080
|
||||
timeout_seconds: 3
|
||||
period_seconds: 30
|
||||
```
|
||||
|
||||
### File Structure
|
||||
|
||||
```
|
||||
deploy/
|
||||
├── gcp/
|
||||
│ ├── cloudbuild.yaml # CI/CD pipeline
|
||||
│ ├── deploy.sh # Quick deployment script
|
||||
│ └── terraform/
|
||||
│ └── main.tf # Infrastructure as code
|
||||
├── init-db.sql # Database schema
|
||||
├── Dockerfile # Container image
|
||||
└── docker-compose.yml # Local development
|
||||
```
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- **Cost-effective**: ~$15-20/month for low traffic
|
||||
- **Serverless**: Scale to zero when not in use
|
||||
- **Managed services**: No infrastructure maintenance
|
||||
- **Security**: Secret Manager, IAM, VPC support
|
||||
- **Observability**: Built-in logging and monitoring
|
||||
|
||||
### Negative
|
||||
- **Cold starts**: First request after idle ~2-3 seconds
|
||||
- **Vendor lock-in**: GCP-specific services
|
||||
- **Complexity**: Multiple services to configure
|
||||
|
||||
### Trade-offs
|
||||
- **Cloud SQL vs Firestore**: SQL chosen for complex queries, Row-Level Security
|
||||
- **Cloud Run vs GKE**: Run chosen for simplicity, lower cost
|
||||
- **db-f1-micro vs larger**: Cost vs performance trade-off
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
| Option | Pros | Cons | Estimated Cost |
|
||||
|--------|------|------|----------------|
|
||||
| GKE + Postgres | Full control, predictable | Complex, expensive | ~$100+/month |
|
||||
| App Engine | Simple deployment | Less flexible | ~$30/month |
|
||||
| Firebase + Functions | Easy scaling | No SQL, vendor lock | ~$20/month |
|
||||
| **Cloud Run + SQL** | **Balanced** | **Some complexity** | **~$15/month** |
|
||||
|
||||
## References
|
||||
|
||||
- [Cloud Run Pricing](https://cloud.google.com/run/pricing)
|
||||
- [Cloud SQL Pricing](https://cloud.google.com/sql/pricing)
|
||||
- [Terraform GCP Provider](https://registry.terraform.io/providers/hashicorp/google/latest/docs)
|
||||
- [Cloud Build CI/CD](https://cloud.google.com/build/docs)
|
||||
246
npm/packages/ruvbot/docs/adr/ADR-014-aidefence-integration.md
Normal file
246
npm/packages/ruvbot/docs/adr/ADR-014-aidefence-integration.md
Normal file
@@ -0,0 +1,246 @@
|
||||
# ADR-014: AIDefence Integration for Adversarial Protection
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Date
|
||||
2026-01-27
|
||||
|
||||
## Context
|
||||
|
||||
RuvBot requires robust protection against adversarial attacks including:
|
||||
- Prompt injection (OWASP #1 LLM vulnerability)
|
||||
- Jailbreak attempts
|
||||
- PII leakage
|
||||
- Malicious code injection
|
||||
- Data exfiltration
|
||||
|
||||
The `aidefence` package provides production-ready adversarial defense with <10ms detection latency.
|
||||
|
||||
## Decision
|
||||
|
||||
Integrate `aidefence@2.1.1` into RuvBot as a core security layer.
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ RuvBot Security Layer │
|
||||
├─────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ User Input ────┐ │
|
||||
│ ▼ │
|
||||
│ ┌──────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ AIDefenceGuard │ │
|
||||
│ ├──────────────────────────────────────────────────────────────────────┤ │
|
||||
│ │ Layer 1: Pattern Detection (<5ms) │ │
|
||||
│ │ └─ 50+ injection signatures │ │
|
||||
│ │ └─ Jailbreak patterns (DAN, bypass, etc.) │ │
|
||||
│ │ └─ Custom patterns (configurable) │ │
|
||||
│ ├──────────────────────────────────────────────────────────────────────┤ │
|
||||
│ │ Layer 2: PII Detection (<5ms) │ │
|
||||
│ │ └─ Email, phone, SSN, credit card │ │
|
||||
│ │ └─ API keys and tokens │ │
|
||||
│ │ └─ IP addresses │ │
|
||||
│ ├──────────────────────────────────────────────────────────────────────┤ │
|
||||
│ │ Layer 3: Sanitization (<1ms) │ │
|
||||
│ │ └─ Control character removal │ │
|
||||
│ │ └─ Unicode homoglyph normalization │ │
|
||||
│ │ └─ PII masking │ │
|
||||
│ ├──────────────────────────────────────────────────────────────────────┤ │
|
||||
│ │ Layer 4: Behavioral Analysis (<100ms) [Optional] │ │
|
||||
│ │ └─ User behavior baseline │ │
|
||||
│ │ └─ Anomaly detection │ │
|
||||
│ │ └─ Deviation scoring │ │
|
||||
│ └──────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌──────────┐ │
|
||||
│ │ Safe? │────No───► Block / Sanitize │
|
||||
│ └────┬─────┘ │
|
||||
│ │ Yes │
|
||||
│ ▼ │
|
||||
│ LLM Provider │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌──────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Response Validation │ │
|
||||
│ │ └─ PII leak detection │ │
|
||||
│ │ └─ Injection echo detection │ │
|
||||
│ │ └─ Malicious code detection │ │
|
||||
│ └──────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ Safe Response ────► User │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Threat Types Detected
|
||||
|
||||
| Threat Type | Severity | Detection Method | Response |
|
||||
|-------------|----------|------------------|----------|
|
||||
| Prompt Injection | High | Pattern matching | Block/Sanitize |
|
||||
| Jailbreak | Critical | Signature detection | Block |
|
||||
| PII Exposure | Medium-Critical | Regex patterns | Mask |
|
||||
| Malicious Code | High | AST-like patterns | Block |
|
||||
| Data Exfiltration | High | URL/webhook detection | Block |
|
||||
| Control Characters | Medium | Unicode analysis | Remove |
|
||||
| Encoding Attacks | Medium | Homoglyph detection | Normalize |
|
||||
| Anomalous Behavior | Medium | Baseline deviation | Alert |
|
||||
|
||||
### Performance Targets
|
||||
|
||||
| Operation | Target | Achieved |
|
||||
|-----------|--------|----------|
|
||||
| Pattern Detection | <10ms | ~5ms |
|
||||
| PII Detection | <10ms | ~3ms |
|
||||
| Sanitization | <5ms | ~1ms |
|
||||
| Full Analysis | <20ms | ~10ms |
|
||||
| Response Validation | <15ms | ~8ms |
|
||||
|
||||
### Usage
|
||||
|
||||
```typescript
|
||||
import { createAIDefenceGuard, createAIDefenceMiddleware } from '@ruvector/ruvbot';
|
||||
|
||||
// Simple usage
|
||||
const guard = createAIDefenceGuard({
|
||||
detectPromptInjection: true,
|
||||
detectJailbreak: true,
|
||||
detectPII: true,
|
||||
blockThreshold: 'medium',
|
||||
});
|
||||
|
||||
const result = await guard.analyze(userInput, {
|
||||
userId: 'user-123',
|
||||
sessionId: 'session-456',
|
||||
});
|
||||
|
||||
if (!result.safe) {
|
||||
console.log('Threats detected:', result.threats);
|
||||
// Use sanitized input or block
|
||||
const safeInput = result.sanitizedInput;
|
||||
}
|
||||
|
||||
// Middleware usage
|
||||
const middleware = createAIDefenceMiddleware({
|
||||
blockThreshold: 'medium',
|
||||
enableAuditLog: true,
|
||||
});
|
||||
|
||||
// Validate input before LLM
|
||||
const { allowed, sanitizedInput } = await middleware.validateInput(userInput);
|
||||
|
||||
if (allowed) {
|
||||
const response = await llm.complete(sanitizedInput);
|
||||
|
||||
// Validate response before returning
|
||||
const { allowed: responseAllowed } = await middleware.validateOutput(response, userInput);
|
||||
|
||||
if (responseAllowed) {
|
||||
return response;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Configuration Options
|
||||
|
||||
```typescript
|
||||
interface AIDefenceConfig {
|
||||
// Detection toggles
|
||||
detectPromptInjection: boolean; // Default: true
|
||||
detectJailbreak: boolean; // Default: true
|
||||
detectPII: boolean; // Default: true
|
||||
|
||||
// Advanced features
|
||||
enableBehavioralAnalysis: boolean; // Default: false
|
||||
enablePolicyVerification: boolean; // Default: false
|
||||
|
||||
// Threshold: 'none' | 'low' | 'medium' | 'high' | 'critical'
|
||||
blockThreshold: ThreatLevel; // Default: 'medium'
|
||||
|
||||
// Custom patterns (regex strings)
|
||||
customPatterns?: string[];
|
||||
|
||||
// Allowed domains for URL validation
|
||||
allowedDomains?: string[];
|
||||
|
||||
// Max input length (chars)
|
||||
maxInputLength: number; // Default: 100000
|
||||
|
||||
// Audit logging
|
||||
enableAuditLog: boolean; // Default: true
|
||||
}
|
||||
```
|
||||
|
||||
### Preset Configurations
|
||||
|
||||
```typescript
|
||||
// Strict mode (production)
|
||||
const strictConfig = createStrictConfig();
|
||||
// - All detection enabled
|
||||
// - Behavioral analysis enabled
|
||||
// - Block threshold: 'low'
|
||||
|
||||
// Permissive mode (development)
|
||||
const permissiveConfig = createPermissiveConfig();
|
||||
// - Core detection only
|
||||
// - Block threshold: 'critical'
|
||||
// - Audit logging disabled
|
||||
```
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- Sub-10ms detection latency
|
||||
- 50+ built-in injection patterns
|
||||
- PII protection out of the box
|
||||
- Configurable security levels
|
||||
- Audit logging for compliance
|
||||
- Response validation
|
||||
- Unicode/homoglyph protection
|
||||
|
||||
### Negative
|
||||
- Additional dependency (aidefence)
|
||||
- Small latency overhead (~10ms per request)
|
||||
- False positives possible with strict settings
|
||||
|
||||
### Trade-offs
|
||||
- Strict mode may block legitimate queries
|
||||
- Behavioral analysis adds latency (~100ms)
|
||||
- PII masking may alter valid content
|
||||
|
||||
## Integration with Existing Security
|
||||
|
||||
AIDefence integrates with RuvBot's 6-layer security architecture:
|
||||
|
||||
```
|
||||
Layer 1: Transport (TLS 1.3)
|
||||
Layer 2: Authentication (JWT)
|
||||
Layer 3: Authorization (RBAC)
|
||||
Layer 4: Data Protection (Encryption)
|
||||
Layer 5: Input Validation (AIDefence) ◄── NEW
|
||||
Layer 6: WASM Sandbox
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
```json
|
||||
{
|
||||
"aidefence": "^2.1.1"
|
||||
}
|
||||
```
|
||||
|
||||
The aidefence package includes:
|
||||
- agentdb (vector storage)
|
||||
- lean-agentic (formal verification)
|
||||
- zod (schema validation)
|
||||
- winston (logging)
|
||||
- helmet (HTTP security headers)
|
||||
|
||||
## References
|
||||
|
||||
- [aidefence on npm](https://www.npmjs.com/package/aidefence)
|
||||
- [OWASP LLM Top 10](https://owasp.org/www-project-top-10-for-large-language-model-applications/)
|
||||
- [Prompt Injection Guide](https://www.lakera.ai/blog/guide-to-prompt-injection)
|
||||
- [AIMDS Documentation](https://ruv.io/aimds)
|
||||
192
npm/packages/ruvbot/docs/adr/ADR-015-chat-ui.md
Normal file
192
npm/packages/ruvbot/docs/adr/ADR-015-chat-ui.md
Normal file
@@ -0,0 +1,192 @@
|
||||
# ADR-015: Chat UI Architecture
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Date
|
||||
|
||||
2026-01-28
|
||||
|
||||
## Context
|
||||
|
||||
RuvBot provides a powerful REST API for chat interactions, but lacks a user-facing web interface. When users visit the root URL of a deployed RuvBot instance (e.g., on Cloud Run), they receive a 404 error instead of a usable chat interface.
|
||||
|
||||
### Requirements
|
||||
|
||||
1. Provide a modern, responsive chat UI out of the box
|
||||
2. Support dark mode (default) and light mode themes
|
||||
3. Work with the existing REST API endpoints
|
||||
4. No build step required - serve static files directly
|
||||
5. Support streaming responses for real-time AI interaction
|
||||
6. Mobile-friendly design
|
||||
7. Model selection capability
|
||||
8. Integration with CLI and npm package
|
||||
|
||||
### Alternatives Considered
|
||||
|
||||
| Option | Pros | Cons |
|
||||
|--------|------|------|
|
||||
| **assistant-ui** | Industry leader, 200k+ downloads, Y Combinator backed | Requires React build, adds complexity |
|
||||
| **Vercel AI Elements** | Official Vercel components, AI SDK integration | Requires Next.js |
|
||||
| **shadcn-chatbot-kit** | Beautiful components, shadcn design system | Requires React build |
|
||||
| **Embedded HTML/CSS/JS** | No build step, portable, fast deployment | Less features, custom implementation |
|
||||
|
||||
## Decision
|
||||
|
||||
Implement a **lightweight embedded chat UI** using vanilla HTML, CSS, and JavaScript that:
|
||||
|
||||
1. Is served directly from the existing HTTP server
|
||||
2. Requires no build step or additional dependencies
|
||||
3. Provides a modern, accessible interface
|
||||
4. Supports dark mode by default
|
||||
5. Includes basic markdown rendering
|
||||
6. Works seamlessly with the existing REST API
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ RuvBot Server │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ GET / → Chat UI (index.html) │
|
||||
│ GET /health → Health check │
|
||||
│ GET /api/models → Available models │
|
||||
│ POST /api/sessions → Create session │
|
||||
│ POST /api/sessions/:id/chat → Chat endpoint │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### File Structure
|
||||
|
||||
```
|
||||
src/
|
||||
├── api/
|
||||
│ └── public/
|
||||
│ └── index.html # Chat UI (single file)
|
||||
├── server.ts # Updated to serve static files
|
||||
└── ...
|
||||
```
|
||||
|
||||
### Features
|
||||
|
||||
1. **Theme Support**: Dark mode default, light mode toggle
|
||||
2. **Model Selection**: Dropdown for available models
|
||||
3. **Responsive Design**: Mobile-first approach
|
||||
4. **Accessibility**: ARIA labels, keyboard navigation
|
||||
5. **Markdown Rendering**: Code blocks, lists, links
|
||||
6. **Error Handling**: User-friendly error messages
|
||||
7. **Session Management**: Automatic session creation
|
||||
8. **Real-time Updates**: Typing indicators
|
||||
|
||||
### CSS Design System
|
||||
|
||||
```css
|
||||
:root {
|
||||
--bg-primary: #0a0a0f; /* Dark background */
|
||||
--bg-secondary: #12121a; /* Card background */
|
||||
--text-primary: #f0f0f5; /* Main text */
|
||||
--accent: #6366f1; /* Indigo accent */
|
||||
--radius: 12px; /* Border radius */
|
||||
}
|
||||
```
|
||||
|
||||
### API Integration
|
||||
|
||||
The UI integrates with existing endpoints:
|
||||
|
||||
```javascript
|
||||
// Create session
|
||||
POST /api/sessions { agentId: 'default-agent' }
|
||||
|
||||
// Send message
|
||||
POST /api/sessions/:id/chat { message: '...', model: '...' }
|
||||
```
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
1. **Zero Configuration**: Works out of the box
|
||||
2. **Fast Deployment**: No build step required
|
||||
3. **Portable**: Single HTML file, easy to customize
|
||||
4. **Lightweight**: ~25KB uncompressed
|
||||
5. **Framework Agnostic**: No React/Vue/Svelte dependency
|
||||
6. **Cloud Run Compatible**: Works with existing deployment
|
||||
|
||||
### Negative
|
||||
|
||||
1. **Limited Features**: No streaming UI (yet), basic markdown
|
||||
2. **Manual Updates**: No component library updates
|
||||
3. **Custom Code**: Maintenance responsibility
|
||||
|
||||
### Neutral
|
||||
|
||||
1. Future option to add assistant-ui or similar for advanced features
|
||||
2. Can be replaced with any frontend framework later
|
||||
|
||||
## Implementation
|
||||
|
||||
### Server Changes (server.ts)
|
||||
|
||||
```typescript
|
||||
// Serve static files
|
||||
function getChatUIPath(): string {
|
||||
const possiblePaths = [
|
||||
join(__dirname, 'api', 'public', 'index.html'),
|
||||
// ... fallback paths
|
||||
];
|
||||
// Find first existing path
|
||||
}
|
||||
|
||||
// Add root route
|
||||
{ method: 'GET', pattern: /^\/$/, handler: handleRoot }
|
||||
```
|
||||
|
||||
### CLI Integration
|
||||
|
||||
```bash
|
||||
# View chat UI URL after deployment
|
||||
ruvbot deploy-cloud cloudrun
|
||||
# Output: URL: https://ruvbot-xxx.run.app
|
||||
|
||||
# Open chat UI
|
||||
ruvbot open # Opens browser to chat UI
|
||||
```
|
||||
|
||||
### npm Package
|
||||
|
||||
The chat UI is bundled with the npm package:
|
||||
|
||||
```json
|
||||
{
|
||||
"files": [
|
||||
"dist",
|
||||
"bin",
|
||||
"scripts",
|
||||
"src/api/public"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
1. **Streaming Responses**: SSE/WebSocket for real-time streaming
|
||||
2. **File Uploads**: Image and document support
|
||||
3. **Voice Input**: Speech-to-text integration
|
||||
4. **assistant-ui Migration**: Full-featured React UI option
|
||||
5. **Themes**: Additional theme presets
|
||||
6. **Plugins**: Extensible UI components
|
||||
|
||||
## References
|
||||
|
||||
- [assistant-ui](https://github.com/assistant-ui/assistant-ui) - Industry-leading chat UI library
|
||||
- [Vercel AI SDK](https://ai-sdk.dev/) - AI SDK with streaming support
|
||||
- [shadcn/ui](https://ui.shadcn.com/) - Design system inspiration
|
||||
- [ADR-013: GCP Deployment](./ADR-013-gcp-deployment.md) - Cloud Run deployment
|
||||
|
||||
## Changelog
|
||||
|
||||
| Date | Change |
|
||||
|------|--------|
|
||||
| 2026-01-28 | Initial version - embedded chat UI |
|
||||
Reference in New Issue
Block a user