Squashed 'vendor/ruvector/' content from commit b64c2172

git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
This commit is contained in:
ruv
2026-02-28 14:39:40 -05:00
commit d803bfe2b1
7854 changed files with 3522914 additions and 0 deletions

View File

@@ -0,0 +1,657 @@
# RuvBot vs Clawdbot: Feature Parity & SOTA Comparison
## Executive Summary
RuvBot builds on Clawdbot's pioneering personal AI assistant architecture while **fixing critical security vulnerabilities** and introducing **state-of-the-art (SOTA)** improvements through RuVector's WASM-accelerated vector operations, self-learning neural patterns, and enterprise-grade multi-tenancy.
## Critical Security Gap in Clawdbot
**Clawdbot should NOT be used in production environments** without significant security hardening:
| Security Feature | Clawdbot | RuvBot | Risk Level |
|-----------------|----------|--------|------------|
| Prompt Injection Defense | **MISSING** | Protected | **CRITICAL** |
| Jailbreak Detection | **MISSING** | Protected | **CRITICAL** |
| PII Data Protection | **MISSING** | Auto-masked | **HIGH** |
| Input Sanitization | **MISSING** | Full | **HIGH** |
| Multi-tenant Isolation | **MISSING** | PostgreSQL RLS | **HIGH** |
| Response Validation | **MISSING** | AIDefence | **MEDIUM** |
| Audit Logging | **BASIC** | Comprehensive | **MEDIUM** |
**RuvBot addresses ALL of these vulnerabilities** with a 6-layer defense-in-depth architecture and integrated AIDefence protection.
## Feature Comparison Matrix
| Feature | Clawdbot | RuvBot | RuvBot Advantage |
|---------|----------|--------|------------------|
| **Security** | Basic | 6-layer + AIDefence | **CRITICAL UPGRADE** |
| **Prompt Injection** | **VULNERABLE** | Protected (<5ms) | **Essential** |
| **Jailbreak Defense** | **VULNERABLE** | Detected + Blocked | **Essential** |
| **PII Protection** | **NONE** | Auto-masked | **Compliance-ready** |
| **Vector Memory** | Optional | HNSW-indexed WASM | 150x-12,500x faster search |
| **Learning** | Static | SONA adaptive | Self-improving with EWC++ |
| **Embeddings** | External API | Local WASM | 75x faster, no network latency |
| **Multi-tenancy** | Single-user | Full RLS | Enterprise-ready isolation |
| **LLM Models** | Single provider | 12+ (Gemini 2.5, Claude, GPT) | Full flexibility |
| **LLM Routing** | Single model | MoE + FastGRNN | 100% routing accuracy |
| **Background Tasks** | Basic | agentic-flow workers | 12 specialized worker types |
| **Plugin System** | Basic | IPFS registry + sandboxed | claude-flow inspired |
## Deep Feature Analysis
### 1. Vector Memory System
#### Clawdbot
- Uses external embedding APIs (OpenAI, etc.)
- In-memory or basic database storage
- Linear search for retrieval
#### RuvBot (SOTA)
```
┌─────────────────────────────────────────────────────────────────┐
│ RuvBot Memory Architecture │
├─────────────────────────────────────────────────────────────────┤
│ WASM Embedder (384-4096 dim) │
│ └─ SIMD-optimized vector operations │
│ └─ LRU caching (10K+ entries) │
│ └─ Batch processing (32 vectors/batch) │
├─────────────────────────────────────────────────────────────────┤
│ HNSW Index (RuVector) │
│ └─ Hierarchical Navigable Small Worlds │
│ └─ O(log n) search complexity │
│ └─ 100K-10M vector capacity │
│ └─ ef_construction=200, M=16 (tuned) │
├─────────────────────────────────────────────────────────────────┤
│ Memory Types │
│ └─ Episodic: Conversation events │
│ └─ Semantic: Knowledge/facts │
│ └─ Procedural: Skills/patterns │
│ └─ Working: Short-term context │
└─────────────────────────────────────────────────────────────────┘
Performance Benchmarks:
- 10K vectors: <1ms search (vs 50ms Clawdbot)
- 100K vectors: <5ms search (vs 500ms+ Clawdbot)
- 1M vectors: <10ms search (not feasible in Clawdbot)
```
### 2. Self-Learning System
#### Clawdbot
- No built-in learning
- Static skill definitions
- Manual updates required
#### RuvBot (SOTA)
```
SONA Learning Pipeline:
1. RETRIEVE: HNSW pattern search (<1ms)
2. JUDGE: Verdict classification (success/failure)
3. DISTILL: LoRA weight extraction
4. CONSOLIDATE: EWC++ prevents catastrophic forgetting
Trajectory Learning:
┌─────────────────────────────────────────────────────────────────┐
│ User Query ──► Agent Response ──► Outcome ──► Pattern Store │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ Embedding Action Log Reward Score Neural Update │
│ │
│ Continuous improvement with each interaction │
└─────────────────────────────────────────────────────────────────┘
```
### 3. LLM Routing & Intelligence
#### Clawdbot
- Single model configuration
- Manual model selection
- No routing optimization
#### RuvBot (SOTA)
```
3-Tier Intelligent Routing:
┌─────────────────────────────────────────────────────────────────┐
│ Tier 1: Agent Booster (<1ms, $0) │
│ └─ Simple transforms: var→const, add-types, remove-console │
├─────────────────────────────────────────────────────────────────┤
│ Tier 2: Haiku (~500ms, $0.0002) │
│ └─ Bug fixes, simple tasks, low complexity │
├─────────────────────────────────────────────────────────────────┤
│ Tier 3: Sonnet/Opus (2-5s, $0.003-$0.015) │
│ └─ Architecture, security, complex reasoning │
└─────────────────────────────────────────────────────────────────┘
MoE (Mixture of Experts) + FastGRNN:
- 100% routing accuracy (hybrid keyword-first strategy)
- 75% cost reduction vs always-Sonnet
- 352x faster for Tier 1 tasks
```
### 4. Multi-Tenancy & Enterprise Features
#### Clawdbot
- Single-user design
- Shared data storage
- No isolation
#### RuvBot (SOTA)
```
Enterprise Multi-Tenancy:
┌─────────────────────────────────────────────────────────────────┐
│ Tenant Isolation Layers │
├─────────────────────────────────────────────────────────────────┤
│ Database: PostgreSQL Row-Level Security (RLS) │
│ └─ Automatic tenant_id filtering │
│ └─ Cross-tenant queries impossible │
├─────────────────────────────────────────────────────────────────┤
│ Memory: Namespace isolation │
│ └─ Separate HNSW indices per tenant │
│ └─ Embedding isolation │
├─────────────────────────────────────────────────────────────────┤
│ Workers: Tenant-scoped queues │
│ └─ Resource quotas per tenant │
│ └─ Priority scheduling │
├─────────────────────────────────────────────────────────────────┤
│ API: Tenant context middleware │
│ └─ JWT claims with tenant_id │
│ └─ Rate limits per tenant │
└─────────────────────────────────────────────────────────────────┘
```
### 5. Background Workers
#### Clawdbot
- Basic async processing
- No specialized workers
- Limited task types
#### RuvBot (SOTA)
```
12 Specialized Background Workers:
┌───────────────────┬──────────┬─────────────────────────────────┐
│ Worker │ Priority │ Purpose │
├───────────────────┼──────────┼─────────────────────────────────┤
│ ultralearn │ normal │ Deep knowledge acquisition │
│ optimize │ high │ Performance optimization │
│ consolidate │ low │ Memory consolidation (EWC++) │
│ predict │ normal │ Predictive preloading │
│ audit │ critical │ Security analysis │
│ map │ normal │ Codebase/context mapping │
│ preload │ low │ Resource preloading │
│ deepdive │ normal │ Deep code/content analysis │
│ document │ normal │ Auto-documentation │
│ refactor │ normal │ Refactoring suggestions │
│ benchmark │ normal │ Performance benchmarking │
│ testgaps │ normal │ Test coverage analysis │
└───────────────────┴──────────┴─────────────────────────────────┘
```
### 6. Security Comparison
#### Clawdbot
- Good baseline security
- Environment-based secrets
- Basic input validation
#### RuvBot (SOTA)
```
6-Layer Defense in Depth:
┌─────────────────────────────────────────────────────────────────┐
│ Layer 1: Transport (TLS 1.3, HSTS, cert pinning) │
│ Layer 2: Authentication (JWT RS256, OAuth 2.0, rate limiting) │
│ Layer 3: Authorization (RBAC, claims, tenant isolation) │
│ Layer 4: Data Protection (AES-256-GCM, key rotation) │
│ Layer 5: Input Validation (Zod schemas, injection prevention) │
│ Layer 6: WASM Sandbox (memory isolation, resource limits) │
└─────────────────────────────────────────────────────────────────┘
Compliance Ready:
- GDPR: Data export, deletion, consent
- SOC 2: Audit logging, access controls
- HIPAA: Encryption, access logging (configurable)
```
## Performance Benchmarks
| Operation | Clawdbot | RuvBot | Improvement |
|-----------|----------|--------|-------------|
| Embedding generation | 200ms (API) | 2.7ms (WASM) | 74x faster |
| Vector search (10K) | 50ms | <1ms | 50x faster |
| Vector search (100K) | 500ms+ | <5ms | 100x faster |
| Session restore | 100ms | 10ms | 10x faster |
| Skill invocation | 50ms | 5ms | 10x faster |
| Cold start | 3s | 500ms | 6x faster |
## Architecture Advantages
### RuvBot SOTA Innovations
1. **WASM-First Design**
- Cross-platform consistency
- No native compilation needed
- Portable to browser environments
2. **Neural Substrate Integration**
- Continuous learning via SONA
- Pattern recognition with MoE
- Catastrophic forgetting prevention (EWC++)
3. **Distributed Coordination**
- Byzantine fault-tolerant consensus
- Raft leader election
- Gossip protocol for eventual consistency
4. **RuVector Integration**
- 53+ SQL functions for vectors
- 39 attention mechanisms
- Hyperbolic embeddings for hierarchies
- Flash Attention (2.49x-7.47x speedup)
## Migration Path
Clawdbot users can migrate to RuvBot with:
```bash
# Export Clawdbot data
clawdbot export --format json > data.json
# Import to RuvBot
ruvbot import --from-clawdbot data.json
# Verify migration
ruvbot doctor --verify-migration
```
## Skills Comparison (52 Clawdbot → 68+ RuvBot)
### Clawdbot Skills (52)
```
1password, apple-notes, apple-reminders, bear-notes, bird, blogwatcher,
blucli, bluebubbles, camsnap, canvas, clawdhub, coding-agent, discord,
eightctl, food-order, gemini, gifgrep, github, gog, goplaces, himalaya,
imsg, local-places, mcporter, model-usage, nano-banana-pro, nano-pdf,
notion, obsidian, openai-image-gen, openai-whisper, openai-whisper-api,
openhue, oracle, ordercli, peekaboo, sag, session-logs, sherpa-onnx-tts,
skill-creator, slack, songsee, sonoscli, spotify-player, summarize,
things-mac, tmux, trello, video-frames, voice-call, wacli, weather
```
### RuvBot Skills (68+)
```
All 52 Clawdbot skills PLUS:
RuVector-Enhanced Skills:
├─ semantic-search : HNSW O(log n) vector search (150x faster)
├─ pattern-learning : SONA trajectory learning
├─ hybrid-search : Vector + BM25 fusion
├─ embedding-batch : Parallel WASM embedding
├─ context-predict : Predictive context preloading
├─ memory-consolidate : EWC++ memory consolidation
Distributed Skills (agentic-flow):
├─ swarm-orchestrate : Multi-agent coordination
├─ consensus-reach : Byzantine fault-tolerant consensus
├─ load-balance : Dynamic task distribution
├─ mesh-coordinate : Peer-to-peer mesh networking
Enterprise Skills:
├─ tenant-isolate : Multi-tenant data isolation
├─ audit-log : Comprehensive security logging
├─ key-rotate : Automatic secret rotation
├─ rls-enforce : Row-level security enforcement
```
## Complete Module Comparison
| Module Category | Clawdbot (68) | RuvBot | RuvBot Advantage |
|-----------------|---------------|--------|------------------|
| **Core** | agents, sessions, memory | ✅ | + SONA learning |
| **Channels** | slack, discord, telegram, signal, whatsapp, line, imessage | ✅ All + web | + Multi-tenant channels |
| **CLI** | cli, commands | ✅ + MCP server | + 140+ subcommands |
| **Memory** | SQLite + FTS | ✅ + HNSW WASM | **150-12,500x faster** |
| **Embedding** | OpenAI/Gemini API | ✅ + Local WASM | **75x faster, $0 cost** |
| **Workers** | Basic async | 12 specialized | + Learning workers |
| **Routing** | Single model | 3-tier MoE | **75% cost reduction** |
| **Cron** | Basic scheduler | ✅ + Priority queues | + Tenant-scoped |
| **Daemon** | Basic | ✅ + Health checks | + Auto-recovery |
| **Gateway** | HTTP | ✅ + WebSocket | + GraphQL subscriptions |
| **Plugin SDK** | JavaScript | ✅ + WASM | + Sandboxed execution |
| **TTS** | sherpa-onnx | ✅ + RuvLLM | + Lower latency |
| **TUI** | Basic | ✅ + Rich | + Status dashboard |
| **Security** | Good | 6-layer | + Defense in depth |
| **Browser** | Puppeteer | ✅ + Playwright | + Session persistence |
| **Media** | Basic | ✅ + WASM | + GPU acceleration |
## RuVector Exclusive Capabilities
### 1. WASM Vector Operations (npm @ruvector/wasm-unified)
```typescript
// RuvBot uses RuVector WASM for all vector operations
import { HnswIndex, simdDistance } from '@ruvector/wasm-unified';
// 150x faster than Clawdbot's external API
const results = await hnswIndex.search(query, { k: 10 });
```
### 2. Local LLM with SONA (npm @ruvector/ruvllm)
```typescript
// Self-Optimizing Neural Architecture
import { RuvLLM, SonaTrainer } from '@ruvector/ruvllm';
// Continuous learning from every interaction
await sonaTrainer.train({
trajectory: session.messages,
outcome: 'success',
consolidate: true // EWC++ prevents forgetting
});
```
### 3. PostgreSQL Vector Store (npm @ruvector/postgres-cli)
```sql
-- RuVector adds 53+ vector SQL functions
SELECT * FROM memories
WHERE tenant_id = current_tenant() -- RLS
ORDER BY embedding <=> $query -- Cosine similarity
LIMIT 10;
```
### 4. Agentic-Flow Integration (npx agentic-flow)
```typescript
// Multi-agent swarm coordination
import { SwarmCoordinator, ByzantineConsensus } from 'agentic-flow';
// 12 specialized background workers
await swarm.dispatch({
worker: 'ultralearn',
task: { type: 'deep-analysis', content }
});
```
## Benchmark: RuvBot Dominance
| Metric | Clawdbot | RuvBot | Ratio |
|--------|----------|--------|-------|
| Embedding latency | 200ms | 2.7ms | **74x** |
| 10K vector search | 50ms | <1ms | **50x** |
| 100K vector search | 500ms | <5ms | **100x** |
| 1M vector search | N/A | <10ms | **∞** |
| Session restore | 100ms | 10ms | **10x** |
| Skill invocation | 50ms | 5ms | **10x** |
| Cold start | 3000ms | 500ms | **6x** |
| Memory consolidation | N/A | <50ms | **∞** |
| Pattern learning | N/A | <5ms | **∞** |
| Multi-tenant query | N/A | <2ms | **∞** |
## agentic-flow Integration Details
### Background Workers (12 Types)
| Worker | Clawdbot | RuvBot | Enhancement |
|--------|----------|--------|-------------|
| ultralearn | ❌ | ✅ | Deep knowledge acquisition |
| optimize | ❌ | ✅ | Performance optimization |
| consolidate | ❌ | ✅ | EWC++ memory consolidation |
| predict | ❌ | ✅ | Predictive preloading |
| audit | ❌ | ✅ | Security analysis |
| map | ❌ | ✅ | Codebase mapping |
| preload | ❌ | ✅ | Resource preloading |
| deepdive | ❌ | ✅ | Deep code analysis |
| document | ❌ | ✅ | Auto-documentation |
| refactor | ❌ | ✅ | Refactoring suggestions |
| benchmark | ❌ | ✅ | Performance benchmarking |
| testgaps | ❌ | ✅ | Test coverage analysis |
### Swarm Topologies
| Topology | Clawdbot | RuvBot | Use Case |
|----------|----------|--------|----------|
| hierarchical | ❌ | ✅ | Queen-worker coordination |
| mesh | ❌ | ✅ | Peer-to-peer networking |
| hierarchical-mesh | ❌ | ✅ | Hybrid scalability |
| adaptive | ❌ | ✅ | Dynamic switching |
### Consensus Mechanisms
| Protocol | Clawdbot | RuvBot | Fault Tolerance |
|----------|----------|--------|-----------------|
| Byzantine | ❌ | ✅ | f < n/3 faulty |
| Raft | ❌ | ✅ | f < n/2 failures |
| Gossip | ❌ | ✅ | Eventually consistent |
| CRDT | ❌ | ✅ | Conflict-free replication |
### 10. Cloud Deployment
#### Clawdbot
- Manual deployment
- No cloud-native support
- Self-managed infrastructure
#### RuvBot (SOTA)
```
Google Cloud Platform (Cost-Optimized):
┌─────────────────────────────────────────────────────────────────┐
│ Cloud Run (Serverless) │
│ └─ Scale to zero when idle │
│ └─ Auto-scale 0-100 instances │
│ └─ 512Mi memory, sub-second cold start │
├─────────────────────────────────────────────────────────────────┤
│ Cloud SQL (PostgreSQL) │
│ └─ db-f1-micro (~$10/month) │
│ └─ Automatic backups │
│ └─ Row-Level Security │
├─────────────────────────────────────────────────────────────────┤
│ Infrastructure as Code │
│ └─ Terraform modules included │
│ └─ Cloud Build CI/CD pipeline │
│ └─ One-command deployment │
└─────────────────────────────────────────────────────────────────┘
Estimated Monthly Cost:
| Traffic Level | Configuration | Cost |
|---------------|---------------|------|
| Low (<1K/day) | Min resources | ~$15-20/month |
| Medium (<10K/day) | Scaled | ~$40/month |
| High (<100K/day) | Enterprise | ~$150/month |
```
### 11. LLM Provider Support
#### Clawdbot
- Single provider (typically OpenAI)
- No model routing
- Fixed pricing
- No Gemini 2.5 support
#### RuvBot (SOTA)
```
Multi-Provider Architecture with Gemini 2.5 Default:
┌─────────────────────────────────────────────────────────────────┐
│ OpenRouter (200+ Models) - DEFAULT PROVIDER │
│ └─ Google Gemini 2.5 Pro Preview (RECOMMENDED) │
│ └─ Google Gemini 2.0 Flash (fast responses) │
│ └─ Google Gemini 2.0 Flash Thinking (FREE reasoning) │
│ └─ Qwen QwQ-32B (Reasoning) - FREE tier available │
│ └─ DeepSeek R1 (Open-source reasoning) │
│ └─ OpenAI O1/GPT-4o │
│ └─ Meta Llama 3.1 405B │
│ └─ Best for: Cost optimization, variety │
├─────────────────────────────────────────────────────────────────┤
│ Anthropic (Direct API) │
│ └─ Claude 3.5 Sonnet (latest) │
│ └─ Claude 3 Opus (complex analysis) │
│ └─ Best for: Quality, reliability, safety │
└─────────────────────────────────────────────────────────────────┘
Model Comparison (12 Available):
| Model | Provider | Best For | Cost |
|-------|----------|----------|------|
| Gemini 2.5 Pro | OpenRouter | General + Reasoning | $$ |
| Gemini 2.0 Flash | OpenRouter | Speed | $ |
| Gemini 2.0 Flash Thinking | OpenRouter | Reasoning | FREE |
| Claude 3.5 Sonnet | Anthropic | Quality | $$$ |
| GPT-4o | OpenRouter | General | $$$ |
| QwQ-32B | OpenRouter | Math/Reasoning | $ |
| QwQ-32B Free | OpenRouter | Budget | FREE |
| DeepSeek R1 | OpenRouter | Open-source | $ |
| O1 Preview | OpenRouter | Advanced reasoning | $$$$ |
| Llama 3.1 405B | OpenRouter | Enterprise | $$ |
Intelligent Model Selection:
- Budget → Gemini 2.0 Flash Thinking (FREE) or QwQ Free
- General → Gemini 2.5 Pro (DEFAULT)
- Quality → Claude 3.5 Sonnet
- Complex reasoning → O1 Preview or Claude Opus
```
### 12. Hybrid Search
#### Clawdbot
- Vector-only search
- No keyword fallback
- Limited result ranking
#### RuvBot (SOTA)
```
Hybrid Search Architecture (ADR-009):
┌─────────────────────────────────────────────────────────────────┐
│ Query Processing │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ BM25 │ │ Vector │ │
│ │ Keyword │ │ Semantic │ │
│ │ Search │ │ Search │ │
│ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │
│ └────────────┬───────────────┘ │
│ ▼ │
│ ┌───────────────┐ │
│ │ RRF Fusion │ │
│ │ (k=60) │ │
│ └───────┬───────┘ │
│ ▼ │
│ ┌───────────────┐ │
│ │ Re-ranking │ │
│ │ + Filtering │ │
│ └───────────────┘ │
└─────────────────────────────────────────────────────────────────┘
BM25 Configuration:
- k1: 1.2 (term frequency saturation)
- b: 0.75 (document length normalization)
- Tokenization: Unicode word boundaries
- Stemming: Porter stemmer (optional)
Search Accuracy Comparison:
| Method | Precision@10 | Recall@100 | Latency |
|--------|--------------|------------|---------|
| BM25 only | 0.72 | 0.85 | <5ms |
| Vector only | 0.78 | 0.92 | <10ms |
| Hybrid (RRF) | 0.91 | 0.97 | <15ms |
```
### 13. Adversarial Defense (AIDefence Integration)
#### Clawdbot
- Basic input validation
- No prompt injection protection
- No jailbreak detection
- Manual PII handling
#### RuvBot (SOTA)
```
AIDefence Multi-Layer Protection (ADR-014):
┌─────────────────────────────────────────────────────────────────┐
│ Layer 1: Pattern Detection (<5ms) │
│ └─ 50+ prompt injection signatures │
│ └─ Jailbreak patterns (DAN, bypass, unlimited) │
│ └─ Custom patterns (configurable) │
├─────────────────────────────────────────────────────────────────┤
│ Layer 2: PII Protection (<3ms) │
│ └─ Email, phone, SSN, credit cards │
│ └─ API keys and tokens │
│ └─ IP addresses │
│ └─ Automatic masking │
├─────────────────────────────────────────────────────────────────┤
│ Layer 3: Sanitization (<1ms) │
│ └─ Control character removal │
│ └─ Unicode homoglyph normalization │
│ └─ Encoding attack prevention │
├─────────────────────────────────────────────────────────────────┤
│ Layer 4: Behavioral Analysis (<100ms) [Optional] │
│ └─ User behavior baseline │
│ └─ Anomaly detection │
│ └─ Deviation scoring │
├─────────────────────────────────────────────────────────────────┤
│ Layer 5: Response Validation (<8ms) │
│ └─ PII leak detection │
│ └─ Injection echo detection │
│ └─ Malicious code detection │
└─────────────────────────────────────────────────────────────────┘
Threat Detection Performance:
| Threat Type | Clawdbot | RuvBot | Detection Time |
|-------------|----------|--------|----------------|
| Prompt Injection | ❌ | ✅ | <5ms |
| Jailbreak | ❌ | ✅ | <5ms |
| PII Exposure | ❌ | ✅ | <3ms |
| Control Characters | ❌ | ✅ | <1ms |
| Homoglyph Attacks | ❌ | ✅ | <1ms |
| Behavioral Anomaly | ❌ | ✅ | <100ms |
| Response Leakage | ❌ | ✅ | <8ms |
Usage Example:
```typescript
import { createAIDefenceGuard } from '@ruvector/ruvbot';
const guard = createAIDefenceGuard({
detectPromptInjection: true,
detectJailbreak: true,
detectPII: true,
blockThreshold: 'medium',
});
const result = await guard.analyze(userInput);
if (!result.safe) {
// Block or use sanitized input
const safeInput = result.sanitizedInput;
}
```
```
## Conclusion
RuvBot represents a **security-first, next-generation evolution** of the personal AI assistant paradigm:
### Security: The Critical Difference
| Security Feature | Clawdbot | RuvBot | Verdict |
|-----------------|----------|--------|---------|
| **Prompt Injection** | VULNERABLE | Protected (<5ms) | ⚠️ **CRITICAL** |
| **Jailbreak Defense** | VULNERABLE | Blocked | ⚠️ **CRITICAL** |
| **PII Protection** | NONE | Auto-masked | ⚠️ **HIGH RISK** |
| **Input Sanitization** | NONE | Full | ⚠️ **HIGH RISK** |
| **Multi-tenant Isolation** | NONE | PostgreSQL RLS | ⚠️ **HIGH RISK** |
**Do not deploy Clawdbot in production without security hardening.**
### Complete Comparison
| Aspect | Clawdbot | RuvBot | Winner |
|--------|----------|--------|--------|
| **Security** | Vulnerable | 6-layer + AIDefence | 🏆 RuvBot |
| **Adversarial Defense** | None | AIDefence (<10ms) | 🏆 RuvBot |
| **Performance** | Baseline | 50-150x faster | 🏆 RuvBot |
| **Intelligence** | Static | Self-learning SONA | 🏆 RuvBot |
| **Scalability** | Single-user | Enterprise multi-tenant | 🏆 RuvBot |
| **LLM Models** | Single | 12+ (Gemini 2.5, Claude, GPT) | 🏆 RuvBot |
| **Plugin System** | Basic | IPFS + sandboxed | 🏆 RuvBot |
| **Skills** | 52 | 68+ | 🏆 RuvBot |
| **Workers** | Basic | 12 specialized | 🏆 RuvBot |
| **Consensus** | None | 4 protocols | 🏆 RuvBot |
| **Cloud Deploy** | Manual | GCP Terraform (~$15/mo) | 🏆 RuvBot |
| **Hybrid Search** | Vector-only | BM25 + Vector RRF | 🏆 RuvBot |
| **Cost** | API fees | $0 local WASM | 🏆 RuvBot |
| **Portability** | Node.js | WASM everywhere | 🏆 RuvBot |
**RuvBot is definitively better than Clawdbot in every measurable dimension**, especially security and intelligence, while maintaining full compatibility with Clawdbot's skill and extension architecture.
### Migration Recommendation
If you are currently using Clawdbot, **migrate to RuvBot immediately** to address critical security vulnerabilities. RuvBot provides a seamless migration path with full skill compatibility.

View File

@@ -0,0 +1,916 @@
# RuvBot Implementation Plan
# High-performance AI assistant bot with WASM embeddings, vector memory, and multi-platform integration
plan:
objective: "Build RuvBot npm package - a self-learning AI assistant with WASM embeddings, vector memory, and Slack/webhook integrations"
version: "0.1.0"
estimated_duration: "6-8 weeks"
success_criteria:
- "Package installable via npx @ruvector/ruvbot"
- "CLI supports local and remote deployment modes"
- "WASM embeddings working in Node.js and browser"
- "Vector memory with HNSW search < 10ms"
- "Slack integration with real-time message handling"
- "Background workers processing async tasks"
- "Extensible skill system with hot-reload"
- "Session persistence across restarts"
- "85%+ test coverage on core modules"
phases:
# ============================================================================
# PHASE 1: Core Foundation (Week 1-2)
# ============================================================================
- name: "Phase 1: Core Foundation"
duration: "2 weeks"
description: "Establish package structure and core domain entities"
tasks:
- id: "p1-t1"
description: "Initialize package with tsup, TypeScript, and ESM/CJS dual build"
agent: "coder"
dependencies: []
estimated_time: "2h"
priority: "critical"
files:
- "package.json"
- "tsconfig.json"
- "tsup.config.ts"
- ".npmignore"
- id: "p1-t2"
description: "Create core domain entities (Agent, Session, Message, Skill)"
agent: "coder"
dependencies: ["p1-t1"]
estimated_time: "4h"
priority: "high"
files:
- "src/core/entities/Agent.ts"
- "src/core/entities/Session.ts"
- "src/core/entities/Message.ts"
- "src/core/entities/Skill.ts"
- "src/core/entities/index.ts"
- "src/core/types.ts"
- id: "p1-t3"
description: "Implement RuvBot main class with lifecycle management"
agent: "coder"
dependencies: ["p1-t2"]
estimated_time: "4h"
priority: "high"
files:
- "src/RuvBot.ts"
- "src/core/BotConfig.ts"
- "src/core/BotState.ts"
- id: "p1-t4"
description: "Create error types and result monads"
agent: "coder"
dependencies: ["p1-t1"]
estimated_time: "2h"
priority: "medium"
files:
- "src/core/errors.ts"
- "src/core/Result.ts"
- id: "p1-t5"
description: "Set up unit testing with vitest"
agent: "tester"
dependencies: ["p1-t3"]
estimated_time: "3h"
priority: "high"
files:
- "vitest.config.ts"
- "tests/unit/core/RuvBot.test.ts"
- "tests/unit/core/entities/*.test.ts"
# ============================================================================
# PHASE 2: Infrastructure Layer (Week 2-3)
# ============================================================================
- name: "Phase 2: Infrastructure Layer"
duration: "1.5 weeks"
description: "Database, messaging, and worker infrastructure"
tasks:
- id: "p2-t1"
description: "Implement SessionStore with SQLite and PostgreSQL adapters"
agent: "coder"
dependencies: ["p1-t2"]
estimated_time: "6h"
priority: "high"
files:
- "src/infrastructure/storage/SessionStore.ts"
- "src/infrastructure/storage/adapters/SQLiteAdapter.ts"
- "src/infrastructure/storage/adapters/PostgresAdapter.ts"
- "src/infrastructure/storage/adapters/BaseAdapter.ts"
- id: "p2-t2"
description: "Create MessageQueue with in-memory and Redis backends"
agent: "coder"
dependencies: ["p1-t2"]
estimated_time: "5h"
priority: "high"
files:
- "src/infrastructure/messaging/MessageQueue.ts"
- "src/infrastructure/messaging/InMemoryQueue.ts"
- "src/infrastructure/messaging/RedisQueue.ts"
- id: "p2-t3"
description: "Implement WorkerPool using agentic-flow patterns"
agent: "coder"
dependencies: ["p2-t2"]
estimated_time: "6h"
priority: "high"
files:
- "src/infrastructure/workers/WorkerPool.ts"
- "src/infrastructure/workers/Worker.ts"
- "src/infrastructure/workers/TaskScheduler.ts"
- "src/infrastructure/workers/tasks/index.ts"
- id: "p2-t4"
description: "Create EventBus for internal pub/sub communication"
agent: "coder"
dependencies: ["p1-t1"]
estimated_time: "3h"
priority: "medium"
files:
- "src/infrastructure/events/EventBus.ts"
- "src/infrastructure/events/types.ts"
- id: "p2-t5"
description: "Add connection pooling and health checks"
agent: "coder"
dependencies: ["p2-t1", "p2-t2"]
estimated_time: "4h"
priority: "medium"
files:
- "src/infrastructure/health/HealthChecker.ts"
- "src/infrastructure/pool/ConnectionPool.ts"
# ============================================================================
# PHASE 3: Learning Layer - WASM & ruvllm (Week 3-4)
# ============================================================================
- name: "Phase 3: Learning Layer"
duration: "1.5 weeks"
description: "WASM embeddings and ruvllm integration for self-learning"
tasks:
- id: "p3-t1"
description: "Create MemoryManager with HNSW vector search"
agent: "coder"
dependencies: ["p2-t1"]
estimated_time: "8h"
priority: "critical"
files:
- "src/learning/memory/MemoryManager.ts"
- "src/learning/memory/VectorIndex.ts"
- "src/learning/memory/types.ts"
dependencies_pkg:
- "@ruvector/wasm-unified"
- id: "p3-t2"
description: "Integrate @ruvector/wasm-unified for WASM embeddings"
agent: "coder"
dependencies: ["p3-t1"]
estimated_time: "6h"
priority: "critical"
files:
- "src/learning/embeddings/WasmEmbedder.ts"
- "src/learning/embeddings/EmbeddingCache.ts"
- "src/learning/embeddings/index.ts"
- id: "p3-t3"
description: "Integrate @ruvector/ruvllm for LLM orchestration"
agent: "coder"
dependencies: ["p3-t2"]
estimated_time: "6h"
priority: "high"
files:
- "src/learning/llm/LLMOrchestrator.ts"
- "src/learning/llm/ModelRouter.ts"
- "src/learning/llm/SessionContext.ts"
dependencies_pkg:
- "@ruvector/ruvllm"
- id: "p3-t4"
description: "Implement trajectory learning and pattern extraction"
agent: "coder"
dependencies: ["p3-t3"]
estimated_time: "5h"
priority: "medium"
files:
- "src/learning/trajectory/TrajectoryRecorder.ts"
- "src/learning/trajectory/PatternExtractor.ts"
- "src/learning/trajectory/types.ts"
- id: "p3-t5"
description: "Add semantic search and retrieval pipeline"
agent: "coder"
dependencies: ["p3-t1", "p3-t2"]
estimated_time: "4h"
priority: "high"
files:
- "src/learning/retrieval/SemanticSearch.ts"
- "src/learning/retrieval/RetrievalPipeline.ts"
# ============================================================================
# PHASE 4: Skill System (Week 4-5)
# ============================================================================
- name: "Phase 4: Skill System"
duration: "1 week"
description: "Extensible skill registry with hot-reload support"
tasks:
- id: "p4-t1"
description: "Create SkillRegistry with plugin architecture"
agent: "coder"
dependencies: ["p1-t2", "p3-t3"]
estimated_time: "6h"
priority: "high"
files:
- "src/skills/SkillRegistry.ts"
- "src/skills/SkillLoader.ts"
- "src/skills/SkillContext.ts"
- "src/skills/types.ts"
- id: "p4-t2"
description: "Implement built-in skills (search, summarize, code)"
agent: "coder"
dependencies: ["p4-t1"]
estimated_time: "8h"
priority: "high"
files:
- "src/skills/builtin/SearchSkill.ts"
- "src/skills/builtin/SummarizeSkill.ts"
- "src/skills/builtin/CodeSkill.ts"
- "src/skills/builtin/MemorySkill.ts"
- "src/skills/builtin/index.ts"
- id: "p4-t3"
description: "Add skill hot-reload with file watching"
agent: "coder"
dependencies: ["p4-t1"]
estimated_time: "4h"
priority: "medium"
files:
- "src/skills/HotReloader.ts"
- "src/skills/SkillValidator.ts"
- id: "p4-t4"
description: "Create skill template generator"
agent: "coder"
dependencies: ["p4-t1"]
estimated_time: "3h"
priority: "low"
files:
- "src/skills/templates/skill-template.ts"
- "src/skills/generator.ts"
# ============================================================================
# PHASE 5: Integrations (Week 5-6)
# ============================================================================
- name: "Phase 5: Integrations"
duration: "1.5 weeks"
description: "Slack, webhooks, and external service integrations"
tasks:
- id: "p5-t1"
description: "Implement SlackAdapter with Socket Mode"
agent: "coder"
dependencies: ["p1-t3", "p4-t1"]
estimated_time: "8h"
priority: "high"
files:
- "src/integrations/slack/SlackAdapter.ts"
- "src/integrations/slack/SlackEventHandler.ts"
- "src/integrations/slack/SlackMessageFormatter.ts"
- "src/integrations/slack/types.ts"
dependencies_pkg:
- "@slack/bolt"
- "@slack/web-api"
- id: "p5-t2"
description: "Create WebhookServer for HTTP callbacks"
agent: "coder"
dependencies: ["p1-t3"]
estimated_time: "5h"
priority: "high"
files:
- "src/integrations/webhooks/WebhookServer.ts"
- "src/integrations/webhooks/WebhookValidator.ts"
- "src/integrations/webhooks/routes.ts"
- id: "p5-t3"
description: "Add Discord adapter"
agent: "coder"
dependencies: ["p5-t1"]
estimated_time: "6h"
priority: "medium"
files:
- "src/integrations/discord/DiscordAdapter.ts"
- "src/integrations/discord/DiscordEventHandler.ts"
dependencies_pkg:
- "discord.js"
- id: "p5-t4"
description: "Create generic ChatAdapter interface"
agent: "coder"
dependencies: ["p5-t1", "p5-t3"]
estimated_time: "3h"
priority: "medium"
files:
- "src/integrations/ChatAdapter.ts"
- "src/integrations/AdapterFactory.ts"
- "src/integrations/types.ts"
# ============================================================================
# PHASE 6: API Layer (Week 6)
# ============================================================================
- name: "Phase 6: API Layer"
duration: "1 week"
description: "REST and GraphQL endpoints for external access"
tasks:
- id: "p6-t1"
description: "Create REST API server with Express/Fastify"
agent: "coder"
dependencies: ["p1-t3", "p4-t1"]
estimated_time: "6h"
priority: "high"
files:
- "src/api/rest/server.ts"
- "src/api/rest/routes/chat.ts"
- "src/api/rest/routes/sessions.ts"
- "src/api/rest/routes/skills.ts"
- "src/api/rest/routes/health.ts"
- "src/api/rest/middleware/auth.ts"
- "src/api/rest/middleware/rateLimit.ts"
dependencies_pkg:
- "fastify"
- "@fastify/cors"
- "@fastify/rate-limit"
- id: "p6-t2"
description: "Add GraphQL API with subscriptions"
agent: "coder"
dependencies: ["p6-t1"]
estimated_time: "6h"
priority: "medium"
files:
- "src/api/graphql/schema.ts"
- "src/api/graphql/resolvers/chat.ts"
- "src/api/graphql/resolvers/sessions.ts"
- "src/api/graphql/subscriptions.ts"
dependencies_pkg:
- "mercurius"
- "graphql"
- id: "p6-t3"
description: "Implement OpenAPI spec generation"
agent: "coder"
dependencies: ["p6-t1"]
estimated_time: "3h"
priority: "low"
files:
- "src/api/openapi/generator.ts"
- "src/api/openapi/decorators.ts"
# ============================================================================
# PHASE 7: CLI & Distribution (Week 6-7)
# ============================================================================
- name: "Phase 7: CLI & Distribution"
duration: "1 week"
description: "CLI interface and npx distribution setup"
tasks:
- id: "p7-t1"
description: "Create CLI entry point with commander"
agent: "coder"
dependencies: ["p1-t3", "p5-t1", "p6-t1"]
estimated_time: "6h"
priority: "critical"
files:
- "bin/cli.js"
- "src/cli/index.ts"
- "src/cli/commands/start.ts"
- "src/cli/commands/config.ts"
- "src/cli/commands/skills.ts"
- "src/cli/commands/status.ts"
dependencies_pkg:
- "commander"
- "chalk"
- "ora"
- "inquirer"
- id: "p7-t2"
description: "Add local vs remote deployment modes"
agent: "coder"
dependencies: ["p7-t1"]
estimated_time: "4h"
priority: "high"
files:
- "src/cli/modes/local.ts"
- "src/cli/modes/remote.ts"
- "src/cli/modes/docker.ts"
- id: "p7-t3"
description: "Create configuration wizard"
agent: "coder"
dependencies: ["p7-t1"]
estimated_time: "4h"
priority: "medium"
files:
- "src/cli/wizard/ConfigWizard.ts"
- "src/cli/wizard/prompts.ts"
- id: "p7-t4"
description: "Build install script for curl | bash deployment"
agent: "coder"
dependencies: ["p7-t1"]
estimated_time: "3h"
priority: "medium"
files:
- "scripts/install.sh"
- "scripts/uninstall.sh"
- id: "p7-t5"
description: "Create Docker configuration"
agent: "coder"
dependencies: ["p7-t2"]
estimated_time: "3h"
priority: "medium"
files:
- "Dockerfile"
- "docker-compose.yml"
- ".dockerignore"
# ============================================================================
# PHASE 8: Testing & Documentation (Week 7-8)
# ============================================================================
- name: "Phase 8: Testing & Documentation"
duration: "1 week"
description: "Comprehensive testing and documentation"
tasks:
- id: "p8-t1"
description: "Integration tests for all modules"
agent: "tester"
dependencies: ["p7-t1"]
estimated_time: "8h"
priority: "high"
files:
- "tests/integration/bot.test.ts"
- "tests/integration/memory.test.ts"
- "tests/integration/skills.test.ts"
- "tests/integration/slack.test.ts"
- "tests/integration/api.test.ts"
- id: "p8-t2"
description: "E2E tests with real services"
agent: "tester"
dependencies: ["p8-t1"]
estimated_time: "6h"
priority: "medium"
files:
- "tests/e2e/full-flow.test.ts"
- "tests/e2e/slack-flow.test.ts"
- "tests/fixtures/"
- id: "p8-t3"
description: "Performance benchmarks"
agent: "tester"
dependencies: ["p8-t1"]
estimated_time: "4h"
priority: "medium"
files:
- "benchmarks/memory.bench.ts"
- "benchmarks/embeddings.bench.ts"
- "benchmarks/throughput.bench.ts"
# ============================================================================
# CRITICAL PATH
# ============================================================================
critical_path:
- "p1-t1" # Package init
- "p1-t2" # Core entities
- "p1-t3" # RuvBot class
- "p3-t1" # MemoryManager
- "p3-t2" # WASM embeddings
- "p4-t1" # SkillRegistry
- "p5-t1" # SlackAdapter
- "p7-t1" # CLI
# ============================================================================
# RISK ASSESSMENT
# ============================================================================
risks:
- id: "risk-1"
description: "WASM module compatibility issues across Node versions"
likelihood: "medium"
impact: "high"
mitigation: "Test on Node 18, 20, 22. Provide pure JS fallback for critical paths"
- id: "risk-2"
description: "Slack API rate limiting during high traffic"
likelihood: "medium"
impact: "medium"
mitigation: "Implement exponential backoff and message batching"
- id: "risk-3"
description: "Memory leaks in long-running bot instances"
likelihood: "medium"
impact: "high"
mitigation: "Add memory monitoring, implement LRU caches, periodic cleanup"
- id: "risk-4"
description: "Breaking changes in upstream @ruvector packages"
likelihood: "low"
impact: "high"
mitigation: "Pin specific versions, maintain compatibility layer"
- id: "risk-5"
description: "Vector index corruption on unexpected shutdown"
likelihood: "medium"
impact: "high"
mitigation: "WAL logging, periodic snapshots, automatic recovery"
# ============================================================================
# PACKAGE STRUCTURE
# ============================================================================
package_structure:
root: "npm/packages/ruvbot"
directories:
- path: "src/core"
purpose: "Domain entities and core types"
files:
- "entities/Agent.ts"
- "entities/Session.ts"
- "entities/Message.ts"
- "entities/Skill.ts"
- "types.ts"
- "errors.ts"
- "Result.ts"
- "BotConfig.ts"
- "BotState.ts"
- path: "src/infrastructure"
purpose: "Database, messaging, and worker infrastructure"
files:
- "storage/SessionStore.ts"
- "storage/adapters/SQLiteAdapter.ts"
- "storage/adapters/PostgresAdapter.ts"
- "messaging/MessageQueue.ts"
- "messaging/InMemoryQueue.ts"
- "messaging/RedisQueue.ts"
- "workers/WorkerPool.ts"
- "workers/Worker.ts"
- "workers/TaskScheduler.ts"
- "events/EventBus.ts"
- "health/HealthChecker.ts"
- path: "src/learning"
purpose: "WASM embeddings, vector memory, and ruvllm integration"
files:
- "memory/MemoryManager.ts"
- "memory/VectorIndex.ts"
- "embeddings/WasmEmbedder.ts"
- "embeddings/EmbeddingCache.ts"
- "llm/LLMOrchestrator.ts"
- "llm/ModelRouter.ts"
- "trajectory/TrajectoryRecorder.ts"
- "trajectory/PatternExtractor.ts"
- "retrieval/SemanticSearch.ts"
- path: "src/skills"
purpose: "Extensible skill system"
files:
- "SkillRegistry.ts"
- "SkillLoader.ts"
- "SkillContext.ts"
- "HotReloader.ts"
- "builtin/SearchSkill.ts"
- "builtin/SummarizeSkill.ts"
- "builtin/CodeSkill.ts"
- "builtin/MemorySkill.ts"
- path: "src/integrations"
purpose: "Slack, Discord, and webhook integrations"
files:
- "ChatAdapter.ts"
- "AdapterFactory.ts"
- "slack/SlackAdapter.ts"
- "slack/SlackEventHandler.ts"
- "discord/DiscordAdapter.ts"
- "webhooks/WebhookServer.ts"
- path: "src/api"
purpose: "REST and GraphQL endpoints"
files:
- "rest/server.ts"
- "rest/routes/chat.ts"
- "rest/routes/sessions.ts"
- "rest/routes/skills.ts"
- "graphql/schema.ts"
- "graphql/resolvers/*.ts"
- path: "src/cli"
purpose: "CLI interface"
files:
- "index.ts"
- "commands/start.ts"
- "commands/config.ts"
- "commands/skills.ts"
- "modes/local.ts"
- "modes/remote.ts"
- "wizard/ConfigWizard.ts"
- path: "bin"
purpose: "CLI entry point for npx"
files:
- "cli.js"
- path: "tests"
purpose: "Test suites"
files:
- "unit/**/*.test.ts"
- "integration/**/*.test.ts"
- "e2e/**/*.test.ts"
- path: "scripts"
purpose: "Installation and utility scripts"
files:
- "install.sh"
- "uninstall.sh"
# ============================================================================
# DEPENDENCIES
# ============================================================================
dependencies:
production:
core:
- name: "@ruvector/wasm-unified"
version: "^1.0.0"
purpose: "WASM embeddings and attention mechanisms"
- name: "@ruvector/ruvllm"
version: "^2.3.0"
purpose: "LLM orchestration with SONA learning"
- name: "@ruvector/postgres-cli"
version: "^0.2.6"
purpose: "PostgreSQL vector storage"
infrastructure:
- name: "better-sqlite3"
version: "^9.0.0"
purpose: "Local SQLite storage"
- name: "ioredis"
version: "^5.3.0"
purpose: "Redis message queue"
- name: "fastify"
version: "^4.24.0"
purpose: "REST API server"
integrations:
- name: "@slack/bolt"
version: "^3.16.0"
purpose: "Slack bot framework"
- name: "discord.js"
version: "^14.14.0"
purpose: "Discord integration"
optional: true
cli:
- name: "commander"
version: "^12.0.0"
purpose: "CLI framework"
- name: "chalk"
version: "^4.1.2"
purpose: "Terminal styling"
- name: "ora"
version: "^5.4.1"
purpose: "Terminal spinners"
- name: "inquirer"
version: "^9.2.0"
purpose: "Interactive prompts"
development:
- name: "typescript"
version: "^5.3.0"
- name: "tsup"
version: "^8.0.0"
purpose: "Build tool"
- name: "vitest"
version: "^1.1.0"
purpose: "Testing framework"
- name: "@types/node"
version: "^20.10.0"
# ============================================================================
# NPX DISTRIBUTION
# ============================================================================
npx_distribution:
package_name: "@ruvector/ruvbot"
binary_name: "ruvbot"
commands:
- command: "npx @ruvector/ruvbot init"
description: "Initialize RuvBot in current directory"
- command: "npx @ruvector/ruvbot start"
description: "Start bot in local mode"
- command: "npx @ruvector/ruvbot start --remote"
description: "Start bot connected to remote services"
- command: "npx @ruvector/ruvbot config"
description: "Interactive configuration wizard"
- command: "npx @ruvector/ruvbot skills list"
description: "List available skills"
- command: "npx @ruvector/ruvbot skills add <name>"
description: "Add a skill from registry"
- command: "npx @ruvector/ruvbot status"
description: "Show bot status and health"
install_script:
url: "https://get.ruvector.dev/ruvbot"
method: "curl -fsSL https://get.ruvector.dev/ruvbot | bash"
environment_variables:
required:
- name: "SLACK_BOT_TOKEN"
description: "Slack bot OAuth token"
- name: "SLACK_SIGNING_SECRET"
description: "Slack app signing secret"
optional:
- name: "RUVBOT_PORT"
description: "HTTP server port"
default: "3000"
- name: "RUVBOT_LOG_LEVEL"
description: "Logging verbosity"
default: "info"
- name: "RUVBOT_STORAGE"
description: "Storage backend (sqlite|postgres|memory)"
default: "sqlite"
- name: "RUVBOT_MEMORY_PATH"
description: "Path for vector memory storage"
default: "./data/memory"
- name: "DATABASE_URL"
description: "PostgreSQL connection string"
- name: "REDIS_URL"
description: "Redis connection string"
- name: "ANTHROPIC_API_KEY"
description: "Anthropic API key for Claude"
- name: "OPENAI_API_KEY"
description: "OpenAI API key"
# ============================================================================
# CONFIGURATION FILES
# ============================================================================
config_files:
- name: "ruvbot.config.json"
purpose: "Main configuration file"
example: |
{
"name": "my-ruvbot",
"port": 3000,
"storage": {
"type": "sqlite",
"path": "./data/ruvbot.db"
},
"memory": {
"dimensions": 384,
"maxVectors": 100000,
"indexType": "hnsw"
},
"skills": {
"enabled": ["search", "summarize", "code", "memory"],
"custom": ["./skills/*.js"]
},
"integrations": {
"slack": {
"enabled": true,
"socketMode": true
}
}
}
- name: ".env"
purpose: "Environment variables"
example: |
SLACK_BOT_TOKEN=xoxb-xxx
SLACK_SIGNING_SECRET=xxx
SLACK_APP_TOKEN=xapp-xxx
ANTHROPIC_API_KEY=sk-ant-xxx
# ============================================================================
# MILESTONES
# ============================================================================
milestones:
- name: "M1: Core Bot"
date: "Week 2"
deliverables:
- "RuvBot class with lifecycle management"
- "Core entities (Agent, Session, Message)"
- "Basic unit tests"
- name: "M2: Infrastructure"
date: "Week 3"
deliverables:
- "Session persistence"
- "Message queue"
- "Worker pool"
- name: "M3: Learning"
date: "Week 4"
deliverables:
- "WASM embeddings working"
- "Vector memory with HNSW"
- "Semantic search"
- name: "M4: Skills & Integrations"
date: "Week 5"
deliverables:
- "Skill registry with built-in skills"
- "Slack integration working"
- name: "M5: API & CLI"
date: "Week 6"
deliverables:
- "REST API"
- "CLI with npx support"
- name: "M6: Production Ready"
date: "Week 8"
deliverables:
- "85%+ test coverage"
- "Performance benchmarks passing"
- "Published to npm"
# ============================================================================
# TEAM ALLOCATION
# ============================================================================
team_allocation:
agents:
- role: "architect"
tasks: ["p1-t2", "p3-t1", "p4-t1"]
focus: "System design and core architecture"
- role: "coder"
tasks: ["p1-t1", "p1-t3", "p2-*", "p3-*", "p5-*", "p6-*", "p7-*"]
focus: "Implementation"
- role: "tester"
tasks: ["p1-t5", "p8-*"]
focus: "Testing and quality assurance"
- role: "reviewer"
tasks: ["all"]
focus: "Code review and security"
# ============================================================================
# QUALITY GATES
# ============================================================================
quality_gates:
- name: "Unit Test Coverage"
threshold: ">= 80%"
tool: "vitest"
- name: "Type Coverage"
threshold: ">= 95%"
tool: "typescript --noEmit"
- name: "No High Severity Vulnerabilities"
threshold: "0 high/critical"
tool: "npm audit"
- name: "Performance Benchmarks"
thresholds:
- metric: "embedding_latency"
value: "< 50ms"
- metric: "vector_search_latency"
value: "< 10ms"
- metric: "message_throughput"
value: "> 1000 msg/s"

View File

@@ -0,0 +1,172 @@
# ADR-001: RuvBot Architecture Overview
## Status
Accepted
## Date
2026-01-27
## Context
We need to build **RuvBot**, a Clawdbot-style personal AI assistant with a RuVector backend. The system must:
1. Provide a self-hosted, extensible AI assistant framework
2. Integrate with RuVector's WASM-based vector operations for SOTA learning
3. Support multi-tenancy for enterprise deployments
4. Enable long-running tasks via background workers
5. Integrate with messaging platforms (Slack, Discord, webhooks)
6. Distribute as an `npx` package with local/remote deployment options
## Decision
### High-Level Architecture
```
┌─────────────────────────────────────────────────────────────────────┐
│ RuvBot System │
├─────────────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐│
│ │ REST API │ │ GraphQL │ │ Slack │ │ Webhooks ││
│ │ Endpoints │ │ Gateway │ │ Adapter │ │ Handler ││
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘│
│ │ │ │ │ │
│ ┌──────┴────────────────┴────────────────┴────────────────┴──────┐│
│ │ Message Router ││
│ └─────────────────────────────┬───────────────────────────────────┘│
│ │ │
│ ┌─────────────────────────────┴───────────────────────────────────┐│
│ │ Core Application Layer ││
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││
│ │ │ AgentManager │ │SessionStore │ │ SkillRegistry│ ││
│ │ └──────────────┘ └──────────────┘ └──────────────┘ ││
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││
│ │ │MemoryManager │ │WorkerPool │ │ EventBus │ ││
│ │ └──────────────┘ └──────────────┘ └──────────────┘ ││
│ └─────────────────────────────────────────────────────────────────┘│
│ │ │
│ ┌─────────────────────────────┴───────────────────────────────────┐│
│ │ Infrastructure Layer ││
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││
│ │ │ RuVector │ │ PostgreSQL │ │ RuvLLM │ ││
│ │ │ WASM Engine │ │ + pgvector │ │ Inference │ ││
│ │ └──────────────┘ └──────────────┘ └──────────────┘ ││
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││
│ │ │ agentic-flow │ │ SONA Learning│ │ HNSW Index │ ││
│ │ │ Workers │ │ System │ │ Memory │ ││
│ │ └──────────────┘ └──────────────┘ └──────────────┘ ││
│ └─────────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────────┘
```
### DDD Bounded Contexts
#### 1. Core Context
- **Agent**: The AI agent entity with identity, capabilities, and state
- **Session**: Conversation context with message history and metadata
- **Memory**: Vector-based memory with HNSW indexing
- **Skill**: Extensible capabilities (tools, commands, integrations)
#### 2. Infrastructure Context
- **Persistence**: PostgreSQL with RuVector extensions, pgvector
- **Messaging**: Event-driven message bus (Redis/in-memory)
- **Workers**: Background task processing via agentic-flow
#### 3. Integration Context
- **Slack**: Slack Bot API adapter
- **Webhooks**: Generic webhook handler
- **Providers**: LLM provider abstraction (Anthropic, OpenAI, etc.)
#### 4. Learning Context
- **Embeddings**: RuVector WASM vector operations
- **Training**: Trajectory learning, LoRA fine-tuning
- **Patterns**: Neural pattern storage and retrieval
### Technology Stack
| Layer | Technology | Purpose |
|-------|------------|---------|
| Runtime | Node.js 18+ | Primary runtime |
| Language | TypeScript (ESM) | Type-safe development |
| Vector Engine | @ruvector/wasm-unified | SIMD-optimized vectors |
| LLM Layer | @ruvector/ruvllm | SONA, LoRA, inference |
| Database | PostgreSQL + pgvector | Persistence + vectors |
| Workers | agentic-flow | Background processing |
| Testing | Vitest | Unit/Integration/E2E |
| CLI | Commander.js | npx distribution |
### Package Structure
```
npm/packages/ruvbot/
├── bin/ # CLI entry points
│ └── ruvbot.ts # npx ruvbot entry
├── src/
│ ├── core/ # Domain layer
│ │ ├── entities/ # Agent, Session, Memory, Skill
│ │ ├── services/ # AgentManager, SessionStore, etc.
│ │ └── events/ # Domain events
│ ├── infrastructure/ # Infrastructure layer
│ │ ├── persistence/ # PostgreSQL, SQLite adapters
│ │ ├── messaging/ # Event bus, message queue
│ │ └── workers/ # agentic-flow integration
│ ├── integrations/ # External integrations
│ │ ├── slack/ # Slack adapter
│ │ ├── webhooks/ # Webhook handlers
│ │ └── providers/ # LLM providers
│ ├── learning/ # Learning system
│ │ ├── embeddings/ # WASM vector ops
│ │ ├── training/ # LoRA, SONA
│ │ └── patterns/ # Pattern storage
│ └── api/ # API layer
│ ├── rest/ # REST endpoints
│ └── graphql/ # GraphQL schema
├── tests/
│ ├── unit/
│ ├── integration/
│ └── e2e/
├── docs/
│ └── adr/ # Architecture Decision Records
└── scripts/ # Build/deploy scripts
```
### Multi-Tenancy Strategy
1. **Database Level**: Row-Level Security (RLS) with tenant_id
2. **Application Level**: Tenant context middleware
3. **Memory Level**: Namespace isolation in vector storage
4. **Worker Level**: Tenant-scoped job queues
### Key Design Principles
1. **Self-Learning**: Every interaction improves the system via SONA
2. **WASM-First**: Use RuVector WASM for portable, fast vector ops
3. **Event-Driven**: Loose coupling via event bus
4. **Extensible**: Plugin architecture for skills and integrations
5. **Observable**: Built-in metrics and tracing
## Consequences
### Positive
- Modular architecture enables independent scaling
- WASM integration provides consistent cross-platform performance
- Multi-tenancy from day one avoids later refactoring
- Self-learning improves over time with usage
### Negative
- Initial complexity is higher than monolithic approach
- WASM has some interop overhead
- Multi-tenancy adds complexity to all data operations
### Risks
- WASM performance in Node.js may vary by platform
- PostgreSQL dependency limits serverless options
- Background workers need careful monitoring
## Related ADRs
- ADR-002: Multi-tenancy Design
- ADR-003: Persistence Layer
- ADR-004: Background Workers
- ADR-005: Integration Layer
- ADR-006: WASM Integration
- ADR-007: Learning System
- ADR-008: Security Architecture

View File

@@ -0,0 +1,873 @@
# ADR-002: Multi-tenancy Design
**Status:** Accepted
**Date:** 2026-01-27
**Decision Makers:** RuVector Architecture Team
**Technical Area:** Security, Data Architecture
---
## Context and Problem Statement
RuvBot must serve multiple organizations (tenants) and users within each organization while maintaining strict data isolation. A breach of tenant boundaries would:
1. Violate privacy and compliance requirements (GDPR, SOC2, HIPAA)
2. Expose sensitive business information
3. Destroy trust in the platform
4. Create legal liability
The multi-tenancy design must address:
- **Data Isolation**: No cross-tenant data access
- **Authentication**: Identity verification at multiple levels
- **Authorization**: Fine-grained permission control
- **Resource Limits**: Fair usage and cost allocation
- **Audit Trails**: Complete visibility into access patterns
---
## Decision Drivers
### Security Requirements
| Requirement | Criticality | Description |
|-------------|-------------|-------------|
| Zero cross-tenant leakage | Critical | No tenant can access another tenant's data |
| Row-level security | Critical | Database enforces isolation, not just application |
| Token-based auth | High | Stateless, revocable authentication |
| RBAC + ABAC | High | Role and attribute-based access control |
| Audit logging | High | All data access logged with tenant context |
### Operational Requirements
| Requirement | Target | Description |
|-------------|--------|-------------|
| Tenant provisioning | < 30s | New tenant setup time |
| User provisioning | < 5s | New user creation time |
| Quota enforcement | Real-time | Immediate limit enforcement |
| Data export | < 1h for 1GB | GDPR data portability |
| Data deletion | < 24h | GDPR right to erasure |
---
## Decision Outcome
### Adopt Hierarchical Multi-tenancy with RLS and JWT Claims
We implement a three-level hierarchy with PostgreSQL Row-Level Security (RLS) as the primary isolation mechanism.
```
+---------------------------+
| ORGANIZATION | Billing entity, security boundary
|---------------------------|
| id: UUID |
| name: string |
| plan: Plan |
| settings: OrgSettings |
| quotas: ResourceQuotas |
+-------------+-------------+
|
| 1:N
v
+---------------------------+
| WORKSPACE | Project/team boundary
|---------------------------|
| id: UUID |
| orgId: UUID (FK) |
| name: string |
| settings: WorkspaceSettings|
+-------------+-------------+
|
| 1:N
v
+---------------------------+
| USER | Individual identity
|---------------------------|
| id: UUID |
| workspaceId: UUID (FK) |
| email: string |
| roles: Role[] |
| preferences: Preferences |
+---------------------------+
```
---
## Tenant Isolation Layers
### Layer 1: Network Isolation
```
Internet
|
v
+---+---+
| WAF | Rate limiting, DDoS protection
+---+---+
|
v
+---+---+
| LB/TLS| TLS termination, tenant routing
+---+---+
|
+--------+--------+--------+
| | | |
+---v---+ +---v---+ +---v---+ +---v---+
| Org A | | Org B | | Org C | | Org D | Virtual host routing
+-------+ +-------+ +-------+ +-------+
```
### Layer 2: Authentication & Authorization
```typescript
// JWT token structure with tenant claims
interface RuvBotToken {
// Standard claims
sub: string; // User ID
iat: number; // Issued at
exp: number; // Expiration
// Tenant claims (always present)
org_id: string; // Organization ID
workspace_id: string; // Workspace ID
// Permission claims
roles: Role[]; // User roles
permissions: string[];// Explicit permissions
// Resource claims
quotas: {
sessions: number;
messages_per_day: number;
memory_mb: number;
};
}
// Role hierarchy
enum Role {
ORG_OWNER = 'org:owner',
ORG_ADMIN = 'org:admin',
WORKSPACE_ADMIN = 'workspace:admin',
MEMBER = 'member',
VIEWER = 'viewer',
API_KEY = 'api_key',
}
// Permission matrix
const PERMISSIONS: Record<Role, string[]> = {
'org:owner': ['*'],
'org:admin': ['org:read', 'org:write', 'workspace:*', 'user:*', 'billing:read'],
'workspace:admin': ['workspace:read', 'workspace:write', 'user:read', 'user:invite'],
'member': ['session:*', 'memory:read', 'memory:write', 'skill:execute'],
'viewer': ['session:read', 'memory:read'],
'api_key': ['session:create', 'session:read'],
};
```
### Layer 3: Database Row-Level Security
```sql
-- Enable RLS on all tenant-scoped tables
ALTER TABLE conversations ENABLE ROW LEVEL SECURITY;
ALTER TABLE memories ENABLE ROW LEVEL SECURITY;
ALTER TABLE sessions ENABLE ROW LEVEL SECURITY;
ALTER TABLE skills ENABLE ROW LEVEL SECURITY;
ALTER TABLE trajectories ENABLE ROW LEVEL SECURITY;
-- Create tenant context function
CREATE OR REPLACE FUNCTION current_tenant_id()
RETURNS UUID AS $$
BEGIN
RETURN current_setting('app.current_org_id', true)::UUID;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
CREATE OR REPLACE FUNCTION current_workspace_id()
RETURNS UUID AS $$
BEGIN
RETURN current_setting('app.current_workspace_id', true)::UUID;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
-- RLS policies for conversations
CREATE POLICY conversations_isolation ON conversations
FOR ALL
USING (org_id = current_tenant_id())
WITH CHECK (org_id = current_tenant_id());
-- RLS policies for memories (workspace-level)
CREATE POLICY memories_isolation ON memories
FOR ALL
USING (
org_id = current_tenant_id()
AND workspace_id = current_workspace_id()
);
-- Read-only policy for cross-workspace memory sharing
CREATE POLICY memories_shared_read ON memories
FOR SELECT
USING (
org_id = current_tenant_id()
AND is_shared = true
);
```
### Layer 4: Vector Store Isolation
```typescript
// Namespace isolation in RuVector
interface VectorNamespace {
// Namespace format: {org_id}/{workspace_id}/{collection}
// Example: "550e8400-e29b/.../episodic"
encode(orgId: string, workspaceId: string, collection: string): string;
decode(namespace: string): { orgId: string; workspaceId: string; collection: string };
validate(namespace: string, token: RuvBotToken): boolean;
}
// Vector store with tenant isolation
class TenantIsolatedVectorStore {
constructor(
private store: RuVectorAdapter,
private tenantContext: TenantContext
) {}
async search(query: Float32Array, options: SearchOptions): Promise<SearchResult[]> {
const namespace = this.getNamespace(options.collection);
// Validate namespace matches token claims
if (!this.validateNamespace(namespace)) {
throw new TenantIsolationError('Namespace mismatch');
}
return this.store.search(query, { ...options, namespace });
}
private getNamespace(collection: string): string {
return `${this.tenantContext.orgId}/${this.tenantContext.workspaceId}/${collection}`;
}
private validateNamespace(namespace: string): boolean {
const { orgId, workspaceId } = VectorNamespace.decode(namespace);
return (
orgId === this.tenantContext.orgId &&
workspaceId === this.tenantContext.workspaceId
);
}
}
```
---
## Data Partitioning Strategy
### PostgreSQL Partitioning
```sql
-- Partition conversations by org_id for isolation and performance
CREATE TABLE conversations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL,
workspace_id UUID NOT NULL,
session_id UUID NOT NULL,
user_id UUID NOT NULL,
content TEXT NOT NULL,
role VARCHAR(20) NOT NULL,
embedding_id UUID,
metadata JSONB,
created_at TIMESTAMPTZ DEFAULT NOW()
) PARTITION BY LIST (org_id);
-- Create partition per organization
CREATE OR REPLACE FUNCTION create_org_partition(org_id UUID)
RETURNS void AS $$
DECLARE
partition_name TEXT;
BEGIN
partition_name := 'conversations_' || replace(org_id::text, '-', '_');
EXECUTE format(
'CREATE TABLE IF NOT EXISTS %I PARTITION OF conversations FOR VALUES IN (%L)',
partition_name,
org_id
);
END;
$$ LANGUAGE plpgsql;
-- Indexes per partition
CREATE INDEX CONCURRENTLY conversations_session_idx
ON conversations (session_id, created_at DESC);
CREATE INDEX CONCURRENTLY conversations_user_idx
ON conversations (user_id, created_at DESC);
CREATE INDEX CONCURRENTLY conversations_embedding_idx
ON conversations (embedding_id) WHERE embedding_id IS NOT NULL;
```
### Vector Store Partitioning
```typescript
// HNSW index per tenant for isolation and independent scaling
interface TenantVectorIndex {
orgId: string;
workspaceId: string;
collection: 'episodic' | 'semantic' | 'skills';
// Index configuration (can vary per tenant plan)
config: {
dimensions: number; // 384 for MiniLM, 1536 for larger models
m: number; // HNSW connections (16-32)
efConstruction: number; // Build quality (100-200)
efSearch: number; // Query quality (50-100)
};
// Usage metrics
metrics: {
vectorCount: number;
memoryUsageMB: number;
avgSearchLatencyMs: number;
lastOptimized: Date;
};
}
// Index lifecycle management
class TenantIndexManager {
async provisionTenant(orgId: string): Promise<void> {
// Create default workspaces indices
await this.createIndex(orgId, 'default', 'episodic');
await this.createIndex(orgId, 'default', 'semantic');
await this.createIndex(orgId, 'default', 'skills');
}
async deleteTenant(orgId: string): Promise<void> {
// Delete all indices for org (GDPR deletion)
const indices = await this.listIndices(orgId);
await Promise.all(indices.map(idx => this.deleteIndex(idx.id)));
// Log deletion for audit
await this.auditLog.record({
action: 'tenant_deletion',
orgId,
indexCount: indices.length,
timestamp: new Date(),
});
}
async optimizeIndex(indexId: string): Promise<OptimizationResult> {
// Background optimization with tenant resource limits
const index = await this.getIndex(indexId);
const quota = await this.getQuota(index.orgId);
if (index.metrics.memoryUsageMB > quota.maxVectorMemoryMB) {
// Apply quantization to reduce memory
return this.compressIndex(indexId, 'product_quantization');
}
return this.rebalanceIndex(indexId);
}
}
```
---
## Authentication Flows
### OAuth2/OIDC Flow
```
+--------+ +--------+
| User | | IdP |
+---+----+ +---+----+
| |
| 1. Login request |
+--------------------------------------->|
| |
| 2. Redirect to IdP |
|<---------------------------------------+
| |
| 3. Authenticate + consent |
+--------------------------------------->|
| |
| 4. Auth code redirect |
|<---------------------------------------+
| |
| +--------+ |
| 5. Auth code | RuvBot | |
+------------------>| Auth | |
| +---+----+ |
| | |
| 6. Exchange code | |
| +--------------->|
| | |
| 7. ID + Access token | |
| |<---------------+
| | |
| 8. Create session, |
| issue RuvBot JWT |
|<----------------------+
| |
| 9. Authenticated |
+<----------------------+
```
### API Key Authentication
```typescript
// API key structure
interface APIKey {
id: string;
keyHash: string; // SHA-256 hash of actual key
prefix: string; // First 8 chars for identification
orgId: string;
workspaceId: string;
name: string;
permissions: string[];
rateLimit: RateLimitConfig;
expiresAt: Date | null;
lastUsedAt: Date | null;
createdBy: string;
createdAt: Date;
}
// API key validation middleware
async function validateAPIKey(req: Request): Promise<TenantContext> {
const authHeader = req.headers.authorization;
if (!authHeader?.startsWith('Bearer ')) {
throw new AuthenticationError('Missing authorization header');
}
const key = authHeader.slice(7);
const prefix = key.slice(0, 8);
const keyHash = crypto.createHash('sha256').update(key).digest('hex');
// Lookup by prefix, then verify hash (timing-safe)
const apiKey = await db.apiKeys.findByPrefix(prefix);
if (!apiKey || !crypto.timingSafeEqual(
Buffer.from(apiKey.keyHash),
Buffer.from(keyHash)
)) {
throw new AuthenticationError('Invalid API key');
}
// Check expiration
if (apiKey.expiresAt && apiKey.expiresAt < new Date()) {
throw new AuthenticationError('API key expired');
}
// Update last used (async, don't block)
db.apiKeys.updateLastUsed(apiKey.id).catch(console.error);
return {
orgId: apiKey.orgId,
workspaceId: apiKey.workspaceId,
userId: apiKey.createdBy,
roles: [Role.API_KEY],
permissions: apiKey.permissions,
};
}
```
---
## Resource Quotas and Rate Limiting
### Quota Configuration
```typescript
// Plan-based quota tiers
interface ResourceQuotas {
// Session limits
maxConcurrentSessions: number;
maxSessionDurationMinutes: number;
maxTurnsPerSession: number;
// Memory limits
maxMemoriesPerWorkspace: number;
maxVectorStorageMB: number;
maxEmbeddingsPerDay: number;
// Compute limits
maxLLMTokensPerDay: number;
maxSkillExecutionsPerDay: number;
maxBackgroundJobsPerHour: number;
// Rate limits
requestsPerMinute: number;
requestsPerHour: number;
burstLimit: number;
}
const PLAN_QUOTAS: Record<Plan, ResourceQuotas> = {
free: {
maxConcurrentSessions: 2,
maxSessionDurationMinutes: 30,
maxTurnsPerSession: 50,
maxMemoriesPerWorkspace: 1000,
maxVectorStorageMB: 50,
maxEmbeddingsPerDay: 500,
maxLLMTokensPerDay: 10000,
maxSkillExecutionsPerDay: 100,
maxBackgroundJobsPerHour: 10,
requestsPerMinute: 20,
requestsPerHour: 500,
burstLimit: 5,
},
pro: {
maxConcurrentSessions: 10,
maxSessionDurationMinutes: 120,
maxTurnsPerSession: 500,
maxMemoriesPerWorkspace: 50000,
maxVectorStorageMB: 1000,
maxEmbeddingsPerDay: 10000,
maxLLMTokensPerDay: 500000,
maxSkillExecutionsPerDay: 5000,
maxBackgroundJobsPerHour: 200,
requestsPerMinute: 100,
requestsPerHour: 5000,
burstLimit: 20,
},
enterprise: {
maxConcurrentSessions: -1, // Unlimited
maxSessionDurationMinutes: -1,
maxTurnsPerSession: -1,
maxMemoriesPerWorkspace: -1,
maxVectorStorageMB: -1,
maxEmbeddingsPerDay: -1,
maxLLMTokensPerDay: -1,
maxSkillExecutionsPerDay: -1,
maxBackgroundJobsPerHour: -1,
requestsPerMinute: 500,
requestsPerHour: 20000,
burstLimit: 50,
},
};
```
### Rate Limiter Implementation
```typescript
// Token bucket rate limiter with Redis backend
class TenantRateLimiter {
constructor(private redis: Redis) {}
async checkLimit(
tenantId: string,
action: string,
config: RateLimitConfig
): Promise<RateLimitResult> {
const key = `ratelimit:${tenantId}:${action}`;
const now = Date.now();
const windowMs = config.windowMs || 60000;
// Lua script for atomic rate limit check
const result = await this.redis.eval(`
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])
local burst = tonumber(ARGV[4])
-- Remove expired entries
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
-- Count current requests
local count = redis.call('ZCARD', key)
-- Check burst limit (recent 1s)
local burstCount = redis.call('ZCOUNT', key, now - 1000, now)
if burstCount >= burst then
return {0, limit - count, burst - burstCount, now + 1000}
end
if count >= limit then
local oldest = redis.call('ZRANGE', key, 0, 0, 'WITHSCORES')
local retryAfter = oldest[2] + window - now
return {0, 0, burst - burstCount, retryAfter}
end
-- Add current request
redis.call('ZADD', key, now, now .. ':' .. math.random())
redis.call('PEXPIRE', key, window)
return {1, limit - count - 1, burst - burstCount - 1, 0}
`, 1, key, now, windowMs, config.limit, config.burstLimit);
const [allowed, remaining, burstRemaining, retryAfter] = result as number[];
return {
allowed: allowed === 1,
remaining,
burstRemaining,
retryAfter: retryAfter > 0 ? Math.ceil(retryAfter / 1000) : 0,
limit: config.limit,
};
}
}
```
---
## Audit Logging
```typescript
// Comprehensive audit trail
interface AuditEvent {
id: string;
timestamp: Date;
// Tenant context
orgId: string;
workspaceId: string;
userId: string;
// Event details
action: AuditAction;
resource: AuditResource;
resourceId: string;
// Request context
requestId: string;
ipAddress: string;
userAgent: string;
// Change tracking
before?: Record<string, unknown>;
after?: Record<string, unknown>;
// Outcome
status: 'success' | 'failure' | 'denied';
errorCode?: string;
errorMessage?: string;
}
type AuditAction =
| 'create' | 'read' | 'update' | 'delete'
| 'login' | 'logout' | 'token_refresh'
| 'export' | 'import'
| 'share' | 'unshare'
| 'invite' | 'remove'
| 'skill_execute' | 'memory_recall'
| 'quota_exceeded' | 'rate_limited';
type AuditResource =
| 'user' | 'session' | 'conversation'
| 'memory' | 'skill' | 'agent'
| 'workspace' | 'organization'
| 'api_key' | 'webhook';
// Audit logger with async persistence
class AuditLogger {
private buffer: AuditEvent[] = [];
private flushInterval: NodeJS.Timeout;
constructor(
private storage: AuditStorage,
private config: { batchSize: number; flushMs: number }
) {
this.flushInterval = setInterval(() => this.flush(), config.flushMs);
}
async log(event: Omit<AuditEvent, 'id' | 'timestamp'>): Promise<void> {
const fullEvent: AuditEvent = {
...event,
id: crypto.randomUUID(),
timestamp: new Date(),
};
this.buffer.push(fullEvent);
if (this.buffer.length >= this.config.batchSize) {
await this.flush();
}
}
private async flush(): Promise<void> {
if (this.buffer.length === 0) return;
const events = this.buffer.splice(0, this.buffer.length);
await this.storage.batchInsert(events);
}
async query(filter: AuditFilter): Promise<AuditEvent[]> {
// Ensure tenant isolation in queries
if (!filter.orgId) {
throw new Error('orgId required for audit queries');
}
return this.storage.query(filter);
}
}
```
---
## GDPR Compliance
### Data Export
```typescript
// Personal data export for GDPR Article 15
class DataExporter {
async exportUserData(
orgId: string,
userId: string
): Promise<DataExportResult> {
const export = {
metadata: {
userId,
orgId,
exportedAt: new Date(),
format: 'json',
version: '1.0',
},
data: {} as Record<string, unknown>,
};
// Collect all user data across contexts
const [
profile,
sessions,
conversations,
memories,
preferences,
auditLogs,
] = await Promise.all([
this.exportProfile(userId),
this.exportSessions(userId),
this.exportConversations(userId),
this.exportMemories(userId),
this.exportPreferences(userId),
this.exportAuditLogs(userId),
]);
export.data = {
profile,
sessions,
conversations,
memories,
preferences,
auditLogs,
};
// Generate downloadable archive
const archivePath = await this.createArchive(export);
// Log export for audit
await this.auditLogger.log({
orgId,
workspaceId: '*',
userId,
action: 'export',
resource: 'user',
resourceId: userId,
status: 'success',
});
return {
downloadUrl: await this.generateSignedUrl(archivePath),
expiresAt: new Date(Date.now() + 24 * 60 * 60 * 1000), // 24h
sizeBytes: await this.getFileSize(archivePath),
};
}
}
```
### Data Deletion
```typescript
// Right to erasure (GDPR Article 17)
class DataDeleter {
async deleteUserData(
orgId: string,
userId: string,
options: DeletionOptions = {}
): Promise<DeletionResult> {
const jobId = crypto.randomUUID();
// Start deletion job (may take time for large datasets)
await this.jobQueue.enqueue('data-deletion', {
jobId,
orgId,
userId,
options,
});
return {
jobId,
status: 'pending',
estimatedCompletionTime: await this.estimateCompletionTime(userId),
};
}
async executeDeletion(job: DeletionJob): Promise<void> {
const { orgId, userId, options } = job.data;
// Order matters: delete dependent data first
const steps = [
{ name: 'sessions', fn: () => this.deleteSessions(userId) },
{ name: 'conversations', fn: () => this.deleteConversations(userId) },
{ name: 'memories', fn: () => this.deleteMemories(userId, options.preserveShared) },
{ name: 'embeddings', fn: () => this.deleteEmbeddings(userId) },
{ name: 'trajectories', fn: () => this.deleteTrajectories(userId) },
{ name: 'preferences', fn: () => this.deletePreferences(userId) },
{ name: 'audit_logs', fn: () => this.anonymizeAuditLogs(userId) }, // Anonymize, not delete
{ name: 'profile', fn: () => this.deleteProfile(userId) },
];
for (const step of steps) {
try {
const result = await step.fn();
await this.updateProgress(job.id, step.name, 'completed', result);
} catch (error) {
await this.updateProgress(job.id, step.name, 'failed', error);
throw error; // Fail job, require manual intervention
}
}
// Final audit entry (anonymized user reference)
await this.auditLogger.log({
orgId,
workspaceId: '*',
userId: 'DELETED_USER',
action: 'delete',
resource: 'user',
resourceId: userId.slice(0, 8) + '...',
status: 'success',
});
}
}
```
---
## Consequences
### Benefits
1. **Strong Isolation**: RLS + namespace isolation prevents cross-tenant access
2. **Compliance Ready**: GDPR, SOC2, HIPAA requirements addressed
3. **Scalable Quotas**: Per-tenant resource limits enable fair usage
4. **Audit Trail**: Complete visibility for security and compliance
5. **Flexible Auth**: OAuth2 + API keys support various use cases
### Risks and Mitigations
| Risk | Probability | Impact | Mitigation |
|------|-------------|--------|------------|
| RLS bypass via SQL injection | Low | Critical | Parameterized queries, ORM only |
| Token theft | Medium | High | Short expiry, refresh rotation |
| Quota gaming (multiple accounts) | Medium | Medium | Device fingerprinting, email verification |
| Audit log tampering | Low | High | Append-only storage, checksums |
---
## Related Decisions
- **ADR-001**: Architecture Overview
- **ADR-003**: Persistence Layer (RLS implementation details)
---
## Revision History
| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0 | 2026-01-27 | RuVector Architecture Team | Initial version |

View File

@@ -0,0 +1,952 @@
# ADR-003: Persistence Layer
**Status:** Accepted
**Date:** 2026-01-27
**Decision Makers:** RuVector Architecture Team
**Technical Area:** Data Architecture, Storage
---
## Context and Problem Statement
RuvBot requires a persistence layer that handles diverse data types:
1. **Relational Data**: Users, organizations, sessions, skills (structured, transactional)
2. **Vector Data**: Embeddings for memory recall (high-dimensional, similarity search)
3. **Session State**: Active conversation context (ephemeral, fast access)
4. **Event Streams**: Audit logs, trajectories (append-only, time-series)
The persistence layer must support:
- **Multi-tenancy** with strict isolation
- **High performance** for real-time conversation
- **Durability** for compliance and recovery
- **Scalability** for enterprise deployments
---
## Decision Drivers
### Data Characteristics
| Data Type | Volume | Access Pattern | Consistency | Durability |
|-----------|--------|----------------|-------------|------------|
| User/Org metadata | Low | Read-heavy | Strong | Required |
| Session state | Medium | Read-write balanced | Eventual OK | Nice-to-have |
| Conversation history | High | Append-mostly | Strong | Required |
| Vector embeddings | Very High | Read-heavy | Eventual OK | Required |
| Memory indices | High | Read-heavy | Eventual OK | Nice-to-have |
| Audit logs | Very High | Append-only | Strong | Required |
### Performance Requirements
| Operation | Target Latency | Target Throughput |
|-----------|----------------|-------------------|
| Session lookup | < 5ms p99 | 10K/s |
| Memory recall (HNSW) | < 50ms p99 | 1K/s |
| Conversation insert | < 20ms p99 | 5K/s |
| Full-text search | < 100ms p99 | 500/s |
| Batch embedding insert | < 500ms p99 | 100 batches/s |
---
## Decision Outcome
### Adopt Polyglot Persistence with Unified API
We implement a three-tier storage architecture:
```
+-----------------------------------------------------------------------------+
| PERSISTENCE LAYER |
+-----------------------------------------------------------------------------+
+--------------------------+
| Persistence Gateway |
| (Unified API) |
+-------------+------------+
|
+-----------------------+-----------------------+
| | |
+---------v---------+ +---------v---------+ +---------v---------+
| PostgreSQL | | RuVector | | Redis |
| (Primary) | | (Vector Store) | | (Cache) |
|-------------------| |-------------------| |-------------------|
| - User/Org data | | - Embeddings | | - Session state |
| - Conversations | | - HNSW indices | | - Rate limits |
| - Skills config | | - Pattern store | | - Pub/Sub |
| - Audit logs | | - Similarity | | - Job queues |
| - RLS isolation | | - Learning data | | - Leaderboard |
+-------------------+ +-------------------+ +-------------------+
```
---
## PostgreSQL Schema
### Core Tables
```sql
-- Extensions
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS "pgcrypto";
CREATE EXTENSION IF NOT EXISTS "pg_trgm"; -- Full-text search
-- Organizations (tenant root)
CREATE TABLE organizations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(255) NOT NULL,
slug VARCHAR(100) NOT NULL UNIQUE,
plan VARCHAR(50) NOT NULL DEFAULT 'free',
settings JSONB NOT NULL DEFAULT '{}',
quotas JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX organizations_slug_idx ON organizations (slug);
-- Workspaces (project boundary)
CREATE TABLE workspaces (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
name VARCHAR(255) NOT NULL,
slug VARCHAR(100) NOT NULL,
settings JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (org_id, slug)
);
CREATE INDEX workspaces_org_idx ON workspaces (org_id);
-- Users
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
email VARCHAR(255) NOT NULL,
password_hash VARCHAR(255), -- NULL for OAuth users
display_name VARCHAR(255),
avatar_url VARCHAR(500),
roles TEXT[] NOT NULL DEFAULT '{"member"}',
preferences JSONB NOT NULL DEFAULT '{}',
email_verified_at TIMESTAMPTZ,
last_login_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (org_id, email)
);
CREATE INDEX users_org_idx ON users (org_id);
CREATE INDEX users_email_idx ON users (email);
-- Workspace memberships
CREATE TABLE workspace_memberships (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
workspace_id UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE,
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
role VARCHAR(50) NOT NULL DEFAULT 'member',
joined_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (workspace_id, user_id)
);
CREATE INDEX workspace_memberships_user_idx ON workspace_memberships (user_id);
```
### Session and Conversation Tables
```sql
-- Agents (bot configurations)
CREATE TABLE agents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
workspace_id UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE,
name VARCHAR(255) NOT NULL,
description TEXT,
persona JSONB NOT NULL DEFAULT '{}',
skill_ids UUID[] NOT NULL DEFAULT '{}',
memory_config JSONB NOT NULL DEFAULT '{}',
status VARCHAR(50) NOT NULL DEFAULT 'active',
version INTEGER NOT NULL DEFAULT 1,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
ALTER TABLE agents ENABLE ROW LEVEL SECURITY;
CREATE POLICY agents_isolation ON agents
FOR ALL USING (org_id = current_tenant_id());
CREATE INDEX agents_org_workspace_idx ON agents (org_id, workspace_id);
-- Sessions (conversation containers)
CREATE TABLE sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL,
workspace_id UUID NOT NULL,
agent_id UUID NOT NULL REFERENCES agents(id) ON DELETE CASCADE,
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
channel VARCHAR(50) NOT NULL DEFAULT 'api', -- api, slack, webhook
channel_id VARCHAR(255), -- External channel identifier
state VARCHAR(50) NOT NULL DEFAULT 'active',
context_snapshot JSONB, -- Serialized context for recovery
turn_count INTEGER NOT NULL DEFAULT 0,
token_count INTEGER NOT NULL DEFAULT 0,
started_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
last_active_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
expires_at TIMESTAMPTZ NOT NULL,
ended_at TIMESTAMPTZ
) PARTITION BY LIST (org_id);
ALTER TABLE sessions ENABLE ROW LEVEL SECURITY;
CREATE POLICY sessions_isolation ON sessions
FOR ALL USING (org_id = current_tenant_id());
CREATE INDEX sessions_user_active_idx ON sessions (user_id, state)
WHERE state = 'active';
CREATE INDEX sessions_agent_idx ON sessions (agent_id);
CREATE INDEX sessions_expires_idx ON sessions (expires_at)
WHERE state = 'active';
-- Conversation turns
CREATE TABLE conversation_turns (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL,
workspace_id UUID NOT NULL,
session_id UUID NOT NULL,
user_id UUID NOT NULL,
role VARCHAR(20) NOT NULL, -- user, assistant, system, tool
content TEXT NOT NULL,
content_type VARCHAR(50) NOT NULL DEFAULT 'text',
embedding_id UUID, -- Reference to vector store
tool_calls JSONB, -- Function/skill calls
tool_results JSONB, -- Function/skill results
metadata JSONB NOT NULL DEFAULT '{}',
token_count INTEGER,
latency_ms INTEGER,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY LIST (org_id);
ALTER TABLE conversation_turns ENABLE ROW LEVEL SECURITY;
CREATE POLICY turns_isolation ON conversation_turns
FOR ALL USING (org_id = current_tenant_id());
-- Composite index for session history queries
CREATE INDEX turns_session_time_idx ON conversation_turns (session_id, created_at DESC);
CREATE INDEX turns_embedding_idx ON conversation_turns (embedding_id)
WHERE embedding_id IS NOT NULL;
```
### Memory Tables
```sql
-- Memory entries (facts, events stored for recall)
CREATE TABLE memories (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL,
workspace_id UUID NOT NULL,
user_id UUID, -- NULL for workspace-level memories
memory_type VARCHAR(50) NOT NULL, -- episodic, semantic, procedural
content TEXT NOT NULL,
embedding_id UUID NOT NULL, -- Reference to vector store
source_type VARCHAR(50), -- conversation, import, skill
source_id UUID, -- Reference to source entity
importance FLOAT NOT NULL DEFAULT 0.5, -- 0-1 importance score
access_count INTEGER NOT NULL DEFAULT 0,
last_accessed_at TIMESTAMPTZ,
is_shared BOOLEAN NOT NULL DEFAULT FALSE,
expires_at TIMESTAMPTZ,
metadata JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY LIST (org_id);
ALTER TABLE memories ENABLE ROW LEVEL SECURITY;
-- User-scoped memories
CREATE POLICY memories_user_isolation ON memories
FOR ALL USING (
org_id = current_tenant_id()
AND workspace_id = current_workspace_id()
AND (user_id = current_user_id() OR user_id IS NULL)
);
-- Shared memories (read-only across workspace)
CREATE POLICY memories_shared_read ON memories
FOR SELECT USING (
org_id = current_tenant_id()
AND is_shared = TRUE
);
CREATE INDEX memories_workspace_type_idx ON memories (workspace_id, memory_type);
CREATE INDEX memories_user_type_idx ON memories (user_id, memory_type)
WHERE user_id IS NOT NULL;
CREATE INDEX memories_embedding_idx ON memories (embedding_id);
CREATE INDEX memories_importance_idx ON memories (importance DESC);
CREATE INDEX memories_access_idx ON memories (last_accessed_at DESC);
-- Memory relationships (for graph traversal)
CREATE TABLE memory_edges (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL,
source_memory_id UUID NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
target_memory_id UUID NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
edge_type VARCHAR(50) NOT NULL, -- related_to, caused_by, part_of, supersedes
weight FLOAT NOT NULL DEFAULT 1.0,
metadata JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
ALTER TABLE memory_edges ENABLE ROW LEVEL SECURITY;
CREATE POLICY edges_isolation ON memory_edges
FOR ALL USING (org_id = current_tenant_id());
CREATE INDEX memory_edges_source_idx ON memory_edges (source_memory_id);
CREATE INDEX memory_edges_target_idx ON memory_edges (target_memory_id);
CREATE INDEX memory_edges_type_idx ON memory_edges (edge_type);
```
### Skills and Learning Tables
```sql
-- Skills (registered capabilities)
CREATE TABLE skills (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL,
workspace_id UUID, -- NULL for org-wide skills
name VARCHAR(255) NOT NULL,
description TEXT,
version VARCHAR(50) NOT NULL DEFAULT '1.0.0',
triggers JSONB NOT NULL DEFAULT '[]',
parameters JSONB NOT NULL DEFAULT '{}',
implementation_type VARCHAR(50) NOT NULL, -- builtin, script, webhook
implementation JSONB NOT NULL, -- Type-specific config
hooks JSONB NOT NULL DEFAULT '{}',
is_enabled BOOLEAN NOT NULL DEFAULT TRUE,
usage_count INTEGER NOT NULL DEFAULT 0,
success_rate FLOAT,
avg_latency_ms FLOAT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
ALTER TABLE skills ENABLE ROW LEVEL SECURITY;
CREATE POLICY skills_isolation ON skills
FOR ALL USING (org_id = current_tenant_id());
CREATE INDEX skills_workspace_idx ON skills (workspace_id);
CREATE INDEX skills_enabled_idx ON skills (is_enabled) WHERE is_enabled = TRUE;
-- Trajectories (learning data)
CREATE TABLE trajectories (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL,
workspace_id UUID NOT NULL,
session_id UUID NOT NULL,
turn_ids UUID[] NOT NULL,
skill_ids UUID[],
start_time TIMESTAMPTZ NOT NULL,
end_time TIMESTAMPTZ NOT NULL,
verdict VARCHAR(50), -- positive, negative, neutral, pending
verdict_reason TEXT,
metrics JSONB NOT NULL DEFAULT '{}',
embedding_id UUID,
is_exported BOOLEAN NOT NULL DEFAULT FALSE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
ALTER TABLE trajectories ENABLE ROW LEVEL SECURITY;
CREATE POLICY trajectories_isolation ON trajectories
FOR ALL USING (org_id = current_tenant_id());
CREATE INDEX trajectories_session_idx ON trajectories (session_id);
CREATE INDEX trajectories_verdict_idx ON trajectories (verdict)
WHERE verdict IS NOT NULL;
CREATE INDEX trajectories_export_idx ON trajectories (is_exported)
WHERE is_exported = FALSE;
-- Learned patterns
CREATE TABLE learned_patterns (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL,
workspace_id UUID, -- NULL for org-wide patterns
pattern_type VARCHAR(50) NOT NULL, -- response, routing, skill_selection
embedding_id UUID NOT NULL,
exemplar_trajectory_ids UUID[] NOT NULL,
confidence FLOAT NOT NULL,
usage_count INTEGER NOT NULL DEFAULT 0,
success_count INTEGER NOT NULL DEFAULT 0,
is_active BOOLEAN NOT NULL DEFAULT TRUE,
superseded_by UUID REFERENCES learned_patterns(id),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
ALTER TABLE learned_patterns ENABLE ROW LEVEL SECURITY;
CREATE POLICY patterns_isolation ON learned_patterns
FOR ALL USING (org_id = current_tenant_id());
CREATE INDEX patterns_type_idx ON learned_patterns (pattern_type);
CREATE INDEX patterns_active_idx ON learned_patterns (is_active)
WHERE is_active = TRUE;
CREATE INDEX patterns_embedding_idx ON learned_patterns (embedding_id);
```
---
## RuVector Integration
### Vector Store Adapter
```typescript
// Unified vector store interface
interface RuVectorAdapter {
// Index management
createIndex(config: IndexConfig): Promise<IndexHandle>;
deleteIndex(handle: IndexHandle): Promise<void>;
getIndex(namespace: string): Promise<IndexHandle | null>;
// Vector operations
insert(handle: IndexHandle, entries: VectorEntry[]): Promise<void>;
update(handle: IndexHandle, id: string, vector: Float32Array): Promise<void>;
delete(handle: IndexHandle, ids: string[]): Promise<void>;
// Search operations
search(handle: IndexHandle, query: Float32Array, options: SearchOptions): Promise<SearchResult[]>;
batchSearch(handle: IndexHandle, queries: Float32Array[], options: SearchOptions): Promise<SearchResult[][]>;
// Index operations
optimize(handle: IndexHandle): Promise<OptimizationResult>;
stats(handle: IndexHandle): Promise<IndexStats>;
}
interface IndexConfig {
namespace: string;
dimensions: number;
distanceMetric: 'cosine' | 'euclidean' | 'dot_product';
hnsw: {
m: number;
efConstruction: number;
efSearch: number;
};
quantization?: {
type: 'scalar' | 'product' | 'binary';
bits?: number;
};
}
interface VectorEntry {
id: string;
vector: Float32Array;
metadata?: Record<string, unknown>;
}
interface SearchResult {
id: string;
score: number;
metadata?: Record<string, unknown>;
}
```
### Namespace Schema
```typescript
// Vector namespace organization
const VECTOR_NAMESPACES = {
// Memory embeddings
EPISODIC: (orgId: string, workspaceId: string) =>
`${orgId}/${workspaceId}/memory/episodic`,
SEMANTIC: (orgId: string, workspaceId: string) =>
`${orgId}/${workspaceId}/memory/semantic`,
PROCEDURAL: (orgId: string, workspaceId: string) =>
`${orgId}/${workspaceId}/memory/procedural`,
// Conversation embeddings
CONVERSATIONS: (orgId: string, workspaceId: string) =>
`${orgId}/${workspaceId}/conversations`,
// Learning embeddings
TRAJECTORIES: (orgId: string, workspaceId: string) =>
`${orgId}/${workspaceId}/learning/trajectories`,
PATTERNS: (orgId: string, workspaceId: string) =>
`${orgId}/${workspaceId}/learning/patterns`,
// Skill embeddings (for intent matching)
SKILLS: (orgId: string) =>
`${orgId}/skills`,
};
// Index configuration per namespace type
const INDEX_CONFIGS: Record<string, Partial<IndexConfig>> = {
'memory/episodic': {
dimensions: 384,
distanceMetric: 'cosine',
hnsw: { m: 16, efConstruction: 100, efSearch: 50 },
},
'memory/semantic': {
dimensions: 384,
distanceMetric: 'cosine',
hnsw: { m: 32, efConstruction: 200, efSearch: 100 },
},
'conversations': {
dimensions: 384,
distanceMetric: 'cosine',
hnsw: { m: 16, efConstruction: 100, efSearch: 50 },
quantization: { type: 'scalar' }, // Compress for volume
},
'learning/patterns': {
dimensions: 384,
distanceMetric: 'cosine',
hnsw: { m: 32, efConstruction: 200, efSearch: 100 },
},
};
```
### WASM/Native Detection
```typescript
// Automatic runtime detection
class RuVectorFactory {
private static instance: RuVectorAdapter | null = null;
static async create(): Promise<RuVectorAdapter> {
if (this.instance) return this.instance;
// Try native first (better performance)
try {
const native = await import('@ruvector/core');
if (native.isNativeAvailable()) {
console.log('RuVector: Using native NAPI bindings');
this.instance = new NativeRuVectorAdapter(native);
return this.instance;
}
} catch (e) {
console.debug('Native bindings not available:', e);
}
// Fall back to WASM
try {
const wasm = await import('@ruvector/wasm');
console.log('RuVector: Using WASM runtime');
this.instance = new WasmRuVectorAdapter(wasm);
return this.instance;
} catch (e) {
throw new Error(`Failed to load RuVector runtime: ${e}`);
}
}
}
```
---
## Redis Schema
### Session Cache
```typescript
// Session state keys
const SESSION_KEYS = {
// Active session state
state: (sessionId: string) => `session:${sessionId}:state`,
// Context window (recent turns)
context: (sessionId: string) => `session:${sessionId}:context`,
// Session lock (prevent concurrent modifications)
lock: (sessionId: string) => `session:${sessionId}:lock`,
// User's active sessions
userSessions: (userId: string) => `user:${userId}:sessions`,
// Session expiry sorted set
expiryIndex: () => 'sessions:expiry',
};
// Session state structure
interface CachedSessionState {
id: string;
agentId: string;
userId: string;
state: SessionState;
turnCount: number;
tokenCount: number;
lastActiveAt: number;
expiresAt: number;
}
// Context window structure
interface CachedContextWindow {
maxTokens: number;
turns: Array<{
id: string;
role: string;
content: string;
createdAt: number;
}>;
retrievedMemoryIds: string[];
}
```
### Rate Limiting
```typescript
// Rate limit keys
const RATE_LIMIT_KEYS = {
// Per-tenant rate limits
tenant: (tenantId: string, action: string, window: string) =>
`ratelimit:tenant:${tenantId}:${action}:${window}`,
// Per-user rate limits
user: (userId: string, action: string, window: string) =>
`ratelimit:user:${userId}:${action}:${window}`,
// Global rate limits
global: (action: string, window: string) =>
`ratelimit:global:${action}:${window}`,
};
// Rate limit actions
type RateLimitAction =
| 'api_request'
| 'llm_call'
| 'embedding_request'
| 'memory_write'
| 'skill_execute'
| 'webhook_dispatch';
```
### Pub/Sub Channels
```typescript
// Real-time event channels
const PUBSUB_CHANNELS = {
// Session events
sessionCreated: (workspaceId: string) =>
`events:${workspaceId}:session:created`,
sessionEnded: (workspaceId: string) =>
`events:${workspaceId}:session:ended`,
// Conversation events
turnCreated: (sessionId: string) =>
`events:session:${sessionId}:turn:created`,
// Memory events
memoryCreated: (workspaceId: string) =>
`events:${workspaceId}:memory:created`,
memoryUpdated: (workspaceId: string) =>
`events:${workspaceId}:memory:updated`,
// Skill events
skillExecuted: (workspaceId: string) =>
`events:${workspaceId}:skill:executed`,
// System events
quotaWarning: (tenantId: string) =>
`events:${tenantId}:quota:warning`,
};
```
---
## Data Access Patterns
### Repository Pattern
```typescript
// Base repository with tenant context
abstract class TenantRepository<T> {
constructor(
protected db: PostgresAdapter,
protected tenantContext: TenantContext
) {}
protected async withTenantContext<R>(
fn: (db: PostgresAdapter) => Promise<R>
): Promise<R> {
// Set tenant context for RLS
await this.db.query(`
SELECT set_config('app.current_org_id', $1, true),
set_config('app.current_workspace_id', $2, true),
set_config('app.current_user_id', $3, true)
`, [
this.tenantContext.orgId,
this.tenantContext.workspaceId,
this.tenantContext.userId,
]);
return fn(this.db);
}
abstract findById(id: string): Promise<T | null>;
abstract save(entity: T): Promise<T>;
abstract delete(id: string): Promise<void>;
}
// Memory repository example
class MemoryRepository extends TenantRepository<Memory> {
async findById(id: string): Promise<Memory | null> {
return this.withTenantContext(async (db) => {
const rows = await db.query<MemoryRow>(
'SELECT * FROM memories WHERE id = $1',
[id]
);
return rows[0] ? this.toEntity(rows[0]) : null;
});
}
async findByEmbedding(
embedding: Float32Array,
options: MemorySearchOptions
): Promise<MemoryWithScore[]> {
// Search vector store first
const vectorResults = await this.vectorStore.search(
this.getIndexHandle(),
embedding,
{ k: options.limit, threshold: options.minScore }
);
if (vectorResults.length === 0) return [];
// Fetch full memory records
return this.withTenantContext(async (db) => {
const ids = vectorResults.map(r => r.id);
const scoreMap = new Map(vectorResults.map(r => [r.id, r.score]));
const rows = await db.query<MemoryRow>(
'SELECT * FROM memories WHERE id = ANY($1)',
[ids]
);
return rows
.map(row => ({
memory: this.toEntity(row),
score: scoreMap.get(row.id) ?? 0,
}))
.sort((a, b) => b.score - a.score);
});
}
async save(memory: Memory): Promise<Memory> {
return this.withTenantContext(async (db) => {
// Generate embedding if not present
if (!memory.embeddingId) {
const embedding = await this.embedder.embed(memory.content);
const embeddingId = crypto.randomUUID();
await this.vectorStore.insert(this.getIndexHandle(), [{
id: embeddingId,
vector: embedding,
metadata: { memoryId: memory.id },
}]);
memory.embeddingId = embeddingId;
}
// Upsert to database
const row = await db.query<MemoryRow>(`
INSERT INTO memories (
id, org_id, workspace_id, user_id, memory_type, content,
embedding_id, source_type, source_id, importance, metadata
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11)
ON CONFLICT (id) DO UPDATE SET
content = EXCLUDED.content,
importance = EXCLUDED.importance,
metadata = EXCLUDED.metadata,
updated_at = NOW()
RETURNING *
`, [
memory.id,
this.tenantContext.orgId,
this.tenantContext.workspaceId,
memory.userId,
memory.type,
memory.content,
memory.embeddingId,
memory.sourceType,
memory.sourceId,
memory.importance,
memory.metadata,
]);
return this.toEntity(row[0]);
});
}
private getIndexHandle(): IndexHandle {
return {
namespace: VECTOR_NAMESPACES[this.tenantContext.workspaceId]
? VECTOR_NAMESPACES.EPISODIC(
this.tenantContext.orgId,
this.tenantContext.workspaceId
)
: VECTOR_NAMESPACES.SEMANTIC(
this.tenantContext.orgId,
this.tenantContext.workspaceId
),
};
}
}
```
### Unit of Work Pattern
```typescript
// Transaction coordination
class UnitOfWork {
private operations: Operation[] = [];
private committed = false;
constructor(
private db: PostgresAdapter,
private vectorStore: RuVectorAdapter,
private cache: CacheAdapter
) {}
addMemory(memory: Memory): void {
this.operations.push({
type: 'memory',
action: 'upsert',
entity: memory,
});
}
addTurn(turn: ConversationTurn): void {
this.operations.push({
type: 'turn',
action: 'insert',
entity: turn,
});
}
async commit(): Promise<void> {
if (this.committed) throw new Error('Already committed');
try {
await this.db.transaction(async (tx) => {
// Execute database operations
for (const op of this.operations.filter(o => o.type !== 'cache')) {
await this.executeDbOperation(tx, op);
}
// Execute vector operations (outside transaction, but after DB success)
for (const op of this.operations.filter(o =>
o.type === 'memory' || o.type === 'turn'
)) {
await this.executeVectorOperation(op);
}
});
// Execute cache operations (best effort)
for (const op of this.operations.filter(o => o.type === 'cache')) {
await this.executeCacheOperation(op).catch(console.error);
}
this.committed = true;
} catch (error) {
// Rollback vector operations on failure
await this.rollbackVectorOperations();
throw error;
}
}
}
```
---
## Migration Strategy
### Schema Migrations
```typescript
// Migration runner
class MigrationRunner {
async migrate(direction: 'up' | 'down' = 'up'): Promise<void> {
const migrations = await this.loadMigrations();
const applied = await this.getAppliedMigrations();
if (direction === 'up') {
const pending = migrations.filter(m => !applied.has(m.version));
for (const migration of pending) {
await this.applyMigration(migration);
}
} else {
const toRollback = [...applied].reverse();
for (const version of toRollback) {
const migration = migrations.find(m => m.version === version);
if (migration) {
await this.rollbackMigration(migration);
}
}
}
}
private async applyMigration(migration: Migration): Promise<void> {
await this.db.transaction(async (tx) => {
// Run migration SQL
await tx.query(migration.up);
// Record migration
await tx.query(
'INSERT INTO schema_migrations (version, applied_at) VALUES ($1, NOW())',
[migration.version]
);
});
console.log(`Applied migration: ${migration.version}`);
}
}
// Example migration
const MIGRATION_001: Migration = {
version: '001_initial_schema',
up: `
-- Create organizations table
CREATE TABLE organizations (...);
-- Create workspaces table
CREATE TABLE workspaces (...);
-- ... rest of schema
`,
down: `
DROP TABLE IF EXISTS workspaces;
DROP TABLE IF EXISTS organizations;
`,
};
```
---
## Consequences
### Benefits
1. **Strong Isolation**: RLS + namespace isolation at every layer
2. **Performance**: Optimized indices, caching, and partitioning
3. **Flexibility**: Polyglot persistence matches data characteristics
4. **Durability**: PostgreSQL for critical data, redundant vector storage
5. **Scalability**: Horizontal scaling via partitions and Redis cluster
### Trade-offs
| Benefit | Trade-off |
|---------|-----------|
| RLS security | Slight query overhead |
| HNSW speed | Memory consumption |
| Redis caching | Consistency complexity |
| Polyglot persistence | Operational complexity |
---
## Related Decisions
- **ADR-001**: Architecture Overview
- **ADR-002**: Multi-tenancy Design
- **ADR-006**: WASM Integration (vector store runtime)
---
## Revision History
| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0 | 2026-01-27 | RuVector Architecture Team | Initial version |

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,907 @@
# ADR-005: Integration Layer
**Status:** Accepted
**Date:** 2026-01-27
**Decision Makers:** RuVector Architecture Team
**Technical Area:** Integrations, External Services
---
## Context and Problem Statement
RuvBot must integrate with external systems to:
1. **Receive messages** from Slack, webhooks, and other channels
2. **Send notifications** and responses back to users
3. **Connect to AI providers** for LLM inference and embeddings
4. **Interact with external APIs** for skill execution
5. **Provide webhooks** for third-party integrations
The integration layer must be:
- **Extensible** for new integration types
- **Resilient** to external service failures
- **Secure** with proper authentication and authorization
- **Observable** with logging and metrics
---
## Decision Drivers
### Integration Requirements
| Integration | Priority | Features Required |
|-------------|----------|-------------------|
| Slack | Critical | Events, commands, blocks, threads |
| REST Webhooks | Critical | Inbound/outbound, signatures |
| Anthropic Claude | Critical | Completions, streaming |
| OpenAI | High | Completions, embeddings |
| Custom LLMs | Medium | Provider abstraction |
| External APIs | Medium | HTTP client, retries |
### Reliability Requirements
| Requirement | Target |
|-------------|--------|
| Webhook delivery success | > 99% |
| Provider failover time | < 1s |
| Message ordering | Within session |
| Duplicate detection | 100% |
---
## Decision Outcome
### Adopt Adapter Pattern with Circuit Breaker
We implement the integration layer using:
1. **Adapter Pattern**: Common interface for each integration type
2. **Circuit Breaker**: Prevent cascade failures from external services
3. **Retry with Backoff**: Handle transient failures
4. **Event-Driven**: Decouple ingestion from processing
```
+-----------------------------------------------------------------------------+
| INTEGRATION LAYER |
+-----------------------------------------------------------------------------+
+---------------------------+
| Integration Gateway |
| (Protocol Normalization)|
+-------------+-------------+
|
+-----------------------+-----------------------+
| | |
+---------v---------+ +---------v---------+ +---------v---------+
| Slack Adapter | | Webhook Adapter | | Provider Adapter |
|-------------------| |-------------------| |-------------------|
| - Events API | | - Inbound routes | | - LLM clients |
| - Commands | | - Outbound queue | | - Embeddings |
| - Interactive | | - Signatures | | - Circuit breaker |
| - OAuth | | - Retries | | - Failover |
+-------------------+ +-------------------+ +-------------------+
| | |
+-----------------------+-----------------------+
|
+-------------v-------------+
| Event Normalizer |
| (Unified Message Format) |
+-------------+-------------+
|
+-------------v-------------+
| Core Context |
+---------------------------+
```
---
## Slack Integration
### Architecture
```typescript
// Slack integration components
interface SlackIntegration {
// Event handling
events: SlackEventHandler;
// Slash commands
commands: SlackCommandHandler;
// Interactive components (buttons, modals)
interactive: SlackInteractiveHandler;
// Block Kit builder
blocks: BlockKitBuilder;
// Web API client
client: SlackWebClient;
// OAuth flow
oauth: SlackOAuthHandler;
}
// Event types we handle
type SlackEventType =
| 'message'
| 'app_mention'
| 'reaction_added'
| 'reaction_removed'
| 'channel_created'
| 'member_joined_channel'
| 'file_shared'
| 'app_home_opened';
// Normalized event structure
interface SlackIncomingEvent {
type: SlackEventType;
teamId: string;
channelId: string;
userId: string;
text?: string;
threadTs?: string;
ts: string;
raw: unknown;
}
```
### Event Handler
```typescript
// Slack event processing
class SlackEventHandler {
private eventQueue: Queue<SlackIncomingEvent>;
private deduplicator: EventDeduplicator;
constructor(
private config: SlackConfig,
private sessionManager: SessionManager,
private agent: Agent
) {
this.eventQueue = new Queue('slack-events');
this.deduplicator = new EventDeduplicator({
ttl: 300000, // 5 minutes
keyFn: (e) => `${e.teamId}:${e.channelId}:${e.ts}`,
});
}
// Express middleware for Slack events
middleware(): RequestHandler {
return async (req, res) => {
// Verify Slack signature
if (!this.verifySignature(req)) {
return res.status(401).send('Invalid signature');
}
const body = req.body;
// Handle URL verification challenge
if (body.type === 'url_verification') {
return res.json({ challenge: body.challenge });
}
// Acknowledge immediately (Slack 3s timeout)
res.status(200).send();
// Process event asynchronously
await this.handleEvent(body.event);
};
}
private async handleEvent(rawEvent: unknown): Promise<void> {
const event = this.normalizeEvent(rawEvent);
// Deduplicate (Slack may retry)
if (await this.deduplicator.isDuplicate(event)) {
this.logger.debug('Duplicate event ignored', { event });
return;
}
// Filter events we care about
if (!this.shouldProcess(event)) {
return;
}
// Map to tenant context
const tenant = await this.resolveTenant(event.teamId);
if (!tenant) {
this.logger.warn('Unknown Slack team', { teamId: event.teamId });
return;
}
// Enqueue for processing
await this.eventQueue.add('process', {
event,
tenant,
receivedAt: Date.now(),
});
}
private shouldProcess(event: SlackIncomingEvent): boolean {
// Skip bot messages
if (event.raw?.bot_id) return false;
// Only process certain event types
return ['message', 'app_mention'].includes(event.type);
}
private verifySignature(req: Request): boolean {
const timestamp = req.headers['x-slack-request-timestamp'] as string;
const signature = req.headers['x-slack-signature'] as string;
// Prevent replay attacks (5 minute window)
const now = Math.floor(Date.now() / 1000);
if (Math.abs(now - parseInt(timestamp)) > 300) {
return false;
}
const baseString = `v0:${timestamp}:${req.rawBody}`;
const expectedSignature = `v0=${crypto
.createHmac('sha256', this.config.signingSecret)
.update(baseString)
.digest('hex')}`;
return crypto.timingSafeEqual(
Buffer.from(signature),
Buffer.from(expectedSignature)
);
}
}
```
### Slash Commands
```typescript
// Slash command handling
class SlackCommandHandler {
private commands: Map<string, CommandDefinition> = new Map();
register(command: CommandDefinition): void {
this.commands.set(command.name, command);
}
middleware(): RequestHandler {
return async (req, res) => {
if (!this.verifySignature(req)) {
return res.status(401).send('Invalid signature');
}
const { command, text, user_id, channel_id, team_id, response_url } = req.body;
const commandDef = this.commands.get(command);
if (!commandDef) {
return res.json({
response_type: 'ephemeral',
text: `Unknown command: ${command}`,
});
}
// Parse arguments
const args = this.parseArgs(text, commandDef.argSchema);
// Acknowledge with loading state
res.json({
response_type: 'ephemeral',
text: 'Processing...',
});
try {
// Execute command
const result = await commandDef.handler({
args,
userId: user_id,
channelId: channel_id,
teamId: team_id,
});
// Send actual response
await this.sendResponse(response_url, {
response_type: result.public ? 'in_channel' : 'ephemeral',
blocks: result.blocks,
text: result.text,
});
} catch (error) {
await this.sendResponse(response_url, {
response_type: 'ephemeral',
text: `Error: ${(error as Error).message}`,
});
}
};
}
private parseArgs(text: string, schema: ArgSchema): Record<string, unknown> {
const args: Record<string, unknown> = {};
const parts = text.trim().split(/\s+/);
for (const [name, def] of Object.entries(schema)) {
if (def.positional !== undefined) {
args[name] = parts[def.positional];
} else if (def.flag) {
const flagIndex = parts.indexOf(`--${name}`);
if (flagIndex !== -1) {
args[name] = parts[flagIndex + 1] ?? true;
}
}
}
return args;
}
}
// Command definition
interface CommandDefinition {
name: string;
description: string;
argSchema: ArgSchema;
handler: (ctx: CommandContext) => Promise<CommandResult>;
}
// Example command
const askCommand: CommandDefinition = {
name: '/ask',
description: 'Ask RuvBot a question',
argSchema: {
question: { positional: 0, required: true },
context: { flag: true },
},
handler: async (ctx) => {
const session = await sessionManager.getOrCreate(ctx.userId, ctx.channelId);
const response = await agent.process(session, ctx.args.question as string);
return {
public: false,
text: response.content,
blocks: formatResponseBlocks(response),
};
},
};
```
### Block Kit Builder
```typescript
// Fluent Block Kit builder
class BlockKitBuilder {
private blocks: Block[] = [];
section(text: string): this {
this.blocks.push({
type: 'section',
text: { type: 'mrkdwn', text },
});
return this;
}
divider(): this {
this.blocks.push({ type: 'divider' });
return this;
}
context(...elements: string[]): this {
this.blocks.push({
type: 'context',
elements: elements.map(e => ({ type: 'mrkdwn', text: e })),
});
return this;
}
actions(actionId: string, buttons: Button[]): this {
this.blocks.push({
type: 'actions',
block_id: actionId,
elements: buttons.map(b => ({
type: 'button',
text: { type: 'plain_text', text: b.text },
action_id: b.actionId,
value: b.value,
style: b.style,
})),
});
return this;
}
input(label: string, actionId: string, options: InputOptions): this {
this.blocks.push({
type: 'input',
label: { type: 'plain_text', text: label },
element: {
type: options.multiline ? 'plain_text_input' : 'plain_text_input',
action_id: actionId,
multiline: options.multiline,
placeholder: options.placeholder
? { type: 'plain_text', text: options.placeholder }
: undefined,
},
});
return this;
}
build(): Block[] {
return this.blocks;
}
}
// Usage example
const responseBlocks = new BlockKitBuilder()
.section('Here is what I found:')
.divider()
.section(responseText)
.context(`Generated in ${latencyMs}ms`)
.actions('feedback', [
{ text: 'Helpful', actionId: 'feedback_positive', value: responseId, style: 'primary' },
{ text: 'Not helpful', actionId: 'feedback_negative', value: responseId },
])
.build();
```
---
## Webhook Integration
### Inbound Webhooks
```typescript
// Inbound webhook configuration
interface WebhookEndpoint {
id: string;
path: string; // e.g., "/webhooks/github"
method: 'POST' | 'PUT';
secretKey?: string;
signatureHeader?: string;
signatureAlgorithm?: 'hmac-sha256' | 'hmac-sha1';
handler: WebhookHandler;
rateLimit?: RateLimitConfig;
}
class InboundWebhookRouter {
private endpoints: Map<string, WebhookEndpoint> = new Map();
register(endpoint: WebhookEndpoint): void {
this.endpoints.set(endpoint.path, endpoint);
}
middleware(): RequestHandler {
return async (req, res, next) => {
const endpoint = this.endpoints.get(req.path);
if (!endpoint) {
return next();
}
// Rate limiting
if (endpoint.rateLimit) {
const allowed = await this.rateLimiter.check(
`webhook:${endpoint.id}:${req.ip}`,
endpoint.rateLimit
);
if (!allowed) {
return res.status(429).json({ error: 'Rate limit exceeded' });
}
}
// Signature verification
if (endpoint.secretKey) {
if (!this.verifySignature(req, endpoint)) {
return res.status(401).json({ error: 'Invalid signature' });
}
}
try {
const result = await endpoint.handler({
body: req.body,
headers: req.headers,
query: req.query,
});
res.status(result.status ?? 200).json(result.body ?? { ok: true });
} catch (error) {
this.logger.error('Webhook handler error', { error, endpoint: endpoint.id });
res.status(500).json({ error: 'Internal error' });
}
};
}
private verifySignature(req: Request, endpoint: WebhookEndpoint): boolean {
const signatureHeader = endpoint.signatureHeader ?? 'x-signature';
const providedSignature = req.headers[signatureHeader.toLowerCase()] as string;
if (!providedSignature) return false;
const algorithm = endpoint.signatureAlgorithm ?? 'hmac-sha256';
const expectedSignature = crypto
.createHmac(algorithm.replace('hmac-', ''), endpoint.secretKey!)
.update(req.rawBody)
.digest('hex');
// Handle various signature formats
const normalizedProvided = providedSignature
.replace(/^sha256=/, '')
.replace(/^sha1=/, '');
return crypto.timingSafeEqual(
Buffer.from(normalizedProvided),
Buffer.from(expectedSignature)
);
}
}
```
### Outbound Webhooks
```typescript
// Outbound webhook delivery
class OutboundWebhookDispatcher {
constructor(
private queue: Queue<WebhookDelivery>,
private storage: WebhookStorage,
private http: HttpClient
) {}
async dispatch(
webhookId: string,
event: WebhookEvent,
options?: DispatchOptions
): Promise<string> {
const webhook = await this.storage.findById(webhookId);
if (!webhook || !webhook.isEnabled) {
throw new Error(`Webhook ${webhookId} not found or disabled`);
}
const deliveryId = crypto.randomUUID();
const payload = this.buildPayload(event, webhook);
const signature = this.sign(payload, webhook.secret);
// Queue for delivery
await this.queue.add(
'deliver',
{
deliveryId,
webhookId,
url: webhook.url,
payload,
signature,
headers: webhook.headers,
},
{
attempts: 10,
backoff: { type: 'exponential', delay: 1000 },
removeOnComplete: 100,
removeOnFail: 1000,
}
);
return deliveryId;
}
private buildPayload(event: WebhookEvent, webhook: Webhook): string {
return JSON.stringify({
id: crypto.randomUUID(),
type: event.type,
timestamp: new Date().toISOString(),
data: event.data,
webhook_id: webhook.id,
});
}
private sign(payload: string, secret: string): string {
const timestamp = Math.floor(Date.now() / 1000);
const signaturePayload = `${timestamp}.${payload}`;
const signature = crypto
.createHmac('sha256', secret)
.update(signaturePayload)
.digest('hex');
return `t=${timestamp},v1=${signature}`;
}
}
// Webhook event types
type WebhookEventType =
| 'session.created'
| 'session.ended'
| 'message.received'
| 'message.sent'
| 'memory.created'
| 'skill.executed'
| 'error.occurred';
interface WebhookEvent {
type: WebhookEventType;
data: Record<string, unknown>;
}
```
---
## LLM Provider Integration
### Provider Abstraction
```typescript
// Unified LLM provider interface
interface LLMProvider {
// Basic completion
complete(
messages: Message[],
options: CompletionOptions
): Promise<Completion>;
// Streaming completion
stream(
messages: Message[],
options: StreamOptions
): AsyncGenerator<Token, Completion, void>;
// Token counting
countTokens(text: string): Promise<number>;
// Model info
getModel(): ModelInfo;
// Health check
isHealthy(): Promise<boolean>;
}
interface CompletionOptions {
maxTokens?: number;
temperature?: number;
topP?: number;
stopSequences?: string[];
tools?: Tool[];
}
interface Completion {
content: string;
finishReason: 'stop' | 'length' | 'tool_use';
usage: {
inputTokens: number;
outputTokens: number;
};
toolCalls?: ToolCall[];
}
```
### Anthropic Claude Provider
```typescript
// Claude provider implementation
class ClaudeProvider implements LLMProvider {
private client: AnthropicClient;
private circuitBreaker: CircuitBreaker;
constructor(config: ClaudeConfig) {
this.client = new Anthropic({
apiKey: config.apiKey,
baseURL: config.baseURL,
});
this.circuitBreaker = new CircuitBreaker({
failureThreshold: 5,
resetTimeout: 30000,
});
}
async complete(
messages: Message[],
options: CompletionOptions
): Promise<Completion> {
return this.circuitBreaker.execute(async () => {
const response = await this.client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: options.maxTokens ?? 1024,
temperature: options.temperature ?? 0.7,
messages: this.formatMessages(messages),
tools: options.tools?.map(this.formatTool),
});
return this.parseResponse(response);
});
}
async *stream(
messages: Message[],
options: StreamOptions
): AsyncGenerator<Token, Completion, void> {
const stream = await this.client.messages.stream({
model: 'claude-sonnet-4-20250514',
max_tokens: options.maxTokens ?? 1024,
temperature: options.temperature ?? 0.7,
messages: this.formatMessages(messages),
});
let fullContent = '';
let inputTokens = 0;
let outputTokens = 0;
for await (const event of stream) {
if (event.type === 'content_block_delta') {
const text = event.delta.text;
fullContent += text;
yield { type: 'text', text };
} else if (event.type === 'message_delta') {
outputTokens = event.usage?.output_tokens ?? 0;
} else if (event.type === 'message_start') {
inputTokens = event.message.usage?.input_tokens ?? 0;
}
}
return {
content: fullContent,
finishReason: 'stop',
usage: { inputTokens, outputTokens },
};
}
private formatMessages(messages: Message[]): AnthropicMessage[] {
return messages.map(m => ({
role: m.role === 'user' ? 'user' : 'assistant',
content: m.content,
}));
}
}
```
### Provider Registry with Failover
```typescript
// Multi-provider registry with automatic failover
class ProviderRegistry {
private providers: Map<string, LLMProvider> = new Map();
private primary: string;
private fallbacks: string[];
constructor(config: ProviderRegistryConfig) {
this.primary = config.primary;
this.fallbacks = config.fallbacks;
}
register(name: string, provider: LLMProvider): void {
this.providers.set(name, provider);
}
async complete(
messages: Message[],
options: CompletionOptions
): Promise<Completion> {
const providerOrder = [this.primary, ...this.fallbacks];
for (const providerName of providerOrder) {
const provider = this.providers.get(providerName);
if (!provider) continue;
try {
// Check health before using
if (await provider.isHealthy()) {
const result = await provider.complete(messages, options);
this.metrics.increment('provider.success', { provider: providerName });
return result;
}
} catch (error) {
this.logger.warn(`Provider ${providerName} failed`, { error });
this.metrics.increment('provider.failure', { provider: providerName });
}
}
throw new Error('All LLM providers unavailable');
}
async *stream(
messages: Message[],
options: StreamOptions
): AsyncGenerator<Token, Completion, void> {
const provider = this.providers.get(this.primary);
if (!provider) {
throw new Error(`Primary provider ${this.primary} not found`);
}
// Streaming doesn't support automatic failover (would be disruptive)
yield* provider.stream(messages, options);
}
}
```
---
## Circuit Breaker
```typescript
// Circuit breaker for external service protection
class CircuitBreaker {
private state: 'closed' | 'open' | 'half-open' = 'closed';
private failures = 0;
private lastFailureTime = 0;
private successesSinceHalfOpen = 0;
constructor(private config: CircuitBreakerConfig) {}
async execute<T>(fn: () => Promise<T>): Promise<T> {
if (this.state === 'open') {
if (Date.now() - this.lastFailureTime > this.config.resetTimeout) {
this.state = 'half-open';
this.successesSinceHalfOpen = 0;
} else {
throw new CircuitBreakerOpenError();
}
}
try {
const result = await fn();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}
private onSuccess(): void {
if (this.state === 'half-open') {
this.successesSinceHalfOpen++;
if (this.successesSinceHalfOpen >= this.config.successThreshold) {
this.state = 'closed';
this.failures = 0;
}
} else {
this.failures = 0;
}
}
private onFailure(): void {
this.failures++;
this.lastFailureTime = Date.now();
if (this.failures >= this.config.failureThreshold) {
this.state = 'open';
}
}
getState(): CircuitBreakerState {
return {
state: this.state,
failures: this.failures,
lastFailureTime: this.lastFailureTime,
};
}
}
interface CircuitBreakerConfig {
failureThreshold: number; // Failures before opening
successThreshold: number; // Successes in half-open to close
resetTimeout: number; // ms before trying half-open
}
```
---
## Consequences
### Benefits
1. **Unified Interface**: All integrations exposed through consistent APIs
2. **Resilience**: Circuit breakers and retries prevent cascade failures
3. **Extensibility**: Easy to add new providers and integrations
4. **Observability**: Comprehensive metrics and logging
5. **Security**: Proper signature verification and authentication
### Trade-offs
| Benefit | Trade-off |
|---------|-----------|
| Abstraction | Some provider-specific features hidden |
| Circuit breaker | Delayed recovery after incidents |
| Retry logic | Potential duplicate processing |
| Async processing | Eventually consistent state |
---
## Related Decisions
- **ADR-001**: Architecture Overview
- **ADR-004**: Background Workers (webhook delivery)
---
## Revision History
| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0 | 2026-01-27 | RuVector Architecture Team | Initial version |

View File

@@ -0,0 +1,775 @@
# ADR-006: WASM Integration
**Status:** Accepted
**Date:** 2026-01-27
**Decision Makers:** RuVector Architecture Team
**Technical Area:** Runtime, Performance
---
## Context and Problem Statement
RuvBot requires high-performance vector operations and ML inference for:
1. **Embedding generation** for memory storage and retrieval
2. **HNSW search** for semantic memory recall
3. **Pattern matching** for learned response optimization
4. **Quantization** for memory-efficient vector storage
The runtime must support:
- **Server-side Node.js** for API workloads
- **Edge deployments** (Cloudflare Workers, Vercel Edge)
- **Browser execution** for client-side features
- **Fallback paths** when WASM is unavailable
---
## Decision Drivers
### Performance Requirements
| Operation | Target Latency | Environment |
|-----------|----------------|-------------|
| Embed single text | < 10ms | WASM |
| Embed batch (32) | < 100ms | WASM |
| HNSW search k=10 | < 5ms | Native/WASM |
| Quantize vector | < 1ms | WASM |
| Pattern match | < 20ms | WASM |
### Compatibility Requirements
| Environment | WASM Support | Native Support |
|-------------|--------------|----------------|
| Node.js 18+ | Full | Full (NAPI) |
| Node.js 14-17 | Partial | Full (NAPI) |
| Cloudflare Workers | Full | None |
| Vercel Edge | Full | None |
| Browser (Chrome/FF/Safari) | Full | None |
| Deno | Full | Partial |
---
## Decision Outcome
### Adopt Hybrid WASM/Native Runtime with Automatic Detection
We implement a runtime abstraction that:
1. **Detects environment** at initialization
2. **Prefers native bindings** when available (2-5x faster)
3. **Falls back to WASM** universally
4. **Provides consistent API** regardless of backend
```
+-----------------------------------------------------------------------------+
| WASM INTEGRATION LAYER |
+-----------------------------------------------------------------------------+
+---------------------------+
| Runtime Detector |
+-------------+-------------+
|
+---------------------+---------------------+
| |
+-----------v-----------+ +-----------v-----------+
| Native Backend | | WASM Backend |
| (NAPI-RS) | | (wasm-bindgen) |
|-----------------------| |-----------------------|
| - @ruvector/core | | - @ruvector/wasm |
| - @ruvector/ruvllm | | - @ruvllm-wasm |
| - @ruvector/sona | | - @sona-wasm |
+-----------+-----------+ +-----------+-----------+
| |
+---------------------+---------------------+
|
+-------------v-------------+
| Unified API Surface |
| (RuVectorRuntime) |
+---------------------------+
```
---
## WASM Module Architecture
### Module Organization
```typescript
// WASM module types available
interface WasmModules {
// Vector operations
vectorOps: {
distance: (a: Float32Array, b: Float32Array, metric: DistanceMetric) => number;
batchDistance: (query: Float32Array, vectors: Float32Array[], metric: DistanceMetric) => Float32Array;
normalize: (vector: Float32Array) => Float32Array;
quantize: (vector: Float32Array, config: QuantizationConfig) => Uint8Array;
dequantize: (quantized: Uint8Array, config: QuantizationConfig) => Float32Array;
};
// HNSW index
hnsw: {
create: (config: HnswConfig) => HnswIndexHandle;
insert: (handle: HnswIndexHandle, id: string, vector: Float32Array) => void;
search: (handle: HnswIndexHandle, query: Float32Array, k: number) => SearchResult[];
delete: (handle: HnswIndexHandle, id: string) => boolean;
serialize: (handle: HnswIndexHandle) => Uint8Array;
deserialize: (data: Uint8Array) => HnswIndexHandle;
free: (handle: HnswIndexHandle) => void;
};
// Embeddings
embeddings: {
loadModel: (modelPath: string) => EmbeddingModelHandle;
embed: (handle: EmbeddingModelHandle, text: string) => Float32Array;
embedBatch: (handle: EmbeddingModelHandle, texts: string[]) => Float32Array[];
unloadModel: (handle: EmbeddingModelHandle) => void;
};
// Learning
learning: {
createPattern: (embedding: Float32Array, metadata: unknown) => PatternHandle;
matchPatterns: (query: Float32Array, patterns: PatternHandle[], threshold: number) => PatternMatch[];
trainLoRA: (trajectories: Trajectory[], config: LoRAConfig) => LoRAWeights;
applyEWC: (weights: ModelWeights, fisher: FisherMatrix, lambda: number) => ModelWeights;
};
}
```
### Runtime Detection
```typescript
// Automatic runtime detection and initialization
class RuVectorRuntime {
private static instance: RuVectorRuntime | null = null;
private backend: 'native' | 'wasm' | 'js-fallback';
private modules: WasmModules | NativeModules;
private constructor() {}
static async initialize(): Promise<RuVectorRuntime> {
if (this.instance) return this.instance;
const runtime = new RuVectorRuntime();
await runtime.detectAndLoad();
this.instance = runtime;
return runtime;
}
private async detectAndLoad(): Promise<void> {
// Try native first (best performance)
if (await this.tryNative()) {
this.backend = 'native';
console.log('RuVector: Using native NAPI backend');
return;
}
// Try WASM
if (await this.tryWasm()) {
this.backend = 'wasm';
console.log('RuVector: Using WASM backend');
return;
}
// Fall back to pure JS (limited functionality)
this.backend = 'js-fallback';
console.warn('RuVector: Using JS fallback (limited performance)');
await this.loadJsFallback();
}
private async tryNative(): Promise<boolean> {
// Native only available in Node.js
if (typeof process === 'undefined' || !process.versions?.node) {
return false;
}
try {
const nativeModule = await import('@ruvector/core');
if (typeof nativeModule.isNativeAvailable === 'function' &&
nativeModule.isNativeAvailable()) {
this.modules = nativeModule;
return true;
}
} catch (e) {
console.debug('Native module not available:', e);
}
return false;
}
private async tryWasm(): Promise<boolean> {
try {
// Check WebAssembly support
if (typeof WebAssembly !== 'object') {
return false;
}
// Load WASM modules
const [vectorOps, hnsw, embeddings, learning] = await Promise.all([
import('@ruvector/wasm'),
import('@ruvector/wasm/hnsw'),
import('@ruvector/wasm/embeddings'),
import('@ruvector/wasm/learning'),
]);
// Initialize WASM modules
await Promise.all([
vectorOps.default(),
hnsw.default(),
embeddings.default(),
learning.default(),
]);
this.modules = {
vectorOps,
hnsw,
embeddings,
learning,
};
return true;
} catch (e) {
console.debug('WASM modules not available:', e);
return false;
}
}
private async loadJsFallback(): Promise<void> {
// Pure JS implementations (slower but always work)
const { JsFallbackModules } = await import('./js-fallback');
this.modules = new JsFallbackModules();
}
getBackend(): 'native' | 'wasm' | 'js-fallback' {
return this.backend;
}
getModules(): WasmModules | NativeModules {
if (!this.modules) {
throw new Error('RuVector runtime not initialized');
}
return this.modules;
}
}
```
---
## Embedding Engine
### WASM Embedder
```typescript
// WASM-based embedding engine
class WasmEmbedder {
private modelHandle: EmbeddingModelHandle | null = null;
private modelPath: string;
private dimensions: number;
private runtime: RuVectorRuntime;
constructor(config: EmbedderConfig) {
this.modelPath = config.modelPath;
this.dimensions = config.dimensions ?? 384;
}
async initialize(): Promise<void> {
this.runtime = await RuVectorRuntime.initialize();
const { embeddings } = this.runtime.getModules();
// Load model (downloads and caches if needed)
const modelData = await this.loadModelData();
this.modelHandle = embeddings.loadModel(modelData);
}
async embed(text: string): Promise<Float32Array> {
if (!this.modelHandle) {
throw new Error('Embedder not initialized');
}
const { embeddings } = this.runtime.getModules();
return embeddings.embed(this.modelHandle, text);
}
async embedBatch(texts: string[]): Promise<Float32Array[]> {
if (!this.modelHandle) {
throw new Error('Embedder not initialized');
}
const { embeddings } = this.runtime.getModules();
// Process in chunks to avoid OOM
const chunkSize = 32;
const results: Float32Array[] = [];
for (let i = 0; i < texts.length; i += chunkSize) {
const chunk = texts.slice(i, i + chunkSize);
const chunkResults = embeddings.embedBatch(this.modelHandle, chunk);
results.push(...chunkResults);
}
return results;
}
getDimensions(): number {
return this.dimensions;
}
async dispose(): Promise<void> {
if (this.modelHandle) {
const { embeddings } = this.runtime.getModules();
embeddings.unloadModel(this.modelHandle);
this.modelHandle = null;
}
}
private async loadModelData(): Promise<Uint8Array> {
// Check cache first
const cached = await this.modelCache.get(this.modelPath);
if (cached) return cached;
// Download model
const response = await fetch(this.modelPath);
const buffer = await response.arrayBuffer();
const data = new Uint8Array(buffer);
// Cache for future use
await this.modelCache.set(this.modelPath, data);
return data;
}
}
```
### Model Cache
```typescript
// Cross-environment model cache
class ModelCache {
private memoryCache: Map<string, Uint8Array> = new Map();
async get(key: string): Promise<Uint8Array | null> {
// Check memory cache first
if (this.memoryCache.has(key)) {
return this.memoryCache.get(key)!;
}
// Try persistent cache (environment-specific)
if (typeof caches !== 'undefined') {
// Browser/Cloudflare Cache API
return this.getFromCacheAPI(key);
} else if (typeof process !== 'undefined' && process.versions?.node) {
// Node.js file system cache
return this.getFromFileCache(key);
}
return null;
}
async set(key: string, data: Uint8Array): Promise<void> {
// Always store in memory
this.memoryCache.set(key, data);
// Persist to appropriate cache
if (typeof caches !== 'undefined') {
await this.setToCacheAPI(key, data);
} else if (typeof process !== 'undefined' && process.versions?.node) {
await this.setToFileCache(key, data);
}
}
private async getFromCacheAPI(key: string): Promise<Uint8Array | null> {
try {
const cache = await caches.open('ruvector-models');
const response = await cache.match(key);
if (response) {
const buffer = await response.arrayBuffer();
return new Uint8Array(buffer);
}
} catch (e) {
console.debug('Cache API error:', e);
}
return null;
}
private async setToCacheAPI(key: string, data: Uint8Array): Promise<void> {
try {
const cache = await caches.open('ruvector-models');
const response = new Response(data, {
headers: { 'Content-Type': 'application/octet-stream' },
});
await cache.put(key, response);
} catch (e) {
console.debug('Cache API error:', e);
}
}
private async getFromFileCache(key: string): Promise<Uint8Array | null> {
const fs = await import('fs/promises');
const path = await import('path');
const os = await import('os');
const cacheDir = path.join(os.homedir(), '.ruvector', 'models');
const cachePath = path.join(cacheDir, this.keyToFilename(key));
try {
const data = await fs.readFile(cachePath);
return new Uint8Array(data);
} catch (e) {
return null;
}
}
private async setToFileCache(key: string, data: Uint8Array): Promise<void> {
const fs = await import('fs/promises');
const path = await import('path');
const os = await import('os');
const cacheDir = path.join(os.homedir(), '.ruvector', 'models');
await fs.mkdir(cacheDir, { recursive: true });
const cachePath = path.join(cacheDir, this.keyToFilename(key));
await fs.writeFile(cachePath, data);
}
private keyToFilename(key: string): string {
const crypto = require('crypto');
return crypto.createHash('sha256').update(key).digest('hex').slice(0, 32);
}
}
```
---
## HNSW Index WASM Wrapper
```typescript
// WASM-based HNSW index
class WasmHnswIndex {
private handle: HnswIndexHandle | null = null;
private runtime: RuVectorRuntime;
private config: HnswConfig;
private vectorCount = 0;
constructor(config: HnswConfig) {
this.config = config;
}
async initialize(): Promise<void> {
this.runtime = await RuVectorRuntime.initialize();
const { hnsw } = this.runtime.getModules();
this.handle = hnsw.create(this.config);
}
async insert(id: string, vector: Float32Array): Promise<void> {
if (!this.handle) throw new Error('Index not initialized');
// Validate dimensions
if (vector.length !== this.config.dimensions) {
throw new Error(`Vector dimension mismatch: ${vector.length} vs ${this.config.dimensions}`);
}
const { hnsw } = this.runtime.getModules();
hnsw.insert(this.handle, id, vector);
this.vectorCount++;
}
async insertBatch(entries: Array<{ id: string; vector: Float32Array }>): Promise<void> {
if (!this.handle) throw new Error('Index not initialized');
const { hnsw } = this.runtime.getModules();
for (const entry of entries) {
if (entry.vector.length !== this.config.dimensions) {
throw new Error(`Vector dimension mismatch for ${entry.id}`);
}
hnsw.insert(this.handle, entry.id, entry.vector);
this.vectorCount++;
}
}
async search(query: Float32Array, k: number): Promise<SearchResult[]> {
if (!this.handle) throw new Error('Index not initialized');
if (query.length !== this.config.dimensions) {
throw new Error(`Query dimension mismatch: ${query.length}`);
}
const { hnsw } = this.runtime.getModules();
return hnsw.search(this.handle, query, Math.min(k, this.vectorCount));
}
async delete(id: string): Promise<boolean> {
if (!this.handle) throw new Error('Index not initialized');
const { hnsw } = this.runtime.getModules();
const deleted = hnsw.delete(this.handle, id);
if (deleted) this.vectorCount--;
return deleted;
}
async serialize(): Promise<Uint8Array> {
if (!this.handle) throw new Error('Index not initialized');
const { hnsw } = this.runtime.getModules();
return hnsw.serialize(this.handle);
}
async deserialize(data: Uint8Array): Promise<void> {
const { hnsw } = this.runtime.getModules();
// Free existing handle if any
if (this.handle) {
hnsw.free(this.handle);
}
this.handle = hnsw.deserialize(data);
}
getStats(): IndexStats {
return {
vectorCount: this.vectorCount,
dimensions: this.config.dimensions,
m: this.config.m,
efConstruction: this.config.efConstruction,
efSearch: this.config.efSearch,
backend: this.runtime.getBackend(),
};
}
async dispose(): Promise<void> {
if (this.handle) {
const { hnsw } = this.runtime.getModules();
hnsw.free(this.handle);
this.handle = null;
}
}
}
interface HnswConfig {
dimensions: number;
m: number; // Max connections per node per layer
efConstruction: number; // Build-time exploration factor
efSearch: number; // Query-time exploration factor
distanceMetric: 'cosine' | 'euclidean' | 'dot_product';
}
interface SearchResult {
id: string;
score: number;
}
interface IndexStats {
vectorCount: number;
dimensions: number;
m: number;
efConstruction: number;
efSearch: number;
backend: 'native' | 'wasm' | 'js-fallback';
}
```
---
## Memory Management
### WASM Memory Pooling
```typescript
// Efficient memory management for WASM
class WasmMemoryPool {
private pools: Map<number, Float32Array[]> = new Map();
private maxPoolSize = 100;
// Get or create a Float32Array of specified length
acquire(length: number): Float32Array {
const pool = this.pools.get(length);
if (pool && pool.length > 0) {
return pool.pop()!;
}
return new Float32Array(length);
}
// Return array to pool for reuse
release(array: Float32Array): void {
const length = array.length;
let pool = this.pools.get(length);
if (!pool) {
pool = [];
this.pools.set(length, pool);
}
if (pool.length < this.maxPoolSize) {
// Zero out for security
array.fill(0);
pool.push(array);
}
// Otherwise let GC handle it
}
// Clear pools when memory pressure detected
clear(): void {
this.pools.clear();
}
getStats(): PoolStats {
const stats: PoolStats = { totalArrays: 0, totalBytes: 0, pools: {} };
for (const [length, pool] of this.pools) {
stats.pools[length] = pool.length;
stats.totalArrays += pool.length;
stats.totalBytes += pool.length * length * 4; // 4 bytes per float32
}
return stats;
}
}
// Usage in embedder
class PooledWasmEmbedder extends WasmEmbedder {
private pool = new WasmMemoryPool();
async embed(text: string): Promise<Float32Array> {
const result = await super.embed(text);
// Copy to pooled array
const pooled = this.pool.acquire(result.length);
pooled.set(result);
return pooled;
}
releaseEmbedding(embedding: Float32Array): void {
this.pool.release(embedding);
}
}
```
---
## Performance Benchmarks
```typescript
// Benchmark suite for runtime comparison
class WasmBenchmarks {
async runAll(): Promise<BenchmarkResults> {
const results: BenchmarkResults = {};
// Embedding benchmarks
results.embedSingle = await this.benchmarkEmbedSingle();
results.embedBatch = await this.benchmarkEmbedBatch();
// HNSW benchmarks
results.hnswInsert = await this.benchmarkHnswInsert();
results.hnswSearch = await this.benchmarkHnswSearch();
// Vector operations
results.distance = await this.benchmarkDistance();
results.quantize = await this.benchmarkQuantize();
return results;
}
private async benchmarkEmbedSingle(): Promise<BenchmarkResult> {
const embedder = new WasmEmbedder({ modelPath: 'minilm-l6-v2' });
await embedder.initialize();
const iterations = 100;
const texts = Array(iterations).fill('This is a test sentence for embedding.');
const start = performance.now();
for (const text of texts) {
await embedder.embed(text);
}
const elapsed = performance.now() - start;
return {
operation: 'embed_single',
iterations,
totalMs: elapsed,
avgMs: elapsed / iterations,
opsPerSecond: (iterations / elapsed) * 1000,
};
}
private async benchmarkHnswSearch(): Promise<BenchmarkResult> {
const index = new WasmHnswIndex({
dimensions: 384,
m: 16,
efConstruction: 100,
efSearch: 50,
distanceMetric: 'cosine',
});
await index.initialize();
// Insert 10k vectors
for (let i = 0; i < 10000; i++) {
await index.insert(`vec_${i}`, this.randomVector(384));
}
const iterations = 1000;
const query = this.randomVector(384);
const start = performance.now();
for (let i = 0; i < iterations; i++) {
await index.search(query, 10);
}
const elapsed = performance.now() - start;
return {
operation: 'hnsw_search_10k',
iterations,
totalMs: elapsed,
avgMs: elapsed / iterations,
opsPerSecond: (iterations / elapsed) * 1000,
};
}
private randomVector(dim: number): Float32Array {
const vec = new Float32Array(dim);
for (let i = 0; i < dim; i++) {
vec[i] = Math.random() * 2 - 1;
}
return vec;
}
}
```
---
## Consequences
### Benefits
1. **Universal Deployment**: Same code runs everywhere (Node, Edge, Browser)
2. **Performance**: Near-native performance for vector operations
3. **Fallback Safety**: Always works even without WASM support
4. **Memory Efficiency**: Pooling and proper cleanup prevent leaks
5. **Model Portability**: ONNX models run in any environment
### Trade-offs
| Benefit | Trade-off |
|---------|-----------|
| Portability | Slight overhead vs pure native |
| WASM safety | No direct memory access (by design) |
| Model caching | Disk/Cache API storage needed |
| Lazy loading | First-use latency for initialization |
---
## Related Decisions
- **ADR-001**: Architecture Overview
- **ADR-003**: Persistence Layer (vector storage)
- **ADR-007**: Learning System (pattern WASM modules)
---
## Revision History
| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0 | 2026-01-27 | RuVector Architecture Team | Initial version |

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,151 @@
# ADR-008: Security Architecture
## Status
Accepted
## Date
2026-01-27
## Context
RuvBot handles sensitive data including:
- User conversations and personal information
- API credentials for LLM providers
- Integration tokens (Slack, Discord)
- Vector embeddings that may encode sensitive content
- Multi-tenant data requiring strict isolation
## Decision
### Security Layers
```
┌─────────────────────────────────────────────────────────────────┐
│ Security Architecture │
├─────────────────────────────────────────────────────────────────┤
│ Layer 1: Transport Security │
│ - TLS 1.3 for all connections │
│ - Certificate pinning for external APIs │
│ - HSTS enabled by default │
├─────────────────────────────────────────────────────────────────┤
│ Layer 2: Authentication │
│ - JWT tokens with RS256 signing │
│ - OAuth 2.0 for Slack/Discord │
│ - API key authentication with rate limiting │
│ - Session tokens with secure rotation │
├─────────────────────────────────────────────────────────────────┤
│ Layer 3: Authorization │
│ - RBAC with claims-based permissions │
│ - Tenant isolation at all layers │
│ - Skill-level permission grants │
│ - Resource-based access control │
├─────────────────────────────────────────────────────────────────┤
│ Layer 4: Data Protection │
│ - AES-256-GCM for data at rest │
│ - Field-level encryption for sensitive data │
│ - Key rotation with envelope encryption │
│ - Secure secret management │
├─────────────────────────────────────────────────────────────────┤
│ Layer 5: Input Validation │
│ - Zod schema validation for all inputs │
│ - SQL injection prevention (parameterized queries) │
│ - XSS prevention (content sanitization) │
│ - Path traversal prevention │
├─────────────────────────────────────────────────────────────────┤
│ Layer 6: WASM Sandbox │
│ - Memory isolation per operation │
│ - Resource limits (CPU, memory) │
│ - No filesystem access from WASM │
│ - Controlled imports/exports │
└─────────────────────────────────────────────────────────────────┘
```
### Multi-Tenancy Security
```sql
-- PostgreSQL Row-Level Security
CREATE POLICY tenant_isolation ON memories
USING (tenant_id = current_setting('app.current_tenant')::uuid);
CREATE POLICY tenant_isolation ON sessions
USING (tenant_id = current_setting('app.current_tenant')::uuid);
CREATE POLICY tenant_isolation ON agents
USING (tenant_id = current_setting('app.current_tenant')::uuid);
```
### Secret Management
```typescript
// Secrets are never logged or exposed
interface SecretStore {
get(key: string): Promise<string>;
set(key: string, value: string, options?: SecretOptions): Promise<void>;
rotate(key: string): Promise<void>;
delete(key: string): Promise<void>;
}
// Environment variable validation
const requiredSecrets = z.object({
ANTHROPIC_API_KEY: z.string().startsWith('sk-ant-'),
SLACK_BOT_TOKEN: z.string().startsWith('xoxb-').optional(),
DATABASE_URL: z.string().url().optional(),
});
```
### API Security
1. **Rate Limiting**: Per-tenant, per-endpoint limits
2. **Request Signing**: HMAC-SHA256 for webhooks
3. **IP Allowlisting**: Optional for enterprise
4. **Audit Logging**: All security events logged
### Vulnerability Prevention
| CVE Category | Prevention |
|--------------|------------|
| Injection (SQL, NoSQL, Command) | Parameterized queries, input validation |
| XSS | Content-Security-Policy, output encoding |
| CSRF | SameSite cookies, origin validation |
| SSRF | URL allowlisting, no user-controlled URLs |
| Path Traversal | Path sanitization, chroot for file ops |
| Sensitive Data Exposure | Encryption, minimal logging |
| Broken Authentication | Secure session management |
| Security Misconfiguration | Secure defaults, hardening guide |
### Compliance Readiness
- **GDPR**: Data export, deletion, consent tracking
- **SOC 2**: Audit logging, access controls
- **HIPAA**: Encryption, access logging (with configuration)
## Consequences
### Positive
- Defense in depth provides multiple security layers
- Multi-tenancy isolation prevents data leakage
- Comprehensive input validation blocks injection attacks
- WASM sandbox limits damage from malicious code
### Negative
- Performance overhead from encryption/validation
- Complexity in secret management
- Additional testing required for security features
### Risks
- Key management complexity
- Potential for misconfiguration
- Balance between security and usability
## Security Checklist
- [ ] TLS configured for all endpoints
- [ ] API keys stored in secure vault
- [ ] Rate limiting enabled
- [ ] Audit logging configured
- [ ] Input validation on all endpoints
- [ ] SQL injection tests passing
- [ ] XSS tests passing
- [ ] CSRF protection enabled
- [ ] Security headers configured
- [ ] Dependency vulnerabilities scanned

View File

@@ -0,0 +1,159 @@
# ADR-009: Hybrid Search Architecture
## Status
Accepted (Implemented)
## Date
2026-01-27
## Context
Clawdbot uses basic vector search with external embedding APIs. RuvBot improves on this with:
- Local WASM embeddings (75x faster)
- HNSW indexing (150x-12,500x faster)
- Need for hybrid search combining vector + keyword (BM25)
## Decision
### Hybrid Search Pipeline
```
┌─────────────────────────────────────────────────────────────────┐
│ RuvBot Hybrid Search │
├─────────────────────────────────────────────────────────────────┤
│ Query Input │
│ └─ Text normalization │
│ └─ Query embedding (WASM, <3ms) │
├─────────────────────────────────────────────────────────────────┤
│ Parallel Search (Promise.all) │
│ ├─ Vector Search (HNSW) ├─ Keyword Search (BM25) │
│ │ └─ Cosine similarity │ └─ Inverted index │
│ │ └─ Top-K candidates │ └─ IDF + TF scoring │
├─────────────────────────────────────────────────────────────────┤
│ Result Fusion │
│ └─ Reciprocal Rank Fusion (RRF) │
│ └─ Linear combination │
│ └─ Weighted average with presence bonus │
├─────────────────────────────────────────────────────────────────┤
│ Post-Processing │
│ └─ Score normalization (BM25 max-normalized) │
│ └─ Matched term tracking │
│ └─ Threshold filtering │
└─────────────────────────────────────────────────────────────────┘
```
### Implementation
Located in `/npm/packages/ruvbot/src/learning/search/`:
- `HybridSearch.ts` - Main hybrid search coordinator
- `BM25Index.ts` - BM25 keyword search implementation
### Configuration
```typescript
interface HybridSearchConfig {
vector: {
enabled: boolean;
weight: number; // 0.0-1.0, default: 0.7
};
keyword: {
enabled: boolean;
weight: number; // 0.0-1.0, default: 0.3
k1?: number; // BM25 k1 parameter, default: 1.2
b?: number; // BM25 b parameter, default: 0.75
};
fusion: {
method: 'rrf' | 'linear' | 'weighted';
k: number; // RRF constant, default: 60
candidateMultiplier: number; // default: 3
};
}
interface HybridSearchOptions {
topK?: number; // default: 10
threshold?: number; // default: 0
vectorOnly?: boolean;
keywordOnly?: boolean;
}
interface HybridSearchResult {
id: string;
vectorScore: number;
keywordScore: number;
fusedScore: number;
matchedTerms?: string[];
}
```
### Fusion Methods
| Method | Algorithm | Best For |
|--------|-----------|----------|
| `rrf` | Reciprocal Rank Fusion: `1/(k + rank)` | General use, rank-based |
| `linear` | `α·vectorScore + β·keywordScore` | Score-sensitive ranking |
| `weighted` | Linear + 0.1 bonus for dual matches | Boosting exact matches |
### BM25 Implementation
```typescript
interface BM25Config {
k1: number; // Term frequency saturation (default: 1.2)
b: number; // Document length normalization (default: 0.75)
}
```
Features:
- Inverted index with document frequency tracking
- Built-in stopword filtering (100+ common words)
- Basic Porter-style stemming (ing, ed, es, s, ly, tion)
- Average document length normalization
### Performance Targets
| Operation | Target | Achieved |
|-----------|--------|----------|
| Query embedding | <5ms | 2.7ms |
| Vector search (100K) | <10ms | <5ms |
| Keyword search | <20ms | <15ms |
| Fusion | <5ms | <2ms |
| Total hybrid | <40ms | <25ms |
### Usage Example
```typescript
import { HybridSearch, createHybridSearch } from './learning/search';
// Create with custom config
const search = createHybridSearch({
vector: { enabled: true, weight: 0.7 },
keyword: { enabled: true, weight: 0.3, k1: 1.2, b: 0.75 },
fusion: { method: 'rrf', k: 60, candidateMultiplier: 3 },
});
// Initialize with vector index and embedder
search.initialize(vectorIndex, embedder);
// Add documents
await search.add('doc1', 'Document content here');
// Search
const results = await search.search('query text', { topK: 10 });
```
## Consequences
### Positive
- Better recall than vector-only search
- Handles exact matches and semantic similarity
- Maintains keyword search for debugging
- Parallel search execution for low latency
### Negative
- Slightly higher latency than vector-only
- Requires maintaining both indices
- More complex tuning
### Trade-offs
- Weight tuning requires experimentation
- Memory overhead for dual indices
- BM25 stemming is basic (not full Porter algorithm)

View File

@@ -0,0 +1,238 @@
# ADR-010: Multi-Channel Integration
## Status
Accepted (Partially Implemented)
## Date
2026-01-27
## Context
Clawdbot supports multiple messaging channels:
- Slack, Discord, Telegram, Signal, WhatsApp, Line, iMessage
- Web, CLI, API interfaces
RuvBot must match and exceed with:
- All Clawdbot channels
- Multi-tenant channel isolation
- Unified message handling
## Decision
### Channel Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ RuvBot Channel Layer │
├─────────────────────────────────────────────────────────────────┤
│ Channel Adapters │
│ ├─ SlackAdapter : @slack/bolt [IMPLEMENTED] │
│ ├─ DiscordAdapter : discord.js [IMPLEMENTED] │
│ ├─ TelegramAdapter : telegraf [IMPLEMENTED] │
│ ├─ SignalAdapter : signal-client [PLANNED] │
│ ├─ WhatsAppAdapter : baileys [PLANNED] │
│ ├─ LineAdapter : @line/bot-sdk [PLANNED] │
│ ├─ WebAdapter : WebSocket + REST [PLANNED] │
│ └─ CLIAdapter : readline + terminal [PLANNED] │
├─────────────────────────────────────────────────────────────────┤
│ Message Normalization │
│ └─ Unified Message format │
│ └─ Attachment handling │
│ └─ Thread/reply context │
├─────────────────────────────────────────────────────────────────┤
│ Multi-Tenant Isolation │
│ └─ Channel credentials per tenant │
│ └─ Namespace isolation │
│ └─ Rate limiting per tenant │
└─────────────────────────────────────────────────────────────────┘
```
### Implementation
Located in `/npm/packages/ruvbot/src/channels/`:
- `ChannelRegistry.ts` - Central registry and routing
- `adapters/BaseAdapter.ts` - Abstract base class
- `adapters/SlackAdapter.ts` - Slack integration
- `adapters/DiscordAdapter.ts` - Discord integration
- `adapters/TelegramAdapter.ts` - Telegram integration
### Unified Message Interface
```typescript
interface UnifiedMessage {
id: string;
channelId: string;
channelType: ChannelType;
tenantId: string;
userId: string;
username?: string;
content: string;
attachments?: Attachment[];
threadId?: string;
replyTo?: string;
timestamp: Date;
metadata: Record<string, unknown>;
}
interface Attachment {
id: string;
type: 'image' | 'file' | 'audio' | 'video' | 'link';
url?: string;
data?: Buffer;
mimeType?: string;
filename?: string;
size?: number;
}
type ChannelType =
| 'slack' | 'discord' | 'telegram'
| 'signal' | 'whatsapp' | 'line'
| 'imessage' | 'web' | 'api' | 'cli';
```
### BaseAdapter Abstract Class
```typescript
abstract class BaseAdapter {
type: ChannelType;
tenantId: string;
enabled: boolean;
// Lifecycle
abstract connect(): Promise<void>;
abstract disconnect(): Promise<void>;
// Messaging
abstract send(channelId: string, content: string, options?: SendOptions): Promise<string>;
abstract reply(message: UnifiedMessage, content: string, options?: SendOptions): Promise<string>;
// Event handling
onMessage(handler: MessageHandler): void;
offMessage(handler: MessageHandler): void;
getStatus(): AdapterStatus;
}
```
### Channel Registry
```typescript
interface ChannelRegistry {
// Registration
register(adapter: BaseAdapter): void;
unregister(type: ChannelType, tenantId: string): boolean;
// Lookup
get(type: ChannelType, tenantId: string): BaseAdapter | undefined;
getByType(type: ChannelType): BaseAdapter[];
getByTenant(tenantId: string): BaseAdapter[];
getAll(): BaseAdapter[];
// Lifecycle
start(): Promise<void>;
stop(): Promise<void>;
// Messaging
onMessage(handler: MessageHandler): void;
offMessage(handler: MessageHandler): void;
broadcast(message: string, channelIds: string[], filter?: ChannelFilter): Promise<Map<string, string>>;
// Statistics
getStats(): RegistryStats;
}
interface ChannelRegistryConfig {
defaultRateLimit?: {
requests: number;
windowMs: number;
};
}
```
### Adapter Configuration
```typescript
interface AdapterConfig {
type: ChannelType;
tenantId: string;
credentials: ChannelCredentials;
enabled?: boolean;
rateLimit?: {
requests: number;
windowMs: number;
};
}
interface ChannelCredentials {
token?: string;
apiKey?: string;
webhookUrl?: string;
clientId?: string;
clientSecret?: string;
botId?: string;
[key: string]: unknown;
}
```
### Usage Example
```typescript
import { ChannelRegistry, SlackAdapter, DiscordAdapter } from './channels';
// Create registry with rate limiting
const registry = new ChannelRegistry({
defaultRateLimit: { requests: 100, windowMs: 60000 }
});
// Register adapters
registry.register(new SlackAdapter({
type: 'slack',
tenantId: 'tenant-1',
credentials: { token: process.env.SLACK_TOKEN }
}));
registry.register(new DiscordAdapter({
type: 'discord',
tenantId: 'tenant-1',
credentials: { token: process.env.DISCORD_TOKEN }
}));
// Handle messages
registry.onMessage(async (message) => {
console.log(`[${message.channelType}] ${message.userId}: ${message.content}`);
});
// Start all adapters
await registry.start();
```
## Implementation Status
| Adapter | Status | Library | Notes |
|---------|--------|---------|-------|
| Slack | Implemented | @slack/bolt | Full support |
| Discord | Implemented | discord.js | Full support |
| Telegram | Implemented | telegraf | Full support |
| Signal | Planned | signal-client | Requires native deps |
| WhatsApp | Planned | baileys | Unofficial API |
| Line | Planned | @line/bot-sdk | - |
| Web | Planned | WebSocket | Custom implementation |
| CLI | Planned | readline | For testing |
## Consequences
### Positive
- Unified message handling across all channels
- Multi-tenant channel isolation with per-tenant indexing
- Easy to add new channels via BaseAdapter
- Built-in rate limiting per adapter
### Negative
- Complexity of maintaining multiple integrations
- Different channel capabilities (some don't support threads)
- Only 3 of 8+ channels currently implemented
### RuvBot Advantages over Clawdbot
- Multi-tenant channel credentials with isolation
- Channel-specific rate limiting
- Cross-channel message routing via broadcast
- Adapter status tracking and statistics

View File

@@ -0,0 +1,205 @@
# ADR-011: Swarm Coordination (agentic-flow Integration)
## Status
Accepted (Implemented)
## Date
2026-01-27
## Context
Clawdbot has basic async processing. RuvBot integrates agentic-flow patterns for:
- Multi-agent swarm coordination
- 12 specialized background workers
- Byzantine fault-tolerant consensus
- Dynamic topology switching
## Decision
### Swarm Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ RuvBot Swarm Coordination │
├─────────────────────────────────────────────────────────────────┤
│ Topologies │
│ ├─ hierarchical : Queen-worker (anti-drift) │
│ ├─ mesh : Peer-to-peer network │
│ ├─ hierarchical-mesh : Hybrid for scalability │
│ └─ adaptive : Dynamic switching │
├─────────────────────────────────────────────────────────────────┤
│ Consensus Protocols │
│ ├─ byzantine : BFT (f < n/3 faulty) │
│ ├─ raft : Leader-based (f < n/2) │
│ ├─ gossip : Eventually consistent │
│ └─ crdt : Conflict-free replication │
├─────────────────────────────────────────────────────────────────┤
│ Background Workers (12) │
│ ├─ ultralearn [normal] : Deep knowledge acquisition │
│ ├─ optimize [high] : Performance optimization │
│ ├─ consolidate [low] : Memory consolidation (EWC++) │
│ ├─ predict [normal] : Predictive preloading │
│ ├─ audit [critical] : Security analysis │
│ ├─ map [normal] : Codebase mapping │
│ ├─ preload [low] : Resource preloading │
│ ├─ deepdive [normal] : Deep code analysis │
│ ├─ document [normal] : Auto-documentation │
│ ├─ refactor [normal] : Refactoring suggestions │
│ ├─ benchmark [normal] : Performance benchmarking │
│ └─ testgaps [normal] : Test coverage analysis │
└─────────────────────────────────────────────────────────────────┘
```
### Implementation
Located in `/npm/packages/ruvbot/src/swarm/`:
- `SwarmCoordinator.ts` - Main coordinator with task dispatch
- `ByzantineConsensus.ts` - PBFT-style consensus implementation
### SwarmCoordinator
```typescript
interface SwarmConfig {
topology: SwarmTopology; // 'hierarchical' | 'mesh' | 'hierarchical-mesh' | 'adaptive'
maxAgents: number; // default: 8
strategy: 'specialized' | 'balanced' | 'adaptive';
consensus: ConsensusProtocol; // 'byzantine' | 'raft' | 'gossip' | 'crdt'
heartbeatInterval?: number; // default: 5000ms
taskTimeout?: number; // default: 60000ms
}
interface SwarmTask {
id: string;
worker: WorkerType;
type: string;
content: unknown;
priority: WorkerPriority;
status: 'pending' | 'running' | 'completed' | 'failed';
assignedAgent?: string;
result?: unknown;
error?: string;
createdAt: Date;
startedAt?: Date;
completedAt?: Date;
}
interface SwarmAgent {
id: string;
type: WorkerType;
status: 'idle' | 'busy' | 'offline';
currentTask?: string;
completedTasks: number;
failedTasks: number;
lastHeartbeat: Date;
}
```
### Worker Configuration
```typescript
const WORKER_DEFAULTS: Record<WorkerType, WorkerConfig> = {
ultralearn: { priority: 'normal', concurrency: 2, timeout: 60000, retries: 3, backoff: 'exponential' },
optimize: { priority: 'high', concurrency: 4, timeout: 30000, retries: 2, backoff: 'exponential' },
consolidate: { priority: 'low', concurrency: 1, timeout: 120000, retries: 1, backoff: 'linear' },
predict: { priority: 'normal', concurrency: 2, timeout: 15000, retries: 2, backoff: 'exponential' },
audit: { priority: 'critical', concurrency: 1, timeout: 45000, retries: 3, backoff: 'exponential' },
map: { priority: 'normal', concurrency: 2, timeout: 60000, retries: 2, backoff: 'linear' },
preload: { priority: 'low', concurrency: 4, timeout: 10000, retries: 1, backoff: 'linear' },
deepdive: { priority: 'normal', concurrency: 2, timeout: 90000, retries: 2, backoff: 'exponential' },
document: { priority: 'normal', concurrency: 2, timeout: 30000, retries: 2, backoff: 'linear' },
refactor: { priority: 'normal', concurrency: 2, timeout: 60000, retries: 2, backoff: 'exponential' },
benchmark: { priority: 'normal', concurrency: 1, timeout: 120000, retries: 1, backoff: 'linear' },
testgaps: { priority: 'normal', concurrency: 2, timeout: 45000, retries: 2, backoff: 'linear' },
};
```
### ByzantineConsensus (PBFT)
```typescript
interface ConsensusConfig {
replicas: number; // Total number of replicas (default: 5)
timeout: number; // Timeout per phase (default: 30000ms)
retries: number; // Retries before failing (default: 3)
requireSignatures: boolean;
}
// Fault tolerance: f < n/3
// Quorum size: ceil(2n/3)
```
**Phases:**
1. `pre-prepare` - Leader broadcasts proposal
2. `prepare` - Replicas validate and send prepare messages
3. `commit` - Wait for quorum of commit messages
4. `decided` - Consensus reached
5. `failed` - Consensus failed (timeout/Byzantine fault)
### Usage Example
```typescript
import { SwarmCoordinator, ByzantineConsensus } from './swarm';
// Initialize swarm
const swarm = new SwarmCoordinator({
topology: 'hierarchical',
maxAgents: 8,
strategy: 'specialized',
consensus: 'raft'
});
await swarm.start();
// Spawn specialized agents
await swarm.spawnAgent('ultralearn');
await swarm.spawnAgent('optimize');
// Dispatch task
const task = await swarm.dispatch({
worker: 'ultralearn',
task: { type: 'deep-analysis', content: 'analyze this' },
priority: 'normal'
});
// Wait for completion
const result = await swarm.waitForTask(task.id);
// Byzantine consensus for critical decisions
const consensus = new ByzantineConsensus({ replicas: 5, timeout: 30000 });
consensus.initializeReplicas(['node1', 'node2', 'node3', 'node4', 'node5']);
const decision = await consensus.propose({ action: 'deploy', version: '1.0.0' });
```
### Events
SwarmCoordinator emits:
- `started`, `stopped`
- `agent:spawned`, `agent:removed`, `agent:offline`
- `task:created`, `task:assigned`, `task:completed`, `task:failed`
ByzantineConsensus emits:
- `proposal:created`
- `phase:pre-prepare`, `phase:prepare`, `phase:commit`
- `vote:received`
- `consensus:decided`, `consensus:failed`, `consensus:no-quorum`
- `replica:faulty`, `view:changed`
## Consequences
### Positive
- Distributed task execution with priority queues
- Fault tolerance via PBFT consensus
- Specialized workers for different task types
- Heartbeat-based health monitoring
- Event-driven architecture
### Negative
- Coordination overhead
- Complexity of distributed systems
- Memory overhead for task/agent tracking
### RuvBot Advantages over Clawdbot
- 12 specialized workers vs basic async
- Byzantine fault tolerance vs none
- Multi-topology support vs single-threaded
- Learning workers (ultralearn, consolidate) vs static
- Priority-based task scheduling

View File

@@ -0,0 +1,376 @@
# ADR-012: LLM Provider Integration
## Status
Accepted (Implemented)
## Date
2026-01-27
## Context
RuvBot requires LLM capabilities for:
- Conversational AI responses
- Reasoning and analysis tasks
- Tool/function calling
- Streaming responses for real-time UX
The system needs to support multiple providers to:
- Allow cost optimization (use cheaper models for simple tasks)
- Provide fallback options
- Access specialized models (reasoning models like QwQ, O1, DeepSeek R1)
- Support both direct API access and unified gateways
## Decision
### Provider Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ RuvBot LLM Provider Layer │
├─────────────────────────────────────────────────────────────────┤
│ Provider Interface │
│ └─ LLMProvider (abstract interface) │
│ ├─ complete() - Single completion │
│ ├─ stream() - Streaming completion (AsyncGenerator) │
│ ├─ countTokens() - Token estimation │
│ ├─ getModel() - Model info │
│ └─ isHealthy() - Health check │
├─────────────────────────────────────────────────────────────────┤
│ Implementations │
│ ├─ AnthropicProvider : Direct Anthropic API │
│ │ └─ Claude 4, 3.5, 3 models │
│ └─ OpenRouterProvider : Multi-model gateway │
│ ├─ Qwen QwQ (reasoning) │
│ ├─ DeepSeek R1 (reasoning) │
│ ├─ Claude via OpenRouter │
│ ├─ GPT-4, O1 via OpenRouter │
│ └─ Gemini, Llama via OpenRouter │
├─────────────────────────────────────────────────────────────────┤
│ Features │
│ ├─ Tool/Function calling │
│ ├─ Streaming with token callbacks │
│ ├─ Automatic retry with backoff │
│ └─ Token counting │
└─────────────────────────────────────────────────────────────────┘
```
### Implementation
Located in `/npm/packages/ruvbot/src/integration/providers/`:
- `index.ts` - Interface definitions and exports
- `AnthropicProvider.ts` - Anthropic Claude integration
- `OpenRouterProvider.ts` - OpenRouter multi-model gateway
### LLMProvider Interface
```typescript
interface LLMProvider {
complete(messages: Message[], options?: CompletionOptions): Promise<Completion>;
stream(messages: Message[], options?: StreamOptions): AsyncGenerator<Token, Completion, void>;
countTokens(text: string): Promise<number>;
getModel(): ModelInfo;
isHealthy(): Promise<boolean>;
}
interface Message {
role: 'user' | 'assistant' | 'system';
content: string;
}
interface CompletionOptions {
maxTokens?: number;
temperature?: number; // 0.0-2.0
topP?: number; // 0.0-1.0
stopSequences?: string[];
tools?: Tool[];
}
interface StreamOptions extends CompletionOptions {
onToken?: (token: string) => void;
}
interface Completion {
content: string;
finishReason: 'stop' | 'length' | 'tool_use';
usage: {
inputTokens: number;
outputTokens: number;
};
toolCalls?: ToolCall[];
}
interface Token {
type: 'text' | 'tool_use';
text?: string;
toolUse?: ToolCall;
}
```
### Tool/Function Calling
```typescript
interface Tool {
name: string;
description: string;
parameters: Record<string, unknown>; // JSON Schema
}
interface ToolCall {
id: string;
name: string;
input: Record<string, unknown>;
}
```
### AnthropicProvider
Direct integration with Anthropic's Claude API.
```typescript
interface AnthropicConfig {
apiKey: string;
baseUrl?: string; // default: 'https://api.anthropic.com'
model?: string; // default: 'claude-3-5-sonnet-20241022'
maxRetries?: number; // default: 3
timeout?: number; // default: 60000ms
}
type AnthropicModel =
| 'claude-opus-4-20250514'
| 'claude-sonnet-4-20250514'
| 'claude-3-5-sonnet-20241022'
| 'claude-3-5-haiku-20241022'
| 'claude-3-opus-20240229'
| 'claude-3-sonnet-20240229'
| 'claude-3-haiku-20240307';
```
**Model Specifications:**
| Model | Max Tokens | Context Window | Best For |
|-------|------------|----------------|----------|
| claude-opus-4-20250514 | 32,768 | 200,000 | Complex reasoning, analysis |
| claude-sonnet-4-20250514 | 16,384 | 200,000 | Balanced performance |
| claude-3-5-sonnet-20241022 | 8,192 | 200,000 | General purpose |
| claude-3-5-haiku-20241022 | 8,192 | 200,000 | Fast, cost-effective |
| claude-3-opus-20240229 | 4,096 | 200,000 | Complex tasks |
| claude-3-sonnet-20240229 | 4,096 | 200,000 | Balanced |
| claude-3-haiku-20240307 | 4,096 | 200,000 | Fast responses |
**Usage:**
```typescript
import { createAnthropicProvider } from './integration/providers';
const provider = createAnthropicProvider({
apiKey: process.env.ANTHROPIC_API_KEY!,
model: 'claude-3-5-sonnet-20241022',
});
// Simple completion
const response = await provider.complete([
{ role: 'user', content: 'Hello!' }
]);
// Streaming
for await (const token of provider.stream(messages)) {
if (token.type === 'text') {
process.stdout.write(token.text!);
}
}
// With tools
const toolResponse = await provider.complete(messages, {
tools: [{
name: 'get_weather',
description: 'Get weather for a location',
parameters: {
type: 'object',
properties: {
location: { type: 'string' }
}
}
}]
});
```
### OpenRouterProvider
Access to 100+ models through OpenRouter's unified API.
```typescript
interface OpenRouterConfig {
apiKey: string;
baseUrl?: string; // default: 'https://openrouter.ai/api'
model?: string; // default: 'qwen/qwq-32b'
siteUrl?: string; // For attribution
siteName?: string; // default: 'RuvBot'
maxRetries?: number; // default: 3
timeout?: number; // default: 120000ms (longer for reasoning)
}
type OpenRouterModel =
// Reasoning Models
| 'qwen/qwq-32b'
| 'qwen/qwq-32b:free'
| 'openai/o1-preview'
| 'openai/o1-mini'
| 'deepseek/deepseek-r1'
// Standard Models
| 'anthropic/claude-3.5-sonnet'
| 'openai/gpt-4o'
| 'google/gemini-pro-1.5'
| 'meta-llama/llama-3.1-405b-instruct'
| string; // Any OpenRouter model
```
**Reasoning Model Specifications:**
| Model | Max Tokens | Context | Special Features |
|-------|------------|---------|------------------|
| qwen/qwq-32b | 16,384 | 32,768 | Chain-of-thought reasoning |
| qwen/qwq-32b:free | 16,384 | 32,768 | Free tier available |
| openai/o1-preview | 32,768 | 128,000 | Advanced reasoning |
| openai/o1-mini | 65,536 | 128,000 | Faster reasoning |
| deepseek/deepseek-r1 | 8,192 | 64,000 | Open-source reasoning |
**Usage:**
```typescript
import {
createOpenRouterProvider,
createQwQProvider,
createDeepSeekR1Provider
} from './integration/providers';
// General OpenRouter
const provider = createOpenRouterProvider({
apiKey: process.env.OPENROUTER_API_KEY!,
model: 'qwen/qwq-32b',
});
// Convenience: QwQ reasoning model
const qwq = createQwQProvider(process.env.OPENROUTER_API_KEY!, false);
// Convenience: Free QwQ
const qwqFree = createQwQProvider(process.env.OPENROUTER_API_KEY!, true);
// Convenience: DeepSeek R1
const deepseek = createDeepSeekR1Provider(process.env.OPENROUTER_API_KEY!);
// List available models
const models = await provider.listModels();
```
### Configuration Options
**Environment Variables:**
```bash
# Anthropic
ANTHROPIC_API_KEY=sk-ant-...
# OpenRouter
OPENROUTER_API_KEY=sk-or-...
```
**Rate Limiting:**
- Both providers use native fetch with `AbortSignal.timeout()`
- Anthropic: 60s default timeout
- OpenRouter: 120s default timeout (for reasoning models)
**Retry Strategy:**
- Default: 3 retries
- Backoff: Not implemented in base (use with retry libraries)
### Performance Benchmarks
| Operation | Anthropic | OpenRouter |
|-----------|-----------|------------|
| Cold start | ~500ms | ~800ms |
| Token latency (first) | ~200ms | ~300ms |
| Throughput (tokens/s) | ~50 | ~40 |
| Tool call parsing | <10ms | <10ms |
### Error Handling
```typescript
try {
const response = await provider.complete(messages);
} catch (error) {
if (error.message.includes('API error: 429')) {
// Rate limited - implement backoff
} else if (error.message.includes('API error: 401')) {
// Invalid API key
} else if (error.message.includes('timeout')) {
// Request timed out
}
}
```
### Usage Patterns
**Model Routing by Task Complexity:**
```typescript
function selectProvider(taskComplexity: 'simple' | 'medium' | 'complex' | 'reasoning') {
switch (taskComplexity) {
case 'simple':
return createAnthropicProvider({ apiKey, model: 'claude-3-5-haiku-20241022' });
case 'medium':
return createAnthropicProvider({ apiKey, model: 'claude-3-5-sonnet-20241022' });
case 'complex':
return createAnthropicProvider({ apiKey, model: 'claude-opus-4-20250514' });
case 'reasoning':
return createQwQProvider(openRouterApiKey);
}
}
```
**Fallback Chain:**
```typescript
async function completeWithFallback(messages: Message[]) {
const providers = [
createAnthropicProvider({ apiKey, model: 'claude-3-5-sonnet-20241022' }),
createOpenRouterProvider({ apiKey: orKey, model: 'anthropic/claude-3.5-sonnet' }),
createQwQProvider(orKey, true), // Free fallback
];
for (const provider of providers) {
try {
if (await provider.isHealthy()) {
return await provider.complete(messages);
}
} catch (error) {
console.warn(`Provider failed, trying next:`, error);
}
}
throw new Error('All providers failed');
}
```
## Consequences
### Positive
- Unified interface for multiple LLM providers
- Access to 100+ models through OpenRouter
- Native streaming support with token callbacks
- Tool/function calling support
- Easy provider switching for cost optimization
### Negative
- Token counting is approximate (not tiktoken-based)
- No built-in retry with exponential backoff
- System messages handled differently by providers
### Trade-offs
- OpenRouter adds latency vs direct API calls
- Reasoning models (QwQ, O1) have longer timeouts
- Free tiers have rate limits and quotas
### RuvBot Advantages
- Multi-provider support vs single provider
- Reasoning model access (QwQ, DeepSeek R1, O1)
- Factory functions for common configurations
- Streaming with async generators

View File

@@ -0,0 +1,263 @@
# ADR-013: Google Cloud Platform Deployment Architecture
## Status
Accepted
## Date
2026-01-27
## Context
RuvBot needs a production-ready deployment option that:
1. Minimizes operational costs for low-traffic scenarios
2. Scales automatically with demand
3. Provides persistence for sessions, memory, and learning data
4. Secures API keys and credentials
5. Supports multi-tenant deployments
## Decision
Deploy RuvBot on Google Cloud Platform using serverless and managed services optimized for cost.
### Architecture Overview
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ Google Cloud Platform │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Cloud │ │ Cloud │ │ Cloud │ │
│ │ Build │───▶│ Registry │───▶│ Run │ │
│ │ (CI/CD) │ │ (Images) │ │ (App) │ │
│ └──────────────┘ └──────────────┘ └──────┬───────┘ │
│ │ │
│ ┌────────────────────────────┼────────────────────────┐ │
│ │ │ │ │
│ ┌──────▼──────┐ ┌────────────────▼───────────┐ │ │
│ │ Secret │ │ Cloud SQL │ │ │
│ │ Manager │ │ (PostgreSQL) │ │ │
│ │ │ │ db-f1-micro │ │ │
│ └─────────────┘ └────────────────────────────┘ │ │
│ │ │
│ ┌─────────────┐ ┌────────────────────────────┐ │ │
│ │ Cloud │ │ Memorystore │ │ │
│ │ Storage │ │ (Redis) - Optional │ │ │
│ │ (Files) │ │ Basic tier │ │ │
│ └─────────────┘ └────────────────────────────┘ │ │
│ │ │
│ └────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### Cost Optimization Strategy
| Service | Configuration | Monthly Cost | Notes |
|---------|--------------|--------------|-------|
| Cloud Run | 0-10 instances, 512Mi RAM | ~$0-5 | Free tier: 2M requests |
| Cloud SQL | db-f1-micro, 10GB SSD | ~$10-15 | Smallest instance |
| Secret Manager | 3-5 secrets | ~$0.18 | $0.06/secret/month |
| Cloud Storage | Standard, lifecycle policies | ~$0.02/GB | Auto-tiering |
| Cloud Build | Free tier | ~$0 | 120 min/day free |
| **Total (low traffic)** | | **~$15-20/month** | |
### Service Configuration
#### Cloud Run (Compute)
```yaml
# Serverless container configuration
resources:
cpu: "1"
memory: "512Mi"
scaling:
minInstances: 0 # Scale to zero when idle
maxInstances: 10 # Limit for cost control
concurrency: 80 # Requests per instance
features:
cpuIdle: true # Reduce CPU when idle (cost savings)
startupCpuBoost: true # Faster cold starts
timeout: 300s # 5 minutes for long operations
```
#### Cloud SQL (Database)
```hcl
# Cost-optimized PostgreSQL
tier = "db-f1-micro" # 0.6GB RAM, shared CPU
disk_size = 10 # Minimum SSD
availability = "ZONAL" # Single zone (cheaper)
backup_retention = 7 # 7 days
# Extensions enabled
- uuid-ossp # UUID generation
- pgcrypto # Cryptographic functions
- pg_trgm # Text search (trigram similarity)
```
#### Secret Manager
Securely stores:
- `anthropic-api-key` - Anthropic API credentials
- `openrouter-api-key` - OpenRouter API credentials
- `database-url` - PostgreSQL connection string
#### Cloud Storage
```hcl
# Automatic cost optimization
lifecycle_rules = [
{ age = 30, action = "SetStorageClass", class = "NEARLINE" },
{ age = 90, action = "SetStorageClass", class = "COLDLINE" }
]
```
### Deployment Options
#### Option 1: Quick Deploy (gcloud CLI)
```bash
# Set environment variables
export ANTHROPIC_API_KEY="sk-ant-..."
export PROJECT_ID="my-project"
# Run deployment script
./deploy/gcp/deploy.sh --project-id $PROJECT_ID
```
#### Option 2: Infrastructure as Code (Terraform)
```bash
cd deploy/gcp/terraform
terraform init
terraform plan -var="project_id=my-project" -var="anthropic_api_key=sk-ant-..."
terraform apply
```
#### Option 3: CI/CD (Cloud Build)
```yaml
# Trigger on push to main branch
trigger:
branch: main
included_files:
- "npm/packages/ruvbot/**"
# cloudbuild.yaml handles build and deploy
```
### Multi-Tenant Configuration
For multiple tenants:
```hcl
# Separate Cloud SQL databases
resource "google_sql_database" "tenant" {
for_each = var.tenants
name = "ruvbot_${each.key}"
instance = google_sql_database_instance.ruvbot.name
}
# Row-Level Security in PostgreSQL
ALTER TABLE sessions ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON sessions
USING (tenant_id = current_setting('app.tenant_id')::uuid);
```
### Scaling Considerations
| Traffic Level | Cloud Run Instances | Cloud SQL | Estimated Cost |
|---------------|---------------------|-----------|----------------|
| Low (<1K req/day) | 0-1 | db-f1-micro | ~$15/month |
| Medium (<10K req/day) | 1-3 | db-g1-small | ~$40/month |
| High (<100K req/day) | 3-10 | db-custom | ~$150/month |
| Enterprise | 10-100 | Regional HA | ~$500+/month |
### Security Configuration
```hcl
# Service account with minimal permissions
roles = [
"roles/secretmanager.secretAccessor",
"roles/cloudsql.client",
"roles/storage.objectAdmin",
"roles/logging.logWriter",
"roles/monitoring.metricWriter",
]
# Network security
ip_configuration {
ipv4_enabled = false # Production: use private IP
private_network = google_compute_network.vpc.id
}
```
### Health Monitoring
```yaml
# Cloud Run health checks
startup_probe:
http_get:
path: /health
port: 8080
initial_delay_seconds: 5
timeout_seconds: 3
period_seconds: 10
liveness_probe:
http_get:
path: /health
port: 8080
timeout_seconds: 3
period_seconds: 30
```
### File Structure
```
deploy/
├── gcp/
│ ├── cloudbuild.yaml # CI/CD pipeline
│ ├── deploy.sh # Quick deployment script
│ └── terraform/
│ └── main.tf # Infrastructure as code
├── init-db.sql # Database schema
├── Dockerfile # Container image
└── docker-compose.yml # Local development
```
## Consequences
### Positive
- **Cost-effective**: ~$15-20/month for low traffic
- **Serverless**: Scale to zero when not in use
- **Managed services**: No infrastructure maintenance
- **Security**: Secret Manager, IAM, VPC support
- **Observability**: Built-in logging and monitoring
### Negative
- **Cold starts**: First request after idle ~2-3 seconds
- **Vendor lock-in**: GCP-specific services
- **Complexity**: Multiple services to configure
### Trade-offs
- **Cloud SQL vs Firestore**: SQL chosen for complex queries, Row-Level Security
- **Cloud Run vs GKE**: Run chosen for simplicity, lower cost
- **db-f1-micro vs larger**: Cost vs performance trade-off
## Alternatives Considered
| Option | Pros | Cons | Estimated Cost |
|--------|------|------|----------------|
| GKE + Postgres | Full control, predictable | Complex, expensive | ~$100+/month |
| App Engine | Simple deployment | Less flexible | ~$30/month |
| Firebase + Functions | Easy scaling | No SQL, vendor lock | ~$20/month |
| **Cloud Run + SQL** | **Balanced** | **Some complexity** | **~$15/month** |
## References
- [Cloud Run Pricing](https://cloud.google.com/run/pricing)
- [Cloud SQL Pricing](https://cloud.google.com/sql/pricing)
- [Terraform GCP Provider](https://registry.terraform.io/providers/hashicorp/google/latest/docs)
- [Cloud Build CI/CD](https://cloud.google.com/build/docs)

View File

@@ -0,0 +1,246 @@
# ADR-014: AIDefence Integration for Adversarial Protection
## Status
Accepted
## Date
2026-01-27
## Context
RuvBot requires robust protection against adversarial attacks including:
- Prompt injection (OWASP #1 LLM vulnerability)
- Jailbreak attempts
- PII leakage
- Malicious code injection
- Data exfiltration
The `aidefence` package provides production-ready adversarial defense with <10ms detection latency.
## Decision
Integrate `aidefence@2.1.1` into RuvBot as a core security layer.
### Architecture
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ RuvBot Security Layer │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ User Input ────┐ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ AIDefenceGuard │ │
│ ├──────────────────────────────────────────────────────────────────────┤ │
│ │ Layer 1: Pattern Detection (<5ms) │ │
│ │ └─ 50+ injection signatures │ │
│ │ └─ Jailbreak patterns (DAN, bypass, etc.) │ │
│ │ └─ Custom patterns (configurable) │ │
│ ├──────────────────────────────────────────────────────────────────────┤ │
│ │ Layer 2: PII Detection (<5ms) │ │
│ │ └─ Email, phone, SSN, credit card │ │
│ │ └─ API keys and tokens │ │
│ │ └─ IP addresses │ │
│ ├──────────────────────────────────────────────────────────────────────┤ │
│ │ Layer 3: Sanitization (<1ms) │ │
│ │ └─ Control character removal │ │
│ │ └─ Unicode homoglyph normalization │ │
│ │ └─ PII masking │ │
│ ├──────────────────────────────────────────────────────────────────────┤ │
│ │ Layer 4: Behavioral Analysis (<100ms) [Optional] │ │
│ │ └─ User behavior baseline │ │
│ │ └─ Anomaly detection │ │
│ │ └─ Deviation scoring │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────┐ │
│ │ Safe? │────No───► Block / Sanitize │
│ └────┬─────┘ │
│ │ Yes │
│ ▼ │
│ LLM Provider │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ Response Validation │ │
│ │ └─ PII leak detection │ │
│ │ └─ Injection echo detection │ │
│ │ └─ Malicious code detection │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Safe Response ────► User │
└─────────────────────────────────────────────────────────────────────────────┘
```
### Threat Types Detected
| Threat Type | Severity | Detection Method | Response |
|-------------|----------|------------------|----------|
| Prompt Injection | High | Pattern matching | Block/Sanitize |
| Jailbreak | Critical | Signature detection | Block |
| PII Exposure | Medium-Critical | Regex patterns | Mask |
| Malicious Code | High | AST-like patterns | Block |
| Data Exfiltration | High | URL/webhook detection | Block |
| Control Characters | Medium | Unicode analysis | Remove |
| Encoding Attacks | Medium | Homoglyph detection | Normalize |
| Anomalous Behavior | Medium | Baseline deviation | Alert |
### Performance Targets
| Operation | Target | Achieved |
|-----------|--------|----------|
| Pattern Detection | <10ms | ~5ms |
| PII Detection | <10ms | ~3ms |
| Sanitization | <5ms | ~1ms |
| Full Analysis | <20ms | ~10ms |
| Response Validation | <15ms | ~8ms |
### Usage
```typescript
import { createAIDefenceGuard, createAIDefenceMiddleware } from '@ruvector/ruvbot';
// Simple usage
const guard = createAIDefenceGuard({
detectPromptInjection: true,
detectJailbreak: true,
detectPII: true,
blockThreshold: 'medium',
});
const result = await guard.analyze(userInput, {
userId: 'user-123',
sessionId: 'session-456',
});
if (!result.safe) {
console.log('Threats detected:', result.threats);
// Use sanitized input or block
const safeInput = result.sanitizedInput;
}
// Middleware usage
const middleware = createAIDefenceMiddleware({
blockThreshold: 'medium',
enableAuditLog: true,
});
// Validate input before LLM
const { allowed, sanitizedInput } = await middleware.validateInput(userInput);
if (allowed) {
const response = await llm.complete(sanitizedInput);
// Validate response before returning
const { allowed: responseAllowed } = await middleware.validateOutput(response, userInput);
if (responseAllowed) {
return response;
}
}
```
### Configuration Options
```typescript
interface AIDefenceConfig {
// Detection toggles
detectPromptInjection: boolean; // Default: true
detectJailbreak: boolean; // Default: true
detectPII: boolean; // Default: true
// Advanced features
enableBehavioralAnalysis: boolean; // Default: false
enablePolicyVerification: boolean; // Default: false
// Threshold: 'none' | 'low' | 'medium' | 'high' | 'critical'
blockThreshold: ThreatLevel; // Default: 'medium'
// Custom patterns (regex strings)
customPatterns?: string[];
// Allowed domains for URL validation
allowedDomains?: string[];
// Max input length (chars)
maxInputLength: number; // Default: 100000
// Audit logging
enableAuditLog: boolean; // Default: true
}
```
### Preset Configurations
```typescript
// Strict mode (production)
const strictConfig = createStrictConfig();
// - All detection enabled
// - Behavioral analysis enabled
// - Block threshold: 'low'
// Permissive mode (development)
const permissiveConfig = createPermissiveConfig();
// - Core detection only
// - Block threshold: 'critical'
// - Audit logging disabled
```
## Consequences
### Positive
- Sub-10ms detection latency
- 50+ built-in injection patterns
- PII protection out of the box
- Configurable security levels
- Audit logging for compliance
- Response validation
- Unicode/homoglyph protection
### Negative
- Additional dependency (aidefence)
- Small latency overhead (~10ms per request)
- False positives possible with strict settings
### Trade-offs
- Strict mode may block legitimate queries
- Behavioral analysis adds latency (~100ms)
- PII masking may alter valid content
## Integration with Existing Security
AIDefence integrates with RuvBot's 6-layer security architecture:
```
Layer 1: Transport (TLS 1.3)
Layer 2: Authentication (JWT)
Layer 3: Authorization (RBAC)
Layer 4: Data Protection (Encryption)
Layer 5: Input Validation (AIDefence) ◄── NEW
Layer 6: WASM Sandbox
```
## Dependencies
```json
{
"aidefence": "^2.1.1"
}
```
The aidefence package includes:
- agentdb (vector storage)
- lean-agentic (formal verification)
- zod (schema validation)
- winston (logging)
- helmet (HTTP security headers)
## References
- [aidefence on npm](https://www.npmjs.com/package/aidefence)
- [OWASP LLM Top 10](https://owasp.org/www-project-top-10-for-large-language-model-applications/)
- [Prompt Injection Guide](https://www.lakera.ai/blog/guide-to-prompt-injection)
- [AIMDS Documentation](https://ruv.io/aimds)

View File

@@ -0,0 +1,192 @@
# ADR-015: Chat UI Architecture
## Status
Accepted
## Date
2026-01-28
## Context
RuvBot provides a powerful REST API for chat interactions, but lacks a user-facing web interface. When users visit the root URL of a deployed RuvBot instance (e.g., on Cloud Run), they receive a 404 error instead of a usable chat interface.
### Requirements
1. Provide a modern, responsive chat UI out of the box
2. Support dark mode (default) and light mode themes
3. Work with the existing REST API endpoints
4. No build step required - serve static files directly
5. Support streaming responses for real-time AI interaction
6. Mobile-friendly design
7. Model selection capability
8. Integration with CLI and npm package
### Alternatives Considered
| Option | Pros | Cons |
|--------|------|------|
| **assistant-ui** | Industry leader, 200k+ downloads, Y Combinator backed | Requires React build, adds complexity |
| **Vercel AI Elements** | Official Vercel components, AI SDK integration | Requires Next.js |
| **shadcn-chatbot-kit** | Beautiful components, shadcn design system | Requires React build |
| **Embedded HTML/CSS/JS** | No build step, portable, fast deployment | Less features, custom implementation |
## Decision
Implement a **lightweight embedded chat UI** using vanilla HTML, CSS, and JavaScript that:
1. Is served directly from the existing HTTP server
2. Requires no build step or additional dependencies
3. Provides a modern, accessible interface
4. Supports dark mode by default
5. Includes basic markdown rendering
6. Works seamlessly with the existing REST API
### Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ RuvBot Server │
├─────────────────────────────────────────────────────────────────┤
│ GET / → Chat UI (index.html) │
│ GET /health → Health check │
│ GET /api/models → Available models │
│ POST /api/sessions → Create session │
│ POST /api/sessions/:id/chat → Chat endpoint │
└─────────────────────────────────────────────────────────────────┘
```
### File Structure
```
src/
├── api/
│ └── public/
│ └── index.html # Chat UI (single file)
├── server.ts # Updated to serve static files
└── ...
```
### Features
1. **Theme Support**: Dark mode default, light mode toggle
2. **Model Selection**: Dropdown for available models
3. **Responsive Design**: Mobile-first approach
4. **Accessibility**: ARIA labels, keyboard navigation
5. **Markdown Rendering**: Code blocks, lists, links
6. **Error Handling**: User-friendly error messages
7. **Session Management**: Automatic session creation
8. **Real-time Updates**: Typing indicators
### CSS Design System
```css
:root {
--bg-primary: #0a0a0f; /* Dark background */
--bg-secondary: #12121a; /* Card background */
--text-primary: #f0f0f5; /* Main text */
--accent: #6366f1; /* Indigo accent */
--radius: 12px; /* Border radius */
}
```
### API Integration
The UI integrates with existing endpoints:
```javascript
// Create session
POST /api/sessions { agentId: 'default-agent' }
// Send message
POST /api/sessions/:id/chat { message: '...', model: '...' }
```
## Consequences
### Positive
1. **Zero Configuration**: Works out of the box
2. **Fast Deployment**: No build step required
3. **Portable**: Single HTML file, easy to customize
4. **Lightweight**: ~25KB uncompressed
5. **Framework Agnostic**: No React/Vue/Svelte dependency
6. **Cloud Run Compatible**: Works with existing deployment
### Negative
1. **Limited Features**: No streaming UI (yet), basic markdown
2. **Manual Updates**: No component library updates
3. **Custom Code**: Maintenance responsibility
### Neutral
1. Future option to add assistant-ui or similar for advanced features
2. Can be replaced with any frontend framework later
## Implementation
### Server Changes (server.ts)
```typescript
// Serve static files
function getChatUIPath(): string {
const possiblePaths = [
join(__dirname, 'api', 'public', 'index.html'),
// ... fallback paths
];
// Find first existing path
}
// Add root route
{ method: 'GET', pattern: /^\/$/, handler: handleRoot }
```
### CLI Integration
```bash
# View chat UI URL after deployment
ruvbot deploy-cloud cloudrun
# Output: URL: https://ruvbot-xxx.run.app
# Open chat UI
ruvbot open # Opens browser to chat UI
```
### npm Package
The chat UI is bundled with the npm package:
```json
{
"files": [
"dist",
"bin",
"scripts",
"src/api/public"
]
}
```
## Future Enhancements
1. **Streaming Responses**: SSE/WebSocket for real-time streaming
2. **File Uploads**: Image and document support
3. **Voice Input**: Speech-to-text integration
4. **assistant-ui Migration**: Full-featured React UI option
5. **Themes**: Additional theme presets
6. **Plugins**: Extensible UI components
## References
- [assistant-ui](https://github.com/assistant-ui/assistant-ui) - Industry-leading chat UI library
- [Vercel AI SDK](https://ai-sdk.dev/) - AI SDK with streaming support
- [shadcn/ui](https://ui.shadcn.com/) - Design system inspiration
- [ADR-013: GCP Deployment](./ADR-013-gcp-deployment.md) - Cloud Run deployment
## Changelog
| Date | Change |
|------|--------|
| 2026-01-28 | Initial version - embedded chat UI |