Squashed 'vendor/ruvector/' content from commit b64c2172

git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00
commit d803bfe2b1
7854 changed files with 3522914 additions and 0 deletions
--- a/npm/packages/agentic-synth/docs/PERFORMANCE.md
+++ b/npm/packages/agentic-synth/docs/PERFORMANCE.md
@@ -0,0 +1,322 @@
+# Performance Optimization Guide
+
+## Overview
+
+Agentic-Synth is optimized for high-performance synthetic data generation with the following targets:
+- **Sub-second response times** for cached requests
+- **100+ concurrent generations** supported
+- **Memory efficient** data handling (< 400MB)
+- **50%+ cache hit rate** for typical workloads
+
+## Performance Targets
+
+| Metric | Target | Notes |
+|--------|--------|-------|
+| P99 Latency | < 1000ms | For cached requests < 100ms |
+| Throughput | > 10 req/s | Scales with concurrency |
+| Memory Usage | < 400MB | With 1000-item cache |
+| Cache Hit Rate | > 50% | Depends on workload patterns |
+| Error Rate | < 1% | With retry logic |
+
+## Optimization Strategies
+
+### 1. Context Caching
+
+**Configuration:**
+```typescript
+const synth = new AgenticSynth({
+  enableCache: true,
+  cacheSize: 1000,      // Adjust based on memory
+  cacheTTL: 3600000,    // 1 hour in milliseconds
+});
+```
+
+**Benefits:**
+- Reduces API calls by 50-80%
+- Sub-100ms latency for cache hits
+- Automatic LRU eviction
+
+**Best Practices:**
+- Use consistent prompts for better cache hits
+- Increase cache size for repetitive workloads
+- Monitor cache hit rate with `synth.getMetrics()`
+
+### 2. Model Routing
+
+**Configuration:**
+```typescript
+const synth = new AgenticSynth({
+  modelPreference: [
+    'claude-sonnet-4-5-20250929',
+    'claude-3-5-sonnet-20241022'
+  ],
+});
+```
+
+**Features:**
+- Automatic load balancing
+- Performance-based routing
+- Error handling and fallback
+
+### 3. Concurrent Generation
+
+**Configuration:**
+```typescript
+const synth = new AgenticSynth({
+  maxConcurrency: 100,  // Adjust based on API limits
+});
+```
+
+**Usage:**
+```typescript
+const prompts = [...]; // 100+ prompts
+const results = await synth.generateBatch(prompts, {
+  maxTokens: 500
+});
+```
+
+**Performance:**
+- 2-3x faster than sequential
+- Respects concurrency limits
+- Automatic batching
+
+### 4. Memory Management
+
+**Configuration:**
+```typescript
+const synth = new AgenticSynth({
+  memoryLimit: 512 * 1024 * 1024,  // 512MB
+});
+```
+
+**Features:**
+- Automatic memory tracking
+- LRU eviction when over limit
+- Periodic cleanup with `synth.optimize()`
+
+### 5. Streaming for Large Outputs
+
+**Usage:**
+```typescript
+const stream = synth.generateStream(prompt, {
+  maxTokens: 4096
+});
+
+for await (const chunk of stream) {
+  // Process chunk immediately
+  processChunk(chunk);
+}
+```
+
+**Benefits:**
+- Lower time-to-first-byte
+- Reduced memory usage
+- Better user experience
+
+## Benchmarking
+
+### Running Benchmarks
+
+```bash
+# Run all benchmarks
+npm run benchmark
+
+# Run specific suite
+npm run benchmark -- --suite "Throughput Test"
+
+# With custom settings
+npm run benchmark -- --iterations 20 --concurrency 200
+
+# Generate report
+npm run benchmark -- --output benchmarks/report.md
+```
+
+### Benchmark Suites
+
+1. **Throughput Test**: Measures requests per second
+2. **Latency Test**: Measures P50/P95/P99 latencies
+3. **Memory Test**: Measures memory usage and leaks
+4. **Cache Test**: Measures cache effectiveness
+5. **Concurrency Test**: Tests concurrent request handling
+6. **Streaming Test**: Measures streaming performance
+
+### Analyzing Results
+
+```bash
+# Analyze performance
+npm run perf:analyze
+
+# Generate detailed report
+npm run perf:report
+```
+
+## Bottleneck Detection
+
+The built-in bottleneck analyzer automatically detects:
+
+### 1. Latency Bottlenecks
+- **Cause**: Slow API responses, network issues
+- **Solution**: Increase cache size, optimize prompts
+- **Impact**: 30-50% latency reduction
+
+### 2. Throughput Bottlenecks
+- **Cause**: Low concurrency, sequential processing
+- **Solution**: Increase maxConcurrency, use batch API
+- **Impact**: 2-3x throughput increase
+
+### 3. Memory Bottlenecks
+- **Cause**: Large cache, memory leaks
+- **Solution**: Reduce cache size, call optimize()
+- **Impact**: 40-60% memory reduction
+
+### 4. Cache Bottlenecks
+- **Cause**: Low hit rate, small cache
+- **Solution**: Increase cache size, optimize keys
+- **Impact**: 20-40% cache improvement
+
+## CI/CD Integration
+
+### Performance Regression Detection
+
+```bash
+# Run in CI
+npm run benchmark:ci
+```
+
+**Features:**
+- Automatic threshold checking
+- Fails build on regression
+- Generates reports for artifacts
+
+### GitHub Actions Example
+
+```yaml
+- name: Performance Benchmarks
+  run: npm run benchmark:ci
+
+- name: Upload Report
+  uses: actions/upload-artifact@v3
+  with:
+    name: performance-report
+    path: benchmarks/performance-report.md
+```
+
+## Profiling
+
+### CPU Profiling
+
+```bash
+npm run benchmark:profile
+node --prof-process isolate-*.log > profile.txt
+```
+
+### Memory Profiling
+
+```bash
+node --expose-gc --max-old-space-size=512 dist/benchmarks/runner.js
+```
+
+### Chrome DevTools
+
+```bash
+node --inspect-brk dist/benchmarks/runner.js
+# Open chrome://inspect
+```
+
+## Optimization Checklist
+
+- [ ] Enable caching for repetitive workloads
+- [ ] Set appropriate cache size (1000+ items)
+- [ ] Configure concurrency based on API limits
+- [ ] Use batch API for multiple generations
+- [ ] Implement streaming for large outputs
+- [ ] Monitor memory usage regularly
+- [ ] Run benchmarks before releases
+- [ ] Set up CI/CD performance tests
+- [ ] Profile bottlenecks periodically
+- [ ] Optimize prompt patterns for cache hits
+
+## Performance Monitoring
+
+### Runtime Metrics
+
+```typescript
+// Get current metrics
+const metrics = synth.getMetrics();
+console.log('Cache:', metrics.cache);
+console.log('Memory:', metrics.memory);
+console.log('Router:', metrics.router);
+```
+
+### Performance Monitor
+
+```typescript
+import { PerformanceMonitor } from '@ruvector/agentic-synth';
+
+const monitor = new PerformanceMonitor();
+monitor.start();
+
+// ... run workload ...
+
+const metrics = monitor.getMetrics();
+console.log('Throughput:', metrics.throughput);
+console.log('P99 Latency:', metrics.p99LatencyMs);
+```
+
+### Bottleneck Analysis
+
+```typescript
+import { BottleneckAnalyzer } from '@ruvector/agentic-synth';
+
+const analyzer = new BottleneckAnalyzer();
+const report = analyzer.analyze(metrics);
+
+if (report.detected) {
+  console.log('Bottlenecks:', report.bottlenecks);
+  console.log('Recommendations:', report.recommendations);
+}
+```
+
+## Best Practices
+
+1. **Cache Strategy**: Use prompts as cache keys, normalize formatting
+2. **Concurrency**: Start with 100, increase based on API limits
+3. **Memory**: Monitor with getMetrics(), call optimize() periodically
+4. **Streaming**: Use for outputs > 1000 tokens
+5. **Benchmarking**: Run before releases, track trends
+6. **Monitoring**: Enable in production, set up alerts
+7. **Optimization**: Profile first, optimize bottlenecks
+8. **Testing**: Include performance tests in CI/CD
+
+## Troubleshooting
+
+### High Latency
+- Check cache hit rate
+- Increase cache size
+- Optimize prompt patterns
+- Check network connectivity
+
+### Low Throughput
+- Increase maxConcurrency
+- Use batch API
+- Reduce maxTokens
+- Check API rate limits
+
+### High Memory Usage
+- Reduce cache size
+- Call optimize() regularly
+- Use streaming for large outputs
+- Check for memory leaks
+
+### Low Cache Hit Rate
+- Normalize prompt formatting
+- Increase cache size
+- Increase TTL
+- Review workload patterns
+
+## Additional Resources
+
+- [API Documentation](./API.md)
+- [Examples](../examples/)
+- [Benchmark Source](../src/benchmarks/)
+- [GitHub Issues](https://github.com/ruvnet/ruvector/issues)