Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

2026-02-28 14:39:40 -05:00
parent 7885bf6278 d803bfe2b1
commit cd5943df23
7854 changed files with 3522914 additions and 0 deletions
--- a/vendor/ruvector/npm/packages/agentic-synth/docs/BENCHMARKS.md
+++ b/vendor/ruvector/npm/packages/agentic-synth/docs/BENCHMARKS.md
@@ -0,0 +1,492 @@
+# Benchmark Suite Documentation
+
+## Overview
+
+The agentic-synth benchmark suite provides comprehensive performance testing across multiple dimensions:
+- Data generation throughput
+- API latency and percentiles
+- Memory usage profiling
+- Cache effectiveness
+- Streaming performance
+- Concurrent generation scenarios
+
+## Quick Start
+
+```bash
+# Install dependencies
+npm install
+
+# Build project
+npm run build
+
+# Run all benchmarks
+npm run benchmark
+
+# Run specific benchmark
+npm run benchmark -- --suite "Throughput Test"
+
+# Run with custom configuration
+npm run benchmark -- --iterations 20 --concurrency 200
+
+# Generate report
+npm run benchmark -- --output benchmarks/report.md
+```
+
+## Benchmark Suites
+
+### 1. Throughput Benchmark
+
+**Measures**: Requests per second at various concurrency levels
+
+**Configuration**:
+```typescript
+{
+  iterations: 10,
+  concurrency: 100,
+  maxTokens: 100
+}
+```
+
+**Targets**:
+- Minimum: 10 req/s
+- Target: 50+ req/s
+- Optimal: 100+ req/s
+
+### 2. Latency Benchmark
+
+**Measures**: Response time percentiles (P50, P95, P99)
+
+**Configuration**:
+```typescript
+{
+  iterations: 50,
+  maxTokens: 50
+}
+```
+
+**Targets**:
+- P50: < 500ms
+- P95: < 800ms
+- P99: < 1000ms
+- Cached: < 100ms
+
+### 3. Memory Benchmark
+
+**Measures**: Memory usage patterns and leak detection
+
+**Configuration**:
+```typescript
+{
+  iterations: 100,
+  maxTokens: 100,
+  enableGC: true
+}
+```
+
+**Targets**:
+- Peak: < 400MB
+- Final (after GC): < 200MB
+- No memory leaks
+
+### 4. Cache Benchmark
+
+**Measures**: Cache hit rates and effectiveness
+
+**Configuration**:
+```typescript
+{
+  cacheSize: 1000,
+  ttl: 3600000,
+  repeatRatio: 0.5
+}
+```
+
+**Targets**:
+- Hit rate: > 50%
+- Optimal: > 80%
+
+### 5. Concurrency Benchmark
+
+**Measures**: Performance at various concurrency levels
+
+**Tests**: 10, 50, 100, 200 concurrent requests
+
+**Targets**:
+- 10 concurrent: < 2s total
+- 50 concurrent: < 5s total
+- 100 concurrent: < 10s total
+- 200 concurrent: < 20s total
+
+### 6. Streaming Benchmark
+
+**Measures**: Streaming performance and time-to-first-byte
+
+**Configuration**:
+```typescript
+{
+  maxTokens: 500,
+  measureFirstChunk: true
+}
+```
+
+**Targets**:
+- First chunk: < 200ms
+- Total duration: < 5s
+- Chunks: 50-100
+
+## CLI Usage
+
+### Basic Commands
+
+```bash
+# Run all benchmarks
+agentic-synth benchmark
+
+# Run specific suite
+agentic-synth benchmark --suite "Latency Test"
+
+# Custom iterations
+agentic-synth benchmark --iterations 20
+
+# Custom concurrency
+agentic-synth benchmark --concurrency 200
+
+# Output report
+agentic-synth benchmark --output report.md
+```
+
+### Advanced Options
+
+```bash
+# Full configuration
+agentic-synth benchmark \
+  --suite "All" \
+  --iterations 20 \
+  --concurrency 100 \
+  --warmup 5 \
+  --output benchmarks/detailed-report.md
+```
+
+## Programmatic Usage
+
+### Running Benchmarks
+
+```typescript
+import {
+  BenchmarkRunner,
+  ThroughputBenchmark,
+  LatencyBenchmark,
+  BenchmarkAnalyzer,
+  BenchmarkReporter
+} from '@ruvector/agentic-synth/benchmarks';
+import { AgenticSynth } from '@ruvector/agentic-synth';
+
+const synth = new AgenticSynth({
+  enableCache: true,
+  maxConcurrency: 100
+});
+
+const runner = new BenchmarkRunner();
+runner.registerSuite(new ThroughputBenchmark(synth));
+runner.registerSuite(new LatencyBenchmark(synth));
+
+const result = await runner.runAll({
+  name: 'My Benchmark',
+  iterations: 10,
+  concurrency: 100,
+  warmupIterations: 2,
+  timeout: 300000
+});
+
+console.log('Throughput:', result.metrics.throughput);
+console.log('P99 Latency:', result.metrics.p99LatencyMs);
+```
+
+### Analyzing Results
+
+```typescript
+import { BenchmarkAnalyzer } from '@ruvector/agentic-synth/benchmarks';
+
+const analyzer = new BenchmarkAnalyzer();
+analyzer.analyze(result);
+
+// Automatic bottleneck detection
+// Optimization recommendations
+// Performance comparison
+```
+
+### Generating Reports
+
+```typescript
+import { BenchmarkReporter } from '@ruvector/agentic-synth/benchmarks';
+
+const reporter = new BenchmarkReporter();
+
+// Markdown report
+await reporter.generateMarkdown([result], 'report.md');
+
+// JSON data export
+await reporter.generateJSON([result], 'data.json');
+```
+
+## CI/CD Integration
+
+### GitHub Actions
+
+```yaml
+name: Performance Benchmarks
+
+on: [push, pull_request]
+
+jobs:
+  benchmark:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v3
+        with:
+          node-version: '18'
+
+      - name: Install Dependencies
+        run: npm ci
+
+      - name: Build
+        run: npm run build
+
+      - name: Run Benchmarks
+        run: npm run benchmark:ci
+        env:
+          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+
+      - name: Upload Report
+        uses: actions/upload-artifact@v3
+        with:
+          name: performance-report
+          path: benchmarks/performance-report.md
+
+      - name: Check Regression
+        run: |
+          if [ $? -ne 0 ]; then
+            echo "Performance regression detected!"
+            exit 1
+          fi
+```
+
+### GitLab CI
+
+```yaml
+benchmark:
+  stage: test
+  script:
+    - npm ci
+    - npm run build
+    - npm run benchmark:ci
+  artifacts:
+    paths:
+      - benchmarks/performance-report.md
+    when: always
+  only:
+    - main
+    - merge_requests
+```
+
+## Performance Regression Detection
+
+The CI runner automatically checks for regressions:
+
+```typescript
+{
+  maxP99Latency: 1000,      // 1 second
+  minThroughput: 10,        // 10 req/s
+  maxMemoryMB: 400,         // 400MB
+  minCacheHitRate: 0.5,     // 50%
+  maxErrorRate: 0.01        // 1%
+}
+```
+
+**Exit Codes**:
+- 0: All tests passed
+- 1: Performance regression detected
+
+## Report Formats
+
+### Markdown Report
+
+Includes:
+- Performance metrics table
+- Latency distribution
+- Optimization recommendations
+- Historical trends
+- Pass/fail status
+
+### JSON Report
+
+Includes:
+- Raw metrics data
+- Timestamp
+- Configuration
+- Recommendations
+- Full result objects
+
+## Performance Metrics
+
+### Collected Metrics
+
+| Metric | Description | Unit |
+|--------|-------------|------|
+| throughput | Requests per second | req/s |
+| p50LatencyMs | 50th percentile latency | ms |
+| p95LatencyMs | 95th percentile latency | ms |
+| p99LatencyMs | 99th percentile latency | ms |
+| avgLatencyMs | Average latency | ms |
+| cacheHitRate | Cache hit ratio | 0-1 |
+| memoryUsageMB | Memory usage | MB |
+| cpuUsagePercent | CPU usage | % |
+| concurrentRequests | Active requests | count |
+| errorRate | Error ratio | 0-1 |
+
+### Performance Targets
+
+| Category | Metric | Target | Optimal |
+|----------|--------|--------|---------|
+| Speed | P99 Latency | < 1000ms | < 500ms |
+| Speed | Throughput | > 10 req/s | > 50 req/s |
+| Cache | Hit Rate | > 50% | > 80% |
+| Memory | Usage | < 400MB | < 200MB |
+| Reliability | Error Rate | < 1% | < 0.1% |
+
+## Bottleneck Analysis
+
+### Automatic Detection
+
+The analyzer automatically detects:
+
+1. **Latency Bottlenecks**
+   - Slow API responses
+   - Network issues
+   - Cache misses
+
+2. **Throughput Bottlenecks**
+   - Low concurrency
+   - Sequential processing
+   - API rate limits
+
+3. **Memory Bottlenecks**
+   - Large cache size
+   - Memory leaks
+   - Excessive buffering
+
+4. **Cache Bottlenecks**
+   - Low hit rate
+   - Small cache size
+   - Poor key strategy
+
+### Recommendations
+
+Each bottleneck includes:
+- Category (cache, routing, memory, etc.)
+- Severity (low, medium, high, critical)
+- Issue description
+- Optimization recommendation
+- Estimated improvement
+- Implementation effort
+
+## Best Practices
+
+### Running Benchmarks
+
+1. **Warmup**: Always use warmup iterations (2-5)
+2. **Iterations**: Use 10+ for statistical significance
+3. **Concurrency**: Test at expected load levels
+4. **Environment**: Run in consistent environment
+5. **Monitoring**: Watch system resources
+
+### Analyzing Results
+
+1. **Trends**: Compare across multiple runs
+2. **Baselines**: Establish performance baselines
+3. **Regressions**: Set up automated checks
+4. **Profiling**: Profile bottlenecks before optimizing
+5. **Documentation**: Document optimization changes
+
+### CI/CD Integration
+
+1. **Automation**: Run on every PR/commit
+2. **Thresholds**: Set realistic regression thresholds
+3. **Artifacts**: Save reports and data
+4. **Notifications**: Alert on regressions
+5. **History**: Track performance over time
+
+## Troubleshooting
+
+### Common Issues
+
+**High Variance**:
+- Increase warmup iterations
+- Run more iterations
+- Check system load
+
+**API Errors**:
+- Verify API key
+- Check rate limits
+- Review network connectivity
+
+**Out of Memory**:
+- Reduce concurrency
+- Decrease cache size
+- Enable GC
+
+**Slow Benchmarks**:
+- Reduce iterations
+- Decrease concurrency
+- Use smaller maxTokens
+
+## Advanced Features
+
+### Custom Benchmarks
+
+```typescript
+import { BenchmarkSuite } from '@ruvector/agentic-synth/benchmarks';
+
+class CustomBenchmark implements BenchmarkSuite {
+  name = 'Custom Test';
+
+  async run(): Promise<void> {
+    // Your benchmark logic
+  }
+}
+
+runner.registerSuite(new CustomBenchmark());
+```
+
+### Custom Thresholds
+
+```typescript
+import { BottleneckAnalyzer } from '@ruvector/agentic-synth/benchmarks';
+
+const analyzer = new BottleneckAnalyzer();
+analyzer.setThresholds({
+  maxP99LatencyMs: 500,    // Stricter than default
+  minThroughput: 50,       // Higher than default
+  maxMemoryMB: 300         // Lower than default
+});
+```
+
+### Performance Hooks
+
+```bash
+# Pre-benchmark hook
+npx claude-flow@alpha hooks pre-task --description "Benchmarking"
+
+# Post-benchmark hook
+npx claude-flow@alpha hooks post-task --task-id "bench-123"
+```
+
+## Resources
+
+- [Performance Optimization Guide](./PERFORMANCE.md)
+- [API Documentation](./API.md)
+- [Examples](../examples/)
+- [Source Code](../src/benchmarks/)