Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'
This commit is contained in:
582
vendor/ruvector/benchmarks/docs/LOAD_TEST_SCENARIOS.md
vendored
Normal file
582
vendor/ruvector/benchmarks/docs/LOAD_TEST_SCENARIOS.md
vendored
Normal file
@@ -0,0 +1,582 @@
|
||||
# RuVector Load Testing Scenarios
|
||||
|
||||
## Overview
|
||||
|
||||
This document defines comprehensive load testing scenarios for the globally distributed RuVector system, targeting 500 million concurrent learning streams with burst capacity up to 25 billion.
|
||||
|
||||
## Test Environment
|
||||
|
||||
### Global Regions
|
||||
- **Americas**: us-central1, us-east1, us-west1, southamerica-east1
|
||||
- **Europe**: europe-west1, europe-west3, europe-north1
|
||||
- **Asia-Pacific**: asia-east1, asia-southeast1, asia-northeast1, australia-southeast1
|
||||
- **Total**: 11 regions
|
||||
|
||||
### Infrastructure
|
||||
- **Cloud Run**: Auto-scaling instances (10-1000 per region)
|
||||
- **Load Balancer**: Global HTTPS LB with Cloud CDN
|
||||
- **Database**: Cloud SQL PostgreSQL (multi-region)
|
||||
- **Cache**: Memorystore Redis (128GB per region)
|
||||
- **Monitoring**: Cloud Monitoring + OpenTelemetry
|
||||
|
||||
---
|
||||
|
||||
## Scenario Categories
|
||||
|
||||
### 1. Baseline Scenarios
|
||||
|
||||
#### 1.1 Steady State (500M Concurrent)
|
||||
**Objective**: Validate system handles target baseline load
|
||||
|
||||
**Configuration**:
|
||||
- Total connections: 500M globally
|
||||
- Distribution: Proportional to region capacity
|
||||
- Tier-1 regions (5): 80M each = 400M
|
||||
- Tier-2 regions (10): 10M each = 100M
|
||||
- Query rate: 50K QPS globally
|
||||
- Test duration: 4 hours
|
||||
- Ramp-up: 30 minutes
|
||||
|
||||
**Success Criteria**:
|
||||
- P99 latency < 50ms
|
||||
- P50 latency < 10ms
|
||||
- Error rate < 0.1%
|
||||
- No memory leaks
|
||||
- CPU utilization 60-80%
|
||||
- All regions healthy
|
||||
|
||||
**Load Pattern**:
|
||||
```javascript
|
||||
{
|
||||
type: "ramped-arrival-rate",
|
||||
stages: [
|
||||
{ duration: "30m", target: 50000 }, // Ramp up
|
||||
{ duration: "4h", target: 50000 }, // Steady
|
||||
{ duration: "15m", target: 0 } // Ramp down
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
#### 1.2 Daily Peak (750M Concurrent)
|
||||
**Objective**: Handle 1.5x baseline during peak hours
|
||||
|
||||
**Configuration**:
|
||||
- Total connections: 750M globally
|
||||
- Peak hours: 18:00-22:00 local time per region
|
||||
- Query rate: 75K QPS
|
||||
- Test duration: 5 hours
|
||||
- Multiple peaks (simulate time zones)
|
||||
|
||||
**Success Criteria**:
|
||||
- P99 latency < 75ms
|
||||
- P50 latency < 15ms
|
||||
- Error rate < 0.5%
|
||||
- Auto-scaling triggers within 60s
|
||||
- Cost < $5K for test
|
||||
|
||||
---
|
||||
|
||||
### 2. Burst Scenarios
|
||||
|
||||
#### 2.1 World Cup Final (50x Burst)
|
||||
**Objective**: Handle massive spike during major sporting event
|
||||
|
||||
**Event Profile**:
|
||||
- **Pre-event**: 30 minutes before kickoff
|
||||
- **Peak**: During match (90 minutes + 30 min halftime)
|
||||
- **Post-event**: 60 minutes after final whistle
|
||||
- **Geography**: Concentrated in specific regions (France, Argentina)
|
||||
|
||||
**Configuration**:
|
||||
- Baseline: 500M concurrent
|
||||
- Peak: 25B concurrent (50x)
|
||||
- Primary regions: europe-west3 (France), southamerica-east1 (Argentina)
|
||||
- Secondary spillover: All Europe/Americas regions
|
||||
- Query rate: 2.5M QPS at peak
|
||||
- Test duration: 3 hours
|
||||
|
||||
**Load Pattern**:
|
||||
```javascript
|
||||
{
|
||||
stages: [
|
||||
// Pre-event buzz (30 min before)
|
||||
{ duration: "30m", target: 500000 }, // 10x baseline
|
||||
{ duration: "15m", target: 2500000 }, // 50x PEAK
|
||||
// First half (45 min)
|
||||
{ duration: "45m", target: 2500000 }, // Sustained peak
|
||||
// Halftime (15 min - slight drop)
|
||||
{ duration: "15m", target: 1500000 }, // 30x
|
||||
// Second half (45 min)
|
||||
{ duration: "45m", target: 2500000 }, // Back to peak
|
||||
// Extra time / penalties (30 min)
|
||||
{ duration: "30m", target: 3000000 }, // 60x SUPER PEAK
|
||||
// Post-game analysis (30 min)
|
||||
{ duration: "30m", target: 1000000 }, // 20x
|
||||
// Gradual decline (30 min)
|
||||
{ duration: "30m", target: 100000 } // 2x
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Regional Distribution**:
|
||||
- **France**: 40% (10B peak)
|
||||
- **Argentina**: 35% (8.75B peak)
|
||||
- **Spain/Italy/Portugal**: 10% (2.5B peak)
|
||||
- **Rest of Europe**: 8% (2B peak)
|
||||
- **Americas**: 5% (1.25B peak)
|
||||
- **Asia/Pacific**: 2% (500M peak)
|
||||
|
||||
**Success Criteria**:
|
||||
- System survives without crash
|
||||
- P99 latency < 200ms (degraded acceptable)
|
||||
- P50 latency < 50ms
|
||||
- Error rate < 5% (acceptable during super peak)
|
||||
- Auto-scaling completes within 10 minutes
|
||||
- No cascading failures
|
||||
- Graceful degradation activated when needed
|
||||
- Cost < $100K for full test
|
||||
|
||||
**Pre-warming**:
|
||||
- Enable predictive scaling 15 minutes before test
|
||||
- Pre-allocate 25x capacity in primary regions
|
||||
- Warm up CDN caches
|
||||
- Increase database connection pools
|
||||
|
||||
#### 2.2 Product Launch (10x Burst)
|
||||
**Objective**: Handle viral traffic spike (e.g., AI model release)
|
||||
|
||||
**Configuration**:
|
||||
- Baseline: 500M concurrent
|
||||
- Peak: 5B concurrent (10x)
|
||||
- Distribution: Global, concentrated in US
|
||||
- Query rate: 500K QPS
|
||||
- Test duration: 2 hours
|
||||
- Pattern: Sudden spike, gradual decline
|
||||
|
||||
**Load Pattern**:
|
||||
```javascript
|
||||
{
|
||||
stages: [
|
||||
{ duration: "5m", target: 500000 }, // 10x instant spike
|
||||
{ duration: "30m", target: 500000 }, // Sustained
|
||||
{ duration: "45m", target: 300000 }, // Gradual decline
|
||||
{ duration: "40m", target: 100000 } // Return to normal
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Success Criteria**:
|
||||
- Reactive scaling responds within 60s
|
||||
- P99 latency < 100ms
|
||||
- Error rate < 2%
|
||||
- No downtime
|
||||
|
||||
#### 2.3 Flash Crowd (25x Burst)
|
||||
**Objective**: Unpredictable viral event
|
||||
|
||||
**Configuration**:
|
||||
- Baseline: 500M concurrent
|
||||
- Peak: 12.5B concurrent (25x)
|
||||
- Geography: Unpredictable (use US for test)
|
||||
- Query rate: 1.25M QPS
|
||||
- Test duration: 90 minutes
|
||||
- Pattern: Very rapid spike (< 2 minutes)
|
||||
|
||||
**Load Pattern**:
|
||||
```javascript
|
||||
{
|
||||
stages: [
|
||||
{ duration: "2m", target: 1250000 }, // 25x in 2 minutes!
|
||||
{ duration: "30m", target: 1250000 }, // Hold peak
|
||||
{ duration: "30m", target: 750000 }, // Decline
|
||||
{ duration: "28m", target: 100000 } // Return
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Success Criteria**:
|
||||
- System survives without manual intervention
|
||||
- Reactive scaling activates immediately
|
||||
- P99 latency < 150ms
|
||||
- Error rate < 3%
|
||||
- Cost cap respected
|
||||
|
||||
---
|
||||
|
||||
### 3. Failover Scenarios
|
||||
|
||||
#### 3.1 Single Region Failure
|
||||
**Objective**: Validate regional failover
|
||||
|
||||
**Configuration**:
|
||||
- Baseline: 500M concurrent
|
||||
- Failed region: europe-west1 (80M connections)
|
||||
- Failover targets: europe-west3, europe-north1
|
||||
- Query rate: 50K QPS
|
||||
- Test duration: 1 hour
|
||||
- Failure trigger: 30 minutes into test
|
||||
|
||||
**Procedure**:
|
||||
1. Run baseline load for 30 minutes
|
||||
2. Simulate region failure (kill all instances in europe-west1)
|
||||
3. Observe failover behavior
|
||||
4. Measure recovery time
|
||||
5. Validate data consistency
|
||||
|
||||
**Success Criteria**:
|
||||
- Failover completes within 60 seconds
|
||||
- Connection loss < 5%
|
||||
- No data loss
|
||||
- P99 latency spike < 200ms during failover
|
||||
- Automatic recovery when region restored
|
||||
|
||||
#### 3.2 Multi-Region Cascade Failure
|
||||
**Objective**: Test disaster recovery
|
||||
|
||||
**Configuration**:
|
||||
- Baseline: 500M concurrent
|
||||
- Failed regions: europe-west1, europe-west3 (160M connections)
|
||||
- Failover: Global redistribution
|
||||
- Test duration: 2 hours
|
||||
- Progressive failures (15 min apart)
|
||||
|
||||
**Procedure**:
|
||||
1. Run baseline load
|
||||
2. Kill europe-west1 at T+30m
|
||||
3. Kill europe-west3 at T+45m
|
||||
4. Observe cascade prevention
|
||||
5. Validate global recovery
|
||||
|
||||
**Success Criteria**:
|
||||
- No cascading failures
|
||||
- Circuit breakers activate
|
||||
- Graceful degradation if needed
|
||||
- Connection loss < 10%
|
||||
- System remains stable
|
||||
|
||||
#### 3.3 Database Failover
|
||||
**Objective**: Test database resilience
|
||||
|
||||
**Configuration**:
|
||||
- Baseline: 500M concurrent
|
||||
- Database: Trigger Cloud SQL failover to replica
|
||||
- Query rate: 50K QPS (read-heavy)
|
||||
- Test duration: 1 hour
|
||||
- Failure trigger: 20 minutes into test
|
||||
|
||||
**Success Criteria**:
|
||||
- Failover completes within 30 seconds
|
||||
- Connection pool recovers automatically
|
||||
- Read queries continue with < 5% errors
|
||||
- Write queries resume after failover
|
||||
- No permanent data loss
|
||||
|
||||
---
|
||||
|
||||
### 4. Workload Scenarios
|
||||
|
||||
#### 4.1 Read-Heavy (90% Reads)
|
||||
**Objective**: Validate cache effectiveness
|
||||
|
||||
**Configuration**:
|
||||
- Total connections: 500M
|
||||
- Query mix: 90% similarity search, 10% updates
|
||||
- Cache hit rate target: > 75%
|
||||
- Query rate: 50K QPS
|
||||
- Test duration: 2 hours
|
||||
|
||||
**Success Criteria**:
|
||||
- P99 latency < 30ms (due to caching)
|
||||
- Cache hit rate > 75%
|
||||
- Database CPU < 50%
|
||||
|
||||
#### 4.2 Write-Heavy (40% Writes)
|
||||
**Objective**: Test write throughput
|
||||
|
||||
**Configuration**:
|
||||
- Total connections: 500M
|
||||
- Query mix: 60% reads, 40% vector updates
|
||||
- Query rate: 50K QPS
|
||||
- Test duration: 2 hours
|
||||
- Vector dimensions: 768
|
||||
|
||||
**Success Criteria**:
|
||||
- P99 latency < 100ms
|
||||
- Database CPU < 80%
|
||||
- Replication lag < 5 seconds
|
||||
- No write conflicts
|
||||
|
||||
#### 4.3 Mixed Workload (Realistic)
|
||||
**Objective**: Simulate production traffic
|
||||
|
||||
**Configuration**:
|
||||
- Total connections: 500M
|
||||
- Query mix:
|
||||
- 70% similarity search
|
||||
- 15% filtered search
|
||||
- 10% vector inserts
|
||||
- 5% deletes
|
||||
- Query rate: 50K QPS
|
||||
- Test duration: 4 hours
|
||||
- Varying vector dimensions (384, 768, 1536)
|
||||
|
||||
**Success Criteria**:
|
||||
- P99 latency < 50ms
|
||||
- All operations succeed
|
||||
- Resource utilization balanced
|
||||
|
||||
---
|
||||
|
||||
### 5. Stress Scenarios
|
||||
|
||||
#### 5.1 Gradual Load Increase
|
||||
**Objective**: Find breaking point
|
||||
|
||||
**Configuration**:
|
||||
- Start: 100M concurrent
|
||||
- End: Until system breaks
|
||||
- Increment: +100M every 30 minutes
|
||||
- Query rate: Proportional to connections
|
||||
- Test duration: Until failure
|
||||
|
||||
**Success Criteria**:
|
||||
- Identify maximum capacity
|
||||
- Measure degradation curve
|
||||
- Observe failure modes
|
||||
|
||||
#### 5.2 Long-Duration Soak Test
|
||||
**Objective**: Detect memory leaks and resource exhaustion
|
||||
|
||||
**Configuration**:
|
||||
- Total connections: 500M
|
||||
- Query rate: 50K QPS
|
||||
- Test duration: 24 hours
|
||||
- Pattern: Steady state
|
||||
|
||||
**Success Criteria**:
|
||||
- No memory leaks
|
||||
- No connection leaks
|
||||
- Stable performance over time
|
||||
- Resource cleanup works
|
||||
|
||||
---
|
||||
|
||||
## Test Execution Strategy
|
||||
|
||||
### Sequential Execution (Standard Suite)
|
||||
Total time: ~18 hours
|
||||
|
||||
1. Baseline Steady State (4h)
|
||||
2. Daily Peak (5h)
|
||||
3. Product Launch 10x (2h)
|
||||
4. Single Region Failover (1h)
|
||||
5. Read-Heavy Workload (2h)
|
||||
6. Write-Heavy Workload (2h)
|
||||
7. Mixed Workload (4h)
|
||||
|
||||
### Burst Suite (Special Events)
|
||||
Total time: ~8 hours
|
||||
|
||||
1. World Cup 50x (3h)
|
||||
2. Flash Crowd 25x (1.5h)
|
||||
3. Multi-Region Cascade (2h)
|
||||
4. Database Failover (1h)
|
||||
|
||||
### Quick Validation (Smoke Test)
|
||||
Total time: ~2 hours
|
||||
|
||||
1. Baseline Steady State - 30 minutes
|
||||
2. Product Launch 10x - 30 minutes
|
||||
3. Single Region Failover - 30 minutes
|
||||
4. Mixed Workload - 30 minutes
|
||||
|
||||
---
|
||||
|
||||
## Monitoring During Tests
|
||||
|
||||
### Real-Time Metrics
|
||||
- Connection count per region
|
||||
- Query latency percentiles (p50, p95, p99)
|
||||
- Error rates by type
|
||||
- CPU/Memory utilization
|
||||
- Network throughput
|
||||
- Database connections
|
||||
- Cache hit rates
|
||||
|
||||
### Alerts
|
||||
- P99 latency > 50ms (warning)
|
||||
- P99 latency > 100ms (critical)
|
||||
- Error rate > 1% (warning)
|
||||
- Error rate > 5% (critical)
|
||||
- Region unhealthy
|
||||
- Database connections > 90%
|
||||
- Cost > $10K/hour
|
||||
|
||||
### Dashboards
|
||||
1. Executive: High-level metrics, SLA status
|
||||
2. Operations: Regional health, resource utilization
|
||||
3. Cost: Hourly spend, projections
|
||||
4. Performance: Latency distributions, throughput
|
||||
|
||||
---
|
||||
|
||||
## Cost Estimates
|
||||
|
||||
### Per-Test Costs
|
||||
|
||||
| Scenario | Duration | Peak Load | Estimated Cost |
|
||||
|----------|----------|-----------|----------------|
|
||||
| Baseline Steady | 4h | 500M | $180 |
|
||||
| Daily Peak | 5h | 750M | $350 |
|
||||
| World Cup 50x | 3h | 25B | $80,000 |
|
||||
| Product Launch 10x | 2h | 5B | $3,600 |
|
||||
| Flash Crowd 25x | 1.5h | 12.5B | $28,000 |
|
||||
| Single Region Failover | 1h | 500M | $45 |
|
||||
| Workload Tests | 2h | 500M | $90 |
|
||||
|
||||
### Full Suite Costs
|
||||
- **Standard Suite**: ~$900
|
||||
- **Burst Suite**: ~$112K
|
||||
- **Quick Validation**: ~$150
|
||||
|
||||
**Cost Optimization**:
|
||||
- Use committed use discounts (30% off)
|
||||
- Run tests in low-cost regions when possible
|
||||
- Use preemptible instances for load generators
|
||||
- Leverage CDN caching
|
||||
- Clean up resources immediately after tests
|
||||
|
||||
---
|
||||
|
||||
## Pre-Test Checklist
|
||||
|
||||
### Infrastructure
|
||||
- [ ] All regions deployed and healthy
|
||||
- [ ] Load balancer configured
|
||||
- [ ] CDN enabled
|
||||
- [ ] Database replicas ready
|
||||
- [ ] Redis caches warmed
|
||||
- [ ] Monitoring dashboards set up
|
||||
- [ ] Alerting policies active
|
||||
- [ ] Budget alerts configured
|
||||
|
||||
### Load Generation
|
||||
- [ ] K6 scripts validated
|
||||
- [ ] Load generators deployed in all regions
|
||||
- [ ] Test data prepared
|
||||
- [ ] Baseline traffic running
|
||||
- [ ] Credentials configured
|
||||
- [ ] Results storage ready
|
||||
|
||||
### Team
|
||||
- [ ] On-call engineer available
|
||||
- [ ] Communication channels open (Slack)
|
||||
- [ ] Runbook reviewed
|
||||
- [ ] Rollback plan ready
|
||||
- [ ] Stakeholders notified
|
||||
|
||||
---
|
||||
|
||||
## Post-Test Analysis
|
||||
|
||||
### Deliverables
|
||||
1. Test execution log
|
||||
2. Metrics summary (latency, throughput, errors)
|
||||
3. SLA compliance report
|
||||
4. Cost breakdown
|
||||
5. Bottleneck analysis
|
||||
6. Recommendations document
|
||||
7. Performance comparison (vs. previous tests)
|
||||
|
||||
### Key Questions
|
||||
- Did we meet SLA targets?
|
||||
- Where did bottlenecks occur?
|
||||
- How well did auto-scaling perform?
|
||||
- Were there any unexpected failures?
|
||||
- What was the actual cost vs. estimate?
|
||||
- What improvements should we make?
|
||||
|
||||
---
|
||||
|
||||
## Example: Running World Cup Test
|
||||
|
||||
```bash
|
||||
# 1. Pre-warm infrastructure
|
||||
cd /home/user/ruvector/src/burst-scaling
|
||||
npm run build
|
||||
node dist/burst-predictor.js --event "World Cup Final" --time "2026-07-15T18:00:00Z"
|
||||
|
||||
# 2. Deploy load generators
|
||||
cd /home/user/ruvector/benchmarks
|
||||
npm run deploy:generators
|
||||
|
||||
# 3. Run scenario
|
||||
npm run scenario:worldcup -- \
|
||||
--regions "europe-west3,southamerica-east1" \
|
||||
--peak-multiplier 50 \
|
||||
--duration "3h" \
|
||||
--enable-notifications
|
||||
|
||||
# 4. Monitor (separate terminal)
|
||||
npm run dashboard
|
||||
|
||||
# 5. Collect results
|
||||
npm run analyze -- --test-id "worldcup-2026-final-test"
|
||||
|
||||
# 6. Generate report
|
||||
npm run report -- --test-id "worldcup-2026-final-test" --format pdf
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### High Error Rates
|
||||
- Check: Database connection pool exhaustion
|
||||
- Check: Network bandwidth limits
|
||||
- Check: Rate limiting too aggressive
|
||||
- Action: Scale up resources or enable degradation
|
||||
|
||||
### High Latency
|
||||
- Check: Cold cache (low hit rate)
|
||||
- Check: Database query performance
|
||||
- Check: Network latency between regions
|
||||
- Action: Warm caches, optimize queries, adjust routing
|
||||
|
||||
### Failed Auto-Scaling
|
||||
- Check: GCP quotas and limits
|
||||
- Check: Budget caps
|
||||
- Check: IAM permissions
|
||||
- Action: Request quota increase, adjust caps
|
||||
|
||||
### Cost Overruns
|
||||
- Check: Instances not scaling down
|
||||
- Check: Database overprovisioned
|
||||
- Check: Excessive logging
|
||||
- Action: Force scale-in, reduce logging verbosity
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Run Quick Validation**: Ensure system is ready
|
||||
2. **Run Standard Suite**: Comprehensive testing
|
||||
3. **Schedule Burst Tests**: Coordinate with team (expensive!)
|
||||
4. **Iterate Based on Results**: Tune thresholds and configurations
|
||||
5. **Document Learnings**: Update runbooks and architecture docs
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [Architecture Overview](/home/user/ruvector/docs/cloud-architecture/architecture-overview.md)
|
||||
- [Scaling Strategy](/home/user/ruvector/docs/cloud-architecture/scaling-strategy.md)
|
||||
- [Burst Scaling](/home/user/ruvector/src/burst-scaling/README.md)
|
||||
- [Benchmarking Guide](/home/user/ruvector/benchmarks/README.md)
|
||||
- [Operations Runbook](/home/user/ruvector/src/burst-scaling/RUNBOOK.md)
|
||||
|
||||
---
|
||||
|
||||
**Document Version**: 1.0
|
||||
**Last Updated**: 2025-11-20
|
||||
**Author**: RuVector Performance Team
|
||||
235
vendor/ruvector/benchmarks/docs/QUICKSTART.md
vendored
Normal file
235
vendor/ruvector/benchmarks/docs/QUICKSTART.md
vendored
Normal file
@@ -0,0 +1,235 @@
|
||||
# RuVector Benchmarks - Quick Start Guide
|
||||
|
||||
Get up and running with RuVector benchmarks in 5 minutes!
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Node.js 18+ and npm
|
||||
- k6 load testing tool
|
||||
- Access to RuVector cluster
|
||||
|
||||
## Installation
|
||||
|
||||
### Step 1: Install k6
|
||||
|
||||
**macOS:**
|
||||
```bash
|
||||
brew install k6
|
||||
```
|
||||
|
||||
**Linux (Debian/Ubuntu):**
|
||||
```bash
|
||||
sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg \
|
||||
--keyserver hkp://keyserver.ubuntu.com:80 \
|
||||
--recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
|
||||
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" | \
|
||||
sudo tee /etc/apt/sources.list.d/k6.list
|
||||
sudo apt-get update
|
||||
sudo apt-get install k6
|
||||
```
|
||||
|
||||
**Windows:**
|
||||
```powershell
|
||||
choco install k6
|
||||
```
|
||||
|
||||
### Step 2: Run Setup Script
|
||||
|
||||
```bash
|
||||
cd /home/user/ruvector/benchmarks
|
||||
./setup.sh
|
||||
```
|
||||
|
||||
This will:
|
||||
- Check dependencies
|
||||
- Install TypeScript/ts-node
|
||||
- Create results directory
|
||||
- Configure environment
|
||||
|
||||
### Step 3: Configure Environment
|
||||
|
||||
Edit `.env` file with your cluster URL:
|
||||
|
||||
```bash
|
||||
BASE_URL=https://your-ruvector-cluster.example.com
|
||||
PARALLEL=1
|
||||
ENABLE_HOOKS=true
|
||||
```
|
||||
|
||||
## Running Your First Test
|
||||
|
||||
### Quick Validation (45 minutes)
|
||||
|
||||
```bash
|
||||
npm run test:quick
|
||||
```
|
||||
|
||||
This runs `baseline_100m` scenario:
|
||||
- 100M concurrent connections
|
||||
- 30 minutes steady-state
|
||||
- Validates basic functionality
|
||||
|
||||
### View Results
|
||||
|
||||
```bash
|
||||
# Start visualization dashboard
|
||||
npm run dashboard
|
||||
|
||||
# Open in browser
|
||||
open http://localhost:8000/visualization-dashboard.html
|
||||
```
|
||||
|
||||
## Common Scenarios
|
||||
|
||||
### Baseline Test (500M connections)
|
||||
```bash
|
||||
npm run test:baseline
|
||||
```
|
||||
Duration: 3h 15m
|
||||
|
||||
### Burst Test (10x spike)
|
||||
```bash
|
||||
npm run test:burst
|
||||
```
|
||||
Duration: 20m
|
||||
|
||||
### Standard Test Suite
|
||||
```bash
|
||||
npm run test:standard
|
||||
```
|
||||
Duration: ~6 hours
|
||||
|
||||
## Understanding Results
|
||||
|
||||
After a test completes, check:
|
||||
|
||||
```bash
|
||||
results/
|
||||
run-{timestamp}/
|
||||
{scenario}-metrics.json # Raw metrics
|
||||
{scenario}-analysis.json # Analysis report
|
||||
{scenario}-report.md # Human-readable report
|
||||
SUMMARY.md # Overall summary
|
||||
```
|
||||
|
||||
### Key Metrics
|
||||
|
||||
- **P99 Latency**: Should be < 50ms (baseline)
|
||||
- **Throughput**: Queries per second
|
||||
- **Error Rate**: Should be < 0.01%
|
||||
- **Availability**: Should be > 99.99%
|
||||
|
||||
### Performance Score
|
||||
|
||||
Each test gets a score 0-100:
|
||||
- 90+: Excellent
|
||||
- 80-89: Good
|
||||
- 70-79: Fair
|
||||
- <70: Needs improvement
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Connection Failed
|
||||
```bash
|
||||
# Test cluster connectivity
|
||||
curl -v https://your-cluster.example.com/health
|
||||
```
|
||||
|
||||
### k6 Errors
|
||||
```bash
|
||||
# Verify k6 installation
|
||||
k6 version
|
||||
|
||||
# Reinstall if needed
|
||||
brew reinstall k6 # macOS
|
||||
```
|
||||
|
||||
### High Memory Usage
|
||||
```bash
|
||||
# Increase Node.js memory
|
||||
export NODE_OPTIONS="--max-old-space-size=8192"
|
||||
```
|
||||
|
||||
## Docker Usage
|
||||
|
||||
### Build Image
|
||||
```bash
|
||||
docker build -t ruvector-benchmark .
|
||||
```
|
||||
|
||||
### Run Test
|
||||
```bash
|
||||
docker run \
|
||||
-e BASE_URL="https://your-cluster.example.com" \
|
||||
-v $(pwd)/results:/benchmarks/results \
|
||||
ruvector-benchmark run baseline_100m
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Review README.md** for comprehensive documentation
|
||||
2. **Explore scenarios** in `benchmark-scenarios.ts`
|
||||
3. **Customize tests** for your workload
|
||||
4. **Set up CI/CD** for continuous benchmarking
|
||||
|
||||
## Quick Command Reference
|
||||
|
||||
```bash
|
||||
# List all scenarios
|
||||
npm run list
|
||||
|
||||
# Run specific scenario
|
||||
ts-node benchmark-runner.ts run <scenario-name>
|
||||
|
||||
# Run scenario group
|
||||
ts-node benchmark-runner.ts group <group-name>
|
||||
|
||||
# View dashboard
|
||||
npm run dashboard
|
||||
|
||||
# Clean results
|
||||
npm run clean
|
||||
```
|
||||
|
||||
## Available Scenarios
|
||||
|
||||
### Baseline Tests
|
||||
- `baseline_100m` - Quick validation (45m)
|
||||
- `baseline_500m` - Full baseline (3h 15m)
|
||||
|
||||
### Burst Tests
|
||||
- `burst_10x` - 10x spike (20m)
|
||||
- `burst_25x` - 25x spike (35m)
|
||||
- `burst_50x` - 50x spike (50m)
|
||||
|
||||
### Workload Tests
|
||||
- `read_heavy` - 95% reads (1h 50m)
|
||||
- `write_heavy` - 70% writes (1h 50m)
|
||||
- `balanced_workload` - 50/50 split (1h 50m)
|
||||
|
||||
### Failover Tests
|
||||
- `regional_failover` - Single region failure (45m)
|
||||
- `multi_region_failover` - Multiple region failure (55m)
|
||||
|
||||
### Real-World Tests
|
||||
- `world_cup` - Sporting event simulation (3h)
|
||||
- `black_friday` - E-commerce peak (14h)
|
||||
|
||||
### Scenario Groups
|
||||
- `quick_validation` - Fast validation suite
|
||||
- `standard_suite` - Standard test suite
|
||||
- `stress_suite` - Stress testing
|
||||
- `reliability_suite` - Failover tests
|
||||
- `full_suite` - All scenarios
|
||||
|
||||
## Support
|
||||
|
||||
- **Documentation**: See README.md
|
||||
- **Issues**: https://github.com/ruvnet/ruvector/issues
|
||||
- **Slack**: https://ruvector.slack.com
|
||||
|
||||
---
|
||||
|
||||
**Ready to benchmark!** 🚀
|
||||
|
||||
Start with: `npm run test:quick`
|
||||
665
vendor/ruvector/benchmarks/docs/README.md
vendored
Normal file
665
vendor/ruvector/benchmarks/docs/README.md
vendored
Normal file
@@ -0,0 +1,665 @@
|
||||
# RuVector Benchmarking Suite
|
||||
|
||||
Comprehensive benchmarking tool for testing the globally distributed RuVector vector search system at scale (500M+ concurrent connections).
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Features](#features)
|
||||
- [Prerequisites](#prerequisites)
|
||||
- [Installation](#installation)
|
||||
- [Quick Start](#quick-start)
|
||||
- [Benchmark Scenarios](#benchmark-scenarios)
|
||||
- [Running Benchmarks](#running-benchmarks)
|
||||
- [Understanding Results](#understanding-results)
|
||||
- [Best Practices](#best-practices)
|
||||
- [Cost Estimation](#cost-estimation)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
- [Advanced Usage](#advanced-usage)
|
||||
|
||||
## Overview
|
||||
|
||||
This benchmarking suite provides enterprise-grade load testing capabilities for RuVector, supporting:
|
||||
|
||||
- **Massive Scale**: Test up to 25B concurrent connections
|
||||
- **Multi-Region**: Distributed load generation across 11 GCP regions
|
||||
- **Comprehensive Metrics**: Latency, throughput, errors, resource utilization, costs
|
||||
- **SLA Validation**: Automated checking against 99.99% availability, <50ms p99 latency targets
|
||||
- **Advanced Analysis**: Statistical analysis, bottleneck identification, recommendations
|
||||
|
||||
## Features
|
||||
|
||||
### Load Generation
|
||||
- Multi-protocol support (HTTP, HTTP/2, WebSocket, gRPC)
|
||||
- Realistic query patterns (uniform, hotspot, Zipfian, burst)
|
||||
- Configurable ramp-up/down rates
|
||||
- Connection lifecycle management
|
||||
- Geographic distribution
|
||||
|
||||
### Metrics Collection
|
||||
- Latency distribution (p50, p90, p95, p99, p99.9)
|
||||
- Throughput tracking (QPS, bandwidth)
|
||||
- Error analysis by type and region
|
||||
- Resource utilization (CPU, memory, network)
|
||||
- Cost per million queries
|
||||
- Regional performance comparison
|
||||
|
||||
### Analysis & Reporting
|
||||
- Statistical analysis with anomaly detection
|
||||
- SLA compliance checking
|
||||
- Bottleneck identification
|
||||
- Performance score calculation
|
||||
- Actionable recommendations
|
||||
- Interactive visualization dashboard
|
||||
- Markdown and JSON reports
|
||||
- CSV export for further analysis
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Required
|
||||
- **Node.js**: v18+ (for TypeScript execution)
|
||||
- **k6**: Latest version ([installation guide](https://k6.io/docs/getting-started/installation/))
|
||||
- **Access**: RuVector cluster endpoint
|
||||
|
||||
### Optional
|
||||
- **Claude Flow**: For hooks integration
|
||||
```bash
|
||||
npm install -g claude-flow@alpha
|
||||
```
|
||||
- **Docker**: For containerized execution
|
||||
- **GCP Account**: For multi-region load generation
|
||||
|
||||
## Installation
|
||||
|
||||
1. **Clone Repository**
|
||||
```bash
|
||||
cd /home/user/ruvector/benchmarks
|
||||
```
|
||||
|
||||
2. **Install Dependencies**
|
||||
```bash
|
||||
npm install -g typescript ts-node
|
||||
npm install k6 @types/k6
|
||||
```
|
||||
|
||||
3. **Verify Installation**
|
||||
```bash
|
||||
k6 version
|
||||
ts-node --version
|
||||
```
|
||||
|
||||
4. **Configure Environment**
|
||||
```bash
|
||||
export BASE_URL="https://your-ruvector-cluster.example.com"
|
||||
export PARALLEL=2 # Number of parallel scenarios
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Run a Single Scenario
|
||||
|
||||
```bash
|
||||
# Quick validation (100M connections, 45 minutes)
|
||||
ts-node benchmark-runner.ts run baseline_100m
|
||||
|
||||
# Full baseline test (500M connections, 3+ hours)
|
||||
ts-node benchmark-runner.ts run baseline_500m
|
||||
|
||||
# Burst test (10x spike to 5B connections)
|
||||
ts-node benchmark-runner.ts run burst_10x
|
||||
```
|
||||
|
||||
### Run Scenario Groups
|
||||
|
||||
```bash
|
||||
# Quick validation suite (~1 hour)
|
||||
ts-node benchmark-runner.ts group quick_validation
|
||||
|
||||
# Standard test suite (~6 hours)
|
||||
ts-node benchmark-runner.ts group standard_suite
|
||||
|
||||
# Full stress testing suite (~10 hours)
|
||||
ts-node benchmark-runner.ts group stress_suite
|
||||
|
||||
# All scenarios (~48 hours)
|
||||
ts-node benchmark-runner.ts group full_suite
|
||||
```
|
||||
|
||||
### List Available Tests
|
||||
|
||||
```bash
|
||||
ts-node benchmark-runner.ts list
|
||||
```
|
||||
|
||||
## Benchmark Scenarios
|
||||
|
||||
### Baseline Tests
|
||||
|
||||
#### baseline_500m
|
||||
- **Description**: Steady-state operation with 500M concurrent connections
|
||||
- **Duration**: 3h 15m
|
||||
- **Target**: P99 < 50ms, 99.99% availability
|
||||
- **Use Case**: Production capacity validation
|
||||
|
||||
#### baseline_100m
|
||||
- **Description**: Smaller baseline for quick validation
|
||||
- **Duration**: 45m
|
||||
- **Target**: P99 < 50ms, 99.99% availability
|
||||
- **Use Case**: CI/CD integration, quick regression tests
|
||||
|
||||
### Burst Tests
|
||||
|
||||
#### burst_10x
|
||||
- **Description**: Sudden spike to 5B concurrent (10x baseline)
|
||||
- **Duration**: 20m
|
||||
- **Target**: P99 < 100ms, 99.9% availability
|
||||
- **Use Case**: Flash sale, viral event simulation
|
||||
|
||||
#### burst_25x
|
||||
- **Description**: Extreme spike to 12.5B concurrent (25x baseline)
|
||||
- **Duration**: 35m
|
||||
- **Target**: P99 < 150ms, 99.5% availability
|
||||
- **Use Case**: Major global event (Olympics, elections)
|
||||
|
||||
#### burst_50x
|
||||
- **Description**: Maximum spike to 25B concurrent (50x baseline)
|
||||
- **Duration**: 50m
|
||||
- **Target**: P99 < 200ms, 99% availability
|
||||
- **Use Case**: Stress testing absolute limits
|
||||
|
||||
### Failover Tests
|
||||
|
||||
#### regional_failover
|
||||
- **Description**: Test recovery when one region fails
|
||||
- **Duration**: 45m
|
||||
- **Target**: <10% throughput degradation, <1% errors
|
||||
- **Use Case**: Disaster recovery validation
|
||||
|
||||
#### multi_region_failover
|
||||
- **Description**: Test recovery when multiple regions fail
|
||||
- **Duration**: 55m
|
||||
- **Target**: <20% throughput degradation, <2% errors
|
||||
- **Use Case**: Multi-region outage preparation
|
||||
|
||||
### Workload Tests
|
||||
|
||||
#### read_heavy
|
||||
- **Description**: 95% reads, 5% writes (typical production workload)
|
||||
- **Duration**: 1h 50m
|
||||
- **Target**: P99 < 50ms, 99.99% availability
|
||||
- **Use Case**: Production simulation
|
||||
|
||||
#### write_heavy
|
||||
- **Description**: 70% writes, 30% reads (batch indexing scenario)
|
||||
- **Duration**: 1h 50m
|
||||
- **Target**: P99 < 80ms, 99.95% availability
|
||||
- **Use Case**: Bulk data ingestion
|
||||
|
||||
#### balanced_workload
|
||||
- **Description**: 50% reads, 50% writes
|
||||
- **Duration**: 1h 50m
|
||||
- **Target**: P99 < 60ms, 99.98% availability
|
||||
- **Use Case**: Mixed workload validation
|
||||
|
||||
### Real-World Scenarios
|
||||
|
||||
#### world_cup
|
||||
- **Description**: Predictable spike with geographic concentration (Europe)
|
||||
- **Duration**: 3h
|
||||
- **Target**: P99 < 100ms during matches
|
||||
- **Use Case**: Major sporting event
|
||||
|
||||
#### black_friday
|
||||
- **Description**: Sustained high load with periodic spikes
|
||||
- **Duration**: 14h
|
||||
- **Target**: P99 < 80ms, 99.95% availability
|
||||
- **Use Case**: E-commerce peak period
|
||||
|
||||
## Running Benchmarks
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```bash
|
||||
# Set environment variables
|
||||
export BASE_URL="https://ruvector.example.com"
|
||||
export REGION="us-east1"
|
||||
|
||||
# Run single test
|
||||
ts-node benchmark-runner.ts run baseline_500m
|
||||
|
||||
# Run with custom config
|
||||
BASE_URL="https://staging.example.com" \
|
||||
PARALLEL=3 \
|
||||
ts-node benchmark-runner.ts group standard_suite
|
||||
```
|
||||
|
||||
### With Claude Flow Hooks
|
||||
|
||||
```bash
|
||||
# Enable hooks (default)
|
||||
export ENABLE_HOOKS=true
|
||||
|
||||
# Disable hooks
|
||||
export ENABLE_HOOKS=false
|
||||
|
||||
ts-node benchmark-runner.ts run baseline_500m
|
||||
```
|
||||
|
||||
Hooks will automatically:
|
||||
- Execute `npx claude-flow@alpha hooks pre-task` before each test
|
||||
- Store results in swarm memory
|
||||
- Execute `npx claude-flow@alpha hooks post-task` after completion
|
||||
|
||||
### Multi-Region Execution
|
||||
|
||||
To distribute load across regions:
|
||||
|
||||
```bash
|
||||
# Deploy load generators to GCP regions
|
||||
for region in us-east1 us-west1 europe-west1 asia-east1; do
|
||||
gcloud compute instances create "k6-${region}" \
|
||||
--zone="${region}-a" \
|
||||
--machine-type="n2-standard-32" \
|
||||
--image-family="ubuntu-2004-lts" \
|
||||
--image-project="ubuntu-os-cloud" \
|
||||
--metadata-from-file=startup-script=setup-k6.sh
|
||||
done
|
||||
|
||||
# Run distributed test
|
||||
ts-node benchmark-runner.ts run baseline_500m
|
||||
```
|
||||
|
||||
### Docker Execution
|
||||
|
||||
```bash
|
||||
# Build container
|
||||
docker build -t ruvector-benchmark .
|
||||
|
||||
# Run test
|
||||
docker run \
|
||||
-e BASE_URL="https://ruvector.example.com" \
|
||||
-v $(pwd)/results:/results \
|
||||
ruvector-benchmark run baseline_500m
|
||||
```
|
||||
|
||||
## Understanding Results
|
||||
|
||||
### Output Structure
|
||||
|
||||
```
|
||||
results/
|
||||
run-{timestamp}/
|
||||
{scenario}-{timestamp}-raw.json # Raw K6 metrics
|
||||
{scenario}-{timestamp}-metrics.json # Processed metrics
|
||||
{scenario}-{timestamp}-metrics.csv # CSV export
|
||||
{scenario}-{timestamp}-analysis.json # Analysis report
|
||||
{scenario}-{timestamp}-report.md # Markdown report
|
||||
SUMMARY.md # Multi-scenario summary
|
||||
```
|
||||
|
||||
### Key Metrics
|
||||
|
||||
#### Latency
|
||||
- **P50 (Median)**: 50% of requests faster than this
|
||||
- **P90**: 90% of requests faster than this
|
||||
- **P95**: 95% of requests faster than this
|
||||
- **P99**: 99% of requests faster than this (SLA target)
|
||||
- **P99.9**: 99.9% of requests faster than this
|
||||
|
||||
**Target**: P99 < 50ms for baseline, <100ms for burst
|
||||
|
||||
#### Throughput
|
||||
- **QPS**: Queries per second
|
||||
- **Peak QPS**: Maximum sustained throughput
|
||||
- **Average QPS**: Mean throughput over test duration
|
||||
|
||||
**Target**: 50M QPS for 500M baseline connections
|
||||
|
||||
#### Error Rate
|
||||
- **Total Errors**: Count of failed requests
|
||||
- **Error Rate %**: Percentage of requests that failed
|
||||
- **By Type**: Breakdown (timeout, connection, server, client)
|
||||
- **By Region**: Geographic distribution
|
||||
|
||||
**Target**: < 0.01% error rate (99.99% success)
|
||||
|
||||
#### Availability
|
||||
- **Uptime %**: Percentage of time system was available
|
||||
- **Downtime**: Total milliseconds of unavailability
|
||||
- **MTBF**: Mean time between failures
|
||||
- **MTTR**: Mean time to recovery
|
||||
|
||||
**Target**: 99.99% availability (52 minutes/year downtime)
|
||||
|
||||
#### Resource Utilization
|
||||
- **CPU %**: Average and peak CPU usage
|
||||
- **Memory %**: Average and peak memory usage
|
||||
- **Network**: Bandwidth, ingress/egress bytes
|
||||
- **Per Region**: Resource usage by geographic location
|
||||
|
||||
**Alert Thresholds**: CPU > 80%, Memory > 85%
|
||||
|
||||
#### Cost
|
||||
- **Total Cost**: Compute + network + storage
|
||||
- **Cost Per Million**: Queries per million queries
|
||||
- **Per Region**: Cost breakdown by location
|
||||
|
||||
**Target**: < $0.50 per million queries
|
||||
|
||||
### Performance Score
|
||||
|
||||
Overall score (0-100) calculated from:
|
||||
- **Performance** (35%): Latency and throughput
|
||||
- **Reliability** (35%): Availability and error rate
|
||||
- **Scalability** (20%): Resource utilization efficiency
|
||||
- **Efficiency** (10%): Cost effectiveness
|
||||
|
||||
**Grades**:
|
||||
- 90-100: Excellent
|
||||
- 80-89: Good
|
||||
- 70-79: Fair
|
||||
- 60-69: Needs Improvement
|
||||
- <60: Poor
|
||||
|
||||
### SLA Compliance
|
||||
|
||||
✅ **PASSED** if all criteria met:
|
||||
- P99 latency < 50ms (baseline) or scenario target
|
||||
- Availability >= 99.99%
|
||||
- Error rate < 0.01%
|
||||
|
||||
❌ **FAILED** if any criterion violated
|
||||
|
||||
### Analysis Report
|
||||
|
||||
Each test generates an analysis report with:
|
||||
|
||||
1. **Statistical Analysis**
|
||||
- Summary statistics
|
||||
- Distribution histograms
|
||||
- Time series charts
|
||||
- Anomaly detection
|
||||
|
||||
2. **SLA Compliance**
|
||||
- Pass/fail status
|
||||
- Violation details
|
||||
- Duration and severity
|
||||
|
||||
3. **Bottlenecks**
|
||||
- Identified constraints
|
||||
- Current vs. threshold values
|
||||
- Impact assessment
|
||||
- Recommendations
|
||||
|
||||
4. **Recommendations**
|
||||
- Prioritized action items
|
||||
- Implementation guidance
|
||||
- Estimated impact and cost
|
||||
|
||||
### Visualization Dashboard
|
||||
|
||||
Open `visualization-dashboard.html` in a browser to view:
|
||||
|
||||
- Real-time metrics
|
||||
- Interactive charts
|
||||
- Geographic heat maps
|
||||
- Historical comparisons
|
||||
- Cost analysis
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Before Running Tests
|
||||
|
||||
1. **Baseline Environment**
|
||||
- Ensure cluster is healthy
|
||||
- No active deployments or maintenance
|
||||
- Stable configuration
|
||||
|
||||
2. **Resource Allocation**
|
||||
- Sufficient load generator capacity
|
||||
- Network bandwidth provisioned
|
||||
- Monitoring systems ready
|
||||
|
||||
3. **Communication**
|
||||
- Notify team of upcoming test
|
||||
- Schedule during low-traffic periods
|
||||
- Have rollback plan ready
|
||||
|
||||
### During Tests
|
||||
|
||||
1. **Monitoring**
|
||||
- Watch real-time metrics
|
||||
- Check for anomalies
|
||||
- Monitor costs
|
||||
|
||||
2. **Safety**
|
||||
- Start with smaller tests (baseline_100m)
|
||||
- Gradually increase load
|
||||
- Be ready to abort if issues detected
|
||||
|
||||
3. **Documentation**
|
||||
- Note any unusual events
|
||||
- Document configuration changes
|
||||
- Record observations
|
||||
|
||||
### After Tests
|
||||
|
||||
1. **Analysis**
|
||||
- Review all metrics
|
||||
- Identify bottlenecks
|
||||
- Compare to previous runs
|
||||
|
||||
2. **Reporting**
|
||||
- Share results with team
|
||||
- Document findings
|
||||
- Create action items
|
||||
|
||||
3. **Follow-Up**
|
||||
- Implement recommendations
|
||||
- Re-test after changes
|
||||
- Track improvements over time
|
||||
|
||||
### Test Frequency
|
||||
|
||||
- **Quick Validation**: Daily (CI/CD)
|
||||
- **Standard Suite**: Weekly
|
||||
- **Stress Testing**: Monthly
|
||||
- **Full Suite**: Quarterly
|
||||
|
||||
## Cost Estimation
|
||||
|
||||
### Load Generation Costs
|
||||
|
||||
Per hour of testing:
|
||||
- **Compute**: ~$1,000/hour (distributed load generators)
|
||||
- **Network**: ~$200/hour (egress traffic)
|
||||
- **Storage**: ~$10/hour (results storage)
|
||||
|
||||
**Total**: ~$1,200/hour
|
||||
|
||||
### Scenario Cost Estimates
|
||||
|
||||
| Scenario | Duration | Estimated Cost |
|
||||
|----------|----------|----------------|
|
||||
| baseline_100m | 45m | $900 |
|
||||
| baseline_500m | 3h 15m | $3,900 |
|
||||
| burst_10x | 20m | $400 |
|
||||
| burst_25x | 35m | $700 |
|
||||
| burst_50x | 50m | $1,000 |
|
||||
| read_heavy | 1h 50m | $2,200 |
|
||||
| world_cup | 3h | $3,600 |
|
||||
| black_friday | 14h | $16,800 |
|
||||
| **Full Suite** | ~48h | **~$57,600** |
|
||||
|
||||
### Cost Optimization
|
||||
|
||||
1. **Use Spot Instances**: 60-80% savings on load generators
|
||||
2. **Regional Selection**: Test in fewer regions
|
||||
3. **Shorter Duration**: Reduce steady-state phase
|
||||
4. **Parallel Execution**: Minimize total runtime
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### K6 Not Found
|
||||
```bash
|
||||
# Install k6
|
||||
brew install k6 # macOS
|
||||
sudo apt install k6 # Linux
|
||||
choco install k6 # Windows
|
||||
```
|
||||
|
||||
#### Connection Refused
|
||||
```bash
|
||||
# Check cluster endpoint
|
||||
curl -v https://your-ruvector-cluster.example.com/health
|
||||
|
||||
# Verify network connectivity
|
||||
ping your-ruvector-cluster.example.com
|
||||
```
|
||||
|
||||
#### Out of Memory
|
||||
```bash
|
||||
# Increase Node.js memory limit
|
||||
export NODE_OPTIONS="--max-old-space-size=8192"
|
||||
|
||||
# Use smaller scenario
|
||||
ts-node benchmark-runner.ts run baseline_100m
|
||||
```
|
||||
|
||||
#### High Error Rate
|
||||
- Check cluster health
|
||||
- Verify capacity (not overloaded)
|
||||
- Review network latency
|
||||
- Check authentication/authorization
|
||||
|
||||
#### Slow Performance
|
||||
- Insufficient load generator capacity
|
||||
- Network bandwidth limitations
|
||||
- Target cluster under-provisioned
|
||||
- Configuration issues (connection limits, timeouts)
|
||||
|
||||
### Debug Mode
|
||||
|
||||
```bash
|
||||
# Enable verbose logging
|
||||
export DEBUG=true
|
||||
export LOG_LEVEL=debug
|
||||
|
||||
ts-node benchmark-runner.ts run baseline_500m
|
||||
```
|
||||
|
||||
### Support
|
||||
|
||||
For issues or questions:
|
||||
- GitHub Issues: https://github.com/ruvnet/ruvector/issues
|
||||
- Documentation: https://docs.ruvector.io
|
||||
- Community: https://discord.gg/ruvector
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
### Custom Scenarios
|
||||
|
||||
Create custom scenario in `benchmark-scenarios.ts`:
|
||||
|
||||
```typescript
|
||||
export const SCENARIOS = {
|
||||
...SCENARIOS,
|
||||
my_custom_test: {
|
||||
name: 'My Custom Test',
|
||||
description: 'Custom workload pattern',
|
||||
config: {
|
||||
targetConnections: 1000000000,
|
||||
rampUpDuration: '15m',
|
||||
steadyStateDuration: '1h',
|
||||
rampDownDuration: '10m',
|
||||
queriesPerConnection: 100,
|
||||
queryInterval: '1000',
|
||||
protocol: 'http',
|
||||
vectorDimension: 768,
|
||||
queryPattern: 'uniform',
|
||||
},
|
||||
k6Options: {
|
||||
// K6 configuration
|
||||
},
|
||||
expectedMetrics: {
|
||||
p99Latency: 50,
|
||||
errorRate: 0.01,
|
||||
throughput: 100000000,
|
||||
availability: 99.99,
|
||||
},
|
||||
duration: '1h25m',
|
||||
tags: ['custom'],
|
||||
},
|
||||
};
|
||||
```
|
||||
|
||||
### Integration with CI/CD
|
||||
|
||||
```yaml
|
||||
# .github/workflows/benchmark.yml
|
||||
name: Benchmark
|
||||
on:
|
||||
schedule:
|
||||
- cron: '0 0 * * 0' # Weekly
|
||||
workflow_dispatch:
|
||||
|
||||
jobs:
|
||||
benchmark:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- uses: actions/setup-node@v3
|
||||
- name: Install k6
|
||||
run: |
|
||||
sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
|
||||
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
|
||||
sudo apt-get update
|
||||
sudo apt-get install k6
|
||||
- name: Run benchmark
|
||||
env:
|
||||
BASE_URL: ${{ secrets.BASE_URL }}
|
||||
run: |
|
||||
cd benchmarks
|
||||
ts-node benchmark-runner.ts run baseline_100m
|
||||
- name: Upload results
|
||||
uses: actions/upload-artifact@v3
|
||||
with:
|
||||
name: benchmark-results
|
||||
path: benchmarks/results/
|
||||
```
|
||||
|
||||
### Programmatic Usage
|
||||
|
||||
```typescript
|
||||
import { BenchmarkRunner } from './benchmark-runner';
|
||||
|
||||
const runner = new BenchmarkRunner({
|
||||
baseUrl: 'https://ruvector.example.com',
|
||||
parallelScenarios: 2,
|
||||
enableHooks: true,
|
||||
});
|
||||
|
||||
// Run single scenario
|
||||
const run = await runner.runScenario('baseline_500m');
|
||||
console.log(`Score: ${run.analysis?.score.overall}/100`);
|
||||
|
||||
// Run multiple scenarios
|
||||
const results = await runner.runScenarios([
|
||||
'baseline_500m',
|
||||
'burst_10x',
|
||||
'read_heavy',
|
||||
]);
|
||||
|
||||
// Check if all passed SLA
|
||||
const allPassed = Array.from(results.values()).every(
|
||||
r => r.analysis?.slaCompliance.met
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Happy Benchmarking!** 🚀
|
||||
|
||||
For questions or contributions, please visit: https://github.com/ruvnet/ruvector
|
||||
Reference in New Issue
Block a user