Squashed 'vendor/ruvector/' content from commit b64c2172

git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00
commit d803bfe2b1
7854 changed files with 3522914 additions and 0 deletions
--- a/examples/edge-net/docs/benchmarks/BENCHMARK_ANALYSIS.md
+++ b/examples/edge-net/docs/benchmarks/BENCHMARK_ANALYSIS.md
@@ -0,0 +1,355 @@
+# Edge-Net Comprehensive Benchmark Analysis
+
+This document provides detailed analysis of the edge-net performance benchmarks, covering spike-driven attention, RAC coherence, learning modules, and integration tests.
+
+## Benchmark Categories
+
+### 1. Spike-Driven Attention Benchmarks
+
+Tests the energy-efficient spike-driven attention mechanism that claims 87x energy savings over standard attention.
+
+**Benchmarks:**
+- `bench_spike_encoding_small` - 64 values encoding
+- `bench_spike_encoding_medium` - 256 values encoding
+- `bench_spike_encoding_large` - 1024 values encoding
+- `bench_spike_attention_seq16_dim64` - Attention with 16 seq, 64 dim
+- `bench_spike_attention_seq64_dim128` - Attention with 64 seq, 128 dim
+- `bench_spike_attention_seq128_dim256` - Attention with 128 seq, 256 dim
+- `bench_spike_energy_ratio_calculation` - Energy ratio computation
+
+**Key Metrics:**
+- Encoding throughput (values/sec)
+- Attention latency vs sequence length
+- Energy ratio accuracy (target: 87x)
+- Temporal coding overhead
+
+**Expected Performance:**
+- Encoding: < 1µs per value
+- Attention (64x128): < 100µs
+- Energy ratio calculation: < 10ns
+- Scaling: O(n*m) where n=seq_len, m=spike_count
+
+### 2. RAC Coherence Benchmarks
+
+Tests the adversarial coherence engine for distributed claim verification and conflict resolution.
+
+**Benchmarks:**
+- `bench_rac_event_ingestion` - Single event ingestion
+- `bench_rac_event_ingestion_1k` - 1000 events batch ingestion
+- `bench_rac_quarantine_check` - Quarantine level lookup
+- `bench_rac_quarantine_set_level` - Quarantine level update
+- `bench_rac_merkle_root_update` - Merkle root calculation
+- `bench_rac_ruvector_similarity` - Semantic similarity computation
+
+**Key Metrics:**
+- Event ingestion throughput (events/sec)
+- Quarantine check latency
+- Merkle proof generation time
+- Conflict detection overhead
+
+**Expected Performance:**
+- Single event ingestion: < 50µs
+- 1K batch ingestion: < 50ms (1000 events/sec)
+- Quarantine check: < 100ns (hash map lookup)
+- Merkle root: < 1ms for 100 events
+- RuVector similarity: < 500ns
+
+### 3. Learning Module Benchmarks
+
+Tests the ReasoningBank pattern storage and trajectory tracking for self-learning.
+
+**Benchmarks:**
+- `bench_reasoning_bank_lookup_1k` - Lookup in 1K patterns
+- `bench_reasoning_bank_lookup_10k` - Lookup in 10K patterns
+- `bench_reasoning_bank_lookup_100k` - Lookup in 100K patterns (if added)
+- `bench_reasoning_bank_store` - Pattern storage
+- `bench_trajectory_recording` - Trajectory recording
+- `bench_pattern_similarity_computation` - Cosine similarity
+
+**Key Metrics:**
+- Lookup latency vs database size
+- Scaling characteristics (linear, log, constant)
+- Storage throughput (patterns/sec)
+- Similarity computation cost
+
+**Expected Performance:**
+- 1K lookup: < 1ms
+- 10K lookup: < 10ms
+- 100K lookup: < 100ms
+- Pattern store: < 10µs
+- Trajectory record: < 5µs
+- Similarity: < 200ns per comparison
+
+**Scaling Analysis:**
+- Target: O(n) for brute-force similarity search
+- With indexing: O(log n) or better
+- 1K → 10K should be ~10x increase
+- 10K → 100K should be ~10x increase
+
+### 4. Multi-Head Attention Benchmarks
+
+Tests the standard multi-head attention for task routing.
+
+**Benchmarks:**
+- `bench_multi_head_attention_2heads_dim8` - 2 heads, 8 dimensions
+- `bench_multi_head_attention_4heads_dim64` - 4 heads, 64 dimensions
+- `bench_multi_head_attention_8heads_dim128` - 8 heads, 128 dimensions
+- `bench_multi_head_attention_8heads_dim256_10keys` - 8 heads, 256 dim, 10 keys
+
+**Key Metrics:**
+- Latency vs dimensions
+- Latency vs number of heads
+- Latency vs number of keys
+- Throughput (ops/sec)
+
+**Expected Performance:**
+- 2h x 8d: < 1µs
+- 4h x 64d: < 10µs
+- 8h x 128d: < 50µs
+- 8h x 256d x 10k: < 200µs
+
+**Scaling:**
+- O(d²) in dimension size (quadratic due to QKV projections)
+- O(h) in number of heads (linear parallelization)
+- O(k) in number of keys (linear attention)
+
+### 5. Integration Benchmarks
+
+Tests end-to-end performance with combined systems.
+
+**Benchmarks:**
+- `bench_end_to_end_task_routing_with_learning` - Full task lifecycle with learning
+- `bench_combined_learning_coherence_overhead` - Learning + RAC overhead
+- `bench_memory_usage_trajectory_1k` - Memory footprint for 1K trajectories
+- `bench_concurrent_learning_and_rac_ops` - Concurrent operations
+
+**Key Metrics:**
+- End-to-end task latency
+- Combined system overhead
+- Memory usage over time
+- Concurrent access performance
+
+**Expected Performance:**
+- E2E task routing: < 1ms
+- Combined overhead: < 500µs for 10 ops each
+- Memory 1K trajectories: < 1MB
+- Concurrent ops: < 100µs
+
+## Statistical Analysis
+
+For each benchmark, we measure:
+
+### Central Tendency
+- **Mean**: Average execution time
+- **Median**: Middle value (robust to outliers)
+- **Mode**: Most common value
+
+### Dispersion
+- **Standard Deviation**: Measure of spread
+- **Variance**: Squared deviation
+- **Range**: Max - Min
+- **IQR**: Interquartile range (75th - 25th percentile)
+
+### Percentiles
+- **P50 (Median)**: 50% of samples below this
+- **P90**: 90% of samples below this
+- **P95**: 95% of samples below this
+- **P99**: 99% of samples below this
+- **P99.9**: 99.9% of samples below this
+
+### Performance Metrics
+- **Throughput**: Operations per second
+- **Latency**: Time per operation
+- **Jitter**: Variation in latency (StdDev)
+- **Efficiency**: Actual vs theoretical performance
+
+## Running Benchmarks
+
+### Prerequisites
+
+```bash
+cd /workspaces/ruvector/examples/edge-net
+```
+
+### Run All Benchmarks
+
+```bash
+# Using nightly Rust (required for bench feature)
+rustup default nightly
+cargo bench --features bench
+
+# Or using the provided script
+./benches/run_benchmarks.sh
+```
+
+### Run Specific Categories
+
+```bash
+# Spike-driven attention only
+cargo bench --features bench -- spike_
+
+# RAC coherence only
+cargo bench --features bench -- rac_
+
+# Learning modules only
+cargo bench --features bench -- reasoning_bank
+cargo bench --features bench -- trajectory
+
+# Multi-head attention only
+cargo bench --features bench -- multi_head
+
+# Integration tests only
+cargo bench --features bench -- integration
+cargo bench --features bench -- end_to_end
+```
+
+### Custom Iterations
+
+```bash
+# Run with more iterations for statistical significance
+BENCH_ITERATIONS=1000 cargo bench --features bench
+```
+
+## Interpreting Results
+
+### Good Performance Indicators
+
+✅ **Low latency** - Operations complete quickly
+✅ **Low jitter** - Consistent performance (low StdDev)
+✅ **Good scaling** - Performance degrades predictably
+✅ **High throughput** - Many operations per second
+
+### Performance Red Flags
+
+❌ **High P99/P99.9** - Long tail latencies
+❌ **High StdDev** - Inconsistent performance
+❌ **Poor scaling** - Worse than O(n) when expected
+❌ **Memory growth** - Unbounded memory usage
+
+### Example Output Interpretation
+
+```
+bench_spike_attention_seq64_dim128:
+  Mean: 45,230 ns (45.23 µs)
+  Median: 44,100 ns
+  StdDev: 2,150 ns
+  P95: 48,500 ns
+  P99: 51,200 ns
+  Throughput: 22,110 ops/sec
+```
+
+**Analysis:**
+- ✅ Mean < 100µs target
+- ✅ Low jitter (StdDev ~4.7% of mean)
+- ✅ P99 close to mean (good tail latency)
+- ✅ Throughput adequate for distributed tasks
+
+## Energy Efficiency Analysis
+
+### Spike-Driven vs Standard Attention
+
+**Theoretical Energy Ratio:** 87x
+
+**Calculation:**
+```
+Standard Attention Energy:
+  = 2 * seq_len² * hidden_dim * mult_energy_factor
+  = 2 * 64² * 128 * 3.7
+  = 3,833,856 energy units
+
+Spike Attention Energy:
+  = seq_len * avg_spikes * hidden_dim * add_energy_factor
+  = 64 * 2.4 * 128 * 1.0
+  = 19,660 energy units
+
+Ratio = 3,833,856 / 19,660 = 195x (theoretical upper bound)
+Achieved = ~87x (accounting for encoding overhead)
+```
+
+**Validation:**
+- Measure actual execution time spike vs standard
+- Compare energy consumption if available
+- Verify temporal coding overhead is acceptable
+
+## Scaling Characteristics
+
+### Expected Complexity
+
+| Component | Expected | Actual | Status |
+|-----------|----------|--------|--------|
+| Spike Encoding | O(n*s) | TBD | - |
+| Spike Attention | O(n²) | TBD | - |
+| RAC Event Ingestion | O(1) | TBD | - |
+| RAC Merkle Update | O(n) | TBD | - |
+| ReasoningBank Lookup | O(n) | TBD | - |
+| Multi-Head Attention | O(n²d) | TBD | - |
+
+### Scaling Tests
+
+To verify scaling characteristics:
+
+1. **Linear Scaling (O(n))**
+   - 1x → 10x input should show 10x time
+   - Example: 1K → 10K ReasoningBank
+
+2. **Quadratic Scaling (O(n²))**
+   - 1x → 10x input should show 100x time
+   - Example: Attention sequence length
+
+3. **Logarithmic Scaling (O(log n))**
+   - 1x → 10x input should show ~3.3x time
+   - Example: Indexed lookup (if implemented)
+
+## Performance Targets Summary
+
+| Component | Metric | Target | Rationale |
+|-----------|--------|--------|-----------|
+| Spike Encoding | Latency | < 1µs/value | Fast enough for real-time |
+| Spike Attention | Latency | < 100µs | Enables 10K ops/sec |
+| RAC Ingestion | Throughput | > 1K events/sec | Handle distributed load |
+| RAC Quarantine | Latency | < 100ns | Fast decision making |
+| ReasoningBank 10K | Latency | < 10ms | Acceptable for async ops |
+| Multi-Head 8h×128d | Latency | < 50µs | Real-time routing |
+| E2E Task Routing | Latency | < 1ms | User-facing threshold |
+
+## Continuous Monitoring
+
+### Regression Detection
+
+Track benchmarks over time to detect performance regressions:
+
+```bash
+# Save baseline
+cargo bench --features bench > baseline.txt
+
+# After changes, compare
+cargo bench --features bench > current.txt
+diff baseline.txt current.txt
+```
+
+### CI/CD Integration
+
+Add to GitHub Actions:
+
+```yaml
+- name: Run Benchmarks
+  run: cargo bench --features bench
+- name: Compare with baseline
+  run: ./benches/compare_benchmarks.sh
+```
+
+## Contributing
+
+When adding new features:
+
+1. ✅ Add corresponding benchmarks
+2. ✅ Document expected performance
+3. ✅ Run benchmarks before submitting PR
+4. ✅ Include benchmark results in PR description
+5. ✅ Ensure no regressions in existing benchmarks
+
+## References
+
+- [Criterion.rs](https://github.com/bheisler/criterion.rs) - Rust benchmarking
+- [Statistical Analysis](https://en.wikipedia.org/wiki/Statistical_hypothesis_testing)
+- [Performance Testing Best Practices](https://github.com/rust-lang/rust/blob/master/src/doc/rustc-dev-guide/src/tests/perf.md)