Files

Edge-Net Comprehensive Benchmark Suite

Overview

This directory contains a comprehensive benchmark suite for the edge-net distributed compute intelligence network. The suite tests all critical performance aspects including spike-driven attention, RAC coherence, learning modules, and integration scenarios.

Quick Start

# Navigate to edge-net directory
cd /workspaces/ruvector/examples/edge-net

# Install nightly Rust (required for bench feature)
rustup default nightly

# Run all benchmarks
cargo bench --features bench

# Or use the provided script
./benches/run_benchmarks.sh

Benchmark Structure

Total Benchmarks: 47

1. Spike-Driven Attention (7 benchmarks)

  • Energy-efficient attention with 87x claimed savings
  • Tests encoding, attention computation, and energy ratio
  • Located in src/bench.rs lines 522-596

2. RAC Coherence Engine (6 benchmarks)

  • Adversarial coherence for distributed claims
  • Tests event ingestion, quarantine, Merkle proofs
  • Located in src/bench.rs lines 598-747

3. Learning Modules (5 benchmarks)

  • ReasoningBank pattern storage and lookup
  • Tests trajectory tracking and similarity computation
  • Located in src/bench.rs lines 749-865

4. Multi-Head Attention (4 benchmarks)

  • Standard attention for task routing
  • Tests scaling with dimensions and heads
  • Located in src/bench.rs lines 867-925

5. Integration (4 benchmarks)

  • End-to-end performance tests
  • Tests combined system overhead
  • Located in src/bench.rs lines 927-1105

6. Legacy Benchmarks (21 benchmarks)

  • Credit operations, QDAG, tasks, security
  • Network topology, economic engine
  • Located in src/bench.rs lines 1-520

Running Benchmarks

All Benchmarks

cargo bench --features bench

By Category

# Spike-driven attention
cargo bench --features bench -- spike_

# RAC coherence
cargo bench --features bench -- rac_

# Learning modules
cargo bench --features bench -- reasoning_bank
cargo bench --features bench -- trajectory
cargo bench --features bench -- pattern_similarity

# Multi-head attention
cargo bench --features bench -- multi_head

# Integration
cargo bench --features bench -- integration
cargo bench --features bench -- end_to_end
cargo bench --features bench -- concurrent

Specific Benchmark

# Run a single benchmark
cargo bench --features bench -- bench_spike_attention_seq64_dim128

Custom Iterations

# Run with more iterations for statistical significance
BENCH_ITERATIONS=1000 cargo bench --features bench

Output Format

Each benchmark produces output like:

test bench_spike_attention_seq64_dim128 ... bench:      45,230 ns/iter (+/- 2,150)

Interpretation:

  • 45,230 ns/iter: Mean execution time (45.23 µs)
  • (+/- 2,150): Standard deviation (±2.15 µs, 4.7% jitter)

Derived Metrics:

  • Throughput: 1,000,000,000 / 45,230 = 22,110 ops/sec
  • P99 (approx): Mean + 3*StdDev = 51,680 ns

Performance Targets

Benchmark Target Rationale
Spike Encoding < 1 µs/value Real-time encoding
Spike Attention (64×128) < 100 µs 10K ops/sec throughput
RAC Event Ingestion < 50 µs 20K events/sec
RAC Quarantine Check < 100 ns Hot path operation
ReasoningBank Lookup (10K) < 10 ms Acceptable async delay
Multi-Head Attention (8h×128d) < 50 µs Real-time routing
E2E Task Routing < 1 ms User-facing threshold

Key Metrics

Spike-Driven Attention

Energy Efficiency Calculation:

Standard Attention Energy = 2 * seq² * dim * 3.7 pJ
Spike Attention Energy = seq * spikes * dim * 1.0 pJ

For seq=64, dim=256, spikes=2.4:
  Standard: 7,741,440 pJ
  Spike: 39,321 pJ
  Ratio: 196.8x (theoretical)
  Achieved: ~87x (with encoding overhead)

Validation:

  • Energy ratio should be 70x - 100x
  • Encoding overhead should be < 60% of total time
  • Attention should scale O(n*m) with n=seq_len, m=spike_count

RAC Coherence Performance

Expected Throughput:

  • Single event: 1-2M events/sec
  • Batch 1K events: 1.2K-1.6K batches/sec
  • Quarantine check: 10M-20M checks/sec
  • Merkle update: 100K-200K updates/sec

Scaling:

  • Event ingestion: O(1) amortized
  • Merkle update: O(log n) per event
  • Quarantine: O(1) hash lookup

Learning Module Scaling

ReasoningBank Lookup:

Without indexing (current):

1K patterns: ~200 µs (linear scan)
10K patterns: ~2 ms (10x scaling)
100K patterns: ~20 ms (10x scaling)

With ANN indexing (future optimization):

1K patterns: ~2 µs (log scaling)
10K patterns: ~2.6 µs (1.3x scaling)
100K patterns: ~3.2 µs (1.2x scaling)

Validation:

  • 1K → 10K should scale ~10x (linear)
  • Store operation < 10 µs
  • Similarity computation < 300 ns

Multi-Head Attention Complexity

Time Complexity: O(h * d * (d + k))

  • h = number of heads
  • d = dimension per head
  • k = number of keys

Scaling Verification:

  • 2x dimensions → 4x time (quadratic)
  • 2x heads → 2x time (linear)
  • 2x keys → 2x time (linear)

Benchmark Analysis Tools

benchmark_runner.rs

Provides statistical analysis and reporting:

use benchmark_runner::BenchmarkSuite;

let mut suite = BenchmarkSuite::new();
suite.run_benchmark("test", 100, || {
    // benchmark code
});

println!("{}", suite.generate_report());

Features:

  • Mean, median, std dev, percentiles
  • Throughput calculation
  • Comparative analysis
  • Pass/fail against targets

run_benchmarks.sh

Automated benchmark execution:

./benches/run_benchmarks.sh

Output:

  • Saves results to benchmark_results/
  • Generates timestamped reports
  • Runs all benchmark categories
  • Produces text logs for analysis

Documentation

BENCHMARK_ANALYSIS.md

Comprehensive guide covering:

  • Benchmark categories and purpose
  • Statistical analysis methodology
  • Performance targets and rationale
  • Scaling characteristics
  • Optimization opportunities

BENCHMARK_SUMMARY.md

Quick reference with:

  • 47 benchmark breakdown
  • Expected results summary
  • Key performance indicators
  • Running instructions

BENCHMARK_RESULTS.md

Theoretical analysis including:

  • Energy efficiency calculations
  • Complexity analysis
  • Performance budgets
  • Bottleneck identification
  • Optimization recommendations

Interpreting Results

Good Performance Indicators

Low Mean Latency - Fast execution Low Jitter - Consistent performance (StdDev < 10% of mean) Expected Scaling - Matches theoretical complexity High Throughput - Many ops/sec

Performance Red Flags

High P99/P99.9 - Long tail latencies High StdDev - Inconsistent performance (>20% jitter) Poor Scaling - Worse than expected complexity Memory Growth - Unbounded memory usage

Example Analysis

bench_spike_attention_seq64_dim128:
  Mean: 45,230 ns (45.23 µs)
  StdDev: 2,150 ns (4.7%)
  Throughput: 22,110 ops/sec

✅ Below 100µs target
✅ Low jitter (<5%)
✅ Adequate throughput

Optimization Opportunities

Based on theoretical analysis:

High Priority

  1. ANN Indexing for ReasoningBank

    • Expected: 100x speedup for 10K+ patterns
    • Libraries: FAISS, Annoy, HNSW
    • Effort: Medium (1-2 weeks)
  2. SIMD for Spike Encoding

    • Expected: 4-8x speedup
    • Use: std::simd or intrinsics
    • Effort: Low (few days)
  3. Parallel Merkle Updates

    • Expected: 4-8x speedup on multi-core
    • Use: Rayon parallel iterators
    • Effort: Low (few days)

Medium Priority

  1. Flash Attention

    • Expected: 2-3x speedup
    • Complexity: High
    • Effort: High (2-3 weeks)
  2. Bloom Filters for Quarantine

    • Expected: 2x speedup for negative lookups
    • Complexity: Low
    • Effort: Low (few days)

CI/CD Integration

Regression Detection

name: Benchmarks
on: [push, pull_request]
jobs:
  benchmark:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions-rs/toolchain@v1
        with:
          toolchain: nightly
      - run: cargo bench --features bench
      - run: ./benches/compare_benchmarks.sh

Performance Budgets

Assert maximum latencies:

#[bench]
fn bench_critical(b: &mut Bencher) {
    let result = b.iter(|| {
        // code
    });

    assert!(result.mean < Duration::from_micros(100));
}

Troubleshooting

Benchmark Not Running

# Ensure nightly Rust
rustup default nightly

# Check feature is enabled
cargo bench --features bench -- --list

# Verify dependencies
cargo check --features bench

Inconsistent Results

# Increase iterations
BENCH_ITERATIONS=1000 cargo bench

# Reduce system noise
sudo systemctl stop cron
sudo systemctl stop atd

# Pin to CPU core
taskset -c 0 cargo bench

High Variance

  • Close other applications
  • Disable CPU frequency scaling
  • Run on dedicated benchmark machine
  • Increase warmup iterations

Contributing

When adding benchmarks:

  1. Add to appropriate category in src/bench.rs
  2. Document expected performance
  3. Update this README
  4. Run full suite before PR
  5. Include results in PR description

References

License

MIT - See LICENSE file in repository root.