Files
wifi-densepose/examples/edge-net/docs/performance/performance-analysis.md
ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00

558 lines
15 KiB
Markdown

# Edge-Net Performance Analysis
## Executive Summary
This document provides a comprehensive analysis of performance bottlenecks in the edge-net system, identifying O(n) or worse operations and providing optimization recommendations.
## Critical Performance Bottlenecks
### 1. Credit Ledger Operations (O(n) issues)
#### `WasmCreditLedger::balance()` - **HIGH PRIORITY**
**Location**: `src/credits/mod.rs:124-132`
```rust
pub fn balance(&self) -> u64 {
let total_earned: u64 = self.earned.values().sum();
let total_spent: u64 = self.spent.values()
.map(|(pos, neg)| pos.saturating_sub(*neg))
.sum();
total_earned.saturating_sub(total_spent).saturating_sub(self.staked)
}
```
**Problem**: O(n) where n = number of transactions. Called frequently, iterates all transactions.
**Impact**:
- Called on every credit/deduct operation
- Performance degrades linearly with transaction history
- 1000 transactions = 1000 operations per balance check
**Optimization**:
```rust
// Add cached balance field
local_balance: u64,
// Update on credit/deduct instead of recalculating
pub fn credit(&mut self, amount: u64, reason: &str) -> Result<(), JsValue> {
// ... existing code ...
self.local_balance += amount; // O(1)
Ok(())
}
pub fn balance(&self) -> u64 {
self.local_balance // O(1)
}
```
**Estimated Improvement**: 1000x faster for 1000 transactions
---
#### `WasmCreditLedger::merge()` - **MEDIUM PRIORITY**
**Location**: `src/credits/mod.rs:238-265`
**Problem**: O(m) where m = size of remote ledger state. CRDT merge iterates all entries.
**Impact**:
- Network sync operations
- Large ledgers cause sync delays
**Optimization**:
- Delta-based sync (send only changes since last sync)
- Bloom filters for quick diff detection
- Batch merging with lazy evaluation
---
### 2. QDAG Transaction Processing (O(n²) risk)
#### Tip Selection - **HIGH PRIORITY**
**Location**: `src/credits/qdag.rs:358-366`
```rust
fn select_tips(&self, count: usize) -> Result<Vec<[u8; 32]>, JsValue> {
if self.tips.is_empty() {
return Ok(vec![]);
}
// Simple random selection (would use weighted selection in production)
let tips: Vec<[u8; 32]> = self.tips.iter().copied().take(count).collect();
Ok(tips)
}
```
**Problem**:
- Currently O(1) but marked for weighted selection
- Weighted selection would be O(n) where n = number of tips
- Tips grow with transaction volume
**Impact**: Transaction creation slows as network grows
**Optimization**:
```rust
// Maintain weighted tip index
struct TipIndex {
tips: Vec<[u8; 32]>,
weights: Vec<f32>,
cumulative: Vec<f32>, // Cumulative distribution
}
// Binary search for O(log n) weighted selection
fn select_weighted(&self, count: usize) -> Vec<[u8; 32]> {
// Binary search on cumulative distribution
// O(count * log n) instead of O(count * n)
}
```
**Estimated Improvement**: 100x faster for 1000 tips
---
#### Transaction Validation Chain Walk - **MEDIUM PRIORITY**
**Location**: `src/credits/qdag.rs:248-301`
**Problem**: Recursive validation of parent transactions can create O(depth) traversal
**Impact**: Deep DAG chains slow validation
**Optimization**:
- Checkpoint system (validate only since last checkpoint)
- Parallel validation using rayon
- Validation caching
---
### 3. Security System Q-Learning (O(n) growth)
#### Attack Pattern Detection - **MEDIUM PRIORITY**
**Location**: `src/security/mod.rs:517-530`
```rust
pub fn detect_attack(&self, features: &[f32]) -> f32 {
let mut max_match = 0.0f32;
for pattern in &self.attack_patterns {
let similarity = self.pattern_similarity(&pattern.fingerprint, features);
let threat_score = similarity * pattern.severity * pattern.confidence;
max_match = max_match.max(threat_score);
}
max_match
}
```
**Problem**: O(n*m) where n = patterns, m = feature dimensions. Linear scan on every request.
**Impact**:
- Called on every incoming request
- 1000 patterns = 1000 similarity calculations per request
**Optimization**:
```rust
// Use KD-Tree or Ball Tree for O(log n) similarity search
use kdtree::KdTree;
struct OptimizedPatternDetector {
pattern_tree: KdTree<f32, usize, &'static [f32]>,
patterns: Vec<AttackPattern>,
}
pub fn detect_attack(&self, features: &[f32]) -> f32 {
// KD-tree nearest neighbor: O(log n)
let nearest = self.pattern_tree.nearest(features, 5, &squared_euclidean);
// Only check top-k similar patterns
}
```
**Estimated Improvement**: 10-100x faster depending on pattern count
---
#### Decision History Pruning - **LOW PRIORITY**
**Location**: `src/security/mod.rs:433-437`
```rust
if self.decisions.len() > 10000 {
self.decisions.drain(0..5000);
}
```
**Problem**: O(n) drain operation on vector. Can cause latency spikes.
**Optimization**:
```rust
// Use circular buffer (VecDeque) for O(1) removal
use std::collections::VecDeque;
decisions: VecDeque<SecurityDecision>,
// Or use time-based eviction instead of count-based
```
---
### 4. Network Topology Operations (O(n) peer operations)
#### Peer Connection Updates - **LOW PRIORITY**
**Location**: `src/evolution/mod.rs:50-60`
```rust
pub fn update_connection(&mut self, from: &str, to: &str, success_rate: f32) {
if let Some(connections) = self.connectivity.get_mut(from) {
if let Some(conn) = connections.iter_mut().find(|(id, _)| id == to) {
conn.1 = conn.1 * (1.0 - self.learning_rate) + success_rate * self.learning_rate;
} else {
connections.push((to.to_string(), success_rate));
}
}
}
```
**Problem**: O(n) linear search through connections for each update
**Impact**: Frequent peer interaction updates cause slowdown
**Optimization**:
```rust
// Use HashMap for O(1) lookup
connectivity: HashMap<String, HashMap<String, f32>>,
pub fn update_connection(&mut self, from: &str, to: &str, success_rate: f32) {
self.connectivity
.entry(from.to_string())
.or_insert_with(HashMap::new)
.entry(to.to_string())
.and_modify(|score| {
*score = *score * (1.0 - self.learning_rate) + success_rate * self.learning_rate;
})
.or_insert(success_rate);
}
```
---
#### Optimal Peer Selection - **MEDIUM PRIORITY**
**Location**: `src/evolution/mod.rs:63-77`
```rust
pub fn get_optimal_peers(&self, node_id: &str, count: usize) -> Vec<String> {
if let Some(connections) = self.connectivity.get(node_id) {
let mut sorted: Vec<_> = connections.iter().collect();
sorted.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap_or(std::cmp::Ordering::Equal));
for (peer_id, _score) in sorted.into_iter().take(count) {
peers.push(peer_id.clone());
}
}
peers
}
```
**Problem**: O(n log n) sort on every call. Wasteful for small `count`.
**Optimization**:
```rust
// Use partial sort (nth_element) for O(n) when count << connections.len()
use std::cmp::Ordering;
pub fn get_optimal_peers(&self, node_id: &str, count: usize) -> Vec<String> {
if let Some(connections) = self.connectivity.get(node_id) {
let mut peers: Vec<_> = connections.iter().collect();
if count >= peers.len() {
return peers.iter().map(|(id, _)| (*id).clone()).collect();
}
// Partial sort: O(n) for finding top-k
peers.select_nth_unstable_by(count, |a, b| {
b.1.partial_cmp(&a.1).unwrap_or(Ordering::Equal)
});
peers[..count].iter().map(|(id, _)| (*id).clone()).collect()
} else {
Vec::new()
}
}
```
**Estimated Improvement**: 10x faster for count=5, connections=1000
---
### 5. Task Queue Operations (O(n) search)
#### Task Claiming - **HIGH PRIORITY**
**Location**: `src/tasks/mod.rs:335-347`
```rust
pub async fn claim_next(
&mut self,
identity: &crate::identity::WasmNodeIdentity,
) -> Result<Option<Task>, JsValue> {
for task in &self.pending {
if !self.claimed.contains_key(&task.id) {
self.claimed.insert(task.id.clone(), identity.node_id());
return Ok(Some(task.clone()));
}
}
Ok(None)
}
```
**Problem**: O(n) linear search through pending tasks
**Impact**:
- Every worker scans all pending tasks
- 1000 pending tasks = 1000 checks per claim attempt
**Optimization**:
```rust
// Priority queue with indexed lookup
use std::collections::{BinaryHeap, HashMap};
struct TaskQueue {
pending: BinaryHeap<PrioritizedTask>,
claimed: HashMap<String, String>,
task_index: HashMap<String, Task>, // Fast lookup
}
pub async fn claim_next(&mut self, identity: &Identity) -> Option<Task> {
while let Some(prioritized) = self.pending.pop() {
if !self.claimed.contains_key(&prioritized.id) {
self.claimed.insert(prioritized.id.clone(), identity.node_id());
return self.task_index.get(&prioritized.id).cloned();
}
}
None
}
```
**Estimated Improvement**: 100x faster for large queues
---
### 6. Optimization Engine Routing (O(n) filter operations)
#### Node Score Calculation - **MEDIUM PRIORITY**
**Location**: `src/evolution/mod.rs:476-492`
```rust
fn calculate_node_score(&self, node_id: &str, task_type: &str) -> f32 {
let history: Vec<_> = self.routing_history.iter()
.filter(|d| d.selected_node == node_id && d.task_type == task_type)
.collect();
// ... calculations ...
}
```
**Problem**: O(n) filter on every node scoring. Called multiple times during selection.
**Impact**: Large routing history (10K+ entries) causes significant slowdown
**Optimization**:
```rust
// Maintain indexed aggregates
struct RoutingStats {
success_count: u64,
total_count: u64,
total_latency: u64,
}
routing_stats: HashMap<(String, String), RoutingStats>, // (node_id, task_type) -> stats
fn calculate_node_score(&self, node_id: &str, task_type: &str) -> f32 {
let key = (node_id.to_string(), task_type.to_string());
if let Some(stats) = self.routing_stats.get(&key) {
let success_rate = stats.success_count as f32 / stats.total_count as f32;
let avg_latency = stats.total_latency as f32 / stats.total_count as f32;
// O(1) calculation
} else {
0.5 // Unknown
}
}
```
**Estimated Improvement**: 1000x faster for 10K history
---
## Memory Optimization Opportunities
### 1. String Allocations
**Problem**: Heavy use of `String::clone()` and `to_string()` throughout codebase
**Impact**: Heap allocations, GC pressure
**Examples**:
- Node IDs cloned repeatedly
- Task IDs duplicated across structures
- Transaction hashes as byte arrays then converted to strings
**Optimization**:
```rust
// Use Arc<str> for shared immutable strings
use std::sync::Arc;
type NodeId = Arc<str>;
type TaskId = Arc<str>;
// Or use string interning
use string_cache::DefaultAtom as Atom;
```
---
### 2. HashMap Growth
**Problem**: HashMaps without capacity hints cause multiple reallocations
**Examples**:
- `connectivity: HashMap<String, Vec<(String, f32)>>`
- `routing_history: Vec<RoutingDecision>`
**Optimization**:
```rust
// Pre-allocate with estimated capacity
let mut connectivity = HashMap::with_capacity(expected_nodes);
// Or use SmallVec for small connection lists
use smallvec::SmallVec;
type ConnectionList = SmallVec<[(String, f32); 8]>;
```
---
## Algorithmic Improvements
### 1. Batch Operations
**Current**: Individual credit/deduct operations
**Improved**: Batch multiple operations
```rust
pub fn batch_credit(&mut self, transactions: &[(u64, &str)]) -> Result<(), JsValue> {
let total: u64 = transactions.iter().map(|(amt, _)| amt).sum();
self.local_balance += total;
for (amount, reason) in transactions {
let event_id = Uuid::new_v4().to_string();
*self.earned.entry(event_id).or_insert(0) += amount;
}
Ok(())
}
```
---
### 2. Lazy Evaluation
**Current**: Eager computation of metrics
**Improved**: Compute on-demand with caching
```rust
struct CachedMetric<T> {
value: Option<T>,
dirty: bool,
}
impl EconomicEngine {
fn get_health(&mut self) -> &EconomicHealth {
if self.health_cache.dirty {
self.health_cache.value = Some(self.calculate_health());
self.health_cache.dirty = false;
}
self.health_cache.value.as_ref().unwrap()
}
}
```
---
## Benchmark Targets
Based on the analysis, here are performance targets:
| Operation | Current (est.) | Target | Improvement |
|-----------|---------------|--------|-------------|
| Balance check (1K txs) | 1ms | 10ns | 100,000x |
| QDAG tip selection | 100µs | 1µs | 100x |
| Attack detection | 500µs | 5µs | 100x |
| Task claiming | 10ms | 100µs | 100x |
| Peer selection | 1ms | 10µs | 100x |
| Node scoring | 5ms | 5µs | 1000x |
---
## Priority Implementation Order
### Phase 1: Critical Bottlenecks (Week 1)
1. ✅ Cache ledger balance (O(n) → O(1))
2. ✅ Index task queue (O(n) → O(log n))
3. ✅ Index routing stats (O(n) → O(1))
### Phase 2: High Impact (Week 2)
4. ✅ Optimize peer selection (O(n log n) → O(n))
5. ✅ KD-tree for attack patterns (O(n) → O(log n))
6. ✅ Weighted tip selection (O(n) → O(log n))
### Phase 3: Polish (Week 3)
7. ✅ String interning
8. ✅ Batch operations API
9. ✅ Lazy evaluation caching
10. ✅ Memory pool allocators
---
## Testing Strategy
### Benchmark Suite
Run comprehensive benchmarks in `src/bench.rs`:
```bash
cargo bench --features=bench
```
### Load Testing
```rust
// Simulate 10K nodes, 100K transactions
#[test]
fn stress_test_large_network() {
let mut topology = NetworkTopology::new();
for i in 0..10_000 {
topology.register_node(&format!("node-{}", i), &[0.5, 0.3, 0.2]);
}
let start = Instant::now();
topology.get_optimal_peers("node-0", 10);
let elapsed = start.elapsed();
assert!(elapsed < Duration::from_millis(1)); // Target: <1ms
}
```
### Memory Profiling
```bash
# Using valgrind/massif
valgrind --tool=massif target/release/edge-net-bench
# Using heaptrack
heaptrack target/release/edge-net-bench
```
---
## Conclusion
The edge-net system has several O(n) and O(n log n) operations that will become bottlenecks as the network scales. The priority optimizations focus on:
1. **Caching computed values** (balance, routing stats)
2. **Using appropriate data structures** (indexed collections, priority queues)
3. **Avoiding linear scans** (spatial indexes for patterns, partial sorting)
4. **Reducing allocations** (string interning, capacity hints)
Implementing Phase 1 optimizations alone should provide **100-1000x** improvements for critical operations.
## Next Steps
1. Run baseline benchmarks to establish current performance
2. Implement Phase 1 optimizations with before/after benchmarks
3. Profile memory usage under load
4. Document performance characteristics in API docs
5. Set up continuous performance monitoring