Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'
This commit is contained in:
84
vendor/ruvector/crates/profiling/README.md
vendored
Normal file
84
vendor/ruvector/crates/profiling/README.md
vendored
Normal file
@@ -0,0 +1,84 @@
|
||||
# Ruvector Performance Profiling
|
||||
|
||||
This directory contains profiling scripts, reports, and analysis for Ruvector performance optimization.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
profiling/
|
||||
├── scripts/ # Profiling and benchmarking scripts
|
||||
├── reports/ # Generated profiling reports
|
||||
├── flamegraphs/ # CPU flamegraphs
|
||||
├── memory/ # Memory profiling data
|
||||
└── benchmarks/ # Benchmark results
|
||||
```
|
||||
|
||||
## Profiling Tools
|
||||
|
||||
### CPU Profiling
|
||||
- **perf**: Linux performance counters
|
||||
- **flamegraph**: Visualization of CPU hotspots
|
||||
- **cargo-flamegraph**: Integrated Rust profiling
|
||||
|
||||
### Memory Profiling
|
||||
- **valgrind**: Memory leak detection and profiling
|
||||
- **heaptrack**: Heap memory profiling
|
||||
- **massif**: Heap profiler
|
||||
|
||||
### Cache Profiling
|
||||
- **perf-cache**: Cache miss analysis
|
||||
- **cachegrind**: Cache simulation
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Install profiling tools
|
||||
./scripts/install_tools.sh
|
||||
|
||||
# Run CPU profiling
|
||||
./scripts/cpu_profile.sh
|
||||
|
||||
# Run memory profiling
|
||||
./scripts/memory_profile.sh
|
||||
|
||||
# Generate flamegraph
|
||||
./scripts/generate_flamegraph.sh
|
||||
|
||||
# Run comprehensive benchmark suite
|
||||
./scripts/benchmark_all.sh
|
||||
```
|
||||
|
||||
## Performance Targets
|
||||
|
||||
- **Throughput**: 50,000+ queries per second (QPS)
|
||||
- **Latency**: Sub-millisecond p50 latency (<1ms)
|
||||
- **Recall**: 95% recall at high QPS
|
||||
- **Memory**: Efficient memory usage with minimal allocations in hot paths
|
||||
- **Scalability**: Linear scaling from 1-128 threads
|
||||
|
||||
## Optimization Areas
|
||||
|
||||
1. **CPU Optimization**
|
||||
- SIMD intrinsics (AVX2/AVX-512)
|
||||
- Target-specific compilation
|
||||
- Hot path optimization
|
||||
|
||||
2. **Memory Optimization**
|
||||
- Arena allocation
|
||||
- Object pooling
|
||||
- Zero-copy operations
|
||||
|
||||
3. **Cache Optimization**
|
||||
- Structure-of-Arrays layout
|
||||
- Cache-line alignment
|
||||
- Prefetching
|
||||
|
||||
4. **Concurrency Optimization**
|
||||
- Lock-free data structures
|
||||
- RwLock optimization
|
||||
- Rayon tuning
|
||||
|
||||
5. **Compile-Time Optimization**
|
||||
- Profile-Guided Optimization (PGO)
|
||||
- Link-Time Optimization (LTO)
|
||||
- Target CPU features
|
||||
Reference in New Issue
Block a user