Files

ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172

git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900

2026-02-28 14:39:40 -05:00

9.1 KiB

Raw Blame History

ruvector-scipix Benchmark Suite

Comprehensive performance benchmarking for the Scipix OCR clone using Criterion.

Overview

This benchmark suite provides detailed performance analysis across all critical components of the OCR system:

OCR Latency: End-to-end OCR performance metrics
Preprocessing: Image preprocessing pipeline performance
LaTeX Generation: LaTeX AST generation and string building
Inference: Model inference benchmarks (detection, recognition, math)
Cache: Embedding cache and similarity search performance
API: REST API request/response handling
Memory: Memory usage, growth, and fragmentation analysis

Performance Targets

Primary Targets

Single Image OCR: < 100ms at P95
Batch Processing (16 images): < 500ms total
Preprocessing Pipeline: < 20ms
LaTeX Generation: < 5ms

Secondary Targets

Cache Hit Latency: < 1ms
Similarity Search (1000 embeddings): < 10ms
API Request Parsing: < 0.5ms
Model Warm-up: < 200ms

Running Benchmarks

Run All Benchmarks

cd examples/scipix
./scripts/run_benchmarks.sh all

Run Specific Benchmark Suite

# OCR latency benchmarks
./scripts/run_benchmarks.sh latency

# Preprocessing benchmarks
./scripts/run_benchmarks.sh preprocessing

# LaTeX generation benchmarks
./scripts/run_benchmarks.sh latex

# Model inference benchmarks
./scripts/run_benchmarks.sh inference

# Cache benchmarks
./scripts/run_benchmarks.sh cache

# API benchmarks
./scripts/run_benchmarks.sh api

# Memory benchmarks
./scripts/run_benchmarks.sh memory

Quick Benchmark Suite

For rapid iteration during development:

./scripts/run_benchmarks.sh quick

CI Benchmark Suite

Minimal samples for continuous integration:

./scripts/run_benchmarks.sh ci

Baseline Tracking

Save Current Results as Baseline

BASELINE=v1.0 ./scripts/run_benchmarks.sh all

Compare with Saved Baseline

./scripts/run_benchmarks.sh compare v1.0

Compare with Main Branch

BASELINE=main ./scripts/run_benchmarks.sh all
./scripts/run_benchmarks.sh compare main

Benchmark Details

1. OCR Latency Benchmarks (`ocr_latency.rs`)

Tests end-to-end OCR performance across various scenarios:

Single Image OCR: Different image sizes (224x224 to 1024x1024)
Batch Processing: Batch sizes from 1 to 32 images
Cold vs Warm Start: Model initialization overhead
Latency Percentiles: P50, P95, P99 measurements
Throughput: Images per second

Key Metrics:

Mean latency
P95/P99 latency
Throughput (images/sec)
Batch efficiency

2. Preprocessing Benchmarks (`preprocessing.rs`)

Image preprocessing pipeline performance:

Individual Transforms: Grayscale, blur, threshold, edge detection
Full Pipeline: Sequential preprocessing chain
Parallel vs Sequential: Batch processing comparison
Resize Operations: Nearest neighbor and bilinear interpolation

Key Metrics:

Transform latency
Pipeline total time
Parallel speedup
Memory overhead

3. LaTeX Generation Benchmarks (`latex_generation.rs`)

LaTeX code generation from AST:

Simple Expressions: Fractions, powers, sums
Complex Expressions: Matrices, integrals, summations
AST Traversal: Tree depth impact on performance
String Building: Optimization strategies
Batch Generation: Multiple expressions

Key Metrics:

Generation latency
AST traversal time
String concatenation efficiency

4. Inference Benchmarks (`inference.rs`)

Neural network model inference:

Text Detection Model: Bounding box detection
Text Recognition Model: OCR text extraction
Math Model: Mathematical notation recognition
Tensor Preprocessing: Image to tensor conversion
Output Postprocessing: NMS, confidence filtering, CTC decoding
Batch Inference: Multi-image processing
Model Warm-up: Initialization overhead

Key Metrics:

Inference latency per model
Batch throughput
Preprocessing overhead
Postprocessing time

5. Cache Benchmarks (`cache.rs`)

Embedding cache and similarity search:

Embedding Generation: Image to vector embedding
Similarity Search: Linear and approximate nearest neighbor
Cache Hit/Miss Latency: Lookup performance
Cache Insertion: Add new entries
Batch Operations: Multi-query performance
Cache Statistics: Memory and efficiency metrics

Key Metrics:

Embedding generation time
Search latency (linear vs ANN)
Hit/miss ratio impact
Memory per embedding

6. API Benchmarks (`api.rs`)

REST API performance:

Request Parsing: JSON deserialization
Response Serialization: JSON encoding
Concurrent Requests: Multi-client handling
Middleware Overhead: Auth, logging, validation, rate limiting
Error Handling: Error response generation
End-to-End Request: Full request cycle

Key Metrics:

Parse/serialize latency
Middleware overhead
Concurrent throughput
Error handling time

7. Memory Benchmarks (`memory.rs`)

Memory usage and management:

Peak Memory: Maximum usage during inference
Memory per Image: Batch processing memory scaling
Model Loading: Memory required for model initialization
Memory Growth: Leak detection over time
Fragmentation: Allocation/deallocation patterns
Cache Memory: Embedding storage overhead
Memory Pools: Pool vs heap allocation
Tensor Layouts: HWC vs CHW memory impact

Key Metrics:

Peak memory usage
Memory growth rate
Fragmentation level
Pool efficiency

HTML Reports

Criterion automatically generates detailed HTML reports with:

Performance graphs
Statistical analysis
Regression detection
Historical comparisons

View Reports

After running benchmarks, open:

open target/criterion/report/index.html

Or for a specific benchmark:

open target/criterion/ocr_latency/report/index.html

Interpreting Results

Latency Metrics

Mean: Average latency across all samples
Median (P50): 50th percentile - half of requests are faster
P95: 95th percentile - 95% of requests are faster
P99: 99th percentile - 99% of requests are faster
Standard Deviation: Variance in latency

Throughput Metrics

Images/Second: Processing rate
Batch Efficiency: Speedup from batching
Sustainable Throughput: Max rate with <95% success

Regression Detection

Criterion detects performance regressions automatically:

Green: Performance improved
Yellow: Minor change (within noise)
Red: Performance regressed

Memory Metrics

Peak Usage: Maximum memory at any point
Growth Rate: Memory increase over time
Fragmentation: Memory layout efficiency

Best Practices

Running Benchmarks

Consistent Environment: Run on the same hardware
Quiet System: Close other applications
Multiple Samples: Use sufficient sample size (50-100)
Warm-up: Allow for JIT compilation and caching
Baseline Tracking: Save results for comparison

Analyzing Results

Focus on Percentiles: P95/P99 more important than mean
Check Variance: High variance indicates instability
Profile Outliers: Investigate extreme values
Memory Leaks: Monitor growth rate
Regression Limits: Set acceptable thresholds

Optimization Workflow

Baseline: Establish current performance
Profile: Identify bottlenecks
Optimize: Implement improvements
Benchmark: Measure impact
Compare: Verify improvement vs baseline
Iterate: Repeat until targets met

Continuous Integration

CI Benchmark Configuration

# .github/workflows/benchmark.yml
name: Benchmarks

on:
  pull_request:
  push:
    branches: [main]

jobs:
  benchmark:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions-rs/toolchain@v1
        with:
          toolchain: stable

      - name: Run benchmarks
        run: |
          cd examples/scipix
          ./scripts/run_benchmarks.sh ci

      - name: Compare with baseline
        run: |
          cd examples/scipix
          ./scripts/run_benchmarks.sh compare main

Troubleshooting

Benchmarks Running Slowly

Reduce sample size: cargo bench -- --sample-size 10
Use quick mode: ./scripts/run_benchmarks.sh quick
Run specific benchmarks only

Inconsistent Results

Ensure system is idle
Disable CPU frequency scaling
Run with higher sample size
Check for thermal throttling

Memory Issues

Monitor system memory during benchmarks
Use memory profiling tools (valgrind, heaptrack)
Check for memory leaks with growth benchmarks

Contributing

When adding new features:

Add corresponding benchmarks
Set performance targets
Run baseline before/after changes
Document any performance impact
Update this documentation

9.1 KiB Raw Blame History

ruvector-scipix Benchmark Suite

Overview

Performance Targets

Primary Targets

Secondary Targets

Running Benchmarks

Run All Benchmarks

Run Specific Benchmark Suite

Quick Benchmark Suite

CI Benchmark Suite

Baseline Tracking

Save Current Results as Baseline

Compare with Saved Baseline

Compare with Main Branch

Benchmark Details

1. OCR Latency Benchmarks (ocr_latency.rs)

2. Preprocessing Benchmarks (preprocessing.rs)

3. LaTeX Generation Benchmarks (latex_generation.rs)

4. Inference Benchmarks (inference.rs)

5. Cache Benchmarks (cache.rs)

6. API Benchmarks (api.rs)

7. Memory Benchmarks (memory.rs)

HTML Reports

View Reports

Interpreting Results

Latency Metrics

Throughput Metrics

Regression Detection

Memory Metrics

Best Practices

Running Benchmarks

Analyzing Results

Optimization Workflow

Continuous Integration

CI Benchmark Configuration

Troubleshooting

Benchmarks Running Slowly

Inconsistent Results

Memory Issues

Contributing

Resources

9.1 KiB

Raw Blame History

1. OCR Latency Benchmarks (`ocr_latency.rs`)

2. Preprocessing Benchmarks (`preprocessing.rs`)

3. LaTeX Generation Benchmarks (`latex_generation.rs`)

4. Inference Benchmarks (`inference.rs`)

5. Cache Benchmarks (`cache.rs`)

6. API Benchmarks (`api.rs`)

7. Memory Benchmarks (`memory.rs`)