Files
wifi-densepose/crates/ruvector-sparse-inference/tests

Sparse Inference Engine - Test Suite

Comprehensive test suite for the RuVector sparse inference engine with 78+ tests and 10 benchmarks across 1516 lines of test code.

Test Structure

Unit Tests (tests/unit/)

Predictor Tests (predictor_tests.rs - 12 tests)

  • Low-rank predictor creation and configuration
  • Active neuron prediction validation
  • Top-K mode functionality
  • Calibration effectiveness
  • Input validation and edge cases
  • Consistency and determinism

Sparse FFN Tests (sparse_ffn_tests.rs - 14 tests)

  • Sparse vs dense computation equivalence
  • Different activation functions (ReLU, GeLU, SiLU)
  • SwiGLU paired neuron handling
  • Empty and partial activation sets
  • Out-of-bounds and duplicate neuron handling
  • Deterministic output verification

Quantization Tests (quantization_tests.rs - 15 tests)

  • INT8 quantization roundtrip accuracy
  • INT4 compression ratios
  • Different group sizes (16, 32, 64, 128)
  • Selective row dequantization
  • Range preservation
  • Uniform and zero value handling
  • Odd-length array support

Integration Tests (tests/integration/)

Model Loading Tests (model_loading_tests.rs - 15 tests)

  • GGUF header parsing
  • Invalid format detection
  • Model structure validation
  • Forward pass execution
  • Configuration handling
  • Multiple model sizes

Sparse Inference Tests (sparse_inference_tests.rs - 12 tests)

  • Full sparse pipeline execution
  • Dense vs sparse accuracy comparison
  • Batch processing
  • Calibration improvements
  • Different sparsity levels (10%-90%)
  • Consistency verification
  • Extreme input handling

Property-Based Tests (tests/property/mod.rs - 10 tests)

Using proptest for generative testing:

  • Output finiteness invariants
  • Valid index generation
  • Dense/sparse equivalence
  • Quantization ordering preservation
  • Top-K constraints
  • Dimension correctness
  • INT4 roundtrip properties
  • Output dimension consistency
  • SwiGLU output validation
  • Calibration robustness

Benchmark Tests (benches/sparse_inference_bench.rs - 10 benchmarks)

Performance Comparisons:

  1. Sparse vs Dense: Baseline comparison
  2. Sparsity Levels: 30%, 50%, 70%, 90% sparsity
  3. Predictor Performance: Prediction latency
  4. Top-K Modes: K=100, 500, 1000, 2000
  5. Sparse FFN: Dense vs 10% vs 50% sparse
  6. Activation Functions: ReLU, GeLU, SiLU comparison
  7. Quantization: Dequantization of 1, 10, 100 rows
  8. INT4 vs INT8: Quantization speed and accuracy
  9. Calibration: Sample sizes 10, 50, 100, 500
  10. SwiGLU: Dense vs sparse comparison

Common Test Utilities (tests/common/mod.rs)

Helper functions for all tests:

  • random_vector(dim) - Generate test vectors
  • random_activations(max) - Generate activation patterns
  • create_test_ffn(input, hidden) - FFN factory
  • create_calibrated_predictor() - Pre-calibrated predictor
  • create_quantized_matrix(rows, cols) - Quantized weights
  • load_test_llama_model() - Test model loader
  • assert_vectors_close(a, b, tol) - Approximate equality
  • mse(a, b) - Mean squared error
  • generate_calibration_data(n) - Calibration dataset

Running Tests

# Run all tests
cargo test -p ruvector-sparse-inference

# Run specific test categories
cargo test -p ruvector-sparse-inference --test unit
cargo test -p ruvector-sparse-inference --test integration
cargo test -p ruvector-sparse-inference --test property

# Run unit tests for a specific module
cargo test -p ruvector-sparse-inference predictor_tests
cargo test -p ruvector-sparse-inference quantization_tests
cargo test -p ruvector-sparse-inference sparse_ffn_tests

# Run benchmarks
cargo bench -p ruvector-sparse-inference

# Run specific benchmark
cargo bench -p ruvector-sparse-inference -- sparse_vs_dense
cargo bench -p ruvector-sparse-inference -- sparsity_levels
cargo bench -p ruvector-sparse-inference -- quantization

Test Coverage Goals

  • Statements: >80%
  • Branches: >75%
  • Functions: >80%
  • Lines: >80%

Test Characteristics

Tests follow the FIRST principles:

  • Fast: Unit tests <100ms
  • Isolated: No dependencies between tests
  • Repeatable: Same result every time
  • Self-validating: Clear pass/fail
  • Timely: Written with implementation

Property-Based Testing

Tests use proptest to verify invariants across wide input ranges:

  • Input values: -10.0 to 10.0
  • Vector dimensions: 256 to 1024
  • Hidden dimensions: 512 to 4096
  • Group sizes: 16, 32, 64, 128
  • Sample counts: 1 to 100

Edge Cases Tested

  1. Empty inputs: Zero-length vectors, no active neurons
  2. Boundary values: Maximum dimensions, extreme values
  3. Invalid inputs: Wrong dimensions, out-of-bounds indices
  4. Numerical stability: Very large/small values, precision loss
  5. Concurrent operations: Parallel inference requests
  6. Memory efficiency: Large datasets, quantization compression

Test Organization

tests/
├── common/
│   └── mod.rs                    # Shared test utilities
├── unit/
│   ├── predictor_tests.rs        # Neuron prediction tests
│   ├── sparse_ffn_tests.rs       # Sparse computation tests
│   └── quantization_tests.rs     # Weight compression tests
├── integration/
│   ├── model_loading_tests.rs    # GGUF parsing tests
│   └── sparse_inference_tests.rs # End-to-end pipeline tests
└── property/
    └── mod.rs                     # Property-based tests

benches/
└── sparse_inference_bench.rs      # Performance benchmarks

Future Test Additions

Potential areas for expansion:

  1. Stress tests for memory limits
  2. Concurrent inference benchmarks
  3. Hardware-specific SIMD tests
  4. Model-specific accuracy tests
  5. Calibration strategy comparisons
  6. Cache effectiveness tests
  7. Quantization accuracy analysis

Total Test Coverage: 78+ tests across 1516 lines

  • 68 unit/integration tests
  • 10 property-based tests
  • 10 performance benchmarks