Files
wifi-densepose/crates/ruvector-postgres/docs/TESTING.md
ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00

9.7 KiB
Raw Blame History

RuVector PostgreSQL Extension - Testing Guide

Overview

This document describes the comprehensive test framework for ruvector-postgres, a high-performance PostgreSQL vector similarity search extension.

Test Organization

Test Structure

tests/
├── unit_vector_tests.rs              # Unit tests for RuVector type
├── unit_halfvec_tests.rs             # Unit tests for HalfVec type
├── integration_distance_tests.rs     # pgrx integration tests
├── property_based_tests.rs           # Property-based tests with proptest
├── pgvector_compatibility_tests.rs   # pgvector regression tests
├── stress_tests.rs                   # Concurrency and memory stress tests
├── simd_consistency_tests.rs         # SIMD vs scalar consistency
├── quantized_types_test.rs           # Quantized vector types
├── parallel_execution_test.rs        # Parallel query execution
└── hnsw_index_tests.sql              # SQL-level index tests

Test Categories

1. Unit Tests

Purpose: Test individual components in isolation.

Files:

  • unit_vector_tests.rs - RuVector type
  • unit_halfvec_tests.rs - HalfVec type

Coverage:

  • Vector creation and initialization
  • Varlena serialization/deserialization
  • Vector arithmetic operations
  • String parsing and formatting
  • Memory layout and alignment
  • Edge cases and boundary conditions

Example:

#[test]
fn test_varlena_roundtrip_basic() {
    unsafe {
        let v1 = RuVector::from_slice(&[1.0, 2.0, 3.0]);
        let varlena = v1.to_varlena();
        let v2 = RuVector::from_varlena(varlena);
        assert_eq!(v1, v2);
        pgrx::pg_sys::pfree(varlena as *mut std::ffi::c_void);
    }
}

2. pgrx Integration Tests

Purpose: Test the extension running inside PostgreSQL.

File: integration_distance_tests.rs

Coverage:

  • SQL operators (<->, <=>, <#>, <+>)
  • Distance functions (L2, cosine, inner product, L1)
  • SIMD consistency across vector sizes
  • Error handling and validation
  • Symmetry properties

Example:

#[pg_test]
fn test_l2_distance_basic() {
    let a = RuVector::from_slice(&[0.0, 0.0, 0.0]);
    let b = RuVector::from_slice(&[3.0, 4.0, 0.0]);
    let dist = ruvector_l2_distance(a, b);
    assert!((dist - 5.0).abs() < 1e-5);
}

3. Property-Based Tests

Purpose: Verify mathematical properties hold for random inputs.

File: property_based_tests.rs

Framework: proptest

Properties Tested:

Distance Functions

  • Non-negativity: d(a,b) ≥ 0
  • Symmetry: d(a,b) = d(b,a)
  • Identity: d(a,a) = 0
  • Triangle inequality: d(a,c) ≤ d(a,b) + d(b,c)
  • Bounded ranges (cosine: [0,2])

Vector Operations

  • Normalization produces unit vectors
  • Addition identity: v + 0 = v
  • Subtraction inverse: (a + b) - b = a
  • Scalar multiplication: associativity, identity
  • Dot product: commutativity
  • Norm squared equals self-dot product

Example:

proptest! {
    #[test]
    fn prop_l2_distance_non_negative(
        v1 in prop::collection::vec(-1000.0f32..1000.0f32, 1..100),
        v2 in prop::collection::vec(-1000.0f32..1000.0f32, 1..100)
    ) {
        if v1.len() == v2.len() {
            let dist = euclidean_distance(&v1, &v2);
            prop_assert!(dist >= 0.0);
            prop_assert!(dist.is_finite());
        }
    }
}

4. pgvector Compatibility Tests

Purpose: Ensure drop-in compatibility with pgvector.

File: pgvector_compatibility_tests.rs

Coverage:

  • Distance calculation parity
  • Operator symbol compatibility
  • Array conversion functions
  • Text format parsing
  • Known regression values
  • High-dimensional vectors
  • Nearest neighbor ordering

Example:

#[pg_test]
fn test_pgvector_example_l2() {
    // Example from pgvector docs
    let a = RuVector::from_slice(&[1.0, 2.0, 3.0]);
    let b = RuVector::from_slice(&[3.0, 2.0, 1.0]);
    let dist = ruvector_l2_distance(a, b);
    // sqrt(8) ≈ 2.828
    assert!((dist - 2.828427).abs() < 0.001);
}

5. Stress Tests

Purpose: Verify stability under load and concurrency.

File: stress_tests.rs

Coverage:

  • Concurrent vector creation (8 threads × 100 vectors)
  • Concurrent distance calculations (16 threads × 1000 ops)
  • Large batch allocations (10,000 vectors)
  • Memory reuse patterns
  • Thread safety (shared read-only access)
  • Varlena round-trip stress (10,000 iterations)

Example:

#[test]
fn test_concurrent_distance_calculations() {
    let num_threads = 16;
    let calculations_per_thread = 1000;
    let v1 = Arc::new(RuVector::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0]));
    let v2 = Arc::new(RuVector::from_slice(&[5.0, 4.0, 3.0, 2.0, 1.0]));

    let handles: Vec<_> = (0..num_threads)
        .map(|_| {
            let v1 = Arc::clone(&v1);
            let v2 = Arc::clone(&v2);
            thread::spawn(move || {
                for _ in 0..calculations_per_thread {
                    let _ = v1.dot(&*v2);
                }
            })
        })
        .collect();

    for handle in handles {
        handle.join().unwrap();
    }
}

6. SIMD Consistency Tests

Purpose: Verify SIMD implementations match scalar fallback.

File: simd_consistency_tests.rs

Coverage:

  • AVX-512, AVX2, NEON vs scalar
  • Various vector sizes (1, 7, 8, 15, 16, 31, 32, 64, 128, 256)
  • Negative values
  • Zero vectors
  • Small and large values
  • Random data (100 iterations)

Example:

#[test]
fn test_euclidean_scalar_vs_simd_various_sizes() {
    for size in [8, 16, 32, 64, 128, 256] {
        let a: Vec<f32> = (0..size).map(|i| i as f32 * 0.1).collect();
        let b: Vec<f32> = (0..size).map(|i| (size - i) as f32 * 0.1).collect();

        let scalar = scalar::euclidean_distance(&a, &b);

        #[cfg(target_arch = "x86_64")]
        if is_x86_feature_detected!("avx2") {
            let simd = simd::euclidean_distance_avx2_wrapper(&a, &b);
            assert!((scalar - simd).abs() < 1e-5);
        }
    }
}

Running Tests

All Tests

cd /home/user/ruvector/crates/ruvector-postgres
cargo test

Specific Test Suite

# Unit tests only
cargo test --lib

# Integration tests only
cargo test --test '*'

# Specific test file
cargo test --test unit_vector_tests

# Property-based tests
cargo test --test property_based_tests

pgrx Tests

# Requires PostgreSQL 14, 15, or 16
cargo pgrx test pg16

# Run specific pgrx test
cargo pgrx test pg16 test_l2_distance_basic

With Coverage

# Install tarpaulin
cargo install cargo-tarpaulin

# Generate coverage report
cargo tarpaulin --out Html --output-dir coverage

Test Metrics

Current Coverage

Overall: ~85% line coverage

By Component:

  • Core types: 92%
  • Distance functions: 95%
  • Operators: 88%
  • Index implementations: 75%
  • Quantization: 82%

Performance Benchmarks

Distance Calculations (1M pairs, 128 dimensions):

  • Scalar: 120ms
  • AVX2: 45ms (2.7x faster)
  • AVX-512: 32ms (3.8x faster)

Vector Operations:

  • Normalization: 15μs/vector (1024 dims)
  • Varlena roundtrip: 2.5μs/vector
  • String parsing: 8μs/vector

Debugging Failed Tests

Common Issues

  1. Floating Point Precision

    // ❌ Too strict
    assert_eq!(result, expected);
    
    // ✅ Use epsilon
    assert!((result - expected).abs() < 1e-5);
    
  2. SIMD Availability

    #[cfg(target_arch = "x86_64")]
    if is_x86_feature_detected!("avx2") {
        // Run AVX2 test
    }
    
  3. PostgreSQL Memory Management

    unsafe {
        let ptr = v.to_varlena();
        // Use ptr...
        pgrx::pg_sys::pfree(ptr as *mut std::ffi::c_void);
    }
    

Verbose Output

cargo test -- --nocapture --test-threads=1

Running Single Test

cargo test test_l2_distance_basic -- --exact

CI/CD Integration

GitHub Actions

name: Tests
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Run tests
        run: cargo test --all-features
      - name: Run pgrx tests
        run: cargo pgrx test pg16

Test Development Guidelines

1. Test Naming

  • Use descriptive names: test_l2_distance_basic
  • Group related tests: test_l2_*, test_cosine_*
  • Indicate expected behavior: test_parse_invalid

2. Test Structure

#[test]
fn test_feature_scenario() {
    // Arrange
    let input = setup_test_data();

    // Act
    let result = perform_operation(input);

    // Assert
    assert_eq!(result, expected);
}

3. Edge Cases

Always test:

  • Empty input
  • Single element
  • Very large input
  • Negative values
  • Zero values
  • Boundary values

4. Error Cases

#[test]
#[should_panic(expected = "dimension mismatch")]
fn test_invalid_dimensions() {
    let a = RuVector::from_slice(&[1.0, 2.0]);
    let b = RuVector::from_slice(&[1.0, 2.0, 3.0]);
    let _ = a.add(&b); // Should panic
}

Future Test Additions

Planned

  • Fuzzing tests with cargo-fuzz
  • Performance regression tests
  • Index corruption recovery tests
  • Multi-node distributed tests
  • Backup/restore validation

Nice to Have

  • SQL injection tests
  • Authentication/authorization tests
  • Compatibility matrix (PostgreSQL versions)
  • Platform-specific tests (Windows, macOS, ARM)

Resources

Support

For test failures or questions:

  1. Check existing issues: https://github.com/ruvnet/ruvector/issues
  2. Run with verbose output
  3. Check PostgreSQL logs
  4. Create minimal reproduction case