9.7 KiB
RuVector PostgreSQL Extension - Testing Guide
Overview
This document describes the comprehensive test framework for ruvector-postgres, a high-performance PostgreSQL vector similarity search extension.
Test Organization
Test Structure
tests/
├── unit_vector_tests.rs # Unit tests for RuVector type
├── unit_halfvec_tests.rs # Unit tests for HalfVec type
├── integration_distance_tests.rs # pgrx integration tests
├── property_based_tests.rs # Property-based tests with proptest
├── pgvector_compatibility_tests.rs # pgvector regression tests
├── stress_tests.rs # Concurrency and memory stress tests
├── simd_consistency_tests.rs # SIMD vs scalar consistency
├── quantized_types_test.rs # Quantized vector types
├── parallel_execution_test.rs # Parallel query execution
└── hnsw_index_tests.sql # SQL-level index tests
Test Categories
1. Unit Tests
Purpose: Test individual components in isolation.
Files:
unit_vector_tests.rs- RuVector typeunit_halfvec_tests.rs- HalfVec type
Coverage:
- Vector creation and initialization
- Varlena serialization/deserialization
- Vector arithmetic operations
- String parsing and formatting
- Memory layout and alignment
- Edge cases and boundary conditions
Example:
#[test]
fn test_varlena_roundtrip_basic() {
unsafe {
let v1 = RuVector::from_slice(&[1.0, 2.0, 3.0]);
let varlena = v1.to_varlena();
let v2 = RuVector::from_varlena(varlena);
assert_eq!(v1, v2);
pgrx::pg_sys::pfree(varlena as *mut std::ffi::c_void);
}
}
2. pgrx Integration Tests
Purpose: Test the extension running inside PostgreSQL.
File: integration_distance_tests.rs
Coverage:
- SQL operators (
<->,<=>,<#>,<+>) - Distance functions (L2, cosine, inner product, L1)
- SIMD consistency across vector sizes
- Error handling and validation
- Symmetry properties
Example:
#[pg_test]
fn test_l2_distance_basic() {
let a = RuVector::from_slice(&[0.0, 0.0, 0.0]);
let b = RuVector::from_slice(&[3.0, 4.0, 0.0]);
let dist = ruvector_l2_distance(a, b);
assert!((dist - 5.0).abs() < 1e-5);
}
3. Property-Based Tests
Purpose: Verify mathematical properties hold for random inputs.
File: property_based_tests.rs
Framework: proptest
Properties Tested:
Distance Functions
- Non-negativity:
d(a,b) ≥ 0 - Symmetry:
d(a,b) = d(b,a) - Identity:
d(a,a) = 0 - Triangle inequality:
d(a,c) ≤ d(a,b) + d(b,c) - Bounded ranges (cosine: [0,2])
Vector Operations
- Normalization produces unit vectors
- Addition identity:
v + 0 = v - Subtraction inverse:
(a + b) - b = a - Scalar multiplication: associativity, identity
- Dot product: commutativity
- Norm squared equals self-dot product
Example:
proptest! {
#[test]
fn prop_l2_distance_non_negative(
v1 in prop::collection::vec(-1000.0f32..1000.0f32, 1..100),
v2 in prop::collection::vec(-1000.0f32..1000.0f32, 1..100)
) {
if v1.len() == v2.len() {
let dist = euclidean_distance(&v1, &v2);
prop_assert!(dist >= 0.0);
prop_assert!(dist.is_finite());
}
}
}
4. pgvector Compatibility Tests
Purpose: Ensure drop-in compatibility with pgvector.
File: pgvector_compatibility_tests.rs
Coverage:
- Distance calculation parity
- Operator symbol compatibility
- Array conversion functions
- Text format parsing
- Known regression values
- High-dimensional vectors
- Nearest neighbor ordering
Example:
#[pg_test]
fn test_pgvector_example_l2() {
// Example from pgvector docs
let a = RuVector::from_slice(&[1.0, 2.0, 3.0]);
let b = RuVector::from_slice(&[3.0, 2.0, 1.0]);
let dist = ruvector_l2_distance(a, b);
// sqrt(8) ≈ 2.828
assert!((dist - 2.828427).abs() < 0.001);
}
5. Stress Tests
Purpose: Verify stability under load and concurrency.
File: stress_tests.rs
Coverage:
- Concurrent vector creation (8 threads × 100 vectors)
- Concurrent distance calculations (16 threads × 1000 ops)
- Large batch allocations (10,000 vectors)
- Memory reuse patterns
- Thread safety (shared read-only access)
- Varlena round-trip stress (10,000 iterations)
Example:
#[test]
fn test_concurrent_distance_calculations() {
let num_threads = 16;
let calculations_per_thread = 1000;
let v1 = Arc::new(RuVector::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0]));
let v2 = Arc::new(RuVector::from_slice(&[5.0, 4.0, 3.0, 2.0, 1.0]));
let handles: Vec<_> = (0..num_threads)
.map(|_| {
let v1 = Arc::clone(&v1);
let v2 = Arc::clone(&v2);
thread::spawn(move || {
for _ in 0..calculations_per_thread {
let _ = v1.dot(&*v2);
}
})
})
.collect();
for handle in handles {
handle.join().unwrap();
}
}
6. SIMD Consistency Tests
Purpose: Verify SIMD implementations match scalar fallback.
File: simd_consistency_tests.rs
Coverage:
- AVX-512, AVX2, NEON vs scalar
- Various vector sizes (1, 7, 8, 15, 16, 31, 32, 64, 128, 256)
- Negative values
- Zero vectors
- Small and large values
- Random data (100 iterations)
Example:
#[test]
fn test_euclidean_scalar_vs_simd_various_sizes() {
for size in [8, 16, 32, 64, 128, 256] {
let a: Vec<f32> = (0..size).map(|i| i as f32 * 0.1).collect();
let b: Vec<f32> = (0..size).map(|i| (size - i) as f32 * 0.1).collect();
let scalar = scalar::euclidean_distance(&a, &b);
#[cfg(target_arch = "x86_64")]
if is_x86_feature_detected!("avx2") {
let simd = simd::euclidean_distance_avx2_wrapper(&a, &b);
assert!((scalar - simd).abs() < 1e-5);
}
}
}
Running Tests
All Tests
cd /home/user/ruvector/crates/ruvector-postgres
cargo test
Specific Test Suite
# Unit tests only
cargo test --lib
# Integration tests only
cargo test --test '*'
# Specific test file
cargo test --test unit_vector_tests
# Property-based tests
cargo test --test property_based_tests
pgrx Tests
# Requires PostgreSQL 14, 15, or 16
cargo pgrx test pg16
# Run specific pgrx test
cargo pgrx test pg16 test_l2_distance_basic
With Coverage
# Install tarpaulin
cargo install cargo-tarpaulin
# Generate coverage report
cargo tarpaulin --out Html --output-dir coverage
Test Metrics
Current Coverage
Overall: ~85% line coverage
By Component:
- Core types: 92%
- Distance functions: 95%
- Operators: 88%
- Index implementations: 75%
- Quantization: 82%
Performance Benchmarks
Distance Calculations (1M pairs, 128 dimensions):
- Scalar: 120ms
- AVX2: 45ms (2.7x faster)
- AVX-512: 32ms (3.8x faster)
Vector Operations:
- Normalization: 15μs/vector (1024 dims)
- Varlena roundtrip: 2.5μs/vector
- String parsing: 8μs/vector
Debugging Failed Tests
Common Issues
-
Floating Point Precision
// ❌ Too strict assert_eq!(result, expected); // ✅ Use epsilon assert!((result - expected).abs() < 1e-5); -
SIMD Availability
#[cfg(target_arch = "x86_64")] if is_x86_feature_detected!("avx2") { // Run AVX2 test } -
PostgreSQL Memory Management
unsafe { let ptr = v.to_varlena(); // Use ptr... pgrx::pg_sys::pfree(ptr as *mut std::ffi::c_void); }
Verbose Output
cargo test -- --nocapture --test-threads=1
Running Single Test
cargo test test_l2_distance_basic -- --exact
CI/CD Integration
GitHub Actions
name: Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Run tests
run: cargo test --all-features
- name: Run pgrx tests
run: cargo pgrx test pg16
Test Development Guidelines
1. Test Naming
- Use descriptive names:
test_l2_distance_basic - Group related tests:
test_l2_*,test_cosine_* - Indicate expected behavior:
test_parse_invalid
2. Test Structure
#[test]
fn test_feature_scenario() {
// Arrange
let input = setup_test_data();
// Act
let result = perform_operation(input);
// Assert
assert_eq!(result, expected);
}
3. Edge Cases
Always test:
- Empty input
- Single element
- Very large input
- Negative values
- Zero values
- Boundary values
4. Error Cases
#[test]
#[should_panic(expected = "dimension mismatch")]
fn test_invalid_dimensions() {
let a = RuVector::from_slice(&[1.0, 2.0]);
let b = RuVector::from_slice(&[1.0, 2.0, 3.0]);
let _ = a.add(&b); // Should panic
}
Future Test Additions
Planned
- Fuzzing tests with cargo-fuzz
- Performance regression tests
- Index corruption recovery tests
- Multi-node distributed tests
- Backup/restore validation
Nice to Have
- SQL injection tests
- Authentication/authorization tests
- Compatibility matrix (PostgreSQL versions)
- Platform-specific tests (Windows, macOS, ARM)
Resources
Support
For test failures or questions:
- Check existing issues: https://github.com/ruvnet/ruvector/issues
- Run with verbose output
- Check PostgreSQL logs
- Create minimal reproduction case