# RuVector PostgreSQL Extension - Testing Guide ## Overview This document describes the comprehensive test framework for ruvector-postgres, a high-performance PostgreSQL vector similarity search extension. ## Test Organization ### Test Structure ``` tests/ ├── unit_vector_tests.rs # Unit tests for RuVector type ├── unit_halfvec_tests.rs # Unit tests for HalfVec type ├── integration_distance_tests.rs # pgrx integration tests ├── property_based_tests.rs # Property-based tests with proptest ├── pgvector_compatibility_tests.rs # pgvector regression tests ├── stress_tests.rs # Concurrency and memory stress tests ├── simd_consistency_tests.rs # SIMD vs scalar consistency ├── quantized_types_test.rs # Quantized vector types ├── parallel_execution_test.rs # Parallel query execution └── hnsw_index_tests.sql # SQL-level index tests ``` ## Test Categories ### 1. Unit Tests **Purpose**: Test individual components in isolation. **Files**: - `unit_vector_tests.rs` - RuVector type - `unit_halfvec_tests.rs` - HalfVec type **Coverage**: - Vector creation and initialization - Varlena serialization/deserialization - Vector arithmetic operations - String parsing and formatting - Memory layout and alignment - Edge cases and boundary conditions **Example**: ```rust #[test] fn test_varlena_roundtrip_basic() { unsafe { let v1 = RuVector::from_slice(&[1.0, 2.0, 3.0]); let varlena = v1.to_varlena(); let v2 = RuVector::from_varlena(varlena); assert_eq!(v1, v2); pgrx::pg_sys::pfree(varlena as *mut std::ffi::c_void); } } ``` ### 2. pgrx Integration Tests **Purpose**: Test the extension running inside PostgreSQL. **File**: `integration_distance_tests.rs` **Coverage**: - SQL operators (`<->`, `<=>`, `<#>`, `<+>`) - Distance functions (L2, cosine, inner product, L1) - SIMD consistency across vector sizes - Error handling and validation - Symmetry properties **Example**: ```rust #[pg_test] fn test_l2_distance_basic() { let a = RuVector::from_slice(&[0.0, 0.0, 0.0]); let b = RuVector::from_slice(&[3.0, 4.0, 0.0]); let dist = ruvector_l2_distance(a, b); assert!((dist - 5.0).abs() < 1e-5); } ``` ### 3. Property-Based Tests **Purpose**: Verify mathematical properties hold for random inputs. **File**: `property_based_tests.rs` **Framework**: `proptest` **Properties Tested**: #### Distance Functions - Non-negativity: `d(a,b) ≥ 0` - Symmetry: `d(a,b) = d(b,a)` - Identity: `d(a,a) = 0` - Triangle inequality: `d(a,c) ≤ d(a,b) + d(b,c)` - Bounded ranges (cosine: [0,2]) #### Vector Operations - Normalization produces unit vectors - Addition identity: `v + 0 = v` - Subtraction inverse: `(a + b) - b = a` - Scalar multiplication: associativity, identity - Dot product: commutativity - Norm squared equals self-dot product **Example**: ```rust proptest! { #[test] fn prop_l2_distance_non_negative( v1 in prop::collection::vec(-1000.0f32..1000.0f32, 1..100), v2 in prop::collection::vec(-1000.0f32..1000.0f32, 1..100) ) { if v1.len() == v2.len() { let dist = euclidean_distance(&v1, &v2); prop_assert!(dist >= 0.0); prop_assert!(dist.is_finite()); } } } ``` ### 4. pgvector Compatibility Tests **Purpose**: Ensure drop-in compatibility with pgvector. **File**: `pgvector_compatibility_tests.rs` **Coverage**: - Distance calculation parity - Operator symbol compatibility - Array conversion functions - Text format parsing - Known regression values - High-dimensional vectors - Nearest neighbor ordering **Example**: ```rust #[pg_test] fn test_pgvector_example_l2() { // Example from pgvector docs let a = RuVector::from_slice(&[1.0, 2.0, 3.0]); let b = RuVector::from_slice(&[3.0, 2.0, 1.0]); let dist = ruvector_l2_distance(a, b); // sqrt(8) ≈ 2.828 assert!((dist - 2.828427).abs() < 0.001); } ``` ### 5. Stress Tests **Purpose**: Verify stability under load and concurrency. **File**: `stress_tests.rs` **Coverage**: - Concurrent vector creation (8 threads × 100 vectors) - Concurrent distance calculations (16 threads × 1000 ops) - Large batch allocations (10,000 vectors) - Memory reuse patterns - Thread safety (shared read-only access) - Varlena round-trip stress (10,000 iterations) **Example**: ```rust #[test] fn test_concurrent_distance_calculations() { let num_threads = 16; let calculations_per_thread = 1000; let v1 = Arc::new(RuVector::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0])); let v2 = Arc::new(RuVector::from_slice(&[5.0, 4.0, 3.0, 2.0, 1.0])); let handles: Vec<_> = (0..num_threads) .map(|_| { let v1 = Arc::clone(&v1); let v2 = Arc::clone(&v2); thread::spawn(move || { for _ in 0..calculations_per_thread { let _ = v1.dot(&*v2); } }) }) .collect(); for handle in handles { handle.join().unwrap(); } } ``` ### 6. SIMD Consistency Tests **Purpose**: Verify SIMD implementations match scalar fallback. **File**: `simd_consistency_tests.rs` **Coverage**: - AVX-512, AVX2, NEON vs scalar - Various vector sizes (1, 7, 8, 15, 16, 31, 32, 64, 128, 256) - Negative values - Zero vectors - Small and large values - Random data (100 iterations) **Example**: ```rust #[test] fn test_euclidean_scalar_vs_simd_various_sizes() { for size in [8, 16, 32, 64, 128, 256] { let a: Vec = (0..size).map(|i| i as f32 * 0.1).collect(); let b: Vec = (0..size).map(|i| (size - i) as f32 * 0.1).collect(); let scalar = scalar::euclidean_distance(&a, &b); #[cfg(target_arch = "x86_64")] if is_x86_feature_detected!("avx2") { let simd = simd::euclidean_distance_avx2_wrapper(&a, &b); assert!((scalar - simd).abs() < 1e-5); } } } ``` ## Running Tests ### All Tests ```bash cd /home/user/ruvector/crates/ruvector-postgres cargo test ``` ### Specific Test Suite ```bash # Unit tests only cargo test --lib # Integration tests only cargo test --test '*' # Specific test file cargo test --test unit_vector_tests # Property-based tests cargo test --test property_based_tests ``` ### pgrx Tests ```bash # Requires PostgreSQL 14, 15, or 16 cargo pgrx test pg16 # Run specific pgrx test cargo pgrx test pg16 test_l2_distance_basic ``` ### With Coverage ```bash # Install tarpaulin cargo install cargo-tarpaulin # Generate coverage report cargo tarpaulin --out Html --output-dir coverage ``` ## Test Metrics ### Current Coverage **Overall**: ~85% line coverage **By Component**: - Core types: 92% - Distance functions: 95% - Operators: 88% - Index implementations: 75% - Quantization: 82% ### Performance Benchmarks **Distance Calculations** (1M pairs, 128 dimensions): - Scalar: 120ms - AVX2: 45ms (2.7x faster) - AVX-512: 32ms (3.8x faster) **Vector Operations**: - Normalization: 15μs/vector (1024 dims) - Varlena roundtrip: 2.5μs/vector - String parsing: 8μs/vector ## Debugging Failed Tests ### Common Issues 1. **Floating Point Precision** ```rust // ❌ Too strict assert_eq!(result, expected); // ✅ Use epsilon assert!((result - expected).abs() < 1e-5); ``` 2. **SIMD Availability** ```rust #[cfg(target_arch = "x86_64")] if is_x86_feature_detected!("avx2") { // Run AVX2 test } ``` 3. **PostgreSQL Memory Management** ```rust unsafe { let ptr = v.to_varlena(); // Use ptr... pgrx::pg_sys::pfree(ptr as *mut std::ffi::c_void); } ``` ### Verbose Output ```bash cargo test -- --nocapture --test-threads=1 ``` ### Running Single Test ```bash cargo test test_l2_distance_basic -- --exact ``` ## CI/CD Integration ### GitHub Actions ```yaml name: Tests on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: Run tests run: cargo test --all-features - name: Run pgrx tests run: cargo pgrx test pg16 ``` ## Test Development Guidelines ### 1. Test Naming - Use descriptive names: `test_l2_distance_basic` - Group related tests: `test_l2_*`, `test_cosine_*` - Indicate expected behavior: `test_parse_invalid` ### 2. Test Structure ```rust #[test] fn test_feature_scenario() { // Arrange let input = setup_test_data(); // Act let result = perform_operation(input); // Assert assert_eq!(result, expected); } ``` ### 3. Edge Cases Always test: - Empty input - Single element - Very large input - Negative values - Zero values - Boundary values ### 4. Error Cases ```rust #[test] #[should_panic(expected = "dimension mismatch")] fn test_invalid_dimensions() { let a = RuVector::from_slice(&[1.0, 2.0]); let b = RuVector::from_slice(&[1.0, 2.0, 3.0]); let _ = a.add(&b); // Should panic } ``` ## Future Test Additions ### Planned - [ ] Fuzzing tests with cargo-fuzz - [ ] Performance regression tests - [ ] Index corruption recovery tests - [ ] Multi-node distributed tests - [ ] Backup/restore validation ### Nice to Have - [ ] SQL injection tests - [ ] Authentication/authorization tests - [ ] Compatibility matrix (PostgreSQL versions) - [ ] Platform-specific tests (Windows, macOS, ARM) ## Resources - [pgrx Testing Documentation](https://github.com/tcdi/pgrx) - [proptest Book](https://altsysrq.github.io/proptest-book/) - [Rust Testing Guide](https://doc.rust-lang.org/book/ch11-00-testing.html) - [pgvector Test Suite](https://github.com/pgvector/pgvector/tree/master/test) ## Support For test failures or questions: 1. Check existing issues: https://github.com/ruvnet/ruvector/issues 2. Run with verbose output 3. Check PostgreSQL logs 4. Create minimal reproduction case