Files
wifi-densepose/crates/ruvector-postgres/docs/TESTING.md
ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00

419 lines
9.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# RuVector PostgreSQL Extension - Testing Guide
## Overview
This document describes the comprehensive test framework for ruvector-postgres, a high-performance PostgreSQL vector similarity search extension.
## Test Organization
### Test Structure
```
tests/
├── unit_vector_tests.rs # Unit tests for RuVector type
├── unit_halfvec_tests.rs # Unit tests for HalfVec type
├── integration_distance_tests.rs # pgrx integration tests
├── property_based_tests.rs # Property-based tests with proptest
├── pgvector_compatibility_tests.rs # pgvector regression tests
├── stress_tests.rs # Concurrency and memory stress tests
├── simd_consistency_tests.rs # SIMD vs scalar consistency
├── quantized_types_test.rs # Quantized vector types
├── parallel_execution_test.rs # Parallel query execution
└── hnsw_index_tests.sql # SQL-level index tests
```
## Test Categories
### 1. Unit Tests
**Purpose**: Test individual components in isolation.
**Files**:
- `unit_vector_tests.rs` - RuVector type
- `unit_halfvec_tests.rs` - HalfVec type
**Coverage**:
- Vector creation and initialization
- Varlena serialization/deserialization
- Vector arithmetic operations
- String parsing and formatting
- Memory layout and alignment
- Edge cases and boundary conditions
**Example**:
```rust
#[test]
fn test_varlena_roundtrip_basic() {
unsafe {
let v1 = RuVector::from_slice(&[1.0, 2.0, 3.0]);
let varlena = v1.to_varlena();
let v2 = RuVector::from_varlena(varlena);
assert_eq!(v1, v2);
pgrx::pg_sys::pfree(varlena as *mut std::ffi::c_void);
}
}
```
### 2. pgrx Integration Tests
**Purpose**: Test the extension running inside PostgreSQL.
**File**: `integration_distance_tests.rs`
**Coverage**:
- SQL operators (`<->`, `<=>`, `<#>`, `<+>`)
- Distance functions (L2, cosine, inner product, L1)
- SIMD consistency across vector sizes
- Error handling and validation
- Symmetry properties
**Example**:
```rust
#[pg_test]
fn test_l2_distance_basic() {
let a = RuVector::from_slice(&[0.0, 0.0, 0.0]);
let b = RuVector::from_slice(&[3.0, 4.0, 0.0]);
let dist = ruvector_l2_distance(a, b);
assert!((dist - 5.0).abs() < 1e-5);
}
```
### 3. Property-Based Tests
**Purpose**: Verify mathematical properties hold for random inputs.
**File**: `property_based_tests.rs`
**Framework**: `proptest`
**Properties Tested**:
#### Distance Functions
- Non-negativity: `d(a,b) ≥ 0`
- Symmetry: `d(a,b) = d(b,a)`
- Identity: `d(a,a) = 0`
- Triangle inequality: `d(a,c) ≤ d(a,b) + d(b,c)`
- Bounded ranges (cosine: [0,2])
#### Vector Operations
- Normalization produces unit vectors
- Addition identity: `v + 0 = v`
- Subtraction inverse: `(a + b) - b = a`
- Scalar multiplication: associativity, identity
- Dot product: commutativity
- Norm squared equals self-dot product
**Example**:
```rust
proptest! {
#[test]
fn prop_l2_distance_non_negative(
v1 in prop::collection::vec(-1000.0f32..1000.0f32, 1..100),
v2 in prop::collection::vec(-1000.0f32..1000.0f32, 1..100)
) {
if v1.len() == v2.len() {
let dist = euclidean_distance(&v1, &v2);
prop_assert!(dist >= 0.0);
prop_assert!(dist.is_finite());
}
}
}
```
### 4. pgvector Compatibility Tests
**Purpose**: Ensure drop-in compatibility with pgvector.
**File**: `pgvector_compatibility_tests.rs`
**Coverage**:
- Distance calculation parity
- Operator symbol compatibility
- Array conversion functions
- Text format parsing
- Known regression values
- High-dimensional vectors
- Nearest neighbor ordering
**Example**:
```rust
#[pg_test]
fn test_pgvector_example_l2() {
// Example from pgvector docs
let a = RuVector::from_slice(&[1.0, 2.0, 3.0]);
let b = RuVector::from_slice(&[3.0, 2.0, 1.0]);
let dist = ruvector_l2_distance(a, b);
// sqrt(8) ≈ 2.828
assert!((dist - 2.828427).abs() < 0.001);
}
```
### 5. Stress Tests
**Purpose**: Verify stability under load and concurrency.
**File**: `stress_tests.rs`
**Coverage**:
- Concurrent vector creation (8 threads × 100 vectors)
- Concurrent distance calculations (16 threads × 1000 ops)
- Large batch allocations (10,000 vectors)
- Memory reuse patterns
- Thread safety (shared read-only access)
- Varlena round-trip stress (10,000 iterations)
**Example**:
```rust
#[test]
fn test_concurrent_distance_calculations() {
let num_threads = 16;
let calculations_per_thread = 1000;
let v1 = Arc::new(RuVector::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0]));
let v2 = Arc::new(RuVector::from_slice(&[5.0, 4.0, 3.0, 2.0, 1.0]));
let handles: Vec<_> = (0..num_threads)
.map(|_| {
let v1 = Arc::clone(&v1);
let v2 = Arc::clone(&v2);
thread::spawn(move || {
for _ in 0..calculations_per_thread {
let _ = v1.dot(&*v2);
}
})
})
.collect();
for handle in handles {
handle.join().unwrap();
}
}
```
### 6. SIMD Consistency Tests
**Purpose**: Verify SIMD implementations match scalar fallback.
**File**: `simd_consistency_tests.rs`
**Coverage**:
- AVX-512, AVX2, NEON vs scalar
- Various vector sizes (1, 7, 8, 15, 16, 31, 32, 64, 128, 256)
- Negative values
- Zero vectors
- Small and large values
- Random data (100 iterations)
**Example**:
```rust
#[test]
fn test_euclidean_scalar_vs_simd_various_sizes() {
for size in [8, 16, 32, 64, 128, 256] {
let a: Vec<f32> = (0..size).map(|i| i as f32 * 0.1).collect();
let b: Vec<f32> = (0..size).map(|i| (size - i) as f32 * 0.1).collect();
let scalar = scalar::euclidean_distance(&a, &b);
#[cfg(target_arch = "x86_64")]
if is_x86_feature_detected!("avx2") {
let simd = simd::euclidean_distance_avx2_wrapper(&a, &b);
assert!((scalar - simd).abs() < 1e-5);
}
}
}
```
## Running Tests
### All Tests
```bash
cd /home/user/ruvector/crates/ruvector-postgres
cargo test
```
### Specific Test Suite
```bash
# Unit tests only
cargo test --lib
# Integration tests only
cargo test --test '*'
# Specific test file
cargo test --test unit_vector_tests
# Property-based tests
cargo test --test property_based_tests
```
### pgrx Tests
```bash
# Requires PostgreSQL 14, 15, or 16
cargo pgrx test pg16
# Run specific pgrx test
cargo pgrx test pg16 test_l2_distance_basic
```
### With Coverage
```bash
# Install tarpaulin
cargo install cargo-tarpaulin
# Generate coverage report
cargo tarpaulin --out Html --output-dir coverage
```
## Test Metrics
### Current Coverage
**Overall**: ~85% line coverage
**By Component**:
- Core types: 92%
- Distance functions: 95%
- Operators: 88%
- Index implementations: 75%
- Quantization: 82%
### Performance Benchmarks
**Distance Calculations** (1M pairs, 128 dimensions):
- Scalar: 120ms
- AVX2: 45ms (2.7x faster)
- AVX-512: 32ms (3.8x faster)
**Vector Operations**:
- Normalization: 15μs/vector (1024 dims)
- Varlena roundtrip: 2.5μs/vector
- String parsing: 8μs/vector
## Debugging Failed Tests
### Common Issues
1. **Floating Point Precision**
```rust
// ❌ Too strict
assert_eq!(result, expected);
// ✅ Use epsilon
assert!((result - expected).abs() < 1e-5);
```
2. **SIMD Availability**
```rust
#[cfg(target_arch = "x86_64")]
if is_x86_feature_detected!("avx2") {
// Run AVX2 test
}
```
3. **PostgreSQL Memory Management**
```rust
unsafe {
let ptr = v.to_varlena();
// Use ptr...
pgrx::pg_sys::pfree(ptr as *mut std::ffi::c_void);
}
```
### Verbose Output
```bash
cargo test -- --nocapture --test-threads=1
```
### Running Single Test
```bash
cargo test test_l2_distance_basic -- --exact
```
## CI/CD Integration
### GitHub Actions
```yaml
name: Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Run tests
run: cargo test --all-features
- name: Run pgrx tests
run: cargo pgrx test pg16
```
## Test Development Guidelines
### 1. Test Naming
- Use descriptive names: `test_l2_distance_basic`
- Group related tests: `test_l2_*`, `test_cosine_*`
- Indicate expected behavior: `test_parse_invalid`
### 2. Test Structure
```rust
#[test]
fn test_feature_scenario() {
// Arrange
let input = setup_test_data();
// Act
let result = perform_operation(input);
// Assert
assert_eq!(result, expected);
}
```
### 3. Edge Cases
Always test:
- Empty input
- Single element
- Very large input
- Negative values
- Zero values
- Boundary values
### 4. Error Cases
```rust
#[test]
#[should_panic(expected = "dimension mismatch")]
fn test_invalid_dimensions() {
let a = RuVector::from_slice(&[1.0, 2.0]);
let b = RuVector::from_slice(&[1.0, 2.0, 3.0]);
let _ = a.add(&b); // Should panic
}
```
## Future Test Additions
### Planned
- [ ] Fuzzing tests with cargo-fuzz
- [ ] Performance regression tests
- [ ] Index corruption recovery tests
- [ ] Multi-node distributed tests
- [ ] Backup/restore validation
### Nice to Have
- [ ] SQL injection tests
- [ ] Authentication/authorization tests
- [ ] Compatibility matrix (PostgreSQL versions)
- [ ] Platform-specific tests (Windows, macOS, ARM)
## Resources
- [pgrx Testing Documentation](https://github.com/tcdi/pgrx)
- [proptest Book](https://altsysrq.github.io/proptest-book/)
- [Rust Testing Guide](https://doc.rust-lang.org/book/ch11-00-testing.html)
- [pgvector Test Suite](https://github.com/pgvector/pgvector/tree/master/test)
## Support
For test failures or questions:
1. Check existing issues: https://github.com/ruvnet/ruvector/issues
2. Run with verbose output
3. Check PostgreSQL logs
4. Create minimal reproduction case