Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

This commit is contained in:
ruv
2026-02-28 14:39:40 -05:00
7854 changed files with 3522914 additions and 0 deletions

View File

@@ -0,0 +1,307 @@
# RuVector Benchmark Suite
Comprehensive benchmarks comparing ruvector vs pgvector across multiple dimensions.
## Overview
This benchmark suite provides:
1. **Rust Benchmarks** - Low-level performance testing using Criterion
2. **SQL Benchmarks** - Realistic PostgreSQL workload testing
3. **Automated CI** - GitHub Actions workflow for continuous benchmarking
## Quick Start
### Run All Benchmarks
```bash
cd crates/ruvector-postgres
bash benches/scripts/run_benchmarks.sh
```
### Run Individual Benchmarks
```bash
# Distance function benchmarks
cargo bench --bench distance_bench
# HNSW index benchmarks
cargo bench --bench index_bench
# Quantization benchmarks
cargo bench --bench quantization_bench
# Quantized distance benchmarks
cargo bench --bench quantized_distance_bench
```
### Run SQL Benchmarks
```bash
# Setup database
createdb ruvector_bench
psql -d ruvector_bench -c 'CREATE EXTENSION ruvector;'
psql -d ruvector_bench -c 'CREATE EXTENSION pgvector;'
# Quick benchmark (10k vectors)
psql -d ruvector_bench -f benches/sql/quick_benchmark.sql
# Full workload (1M vectors)
psql -d ruvector_bench -f benches/sql/benchmark_workload.sql
```
## Benchmark Categories
### 1. Distance Function Benchmarks (`distance_bench.rs`)
Tests distance calculation performance across different vector dimensions:
- **L2 (Euclidean) Distance**: Scalar vs SIMD implementations
- **Cosine Distance**: Normalized similarity measurement
- **Inner Product**: Dot product for maximum inner product search
- **Batch Operations**: Sequential vs parallel processing
**Dimensions tested**: 128, 384, 768, 1536, 3072
**Key metrics**:
- Single operation latency
- Throughput (ops/sec)
- SIMD speedup vs scalar
### 2. HNSW Index Benchmarks (`index_bench.rs`)
Tests Hierarchical Navigable Small World graph index:
#### Build Benchmarks
- Index construction time vs dataset size (1K, 10K, 100K, 1M vectors)
- Impact of `ef_construction` parameter (16, 32, 64, 128, 256)
- Impact of `M` parameter (8, 12, 16, 24, 32, 48)
#### Search Benchmarks
- Query latency vs dataset size
- Impact of `ef_search` parameter (10, 20, 40, 80, 160, 320)
- Impact of `k` (number of neighbors: 1, 5, 10, 20, 50, 100)
#### Recall Accuracy
- Recall@10 vs `ef_search` values
- Ground truth comparison
#### Memory Usage
- Index size vs dataset size
- Memory per vector overhead
**Dimensions tested**: 128, 384, 768, 1536
### 3. Quantization Benchmarks (`quantization_bench.rs`)
Tests vector compression and quantized search:
#### Scalar Quantization (SQ8)
- Encoding/decoding speed
- Distance calculation speedup
- Recall vs exact search
- Memory reduction (4x compression)
#### Binary Quantization
- Encoding speed
- Hamming distance calculation (SIMD)
- Massive compression (32x for f32)
- Re-ranking strategies
#### Product Quantization (PQ)
- ADC (Asymmetric Distance Computation)
- SIMD vs scalar lookup
- Configurable compression ratios
**Key metrics**:
- Speedup vs exact search
- Recall@10 accuracy
- Compression ratio
- Throughput improvement
### 4. SQL Workload Benchmarks
Realistic PostgreSQL scenarios:
#### Quick Benchmark (`quick_benchmark.sql`)
- 10,000 vectors, 768 dimensions
- Sequential scan baseline
- HNSW index build
- Index search performance
- Distance function comparisons
#### Full Workload (`benchmark_workload.sql`)
- 1,000,000 vectors, 1536 dimensions
- 1,000 queries for statistical significance
- P50, P99 latency measurements
- Memory usage analysis
- Recall accuracy testing
- ruvector vs pgvector comparison
## Understanding Results
### Criterion Output
```
Distance/euclidean/scalar/768
time: [2.1234 µs 2.1456 µs 2.1678 µs]
thrpt: [354.23 Melem/s 357.89 Melem/s 361.55 Melem/s]
```
- **time**: Mean execution time with confidence intervals
- **thrpt**: Throughput (operations per second)
### Comparing Implementations
```bash
# Set baseline
cargo bench --bench distance_bench -- --save-baseline main
# Make changes, then compare
cargo bench --bench distance_bench -- --baseline main
```
### SQL Benchmark Interpretation
```sql
p50_ms | p99_ms | avg_ms | min_ms | max_ms
--------+--------+--------+--------+--------
0.856 | 1.234 | 0.912 | 0.654 | 2.456
```
- **p50**: Median latency (50th percentile)
- **p99**: 99th percentile latency (worst 1%)
- **avg**: Average latency
- **min/max**: Best and worst case
## Performance Targets
### Distance Functions
| Operation | Dimension | Target Throughput |
|-----------|-----------|-------------------|
| L2 (SIMD) | 768 | > 400 Mops/s |
| L2 (SIMD) | 1536 | > 200 Mops/s |
| Cosine | 768 | > 300 Mops/s |
| Inner Product | 768 | > 500 Mops/s |
### HNSW Index
| Dataset Size | Build Time | Search Latency | Recall@10 |
|--------------|------------|----------------|-----------|
| 100K | < 30s | < 1ms | > 0.95 |
| 1M | < 5min | < 2ms | > 0.95 |
| 10M | < 1hr | < 5ms | > 0.90 |
### Quantization
| Method | Compression | Speedup | Recall@10 |
|---------|-------------|---------|-----------|
| SQ8 | 4x | 2-3x | > 0.95 |
| Binary | 32x | 10-20x | > 0.85 |
| PQ(8) | 16x | 5-10x | > 0.90 |
## Continuous Integration
The GitHub Actions workflow runs automatically on:
- Pull requests touching benchmark code
- Pushes to `main` and `develop` branches
- Manual workflow dispatch
Results are:
- Posted as PR comments
- Stored as artifacts (30 day retention)
- Tracked over time on main branch
- Compared against baseline
### Triggering Manual Runs
```bash
# From GitHub UI: Actions → Benchmarks → Run workflow
# Or using gh CLI
gh workflow run benchmarks.yml
```
### Enabling SQL Benchmarks in CI
SQL benchmarks are disabled by default (too slow). Enable via workflow dispatch:
```bash
gh workflow run benchmarks.yml -f run_sql_benchmarks=true
```
## Advanced Usage
### Profiling with Criterion
```bash
# Generate flamegraph
cargo bench --bench distance_bench -- --profile-time=5
# Output to specific format
cargo bench --bench distance_bench -- --output-format bencher
```
### Custom Benchmark Parameters
Edit benchmark files to adjust:
- Vector dimensions
- Dataset sizes
- Number of queries
- HNSW parameters (M, ef_construction, ef_search)
- Quantization settings
### Comparing with pgvector
Ensure pgvector is installed:
```bash
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install
```
Then run SQL benchmarks for side-by-side comparison.
## Interpreting Regressions
### Performance Degradation Alert
If CI fails due to performance regression:
1. **Check the comparison**: Review the baseline vs current results
2. **Validate the change**: Ensure it's not due to measurement noise
3. **Profile the code**: Use flamegraphs to identify bottlenecks
4. **Consider trade-offs**: Sometimes correctness > speed
### Common Causes
- **SIMD disabled**: Check compiler flags
- **Debug build**: Ensure --release mode
- **Thermal throttling**: CPU overheating in CI
- **Cache effects**: Different data access patterns
## Contributing
When adding benchmarks:
1. Add to appropriate `*_bench.rs` file
2. Update this README
3. Ensure benchmarks complete in < 5 minutes
4. Use `black_box()` to prevent optimization
5. Test both small and large inputs
## Resources
- [Criterion.rs Documentation](https://bheisler.github.io/criterion.rs/book/)
- [HNSW Paper](https://arxiv.org/abs/1603.09320)
- [Product Quantization Paper](https://ieeexplore.ieee.org/document/5432202)
- [pgvector Repository](https://github.com/pgvector/pgvector)
## License
Same as ruvector project - MIT

View File

@@ -0,0 +1,565 @@
//! Comprehensive distance function benchmarks
//!
//! Compare SIMD vs scalar implementations across different vector sizes
//! and distance metrics (L2, cosine, inner product, Manhattan).
//!
//! Dimensions tested: 128, 384, 768, 1536, 3072
//! This covers common embedding sizes:
//! - 128: SBERT MiniLM
//! - 384: all-MiniLM-L6-v2
//! - 768: BERT base, RoBERTa
//! - 1536: OpenAI text-embedding-ada-002
//! - 3072: OpenAI text-embedding-3-large
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
use rand::prelude::*;
use rand_chacha::ChaCha8Rng;
use rayon::prelude::*;
// ============================================================================
// Distance Implementations
// ============================================================================
mod distance_impl {
/// Scalar Euclidean distance
pub fn euclidean_scalar(a: &[f32], b: &[f32]) -> f32 {
a.iter()
.zip(b.iter())
.map(|(x, y)| {
let diff = x - y;
diff * diff
})
.sum::<f32>()
.sqrt()
}
/// Scalar cosine distance
pub fn cosine_scalar(a: &[f32], b: &[f32]) -> f32 {
let mut dot = 0.0f32;
let mut norm_a = 0.0f32;
let mut norm_b = 0.0f32;
for (x, y) in a.iter().zip(b.iter()) {
dot += x * y;
norm_a += x * x;
norm_b += y * y;
}
let denominator = (norm_a * norm_b).sqrt();
if denominator == 0.0 {
return 1.0;
}
1.0 - (dot / denominator)
}
/// Scalar inner product distance (negative)
pub fn inner_product_scalar(a: &[f32], b: &[f32]) -> f32 {
-a.iter().zip(b.iter()).map(|(x, y)| x * y).sum::<f32>()
}
/// Scalar Manhattan distance
pub fn manhattan_scalar(a: &[f32], b: &[f32]) -> f32 {
a.iter()
.zip(b.iter())
.map(|(x, y)| (x - y).abs())
.sum::<f32>()
}
/// AVX2 Euclidean distance squared (L2^2)
#[cfg(target_arch = "x86_64")]
#[target_feature(enable = "avx2", enable = "fma")]
pub unsafe fn euclidean_avx2(a: &[f32], b: &[f32]) -> f32 {
use std::arch::x86_64::*;
let n = a.len();
let mut sum = _mm256_setzero_ps();
let chunks = n / 8;
for i in 0..chunks {
let offset = i * 8;
let va = _mm256_loadu_ps(a.as_ptr().add(offset));
let vb = _mm256_loadu_ps(b.as_ptr().add(offset));
let diff = _mm256_sub_ps(va, vb);
sum = _mm256_fmadd_ps(diff, diff, sum);
}
// Horizontal sum
let sum_high = _mm256_extractf128_ps(sum, 1);
let sum_low = _mm256_castps256_ps128(sum);
let sum128 = _mm_add_ps(sum_high, sum_low);
let sum64 = _mm_add_ps(sum128, _mm_movehl_ps(sum128, sum128));
let sum32 = _mm_add_ss(sum64, _mm_shuffle_ps(sum64, sum64, 1));
let mut result = _mm_cvtss_f32(sum32);
// Handle remainder
for i in (chunks * 8)..n {
let diff = a[i] - b[i];
result += diff * diff;
}
result.sqrt()
}
/// AVX2 cosine distance
#[cfg(target_arch = "x86_64")]
#[target_feature(enable = "avx2", enable = "fma")]
pub unsafe fn cosine_avx2(a: &[f32], b: &[f32]) -> f32 {
use std::arch::x86_64::*;
let n = a.len();
let mut dot_sum = _mm256_setzero_ps();
let mut norm_a_sum = _mm256_setzero_ps();
let mut norm_b_sum = _mm256_setzero_ps();
let chunks = n / 8;
for i in 0..chunks {
let offset = i * 8;
let va = _mm256_loadu_ps(a.as_ptr().add(offset));
let vb = _mm256_loadu_ps(b.as_ptr().add(offset));
dot_sum = _mm256_fmadd_ps(va, vb, dot_sum);
norm_a_sum = _mm256_fmadd_ps(va, va, norm_a_sum);
norm_b_sum = _mm256_fmadd_ps(vb, vb, norm_b_sum);
}
// Horizontal sums
let h_dot = horizontal_sum_avx2(dot_sum);
let h_norm_a = horizontal_sum_avx2(norm_a_sum);
let h_norm_b = horizontal_sum_avx2(norm_b_sum);
// Handle remainder
let mut dot = h_dot;
let mut norm_a = h_norm_a;
let mut norm_b = h_norm_b;
for i in (chunks * 8)..n {
dot += a[i] * b[i];
norm_a += a[i] * a[i];
norm_b += b[i] * b[i];
}
let denom = (norm_a * norm_b).sqrt();
if denom == 0.0 {
return 1.0;
}
1.0 - (dot / denom)
}
/// AVX2 inner product
#[cfg(target_arch = "x86_64")]
#[target_feature(enable = "avx2", enable = "fma")]
pub unsafe fn inner_product_avx2(a: &[f32], b: &[f32]) -> f32 {
use std::arch::x86_64::*;
let n = a.len();
let mut sum = _mm256_setzero_ps();
let chunks = n / 8;
for i in 0..chunks {
let offset = i * 8;
let va = _mm256_loadu_ps(a.as_ptr().add(offset));
let vb = _mm256_loadu_ps(b.as_ptr().add(offset));
sum = _mm256_fmadd_ps(va, vb, sum);
}
let mut result = horizontal_sum_avx2(sum);
// Handle remainder
for i in (chunks * 8)..n {
result += a[i] * b[i];
}
-result
}
#[cfg(target_arch = "x86_64")]
#[inline]
unsafe fn horizontal_sum_avx2(v: std::arch::x86_64::__m256) -> f32 {
use std::arch::x86_64::*;
let sum_high = _mm256_extractf128_ps(v, 1);
let sum_low = _mm256_castps256_ps128(v);
let sum128 = _mm_add_ps(sum_high, sum_low);
let sum64 = _mm_add_ps(sum128, _mm_movehl_ps(sum128, sum128));
let sum32 = _mm_add_ss(sum64, _mm_shuffle_ps(sum64, sum64, 1));
_mm_cvtss_f32(sum32)
}
#[cfg(not(target_arch = "x86_64"))]
pub unsafe fn euclidean_avx2(a: &[f32], b: &[f32]) -> f32 {
euclidean_scalar(a, b)
}
#[cfg(not(target_arch = "x86_64"))]
pub unsafe fn cosine_avx2(a: &[f32], b: &[f32]) -> f32 {
cosine_scalar(a, b)
}
#[cfg(not(target_arch = "x86_64"))]
pub unsafe fn inner_product_avx2(a: &[f32], b: &[f32]) -> f32 {
inner_product_scalar(a, b)
}
}
// ============================================================================
// Test Data Generation
// ============================================================================
fn generate_vectors(dims: usize, seed: u64) -> (Vec<f32>, Vec<f32>) {
let mut rng = ChaCha8Rng::seed_from_u64(seed);
let a: Vec<f32> = (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect();
let b: Vec<f32> = (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect();
(a, b)
}
fn generate_normalized_vectors(dims: usize, seed: u64) -> (Vec<f32>, Vec<f32>) {
let (mut a, mut b) = generate_vectors(dims, seed);
// Normalize vectors
let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
for x in &mut a {
*x /= norm_a;
}
for x in &mut b {
*x /= norm_b;
}
(a, b)
}
fn generate_vector_dataset(n: usize, dims: usize, seed: u64) -> Vec<Vec<f32>> {
let mut rng = ChaCha8Rng::seed_from_u64(seed);
(0..n)
.map(|_| (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect())
.collect()
}
// ============================================================================
// Euclidean Distance Benchmarks
// ============================================================================
const DIMENSIONS: [usize; 5] = [128, 384, 768, 1536, 3072];
fn bench_euclidean(c: &mut Criterion) {
let mut group = c.benchmark_group("Euclidean Distance");
for dims in DIMENSIONS.iter() {
let (a, b) = generate_vectors(*dims, 42);
group.throughput(Throughput::Elements(*dims as u64));
group.bench_with_input(BenchmarkId::new("scalar", dims), dims, |bench, _| {
bench.iter(|| distance_impl::euclidean_scalar(black_box(&a), black_box(&b)))
});
#[cfg(target_arch = "x86_64")]
if is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma") {
group.bench_with_input(BenchmarkId::new("avx2", dims), dims, |bench, _| {
bench
.iter(|| unsafe { distance_impl::euclidean_avx2(black_box(&a), black_box(&b)) })
});
}
}
group.finish();
}
// ============================================================================
// Cosine Distance Benchmarks
// ============================================================================
fn bench_cosine(c: &mut Criterion) {
let mut group = c.benchmark_group("Cosine Distance");
for dims in DIMENSIONS.iter() {
let (a, b) = generate_vectors(*dims, 42);
group.throughput(Throughput::Elements(*dims as u64));
group.bench_with_input(BenchmarkId::new("scalar", dims), dims, |bench, _| {
bench.iter(|| distance_impl::cosine_scalar(black_box(&a), black_box(&b)))
});
#[cfg(target_arch = "x86_64")]
if is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma") {
group.bench_with_input(BenchmarkId::new("avx2", dims), dims, |bench, _| {
bench.iter(|| unsafe { distance_impl::cosine_avx2(black_box(&a), black_box(&b)) })
});
}
}
group.finish();
}
// ============================================================================
// Cosine Distance for Pre-Normalized Vectors
// ============================================================================
fn bench_cosine_normalized(c: &mut Criterion) {
let mut group = c.benchmark_group("Cosine Distance (Normalized)");
for dims in DIMENSIONS.iter() {
let (a, b) = generate_normalized_vectors(*dims, 42);
group.throughput(Throughput::Elements(*dims as u64));
// For normalized vectors, cosine = 1 - dot product
group.bench_with_input(BenchmarkId::new("scalar_dot", dims), dims, |bench, _| {
bench.iter(|| {
let dot: f32 = a.iter().zip(&b).map(|(x, y)| x * y).sum();
1.0 - black_box(dot)
})
});
#[cfg(target_arch = "x86_64")]
if is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma") {
group.bench_with_input(BenchmarkId::new("avx2_dot", dims), dims, |bench, _| {
bench.iter(|| unsafe {
1.0 + distance_impl::inner_product_avx2(black_box(&a), black_box(&b))
})
});
}
}
group.finish();
}
// ============================================================================
// Inner Product Benchmarks
// ============================================================================
fn bench_inner_product(c: &mut Criterion) {
let mut group = c.benchmark_group("Inner Product");
for dims in DIMENSIONS.iter() {
let (a, b) = generate_vectors(*dims, 42);
group.throughput(Throughput::Elements(*dims as u64));
group.bench_with_input(BenchmarkId::new("scalar", dims), dims, |bench, _| {
bench.iter(|| distance_impl::inner_product_scalar(black_box(&a), black_box(&b)))
});
#[cfg(target_arch = "x86_64")]
if is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma") {
group.bench_with_input(BenchmarkId::new("avx2", dims), dims, |bench, _| {
bench.iter(|| unsafe {
distance_impl::inner_product_avx2(black_box(&a), black_box(&b))
})
});
}
}
group.finish();
}
// ============================================================================
// Manhattan Distance Benchmarks
// ============================================================================
fn bench_manhattan(c: &mut Criterion) {
let mut group = c.benchmark_group("Manhattan Distance");
for dims in DIMENSIONS.iter() {
let (a, b) = generate_vectors(*dims, 42);
group.throughput(Throughput::Elements(*dims as u64));
group.bench_with_input(BenchmarkId::new("scalar", dims), dims, |bench, _| {
bench.iter(|| distance_impl::manhattan_scalar(black_box(&a), black_box(&b)))
});
}
group.finish();
}
// ============================================================================
// Batch Distance Benchmarks (1000 vectors)
// ============================================================================
fn bench_batch_sequential(c: &mut Criterion) {
let mut group = c.benchmark_group("Batch Distance (Sequential, 1000 vectors)");
for dims in [128, 384, 1536].iter() {
let query = generate_vectors(*dims, 42).0;
let vectors = generate_vector_dataset(1000, *dims, 123);
group.throughput(Throughput::Elements(1000));
group.bench_with_input(BenchmarkId::new("euclidean", dims), dims, |bench, _| {
bench.iter(|| {
vectors
.iter()
.map(|v| distance_impl::euclidean_scalar(black_box(&query), black_box(v)))
.collect::<Vec<_>>()
})
});
group.bench_with_input(BenchmarkId::new("cosine", dims), dims, |bench, _| {
bench.iter(|| {
vectors
.iter()
.map(|v| distance_impl::cosine_scalar(black_box(&query), black_box(v)))
.collect::<Vec<_>>()
})
});
group.bench_with_input(BenchmarkId::new("inner_product", dims), dims, |bench, _| {
bench.iter(|| {
vectors
.iter()
.map(|v| distance_impl::inner_product_scalar(black_box(&query), black_box(v)))
.collect::<Vec<_>>()
})
});
}
group.finish();
}
fn bench_batch_parallel(c: &mut Criterion) {
let mut group = c.benchmark_group("Batch Distance (Parallel, 1000 vectors)");
for dims in [128, 384, 1536].iter() {
let query = generate_vectors(*dims, 42).0;
let vectors = generate_vector_dataset(1000, *dims, 123);
group.throughput(Throughput::Elements(1000));
group.bench_with_input(
BenchmarkId::new("euclidean_rayon", dims),
dims,
|bench, _| {
bench.iter(|| {
vectors
.par_iter()
.map(|v| distance_impl::euclidean_scalar(black_box(&query), black_box(v)))
.collect::<Vec<_>>()
})
},
);
group.bench_with_input(BenchmarkId::new("cosine_rayon", dims), dims, |bench, _| {
bench.iter(|| {
vectors
.par_iter()
.map(|v| distance_impl::cosine_scalar(black_box(&query), black_box(v)))
.collect::<Vec<_>>()
})
});
}
group.finish();
}
// ============================================================================
// Large Batch Benchmarks (10K vectors)
// ============================================================================
fn bench_large_batch(c: &mut Criterion) {
let mut group = c.benchmark_group("Large Batch Distance (10K vectors)");
group.sample_size(10);
for dims in [384, 768, 1536].iter() {
let query = generate_vectors(*dims, 42).0;
let vectors = generate_vector_dataset(10_000, *dims, 123);
group.throughput(Throughput::Elements(10_000));
group.bench_with_input(BenchmarkId::new("sequential", dims), dims, |bench, _| {
bench.iter(|| {
vectors
.iter()
.map(|v| distance_impl::euclidean_scalar(black_box(&query), black_box(v)))
.collect::<Vec<_>>()
})
});
group.bench_with_input(BenchmarkId::new("parallel", dims), dims, |bench, _| {
bench.iter(|| {
vectors
.par_iter()
.map(|v| distance_impl::euclidean_scalar(black_box(&query), black_box(v)))
.collect::<Vec<_>>()
})
});
#[cfg(target_arch = "x86_64")]
if is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma") {
group.bench_with_input(BenchmarkId::new("parallel_avx2", dims), dims, |bench, _| {
bench.iter(|| {
vectors
.par_iter()
.map(|v| unsafe {
distance_impl::euclidean_avx2(black_box(&query), black_box(v))
})
.collect::<Vec<_>>()
})
});
}
}
group.finish();
}
// ============================================================================
// SIMD Speedup Comparison
// ============================================================================
fn bench_simd_speedup(c: &mut Criterion) {
let mut group = c.benchmark_group("SIMD Speedup Analysis");
#[cfg(target_arch = "x86_64")]
if is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma") {
for dims in DIMENSIONS.iter() {
let (a, b) = generate_vectors(*dims, 42);
// Euclidean
group.bench_with_input(
BenchmarkId::new("euclidean_scalar", dims),
dims,
|bench, _| {
bench.iter(|| distance_impl::euclidean_scalar(black_box(&a), black_box(&b)))
},
);
group.bench_with_input(
BenchmarkId::new("euclidean_avx2", dims),
dims,
|bench, _| {
bench.iter(|| unsafe {
distance_impl::euclidean_avx2(black_box(&a), black_box(&b))
})
},
);
// Cosine
group.bench_with_input(BenchmarkId::new("cosine_scalar", dims), dims, |bench, _| {
bench.iter(|| distance_impl::cosine_scalar(black_box(&a), black_box(&b)))
});
group.bench_with_input(BenchmarkId::new("cosine_avx2", dims), dims, |bench, _| {
bench.iter(|| unsafe { distance_impl::cosine_avx2(black_box(&a), black_box(&b)) })
});
}
}
group.finish();
}
criterion_group!(
benches,
bench_euclidean,
bench_cosine,
bench_cosine_normalized,
bench_inner_product,
bench_manhattan,
bench_batch_sequential,
bench_batch_parallel,
bench_large_batch,
bench_simd_speedup,
);
criterion_main!(benches);

View File

@@ -0,0 +1,782 @@
//! End-to-End benchmarks for RuVector PostgreSQL extension
//!
//! Comprehensive benchmarks for:
//! - Full query pipeline latency
//! - Insert throughput
//! - Concurrent query scaling
//! - Memory usage under load
//! - pgvector comparison baselines
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
use rand::prelude::*;
use rand_chacha::ChaCha8Rng;
use rayon::prelude::*;
use std::collections::HashMap;
use std::sync::atomic::{AtomicUsize, Ordering as AtomicOrdering};
use std::sync::Arc;
use std::time::{Duration, Instant};
// ============================================================================
// Simulated Vector Index (Full Pipeline)
// ============================================================================
mod index {
use dashmap::DashMap;
use parking_lot::RwLock;
use rand::prelude::*;
use rand_chacha::ChaCha8Rng;
use rayon::prelude::*;
use std::cmp::Ordering;
use std::collections::{BinaryHeap, HashMap, HashSet};
use std::sync::atomic::{AtomicUsize, Ordering as AtomicOrdering};
/// Full-featured HNSW index for benchmarking
pub struct HnswIndex {
pub nodes: DashMap<u64, Vec<f32>>,
pub neighbors: DashMap<u64, Vec<Vec<u64>>>,
pub entry_point: RwLock<Option<u64>>,
pub max_layer: AtomicUsize,
pub m: usize,
pub m0: usize,
pub ef_construction: usize,
pub ef_search: usize,
pub dimensions: usize,
next_id: AtomicUsize,
rng: RwLock<ChaCha8Rng>,
}
impl HnswIndex {
pub fn new(
dimensions: usize,
m: usize,
ef_construction: usize,
ef_search: usize,
seed: u64,
) -> Self {
Self {
nodes: DashMap::new(),
neighbors: DashMap::new(),
entry_point: RwLock::new(None),
max_layer: AtomicUsize::new(0),
m,
m0: m * 2,
ef_construction,
ef_search,
dimensions,
next_id: AtomicUsize::new(0),
rng: RwLock::new(ChaCha8Rng::seed_from_u64(seed)),
}
}
pub fn len(&self) -> usize {
self.nodes.len()
}
fn random_level(&self) -> usize {
let ml = 1.0 / (self.m as f64).ln();
let mut rng = self.rng.write();
let r: f64 = rng.gen();
((-r.ln() * ml).floor() as usize).min(32)
}
fn distance(&self, a: &[f32], b: &[f32]) -> f32 {
a.iter()
.zip(b.iter())
.map(|(x, y)| (x - y).powi(2))
.sum::<f32>()
.sqrt()
}
pub fn insert(&self, vector: Vec<f32>) -> u64 {
let id = self.next_id.fetch_add(1, AtomicOrdering::Relaxed) as u64;
let level = self.random_level();
// Initialize neighbor lists for all layers
let mut neighbor_lists = Vec::with_capacity(level + 1);
for _ in 0..=level {
neighbor_lists.push(Vec::new());
}
self.nodes.insert(id, vector.clone());
self.neighbors.insert(id, neighbor_lists);
let current_entry = *self.entry_point.read();
if current_entry.is_none() {
*self.entry_point.write() = Some(id);
self.max_layer.store(level, AtomicOrdering::Relaxed);
return id;
}
// Simplified insertion
let entry_id = current_entry.unwrap();
// Connect to some neighbors
if let Some(entry_vec) = self.nodes.get(&entry_id) {
let max_conn = if level == 0 { self.m0 } else { self.m };
if let Some(mut neighbors) = self.neighbors.get_mut(&id) {
neighbors[0].push(entry_id);
}
if let Some(mut entry_neighbors) = self.neighbors.get_mut(&entry_id) {
if entry_neighbors[0].len() < max_conn {
entry_neighbors[0].push(id);
}
}
}
if level > self.max_layer.load(AtomicOrdering::Relaxed) {
*self.entry_point.write() = Some(id);
self.max_layer.store(level, AtomicOrdering::Relaxed);
}
id
}
pub fn insert_batch(&self, vectors: &[Vec<f32>]) -> Vec<u64> {
vectors.iter().map(|v| self.insert(v.clone())).collect()
}
pub fn insert_batch_parallel(&self, vectors: &[Vec<f32>]) -> Vec<u64> {
// Parallel insertion with batching
vectors.par_iter().map(|v| self.insert(v.clone())).collect()
}
pub fn search(&self, query: &[f32], k: usize) -> Vec<(u64, f32)> {
// Brute force for simplicity in benchmarks
let mut results: Vec<(u64, f32)> = self
.nodes
.iter()
.map(|entry| {
let dist = self.distance(query, entry.value());
(*entry.key(), dist)
})
.collect();
results.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
results.truncate(k);
results
}
pub fn search_parallel(&self, query: &[f32], k: usize) -> Vec<(u64, f32)> {
let mut results: Vec<(u64, f32)> = self
.nodes
.iter()
.collect::<Vec<_>>()
.par_iter()
.map(|entry| {
let dist = self.distance(query, entry.value());
(*entry.key(), dist)
})
.collect();
results.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
results.truncate(k);
results
}
pub fn memory_usage(&self) -> usize {
let vector_bytes = self.nodes.len() * self.dimensions * 4;
let neighbor_bytes: usize = self
.neighbors
.iter()
.map(|entry| entry.value().iter().map(|l| l.len() * 8).sum::<usize>())
.sum();
vector_bytes + neighbor_bytes
}
}
}
use index::HnswIndex;
// ============================================================================
// Test Data Generation
// ============================================================================
fn generate_random_vectors(n: usize, dims: usize, seed: u64) -> Vec<Vec<f32>> {
let mut rng = ChaCha8Rng::seed_from_u64(seed);
(0..n)
.map(|_| (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect())
.collect()
}
fn generate_normalized_vectors(n: usize, dims: usize, seed: u64) -> Vec<Vec<f32>> {
let vectors = generate_random_vectors(n, dims, seed);
vectors
.into_iter()
.map(|v| {
let norm: f32 = v.iter().map(|x| x * x).sum::<f32>().sqrt();
v.into_iter().map(|x| x / norm).collect()
})
.collect()
}
// ============================================================================
// Full Query Pipeline Benchmarks
// ============================================================================
fn bench_query_pipeline(c: &mut Criterion) {
let mut group = c.benchmark_group("Query Pipeline");
for &dims in [128, 384, 768, 1536].iter() {
for &n in [10_000, 100_000].iter() {
let vectors = generate_random_vectors(n, dims, 42);
let query = vectors[0].clone();
let index = HnswIndex::new(dims, 16, 64, 40, 42);
index.insert_batch(&vectors);
group.throughput(Throughput::Elements(1));
// Full pipeline: search + post-process
group.bench_with_input(BenchmarkId::new(format!("{}d", dims), n), &n, |bench, _| {
bench.iter(|| {
// Search
let results = index.search(&query, 10);
// Post-process (e.g., fetch metadata, rerank)
let processed: Vec<_> = results
.iter()
.map(|(id, dist)| {
// Simulate metadata lookup
let metadata = id.to_string();
(*id, *dist, metadata)
})
.collect();
black_box(processed)
})
});
}
}
group.finish();
}
fn bench_query_pipeline_parallel(c: &mut Criterion) {
let mut group = c.benchmark_group("Query Pipeline (Parallel)");
let dims = 768;
let n = 100_000;
let vectors = generate_random_vectors(n, dims, 42);
let queries: Vec<Vec<f32>> = generate_random_vectors(100, dims, 999);
let index = HnswIndex::new(dims, 16, 64, 40, 42);
index.insert_batch(&vectors);
group.throughput(Throughput::Elements(100));
group.bench_function("sequential", |bench| {
bench.iter(|| {
queries
.iter()
.map(|q| index.search(q, 10))
.collect::<Vec<_>>()
})
});
group.bench_function("parallel_queries", |bench| {
bench.iter(|| {
queries
.par_iter()
.map(|q| index.search(q, 10))
.collect::<Vec<_>>()
})
});
group.bench_function("parallel_search_internal", |bench| {
bench.iter(|| {
queries
.iter()
.map(|q| index.search_parallel(q, 10))
.collect::<Vec<_>>()
})
});
group.bench_function("full_parallel", |bench| {
bench.iter(|| {
queries
.par_iter()
.map(|q| index.search_parallel(q, 10))
.collect::<Vec<_>>()
})
});
group.finish();
}
// ============================================================================
// Insert Throughput Benchmarks
// ============================================================================
fn bench_insert_throughput(c: &mut Criterion) {
let mut group = c.benchmark_group("Insert Throughput");
group.sample_size(10);
for &dims in [128, 384, 768, 1536].iter() {
for &n in [1_000, 10_000, 100_000].iter() {
let vectors = generate_random_vectors(n, dims, 42);
group.throughput(Throughput::Elements(n as u64));
group.bench_with_input(
BenchmarkId::new(format!("{}d", dims), n),
&vectors,
|bench, vecs| {
bench.iter(|| {
let index = HnswIndex::new(dims, 16, 64, 40, 42);
index.insert_batch(vecs);
black_box(index.len())
})
},
);
}
}
group.finish();
}
fn bench_insert_throughput_parallel(c: &mut Criterion) {
let mut group = c.benchmark_group("Insert Throughput (Parallel)");
group.sample_size(10);
let dims = 768;
for &n in [10_000, 100_000].iter() {
let vectors = generate_random_vectors(n, dims, 42);
group.throughput(Throughput::Elements(n as u64));
group.bench_with_input(
BenchmarkId::new("sequential", n),
&vectors,
|bench, vecs| {
bench.iter(|| {
let index = HnswIndex::new(dims, 16, 64, 40, 42);
index.insert_batch(vecs);
black_box(index.len())
})
},
);
group.bench_with_input(BenchmarkId::new("parallel", n), &vectors, |bench, vecs| {
bench.iter(|| {
let index = HnswIndex::new(dims, 16, 64, 40, 42);
index.insert_batch_parallel(vecs);
black_box(index.len())
})
});
}
group.finish();
}
fn bench_insert_batching(c: &mut Criterion) {
let mut group = c.benchmark_group("Insert Batch Sizes");
group.sample_size(10);
let dims = 768;
let n = 10_000;
let vectors = generate_random_vectors(n, dims, 42);
for &batch_size in [1, 10, 100, 1000, 10000].iter() {
group.throughput(Throughput::Elements(n as u64));
group.bench_with_input(
BenchmarkId::from_parameter(batch_size),
&batch_size,
|bench, &bs| {
bench.iter(|| {
let index = HnswIndex::new(dims, 16, 64, 40, 42);
for chunk in vectors.chunks(bs) {
index.insert_batch(chunk);
}
black_box(index.len())
})
},
);
}
group.finish();
}
// ============================================================================
// Concurrent Query Scaling
// ============================================================================
fn bench_concurrent_scaling(c: &mut Criterion) {
let mut group = c.benchmark_group("Concurrent Query Scaling");
group.sample_size(10);
let dims = 768;
let n = 100_000;
let vectors = generate_random_vectors(n, dims, 42);
let queries = generate_random_vectors(1000, dims, 999);
let index = Arc::new(HnswIndex::new(dims, 16, 64, 40, 42));
index.insert_batch(&vectors);
for &num_threads in [1, 2, 4, 8, 16].iter() {
group.throughput(Throughput::Elements(1000));
group.bench_with_input(
BenchmarkId::from_parameter(num_threads),
&num_threads,
|bench, &threads| {
let pool = rayon::ThreadPoolBuilder::new()
.num_threads(threads)
.build()
.unwrap();
bench.iter(|| {
pool.install(|| {
queries.par_iter().for_each(|q| {
black_box(index.search(q, 10));
});
})
})
},
);
}
group.finish();
}
fn bench_mixed_workload(c: &mut Criterion) {
let mut group = c.benchmark_group("Mixed Read/Write Workload");
group.sample_size(10);
let dims = 768;
let n = 50_000;
let vectors = generate_random_vectors(n, dims, 42);
let queries = generate_random_vectors(100, dims, 999);
let new_vectors = generate_random_vectors(1000, dims, 123);
let index = Arc::new(HnswIndex::new(dims, 16, 64, 40, 42));
index.insert_batch(&vectors);
// Read-heavy (90% reads, 10% writes)
group.bench_function("read_heavy", |bench| {
let idx = index.clone();
bench.iter(|| {
// 90 reads
for q in queries.iter().take(90) {
black_box(idx.search(q, 10));
}
// 10 writes
for v in new_vectors.iter().take(10) {
black_box(idx.insert(v.clone()));
}
})
});
// Balanced (50% reads, 50% writes)
group.bench_function("balanced", |bench| {
let idx = index.clone();
bench.iter(|| {
for (q, v) in queries.iter().take(50).zip(new_vectors.iter().take(50)) {
black_box(idx.search(q, 10));
black_box(idx.insert(v.clone()));
}
})
});
// Write-heavy (10% reads, 90% writes)
group.bench_function("write_heavy", |bench| {
let idx = index.clone();
bench.iter(|| {
// 10 reads
for q in queries.iter().take(10) {
black_box(idx.search(q, 10));
}
// 90 writes
for v in new_vectors.iter().take(90) {
black_box(idx.insert(v.clone()));
}
})
});
group.finish();
}
// ============================================================================
// Memory Usage Under Load
// ============================================================================
fn bench_memory_growth(c: &mut Criterion) {
let mut group = c.benchmark_group("Memory Growth");
group.sample_size(10);
let dims = 768;
for &n in [1_000, 10_000, 50_000, 100_000].iter() {
let vectors = generate_random_vectors(n, dims, 42);
group.bench_with_input(BenchmarkId::from_parameter(n), &vectors, |bench, vecs| {
bench.iter(|| {
let index = HnswIndex::new(dims, 16, 64, 40, 42);
index.insert_batch(vecs);
let memory = index.memory_usage();
let per_vector = memory as f64 / n as f64;
black_box((memory, per_vector))
})
});
}
group.finish();
}
fn bench_memory_efficiency(c: &mut Criterion) {
let mut group = c.benchmark_group("Memory Efficiency (M parameter)");
group.sample_size(10);
let dims = 768;
let n = 10_000;
let vectors = generate_random_vectors(n, dims, 42);
for &m in [8, 12, 16, 24, 32, 48].iter() {
group.bench_with_input(BenchmarkId::from_parameter(m), &m, |bench, &m_val| {
bench.iter(|| {
let index = HnswIndex::new(dims, m_val, 64, 40, 42);
index.insert_batch(&vectors);
let memory = index.memory_usage();
let per_vector = memory as f64 / n as f64;
black_box(per_vector)
})
});
}
group.finish();
}
// ============================================================================
// Latency Distribution
// ============================================================================
fn bench_latency_distribution(c: &mut Criterion) {
let mut group = c.benchmark_group("Latency Distribution");
group.sample_size(10);
let dims = 768;
let n = 100_000;
let vectors = generate_random_vectors(n, dims, 42);
let queries = generate_random_vectors(1000, dims, 999);
let index = HnswIndex::new(dims, 16, 64, 40, 42);
index.insert_batch(&vectors);
group.bench_function("collect_percentiles", |bench| {
bench.iter(|| {
let mut latencies: Vec<Duration> = Vec::with_capacity(queries.len());
for query in &queries {
let start = Instant::now();
black_box(index.search(query, 10));
latencies.push(start.elapsed());
}
latencies.sort();
let p50 = latencies[latencies.len() / 2];
let p95 = latencies[(latencies.len() as f64 * 0.95) as usize];
let p99 = latencies[(latencies.len() as f64 * 0.99) as usize];
let p999 = latencies[(latencies.len() as f64 * 0.999) as usize];
black_box((p50, p95, p99, p999))
})
});
group.finish();
}
// ============================================================================
// Dimension Scaling
// ============================================================================
fn bench_dimension_scaling(c: &mut Criterion) {
let mut group = c.benchmark_group("Dimension Scaling");
group.sample_size(10);
let n = 10_000;
for &dims in [64, 128, 256, 384, 512, 768, 1024, 1536, 2048, 3072].iter() {
let vectors = generate_random_vectors(n, dims, 42);
let query = vectors[0].clone();
let index = HnswIndex::new(dims, 16, 64, 40, 42);
index.insert_batch(&vectors);
group.bench_with_input(BenchmarkId::new("search", dims), &dims, |bench, _| {
bench.iter(|| black_box(index.search(&query, 10)))
});
}
group.finish();
}
// ============================================================================
// pgvector Comparison Baselines
// ============================================================================
fn bench_baseline_brute_force(c: &mut Criterion) {
let mut group = c.benchmark_group("Baseline Brute Force");
group.sample_size(10);
for &dims in [128, 384, 768, 1536].iter() {
for &n in [1_000, 10_000, 100_000].iter() {
let vectors = generate_random_vectors(n, dims, 42);
let query = vectors[0].clone();
group.throughput(Throughput::Elements(n as u64));
// Sequential brute force
group.bench_with_input(
BenchmarkId::new(format!("{}d_seq", dims), n),
&vectors,
|bench, vecs| {
bench.iter(|| {
let mut distances: Vec<(usize, f32)> = vecs
.iter()
.enumerate()
.map(|(i, v)| {
let dist: f32 = query
.iter()
.zip(v.iter())
.map(|(a, b)| (a - b).powi(2))
.sum::<f32>()
.sqrt();
(i, dist)
})
.collect();
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
distances.truncate(10);
black_box(distances)
})
},
);
// Parallel brute force
group.bench_with_input(
BenchmarkId::new(format!("{}d_par", dims), n),
&vectors,
|bench, vecs| {
bench.iter(|| {
let mut distances: Vec<(usize, f32)> = vecs
.par_iter()
.enumerate()
.map(|(i, v)| {
let dist: f32 = query
.iter()
.zip(v.iter())
.map(|(a, b)| (a - b).powi(2))
.sum::<f32>()
.sqrt();
(i, dist)
})
.collect();
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
distances.truncate(10);
black_box(distances)
})
},
);
}
}
group.finish();
}
// ============================================================================
// Recall vs Throughput Tradeoff
// ============================================================================
fn bench_recall_throughput_tradeoff(c: &mut Criterion) {
let mut group = c.benchmark_group("Recall vs Throughput");
group.sample_size(10);
let dims = 768;
let n = 10_000;
let vectors = generate_random_vectors(n, dims, 42);
let query = vectors[0].clone();
// Compute ground truth
let ground_truth: Vec<usize> = {
let mut distances: Vec<(usize, f32)> = vectors
.iter()
.enumerate()
.map(|(i, v)| {
let dist: f32 = query
.iter()
.zip(v.iter())
.map(|(a, b)| (a - b).powi(2))
.sum::<f32>()
.sqrt();
(i, dist)
})
.collect();
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
distances.iter().take(10).map(|(i, _)| *i).collect()
};
for &ef_search in [10, 20, 40, 80, 160, 320].iter() {
let index = HnswIndex::new(dims, 16, 64, ef_search, 42);
index.insert_batch(&vectors);
group.bench_with_input(
BenchmarkId::from_parameter(ef_search),
&ef_search,
|bench, _| {
bench.iter(|| {
let results = index.search(&query, 10);
// Calculate recall
let recall = results
.iter()
.filter(|(id, _)| ground_truth.contains(&(*id as usize)))
.count() as f64
/ 10.0;
black_box(recall)
})
},
);
}
group.finish();
}
criterion_group!(
benches,
// Query Pipeline
bench_query_pipeline,
bench_query_pipeline_parallel,
// Insert Throughput
bench_insert_throughput,
bench_insert_throughput_parallel,
bench_insert_batching,
// Concurrent Scaling
bench_concurrent_scaling,
bench_mixed_workload,
// Memory Usage
bench_memory_growth,
bench_memory_efficiency,
// Latency
bench_latency_distribution,
// Dimension Scaling
bench_dimension_scaling,
// Baselines
bench_baseline_brute_force,
// Recall/Throughput
bench_recall_throughput_tradeoff,
);
criterion_main!(benches);

View File

@@ -0,0 +1,742 @@
//! Hybrid search benchmarks
//!
//! Benchmarks for combining vector search with keyword/BM25 scoring:
//! - Vector-only vs hybrid latency
//! - BM25 scoring overhead
//! - Fusion algorithm comparison (RRF, weighted sum)
//! - Parallel branch execution gain
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
use rand::prelude::*;
use rand_chacha::ChaCha8Rng;
use rayon::prelude::*;
use std::cmp::Ordering;
use std::collections::{BinaryHeap, HashMap, HashSet};
// ============================================================================
// BM25 Implementation
// ============================================================================
mod bm25 {
use std::cmp::Ordering;
use std::collections::HashMap;
/// Simple tokenizer
pub fn tokenize(text: &str) -> Vec<String> {
text.to_lowercase()
.split(|c: char| !c.is_alphanumeric())
.filter(|s| !s.is_empty() && s.len() > 2)
.map(|s| s.to_string())
.collect()
}
/// BM25 scoring index
pub struct BM25Index {
/// Document frequency for each term
pub doc_freq: HashMap<String, usize>,
/// Term frequency per document
pub term_freq: Vec<HashMap<String, usize>>,
/// Document lengths
pub doc_lengths: Vec<usize>,
/// Average document length
pub avg_doc_len: f64,
/// Number of documents
pub num_docs: usize,
/// BM25 parameters
pub k1: f64,
pub b: f64,
}
impl BM25Index {
pub fn new(k1: f64, b: f64) -> Self {
Self {
doc_freq: HashMap::new(),
term_freq: Vec::new(),
doc_lengths: Vec::new(),
avg_doc_len: 0.0,
num_docs: 0,
k1,
b,
}
}
pub fn build(&mut self, documents: &[String]) {
self.num_docs = documents.len();
self.term_freq = Vec::with_capacity(documents.len());
self.doc_lengths = Vec::with_capacity(documents.len());
let mut total_len = 0usize;
for doc in documents {
let tokens = tokenize(doc);
self.doc_lengths.push(tokens.len());
total_len += tokens.len();
let mut tf: HashMap<String, usize> = HashMap::new();
let mut seen_terms: std::collections::HashSet<String> =
std::collections::HashSet::new();
for token in tokens {
*tf.entry(token.clone()).or_insert(0) += 1;
if !seen_terms.contains(&token) {
*self.doc_freq.entry(token.clone()).or_insert(0) += 1;
seen_terms.insert(token);
}
}
self.term_freq.push(tf);
}
self.avg_doc_len = total_len as f64 / documents.len() as f64;
}
/// Calculate IDF for a term
fn idf(&self, term: &str) -> f64 {
let df = self.doc_freq.get(term).copied().unwrap_or(0) as f64;
if df == 0.0 {
return 0.0;
}
((self.num_docs as f64 - df + 0.5) / (df + 0.5) + 1.0).ln()
}
/// Score a document against a query
pub fn score(&self, doc_id: usize, query_tokens: &[String]) -> f64 {
if doc_id >= self.term_freq.len() {
return 0.0;
}
let doc_tf = &self.term_freq[doc_id];
let doc_len = self.doc_lengths[doc_id] as f64;
let mut score = 0.0;
for term in query_tokens {
let tf = doc_tf.get(term).copied().unwrap_or(0) as f64;
if tf == 0.0 {
continue;
}
let idf = self.idf(term);
let numerator = tf * (self.k1 + 1.0);
let denominator =
tf + self.k1 * (1.0 - self.b + self.b * (doc_len / self.avg_doc_len));
score += idf * (numerator / denominator);
}
score
}
/// Search and return top-k documents
pub fn search(&self, query: &str, k: usize) -> Vec<(usize, f64)> {
let query_tokens = tokenize(query);
let mut scores: Vec<(usize, f64)> = (0..self.num_docs)
.map(|doc_id| (doc_id, self.score(doc_id, &query_tokens)))
.filter(|(_, score)| *score > 0.0)
.collect();
scores.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap_or(Ordering::Equal));
scores.truncate(k);
scores
}
}
}
// ============================================================================
// Vector Search (Simplified)
// ============================================================================
mod vector_search {
use std::cmp::Ordering;
pub fn euclidean_distance(a: &[f32], b: &[f32]) -> f32 {
a.iter()
.zip(b.iter())
.map(|(x, y)| (x - y).powi(2))
.sum::<f32>()
.sqrt()
}
pub fn search(vectors: &[Vec<f32>], query: &[f32], k: usize) -> Vec<(usize, f32)> {
let mut results: Vec<(usize, f32)> = vectors
.iter()
.enumerate()
.map(|(i, v)| (i, euclidean_distance(query, v)))
.collect();
results.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(Ordering::Equal));
results.truncate(k);
results
}
pub fn search_parallel(vectors: &[Vec<f32>], query: &[f32], k: usize) -> Vec<(usize, f32)> {
use rayon::prelude::*;
let mut results: Vec<(usize, f32)> = vectors
.par_iter()
.enumerate()
.map(|(i, v)| (i, euclidean_distance(query, v)))
.collect();
results.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(Ordering::Equal));
results.truncate(k);
results
}
}
// ============================================================================
// Fusion Algorithms
// ============================================================================
mod fusion {
use std::collections::HashMap;
/// Reciprocal Rank Fusion
pub fn rrf(
vector_results: &[(usize, f32)],
text_results: &[(usize, f64)],
k: usize,
rrf_k: f64,
) -> Vec<(usize, f64)> {
let mut scores: HashMap<usize, f64> = HashMap::new();
// Vector results
for (rank, (doc_id, _)) in vector_results.iter().enumerate() {
let rrf_score = 1.0 / (rrf_k + rank as f64 + 1.0);
*scores.entry(*doc_id).or_insert(0.0) += rrf_score;
}
// Text results
for (rank, (doc_id, _)) in text_results.iter().enumerate() {
let rrf_score = 1.0 / (rrf_k + rank as f64 + 1.0);
*scores.entry(*doc_id).or_insert(0.0) += rrf_score;
}
let mut results: Vec<(usize, f64)> = scores.into_iter().collect();
results.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
results.truncate(k);
results
}
/// Weighted score fusion (requires normalized scores)
pub fn weighted_sum(
vector_results: &[(usize, f32)],
text_results: &[(usize, f64)],
k: usize,
vector_weight: f64,
text_weight: f64,
) -> Vec<(usize, f64)> {
// Normalize vector scores (lower distance = higher score)
let max_dist = vector_results
.iter()
.map(|(_, d)| *d)
.fold(0.0f32, f32::max);
let vector_scores: HashMap<usize, f64> = vector_results
.iter()
.map(|(id, dist)| (*id, (1.0 - dist / max_dist.max(1e-6)) as f64))
.collect();
// Normalize text scores
let max_text = text_results.iter().map(|(_, s)| *s).fold(0.0f64, f64::max);
let text_scores: HashMap<usize, f64> = text_results
.iter()
.map(|(id, score)| (*id, score / max_text.max(1e-6)))
.collect();
// Combine
let mut all_ids: std::collections::HashSet<usize> = std::collections::HashSet::new();
all_ids.extend(vector_scores.keys());
all_ids.extend(text_scores.keys());
let mut results: Vec<(usize, f64)> = all_ids
.iter()
.map(|&id| {
let v_score = vector_scores.get(&id).copied().unwrap_or(0.0);
let t_score = text_scores.get(&id).copied().unwrap_or(0.0);
(id, vector_weight * v_score + text_weight * t_score)
})
.collect();
results.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
results.truncate(k);
results
}
/// Disjunctive Normalization
pub fn disjunctive_normalization(
vector_results: &[(usize, f32)],
text_results: &[(usize, f64)],
k: usize,
) -> Vec<(usize, f64)> {
let mut scores: HashMap<usize, f64> = HashMap::new();
// Vector results (convert distance to similarity)
let max_dist = vector_results
.iter()
.map(|(_, d)| *d)
.fold(0.0f32, f32::max);
for (doc_id, dist) in vector_results {
let sim = 1.0 - (*dist / max_dist.max(1e-6)) as f64;
scores.insert(*doc_id, sim);
}
// Text results (add if not present, max if present)
let max_text = text_results.iter().map(|(_, s)| *s).fold(0.0f64, f64::max);
for (doc_id, score) in text_results {
let norm_score = score / max_text.max(1e-6);
let current = scores.entry(*doc_id).or_insert(0.0);
*current = current.max(norm_score);
}
let mut results: Vec<(usize, f64)> = scores.into_iter().collect();
results.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
results.truncate(k);
results
}
}
use bm25::{tokenize, BM25Index};
use fusion::{disjunctive_normalization, rrf, weighted_sum};
use vector_search::{search as vector_search_fn, search_parallel as vector_search_parallel};
// ============================================================================
// Test Data Generation
// ============================================================================
fn generate_random_vectors(n: usize, dims: usize, seed: u64) -> Vec<Vec<f32>> {
let mut rng = ChaCha8Rng::seed_from_u64(seed);
(0..n)
.map(|_| (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect())
.collect()
}
fn generate_random_documents(n: usize, seed: u64) -> Vec<String> {
let words = [
"machine",
"learning",
"artificial",
"intelligence",
"neural",
"network",
"deep",
"training",
"model",
"data",
"algorithm",
"optimization",
"gradient",
"descent",
"backpropagation",
"convolution",
"recurrent",
"transformer",
"attention",
"embedding",
"vector",
"search",
"similarity",
"distance",
"nearest",
"neighbor",
"index",
"query",
"retrieval",
"ranking",
"database",
"storage",
"distributed",
"parallel",
"processing",
];
let mut rng = ChaCha8Rng::seed_from_u64(seed);
(0..n)
.map(|_| {
let len = rng.gen_range(20..100);
(0..len)
.map(|_| words[rng.gen_range(0..words.len())])
.collect::<Vec<_>>()
.join(" ")
})
.collect()
}
// ============================================================================
// Vector-Only vs Hybrid Benchmarks
// ============================================================================
fn bench_vector_only(c: &mut Criterion) {
let mut group = c.benchmark_group("Vector Only Search");
for &n in [10_000, 100_000].iter() {
let dims = 768;
let vectors = generate_random_vectors(n, dims, 42);
let query = vectors[0].clone();
group.throughput(Throughput::Elements(n as u64));
group.bench_with_input(BenchmarkId::new("sequential", n), &n, |bench, _| {
bench.iter(|| black_box(vector_search_fn(&vectors, &query, 10)))
});
group.bench_with_input(BenchmarkId::new("parallel", n), &n, |bench, _| {
bench.iter(|| black_box(vector_search_parallel(&vectors, &query, 10)))
});
}
group.finish();
}
fn bench_text_only(c: &mut Criterion) {
let mut group = c.benchmark_group("Text Only (BM25) Search");
for &n in [10_000, 100_000].iter() {
let documents = generate_random_documents(n, 42);
let mut bm25 = BM25Index::new(1.2, 0.75);
bm25.build(&documents);
let query = "machine learning neural network";
group.throughput(Throughput::Elements(n as u64));
group.bench_with_input(BenchmarkId::from_parameter(n), &n, |bench, _| {
bench.iter(|| black_box(bm25.search(query, 10)))
});
}
group.finish();
}
fn bench_hybrid_search(c: &mut Criterion) {
let mut group = c.benchmark_group("Hybrid Search");
for &n in [10_000, 100_000].iter() {
let dims = 768;
let vectors = generate_random_vectors(n, dims, 42);
let documents = generate_random_documents(n, 42);
let vector_query = vectors[0].clone();
let text_query = "machine learning neural network";
let mut bm25 = BM25Index::new(1.2, 0.75);
bm25.build(&documents);
group.throughput(Throughput::Elements(n as u64));
// Sequential hybrid
group.bench_with_input(BenchmarkId::new("sequential", n), &n, |bench, _| {
bench.iter(|| {
let vector_results = vector_search_fn(&vectors, &vector_query, 100);
let text_results = bm25.search(text_query, 100);
black_box(rrf(&vector_results, &text_results, 10, 60.0))
})
});
// Parallel hybrid (branches)
group.bench_with_input(BenchmarkId::new("parallel_branches", n), &n, |bench, _| {
bench.iter(|| {
let (vector_results, text_results) = rayon::join(
|| vector_search_parallel(&vectors, &vector_query, 100),
|| bm25.search(text_query, 100),
);
black_box(rrf(&vector_results, &text_results, 10, 60.0))
})
});
}
group.finish();
}
// ============================================================================
// BM25 Overhead Benchmarks
// ============================================================================
fn bench_bm25_build(c: &mut Criterion) {
let mut group = c.benchmark_group("BM25 Index Build");
for &n in [1_000, 10_000, 100_000].iter() {
let documents = generate_random_documents(n, 42);
group.throughput(Throughput::Elements(n as u64));
group.bench_with_input(BenchmarkId::from_parameter(n), &documents, |bench, docs| {
bench.iter(|| {
let mut bm25 = BM25Index::new(1.2, 0.75);
bm25.build(docs);
black_box(bm25)
})
});
}
group.finish();
}
fn bench_bm25_query_lengths(c: &mut Criterion) {
let mut group = c.benchmark_group("BM25 Query Length");
let n = 100_000;
let documents = generate_random_documents(n, 42);
let mut bm25 = BM25Index::new(1.2, 0.75);
bm25.build(&documents);
let queries = [
"machine",
"machine learning",
"machine learning neural network",
"machine learning neural network deep training model",
"machine learning neural network deep training model algorithm optimization gradient descent",
];
for query in queries.iter() {
let token_count = tokenize(query).len();
group.bench_with_input(
BenchmarkId::new("tokens", token_count),
query,
|bench, q| bench.iter(|| black_box(bm25.search(q, 10))),
);
}
group.finish();
}
// ============================================================================
// Fusion Algorithm Comparison
// ============================================================================
fn bench_fusion_algorithms(c: &mut Criterion) {
let mut group = c.benchmark_group("Fusion Algorithms");
let n = 100_000;
let dims = 768;
let vectors = generate_random_vectors(n, dims, 42);
let documents = generate_random_documents(n, 42);
let vector_query = vectors[0].clone();
let text_query = "machine learning neural network";
let mut bm25 = BM25Index::new(1.2, 0.75);
bm25.build(&documents);
// Pre-compute search results
let vector_results = vector_search_fn(&vectors, &vector_query, 1000);
let text_results = bm25.search(text_query, 1000);
for &k in [10, 50, 100].iter() {
group.bench_with_input(BenchmarkId::new("rrf", k), &k, |bench, &k_val| {
bench.iter(|| black_box(rrf(&vector_results, &text_results, k_val, 60.0)))
});
group.bench_with_input(BenchmarkId::new("weighted_sum", k), &k, |bench, &k_val| {
bench.iter(|| {
black_box(weighted_sum(
&vector_results,
&text_results,
k_val,
0.6,
0.4,
))
})
});
group.bench_with_input(
BenchmarkId::new("disjunctive_norm", k),
&k,
|bench, &k_val| {
bench.iter(|| {
black_box(disjunctive_normalization(
&vector_results,
&text_results,
k_val,
))
})
},
);
}
group.finish();
}
fn bench_rrf_k_parameter(c: &mut Criterion) {
let mut group = c.benchmark_group("RRF K Parameter");
let n = 100_000;
let dims = 768;
let vectors = generate_random_vectors(n, dims, 42);
let documents = generate_random_documents(n, 42);
let vector_query = vectors[0].clone();
let text_query = "machine learning neural network";
let mut bm25 = BM25Index::new(1.2, 0.75);
bm25.build(&documents);
let vector_results = vector_search_fn(&vectors, &vector_query, 1000);
let text_results = bm25.search(text_query, 1000);
for &rrf_k in [1.0, 20.0, 60.0, 100.0, 200.0].iter() {
group.bench_with_input(
BenchmarkId::from_parameter(rrf_k as i32),
&rrf_k,
|bench, &k| bench.iter(|| black_box(rrf(&vector_results, &text_results, 10, k))),
);
}
group.finish();
}
fn bench_weight_ratios(c: &mut Criterion) {
let mut group = c.benchmark_group("Weight Ratios");
let n = 100_000;
let dims = 768;
let vectors = generate_random_vectors(n, dims, 42);
let documents = generate_random_documents(n, 42);
let vector_query = vectors[0].clone();
let text_query = "machine learning neural network";
let mut bm25 = BM25Index::new(1.2, 0.75);
bm25.build(&documents);
let vector_results = vector_search_fn(&vectors, &vector_query, 1000);
let text_results = bm25.search(text_query, 1000);
let ratios = [
(0.0, 1.0, "text_only"),
(0.3, 0.7, "text_heavy"),
(0.5, 0.5, "balanced"),
(0.7, 0.3, "vector_heavy"),
(1.0, 0.0, "vector_only"),
];
for (vector_w, text_w, name) in ratios.iter() {
group.bench_with_input(
BenchmarkId::from_parameter(name),
&(*vector_w, *text_w),
|bench, &(v_w, t_w)| {
bench.iter(|| black_box(weighted_sum(&vector_results, &text_results, 10, v_w, t_w)))
},
);
}
group.finish();
}
// ============================================================================
// Parallel Branch Execution
// ============================================================================
fn bench_parallel_execution_gain(c: &mut Criterion) {
let mut group = c.benchmark_group("Parallel Branch Execution");
for &n in [10_000, 50_000, 100_000].iter() {
let dims = 768;
let vectors = generate_random_vectors(n, dims, 42);
let documents = generate_random_documents(n, 42);
let vector_query = vectors[0].clone();
let text_query = "machine learning neural network";
let mut bm25 = BM25Index::new(1.2, 0.75);
bm25.build(&documents);
// Sequential
group.bench_with_input(BenchmarkId::new("sequential", n), &n, |bench, _| {
bench.iter(|| {
let vector_results = vector_search_fn(&vectors, &vector_query, 100);
let text_results = bm25.search(text_query, 100);
black_box((vector_results, text_results))
})
});
// Parallel with rayon::join
group.bench_with_input(BenchmarkId::new("parallel_join", n), &n, |bench, _| {
bench.iter(|| {
let (vector_results, text_results) = rayon::join(
|| vector_search_fn(&vectors, &vector_query, 100),
|| bm25.search(text_query, 100),
);
black_box((vector_results, text_results))
})
});
// Parallel vector search only
group.bench_with_input(BenchmarkId::new("parallel_vector", n), &n, |bench, _| {
bench.iter(|| {
let vector_results = vector_search_parallel(&vectors, &vector_query, 100);
let text_results = bm25.search(text_query, 100);
black_box((vector_results, text_results))
})
});
// Full parallel
group.bench_with_input(BenchmarkId::new("full_parallel", n), &n, |bench, _| {
bench.iter(|| {
let (vector_results, text_results) = rayon::join(
|| vector_search_parallel(&vectors, &vector_query, 100),
|| bm25.search(text_query, 100),
);
black_box((vector_results, text_results))
})
});
}
group.finish();
}
// ============================================================================
// Candidate Count Analysis
// ============================================================================
fn bench_candidate_counts(c: &mut Criterion) {
let mut group = c.benchmark_group("Candidate Count Analysis");
let n = 100_000;
let dims = 768;
let vectors = generate_random_vectors(n, dims, 42);
let documents = generate_random_documents(n, 42);
let vector_query = vectors[0].clone();
let text_query = "machine learning neural network";
let mut bm25 = BM25Index::new(1.2, 0.75);
bm25.build(&documents);
for &candidates in [50, 100, 200, 500, 1000, 2000].iter() {
group.bench_with_input(
BenchmarkId::from_parameter(candidates),
&candidates,
|bench, &k_candidates| {
bench.iter(|| {
let (vector_results, text_results) = rayon::join(
|| vector_search_parallel(&vectors, &vector_query, k_candidates),
|| bm25.search(text_query, k_candidates),
);
black_box(rrf(&vector_results, &text_results, 10, 60.0))
})
},
);
}
group.finish();
}
criterion_group!(
benches,
// Vector vs Text
bench_vector_only,
bench_text_only,
bench_hybrid_search,
// BM25 Overhead
bench_bm25_build,
bench_bm25_query_lengths,
// Fusion Algorithms
bench_fusion_algorithms,
bench_rrf_k_parameter,
bench_weight_ratios,
// Parallel Execution
bench_parallel_execution_gain,
bench_candidate_counts,
);
criterion_main!(benches);

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,915 @@
//! Index integrity and graph maintenance benchmarks
//!
//! Benchmarks for v2 structural integrity features:
//! - Contracted graph construction
//! - Mincut computation time
//! - State transition overhead
//! - Gating check latency
//! - Graph connectivity verification
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
use rand::prelude::*;
use rand_chacha::ChaCha8Rng;
use rayon::prelude::*;
use std::cmp::Ordering;
use std::collections::{BinaryHeap, HashMap, HashSet, VecDeque};
// ============================================================================
// Graph Structures for Index Integrity
// ============================================================================
mod graph {
use std::cmp::Ordering;
use std::collections::{BinaryHeap, HashMap, HashSet, VecDeque};
/// Node in the HNSW graph (simplified)
#[derive(Clone)]
pub struct GraphNode {
pub id: u64,
pub neighbors: Vec<u64>,
pub layer: usize,
}
/// Graph for integrity checking
pub struct Graph {
pub nodes: HashMap<u64, GraphNode>,
pub max_layer: usize,
}
impl Graph {
pub fn new() -> Self {
Self {
nodes: HashMap::new(),
max_layer: 0,
}
}
pub fn add_node(&mut self, id: u64, layer: usize) {
self.nodes.insert(
id,
GraphNode {
id,
neighbors: Vec::new(),
layer,
},
);
self.max_layer = self.max_layer.max(layer);
}
pub fn add_edge(&mut self, from: u64, to: u64) {
if let Some(node) = self.nodes.get_mut(&from) {
if !node.neighbors.contains(&to) {
node.neighbors.push(to);
}
}
}
pub fn len(&self) -> usize {
self.nodes.len()
}
}
/// Contracted graph for integrity verification
pub struct ContractedGraph {
/// Super-nodes (contracted regions)
pub super_nodes: Vec<SuperNode>,
/// Edges between super-nodes
pub super_edges: Vec<(usize, usize, f32)>,
/// Node to super-node mapping
pub node_mapping: HashMap<u64, usize>,
}
#[derive(Clone)]
pub struct SuperNode {
pub id: usize,
pub original_nodes: Vec<u64>,
pub internal_edges: usize,
}
impl ContractedGraph {
pub fn new() -> Self {
Self {
super_nodes: Vec::new(),
super_edges: Vec::new(),
node_mapping: HashMap::new(),
}
}
/// Build contracted graph from original graph
pub fn build_from_graph(graph: &Graph, contraction_factor: usize) -> Self {
let mut contracted = ContractedGraph::new();
// Group nodes by region (simplified partitioning)
let node_ids: Vec<u64> = graph.nodes.keys().copied().collect();
let num_super_nodes = (node_ids.len() / contraction_factor).max(1);
for (i, chunk) in node_ids.chunks(contraction_factor).enumerate() {
let super_node = SuperNode {
id: i,
original_nodes: chunk.to_vec(),
internal_edges: chunk
.iter()
.filter_map(|&id| graph.nodes.get(&id))
.flat_map(|n| n.neighbors.iter())
.filter(|&&neighbor| chunk.contains(&neighbor))
.count(),
};
for &node_id in chunk {
contracted.node_mapping.insert(node_id, i);
}
contracted.super_nodes.push(super_node);
}
// Build super edges
let mut edge_weights: HashMap<(usize, usize), f32> = HashMap::new();
for node in graph.nodes.values() {
let from_super = contracted.node_mapping[&node.id];
for &neighbor in &node.neighbors {
if let Some(&to_super) = contracted.node_mapping.get(&neighbor) {
if from_super != to_super {
let key = if from_super < to_super {
(from_super, to_super)
} else {
(to_super, from_super)
};
*edge_weights.entry(key).or_insert(0.0) += 1.0;
}
}
}
}
contracted.super_edges = edge_weights
.into_iter()
.map(|((a, b), w)| (a, b, w))
.collect();
contracted
}
pub fn num_super_nodes(&self) -> usize {
self.super_nodes.len()
}
pub fn num_super_edges(&self) -> usize {
self.super_edges.len()
}
}
/// Mincut computation using Ford-Fulkerson algorithm
pub struct MincutComputer {
/// Adjacency list with capacities
adj: Vec<Vec<(usize, f32)>>,
pub n: usize,
}
impl MincutComputer {
pub fn from_contracted_graph(contracted: &ContractedGraph) -> Self {
let n = contracted.num_super_nodes();
let mut adj: Vec<Vec<(usize, f32)>> = vec![Vec::new(); n];
for &(a, b, w) in &contracted.super_edges {
adj[a].push((b, w));
adj[b].push((a, w));
}
Self { adj, n }
}
/// Find mincut using BFS-based augmenting paths
pub fn compute_mincut(&self, source: usize, sink: usize) -> f32 {
if source == sink || self.n == 0 {
return 0.0;
}
// Create residual capacity matrix
let mut residual: Vec<Vec<f32>> = vec![vec![0.0; self.n]; self.n];
for (from, edges) in self.adj.iter().enumerate() {
for &(to, cap) in edges {
residual[from][to] = cap;
}
}
let mut max_flow = 0.0;
// BFS to find augmenting path
loop {
let mut parent = vec![None; self.n];
let mut visited = vec![false; self.n];
let mut queue = VecDeque::new();
visited[source] = true;
queue.push_back(source);
while let Some(u) = queue.pop_front() {
for v in 0..self.n {
if !visited[v] && residual[u][v] > 0.0 {
visited[v] = true;
parent[v] = Some(u);
queue.push_back(v);
}
}
}
if !visited[sink] {
break;
}
// Find minimum residual capacity along path
let mut path_flow = f32::MAX;
let mut v = sink;
while let Some(u) = parent[v] {
path_flow = path_flow.min(residual[u][v]);
v = u;
}
// Update residual capacities
v = sink;
while let Some(u) = parent[v] {
residual[u][v] -= path_flow;
residual[v][u] += path_flow;
v = u;
}
max_flow += path_flow;
}
max_flow
}
/// Compute global mincut (minimum over all pairs)
pub fn compute_global_mincut(&self) -> f32 {
if self.n <= 1 {
return 0.0;
}
let mut min_cut = f32::MAX;
// Use Stoer-Wagner-like approach: fix node 0 as source
for sink in 1..self.n {
let cut = self.compute_mincut(0, sink);
min_cut = min_cut.min(cut);
}
min_cut
}
}
/// State machine for index integrity
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum IndexState {
Uninitialized,
Building,
Ready,
Updating,
Corrupted,
Recovering,
}
pub struct IndexStateMachine {
pub state: IndexState,
pub transition_count: usize,
pub last_integrity_check: std::time::Instant,
pub integrity_score: f32,
}
impl IndexStateMachine {
pub fn new() -> Self {
Self {
state: IndexState::Uninitialized,
transition_count: 0,
last_integrity_check: std::time::Instant::now(),
integrity_score: 1.0,
}
}
pub fn can_transition(&self, to: IndexState) -> bool {
match (self.state, to) {
(IndexState::Uninitialized, IndexState::Building) => true,
(IndexState::Building, IndexState::Ready) => true,
(IndexState::Ready, IndexState::Updating) => true,
(IndexState::Updating, IndexState::Ready) => true,
(_, IndexState::Corrupted) => true,
(IndexState::Corrupted, IndexState::Recovering) => true,
(IndexState::Recovering, IndexState::Ready) => true,
_ => false,
}
}
pub fn transition(&mut self, to: IndexState) -> Result<(), &'static str> {
if self.can_transition(to) {
self.state = to;
self.transition_count += 1;
Ok(())
} else {
Err("Invalid state transition")
}
}
}
/// Gating check for index operations
pub struct GatingCheck {
/// Minimum connectivity threshold
pub min_connectivity: f32,
/// Maximum allowed dead nodes
pub max_dead_nodes_ratio: f32,
/// Maximum layer imbalance
pub max_layer_imbalance: f32,
}
impl GatingCheck {
pub fn default() -> Self {
Self {
min_connectivity: 0.95,
max_dead_nodes_ratio: 0.01,
max_layer_imbalance: 2.0,
}
}
/// Check if graph passes all gates
pub fn check(&self, graph: &Graph) -> GatingResult {
let connectivity = self.check_connectivity(graph);
let dead_ratio = self.check_dead_nodes(graph);
let layer_balance = self.check_layer_balance(graph);
GatingResult {
passed: connectivity >= self.min_connectivity
&& dead_ratio <= self.max_dead_nodes_ratio
&& layer_balance <= self.max_layer_imbalance,
connectivity,
dead_nodes_ratio: dead_ratio,
layer_imbalance: layer_balance,
}
}
fn check_connectivity(&self, graph: &Graph) -> f32 {
if graph.len() <= 1 {
return 1.0;
}
// BFS from first node
let start = *graph.nodes.keys().next().unwrap();
let mut visited = HashSet::new();
let mut queue = VecDeque::new();
visited.insert(start);
queue.push_back(start);
while let Some(node) = queue.pop_front() {
if let Some(n) = graph.nodes.get(&node) {
for &neighbor in &n.neighbors {
if !visited.contains(&neighbor) && graph.nodes.contains_key(&neighbor) {
visited.insert(neighbor);
queue.push_back(neighbor);
}
}
}
}
visited.len() as f32 / graph.len() as f32
}
fn check_dead_nodes(&self, graph: &Graph) -> f32 {
let dead_count = graph
.nodes
.values()
.filter(|n| n.neighbors.is_empty())
.count();
dead_count as f32 / graph.len() as f32
}
fn check_layer_balance(&self, graph: &Graph) -> f32 {
if graph.max_layer == 0 {
return 1.0;
}
let mut layer_counts = vec![0usize; graph.max_layer + 1];
for node in graph.nodes.values() {
layer_counts[node.layer] += 1;
}
let max_count = layer_counts.iter().max().copied().unwrap_or(1) as f32;
let min_count = layer_counts
.iter()
.filter(|&&c| c > 0)
.min()
.copied()
.unwrap_or(1) as f32;
max_count / min_count
}
}
#[derive(Debug)]
pub struct GatingResult {
pub passed: bool,
pub connectivity: f32,
pub dead_nodes_ratio: f32,
pub layer_imbalance: f32,
}
}
use graph::{ContractedGraph, GatingCheck, Graph, IndexState, IndexStateMachine, MincutComputer};
// ============================================================================
// Test Data Generation
// ============================================================================
fn generate_random_graph(n: usize, avg_neighbors: usize, max_layer: usize, seed: u64) -> Graph {
let mut rng = ChaCha8Rng::seed_from_u64(seed);
let mut graph = Graph::new();
// Add nodes with random layers
for id in 0..n {
let layer = if id == 0 {
max_layer
} else {
let ml = 1.0 / (16.0_f64).ln();
let r: f64 = rng.gen();
((-r.ln() * ml).floor() as usize).min(max_layer)
};
graph.add_node(id as u64, layer);
}
// Add random edges (maintaining HNSW-like structure)
for id in 0..n {
let num_neighbors = rng.gen_range(1..=avg_neighbors * 2);
for _ in 0..num_neighbors {
let neighbor = rng.gen_range(0..n) as u64;
if neighbor != id as u64 {
graph.add_edge(id as u64, neighbor);
}
}
}
graph
}
fn generate_connected_graph(n: usize, avg_neighbors: usize, seed: u64) -> Graph {
let mut rng = ChaCha8Rng::seed_from_u64(seed);
let mut graph = Graph::new();
// Add nodes
for id in 0..n {
let layer = if id == 0 { 5 } else { rng.gen_range(0..=5) };
graph.add_node(id as u64, layer);
}
// Ensure connectivity: chain all nodes
for id in 1..n {
graph.add_edge(id as u64, (id - 1) as u64);
graph.add_edge((id - 1) as u64, id as u64);
}
// Add random extra edges
for id in 0..n {
let num_extra = rng.gen_range(0..avg_neighbors);
for _ in 0..num_extra {
let neighbor = rng.gen_range(0..n) as u64;
if neighbor != id as u64 {
graph.add_edge(id as u64, neighbor);
}
}
}
graph
}
// ============================================================================
// Contracted Graph Benchmarks
// ============================================================================
fn bench_contracted_graph_build(c: &mut Criterion) {
let mut group = c.benchmark_group("Contracted Graph Build");
group.sample_size(10);
for &n in [1_000, 10_000, 100_000].iter() {
let graph = generate_connected_graph(n, 16, 42);
for &factor in [10, 50, 100, 500].iter() {
if factor > n {
continue;
}
group.bench_with_input(
BenchmarkId::new(format!("n{}_factor{}", n, factor), n),
&(&graph, factor),
|bench, (g, f)| bench.iter(|| black_box(ContractedGraph::build_from_graph(g, *f))),
);
}
}
group.finish();
}
fn bench_contracted_graph_memory(c: &mut Criterion) {
let mut group = c.benchmark_group("Contracted Graph Memory");
group.sample_size(10);
for &n in [10_000, 100_000].iter() {
let graph = generate_connected_graph(n, 16, 42);
for &factor in [10, 50, 100].iter() {
group.bench_with_input(
BenchmarkId::new(format!("n{}_factor{}", n, factor), n),
&(&graph, factor),
|bench, (g, f)| {
bench.iter(|| {
let contracted = ContractedGraph::build_from_graph(g, *f);
// Calculate memory usage
let super_node_mem = contracted
.super_nodes
.iter()
.map(|sn| sn.original_nodes.len() * 8)
.sum::<usize>();
let edge_mem = contracted.super_edges.len() * 20; // (usize, usize, f32)
let mapping_mem = contracted.node_mapping.len() * 16;
black_box(super_node_mem + edge_mem + mapping_mem)
})
},
);
}
}
group.finish();
}
// ============================================================================
// Mincut Computation Benchmarks
// ============================================================================
fn bench_mincut_compute(c: &mut Criterion) {
let mut group = c.benchmark_group("Mincut Computation");
group.sample_size(10);
for &n in [1_000, 5_000, 10_000].iter() {
let graph = generate_connected_graph(n, 16, 42);
let contracted = ContractedGraph::build_from_graph(&graph, 50);
let mincut_computer = MincutComputer::from_contracted_graph(&contracted);
group.bench_with_input(
BenchmarkId::new("single_pair", n),
&mincut_computer,
|bench, mc| bench.iter(|| black_box(mc.compute_mincut(0, mc.n - 1))),
);
group.bench_with_input(
BenchmarkId::new("global", n),
&mincut_computer,
|bench, mc| bench.iter(|| black_box(mc.compute_global_mincut())),
);
}
group.finish();
}
fn bench_mincut_contraction_factors(c: &mut Criterion) {
let mut group = c.benchmark_group("Mincut vs Contraction Factor");
group.sample_size(10);
let n = 10_000;
let graph = generate_connected_graph(n, 16, 42);
for &factor in [10, 25, 50, 100, 200].iter() {
let contracted = ContractedGraph::build_from_graph(&graph, factor);
let mincut_computer = MincutComputer::from_contracted_graph(&contracted);
group.bench_with_input(
BenchmarkId::from_parameter(factor),
&mincut_computer,
|bench, mc| bench.iter(|| black_box(mc.compute_global_mincut())),
);
}
group.finish();
}
// ============================================================================
// State Transition Benchmarks
// ============================================================================
fn bench_state_transitions(c: &mut Criterion) {
let mut group = c.benchmark_group("State Transitions");
// Single transition
group.bench_function("single_transition", |bench| {
bench.iter(|| {
let mut sm = IndexStateMachine::new();
black_box(sm.transition(IndexState::Building))
})
});
// Full lifecycle
group.bench_function("full_lifecycle", |bench| {
bench.iter(|| {
let mut sm = IndexStateMachine::new();
sm.transition(IndexState::Building).ok();
sm.transition(IndexState::Ready).ok();
sm.transition(IndexState::Updating).ok();
sm.transition(IndexState::Ready).ok();
black_box(sm.state)
})
});
// Transition check only (no mutation)
group.bench_function("transition_check", |bench| {
let sm = IndexStateMachine::new();
bench.iter(|| black_box(sm.can_transition(IndexState::Building)))
});
// Many transitions
group.bench_function("1000_transitions", |bench| {
bench.iter(|| {
let mut sm = IndexStateMachine::new();
sm.transition(IndexState::Building).ok();
sm.transition(IndexState::Ready).ok();
for _ in 0..500 {
sm.transition(IndexState::Updating).ok();
sm.transition(IndexState::Ready).ok();
}
black_box(sm.transition_count)
})
});
group.finish();
}
fn bench_state_machine_overhead(c: &mut Criterion) {
let mut group = c.benchmark_group("State Machine Overhead");
// Measure overhead of state checking before operations
let graph = generate_connected_graph(10_000, 16, 42);
group.bench_function("with_state_check", |bench| {
let mut sm = IndexStateMachine::new();
sm.transition(IndexState::Building).ok();
sm.transition(IndexState::Ready).ok();
bench.iter(|| {
// Simulate operation with state check
if sm.state == IndexState::Ready {
// Perform "operation"
let count = graph.nodes.len();
black_box(count)
} else {
black_box(0)
}
})
});
group.bench_function("without_state_check", |bench| {
bench.iter(|| {
// Perform operation directly
let count = graph.nodes.len();
black_box(count)
})
});
group.finish();
}
// ============================================================================
// Gating Check Benchmarks
// ============================================================================
fn bench_gating_check(c: &mut Criterion) {
let mut group = c.benchmark_group("Gating Check");
for &n in [1_000, 10_000, 100_000].iter() {
let graph = generate_connected_graph(n, 16, 42);
let gating = GatingCheck::default();
group.bench_with_input(
BenchmarkId::new("full_check", n),
&(&graph, &gating),
|bench, (g, gate)| bench.iter(|| black_box(gate.check(g))),
);
}
group.finish();
}
fn bench_connectivity_check(c: &mut Criterion) {
let mut group = c.benchmark_group("Connectivity Check");
for &n in [1_000, 10_000, 100_000].iter() {
// Well-connected graph
let connected_graph = generate_connected_graph(n, 16, 42);
// Sparse graph (may have disconnected components)
let sparse_graph = generate_random_graph(n, 2, 5, 42);
let gating = GatingCheck::default();
group.bench_with_input(
BenchmarkId::new("connected", n),
&(&connected_graph, &gating),
|bench, (g, gate)| bench.iter(|| black_box(gate.check(g).connectivity)),
);
group.bench_with_input(
BenchmarkId::new("sparse", n),
&(&sparse_graph, &gating),
|bench, (g, gate)| bench.iter(|| black_box(gate.check(g).connectivity)),
);
}
group.finish();
}
fn bench_dead_node_detection(c: &mut Criterion) {
let mut group = c.benchmark_group("Dead Node Detection");
for &n in [10_000, 100_000].iter() {
let graph = generate_connected_graph(n, 16, 42);
let gating = GatingCheck::default();
group.bench_with_input(
BenchmarkId::from_parameter(n),
&(&graph, &gating),
|bench, (g, gate)| bench.iter(|| black_box(gate.check(g).dead_nodes_ratio)),
);
}
group.finish();
}
fn bench_layer_balance_check(c: &mut Criterion) {
let mut group = c.benchmark_group("Layer Balance Check");
for &n in [10_000, 100_000].iter() {
let graph = generate_random_graph(n, 16, 10, 42);
let gating = GatingCheck::default();
group.bench_with_input(
BenchmarkId::from_parameter(n),
&(&graph, &gating),
|bench, (g, gate)| bench.iter(|| black_box(gate.check(g).layer_imbalance)),
);
}
group.finish();
}
// ============================================================================
// Parallel Integrity Checks
// ============================================================================
fn bench_parallel_integrity(c: &mut Criterion) {
let mut group = c.benchmark_group("Parallel Integrity Check");
group.sample_size(10);
let n = 100_000;
let graph = generate_connected_graph(n, 16, 42);
let gating = GatingCheck::default();
// Sequential checks
group.bench_function("sequential", |bench| {
bench.iter(|| {
let result = gating.check(&graph);
black_box(result)
})
});
// Parallel checks (connectivity, dead nodes, layer balance)
group.bench_function("parallel", |bench| {
bench.iter(|| {
let (connectivity, (dead_ratio, layer_balance)) = rayon::join(
|| {
// Connectivity check
if graph.len() <= 1 {
return 1.0;
}
let start = *graph.nodes.keys().next().unwrap();
let mut visited = HashSet::new();
let mut queue = VecDeque::new();
visited.insert(start);
queue.push_back(start);
while let Some(node) = queue.pop_front() {
if let Some(n) = graph.nodes.get(&node) {
for &neighbor in &n.neighbors {
if !visited.contains(&neighbor)
&& graph.nodes.contains_key(&neighbor)
{
visited.insert(neighbor);
queue.push_back(neighbor);
}
}
}
}
visited.len() as f32 / graph.len() as f32
},
|| {
rayon::join(
|| {
// Dead nodes
let dead = graph
.nodes
.values()
.filter(|n| n.neighbors.is_empty())
.count();
dead as f32 / graph.len() as f32
},
|| {
// Layer balance
let mut layer_counts = vec![0usize; graph.max_layer + 1];
for node in graph.nodes.values() {
layer_counts[node.layer] += 1;
}
let max_count = layer_counts.iter().max().copied().unwrap_or(1) as f32;
let min_count = layer_counts
.iter()
.filter(|&&c| c > 0)
.min()
.copied()
.unwrap_or(1) as f32;
max_count / min_count
},
)
},
);
let passed = connectivity >= gating.min_connectivity
&& dead_ratio <= gating.max_dead_nodes_ratio
&& layer_balance <= gating.max_layer_imbalance;
black_box(passed)
})
});
group.finish();
}
// ============================================================================
// Complete Integrity Pipeline
// ============================================================================
fn bench_full_integrity_pipeline(c: &mut Criterion) {
let mut group = c.benchmark_group("Full Integrity Pipeline");
group.sample_size(10);
for &n in [10_000, 50_000, 100_000].iter() {
let graph = generate_connected_graph(n, 16, 42);
let gating = GatingCheck::default();
group.bench_with_input(BenchmarkId::from_parameter(n), &n, |bench, _| {
bench.iter(|| {
// 1. State check
let mut sm = IndexStateMachine::new();
sm.transition(IndexState::Building).ok();
sm.transition(IndexState::Ready).ok();
// 2. Gating check
let gate_result = gating.check(&graph);
// 3. If passed, build contracted graph
if gate_result.passed {
let contracted = ContractedGraph::build_from_graph(&graph, 100);
// 4. Compute mincut
let mincut_computer = MincutComputer::from_contracted_graph(&contracted);
let mincut = mincut_computer.compute_global_mincut();
black_box((gate_result, mincut))
} else {
black_box((gate_result, 0.0))
}
})
});
}
group.finish();
}
criterion_group!(
benches,
// Contracted Graph
bench_contracted_graph_build,
bench_contracted_graph_memory,
// Mincut
bench_mincut_compute,
bench_mincut_contraction_factors,
// State Transitions
bench_state_transitions,
bench_state_machine_overhead,
// Gating Checks
bench_gating_check,
bench_connectivity_check,
bench_dead_node_detection,
bench_layer_balance_check,
// Parallel Integrity
bench_parallel_integrity,
// Full Pipeline
bench_full_integrity_pipeline,
);
criterion_main!(benches);

View File

@@ -0,0 +1,434 @@
//! Comprehensive quantization benchmarks
//!
//! Compares exact vs quantized search with different quantization methods
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion};
use rand::prelude::*;
use rand_chacha::ChaCha8Rng;
use ruvector_postgres::distance::DistanceMetric;
use ruvector_postgres::types::{BinaryVec, ProductVec, RuVector, ScalarVec};
// ============================================================================
// Test Data Generation
// ============================================================================
fn generate_vectors(n: usize, dims: usize, seed: u64) -> Vec<Vec<f32>> {
let mut rng = ChaCha8Rng::seed_from_u64(seed);
(0..n)
.map(|_| (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect())
.collect()
}
// ============================================================================
// Scalar Quantization (SQ8) Benchmarks
// ============================================================================
fn bench_sq8_quantization(c: &mut Criterion) {
let mut group = c.benchmark_group("sq8_quantization");
for dims in [128, 384, 768, 1536, 3072].iter() {
let data: Vec<f32> = (0..*dims).map(|i| (i as f32) * 0.001).collect();
group.bench_with_input(BenchmarkId::new("encode", dims), dims, |bench, _| {
bench.iter(|| black_box(ScalarVec::from_f32(&data)));
});
let encoded = ScalarVec::from_f32(&data);
group.bench_with_input(BenchmarkId::new("decode", dims), dims, |bench, _| {
bench.iter(|| black_box(encoded.to_f32()));
});
}
group.finish();
}
fn bench_sq8_distance(c: &mut Criterion) {
let mut group = c.benchmark_group("sq8_distance");
for dims in [128, 384, 768, 1536, 3072].iter() {
let a_data: Vec<f32> = (0..*dims).map(|i| i as f32 * 0.1).collect();
let b_data: Vec<f32> = (0..*dims).map(|i| (*dims - i) as f32 * 0.1).collect();
let a_exact = RuVector::from_slice(&a_data);
let b_exact = RuVector::from_slice(&b_data);
let a_sq8 = ScalarVec::from_f32(&a_data);
let b_sq8 = ScalarVec::from_f32(&b_data);
group.bench_with_input(BenchmarkId::new("exact", dims), dims, |bench, _| {
bench.iter(|| black_box(a_exact.dot(&b_exact)));
});
group.bench_with_input(BenchmarkId::new("quantized", dims), dims, |bench, _| {
bench.iter(|| black_box(a_sq8.distance(&b_sq8)));
});
}
group.finish();
}
fn bench_sq8_search(c: &mut Criterion) {
let mut group = c.benchmark_group("sq8_search");
for dims in [128, 768, 1536].iter() {
let n = 10000;
let vectors = generate_vectors(n, *dims, 42);
let query = generate_vectors(1, *dims, 999)[0].clone();
// Exact search
let exact_vecs: Vec<RuVector> = vectors.iter().map(|v| RuVector::from_slice(v)).collect();
let exact_query = RuVector::from_slice(&query);
group.bench_with_input(BenchmarkId::new("exact", dims), dims, |bench, _| {
bench.iter(|| {
let mut distances: Vec<(usize, f32)> = exact_vecs
.iter()
.enumerate()
.map(|(id, vec)| {
let dist = exact_query.dot(vec);
(id, -dist) // Negative for max inner product
})
.collect();
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
let top_k: Vec<_> = distances[..10].to_vec();
black_box(top_k)
});
});
// Quantized search
let sq8_vecs: Vec<ScalarVec> = vectors.iter().map(|v| ScalarVec::from_f32(v)).collect();
let sq8_query = ScalarVec::from_f32(&query);
group.bench_with_input(BenchmarkId::new("quantized", dims), dims, |bench, _| {
bench.iter(|| {
let mut distances: Vec<(usize, f32)> = sq8_vecs
.iter()
.enumerate()
.map(|(id, vec)| (id, sq8_query.distance(vec)))
.collect();
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
let top_k: Vec<_> = distances[..10].to_vec();
black_box(top_k)
});
});
}
group.finish();
}
// ============================================================================
// Binary Quantization Benchmarks
// ============================================================================
fn bench_binary_quantization(c: &mut Criterion) {
let mut group = c.benchmark_group("binary_quantization");
for dims in [128, 512, 1024, 2048, 4096].iter() {
let data: Vec<f32> = (0..*dims)
.map(|i| if i % 2 == 0 { 1.0 } else { -1.0 })
.collect();
group.bench_with_input(BenchmarkId::new("encode", dims), dims, |bench, _| {
bench.iter(|| black_box(BinaryVec::from_f32(&data)));
});
}
group.finish();
}
fn bench_binary_hamming(c: &mut Criterion) {
let mut group = c.benchmark_group("binary_hamming");
for dims in [128, 512, 1024, 2048, 4096, 8192].iter() {
let a_data: Vec<f32> = (0..*dims)
.map(|i| if i % 2 == 0 { 1.0 } else { -1.0 })
.collect();
let b_data: Vec<f32> = (0..*dims)
.map(|i| if i % 3 == 0 { 1.0 } else { -1.0 })
.collect();
let a = BinaryVec::from_f32(&a_data);
let b = BinaryVec::from_f32(&b_data);
group.bench_with_input(BenchmarkId::new("simd", dims), dims, |bench, _| {
bench.iter(|| black_box(a.hamming_distance(&b)));
});
}
group.finish();
}
fn bench_binary_search(c: &mut Criterion) {
let mut group = c.benchmark_group("binary_search");
for dims in [1024, 2048, 4096].iter() {
let n = 100000;
let vectors = generate_vectors(n, *dims, 42);
let query = generate_vectors(1, *dims, 999)[0].clone();
let binary_vecs: Vec<BinaryVec> = vectors.iter().map(|v| BinaryVec::from_f32(v)).collect();
let binary_query = BinaryVec::from_f32(&query);
group.bench_with_input(BenchmarkId::new("scan", dims), dims, |bench, _| {
bench.iter(|| {
let mut distances: Vec<(usize, u32)> = binary_vecs
.iter()
.enumerate()
.map(|(id, vec)| (id, binary_query.hamming_distance(vec)))
.collect();
distances.sort_by_key(|k| k.1);
let top_k: Vec<_> = distances[..10].to_vec();
black_box(top_k)
});
});
}
group.finish();
}
// ============================================================================
// Product Quantization (PQ) Benchmarks
// ============================================================================
fn bench_pq_adc_distance(c: &mut Criterion) {
let mut group = c.benchmark_group("pq_adc_distance");
for m in [8u8, 16, 32, 48, 64].iter() {
let k: usize = 256; // Number of centroids
let codes: Vec<u8> = (0..*m).map(|i| ((i * 7) % k as u8) as u8).collect();
let pq = ProductVec::new((*m as usize * 32) as u16, *m, 255, codes);
// Create distance table
let mut table = Vec::with_capacity(*m as usize * k as usize);
for i in 0..(*m as usize * k as usize) {
table.push((i % 100) as f32 * 0.01);
}
group.bench_with_input(BenchmarkId::new("simd", m), m, |bench, _| {
bench.iter(|| black_box(pq.adc_distance_simd(&table)));
});
group.bench_with_input(BenchmarkId::new("flat", m), m, |bench, _| {
bench.iter(|| black_box(pq.adc_distance_flat(&table)));
});
}
group.finish();
}
// ============================================================================
// Compression Ratio Benchmarks
// ============================================================================
fn bench_compression_comparison(c: &mut Criterion) {
let mut group = c.benchmark_group("compression_ratio");
for dims in [384, 768, 1536, 3072].iter() {
let data: Vec<f32> = (0..*dims).map(|i| (i as f32) * 0.001).collect();
let original_size = dims * std::mem::size_of::<f32>();
group.bench_with_input(BenchmarkId::new("binary", dims), dims, |bench, _| {
bench.iter(|| {
let binary = black_box(BinaryVec::from_f32(&data));
let compressed = binary.memory_size();
let ratio = original_size as f32 / compressed as f32;
black_box(ratio)
});
});
group.bench_with_input(BenchmarkId::new("scalar", dims), dims, |bench, _| {
bench.iter(|| {
let scalar = black_box(ScalarVec::from_f32(&data));
let compressed = scalar.memory_size();
let ratio = original_size as f32 / compressed as f32;
black_box(ratio)
});
});
group.bench_with_input(BenchmarkId::new("product", dims), dims, |bench, _| {
bench.iter(|| {
let m = (dims / 32).min(64);
let pq = black_box(ProductVec::new(*dims as u16, m as u8, 255, vec![0; m]));
let compressed = pq.memory_size();
let ratio = original_size as f32 / compressed as f32;
black_box(ratio)
});
});
}
group.finish();
}
// ============================================================================
// Speedup vs Accuracy Trade-off
// ============================================================================
fn bench_quantization_tradeoff(c: &mut Criterion) {
let mut group = c.benchmark_group("quantization_tradeoff");
group.sample_size(10);
let dims = 768;
let n = 10000;
let num_queries = 100;
let vectors = generate_vectors(n, dims, 42);
let queries = generate_vectors(num_queries, dims, 999);
// Compute ground truth
let exact_vecs: Vec<RuVector> = vectors.iter().map(|v| RuVector::from_slice(v)).collect();
let ground_truth: Vec<Vec<usize>> = queries
.iter()
.map(|query| {
let query_vec = RuVector::from_slice(query);
let mut distances: Vec<(usize, f32)> = exact_vecs
.iter()
.enumerate()
.map(|(id, vec)| {
let diff = query_vec.sub(vec);
let dist = diff.norm();
(id, dist)
})
.collect();
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
distances.iter().take(10).map(|(id, _)| *id).collect()
})
.collect();
// Benchmark SQ8
let sq8_vecs: Vec<ScalarVec> = vectors.iter().map(|v| ScalarVec::from_f32(v)).collect();
group.bench_function("sq8_speedup", |bench| {
bench.iter(|| {
for (i, query) in queries.iter().enumerate() {
let sq8_query = ScalarVec::from_f32(query);
let mut distances: Vec<(usize, f32)> = sq8_vecs
.iter()
.enumerate()
.map(|(id, vec)| (id, sq8_query.distance(vec)))
.collect();
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
let results: Vec<usize> = distances.iter().take(10).map(|(id, _)| *id).collect();
// Compute recall
let hits = results
.iter()
.filter(|id| ground_truth[i].contains(id))
.count();
black_box(hits as f32 / 10.0);
}
});
});
// Benchmark Binary
let binary_vecs: Vec<BinaryVec> = vectors.iter().map(|v| BinaryVec::from_f32(v)).collect();
group.bench_function("binary_speedup", |bench| {
bench.iter(|| {
for (i, query) in queries.iter().enumerate() {
let binary_query = BinaryVec::from_f32(query);
let mut distances: Vec<(usize, u32)> = binary_vecs
.iter()
.enumerate()
.map(|(id, vec)| (id, binary_query.hamming_distance(vec)))
.collect();
distances.sort_by_key(|k| k.1);
let results: Vec<usize> = distances.iter().take(10).map(|(id, _)| *id).collect();
// Compute recall
let hits = results
.iter()
.filter(|id| ground_truth[i].contains(id))
.count();
black_box(hits as f32 / 10.0);
}
});
});
group.finish();
}
// ============================================================================
// Throughput Comparison
// ============================================================================
fn bench_quantization_throughput(c: &mut Criterion) {
let mut group = c.benchmark_group("quantization_throughput");
let dims = 1536;
let n = 100000;
let vectors = generate_vectors(n, dims, 42);
let query = generate_vectors(1, dims, 999)[0].clone();
// Exact
let exact_vecs: Vec<RuVector> = vectors.iter().map(|v| RuVector::from_slice(v)).collect();
let exact_query = RuVector::from_slice(&query);
group.bench_function("exact_scan", |bench| {
bench.iter(|| {
let mut total = 0.0f32;
for vec in &exact_vecs {
total += exact_query.dot(vec);
}
black_box(total)
});
});
// SQ8
let sq8_vecs: Vec<ScalarVec> = vectors.iter().map(|v| ScalarVec::from_f32(v)).collect();
let sq8_query = ScalarVec::from_f32(&query);
group.bench_function("sq8_scan", |bench| {
bench.iter(|| {
let mut total = 0.0f32;
for vec in &sq8_vecs {
total += sq8_query.distance(vec);
}
black_box(total)
});
});
// Binary
let binary_vecs: Vec<BinaryVec> = vectors.iter().map(|v| BinaryVec::from_f32(v)).collect();
let binary_query = BinaryVec::from_f32(&query);
group.bench_function("binary_scan", |bench| {
bench.iter(|| {
let mut total = 0u64;
for vec in &binary_vecs {
total += binary_query.hamming_distance(vec) as u64;
}
black_box(total)
});
});
group.finish();
}
criterion_group!(
benches,
bench_sq8_quantization,
bench_sq8_distance,
bench_sq8_search,
bench_binary_quantization,
bench_binary_hamming,
bench_binary_search,
bench_pq_adc_distance,
bench_compression_comparison,
bench_quantization_tradeoff,
bench_quantization_throughput,
);
criterion_main!(benches);

View File

@@ -0,0 +1,217 @@
//! Benchmarks for quantized vector distance calculations
//!
//! Compares scalar vs SIMD implementations for all quantized types
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion};
use ruvector_postgres::types::{BinaryVec, ProductVec, ScalarVec};
// ============================================================================
// BinaryVec Benchmarks
// ============================================================================
fn bench_binaryvec_hamming(c: &mut Criterion) {
let mut group = c.benchmark_group("binaryvec_hamming");
for dims in [128, 512, 1024, 2048, 4096].iter() {
let a_data: Vec<f32> = (0..*dims)
.map(|i| if i % 2 == 0 { 1.0 } else { -1.0 })
.collect();
let b_data: Vec<f32> = (0..*dims)
.map(|i| if i % 3 == 0 { 1.0 } else { -1.0 })
.collect();
let a = BinaryVec::from_f32(&a_data);
let b = BinaryVec::from_f32(&b_data);
group.bench_with_input(BenchmarkId::new("simd", dims), dims, |bencher, _| {
bencher.iter(|| black_box(a.hamming_distance(&b)));
});
}
group.finish();
}
fn bench_binaryvec_quantization(c: &mut Criterion) {
let mut group = c.benchmark_group("binaryvec_quantization");
for dims in [128, 512, 1024, 2048, 4096].iter() {
let data: Vec<f32> = (0..*dims).map(|i| (i as f32) * 0.01).collect();
group.bench_with_input(BenchmarkId::new("from_f32", dims), dims, |bencher, _| {
bencher.iter(|| black_box(BinaryVec::from_f32(&data)));
});
}
group.finish();
}
// ============================================================================
// ScalarVec Benchmarks
// ============================================================================
fn bench_scalarvec_distance(c: &mut Criterion) {
let mut group = c.benchmark_group("scalarvec_distance");
for dims in [128, 512, 1024, 2048, 4096].iter() {
let a_data: Vec<f32> = (0..*dims).map(|i| i as f32 * 0.1).collect();
let b_data: Vec<f32> = (0..*dims).map(|i| (*dims - i) as f32 * 0.1).collect();
let a = ScalarVec::from_f32(&a_data);
let b = ScalarVec::from_f32(&b_data);
group.bench_with_input(BenchmarkId::new("simd", dims), dims, |bencher, _| {
bencher.iter(|| black_box(a.distance(&b)));
});
}
group.finish();
}
fn bench_scalarvec_quantization(c: &mut Criterion) {
let mut group = c.benchmark_group("scalarvec_quantization");
for dims in [128, 512, 1024, 2048, 4096].iter() {
let data: Vec<f32> = (0..*dims).map(|i| (i as f32) * 0.01).collect();
group.bench_with_input(BenchmarkId::new("from_f32", dims), dims, |bencher, _| {
bencher.iter(|| black_box(ScalarVec::from_f32(&data)));
});
let scalar = ScalarVec::from_f32(&data);
group.bench_with_input(BenchmarkId::new("to_f32", dims), dims, |bencher, _| {
bencher.iter(|| black_box(scalar.to_f32()));
});
}
group.finish();
}
// ============================================================================
// ProductVec Benchmarks
// ============================================================================
fn bench_productvec_adc_distance(c: &mut Criterion) {
let mut group = c.benchmark_group("productvec_adc_distance");
for m in [8u8, 16, 32, 48, 64].iter() {
let k: usize = 256;
let codes: Vec<u8> = (0..*m).map(|i| ((i * 7) % k as u8) as u8).collect();
let pq = ProductVec::new((*m as usize * 32) as u16, *m, 255, codes);
// Create distance table
let mut table = Vec::with_capacity(*m as usize * k as usize);
for i in 0..(*m as usize * k as usize) {
table.push((i % 100) as f32 * 0.01);
}
group.bench_with_input(BenchmarkId::new("simd", m), m, |bencher, _| {
bencher.iter(|| black_box(pq.adc_distance_simd(&table)));
});
group.bench_with_input(BenchmarkId::new("flat", m), m, |bencher, _| {
bencher.iter(|| black_box(pq.adc_distance_flat(&table)));
});
}
group.finish();
}
// ============================================================================
// Compression Benchmarks
// ============================================================================
fn bench_compression_ratios(c: &mut Criterion) {
let mut group = c.benchmark_group("compression");
let dims = 1536; // OpenAI embedding size
let data: Vec<f32> = (0..dims).map(|i| (i as f32) * 0.001).collect();
// Original size
let original_size = dims * std::mem::size_of::<f32>();
group.bench_function("binary_quantize", |bencher| {
bencher.iter(|| {
let binary = black_box(BinaryVec::from_f32(&data));
let ratio = original_size as f32 / binary.memory_size() as f32;
black_box(ratio)
});
});
group.bench_function("scalar_quantize", |bencher| {
bencher.iter(|| {
let scalar = black_box(ScalarVec::from_f32(&data));
let ratio = original_size as f32 / scalar.memory_size() as f32;
black_box(ratio)
});
});
group.bench_function("product_quantize", |bencher| {
bencher.iter(|| {
let pq = black_box(ProductVec::new(dims as u16, 48, 255, vec![0; 48]));
let ratio = original_size as f32 / pq.memory_size() as f32;
black_box(ratio)
});
});
group.finish();
}
// ============================================================================
// Throughput Benchmarks
// ============================================================================
fn bench_throughput_comparison(c: &mut Criterion) {
let mut group = c.benchmark_group("throughput");
let dims = 1024;
let num_vectors = 1000;
// Generate test data
let vectors: Vec<Vec<f32>> = (0..num_vectors)
.map(|i| (0..dims).map(|j| ((i * dims + j) as f32) * 0.001).collect())
.collect();
let query = vectors[0].clone();
// Quantize all vectors
let binary_vecs: Vec<BinaryVec> = vectors.iter().map(|v| BinaryVec::from_f32(v)).collect();
let scalar_vecs: Vec<ScalarVec> = vectors.iter().map(|v| ScalarVec::from_f32(v)).collect();
let query_binary = BinaryVec::from_f32(&query);
let query_scalar = ScalarVec::from_f32(&query);
group.bench_function("binary_scan", |bencher| {
bencher.iter(|| {
let mut total_dist = 0u32;
for v in &binary_vecs {
total_dist += black_box(query_binary.hamming_distance(v));
}
black_box(total_dist)
});
});
group.bench_function("scalar_scan", |bencher| {
bencher.iter(|| {
let mut total_dist = 0.0f32;
for v in &scalar_vecs {
total_dist += black_box(query_scalar.distance(v));
}
black_box(total_dist)
});
});
group.finish();
}
criterion_group!(
benches,
bench_binaryvec_hamming,
bench_binaryvec_quantization,
bench_scalarvec_distance,
bench_scalarvec_quantization,
bench_productvec_adc_distance,
bench_compression_ratios,
bench_throughput_comparison,
);
criterion_main!(benches);

View File

@@ -0,0 +1,173 @@
#!/bin/bash
# Comprehensive benchmark runner script
set -e
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Configuration
BENCHMARK_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
RESULTS_DIR="${BENCHMARK_DIR}/results"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
# Create results directory
mkdir -p "${RESULTS_DIR}"
echo -e "${BLUE}==================================================${NC}"
echo -e "${BLUE} RuVector Comprehensive Benchmark Suite${NC}"
echo -e "${BLUE}==================================================${NC}"
echo ""
# ============================================================================
# Rust Benchmarks
# ============================================================================
echo -e "${GREEN}Running Rust benchmarks...${NC}"
echo ""
# Distance benchmarks
echo -e "${YELLOW}1. Distance function benchmarks${NC}"
cargo bench --bench distance_bench -- --output-format bencher | tee "${RESULTS_DIR}/distance_${TIMESTAMP}.txt"
# Index benchmarks
echo -e "${YELLOW}2. HNSW index benchmarks${NC}"
cargo bench --bench index_bench -- --output-format bencher | tee "${RESULTS_DIR}/index_${TIMESTAMP}.txt"
# Quantization benchmarks
echo -e "${YELLOW}3. Quantization benchmarks${NC}"
cargo bench --bench quantization_bench -- --output-format bencher | tee "${RESULTS_DIR}/quantization_${TIMESTAMP}.txt"
# Quantized distance benchmarks
echo -e "${YELLOW}4. Quantized distance benchmarks${NC}"
cargo bench --bench quantized_distance_bench -- --output-format bencher | tee "${RESULTS_DIR}/quantized_distance_${TIMESTAMP}.txt"
# ============================================================================
# SQL Benchmarks (if PostgreSQL is available)
# ============================================================================
if command -v psql &> /dev/null; then
echo ""
echo -e "${GREEN}Running SQL benchmarks...${NC}"
echo ""
# Check if test database exists
if psql -lqt | cut -d \| -f 1 | grep -qw ruvector_bench; then
echo -e "${YELLOW}5. Quick SQL benchmark${NC}"
psql -d ruvector_bench -f "${BENCHMARK_DIR}/sql/quick_benchmark.sql" | tee "${RESULTS_DIR}/sql_quick_${TIMESTAMP}.txt"
echo -e "${YELLOW}6. Full workload benchmark${NC}"
echo -e "${RED}Warning: This may take several minutes...${NC}"
psql -d ruvector_bench -f "${BENCHMARK_DIR}/sql/benchmark_workload.sql" | tee "${RESULTS_DIR}/sql_workload_${TIMESTAMP}.txt"
else
echo -e "${YELLOW}Skipping SQL benchmarks (database 'ruvector_bench' not found)${NC}"
echo -e "${YELLOW}To run SQL benchmarks:${NC}"
echo -e " createdb ruvector_bench"
echo -e " psql -d ruvector_bench -c 'CREATE EXTENSION ruvector;'"
echo -e " psql -d ruvector_bench -c 'CREATE EXTENSION pgvector;'"
fi
else
echo -e "${YELLOW}Skipping SQL benchmarks (psql not found)${NC}"
fi
# ============================================================================
# Generate Summary Report
# ============================================================================
echo ""
echo -e "${GREEN}Generating summary report...${NC}"
cat > "${RESULTS_DIR}/summary_${TIMESTAMP}.md" <<EOF
# RuVector Benchmark Results
**Date:** $(date)
**Platform:** $(uname -s) $(uname -m)
**Rust Version:** $(rustc --version)
## Benchmark Files
- Distance functions: \`distance_${TIMESTAMP}.txt\`
- HNSW index: \`index_${TIMESTAMP}.txt\`
- Quantization: \`quantization_${TIMESTAMP}.txt\`
- Quantized distance: \`quantized_distance_${TIMESTAMP}.txt\`
## SQL Benchmarks
EOF
if [ -f "${RESULTS_DIR}/sql_quick_${TIMESTAMP}.txt" ]; then
cat >> "${RESULTS_DIR}/summary_${TIMESTAMP}.md" <<EOF
- Quick benchmark: \`sql_quick_${TIMESTAMP}.txt\`
- Full workload: \`sql_workload_${TIMESTAMP}.txt\`
EOF
else
cat >> "${RESULTS_DIR}/summary_${TIMESTAMP}.md" <<EOF
SQL benchmarks were not run. See setup instructions above.
EOF
fi
cat >> "${RESULTS_DIR}/summary_${TIMESTAMP}.md" <<EOF
## System Information
\`\`\`
$(uname -a)
\`\`\`
### CPU Information
\`\`\`
$(lscpu 2>/dev/null || sysctl -a | grep machdep.cpu || echo "CPU info not available")
\`\`\`
### Memory Information
\`\`\`
$(free -h 2>/dev/null || vm_stat || echo "Memory info not available")
\`\`\`
## Running the Benchmarks
To reproduce these results:
\`\`\`bash
cd crates/ruvector-postgres
bash benches/scripts/run_benchmarks.sh
\`\`\`
## Comparing with Previous Results
\`\`\`bash
# Install cargo-criterion for better comparison
cargo install cargo-criterion
# Run with baseline
cargo criterion --bench distance_bench --baseline main
\`\`\`
EOF
echo ""
echo -e "${GREEN}==================================================${NC}"
echo -e "${GREEN} Benchmark Complete!${NC}"
echo -e "${GREEN}==================================================${NC}"
echo ""
echo -e "Results saved to: ${BLUE}${RESULTS_DIR}${NC}"
echo -e "Summary report: ${BLUE}${RESULTS_DIR}/summary_${TIMESTAMP}.md${NC}"
echo ""
# ============================================================================
# Optional: Open results in browser if criterion HTML is available
# ============================================================================
if [ -d "target/criterion" ]; then
echo -e "${YELLOW}Criterion HTML reports available at:${NC}"
echo -e " ${BLUE}file://$(pwd)/target/criterion/report/index.html${NC}"
fi
echo ""
echo -e "${GREEN}Done!${NC}"

View File

@@ -0,0 +1,381 @@
-- Realistic workload benchmark for ruvector vs pgvector
-- This script tests common operations with realistic dataset sizes
\timing on
\set ECHO all
-- Configuration
\set num_vectors 1000000
\set num_queries 1000
\set dims 1536
\set k 10
BEGIN;
-- ============================================================================
-- Setup Test Tables
-- ============================================================================
DROP TABLE IF EXISTS vectors_ruvector CASCADE;
DROP TABLE IF EXISTS vectors_pgvector CASCADE;
DROP TABLE IF EXISTS queries CASCADE;
-- Create tables
CREATE TABLE vectors_ruvector (
id SERIAL PRIMARY KEY,
embedding ruvector(:dims),
metadata JSONB
);
CREATE TABLE vectors_pgvector (
id SERIAL PRIMARY KEY,
embedding vector(:dims),
metadata JSONB
);
CREATE TABLE queries (
id SERIAL PRIMARY KEY,
query_vector ruvector(:dims)
);
-- ============================================================================
-- Generate Test Data
-- ============================================================================
\echo 'Generating test data...'
-- Insert vectors (ruvector)
INSERT INTO vectors_ruvector (embedding, metadata)
SELECT
array_to_ruvector(ARRAY(
SELECT random()::real
FROM generate_series(1, :dims)
)),
jsonb_build_object('category', i % 100)
FROM generate_series(1, :num_vectors) i;
-- Insert vectors (pgvector)
INSERT INTO vectors_pgvector (embedding, metadata)
SELECT
ARRAY(
SELECT random()::real
FROM generate_series(1, :dims)
)::vector(:dims),
jsonb_build_object('category', i % 100)
FROM generate_series(1, :num_vectors) i;
-- Generate query vectors
INSERT INTO queries (query_vector)
SELECT
array_to_ruvector(ARRAY(
SELECT random()::real
FROM generate_series(1, :dims)
))
FROM generate_series(1, :num_queries);
COMMIT;
-- ============================================================================
-- Benchmark 1: Sequential Scan (No Index)
-- ============================================================================
\echo ''
\echo '=== Benchmark 1: Sequential Scan (No Index) ==='
\echo ''
-- Get a test query
\set test_query 'SELECT query_vector FROM queries WHERE id = 1'
-- RuVector scan
\echo 'RuVector sequential scan (p50, p99 latency):'
SELECT
percentile_cont(0.5) WITHIN GROUP (ORDER BY duration) AS p50_ms,
percentile_cont(0.99) WITHIN GROUP (ORDER BY duration) AS p99_ms,
AVG(duration) AS avg_ms,
MIN(duration) AS min_ms,
MAX(duration) AS max_ms
FROM (
SELECT
id,
extract(milliseconds FROM (clock_timestamp() - start_time)) AS duration
FROM (
SELECT
id,
clock_timestamp() AS start_time,
(SELECT id FROM vectors_ruvector v ORDER BY v.embedding <-> (:test_query)::ruvector LIMIT :k)
FROM queries
LIMIT 100
) t
) times;
-- PGVector scan
\echo 'pgvector sequential scan (p50, p99 latency):'
SELECT
percentile_cont(0.5) WITHIN GROUP (ORDER BY duration) AS p50_ms,
percentile_cont(0.99) WITHIN GROUP (ORDER BY duration) AS p99_ms,
AVG(duration) AS avg_ms,
MIN(duration) AS min_ms,
MAX(duration) AS max_ms
FROM (
SELECT
id,
extract(milliseconds FROM (clock_timestamp() - start_time)) AS duration
FROM (
SELECT
id,
clock_timestamp() AS start_time,
(SELECT id FROM vectors_pgvector v ORDER BY v.embedding <-> (SELECT query_vector::vector FROM queries WHERE id = 1) LIMIT :k)
FROM queries
LIMIT 100
) t
) times;
-- ============================================================================
-- Benchmark 2: Build Index
-- ============================================================================
\echo ''
\echo '=== Benchmark 2: Index Build Time ==='
\echo ''
-- RuVector HNSW
\echo 'Building ruvector HNSW index...'
\timing on
CREATE INDEX vectors_ruvector_hnsw_idx ON vectors_ruvector
USING hnsw (embedding ruvector_l2_ops)
WITH (m = 16, ef_construction = 64);
-- PGVector HNSW
\echo 'Building pgvector HNSW index...'
\timing on
CREATE INDEX vectors_pgvector_hnsw_idx ON vectors_pgvector
USING hnsw (embedding vector_l2_ops)
WITH (m = 16, ef_construction = 64);
-- ============================================================================
-- Benchmark 3: Index Search Performance
-- ============================================================================
\echo ''
\echo '=== Benchmark 3: Index Search (HNSW) ==='
\echo ''
-- Warm up
SELECT COUNT(*) FROM vectors_ruvector v, queries q
WHERE v.embedding <-> q.query_vector < 1000 LIMIT 100;
-- RuVector HNSW search
\echo 'RuVector HNSW search (p50, p99 latency):'
SELECT
percentile_cont(0.5) WITHIN GROUP (ORDER BY duration) AS p50_ms,
percentile_cont(0.99) WITHIN GROUP (ORDER BY duration) AS p99_ms,
AVG(duration) AS avg_ms,
MIN(duration) AS min_ms,
MAX(duration) AS max_ms
FROM (
SELECT
id,
extract(milliseconds FROM (clock_timestamp() - start_time)) AS duration
FROM (
SELECT
q.id,
clock_timestamp() AS start_time,
(SELECT id FROM vectors_ruvector v ORDER BY v.embedding <-> q.query_vector LIMIT :k)
FROM queries q
LIMIT 1000
) t
) times;
-- PGVector HNSW search
\echo 'pgvector HNSW search (p50, p99 latency):'
SELECT
percentile_cont(0.5) WITHIN GROUP (ORDER BY duration) AS p50_ms,
percentile_cont(0.99) WITHIN GROUP (ORDER BY duration) AS p99_ms,
AVG(duration) AS avg_ms,
MIN(duration) AS min_ms,
MAX(duration) AS max_ms
FROM (
SELECT
id,
extract(milliseconds FROM (clock_timestamp() - start_time)) AS duration
FROM (
SELECT
q.id,
clock_timestamp() AS start_time,
(SELECT id FROM vectors_pgvector v ORDER BY v.embedding <-> q.query_vector::vector LIMIT :k)
FROM queries q
LIMIT 1000
) t
) times;
-- ============================================================================
-- Benchmark 4: Distance Function Performance
-- ============================================================================
\echo ''
\echo '=== Benchmark 4: Distance Functions ==='
\echo ''
-- L2 Distance
\echo 'L2 Distance (100k calculations):'
\timing on
SELECT SUM(ruvector_l2_distance(v1.embedding, v2.embedding))
FROM vectors_ruvector v1
CROSS JOIN vectors_ruvector v2
WHERE v1.id <= 100 AND v2.id <= 1000;
\timing on
SELECT SUM(v1.embedding <-> v2.embedding)
FROM vectors_pgvector v1
CROSS JOIN vectors_pgvector v2
WHERE v1.id <= 100 AND v2.id <= 1000;
-- Cosine Distance
\echo 'Cosine Distance (100k calculations):'
\timing on
SELECT SUM(ruvector_cosine_distance(v1.embedding, v2.embedding))
FROM vectors_ruvector v1
CROSS JOIN vectors_ruvector v2
WHERE v1.id <= 100 AND v2.id <= 1000;
\timing on
SELECT SUM(v1.embedding <=> v2.embedding)
FROM vectors_pgvector v1
CROSS JOIN vectors_pgvector v2
WHERE v1.id <= 100 AND v2.id <= 1000;
-- Inner Product
\echo 'Inner Product (100k calculations):'
\timing on
SELECT SUM(ruvector_inner_product(v1.embedding, v2.embedding))
FROM vectors_ruvector v1
CROSS JOIN vectors_ruvector v2
WHERE v1.id <= 100 AND v2.id <= 1000;
\timing on
SELECT SUM(v1.embedding <#> v2.embedding)
FROM vectors_pgvector v1
CROSS JOIN vectors_pgvector v2
WHERE v1.id <= 100 AND v2.id <= 1000;
-- ============================================================================
-- Benchmark 5: Index Recall Accuracy
-- ============================================================================
\echo ''
\echo '=== Benchmark 5: Index Recall ==='
\echo ''
-- Create ground truth table
DROP TABLE IF EXISTS ground_truth;
CREATE TEMP TABLE ground_truth AS
SELECT
q.id AS query_id,
ARRAY_AGG(v.id ORDER BY v.embedding <-> q.query_vector) AS true_neighbors
FROM queries q
CROSS JOIN LATERAL (
SELECT id, embedding
FROM vectors_ruvector
ORDER BY embedding <-> q.query_vector
LIMIT :k
) v
WHERE q.id <= 100
GROUP BY q.id;
-- Compute recall for ruvector HNSW
WITH hnsw_results AS (
SELECT
q.id AS query_id,
ARRAY_AGG(v.id ORDER BY v.embedding <-> q.query_vector) AS hnsw_neighbors
FROM queries q
CROSS JOIN LATERAL (
SELECT id
FROM vectors_ruvector
ORDER BY embedding <-> q.query_vector
LIMIT :k
) v
WHERE q.id <= 100
GROUP BY q.id
)
SELECT
AVG(
(
SELECT COUNT(*)
FROM unnest(h.hnsw_neighbors) AS hn
WHERE hn = ANY(g.true_neighbors)
)::float / :k
) AS recall
FROM hnsw_results h
JOIN ground_truth g ON h.query_id = g.query_id;
-- ============================================================================
-- Benchmark 6: Memory Usage
-- ============================================================================
\echo ''
\echo '=== Benchmark 6: Memory Usage ==='
\echo ''
-- Table sizes
\echo 'Table sizes:'
SELECT
'ruvector' AS type,
pg_size_pretty(pg_total_relation_size('vectors_ruvector')) AS total_size,
pg_size_pretty(pg_relation_size('vectors_ruvector')) AS table_size,
pg_size_pretty(pg_indexes_size('vectors_ruvector')) AS index_size
UNION ALL
SELECT
'pgvector' AS type,
pg_size_pretty(pg_total_relation_size('vectors_pgvector')) AS total_size,
pg_size_pretty(pg_relation_size('vectors_pgvector')) AS table_size,
pg_size_pretty(pg_indexes_size('vectors_pgvector')) AS index_size;
-- Index sizes
\echo 'Index sizes:'
SELECT
indexname,
pg_size_pretty(pg_relation_size(indexname::regclass)) AS size
FROM pg_indexes
WHERE tablename IN ('vectors_ruvector', 'vectors_pgvector')
ORDER BY tablename, indexname;
-- ============================================================================
-- Benchmark 7: Quantization Performance
-- ============================================================================
\echo ''
\echo '=== Benchmark 7: Quantization ==='
\echo ''
-- Create quantized tables
DROP TABLE IF EXISTS vectors_scalar;
CREATE TABLE vectors_scalar (
id SERIAL PRIMARY KEY,
embedding scalarvec
);
INSERT INTO vectors_scalar (embedding)
SELECT quantize_scalar(embedding)
FROM vectors_ruvector
LIMIT 100000;
-- Quantized search
\echo 'Scalar quantized search:'
\timing on
SELECT id
FROM vectors_scalar
ORDER BY embedding <-> quantize_scalar((SELECT query_vector FROM queries WHERE id = 1))
LIMIT :k;
-- ============================================================================
-- Cleanup
-- ============================================================================
\echo ''
\echo '=== Benchmark Complete ==='
\echo ''
DROP TABLE IF EXISTS vectors_ruvector CASCADE;
DROP TABLE IF EXISTS vectors_pgvector CASCADE;
DROP TABLE IF EXISTS queries CASCADE;
DROP TABLE IF EXISTS vectors_scalar CASCADE;

View File

@@ -0,0 +1,123 @@
-- Quick benchmark script for development testing
-- Smaller dataset for faster iteration
\timing on
\set ECHO all
-- Configuration
\set num_vectors 10000
\set num_queries 100
\set dims 768
\set k 10
BEGIN;
-- ============================================================================
-- Setup
-- ============================================================================
DROP TABLE IF EXISTS test_vectors CASCADE;
DROP TABLE IF EXISTS test_queries CASCADE;
CREATE TABLE test_vectors (
id SERIAL PRIMARY KEY,
embedding ruvector(:dims)
);
CREATE TABLE test_queries (
id SERIAL PRIMARY KEY,
query_vector ruvector(:dims)
);
-- ============================================================================
-- Load Data
-- ============================================================================
\echo 'Loading test data...'
INSERT INTO test_vectors (embedding)
SELECT
array_to_ruvector(ARRAY(
SELECT random()::real
FROM generate_series(1, :dims)
))
FROM generate_series(1, :num_vectors);
INSERT INTO test_queries (query_vector)
SELECT
array_to_ruvector(ARRAY(
SELECT random()::real
FROM generate_series(1, :dims)
))
FROM generate_series(1, :num_queries);
COMMIT;
-- ============================================================================
-- Sequential Scan Baseline
-- ============================================================================
\echo ''
\echo 'Sequential scan baseline:'
EXPLAIN ANALYZE
SELECT id
FROM test_vectors
ORDER BY embedding <-> (SELECT query_vector FROM test_queries WHERE id = 1)
LIMIT :k;
-- ============================================================================
-- Build HNSW Index
-- ============================================================================
\echo ''
\echo 'Building HNSW index...'
CREATE INDEX test_vectors_hnsw_idx ON test_vectors
USING hnsw (embedding ruvector_l2_ops)
WITH (m = 16, ef_construction = 64);
-- ============================================================================
-- Index Search
-- ============================================================================
\echo ''
\echo 'HNSW index search:'
EXPLAIN ANALYZE
SELECT id
FROM test_vectors
ORDER BY embedding <-> (SELECT query_vector FROM test_queries WHERE id = 1)
LIMIT :k;
-- ============================================================================
-- Distance Functions
-- ============================================================================
\echo ''
\echo 'Distance function performance (1000 calculations):'
-- L2
\timing on
SELECT SUM(ruvector_l2_distance(v1.embedding, v2.embedding))
FROM test_vectors v1, test_vectors v2
WHERE v1.id <= 10 AND v2.id <= 100;
-- Cosine
\timing on
SELECT SUM(ruvector_cosine_distance(v1.embedding, v2.embedding))
FROM test_vectors v1, test_vectors v2
WHERE v1.id <= 10 AND v2.id <= 100;
-- Inner Product
\timing on
SELECT SUM(ruvector_inner_product(v1.embedding, v2.embedding))
FROM test_vectors v1, test_vectors v2
WHERE v1.id <= 10 AND v2.id <= 100;
-- ============================================================================
-- Cleanup
-- ============================================================================
DROP TABLE IF EXISTS test_vectors CASCADE;
DROP TABLE IF EXISTS test_queries CASCADE;
\echo ''
\echo 'Quick benchmark complete!'