Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
This commit is contained in:
307
crates/ruvector-postgres/benches/README.md
Normal file
307
crates/ruvector-postgres/benches/README.md
Normal file
@@ -0,0 +1,307 @@
|
||||
# RuVector Benchmark Suite
|
||||
|
||||
Comprehensive benchmarks comparing ruvector vs pgvector across multiple dimensions.
|
||||
|
||||
## Overview
|
||||
|
||||
This benchmark suite provides:
|
||||
|
||||
1. **Rust Benchmarks** - Low-level performance testing using Criterion
|
||||
2. **SQL Benchmarks** - Realistic PostgreSQL workload testing
|
||||
3. **Automated CI** - GitHub Actions workflow for continuous benchmarking
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Run All Benchmarks
|
||||
|
||||
```bash
|
||||
cd crates/ruvector-postgres
|
||||
bash benches/scripts/run_benchmarks.sh
|
||||
```
|
||||
|
||||
### Run Individual Benchmarks
|
||||
|
||||
```bash
|
||||
# Distance function benchmarks
|
||||
cargo bench --bench distance_bench
|
||||
|
||||
# HNSW index benchmarks
|
||||
cargo bench --bench index_bench
|
||||
|
||||
# Quantization benchmarks
|
||||
cargo bench --bench quantization_bench
|
||||
|
||||
# Quantized distance benchmarks
|
||||
cargo bench --bench quantized_distance_bench
|
||||
```
|
||||
|
||||
### Run SQL Benchmarks
|
||||
|
||||
```bash
|
||||
# Setup database
|
||||
createdb ruvector_bench
|
||||
psql -d ruvector_bench -c 'CREATE EXTENSION ruvector;'
|
||||
psql -d ruvector_bench -c 'CREATE EXTENSION pgvector;'
|
||||
|
||||
# Quick benchmark (10k vectors)
|
||||
psql -d ruvector_bench -f benches/sql/quick_benchmark.sql
|
||||
|
||||
# Full workload (1M vectors)
|
||||
psql -d ruvector_bench -f benches/sql/benchmark_workload.sql
|
||||
```
|
||||
|
||||
## Benchmark Categories
|
||||
|
||||
### 1. Distance Function Benchmarks (`distance_bench.rs`)
|
||||
|
||||
Tests distance calculation performance across different vector dimensions:
|
||||
|
||||
- **L2 (Euclidean) Distance**: Scalar vs SIMD implementations
|
||||
- **Cosine Distance**: Normalized similarity measurement
|
||||
- **Inner Product**: Dot product for maximum inner product search
|
||||
- **Batch Operations**: Sequential vs parallel processing
|
||||
|
||||
**Dimensions tested**: 128, 384, 768, 1536, 3072
|
||||
|
||||
**Key metrics**:
|
||||
- Single operation latency
|
||||
- Throughput (ops/sec)
|
||||
- SIMD speedup vs scalar
|
||||
|
||||
### 2. HNSW Index Benchmarks (`index_bench.rs`)
|
||||
|
||||
Tests Hierarchical Navigable Small World graph index:
|
||||
|
||||
#### Build Benchmarks
|
||||
- Index construction time vs dataset size (1K, 10K, 100K, 1M vectors)
|
||||
- Impact of `ef_construction` parameter (16, 32, 64, 128, 256)
|
||||
- Impact of `M` parameter (8, 12, 16, 24, 32, 48)
|
||||
|
||||
#### Search Benchmarks
|
||||
- Query latency vs dataset size
|
||||
- Impact of `ef_search` parameter (10, 20, 40, 80, 160, 320)
|
||||
- Impact of `k` (number of neighbors: 1, 5, 10, 20, 50, 100)
|
||||
|
||||
#### Recall Accuracy
|
||||
- Recall@10 vs `ef_search` values
|
||||
- Ground truth comparison
|
||||
|
||||
#### Memory Usage
|
||||
- Index size vs dataset size
|
||||
- Memory per vector overhead
|
||||
|
||||
**Dimensions tested**: 128, 384, 768, 1536
|
||||
|
||||
### 3. Quantization Benchmarks (`quantization_bench.rs`)
|
||||
|
||||
Tests vector compression and quantized search:
|
||||
|
||||
#### Scalar Quantization (SQ8)
|
||||
- Encoding/decoding speed
|
||||
- Distance calculation speedup
|
||||
- Recall vs exact search
|
||||
- Memory reduction (4x compression)
|
||||
|
||||
#### Binary Quantization
|
||||
- Encoding speed
|
||||
- Hamming distance calculation (SIMD)
|
||||
- Massive compression (32x for f32)
|
||||
- Re-ranking strategies
|
||||
|
||||
#### Product Quantization (PQ)
|
||||
- ADC (Asymmetric Distance Computation)
|
||||
- SIMD vs scalar lookup
|
||||
- Configurable compression ratios
|
||||
|
||||
**Key metrics**:
|
||||
- Speedup vs exact search
|
||||
- Recall@10 accuracy
|
||||
- Compression ratio
|
||||
- Throughput improvement
|
||||
|
||||
### 4. SQL Workload Benchmarks
|
||||
|
||||
Realistic PostgreSQL scenarios:
|
||||
|
||||
#### Quick Benchmark (`quick_benchmark.sql`)
|
||||
- 10,000 vectors, 768 dimensions
|
||||
- Sequential scan baseline
|
||||
- HNSW index build
|
||||
- Index search performance
|
||||
- Distance function comparisons
|
||||
|
||||
#### Full Workload (`benchmark_workload.sql`)
|
||||
- 1,000,000 vectors, 1536 dimensions
|
||||
- 1,000 queries for statistical significance
|
||||
- P50, P99 latency measurements
|
||||
- Memory usage analysis
|
||||
- Recall accuracy testing
|
||||
- ruvector vs pgvector comparison
|
||||
|
||||
## Understanding Results
|
||||
|
||||
### Criterion Output
|
||||
|
||||
```
|
||||
Distance/euclidean/scalar/768
|
||||
time: [2.1234 µs 2.1456 µs 2.1678 µs]
|
||||
thrpt: [354.23 Melem/s 357.89 Melem/s 361.55 Melem/s]
|
||||
```
|
||||
|
||||
- **time**: Mean execution time with confidence intervals
|
||||
- **thrpt**: Throughput (operations per second)
|
||||
|
||||
### Comparing Implementations
|
||||
|
||||
```bash
|
||||
# Set baseline
|
||||
cargo bench --bench distance_bench -- --save-baseline main
|
||||
|
||||
# Make changes, then compare
|
||||
cargo bench --bench distance_bench -- --baseline main
|
||||
```
|
||||
|
||||
### SQL Benchmark Interpretation
|
||||
|
||||
```sql
|
||||
p50_ms | p99_ms | avg_ms | min_ms | max_ms
|
||||
--------+--------+--------+--------+--------
|
||||
0.856 | 1.234 | 0.912 | 0.654 | 2.456
|
||||
```
|
||||
|
||||
- **p50**: Median latency (50th percentile)
|
||||
- **p99**: 99th percentile latency (worst 1%)
|
||||
- **avg**: Average latency
|
||||
- **min/max**: Best and worst case
|
||||
|
||||
## Performance Targets
|
||||
|
||||
### Distance Functions
|
||||
|
||||
| Operation | Dimension | Target Throughput |
|
||||
|-----------|-----------|-------------------|
|
||||
| L2 (SIMD) | 768 | > 400 Mops/s |
|
||||
| L2 (SIMD) | 1536 | > 200 Mops/s |
|
||||
| Cosine | 768 | > 300 Mops/s |
|
||||
| Inner Product | 768 | > 500 Mops/s |
|
||||
|
||||
### HNSW Index
|
||||
|
||||
| Dataset Size | Build Time | Search Latency | Recall@10 |
|
||||
|--------------|------------|----------------|-----------|
|
||||
| 100K | < 30s | < 1ms | > 0.95 |
|
||||
| 1M | < 5min | < 2ms | > 0.95 |
|
||||
| 10M | < 1hr | < 5ms | > 0.90 |
|
||||
|
||||
### Quantization
|
||||
|
||||
| Method | Compression | Speedup | Recall@10 |
|
||||
|---------|-------------|---------|-----------|
|
||||
| SQ8 | 4x | 2-3x | > 0.95 |
|
||||
| Binary | 32x | 10-20x | > 0.85 |
|
||||
| PQ(8) | 16x | 5-10x | > 0.90 |
|
||||
|
||||
## Continuous Integration
|
||||
|
||||
The GitHub Actions workflow runs automatically on:
|
||||
|
||||
- Pull requests touching benchmark code
|
||||
- Pushes to `main` and `develop` branches
|
||||
- Manual workflow dispatch
|
||||
|
||||
Results are:
|
||||
- Posted as PR comments
|
||||
- Stored as artifacts (30 day retention)
|
||||
- Tracked over time on main branch
|
||||
- Compared against baseline
|
||||
|
||||
### Triggering Manual Runs
|
||||
|
||||
```bash
|
||||
# From GitHub UI: Actions → Benchmarks → Run workflow
|
||||
|
||||
# Or using gh CLI
|
||||
gh workflow run benchmarks.yml
|
||||
```
|
||||
|
||||
### Enabling SQL Benchmarks in CI
|
||||
|
||||
SQL benchmarks are disabled by default (too slow). Enable via workflow dispatch:
|
||||
|
||||
```bash
|
||||
gh workflow run benchmarks.yml -f run_sql_benchmarks=true
|
||||
```
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
### Profiling with Criterion
|
||||
|
||||
```bash
|
||||
# Generate flamegraph
|
||||
cargo bench --bench distance_bench -- --profile-time=5
|
||||
|
||||
# Output to specific format
|
||||
cargo bench --bench distance_bench -- --output-format bencher
|
||||
```
|
||||
|
||||
### Custom Benchmark Parameters
|
||||
|
||||
Edit benchmark files to adjust:
|
||||
|
||||
- Vector dimensions
|
||||
- Dataset sizes
|
||||
- Number of queries
|
||||
- HNSW parameters (M, ef_construction, ef_search)
|
||||
- Quantization settings
|
||||
|
||||
### Comparing with pgvector
|
||||
|
||||
Ensure pgvector is installed:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/pgvector/pgvector.git
|
||||
cd pgvector
|
||||
make
|
||||
sudo make install
|
||||
```
|
||||
|
||||
Then run SQL benchmarks for side-by-side comparison.
|
||||
|
||||
## Interpreting Regressions
|
||||
|
||||
### Performance Degradation Alert
|
||||
|
||||
If CI fails due to performance regression:
|
||||
|
||||
1. **Check the comparison**: Review the baseline vs current results
|
||||
2. **Validate the change**: Ensure it's not due to measurement noise
|
||||
3. **Profile the code**: Use flamegraphs to identify bottlenecks
|
||||
4. **Consider trade-offs**: Sometimes correctness > speed
|
||||
|
||||
### Common Causes
|
||||
|
||||
- **SIMD disabled**: Check compiler flags
|
||||
- **Debug build**: Ensure --release mode
|
||||
- **Thermal throttling**: CPU overheating in CI
|
||||
- **Cache effects**: Different data access patterns
|
||||
|
||||
## Contributing
|
||||
|
||||
When adding benchmarks:
|
||||
|
||||
1. Add to appropriate `*_bench.rs` file
|
||||
2. Update this README
|
||||
3. Ensure benchmarks complete in < 5 minutes
|
||||
4. Use `black_box()` to prevent optimization
|
||||
5. Test both small and large inputs
|
||||
|
||||
## Resources
|
||||
|
||||
- [Criterion.rs Documentation](https://bheisler.github.io/criterion.rs/book/)
|
||||
- [HNSW Paper](https://arxiv.org/abs/1603.09320)
|
||||
- [Product Quantization Paper](https://ieeexplore.ieee.org/document/5432202)
|
||||
- [pgvector Repository](https://github.com/pgvector/pgvector)
|
||||
|
||||
## License
|
||||
|
||||
Same as ruvector project - MIT
|
||||
565
crates/ruvector-postgres/benches/distance_bench.rs
Normal file
565
crates/ruvector-postgres/benches/distance_bench.rs
Normal file
@@ -0,0 +1,565 @@
|
||||
//! Comprehensive distance function benchmarks
|
||||
//!
|
||||
//! Compare SIMD vs scalar implementations across different vector sizes
|
||||
//! and distance metrics (L2, cosine, inner product, Manhattan).
|
||||
//!
|
||||
//! Dimensions tested: 128, 384, 768, 1536, 3072
|
||||
//! This covers common embedding sizes:
|
||||
//! - 128: SBERT MiniLM
|
||||
//! - 384: all-MiniLM-L6-v2
|
||||
//! - 768: BERT base, RoBERTa
|
||||
//! - 1536: OpenAI text-embedding-ada-002
|
||||
//! - 3072: OpenAI text-embedding-3-large
|
||||
|
||||
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
|
||||
use rand::prelude::*;
|
||||
use rand_chacha::ChaCha8Rng;
|
||||
use rayon::prelude::*;
|
||||
|
||||
// ============================================================================
|
||||
// Distance Implementations
|
||||
// ============================================================================
|
||||
|
||||
mod distance_impl {
|
||||
/// Scalar Euclidean distance
|
||||
pub fn euclidean_scalar(a: &[f32], b: &[f32]) -> f32 {
|
||||
a.iter()
|
||||
.zip(b.iter())
|
||||
.map(|(x, y)| {
|
||||
let diff = x - y;
|
||||
diff * diff
|
||||
})
|
||||
.sum::<f32>()
|
||||
.sqrt()
|
||||
}
|
||||
|
||||
/// Scalar cosine distance
|
||||
pub fn cosine_scalar(a: &[f32], b: &[f32]) -> f32 {
|
||||
let mut dot = 0.0f32;
|
||||
let mut norm_a = 0.0f32;
|
||||
let mut norm_b = 0.0f32;
|
||||
|
||||
for (x, y) in a.iter().zip(b.iter()) {
|
||||
dot += x * y;
|
||||
norm_a += x * x;
|
||||
norm_b += y * y;
|
||||
}
|
||||
|
||||
let denominator = (norm_a * norm_b).sqrt();
|
||||
if denominator == 0.0 {
|
||||
return 1.0;
|
||||
}
|
||||
|
||||
1.0 - (dot / denominator)
|
||||
}
|
||||
|
||||
/// Scalar inner product distance (negative)
|
||||
pub fn inner_product_scalar(a: &[f32], b: &[f32]) -> f32 {
|
||||
-a.iter().zip(b.iter()).map(|(x, y)| x * y).sum::<f32>()
|
||||
}
|
||||
|
||||
/// Scalar Manhattan distance
|
||||
pub fn manhattan_scalar(a: &[f32], b: &[f32]) -> f32 {
|
||||
a.iter()
|
||||
.zip(b.iter())
|
||||
.map(|(x, y)| (x - y).abs())
|
||||
.sum::<f32>()
|
||||
}
|
||||
|
||||
/// AVX2 Euclidean distance squared (L2^2)
|
||||
#[cfg(target_arch = "x86_64")]
|
||||
#[target_feature(enable = "avx2", enable = "fma")]
|
||||
pub unsafe fn euclidean_avx2(a: &[f32], b: &[f32]) -> f32 {
|
||||
use std::arch::x86_64::*;
|
||||
|
||||
let n = a.len();
|
||||
let mut sum = _mm256_setzero_ps();
|
||||
|
||||
let chunks = n / 8;
|
||||
for i in 0..chunks {
|
||||
let offset = i * 8;
|
||||
let va = _mm256_loadu_ps(a.as_ptr().add(offset));
|
||||
let vb = _mm256_loadu_ps(b.as_ptr().add(offset));
|
||||
let diff = _mm256_sub_ps(va, vb);
|
||||
sum = _mm256_fmadd_ps(diff, diff, sum);
|
||||
}
|
||||
|
||||
// Horizontal sum
|
||||
let sum_high = _mm256_extractf128_ps(sum, 1);
|
||||
let sum_low = _mm256_castps256_ps128(sum);
|
||||
let sum128 = _mm_add_ps(sum_high, sum_low);
|
||||
let sum64 = _mm_add_ps(sum128, _mm_movehl_ps(sum128, sum128));
|
||||
let sum32 = _mm_add_ss(sum64, _mm_shuffle_ps(sum64, sum64, 1));
|
||||
|
||||
let mut result = _mm_cvtss_f32(sum32);
|
||||
|
||||
// Handle remainder
|
||||
for i in (chunks * 8)..n {
|
||||
let diff = a[i] - b[i];
|
||||
result += diff * diff;
|
||||
}
|
||||
|
||||
result.sqrt()
|
||||
}
|
||||
|
||||
/// AVX2 cosine distance
|
||||
#[cfg(target_arch = "x86_64")]
|
||||
#[target_feature(enable = "avx2", enable = "fma")]
|
||||
pub unsafe fn cosine_avx2(a: &[f32], b: &[f32]) -> f32 {
|
||||
use std::arch::x86_64::*;
|
||||
|
||||
let n = a.len();
|
||||
let mut dot_sum = _mm256_setzero_ps();
|
||||
let mut norm_a_sum = _mm256_setzero_ps();
|
||||
let mut norm_b_sum = _mm256_setzero_ps();
|
||||
|
||||
let chunks = n / 8;
|
||||
for i in 0..chunks {
|
||||
let offset = i * 8;
|
||||
let va = _mm256_loadu_ps(a.as_ptr().add(offset));
|
||||
let vb = _mm256_loadu_ps(b.as_ptr().add(offset));
|
||||
|
||||
dot_sum = _mm256_fmadd_ps(va, vb, dot_sum);
|
||||
norm_a_sum = _mm256_fmadd_ps(va, va, norm_a_sum);
|
||||
norm_b_sum = _mm256_fmadd_ps(vb, vb, norm_b_sum);
|
||||
}
|
||||
|
||||
// Horizontal sums
|
||||
let h_dot = horizontal_sum_avx2(dot_sum);
|
||||
let h_norm_a = horizontal_sum_avx2(norm_a_sum);
|
||||
let h_norm_b = horizontal_sum_avx2(norm_b_sum);
|
||||
|
||||
// Handle remainder
|
||||
let mut dot = h_dot;
|
||||
let mut norm_a = h_norm_a;
|
||||
let mut norm_b = h_norm_b;
|
||||
for i in (chunks * 8)..n {
|
||||
dot += a[i] * b[i];
|
||||
norm_a += a[i] * a[i];
|
||||
norm_b += b[i] * b[i];
|
||||
}
|
||||
|
||||
let denom = (norm_a * norm_b).sqrt();
|
||||
if denom == 0.0 {
|
||||
return 1.0;
|
||||
}
|
||||
1.0 - (dot / denom)
|
||||
}
|
||||
|
||||
/// AVX2 inner product
|
||||
#[cfg(target_arch = "x86_64")]
|
||||
#[target_feature(enable = "avx2", enable = "fma")]
|
||||
pub unsafe fn inner_product_avx2(a: &[f32], b: &[f32]) -> f32 {
|
||||
use std::arch::x86_64::*;
|
||||
|
||||
let n = a.len();
|
||||
let mut sum = _mm256_setzero_ps();
|
||||
|
||||
let chunks = n / 8;
|
||||
for i in 0..chunks {
|
||||
let offset = i * 8;
|
||||
let va = _mm256_loadu_ps(a.as_ptr().add(offset));
|
||||
let vb = _mm256_loadu_ps(b.as_ptr().add(offset));
|
||||
sum = _mm256_fmadd_ps(va, vb, sum);
|
||||
}
|
||||
|
||||
let mut result = horizontal_sum_avx2(sum);
|
||||
|
||||
// Handle remainder
|
||||
for i in (chunks * 8)..n {
|
||||
result += a[i] * b[i];
|
||||
}
|
||||
|
||||
-result
|
||||
}
|
||||
|
||||
#[cfg(target_arch = "x86_64")]
|
||||
#[inline]
|
||||
unsafe fn horizontal_sum_avx2(v: std::arch::x86_64::__m256) -> f32 {
|
||||
use std::arch::x86_64::*;
|
||||
let sum_high = _mm256_extractf128_ps(v, 1);
|
||||
let sum_low = _mm256_castps256_ps128(v);
|
||||
let sum128 = _mm_add_ps(sum_high, sum_low);
|
||||
let sum64 = _mm_add_ps(sum128, _mm_movehl_ps(sum128, sum128));
|
||||
let sum32 = _mm_add_ss(sum64, _mm_shuffle_ps(sum64, sum64, 1));
|
||||
_mm_cvtss_f32(sum32)
|
||||
}
|
||||
|
||||
#[cfg(not(target_arch = "x86_64"))]
|
||||
pub unsafe fn euclidean_avx2(a: &[f32], b: &[f32]) -> f32 {
|
||||
euclidean_scalar(a, b)
|
||||
}
|
||||
|
||||
#[cfg(not(target_arch = "x86_64"))]
|
||||
pub unsafe fn cosine_avx2(a: &[f32], b: &[f32]) -> f32 {
|
||||
cosine_scalar(a, b)
|
||||
}
|
||||
|
||||
#[cfg(not(target_arch = "x86_64"))]
|
||||
pub unsafe fn inner_product_avx2(a: &[f32], b: &[f32]) -> f32 {
|
||||
inner_product_scalar(a, b)
|
||||
}
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Test Data Generation
|
||||
// ============================================================================
|
||||
|
||||
fn generate_vectors(dims: usize, seed: u64) -> (Vec<f32>, Vec<f32>) {
|
||||
let mut rng = ChaCha8Rng::seed_from_u64(seed);
|
||||
let a: Vec<f32> = (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect();
|
||||
let b: Vec<f32> = (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect();
|
||||
(a, b)
|
||||
}
|
||||
|
||||
fn generate_normalized_vectors(dims: usize, seed: u64) -> (Vec<f32>, Vec<f32>) {
|
||||
let (mut a, mut b) = generate_vectors(dims, seed);
|
||||
|
||||
// Normalize vectors
|
||||
let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
|
||||
for x in &mut a {
|
||||
*x /= norm_a;
|
||||
}
|
||||
for x in &mut b {
|
||||
*x /= norm_b;
|
||||
}
|
||||
|
||||
(a, b)
|
||||
}
|
||||
|
||||
fn generate_vector_dataset(n: usize, dims: usize, seed: u64) -> Vec<Vec<f32>> {
|
||||
let mut rng = ChaCha8Rng::seed_from_u64(seed);
|
||||
(0..n)
|
||||
.map(|_| (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect())
|
||||
.collect()
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Euclidean Distance Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
const DIMENSIONS: [usize; 5] = [128, 384, 768, 1536, 3072];
|
||||
|
||||
fn bench_euclidean(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Euclidean Distance");
|
||||
|
||||
for dims in DIMENSIONS.iter() {
|
||||
let (a, b) = generate_vectors(*dims, 42);
|
||||
|
||||
group.throughput(Throughput::Elements(*dims as u64));
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("scalar", dims), dims, |bench, _| {
|
||||
bench.iter(|| distance_impl::euclidean_scalar(black_box(&a), black_box(&b)))
|
||||
});
|
||||
|
||||
#[cfg(target_arch = "x86_64")]
|
||||
if is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma") {
|
||||
group.bench_with_input(BenchmarkId::new("avx2", dims), dims, |bench, _| {
|
||||
bench
|
||||
.iter(|| unsafe { distance_impl::euclidean_avx2(black_box(&a), black_box(&b)) })
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Cosine Distance Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_cosine(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Cosine Distance");
|
||||
|
||||
for dims in DIMENSIONS.iter() {
|
||||
let (a, b) = generate_vectors(*dims, 42);
|
||||
|
||||
group.throughput(Throughput::Elements(*dims as u64));
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("scalar", dims), dims, |bench, _| {
|
||||
bench.iter(|| distance_impl::cosine_scalar(black_box(&a), black_box(&b)))
|
||||
});
|
||||
|
||||
#[cfg(target_arch = "x86_64")]
|
||||
if is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma") {
|
||||
group.bench_with_input(BenchmarkId::new("avx2", dims), dims, |bench, _| {
|
||||
bench.iter(|| unsafe { distance_impl::cosine_avx2(black_box(&a), black_box(&b)) })
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Cosine Distance for Pre-Normalized Vectors
|
||||
// ============================================================================
|
||||
|
||||
fn bench_cosine_normalized(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Cosine Distance (Normalized)");
|
||||
|
||||
for dims in DIMENSIONS.iter() {
|
||||
let (a, b) = generate_normalized_vectors(*dims, 42);
|
||||
|
||||
group.throughput(Throughput::Elements(*dims as u64));
|
||||
|
||||
// For normalized vectors, cosine = 1 - dot product
|
||||
group.bench_with_input(BenchmarkId::new("scalar_dot", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let dot: f32 = a.iter().zip(&b).map(|(x, y)| x * y).sum();
|
||||
1.0 - black_box(dot)
|
||||
})
|
||||
});
|
||||
|
||||
#[cfg(target_arch = "x86_64")]
|
||||
if is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma") {
|
||||
group.bench_with_input(BenchmarkId::new("avx2_dot", dims), dims, |bench, _| {
|
||||
bench.iter(|| unsafe {
|
||||
1.0 + distance_impl::inner_product_avx2(black_box(&a), black_box(&b))
|
||||
})
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Inner Product Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_inner_product(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Inner Product");
|
||||
|
||||
for dims in DIMENSIONS.iter() {
|
||||
let (a, b) = generate_vectors(*dims, 42);
|
||||
|
||||
group.throughput(Throughput::Elements(*dims as u64));
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("scalar", dims), dims, |bench, _| {
|
||||
bench.iter(|| distance_impl::inner_product_scalar(black_box(&a), black_box(&b)))
|
||||
});
|
||||
|
||||
#[cfg(target_arch = "x86_64")]
|
||||
if is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma") {
|
||||
group.bench_with_input(BenchmarkId::new("avx2", dims), dims, |bench, _| {
|
||||
bench.iter(|| unsafe {
|
||||
distance_impl::inner_product_avx2(black_box(&a), black_box(&b))
|
||||
})
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Manhattan Distance Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_manhattan(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Manhattan Distance");
|
||||
|
||||
for dims in DIMENSIONS.iter() {
|
||||
let (a, b) = generate_vectors(*dims, 42);
|
||||
|
||||
group.throughput(Throughput::Elements(*dims as u64));
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("scalar", dims), dims, |bench, _| {
|
||||
bench.iter(|| distance_impl::manhattan_scalar(black_box(&a), black_box(&b)))
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Batch Distance Benchmarks (1000 vectors)
|
||||
// ============================================================================
|
||||
|
||||
fn bench_batch_sequential(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Batch Distance (Sequential, 1000 vectors)");
|
||||
|
||||
for dims in [128, 384, 1536].iter() {
|
||||
let query = generate_vectors(*dims, 42).0;
|
||||
let vectors = generate_vector_dataset(1000, *dims, 123);
|
||||
|
||||
group.throughput(Throughput::Elements(1000));
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("euclidean", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
vectors
|
||||
.iter()
|
||||
.map(|v| distance_impl::euclidean_scalar(black_box(&query), black_box(v)))
|
||||
.collect::<Vec<_>>()
|
||||
})
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("cosine", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
vectors
|
||||
.iter()
|
||||
.map(|v| distance_impl::cosine_scalar(black_box(&query), black_box(v)))
|
||||
.collect::<Vec<_>>()
|
||||
})
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("inner_product", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
vectors
|
||||
.iter()
|
||||
.map(|v| distance_impl::inner_product_scalar(black_box(&query), black_box(v)))
|
||||
.collect::<Vec<_>>()
|
||||
})
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_batch_parallel(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Batch Distance (Parallel, 1000 vectors)");
|
||||
|
||||
for dims in [128, 384, 1536].iter() {
|
||||
let query = generate_vectors(*dims, 42).0;
|
||||
let vectors = generate_vector_dataset(1000, *dims, 123);
|
||||
|
||||
group.throughput(Throughput::Elements(1000));
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("euclidean_rayon", dims),
|
||||
dims,
|
||||
|bench, _| {
|
||||
bench.iter(|| {
|
||||
vectors
|
||||
.par_iter()
|
||||
.map(|v| distance_impl::euclidean_scalar(black_box(&query), black_box(v)))
|
||||
.collect::<Vec<_>>()
|
||||
})
|
||||
},
|
||||
);
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("cosine_rayon", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
vectors
|
||||
.par_iter()
|
||||
.map(|v| distance_impl::cosine_scalar(black_box(&query), black_box(v)))
|
||||
.collect::<Vec<_>>()
|
||||
})
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Large Batch Benchmarks (10K vectors)
|
||||
// ============================================================================
|
||||
|
||||
fn bench_large_batch(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Large Batch Distance (10K vectors)");
|
||||
group.sample_size(10);
|
||||
|
||||
for dims in [384, 768, 1536].iter() {
|
||||
let query = generate_vectors(*dims, 42).0;
|
||||
let vectors = generate_vector_dataset(10_000, *dims, 123);
|
||||
|
||||
group.throughput(Throughput::Elements(10_000));
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("sequential", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
vectors
|
||||
.iter()
|
||||
.map(|v| distance_impl::euclidean_scalar(black_box(&query), black_box(v)))
|
||||
.collect::<Vec<_>>()
|
||||
})
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("parallel", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
vectors
|
||||
.par_iter()
|
||||
.map(|v| distance_impl::euclidean_scalar(black_box(&query), black_box(v)))
|
||||
.collect::<Vec<_>>()
|
||||
})
|
||||
});
|
||||
|
||||
#[cfg(target_arch = "x86_64")]
|
||||
if is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma") {
|
||||
group.bench_with_input(BenchmarkId::new("parallel_avx2", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
vectors
|
||||
.par_iter()
|
||||
.map(|v| unsafe {
|
||||
distance_impl::euclidean_avx2(black_box(&query), black_box(v))
|
||||
})
|
||||
.collect::<Vec<_>>()
|
||||
})
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// SIMD Speedup Comparison
|
||||
// ============================================================================
|
||||
|
||||
fn bench_simd_speedup(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("SIMD Speedup Analysis");
|
||||
|
||||
#[cfg(target_arch = "x86_64")]
|
||||
if is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma") {
|
||||
for dims in DIMENSIONS.iter() {
|
||||
let (a, b) = generate_vectors(*dims, 42);
|
||||
|
||||
// Euclidean
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("euclidean_scalar", dims),
|
||||
dims,
|
||||
|bench, _| {
|
||||
bench.iter(|| distance_impl::euclidean_scalar(black_box(&a), black_box(&b)))
|
||||
},
|
||||
);
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("euclidean_avx2", dims),
|
||||
dims,
|
||||
|bench, _| {
|
||||
bench.iter(|| unsafe {
|
||||
distance_impl::euclidean_avx2(black_box(&a), black_box(&b))
|
||||
})
|
||||
},
|
||||
);
|
||||
|
||||
// Cosine
|
||||
group.bench_with_input(BenchmarkId::new("cosine_scalar", dims), dims, |bench, _| {
|
||||
bench.iter(|| distance_impl::cosine_scalar(black_box(&a), black_box(&b)))
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("cosine_avx2", dims), dims, |bench, _| {
|
||||
bench.iter(|| unsafe { distance_impl::cosine_avx2(black_box(&a), black_box(&b)) })
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(
|
||||
benches,
|
||||
bench_euclidean,
|
||||
bench_cosine,
|
||||
bench_cosine_normalized,
|
||||
bench_inner_product,
|
||||
bench_manhattan,
|
||||
bench_batch_sequential,
|
||||
bench_batch_parallel,
|
||||
bench_large_batch,
|
||||
bench_simd_speedup,
|
||||
);
|
||||
|
||||
criterion_main!(benches);
|
||||
782
crates/ruvector-postgres/benches/e2e_bench.rs
Normal file
782
crates/ruvector-postgres/benches/e2e_bench.rs
Normal file
@@ -0,0 +1,782 @@
|
||||
//! End-to-End benchmarks for RuVector PostgreSQL extension
|
||||
//!
|
||||
//! Comprehensive benchmarks for:
|
||||
//! - Full query pipeline latency
|
||||
//! - Insert throughput
|
||||
//! - Concurrent query scaling
|
||||
//! - Memory usage under load
|
||||
//! - pgvector comparison baselines
|
||||
|
||||
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
|
||||
use rand::prelude::*;
|
||||
use rand_chacha::ChaCha8Rng;
|
||||
use rayon::prelude::*;
|
||||
use std::collections::HashMap;
|
||||
use std::sync::atomic::{AtomicUsize, Ordering as AtomicOrdering};
|
||||
use std::sync::Arc;
|
||||
use std::time::{Duration, Instant};
|
||||
|
||||
// ============================================================================
|
||||
// Simulated Vector Index (Full Pipeline)
|
||||
// ============================================================================
|
||||
|
||||
mod index {
|
||||
use dashmap::DashMap;
|
||||
use parking_lot::RwLock;
|
||||
use rand::prelude::*;
|
||||
use rand_chacha::ChaCha8Rng;
|
||||
use rayon::prelude::*;
|
||||
use std::cmp::Ordering;
|
||||
use std::collections::{BinaryHeap, HashMap, HashSet};
|
||||
use std::sync::atomic::{AtomicUsize, Ordering as AtomicOrdering};
|
||||
|
||||
/// Full-featured HNSW index for benchmarking
|
||||
pub struct HnswIndex {
|
||||
pub nodes: DashMap<u64, Vec<f32>>,
|
||||
pub neighbors: DashMap<u64, Vec<Vec<u64>>>,
|
||||
pub entry_point: RwLock<Option<u64>>,
|
||||
pub max_layer: AtomicUsize,
|
||||
pub m: usize,
|
||||
pub m0: usize,
|
||||
pub ef_construction: usize,
|
||||
pub ef_search: usize,
|
||||
pub dimensions: usize,
|
||||
next_id: AtomicUsize,
|
||||
rng: RwLock<ChaCha8Rng>,
|
||||
}
|
||||
|
||||
impl HnswIndex {
|
||||
pub fn new(
|
||||
dimensions: usize,
|
||||
m: usize,
|
||||
ef_construction: usize,
|
||||
ef_search: usize,
|
||||
seed: u64,
|
||||
) -> Self {
|
||||
Self {
|
||||
nodes: DashMap::new(),
|
||||
neighbors: DashMap::new(),
|
||||
entry_point: RwLock::new(None),
|
||||
max_layer: AtomicUsize::new(0),
|
||||
m,
|
||||
m0: m * 2,
|
||||
ef_construction,
|
||||
ef_search,
|
||||
dimensions,
|
||||
next_id: AtomicUsize::new(0),
|
||||
rng: RwLock::new(ChaCha8Rng::seed_from_u64(seed)),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn len(&self) -> usize {
|
||||
self.nodes.len()
|
||||
}
|
||||
|
||||
fn random_level(&self) -> usize {
|
||||
let ml = 1.0 / (self.m as f64).ln();
|
||||
let mut rng = self.rng.write();
|
||||
let r: f64 = rng.gen();
|
||||
((-r.ln() * ml).floor() as usize).min(32)
|
||||
}
|
||||
|
||||
fn distance(&self, a: &[f32], b: &[f32]) -> f32 {
|
||||
a.iter()
|
||||
.zip(b.iter())
|
||||
.map(|(x, y)| (x - y).powi(2))
|
||||
.sum::<f32>()
|
||||
.sqrt()
|
||||
}
|
||||
|
||||
pub fn insert(&self, vector: Vec<f32>) -> u64 {
|
||||
let id = self.next_id.fetch_add(1, AtomicOrdering::Relaxed) as u64;
|
||||
let level = self.random_level();
|
||||
|
||||
// Initialize neighbor lists for all layers
|
||||
let mut neighbor_lists = Vec::with_capacity(level + 1);
|
||||
for _ in 0..=level {
|
||||
neighbor_lists.push(Vec::new());
|
||||
}
|
||||
|
||||
self.nodes.insert(id, vector.clone());
|
||||
self.neighbors.insert(id, neighbor_lists);
|
||||
|
||||
let current_entry = *self.entry_point.read();
|
||||
|
||||
if current_entry.is_none() {
|
||||
*self.entry_point.write() = Some(id);
|
||||
self.max_layer.store(level, AtomicOrdering::Relaxed);
|
||||
return id;
|
||||
}
|
||||
|
||||
// Simplified insertion
|
||||
let entry_id = current_entry.unwrap();
|
||||
|
||||
// Connect to some neighbors
|
||||
if let Some(entry_vec) = self.nodes.get(&entry_id) {
|
||||
let max_conn = if level == 0 { self.m0 } else { self.m };
|
||||
|
||||
if let Some(mut neighbors) = self.neighbors.get_mut(&id) {
|
||||
neighbors[0].push(entry_id);
|
||||
}
|
||||
|
||||
if let Some(mut entry_neighbors) = self.neighbors.get_mut(&entry_id) {
|
||||
if entry_neighbors[0].len() < max_conn {
|
||||
entry_neighbors[0].push(id);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if level > self.max_layer.load(AtomicOrdering::Relaxed) {
|
||||
*self.entry_point.write() = Some(id);
|
||||
self.max_layer.store(level, AtomicOrdering::Relaxed);
|
||||
}
|
||||
|
||||
id
|
||||
}
|
||||
|
||||
pub fn insert_batch(&self, vectors: &[Vec<f32>]) -> Vec<u64> {
|
||||
vectors.iter().map(|v| self.insert(v.clone())).collect()
|
||||
}
|
||||
|
||||
pub fn insert_batch_parallel(&self, vectors: &[Vec<f32>]) -> Vec<u64> {
|
||||
// Parallel insertion with batching
|
||||
vectors.par_iter().map(|v| self.insert(v.clone())).collect()
|
||||
}
|
||||
|
||||
pub fn search(&self, query: &[f32], k: usize) -> Vec<(u64, f32)> {
|
||||
// Brute force for simplicity in benchmarks
|
||||
let mut results: Vec<(u64, f32)> = self
|
||||
.nodes
|
||||
.iter()
|
||||
.map(|entry| {
|
||||
let dist = self.distance(query, entry.value());
|
||||
(*entry.key(), dist)
|
||||
})
|
||||
.collect();
|
||||
|
||||
results.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
|
||||
results.truncate(k);
|
||||
results
|
||||
}
|
||||
|
||||
pub fn search_parallel(&self, query: &[f32], k: usize) -> Vec<(u64, f32)> {
|
||||
let mut results: Vec<(u64, f32)> = self
|
||||
.nodes
|
||||
.iter()
|
||||
.collect::<Vec<_>>()
|
||||
.par_iter()
|
||||
.map(|entry| {
|
||||
let dist = self.distance(query, entry.value());
|
||||
(*entry.key(), dist)
|
||||
})
|
||||
.collect();
|
||||
|
||||
results.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
|
||||
results.truncate(k);
|
||||
results
|
||||
}
|
||||
|
||||
pub fn memory_usage(&self) -> usize {
|
||||
let vector_bytes = self.nodes.len() * self.dimensions * 4;
|
||||
let neighbor_bytes: usize = self
|
||||
.neighbors
|
||||
.iter()
|
||||
.map(|entry| entry.value().iter().map(|l| l.len() * 8).sum::<usize>())
|
||||
.sum();
|
||||
vector_bytes + neighbor_bytes
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
use index::HnswIndex;
|
||||
|
||||
// ============================================================================
|
||||
// Test Data Generation
|
||||
// ============================================================================
|
||||
|
||||
fn generate_random_vectors(n: usize, dims: usize, seed: u64) -> Vec<Vec<f32>> {
|
||||
let mut rng = ChaCha8Rng::seed_from_u64(seed);
|
||||
(0..n)
|
||||
.map(|_| (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect())
|
||||
.collect()
|
||||
}
|
||||
|
||||
fn generate_normalized_vectors(n: usize, dims: usize, seed: u64) -> Vec<Vec<f32>> {
|
||||
let vectors = generate_random_vectors(n, dims, seed);
|
||||
vectors
|
||||
.into_iter()
|
||||
.map(|v| {
|
||||
let norm: f32 = v.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
v.into_iter().map(|x| x / norm).collect()
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Full Query Pipeline Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_query_pipeline(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Query Pipeline");
|
||||
|
||||
for &dims in [128, 384, 768, 1536].iter() {
|
||||
for &n in [10_000, 100_000].iter() {
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let query = vectors[0].clone();
|
||||
|
||||
let index = HnswIndex::new(dims, 16, 64, 40, 42);
|
||||
index.insert_batch(&vectors);
|
||||
|
||||
group.throughput(Throughput::Elements(1));
|
||||
|
||||
// Full pipeline: search + post-process
|
||||
group.bench_with_input(BenchmarkId::new(format!("{}d", dims), n), &n, |bench, _| {
|
||||
bench.iter(|| {
|
||||
// Search
|
||||
let results = index.search(&query, 10);
|
||||
|
||||
// Post-process (e.g., fetch metadata, rerank)
|
||||
let processed: Vec<_> = results
|
||||
.iter()
|
||||
.map(|(id, dist)| {
|
||||
// Simulate metadata lookup
|
||||
let metadata = id.to_string();
|
||||
(*id, *dist, metadata)
|
||||
})
|
||||
.collect();
|
||||
|
||||
black_box(processed)
|
||||
})
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_query_pipeline_parallel(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Query Pipeline (Parallel)");
|
||||
|
||||
let dims = 768;
|
||||
let n = 100_000;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let queries: Vec<Vec<f32>> = generate_random_vectors(100, dims, 999);
|
||||
|
||||
let index = HnswIndex::new(dims, 16, 64, 40, 42);
|
||||
index.insert_batch(&vectors);
|
||||
|
||||
group.throughput(Throughput::Elements(100));
|
||||
|
||||
group.bench_function("sequential", |bench| {
|
||||
bench.iter(|| {
|
||||
queries
|
||||
.iter()
|
||||
.map(|q| index.search(q, 10))
|
||||
.collect::<Vec<_>>()
|
||||
})
|
||||
});
|
||||
|
||||
group.bench_function("parallel_queries", |bench| {
|
||||
bench.iter(|| {
|
||||
queries
|
||||
.par_iter()
|
||||
.map(|q| index.search(q, 10))
|
||||
.collect::<Vec<_>>()
|
||||
})
|
||||
});
|
||||
|
||||
group.bench_function("parallel_search_internal", |bench| {
|
||||
bench.iter(|| {
|
||||
queries
|
||||
.iter()
|
||||
.map(|q| index.search_parallel(q, 10))
|
||||
.collect::<Vec<_>>()
|
||||
})
|
||||
});
|
||||
|
||||
group.bench_function("full_parallel", |bench| {
|
||||
bench.iter(|| {
|
||||
queries
|
||||
.par_iter()
|
||||
.map(|q| index.search_parallel(q, 10))
|
||||
.collect::<Vec<_>>()
|
||||
})
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Insert Throughput Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_insert_throughput(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Insert Throughput");
|
||||
group.sample_size(10);
|
||||
|
||||
for &dims in [128, 384, 768, 1536].iter() {
|
||||
for &n in [1_000, 10_000, 100_000].iter() {
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
|
||||
group.throughput(Throughput::Elements(n as u64));
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new(format!("{}d", dims), n),
|
||||
&vectors,
|
||||
|bench, vecs| {
|
||||
bench.iter(|| {
|
||||
let index = HnswIndex::new(dims, 16, 64, 40, 42);
|
||||
index.insert_batch(vecs);
|
||||
black_box(index.len())
|
||||
})
|
||||
},
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_insert_throughput_parallel(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Insert Throughput (Parallel)");
|
||||
group.sample_size(10);
|
||||
|
||||
let dims = 768;
|
||||
|
||||
for &n in [10_000, 100_000].iter() {
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
|
||||
group.throughput(Throughput::Elements(n as u64));
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("sequential", n),
|
||||
&vectors,
|
||||
|bench, vecs| {
|
||||
bench.iter(|| {
|
||||
let index = HnswIndex::new(dims, 16, 64, 40, 42);
|
||||
index.insert_batch(vecs);
|
||||
black_box(index.len())
|
||||
})
|
||||
},
|
||||
);
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("parallel", n), &vectors, |bench, vecs| {
|
||||
bench.iter(|| {
|
||||
let index = HnswIndex::new(dims, 16, 64, 40, 42);
|
||||
index.insert_batch_parallel(vecs);
|
||||
black_box(index.len())
|
||||
})
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_insert_batching(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Insert Batch Sizes");
|
||||
group.sample_size(10);
|
||||
|
||||
let dims = 768;
|
||||
let n = 10_000;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
|
||||
for &batch_size in [1, 10, 100, 1000, 10000].iter() {
|
||||
group.throughput(Throughput::Elements(n as u64));
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::from_parameter(batch_size),
|
||||
&batch_size,
|
||||
|bench, &bs| {
|
||||
bench.iter(|| {
|
||||
let index = HnswIndex::new(dims, 16, 64, 40, 42);
|
||||
|
||||
for chunk in vectors.chunks(bs) {
|
||||
index.insert_batch(chunk);
|
||||
}
|
||||
|
||||
black_box(index.len())
|
||||
})
|
||||
},
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Concurrent Query Scaling
|
||||
// ============================================================================
|
||||
|
||||
fn bench_concurrent_scaling(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Concurrent Query Scaling");
|
||||
group.sample_size(10);
|
||||
|
||||
let dims = 768;
|
||||
let n = 100_000;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let queries = generate_random_vectors(1000, dims, 999);
|
||||
|
||||
let index = Arc::new(HnswIndex::new(dims, 16, 64, 40, 42));
|
||||
index.insert_batch(&vectors);
|
||||
|
||||
for &num_threads in [1, 2, 4, 8, 16].iter() {
|
||||
group.throughput(Throughput::Elements(1000));
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::from_parameter(num_threads),
|
||||
&num_threads,
|
||||
|bench, &threads| {
|
||||
let pool = rayon::ThreadPoolBuilder::new()
|
||||
.num_threads(threads)
|
||||
.build()
|
||||
.unwrap();
|
||||
|
||||
bench.iter(|| {
|
||||
pool.install(|| {
|
||||
queries.par_iter().for_each(|q| {
|
||||
black_box(index.search(q, 10));
|
||||
});
|
||||
})
|
||||
})
|
||||
},
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_mixed_workload(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Mixed Read/Write Workload");
|
||||
group.sample_size(10);
|
||||
|
||||
let dims = 768;
|
||||
let n = 50_000;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let queries = generate_random_vectors(100, dims, 999);
|
||||
let new_vectors = generate_random_vectors(1000, dims, 123);
|
||||
|
||||
let index = Arc::new(HnswIndex::new(dims, 16, 64, 40, 42));
|
||||
index.insert_batch(&vectors);
|
||||
|
||||
// Read-heavy (90% reads, 10% writes)
|
||||
group.bench_function("read_heavy", |bench| {
|
||||
let idx = index.clone();
|
||||
bench.iter(|| {
|
||||
// 90 reads
|
||||
for q in queries.iter().take(90) {
|
||||
black_box(idx.search(q, 10));
|
||||
}
|
||||
// 10 writes
|
||||
for v in new_vectors.iter().take(10) {
|
||||
black_box(idx.insert(v.clone()));
|
||||
}
|
||||
})
|
||||
});
|
||||
|
||||
// Balanced (50% reads, 50% writes)
|
||||
group.bench_function("balanced", |bench| {
|
||||
let idx = index.clone();
|
||||
bench.iter(|| {
|
||||
for (q, v) in queries.iter().take(50).zip(new_vectors.iter().take(50)) {
|
||||
black_box(idx.search(q, 10));
|
||||
black_box(idx.insert(v.clone()));
|
||||
}
|
||||
})
|
||||
});
|
||||
|
||||
// Write-heavy (10% reads, 90% writes)
|
||||
group.bench_function("write_heavy", |bench| {
|
||||
let idx = index.clone();
|
||||
bench.iter(|| {
|
||||
// 10 reads
|
||||
for q in queries.iter().take(10) {
|
||||
black_box(idx.search(q, 10));
|
||||
}
|
||||
// 90 writes
|
||||
for v in new_vectors.iter().take(90) {
|
||||
black_box(idx.insert(v.clone()));
|
||||
}
|
||||
})
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Memory Usage Under Load
|
||||
// ============================================================================
|
||||
|
||||
fn bench_memory_growth(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Memory Growth");
|
||||
group.sample_size(10);
|
||||
|
||||
let dims = 768;
|
||||
|
||||
for &n in [1_000, 10_000, 50_000, 100_000].iter() {
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
|
||||
group.bench_with_input(BenchmarkId::from_parameter(n), &vectors, |bench, vecs| {
|
||||
bench.iter(|| {
|
||||
let index = HnswIndex::new(dims, 16, 64, 40, 42);
|
||||
index.insert_batch(vecs);
|
||||
|
||||
let memory = index.memory_usage();
|
||||
let per_vector = memory as f64 / n as f64;
|
||||
|
||||
black_box((memory, per_vector))
|
||||
})
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_memory_efficiency(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Memory Efficiency (M parameter)");
|
||||
group.sample_size(10);
|
||||
|
||||
let dims = 768;
|
||||
let n = 10_000;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
|
||||
for &m in [8, 12, 16, 24, 32, 48].iter() {
|
||||
group.bench_with_input(BenchmarkId::from_parameter(m), &m, |bench, &m_val| {
|
||||
bench.iter(|| {
|
||||
let index = HnswIndex::new(dims, m_val, 64, 40, 42);
|
||||
index.insert_batch(&vectors);
|
||||
|
||||
let memory = index.memory_usage();
|
||||
let per_vector = memory as f64 / n as f64;
|
||||
|
||||
black_box(per_vector)
|
||||
})
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Latency Distribution
|
||||
// ============================================================================
|
||||
|
||||
fn bench_latency_distribution(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Latency Distribution");
|
||||
group.sample_size(10);
|
||||
|
||||
let dims = 768;
|
||||
let n = 100_000;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let queries = generate_random_vectors(1000, dims, 999);
|
||||
|
||||
let index = HnswIndex::new(dims, 16, 64, 40, 42);
|
||||
index.insert_batch(&vectors);
|
||||
|
||||
group.bench_function("collect_percentiles", |bench| {
|
||||
bench.iter(|| {
|
||||
let mut latencies: Vec<Duration> = Vec::with_capacity(queries.len());
|
||||
|
||||
for query in &queries {
|
||||
let start = Instant::now();
|
||||
black_box(index.search(query, 10));
|
||||
latencies.push(start.elapsed());
|
||||
}
|
||||
|
||||
latencies.sort();
|
||||
|
||||
let p50 = latencies[latencies.len() / 2];
|
||||
let p95 = latencies[(latencies.len() as f64 * 0.95) as usize];
|
||||
let p99 = latencies[(latencies.len() as f64 * 0.99) as usize];
|
||||
let p999 = latencies[(latencies.len() as f64 * 0.999) as usize];
|
||||
|
||||
black_box((p50, p95, p99, p999))
|
||||
})
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Dimension Scaling
|
||||
// ============================================================================
|
||||
|
||||
fn bench_dimension_scaling(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Dimension Scaling");
|
||||
group.sample_size(10);
|
||||
|
||||
let n = 10_000;
|
||||
|
||||
for &dims in [64, 128, 256, 384, 512, 768, 1024, 1536, 2048, 3072].iter() {
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let query = vectors[0].clone();
|
||||
|
||||
let index = HnswIndex::new(dims, 16, 64, 40, 42);
|
||||
index.insert_batch(&vectors);
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("search", dims), &dims, |bench, _| {
|
||||
bench.iter(|| black_box(index.search(&query, 10)))
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// pgvector Comparison Baselines
|
||||
// ============================================================================
|
||||
|
||||
fn bench_baseline_brute_force(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Baseline Brute Force");
|
||||
group.sample_size(10);
|
||||
|
||||
for &dims in [128, 384, 768, 1536].iter() {
|
||||
for &n in [1_000, 10_000, 100_000].iter() {
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let query = vectors[0].clone();
|
||||
|
||||
group.throughput(Throughput::Elements(n as u64));
|
||||
|
||||
// Sequential brute force
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new(format!("{}d_seq", dims), n),
|
||||
&vectors,
|
||||
|bench, vecs| {
|
||||
bench.iter(|| {
|
||||
let mut distances: Vec<(usize, f32)> = vecs
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, v)| {
|
||||
let dist: f32 = query
|
||||
.iter()
|
||||
.zip(v.iter())
|
||||
.map(|(a, b)| (a - b).powi(2))
|
||||
.sum::<f32>()
|
||||
.sqrt();
|
||||
(i, dist)
|
||||
})
|
||||
.collect();
|
||||
|
||||
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
|
||||
distances.truncate(10);
|
||||
black_box(distances)
|
||||
})
|
||||
},
|
||||
);
|
||||
|
||||
// Parallel brute force
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new(format!("{}d_par", dims), n),
|
||||
&vectors,
|
||||
|bench, vecs| {
|
||||
bench.iter(|| {
|
||||
let mut distances: Vec<(usize, f32)> = vecs
|
||||
.par_iter()
|
||||
.enumerate()
|
||||
.map(|(i, v)| {
|
||||
let dist: f32 = query
|
||||
.iter()
|
||||
.zip(v.iter())
|
||||
.map(|(a, b)| (a - b).powi(2))
|
||||
.sum::<f32>()
|
||||
.sqrt();
|
||||
(i, dist)
|
||||
})
|
||||
.collect();
|
||||
|
||||
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
|
||||
distances.truncate(10);
|
||||
black_box(distances)
|
||||
})
|
||||
},
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Recall vs Throughput Tradeoff
|
||||
// ============================================================================
|
||||
|
||||
fn bench_recall_throughput_tradeoff(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Recall vs Throughput");
|
||||
group.sample_size(10);
|
||||
|
||||
let dims = 768;
|
||||
let n = 10_000;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let query = vectors[0].clone();
|
||||
|
||||
// Compute ground truth
|
||||
let ground_truth: Vec<usize> = {
|
||||
let mut distances: Vec<(usize, f32)> = vectors
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, v)| {
|
||||
let dist: f32 = query
|
||||
.iter()
|
||||
.zip(v.iter())
|
||||
.map(|(a, b)| (a - b).powi(2))
|
||||
.sum::<f32>()
|
||||
.sqrt();
|
||||
(i, dist)
|
||||
})
|
||||
.collect();
|
||||
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
|
||||
distances.iter().take(10).map(|(i, _)| *i).collect()
|
||||
};
|
||||
|
||||
for &ef_search in [10, 20, 40, 80, 160, 320].iter() {
|
||||
let index = HnswIndex::new(dims, 16, 64, ef_search, 42);
|
||||
index.insert_batch(&vectors);
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::from_parameter(ef_search),
|
||||
&ef_search,
|
||||
|bench, _| {
|
||||
bench.iter(|| {
|
||||
let results = index.search(&query, 10);
|
||||
|
||||
// Calculate recall
|
||||
let recall = results
|
||||
.iter()
|
||||
.filter(|(id, _)| ground_truth.contains(&(*id as usize)))
|
||||
.count() as f64
|
||||
/ 10.0;
|
||||
|
||||
black_box(recall)
|
||||
})
|
||||
},
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(
|
||||
benches,
|
||||
// Query Pipeline
|
||||
bench_query_pipeline,
|
||||
bench_query_pipeline_parallel,
|
||||
// Insert Throughput
|
||||
bench_insert_throughput,
|
||||
bench_insert_throughput_parallel,
|
||||
bench_insert_batching,
|
||||
// Concurrent Scaling
|
||||
bench_concurrent_scaling,
|
||||
bench_mixed_workload,
|
||||
// Memory Usage
|
||||
bench_memory_growth,
|
||||
bench_memory_efficiency,
|
||||
// Latency
|
||||
bench_latency_distribution,
|
||||
// Dimension Scaling
|
||||
bench_dimension_scaling,
|
||||
// Baselines
|
||||
bench_baseline_brute_force,
|
||||
// Recall/Throughput
|
||||
bench_recall_throughput_tradeoff,
|
||||
);
|
||||
|
||||
criterion_main!(benches);
|
||||
742
crates/ruvector-postgres/benches/hybrid_bench.rs
Normal file
742
crates/ruvector-postgres/benches/hybrid_bench.rs
Normal file
@@ -0,0 +1,742 @@
|
||||
//! Hybrid search benchmarks
|
||||
//!
|
||||
//! Benchmarks for combining vector search with keyword/BM25 scoring:
|
||||
//! - Vector-only vs hybrid latency
|
||||
//! - BM25 scoring overhead
|
||||
//! - Fusion algorithm comparison (RRF, weighted sum)
|
||||
//! - Parallel branch execution gain
|
||||
|
||||
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
|
||||
use rand::prelude::*;
|
||||
use rand_chacha::ChaCha8Rng;
|
||||
use rayon::prelude::*;
|
||||
use std::cmp::Ordering;
|
||||
use std::collections::{BinaryHeap, HashMap, HashSet};
|
||||
|
||||
// ============================================================================
|
||||
// BM25 Implementation
|
||||
// ============================================================================
|
||||
|
||||
mod bm25 {
|
||||
use std::cmp::Ordering;
|
||||
use std::collections::HashMap;
|
||||
|
||||
/// Simple tokenizer
|
||||
pub fn tokenize(text: &str) -> Vec<String> {
|
||||
text.to_lowercase()
|
||||
.split(|c: char| !c.is_alphanumeric())
|
||||
.filter(|s| !s.is_empty() && s.len() > 2)
|
||||
.map(|s| s.to_string())
|
||||
.collect()
|
||||
}
|
||||
|
||||
/// BM25 scoring index
|
||||
pub struct BM25Index {
|
||||
/// Document frequency for each term
|
||||
pub doc_freq: HashMap<String, usize>,
|
||||
/// Term frequency per document
|
||||
pub term_freq: Vec<HashMap<String, usize>>,
|
||||
/// Document lengths
|
||||
pub doc_lengths: Vec<usize>,
|
||||
/// Average document length
|
||||
pub avg_doc_len: f64,
|
||||
/// Number of documents
|
||||
pub num_docs: usize,
|
||||
/// BM25 parameters
|
||||
pub k1: f64,
|
||||
pub b: f64,
|
||||
}
|
||||
|
||||
impl BM25Index {
|
||||
pub fn new(k1: f64, b: f64) -> Self {
|
||||
Self {
|
||||
doc_freq: HashMap::new(),
|
||||
term_freq: Vec::new(),
|
||||
doc_lengths: Vec::new(),
|
||||
avg_doc_len: 0.0,
|
||||
num_docs: 0,
|
||||
k1,
|
||||
b,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn build(&mut self, documents: &[String]) {
|
||||
self.num_docs = documents.len();
|
||||
self.term_freq = Vec::with_capacity(documents.len());
|
||||
self.doc_lengths = Vec::with_capacity(documents.len());
|
||||
|
||||
let mut total_len = 0usize;
|
||||
|
||||
for doc in documents {
|
||||
let tokens = tokenize(doc);
|
||||
self.doc_lengths.push(tokens.len());
|
||||
total_len += tokens.len();
|
||||
|
||||
let mut tf: HashMap<String, usize> = HashMap::new();
|
||||
let mut seen_terms: std::collections::HashSet<String> =
|
||||
std::collections::HashSet::new();
|
||||
|
||||
for token in tokens {
|
||||
*tf.entry(token.clone()).or_insert(0) += 1;
|
||||
|
||||
if !seen_terms.contains(&token) {
|
||||
*self.doc_freq.entry(token.clone()).or_insert(0) += 1;
|
||||
seen_terms.insert(token);
|
||||
}
|
||||
}
|
||||
|
||||
self.term_freq.push(tf);
|
||||
}
|
||||
|
||||
self.avg_doc_len = total_len as f64 / documents.len() as f64;
|
||||
}
|
||||
|
||||
/// Calculate IDF for a term
|
||||
fn idf(&self, term: &str) -> f64 {
|
||||
let df = self.doc_freq.get(term).copied().unwrap_or(0) as f64;
|
||||
if df == 0.0 {
|
||||
return 0.0;
|
||||
}
|
||||
((self.num_docs as f64 - df + 0.5) / (df + 0.5) + 1.0).ln()
|
||||
}
|
||||
|
||||
/// Score a document against a query
|
||||
pub fn score(&self, doc_id: usize, query_tokens: &[String]) -> f64 {
|
||||
if doc_id >= self.term_freq.len() {
|
||||
return 0.0;
|
||||
}
|
||||
|
||||
let doc_tf = &self.term_freq[doc_id];
|
||||
let doc_len = self.doc_lengths[doc_id] as f64;
|
||||
|
||||
let mut score = 0.0;
|
||||
|
||||
for term in query_tokens {
|
||||
let tf = doc_tf.get(term).copied().unwrap_or(0) as f64;
|
||||
if tf == 0.0 {
|
||||
continue;
|
||||
}
|
||||
|
||||
let idf = self.idf(term);
|
||||
let numerator = tf * (self.k1 + 1.0);
|
||||
let denominator =
|
||||
tf + self.k1 * (1.0 - self.b + self.b * (doc_len / self.avg_doc_len));
|
||||
|
||||
score += idf * (numerator / denominator);
|
||||
}
|
||||
|
||||
score
|
||||
}
|
||||
|
||||
/// Search and return top-k documents
|
||||
pub fn search(&self, query: &str, k: usize) -> Vec<(usize, f64)> {
|
||||
let query_tokens = tokenize(query);
|
||||
|
||||
let mut scores: Vec<(usize, f64)> = (0..self.num_docs)
|
||||
.map(|doc_id| (doc_id, self.score(doc_id, &query_tokens)))
|
||||
.filter(|(_, score)| *score > 0.0)
|
||||
.collect();
|
||||
|
||||
scores.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap_or(Ordering::Equal));
|
||||
scores.truncate(k);
|
||||
scores
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Vector Search (Simplified)
|
||||
// ============================================================================
|
||||
|
||||
mod vector_search {
|
||||
use std::cmp::Ordering;
|
||||
|
||||
pub fn euclidean_distance(a: &[f32], b: &[f32]) -> f32 {
|
||||
a.iter()
|
||||
.zip(b.iter())
|
||||
.map(|(x, y)| (x - y).powi(2))
|
||||
.sum::<f32>()
|
||||
.sqrt()
|
||||
}
|
||||
|
||||
pub fn search(vectors: &[Vec<f32>], query: &[f32], k: usize) -> Vec<(usize, f32)> {
|
||||
let mut results: Vec<(usize, f32)> = vectors
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, v)| (i, euclidean_distance(query, v)))
|
||||
.collect();
|
||||
|
||||
results.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(Ordering::Equal));
|
||||
results.truncate(k);
|
||||
results
|
||||
}
|
||||
|
||||
pub fn search_parallel(vectors: &[Vec<f32>], query: &[f32], k: usize) -> Vec<(usize, f32)> {
|
||||
use rayon::prelude::*;
|
||||
|
||||
let mut results: Vec<(usize, f32)> = vectors
|
||||
.par_iter()
|
||||
.enumerate()
|
||||
.map(|(i, v)| (i, euclidean_distance(query, v)))
|
||||
.collect();
|
||||
|
||||
results.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(Ordering::Equal));
|
||||
results.truncate(k);
|
||||
results
|
||||
}
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Fusion Algorithms
|
||||
// ============================================================================
|
||||
|
||||
mod fusion {
|
||||
use std::collections::HashMap;
|
||||
|
||||
/// Reciprocal Rank Fusion
|
||||
pub fn rrf(
|
||||
vector_results: &[(usize, f32)],
|
||||
text_results: &[(usize, f64)],
|
||||
k: usize,
|
||||
rrf_k: f64,
|
||||
) -> Vec<(usize, f64)> {
|
||||
let mut scores: HashMap<usize, f64> = HashMap::new();
|
||||
|
||||
// Vector results
|
||||
for (rank, (doc_id, _)) in vector_results.iter().enumerate() {
|
||||
let rrf_score = 1.0 / (rrf_k + rank as f64 + 1.0);
|
||||
*scores.entry(*doc_id).or_insert(0.0) += rrf_score;
|
||||
}
|
||||
|
||||
// Text results
|
||||
for (rank, (doc_id, _)) in text_results.iter().enumerate() {
|
||||
let rrf_score = 1.0 / (rrf_k + rank as f64 + 1.0);
|
||||
*scores.entry(*doc_id).or_insert(0.0) += rrf_score;
|
||||
}
|
||||
|
||||
let mut results: Vec<(usize, f64)> = scores.into_iter().collect();
|
||||
results.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
|
||||
results.truncate(k);
|
||||
results
|
||||
}
|
||||
|
||||
/// Weighted score fusion (requires normalized scores)
|
||||
pub fn weighted_sum(
|
||||
vector_results: &[(usize, f32)],
|
||||
text_results: &[(usize, f64)],
|
||||
k: usize,
|
||||
vector_weight: f64,
|
||||
text_weight: f64,
|
||||
) -> Vec<(usize, f64)> {
|
||||
// Normalize vector scores (lower distance = higher score)
|
||||
let max_dist = vector_results
|
||||
.iter()
|
||||
.map(|(_, d)| *d)
|
||||
.fold(0.0f32, f32::max);
|
||||
let vector_scores: HashMap<usize, f64> = vector_results
|
||||
.iter()
|
||||
.map(|(id, dist)| (*id, (1.0 - dist / max_dist.max(1e-6)) as f64))
|
||||
.collect();
|
||||
|
||||
// Normalize text scores
|
||||
let max_text = text_results.iter().map(|(_, s)| *s).fold(0.0f64, f64::max);
|
||||
let text_scores: HashMap<usize, f64> = text_results
|
||||
.iter()
|
||||
.map(|(id, score)| (*id, score / max_text.max(1e-6)))
|
||||
.collect();
|
||||
|
||||
// Combine
|
||||
let mut all_ids: std::collections::HashSet<usize> = std::collections::HashSet::new();
|
||||
all_ids.extend(vector_scores.keys());
|
||||
all_ids.extend(text_scores.keys());
|
||||
|
||||
let mut results: Vec<(usize, f64)> = all_ids
|
||||
.iter()
|
||||
.map(|&id| {
|
||||
let v_score = vector_scores.get(&id).copied().unwrap_or(0.0);
|
||||
let t_score = text_scores.get(&id).copied().unwrap_or(0.0);
|
||||
(id, vector_weight * v_score + text_weight * t_score)
|
||||
})
|
||||
.collect();
|
||||
|
||||
results.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
|
||||
results.truncate(k);
|
||||
results
|
||||
}
|
||||
|
||||
/// Disjunctive Normalization
|
||||
pub fn disjunctive_normalization(
|
||||
vector_results: &[(usize, f32)],
|
||||
text_results: &[(usize, f64)],
|
||||
k: usize,
|
||||
) -> Vec<(usize, f64)> {
|
||||
let mut scores: HashMap<usize, f64> = HashMap::new();
|
||||
|
||||
// Vector results (convert distance to similarity)
|
||||
let max_dist = vector_results
|
||||
.iter()
|
||||
.map(|(_, d)| *d)
|
||||
.fold(0.0f32, f32::max);
|
||||
for (doc_id, dist) in vector_results {
|
||||
let sim = 1.0 - (*dist / max_dist.max(1e-6)) as f64;
|
||||
scores.insert(*doc_id, sim);
|
||||
}
|
||||
|
||||
// Text results (add if not present, max if present)
|
||||
let max_text = text_results.iter().map(|(_, s)| *s).fold(0.0f64, f64::max);
|
||||
for (doc_id, score) in text_results {
|
||||
let norm_score = score / max_text.max(1e-6);
|
||||
let current = scores.entry(*doc_id).or_insert(0.0);
|
||||
*current = current.max(norm_score);
|
||||
}
|
||||
|
||||
let mut results: Vec<(usize, f64)> = scores.into_iter().collect();
|
||||
results.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
|
||||
results.truncate(k);
|
||||
results
|
||||
}
|
||||
}
|
||||
|
||||
use bm25::{tokenize, BM25Index};
|
||||
use fusion::{disjunctive_normalization, rrf, weighted_sum};
|
||||
use vector_search::{search as vector_search_fn, search_parallel as vector_search_parallel};
|
||||
|
||||
// ============================================================================
|
||||
// Test Data Generation
|
||||
// ============================================================================
|
||||
|
||||
fn generate_random_vectors(n: usize, dims: usize, seed: u64) -> Vec<Vec<f32>> {
|
||||
let mut rng = ChaCha8Rng::seed_from_u64(seed);
|
||||
(0..n)
|
||||
.map(|_| (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect())
|
||||
.collect()
|
||||
}
|
||||
|
||||
fn generate_random_documents(n: usize, seed: u64) -> Vec<String> {
|
||||
let words = [
|
||||
"machine",
|
||||
"learning",
|
||||
"artificial",
|
||||
"intelligence",
|
||||
"neural",
|
||||
"network",
|
||||
"deep",
|
||||
"training",
|
||||
"model",
|
||||
"data",
|
||||
"algorithm",
|
||||
"optimization",
|
||||
"gradient",
|
||||
"descent",
|
||||
"backpropagation",
|
||||
"convolution",
|
||||
"recurrent",
|
||||
"transformer",
|
||||
"attention",
|
||||
"embedding",
|
||||
"vector",
|
||||
"search",
|
||||
"similarity",
|
||||
"distance",
|
||||
"nearest",
|
||||
"neighbor",
|
||||
"index",
|
||||
"query",
|
||||
"retrieval",
|
||||
"ranking",
|
||||
"database",
|
||||
"storage",
|
||||
"distributed",
|
||||
"parallel",
|
||||
"processing",
|
||||
];
|
||||
|
||||
let mut rng = ChaCha8Rng::seed_from_u64(seed);
|
||||
|
||||
(0..n)
|
||||
.map(|_| {
|
||||
let len = rng.gen_range(20..100);
|
||||
(0..len)
|
||||
.map(|_| words[rng.gen_range(0..words.len())])
|
||||
.collect::<Vec<_>>()
|
||||
.join(" ")
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Vector-Only vs Hybrid Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_vector_only(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Vector Only Search");
|
||||
|
||||
for &n in [10_000, 100_000].iter() {
|
||||
let dims = 768;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let query = vectors[0].clone();
|
||||
|
||||
group.throughput(Throughput::Elements(n as u64));
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("sequential", n), &n, |bench, _| {
|
||||
bench.iter(|| black_box(vector_search_fn(&vectors, &query, 10)))
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("parallel", n), &n, |bench, _| {
|
||||
bench.iter(|| black_box(vector_search_parallel(&vectors, &query, 10)))
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_text_only(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Text Only (BM25) Search");
|
||||
|
||||
for &n in [10_000, 100_000].iter() {
|
||||
let documents = generate_random_documents(n, 42);
|
||||
|
||||
let mut bm25 = BM25Index::new(1.2, 0.75);
|
||||
bm25.build(&documents);
|
||||
|
||||
let query = "machine learning neural network";
|
||||
|
||||
group.throughput(Throughput::Elements(n as u64));
|
||||
|
||||
group.bench_with_input(BenchmarkId::from_parameter(n), &n, |bench, _| {
|
||||
bench.iter(|| black_box(bm25.search(query, 10)))
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_hybrid_search(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Hybrid Search");
|
||||
|
||||
for &n in [10_000, 100_000].iter() {
|
||||
let dims = 768;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let documents = generate_random_documents(n, 42);
|
||||
let vector_query = vectors[0].clone();
|
||||
let text_query = "machine learning neural network";
|
||||
|
||||
let mut bm25 = BM25Index::new(1.2, 0.75);
|
||||
bm25.build(&documents);
|
||||
|
||||
group.throughput(Throughput::Elements(n as u64));
|
||||
|
||||
// Sequential hybrid
|
||||
group.bench_with_input(BenchmarkId::new("sequential", n), &n, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let vector_results = vector_search_fn(&vectors, &vector_query, 100);
|
||||
let text_results = bm25.search(text_query, 100);
|
||||
black_box(rrf(&vector_results, &text_results, 10, 60.0))
|
||||
})
|
||||
});
|
||||
|
||||
// Parallel hybrid (branches)
|
||||
group.bench_with_input(BenchmarkId::new("parallel_branches", n), &n, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let (vector_results, text_results) = rayon::join(
|
||||
|| vector_search_parallel(&vectors, &vector_query, 100),
|
||||
|| bm25.search(text_query, 100),
|
||||
);
|
||||
black_box(rrf(&vector_results, &text_results, 10, 60.0))
|
||||
})
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// BM25 Overhead Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_bm25_build(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("BM25 Index Build");
|
||||
|
||||
for &n in [1_000, 10_000, 100_000].iter() {
|
||||
let documents = generate_random_documents(n, 42);
|
||||
|
||||
group.throughput(Throughput::Elements(n as u64));
|
||||
|
||||
group.bench_with_input(BenchmarkId::from_parameter(n), &documents, |bench, docs| {
|
||||
bench.iter(|| {
|
||||
let mut bm25 = BM25Index::new(1.2, 0.75);
|
||||
bm25.build(docs);
|
||||
black_box(bm25)
|
||||
})
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_bm25_query_lengths(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("BM25 Query Length");
|
||||
|
||||
let n = 100_000;
|
||||
let documents = generate_random_documents(n, 42);
|
||||
|
||||
let mut bm25 = BM25Index::new(1.2, 0.75);
|
||||
bm25.build(&documents);
|
||||
|
||||
let queries = [
|
||||
"machine",
|
||||
"machine learning",
|
||||
"machine learning neural network",
|
||||
"machine learning neural network deep training model",
|
||||
"machine learning neural network deep training model algorithm optimization gradient descent",
|
||||
];
|
||||
|
||||
for query in queries.iter() {
|
||||
let token_count = tokenize(query).len();
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("tokens", token_count),
|
||||
query,
|
||||
|bench, q| bench.iter(|| black_box(bm25.search(q, 10))),
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Fusion Algorithm Comparison
|
||||
// ============================================================================
|
||||
|
||||
fn bench_fusion_algorithms(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Fusion Algorithms");
|
||||
|
||||
let n = 100_000;
|
||||
let dims = 768;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let documents = generate_random_documents(n, 42);
|
||||
let vector_query = vectors[0].clone();
|
||||
let text_query = "machine learning neural network";
|
||||
|
||||
let mut bm25 = BM25Index::new(1.2, 0.75);
|
||||
bm25.build(&documents);
|
||||
|
||||
// Pre-compute search results
|
||||
let vector_results = vector_search_fn(&vectors, &vector_query, 1000);
|
||||
let text_results = bm25.search(text_query, 1000);
|
||||
|
||||
for &k in [10, 50, 100].iter() {
|
||||
group.bench_with_input(BenchmarkId::new("rrf", k), &k, |bench, &k_val| {
|
||||
bench.iter(|| black_box(rrf(&vector_results, &text_results, k_val, 60.0)))
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("weighted_sum", k), &k, |bench, &k_val| {
|
||||
bench.iter(|| {
|
||||
black_box(weighted_sum(
|
||||
&vector_results,
|
||||
&text_results,
|
||||
k_val,
|
||||
0.6,
|
||||
0.4,
|
||||
))
|
||||
})
|
||||
});
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("disjunctive_norm", k),
|
||||
&k,
|
||||
|bench, &k_val| {
|
||||
bench.iter(|| {
|
||||
black_box(disjunctive_normalization(
|
||||
&vector_results,
|
||||
&text_results,
|
||||
k_val,
|
||||
))
|
||||
})
|
||||
},
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_rrf_k_parameter(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("RRF K Parameter");
|
||||
|
||||
let n = 100_000;
|
||||
let dims = 768;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let documents = generate_random_documents(n, 42);
|
||||
let vector_query = vectors[0].clone();
|
||||
let text_query = "machine learning neural network";
|
||||
|
||||
let mut bm25 = BM25Index::new(1.2, 0.75);
|
||||
bm25.build(&documents);
|
||||
|
||||
let vector_results = vector_search_fn(&vectors, &vector_query, 1000);
|
||||
let text_results = bm25.search(text_query, 1000);
|
||||
|
||||
for &rrf_k in [1.0, 20.0, 60.0, 100.0, 200.0].iter() {
|
||||
group.bench_with_input(
|
||||
BenchmarkId::from_parameter(rrf_k as i32),
|
||||
&rrf_k,
|
||||
|bench, &k| bench.iter(|| black_box(rrf(&vector_results, &text_results, 10, k))),
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_weight_ratios(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Weight Ratios");
|
||||
|
||||
let n = 100_000;
|
||||
let dims = 768;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let documents = generate_random_documents(n, 42);
|
||||
let vector_query = vectors[0].clone();
|
||||
let text_query = "machine learning neural network";
|
||||
|
||||
let mut bm25 = BM25Index::new(1.2, 0.75);
|
||||
bm25.build(&documents);
|
||||
|
||||
let vector_results = vector_search_fn(&vectors, &vector_query, 1000);
|
||||
let text_results = bm25.search(text_query, 1000);
|
||||
|
||||
let ratios = [
|
||||
(0.0, 1.0, "text_only"),
|
||||
(0.3, 0.7, "text_heavy"),
|
||||
(0.5, 0.5, "balanced"),
|
||||
(0.7, 0.3, "vector_heavy"),
|
||||
(1.0, 0.0, "vector_only"),
|
||||
];
|
||||
|
||||
for (vector_w, text_w, name) in ratios.iter() {
|
||||
group.bench_with_input(
|
||||
BenchmarkId::from_parameter(name),
|
||||
&(*vector_w, *text_w),
|
||||
|bench, &(v_w, t_w)| {
|
||||
bench.iter(|| black_box(weighted_sum(&vector_results, &text_results, 10, v_w, t_w)))
|
||||
},
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Parallel Branch Execution
|
||||
// ============================================================================
|
||||
|
||||
fn bench_parallel_execution_gain(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Parallel Branch Execution");
|
||||
|
||||
for &n in [10_000, 50_000, 100_000].iter() {
|
||||
let dims = 768;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let documents = generate_random_documents(n, 42);
|
||||
let vector_query = vectors[0].clone();
|
||||
let text_query = "machine learning neural network";
|
||||
|
||||
let mut bm25 = BM25Index::new(1.2, 0.75);
|
||||
bm25.build(&documents);
|
||||
|
||||
// Sequential
|
||||
group.bench_with_input(BenchmarkId::new("sequential", n), &n, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let vector_results = vector_search_fn(&vectors, &vector_query, 100);
|
||||
let text_results = bm25.search(text_query, 100);
|
||||
black_box((vector_results, text_results))
|
||||
})
|
||||
});
|
||||
|
||||
// Parallel with rayon::join
|
||||
group.bench_with_input(BenchmarkId::new("parallel_join", n), &n, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let (vector_results, text_results) = rayon::join(
|
||||
|| vector_search_fn(&vectors, &vector_query, 100),
|
||||
|| bm25.search(text_query, 100),
|
||||
);
|
||||
black_box((vector_results, text_results))
|
||||
})
|
||||
});
|
||||
|
||||
// Parallel vector search only
|
||||
group.bench_with_input(BenchmarkId::new("parallel_vector", n), &n, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let vector_results = vector_search_parallel(&vectors, &vector_query, 100);
|
||||
let text_results = bm25.search(text_query, 100);
|
||||
black_box((vector_results, text_results))
|
||||
})
|
||||
});
|
||||
|
||||
// Full parallel
|
||||
group.bench_with_input(BenchmarkId::new("full_parallel", n), &n, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let (vector_results, text_results) = rayon::join(
|
||||
|| vector_search_parallel(&vectors, &vector_query, 100),
|
||||
|| bm25.search(text_query, 100),
|
||||
);
|
||||
black_box((vector_results, text_results))
|
||||
})
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Candidate Count Analysis
|
||||
// ============================================================================
|
||||
|
||||
fn bench_candidate_counts(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Candidate Count Analysis");
|
||||
|
||||
let n = 100_000;
|
||||
let dims = 768;
|
||||
let vectors = generate_random_vectors(n, dims, 42);
|
||||
let documents = generate_random_documents(n, 42);
|
||||
let vector_query = vectors[0].clone();
|
||||
let text_query = "machine learning neural network";
|
||||
|
||||
let mut bm25 = BM25Index::new(1.2, 0.75);
|
||||
bm25.build(&documents);
|
||||
|
||||
for &candidates in [50, 100, 200, 500, 1000, 2000].iter() {
|
||||
group.bench_with_input(
|
||||
BenchmarkId::from_parameter(candidates),
|
||||
&candidates,
|
||||
|bench, &k_candidates| {
|
||||
bench.iter(|| {
|
||||
let (vector_results, text_results) = rayon::join(
|
||||
|| vector_search_parallel(&vectors, &vector_query, k_candidates),
|
||||
|| bm25.search(text_query, k_candidates),
|
||||
);
|
||||
black_box(rrf(&vector_results, &text_results, 10, 60.0))
|
||||
})
|
||||
},
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(
|
||||
benches,
|
||||
// Vector vs Text
|
||||
bench_vector_only,
|
||||
bench_text_only,
|
||||
bench_hybrid_search,
|
||||
// BM25 Overhead
|
||||
bench_bm25_build,
|
||||
bench_bm25_query_lengths,
|
||||
// Fusion Algorithms
|
||||
bench_fusion_algorithms,
|
||||
bench_rrf_k_parameter,
|
||||
bench_weight_ratios,
|
||||
// Parallel Execution
|
||||
bench_parallel_execution_gain,
|
||||
bench_candidate_counts,
|
||||
);
|
||||
|
||||
criterion_main!(benches);
|
||||
1394
crates/ruvector-postgres/benches/index_bench.rs
Normal file
1394
crates/ruvector-postgres/benches/index_bench.rs
Normal file
File diff suppressed because it is too large
Load Diff
915
crates/ruvector-postgres/benches/integrity_bench.rs
Normal file
915
crates/ruvector-postgres/benches/integrity_bench.rs
Normal file
@@ -0,0 +1,915 @@
|
||||
//! Index integrity and graph maintenance benchmarks
|
||||
//!
|
||||
//! Benchmarks for v2 structural integrity features:
|
||||
//! - Contracted graph construction
|
||||
//! - Mincut computation time
|
||||
//! - State transition overhead
|
||||
//! - Gating check latency
|
||||
//! - Graph connectivity verification
|
||||
|
||||
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
|
||||
use rand::prelude::*;
|
||||
use rand_chacha::ChaCha8Rng;
|
||||
use rayon::prelude::*;
|
||||
use std::cmp::Ordering;
|
||||
use std::collections::{BinaryHeap, HashMap, HashSet, VecDeque};
|
||||
|
||||
// ============================================================================
|
||||
// Graph Structures for Index Integrity
|
||||
// ============================================================================
|
||||
|
||||
mod graph {
|
||||
use std::cmp::Ordering;
|
||||
use std::collections::{BinaryHeap, HashMap, HashSet, VecDeque};
|
||||
|
||||
/// Node in the HNSW graph (simplified)
|
||||
#[derive(Clone)]
|
||||
pub struct GraphNode {
|
||||
pub id: u64,
|
||||
pub neighbors: Vec<u64>,
|
||||
pub layer: usize,
|
||||
}
|
||||
|
||||
/// Graph for integrity checking
|
||||
pub struct Graph {
|
||||
pub nodes: HashMap<u64, GraphNode>,
|
||||
pub max_layer: usize,
|
||||
}
|
||||
|
||||
impl Graph {
|
||||
pub fn new() -> Self {
|
||||
Self {
|
||||
nodes: HashMap::new(),
|
||||
max_layer: 0,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn add_node(&mut self, id: u64, layer: usize) {
|
||||
self.nodes.insert(
|
||||
id,
|
||||
GraphNode {
|
||||
id,
|
||||
neighbors: Vec::new(),
|
||||
layer,
|
||||
},
|
||||
);
|
||||
self.max_layer = self.max_layer.max(layer);
|
||||
}
|
||||
|
||||
pub fn add_edge(&mut self, from: u64, to: u64) {
|
||||
if let Some(node) = self.nodes.get_mut(&from) {
|
||||
if !node.neighbors.contains(&to) {
|
||||
node.neighbors.push(to);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
pub fn len(&self) -> usize {
|
||||
self.nodes.len()
|
||||
}
|
||||
}
|
||||
|
||||
/// Contracted graph for integrity verification
|
||||
pub struct ContractedGraph {
|
||||
/// Super-nodes (contracted regions)
|
||||
pub super_nodes: Vec<SuperNode>,
|
||||
/// Edges between super-nodes
|
||||
pub super_edges: Vec<(usize, usize, f32)>,
|
||||
/// Node to super-node mapping
|
||||
pub node_mapping: HashMap<u64, usize>,
|
||||
}
|
||||
|
||||
#[derive(Clone)]
|
||||
pub struct SuperNode {
|
||||
pub id: usize,
|
||||
pub original_nodes: Vec<u64>,
|
||||
pub internal_edges: usize,
|
||||
}
|
||||
|
||||
impl ContractedGraph {
|
||||
pub fn new() -> Self {
|
||||
Self {
|
||||
super_nodes: Vec::new(),
|
||||
super_edges: Vec::new(),
|
||||
node_mapping: HashMap::new(),
|
||||
}
|
||||
}
|
||||
|
||||
/// Build contracted graph from original graph
|
||||
pub fn build_from_graph(graph: &Graph, contraction_factor: usize) -> Self {
|
||||
let mut contracted = ContractedGraph::new();
|
||||
|
||||
// Group nodes by region (simplified partitioning)
|
||||
let node_ids: Vec<u64> = graph.nodes.keys().copied().collect();
|
||||
let num_super_nodes = (node_ids.len() / contraction_factor).max(1);
|
||||
|
||||
for (i, chunk) in node_ids.chunks(contraction_factor).enumerate() {
|
||||
let super_node = SuperNode {
|
||||
id: i,
|
||||
original_nodes: chunk.to_vec(),
|
||||
internal_edges: chunk
|
||||
.iter()
|
||||
.filter_map(|&id| graph.nodes.get(&id))
|
||||
.flat_map(|n| n.neighbors.iter())
|
||||
.filter(|&&neighbor| chunk.contains(&neighbor))
|
||||
.count(),
|
||||
};
|
||||
|
||||
for &node_id in chunk {
|
||||
contracted.node_mapping.insert(node_id, i);
|
||||
}
|
||||
|
||||
contracted.super_nodes.push(super_node);
|
||||
}
|
||||
|
||||
// Build super edges
|
||||
let mut edge_weights: HashMap<(usize, usize), f32> = HashMap::new();
|
||||
|
||||
for node in graph.nodes.values() {
|
||||
let from_super = contracted.node_mapping[&node.id];
|
||||
|
||||
for &neighbor in &node.neighbors {
|
||||
if let Some(&to_super) = contracted.node_mapping.get(&neighbor) {
|
||||
if from_super != to_super {
|
||||
let key = if from_super < to_super {
|
||||
(from_super, to_super)
|
||||
} else {
|
||||
(to_super, from_super)
|
||||
};
|
||||
*edge_weights.entry(key).or_insert(0.0) += 1.0;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
contracted.super_edges = edge_weights
|
||||
.into_iter()
|
||||
.map(|((a, b), w)| (a, b, w))
|
||||
.collect();
|
||||
|
||||
contracted
|
||||
}
|
||||
|
||||
pub fn num_super_nodes(&self) -> usize {
|
||||
self.super_nodes.len()
|
||||
}
|
||||
|
||||
pub fn num_super_edges(&self) -> usize {
|
||||
self.super_edges.len()
|
||||
}
|
||||
}
|
||||
|
||||
/// Mincut computation using Ford-Fulkerson algorithm
|
||||
pub struct MincutComputer {
|
||||
/// Adjacency list with capacities
|
||||
adj: Vec<Vec<(usize, f32)>>,
|
||||
pub n: usize,
|
||||
}
|
||||
|
||||
impl MincutComputer {
|
||||
pub fn from_contracted_graph(contracted: &ContractedGraph) -> Self {
|
||||
let n = contracted.num_super_nodes();
|
||||
let mut adj: Vec<Vec<(usize, f32)>> = vec![Vec::new(); n];
|
||||
|
||||
for &(a, b, w) in &contracted.super_edges {
|
||||
adj[a].push((b, w));
|
||||
adj[b].push((a, w));
|
||||
}
|
||||
|
||||
Self { adj, n }
|
||||
}
|
||||
|
||||
/// Find mincut using BFS-based augmenting paths
|
||||
pub fn compute_mincut(&self, source: usize, sink: usize) -> f32 {
|
||||
if source == sink || self.n == 0 {
|
||||
return 0.0;
|
||||
}
|
||||
|
||||
// Create residual capacity matrix
|
||||
let mut residual: Vec<Vec<f32>> = vec![vec![0.0; self.n]; self.n];
|
||||
|
||||
for (from, edges) in self.adj.iter().enumerate() {
|
||||
for &(to, cap) in edges {
|
||||
residual[from][to] = cap;
|
||||
}
|
||||
}
|
||||
|
||||
let mut max_flow = 0.0;
|
||||
|
||||
// BFS to find augmenting path
|
||||
loop {
|
||||
let mut parent = vec![None; self.n];
|
||||
let mut visited = vec![false; self.n];
|
||||
let mut queue = VecDeque::new();
|
||||
|
||||
visited[source] = true;
|
||||
queue.push_back(source);
|
||||
|
||||
while let Some(u) = queue.pop_front() {
|
||||
for v in 0..self.n {
|
||||
if !visited[v] && residual[u][v] > 0.0 {
|
||||
visited[v] = true;
|
||||
parent[v] = Some(u);
|
||||
queue.push_back(v);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if !visited[sink] {
|
||||
break;
|
||||
}
|
||||
|
||||
// Find minimum residual capacity along path
|
||||
let mut path_flow = f32::MAX;
|
||||
let mut v = sink;
|
||||
while let Some(u) = parent[v] {
|
||||
path_flow = path_flow.min(residual[u][v]);
|
||||
v = u;
|
||||
}
|
||||
|
||||
// Update residual capacities
|
||||
v = sink;
|
||||
while let Some(u) = parent[v] {
|
||||
residual[u][v] -= path_flow;
|
||||
residual[v][u] += path_flow;
|
||||
v = u;
|
||||
}
|
||||
|
||||
max_flow += path_flow;
|
||||
}
|
||||
|
||||
max_flow
|
||||
}
|
||||
|
||||
/// Compute global mincut (minimum over all pairs)
|
||||
pub fn compute_global_mincut(&self) -> f32 {
|
||||
if self.n <= 1 {
|
||||
return 0.0;
|
||||
}
|
||||
|
||||
let mut min_cut = f32::MAX;
|
||||
|
||||
// Use Stoer-Wagner-like approach: fix node 0 as source
|
||||
for sink in 1..self.n {
|
||||
let cut = self.compute_mincut(0, sink);
|
||||
min_cut = min_cut.min(cut);
|
||||
}
|
||||
|
||||
min_cut
|
||||
}
|
||||
}
|
||||
|
||||
/// State machine for index integrity
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
pub enum IndexState {
|
||||
Uninitialized,
|
||||
Building,
|
||||
Ready,
|
||||
Updating,
|
||||
Corrupted,
|
||||
Recovering,
|
||||
}
|
||||
|
||||
pub struct IndexStateMachine {
|
||||
pub state: IndexState,
|
||||
pub transition_count: usize,
|
||||
pub last_integrity_check: std::time::Instant,
|
||||
pub integrity_score: f32,
|
||||
}
|
||||
|
||||
impl IndexStateMachine {
|
||||
pub fn new() -> Self {
|
||||
Self {
|
||||
state: IndexState::Uninitialized,
|
||||
transition_count: 0,
|
||||
last_integrity_check: std::time::Instant::now(),
|
||||
integrity_score: 1.0,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn can_transition(&self, to: IndexState) -> bool {
|
||||
match (self.state, to) {
|
||||
(IndexState::Uninitialized, IndexState::Building) => true,
|
||||
(IndexState::Building, IndexState::Ready) => true,
|
||||
(IndexState::Ready, IndexState::Updating) => true,
|
||||
(IndexState::Updating, IndexState::Ready) => true,
|
||||
(_, IndexState::Corrupted) => true,
|
||||
(IndexState::Corrupted, IndexState::Recovering) => true,
|
||||
(IndexState::Recovering, IndexState::Ready) => true,
|
||||
_ => false,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn transition(&mut self, to: IndexState) -> Result<(), &'static str> {
|
||||
if self.can_transition(to) {
|
||||
self.state = to;
|
||||
self.transition_count += 1;
|
||||
Ok(())
|
||||
} else {
|
||||
Err("Invalid state transition")
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Gating check for index operations
|
||||
pub struct GatingCheck {
|
||||
/// Minimum connectivity threshold
|
||||
pub min_connectivity: f32,
|
||||
/// Maximum allowed dead nodes
|
||||
pub max_dead_nodes_ratio: f32,
|
||||
/// Maximum layer imbalance
|
||||
pub max_layer_imbalance: f32,
|
||||
}
|
||||
|
||||
impl GatingCheck {
|
||||
pub fn default() -> Self {
|
||||
Self {
|
||||
min_connectivity: 0.95,
|
||||
max_dead_nodes_ratio: 0.01,
|
||||
max_layer_imbalance: 2.0,
|
||||
}
|
||||
}
|
||||
|
||||
/// Check if graph passes all gates
|
||||
pub fn check(&self, graph: &Graph) -> GatingResult {
|
||||
let connectivity = self.check_connectivity(graph);
|
||||
let dead_ratio = self.check_dead_nodes(graph);
|
||||
let layer_balance = self.check_layer_balance(graph);
|
||||
|
||||
GatingResult {
|
||||
passed: connectivity >= self.min_connectivity
|
||||
&& dead_ratio <= self.max_dead_nodes_ratio
|
||||
&& layer_balance <= self.max_layer_imbalance,
|
||||
connectivity,
|
||||
dead_nodes_ratio: dead_ratio,
|
||||
layer_imbalance: layer_balance,
|
||||
}
|
||||
}
|
||||
|
||||
fn check_connectivity(&self, graph: &Graph) -> f32 {
|
||||
if graph.len() <= 1 {
|
||||
return 1.0;
|
||||
}
|
||||
|
||||
// BFS from first node
|
||||
let start = *graph.nodes.keys().next().unwrap();
|
||||
let mut visited = HashSet::new();
|
||||
let mut queue = VecDeque::new();
|
||||
|
||||
visited.insert(start);
|
||||
queue.push_back(start);
|
||||
|
||||
while let Some(node) = queue.pop_front() {
|
||||
if let Some(n) = graph.nodes.get(&node) {
|
||||
for &neighbor in &n.neighbors {
|
||||
if !visited.contains(&neighbor) && graph.nodes.contains_key(&neighbor) {
|
||||
visited.insert(neighbor);
|
||||
queue.push_back(neighbor);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
visited.len() as f32 / graph.len() as f32
|
||||
}
|
||||
|
||||
fn check_dead_nodes(&self, graph: &Graph) -> f32 {
|
||||
let dead_count = graph
|
||||
.nodes
|
||||
.values()
|
||||
.filter(|n| n.neighbors.is_empty())
|
||||
.count();
|
||||
|
||||
dead_count as f32 / graph.len() as f32
|
||||
}
|
||||
|
||||
fn check_layer_balance(&self, graph: &Graph) -> f32 {
|
||||
if graph.max_layer == 0 {
|
||||
return 1.0;
|
||||
}
|
||||
|
||||
let mut layer_counts = vec![0usize; graph.max_layer + 1];
|
||||
for node in graph.nodes.values() {
|
||||
layer_counts[node.layer] += 1;
|
||||
}
|
||||
|
||||
let max_count = layer_counts.iter().max().copied().unwrap_or(1) as f32;
|
||||
let min_count = layer_counts
|
||||
.iter()
|
||||
.filter(|&&c| c > 0)
|
||||
.min()
|
||||
.copied()
|
||||
.unwrap_or(1) as f32;
|
||||
|
||||
max_count / min_count
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
pub struct GatingResult {
|
||||
pub passed: bool,
|
||||
pub connectivity: f32,
|
||||
pub dead_nodes_ratio: f32,
|
||||
pub layer_imbalance: f32,
|
||||
}
|
||||
}
|
||||
|
||||
use graph::{ContractedGraph, GatingCheck, Graph, IndexState, IndexStateMachine, MincutComputer};
|
||||
|
||||
// ============================================================================
|
||||
// Test Data Generation
|
||||
// ============================================================================
|
||||
|
||||
fn generate_random_graph(n: usize, avg_neighbors: usize, max_layer: usize, seed: u64) -> Graph {
|
||||
let mut rng = ChaCha8Rng::seed_from_u64(seed);
|
||||
let mut graph = Graph::new();
|
||||
|
||||
// Add nodes with random layers
|
||||
for id in 0..n {
|
||||
let layer = if id == 0 {
|
||||
max_layer
|
||||
} else {
|
||||
let ml = 1.0 / (16.0_f64).ln();
|
||||
let r: f64 = rng.gen();
|
||||
((-r.ln() * ml).floor() as usize).min(max_layer)
|
||||
};
|
||||
graph.add_node(id as u64, layer);
|
||||
}
|
||||
|
||||
// Add random edges (maintaining HNSW-like structure)
|
||||
for id in 0..n {
|
||||
let num_neighbors = rng.gen_range(1..=avg_neighbors * 2);
|
||||
for _ in 0..num_neighbors {
|
||||
let neighbor = rng.gen_range(0..n) as u64;
|
||||
if neighbor != id as u64 {
|
||||
graph.add_edge(id as u64, neighbor);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
graph
|
||||
}
|
||||
|
||||
fn generate_connected_graph(n: usize, avg_neighbors: usize, seed: u64) -> Graph {
|
||||
let mut rng = ChaCha8Rng::seed_from_u64(seed);
|
||||
let mut graph = Graph::new();
|
||||
|
||||
// Add nodes
|
||||
for id in 0..n {
|
||||
let layer = if id == 0 { 5 } else { rng.gen_range(0..=5) };
|
||||
graph.add_node(id as u64, layer);
|
||||
}
|
||||
|
||||
// Ensure connectivity: chain all nodes
|
||||
for id in 1..n {
|
||||
graph.add_edge(id as u64, (id - 1) as u64);
|
||||
graph.add_edge((id - 1) as u64, id as u64);
|
||||
}
|
||||
|
||||
// Add random extra edges
|
||||
for id in 0..n {
|
||||
let num_extra = rng.gen_range(0..avg_neighbors);
|
||||
for _ in 0..num_extra {
|
||||
let neighbor = rng.gen_range(0..n) as u64;
|
||||
if neighbor != id as u64 {
|
||||
graph.add_edge(id as u64, neighbor);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
graph
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Contracted Graph Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_contracted_graph_build(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Contracted Graph Build");
|
||||
group.sample_size(10);
|
||||
|
||||
for &n in [1_000, 10_000, 100_000].iter() {
|
||||
let graph = generate_connected_graph(n, 16, 42);
|
||||
|
||||
for &factor in [10, 50, 100, 500].iter() {
|
||||
if factor > n {
|
||||
continue;
|
||||
}
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new(format!("n{}_factor{}", n, factor), n),
|
||||
&(&graph, factor),
|
||||
|bench, (g, f)| bench.iter(|| black_box(ContractedGraph::build_from_graph(g, *f))),
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_contracted_graph_memory(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Contracted Graph Memory");
|
||||
group.sample_size(10);
|
||||
|
||||
for &n in [10_000, 100_000].iter() {
|
||||
let graph = generate_connected_graph(n, 16, 42);
|
||||
|
||||
for &factor in [10, 50, 100].iter() {
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new(format!("n{}_factor{}", n, factor), n),
|
||||
&(&graph, factor),
|
||||
|bench, (g, f)| {
|
||||
bench.iter(|| {
|
||||
let contracted = ContractedGraph::build_from_graph(g, *f);
|
||||
|
||||
// Calculate memory usage
|
||||
let super_node_mem = contracted
|
||||
.super_nodes
|
||||
.iter()
|
||||
.map(|sn| sn.original_nodes.len() * 8)
|
||||
.sum::<usize>();
|
||||
let edge_mem = contracted.super_edges.len() * 20; // (usize, usize, f32)
|
||||
let mapping_mem = contracted.node_mapping.len() * 16;
|
||||
|
||||
black_box(super_node_mem + edge_mem + mapping_mem)
|
||||
})
|
||||
},
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Mincut Computation Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_mincut_compute(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Mincut Computation");
|
||||
group.sample_size(10);
|
||||
|
||||
for &n in [1_000, 5_000, 10_000].iter() {
|
||||
let graph = generate_connected_graph(n, 16, 42);
|
||||
let contracted = ContractedGraph::build_from_graph(&graph, 50);
|
||||
let mincut_computer = MincutComputer::from_contracted_graph(&contracted);
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("single_pair", n),
|
||||
&mincut_computer,
|
||||
|bench, mc| bench.iter(|| black_box(mc.compute_mincut(0, mc.n - 1))),
|
||||
);
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("global", n),
|
||||
&mincut_computer,
|
||||
|bench, mc| bench.iter(|| black_box(mc.compute_global_mincut())),
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_mincut_contraction_factors(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Mincut vs Contraction Factor");
|
||||
group.sample_size(10);
|
||||
|
||||
let n = 10_000;
|
||||
let graph = generate_connected_graph(n, 16, 42);
|
||||
|
||||
for &factor in [10, 25, 50, 100, 200].iter() {
|
||||
let contracted = ContractedGraph::build_from_graph(&graph, factor);
|
||||
let mincut_computer = MincutComputer::from_contracted_graph(&contracted);
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::from_parameter(factor),
|
||||
&mincut_computer,
|
||||
|bench, mc| bench.iter(|| black_box(mc.compute_global_mincut())),
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// State Transition Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_state_transitions(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("State Transitions");
|
||||
|
||||
// Single transition
|
||||
group.bench_function("single_transition", |bench| {
|
||||
bench.iter(|| {
|
||||
let mut sm = IndexStateMachine::new();
|
||||
black_box(sm.transition(IndexState::Building))
|
||||
})
|
||||
});
|
||||
|
||||
// Full lifecycle
|
||||
group.bench_function("full_lifecycle", |bench| {
|
||||
bench.iter(|| {
|
||||
let mut sm = IndexStateMachine::new();
|
||||
sm.transition(IndexState::Building).ok();
|
||||
sm.transition(IndexState::Ready).ok();
|
||||
sm.transition(IndexState::Updating).ok();
|
||||
sm.transition(IndexState::Ready).ok();
|
||||
black_box(sm.state)
|
||||
})
|
||||
});
|
||||
|
||||
// Transition check only (no mutation)
|
||||
group.bench_function("transition_check", |bench| {
|
||||
let sm = IndexStateMachine::new();
|
||||
bench.iter(|| black_box(sm.can_transition(IndexState::Building)))
|
||||
});
|
||||
|
||||
// Many transitions
|
||||
group.bench_function("1000_transitions", |bench| {
|
||||
bench.iter(|| {
|
||||
let mut sm = IndexStateMachine::new();
|
||||
sm.transition(IndexState::Building).ok();
|
||||
sm.transition(IndexState::Ready).ok();
|
||||
|
||||
for _ in 0..500 {
|
||||
sm.transition(IndexState::Updating).ok();
|
||||
sm.transition(IndexState::Ready).ok();
|
||||
}
|
||||
|
||||
black_box(sm.transition_count)
|
||||
})
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_state_machine_overhead(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("State Machine Overhead");
|
||||
|
||||
// Measure overhead of state checking before operations
|
||||
let graph = generate_connected_graph(10_000, 16, 42);
|
||||
|
||||
group.bench_function("with_state_check", |bench| {
|
||||
let mut sm = IndexStateMachine::new();
|
||||
sm.transition(IndexState::Building).ok();
|
||||
sm.transition(IndexState::Ready).ok();
|
||||
|
||||
bench.iter(|| {
|
||||
// Simulate operation with state check
|
||||
if sm.state == IndexState::Ready {
|
||||
// Perform "operation"
|
||||
let count = graph.nodes.len();
|
||||
black_box(count)
|
||||
} else {
|
||||
black_box(0)
|
||||
}
|
||||
})
|
||||
});
|
||||
|
||||
group.bench_function("without_state_check", |bench| {
|
||||
bench.iter(|| {
|
||||
// Perform operation directly
|
||||
let count = graph.nodes.len();
|
||||
black_box(count)
|
||||
})
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Gating Check Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_gating_check(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Gating Check");
|
||||
|
||||
for &n in [1_000, 10_000, 100_000].iter() {
|
||||
let graph = generate_connected_graph(n, 16, 42);
|
||||
let gating = GatingCheck::default();
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("full_check", n),
|
||||
&(&graph, &gating),
|
||||
|bench, (g, gate)| bench.iter(|| black_box(gate.check(g))),
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_connectivity_check(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Connectivity Check");
|
||||
|
||||
for &n in [1_000, 10_000, 100_000].iter() {
|
||||
// Well-connected graph
|
||||
let connected_graph = generate_connected_graph(n, 16, 42);
|
||||
|
||||
// Sparse graph (may have disconnected components)
|
||||
let sparse_graph = generate_random_graph(n, 2, 5, 42);
|
||||
|
||||
let gating = GatingCheck::default();
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("connected", n),
|
||||
&(&connected_graph, &gating),
|
||||
|bench, (g, gate)| bench.iter(|| black_box(gate.check(g).connectivity)),
|
||||
);
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("sparse", n),
|
||||
&(&sparse_graph, &gating),
|
||||
|bench, (g, gate)| bench.iter(|| black_box(gate.check(g).connectivity)),
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_dead_node_detection(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Dead Node Detection");
|
||||
|
||||
for &n in [10_000, 100_000].iter() {
|
||||
let graph = generate_connected_graph(n, 16, 42);
|
||||
let gating = GatingCheck::default();
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::from_parameter(n),
|
||||
&(&graph, &gating),
|
||||
|bench, (g, gate)| bench.iter(|| black_box(gate.check(g).dead_nodes_ratio)),
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_layer_balance_check(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Layer Balance Check");
|
||||
|
||||
for &n in [10_000, 100_000].iter() {
|
||||
let graph = generate_random_graph(n, 16, 10, 42);
|
||||
let gating = GatingCheck::default();
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::from_parameter(n),
|
||||
&(&graph, &gating),
|
||||
|bench, (g, gate)| bench.iter(|| black_box(gate.check(g).layer_imbalance)),
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Parallel Integrity Checks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_parallel_integrity(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Parallel Integrity Check");
|
||||
group.sample_size(10);
|
||||
|
||||
let n = 100_000;
|
||||
let graph = generate_connected_graph(n, 16, 42);
|
||||
let gating = GatingCheck::default();
|
||||
|
||||
// Sequential checks
|
||||
group.bench_function("sequential", |bench| {
|
||||
bench.iter(|| {
|
||||
let result = gating.check(&graph);
|
||||
black_box(result)
|
||||
})
|
||||
});
|
||||
|
||||
// Parallel checks (connectivity, dead nodes, layer balance)
|
||||
group.bench_function("parallel", |bench| {
|
||||
bench.iter(|| {
|
||||
let (connectivity, (dead_ratio, layer_balance)) = rayon::join(
|
||||
|| {
|
||||
// Connectivity check
|
||||
if graph.len() <= 1 {
|
||||
return 1.0;
|
||||
}
|
||||
let start = *graph.nodes.keys().next().unwrap();
|
||||
let mut visited = HashSet::new();
|
||||
let mut queue = VecDeque::new();
|
||||
visited.insert(start);
|
||||
queue.push_back(start);
|
||||
while let Some(node) = queue.pop_front() {
|
||||
if let Some(n) = graph.nodes.get(&node) {
|
||||
for &neighbor in &n.neighbors {
|
||||
if !visited.contains(&neighbor)
|
||||
&& graph.nodes.contains_key(&neighbor)
|
||||
{
|
||||
visited.insert(neighbor);
|
||||
queue.push_back(neighbor);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
visited.len() as f32 / graph.len() as f32
|
||||
},
|
||||
|| {
|
||||
rayon::join(
|
||||
|| {
|
||||
// Dead nodes
|
||||
let dead = graph
|
||||
.nodes
|
||||
.values()
|
||||
.filter(|n| n.neighbors.is_empty())
|
||||
.count();
|
||||
dead as f32 / graph.len() as f32
|
||||
},
|
||||
|| {
|
||||
// Layer balance
|
||||
let mut layer_counts = vec![0usize; graph.max_layer + 1];
|
||||
for node in graph.nodes.values() {
|
||||
layer_counts[node.layer] += 1;
|
||||
}
|
||||
let max_count = layer_counts.iter().max().copied().unwrap_or(1) as f32;
|
||||
let min_count = layer_counts
|
||||
.iter()
|
||||
.filter(|&&c| c > 0)
|
||||
.min()
|
||||
.copied()
|
||||
.unwrap_or(1) as f32;
|
||||
max_count / min_count
|
||||
},
|
||||
)
|
||||
},
|
||||
);
|
||||
|
||||
let passed = connectivity >= gating.min_connectivity
|
||||
&& dead_ratio <= gating.max_dead_nodes_ratio
|
||||
&& layer_balance <= gating.max_layer_imbalance;
|
||||
|
||||
black_box(passed)
|
||||
})
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Complete Integrity Pipeline
|
||||
// ============================================================================
|
||||
|
||||
fn bench_full_integrity_pipeline(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("Full Integrity Pipeline");
|
||||
group.sample_size(10);
|
||||
|
||||
for &n in [10_000, 50_000, 100_000].iter() {
|
||||
let graph = generate_connected_graph(n, 16, 42);
|
||||
let gating = GatingCheck::default();
|
||||
|
||||
group.bench_with_input(BenchmarkId::from_parameter(n), &n, |bench, _| {
|
||||
bench.iter(|| {
|
||||
// 1. State check
|
||||
let mut sm = IndexStateMachine::new();
|
||||
sm.transition(IndexState::Building).ok();
|
||||
sm.transition(IndexState::Ready).ok();
|
||||
|
||||
// 2. Gating check
|
||||
let gate_result = gating.check(&graph);
|
||||
|
||||
// 3. If passed, build contracted graph
|
||||
if gate_result.passed {
|
||||
let contracted = ContractedGraph::build_from_graph(&graph, 100);
|
||||
|
||||
// 4. Compute mincut
|
||||
let mincut_computer = MincutComputer::from_contracted_graph(&contracted);
|
||||
let mincut = mincut_computer.compute_global_mincut();
|
||||
|
||||
black_box((gate_result, mincut))
|
||||
} else {
|
||||
black_box((gate_result, 0.0))
|
||||
}
|
||||
})
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(
|
||||
benches,
|
||||
// Contracted Graph
|
||||
bench_contracted_graph_build,
|
||||
bench_contracted_graph_memory,
|
||||
// Mincut
|
||||
bench_mincut_compute,
|
||||
bench_mincut_contraction_factors,
|
||||
// State Transitions
|
||||
bench_state_transitions,
|
||||
bench_state_machine_overhead,
|
||||
// Gating Checks
|
||||
bench_gating_check,
|
||||
bench_connectivity_check,
|
||||
bench_dead_node_detection,
|
||||
bench_layer_balance_check,
|
||||
// Parallel Integrity
|
||||
bench_parallel_integrity,
|
||||
// Full Pipeline
|
||||
bench_full_integrity_pipeline,
|
||||
);
|
||||
|
||||
criterion_main!(benches);
|
||||
434
crates/ruvector-postgres/benches/quantization_bench.rs
Normal file
434
crates/ruvector-postgres/benches/quantization_bench.rs
Normal file
@@ -0,0 +1,434 @@
|
||||
//! Comprehensive quantization benchmarks
|
||||
//!
|
||||
//! Compares exact vs quantized search with different quantization methods
|
||||
|
||||
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion};
|
||||
use rand::prelude::*;
|
||||
use rand_chacha::ChaCha8Rng;
|
||||
use ruvector_postgres::distance::DistanceMetric;
|
||||
use ruvector_postgres::types::{BinaryVec, ProductVec, RuVector, ScalarVec};
|
||||
|
||||
// ============================================================================
|
||||
// Test Data Generation
|
||||
// ============================================================================
|
||||
|
||||
fn generate_vectors(n: usize, dims: usize, seed: u64) -> Vec<Vec<f32>> {
|
||||
let mut rng = ChaCha8Rng::seed_from_u64(seed);
|
||||
(0..n)
|
||||
.map(|_| (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect())
|
||||
.collect()
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Scalar Quantization (SQ8) Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_sq8_quantization(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("sq8_quantization");
|
||||
|
||||
for dims in [128, 384, 768, 1536, 3072].iter() {
|
||||
let data: Vec<f32> = (0..*dims).map(|i| (i as f32) * 0.001).collect();
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("encode", dims), dims, |bench, _| {
|
||||
bench.iter(|| black_box(ScalarVec::from_f32(&data)));
|
||||
});
|
||||
|
||||
let encoded = ScalarVec::from_f32(&data);
|
||||
group.bench_with_input(BenchmarkId::new("decode", dims), dims, |bench, _| {
|
||||
bench.iter(|| black_box(encoded.to_f32()));
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_sq8_distance(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("sq8_distance");
|
||||
|
||||
for dims in [128, 384, 768, 1536, 3072].iter() {
|
||||
let a_data: Vec<f32> = (0..*dims).map(|i| i as f32 * 0.1).collect();
|
||||
let b_data: Vec<f32> = (0..*dims).map(|i| (*dims - i) as f32 * 0.1).collect();
|
||||
|
||||
let a_exact = RuVector::from_slice(&a_data);
|
||||
let b_exact = RuVector::from_slice(&b_data);
|
||||
|
||||
let a_sq8 = ScalarVec::from_f32(&a_data);
|
||||
let b_sq8 = ScalarVec::from_f32(&b_data);
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("exact", dims), dims, |bench, _| {
|
||||
bench.iter(|| black_box(a_exact.dot(&b_exact)));
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("quantized", dims), dims, |bench, _| {
|
||||
bench.iter(|| black_box(a_sq8.distance(&b_sq8)));
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_sq8_search(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("sq8_search");
|
||||
|
||||
for dims in [128, 768, 1536].iter() {
|
||||
let n = 10000;
|
||||
let vectors = generate_vectors(n, *dims, 42);
|
||||
let query = generate_vectors(1, *dims, 999)[0].clone();
|
||||
|
||||
// Exact search
|
||||
let exact_vecs: Vec<RuVector> = vectors.iter().map(|v| RuVector::from_slice(v)).collect();
|
||||
|
||||
let exact_query = RuVector::from_slice(&query);
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("exact", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let mut distances: Vec<(usize, f32)> = exact_vecs
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(id, vec)| {
|
||||
let dist = exact_query.dot(vec);
|
||||
(id, -dist) // Negative for max inner product
|
||||
})
|
||||
.collect();
|
||||
|
||||
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
|
||||
let top_k: Vec<_> = distances[..10].to_vec();
|
||||
black_box(top_k)
|
||||
});
|
||||
});
|
||||
|
||||
// Quantized search
|
||||
let sq8_vecs: Vec<ScalarVec> = vectors.iter().map(|v| ScalarVec::from_f32(v)).collect();
|
||||
|
||||
let sq8_query = ScalarVec::from_f32(&query);
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("quantized", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let mut distances: Vec<(usize, f32)> = sq8_vecs
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(id, vec)| (id, sq8_query.distance(vec)))
|
||||
.collect();
|
||||
|
||||
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
|
||||
let top_k: Vec<_> = distances[..10].to_vec();
|
||||
black_box(top_k)
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Binary Quantization Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_binary_quantization(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("binary_quantization");
|
||||
|
||||
for dims in [128, 512, 1024, 2048, 4096].iter() {
|
||||
let data: Vec<f32> = (0..*dims)
|
||||
.map(|i| if i % 2 == 0 { 1.0 } else { -1.0 })
|
||||
.collect();
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("encode", dims), dims, |bench, _| {
|
||||
bench.iter(|| black_box(BinaryVec::from_f32(&data)));
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_binary_hamming(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("binary_hamming");
|
||||
|
||||
for dims in [128, 512, 1024, 2048, 4096, 8192].iter() {
|
||||
let a_data: Vec<f32> = (0..*dims)
|
||||
.map(|i| if i % 2 == 0 { 1.0 } else { -1.0 })
|
||||
.collect();
|
||||
let b_data: Vec<f32> = (0..*dims)
|
||||
.map(|i| if i % 3 == 0 { 1.0 } else { -1.0 })
|
||||
.collect();
|
||||
|
||||
let a = BinaryVec::from_f32(&a_data);
|
||||
let b = BinaryVec::from_f32(&b_data);
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("simd", dims), dims, |bench, _| {
|
||||
bench.iter(|| black_box(a.hamming_distance(&b)));
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_binary_search(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("binary_search");
|
||||
|
||||
for dims in [1024, 2048, 4096].iter() {
|
||||
let n = 100000;
|
||||
let vectors = generate_vectors(n, *dims, 42);
|
||||
let query = generate_vectors(1, *dims, 999)[0].clone();
|
||||
|
||||
let binary_vecs: Vec<BinaryVec> = vectors.iter().map(|v| BinaryVec::from_f32(v)).collect();
|
||||
|
||||
let binary_query = BinaryVec::from_f32(&query);
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("scan", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let mut distances: Vec<(usize, u32)> = binary_vecs
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(id, vec)| (id, binary_query.hamming_distance(vec)))
|
||||
.collect();
|
||||
|
||||
distances.sort_by_key(|k| k.1);
|
||||
let top_k: Vec<_> = distances[..10].to_vec();
|
||||
black_box(top_k)
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Product Quantization (PQ) Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_pq_adc_distance(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("pq_adc_distance");
|
||||
|
||||
for m in [8u8, 16, 32, 48, 64].iter() {
|
||||
let k: usize = 256; // Number of centroids
|
||||
let codes: Vec<u8> = (0..*m).map(|i| ((i * 7) % k as u8) as u8).collect();
|
||||
let pq = ProductVec::new((*m as usize * 32) as u16, *m, 255, codes);
|
||||
|
||||
// Create distance table
|
||||
let mut table = Vec::with_capacity(*m as usize * k as usize);
|
||||
for i in 0..(*m as usize * k as usize) {
|
||||
table.push((i % 100) as f32 * 0.01);
|
||||
}
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("simd", m), m, |bench, _| {
|
||||
bench.iter(|| black_box(pq.adc_distance_simd(&table)));
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("flat", m), m, |bench, _| {
|
||||
bench.iter(|| black_box(pq.adc_distance_flat(&table)));
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Compression Ratio Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_compression_comparison(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("compression_ratio");
|
||||
|
||||
for dims in [384, 768, 1536, 3072].iter() {
|
||||
let data: Vec<f32> = (0..*dims).map(|i| (i as f32) * 0.001).collect();
|
||||
let original_size = dims * std::mem::size_of::<f32>();
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("binary", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let binary = black_box(BinaryVec::from_f32(&data));
|
||||
let compressed = binary.memory_size();
|
||||
let ratio = original_size as f32 / compressed as f32;
|
||||
black_box(ratio)
|
||||
});
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("scalar", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let scalar = black_box(ScalarVec::from_f32(&data));
|
||||
let compressed = scalar.memory_size();
|
||||
let ratio = original_size as f32 / compressed as f32;
|
||||
black_box(ratio)
|
||||
});
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("product", dims), dims, |bench, _| {
|
||||
bench.iter(|| {
|
||||
let m = (dims / 32).min(64);
|
||||
let pq = black_box(ProductVec::new(*dims as u16, m as u8, 255, vec![0; m]));
|
||||
let compressed = pq.memory_size();
|
||||
let ratio = original_size as f32 / compressed as f32;
|
||||
black_box(ratio)
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Speedup vs Accuracy Trade-off
|
||||
// ============================================================================
|
||||
|
||||
fn bench_quantization_tradeoff(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("quantization_tradeoff");
|
||||
group.sample_size(10);
|
||||
|
||||
let dims = 768;
|
||||
let n = 10000;
|
||||
let num_queries = 100;
|
||||
|
||||
let vectors = generate_vectors(n, dims, 42);
|
||||
let queries = generate_vectors(num_queries, dims, 999);
|
||||
|
||||
// Compute ground truth
|
||||
let exact_vecs: Vec<RuVector> = vectors.iter().map(|v| RuVector::from_slice(v)).collect();
|
||||
|
||||
let ground_truth: Vec<Vec<usize>> = queries
|
||||
.iter()
|
||||
.map(|query| {
|
||||
let query_vec = RuVector::from_slice(query);
|
||||
let mut distances: Vec<(usize, f32)> = exact_vecs
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(id, vec)| {
|
||||
let diff = query_vec.sub(vec);
|
||||
let dist = diff.norm();
|
||||
(id, dist)
|
||||
})
|
||||
.collect();
|
||||
|
||||
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
|
||||
distances.iter().take(10).map(|(id, _)| *id).collect()
|
||||
})
|
||||
.collect();
|
||||
|
||||
// Benchmark SQ8
|
||||
let sq8_vecs: Vec<ScalarVec> = vectors.iter().map(|v| ScalarVec::from_f32(v)).collect();
|
||||
|
||||
group.bench_function("sq8_speedup", |bench| {
|
||||
bench.iter(|| {
|
||||
for (i, query) in queries.iter().enumerate() {
|
||||
let sq8_query = ScalarVec::from_f32(query);
|
||||
let mut distances: Vec<(usize, f32)> = sq8_vecs
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(id, vec)| (id, sq8_query.distance(vec)))
|
||||
.collect();
|
||||
|
||||
distances.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
|
||||
let results: Vec<usize> = distances.iter().take(10).map(|(id, _)| *id).collect();
|
||||
|
||||
// Compute recall
|
||||
let hits = results
|
||||
.iter()
|
||||
.filter(|id| ground_truth[i].contains(id))
|
||||
.count();
|
||||
|
||||
black_box(hits as f32 / 10.0);
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
// Benchmark Binary
|
||||
let binary_vecs: Vec<BinaryVec> = vectors.iter().map(|v| BinaryVec::from_f32(v)).collect();
|
||||
|
||||
group.bench_function("binary_speedup", |bench| {
|
||||
bench.iter(|| {
|
||||
for (i, query) in queries.iter().enumerate() {
|
||||
let binary_query = BinaryVec::from_f32(query);
|
||||
let mut distances: Vec<(usize, u32)> = binary_vecs
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(id, vec)| (id, binary_query.hamming_distance(vec)))
|
||||
.collect();
|
||||
|
||||
distances.sort_by_key(|k| k.1);
|
||||
let results: Vec<usize> = distances.iter().take(10).map(|(id, _)| *id).collect();
|
||||
|
||||
// Compute recall
|
||||
let hits = results
|
||||
.iter()
|
||||
.filter(|id| ground_truth[i].contains(id))
|
||||
.count();
|
||||
|
||||
black_box(hits as f32 / 10.0);
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Throughput Comparison
|
||||
// ============================================================================
|
||||
|
||||
fn bench_quantization_throughput(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("quantization_throughput");
|
||||
|
||||
let dims = 1536;
|
||||
let n = 100000;
|
||||
|
||||
let vectors = generate_vectors(n, dims, 42);
|
||||
let query = generate_vectors(1, dims, 999)[0].clone();
|
||||
|
||||
// Exact
|
||||
let exact_vecs: Vec<RuVector> = vectors.iter().map(|v| RuVector::from_slice(v)).collect();
|
||||
let exact_query = RuVector::from_slice(&query);
|
||||
|
||||
group.bench_function("exact_scan", |bench| {
|
||||
bench.iter(|| {
|
||||
let mut total = 0.0f32;
|
||||
for vec in &exact_vecs {
|
||||
total += exact_query.dot(vec);
|
||||
}
|
||||
black_box(total)
|
||||
});
|
||||
});
|
||||
|
||||
// SQ8
|
||||
let sq8_vecs: Vec<ScalarVec> = vectors.iter().map(|v| ScalarVec::from_f32(v)).collect();
|
||||
let sq8_query = ScalarVec::from_f32(&query);
|
||||
|
||||
group.bench_function("sq8_scan", |bench| {
|
||||
bench.iter(|| {
|
||||
let mut total = 0.0f32;
|
||||
for vec in &sq8_vecs {
|
||||
total += sq8_query.distance(vec);
|
||||
}
|
||||
black_box(total)
|
||||
});
|
||||
});
|
||||
|
||||
// Binary
|
||||
let binary_vecs: Vec<BinaryVec> = vectors.iter().map(|v| BinaryVec::from_f32(v)).collect();
|
||||
let binary_query = BinaryVec::from_f32(&query);
|
||||
|
||||
group.bench_function("binary_scan", |bench| {
|
||||
bench.iter(|| {
|
||||
let mut total = 0u64;
|
||||
for vec in &binary_vecs {
|
||||
total += binary_query.hamming_distance(vec) as u64;
|
||||
}
|
||||
black_box(total)
|
||||
});
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(
|
||||
benches,
|
||||
bench_sq8_quantization,
|
||||
bench_sq8_distance,
|
||||
bench_sq8_search,
|
||||
bench_binary_quantization,
|
||||
bench_binary_hamming,
|
||||
bench_binary_search,
|
||||
bench_pq_adc_distance,
|
||||
bench_compression_comparison,
|
||||
bench_quantization_tradeoff,
|
||||
bench_quantization_throughput,
|
||||
);
|
||||
|
||||
criterion_main!(benches);
|
||||
217
crates/ruvector-postgres/benches/quantized_distance_bench.rs
Normal file
217
crates/ruvector-postgres/benches/quantized_distance_bench.rs
Normal file
@@ -0,0 +1,217 @@
|
||||
//! Benchmarks for quantized vector distance calculations
|
||||
//!
|
||||
//! Compares scalar vs SIMD implementations for all quantized types
|
||||
|
||||
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion};
|
||||
use ruvector_postgres::types::{BinaryVec, ProductVec, ScalarVec};
|
||||
|
||||
// ============================================================================
|
||||
// BinaryVec Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_binaryvec_hamming(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("binaryvec_hamming");
|
||||
|
||||
for dims in [128, 512, 1024, 2048, 4096].iter() {
|
||||
let a_data: Vec<f32> = (0..*dims)
|
||||
.map(|i| if i % 2 == 0 { 1.0 } else { -1.0 })
|
||||
.collect();
|
||||
let b_data: Vec<f32> = (0..*dims)
|
||||
.map(|i| if i % 3 == 0 { 1.0 } else { -1.0 })
|
||||
.collect();
|
||||
|
||||
let a = BinaryVec::from_f32(&a_data);
|
||||
let b = BinaryVec::from_f32(&b_data);
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("simd", dims), dims, |bencher, _| {
|
||||
bencher.iter(|| black_box(a.hamming_distance(&b)));
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_binaryvec_quantization(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("binaryvec_quantization");
|
||||
|
||||
for dims in [128, 512, 1024, 2048, 4096].iter() {
|
||||
let data: Vec<f32> = (0..*dims).map(|i| (i as f32) * 0.01).collect();
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("from_f32", dims), dims, |bencher, _| {
|
||||
bencher.iter(|| black_box(BinaryVec::from_f32(&data)));
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// ScalarVec Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_scalarvec_distance(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("scalarvec_distance");
|
||||
|
||||
for dims in [128, 512, 1024, 2048, 4096].iter() {
|
||||
let a_data: Vec<f32> = (0..*dims).map(|i| i as f32 * 0.1).collect();
|
||||
let b_data: Vec<f32> = (0..*dims).map(|i| (*dims - i) as f32 * 0.1).collect();
|
||||
|
||||
let a = ScalarVec::from_f32(&a_data);
|
||||
let b = ScalarVec::from_f32(&b_data);
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("simd", dims), dims, |bencher, _| {
|
||||
bencher.iter(|| black_box(a.distance(&b)));
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
fn bench_scalarvec_quantization(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("scalarvec_quantization");
|
||||
|
||||
for dims in [128, 512, 1024, 2048, 4096].iter() {
|
||||
let data: Vec<f32> = (0..*dims).map(|i| (i as f32) * 0.01).collect();
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("from_f32", dims), dims, |bencher, _| {
|
||||
bencher.iter(|| black_box(ScalarVec::from_f32(&data)));
|
||||
});
|
||||
|
||||
let scalar = ScalarVec::from_f32(&data);
|
||||
group.bench_with_input(BenchmarkId::new("to_f32", dims), dims, |bencher, _| {
|
||||
bencher.iter(|| black_box(scalar.to_f32()));
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// ProductVec Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_productvec_adc_distance(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("productvec_adc_distance");
|
||||
|
||||
for m in [8u8, 16, 32, 48, 64].iter() {
|
||||
let k: usize = 256;
|
||||
let codes: Vec<u8> = (0..*m).map(|i| ((i * 7) % k as u8) as u8).collect();
|
||||
let pq = ProductVec::new((*m as usize * 32) as u16, *m, 255, codes);
|
||||
|
||||
// Create distance table
|
||||
let mut table = Vec::with_capacity(*m as usize * k as usize);
|
||||
for i in 0..(*m as usize * k as usize) {
|
||||
table.push((i % 100) as f32 * 0.01);
|
||||
}
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("simd", m), m, |bencher, _| {
|
||||
bencher.iter(|| black_box(pq.adc_distance_simd(&table)));
|
||||
});
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("flat", m), m, |bencher, _| {
|
||||
bencher.iter(|| black_box(pq.adc_distance_flat(&table)));
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Compression Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_compression_ratios(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("compression");
|
||||
|
||||
let dims = 1536; // OpenAI embedding size
|
||||
let data: Vec<f32> = (0..dims).map(|i| (i as f32) * 0.001).collect();
|
||||
|
||||
// Original size
|
||||
let original_size = dims * std::mem::size_of::<f32>();
|
||||
|
||||
group.bench_function("binary_quantize", |bencher| {
|
||||
bencher.iter(|| {
|
||||
let binary = black_box(BinaryVec::from_f32(&data));
|
||||
let ratio = original_size as f32 / binary.memory_size() as f32;
|
||||
black_box(ratio)
|
||||
});
|
||||
});
|
||||
|
||||
group.bench_function("scalar_quantize", |bencher| {
|
||||
bencher.iter(|| {
|
||||
let scalar = black_box(ScalarVec::from_f32(&data));
|
||||
let ratio = original_size as f32 / scalar.memory_size() as f32;
|
||||
black_box(ratio)
|
||||
});
|
||||
});
|
||||
|
||||
group.bench_function("product_quantize", |bencher| {
|
||||
bencher.iter(|| {
|
||||
let pq = black_box(ProductVec::new(dims as u16, 48, 255, vec![0; 48]));
|
||||
let ratio = original_size as f32 / pq.memory_size() as f32;
|
||||
black_box(ratio)
|
||||
});
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Throughput Benchmarks
|
||||
// ============================================================================
|
||||
|
||||
fn bench_throughput_comparison(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("throughput");
|
||||
|
||||
let dims = 1024;
|
||||
let num_vectors = 1000;
|
||||
|
||||
// Generate test data
|
||||
let vectors: Vec<Vec<f32>> = (0..num_vectors)
|
||||
.map(|i| (0..dims).map(|j| ((i * dims + j) as f32) * 0.001).collect())
|
||||
.collect();
|
||||
|
||||
let query = vectors[0].clone();
|
||||
|
||||
// Quantize all vectors
|
||||
let binary_vecs: Vec<BinaryVec> = vectors.iter().map(|v| BinaryVec::from_f32(v)).collect();
|
||||
let scalar_vecs: Vec<ScalarVec> = vectors.iter().map(|v| ScalarVec::from_f32(v)).collect();
|
||||
|
||||
let query_binary = BinaryVec::from_f32(&query);
|
||||
let query_scalar = ScalarVec::from_f32(&query);
|
||||
|
||||
group.bench_function("binary_scan", |bencher| {
|
||||
bencher.iter(|| {
|
||||
let mut total_dist = 0u32;
|
||||
for v in &binary_vecs {
|
||||
total_dist += black_box(query_binary.hamming_distance(v));
|
||||
}
|
||||
black_box(total_dist)
|
||||
});
|
||||
});
|
||||
|
||||
group.bench_function("scalar_scan", |bencher| {
|
||||
bencher.iter(|| {
|
||||
let mut total_dist = 0.0f32;
|
||||
for v in &scalar_vecs {
|
||||
total_dist += black_box(query_scalar.distance(v));
|
||||
}
|
||||
black_box(total_dist)
|
||||
});
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(
|
||||
benches,
|
||||
bench_binaryvec_hamming,
|
||||
bench_binaryvec_quantization,
|
||||
bench_scalarvec_distance,
|
||||
bench_scalarvec_quantization,
|
||||
bench_productvec_adc_distance,
|
||||
bench_compression_ratios,
|
||||
bench_throughput_comparison,
|
||||
);
|
||||
|
||||
criterion_main!(benches);
|
||||
173
crates/ruvector-postgres/benches/scripts/run_benchmarks.sh
Executable file
173
crates/ruvector-postgres/benches/scripts/run_benchmarks.sh
Executable file
@@ -0,0 +1,173 @@
|
||||
#!/bin/bash
|
||||
# Comprehensive benchmark runner script
|
||||
|
||||
set -e
|
||||
|
||||
# Colors for output
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
# Configuration
|
||||
BENCHMARK_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
|
||||
RESULTS_DIR="${BENCHMARK_DIR}/results"
|
||||
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
|
||||
|
||||
# Create results directory
|
||||
mkdir -p "${RESULTS_DIR}"
|
||||
|
||||
echo -e "${BLUE}==================================================${NC}"
|
||||
echo -e "${BLUE} RuVector Comprehensive Benchmark Suite${NC}"
|
||||
echo -e "${BLUE}==================================================${NC}"
|
||||
echo ""
|
||||
|
||||
# ============================================================================
|
||||
# Rust Benchmarks
|
||||
# ============================================================================
|
||||
|
||||
echo -e "${GREEN}Running Rust benchmarks...${NC}"
|
||||
echo ""
|
||||
|
||||
# Distance benchmarks
|
||||
echo -e "${YELLOW}1. Distance function benchmarks${NC}"
|
||||
cargo bench --bench distance_bench -- --output-format bencher | tee "${RESULTS_DIR}/distance_${TIMESTAMP}.txt"
|
||||
|
||||
# Index benchmarks
|
||||
echo -e "${YELLOW}2. HNSW index benchmarks${NC}"
|
||||
cargo bench --bench index_bench -- --output-format bencher | tee "${RESULTS_DIR}/index_${TIMESTAMP}.txt"
|
||||
|
||||
# Quantization benchmarks
|
||||
echo -e "${YELLOW}3. Quantization benchmarks${NC}"
|
||||
cargo bench --bench quantization_bench -- --output-format bencher | tee "${RESULTS_DIR}/quantization_${TIMESTAMP}.txt"
|
||||
|
||||
# Quantized distance benchmarks
|
||||
echo -e "${YELLOW}4. Quantized distance benchmarks${NC}"
|
||||
cargo bench --bench quantized_distance_bench -- --output-format bencher | tee "${RESULTS_DIR}/quantized_distance_${TIMESTAMP}.txt"
|
||||
|
||||
# ============================================================================
|
||||
# SQL Benchmarks (if PostgreSQL is available)
|
||||
# ============================================================================
|
||||
|
||||
if command -v psql &> /dev/null; then
|
||||
echo ""
|
||||
echo -e "${GREEN}Running SQL benchmarks...${NC}"
|
||||
echo ""
|
||||
|
||||
# Check if test database exists
|
||||
if psql -lqt | cut -d \| -f 1 | grep -qw ruvector_bench; then
|
||||
echo -e "${YELLOW}5. Quick SQL benchmark${NC}"
|
||||
psql -d ruvector_bench -f "${BENCHMARK_DIR}/sql/quick_benchmark.sql" | tee "${RESULTS_DIR}/sql_quick_${TIMESTAMP}.txt"
|
||||
|
||||
echo -e "${YELLOW}6. Full workload benchmark${NC}"
|
||||
echo -e "${RED}Warning: This may take several minutes...${NC}"
|
||||
psql -d ruvector_bench -f "${BENCHMARK_DIR}/sql/benchmark_workload.sql" | tee "${RESULTS_DIR}/sql_workload_${TIMESTAMP}.txt"
|
||||
else
|
||||
echo -e "${YELLOW}Skipping SQL benchmarks (database 'ruvector_bench' not found)${NC}"
|
||||
echo -e "${YELLOW}To run SQL benchmarks:${NC}"
|
||||
echo -e " createdb ruvector_bench"
|
||||
echo -e " psql -d ruvector_bench -c 'CREATE EXTENSION ruvector;'"
|
||||
echo -e " psql -d ruvector_bench -c 'CREATE EXTENSION pgvector;'"
|
||||
fi
|
||||
else
|
||||
echo -e "${YELLOW}Skipping SQL benchmarks (psql not found)${NC}"
|
||||
fi
|
||||
|
||||
# ============================================================================
|
||||
# Generate Summary Report
|
||||
# ============================================================================
|
||||
|
||||
echo ""
|
||||
echo -e "${GREEN}Generating summary report...${NC}"
|
||||
|
||||
cat > "${RESULTS_DIR}/summary_${TIMESTAMP}.md" <<EOF
|
||||
# RuVector Benchmark Results
|
||||
|
||||
**Date:** $(date)
|
||||
**Platform:** $(uname -s) $(uname -m)
|
||||
**Rust Version:** $(rustc --version)
|
||||
|
||||
## Benchmark Files
|
||||
|
||||
- Distance functions: \`distance_${TIMESTAMP}.txt\`
|
||||
- HNSW index: \`index_${TIMESTAMP}.txt\`
|
||||
- Quantization: \`quantization_${TIMESTAMP}.txt\`
|
||||
- Quantized distance: \`quantized_distance_${TIMESTAMP}.txt\`
|
||||
|
||||
## SQL Benchmarks
|
||||
|
||||
EOF
|
||||
|
||||
if [ -f "${RESULTS_DIR}/sql_quick_${TIMESTAMP}.txt" ]; then
|
||||
cat >> "${RESULTS_DIR}/summary_${TIMESTAMP}.md" <<EOF
|
||||
- Quick benchmark: \`sql_quick_${TIMESTAMP}.txt\`
|
||||
- Full workload: \`sql_workload_${TIMESTAMP}.txt\`
|
||||
|
||||
EOF
|
||||
else
|
||||
cat >> "${RESULTS_DIR}/summary_${TIMESTAMP}.md" <<EOF
|
||||
SQL benchmarks were not run. See setup instructions above.
|
||||
|
||||
EOF
|
||||
fi
|
||||
|
||||
cat >> "${RESULTS_DIR}/summary_${TIMESTAMP}.md" <<EOF
|
||||
## System Information
|
||||
|
||||
\`\`\`
|
||||
$(uname -a)
|
||||
\`\`\`
|
||||
|
||||
### CPU Information
|
||||
|
||||
\`\`\`
|
||||
$(lscpu 2>/dev/null || sysctl -a | grep machdep.cpu || echo "CPU info not available")
|
||||
\`\`\`
|
||||
|
||||
### Memory Information
|
||||
|
||||
\`\`\`
|
||||
$(free -h 2>/dev/null || vm_stat || echo "Memory info not available")
|
||||
\`\`\`
|
||||
|
||||
## Running the Benchmarks
|
||||
|
||||
To reproduce these results:
|
||||
|
||||
\`\`\`bash
|
||||
cd crates/ruvector-postgres
|
||||
bash benches/scripts/run_benchmarks.sh
|
||||
\`\`\`
|
||||
|
||||
## Comparing with Previous Results
|
||||
|
||||
\`\`\`bash
|
||||
# Install cargo-criterion for better comparison
|
||||
cargo install cargo-criterion
|
||||
|
||||
# Run with baseline
|
||||
cargo criterion --bench distance_bench --baseline main
|
||||
\`\`\`
|
||||
EOF
|
||||
|
||||
echo ""
|
||||
echo -e "${GREEN}==================================================${NC}"
|
||||
echo -e "${GREEN} Benchmark Complete!${NC}"
|
||||
echo -e "${GREEN}==================================================${NC}"
|
||||
echo ""
|
||||
echo -e "Results saved to: ${BLUE}${RESULTS_DIR}${NC}"
|
||||
echo -e "Summary report: ${BLUE}${RESULTS_DIR}/summary_${TIMESTAMP}.md${NC}"
|
||||
echo ""
|
||||
|
||||
# ============================================================================
|
||||
# Optional: Open results in browser if criterion HTML is available
|
||||
# ============================================================================
|
||||
|
||||
if [ -d "target/criterion" ]; then
|
||||
echo -e "${YELLOW}Criterion HTML reports available at:${NC}"
|
||||
echo -e " ${BLUE}file://$(pwd)/target/criterion/report/index.html${NC}"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo -e "${GREEN}Done!${NC}"
|
||||
381
crates/ruvector-postgres/benches/sql/benchmark_workload.sql
Normal file
381
crates/ruvector-postgres/benches/sql/benchmark_workload.sql
Normal file
@@ -0,0 +1,381 @@
|
||||
-- Realistic workload benchmark for ruvector vs pgvector
|
||||
-- This script tests common operations with realistic dataset sizes
|
||||
|
||||
\timing on
|
||||
\set ECHO all
|
||||
|
||||
-- Configuration
|
||||
\set num_vectors 1000000
|
||||
\set num_queries 1000
|
||||
\set dims 1536
|
||||
\set k 10
|
||||
|
||||
BEGIN;
|
||||
|
||||
-- ============================================================================
|
||||
-- Setup Test Tables
|
||||
-- ============================================================================
|
||||
|
||||
DROP TABLE IF EXISTS vectors_ruvector CASCADE;
|
||||
DROP TABLE IF EXISTS vectors_pgvector CASCADE;
|
||||
DROP TABLE IF EXISTS queries CASCADE;
|
||||
|
||||
-- Create tables
|
||||
CREATE TABLE vectors_ruvector (
|
||||
id SERIAL PRIMARY KEY,
|
||||
embedding ruvector(:dims),
|
||||
metadata JSONB
|
||||
);
|
||||
|
||||
CREATE TABLE vectors_pgvector (
|
||||
id SERIAL PRIMARY KEY,
|
||||
embedding vector(:dims),
|
||||
metadata JSONB
|
||||
);
|
||||
|
||||
CREATE TABLE queries (
|
||||
id SERIAL PRIMARY KEY,
|
||||
query_vector ruvector(:dims)
|
||||
);
|
||||
|
||||
-- ============================================================================
|
||||
-- Generate Test Data
|
||||
-- ============================================================================
|
||||
|
||||
\echo 'Generating test data...'
|
||||
|
||||
-- Insert vectors (ruvector)
|
||||
INSERT INTO vectors_ruvector (embedding, metadata)
|
||||
SELECT
|
||||
array_to_ruvector(ARRAY(
|
||||
SELECT random()::real
|
||||
FROM generate_series(1, :dims)
|
||||
)),
|
||||
jsonb_build_object('category', i % 100)
|
||||
FROM generate_series(1, :num_vectors) i;
|
||||
|
||||
-- Insert vectors (pgvector)
|
||||
INSERT INTO vectors_pgvector (embedding, metadata)
|
||||
SELECT
|
||||
ARRAY(
|
||||
SELECT random()::real
|
||||
FROM generate_series(1, :dims)
|
||||
)::vector(:dims),
|
||||
jsonb_build_object('category', i % 100)
|
||||
FROM generate_series(1, :num_vectors) i;
|
||||
|
||||
-- Generate query vectors
|
||||
INSERT INTO queries (query_vector)
|
||||
SELECT
|
||||
array_to_ruvector(ARRAY(
|
||||
SELECT random()::real
|
||||
FROM generate_series(1, :dims)
|
||||
))
|
||||
FROM generate_series(1, :num_queries);
|
||||
|
||||
COMMIT;
|
||||
|
||||
-- ============================================================================
|
||||
-- Benchmark 1: Sequential Scan (No Index)
|
||||
-- ============================================================================
|
||||
|
||||
\echo ''
|
||||
\echo '=== Benchmark 1: Sequential Scan (No Index) ==='
|
||||
\echo ''
|
||||
|
||||
-- Get a test query
|
||||
\set test_query 'SELECT query_vector FROM queries WHERE id = 1'
|
||||
|
||||
-- RuVector scan
|
||||
\echo 'RuVector sequential scan (p50, p99 latency):'
|
||||
SELECT
|
||||
percentile_cont(0.5) WITHIN GROUP (ORDER BY duration) AS p50_ms,
|
||||
percentile_cont(0.99) WITHIN GROUP (ORDER BY duration) AS p99_ms,
|
||||
AVG(duration) AS avg_ms,
|
||||
MIN(duration) AS min_ms,
|
||||
MAX(duration) AS max_ms
|
||||
FROM (
|
||||
SELECT
|
||||
id,
|
||||
extract(milliseconds FROM (clock_timestamp() - start_time)) AS duration
|
||||
FROM (
|
||||
SELECT
|
||||
id,
|
||||
clock_timestamp() AS start_time,
|
||||
(SELECT id FROM vectors_ruvector v ORDER BY v.embedding <-> (:test_query)::ruvector LIMIT :k)
|
||||
FROM queries
|
||||
LIMIT 100
|
||||
) t
|
||||
) times;
|
||||
|
||||
-- PGVector scan
|
||||
\echo 'pgvector sequential scan (p50, p99 latency):'
|
||||
SELECT
|
||||
percentile_cont(0.5) WITHIN GROUP (ORDER BY duration) AS p50_ms,
|
||||
percentile_cont(0.99) WITHIN GROUP (ORDER BY duration) AS p99_ms,
|
||||
AVG(duration) AS avg_ms,
|
||||
MIN(duration) AS min_ms,
|
||||
MAX(duration) AS max_ms
|
||||
FROM (
|
||||
SELECT
|
||||
id,
|
||||
extract(milliseconds FROM (clock_timestamp() - start_time)) AS duration
|
||||
FROM (
|
||||
SELECT
|
||||
id,
|
||||
clock_timestamp() AS start_time,
|
||||
(SELECT id FROM vectors_pgvector v ORDER BY v.embedding <-> (SELECT query_vector::vector FROM queries WHERE id = 1) LIMIT :k)
|
||||
FROM queries
|
||||
LIMIT 100
|
||||
) t
|
||||
) times;
|
||||
|
||||
-- ============================================================================
|
||||
-- Benchmark 2: Build Index
|
||||
-- ============================================================================
|
||||
|
||||
\echo ''
|
||||
\echo '=== Benchmark 2: Index Build Time ==='
|
||||
\echo ''
|
||||
|
||||
-- RuVector HNSW
|
||||
\echo 'Building ruvector HNSW index...'
|
||||
\timing on
|
||||
CREATE INDEX vectors_ruvector_hnsw_idx ON vectors_ruvector
|
||||
USING hnsw (embedding ruvector_l2_ops)
|
||||
WITH (m = 16, ef_construction = 64);
|
||||
|
||||
-- PGVector HNSW
|
||||
\echo 'Building pgvector HNSW index...'
|
||||
\timing on
|
||||
CREATE INDEX vectors_pgvector_hnsw_idx ON vectors_pgvector
|
||||
USING hnsw (embedding vector_l2_ops)
|
||||
WITH (m = 16, ef_construction = 64);
|
||||
|
||||
-- ============================================================================
|
||||
-- Benchmark 3: Index Search Performance
|
||||
-- ============================================================================
|
||||
|
||||
\echo ''
|
||||
\echo '=== Benchmark 3: Index Search (HNSW) ==='
|
||||
\echo ''
|
||||
|
||||
-- Warm up
|
||||
SELECT COUNT(*) FROM vectors_ruvector v, queries q
|
||||
WHERE v.embedding <-> q.query_vector < 1000 LIMIT 100;
|
||||
|
||||
-- RuVector HNSW search
|
||||
\echo 'RuVector HNSW search (p50, p99 latency):'
|
||||
SELECT
|
||||
percentile_cont(0.5) WITHIN GROUP (ORDER BY duration) AS p50_ms,
|
||||
percentile_cont(0.99) WITHIN GROUP (ORDER BY duration) AS p99_ms,
|
||||
AVG(duration) AS avg_ms,
|
||||
MIN(duration) AS min_ms,
|
||||
MAX(duration) AS max_ms
|
||||
FROM (
|
||||
SELECT
|
||||
id,
|
||||
extract(milliseconds FROM (clock_timestamp() - start_time)) AS duration
|
||||
FROM (
|
||||
SELECT
|
||||
q.id,
|
||||
clock_timestamp() AS start_time,
|
||||
(SELECT id FROM vectors_ruvector v ORDER BY v.embedding <-> q.query_vector LIMIT :k)
|
||||
FROM queries q
|
||||
LIMIT 1000
|
||||
) t
|
||||
) times;
|
||||
|
||||
-- PGVector HNSW search
|
||||
\echo 'pgvector HNSW search (p50, p99 latency):'
|
||||
SELECT
|
||||
percentile_cont(0.5) WITHIN GROUP (ORDER BY duration) AS p50_ms,
|
||||
percentile_cont(0.99) WITHIN GROUP (ORDER BY duration) AS p99_ms,
|
||||
AVG(duration) AS avg_ms,
|
||||
MIN(duration) AS min_ms,
|
||||
MAX(duration) AS max_ms
|
||||
FROM (
|
||||
SELECT
|
||||
id,
|
||||
extract(milliseconds FROM (clock_timestamp() - start_time)) AS duration
|
||||
FROM (
|
||||
SELECT
|
||||
q.id,
|
||||
clock_timestamp() AS start_time,
|
||||
(SELECT id FROM vectors_pgvector v ORDER BY v.embedding <-> q.query_vector::vector LIMIT :k)
|
||||
FROM queries q
|
||||
LIMIT 1000
|
||||
) t
|
||||
) times;
|
||||
|
||||
-- ============================================================================
|
||||
-- Benchmark 4: Distance Function Performance
|
||||
-- ============================================================================
|
||||
|
||||
\echo ''
|
||||
\echo '=== Benchmark 4: Distance Functions ==='
|
||||
\echo ''
|
||||
|
||||
-- L2 Distance
|
||||
\echo 'L2 Distance (100k calculations):'
|
||||
\timing on
|
||||
SELECT SUM(ruvector_l2_distance(v1.embedding, v2.embedding))
|
||||
FROM vectors_ruvector v1
|
||||
CROSS JOIN vectors_ruvector v2
|
||||
WHERE v1.id <= 100 AND v2.id <= 1000;
|
||||
|
||||
\timing on
|
||||
SELECT SUM(v1.embedding <-> v2.embedding)
|
||||
FROM vectors_pgvector v1
|
||||
CROSS JOIN vectors_pgvector v2
|
||||
WHERE v1.id <= 100 AND v2.id <= 1000;
|
||||
|
||||
-- Cosine Distance
|
||||
\echo 'Cosine Distance (100k calculations):'
|
||||
\timing on
|
||||
SELECT SUM(ruvector_cosine_distance(v1.embedding, v2.embedding))
|
||||
FROM vectors_ruvector v1
|
||||
CROSS JOIN vectors_ruvector v2
|
||||
WHERE v1.id <= 100 AND v2.id <= 1000;
|
||||
|
||||
\timing on
|
||||
SELECT SUM(v1.embedding <=> v2.embedding)
|
||||
FROM vectors_pgvector v1
|
||||
CROSS JOIN vectors_pgvector v2
|
||||
WHERE v1.id <= 100 AND v2.id <= 1000;
|
||||
|
||||
-- Inner Product
|
||||
\echo 'Inner Product (100k calculations):'
|
||||
\timing on
|
||||
SELECT SUM(ruvector_inner_product(v1.embedding, v2.embedding))
|
||||
FROM vectors_ruvector v1
|
||||
CROSS JOIN vectors_ruvector v2
|
||||
WHERE v1.id <= 100 AND v2.id <= 1000;
|
||||
|
||||
\timing on
|
||||
SELECT SUM(v1.embedding <#> v2.embedding)
|
||||
FROM vectors_pgvector v1
|
||||
CROSS JOIN vectors_pgvector v2
|
||||
WHERE v1.id <= 100 AND v2.id <= 1000;
|
||||
|
||||
-- ============================================================================
|
||||
-- Benchmark 5: Index Recall Accuracy
|
||||
-- ============================================================================
|
||||
|
||||
\echo ''
|
||||
\echo '=== Benchmark 5: Index Recall ==='
|
||||
\echo ''
|
||||
|
||||
-- Create ground truth table
|
||||
DROP TABLE IF EXISTS ground_truth;
|
||||
CREATE TEMP TABLE ground_truth AS
|
||||
SELECT
|
||||
q.id AS query_id,
|
||||
ARRAY_AGG(v.id ORDER BY v.embedding <-> q.query_vector) AS true_neighbors
|
||||
FROM queries q
|
||||
CROSS JOIN LATERAL (
|
||||
SELECT id, embedding
|
||||
FROM vectors_ruvector
|
||||
ORDER BY embedding <-> q.query_vector
|
||||
LIMIT :k
|
||||
) v
|
||||
WHERE q.id <= 100
|
||||
GROUP BY q.id;
|
||||
|
||||
-- Compute recall for ruvector HNSW
|
||||
WITH hnsw_results AS (
|
||||
SELECT
|
||||
q.id AS query_id,
|
||||
ARRAY_AGG(v.id ORDER BY v.embedding <-> q.query_vector) AS hnsw_neighbors
|
||||
FROM queries q
|
||||
CROSS JOIN LATERAL (
|
||||
SELECT id
|
||||
FROM vectors_ruvector
|
||||
ORDER BY embedding <-> q.query_vector
|
||||
LIMIT :k
|
||||
) v
|
||||
WHERE q.id <= 100
|
||||
GROUP BY q.id
|
||||
)
|
||||
SELECT
|
||||
AVG(
|
||||
(
|
||||
SELECT COUNT(*)
|
||||
FROM unnest(h.hnsw_neighbors) AS hn
|
||||
WHERE hn = ANY(g.true_neighbors)
|
||||
)::float / :k
|
||||
) AS recall
|
||||
FROM hnsw_results h
|
||||
JOIN ground_truth g ON h.query_id = g.query_id;
|
||||
|
||||
-- ============================================================================
|
||||
-- Benchmark 6: Memory Usage
|
||||
-- ============================================================================
|
||||
|
||||
\echo ''
|
||||
\echo '=== Benchmark 6: Memory Usage ==='
|
||||
\echo ''
|
||||
|
||||
-- Table sizes
|
||||
\echo 'Table sizes:'
|
||||
SELECT
|
||||
'ruvector' AS type,
|
||||
pg_size_pretty(pg_total_relation_size('vectors_ruvector')) AS total_size,
|
||||
pg_size_pretty(pg_relation_size('vectors_ruvector')) AS table_size,
|
||||
pg_size_pretty(pg_indexes_size('vectors_ruvector')) AS index_size
|
||||
UNION ALL
|
||||
SELECT
|
||||
'pgvector' AS type,
|
||||
pg_size_pretty(pg_total_relation_size('vectors_pgvector')) AS total_size,
|
||||
pg_size_pretty(pg_relation_size('vectors_pgvector')) AS table_size,
|
||||
pg_size_pretty(pg_indexes_size('vectors_pgvector')) AS index_size;
|
||||
|
||||
-- Index sizes
|
||||
\echo 'Index sizes:'
|
||||
SELECT
|
||||
indexname,
|
||||
pg_size_pretty(pg_relation_size(indexname::regclass)) AS size
|
||||
FROM pg_indexes
|
||||
WHERE tablename IN ('vectors_ruvector', 'vectors_pgvector')
|
||||
ORDER BY tablename, indexname;
|
||||
|
||||
-- ============================================================================
|
||||
-- Benchmark 7: Quantization Performance
|
||||
-- ============================================================================
|
||||
|
||||
\echo ''
|
||||
\echo '=== Benchmark 7: Quantization ==='
|
||||
\echo ''
|
||||
|
||||
-- Create quantized tables
|
||||
DROP TABLE IF EXISTS vectors_scalar;
|
||||
CREATE TABLE vectors_scalar (
|
||||
id SERIAL PRIMARY KEY,
|
||||
embedding scalarvec
|
||||
);
|
||||
|
||||
INSERT INTO vectors_scalar (embedding)
|
||||
SELECT quantize_scalar(embedding)
|
||||
FROM vectors_ruvector
|
||||
LIMIT 100000;
|
||||
|
||||
-- Quantized search
|
||||
\echo 'Scalar quantized search:'
|
||||
\timing on
|
||||
SELECT id
|
||||
FROM vectors_scalar
|
||||
ORDER BY embedding <-> quantize_scalar((SELECT query_vector FROM queries WHERE id = 1))
|
||||
LIMIT :k;
|
||||
|
||||
-- ============================================================================
|
||||
-- Cleanup
|
||||
-- ============================================================================
|
||||
|
||||
\echo ''
|
||||
\echo '=== Benchmark Complete ==='
|
||||
\echo ''
|
||||
|
||||
DROP TABLE IF EXISTS vectors_ruvector CASCADE;
|
||||
DROP TABLE IF EXISTS vectors_pgvector CASCADE;
|
||||
DROP TABLE IF EXISTS queries CASCADE;
|
||||
DROP TABLE IF EXISTS vectors_scalar CASCADE;
|
||||
123
crates/ruvector-postgres/benches/sql/quick_benchmark.sql
Normal file
123
crates/ruvector-postgres/benches/sql/quick_benchmark.sql
Normal file
@@ -0,0 +1,123 @@
|
||||
-- Quick benchmark script for development testing
|
||||
-- Smaller dataset for faster iteration
|
||||
|
||||
\timing on
|
||||
\set ECHO all
|
||||
|
||||
-- Configuration
|
||||
\set num_vectors 10000
|
||||
\set num_queries 100
|
||||
\set dims 768
|
||||
\set k 10
|
||||
|
||||
BEGIN;
|
||||
|
||||
-- ============================================================================
|
||||
-- Setup
|
||||
-- ============================================================================
|
||||
|
||||
DROP TABLE IF EXISTS test_vectors CASCADE;
|
||||
DROP TABLE IF EXISTS test_queries CASCADE;
|
||||
|
||||
CREATE TABLE test_vectors (
|
||||
id SERIAL PRIMARY KEY,
|
||||
embedding ruvector(:dims)
|
||||
);
|
||||
|
||||
CREATE TABLE test_queries (
|
||||
id SERIAL PRIMARY KEY,
|
||||
query_vector ruvector(:dims)
|
||||
);
|
||||
|
||||
-- ============================================================================
|
||||
-- Load Data
|
||||
-- ============================================================================
|
||||
|
||||
\echo 'Loading test data...'
|
||||
|
||||
INSERT INTO test_vectors (embedding)
|
||||
SELECT
|
||||
array_to_ruvector(ARRAY(
|
||||
SELECT random()::real
|
||||
FROM generate_series(1, :dims)
|
||||
))
|
||||
FROM generate_series(1, :num_vectors);
|
||||
|
||||
INSERT INTO test_queries (query_vector)
|
||||
SELECT
|
||||
array_to_ruvector(ARRAY(
|
||||
SELECT random()::real
|
||||
FROM generate_series(1, :dims)
|
||||
))
|
||||
FROM generate_series(1, :num_queries);
|
||||
|
||||
COMMIT;
|
||||
|
||||
-- ============================================================================
|
||||
-- Sequential Scan Baseline
|
||||
-- ============================================================================
|
||||
|
||||
\echo ''
|
||||
\echo 'Sequential scan baseline:'
|
||||
EXPLAIN ANALYZE
|
||||
SELECT id
|
||||
FROM test_vectors
|
||||
ORDER BY embedding <-> (SELECT query_vector FROM test_queries WHERE id = 1)
|
||||
LIMIT :k;
|
||||
|
||||
-- ============================================================================
|
||||
-- Build HNSW Index
|
||||
-- ============================================================================
|
||||
|
||||
\echo ''
|
||||
\echo 'Building HNSW index...'
|
||||
CREATE INDEX test_vectors_hnsw_idx ON test_vectors
|
||||
USING hnsw (embedding ruvector_l2_ops)
|
||||
WITH (m = 16, ef_construction = 64);
|
||||
|
||||
-- ============================================================================
|
||||
-- Index Search
|
||||
-- ============================================================================
|
||||
|
||||
\echo ''
|
||||
\echo 'HNSW index search:'
|
||||
EXPLAIN ANALYZE
|
||||
SELECT id
|
||||
FROM test_vectors
|
||||
ORDER BY embedding <-> (SELECT query_vector FROM test_queries WHERE id = 1)
|
||||
LIMIT :k;
|
||||
|
||||
-- ============================================================================
|
||||
-- Distance Functions
|
||||
-- ============================================================================
|
||||
|
||||
\echo ''
|
||||
\echo 'Distance function performance (1000 calculations):'
|
||||
|
||||
-- L2
|
||||
\timing on
|
||||
SELECT SUM(ruvector_l2_distance(v1.embedding, v2.embedding))
|
||||
FROM test_vectors v1, test_vectors v2
|
||||
WHERE v1.id <= 10 AND v2.id <= 100;
|
||||
|
||||
-- Cosine
|
||||
\timing on
|
||||
SELECT SUM(ruvector_cosine_distance(v1.embedding, v2.embedding))
|
||||
FROM test_vectors v1, test_vectors v2
|
||||
WHERE v1.id <= 10 AND v2.id <= 100;
|
||||
|
||||
-- Inner Product
|
||||
\timing on
|
||||
SELECT SUM(ruvector_inner_product(v1.embedding, v2.embedding))
|
||||
FROM test_vectors v1, test_vectors v2
|
||||
WHERE v1.id <= 10 AND v2.id <= 100;
|
||||
|
||||
-- ============================================================================
|
||||
-- Cleanup
|
||||
-- ============================================================================
|
||||
|
||||
DROP TABLE IF EXISTS test_vectors CASCADE;
|
||||
DROP TABLE IF EXISTS test_queries CASCADE;
|
||||
|
||||
\echo ''
|
||||
\echo 'Quick benchmark complete!'
|
||||
Reference in New Issue
Block a user