Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'
This commit is contained in:
536
vendor/ruvector/crates/ruvector-postgres/docs/ARCHITECTURE.md
vendored
Normal file
536
vendor/ruvector/crates/ruvector-postgres/docs/ARCHITECTURE.md
vendored
Normal file
@@ -0,0 +1,536 @@
|
||||
# RuVector-Postgres Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
RuVector-Postgres is a high-performance, drop-in replacement for the pgvector extension, built in Rust using the pgrx framework. It provides SIMD-optimized vector similarity search with advanced indexing algorithms, quantization support, and hybrid search capabilities.
|
||||
|
||||
## Design Goals
|
||||
|
||||
1. **pgvector API Compatibility**: 100% compatible SQL interface with pgvector
|
||||
2. **Superior Performance**: 2-10x faster than pgvector through SIMD and algorithmic optimizations
|
||||
3. **Memory Efficiency**: Up to 32x memory reduction via quantization
|
||||
4. **Neon Compatibility**: Designed for serverless PostgreSQL (Neon, Supabase, etc.)
|
||||
5. **Production Ready**: Battle-tested algorithms from ruvector-core
|
||||
|
||||
## Architecture Diagram
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ PostgreSQL Server │
|
||||
├─────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ RuVector-Postgres Extension │ │
|
||||
│ ├─────────────────────────────────────────────────────────────────────────┤ │
|
||||
│ │ │ │
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐ │ │
|
||||
│ │ │ Vector │ │ HNSW │ │ IVFFlat │ │ Flat Index │ │ │
|
||||
│ │ │ Type │ │ Index │ │ Index │ │ (fallback) │ │ │
|
||||
│ │ │ │ │ │ │ │ │ │ │ │
|
||||
│ │ │ - ruvector │ │ - O(log n) │ │ - O(√n) │ │ - O(n) │ │ │
|
||||
│ │ │ - halfvec │ │ - 95%+ rec │ │ - clusters │ │ - exact search │ │ │
|
||||
│ │ │ - sparsevec │ │ - SIMD ops │ │ - training │ │ │ │ │
|
||||
│ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └────────┬────────┘ │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ ┌──────┴────────────────┴────────────────┴───────────────────┴────────┐ │ │
|
||||
│ │ │ SIMD Distance Layer │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────────┐ │ │ │
|
||||
│ │ │ │ AVX-512 │ │ AVX2 │ │ NEON │ │ Scalar │ │ │ │
|
||||
│ │ │ │ (x86_64) │ │ (x86_64) │ │ (ARM64) │ │ Fallback │ │ │ │
|
||||
│ │ │ └────────────┘ └────────────┘ └────────────┘ └────────────────┘ │ │ │
|
||||
│ │ └──────────────────────────────────────────────────────────────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │ ┌──────────────────────────────────────────────────────────────────────┐ │ │
|
||||
│ │ │ Quantization Engine │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────────┐ │ │ │
|
||||
│ │ │ │ Scalar │ │ Product │ │ Binary │ │ Half-Prec │ │ │ │
|
||||
│ │ │ │ (4x) │ │ (8-16x) │ │ (32x) │ │ (2x) │ │ │ │
|
||||
│ │ │ └────────────┘ └────────────┘ └────────────┘ └────────────────┘ │ │ │
|
||||
│ │ └──────────────────────────────────────────────────────────────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │ ┌──────────────────────────────────────────────────────────────────────┐ │ │
|
||||
│ │ │ Hybrid Search Engine │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ ┌─────────────────────┐ ┌─────────────────────┐ ┌──────────────┐ │ │ │
|
||||
│ │ │ │ Vector Similarity │ │ BM25 Text Search │ │ RRF Fusion │ │ │ │
|
||||
│ │ │ │ (dense) │ │ (sparse) │ │ (ranking) │ │ │ │
|
||||
│ │ │ └─────────────────────┘ └─────────────────────┘ └──────────────┘ │ │ │
|
||||
│ │ └──────────────────────────────────────────────────────────────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Core Components
|
||||
|
||||
### 1. Vector Types
|
||||
|
||||
#### `ruvector` - Primary Vector Type
|
||||
|
||||
**Varlena Memory Layout (Zero-Copy Design)**
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ RuVector Varlena Layout │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Bytes 0-3 │ Bytes 4-5 │ Bytes 6-7 │ Bytes 8+ │
|
||||
│ vl_len_ │ dimensions │ _unused │ f32 data... │
|
||||
│ (varlena hdr)│ (u16) │ (padding) │ [dim0, dim1...] │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ 4 bytes │ 2 bytes │ 2 bytes │ 4*dims bytes │
|
||||
│ PostgreSQL │ pgvector │ Alignment │ Vector data │
|
||||
│ header │ compatible │ to 8 bytes │ (f32 floats) │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Key Layout Features:**
|
||||
|
||||
1. **Varlena Header (VARHDRSZ)**: Standard PostgreSQL variable-length type header (4 bytes)
|
||||
2. **Dimensions (u16)**: Compatible with pgvector's 16-bit dimension count (max 16,000)
|
||||
3. **Padding (2 bytes)**: Ensures f32 data is 8-byte aligned for efficient SIMD access
|
||||
4. **Data Array**: Contiguous f32 elements for zero-copy SIMD operations
|
||||
|
||||
**Memory Alignment Requirements:**
|
||||
|
||||
- Total header size: 8 bytes (4 + 2 + 2)
|
||||
- Data alignment: 8-byte aligned for optimal performance
|
||||
- SIMD alignment:
|
||||
- AVX-512 prefers 64-byte alignment (checked at runtime)
|
||||
- AVX2 prefers 32-byte alignment (checked at runtime)
|
||||
- Unaligned loads used as fallback (minimal performance penalty)
|
||||
|
||||
**Zero-Copy Access Pattern:**
|
||||
|
||||
```rust
|
||||
// Direct pointer access to varlena data (zero allocation)
|
||||
pub unsafe fn as_ptr(&self) -> *const f32 {
|
||||
// Skip varlena header (4 bytes) + RuVectorHeader (4 bytes)
|
||||
let base = self as *const _ as *const u8;
|
||||
base.add(VARHDRSZ + RuVectorHeader::SIZE) as *const f32
|
||||
}
|
||||
|
||||
// SIMD functions operate directly on this pointer
|
||||
let distance = l2_distance_ptr_avx512(vec_a.as_ptr(), vec_b.as_ptr(), dims);
|
||||
```
|
||||
|
||||
**SQL Usage:**
|
||||
|
||||
```sql
|
||||
-- Dimensions: 1 to 16,000
|
||||
-- Storage: 4 bytes per dimension (f32) + 8 bytes header
|
||||
CREATE TABLE items (
|
||||
id SERIAL PRIMARY KEY,
|
||||
embedding ruvector(1536) -- OpenAI embedding dimensions
|
||||
);
|
||||
|
||||
-- Total storage per vector: 8 + (1536 * 4) = 6,152 bytes
|
||||
```
|
||||
|
||||
#### `halfvec` - Half-Precision Vector
|
||||
|
||||
**Varlena Layout:**
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ HalfVec Varlena Layout │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Bytes 0-3 │ Bytes 4-5 │ Bytes 6-7 │ Bytes 8+ │
|
||||
│ vl_len_ │ dimensions │ _unused │ f16 data... │
|
||||
│ (varlena hdr)│ (u16) │ (padding) │ [dim0, dim1...] │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ 4 bytes │ 2 bytes │ 2 bytes │ 2*dims bytes │
|
||||
│ PostgreSQL │ pgvector │ Alignment │ Half-precision │
|
||||
│ header │ compatible │ to 8 bytes │ (f16 floats) │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Storage Benefits:**
|
||||
|
||||
- 50% memory savings vs ruvector
|
||||
- Minimal accuracy loss (<0.01% for most embeddings)
|
||||
- SIMD f16 support on modern CPUs (AVX-512 FP16, ARM Neon FP16)
|
||||
|
||||
```sql
|
||||
-- Storage: 2 bytes per dimension (f16) + 8 bytes header
|
||||
-- 50% memory savings, minimal accuracy loss
|
||||
CREATE TABLE items (
|
||||
id SERIAL PRIMARY KEY,
|
||||
embedding halfvec(1536)
|
||||
);
|
||||
|
||||
-- Total storage per vector: 8 + (1536 * 2) = 3,080 bytes
|
||||
```
|
||||
|
||||
#### `sparsevec` - Sparse Vector
|
||||
|
||||
**Varlena Layout:**
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ SparseVec Varlena Layout │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ Bytes 0-3 │ Bytes 4-7 │ Bytes 8-11 │ Bytes 12+ │
|
||||
│ vl_len_ │ dimensions │ nnz │ indices+values │
|
||||
│ (varlena hdr)│ (u32) │ (u32) │ [(idx,val)...] │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ 4 bytes │ 4 bytes │ 4 bytes │ 8*nnz bytes │
|
||||
│ PostgreSQL │ Total dims │ Non-zero │ (u32,f32) pairs │
|
||||
│ header │ (full size) │ count │ for sparse data │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Storage:** Only non-zero elements stored (u32 index + f32 value pairs)
|
||||
|
||||
```sql
|
||||
-- Storage: Only non-zero elements stored
|
||||
-- Ideal for high-dimensional sparse data (BM25, TF-IDF)
|
||||
CREATE TABLE items (
|
||||
id SERIAL PRIMARY KEY,
|
||||
sparse_embedding sparsevec(50000)
|
||||
);
|
||||
|
||||
-- Total storage: 12 + (nnz * 8) bytes
|
||||
-- Example: 100 non-zero out of 50,000 = 12 + 800 = 812 bytes
|
||||
```
|
||||
|
||||
### 2. Distance Operators
|
||||
|
||||
| Operator | Distance Metric | Description | SIMD Optimized |
|
||||
|----------|----------------|-------------|----------------|
|
||||
| `<->` | L2 (Euclidean) | `sqrt(sum((a[i] - b[i])^2))` | ✓ |
|
||||
| `<#>` | Inner Product | `-sum(a[i] * b[i])` (negative for ORDER BY) | ✓ |
|
||||
| `<=>` | Cosine | `1 - (a·b)/(‖a‖‖b‖)` | ✓ |
|
||||
| `<+>` | L1 (Manhattan) | `sum(abs(a[i] - b[i]))` | ✓ |
|
||||
| `<~>` | Hamming | Bit differences (binary vectors) | ✓ |
|
||||
| `<%>` | Jaccard | Set similarity (sparse vectors) | - |
|
||||
|
||||
### 3. SIMD Dispatch Mechanism
|
||||
|
||||
**Runtime Feature Detection:**
|
||||
|
||||
```rust
|
||||
/// Initialize SIMD dispatch table at extension load
|
||||
pub fn init_simd_dispatch() {
|
||||
#[cfg(target_arch = "x86_64")]
|
||||
{
|
||||
if is_x86_feature_detected!("avx512f") {
|
||||
SIMD_LEVEL.store(SimdLevel::AVX512, Ordering::Relaxed);
|
||||
return;
|
||||
}
|
||||
if is_x86_feature_detected!("avx2") {
|
||||
SIMD_LEVEL.store(SimdLevel::AVX2, Ordering::Relaxed);
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(target_arch = "aarch64")]
|
||||
{
|
||||
if is_aarch64_feature_detected!("neon") {
|
||||
SIMD_LEVEL.store(SimdLevel::NEON, Ordering::Relaxed);
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
SIMD_LEVEL.store(SimdLevel::Scalar, Ordering::Relaxed);
|
||||
}
|
||||
```
|
||||
|
||||
**Dispatch Flow:**
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Distance Function Call (SQL Operator) │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ ↓ │
|
||||
│ ┌─────────────────────────────────────────────────────────────┐│
|
||||
│ │ euclidean_distance(a: &[f32], b: &[f32]) -> f32 ││
|
||||
│ │ ↓ ││
|
||||
│ │ Check SIMD_LEVEL (atomic read, cached) ││
|
||||
│ └─────────────────────────────────────────────────────────────┘│
|
||||
│ ↓ │
|
||||
│ ┌────────────────────┴────────────────────┐ │
|
||||
│ ↓ ↓ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
|
||||
│ │ AVX-512? │ │ AVX2? │ │ NEON/Scalar? │ │
|
||||
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────────────┘ │
|
||||
│ ↓ ↓ ↓ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
|
||||
│ │ 16 floats/ │ │ 8 floats/ │ │ 4 floats (NEON) or │ │
|
||||
│ │ iteration │ │ iteration │ │ 1 float (scalar) │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ _mm512_* │ │ _mm256_* │ │ vaddq_f32/for loop │ │
|
||||
│ │ FMA support │ │ FMA support │ │ │ │
|
||||
│ └──────────────┘ └──────────────┘ └──────────────────────┘ │
|
||||
│ ↓ ↓ ↓ │
|
||||
│ └────────────────────┬─────────────────┘ │
|
||||
│ ↓ │
|
||||
│ ┌──────────────────┐ │
|
||||
│ │ Return distance │ │
|
||||
│ └──────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Performance Characteristics:**
|
||||
|
||||
| SIMD Level | Floats/Iter | Relative Speed | Instruction Examples |
|
||||
|------------|-------------|----------------|---------------------|
|
||||
| AVX-512 | 16 | 16x | `_mm512_loadu_ps`, `_mm512_fmadd_ps` |
|
||||
| AVX2 | 8 | 8x | `_mm256_loadu_ps`, `_mm256_fmadd_ps` |
|
||||
| NEON | 4 | 4x | `vld1q_f32`, `vmlaq_f32` |
|
||||
| Scalar | 1 | 1x | Standard f32 operations |
|
||||
|
||||
### 4. TOAST Handling
|
||||
|
||||
**TOAST (The Oversized-Attribute Storage Technique):**
|
||||
|
||||
PostgreSQL automatically TOASTs values > ~2KB. RuVector handles this transparently:
|
||||
|
||||
```rust
|
||||
/// Detoast varlena pointer if needed
|
||||
#[inline]
|
||||
unsafe fn detoast_vector(raw: *mut varlena) -> *mut varlena {
|
||||
if VARATT_IS_EXTENDED(raw) {
|
||||
// PostgreSQL automatically detoasts
|
||||
pg_detoast_datum(raw as *const varlena) as *mut varlena
|
||||
} else {
|
||||
raw
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**When TOAST Occurs:**
|
||||
|
||||
- RuVector: ~512+ dimensions (2048+ bytes)
|
||||
- HalfVec: ~1024+ dimensions (2048+ bytes)
|
||||
- Automatic compression and external storage
|
||||
|
||||
**Performance Impact:**
|
||||
|
||||
- First access: Detoasting overhead (~10-50μs)
|
||||
- Subsequent access: Cached in PostgreSQL buffer
|
||||
- Index operations: Typically work with detoasted values
|
||||
|
||||
### 5. Index Types
|
||||
|
||||
#### HNSW (Hierarchical Navigable Small World)
|
||||
|
||||
```sql
|
||||
CREATE INDEX ON items USING ruhnsw (embedding ruvector_l2_ops)
|
||||
WITH (m = 16, ef_construction = 200);
|
||||
```
|
||||
|
||||
**Parameters:**
|
||||
- `m`: Maximum connections per layer (default: 16, range: 2-100)
|
||||
- `ef_construction`: Build-time search breadth (default: 64, range: 4-1000)
|
||||
|
||||
**Characteristics:**
|
||||
- Search: O(log n)
|
||||
- Insert: O(log n)
|
||||
- Memory: ~1.5x index overhead
|
||||
- Recall: 95-99%+ with tuned parameters
|
||||
|
||||
**HNSW Index Layout:**
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ HNSW Index Structure │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ Layer L (top): ○──────○ │
|
||||
│ │ │ │
|
||||
│ Layer L-1: ○──○───○──○ │
|
||||
│ │ │ │ │ │
|
||||
│ Layer L-2: ○──○───○──○──○──○ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ Layer 0 (base): ○──○───○──○──○──○──○──○──○ │
|
||||
│ │
|
||||
│ Entry Point: Top layer node │
|
||||
│ Search: Greedy descent + local beam search │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
#### IVFFlat (Inverted File with Flat Quantization)
|
||||
|
||||
```sql
|
||||
CREATE INDEX ON items USING ruivfflat (embedding ruvector_l2_ops)
|
||||
WITH (lists = 100);
|
||||
```
|
||||
|
||||
**Parameters:**
|
||||
- `lists`: Number of clusters (default: sqrt(n), recommended: rows/1000 to rows/10000)
|
||||
|
||||
**Characteristics:**
|
||||
- Search: O(√n)
|
||||
- Insert: O(1) after training
|
||||
- Memory: Minimal overhead
|
||||
- Recall: 90-95% with `probes = sqrt(lists)`
|
||||
|
||||
## Query Execution Flow
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Query: SELECT ... ORDER BY v <-> q │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ 1. Parse & Plan │
|
||||
│ └─> Identify index scan opportunity │
|
||||
│ │
|
||||
│ 2. Index Selection │
|
||||
│ └─> Choose HNSW/IVFFlat based on cost estimation │
|
||||
│ │
|
||||
│ 3. Index Scan (SIMD-accelerated) │
|
||||
│ ├─> HNSW: Navigate layers, beam search at layer 0 │
|
||||
│ └─> IVFFlat: Probe nearest centroids, scan cells │
|
||||
│ │
|
||||
│ 4. Distance Calculation (per candidate) │
|
||||
│ ├─> Detoast vector if needed │
|
||||
│ ├─> Zero-copy pointer access │
|
||||
│ ├─> SIMD dispatch (AVX-512/AVX2/NEON/Scalar) │
|
||||
│ └─> Full precision or quantized distance │
|
||||
│ │
|
||||
│ 5. Result Aggregation │
|
||||
│ └─> Return top-k with distances │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Comparison with pgvector
|
||||
|
||||
| Feature | pgvector 0.8.0 | RuVector-Postgres |
|
||||
|---------|---------------|-------------------|
|
||||
| Vector dimensions | 16,000 max | 16,000 max |
|
||||
| HNSW index | ✓ | ✓ (optimized) |
|
||||
| IVFFlat index | ✓ | ✓ (optimized) |
|
||||
| Half-precision | ✓ | ✓ |
|
||||
| Sparse vectors | ✓ | ✓ |
|
||||
| Binary quantization | ✓ | ✓ |
|
||||
| Product quantization | ✗ | ✓ |
|
||||
| Scalar quantization | ✗ | ✓ |
|
||||
| AVX-512 optimized | Partial | Full |
|
||||
| ARM NEON optimized | ✗ | ✓ |
|
||||
| Zero-copy access | ✗ | ✓ |
|
||||
| Varlena alignment | Basic | Optimized (8-byte) |
|
||||
| Hybrid search | ✗ | ✓ |
|
||||
| Filtered HNSW | Partial | ✓ |
|
||||
| Parallel queries | ✓ | ✓ (PARALLEL SAFE) |
|
||||
|
||||
## Thread Safety
|
||||
|
||||
RuVector-Postgres is fully thread-safe:
|
||||
|
||||
- **Read operations**: Lock-free concurrent reads
|
||||
- **Write operations**: Fine-grained locking per graph layer
|
||||
- **Index builds**: Parallel with work-stealing
|
||||
|
||||
```rust
|
||||
// Internal synchronization primitives
|
||||
pub struct HnswIndex {
|
||||
layers: Vec<RwLock<Layer>>, // Per-layer locks
|
||||
entry_point: AtomicUsize, // Lock-free entry point
|
||||
node_count: AtomicUsize, // Lock-free counter
|
||||
vectors: DashMap<NodeId, Vec<f32>>, // Concurrent hashmap
|
||||
}
|
||||
```
|
||||
|
||||
## Extension Dependencies
|
||||
|
||||
```toml
|
||||
[dependencies]
|
||||
pgrx = "0.12" # PostgreSQL extension framework
|
||||
simsimd = "5.9" # SIMD-accelerated distance functions
|
||||
parking_lot = "0.12" # Fast synchronization primitives
|
||||
dashmap = "6.0" # Concurrent hashmap
|
||||
rayon = "1.10" # Data parallelism
|
||||
half = "2.4" # Half-precision floats
|
||||
bitflags = "2.6" # Compact flags storage
|
||||
```
|
||||
|
||||
## Performance Tuning
|
||||
|
||||
### Index Build Performance
|
||||
|
||||
```sql
|
||||
-- Parallel index build (uses all available cores)
|
||||
SET maintenance_work_mem = '8GB';
|
||||
SET max_parallel_maintenance_workers = 8;
|
||||
|
||||
CREATE INDEX CONCURRENTLY ON items
|
||||
USING ruhnsw (embedding ruvector_l2_ops)
|
||||
WITH (m = 32, ef_construction = 400);
|
||||
```
|
||||
|
||||
### Search Performance
|
||||
|
||||
```sql
|
||||
-- Adjust search quality vs speed tradeoff
|
||||
SET ruvector.ef_search = 200; -- Higher = better recall, slower
|
||||
SET ruvector.probes = 10; -- For IVFFlat: more probes = better recall
|
||||
|
||||
-- Use iterative scan for filtered queries
|
||||
SELECT * FROM items
|
||||
WHERE category = 'electronics'
|
||||
ORDER BY embedding <-> '[0.1, 0.2, ...]'::ruvector
|
||||
LIMIT 10;
|
||||
```
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
crates/ruvector-postgres/
|
||||
├── Cargo.toml # Rust dependencies
|
||||
├── ruvector.control # Extension metadata
|
||||
├── docs/
|
||||
│ ├── ARCHITECTURE.md # This file
|
||||
│ ├── NEON_COMPATIBILITY.md # Neon deployment guide
|
||||
│ ├── SIMD_OPTIMIZATION.md # SIMD implementation details
|
||||
│ ├── INSTALLATION.md # Installation instructions
|
||||
│ ├── API.md # SQL API reference
|
||||
│ └── MIGRATION.md # Migration from pgvector
|
||||
├── sql/
|
||||
│ ├── ruvector--0.1.0.sql # Extension SQL definitions
|
||||
│ └── ruvector--0.0.0--0.1.0.sql # Migration script
|
||||
├── src/
|
||||
│ ├── lib.rs # Extension entry point
|
||||
│ ├── types/
|
||||
│ │ ├── mod.rs
|
||||
│ │ ├── vector.rs # ruvector type (zero-copy varlena)
|
||||
│ │ ├── halfvec.rs # Half-precision vector
|
||||
│ │ └── sparsevec.rs # Sparse vector
|
||||
│ ├── distance/
|
||||
│ │ ├── mod.rs
|
||||
│ │ ├── simd.rs # SIMD implementations (AVX-512/AVX2/NEON)
|
||||
│ │ └── scalar.rs # Scalar fallbacks
|
||||
│ ├── index/
|
||||
│ │ ├── mod.rs
|
||||
│ │ ├── hnsw.rs # HNSW implementation
|
||||
│ │ ├── ivfflat.rs # IVFFlat implementation
|
||||
│ │ └── scan.rs # Index scan operators
|
||||
│ ├── quantization/
|
||||
│ │ ├── mod.rs
|
||||
│ │ ├── scalar.rs # SQ8 quantization
|
||||
│ │ ├── product.rs # PQ quantization
|
||||
│ │ └── binary.rs # Binary quantization
|
||||
│ ├── operators.rs # SQL operators (<->, <=>, etc.)
|
||||
│ └── functions.rs # SQL functions
|
||||
└── tests/
|
||||
├── integration_tests.rs
|
||||
└── compatibility_tests.rs # pgvector compatibility
|
||||
```
|
||||
|
||||
## Version History
|
||||
|
||||
- **0.1.0**: Initial release with pgvector compatibility
|
||||
- HNSW and IVFFlat indexes
|
||||
- SIMD-optimized distance functions
|
||||
- Scalar quantization support
|
||||
- Neon compatibility
|
||||
- Zero-copy varlena access
|
||||
- AVX-512/AVX2/NEON support
|
||||
|
||||
## License
|
||||
|
||||
MIT License - Same as ruvector-core
|
||||
Reference in New Issue
Block a user