6.4 KiB
6.4 KiB
ADR-0027: Fix HNSW Index Segmentation Fault with Parameterized Queries
Status
Accepted - 2026-01-28
Context
Problem Statement
GitHub Issue #141 reported a critical (P0) bug where HNSW indexes on ruvector(384) columns cause PostgreSQL to crash with a segmentation fault when executing similarity queries with parameterized query vectors.
Symptoms
- Warning:
"HNSW: Could not extract query vector, using zeros" - Warning:
"HNSW v2: Bitmap scans not supported for k-NN queries" - Fatal:
"server process terminated by signal 11: Segmentation fault"
Root Cause Analysis
The bug has three contributing factors:
-
Query Vector Extraction Failure
- The
hnsw_rescancallback extracts the query vector from PostgreSQL'sorderby.sk_argumentdatum - The extraction code only handles direct
ruvectordatums viaRuVector::from_polymorphic_datum() - Parameterized queries (prepared statements, application drivers) pass text representations that require conversion
- When extraction fails, the code falls back to a zero vector
- The
-
Invalid Zero Vector Handling
- A zero vector is mathematically invalid for similarity search (especially in hyperbolic/Poincaré space)
- The HNSW search algorithm proceeds with this invalid vector without validation
- Distance calculations with zero vectors cause undefined behavior
-
Missing Error Handling
- No validation before executing HNSW search
- Segmentation fault instead of graceful PostgreSQL error
- No dimension mismatch checking
Impact
- Production Adoption Blocked: Modern applications use parameterized queries (ORMs, prepared statements, SQL injection prevention)
- 100% Reproducible: Any parameterized HNSW query triggers the crash
- Workaround Required: Sequential scans with 10-15x performance penalty
Decision
Fix Strategy
Implement a comprehensive query vector extraction pipeline with proper validation:
1. Multi-Method Query Vector Extraction
// Method 1: Direct RuVector extraction (literals, casts)
if let Some(vector) = RuVector::from_polymorphic_datum(datum, false, typoid) {
state.query_vector = vector.as_slice().to_vec();
state.query_valid = true;
}
// Method 2: Text parameter conversion (parameterized queries)
if !state.query_valid && is_text_type(typoid) {
if let Some(vec) = try_convert_text_to_ruvector(datum) {
state.query_vector = vec;
state.query_valid = true;
}
}
// Method 3: Validated varlena fallback
if !state.query_valid {
// ... with size and dimension validation
}
2. Validation Before Search
// Reject invalid queries with clear error messages
if !state.query_valid || state.query_vector.is_empty() {
pgrx::error!("HNSW: Could not extract query vector...");
}
if is_zero_vector(&state.query_vector) {
pgrx::error!("HNSW: Query vector is all zeros...");
}
if state.query_vector.len() != state.dimensions {
pgrx::error!("HNSW: Dimension mismatch...");
}
3. Track Query Validity State
Add query_valid: bool field to HnswScanState to track extraction success across methods.
Changes Made
| File | Changes |
|---|---|
crates/ruvector-postgres/src/index/hnsw_am.rs |
Multi-method extraction, validation, zero-vector check |
crates/ruvector-postgres/src/index/ivfflat_am.rs |
Same fixes applied for consistency |
Key Functions Added/Modified
hnsw_rescan()- Complete rewrite of query extraction logictry_convert_text_to_ruvector()- New function for text→ruvector conversionis_zero_vector()- New validation helperivfflat_amrescan()- Parallel fix for IVFFlat indexivfflat_try_convert_text_to_ruvector()- IVFFlat text conversion
Consequences
Positive
- Parameterized queries work: Prepared statements, ORMs, application drivers all function correctly
- Graceful error handling: PostgreSQL ERROR instead of segfault
- Clear error messages: Users understand what went wrong and how to fix it
- Dimension validation: Catches mismatched query/index dimensions early
- Zero-vector protection: Invalid queries rejected before search execution
Negative
- Slight overhead: Additional validation on each query (negligible, ~1μs)
- Text parsing: Manual vector parsing for text parameters (only when other methods fail)
Neutral
- No API changes: Existing queries continue to work unchanged
- IVFFlat also fixed: Consistent behavior across both index types
Test Plan
Unit Tests
-- 1. Literal query (baseline - should work)
SELECT * FROM test_hnsw ORDER BY embedding <=> '[0.1,0.2,0.3]'::ruvector(3) LIMIT 5;
-- 2. Prepared statement (was crashing, now works)
PREPARE search AS SELECT * FROM test_hnsw ORDER BY embedding <=> $1::ruvector(3) LIMIT 5;
EXECUTE search('[0.1,0.2,0.3]');
-- 3. Function with text parameter (was crashing, now works)
SELECT * FROM search_similar('[0.1,0.2,0.3]');
-- 4. Zero vector (was crashing, now errors gracefully)
SELECT * FROM test_hnsw ORDER BY embedding <=> '[0,0,0]'::ruvector(3) LIMIT 5;
-- ERROR: HNSW: Query vector is all zeros...
-- 5. Dimension mismatch (was undefined behavior, now errors)
SELECT * FROM test_hnsw ORDER BY embedding <=> '[0.1,0.2]'::ruvector(2) LIMIT 5;
-- ERROR: HNSW: Query vector has 2 dimensions but index expects 3
Integration Tests
- Node.js pg driver with parameterized queries
- Python psycopg with prepared statements
- Rust sqlx with query parameters
- Load test with 10k concurrent parameterized queries
Related
- Issue: #141 - HNSW Segmentation Fault with Parameterized Queries
- Reporter: Mark Allen, NexaDental CTO
- Priority: P0 (Critical) - Production blocker
Implementation Checklist
- Fix
hnsw_rescan()query extraction - Add
try_convert_text_to_ruvector()helper - Add
is_zero_vector()validation - Add
query_validfield to scan state - Apply same fix to IVFFlat for consistency
- Compile verification
- Add regression tests
- Update documentation
- Build new Docker image
- Test with production dataset (6,975 rows)
- Release v2.0.1 patch