Files
wifi-densepose/vendor/ruvector/docs/adr/ADR-0027-hnsw-parameterized-query-fix.md

6.4 KiB

ADR-0027: Fix HNSW Index Segmentation Fault with Parameterized Queries

Status

Accepted - 2026-01-28

Context

Problem Statement

GitHub Issue #141 reported a critical (P0) bug where HNSW indexes on ruvector(384) columns cause PostgreSQL to crash with a segmentation fault when executing similarity queries with parameterized query vectors.

Symptoms

  1. Warning: "HNSW: Could not extract query vector, using zeros"
  2. Warning: "HNSW v2: Bitmap scans not supported for k-NN queries"
  3. Fatal: "server process terminated by signal 11: Segmentation fault"

Root Cause Analysis

The bug has three contributing factors:

  1. Query Vector Extraction Failure

    • The hnsw_rescan callback extracts the query vector from PostgreSQL's orderby.sk_argument datum
    • The extraction code only handles direct ruvector datums via RuVector::from_polymorphic_datum()
    • Parameterized queries (prepared statements, application drivers) pass text representations that require conversion
    • When extraction fails, the code falls back to a zero vector
  2. Invalid Zero Vector Handling

    • A zero vector is mathematically invalid for similarity search (especially in hyperbolic/Poincaré space)
    • The HNSW search algorithm proceeds with this invalid vector without validation
    • Distance calculations with zero vectors cause undefined behavior
  3. Missing Error Handling

    • No validation before executing HNSW search
    • Segmentation fault instead of graceful PostgreSQL error
    • No dimension mismatch checking

Impact

  • Production Adoption Blocked: Modern applications use parameterized queries (ORMs, prepared statements, SQL injection prevention)
  • 100% Reproducible: Any parameterized HNSW query triggers the crash
  • Workaround Required: Sequential scans with 10-15x performance penalty

Decision

Fix Strategy

Implement a comprehensive query vector extraction pipeline with proper validation:

1. Multi-Method Query Vector Extraction

// Method 1: Direct RuVector extraction (literals, casts)
if let Some(vector) = RuVector::from_polymorphic_datum(datum, false, typoid) {
    state.query_vector = vector.as_slice().to_vec();
    state.query_valid = true;
}

// Method 2: Text parameter conversion (parameterized queries)
if !state.query_valid && is_text_type(typoid) {
    if let Some(vec) = try_convert_text_to_ruvector(datum) {
        state.query_vector = vec;
        state.query_valid = true;
    }
}

// Method 3: Validated varlena fallback
if !state.query_valid {
    // ... with size and dimension validation
}
// Reject invalid queries with clear error messages
if !state.query_valid || state.query_vector.is_empty() {
    pgrx::error!("HNSW: Could not extract query vector...");
}

if is_zero_vector(&state.query_vector) {
    pgrx::error!("HNSW: Query vector is all zeros...");
}

if state.query_vector.len() != state.dimensions {
    pgrx::error!("HNSW: Dimension mismatch...");
}

3. Track Query Validity State

Add query_valid: bool field to HnswScanState to track extraction success across methods.

Changes Made

File Changes
crates/ruvector-postgres/src/index/hnsw_am.rs Multi-method extraction, validation, zero-vector check
crates/ruvector-postgres/src/index/ivfflat_am.rs Same fixes applied for consistency

Key Functions Added/Modified

  • hnsw_rescan() - Complete rewrite of query extraction logic
  • try_convert_text_to_ruvector() - New function for text→ruvector conversion
  • is_zero_vector() - New validation helper
  • ivfflat_amrescan() - Parallel fix for IVFFlat index
  • ivfflat_try_convert_text_to_ruvector() - IVFFlat text conversion

Consequences

Positive

  • Parameterized queries work: Prepared statements, ORMs, application drivers all function correctly
  • Graceful error handling: PostgreSQL ERROR instead of segfault
  • Clear error messages: Users understand what went wrong and how to fix it
  • Dimension validation: Catches mismatched query/index dimensions early
  • Zero-vector protection: Invalid queries rejected before search execution

Negative

  • Slight overhead: Additional validation on each query (negligible, ~1μs)
  • Text parsing: Manual vector parsing for text parameters (only when other methods fail)

Neutral

  • No API changes: Existing queries continue to work unchanged
  • IVFFlat also fixed: Consistent behavior across both index types

Test Plan

Unit Tests

-- 1. Literal query (baseline - should work)
SELECT * FROM test_hnsw ORDER BY embedding <=> '[0.1,0.2,0.3]'::ruvector(3) LIMIT 5;

-- 2. Prepared statement (was crashing, now works)
PREPARE search AS SELECT * FROM test_hnsw ORDER BY embedding <=> $1::ruvector(3) LIMIT 5;
EXECUTE search('[0.1,0.2,0.3]');

-- 3. Function with text parameter (was crashing, now works)
SELECT * FROM search_similar('[0.1,0.2,0.3]');

-- 4. Zero vector (was crashing, now errors gracefully)
SELECT * FROM test_hnsw ORDER BY embedding <=> '[0,0,0]'::ruvector(3) LIMIT 5;
-- ERROR: HNSW: Query vector is all zeros...

-- 5. Dimension mismatch (was undefined behavior, now errors)
SELECT * FROM test_hnsw ORDER BY embedding <=> '[0.1,0.2]'::ruvector(2) LIMIT 5;
-- ERROR: HNSW: Query vector has 2 dimensions but index expects 3

Integration Tests

  • Node.js pg driver with parameterized queries
  • Python psycopg with prepared statements
  • Rust sqlx with query parameters
  • Load test with 10k concurrent parameterized queries
  • Issue: #141 - HNSW Segmentation Fault with Parameterized Queries
  • Reporter: Mark Allen, NexaDental CTO
  • Priority: P0 (Critical) - Production blocker

Implementation Checklist

  • Fix hnsw_rescan() query extraction
  • Add try_convert_text_to_ruvector() helper
  • Add is_zero_vector() validation
  • Add query_valid field to scan state
  • Apply same fix to IVFFlat for consistency
  • Compile verification
  • Add regression tests
  • Update documentation
  • Build new Docker image
  • Test with production dataset (6,975 rows)
  • Release v2.0.1 patch

References