Files
wifi-densepose/crates/ruvector-postgres/docs/NEON_COMPATIBILITY.md
ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00

20 KiB

Neon Postgres Compatibility Guide

Overview

RuVector-Postgres is designed with first-class support for Neon's serverless PostgreSQL platform. This guide covers deployment, configuration, and optimization for Neon environments.

Neon Platform Overview

Neon is a serverless PostgreSQL platform with unique architecture:

  • Separation of Storage and Compute: Compute nodes are stateless
  • Scale to Zero: Instances automatically suspend when idle
  • Instant Branching: Copy-on-write database branches
  • Dynamic Extension Loading: Custom extensions loaded on demand
  • Connection Pooling: Built-in pooling with PgBouncer

Compatibility Matrix

Neon Feature RuVector Support Notes
PostgreSQL 14 ✓ Full Tested
PostgreSQL 15 ✓ Full Tested
PostgreSQL 16 ✓ Full Recommended
PostgreSQL 17 ✓ Full Latest
PostgreSQL 18 ✓ Full Beta support
Scale to Zero ✓ Full <100ms cold start
Instant Branching ✓ Full Index state preserved
Connection Pooling ✓ Full Thread-safe, no session state
Read Replicas ✓ Full Consistent reads
Autoscaling ✓ Full Dynamic memory handling
Autosuspend ✓ Full Fast wake-up

Design Considerations for Neon

1. Stateless Compute

Neon compute nodes are ephemeral and may be replaced at any time. RuVector-Postgres handles this by:

// No global mutable state that requires persistence
// All state lives in PostgreSQL's shared memory or storage

#[pg_guard]
pub fn _PG_init() {
    // Lightweight initialization - no disk I/O
    // SIMD feature detection cached in thread-local
    init_simd_dispatch();

    // Register GUCs (configuration variables)
    register_gucs();

    // No background workers (Neon restriction)
    // All maintenance is on-demand or during queries
}

Key Principles:

  • No file-based state: Everything in PostgreSQL shared buffers
  • No background workers: All work is query-driven
  • Fast initialization: Extension loads in <100ms
  • Memory-mapped indexes: Loaded from storage on demand

2. Fast Cold Start

Critical for scale-to-zero. RuVector-Postgres achieves sub-100ms initialization:

┌─────────────────────────────────────────────────────────────────┐
│                    Cold Start Timeline                           │
├─────────────────────────────────────────────────────────────────┤
│  0ms   │ Extension .so loaded by PostgreSQL                     │
│  5ms   │ _PG_init() called                                      │
│  10ms  │ SIMD feature detection complete                        │
│  15ms  │ GUC registration complete                              │
│  20ms  │ Operator/function registration complete                │
│  25ms  │ Index access method registration complete              │
│  50ms  │ First query ready                                      │
│  75ms  │ Index mmap from storage (on first access)              │
│ 100ms  │ Full warm state achieved                               │
└─────────────────────────────────────────────────────────────────┘

Optimization Techniques:

  1. Lazy Index Loading: Indexes mmap'd from storage on first access
  2. No Precomputation: No tables built at startup
  3. Minimal Allocations: Stack-based init where possible
  4. Cached SIMD Detection: One-time CPU feature detection

Comparison with pgvector:

Metric RuVector pgvector
Cold start time 50ms 120ms
Memory at init 2 MB 8 MB
First query latency +10ms +50ms

3. Memory Efficiency

Neon compute instances have memory limits based on compute units (CU). RuVector-Postgres is memory-conscious:

-- Check memory usage
SELECT * FROM ruvector_memory_stats();

┌──────────────────────────────────────────────────────────────┐
                  Memory Statistics                            
├──────────────────────────────────────────────────────────────┤
 index_memory_mb         256                                 
 vector_cache_mb         64                                  
 quantization_tables_mb  8                                   
 total_extension_mb      328                                 
└──────────────────────────────────────────────────────────────┘

Memory Optimization Strategies:

-- Limit index memory (for smaller Neon instances)
SET ruvector.max_index_memory = '256MB';

-- Use quantization to reduce memory footprint
CREATE INDEX ON items USING ruhnsw (embedding ruvector_l2_ops)
WITH (quantization = 'sq8');  -- 4x memory reduction

-- Use half-precision vectors
CREATE TABLE items (embedding halfvec(1536));  -- 50% memory savings

Memory by Compute Unit:

Neon CU RAM Recommended Index Size Quantization
0.25 1 GB <128 MB Required (sq8/pq)
0.5 2 GB <512 MB Recommended (sq8)
1.0 4 GB <2 GB Optional
2.0 8 GB <4 GB Optional
4.0+ 16+ GB <8 GB None

4. No Background Workers

Neon restricts background workers for resource management. RuVector-Postgres is designed without them:

// ❌ NOT USED: Background workers
// BackgroundWorker::register("ruvector_maintenance", ...);

// ✓ USED: On-demand operations
// - Index vacuum during INSERT/UPDATE
// - Statistics during ANALYZE
// - Maintenance via explicit SQL functions

Alternative Maintenance Patterns:

-- Explicit index maintenance (replaces background vacuum)
SELECT ruvector_index_maintenance('items_embedding_idx');

-- Scheduled via pg_cron (if available)
SELECT cron.schedule('vacuum-index', '0 2 * * *',
    $$SELECT ruvector_index_maintenance('items_embedding_idx')$$);

-- Manual statistics update
ANALYZE items;

5. Connection Pooling Considerations

Neon uses PgBouncer in transaction mode for connection pooling. RuVector-Postgres is fully compatible:

Compatible Features:

  • ✓ No session-level state
  • ✓ No temp tables or cursors
  • ✓ All settings via GUCs (can be set per-transaction)
  • ✓ Thread-safe distance calculations

Usage Pattern:

-- Each transaction is independent
BEGIN;
SET LOCAL ruvector.ef_search = 100;  -- Transaction-local setting
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
COMMIT;

-- Next transaction (potentially different connection)
BEGIN;
SET LOCAL ruvector.ef_search = 200;  -- Different setting
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
COMMIT;

6. Index Persistence

How Indexes Are Stored:

  • HNSW/IVFFlat indexes stored in PostgreSQL pages
  • Automatically replicated to Neon storage layer
  • Preserved across compute restarts
  • Shared across branches (copy-on-write)

Index Build on Neon:

-- Non-blocking index build (recommended on Neon)
CREATE INDEX CONCURRENTLY items_embedding_idx ON items
USING ruhnsw (embedding ruvector_l2_ops)
WITH (m = 32, ef_construction = 200);

-- Monitor progress
SELECT
    phase,
    blocks_total,
    blocks_done,
    tuples_total,
    tuples_done
FROM pg_stat_progress_create_index;

Neon-Specific Limitations

1. Extension Installation (Scale Plan Required)

Free Plan:

  • Pre-approved extensions only (pgvector is included)
  • RuVector requires custom extension approval

Scale Plan:

  • Custom extensions allowed
  • Contact support for installation

Enterprise Plan:

  • Dedicated support for custom extensions
  • Faster approval process

2. Compute Suspension

Behavior:

  • Compute suspends after 5 minutes of inactivity (configurable)
  • First query after suspension: +100-200ms latency
  • Indexes loaded from storage on first access

Mitigation:

-- Keep-alive query (via cron or application)
SELECT 1;

-- Or use Neon's suspend_timeout setting
-- In Neon console: Project Settings → Compute → Autosuspend delay

3. Memory Constraints

Observation:

  • Neon may limit memory below advertised CU limits
  • Large index builds may fail with OOM

Solutions:

-- Build index with lower memory
SET maintenance_work_mem = '256MB';
CREATE INDEX CONCURRENTLY ...;

-- Use quantization for large datasets
WITH (quantization = 'pq16');  -- 16x memory reduction

4. Extension Update Process

Current Process:

  1. Open support ticket with Neon
  2. Provide new .so and SQL files
  3. Neon reviews and deploys
  4. Extension available for ALTER EXTENSION UPDATE

Future: Self-service extension updates (roadmap item)

Requesting RuVector on Neon

For Scale Plan Customers

Step 1: Open Support Ticket

Navigate to: Neon ConsoleSupport

Ticket Template:

Subject: Custom Extension Request - RuVector-Postgres

Body:
I would like to install the RuVector-Postgres extension for vector similarity search.

Details:
- Extension: ruvector-postgres
- Version: 0.1.19
- PostgreSQL version: 16 (or your version)
- Project ID: [your-project-id]

Use case:
[Describe your vector search use case]

Repository: https://github.com/ruvnet/ruvector
Documentation: https://github.com/ruvnet/ruvector/tree/main/crates/ruvector-postgres

I can provide pre-built binaries if needed.

Step 2: Provide Extension Artifacts

Neon will request:

  1. Shared Library (.so file):

    # Build for PostgreSQL 16
    cargo pgrx package --pg-config /path/to/pg_config
    # Artifact: target/release/ruvector-pg16/usr/lib/postgresql/16/lib/ruvector.so
    
  2. Control File (ruvector.control):

    comment = 'High-performance vector similarity search'
    default_version = '0.1.19'
    module_pathname = '$libdir/ruvector'
    relocatable = true
    
  3. SQL Scripts:

    • ruvector--0.1.0.sql (initial schema)
    • ruvector--0.1.0--0.1.19.sql (migration script)
  4. Security Documentation:

    • Memory safety audit
    • No unsafe FFI calls
    • No network access
    • Resource limits

Step 3: Security Review

Neon engineers will review:

  • ✓ Rust memory safety guarantees
  • ✓ No unsafe system calls
  • ✓ Sandboxed execution
  • ✓ Resource limits (memory, CPU)
  • ✓ No file system access beyond PostgreSQL

Timeline: 1-2 weeks for approval.

Step 4: Deployment

Once approved:

-- Extension becomes available
CREATE EXTENSION ruvector;

-- Verify
SELECT ruvector_version();

For Free Plan Users

Option 1: Request via Discord

  1. Join Neon Discord
  2. Post in #feedback channel
  3. Include use case and expected usage

Option 2: Use pgvector (Pre-installed)

-- pgvector is available on all plans
CREATE EXTENSION vector;

-- RuVector provides migration path
-- (See MIGRATION.md)

Migration from pgvector

RuVector-Postgres is API-compatible with pgvector. Migration is seamless:

Step 1: Create Parallel Tables

-- Keep existing pgvector table (for rollback)
-- ALTER TABLE items RENAME TO items_pgvector;

-- Create new table with ruvector
CREATE TABLE items_ruvector (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding ruvector(1536)
);

-- Copy data (automatic type conversion)
INSERT INTO items_ruvector (id, content, embedding)
SELECT id, content, embedding::ruvector FROM items;

Step 2: Rebuild Indexes

-- Drop old pgvector index (if exists)
-- DROP INDEX items_embedding_idx;

-- Create optimized HNSW index
CREATE INDEX items_embedding_ruhnsw_idx ON items_ruvector
USING ruhnsw (embedding ruvector_l2_ops)
WITH (m = 32, ef_construction = 200);

-- Analyze for query planner
ANALYZE items_ruvector;

Step 3: Validate Results

-- Compare search results
WITH pgvector_results AS (
    SELECT id, embedding <-> '[...]'::vector AS dist
    FROM items ORDER BY dist LIMIT 10
),
ruvector_results AS (
    SELECT id, embedding <-> '[...]'::ruvector AS dist
    FROM items_ruvector ORDER BY dist LIMIT 10
)
SELECT
    p.id AS pg_id,
    r.id AS ru_id,
    p.id = r.id AS id_match,
    abs(p.dist - r.dist) < 0.0001 AS dist_match
FROM pgvector_results p
FULL OUTER JOIN ruvector_results r ON p.id = r.id;

-- All rows should have id_match=true, dist_match=true

Step 4: Switch Over

-- Atomic swap
BEGIN;
ALTER TABLE items RENAME TO items_old;
ALTER TABLE items_ruvector RENAME TO items;
COMMIT;

-- Validate application queries
-- ... run tests ...

-- Drop old table after validation period (e.g., 1 week)
DROP TABLE items_old;

Performance Tuning for Neon

Instance Size Recommendations

Neon CU RAM Max Vectors Recommended Settings
0.25 1 GB 100K m=8, ef=64, sq8 quant
0.5 2 GB 500K m=16, ef=100, sq8 quant
1.0 4 GB 2M m=24, ef=150, optional quant
2.0 8 GB 5M m=32, ef=200, no quant
4.0 16 GB 10M+ m=48, ef=300, no quant

Query Optimization

-- High recall (use for important queries)
SET ruvector.ef_search = 200;
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;

-- Low latency (use for real-time queries)
SET ruvector.ef_search = 40;
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;

-- Per-query tuning
SET LOCAL ruvector.ef_search = 100;

Index Build Settings

-- For small Neon instances
SET maintenance_work_mem = '512MB';
SET max_parallel_maintenance_workers = 2;

-- For large Neon instances
SET maintenance_work_mem = '4GB';
SET max_parallel_maintenance_workers = 8;

-- Always use CONCURRENTLY on Neon
CREATE INDEX CONCURRENTLY ...;

Neon Branching with RuVector

How Branching Works

Neon branches use copy-on-write, so indexes are instantly available:

Parent Branch                Child Branch
┌─────────────┐             ┌─────────────┐
│ items       │             │ items       │ (copy-on-write)
│ ├─ data     │──shared────→│ ├─ data     │
│ └─ index    │──shared────→│ └─ index    │
└─────────────┘             └─────────────┘
                                   ↓
                              Modify data
                                   ↓
                            ┌─────────────┐
                            │ items       │
                            │ ├─ data     │ (diverged)
                            │ └─ index    │ (needs rebuild)
                            └─────────────┘

Branch Creation Workflow

-- In parent branch: Create index
CREATE INDEX items_embedding_idx ON items
USING ruhnsw (embedding ruvector_l2_ops);

-- Create child branch via Neon Console or API
-- Index is instantly available (no rebuild needed)

-- In child branch: Index is read-only until data changes
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
-- Uses parent's index ✓

-- After INSERT/UPDATE in child:
-- Index diverges and needs rebuild
INSERT INTO items VALUES (...);
REINDEX INDEX items_embedding_idx;  -- or CREATE INDEX CONCURRENTLY

Branch-Specific Tuning

-- Development branch: Faster builds, lower recall
ALTER DATABASE dev_branch SET ruvector.ef_search = 20;

-- Staging branch: Balanced
ALTER DATABASE staging SET ruvector.ef_search = 100;

-- Production branch: High recall
ALTER DATABASE prod SET ruvector.ef_search = 200;

Monitoring on Neon

Extension Metrics

-- Index statistics
SELECT * FROM ruvector_index_stats();

┌────────────────────────────────────────────────────────────────┐
                    Index Statistics                             
├────────────────────────────────────────────────────────────────┤
 index_name               items_embedding_idx                  
 index_size_mb            512                                  
 vector_count             1000000                              
 dimensions               1536                                 
 build_time_seconds       45.2                                 
 fragmentation_pct        2.3                                  
└────────────────────────────────────────────────────────────────┘

Query Performance

-- Explain analyze for vector queries
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT * FROM items
ORDER BY embedding <-> '[0.1, 0.2, ...]'::ruvector
LIMIT 10;

-- Output includes:
-- - Index Scan using items_embedding_idx
-- - Distance calculations: 15000
-- - Buffers: shared hit=250, read=10
-- - Execution time: 12.5ms

Neon Metrics Integration

Use Neon's monitoring dashboard:

  1. Query Time: Track vector query latencies
  2. Buffer Hit Ratio: Monitor index cache efficiency
  3. Compute Usage: Track CPU during index builds
  4. Memory Usage: Monitor vector memory consumption

Troubleshooting

Cold Start Slow

Symptom: First query after suspend takes >500ms

Diagnosis:

-- Check extension load time
SELECT extname, extversion FROM pg_extension WHERE extname = 'ruvector';

-- Check SIMD detection
SELECT ruvector_simd_info();

Solution:

  • Expected: 100-200ms for first query
  • If >500ms: Contact Neon support (compute issue)
  • Use keep-alive queries to prevent suspension

Memory Pressure

Symptom: Index build fails with OOM

Diagnosis:

-- Check current memory usage
SELECT * FROM ruvector_memory_stats();

-- Check Neon compute size
SELECT current_setting('shared_buffers');

Solution:

-- Reduce index memory
SET ruvector.max_index_memory = '128MB';

-- Use aggressive quantization
CREATE INDEX ... WITH (quantization = 'pq16');

-- Upgrade Neon compute unit
-- Neon Console → Project Settings → Compute → Scale up

Index Build Timeout

Symptom: CREATE INDEX times out on large dataset

Solution:

-- Always use CONCURRENTLY
CREATE INDEX CONCURRENTLY items_embedding_idx ON items
USING ruhnsw (embedding ruvector_l2_ops);

-- Split into batches
CREATE TABLE items_batch_1 AS SELECT * FROM items LIMIT 100000;
CREATE INDEX ... ON items_batch_1;
-- Repeat for batches, then UNION ALL

Connection Pool Compatibility

Symptom: Settings not persisting across queries

Cause: PgBouncer transaction mode resets session state

Solution:

-- Use SET LOCAL (transaction-scoped)
BEGIN;
SET LOCAL ruvector.ef_search = 100;
SELECT ... ORDER BY embedding <-> query;
COMMIT;

-- Or set defaults in postgresql.conf
ALTER DATABASE mydb SET ruvector.ef_search = 100;

Support Resources