Files

ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172

git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900

2026-02-28 14:39:40 -05:00

20 KiB

Raw Blame History

Neon Postgres Compatibility Guide

Overview

RuVector-Postgres is designed with first-class support for Neon's serverless PostgreSQL platform. This guide covers deployment, configuration, and optimization for Neon environments.

Neon Platform Overview

Neon is a serverless PostgreSQL platform with unique architecture:

Separation of Storage and Compute: Compute nodes are stateless
Scale to Zero: Instances automatically suspend when idle
Instant Branching: Copy-on-write database branches
Dynamic Extension Loading: Custom extensions loaded on demand
Connection Pooling: Built-in pooling with PgBouncer

Compatibility Matrix

Neon Feature	RuVector Support	Notes
PostgreSQL 14	✓ Full	Tested
PostgreSQL 15	✓ Full	Tested
PostgreSQL 16	✓ Full	Recommended
PostgreSQL 17	✓ Full	Latest
PostgreSQL 18	✓ Full	Beta support
Scale to Zero	✓ Full	<100ms cold start
Instant Branching	✓ Full	Index state preserved
Connection Pooling	✓ Full	Thread-safe, no session state
Read Replicas	✓ Full	Consistent reads
Autoscaling	✓ Full	Dynamic memory handling
Autosuspend	✓ Full	Fast wake-up

Design Considerations for Neon

1. Stateless Compute

Neon compute nodes are ephemeral and may be replaced at any time. RuVector-Postgres handles this by:

// No global mutable state that requires persistence
// All state lives in PostgreSQL's shared memory or storage

#[pg_guard]
pub fn _PG_init() {
    // Lightweight initialization - no disk I/O
    // SIMD feature detection cached in thread-local
    init_simd_dispatch();

    // Register GUCs (configuration variables)
    register_gucs();

    // No background workers (Neon restriction)
    // All maintenance is on-demand or during queries
}

Key Principles:

No file-based state: Everything in PostgreSQL shared buffers
No background workers: All work is query-driven
Fast initialization: Extension loads in <100ms
Memory-mapped indexes: Loaded from storage on demand

2. Fast Cold Start

Critical for scale-to-zero. RuVector-Postgres achieves sub-100ms initialization:

┌─────────────────────────────────────────────────────────────────┐
│                    Cold Start Timeline                           │
├─────────────────────────────────────────────────────────────────┤
│  0ms   │ Extension .so loaded by PostgreSQL                     │
│  5ms   │ _PG_init() called                                      │
│  10ms  │ SIMD feature detection complete                        │
│  15ms  │ GUC registration complete                              │
│  20ms  │ Operator/function registration complete                │
│  25ms  │ Index access method registration complete              │
│  50ms  │ First query ready                                      │
│  75ms  │ Index mmap from storage (on first access)              │
│ 100ms  │ Full warm state achieved                               │
└─────────────────────────────────────────────────────────────────┘

Optimization Techniques:

Lazy Index Loading: Indexes mmap'd from storage on first access
No Precomputation: No tables built at startup
Minimal Allocations: Stack-based init where possible
Cached SIMD Detection: One-time CPU feature detection

Comparison with pgvector:

Metric	RuVector	pgvector
Cold start time	50ms	120ms
Memory at init	2 MB	8 MB
First query latency	+10ms	+50ms

3. Memory Efficiency

Neon compute instances have memory limits based on compute units (CU). RuVector-Postgres is memory-conscious:

-- Check memory usage
SELECT * FROM ruvector_memory_stats();

┌──────────────────────────────────────────────────────────────┐
│                  Memory Statistics                            │
├──────────────────────────────────────────────────────────────┤
│ index_memory_mb        │ 256                                 │
│ vector_cache_mb        │ 64                                  │
│ quantization_tables_mb │ 8                                   │
│ total_extension_mb     │ 328                                 │
└──────────────────────────────────────────────────────────────┘

Memory Optimization Strategies:

-- Limit index memory (for smaller Neon instances)
SET ruvector.max_index_memory = '256MB';

-- Use quantization to reduce memory footprint
CREATE INDEX ON items USING ruhnsw (embedding ruvector_l2_ops)
WITH (quantization = 'sq8');  -- 4x memory reduction

-- Use half-precision vectors
CREATE TABLE items (embedding halfvec(1536));  -- 50% memory savings

Memory by Compute Unit:

Neon CU	RAM	Recommended Index Size	Quantization
0.25	1 GB	<128 MB	Required (sq8/pq)
0.5	2 GB	<512 MB	Recommended (sq8)
1.0	4 GB	<2 GB	Optional
2.0	8 GB	<4 GB	Optional
4.0+	16+ GB	<8 GB	None

4. No Background Workers

Neon restricts background workers for resource management. RuVector-Postgres is designed without them:

// ❌ NOT USED: Background workers
// BackgroundWorker::register("ruvector_maintenance", ...);

// ✓ USED: On-demand operations
// - Index vacuum during INSERT/UPDATE
// - Statistics during ANALYZE
// - Maintenance via explicit SQL functions

Alternative Maintenance Patterns:

-- Explicit index maintenance (replaces background vacuum)
SELECT ruvector_index_maintenance('items_embedding_idx');

-- Scheduled via pg_cron (if available)
SELECT cron.schedule('vacuum-index', '0 2 * * *',
    $$SELECT ruvector_index_maintenance('items_embedding_idx')$$);

-- Manual statistics update
ANALYZE items;

5. Connection Pooling Considerations

Neon uses PgBouncer in transaction mode for connection pooling. RuVector-Postgres is fully compatible:

Compatible Features:

✓ No session-level state
✓ No temp tables or cursors
✓ All settings via GUCs (can be set per-transaction)
✓ Thread-safe distance calculations

Usage Pattern:

-- Each transaction is independent
BEGIN;
SET LOCAL ruvector.ef_search = 100;  -- Transaction-local setting
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
COMMIT;

-- Next transaction (potentially different connection)
BEGIN;
SET LOCAL ruvector.ef_search = 200;  -- Different setting
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
COMMIT;

6. Index Persistence

How Indexes Are Stored:

HNSW/IVFFlat indexes stored in PostgreSQL pages
Automatically replicated to Neon storage layer
Preserved across compute restarts
Shared across branches (copy-on-write)

Index Build on Neon:

-- Non-blocking index build (recommended on Neon)
CREATE INDEX CONCURRENTLY items_embedding_idx ON items
USING ruhnsw (embedding ruvector_l2_ops)
WITH (m = 32, ef_construction = 200);

-- Monitor progress
SELECT
    phase,
    blocks_total,
    blocks_done,
    tuples_total,
    tuples_done
FROM pg_stat_progress_create_index;

Neon-Specific Limitations

1. Extension Installation (Scale Plan Required)

Free Plan:

Pre-approved extensions only (pgvector is included)
RuVector requires custom extension approval

Scale Plan:

Custom extensions allowed
Contact support for installation

Enterprise Plan:

Dedicated support for custom extensions
Faster approval process

2. Compute Suspension

Behavior:

Compute suspends after 5 minutes of inactivity (configurable)
First query after suspension: +100-200ms latency
Indexes loaded from storage on first access

Mitigation:

-- Keep-alive query (via cron or application)
SELECT 1;

-- Or use Neon's suspend_timeout setting
-- In Neon console: Project Settings → Compute → Autosuspend delay

3. Memory Constraints

Observation:

Neon may limit memory below advertised CU limits
Large index builds may fail with OOM

Solutions:

-- Build index with lower memory
SET maintenance_work_mem = '256MB';
CREATE INDEX CONCURRENTLY ...;

-- Use quantization for large datasets
WITH (quantization = 'pq16');  -- 16x memory reduction

4. Extension Update Process

Current Process:

Open support ticket with Neon
Provide new .so and SQL files
Neon reviews and deploys
Extension available for ALTER EXTENSION UPDATE

Future: Self-service extension updates (roadmap item)

Requesting RuVector on Neon

For Scale Plan Customers

Step 1: Open Support Ticket

Navigate to: Neon Console → Support

Ticket Template:

Subject: Custom Extension Request - RuVector-Postgres

Body:
I would like to install the RuVector-Postgres extension for vector similarity search.

Details:
- Extension: ruvector-postgres
- Version: 0.1.19
- PostgreSQL version: 16 (or your version)
- Project ID: [your-project-id]

Use case:
[Describe your vector search use case]

Repository: https://github.com/ruvnet/ruvector
Documentation: https://github.com/ruvnet/ruvector/tree/main/crates/ruvector-postgres

I can provide pre-built binaries if needed.

Step 2: Provide Extension Artifacts

Neon will request:

Shared Library (.so file):

# Build for PostgreSQL 16
cargo pgrx package --pg-config /path/to/pg_config
# Artifact: target/release/ruvector-pg16/usr/lib/postgresql/16/lib/ruvector.so

Control File (ruvector.control):

comment = 'High-performance vector similarity search'
default_version = '0.1.19'
module_pathname = '$libdir/ruvector'
relocatable = true

SQL Scripts:
- ruvector--0.1.0.sql (initial schema)
- ruvector--0.1.0--0.1.19.sql (migration script)
Security Documentation:
- Memory safety audit
- No unsafe FFI calls
- No network access
- Resource limits

Step 3: Security Review

Neon engineers will review:

✓ Rust memory safety guarantees
✓ No unsafe system calls
✓ Sandboxed execution
✓ Resource limits (memory, CPU)
✓ No file system access beyond PostgreSQL

Timeline: 1-2 weeks for approval.

Step 4: Deployment

Once approved:

-- Extension becomes available
CREATE EXTENSION ruvector;

-- Verify
SELECT ruvector_version();

For Free Plan Users

Option 1: Request via Discord

Join Neon Discord
Post in #feedback channel
Include use case and expected usage

Option 2: Use pgvector (Pre-installed)

-- pgvector is available on all plans
CREATE EXTENSION vector;

-- RuVector provides migration path
-- (See MIGRATION.md)

Migration from pgvector

RuVector-Postgres is API-compatible with pgvector. Migration is seamless:

Step 1: Create Parallel Tables

-- Keep existing pgvector table (for rollback)
-- ALTER TABLE items RENAME TO items_pgvector;

-- Create new table with ruvector
CREATE TABLE items_ruvector (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding ruvector(1536)
);

-- Copy data (automatic type conversion)
INSERT INTO items_ruvector (id, content, embedding)
SELECT id, content, embedding::ruvector FROM items;

Step 2: Rebuild Indexes

-- Drop old pgvector index (if exists)
-- DROP INDEX items_embedding_idx;

-- Create optimized HNSW index
CREATE INDEX items_embedding_ruhnsw_idx ON items_ruvector
USING ruhnsw (embedding ruvector_l2_ops)
WITH (m = 32, ef_construction = 200);

-- Analyze for query planner
ANALYZE items_ruvector;

Step 3: Validate Results

-- Compare search results
WITH pgvector_results AS (
    SELECT id, embedding <-> '[...]'::vector AS dist
    FROM items ORDER BY dist LIMIT 10
),
ruvector_results AS (
    SELECT id, embedding <-> '[...]'::ruvector AS dist
    FROM items_ruvector ORDER BY dist LIMIT 10
)
SELECT
    p.id AS pg_id,
    r.id AS ru_id,
    p.id = r.id AS id_match,
    abs(p.dist - r.dist) < 0.0001 AS dist_match
FROM pgvector_results p
FULL OUTER JOIN ruvector_results r ON p.id = r.id;

-- All rows should have id_match=true, dist_match=true

Step 4: Switch Over

-- Atomic swap
BEGIN;
ALTER TABLE items RENAME TO items_old;
ALTER TABLE items_ruvector RENAME TO items;
COMMIT;

-- Validate application queries
-- ... run tests ...

-- Drop old table after validation period (e.g., 1 week)
DROP TABLE items_old;

Performance Tuning for Neon

Instance Size Recommendations

Neon CU	RAM	Max Vectors	Recommended Settings
0.25	1 GB	100K	`m=8, ef=64, sq8 quant`
0.5	2 GB	500K	`m=16, ef=100, sq8 quant`
1.0	4 GB	2M	`m=24, ef=150, optional quant`
2.0	8 GB	5M	`m=32, ef=200, no quant`
4.0	16 GB	10M+	`m=48, ef=300, no quant`

Query Optimization

-- High recall (use for important queries)
SET ruvector.ef_search = 200;
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;

-- Low latency (use for real-time queries)
SET ruvector.ef_search = 40;
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;

-- Per-query tuning
SET LOCAL ruvector.ef_search = 100;

Index Build Settings

-- For small Neon instances
SET maintenance_work_mem = '512MB';
SET max_parallel_maintenance_workers = 2;

-- For large Neon instances
SET maintenance_work_mem = '4GB';
SET max_parallel_maintenance_workers = 8;

-- Always use CONCURRENTLY on Neon
CREATE INDEX CONCURRENTLY ...;

Neon Branching with RuVector

How Branching Works

Neon branches use copy-on-write, so indexes are instantly available:

Parent Branch                Child Branch
┌─────────────┐             ┌─────────────┐
│ items       │             │ items       │ (copy-on-write)
│ ├─ data     │──shared────→│ ├─ data     │
│ └─ index    │──shared────→│ └─ index    │
└─────────────┘             └─────────────┘
                                   ↓
                              Modify data
                                   ↓
                            ┌─────────────┐
                            │ items       │
                            │ ├─ data     │ (diverged)
                            │ └─ index    │ (needs rebuild)
                            └─────────────┘

Branch Creation Workflow

-- In parent branch: Create index
CREATE INDEX items_embedding_idx ON items
USING ruhnsw (embedding ruvector_l2_ops);

-- Create child branch via Neon Console or API
-- Index is instantly available (no rebuild needed)

-- In child branch: Index is read-only until data changes
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
-- Uses parent's index ✓

-- After INSERT/UPDATE in child:
-- Index diverges and needs rebuild
INSERT INTO items VALUES (...);
REINDEX INDEX items_embedding_idx;  -- or CREATE INDEX CONCURRENTLY

Branch-Specific Tuning

-- Development branch: Faster builds, lower recall
ALTER DATABASE dev_branch SET ruvector.ef_search = 20;

-- Staging branch: Balanced
ALTER DATABASE staging SET ruvector.ef_search = 100;

-- Production branch: High recall
ALTER DATABASE prod SET ruvector.ef_search = 200;

Monitoring on Neon

Extension Metrics

-- Index statistics
SELECT * FROM ruvector_index_stats();

┌────────────────────────────────────────────────────────────────┐
│                    Index Statistics                             │
├────────────────────────────────────────────────────────────────┤
│ index_name              │ items_embedding_idx                  │
│ index_size_mb           │ 512                                  │
│ vector_count            │ 1000000                              │
│ dimensions              │ 1536                                 │
│ build_time_seconds      │ 45.2                                 │
│ fragmentation_pct       │ 2.3                                  │
└────────────────────────────────────────────────────────────────┘

Query Performance

-- Explain analyze for vector queries
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT * FROM items
ORDER BY embedding <-> '[0.1, 0.2, ...]'::ruvector
LIMIT 10;

-- Output includes:
-- - Index Scan using items_embedding_idx
-- - Distance calculations: 15000
-- - Buffers: shared hit=250, read=10
-- - Execution time: 12.5ms

Neon Metrics Integration

Use Neon's monitoring dashboard:

Query Time: Track vector query latencies
Buffer Hit Ratio: Monitor index cache efficiency
Compute Usage: Track CPU during index builds
Memory Usage: Monitor vector memory consumption

Troubleshooting

Cold Start Slow

Symptom: First query after suspend takes >500ms

Diagnosis:

-- Check extension load time
SELECT extname, extversion FROM pg_extension WHERE extname = 'ruvector';

-- Check SIMD detection
SELECT ruvector_simd_info();

Solution:

Expected: 100-200ms for first query
If >500ms: Contact Neon support (compute issue)
Use keep-alive queries to prevent suspension

Memory Pressure

Symptom: Index build fails with OOM

Diagnosis:

-- Check current memory usage
SELECT * FROM ruvector_memory_stats();

-- Check Neon compute size
SELECT current_setting('shared_buffers');

Solution:

-- Reduce index memory
SET ruvector.max_index_memory = '128MB';

-- Use aggressive quantization
CREATE INDEX ... WITH (quantization = 'pq16');

-- Upgrade Neon compute unit
-- Neon Console → Project Settings → Compute → Scale up

Index Build Timeout

Symptom: CREATE INDEX times out on large dataset

Solution:

-- Always use CONCURRENTLY
CREATE INDEX CONCURRENTLY items_embedding_idx ON items
USING ruhnsw (embedding ruvector_l2_ops);

-- Split into batches
CREATE TABLE items_batch_1 AS SELECT * FROM items LIMIT 100000;
CREATE INDEX ... ON items_batch_1;
-- Repeat for batches, then UNION ALL

Connection Pool Compatibility

Symptom: Settings not persisting across queries

Cause: PgBouncer transaction mode resets session state

Solution:

-- Use SET LOCAL (transaction-scoped)
BEGIN;
SET LOCAL ruvector.ef_search = 100;
SELECT ... ORDER BY embedding <-> query;
COMMIT;

-- Or set defaults in postgresql.conf
ALTER DATABASE mydb SET ruvector.ef_search = 100;

Support Resources

Neon Documentation: https://neon.tech/docs
RuVector GitHub: https://github.com/ruvnet/ruvector
RuVector Issues: https://github.com/ruvnet/ruvector/issues
Neon Discord: https://discord.gg/92vNTzKDGp
Neon Support: console.neon.tech → Support (Scale plan+)

20 KiB Raw Blame History

Neon Postgres Compatibility Guide

Overview

Neon Platform Overview

Compatibility Matrix

Design Considerations for Neon

1. Stateless Compute

2. Fast Cold Start

3. Memory Efficiency

4. No Background Workers

5. Connection Pooling Considerations

6. Index Persistence

Neon-Specific Limitations

1. Extension Installation (Scale Plan Required)

2. Compute Suspension

3. Memory Constraints

4. Extension Update Process

Requesting RuVector on Neon

For Scale Plan Customers

Step 1: Open Support Ticket

Step 2: Provide Extension Artifacts

Step 3: Security Review

Step 4: Deployment

For Free Plan Users

Migration from pgvector

Step 1: Create Parallel Tables

Step 2: Rebuild Indexes

Step 3: Validate Results

Step 4: Switch Over

Performance Tuning for Neon

Instance Size Recommendations

Query Optimization

Index Build Settings

Neon Branching with RuVector

How Branching Works

Branch Creation Workflow

Branch-Specific Tuning

Monitoring on Neon

Extension Metrics

Query Performance

Neon Metrics Integration

Troubleshooting

Cold Start Slow

Memory Pressure

Index Build Timeout

Connection Pool Compatibility

Support Resources

20 KiB

Raw Blame History