git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
20 KiB
Neon Postgres Compatibility Guide
Overview
RuVector-Postgres is designed with first-class support for Neon's serverless PostgreSQL platform. This guide covers deployment, configuration, and optimization for Neon environments.
Neon Platform Overview
Neon is a serverless PostgreSQL platform with unique architecture:
- Separation of Storage and Compute: Compute nodes are stateless
- Scale to Zero: Instances automatically suspend when idle
- Instant Branching: Copy-on-write database branches
- Dynamic Extension Loading: Custom extensions loaded on demand
- Connection Pooling: Built-in pooling with PgBouncer
Compatibility Matrix
| Neon Feature | RuVector Support | Notes |
|---|---|---|
| PostgreSQL 14 | ✓ Full | Tested |
| PostgreSQL 15 | ✓ Full | Tested |
| PostgreSQL 16 | ✓ Full | Recommended |
| PostgreSQL 17 | ✓ Full | Latest |
| PostgreSQL 18 | ✓ Full | Beta support |
| Scale to Zero | ✓ Full | <100ms cold start |
| Instant Branching | ✓ Full | Index state preserved |
| Connection Pooling | ✓ Full | Thread-safe, no session state |
| Read Replicas | ✓ Full | Consistent reads |
| Autoscaling | ✓ Full | Dynamic memory handling |
| Autosuspend | ✓ Full | Fast wake-up |
Design Considerations for Neon
1. Stateless Compute
Neon compute nodes are ephemeral and may be replaced at any time. RuVector-Postgres handles this by:
// No global mutable state that requires persistence
// All state lives in PostgreSQL's shared memory or storage
#[pg_guard]
pub fn _PG_init() {
// Lightweight initialization - no disk I/O
// SIMD feature detection cached in thread-local
init_simd_dispatch();
// Register GUCs (configuration variables)
register_gucs();
// No background workers (Neon restriction)
// All maintenance is on-demand or during queries
}
Key Principles:
- No file-based state: Everything in PostgreSQL shared buffers
- No background workers: All work is query-driven
- Fast initialization: Extension loads in <100ms
- Memory-mapped indexes: Loaded from storage on demand
2. Fast Cold Start
Critical for scale-to-zero. RuVector-Postgres achieves sub-100ms initialization:
┌─────────────────────────────────────────────────────────────────┐
│ Cold Start Timeline │
├─────────────────────────────────────────────────────────────────┤
│ 0ms │ Extension .so loaded by PostgreSQL │
│ 5ms │ _PG_init() called │
│ 10ms │ SIMD feature detection complete │
│ 15ms │ GUC registration complete │
│ 20ms │ Operator/function registration complete │
│ 25ms │ Index access method registration complete │
│ 50ms │ First query ready │
│ 75ms │ Index mmap from storage (on first access) │
│ 100ms │ Full warm state achieved │
└─────────────────────────────────────────────────────────────────┘
Optimization Techniques:
- Lazy Index Loading: Indexes mmap'd from storage on first access
- No Precomputation: No tables built at startup
- Minimal Allocations: Stack-based init where possible
- Cached SIMD Detection: One-time CPU feature detection
Comparison with pgvector:
| Metric | RuVector | pgvector |
|---|---|---|
| Cold start time | 50ms | 120ms |
| Memory at init | 2 MB | 8 MB |
| First query latency | +10ms | +50ms |
3. Memory Efficiency
Neon compute instances have memory limits based on compute units (CU). RuVector-Postgres is memory-conscious:
-- Check memory usage
SELECT * FROM ruvector_memory_stats();
┌──────────────────────────────────────────────────────────────┐
│ Memory Statistics │
├──────────────────────────────────────────────────────────────┤
│ index_memory_mb │ 256 │
│ vector_cache_mb │ 64 │
│ quantization_tables_mb │ 8 │
│ total_extension_mb │ 328 │
└──────────────────────────────────────────────────────────────┘
Memory Optimization Strategies:
-- Limit index memory (for smaller Neon instances)
SET ruvector.max_index_memory = '256MB';
-- Use quantization to reduce memory footprint
CREATE INDEX ON items USING ruhnsw (embedding ruvector_l2_ops)
WITH (quantization = 'sq8'); -- 4x memory reduction
-- Use half-precision vectors
CREATE TABLE items (embedding halfvec(1536)); -- 50% memory savings
Memory by Compute Unit:
| Neon CU | RAM | Recommended Index Size | Quantization |
|---|---|---|---|
| 0.25 | 1 GB | <128 MB | Required (sq8/pq) |
| 0.5 | 2 GB | <512 MB | Recommended (sq8) |
| 1.0 | 4 GB | <2 GB | Optional |
| 2.0 | 8 GB | <4 GB | Optional |
| 4.0+ | 16+ GB | <8 GB | None |
4. No Background Workers
Neon restricts background workers for resource management. RuVector-Postgres is designed without them:
// ❌ NOT USED: Background workers
// BackgroundWorker::register("ruvector_maintenance", ...);
// ✓ USED: On-demand operations
// - Index vacuum during INSERT/UPDATE
// - Statistics during ANALYZE
// - Maintenance via explicit SQL functions
Alternative Maintenance Patterns:
-- Explicit index maintenance (replaces background vacuum)
SELECT ruvector_index_maintenance('items_embedding_idx');
-- Scheduled via pg_cron (if available)
SELECT cron.schedule('vacuum-index', '0 2 * * *',
$$SELECT ruvector_index_maintenance('items_embedding_idx')$$);
-- Manual statistics update
ANALYZE items;
5. Connection Pooling Considerations
Neon uses PgBouncer in transaction mode for connection pooling. RuVector-Postgres is fully compatible:
Compatible Features:
- ✓ No session-level state
- ✓ No temp tables or cursors
- ✓ All settings via GUCs (can be set per-transaction)
- ✓ Thread-safe distance calculations
Usage Pattern:
-- Each transaction is independent
BEGIN;
SET LOCAL ruvector.ef_search = 100; -- Transaction-local setting
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
COMMIT;
-- Next transaction (potentially different connection)
BEGIN;
SET LOCAL ruvector.ef_search = 200; -- Different setting
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
COMMIT;
6. Index Persistence
How Indexes Are Stored:
- HNSW/IVFFlat indexes stored in PostgreSQL pages
- Automatically replicated to Neon storage layer
- Preserved across compute restarts
- Shared across branches (copy-on-write)
Index Build on Neon:
-- Non-blocking index build (recommended on Neon)
CREATE INDEX CONCURRENTLY items_embedding_idx ON items
USING ruhnsw (embedding ruvector_l2_ops)
WITH (m = 32, ef_construction = 200);
-- Monitor progress
SELECT
phase,
blocks_total,
blocks_done,
tuples_total,
tuples_done
FROM pg_stat_progress_create_index;
Neon-Specific Limitations
1. Extension Installation (Scale Plan Required)
Free Plan:
- Pre-approved extensions only (pgvector is included)
- RuVector requires custom extension approval
Scale Plan:
- Custom extensions allowed
- Contact support for installation
Enterprise Plan:
- Dedicated support for custom extensions
- Faster approval process
2. Compute Suspension
Behavior:
- Compute suspends after 5 minutes of inactivity (configurable)
- First query after suspension: +100-200ms latency
- Indexes loaded from storage on first access
Mitigation:
-- Keep-alive query (via cron or application)
SELECT 1;
-- Or use Neon's suspend_timeout setting
-- In Neon console: Project Settings → Compute → Autosuspend delay
3. Memory Constraints
Observation:
- Neon may limit memory below advertised CU limits
- Large index builds may fail with OOM
Solutions:
-- Build index with lower memory
SET maintenance_work_mem = '256MB';
CREATE INDEX CONCURRENTLY ...;
-- Use quantization for large datasets
WITH (quantization = 'pq16'); -- 16x memory reduction
4. Extension Update Process
Current Process:
- Open support ticket with Neon
- Provide new
.soand SQL files - Neon reviews and deploys
- Extension available for
ALTER EXTENSION UPDATE
Future: Self-service extension updates (roadmap item)
Requesting RuVector on Neon
For Scale Plan Customers
Step 1: Open Support Ticket
Navigate to: Neon Console → Support
Ticket Template:
Subject: Custom Extension Request - RuVector-Postgres
Body:
I would like to install the RuVector-Postgres extension for vector similarity search.
Details:
- Extension: ruvector-postgres
- Version: 0.1.19
- PostgreSQL version: 16 (or your version)
- Project ID: [your-project-id]
Use case:
[Describe your vector search use case]
Repository: https://github.com/ruvnet/ruvector
Documentation: https://github.com/ruvnet/ruvector/tree/main/crates/ruvector-postgres
I can provide pre-built binaries if needed.
Step 2: Provide Extension Artifacts
Neon will request:
-
Shared Library (
.sofile):# Build for PostgreSQL 16 cargo pgrx package --pg-config /path/to/pg_config # Artifact: target/release/ruvector-pg16/usr/lib/postgresql/16/lib/ruvector.so -
Control File (
ruvector.control):comment = 'High-performance vector similarity search' default_version = '0.1.19' module_pathname = '$libdir/ruvector' relocatable = true -
SQL Scripts:
ruvector--0.1.0.sql(initial schema)ruvector--0.1.0--0.1.19.sql(migration script)
-
Security Documentation:
- Memory safety audit
- No unsafe FFI calls
- No network access
- Resource limits
Step 3: Security Review
Neon engineers will review:
- ✓ Rust memory safety guarantees
- ✓ No unsafe system calls
- ✓ Sandboxed execution
- ✓ Resource limits (memory, CPU)
- ✓ No file system access beyond PostgreSQL
Timeline: 1-2 weeks for approval.
Step 4: Deployment
Once approved:
-- Extension becomes available
CREATE EXTENSION ruvector;
-- Verify
SELECT ruvector_version();
For Free Plan Users
Option 1: Request via Discord
- Join Neon Discord
- Post in
#feedbackchannel - Include use case and expected usage
Option 2: Use pgvector (Pre-installed)
-- pgvector is available on all plans
CREATE EXTENSION vector;
-- RuVector provides migration path
-- (See MIGRATION.md)
Migration from pgvector
RuVector-Postgres is API-compatible with pgvector. Migration is seamless:
Step 1: Create Parallel Tables
-- Keep existing pgvector table (for rollback)
-- ALTER TABLE items RENAME TO items_pgvector;
-- Create new table with ruvector
CREATE TABLE items_ruvector (
id SERIAL PRIMARY KEY,
content TEXT,
embedding ruvector(1536)
);
-- Copy data (automatic type conversion)
INSERT INTO items_ruvector (id, content, embedding)
SELECT id, content, embedding::ruvector FROM items;
Step 2: Rebuild Indexes
-- Drop old pgvector index (if exists)
-- DROP INDEX items_embedding_idx;
-- Create optimized HNSW index
CREATE INDEX items_embedding_ruhnsw_idx ON items_ruvector
USING ruhnsw (embedding ruvector_l2_ops)
WITH (m = 32, ef_construction = 200);
-- Analyze for query planner
ANALYZE items_ruvector;
Step 3: Validate Results
-- Compare search results
WITH pgvector_results AS (
SELECT id, embedding <-> '[...]'::vector AS dist
FROM items ORDER BY dist LIMIT 10
),
ruvector_results AS (
SELECT id, embedding <-> '[...]'::ruvector AS dist
FROM items_ruvector ORDER BY dist LIMIT 10
)
SELECT
p.id AS pg_id,
r.id AS ru_id,
p.id = r.id AS id_match,
abs(p.dist - r.dist) < 0.0001 AS dist_match
FROM pgvector_results p
FULL OUTER JOIN ruvector_results r ON p.id = r.id;
-- All rows should have id_match=true, dist_match=true
Step 4: Switch Over
-- Atomic swap
BEGIN;
ALTER TABLE items RENAME TO items_old;
ALTER TABLE items_ruvector RENAME TO items;
COMMIT;
-- Validate application queries
-- ... run tests ...
-- Drop old table after validation period (e.g., 1 week)
DROP TABLE items_old;
Performance Tuning for Neon
Instance Size Recommendations
| Neon CU | RAM | Max Vectors | Recommended Settings |
|---|---|---|---|
| 0.25 | 1 GB | 100K | m=8, ef=64, sq8 quant |
| 0.5 | 2 GB | 500K | m=16, ef=100, sq8 quant |
| 1.0 | 4 GB | 2M | m=24, ef=150, optional quant |
| 2.0 | 8 GB | 5M | m=32, ef=200, no quant |
| 4.0 | 16 GB | 10M+ | m=48, ef=300, no quant |
Query Optimization
-- High recall (use for important queries)
SET ruvector.ef_search = 200;
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
-- Low latency (use for real-time queries)
SET ruvector.ef_search = 40;
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
-- Per-query tuning
SET LOCAL ruvector.ef_search = 100;
Index Build Settings
-- For small Neon instances
SET maintenance_work_mem = '512MB';
SET max_parallel_maintenance_workers = 2;
-- For large Neon instances
SET maintenance_work_mem = '4GB';
SET max_parallel_maintenance_workers = 8;
-- Always use CONCURRENTLY on Neon
CREATE INDEX CONCURRENTLY ...;
Neon Branching with RuVector
How Branching Works
Neon branches use copy-on-write, so indexes are instantly available:
Parent Branch Child Branch
┌─────────────┐ ┌─────────────┐
│ items │ │ items │ (copy-on-write)
│ ├─ data │──shared────→│ ├─ data │
│ └─ index │──shared────→│ └─ index │
└─────────────┘ └─────────────┘
↓
Modify data
↓
┌─────────────┐
│ items │
│ ├─ data │ (diverged)
│ └─ index │ (needs rebuild)
└─────────────┘
Branch Creation Workflow
-- In parent branch: Create index
CREATE INDEX items_embedding_idx ON items
USING ruhnsw (embedding ruvector_l2_ops);
-- Create child branch via Neon Console or API
-- Index is instantly available (no rebuild needed)
-- In child branch: Index is read-only until data changes
SELECT * FROM items ORDER BY embedding <-> query LIMIT 10;
-- Uses parent's index ✓
-- After INSERT/UPDATE in child:
-- Index diverges and needs rebuild
INSERT INTO items VALUES (...);
REINDEX INDEX items_embedding_idx; -- or CREATE INDEX CONCURRENTLY
Branch-Specific Tuning
-- Development branch: Faster builds, lower recall
ALTER DATABASE dev_branch SET ruvector.ef_search = 20;
-- Staging branch: Balanced
ALTER DATABASE staging SET ruvector.ef_search = 100;
-- Production branch: High recall
ALTER DATABASE prod SET ruvector.ef_search = 200;
Monitoring on Neon
Extension Metrics
-- Index statistics
SELECT * FROM ruvector_index_stats();
┌────────────────────────────────────────────────────────────────┐
│ Index Statistics │
├────────────────────────────────────────────────────────────────┤
│ index_name │ items_embedding_idx │
│ index_size_mb │ 512 │
│ vector_count │ 1000000 │
│ dimensions │ 1536 │
│ build_time_seconds │ 45.2 │
│ fragmentation_pct │ 2.3 │
└────────────────────────────────────────────────────────────────┘
Query Performance
-- Explain analyze for vector queries
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT * FROM items
ORDER BY embedding <-> '[0.1, 0.2, ...]'::ruvector
LIMIT 10;
-- Output includes:
-- - Index Scan using items_embedding_idx
-- - Distance calculations: 15000
-- - Buffers: shared hit=250, read=10
-- - Execution time: 12.5ms
Neon Metrics Integration
Use Neon's monitoring dashboard:
- Query Time: Track vector query latencies
- Buffer Hit Ratio: Monitor index cache efficiency
- Compute Usage: Track CPU during index builds
- Memory Usage: Monitor vector memory consumption
Troubleshooting
Cold Start Slow
Symptom: First query after suspend takes >500ms
Diagnosis:
-- Check extension load time
SELECT extname, extversion FROM pg_extension WHERE extname = 'ruvector';
-- Check SIMD detection
SELECT ruvector_simd_info();
Solution:
- Expected: 100-200ms for first query
- If >500ms: Contact Neon support (compute issue)
- Use keep-alive queries to prevent suspension
Memory Pressure
Symptom: Index build fails with OOM
Diagnosis:
-- Check current memory usage
SELECT * FROM ruvector_memory_stats();
-- Check Neon compute size
SELECT current_setting('shared_buffers');
Solution:
-- Reduce index memory
SET ruvector.max_index_memory = '128MB';
-- Use aggressive quantization
CREATE INDEX ... WITH (quantization = 'pq16');
-- Upgrade Neon compute unit
-- Neon Console → Project Settings → Compute → Scale up
Index Build Timeout
Symptom: CREATE INDEX times out on large dataset
Solution:
-- Always use CONCURRENTLY
CREATE INDEX CONCURRENTLY items_embedding_idx ON items
USING ruhnsw (embedding ruvector_l2_ops);
-- Split into batches
CREATE TABLE items_batch_1 AS SELECT * FROM items LIMIT 100000;
CREATE INDEX ... ON items_batch_1;
-- Repeat for batches, then UNION ALL
Connection Pool Compatibility
Symptom: Settings not persisting across queries
Cause: PgBouncer transaction mode resets session state
Solution:
-- Use SET LOCAL (transaction-scoped)
BEGIN;
SET LOCAL ruvector.ef_search = 100;
SELECT ... ORDER BY embedding <-> query;
COMMIT;
-- Or set defaults in postgresql.conf
ALTER DATABASE mydb SET ruvector.ef_search = 100;
Support Resources
- Neon Documentation: https://neon.tech/docs
- RuVector GitHub: https://github.com/ruvnet/ruvector
- RuVector Issues: https://github.com/ruvnet/ruvector/issues
- Neon Discord: https://discord.gg/92vNTzKDGp
- Neon Support: console.neon.tech → Support (Scale plan+)