Files
wifi-densepose/vendor/ruvector/docs/hnsw/HNSW_QUICK_REFERENCE.md

5.4 KiB

HNSW Index - Quick Reference Guide

Installation

# Build and install
cd /home/user/ruvector/crates/ruvector-postgres
cargo pgrx install

# Enable in database
CREATE EXTENSION ruvector;

Index Creation

-- L2 distance (default)
CREATE INDEX ON table USING hnsw (column hnsw_l2_ops);

-- With custom parameters
CREATE INDEX ON table USING hnsw (column hnsw_l2_ops)
    WITH (m = 32, ef_construction = 128);

-- Cosine distance
CREATE INDEX ON table USING hnsw (column hnsw_cosine_ops);

-- Inner product
CREATE INDEX ON table USING hnsw (column hnsw_ip_ops);

Query Syntax

-- L2 distance
SELECT * FROM table ORDER BY column <-> query_vector LIMIT 10;

-- Cosine distance
SELECT * FROM table ORDER BY column <=> query_vector LIMIT 10;

-- Inner product
SELECT * FROM table ORDER BY column <#> query_vector LIMIT 10;

Parameters

Index Build Parameters

Parameter Default Range Description
m 16 2-128 Max connections per layer
ef_construction 64 4-1000 Build candidate list size

Query Parameters

Parameter Default Range Description
ruvector.ef_search 40 1-1000 Search candidate list size
-- Set globally
ALTER SYSTEM SET ruvector.ef_search = 100;

-- Set per session
SET ruvector.ef_search = 100;

-- Set per transaction
SET LOCAL ruvector.ef_search = 100;

Distance Metrics

Metric Operator Use Case Formula
L2 <-> General distance √(Σ(a-b)²)
Cosine <=> Direction similarity 1-(a·b)/(‖a‖‖b‖)
Inner Product <#> Max similarity -Σ(a*b)

Performance Tuning

For Better Recall

-- Increase ef_search
SET ruvector.ef_search = 100;

-- Rebuild with higher ef_construction
WITH (ef_construction = 200);

For Faster Build

-- Lower ef_construction
WITH (ef_construction = 32);

-- Increase memory
SET maintenance_work_mem = '4GB';

For Less Memory

-- Lower m
WITH (m = 8);

Common Queries

SELECT id, column <-> query AS dist
FROM table
ORDER BY column <-> query
LIMIT 10;
SELECT id, column <-> query AS dist
FROM table
WHERE created_at > NOW() - INTERVAL '7 days'
ORDER BY column <-> query
LIMIT 10;
SELECT
    id,
    0.3 * text_rank + 0.7 * (1/(1+vector_dist)) AS score
FROM table
WHERE text_column @@ search_query
ORDER BY score DESC
LIMIT 10;

Maintenance

-- View statistics
SELECT ruvector_memory_stats();

-- Perform maintenance
SELECT ruvector_index_maintenance('index_name');

-- Vacuum
VACUUM ANALYZE table;

-- Rebuild index
REINDEX INDEX index_name;

Monitoring

-- Check index size
SELECT pg_size_pretty(pg_relation_size('index_name'));

-- Explain query
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM table ORDER BY column <-> query LIMIT 10;

Operators Reference

-- Distance operators
ARRAY[1,2,3]::real[] <-> ARRAY[4,5,6]::real[]  -- L2
ARRAY[1,2,3]::real[] <=> ARRAY[4,5,6]::real[]  -- Cosine
ARRAY[1,2,3]::real[] <#> ARRAY[4,5,6]::real[]  -- Inner product

-- Vector utilities
vector_normalize(ARRAY[3,4]::real[])           -- Normalize
vector_norm(ARRAY[3,4]::real[])                -- L2 norm
vector_add(a::real[], b::real[])               -- Add vectors
vector_sub(a::real[], b::real[])               -- Subtract

Typical Performance

Dataset Dimensions Build Time Query Time Memory
10K 128 ~1s <1ms ~10MB
100K 128 ~20s ~2ms ~100MB
1M 128 ~5min ~5ms ~1GB
10M 128 ~1hr ~10ms ~10GB

Parameter Recommendations

Small Dataset (<100K vectors)

WITH (m = 16, ef_construction = 64)
SET ruvector.ef_search = 40;

Medium Dataset (100K-1M vectors)

WITH (m = 16, ef_construction = 128)
SET ruvector.ef_search = 64;

Large Dataset (>1M vectors)

WITH (m = 32, ef_construction = 200)
SET ruvector.ef_search = 100;

Troubleshooting

Slow Queries

  • ✓ Increase ef_search
  • ✓ Check index exists: \d table
  • ✓ Analyze query: EXPLAIN ANALYZE

Low Recall

  • ✓ Increase ef_search
  • ✓ Rebuild with higher ef_construction
  • ✓ Use higher m value

Out of Memory

  • ✓ Lower m value
  • ✓ Increase maintenance_work_mem
  • ✓ Build index in batches

Index Build Fails

  • ✓ Check data quality (no NULLs)
  • ✓ Verify dimensions match
  • ✓ Increase maintenance_work_mem

Files and Documentation

  • Implementation: /home/user/ruvector/crates/ruvector-postgres/src/index/hnsw_am.rs
  • SQL: /home/user/ruvector/crates/ruvector-postgres/sql/hnsw_index.sql
  • Tests: /home/user/ruvector/crates/ruvector-postgres/tests/hnsw_index_tests.sql
  • Docs: /home/user/ruvector/docs/HNSW_INDEX.md
  • Examples: /home/user/ruvector/docs/HNSW_USAGE_EXAMPLE.md
  • Summary: /home/user/ruvector/docs/HNSW_IMPLEMENTATION_SUMMARY.md

Version Info

  • Implementation Version: 1.0
  • PostgreSQL: 14, 15, 16, 17
  • Extension: ruvector 0.1.0
  • pgrx: 0.12.x

Support


Last Updated: December 2, 2025