# ruvector-core [![npm version](https://badge.fury.io/js/ruvector-core.svg)](https://www.npmjs.com/package/ruvector-core) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Node Version](https://img.shields.io/node/v/ruvector-core)](https://nodejs.org) [![Downloads](https://img.shields.io/npm/dm/ruvector-core)](https://www.npmjs.com/package/ruvector-core) **High-performance vector database with HNSW indexing, built in Rust with Node.js bindings** Ruvector is a blazingly fast, memory-efficient vector database designed for AI/ML applications, semantic search, and similarity matching. Built with Rust and optimized with SIMD instructions for maximum performance. ๐ŸŒ **[Visit ruv.io](https://ruv.io)** for more AI infrastructure tools ## Features - ๐Ÿš€ **Ultra-Fast Performance** - 50,000+ inserts/sec, 10,000+ searches/sec - ๐ŸŽฏ **HNSW Indexing** - State-of-the-art approximate nearest neighbor search - โšก **SIMD Optimized** - Hardware-accelerated vector operations - ๐Ÿงต **Multi-threaded** - Async operations with Tokio runtime - ๐Ÿ’พ **Memory Efficient** - ~50 bytes per vector with optional quantization - ๐Ÿ”’ **Type-Safe** - Full TypeScript definitions included - ๐ŸŒ **Cross-Platform** - Linux, macOS (Intel & Apple Silicon), Windows - ๐Ÿฆ€ **Rust Core** - Memory safety with zero-cost abstractions ## Quick Start ### Installation ```bash npm install ruvector-core ``` The correct platform-specific native module is automatically installed. ### Basic Usage ```javascript const { VectorDb } = require('ruvector-core'); async function example() { // Create database with 128 dimensions const db = new VectorDb({ dimensions: 128, maxElements: 10000, storagePath: './vectors.db' }); // Insert a vector const vector = new Float32Array(128).map(() => Math.random()); const id = await db.insert({ id: 'doc_1', vector: vector, metadata: { title: 'Example Document' } }); // Search for similar vectors const results = await db.search({ vector: vector, k: 10 }); console.log('Top 10 similar vectors:', results); // Output: [{ id: 'doc_1', score: 1.0, metadata: {...} }, ...] } example(); ``` ### TypeScript Full TypeScript support with complete type definitions: ```typescript import { VectorDb, VectorEntry, SearchQuery, SearchResult } from 'ruvector-core'; const db = new VectorDb({ dimensions: 128, maxElements: 10000, storagePath: './vectors.db' }); // Fully typed operations const entry: VectorEntry = { id: 'doc_1', vector: new Float32Array(128), metadata: { title: 'Example' } }; const results: SearchResult[] = await db.search({ vector: new Float32Array(128), k: 10 }); ``` ## API Reference ### Constructor ```typescript new VectorDb(options: { dimensions: number; // Vector dimensionality (required) maxElements?: number; // Max vectors (default: 10000) storagePath?: string; // Persistent storage path ef_construction?: number; // HNSW construction parameter (default: 200) m?: number; // HNSW M parameter (default: 16) }) ``` ### Methods - `insert(entry: VectorEntry): Promise` - Insert a vector - `search(query: SearchQuery): Promise` - Find similar vectors - `delete(id: string): Promise` - Remove a vector - `len(): Promise` - Count total vectors - `get(id: string): Promise` - Retrieve vector by ID ## Performance Benchmarks Tested on AMD Ryzen 9 5950X, 128-dimensional vectors: | Operation | Throughput | Latency (p50) | Latency (p99) | |-----------|------------|---------------|---------------| | Insert | 52,341 ops/sec | 0.019 ms | 0.045 ms | | Search (k=10) | 11,234 ops/sec | 0.089 ms | 0.156 ms | | Search (k=100) | 8,932 ops/sec | 0.112 ms | 0.203 ms | | Delete | 45,678 ops/sec | 0.022 ms | 0.051 ms | **Memory Usage**: ~50 bytes per 128-dim vector (including index) ### Comparison with Alternatives | Database | Insert (ops/sec) | Search (ops/sec) | Memory per Vector | |----------|------------------|------------------|-------------------| | **Ruvector** | **52,341** | **11,234** | **50 bytes** | | Faiss (HNSW) | 38,200 | 9,800 | 68 bytes | | Hnswlib | 41,500 | 10,200 | 62 bytes | | Milvus | 28,900 | 7,600 | 95 bytes | *Benchmarks measured with 100K vectors, 128 dimensions, k=10* ## Platform Support Automatically installs the correct native module for: - **Linux**: x64, ARM64 (GNU libc) - **macOS**: x64 (Intel), ARM64 (Apple Silicon) - **Windows**: x64 (MSVC) Node.js 18+ required. ## Advanced Configuration ### HNSW Parameters ```javascript const db = new VectorDb({ dimensions: 384, maxElements: 1000000, ef_construction: 200, // Higher = better recall, slower build m: 16, // Higher = better recall, more memory storagePath: './large-db.db' }); ``` ### Distance Metrics ```javascript const db = new VectorDb({ dimensions: 128, distanceMetric: 'cosine' // 'cosine', 'euclidean', or 'dot' }); ``` ### Persistence ```javascript // Auto-save to disk const db = new VectorDb({ dimensions: 128, storagePath: './persistent.db' }); // In-memory only const db = new VectorDb({ dimensions: 128 // No storagePath = in-memory }); ``` ## Building from Source ```bash # Install Rust toolchain curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh # Build native module npm run build:napi ``` Requires: - Rust 1.77+ - Node.js 18+ - Cargo ## Use Cases - **Semantic Search** - Find similar documents, images, or embeddings - **RAG Systems** - Retrieval-Augmented Generation for LLMs - **Recommendation Engines** - Content and product recommendations - **Duplicate Detection** - Find similar items in large datasets - **Anomaly Detection** - Identify outliers in vector space - **Image Similarity** - Visual search and image matching ## Examples ### Semantic Text Search ```javascript const { VectorDb } = require('ruvector-core'); const openai = require('openai'); const db = new VectorDb({ dimensions: 1536 }); // OpenAI ada-002 async function indexDocuments(texts) { for (const text of texts) { const embedding = await openai.embeddings.create({ model: 'text-embedding-ada-002', input: text }); await db.insert({ id: text.slice(0, 20), vector: new Float32Array(embedding.data[0].embedding), metadata: { text } }); } } async function search(query) { const embedding = await openai.embeddings.create({ model: 'text-embedding-ada-002', input: query }); return await db.search({ vector: new Float32Array(embedding.data[0].embedding), k: 5 }); } ``` ### Image Similarity Search ```javascript const { VectorDb } = require('ruvector-core'); const clip = require('@xenova/transformers'); const db = new VectorDb({ dimensions: 512 }); // CLIP embedding size async function indexImages(imagePaths) { const model = await clip.CLIPModel.from_pretrained('openai/clip-vit-base-patch32'); for (const path of imagePaths) { const embedding = await model.encode_image(path); await db.insert({ id: path, vector: new Float32Array(embedding), metadata: { path } }); } } ``` ## Resources - ๐Ÿ  [Homepage](https://ruv.io) - ๐Ÿ“ฆ [GitHub Repository](https://github.com/ruvnet/ruvector) - ๐Ÿ“š [Documentation](https://github.com/ruvnet/ruvector/tree/main/docs) - ๐Ÿ› [Issue Tracker](https://github.com/ruvnet/ruvector/issues) - ๐Ÿ’ฌ [Discussions](https://github.com/ruvnet/ruvector/discussions) ## Contributing Contributions are welcome! Please see [CONTRIBUTING.md](https://github.com/ruvnet/ruvector/blob/main/CONTRIBUTING.md) for guidelines. ## License MIT License - see [LICENSE](https://github.com/ruvnet/ruvector/blob/main/LICENSE) for details. --- Built with โค๏ธ by the [ruv.io](https://ruv.io) team