dearsky/wifi-densepose

Fork 0

Files

ruv cd5943df23 Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

2026-02-28 14:39:40 -05:00

24 KiB

Raw Permalink Blame History

🚀 Ruvector

High-Performance Vector Database for Node.js and Browsers

Blazing-fast vector similarity search powered by Rust • Sub-millisecond queries • Universal deployment

Quick Start • Documentation • Examples • API Reference

🌟 Why rUvector?

In the age of AI, vector similarity search is the foundation of modern applications—from RAG systems to recommendation engines. Ruvector brings enterprise-grade vector search performance to your Node.js and browser applications.

The Problem

Existing JavaScript vector databases force you to choose:

Performance: Pure JS solutions are 100x slower than native code
Portability: Server-only solutions can't run in browsers
Scale: Memory-intensive implementations struggle with large datasets

The Solution

Ruvector eliminates these trade-offs:

⚡ 10-100x Faster: Native Rust performance via NAPI-RS with <0.5ms query latency
🌍 Universal Deployment: Runs everywhere—Node.js (native), browsers (WASM), edge devices
💾 Memory Efficient: 4-32x compression with advanced quantization
🎯 Production Ready: Battle-tested HNSW indexing with 95%+ recall
🔒 Zero Dependencies: Pure Rust implementation with no external runtime dependencies
📘 Type Safe: Complete TypeScript definitions auto-generated from Rust

📦 Installation

Node.js (Native Performance)

npm install ruvector

Platform Support:

✅ Linux (x64, ARM64, musl)
✅ macOS (x64, Apple Silicon)
✅ Windows (x64)
✅ Node.js 18.0+

WebAssembly (Browser & Edge)

npm install @ruvector/wasm

Browser Support:

✅ Chrome 91+ (Full SIMD support)
✅ Firefox 89+ (Full SIMD support)
✅ Safari 16.4+ (Partial SIMD)
✅ Edge 91+

CLI Tools

npm install -g ruvector-cli

Or use directly:

npx ruvector --help

⚡ Quick Start

5-Minute Getting Started

Node.js:

const { VectorDB } = require('ruvector');

// Create database with 384 dimensions (e.g., for sentence-transformers)
const db = VectorDB.withDimensions(384);

// Insert vectors with metadata
await db.insert({
  vector: new Float32Array(384).fill(0.1),
  metadata: { text: 'Hello world', category: 'greeting' }
});

// Search for similar vectors
const results = await db.search({
  vector: new Float32Array(384).fill(0.15),
  k: 10
});

console.log(results); // [{ id, score, metadata }, ...]

TypeScript:

import { VectorDB, JsDbOptions } from 'ruvector';

// Advanced configuration
const options: JsDbOptions = {
  dimensions: 768,
  distanceMetric: 'Cosine',
  storagePath: './vectors.db',
  hnswConfig: {
    m: 32,
    efConstruction: 200,
    efSearch: 100
  }
};

const db = new VectorDB(options);

// Batch insert for better performance
const ids = await db.insertBatch([
  { vector: new Float32Array([...]), metadata: { text: 'doc1' } },
  { vector: new Float32Array([...]), metadata: { text: 'doc2' } }
]);

WebAssembly (Browser):

import init, { VectorDB } from '@ruvector/wasm';

// Initialize WASM (one-time setup)
await init();

// Create database (runs entirely in browser!)
const db = new VectorDB(384, 'cosine', true);

// Insert and search
db.insert(new Float32Array([0.1, 0.2, 0.3]), 'doc1');
const results = db.search(new Float32Array([0.15, 0.25, 0.35]), 10);

CLI:

# Create database
npx ruvector create --dimensions 384 --path ./vectors.db

# Insert vectors from JSON
npx ruvector insert --input embeddings.json

# Search for similar vectors
npx ruvector search --query "[0.1, 0.2, 0.3, ...]" --top-k 10

# Run performance benchmark
npx ruvector benchmark --queries 1000

🚀 Features

Core Capabilities

Feature	Description	Node.js	WASM
HNSW Indexing	Hierarchical Navigable Small World for fast ANN search	✅	✅
Distance Metrics	Cosine, Euclidean, Dot Product, Manhattan	✅	✅
Product Quantization	4-32x memory compression with minimal accuracy loss	✅	✅
SIMD Acceleration	Hardware-accelerated operations (2-4x speedup)	✅	✅
Batch Operations	Efficient bulk insert/search (10-50x faster)	✅	✅
Persistence	Save/load database state	✅	✅
TypeScript Support	Full type definitions included	✅	✅
Async/Await	Promise-based API	✅	N/A
Web Workers	Background processing in browsers	N/A	✅
IndexedDB	Browser persistence layer	N/A	✅

Performance Highlights

Metric                  Node.js (Native)    WASM (Browser)    Pure JS
──────────────────────────────────────────────────────────────────────
Query Latency (p50)     <0.5ms              <1ms              50ms+
Insert (10K vectors)    2.1s                3.2s              45s
Memory (1M vectors)     800MB               ~1GB              3GB
Throughput (QPS)        50K+                25K+              100-1K

📖 API Reference

VectorDB Class

Constructor

// Option 1: Full configuration
const db = new VectorDB({
  dimensions: 384,                    // Required: Vector dimensions
  distanceMetric?: 'Cosine' | 'Euclidean' | 'DotProduct' | 'Manhattan',
  storagePath?: string,               // Persistence path
  hnswConfig?: {
    m?: number,              // Connections per layer (16-64)
    efConstruction?: number, // Build quality (100-500)
    efSearch?: number,       // Search quality (50-500)
    maxElements?: number     // Max capacity
  },
  quantization?: {
    type: 'none' | 'scalar' | 'product' | 'binary',
    subspaces?: number,      // For product quantization
    k?: number               // Codebook size
  }
});

// Option 2: Simple factory (recommended for getting started)
const db = VectorDB.withDimensions(384);

Methods

`insert(entry): Promise<string>`

Insert a single vector with optional metadata.

const id = await db.insert({
  id?: string,                    // Optional (auto-generated UUID)
  vector: Float32Array,           // Required: Vector data
  metadata?: Record<string, any>  // Optional: JSON object
});

Example:

const id = await db.insert({
  vector: new Float32Array([0.1, 0.2, 0.3]),
  metadata: {
    text: 'example document',
    category: 'research',
    timestamp: Date.now()
  }
});

`insertBatch(entries): Promise<string[]>`

Insert multiple vectors efficiently (10-50x faster than sequential).

const ids = await db.insertBatch([
  { vector: new Float32Array([...]), metadata: { ... } },
  { vector: new Float32Array([...]), metadata: { ... } }
]);

`search(query): Promise<SearchResult[]>`

Search for k-nearest neighbors.

const results = await db.search({
  vector: Float32Array,           // Required: Query vector
  k: number,                      // Required: Number of results
  filter?: Record<string, any>,   // Optional: Metadata filters
  efSearch?: number               // Optional: Search quality override
});

// Result format:
interface SearchResult {
  id: string;           // Vector ID
  score: number;        // Distance (lower = more similar)
  vector?: number[];    // Original vector (optional)
  metadata?: any;       // Metadata object
}

Example:

const results = await db.search({
  vector: new Float32Array(queryEmbedding),
  k: 10,
  filter: { category: 'research', year: 2024 }
});

results.forEach(result => {
  const similarity = 1 - result.score;  // Convert distance to similarity
  console.log(`${result.metadata.text}: ${similarity.toFixed(3)}`);
});

`get(id): Promise<VectorEntry | null>`

Retrieve a specific vector by ID.

const entry = await db.get('vector-id');
if (entry) {
  console.log(entry.vector, entry.metadata);
}

`delete(id): Promise<boolean>`

Delete a vector by ID.

const deleted = await db.delete('vector-id');

`len(): Promise<number>`

Get total vector count.

const count = await db.len();
console.log(`Database contains ${count} vectors`);

`isEmpty(): Promise<boolean>`

Check if database is empty.

if (await db.isEmpty()) {
  console.log('No vectors yet');
}

CLI Reference

Global Commands

npx ruvector <command> [options]

Command	Description	Example
`create`	Create new database	`npx ruvector create --dimensions 384`
`insert`	Insert vectors from file	`npx ruvector insert --input data.json`
`search`	Search for similar vectors	`npx ruvector search --query "[...]" -k 10`
`info`	Show database statistics	`npx ruvector info --db vectors.db`
`benchmark`	Run performance tests	`npx ruvector benchmark --queries 1000`
`export`	Export database to file	`npx ruvector export --output backup.json`

Common Options

--db <PATH>          # Database file path (default: ./ruvector.db)
--config <FILE>      # Configuration file
--debug              # Enable debug logging
--no-color           # Disable colored output
--help               # Show help
--version            # Show version

See CLI Documentation for complete reference.

🏗️ Architecture

Package Structure

ruvector/
├── ruvector              # Main Node.js package (auto-detects platform)
│   ├── Native bindings   # NAPI-RS for Linux/macOS/Windows
│   └── WASM fallback     # WebAssembly for unsupported platforms
│
├── @ruvector/core        # Core package (optional direct install)
│   └── Pure Rust impl    # Core vector database engine
│
├── @ruvector/wasm        # WebAssembly package for browsers
│   ├── Standard WASM     # Base WebAssembly build
│   └── SIMD WASM         # SIMD-optimized build (2-4x faster)
│
└── ruvector-cli          # Command-line tools
    ├── Database mgmt     # Create, insert, search
    └── MCP server        # Model Context Protocol server

Platform Detection Flow

┌─────────────────────────────────────┐
│   User: npm install ruvector        │
└─────────────────┬───────────────────┘
                  │
                  ▼
         ┌────────────────┐
         │ Platform Check │
         └────────┬───────┘
                  │
        ┌─────────┴─────────┐
        │                   │
        ▼                   ▼
  ┌──────────┐      ┌──────────────┐
  │ Supported│      │ Unsupported  │
  │ Platform │      │   Platform   │
  └────┬─────┘      └──────┬───────┘
       │                   │
       ▼                   ▼
┌──────────────┐    ┌─────────────┐
│ Native NAPI  │    │ WASM Fallback│
│ (Rust→Node)  │    │ (Rust→WASM) │
└──────────────┘    └─────────────┘
       │                   │
       └─────────┬─────────┘
                 │
                 ▼
        ┌─────────────────┐
        │ VectorDB Ready  │
        └─────────────────┘

Native vs WASM Decision Tree

Condition	Package Loaded	Performance
Node.js + Supported Platform	Native NAPI	⚡⚡⚡ (Fastest)
Node.js + Unsupported Platform	WASM	⚡⚡ (Fast)
Browser (Modern)	WASM + SIMD	⚡⚡ (Fast)
Browser (Older)	WASM	⚡ (Good)

📊 Performance

Benchmarks vs Other Vector Databases

Local Performance (1M vectors, 384 dimensions):

Database	Query (p50)	Insert (10K)	Memory	Recall@10	Offline
Ruvector	0.4ms	2.1s	800MB	95%+	✅
Pinecone	~2ms	N/A	N/A	93%	❌
Qdrant	~1ms	~3s	1.5GB	94%	✅
ChromaDB	~50ms	~45s	3GB	85%	✅
Pure JS	100ms+	45s+	3GB+	80%	✅

Native vs WASM Performance

10,000 vectors, 384 dimensions:

Operation	Native (Node.js)	WASM (Browser)	Speedup
Insert (individual)	1.1s	3.2s	2.9x
Insert (batch)	0.4s	1.2s	3.0x
Search k=10 (100 queries)	0.2s	0.5s	2.5x
Search k=100 (100 queries)	0.7s	1.8s	2.6x

Optimization Tips

HNSW Parameters (Quality vs Speed):

// High recall (research, critical apps)
const highRecall = {
  m: 64,              // More connections
  efConstruction: 400,
  efSearch: 200
};

// Balanced (default, most apps)
const balanced = {
  m: 32,
  efConstruction: 200,
  efSearch: 100
};

// Fast (real-time apps)
const fast = {
  m: 16,              // Fewer connections
  efConstruction: 100,
  efSearch: 50
};

Memory Optimization with Quantization:

// Product Quantization: 8-32x compression
const compressed = {
  quantization: {
    type: 'product',
    subspaces: 16,
    k: 256
  }
};

// Binary Quantization: 32x compression, very fast
const minimal = {
  quantization: { type: 'binary' }
};

💡 Advanced Usage

1. RAG (Retrieval-Augmented Generation)

Build production-ready RAG systems with fast vector retrieval:

const { VectorDB } = require('ruvector');
const { OpenAI } = require('openai');

class RAGSystem {
  constructor() {
    this.db = VectorDB.withDimensions(1536); // OpenAI ada-002
    this.openai = new OpenAI();
  }

  async indexDocument(text, metadata) {
    const chunks = this.chunkText(text, 512);

    const embeddings = await this.openai.embeddings.create({
      model: 'text-embedding-3-small',
      input: chunks
    });

    await this.db.insertBatch(
      embeddings.data.map((emb, i) => ({
        vector: new Float32Array(emb.embedding),
        metadata: { ...metadata, chunk: i, text: chunks[i] }
      }))
    );
  }

  async query(question, k = 5) {
    const embedding = await this.openai.embeddings.create({
      model: 'text-embedding-3-small',
      input: [question]
    });

    const results = await this.db.search({
      vector: new Float32Array(embedding.data[0].embedding),
      k
    });

    const context = results.map(r => r.metadata.text).join('\n\n');

    const completion = await this.openai.chat.completions.create({
      model: 'gpt-4',
      messages: [
        { role: 'system', content: 'Answer based on context.' },
        { role: 'user', content: `Context:\n${context}\n\nQuestion: ${question}` }
      ]
    });

    return {
      answer: completion.choices[0].message.content,
      sources: results.map(r => r.metadata)
    };
  }

  chunkText(text, maxLength) {
    // Implement your chunking strategy
    return text.match(new RegExp(`.{1,${maxLength}}`, 'g')) || [];
  }
}

2. Semantic Code Search

Find similar code patterns across your codebase:

import { VectorDB } from 'ruvector';
import { pipeline } from '@xenova/transformers';

// Use code-specific embedding model
const embedder = await pipeline('feature-extraction', 'Xenova/codebert-base');
const db = VectorDB.withDimensions(768);

async function indexCodebase(files: Array<{ path: string, code: string }>) {
  for (const file of files) {
    const embedding = await embedder(file.code, {
      pooling: 'mean',
      normalize: true
    });

    await db.insert({
      vector: new Float32Array(embedding.data),
      metadata: {
        path: file.path,
        code: file.code,
        language: file.path.split('.').pop()
      }
    });
  }
}

async function findSimilarCode(query: string, k = 10) {
  const embedding = await embedder(query, {
    pooling: 'mean',
    normalize: true
  });

  return await db.search({
    vector: new Float32Array(embedding.data),
    k
  });
}

3. Recommendation Engine

Build personalized recommendation systems:

class RecommendationEngine {
  constructor() {
    this.db = VectorDB.withDimensions(128);
  }

  async addItem(itemId, features, metadata) {
    await this.db.insert({
      id: itemId,
      vector: new Float32Array(features),
      metadata: { ...metadata, addedAt: Date.now() }
    });
  }

  async recommendSimilar(itemId, k = 10) {
    const item = await this.db.get(itemId);
    if (!item) return [];

    const results = await this.db.search({
      vector: item.vector,
      k: k + 1
    });

    return results
      .filter(r => r.id !== itemId)
      .slice(0, k)
      .map(r => ({
        id: r.id,
        similarity: 1 - r.score,
        ...r.metadata
      }));
  }
}

4. Browser-Based Semantic Search (WASM)

Offline-first semantic search running entirely in the browser:

import init, { VectorDB } from '@ruvector/wasm';
import { IndexedDBPersistence } from '@ruvector/wasm/indexeddb';

await init();

const db = new VectorDB(384, 'cosine', true);
const persistence = new IndexedDBPersistence('semantic_search');

// Load cached vectors from IndexedDB
await persistence.open();
await persistence.loadAll(async (progress) => {
  if (progress.vectors.length > 0) {
    db.insertBatch(progress.vectors);
  }
  console.log(`Loading: ${progress.percent * 100}%`);
});

// Add new documents
async function indexDocument(text, embedding) {
  const id = db.insert(embedding, null, { text });
  await persistence.save({ id, vector: embedding, metadata: { text } });
}

// Search offline
function search(queryEmbedding, k = 10) {
  return db.search(queryEmbedding, k);
}

🎯 Examples

Complete Working Examples

The repository includes full working examples:

Node.js Examples:

simple.mjs - Basic operations
advanced.mjs - HNSW tuning & batching
semantic-search.mjs - Text similarity

Browser Examples:

Vanilla JS Demo - Pure JavaScript
React Demo - React integration

Run Examples:

# Clone repository
git clone https://github.com/ruvnet/ruvector.git
cd ruvector

# Node.js examples
cd crates/ruvector-node
npm install && npm run build
node examples/simple.mjs

# Browser example
cd ../../examples/wasm-react
npm install && npm start

🛠️ Building from Source

Prerequisites

Rust: 1.77 or higher
Node.js: 18.0 or higher
Build Tools:
- Linux: build-essential
- macOS: Xcode Command Line Tools
- Windows: Visual Studio Build Tools

Build Steps

# Clone repository
git clone https://github.com/ruvnet/ruvector.git
cd ruvector

# Build all crates
cargo build --release --workspace

# Build Node.js bindings
cd crates/ruvector-node
npm install && npm run build

# Build WASM
cd ../ruvector-wasm
npm install && npm run build:web

# Run tests
cargo test --workspace
npm test

Cross-Platform Builds

# Install cross-compilation tools
npm install -g @napi-rs/cli

# Build for specific platforms
npx napi build --platform --release

# Available targets:
# - linux-x64-gnu, linux-arm64-gnu, linux-x64-musl
# - darwin-x64, darwin-arm64
# - win32-x64-msvc

🤝 Contributing & License

Contributing

We welcome contributions! Areas where you can help:

🐛 Bug Fixes - Help us squash bugs
✨ New Features - Add capabilities and integrations
📝 Documentation - Improve guides and API docs
🧪 Testing - Add test coverage
🌍 Translations - Translate documentation

How to Contribute:

Fork the repository: github.com/ruvnet/ruvector
Create a feature branch: git checkout -b feature/amazing-feature
Commit your changes: git commit -m 'Add amazing feature'
Push to the branch: git push origin feature/amazing-feature
Open a Pull Request

See Contributing Guidelines for details.

License

MIT License - Free to use for commercial and personal projects.

See LICENSE for full details.

🌐 Community & Support

Get Help

GitHub Issues: Report bugs or request features
GitHub Discussions: Ask questions and share ideas
Discord: Join our community
Twitter: @ruvnet

Documentation

Getting Started Guide - Complete tutorial
API Reference - Full API documentation
Performance Tuning - Optimization guide
Complete Documentation - All documentation

Enterprise Support

Need enterprise support, custom development, or consulting?

📧 Contact: enterprise@ruv.io

🙏 Acknowledgments

Built with world-class open source technologies:

NAPI-RS - Native Node.js bindings for Rust
wasm-bindgen - Rust/WASM integration
HNSW - HNSW algorithm implementation
SimSIMD - SIMD-accelerated distance metrics
redb - Embedded database engine
Tokio - Async runtime for Rust

Special thanks to the Rust, Node.js, and WebAssembly communities! 🎉

🚀 Ready to Get Started?

npm install ruvector

Built by rUv • Open Source on GitHub

Status: Production Ready | Version: 0.1.0 | Performance: <0.5ms latency

Perfect for: RAG Systems • Semantic Search • Recommendation Engines • AI Agents

Get Started • Documentation • Examples • API Reference

24 KiB Raw Permalink Blame History