Squashed 'vendor/ruvector/' content from commit b64c2172

git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00
commit d803bfe2b1
7854 changed files with 3522914 additions and 0 deletions
--- a/crates/ruvector-router-core/README.md
+++ b/crates/ruvector-router-core/README.md
@@ -0,0 +1,761 @@
+# Router Core
+
+[![Rust](https://img.shields.io/badge/rust-1.77%2B-orange.svg)](https://www.rust-lang.org)
+[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
+[![Performance](https://img.shields.io/badge/latency-<0.5ms-green.svg)](../../docs/TECHNICAL_PLAN.md)
+
+**High-performance vector database and neural routing inference engine built in Rust.**
+
+Core engine powering Ruvector's intelligent request distribution, model selection, and sub-millisecond vector similarity search. Combines advanced indexing algorithms with SIMD-optimized distance calculations for maximum performance.
+
+## 🎯 Overview
+
+Router Core is the foundation of Ruvector's vector database capabilities, providing:
+
+- **Neural Routing**: Intelligent request distribution across multiple models and endpoints
+- **Vector Database**: High-performance storage and retrieval with HNSW indexing
+- **Model Selection**: Adaptive routing strategies for multi-model AI systems
+- **SIMD Acceleration**: Hardware-optimized vector operations via simsimd
+- **Memory Efficiency**: Advanced quantization techniques (4-32x compression)
+- **Zero Dependencies**: Pure Rust implementation with minimal external dependencies
+
+## ⚡ Key Features
+
+### Core Capabilities
+
+- **Sub-Millisecond Search**: <0.5ms p50 latency with HNSW indexing
+- **HNSW Indexing**: Hierarchical Navigable Small World for fast approximate nearest neighbor search
+- **Multiple Distance Metrics**: Euclidean, Cosine, Dot Product, Manhattan
+- **Advanced Quantization**: Scalar (4x), Product (8-16x), Binary (32x) compression
+- **SIMD Optimizations**: Hardware-accelerated distance calculations
+- **Zero-Copy I/O**: Memory-mapped files for efficient data access
+- **Thread-Safe**: Concurrent read/write operations with minimal locking
+- **Persistent Storage**: Durable vector storage with redb backend
+
+### Neural Routing Features
+
+- **Intelligent Request Distribution**: Route queries to optimal model endpoints
+- **Load Balancing**: Distribute workload across multiple inference servers
+- **Model Selection**: Automatically select best model based on query characteristics
+- **Adaptive Strategies**: Learn and optimize routing decisions over time
+- **Latency Optimization**: Minimize end-to-end inference time
+- **Failover Support**: Automatic fallback to backup endpoints
+
+## 📦 Installation
+
+Add to your `Cargo.toml`:
+
+```toml
+[dependencies]
+router-core = "0.1.0"
+```
+
+Or use the full ruvector package:
+
+```toml
+[dependencies]
+ruvector-core = "0.1.0"
+```
+
+## 🚀 Quick Start
+
+### Basic Vector Database
+
+```rust
+use router_core::{VectorDB, VectorEntry, SearchQuery, DistanceMetric};
+use std::collections::HashMap;
+
+// Create database with builder pattern
+let db = VectorDB::builder()
+    .dimensions(384)           // Vector dimensions
+    .distance_metric(DistanceMetric::Cosine)
+    .hnsw_m(32)               // HNSW connections per node
+    .hnsw_ef_construction(200) // Construction accuracy
+    .storage_path("./vectors.db")
+    .build()?;
+
+// Insert vectors
+let entry = VectorEntry {
+    id: "doc1".to_string(),
+    vector: vec![0.1; 384],
+    metadata: HashMap::new(),
+    timestamp: chrono::Utc::now().timestamp(),
+};
+
+db.insert(entry)?;
+
+// Search for similar vectors
+let query = SearchQuery {
+    vector: vec![0.1; 384],
+    k: 10,                     // Top 10 results
+    filters: None,
+    threshold: Some(0.8),      // Minimum similarity
+    ef_search: Some(100),      // Search accuracy
+};
+
+let results = db.search(query)?;
+for result in results {
+    println!("{}: {}", result.id, result.score);
+}
+```
+
+### Batch Operations
+
+```rust
+use router_core::{VectorDB, VectorEntry};
+
+// Insert multiple vectors efficiently
+let entries: Vec<VectorEntry> = (0..1000)
+    .map(|i| VectorEntry {
+        id: format!("doc{}", i),
+        vector: vec![0.1; 384],
+        metadata: HashMap::new(),
+        timestamp: chrono::Utc::now().timestamp(),
+    })
+    .collect();
+
+// Batch insert (much faster than individual inserts)
+db.insert_batch(entries)?;
+
+// Check statistics
+let stats = db.stats();
+println!("Total vectors: {}", stats.total_vectors);
+println!("Avg latency: {:.2}μs", stats.avg_query_latency_us);
+```
+
+### Advanced Configuration
+
+```rust
+use router_core::{VectorDB, DistanceMetric, QuantizationType};
+
+let db = VectorDB::builder()
+    .dimensions(768)                          // Larger embeddings
+    .max_elements(10_000_000)                 // 10M vectors
+    .distance_metric(DistanceMetric::Cosine)  // Cosine similarity
+    .hnsw_m(64)                               // More connections = higher recall
+    .hnsw_ef_construction(400)                // Higher accuracy during build
+    .hnsw_ef_search(200)                      // Search-time accuracy
+    .quantization(QuantizationType::Scalar)   // 4x memory compression
+    .mmap_vectors(true)                       // Memory-mapped storage
+    .storage_path("./large_db.redb")
+    .build()?;
+```
+
+## 🧠 Neural Routing Strategies
+
+Router Core supports multiple routing strategies for intelligent request distribution:
+
+### 1. **Round-Robin Routing**
+
+Simple load balancing across endpoints:
+
+```rust
+use router_core::routing::{Router, RoundRobinStrategy};
+
+let router = Router::new(RoundRobinStrategy::new(vec![
+    "http://model1:8080",
+    "http://model2:8080",
+    "http://model3:8080",
+]));
+
+let endpoint = router.select_endpoint(&query)?;
+```
+
+### 2. **Latency-Based Routing**
+
+Route to fastest available endpoint:
+
+```rust
+use router_core::routing::{Router, LatencyBasedStrategy};
+
+let router = Router::new(LatencyBasedStrategy::new(vec![
+    ("http://model1:8080", 50),  // 50ms avg latency
+    ("http://model2:8080", 30),  // 30ms avg latency (preferred)
+    ("http://model3:8080", 100), // 100ms avg latency
+]));
+```
+
+### 3. **Semantic Routing**
+
+Route based on query similarity to model specializations:
+
+```rust
+use router_core::routing::{Router, SemanticStrategy};
+
+// Define model specializations with example vectors
+let models = vec![
+    ("general-model", vec![0.1; 384]),  // General queries
+    ("code-model", vec![0.8, 0.2, ...]), // Code-related queries
+    ("math-model", vec![0.3, 0.9, ...]), // Math queries
+];
+
+let router = Router::new(SemanticStrategy::new(models));
+
+// Routes to most appropriate model based on query vector
+let endpoint = router.select_endpoint(&query_vector)?;
+```
+
+### 4. **Adaptive Routing**
+
+Learn optimal routing decisions over time:
+
+```rust
+use router_core::routing::{Router, AdaptiveStrategy};
+
+let mut router = Router::new(AdaptiveStrategy::new());
+
+// Router learns from feedback
+router.record_request(&query, &endpoint, latency, success)?;
+
+// Routing improves with more data
+let best_endpoint = router.select_endpoint(&query)?;
+```
+
+## 🎨 Distance Metrics
+
+Router Core supports multiple distance metrics with SIMD optimization:
+
+### Cosine Similarity
+
+Best for normalized embeddings (recommended for most AI applications):
+
+```rust
+use router_core::{DistanceMetric, distance::calculate_distance};
+
+let a = vec![1.0, 0.0, 0.0];
+let b = vec![0.9, 0.1, 0.0];
+
+let dist = calculate_distance(&a, &b, DistanceMetric::Cosine)?;
+// Returns 1 - cosine_similarity (0 = identical, 2 = opposite)
+```
+
+### Euclidean Distance (L2)
+
+Measures absolute geometric distance:
+
+```rust
+let dist = calculate_distance(&a, &b, DistanceMetric::Euclidean)?;
+// Returns sqrt(sum((a[i] - b[i])^2))
+```
+
+### Dot Product
+
+Fast similarity for pre-normalized vectors:
+
+```rust
+let dist = calculate_distance(&a, &b, DistanceMetric::DotProduct)?;
+// Returns -sum(a[i] * b[i]) (negated for distance)
+```
+
+### Manhattan Distance (L1)
+
+Sum of absolute differences:
+
+```rust
+let dist = calculate_distance(&a, &b, DistanceMetric::Manhattan)?;
+// Returns sum(|a[i] - b[i]|)
+```
+
+## 🗜️ Quantization Techniques
+
+Reduce memory usage with minimal accuracy loss:
+
+### Scalar Quantization (4x compression)
+
+Compress float32 to int8:
+
+```rust
+use router_core::{QuantizationType, VectorDB};
+
+let db = VectorDB::builder()
+    .dimensions(384)
+    .quantization(QuantizationType::Scalar)
+    .build()?;
+
+// Automatic quantization on insert
+// 384 dims × 4 bytes = 1536 bytes → 384 bytes + overhead
+```
+
+### Product Quantization (8-16x compression)
+
+Divide vector into subspaces and quantize independently:
+
+```rust
+let db = VectorDB::builder()
+    .dimensions(384)
+    .quantization(QuantizationType::Product {
+        subspaces: 8,    // Divide into 8 subspaces
+        k: 256,          // 256 centroids per subspace
+    })
+    .build()?;
+
+// 384 dims × 4 bytes = 1536 bytes → 8 bytes + overhead
+```
+
+### Binary Quantization (32x compression)
+
+Compress to 1 bit per dimension:
+
+```rust
+let db = VectorDB::builder()
+    .dimensions(384)
+    .quantization(QuantizationType::Binary)
+    .build()?;
+
+// 384 dims × 4 bytes = 1536 bytes → 48 bytes + overhead
+// Fast Hamming distance for similarity
+```
+
+### Compression Ratio Comparison
+
+```rust
+use router_core::quantization::calculate_compression_ratio;
+
+let dims = 384;
+
+let none_ratio = calculate_compression_ratio(dims, QuantizationType::None);
+// 1x - no compression
+
+let scalar_ratio = calculate_compression_ratio(dims, QuantizationType::Scalar);
+// ~4x compression
+
+let product_ratio = calculate_compression_ratio(
+    dims,
+    QuantizationType::Product { subspaces: 8, k: 256 }
+);
+// ~8-16x compression
+
+let binary_ratio = calculate_compression_ratio(dims, QuantizationType::Binary);
+// ~32x compression
+```
+
+## 📊 HNSW Index Configuration
+
+Tune the HNSW index for your performance/accuracy requirements:
+
+### M Parameter (Connections per Node)
+
+Controls graph connectivity and search accuracy:
+
+```rust
+// Low M = faster build, less memory, lower recall
+let db_fast = VectorDB::builder()
+    .hnsw_m(16)  // Minimal connections
+    .build()?;
+
+// Medium M = balanced (default)
+let db_balanced = VectorDB::builder()
+    .hnsw_m(32)  // Default setting
+    .build()?;
+
+// High M = slower build, more memory, higher recall
+let db_accurate = VectorDB::builder()
+    .hnsw_m(64)  // Maximum accuracy
+    .build()?;
+```
+
+### ef_construction (Build-Time Accuracy)
+
+Controls accuracy during index construction:
+
+```rust
+// Fast build, lower recall
+let db_fast = VectorDB::builder()
+    .hnsw_ef_construction(100)
+    .build()?;
+
+// Balanced (default)
+let db_balanced = VectorDB::builder()
+    .hnsw_ef_construction(200)
+    .build()?;
+
+// Slow build, maximum recall
+let db_accurate = VectorDB::builder()
+    .hnsw_ef_construction(400)
+    .build()?;
+```
+
+### ef_search (Query-Time Accuracy)
+
+Can be adjusted per query for dynamic performance/accuracy tradeoff:
+
+```rust
+// Fast search, lower recall
+let query_fast = SearchQuery {
+    vector: query_vec,
+    k: 10,
+    ef_search: Some(50),  // Override default
+    ..Default::default()
+};
+
+// Accurate search
+let query_accurate = SearchQuery {
+    vector: query_vec,
+    k: 10,
+    ef_search: Some(200),  // Higher accuracy
+    ..Default::default()
+};
+```
+
+## 🎯 Use Cases
+
+### Multi-Model AI Systems
+
+Route queries to specialized models based on content:
+
+```rust
+// Route code questions to code model, math to math model, etc.
+let router = SemanticRouter::new(vec![
+    ("gpt-4-code", code_specialization_vector),
+    ("gpt-4-math", math_specialization_vector),
+    ("gpt-4-general", general_specialization_vector),
+]);
+
+let best_model = router.route(&user_query_embedding)?;
+```
+
+### Load Balancing
+
+Distribute inference load across multiple servers:
+
+```rust
+// Balance load across 10 GPU servers
+let router = LoadBalancer::new(vec![
+    "gpu-0.internal:8080",
+    "gpu-1.internal:8080",
+    // ... gpu-9
+]);
+
+let endpoint = router.next_endpoint()?;
+```
+
+### RAG (Retrieval-Augmented Generation)
+
+Fast context retrieval for LLMs:
+
+```rust
+// Store document embeddings
+for doc in documents {
+    let embedding = embed_model.encode(&doc.text)?;
+    db.insert(VectorEntry {
+        id: doc.id,
+        vector: embedding,
+        metadata: doc.metadata,
+        timestamp: now(),
+    })?;
+}
+
+// Retrieve relevant context for query
+let query_embedding = embed_model.encode(&user_query)?;
+let context_docs = db.search(SearchQuery {
+    vector: query_embedding,
+    k: 5,  // Top 5 most relevant
+    threshold: Some(0.7),
+    ..Default::default()
+})?;
+```
+
+### Semantic Search
+
+Build intelligent search engines:
+
+```rust
+// Index product catalog
+for product in catalog {
+    let embedding = encode_product(&product)?;
+    db.insert(VectorEntry {
+        id: product.sku,
+        vector: embedding,
+        metadata: product.to_metadata(),
+        timestamp: now(),
+    })?;
+}
+
+// Search by natural language
+let search_embedding = encode_query("comfortable running shoes")?;
+let results = db.search(SearchQuery {
+    vector: search_embedding,
+    k: 20,
+    filters: Some(HashMap::from([
+        ("category", "footwear"),
+        ("in_stock", true),
+    ])),
+    ..Default::default()
+})?;
+```
+
+### Agent Memory Systems
+
+Store and retrieve agent experiences:
+
+```rust
+// Store agent observations
+struct AgentMemory {
+    db: VectorDB,
+}
+
+impl AgentMemory {
+    pub fn remember(&self, observation: &str, context: Vec<f32>) -> Result<()> {
+        self.db.insert(VectorEntry {
+            id: uuid::Uuid::new_v4().to_string(),
+            vector: context,
+            metadata: HashMap::from([
+                ("observation", observation.into()),
+                ("timestamp", now().into()),
+            ]),
+            timestamp: now(),
+        })
+    }
+
+    pub fn recall(&self, query_context: Vec<f32>, k: usize) -> Result<Vec<String>> {
+        let results = self.db.search(SearchQuery {
+            vector: query_context,
+            k,
+            ..Default::default()
+        })?;
+
+        Ok(results.iter()
+            .filter_map(|r| r.metadata.get("observation"))
+            .map(|v| v.as_str().unwrap().to_string())
+            .collect())
+    }
+}
+```
+
+## 🔧 Configuration Guide
+
+### Optimizing for Different Workloads
+
+#### High Throughput (Batch Processing)
+
+```rust
+let db = VectorDB::builder()
+    .dimensions(384)
+    .hnsw_m(16)                  // Lower M for faster queries
+    .hnsw_ef_construction(100)   // Faster build
+    .hnsw_ef_search(50)          // Lower default search accuracy
+    .quantization(QuantizationType::Scalar)  // Compress for speed
+    .mmap_vectors(true)          // Reduce memory pressure
+    .build()?;
+```
+
+#### High Accuracy (Research/Analysis)
+
+```rust
+let db = VectorDB::builder()
+    .dimensions(768)
+    .hnsw_m(64)                  // Maximum connections
+    .hnsw_ef_construction(400)   // High build accuracy
+    .hnsw_ef_search(200)         // High search accuracy
+    .quantization(QuantizationType::None)  // No compression
+    .build()?;
+```
+
+#### Memory Constrained (Edge Devices)
+
+```rust
+let db = VectorDB::builder()
+    .dimensions(256)             // Smaller embeddings
+    .max_elements(100_000)       // Limit dataset size
+    .hnsw_m(16)                  // Fewer connections
+    .quantization(QuantizationType::Binary)  // 32x compression
+    .mmap_vectors(true)          // Use disk instead of RAM
+    .build()?;
+```
+
+#### Balanced (Production Default)
+
+```rust
+let db = VectorDB::builder()
+    .dimensions(384)
+    .hnsw_m(32)
+    .hnsw_ef_construction(200)
+    .hnsw_ef_search(100)
+    .quantization(QuantizationType::Scalar)
+    .mmap_vectors(true)
+    .build()?;
+```
+
+## 📈 Performance Characteristics
+
+### Latency Benchmarks
+
+```
+Configuration          Query Latency (p50)    Recall@10
+─────────────────────────────────────────────────────────
+Uncompressed, M=64     0.3ms                  98.5%
+Scalar Quant, M=32     0.4ms                  96.2%
+Product Quant, M=32    0.5ms                  94.8%
+Binary Quant, M=16     0.6ms                  91.3%
+```
+
+### Memory Usage (1M vectors @ 384 dims)
+
+```
+Quantization           Memory Usage    Compression Ratio
+───────────────────────────────────────────────────────
+None (float32)         1536 MB         1x
+Scalar (int8)          392 MB          3.9x
+Product (8 subspaces)  120 MB          12.8x
+Binary (1 bit/dim)     52 MB           29.5x
+```
+
+### Throughput (1M vectors)
+
+```
+Operation              Throughput      Notes
+─────────────────────────────────────────────────────────
+Single Insert          ~100K/sec       Sequential
+Batch Insert           ~500K/sec       Parallel (rayon)
+Query (k=10)           ~50K QPS        ef_search=100
+Query (k=100)          ~20K QPS        ef_search=100
+```
+
+## 🏗️ Integration with Vector Database
+
+Router Core integrates seamlessly with the main Ruvector database:
+
+```rust
+use ruvector_core::VectorDB as MainDB;
+use router_core::VectorDB as RouterDB;
+
+// Use router-core for specialized routing logic
+let router_db = RouterDB::builder()
+    .dimensions(384)
+    .build()?;
+
+// Or use main ruvector-core for full features
+let main_db = MainDB::builder()
+    .dimensions(384)
+    .build()?;
+
+// Both share the same API!
+```
+
+## 🧪 Building and Testing
+
+### Build
+
+```bash
+# Build library
+cargo build --release -p router-core
+
+# Build with all features
+cargo build --release -p router-core --all-features
+
+# Build static library
+cargo build --release -p router-core --lib
+```
+
+### Test
+
+```bash
+# Run all tests
+cargo test -p router-core
+
+# Run specific test
+cargo test -p router-core test_hnsw_insert_and_search
+
+# Run with logging
+RUST_LOG=debug cargo test -p router-core
+```
+
+### Benchmark
+
+```bash
+# Run benchmarks
+cargo bench -p router-core
+
+# Run specific benchmark
+cargo bench -p router-core --bench vector_search
+
+# With criterion output
+cargo bench -p router-core -- --output-format verbose
+```
+
+## 📚 API Documentation
+
+### Core Types
+
+- **`VectorDB`**: Main database interface
+- **`VectorEntry`**: Vector with ID, data, and metadata
+- **`SearchQuery`**: Query parameters for similarity search
+- **`SearchResult`**: Search result with ID, score, and metadata
+- **`DistanceMetric`**: Enum for distance calculation methods
+- **`QuantizationType`**: Enum for compression methods
+
+### Key Methods
+
+```rust
+// VectorDB
+pub fn new(config: VectorDbConfig) -> Result<Self>
+pub fn builder() -> VectorDbBuilder
+pub fn insert(&self, entry: VectorEntry) -> Result<String>
+pub fn insert_batch(&self, entries: Vec<VectorEntry>) -> Result<Vec<String>>
+pub fn search(&self, query: SearchQuery) -> Result<Vec<SearchResult>>
+pub fn delete(&self, id: &str) -> Result<bool>
+pub fn get(&self, id: &str) -> Result<Option<VectorEntry>>
+pub fn stats(&self) -> VectorDbStats
+pub fn count(&self) -> Result<usize>
+
+// Distance calculations
+pub fn calculate_distance(a: &[f32], b: &[f32], metric: DistanceMetric) -> Result<f32>
+pub fn batch_distance(query: &[f32], vectors: &[Vec<f32>], metric: DistanceMetric) -> Result<Vec<f32>>
+
+// Quantization
+pub fn quantize(vector: &[f32], qtype: QuantizationType) -> Result<QuantizedVector>
+pub fn dequantize(quantized: &QuantizedVector) -> Vec<f32>
+pub fn calculate_compression_ratio(original_dims: usize, qtype: QuantizationType) -> f32
+```
+
+## 🔗 Links
+
+- **Main Repository**: [github.com/ruvnet/ruvector](https://github.com/ruvnet/ruvector)
+- **Documentation**: [docs/README.md](../../docs/README.md)
+- **API Reference**: [docs/api/RUST_API.md](../../docs/api/RUST_API.md)
+- **Performance Guide**: [docs/optimization/PERFORMANCE_TUNING_GUIDE.md](../../docs/optimization/PERFORMANCE_TUNING_GUIDE.md)
+- **Examples**: [examples/](../../examples/)
+
+## 📊 Related Crates
+
+- **`ruvector-core`**: Full-featured vector database (superset of router-core)
+- **`ruvector-node`**: Node.js bindings via NAPI-RS
+- **`ruvector-wasm`**: WebAssembly bindings for browsers
+- **`router-cli`**: Command-line interface for router operations
+- **`router-ffi`**: Foreign function interface for C/C++
+- **`router-wasm`**: WebAssembly bindings for router
+
+## 🤝 Contributing
+
+Contributions are welcome! Please see:
+
+- **[Contributing Guidelines](../../docs/development/CONTRIBUTING.md)**
+- **[Development Guide](../../docs/development/MIGRATION.md)**
+- **[Code of Conduct](../../CODE_OF_CONDUCT.md)**
+
+## 📜 License
+
+MIT License - see [LICENSE](../../LICENSE) for details.
+
+## 🙏 Acknowledgments
+
+Built with battle-tested technologies:
+
+- **HNSW**: Hierarchical Navigable Small World algorithm
+- **Product Quantization**: Memory-efficient vector compression
+- **simsimd**: SIMD-accelerated similarity computations
+- **redb**: Embedded database for persistent storage
+- **rayon**: Data parallelism for batch operations
+- **parking_lot**: High-performance synchronization primitives
+
+---
+
+<div align="center">
+
+**Part of the [Ruvector](https://github.com/ruvnet/ruvector) ecosystem**
+
+Built by [rUv](https://ruv.io) • Production Ready • MIT Licensed
+
+[Documentation](../../docs/README.md) • [API Reference](../../docs/api/RUST_API.md) • [Examples](../../examples/) • [Benchmarks](../../benchmarks/)
+
+</div>