git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
1008 lines
37 KiB
Markdown
1008 lines
37 KiB
Markdown
# Architecture Analysis: Sublinear-Time Solver Integration with ruvector
|
|
|
|
**Agent**: 5 -- Architecture & System Design
|
|
**Date**: 2026-02-20
|
|
**Status**: Complete
|
|
**Scope**: Full-stack architectural mapping, compatibility analysis, and integration strategy
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
1. [ruvector's Current Architecture Patterns](#1-ruvectors-current-architecture-patterns)
|
|
2. [Architectural Compatibility with Sublinear-Time Solver](#2-architectural-compatibility-with-sublinear-time-solver)
|
|
3. [Layered Integration Strategy (Rust -> WASM -> JS -> API)](#3-layered-integration-strategy)
|
|
4. [Module Boundary Recommendations](#4-module-boundary-recommendations)
|
|
5. [Dependency Injection Points](#5-dependency-injection-points)
|
|
6. [Event-Driven Integration Patterns](#6-event-driven-integration-patterns)
|
|
7. [Performance Architecture Considerations](#7-performance-architecture-considerations)
|
|
|
|
---
|
|
|
|
## 1. ruvector's Current Architecture Patterns
|
|
|
|
### 1.1 Macro-Architecture: Rust Workspace Monorepo
|
|
|
|
ruvector is organized as a Cargo workspace monorepo with approximately 75+ crates under
|
|
`/crates`. The workspace configuration in `Cargo.toml` lists roughly 100 workspace members
|
|
spanning core database functionality, mathematical engines, neural systems, governance layers,
|
|
and multiple deployment targets.
|
|
|
|
**Topology**: The codebase follows a layered architecture with a clear separation between
|
|
computational cores and their platform bindings:
|
|
|
|
```
|
|
Layer 0: Mathematical Foundations
|
|
ruvector-math, ruvector-mincut, ruqu-core, ruqu-algorithms
|
|
|
|
Layer 1: Core Engines
|
|
ruvector-core, ruvector-graph, ruvector-dag, ruvector-sparse-inference,
|
|
prime-radiant, sona, cognitum-gate-kernel, cognitum-gate-tilezero
|
|
|
|
Layer 2: Platform Bindings
|
|
*-wasm crates (wasm-bindgen), *-node crates (NAPI-RS), *-ffi crates
|
|
|
|
Layer 3: Integration Services
|
|
ruvector-server (axum REST), mcp-gate (MCP/JSON-RPC), ruvector-cli (clap)
|
|
|
|
Layer 4: Distribution & Orchestration
|
|
ruvector-cluster, ruvector-raft, ruvector-replication, ruvector-delta-consensus
|
|
```
|
|
|
|
### 1.2 The Core-Binding-Surface Pattern
|
|
|
|
Every major subsystem in ruvector follows a consistent three-part decomposition:
|
|
|
|
| Component | Purpose | Example |
|
|
|-----------|---------|---------|
|
|
| **Core** (pure Rust) | Algorithms, data structures, business logic | `ruvector-core`, `ruvector-graph`, `ruvector-math` |
|
|
| **WASM binding** | Browser/edge deployment via `wasm-bindgen` | `ruvector-wasm`, `ruvector-graph-wasm`, `ruvector-math-wasm` |
|
|
| **Node binding** | Server-side deployment via NAPI-RS | `ruvector-node`, `ruvector-graph-node`, `ruvector-gnn-node` |
|
|
|
|
This pattern is the primary architectural convention in ruvector. It appears in at least
|
|
15 subsystems: core, graph, GNN, attention, mincut, DAG, sparse-inference, math,
|
|
domain-expansion, economy, exotic, learning, nervous-system, tiny-dancer, and the
|
|
prime-radiant advanced WASM.
|
|
|
|
Key characteristics observed in the codebase:
|
|
|
|
- **Pure Rust cores** use `no_std`-compatible patterns where possible, avoiding I/O and
|
|
platform-specific code.
|
|
- **WASM crates** wrap core types in `#[wasm_bindgen]`-annotated structs with `JsValue`
|
|
serialization via `serde_wasm_bindgen`. They handle browser-specific concerns like
|
|
IndexedDB persistence, Web Worker pool management, and Float32Array interop.
|
|
- **Node crates** use `#[napi]` macros with `tokio::task::spawn_blocking` for async I/O,
|
|
leveraging zero-copy `Float32Array` buffers through NAPI-RS.
|
|
|
|
### 1.3 Dependency Management Strategy
|
|
|
|
The workspace `Cargo.toml` centralizes all shared dependencies. Critical shared dependencies
|
|
relevant to the sublinear-time solver integration:
|
|
|
|
- **Linear algebra**: `ndarray 0.16` (ruvector-math uses this extensively)
|
|
- **Numerics**: `rand 0.8`, `rand_distr 0.4`
|
|
- **WASM**: `wasm-bindgen 0.2`, `js-sys 0.3`, `web-sys 0.3`
|
|
- **Node.js**: `napi 2.16`, `napi-derive 2.16`
|
|
- **Async**: `tokio 1.41` (multi-thread runtime), `futures 0.3`
|
|
- **SIMD**: `simsimd 5.9` (distance calculations)
|
|
- **Serialization**: `serde 1.0`, `rkyv 0.8`, `bincode 2.0.0-rc.3`
|
|
- **Concurrency**: `rayon 1.10`, `crossbeam 0.8`, `dashmap 6.1`, `parking_lot 0.12`
|
|
|
|
Notable **absence**: `nalgebra` is not currently a workspace dependency. The sublinear-time
|
|
solver uses `nalgebra` as its linear algebra backend. This is a significant compatibility
|
|
consideration (analyzed in Section 2).
|
|
|
|
### 1.4 Feature Flag Architecture
|
|
|
|
ruvector makes extensive use of Cargo feature flags for conditional compilation:
|
|
|
|
- `storage` / `storage-memory`: Toggle between REDB-backed and in-memory storage
|
|
- `parallel`: Enables lock-free structures and rayon parallelism (disabled on `wasm32`)
|
|
- `collections`: Multi-collection support (requires file I/O, so conditionally excluded in WASM)
|
|
- `kernel-pack`: ADR-005 compliant secure WASM kernel execution
|
|
- `full`: Enables async-dependent modules (healing, qudag, sona) in the DAG crate
|
|
- `api-embeddings` / `real-embeddings`: External embedding model support
|
|
|
|
### 1.5 Event Sourcing and Domain Events
|
|
|
|
The `prime-radiant` crate implements a comprehensive event sourcing pattern through its
|
|
`events.rs` module. Domain events are defined as a tagged enum (`DomainEvent`) covering:
|
|
|
|
- Substrate events (NodeCreated, NodeUpdated, NodeRemoved, EdgeCreated, EdgeRemoved)
|
|
- Coherence computation events (energy calculations, residual updates)
|
|
- Governance events (policy changes, witness records)
|
|
|
|
Events are serialized with `serde` using `#[serde(tag = "type")]` for deterministic replay
|
|
and tamper detection via content hashes. This aligns well with the sublinear-time solver's
|
|
potential need for computation provenance tracking.
|
|
|
|
### 1.6 MCP Integration Pattern
|
|
|
|
The `mcp-gate` crate provides a Model Context Protocol server using JSON-RPC 2.0 over stdio.
|
|
Tools are defined declaratively with JSON Schema input specifications. The architecture uses
|
|
`Arc<RwLock<TileZero>>` for shared state with the coherence gate engine. This existing MCP
|
|
infrastructure provides a natural extension point for exposing solver capabilities to AI agents.
|
|
|
|
### 1.7 Server Architecture
|
|
|
|
`ruvector-server` uses `axum` with tower middleware layers (compression, CORS, tracing).
|
|
Routes are modular (health, collections, points). The server shares application state via
|
|
`AppState` and uses the standard Rust web service pattern with `Router` composition.
|
|
|
|
---
|
|
|
|
## 2. Architectural Compatibility with Sublinear-Time Solver
|
|
|
|
### 2.1 Structural Alignment Matrix
|
|
|
|
| Solver Component | ruvector Equivalent | Compatibility | Notes |
|
|
|-----------------|--------------------|----|-------|
|
|
| Rust core library (`sublinear_solver`) | `ruvector-core`, `ruvector-math` | **HIGH** | Both are pure Rust crates with algorithm-focused design |
|
|
| WASM layer (`wasm-bindgen`) | `ruvector-wasm`, `*-wasm` crates | **HIGH** | Identical binding technology, identical patterns |
|
|
| JS bridge (`solver.js`, etc.) | `npm/core/src/index.ts` | **HIGH** | Both provide platform-detection loaders and typed APIs |
|
|
| Express server | `ruvector-server` (axum) | **MEDIUM** | Different frameworks (Express vs axum) but compatible at API level |
|
|
| MCP integration (40+ tools) | `mcp-gate` (3 tools) | **HIGH** | Same protocol, ruvector has established patterns |
|
|
| CLI (NPX) | `ruvector-cli` (clap) | **MEDIUM** | Different CLI paradigms; ruvector uses native Rust CLI |
|
|
| TypeScript types | `npm/core/src/index.ts` | **HIGH** | ruvector already publishes TypeScript definitions |
|
|
| 9 workspace crates | ~75+ workspace crates | **HIGH** | Same Cargo workspace model |
|
|
|
|
### 2.2 Linear Algebra Backend Divergence
|
|
|
|
**This is the single most significant architectural tension.**
|
|
|
|
- **Sublinear-time solver**: Uses `nalgebra` for matrix operations, linear algebra, and
|
|
numerical computation.
|
|
- **ruvector**: Uses `ndarray 0.16` in `ruvector-math` and raw `Vec<f32>` with SIMD intrinsics
|
|
in `ruvector-core`.
|
|
|
|
**Resolution strategy**: Introduce `nalgebra` as a workspace dependency and create an
|
|
adapter layer. The two libraries can coexist. The adapter should provide zero-cost conversions
|
|
between `nalgebra::DMatrix<f32>` and `ndarray::Array2<f32>` views using shared memory backing.
|
|
Specifically:
|
|
|
|
```rust
|
|
// Proposed adapter in crates/ruvector-math/src/nalgebra_bridge.rs
|
|
use nalgebra::DMatrix;
|
|
use ndarray::Array2;
|
|
|
|
/// Zero-copy view conversion from nalgebra DMatrix to ndarray Array2
|
|
pub fn dmatrix_to_ndarray_view(m: &DMatrix<f32>) -> ndarray::ArrayView2<f32> {
|
|
let (rows, cols) = m.shape();
|
|
let slice = m.as_slice();
|
|
ndarray::ArrayView2::from_shape((rows, cols), slice)
|
|
.expect("nalgebra DMatrix is always contiguous column-major")
|
|
}
|
|
```
|
|
|
|
Note: `nalgebra` uses column-major storage while `ndarray` defaults to row-major. The adapter
|
|
must handle layout transposition or use `.reversed_axes()` for correct interpretation.
|
|
|
|
### 2.3 Server Framework Compatibility
|
|
|
|
The sublinear-time solver uses Express.js with session management and streaming. ruvector
|
|
uses axum (Rust). These are not in conflict because they serve different layers:
|
|
|
|
- **Solver Express server**: JS-level API for browser and Node clients, session management,
|
|
streaming results.
|
|
- **ruvector axum server**: Rust-level REST API for database operations.
|
|
|
|
The integration should layer the solver's Express functionality as a separate API surface,
|
|
or preferably, expose solver endpoints through axum with the same streaming semantics using
|
|
axum's SSE (Server-Sent Events) or WebSocket support.
|
|
|
|
### 2.4 WASM Compilation Target Compatibility
|
|
|
|
Both projects target `wasm32-unknown-unknown` via `wasm-bindgen`. ruvector already manages
|
|
the WASM-specific constraints:
|
|
|
|
- No `std::fs`, `std::net` in WASM builds
|
|
- `parking_lot::Mutex` instead of `std::sync::Mutex` (which does not panic on web)
|
|
- `getrandom` with `wasm_js` feature for random number generation
|
|
- Console error panic hooks for debugging
|
|
|
|
The sublinear-time solver's WASM layer should be able to reuse these patterns directly. The
|
|
existing `ruvector-wasm` crate demonstrates the complete pattern including IndexedDB persistence,
|
|
Web Worker pools, Float32Array interop, and SIMD detection.
|
|
|
|
---
|
|
|
|
## 3. Layered Integration Strategy
|
|
|
|
### 3.1 Layer Architecture Overview
|
|
|
|
```
|
|
+===========================================================================+
|
|
| APPLICATION CONSUMERS |
|
|
| MCP Agents | REST Clients | Browser Apps | CLI Users | Edge Devices |
|
|
+===========================================================================+
|
|
| | | | |
|
|
+===========================================================================+
|
|
| API SURFACE (Layer 4) |
|
|
| mcp-gate | ruvector-server | solver-server | ruvector-cli |
|
|
| (JSON-RPC/stdio) | (axum REST) | (axum SSE) | (clap binary) |
|
|
+===========================================================================+
|
|
| | | |
|
|
+===========================================================================+
|
|
| JS/TS BRIDGE (Layer 3) |
|
|
| npm/core/index.ts | solver-bridge.ts | solver-worker.ts |
|
|
| Platform detection, typed wrappers, async coordination |
|
|
+===========================================================================+
|
|
| | |
|
|
+===========================================================================+
|
|
| WASM SURFACE (Layer 2) |
|
|
| ruvector-wasm | ruvector-solver-wasm | ruvector-math-wasm |
|
|
| wasm-bindgen, Float32Array, Web Workers, IndexedDB |
|
|
+===========================================================================+
|
|
| |
|
|
+===========================================================================+
|
|
| RUST CORE (Layer 1) |
|
|
| ruvector-core | ruvector-solver | ruvector-math | ruvector-dag |
|
|
| Pure algorithms, nalgebra/ndarray, SIMD, rayon |
|
|
+===========================================================================+
|
|
|
|
|
+===========================================================================+
|
|
| MATH FOUNDATION (Layer 0) |
|
|
| nalgebra | ndarray | simsimd | ndarray-linalg (optional) |
|
|
+===========================================================================+
|
|
```
|
|
|
|
### 3.2 Layer 0 -> Layer 1: Rust Core Integration
|
|
|
|
**New crate**: `crates/ruvector-solver` (or `crates/sublinear-solver` if preserving the
|
|
upstream name is preferred).
|
|
|
|
Structure:
|
|
|
|
```
|
|
crates/ruvector-solver/
|
|
Cargo.toml
|
|
src/
|
|
lib.rs # Public API: traits, types, re-exports
|
|
algorithms/
|
|
mod.rs # Algorithm registry
|
|
bmssp.rs # Bounded Max-Sum Subarray Problem solver
|
|
fast.rs # Fast solver variants
|
|
sublinear.rs # Core sublinear-time algorithms
|
|
backend/
|
|
mod.rs # Backend abstraction
|
|
nalgebra.rs # nalgebra-backed implementation
|
|
ndarray.rs # ndarray bridge for ruvector interop
|
|
config.rs # Solver configuration
|
|
error.rs # Error types
|
|
types.rs # Core domain types (matrices, results, bounds)
|
|
```
|
|
|
|
Integration points with existing ruvector crates:
|
|
|
|
- **`ruvector-math`**: The solver's mathematical operations (optimal transport, spectral
|
|
methods, tropical algebra) overlap with `ruvector-math`. Common abstractions should be
|
|
extracted into shared traits.
|
|
- **`ruvector-dag`**: Sublinear graph algorithms can be applied to DAG bottleneck analysis.
|
|
The `DagMinCutEngine` already uses subpolynomial O(n^0.12) bottleneck detection; solver
|
|
algorithms could provide alternative or improved implementations.
|
|
- **`ruvector-sparse-inference`**: Sparse matrix operations and activation-locality patterns
|
|
in the inference engine are natural consumers of sublinear-time solvers.
|
|
|
|
### 3.3 Layer 1 -> Layer 2: WASM Compilation
|
|
|
|
**New crate**: `crates/ruvector-solver-wasm`
|
|
|
|
This follows the established ruvector pattern exactly:
|
|
|
|
```rust
|
|
// crates/ruvector-solver-wasm/src/lib.rs
|
|
use wasm_bindgen::prelude::*;
|
|
use ruvector_solver::{SublinearSolver, SolverConfig, SolverResult};
|
|
|
|
#[wasm_bindgen(start)]
|
|
pub fn init() {
|
|
console_error_panic_hook::set_once();
|
|
}
|
|
|
|
#[wasm_bindgen]
|
|
pub struct JsSolver {
|
|
inner: SublinearSolver,
|
|
}
|
|
|
|
#[wasm_bindgen]
|
|
impl JsSolver {
|
|
#[wasm_bindgen(constructor)]
|
|
pub fn new(config: JsValue) -> Result<JsSolver, JsValue> {
|
|
let config: SolverConfig = serde_wasm_bindgen::from_value(config)?;
|
|
let solver = SublinearSolver::new(config)
|
|
.map_err(|e| JsValue::from_str(&e.to_string()))?;
|
|
Ok(JsSolver { inner: solver })
|
|
}
|
|
|
|
#[wasm_bindgen]
|
|
pub fn solve(&self, input: Float32Array) -> Result<JsValue, JsValue> {
|
|
let data = input.to_vec();
|
|
let result = self.inner.solve(&data)
|
|
.map_err(|e| JsValue::from_str(&e.to_string()))?;
|
|
serde_wasm_bindgen::to_value(&result)
|
|
.map_err(|e| JsValue::from_str(&e.to_string()))
|
|
}
|
|
}
|
|
```
|
|
|
|
Critical WASM considerations:
|
|
|
|
1. **nalgebra WASM compatibility**: `nalgebra` compiles to WASM without issues. Ensure
|
|
`default-features = false` if the `std` feature pulls in incompatible dependencies.
|
|
2. **Memory limits**: WASM linear memory is limited (default 256 pages = 16MB). Sublinear
|
|
algorithms are inherently memory-efficient, which is an advantage. However, large matrix
|
|
operations may need chunked processing.
|
|
3. **No threads by default**: WASM does not support `std::thread`. Use the existing
|
|
`worker-pool.js` and `worker.js` patterns from `ruvector-wasm` for parallelism.
|
|
|
|
### 3.4 Layer 2 -> Layer 3: JavaScript Bridge
|
|
|
|
**New package**: `npm/solver/` (or extension of `npm/core/`)
|
|
|
|
```typescript
|
|
// npm/solver/src/index.ts
|
|
import { SublinearSolver as WasmSolver } from '../pkg/ruvector_solver_wasm';
|
|
|
|
export interface SolverConfig {
|
|
algorithm: 'bmssp' | 'fast' | 'sublinear';
|
|
tolerance?: number;
|
|
maxIterations?: number;
|
|
dimensions?: number;
|
|
}
|
|
|
|
export interface SolverResult {
|
|
solution: Float32Array;
|
|
iterations: number;
|
|
converged: boolean;
|
|
residualNorm: number;
|
|
wallTimeMs: number;
|
|
}
|
|
|
|
export class SublinearSolver {
|
|
private inner: WasmSolver;
|
|
|
|
constructor(config: SolverConfig) {
|
|
this.inner = new WasmSolver(config);
|
|
}
|
|
|
|
solve(input: Float32Array): SolverResult {
|
|
return this.inner.solve(input);
|
|
}
|
|
|
|
async solveAsync(input: Float32Array): Promise<SolverResult> {
|
|
// Offload to Web Worker for non-blocking execution
|
|
return workerPool.dispatch('solve', { input, config: this.config });
|
|
}
|
|
}
|
|
```
|
|
|
|
### 3.5 Layer 3 -> Layer 4: API Surface
|
|
|
|
For the axum-based server integration, add a new route module:
|
|
|
|
```rust
|
|
// crates/ruvector-server/src/routes/solver.rs
|
|
use axum::{extract::State, Json, response::sse::Event};
|
|
use ruvector_solver::{SublinearSolver, SolverConfig};
|
|
|
|
pub fn routes() -> Router<AppState> {
|
|
Router::new()
|
|
.route("/solver/solve", post(solve))
|
|
.route("/solver/solve/stream", post(solve_stream))
|
|
.route("/solver/config", get(get_config).put(update_config))
|
|
}
|
|
```
|
|
|
|
For the MCP integration, add new tools to `mcp-gate`:
|
|
|
|
```rust
|
|
McpTool {
|
|
name: "solve_sublinear".to_string(),
|
|
description: "Execute a sublinear-time solver on the provided input data".to_string(),
|
|
input_schema: serde_json::json!({
|
|
"type": "object",
|
|
"properties": {
|
|
"algorithm": { "type": "string", "enum": ["bmssp", "fast", "sublinear"] },
|
|
"input": { "type": "array", "items": { "type": "number" } },
|
|
"tolerance": { "type": "number", "default": 1e-6 }
|
|
},
|
|
"required": ["algorithm", "input"]
|
|
}),
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Module Boundary Recommendations
|
|
|
|
### 4.1 Boundary Principles
|
|
|
|
The following boundaries should be enforced through Cargo crate visibility and trait-based
|
|
abstraction:
|
|
|
|
```
|
|
PUBLIC API BOUNDARY
|
|
===================
|
|
|
|
|
+--------------+--------------+
|
|
| |
|
|
Solver Core Trait ruvector Core Trait
|
|
(SolverEngine) (VectorDB, SearchEngine)
|
|
| |
|
|
+------+------+ +-------+------+
|
|
| | | | | |
|
|
BMSSP Fast Sublin HNSW Graph DAG
|
|
```
|
|
|
|
### 4.2 Recommended Trait Boundaries
|
|
|
|
**Solver engine trait** (new, in `ruvector-solver`):
|
|
|
|
```rust
|
|
pub trait SolverEngine: Send + Sync {
|
|
type Input;
|
|
type Output;
|
|
type Error: std::error::Error;
|
|
|
|
fn solve(&self, input: &Self::Input) -> Result<Self::Output, Self::Error>;
|
|
fn solve_with_budget(
|
|
&self,
|
|
input: &Self::Input,
|
|
budget: ComputeBudget,
|
|
) -> Result<Self::Output, Self::Error>;
|
|
fn estimate_complexity(&self, input: &Self::Input) -> ComplexityEstimate;
|
|
}
|
|
```
|
|
|
|
**Numeric backend trait** (new, in `ruvector-math` or `ruvector-solver`):
|
|
|
|
```rust
|
|
pub trait NumericBackend: Send + Sync {
|
|
type Matrix;
|
|
type Vector;
|
|
|
|
fn mat_mul(&self, a: &Self::Matrix, b: &Self::Matrix) -> Self::Matrix;
|
|
fn svd(&self, m: &Self::Matrix) -> (Self::Matrix, Self::Vector, Self::Matrix);
|
|
fn eigenvalues(&self, m: &Self::Matrix) -> Self::Vector;
|
|
fn norm(&self, v: &Self::Vector) -> f64;
|
|
}
|
|
```
|
|
|
|
This trait allows the solver to abstract over `nalgebra` and `ndarray` backends, and also
|
|
enables future GPU-accelerated backends (the `prime-radiant` crate already has a GPU module
|
|
with buffer management and kernel dispatch).
|
|
|
|
### 4.3 Crate Dependency Graph (Proposed)
|
|
|
|
```
|
|
ruvector-solver-wasm -----> ruvector-solver -----> ruvector-math
|
|
| | |
|
|
| | +---> nalgebra (new dep)
|
|
| | +---> ndarray (existing)
|
|
| |
|
|
| +---> ruvector-core (optional, for VectorDB integration)
|
|
|
|
|
+---> wasm-bindgen, serde_wasm_bindgen (existing workspace deps)
|
|
|
|
ruvector-solver-node -----> ruvector-solver
|
|
|
|
|
+---> napi, napi-derive (existing workspace deps)
|
|
|
|
mcp-gate -----> ruvector-solver (optional feature)
|
|
ruvector-server -----> ruvector-solver (optional feature)
|
|
ruvector-dag -----> ruvector-solver (optional feature for bottleneck algorithms)
|
|
```
|
|
|
|
### 4.4 Feature Flag Recommendations
|
|
|
|
```toml
|
|
[features]
|
|
default = []
|
|
nalgebra-backend = ["nalgebra"]
|
|
ndarray-backend = ["ndarray"]
|
|
wasm = ["wasm-bindgen", "serde_wasm_bindgen", "js-sys"]
|
|
parallel = ["rayon"]
|
|
simd = [] # Auto-detected via cfg(target_feature)
|
|
gpu = ["ruvector-math/gpu"]
|
|
full = ["nalgebra-backend", "ndarray-backend", "parallel"]
|
|
```
|
|
|
|
---
|
|
|
|
## 5. Dependency Injection Points
|
|
|
|
### 5.1 Core DI Architecture
|
|
|
|
ruvector uses a combination of generic type parameters and `Arc<dyn Trait>` for dependency
|
|
injection. The following injection points are relevant for the sublinear-time solver:
|
|
|
|
#### 5.1.1 Numeric Backend Injection
|
|
|
|
The solver's core algorithm implementations should accept a generic numeric backend:
|
|
|
|
```rust
|
|
pub struct SublinearSolver<B: NumericBackend = NalgebraBackend> {
|
|
backend: B,
|
|
config: SolverConfig,
|
|
}
|
|
|
|
impl<B: NumericBackend> SublinearSolver<B> {
|
|
pub fn with_backend(backend: B, config: SolverConfig) -> Self {
|
|
Self { backend, config }
|
|
}
|
|
}
|
|
```
|
|
|
|
This allows ruvector consumers who already have `ndarray` matrices to use the solver
|
|
without conversion overhead.
|
|
|
|
#### 5.1.2 Distance Function Injection
|
|
|
|
ruvector-core's `DistanceMetric` enum defines four distance functions (Euclidean, Cosine,
|
|
DotProduct, Manhattan). The solver may need additional distance metrics or custom distance
|
|
functions. Injection point:
|
|
|
|
```rust
|
|
pub trait DistanceFunction: Send + Sync {
|
|
fn distance(&self, a: &[f32], b: &[f32]) -> f32;
|
|
fn name(&self) -> &str;
|
|
}
|
|
|
|
// Adapt ruvector's existing DistanceMetric
|
|
impl DistanceFunction for DistanceMetric {
|
|
fn distance(&self, a: &[f32], b: &[f32]) -> f32 {
|
|
match self {
|
|
DistanceMetric::Euclidean => simsimd_euclidean(a, b),
|
|
DistanceMetric::Cosine => simsimd_cosine(a, b),
|
|
// ...
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
#### 5.1.3 Storage Backend Injection
|
|
|
|
ruvector-core already has conditional compilation for storage backends (`storage` vs
|
|
`storage_memory`). The solver should use a similar pattern for result caching:
|
|
|
|
```rust
|
|
pub trait SolverCache: Send + Sync {
|
|
fn get(&self, key: &[u8]) -> Option<Vec<u8>>;
|
|
fn put(&self, key: &[u8], value: &[u8]);
|
|
fn invalidate(&self, key: &[u8]);
|
|
}
|
|
```
|
|
|
|
Implementations could include:
|
|
- `InMemoryCache` (default, using `DashMap`)
|
|
- `VectorDBCache` (using ruvector-core's VectorDB for nearest-neighbor result caching)
|
|
- `WasmCache` (using IndexedDB, following the `ruvector-wasm/src/indexeddb.js` pattern)
|
|
|
|
#### 5.1.4 Compute Budget Injection
|
|
|
|
Following `prime-radiant`'s compute ladder pattern (Lane 0 Reflex through Lane 3 Human),
|
|
the solver should accept compute budgets:
|
|
|
|
```rust
|
|
pub struct ComputeBudget {
|
|
pub max_wall_time: Duration,
|
|
pub max_iterations: usize,
|
|
pub max_memory_bytes: usize,
|
|
pub lane: ComputeLane,
|
|
}
|
|
|
|
pub enum ComputeLane {
|
|
Reflex, // < 1ms, local only
|
|
Retrieval, // ~ 10ms, can fetch cached results
|
|
Heavy, // ~ 100ms, full solver execution
|
|
Deliberate, // unbounded, with streaming progress
|
|
}
|
|
```
|
|
|
|
### 5.2 WASM-Specific Injection Points
|
|
|
|
In the WASM layer, dependency injection occurs through JavaScript configuration objects:
|
|
|
|
```typescript
|
|
interface SolverOptions {
|
|
// Backend selection
|
|
backend?: 'wasm-simd' | 'wasm-baseline' | 'js-fallback';
|
|
|
|
// Worker pool configuration
|
|
workerCount?: number;
|
|
workerUrl?: string;
|
|
|
|
// Memory management
|
|
maxMemoryMB?: number;
|
|
useSharedArrayBuffer?: boolean;
|
|
|
|
// Progress callback (for streaming)
|
|
onProgress?: (progress: SolverProgress) => void;
|
|
}
|
|
```
|
|
|
|
### 5.3 Server-Level Injection
|
|
|
|
At the API layer, the solver should be injected into the axum `AppState`:
|
|
|
|
```rust
|
|
pub struct AppState {
|
|
// Existing
|
|
pub vector_db: Arc<RwLock<CoreVectorDB>>,
|
|
pub collection_manager: Arc<RwLock<CoreCollectionManager>>,
|
|
|
|
// New: solver engine injection
|
|
pub solver: Arc<dyn SolverEngine<Input = SolverInput, Output = SolverOutput, Error = SolverError>>,
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 6. Event-Driven Integration Patterns
|
|
|
|
### 6.1 Alignment with Prime-Radiant Event Sourcing
|
|
|
|
The `prime-radiant` crate's `DomainEvent` enum provides a proven event-sourcing pattern.
|
|
The solver should emit analogous events for computation provenance:
|
|
|
|
```rust
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
#[serde(tag = "type")]
|
|
pub enum SolverEvent {
|
|
/// A solve request was received
|
|
SolveRequested {
|
|
request_id: String,
|
|
algorithm: String,
|
|
input_dimensions: (usize, usize),
|
|
timestamp: Timestamp,
|
|
},
|
|
|
|
/// An iteration completed
|
|
IterationCompleted {
|
|
request_id: String,
|
|
iteration: usize,
|
|
residual_norm: f64,
|
|
wall_time_us: u64,
|
|
timestamp: Timestamp,
|
|
},
|
|
|
|
/// The solver converged to a solution
|
|
SolveConverged {
|
|
request_id: String,
|
|
total_iterations: usize,
|
|
final_residual: f64,
|
|
total_wall_time_us: u64,
|
|
timestamp: Timestamp,
|
|
},
|
|
|
|
/// The solver exceeded its compute budget
|
|
BudgetExhausted {
|
|
request_id: String,
|
|
budget: ComputeBudget,
|
|
best_residual: f64,
|
|
timestamp: Timestamp,
|
|
},
|
|
|
|
/// A complexity estimate was computed
|
|
ComplexityEstimated {
|
|
request_id: String,
|
|
estimated_flops: u64,
|
|
estimated_memory_bytes: u64,
|
|
recommended_lane: ComputeLane,
|
|
timestamp: Timestamp,
|
|
},
|
|
}
|
|
```
|
|
|
|
### 6.2 Event Bus Integration
|
|
|
|
The solver events should be published to the same event infrastructure that prime-radiant
|
|
uses. The recommended pattern is a channel-based event bus:
|
|
|
|
```rust
|
|
pub struct SolverWithEvents<S: SolverEngine> {
|
|
solver: S,
|
|
event_tx: tokio::sync::broadcast::Sender<SolverEvent>,
|
|
}
|
|
|
|
impl<S: SolverEngine> SolverWithEvents<S> {
|
|
pub fn subscribe(&self) -> tokio::sync::broadcast::Receiver<SolverEvent> {
|
|
self.event_tx.subscribe()
|
|
}
|
|
}
|
|
```
|
|
|
|
This enables:
|
|
- **Coherence gate integration**: Prime-radiant can subscribe to solver events and include
|
|
solver stability in its coherence energy calculations.
|
|
- **Streaming API responses**: The axum server can convert the event stream to SSE.
|
|
- **MCP progress notifications**: The MCP server can emit JSON-RPC notifications for
|
|
long-running solve operations.
|
|
- **Telemetry and monitoring**: The `ruvector-metrics` crate can subscribe and export
|
|
Prometheus metrics for solver operations.
|
|
|
|
### 6.3 Coherence Gate as Solver Governor
|
|
|
|
A powerful integration pattern connects the solver to prime-radiant's coherence gate:
|
|
|
|
```
|
|
Solve Request --> Complexity Estimate --> Gate Decision --> Execute or Escalate
|
|
|
|
|
Prime-Radiant evaluates:
|
|
- Energy budget available?
|
|
- System coherence stable?
|
|
- Resource contention low?
|
|
```
|
|
|
|
The `cognitum-gate-tilezero` crate's `permit_action` tool can govern solver execution:
|
|
|
|
```rust
|
|
// Before executing a solver, request permission from the gate
|
|
let action = ActionContext {
|
|
action_id: format!("solve-{}", request_id),
|
|
action_type: "heavy_compute".into(),
|
|
target: ActionTarget {
|
|
device: "solver-engine".into(),
|
|
path: format!("/solver/{}", algorithm),
|
|
},
|
|
metadata: ActionMetadata {
|
|
estimated_cost: complexity.estimated_flops as f64,
|
|
estimated_duration_ms: complexity.estimated_wall_time_ms,
|
|
},
|
|
};
|
|
|
|
match gate.permit_action(action).await {
|
|
GateDecision::Permit(token) => solver.solve_with_token(input, token),
|
|
GateDecision::Defer(info) => escalate_to_queue(input, info),
|
|
GateDecision::Deny(reason) => Err(SolverError::Denied(reason)),
|
|
}
|
|
```
|
|
|
|
### 6.4 DAG Integration Events
|
|
|
|
The `ruvector-dag` crate's query plan optimizer can emit events when bottleneck analysis
|
|
identifies nodes that would benefit from sublinear-time solving:
|
|
|
|
```rust
|
|
// In ruvector-dag when a bottleneck is detected
|
|
SolverEvent::BottleneckSolverRequested {
|
|
dag_id: dag.id(),
|
|
bottleneck_nodes: bottlenecks.iter().map(|b| b.node_id).collect(),
|
|
estimated_speedup: bottlenecks.iter().map(|b| b.speedup_potential).sum(),
|
|
timestamp: now(),
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 7. Performance Architecture Considerations
|
|
|
|
### 7.1 Memory Architecture
|
|
|
|
#### Current ruvector Memory Model
|
|
|
|
ruvector-core uses several memory optimization strategies:
|
|
|
|
- **Arena allocator** (`arena.rs`): Cache-aligned vector allocation with `CACHE_LINE_SIZE`
|
|
awareness and batch allocation via `BatchVectorAllocator`.
|
|
- **SoA storage** (`cache_optimized.rs`): Structure-of-Arrays layout for cache-friendly
|
|
sequential access to vector components.
|
|
- **Memory pools** (`memory.rs`): Basic allocation tracking with optional limits.
|
|
- **Paged memory** (ADR-006): 2MB page-granular allocation with LRU eviction and
|
|
Hot/Warm/Cold residency tiers.
|
|
|
|
#### Solver Memory Requirements
|
|
|
|
Sublinear-time algorithms are inherently memory-efficient (often O(n^alpha) for alpha < 1),
|
|
but the nalgebra backend may allocate large intermediate matrices. Recommendations:
|
|
|
|
1. **Use ruvector's arena allocator** for solver-internal scratch space. Wrap nalgebra
|
|
allocations in arena-backed storage:
|
|
|
|
```rust
|
|
pub struct SolverArena {
|
|
inner: Arena,
|
|
scratch_matrices: Vec<DMatrix<f32>>,
|
|
}
|
|
```
|
|
|
|
2. **Integrate with ADR-006 paged memory** for large problem instances. The solver should
|
|
respect the memory pool's limit and request pages through the established interface rather
|
|
than allocating directly.
|
|
|
|
3. **WASM memory budget**: In WASM, limit solver memory to a configurable fraction of the
|
|
linear memory. The default WASM memory of 16MB is tight; ensure the solver can operate
|
|
within 4-8MB for typical problem sizes, using the `ComputeBudget.max_memory_bytes` field.
|
|
|
|
### 7.2 SIMD Optimization Strategy
|
|
|
|
ruvector uses `simsimd 5.9` for distance calculations, achieving approximately 16M ops/sec
|
|
for 512-dimensional vectors. The solver should leverage SIMD at two levels:
|
|
|
|
1. **Auto-vectorization**: Write inner loops in a SIMD-friendly style (sequential access,
|
|
no branches, aligned data). Rust's LLVM backend will auto-vectorize these for both native
|
|
and WASM targets.
|
|
|
|
2. **Explicit SIMD**: For hot paths, use `std::arch` intrinsics with runtime detection:
|
|
|
|
```rust
|
|
#[cfg(target_arch = "x86_64")]
|
|
use std::arch::x86_64::*;
|
|
|
|
#[cfg(target_arch = "wasm32")]
|
|
use std::arch::wasm32::*;
|
|
```
|
|
|
|
The existing `ruvector-core/src/simd_intrinsics.rs` provides patterns for this.
|
|
|
|
3. **WASM SIMD128**: The `ruvector-wasm` crate already detects SIMD support via
|
|
`detect_simd()`. Ensure the solver WASM crate is compiled with `-C target-feature=+simd128`
|
|
for WASM SIMD support, with a non-SIMD fallback.
|
|
|
|
### 7.3 Concurrency Architecture
|
|
|
|
#### Native (Server) Concurrency
|
|
|
|
ruvector uses a rich concurrency toolkit:
|
|
|
|
- **Rayon** for data-parallel operations (conditional on `feature = "parallel"`)
|
|
- **Crossbeam** for lock-free data structures
|
|
- **DashMap** for concurrent hash maps
|
|
- **Parking_lot** for efficient mutexes and RwLocks
|
|
- **Tokio** for async I/O and task scheduling
|
|
- **Lock-free structures** (`lockfree.rs`): `AtomicVectorPool`, `LockFreeWorkQueue`,
|
|
`LockFreeBatchProcessor`
|
|
|
|
The solver should integrate with this concurrency model:
|
|
|
|
```rust
|
|
impl SublinearSolver {
|
|
pub fn solve_parallel(&self, input: &[f32]) -> Result<SolverResult> {
|
|
#[cfg(feature = "parallel")]
|
|
{
|
|
input.par_chunks(self.config.chunk_size)
|
|
.map(|chunk| self.solve_chunk(chunk))
|
|
.reduce_with(|a, b| self.merge_results(a?, b?))
|
|
.unwrap_or(Err(SolverError::EmptyInput))
|
|
}
|
|
#[cfg(not(feature = "parallel"))]
|
|
{
|
|
self.solve_sequential(input)
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
#### WASM Concurrency
|
|
|
|
WASM does not support native threads. The solver must use Web Workers for parallelism:
|
|
|
|
- Follow the `ruvector-wasm/src/worker-pool.js` pattern
|
|
- Use `SharedArrayBuffer` for zero-copy data sharing between workers (requires
|
|
`Cross-Origin-Opener-Policy: same-origin` and `Cross-Origin-Embedder-Policy: require-corp`)
|
|
- Fall back to `postMessage` with transferable `ArrayBuffer` when SAB is unavailable
|
|
|
|
### 7.4 Latency Targets by Deployment Context
|
|
|
|
| Context | Target Latency | Memory Budget | Strategy |
|
|
|---------|---------------|---------------|----------|
|
|
| **WASM (browser)** | < 50ms for 10K elements | 4-8 MB | SIMD128, single-threaded, streaming |
|
|
| **WASM (edge/Cloudflare)** | < 10ms for 10K elements | 128 MB | SIMD128, limited workers |
|
|
| **Node.js (NAPI)** | < 5ms for 10K elements | 512 MB | Native SIMD, Rayon parallel |
|
|
| **Server (axum)** | < 2ms for 10K elements | 2 GB | Full SIMD, Rayon, memory-mapped |
|
|
| **MCP (agent)** | Budget-dependent | Configurable | Gate-governed, compute ladder |
|
|
|
|
### 7.5 Benchmarking Integration
|
|
|
|
ruvector uses `criterion 0.5` for benchmarking with HTML reports. The solver should integrate
|
|
into the existing benchmark infrastructure:
|
|
|
|
```rust
|
|
// benches/solver_benchmarks.rs
|
|
use criterion::{criterion_group, criterion_main, BenchmarkId, Criterion};
|
|
use ruvector_solver::{SublinearSolver, SolverConfig};
|
|
|
|
fn bench_sublinear_solve(c: &mut Criterion) {
|
|
let mut group = c.benchmark_group("sublinear_solver");
|
|
|
|
for size in [100, 1_000, 10_000, 100_000] {
|
|
group.bench_with_input(
|
|
BenchmarkId::new("bmssp", size),
|
|
&size,
|
|
|b, &size| {
|
|
let solver = SublinearSolver::new(SolverConfig::default());
|
|
let input: Vec<f32> = (0..size).map(|i| i as f32).collect();
|
|
b.iter(|| solver.solve(&input));
|
|
},
|
|
);
|
|
}
|
|
group.finish();
|
|
}
|
|
```
|
|
|
|
The benchmark results should be stored in the existing `bench_results/` directory in JSON
|
|
format, matching the schema used by `comparison_benchmark.json` and `latency_benchmark.json`.
|
|
|
|
### 7.6 Profile-Guided Optimization
|
|
|
|
The workspace `Cargo.toml` already configures aggressive release optimizations:
|
|
|
|
```toml
|
|
[profile.release]
|
|
opt-level = 3
|
|
lto = "fat"
|
|
codegen-units = 1
|
|
strip = true
|
|
```
|
|
|
|
These settings are critical for solver performance. Additional considerations:
|
|
|
|
- **PGO (Profile-Guided Optimization)**: For the NAPI binary, consider adding a PGO training
|
|
step using representative solver workloads.
|
|
- **WASM opt**: Run `wasm-opt -O3` on the solver WASM output (the existing build scripts
|
|
in `ruvector-wasm` likely already do this).
|
|
- **Link-time optimization across crates**: The `lto = "fat"` setting enables cross-crate
|
|
LTO, which is essential for inlining nalgebra operations into solver hot paths.
|
|
|
|
### 7.7 Zero-Copy Data Path
|
|
|
|
The critical performance path for the solver is the data pipeline from API input to solver
|
|
core and back. Minimize copies:
|
|
|
|
```
|
|
API (axum): body bytes --deserialize--> SolverInput
|
|
|
|
|
+---------borrow-----------+
|
|
| |
|
|
nalgebra::DMatrixSlice result buffer
|
|
| |
|
|
+------solve-------->------+
|
|
|
|
|
--serialize--> API response bytes
|
|
```
|
|
|
|
For the WASM path:
|
|
|
|
```
|
|
JS Float32Array --view (no copy)--> wasm linear memory --solve--> wasm linear memory
|
|
|
|
|
--view (no copy)--> JS Float32Array
|
|
```
|
|
|
|
The key is to use `Float32Array::view()` in wasm-bindgen rather than `Float32Array::copy_from()`
|
|
wherever the solver does not need to retain ownership of the input data.
|
|
|
|
---
|
|
|
|
## Summary of Key Recommendations
|
|
|
|
1. **Create `crates/ruvector-solver`** as a new pure-Rust workspace member, following the
|
|
established core-binding-surface pattern.
|
|
|
|
2. **Add `nalgebra` as a workspace dependency** and create a bridge module in `ruvector-math`
|
|
for zero-cost conversions between nalgebra and ndarray representations.
|
|
|
|
3. **Follow the existing three-crate pattern** exactly: `ruvector-solver` (core),
|
|
`ruvector-solver-wasm` (browser), `ruvector-solver-node` (server).
|
|
|
|
4. **Integrate with prime-radiant's event sourcing** by emitting `SolverEvent`s through
|
|
a broadcast channel, enabling coherence gate governance and streaming API responses.
|
|
|
|
5. **Use the coherence gate as a solver governor** to prevent runaway computation and
|
|
integrate with the compute ladder (Lane 0-3).
|
|
|
|
6. **Inject the solver into `AppState`** for axum server integration, and add new MCP
|
|
tools to `mcp-gate` for AI agent access.
|
|
|
|
7. **Respect ruvector's memory architecture** by integrating with the arena allocator,
|
|
SoA storage patterns, and ADR-006 paged memory management.
|
|
|
|
8. **Target WASM SIMD128** for browser performance, with graceful fallback to scalar code
|
|
detected at runtime via the existing `detect_simd()` mechanism.
|
|
|
|
9. **Use Rayon with feature gating** for native parallelism, and Web Workers for WASM
|
|
parallelism, following the patterns already established in `ruvector-wasm`.
|
|
|
|
10. **Integrate benchmarks into the existing `criterion` infrastructure** and store results
|
|
in the `bench_results/` directory for regression tracking.
|