Files
wifi-densepose/docs/research/sublinear-time-solver/05-architecture-analysis.md
ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00

1008 lines
37 KiB
Markdown

# Architecture Analysis: Sublinear-Time Solver Integration with ruvector
**Agent**: 5 -- Architecture & System Design
**Date**: 2026-02-20
**Status**: Complete
**Scope**: Full-stack architectural mapping, compatibility analysis, and integration strategy
---
## Table of Contents
1. [ruvector's Current Architecture Patterns](#1-ruvectors-current-architecture-patterns)
2. [Architectural Compatibility with Sublinear-Time Solver](#2-architectural-compatibility-with-sublinear-time-solver)
3. [Layered Integration Strategy (Rust -> WASM -> JS -> API)](#3-layered-integration-strategy)
4. [Module Boundary Recommendations](#4-module-boundary-recommendations)
5. [Dependency Injection Points](#5-dependency-injection-points)
6. [Event-Driven Integration Patterns](#6-event-driven-integration-patterns)
7. [Performance Architecture Considerations](#7-performance-architecture-considerations)
---
## 1. ruvector's Current Architecture Patterns
### 1.1 Macro-Architecture: Rust Workspace Monorepo
ruvector is organized as a Cargo workspace monorepo with approximately 75+ crates under
`/crates`. The workspace configuration in `Cargo.toml` lists roughly 100 workspace members
spanning core database functionality, mathematical engines, neural systems, governance layers,
and multiple deployment targets.
**Topology**: The codebase follows a layered architecture with a clear separation between
computational cores and their platform bindings:
```
Layer 0: Mathematical Foundations
ruvector-math, ruvector-mincut, ruqu-core, ruqu-algorithms
Layer 1: Core Engines
ruvector-core, ruvector-graph, ruvector-dag, ruvector-sparse-inference,
prime-radiant, sona, cognitum-gate-kernel, cognitum-gate-tilezero
Layer 2: Platform Bindings
*-wasm crates (wasm-bindgen), *-node crates (NAPI-RS), *-ffi crates
Layer 3: Integration Services
ruvector-server (axum REST), mcp-gate (MCP/JSON-RPC), ruvector-cli (clap)
Layer 4: Distribution & Orchestration
ruvector-cluster, ruvector-raft, ruvector-replication, ruvector-delta-consensus
```
### 1.2 The Core-Binding-Surface Pattern
Every major subsystem in ruvector follows a consistent three-part decomposition:
| Component | Purpose | Example |
|-----------|---------|---------|
| **Core** (pure Rust) | Algorithms, data structures, business logic | `ruvector-core`, `ruvector-graph`, `ruvector-math` |
| **WASM binding** | Browser/edge deployment via `wasm-bindgen` | `ruvector-wasm`, `ruvector-graph-wasm`, `ruvector-math-wasm` |
| **Node binding** | Server-side deployment via NAPI-RS | `ruvector-node`, `ruvector-graph-node`, `ruvector-gnn-node` |
This pattern is the primary architectural convention in ruvector. It appears in at least
15 subsystems: core, graph, GNN, attention, mincut, DAG, sparse-inference, math,
domain-expansion, economy, exotic, learning, nervous-system, tiny-dancer, and the
prime-radiant advanced WASM.
Key characteristics observed in the codebase:
- **Pure Rust cores** use `no_std`-compatible patterns where possible, avoiding I/O and
platform-specific code.
- **WASM crates** wrap core types in `#[wasm_bindgen]`-annotated structs with `JsValue`
serialization via `serde_wasm_bindgen`. They handle browser-specific concerns like
IndexedDB persistence, Web Worker pool management, and Float32Array interop.
- **Node crates** use `#[napi]` macros with `tokio::task::spawn_blocking` for async I/O,
leveraging zero-copy `Float32Array` buffers through NAPI-RS.
### 1.3 Dependency Management Strategy
The workspace `Cargo.toml` centralizes all shared dependencies. Critical shared dependencies
relevant to the sublinear-time solver integration:
- **Linear algebra**: `ndarray 0.16` (ruvector-math uses this extensively)
- **Numerics**: `rand 0.8`, `rand_distr 0.4`
- **WASM**: `wasm-bindgen 0.2`, `js-sys 0.3`, `web-sys 0.3`
- **Node.js**: `napi 2.16`, `napi-derive 2.16`
- **Async**: `tokio 1.41` (multi-thread runtime), `futures 0.3`
- **SIMD**: `simsimd 5.9` (distance calculations)
- **Serialization**: `serde 1.0`, `rkyv 0.8`, `bincode 2.0.0-rc.3`
- **Concurrency**: `rayon 1.10`, `crossbeam 0.8`, `dashmap 6.1`, `parking_lot 0.12`
Notable **absence**: `nalgebra` is not currently a workspace dependency. The sublinear-time
solver uses `nalgebra` as its linear algebra backend. This is a significant compatibility
consideration (analyzed in Section 2).
### 1.4 Feature Flag Architecture
ruvector makes extensive use of Cargo feature flags for conditional compilation:
- `storage` / `storage-memory`: Toggle between REDB-backed and in-memory storage
- `parallel`: Enables lock-free structures and rayon parallelism (disabled on `wasm32`)
- `collections`: Multi-collection support (requires file I/O, so conditionally excluded in WASM)
- `kernel-pack`: ADR-005 compliant secure WASM kernel execution
- `full`: Enables async-dependent modules (healing, qudag, sona) in the DAG crate
- `api-embeddings` / `real-embeddings`: External embedding model support
### 1.5 Event Sourcing and Domain Events
The `prime-radiant` crate implements a comprehensive event sourcing pattern through its
`events.rs` module. Domain events are defined as a tagged enum (`DomainEvent`) covering:
- Substrate events (NodeCreated, NodeUpdated, NodeRemoved, EdgeCreated, EdgeRemoved)
- Coherence computation events (energy calculations, residual updates)
- Governance events (policy changes, witness records)
Events are serialized with `serde` using `#[serde(tag = "type")]` for deterministic replay
and tamper detection via content hashes. This aligns well with the sublinear-time solver's
potential need for computation provenance tracking.
### 1.6 MCP Integration Pattern
The `mcp-gate` crate provides a Model Context Protocol server using JSON-RPC 2.0 over stdio.
Tools are defined declaratively with JSON Schema input specifications. The architecture uses
`Arc<RwLock<TileZero>>` for shared state with the coherence gate engine. This existing MCP
infrastructure provides a natural extension point for exposing solver capabilities to AI agents.
### 1.7 Server Architecture
`ruvector-server` uses `axum` with tower middleware layers (compression, CORS, tracing).
Routes are modular (health, collections, points). The server shares application state via
`AppState` and uses the standard Rust web service pattern with `Router` composition.
---
## 2. Architectural Compatibility with Sublinear-Time Solver
### 2.1 Structural Alignment Matrix
| Solver Component | ruvector Equivalent | Compatibility | Notes |
|-----------------|--------------------|----|-------|
| Rust core library (`sublinear_solver`) | `ruvector-core`, `ruvector-math` | **HIGH** | Both are pure Rust crates with algorithm-focused design |
| WASM layer (`wasm-bindgen`) | `ruvector-wasm`, `*-wasm` crates | **HIGH** | Identical binding technology, identical patterns |
| JS bridge (`solver.js`, etc.) | `npm/core/src/index.ts` | **HIGH** | Both provide platform-detection loaders and typed APIs |
| Express server | `ruvector-server` (axum) | **MEDIUM** | Different frameworks (Express vs axum) but compatible at API level |
| MCP integration (40+ tools) | `mcp-gate` (3 tools) | **HIGH** | Same protocol, ruvector has established patterns |
| CLI (NPX) | `ruvector-cli` (clap) | **MEDIUM** | Different CLI paradigms; ruvector uses native Rust CLI |
| TypeScript types | `npm/core/src/index.ts` | **HIGH** | ruvector already publishes TypeScript definitions |
| 9 workspace crates | ~75+ workspace crates | **HIGH** | Same Cargo workspace model |
### 2.2 Linear Algebra Backend Divergence
**This is the single most significant architectural tension.**
- **Sublinear-time solver**: Uses `nalgebra` for matrix operations, linear algebra, and
numerical computation.
- **ruvector**: Uses `ndarray 0.16` in `ruvector-math` and raw `Vec<f32>` with SIMD intrinsics
in `ruvector-core`.
**Resolution strategy**: Introduce `nalgebra` as a workspace dependency and create an
adapter layer. The two libraries can coexist. The adapter should provide zero-cost conversions
between `nalgebra::DMatrix<f32>` and `ndarray::Array2<f32>` views using shared memory backing.
Specifically:
```rust
// Proposed adapter in crates/ruvector-math/src/nalgebra_bridge.rs
use nalgebra::DMatrix;
use ndarray::Array2;
/// Zero-copy view conversion from nalgebra DMatrix to ndarray Array2
pub fn dmatrix_to_ndarray_view(m: &DMatrix<f32>) -> ndarray::ArrayView2<f32> {
let (rows, cols) = m.shape();
let slice = m.as_slice();
ndarray::ArrayView2::from_shape((rows, cols), slice)
.expect("nalgebra DMatrix is always contiguous column-major")
}
```
Note: `nalgebra` uses column-major storage while `ndarray` defaults to row-major. The adapter
must handle layout transposition or use `.reversed_axes()` for correct interpretation.
### 2.3 Server Framework Compatibility
The sublinear-time solver uses Express.js with session management and streaming. ruvector
uses axum (Rust). These are not in conflict because they serve different layers:
- **Solver Express server**: JS-level API for browser and Node clients, session management,
streaming results.
- **ruvector axum server**: Rust-level REST API for database operations.
The integration should layer the solver's Express functionality as a separate API surface,
or preferably, expose solver endpoints through axum with the same streaming semantics using
axum's SSE (Server-Sent Events) or WebSocket support.
### 2.4 WASM Compilation Target Compatibility
Both projects target `wasm32-unknown-unknown` via `wasm-bindgen`. ruvector already manages
the WASM-specific constraints:
- No `std::fs`, `std::net` in WASM builds
- `parking_lot::Mutex` instead of `std::sync::Mutex` (which does not panic on web)
- `getrandom` with `wasm_js` feature for random number generation
- Console error panic hooks for debugging
The sublinear-time solver's WASM layer should be able to reuse these patterns directly. The
existing `ruvector-wasm` crate demonstrates the complete pattern including IndexedDB persistence,
Web Worker pools, Float32Array interop, and SIMD detection.
---
## 3. Layered Integration Strategy
### 3.1 Layer Architecture Overview
```
+===========================================================================+
| APPLICATION CONSUMERS |
| MCP Agents | REST Clients | Browser Apps | CLI Users | Edge Devices |
+===========================================================================+
| | | | |
+===========================================================================+
| API SURFACE (Layer 4) |
| mcp-gate | ruvector-server | solver-server | ruvector-cli |
| (JSON-RPC/stdio) | (axum REST) | (axum SSE) | (clap binary) |
+===========================================================================+
| | | |
+===========================================================================+
| JS/TS BRIDGE (Layer 3) |
| npm/core/index.ts | solver-bridge.ts | solver-worker.ts |
| Platform detection, typed wrappers, async coordination |
+===========================================================================+
| | |
+===========================================================================+
| WASM SURFACE (Layer 2) |
| ruvector-wasm | ruvector-solver-wasm | ruvector-math-wasm |
| wasm-bindgen, Float32Array, Web Workers, IndexedDB |
+===========================================================================+
| |
+===========================================================================+
| RUST CORE (Layer 1) |
| ruvector-core | ruvector-solver | ruvector-math | ruvector-dag |
| Pure algorithms, nalgebra/ndarray, SIMD, rayon |
+===========================================================================+
|
+===========================================================================+
| MATH FOUNDATION (Layer 0) |
| nalgebra | ndarray | simsimd | ndarray-linalg (optional) |
+===========================================================================+
```
### 3.2 Layer 0 -> Layer 1: Rust Core Integration
**New crate**: `crates/ruvector-solver` (or `crates/sublinear-solver` if preserving the
upstream name is preferred).
Structure:
```
crates/ruvector-solver/
Cargo.toml
src/
lib.rs # Public API: traits, types, re-exports
algorithms/
mod.rs # Algorithm registry
bmssp.rs # Bounded Max-Sum Subarray Problem solver
fast.rs # Fast solver variants
sublinear.rs # Core sublinear-time algorithms
backend/
mod.rs # Backend abstraction
nalgebra.rs # nalgebra-backed implementation
ndarray.rs # ndarray bridge for ruvector interop
config.rs # Solver configuration
error.rs # Error types
types.rs # Core domain types (matrices, results, bounds)
```
Integration points with existing ruvector crates:
- **`ruvector-math`**: The solver's mathematical operations (optimal transport, spectral
methods, tropical algebra) overlap with `ruvector-math`. Common abstractions should be
extracted into shared traits.
- **`ruvector-dag`**: Sublinear graph algorithms can be applied to DAG bottleneck analysis.
The `DagMinCutEngine` already uses subpolynomial O(n^0.12) bottleneck detection; solver
algorithms could provide alternative or improved implementations.
- **`ruvector-sparse-inference`**: Sparse matrix operations and activation-locality patterns
in the inference engine are natural consumers of sublinear-time solvers.
### 3.3 Layer 1 -> Layer 2: WASM Compilation
**New crate**: `crates/ruvector-solver-wasm`
This follows the established ruvector pattern exactly:
```rust
// crates/ruvector-solver-wasm/src/lib.rs
use wasm_bindgen::prelude::*;
use ruvector_solver::{SublinearSolver, SolverConfig, SolverResult};
#[wasm_bindgen(start)]
pub fn init() {
console_error_panic_hook::set_once();
}
#[wasm_bindgen]
pub struct JsSolver {
inner: SublinearSolver,
}
#[wasm_bindgen]
impl JsSolver {
#[wasm_bindgen(constructor)]
pub fn new(config: JsValue) -> Result<JsSolver, JsValue> {
let config: SolverConfig = serde_wasm_bindgen::from_value(config)?;
let solver = SublinearSolver::new(config)
.map_err(|e| JsValue::from_str(&e.to_string()))?;
Ok(JsSolver { inner: solver })
}
#[wasm_bindgen]
pub fn solve(&self, input: Float32Array) -> Result<JsValue, JsValue> {
let data = input.to_vec();
let result = self.inner.solve(&data)
.map_err(|e| JsValue::from_str(&e.to_string()))?;
serde_wasm_bindgen::to_value(&result)
.map_err(|e| JsValue::from_str(&e.to_string()))
}
}
```
Critical WASM considerations:
1. **nalgebra WASM compatibility**: `nalgebra` compiles to WASM without issues. Ensure
`default-features = false` if the `std` feature pulls in incompatible dependencies.
2. **Memory limits**: WASM linear memory is limited (default 256 pages = 16MB). Sublinear
algorithms are inherently memory-efficient, which is an advantage. However, large matrix
operations may need chunked processing.
3. **No threads by default**: WASM does not support `std::thread`. Use the existing
`worker-pool.js` and `worker.js` patterns from `ruvector-wasm` for parallelism.
### 3.4 Layer 2 -> Layer 3: JavaScript Bridge
**New package**: `npm/solver/` (or extension of `npm/core/`)
```typescript
// npm/solver/src/index.ts
import { SublinearSolver as WasmSolver } from '../pkg/ruvector_solver_wasm';
export interface SolverConfig {
algorithm: 'bmssp' | 'fast' | 'sublinear';
tolerance?: number;
maxIterations?: number;
dimensions?: number;
}
export interface SolverResult {
solution: Float32Array;
iterations: number;
converged: boolean;
residualNorm: number;
wallTimeMs: number;
}
export class SublinearSolver {
private inner: WasmSolver;
constructor(config: SolverConfig) {
this.inner = new WasmSolver(config);
}
solve(input: Float32Array): SolverResult {
return this.inner.solve(input);
}
async solveAsync(input: Float32Array): Promise<SolverResult> {
// Offload to Web Worker for non-blocking execution
return workerPool.dispatch('solve', { input, config: this.config });
}
}
```
### 3.5 Layer 3 -> Layer 4: API Surface
For the axum-based server integration, add a new route module:
```rust
// crates/ruvector-server/src/routes/solver.rs
use axum::{extract::State, Json, response::sse::Event};
use ruvector_solver::{SublinearSolver, SolverConfig};
pub fn routes() -> Router<AppState> {
Router::new()
.route("/solver/solve", post(solve))
.route("/solver/solve/stream", post(solve_stream))
.route("/solver/config", get(get_config).put(update_config))
}
```
For the MCP integration, add new tools to `mcp-gate`:
```rust
McpTool {
name: "solve_sublinear".to_string(),
description: "Execute a sublinear-time solver on the provided input data".to_string(),
input_schema: serde_json::json!({
"type": "object",
"properties": {
"algorithm": { "type": "string", "enum": ["bmssp", "fast", "sublinear"] },
"input": { "type": "array", "items": { "type": "number" } },
"tolerance": { "type": "number", "default": 1e-6 }
},
"required": ["algorithm", "input"]
}),
}
```
---
## 4. Module Boundary Recommendations
### 4.1 Boundary Principles
The following boundaries should be enforced through Cargo crate visibility and trait-based
abstraction:
```
PUBLIC API BOUNDARY
===================
|
+--------------+--------------+
| |
Solver Core Trait ruvector Core Trait
(SolverEngine) (VectorDB, SearchEngine)
| |
+------+------+ +-------+------+
| | | | | |
BMSSP Fast Sublin HNSW Graph DAG
```
### 4.2 Recommended Trait Boundaries
**Solver engine trait** (new, in `ruvector-solver`):
```rust
pub trait SolverEngine: Send + Sync {
type Input;
type Output;
type Error: std::error::Error;
fn solve(&self, input: &Self::Input) -> Result<Self::Output, Self::Error>;
fn solve_with_budget(
&self,
input: &Self::Input,
budget: ComputeBudget,
) -> Result<Self::Output, Self::Error>;
fn estimate_complexity(&self, input: &Self::Input) -> ComplexityEstimate;
}
```
**Numeric backend trait** (new, in `ruvector-math` or `ruvector-solver`):
```rust
pub trait NumericBackend: Send + Sync {
type Matrix;
type Vector;
fn mat_mul(&self, a: &Self::Matrix, b: &Self::Matrix) -> Self::Matrix;
fn svd(&self, m: &Self::Matrix) -> (Self::Matrix, Self::Vector, Self::Matrix);
fn eigenvalues(&self, m: &Self::Matrix) -> Self::Vector;
fn norm(&self, v: &Self::Vector) -> f64;
}
```
This trait allows the solver to abstract over `nalgebra` and `ndarray` backends, and also
enables future GPU-accelerated backends (the `prime-radiant` crate already has a GPU module
with buffer management and kernel dispatch).
### 4.3 Crate Dependency Graph (Proposed)
```
ruvector-solver-wasm -----> ruvector-solver -----> ruvector-math
| | |
| | +---> nalgebra (new dep)
| | +---> ndarray (existing)
| |
| +---> ruvector-core (optional, for VectorDB integration)
|
+---> wasm-bindgen, serde_wasm_bindgen (existing workspace deps)
ruvector-solver-node -----> ruvector-solver
|
+---> napi, napi-derive (existing workspace deps)
mcp-gate -----> ruvector-solver (optional feature)
ruvector-server -----> ruvector-solver (optional feature)
ruvector-dag -----> ruvector-solver (optional feature for bottleneck algorithms)
```
### 4.4 Feature Flag Recommendations
```toml
[features]
default = []
nalgebra-backend = ["nalgebra"]
ndarray-backend = ["ndarray"]
wasm = ["wasm-bindgen", "serde_wasm_bindgen", "js-sys"]
parallel = ["rayon"]
simd = [] # Auto-detected via cfg(target_feature)
gpu = ["ruvector-math/gpu"]
full = ["nalgebra-backend", "ndarray-backend", "parallel"]
```
---
## 5. Dependency Injection Points
### 5.1 Core DI Architecture
ruvector uses a combination of generic type parameters and `Arc<dyn Trait>` for dependency
injection. The following injection points are relevant for the sublinear-time solver:
#### 5.1.1 Numeric Backend Injection
The solver's core algorithm implementations should accept a generic numeric backend:
```rust
pub struct SublinearSolver<B: NumericBackend = NalgebraBackend> {
backend: B,
config: SolverConfig,
}
impl<B: NumericBackend> SublinearSolver<B> {
pub fn with_backend(backend: B, config: SolverConfig) -> Self {
Self { backend, config }
}
}
```
This allows ruvector consumers who already have `ndarray` matrices to use the solver
without conversion overhead.
#### 5.1.2 Distance Function Injection
ruvector-core's `DistanceMetric` enum defines four distance functions (Euclidean, Cosine,
DotProduct, Manhattan). The solver may need additional distance metrics or custom distance
functions. Injection point:
```rust
pub trait DistanceFunction: Send + Sync {
fn distance(&self, a: &[f32], b: &[f32]) -> f32;
fn name(&self) -> &str;
}
// Adapt ruvector's existing DistanceMetric
impl DistanceFunction for DistanceMetric {
fn distance(&self, a: &[f32], b: &[f32]) -> f32 {
match self {
DistanceMetric::Euclidean => simsimd_euclidean(a, b),
DistanceMetric::Cosine => simsimd_cosine(a, b),
// ...
}
}
}
```
#### 5.1.3 Storage Backend Injection
ruvector-core already has conditional compilation for storage backends (`storage` vs
`storage_memory`). The solver should use a similar pattern for result caching:
```rust
pub trait SolverCache: Send + Sync {
fn get(&self, key: &[u8]) -> Option<Vec<u8>>;
fn put(&self, key: &[u8], value: &[u8]);
fn invalidate(&self, key: &[u8]);
}
```
Implementations could include:
- `InMemoryCache` (default, using `DashMap`)
- `VectorDBCache` (using ruvector-core's VectorDB for nearest-neighbor result caching)
- `WasmCache` (using IndexedDB, following the `ruvector-wasm/src/indexeddb.js` pattern)
#### 5.1.4 Compute Budget Injection
Following `prime-radiant`'s compute ladder pattern (Lane 0 Reflex through Lane 3 Human),
the solver should accept compute budgets:
```rust
pub struct ComputeBudget {
pub max_wall_time: Duration,
pub max_iterations: usize,
pub max_memory_bytes: usize,
pub lane: ComputeLane,
}
pub enum ComputeLane {
Reflex, // < 1ms, local only
Retrieval, // ~ 10ms, can fetch cached results
Heavy, // ~ 100ms, full solver execution
Deliberate, // unbounded, with streaming progress
}
```
### 5.2 WASM-Specific Injection Points
In the WASM layer, dependency injection occurs through JavaScript configuration objects:
```typescript
interface SolverOptions {
// Backend selection
backend?: 'wasm-simd' | 'wasm-baseline' | 'js-fallback';
// Worker pool configuration
workerCount?: number;
workerUrl?: string;
// Memory management
maxMemoryMB?: number;
useSharedArrayBuffer?: boolean;
// Progress callback (for streaming)
onProgress?: (progress: SolverProgress) => void;
}
```
### 5.3 Server-Level Injection
At the API layer, the solver should be injected into the axum `AppState`:
```rust
pub struct AppState {
// Existing
pub vector_db: Arc<RwLock<CoreVectorDB>>,
pub collection_manager: Arc<RwLock<CoreCollectionManager>>,
// New: solver engine injection
pub solver: Arc<dyn SolverEngine<Input = SolverInput, Output = SolverOutput, Error = SolverError>>,
}
```
---
## 6. Event-Driven Integration Patterns
### 6.1 Alignment with Prime-Radiant Event Sourcing
The `prime-radiant` crate's `DomainEvent` enum provides a proven event-sourcing pattern.
The solver should emit analogous events for computation provenance:
```rust
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "type")]
pub enum SolverEvent {
/// A solve request was received
SolveRequested {
request_id: String,
algorithm: String,
input_dimensions: (usize, usize),
timestamp: Timestamp,
},
/// An iteration completed
IterationCompleted {
request_id: String,
iteration: usize,
residual_norm: f64,
wall_time_us: u64,
timestamp: Timestamp,
},
/// The solver converged to a solution
SolveConverged {
request_id: String,
total_iterations: usize,
final_residual: f64,
total_wall_time_us: u64,
timestamp: Timestamp,
},
/// The solver exceeded its compute budget
BudgetExhausted {
request_id: String,
budget: ComputeBudget,
best_residual: f64,
timestamp: Timestamp,
},
/// A complexity estimate was computed
ComplexityEstimated {
request_id: String,
estimated_flops: u64,
estimated_memory_bytes: u64,
recommended_lane: ComputeLane,
timestamp: Timestamp,
},
}
```
### 6.2 Event Bus Integration
The solver events should be published to the same event infrastructure that prime-radiant
uses. The recommended pattern is a channel-based event bus:
```rust
pub struct SolverWithEvents<S: SolverEngine> {
solver: S,
event_tx: tokio::sync::broadcast::Sender<SolverEvent>,
}
impl<S: SolverEngine> SolverWithEvents<S> {
pub fn subscribe(&self) -> tokio::sync::broadcast::Receiver<SolverEvent> {
self.event_tx.subscribe()
}
}
```
This enables:
- **Coherence gate integration**: Prime-radiant can subscribe to solver events and include
solver stability in its coherence energy calculations.
- **Streaming API responses**: The axum server can convert the event stream to SSE.
- **MCP progress notifications**: The MCP server can emit JSON-RPC notifications for
long-running solve operations.
- **Telemetry and monitoring**: The `ruvector-metrics` crate can subscribe and export
Prometheus metrics for solver operations.
### 6.3 Coherence Gate as Solver Governor
A powerful integration pattern connects the solver to prime-radiant's coherence gate:
```
Solve Request --> Complexity Estimate --> Gate Decision --> Execute or Escalate
|
Prime-Radiant evaluates:
- Energy budget available?
- System coherence stable?
- Resource contention low?
```
The `cognitum-gate-tilezero` crate's `permit_action` tool can govern solver execution:
```rust
// Before executing a solver, request permission from the gate
let action = ActionContext {
action_id: format!("solve-{}", request_id),
action_type: "heavy_compute".into(),
target: ActionTarget {
device: "solver-engine".into(),
path: format!("/solver/{}", algorithm),
},
metadata: ActionMetadata {
estimated_cost: complexity.estimated_flops as f64,
estimated_duration_ms: complexity.estimated_wall_time_ms,
},
};
match gate.permit_action(action).await {
GateDecision::Permit(token) => solver.solve_with_token(input, token),
GateDecision::Defer(info) => escalate_to_queue(input, info),
GateDecision::Deny(reason) => Err(SolverError::Denied(reason)),
}
```
### 6.4 DAG Integration Events
The `ruvector-dag` crate's query plan optimizer can emit events when bottleneck analysis
identifies nodes that would benefit from sublinear-time solving:
```rust
// In ruvector-dag when a bottleneck is detected
SolverEvent::BottleneckSolverRequested {
dag_id: dag.id(),
bottleneck_nodes: bottlenecks.iter().map(|b| b.node_id).collect(),
estimated_speedup: bottlenecks.iter().map(|b| b.speedup_potential).sum(),
timestamp: now(),
}
```
---
## 7. Performance Architecture Considerations
### 7.1 Memory Architecture
#### Current ruvector Memory Model
ruvector-core uses several memory optimization strategies:
- **Arena allocator** (`arena.rs`): Cache-aligned vector allocation with `CACHE_LINE_SIZE`
awareness and batch allocation via `BatchVectorAllocator`.
- **SoA storage** (`cache_optimized.rs`): Structure-of-Arrays layout for cache-friendly
sequential access to vector components.
- **Memory pools** (`memory.rs`): Basic allocation tracking with optional limits.
- **Paged memory** (ADR-006): 2MB page-granular allocation with LRU eviction and
Hot/Warm/Cold residency tiers.
#### Solver Memory Requirements
Sublinear-time algorithms are inherently memory-efficient (often O(n^alpha) for alpha < 1),
but the nalgebra backend may allocate large intermediate matrices. Recommendations:
1. **Use ruvector's arena allocator** for solver-internal scratch space. Wrap nalgebra
allocations in arena-backed storage:
```rust
pub struct SolverArena {
inner: Arena,
scratch_matrices: Vec<DMatrix<f32>>,
}
```
2. **Integrate with ADR-006 paged memory** for large problem instances. The solver should
respect the memory pool's limit and request pages through the established interface rather
than allocating directly.
3. **WASM memory budget**: In WASM, limit solver memory to a configurable fraction of the
linear memory. The default WASM memory of 16MB is tight; ensure the solver can operate
within 4-8MB for typical problem sizes, using the `ComputeBudget.max_memory_bytes` field.
### 7.2 SIMD Optimization Strategy
ruvector uses `simsimd 5.9` for distance calculations, achieving approximately 16M ops/sec
for 512-dimensional vectors. The solver should leverage SIMD at two levels:
1. **Auto-vectorization**: Write inner loops in a SIMD-friendly style (sequential access,
no branches, aligned data). Rust's LLVM backend will auto-vectorize these for both native
and WASM targets.
2. **Explicit SIMD**: For hot paths, use `std::arch` intrinsics with runtime detection:
```rust
#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;
#[cfg(target_arch = "wasm32")]
use std::arch::wasm32::*;
```
The existing `ruvector-core/src/simd_intrinsics.rs` provides patterns for this.
3. **WASM SIMD128**: The `ruvector-wasm` crate already detects SIMD support via
`detect_simd()`. Ensure the solver WASM crate is compiled with `-C target-feature=+simd128`
for WASM SIMD support, with a non-SIMD fallback.
### 7.3 Concurrency Architecture
#### Native (Server) Concurrency
ruvector uses a rich concurrency toolkit:
- **Rayon** for data-parallel operations (conditional on `feature = "parallel"`)
- **Crossbeam** for lock-free data structures
- **DashMap** for concurrent hash maps
- **Parking_lot** for efficient mutexes and RwLocks
- **Tokio** for async I/O and task scheduling
- **Lock-free structures** (`lockfree.rs`): `AtomicVectorPool`, `LockFreeWorkQueue`,
`LockFreeBatchProcessor`
The solver should integrate with this concurrency model:
```rust
impl SublinearSolver {
pub fn solve_parallel(&self, input: &[f32]) -> Result<SolverResult> {
#[cfg(feature = "parallel")]
{
input.par_chunks(self.config.chunk_size)
.map(|chunk| self.solve_chunk(chunk))
.reduce_with(|a, b| self.merge_results(a?, b?))
.unwrap_or(Err(SolverError::EmptyInput))
}
#[cfg(not(feature = "parallel"))]
{
self.solve_sequential(input)
}
}
}
```
#### WASM Concurrency
WASM does not support native threads. The solver must use Web Workers for parallelism:
- Follow the `ruvector-wasm/src/worker-pool.js` pattern
- Use `SharedArrayBuffer` for zero-copy data sharing between workers (requires
`Cross-Origin-Opener-Policy: same-origin` and `Cross-Origin-Embedder-Policy: require-corp`)
- Fall back to `postMessage` with transferable `ArrayBuffer` when SAB is unavailable
### 7.4 Latency Targets by Deployment Context
| Context | Target Latency | Memory Budget | Strategy |
|---------|---------------|---------------|----------|
| **WASM (browser)** | < 50ms for 10K elements | 4-8 MB | SIMD128, single-threaded, streaming |
| **WASM (edge/Cloudflare)** | < 10ms for 10K elements | 128 MB | SIMD128, limited workers |
| **Node.js (NAPI)** | < 5ms for 10K elements | 512 MB | Native SIMD, Rayon parallel |
| **Server (axum)** | < 2ms for 10K elements | 2 GB | Full SIMD, Rayon, memory-mapped |
| **MCP (agent)** | Budget-dependent | Configurable | Gate-governed, compute ladder |
### 7.5 Benchmarking Integration
ruvector uses `criterion 0.5` for benchmarking with HTML reports. The solver should integrate
into the existing benchmark infrastructure:
```rust
// benches/solver_benchmarks.rs
use criterion::{criterion_group, criterion_main, BenchmarkId, Criterion};
use ruvector_solver::{SublinearSolver, SolverConfig};
fn bench_sublinear_solve(c: &mut Criterion) {
let mut group = c.benchmark_group("sublinear_solver");
for size in [100, 1_000, 10_000, 100_000] {
group.bench_with_input(
BenchmarkId::new("bmssp", size),
&size,
|b, &size| {
let solver = SublinearSolver::new(SolverConfig::default());
let input: Vec<f32> = (0..size).map(|i| i as f32).collect();
b.iter(|| solver.solve(&input));
},
);
}
group.finish();
}
```
The benchmark results should be stored in the existing `bench_results/` directory in JSON
format, matching the schema used by `comparison_benchmark.json` and `latency_benchmark.json`.
### 7.6 Profile-Guided Optimization
The workspace `Cargo.toml` already configures aggressive release optimizations:
```toml
[profile.release]
opt-level = 3
lto = "fat"
codegen-units = 1
strip = true
```
These settings are critical for solver performance. Additional considerations:
- **PGO (Profile-Guided Optimization)**: For the NAPI binary, consider adding a PGO training
step using representative solver workloads.
- **WASM opt**: Run `wasm-opt -O3` on the solver WASM output (the existing build scripts
in `ruvector-wasm` likely already do this).
- **Link-time optimization across crates**: The `lto = "fat"` setting enables cross-crate
LTO, which is essential for inlining nalgebra operations into solver hot paths.
### 7.7 Zero-Copy Data Path
The critical performance path for the solver is the data pipeline from API input to solver
core and back. Minimize copies:
```
API (axum): body bytes --deserialize--> SolverInput
|
+---------borrow-----------+
| |
nalgebra::DMatrixSlice result buffer
| |
+------solve-------->------+
|
--serialize--> API response bytes
```
For the WASM path:
```
JS Float32Array --view (no copy)--> wasm linear memory --solve--> wasm linear memory
|
--view (no copy)--> JS Float32Array
```
The key is to use `Float32Array::view()` in wasm-bindgen rather than `Float32Array::copy_from()`
wherever the solver does not need to retain ownership of the input data.
---
## Summary of Key Recommendations
1. **Create `crates/ruvector-solver`** as a new pure-Rust workspace member, following the
established core-binding-surface pattern.
2. **Add `nalgebra` as a workspace dependency** and create a bridge module in `ruvector-math`
for zero-cost conversions between nalgebra and ndarray representations.
3. **Follow the existing three-crate pattern** exactly: `ruvector-solver` (core),
`ruvector-solver-wasm` (browser), `ruvector-solver-node` (server).
4. **Integrate with prime-radiant's event sourcing** by emitting `SolverEvent`s through
a broadcast channel, enabling coherence gate governance and streaming API responses.
5. **Use the coherence gate as a solver governor** to prevent runaway computation and
integrate with the compute ladder (Lane 0-3).
6. **Inject the solver into `AppState`** for axum server integration, and add new MCP
tools to `mcp-gate` for AI agent access.
7. **Respect ruvector's memory architecture** by integrating with the arena allocator,
SoA storage patterns, and ADR-006 paged memory management.
8. **Target WASM SIMD128** for browser performance, with graceful fallback to scalar code
detected at runtime via the existing `detect_simd()` mechanism.
9. **Use Rayon with feature gating** for native parallelism, and Web Workers for WASM
parallelism, following the patterns already established in `ruvector-wasm`.
10. **Integrate benchmarks into the existing `criterion` infrastructure** and store results
in the `bench_results/` directory for regression tracking.