# 06 - WebAssembly Integration Analysis **Agent**: 6 (WASM Integration Specialist) **Date**: 2026-02-20 **Scope**: ruvector codebase WASM capabilities, build pipeline, SIMD acceleration, memory management, deployment strategies, module loading, and benchmarking framework --- ## Table of Contents 1. [Existing WASM Usage in ruvector](#1-existing-wasm-usage-in-ruvector) 2. [WASM Build Pipeline Compatibility](#2-wasm-build-pipeline-compatibility) 3. [SIMD Acceleration Opportunities](#3-simd-acceleration-opportunities) 4. [Memory Management Patterns](#4-memory-management-patterns) 5. [Browser vs Node.js Deployment Strategies](#5-browser-vs-nodejs-deployment-strategies) 6. [WASM Module Loading and Initialization Patterns](#6-wasm-module-loading-and-initialization-patterns) 7. [Performance Benchmarking Framework for WASM](#7-performance-benchmarking-framework-for-wasm) 8. [Recommendations for the Sublinear-Time Solver](#8-recommendations-for-the-sublinear-time-solver) --- ## 1. Existing WASM Usage in ruvector ### 1.1 Scale of WASM Infrastructure The ruvector project has a **massive, mature WASM infrastructure**. The workspace defines **27 dedicated WASM crates** in the Cargo workspace, spanning vector database operations, attention mechanisms, graph algorithms, ML inference, and self-learning solvers. This is not an experimental feature -- it is a first-class deployment target. #### WASM Crate Inventory (27 crates) | Crate | Description | Target | Size | |-------|-------------|--------|------| | `ruvector-wasm` | Core vector DB bindings (HNSW, insert, search, delete) | `wasm32-unknown-unknown` (wasm-bindgen) | ~28 KB src | | `rvf-solver-wasm` | Self-learning temporal solver (Thompson Sampling, PolicyKernel) | `wasm32-unknown-unknown` (no_std + alloc, `extern "C"`) | ~160 KB compiled | | `rvf-wasm` | RVF format microkernel for browser/edge vector ops | `wasm32-unknown-unknown` | - | | `micro-hnsw-wasm` | Neuromorphic HNSW with spiking neural nets | `wasm32-unknown-unknown` | 11.8 KB compiled | | `ruvector-attention-wasm` | 18+ attention mechanisms (Flash, MoE, Hyperbolic) | `wasm32-unknown-unknown` (wasm-bindgen) | - | | `ruvector-attention-unified-wasm` | Unified attention API | `wasm32-unknown-unknown` | 339 KB compiled | | `ruvector-learning-wasm` | MicroLoRA adaptation (<100us latency) | `wasm32-unknown-unknown` | 39 KB compiled | | `ruvector-nervous-system-wasm` | Bio-inspired neural simulation | `wasm32-unknown-unknown` | 178 KB compiled | | `ruvector-economy-wasm` | Compute credit management | `wasm32-unknown-unknown` | 181 KB compiled | | `ruvector-exotic-wasm` | Quantum, hyperbolic, topological | `wasm32-unknown-unknown` | 149 KB compiled | | `ruvector-sparse-inference-wasm` | Sparse matrix inference with WASM SIMD | `wasm32-unknown-unknown` | - | | `ruvector-delta-wasm` | Delta operations with SIMD | `wasm32-unknown-unknown` | - | | `ruvector-mincut-wasm` | Subpolynomial-time dynamic min-cut | `wasm32-unknown-unknown` | - | | `ruvector-mincut-gated-transformer-wasm` | Gated transformer min-cut | `wasm32-unknown-unknown` | - | | `ruvector-graph-wasm` | Graph operations | `wasm32-unknown-unknown` | - | | `ruvector-gnn-wasm` | Graph neural networks | `wasm32-unknown-unknown` | - | | `ruvector-dag-wasm` | Minimal DAG for browser/embedded | `wasm32-unknown-unknown` | - | | `ruvector-math-wasm` | Math operations (Wasserstein, manifolds, spherical) | `wasm32-unknown-unknown` | - | | `ruvector-router-wasm` | Query routing | `wasm32-unknown-unknown` | - | | `ruvector-fpga-transformer-wasm` | FPGA transformer simulation | `wasm32-unknown-unknown` | - | | `ruvector-temporal-tensor-wasm` | Temporal tensor operations | `wasm32-unknown-unknown` | - | | `ruvector-tiny-dancer-wasm` | Lightweight operations | `wasm32-unknown-unknown` | - | | `ruvector-hyperbolic-hnsw-wasm` | Hyperbolic HNSW | `wasm32-unknown-unknown` | - | | `ruvector-domain-expansion-wasm` | Cross-domain transfer learning | `wasm32-unknown-unknown` | - | | `ruvllm-wasm` | LLM inference | `wasm32-unknown-unknown` | - | | `ruqu-wasm` | Quantum operations | `wasm32-unknown-unknown` | - | | `exo-wasm` (example) | Exo AI experiment | `wasm32-unknown-unknown` | - | ### 1.2 Two Distinct WASM Binding Strategies The codebase employs **two fundamentally different WASM integration patterns**: #### Pattern A: wasm-bindgen + wasm-pack (High-Level, Browser-First) Used by: `ruvector-wasm`, `ruvector-attention-wasm`, `ruvector-math-wasm`, most `-wasm` crates. ```rust // crates/ruvector-wasm/src/lib.rs use wasm_bindgen::prelude::*; use js_sys::{Float32Array, Object, Promise}; use web_sys::{console, IdbDatabase, IdbFactory}; #[wasm_bindgen(start)] pub fn init() { console_error_panic_hook::set_once(); tracing_wasm::set_as_global_default(); } #[wasm_bindgen] pub struct VectorDB { /* ... */ } #[wasm_bindgen] impl VectorDB { #[wasm_bindgen(constructor)] pub fn new(dimensions: usize, metric: Option, use_hnsw: Option) -> Result { /* ... */ } } ``` Key dependencies: `wasm-bindgen`, `wasm-bindgen-futures`, `js-sys`, `web-sys`, `serde-wasm-bindgen`, `console_error_panic_hook`. Advantages: Rich JS interop, automatic TypeScript type generation, Promise support, access to Web APIs (IndexedDB, Workers, console). #### Pattern B: no_std + extern "C" ABI (Low-Level, Minimal) Used by: `rvf-solver-wasm`, `rvf-wasm`, `micro-hnsw-wasm`. ```rust // crates/rvf/rvf-solver-wasm/src/lib.rs #![no_std] extern crate alloc; #[no_mangle] pub extern "C" fn rvf_solver_create() -> i32 { registry().create() } #[no_mangle] pub extern "C" fn rvf_solver_train(handle: i32, count: i32, /* ... */) -> i32 { /* ... */ } ``` Key dependencies: `dlmalloc` (global allocator), `libm`, `serde` (no_std + alloc). No wasm-bindgen. Advantages: Minimal binary size (~160 KB for rvf-solver-wasm, 11.8 KB for micro-hnsw-wasm), no JS runtime dependency, runs on bare wasm32-unknown-unknown, suitable for self-bootstrapping RVF files. ### 1.3 Kernel Pack System (ADR-005) The `ruvector-wasm` crate includes a sophisticated **Kernel Pack System** (`/crates/ruvector-wasm/src/kernel/`) for secure, sandboxed execution of ML compute kernels via Wasmtime: - **Manifest parsing** (`manifest.rs`): Declares kernel categories (Positional/RoPE, Normalization/RMSNorm, Activation/SwiGLU, KV-Cache, Adapter/LoRA), tensor specs, resource limits - **Ed25519 signature verification** (`signature.rs`): Supply chain security for kernel packs - **SHA256 hash verification** (`hash.rs`): Content integrity - **Epoch-based execution budgets** (`epoch.rs`): Coarse-grained interruption with configurable tick intervals (10ms server, 1ms embedded) - **Shared memory protocol** (`memory.rs`): 16-byte aligned allocation, region overlap validation, tensor layout management - **Kernel runtime** (`runtime.rs`): `KernelRuntime` trait with compile/instantiate/execute lifecycle, mock runtime for testing - **Trusted allowlist** (`allowlist.rs`): Restricts which kernel IDs may execute This kernel pack system is directly relevant to the sublinear-time solver because it provides a ready-made infrastructure for sandboxed execution of solver kernels with resource limits. ### 1.4 Self-Bootstrapping WASM (RVF Format) The `rvf-types` crate defines a `WasmHeader` (`/crates/rvf/rvf-types/src/wasm_bootstrap.rs`) for embedding WASM modules directly inside `.rvf` data files: ``` .rvf file +-- WASM_SEG (role=Interpreter, ~50 KB) +-- WASM_SEG (role=Microkernel, ~5.5 KB) +-- VEC_SEG (data) ``` Roles: `Microkernel`, `Interpreter`, `Combined`, `Extension`, `ControlPlane`. Targets: `Wasm32`, `WasiP1`, `WasiP2`, `Browser`, `BareTile`. Feature flags: `WASM_FEAT_SIMD`, `WASM_FEAT_BULK_MEMORY`, `WASM_FEAT_MULTI_VALUE`, `WASM_FEAT_REFERENCE_TYPES`, `WASM_FEAT_THREADS`, `WASM_FEAT_TAIL_CALL`, `WASM_FEAT_GC`, `WASM_FEAT_EXCEPTION_HANDLING`. ### 1.5 Unified WASM TypeScript API The `@ruvector/wasm-unified` npm package (`/npm/packages/ruvector-wasm-unified/src/index.ts`) provides a high-level TypeScript surface combining all WASM modules: ```typescript export interface UnifiedEngine { attention: AttentionEngine; // 14+ mechanisms learning: LearningEngine; // MicroLoRA, SONA, BTSP, RL nervous: NervousEngine; // Bio-inspired neural simulation economy: EconomyEngine; // Compute credits exotic: ExoticEngine; // Quantum, hyperbolic, topological version(): string; getStats(): UnifiedStats; init(): Promise; dispose(): void; } ``` --- ## 2. WASM Build Pipeline Compatibility ### 2.1 Workspace-Level Configuration The root `Cargo.toml` defines workspace-level WASM dependencies: ```toml # /Cargo.toml (workspace) [workspace.dependencies] wasm-bindgen = "0.2" wasm-bindgen-futures = "0.4" js-sys = "0.3" web-sys = { version = "0.3", features = ["Worker", "MessagePort", "console"] } getrandom = { version = "0.3", features = ["wasm_js"] } ``` There is also a getrandom compatibility patch for WASM: ```toml # In ruvector-wasm/Cargo.toml getrandom02 = { package = "getrandom", version = "0.2", features = ["js"] } [target.'cfg(target_arch = "wasm32")'.dependencies] getrandom = { workspace = true, features = ["wasm_js"] } ``` And a workspace-level patch for hnsw_rs to use rand 0.8 for WASM compatibility: ```toml [patch.crates-io] hnsw_rs = { path = "./patches/hnsw_rs" } ``` ### 2.2 Build Profiles Two distinct WASM build profiles exist: #### Profile 1: Size-Optimized (for wasm-bindgen crates) ```toml # crates/ruvector-wasm/Cargo.toml [profile.release] opt-level = "z" # Optimize for size lto = true # Link-time optimization codegen-units = 1 # Single codegen unit panic = "abort" # No unwind tables [profile.release.package."*"] opt-level = "z" [package.metadata.wasm-pack.profile.release] wasm-opt = false # Disable wasm-opt (already optimized by LTO) ``` #### Profile 2: Size-Optimized + Strip (for no_std crates) ```toml # crates/rvf/rvf-solver-wasm/Cargo.toml [profile.release] opt-level = "z" lto = true codegen-units = 1 strip = true # Also strips debug symbols ``` #### Profile 3: Workspace Default Release (native) ```toml # Root Cargo.toml [profile.release] opt-level = 3 # Optimize for speed lto = "fat" codegen-units = 1 strip = true panic = "unwind" # Keeps unwind tables (unlike WASM profile) ``` ### 2.3 Build Tooling The test script at `/scripts/test/test-wasm.mjs` demonstrates the build command: ```bash wasm-pack build crates/ruvector-attention-wasm --target web --release ``` For no_std crates like rvf-solver-wasm, the standard cargo command with WASM target is used: ```bash cargo build --target wasm32-unknown-unknown --release -p rvf-solver-wasm ``` ### 2.4 Sublinear-Time Solver Build Compatibility The rvf-solver-wasm crate provides the closest precedent for a sublinear-time solver WASM build: - **Target**: `wasm32-unknown-unknown` (no WASI dependency) - **Allocator**: `dlmalloc` (global allocator for `alloc`) - **Math**: `libm` (no_std-compatible math functions) - **Serialization**: `serde` + `serde_json` (no_std + alloc features) - **Crypto**: `rvf-crypto` (SHAKE-256 witness chain) - **Panic handler**: `core::arch::wasm32::unreachable()` - **ABI**: `extern "C"` exports (no wasm-bindgen overhead) - **Crate type**: `cdylib` only (no rlib) This approach produces binaries in the ~160 KB range, which is excellent for edge deployment. --- ## 3. SIMD Acceleration Opportunities ### 3.1 Existing WASM SIMD Infrastructure The codebase has **extensive WASM SIMD128 support** across multiple crates, all using `core::arch::wasm32::*` intrinsics. Every SIMD function provides dual implementations: a `#[cfg(target_feature = "simd128")]` version using WASM SIMD intrinsics and a `#[cfg(not(target_feature = "simd128"))]` scalar fallback. #### WASM SIMD Operations Already Implemented | Crate | File | Operations | |-------|------|------------| | `ruvector-delta-wasm` | `src/simd.rs` | `f32x4` add, sub, scale, dot, L2 norm, diff, abs, clamp, count_nonzero | | `ruvector-sparse-inference` | `src/backend/wasm.rs` | `f32x4` dot product, ReLU, vector add, AXPY | | `ruvector-mincut` | `src/wasm/simd.rs` | `v128` popcount (table lookup method), XOR, boundary computation, batch membership | | `ruvector-core` | `src/simd_intrinsics.rs` | x86_64 (AVX2, AVX-512, FMA), aarch64 (NEON, unrolled), INT8 quantized, batch operations | #### SIMD Operations in ruvector-delta-wasm/src/simd.rs (Representative) ```rust use core::arch::wasm32::*; #[cfg(target_feature = "simd128")] pub fn simd_dot(a: &[f32], b: &[f32]) -> f32 { let chunks = a.len() / 4; let mut sum_vec = f32x4_splat(0.0); for i in 0..chunks { let offset = i * 4; unsafe { let a_vec = v128_load(a.as_ptr().add(offset) as *const v128); let b_vec = v128_load(b.as_ptr().add(offset) as *const v128); let prod = f32x4_mul(a_vec, b_vec); sum_vec = f32x4_add(sum_vec, prod); } } // Horizontal sum + remainder handling let sum_array: [f32; 4] = unsafe { core::mem::transmute(sum_vec) }; let mut sum = sum_array[0] + sum_array[1] + sum_array[2] + sum_array[3]; for i in (chunks * 4)..a.len() { sum += a[i] * b[i]; } sum } ``` #### SIMD Operations in ruvector-sparse-inference/src/backend/wasm.rs (Backend Trait) ```rust pub struct WasmBackend; impl Backend for WasmBackend { fn dot_product(&self, a: &[f32], b: &[f32]) -> f32 { /* SIMD dispatch */ } fn sparse_matmul(&self, matrix: &Array2, input: &[f32], rows: &[usize]) -> Vec; fn sparse_matmul_accumulate(&self, matrix: &Array2, input: &[f32], cols: &[usize], output: &mut [f32]); fn activation(&self, data: &mut [f32], activation_type: ActivationType); // ReLU via SIMD fn add(&self, a: &mut [f32], b: &[f32]); fn axpy(&self, a: &mut [f32], b: &[f32], scalar: f32); fn name(&self) -> &'static str { "WASM-SIMD" } fn simd_width(&self) -> usize { 4 } // 128-bit = 4 x f32 } ``` ### 3.2 SIMD Acceleration Opportunities for the Sublinear-Time Solver Based on the sublinear-time solver's core operations, the following SIMD acceleration points are identified: | Operation | SIMD Strategy | Expected Speedup | Existing Pattern | |-----------|---------------|-------------------|------------------| | Distance computation (dot, cosine, euclidean) | `f32x4_mul` + `f32x4_add` accumulation | 2-4x | `ruvector-delta-wasm/src/simd.rs` | | Vector normalization | `f32x4_mul` (scale) + `f32x4_add` (L2 norm) | 2-4x | `simd_l2_norm_squared`, `simd_scale` | | Bitset operations (partition tracking) | `v128_xor`, `v128_and`, popcount via lookup | 4-8x | `ruvector-mincut/src/wasm/simd.rs` | | Sparse matrix-vector multiply | SIMD dot + sparse row selection | 2-4x | `WasmBackend::sparse_matmul` | | Activation functions (ReLU, GELU) | `f32x4_max` with zero splat | 2-4x | `relu_wasm_simd` | | Thompson Sampling bandit updates | Scalar (branching-heavy) | 1x (no benefit) | N/A | | Sort/selection (top-k) | Scalar (comparison-heavy) | 1x (no benefit) | N/A | ### 3.3 SIMD Feature Detection The `ruvector-wasm` crate exposes SIMD detection to JS: ```rust #[wasm_bindgen(js_name = detectSIMD)] pub fn detect_simd() -> bool { #[cfg(target_feature = "simd128")] { true } #[cfg(not(target_feature = "simd128"))] { false } } ``` For the sublinear-time solver, SIMD should be compiled in via `RUSTFLAGS="-C target-feature=+simd128"` at build time, with scalar fallbacks for environments that do not support it. ### 3.4 Native SIMD Comparison The native codebase (`ruvector-core/src/simd_intrinsics.rs`) supports: - **x86_64**: AVX2 (256-bit, 8 x f32), AVX-512 (512-bit, 16 x f32), FMA, INT8 quantized - **aarch64**: NEON (128-bit, 4 x f32), 4x loop unrolling, FMA via `vfmaq_f32` - **WASM**: SIMD128 (128-bit, 4 x f32) WASM SIMD128 provides the same width as NEON (4 x f32) but lacks FMA (`f32x4_fma` is not available in stable WASM SIMD). This means the sublinear-time solver WASM build will be approximately 2-3x slower than a native NEON build for distance computations, and 4-8x slower than an AVX-512 build. However, it will still be significantly faster than scalar fallback. --- ## 4. Memory Management Patterns ### 4.1 Shared Memory Protocol (Kernel Pack System) The kernel pack system at `/crates/ruvector-wasm/src/kernel/memory.rs` defines a mature shared memory protocol: ```rust pub struct SharedMemoryProtocol { total_size: usize, // Total memory in bytes current_offset: usize, // Bump allocator position alignment: usize, // Typically 16 bytes } impl SharedMemoryProtocol { pub fn default_settings() -> Self { Self::new(256, 16) // 256 pages = 16 MB, 16-byte alignment } pub fn allocate(&mut self, size: usize) -> Result { let aligned_offset = self.align_offset(self.current_offset); // ...bounds check... self.current_offset = aligned_offset + size; Ok(aligned_offset) } } ``` The `KernelInvocationDescriptor` manages tensor memory layout: ```rust pub struct KernelInvocationDescriptor { pub descriptor: KernelDescriptor, // input_a, input_b, output, scratch, params offsets+sizes protocol: SharedMemoryProtocol, } ``` The `MemoryLayoutValidator` prevents region overlap and bounds violations. ### 4.2 Typed Arrays / Zero-Copy Transfer The wasm-bindgen crates use `Float32Array` for zero-copy data transfer between JS and WASM: ```rust // Input: JS Float32Array -> Rust Vec pub fn insert(&self, vector: Float32Array, ...) -> Result { let vector_data: Vec = vector.to_vec(); // Copy from JS typed array // ... } // Output: Rust Vec -> JS Float32Array pub fn vector(&self) -> Float32Array { Float32Array::from(&self.inner.vector[..]) // Copy to JS typed array } ``` Note: `Float32Array::to_vec()` and `Float32Array::from()` perform copies. True zero-copy requires accessing WASM linear memory directly from JS, which is demonstrated in the pwa-loader: ```javascript // Zero-copy write into WASM memory function wasmWrite(data) { const ptr = wasmInstance.exports.rvf_alloc(data.length); const mem = new Uint8Array(wasmMemory.buffer, ptr, data.length); mem.set(data); // Direct memory write return ptr; } // Zero-copy read from WASM memory function wasmRead(ptr, len) { return new Uint8Array(wasmMemory.buffer, ptr, len).slice(); } ``` ### 4.3 Memory Patterns in rvf-solver-wasm (no_std) The no_std solver uses `dlmalloc` as global allocator and manages its own instance registry: ```rust // Global mutable registry - safe in single-threaded WASM static mut REGISTRY: Registry = Registry::new(); const MAX_INSTANCES: usize = 8; struct SolverInstance { solver: AdaptiveSolver, last_result_json: Vec, // Heap-allocated via dlmalloc policy_json: Vec, witness_chain: Vec, } ``` Memory export for external reads uses raw pointer copies: ```rust #[no_mangle] pub extern "C" fn rvf_solver_result_read(handle: i32, out_ptr: i32) -> i32 { let data = &inst.last_result_json; unsafe { core::ptr::copy_nonoverlapping(data.as_ptr(), out_ptr as *mut u8, data.len()); } data.len() as i32 } ``` ### 4.4 Memory Limits | Configuration | Max Pages | Memory Limit | Context | |---------------|-----------|--------------|---------| | Server runtime | 1024 | 64 MB | `RuntimeConfig::server()` | | Embedded runtime | 64 | 4 MB | `RuntimeConfig::embedded()` | | Default shared memory | 256 | 16 MB | `SharedMemoryProtocol::default_settings()` | | Microkernel (RVF) | 2-4 | 128-256 KB | `WasmHeader` min/max pages | | WASM page size | 1 | 64 KB | `WASM_PAGE_SIZE = 65536` | ### 4.5 Security Boundary Validation The `ruvector-wasm` crate enforces input validation at the WASM boundary: ```rust const MAX_VECTOR_DIMENSIONS: usize = 65536; #[wasm_bindgen(constructor)] pub fn new(vector: Float32Array, ...) -> Result { let vec_len = vector.length() as usize; if vec_len == 0 { return Err(JsValue::from_str("Vector cannot be empty")); } if vec_len > MAX_VECTOR_DIMENSIONS { return Err(JsValue::from_str(&format!( "Vector dimensions {} exceed maximum allowed {}", vec_len, MAX_VECTOR_DIMENSIONS ))); } // ... } ``` --- ## 5. Browser vs Node.js Deployment Strategies ### 5.1 Browser Deployment (Primary) The ruvector-wasm crate is browser-first, using: - **IndexedDB persistence**: `web-sys` features include `IdbDatabase`, `IdbFactory`, `IdbObjectStore`, `IdbRequest`, `IdbTransaction`, `IdbOpenDbRequest` (`/crates/ruvector-wasm/Cargo.toml`) - **Web Workers**: Embedded JavaScript worker pool (`/crates/ruvector-wasm/src/worker-pool.js`, `/crates/ruvector-wasm/src/worker.js`) for parallel operations - **Tracing via console**: `tracing-wasm` sends logs to browser dev tools - **Promise-based async**: `wasm-bindgen-futures` for async operations - **getrandom via JS**: `getrandom` with `wasm_js` feature uses `crypto.getRandomValues()` - **PWA support**: The pwa-loader example (`/examples/pwa-loader/app.js`) demonstrates offline-capable WASM loading #### Browser Loading Pattern ```javascript // From examples/pwa-loader/app.js async function loadWasm() { const response = await fetch(WASM_PATH); const bytes = await response.arrayBuffer(); const importObject = { env: {} }; const result = await WebAssembly.instantiate(bytes, importObject); wasmInstance = result.instance; wasmMemory = wasmInstance.exports.memory; } ``` #### Browser SIMD Support WASM SIMD128 is supported in Chrome 91+, Firefox 89+, Safari 16.4+, and Edge 91+. This covers >95% of active browsers as of 2026. Feature detection can be done via: ```javascript const simdSupported = WebAssembly.validate( new Uint8Array([0,97,115,109,1,0,0,0,1,5,1,96,0,1,123,3,2,1,0,10,10,1,8,0,65,0,253,15,253,98,11]) ); ``` ### 5.2 Node.js Deployment The project supports Node.js via: - **wasm-pack `--target nodejs`**: Generates CommonJS bindings - **Direct instantiation** from test scripts (`/scripts/test/test-wasm.mjs`): ```javascript import { readFileSync } from 'fs'; const wasmBuffer = readFileSync(wasmPath); const mathWasm = await import(join(pkgPath, 'ruvector_math_wasm.js')); await mathWasm.default(wasmBuffer); ``` - **Edge-net example**: `/examples/edge-net/pkg/node/` provides Node-specific WASM packages Node.js has had WASM SIMD support since v16.4 (V8 9.1+). For the sublinear-time solver, Node.js deployment enables server-side and CLI usage with the same WASM binary. ### 5.3 Edge / Embedded Deployment The `micro-hnsw-wasm` crate (11.8 KB) and `rvf-solver-wasm` (~160 KB) demonstrate ultra-compact deployment: - **iOS/Swift**: `/examples/wasm/ios/` includes Swift resources with embedded WASM - **Self-bootstrapping**: The WASM_SEG system embeds WASM interpreters inside data files - **Target platforms**: `WasmTarget::Wasm32`, `WasiP1`, `WasiP2`, `Browser`, `BareTile` ### 5.4 Deployment Target Matrix | Target | WASM Format | Binding | SIMD | Size Budget | Persistence | |--------|-------------|---------|------|-------------|-------------| | Browser (Chrome/FF/Safari) | wasm-bindgen | JS glue + TS types | SIMD128 | <500 KB | IndexedDB | | Node.js (>= 16.4) | wasm-bindgen (nodejs) or raw | CommonJS/ESM | SIMD128 | <1 MB | fs | | Cloudflare Workers | wasm-bindgen (web) | ESM | SIMD128 | <1 MB | KV | | iOS/Swift | raw wasm32 | C FFI | Optional | <200 KB | CoreData | | Bare-metal / RVF | no_std cdylib | extern "C" | Optional | <200 KB | None | --- ## 6. WASM Module Loading and Initialization Patterns ### 6.1 Pattern 1: wasm-bindgen Auto-Init Used by most WASM crates. The `#[wasm_bindgen(start)]` attribute runs initialization automatically: ```rust #[wasm_bindgen(start)] pub fn init() { console_error_panic_hook::set_once(); tracing_wasm::set_as_global_default(); } ``` JS side (generated by wasm-pack): ```javascript import init, { VectorDB } from './ruvector_wasm.js'; await init(); // Loads + instantiates + runs start function const db = new VectorDB(384, 'cosine', true); ``` ### 6.2 Pattern 2: Manual WebAssembly.instantiate Used by the pwa-loader and no_std modules: ```javascript const response = await fetch(WASM_PATH); const bytes = await response.arrayBuffer(); const importObject = { env: {} }; const result = await WebAssembly.instantiate(bytes, importObject); wasmInstance = result.instance; wasmMemory = wasmInstance.exports.memory; ``` This pattern offers maximum control: the host can inspect exports before calling any function, handle errors granularly, and manage memory directly. ### 6.3 Pattern 3: Streaming Instantiation For large modules, `WebAssembly.instantiateStreaming` should be used (not currently in the codebase but recommended): ```javascript const result = await WebAssembly.instantiateStreaming( fetch(WASM_PATH), importObject ); ``` This starts compiling while bytes are still downloading, reducing load time by up to 50%. ### 6.4 Pattern 4: Unified Engine Lazy Init The `@ruvector/wasm-unified` uses lazy initialization: ```typescript let defaultEngine: UnifiedEngine | null = null; export async function getDefaultEngine(): Promise { if (!defaultEngine) { defaultEngine = await createUnifiedEngine(); await defaultEngine.init(); } return defaultEngine; } ``` ### 6.5 Pattern 5: Instance Registry (rvf-solver-wasm) The solver WASM uses a handle-based instance registry: ```rust static mut REGISTRY: Registry = Registry::new(); // Max 8 concurrent solvers // JS creates solver: let handle = wasmInstance.exports.rvf_solver_create(); // JS uses solver: wasmInstance.exports.rvf_solver_train(handle, 100, 1, 10, seedLo, seedHi); // JS reads result: let len = wasmInstance.exports.rvf_solver_result_len(handle); let ptr = wasmInstance.exports.rvf_solver_alloc(len); wasmInstance.exports.rvf_solver_result_read(handle, ptr); let json = new TextDecoder().decode(new Uint8Array(wasmMemory.buffer, ptr, len)); // JS destroys: wasmInstance.exports.rvf_solver_destroy(handle); ``` This is the recommended pattern for the sublinear-time solver because it: - Supports multiple concurrent solver instances - Avoids global state issues - Enables resource cleanup - Works across all deployment targets (browser, Node, bare-metal) --- ## 7. Performance Benchmarking Framework for WASM ### 7.1 Existing Benchmark Infrastructure #### In-WASM Benchmark Function The `ruvector-wasm` crate includes a built-in benchmark export: ```rust #[wasm_bindgen(js_name = benchmark)] pub fn benchmark(name: &str, iterations: usize, dimensions: usize) -> Result { let start = Instant::now(); for i in 0..iterations { let vector: Vec = (0..dimensions) .map(|_| js_sys::Math::random() as f32) .collect(); let vector_arr = Float32Array::from(&vector[..]); db.insert(vector_arr, Some(format!("vec_{}", i)), None)?; } let duration = start.elapsed(); Ok(iterations as f64 / duration.as_secs_f64()) } ``` #### WASM Solver Benchmark Binary The `/examples/benchmarks/src/bin/wasm_solver_bench.rs` provides a native vs WASM comparison framework: ``` WASM vs Native AGI Solver Benchmark Config: holdout=50, training=50, cycles=3, budget=200 NATIVE SOLVER RESULTS Mode Acc% Cost Noise% Time Pass A baseline xx.x% xxx.x xx.x% xxxms PASS B compiler xx.x% xxx.x xx.x% xxxms PASS C learned xx.x% xxx.x xx.x% xxxms PASS WASM REFERENCE METRICS Native total time: xxxms WASM expected: ~xxxms (2-5x native) ``` This establishes the expected WASM overhead: **2-5x slower than native** for the self-learning solver workload. #### SIMD Benchmarks The `/crates/prime-radiant/benches/simd_benchmarks.rs` and `/crates/ruvector-sparse-inference/benches/simd_kernels.rs` provide Criterion benchmarks for SIMD operations that can be adapted for WASM SIMD. ### 7.2 Recommended Benchmarking Framework for the Sublinear-Time Solver ``` sublinear-time-solver/benches/ wasm_bench.rs -- In-Rust Criterion benchmarks (native baseline) wasm_bench.mjs -- Node.js WASM performance runner wasm_bench.html -- Browser WASM performance runner bench_harness.rs -- Shared benchmark harness (puzzle generation) ``` #### Metrics to Track | Metric | Description | Measurement | |--------|-------------|-------------| | `solve_throughput` | Puzzles solved per second | `iterations / elapsed_secs` | | `solve_latency_p50` | Median solve time | Percentile of individual solve times | | `solve_latency_p99` | 99th percentile solve time | Percentile of individual solve times | | `memory_peak_bytes` | Peak WASM linear memory usage | `memory.buffer.byteLength` | | `module_load_ms` | Time to instantiate WASM module | `performance.now()` around `WebAssembly.instantiate` | | `simd_speedup` | SIMD vs scalar performance ratio | Compare SIMD build vs non-SIMD build | | `wasm_native_ratio` | WASM-to-native performance overhead | Compare WASM throughput vs native Criterion results | | `binary_size_bytes` | Compiled .wasm file size | `wc -c *.wasm` | | `accuracy_parity` | Solver accuracy matches native | Bit-exact or epsilon comparison of results | #### Benchmark Protocol 1. **Native baseline**: Run the solver natively with Criterion (3+ iterations, warm-up) 2. **WASM baseline**: Load the same solver as WASM, run identical workload in Node.js 3. **WASM SIMD**: Build with `RUSTFLAGS="-C target-feature=+simd128"`, measure speedup 4. **Browser measurement**: Run in Chrome with `performance.now()`, measure real-world latency 5. **Size budget**: Track .wasm binary size across commits (regression alerts if >200 KB) 6. **Accuracy validation**: Compare solver output JSON between native and WASM (must match to f64 epsilon) --- ## 8. Recommendations for the Sublinear-Time Solver ### 8.1 Binding Strategy: Use no_std + extern "C" (Pattern B) For the sublinear-time solver WASM module, adopt the `rvf-solver-wasm` pattern: - **no_std + alloc**: Minimizes binary size, avoids JS runtime dependency - **dlmalloc global allocator**: Proven in rvf-solver-wasm - **extern "C" exports**: Maximum portability (browser, Node, embedded, bare-metal) - **Handle-based instance registry**: Supports concurrent solver instances - **Result reads via pointer+length**: JSON serialization of results into WASM memory, host reads via typed array view Do not use wasm-bindgen for the core solver. A thin wasm-bindgen wrapper can be created separately if a richer JS API is needed. ### 8.2 SIMD Strategy: Conditional Compilation ```rust // In the solver crate #[cfg(all(target_arch = "wasm32", target_feature = "simd128"))] mod simd_wasm { use core::arch::wasm32::*; pub fn distance_l2_simd(a: &[f32], b: &[f32]) -> f32 { /* SIMD128 */ } } #[cfg(not(all(target_arch = "wasm32", target_feature = "simd128")))] mod simd_wasm { pub fn distance_l2_simd(a: &[f32], b: &[f32]) -> f32 { /* scalar fallback */ } } ``` Build two variants: - `solver.wasm` -- scalar fallback (maximum compatibility) - `solver-simd.wasm` -- SIMD128 enabled (Chrome 91+, FF 89+, Safari 16.4+, Node 16.4+) ### 8.3 Memory Strategy: Bump Allocator + Shared Memory Protocol Adopt the `SharedMemoryProtocol` pattern from the kernel pack system: 1. Allocate a fixed arena at solver creation (e.g., 256 pages = 16 MB) 2. Use 16-byte aligned bump allocation for tensor data 3. Reset the allocator between solve invocations (amortized O(1)) 4. Validate memory regions before kernel execution 5. Export `memory` so the host can directly view/write typed arrays without copying ### 8.4 Build Profile ```toml [profile.release] opt-level = "z" lto = true codegen-units = 1 strip = true panic = "abort" ``` Target binary size: <200 KB (consistent with existing rvf-solver-wasm at ~160 KB). ### 8.5 Feature Detection Export ```rust #[no_mangle] pub extern "C" fn solver_capabilities() -> u32 { let mut caps = 0u32; #[cfg(target_feature = "simd128")] { caps |= 0x01; } // SIMD available #[cfg(feature = "thompson-sampling")] { caps |= 0x02; } // Thompson Sampling enabled #[cfg(feature = "witness-chain")] { caps |= 0x04; } // Witness chain enabled caps } ``` ### 8.6 Testing Strategy - Use `wasm-bindgen-test` with `run_in_browser` for browser tests (existing pattern) - Use the Node.js test harness at `/scripts/test/test-wasm.mjs` as a template - Validate accuracy parity with native build via `wasm_solver_bench` - Run SIMD-specific tests with `RUSTFLAGS="-C target-feature=+simd128"` in CI --- ## Appendix A: File Reference ### Core WASM Source Files | File | Purpose | |------|---------| | `/crates/ruvector-wasm/src/lib.rs` | Main VectorDB WASM bindings (wasm-bindgen) | | `/crates/ruvector-wasm/src/kernel/mod.rs` | Kernel pack system entry point | | `/crates/ruvector-wasm/src/kernel/memory.rs` | Shared memory protocol, bump allocator | | `/crates/ruvector-wasm/src/kernel/runtime.rs` | Kernel runtime trait, mock runtime, manager | | `/crates/ruvector-wasm/src/kernel/epoch.rs` | Epoch-based execution budgets | | `/crates/ruvector-wasm/src/kernel/signature.rs` | Ed25519 kernel pack verification | | `/crates/ruvector-wasm/src/kernel/manifest.rs` | Kernel manifest parsing | | `/crates/ruvector-wasm/Cargo.toml` | WASM dependency configuration | ### SIMD Source Files | File | Purpose | |------|---------| | `/crates/ruvector-delta-wasm/src/simd.rs` | WASM SIMD128 f32x4 operations | | `/crates/ruvector-sparse-inference/src/backend/wasm.rs` | WASM SIMD backend with Backend trait | | `/crates/ruvector-mincut/src/wasm/simd.rs` | WASM SIMD128 bitset operations | | `/crates/ruvector-core/src/simd_intrinsics.rs` | Native SIMD (AVX2/AVX-512/NEON) reference | ### Solver WASM Source Files | File | Purpose | |------|---------| | `/crates/rvf/rvf-solver-wasm/src/lib.rs` | Self-learning solver WASM exports (no_std) | | `/crates/rvf/rvf-solver-wasm/src/engine.rs` | Adaptive solver engine | | `/crates/rvf/rvf-solver-wasm/src/policy.rs` | PolicyKernel with Thompson Sampling | | `/crates/rvf/rvf-solver-wasm/Cargo.toml` | no_std WASM build configuration | ### Build and Test Files | File | Purpose | |------|---------| | `/Cargo.toml` | Workspace WASM dependencies and build profiles | | `/scripts/test/test-wasm.mjs` | Node.js WASM test runner | | `/examples/benchmarks/src/bin/wasm_solver_bench.rs` | Native vs WASM benchmark comparison | | `/examples/pwa-loader/app.js` | Browser WASM loading and memory management | ### RVF Self-Bootstrap Files | File | Purpose | |------|---------| | `/crates/rvf/rvf-types/src/wasm_bootstrap.rs` | WasmHeader, WasmRole, WasmTarget, feature flags | ### TypeScript/npm Files | File | Purpose | |------|---------| | `/npm/packages/ruvector-wasm-unified/src/index.ts` | Unified WASM engine TypeScript API | --- ## Appendix B: WASM Binary Size Inventory | Binary | Size | Strategy | |--------|------|----------| | `micro_hnsw.wasm` | 11.8 KB | no_std, bare minimum | | `ruvector_learning_wasm_bg.wasm` | 39 KB | wasm-bindgen | | `ruvector_exotic_wasm_bg.wasm` | 149 KB | wasm-bindgen | | `ruvector_nervous_system_wasm_bg.wasm` | 178 KB | wasm-bindgen | | `ruvector_economy_wasm_bg.wasm` | 181 KB | wasm-bindgen | | `ruvector_attention_unified_wasm_bg.wasm` | 339 KB | wasm-bindgen | | `rvf-solver-wasm` (estimated) | ~160 KB | no_std + dlmalloc | The sublinear-time solver should target the **<200 KB** range using the no_std approach, consistent with `rvf-solver-wasm`.