Files

ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172

git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900

2026-02-28 14:39:40 -05:00

35 KiB

Raw Blame History

06 - WebAssembly Integration Analysis

Agent: 6 (WASM Integration Specialist) Date: 2026-02-20 Scope: ruvector codebase WASM capabilities, build pipeline, SIMD acceleration, memory management, deployment strategies, module loading, and benchmarking framework

Existing WASM Usage in ruvector
WASM Build Pipeline Compatibility
SIMD Acceleration Opportunities
Memory Management Patterns
Browser vs Node.js Deployment Strategies
WASM Module Loading and Initialization Patterns
Performance Benchmarking Framework for WASM
Recommendations for the Sublinear-Time Solver

1. Existing WASM Usage in ruvector

1.1 Scale of WASM Infrastructure

The ruvector project has a massive, mature WASM infrastructure. The workspace defines 27 dedicated WASM crates in the Cargo workspace, spanning vector database operations, attention mechanisms, graph algorithms, ML inference, and self-learning solvers. This is not an experimental feature -- it is a first-class deployment target.

WASM Crate Inventory (27 crates)

Crate	Description	Target	Size
`ruvector-wasm`	Core vector DB bindings (HNSW, insert, search, delete)	`wasm32-unknown-unknown` (wasm-bindgen)	~28 KB src
`rvf-solver-wasm`	Self-learning temporal solver (Thompson Sampling, PolicyKernel)	`wasm32-unknown-unknown` (no_std + alloc, `extern "C"`)	~160 KB compiled
`rvf-wasm`	RVF format microkernel for browser/edge vector ops	`wasm32-unknown-unknown`	-
`micro-hnsw-wasm`	Neuromorphic HNSW with spiking neural nets	`wasm32-unknown-unknown`	11.8 KB compiled
`ruvector-attention-wasm`	18+ attention mechanisms (Flash, MoE, Hyperbolic)	`wasm32-unknown-unknown` (wasm-bindgen)	-
`ruvector-attention-unified-wasm`	Unified attention API	`wasm32-unknown-unknown`	339 KB compiled
`ruvector-learning-wasm`	MicroLoRA adaptation (<100us latency)	`wasm32-unknown-unknown`	39 KB compiled
`ruvector-nervous-system-wasm`	Bio-inspired neural simulation	`wasm32-unknown-unknown`	178 KB compiled
`ruvector-economy-wasm`	Compute credit management	`wasm32-unknown-unknown`	181 KB compiled
`ruvector-exotic-wasm`	Quantum, hyperbolic, topological	`wasm32-unknown-unknown`	149 KB compiled
`ruvector-sparse-inference-wasm`	Sparse matrix inference with WASM SIMD	`wasm32-unknown-unknown`	-
`ruvector-delta-wasm`	Delta operations with SIMD	`wasm32-unknown-unknown`	-
`ruvector-mincut-wasm`	Subpolynomial-time dynamic min-cut	`wasm32-unknown-unknown`	-
`ruvector-mincut-gated-transformer-wasm`	Gated transformer min-cut	`wasm32-unknown-unknown`	-
`ruvector-graph-wasm`	Graph operations	`wasm32-unknown-unknown`	-
`ruvector-gnn-wasm`	Graph neural networks	`wasm32-unknown-unknown`	-
`ruvector-dag-wasm`	Minimal DAG for browser/embedded	`wasm32-unknown-unknown`	-
`ruvector-math-wasm`	Math operations (Wasserstein, manifolds, spherical)	`wasm32-unknown-unknown`	-
`ruvector-router-wasm`	Query routing	`wasm32-unknown-unknown`	-
`ruvector-fpga-transformer-wasm`	FPGA transformer simulation	`wasm32-unknown-unknown`	-
`ruvector-temporal-tensor-wasm`	Temporal tensor operations	`wasm32-unknown-unknown`	-
`ruvector-tiny-dancer-wasm`	Lightweight operations	`wasm32-unknown-unknown`	-
`ruvector-hyperbolic-hnsw-wasm`	Hyperbolic HNSW	`wasm32-unknown-unknown`	-
`ruvector-domain-expansion-wasm`	Cross-domain transfer learning	`wasm32-unknown-unknown`	-
`ruvllm-wasm`	LLM inference	`wasm32-unknown-unknown`	-
`ruqu-wasm`	Quantum operations	`wasm32-unknown-unknown`	-
`exo-wasm` (example)	Exo AI experiment	`wasm32-unknown-unknown`	-

1.2 Two Distinct WASM Binding Strategies

The codebase employs two fundamentally different WASM integration patterns:

Pattern A: wasm-bindgen + wasm-pack (High-Level, Browser-First)

Used by: ruvector-wasm, ruvector-attention-wasm, ruvector-math-wasm, most -wasm crates.

// crates/ruvector-wasm/src/lib.rs
use wasm_bindgen::prelude::*;
use js_sys::{Float32Array, Object, Promise};
use web_sys::{console, IdbDatabase, IdbFactory};

#[wasm_bindgen(start)]
pub fn init() {
    console_error_panic_hook::set_once();
    tracing_wasm::set_as_global_default();
}

#[wasm_bindgen]
pub struct VectorDB { /* ... */ }

#[wasm_bindgen]
impl VectorDB {
    #[wasm_bindgen(constructor)]
    pub fn new(dimensions: usize, metric: Option<String>, use_hnsw: Option<bool>)
        -> Result<VectorDB, JsValue> { /* ... */ }
}

Key dependencies: wasm-bindgen, wasm-bindgen-futures, js-sys, web-sys, serde-wasm-bindgen, console_error_panic_hook.

Advantages: Rich JS interop, automatic TypeScript type generation, Promise support, access to Web APIs (IndexedDB, Workers, console).

Pattern B: no_std + extern "C" ABI (Low-Level, Minimal)

Used by: rvf-solver-wasm, rvf-wasm, micro-hnsw-wasm.

// crates/rvf/rvf-solver-wasm/src/lib.rs
#![no_std]
extern crate alloc;

#[no_mangle]
pub extern "C" fn rvf_solver_create() -> i32 {
    registry().create()
}

#[no_mangle]
pub extern "C" fn rvf_solver_train(handle: i32, count: i32, /* ... */) -> i32 { /* ... */ }

Key dependencies: dlmalloc (global allocator), libm, serde (no_std + alloc). No wasm-bindgen.

Advantages: Minimal binary size (~160 KB for rvf-solver-wasm, 11.8 KB for micro-hnsw-wasm), no JS runtime dependency, runs on bare wasm32-unknown-unknown, suitable for self-bootstrapping RVF files.

1.3 Kernel Pack System (ADR-005)

The ruvector-wasm crate includes a sophisticated Kernel Pack System (/crates/ruvector-wasm/src/kernel/) for secure, sandboxed execution of ML compute kernels via Wasmtime:

Manifest parsing (manifest.rs): Declares kernel categories (Positional/RoPE, Normalization/RMSNorm, Activation/SwiGLU, KV-Cache, Adapter/LoRA), tensor specs, resource limits
Ed25519 signature verification (signature.rs): Supply chain security for kernel packs
SHA256 hash verification (hash.rs): Content integrity
Epoch-based execution budgets (epoch.rs): Coarse-grained interruption with configurable tick intervals (10ms server, 1ms embedded)
Shared memory protocol (memory.rs): 16-byte aligned allocation, region overlap validation, tensor layout management
Kernel runtime (runtime.rs): KernelRuntime trait with compile/instantiate/execute lifecycle, mock runtime for testing
Trusted allowlist (allowlist.rs): Restricts which kernel IDs may execute

This kernel pack system is directly relevant to the sublinear-time solver because it provides a ready-made infrastructure for sandboxed execution of solver kernels with resource limits.

1.4 Self-Bootstrapping WASM (RVF Format)

The rvf-types crate defines a WasmHeader (/crates/rvf/rvf-types/src/wasm_bootstrap.rs) for embedding WASM modules directly inside .rvf data files:

.rvf file
  +-- WASM_SEG (role=Interpreter, ~50 KB)
  +-- WASM_SEG (role=Microkernel, ~5.5 KB)
  +-- VEC_SEG (data)

Roles: Microkernel, Interpreter, Combined, Extension, ControlPlane. Targets: Wasm32, WasiP1, WasiP2, Browser, BareTile. Feature flags: WASM_FEAT_SIMD, WASM_FEAT_BULK_MEMORY, WASM_FEAT_MULTI_VALUE, WASM_FEAT_REFERENCE_TYPES, WASM_FEAT_THREADS, WASM_FEAT_TAIL_CALL, WASM_FEAT_GC, WASM_FEAT_EXCEPTION_HANDLING.

1.5 Unified WASM TypeScript API

The @ruvector/wasm-unified npm package (/npm/packages/ruvector-wasm-unified/src/index.ts) provides a high-level TypeScript surface combining all WASM modules:

export interface UnifiedEngine {
  attention: AttentionEngine;  // 14+ mechanisms
  learning: LearningEngine;    // MicroLoRA, SONA, BTSP, RL
  nervous: NervousEngine;      // Bio-inspired neural simulation
  economy: EconomyEngine;      // Compute credits
  exotic: ExoticEngine;        // Quantum, hyperbolic, topological
  version(): string;
  getStats(): UnifiedStats;
  init(): Promise<void>;
  dispose(): void;
}

2. WASM Build Pipeline Compatibility

2.1 Workspace-Level Configuration

The root Cargo.toml defines workspace-level WASM dependencies:

# /Cargo.toml (workspace)
[workspace.dependencies]
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"
js-sys = "0.3"
web-sys = { version = "0.3", features = ["Worker", "MessagePort", "console"] }
getrandom = { version = "0.3", features = ["wasm_js"] }

There is also a getrandom compatibility patch for WASM:

# In ruvector-wasm/Cargo.toml
getrandom02 = { package = "getrandom", version = "0.2", features = ["js"] }
[target.'cfg(target_arch = "wasm32")'.dependencies]
getrandom = { workspace = true, features = ["wasm_js"] }

And a workspace-level patch for hnsw_rs to use rand 0.8 for WASM compatibility:

[patch.crates-io]
hnsw_rs = { path = "./patches/hnsw_rs" }

2.2 Build Profiles

Two distinct WASM build profiles exist:

Profile 1: Size-Optimized (for wasm-bindgen crates)

# crates/ruvector-wasm/Cargo.toml
[profile.release]
opt-level = "z"       # Optimize for size
lto = true            # Link-time optimization
codegen-units = 1     # Single codegen unit
panic = "abort"       # No unwind tables

[profile.release.package."*"]
opt-level = "z"

[package.metadata.wasm-pack.profile.release]
wasm-opt = false      # Disable wasm-opt (already optimized by LTO)

Profile 2: Size-Optimized + Strip (for no_std crates)

# crates/rvf/rvf-solver-wasm/Cargo.toml
[profile.release]
opt-level = "z"
lto = true
codegen-units = 1
strip = true          # Also strips debug symbols

Profile 3: Workspace Default Release (native)

# Root Cargo.toml
[profile.release]
opt-level = 3         # Optimize for speed
lto = "fat"
codegen-units = 1
strip = true
panic = "unwind"      # Keeps unwind tables (unlike WASM profile)

2.3 Build Tooling

The test script at /scripts/test/test-wasm.mjs demonstrates the build command:

wasm-pack build crates/ruvector-attention-wasm --target web --release

For no_std crates like rvf-solver-wasm, the standard cargo command with WASM target is used:

cargo build --target wasm32-unknown-unknown --release -p rvf-solver-wasm

2.4 Sublinear-Time Solver Build Compatibility

The rvf-solver-wasm crate provides the closest precedent for a sublinear-time solver WASM build:

Target: wasm32-unknown-unknown (no WASI dependency)
Allocator: dlmalloc (global allocator for alloc)
Math: libm (no_std-compatible math functions)
Serialization: serde + serde_json (no_std + alloc features)
Crypto: rvf-crypto (SHAKE-256 witness chain)
Panic handler: core::arch::wasm32::unreachable()
ABI: extern "C" exports (no wasm-bindgen overhead)
Crate type: cdylib only (no rlib)

This approach produces binaries in the ~160 KB range, which is excellent for edge deployment.

3. SIMD Acceleration Opportunities

3.1 Existing WASM SIMD Infrastructure

The codebase has extensive WASM SIMD128 support across multiple crates, all using core::arch::wasm32::* intrinsics. Every SIMD function provides dual implementations: a #[cfg(target_feature = "simd128")] version using WASM SIMD intrinsics and a #[cfg(not(target_feature = "simd128"))] scalar fallback.

WASM SIMD Operations Already Implemented

Crate	File	Operations
`ruvector-delta-wasm`	`src/simd.rs`	`f32x4` add, sub, scale, dot, L2 norm, diff, abs, clamp, count_nonzero
`ruvector-sparse-inference`	`src/backend/wasm.rs`	`f32x4` dot product, ReLU, vector add, AXPY
`ruvector-mincut`	`src/wasm/simd.rs`	`v128` popcount (table lookup method), XOR, boundary computation, batch membership
`ruvector-core`	`src/simd_intrinsics.rs`	x86_64 (AVX2, AVX-512, FMA), aarch64 (NEON, unrolled), INT8 quantized, batch operations

SIMD Operations in ruvector-delta-wasm/src/simd.rs (Representative)

use core::arch::wasm32::*;

#[cfg(target_feature = "simd128")]
pub fn simd_dot(a: &[f32], b: &[f32]) -> f32 {
    let chunks = a.len() / 4;
    let mut sum_vec = f32x4_splat(0.0);
    for i in 0..chunks {
        let offset = i * 4;
        unsafe {
            let a_vec = v128_load(a.as_ptr().add(offset) as *const v128);
            let b_vec = v128_load(b.as_ptr().add(offset) as *const v128);
            let prod = f32x4_mul(a_vec, b_vec);
            sum_vec = f32x4_add(sum_vec, prod);
        }
    }
    // Horizontal sum + remainder handling
    let sum_array: [f32; 4] = unsafe { core::mem::transmute(sum_vec) };
    let mut sum = sum_array[0] + sum_array[1] + sum_array[2] + sum_array[3];
    for i in (chunks * 4)..a.len() { sum += a[i] * b[i]; }
    sum
}

SIMD Operations in ruvector-sparse-inference/src/backend/wasm.rs (Backend Trait)

pub struct WasmBackend;

impl Backend for WasmBackend {
    fn dot_product(&self, a: &[f32], b: &[f32]) -> f32 { /* SIMD dispatch */ }
    fn sparse_matmul(&self, matrix: &Array2<f32>, input: &[f32], rows: &[usize]) -> Vec<f32>;
    fn sparse_matmul_accumulate(&self, matrix: &Array2<f32>, input: &[f32], cols: &[usize], output: &mut [f32]);
    fn activation(&self, data: &mut [f32], activation_type: ActivationType); // ReLU via SIMD
    fn add(&self, a: &mut [f32], b: &[f32]);
    fn axpy(&self, a: &mut [f32], b: &[f32], scalar: f32);
    fn name(&self) -> &'static str { "WASM-SIMD" }
    fn simd_width(&self) -> usize { 4 } // 128-bit = 4 x f32
}

3.2 SIMD Acceleration Opportunities for the Sublinear-Time Solver

Based on the sublinear-time solver's core operations, the following SIMD acceleration points are identified:

Operation	SIMD Strategy	Expected Speedup	Existing Pattern
Distance computation (dot, cosine, euclidean)	`f32x4_mul` + `f32x4_add` accumulation	2-4x	`ruvector-delta-wasm/src/simd.rs`
Vector normalization	`f32x4_mul` (scale) + `f32x4_add` (L2 norm)	2-4x	`simd_l2_norm_squared`, `simd_scale`
Bitset operations (partition tracking)	`v128_xor`, `v128_and`, popcount via lookup	4-8x	`ruvector-mincut/src/wasm/simd.rs`
Sparse matrix-vector multiply	SIMD dot + sparse row selection	2-4x	`WasmBackend::sparse_matmul`
Activation functions (ReLU, GELU)	`f32x4_max` with zero splat	2-4x	`relu_wasm_simd`
Thompson Sampling bandit updates	Scalar (branching-heavy)	1x (no benefit)	N/A
Sort/selection (top-k)	Scalar (comparison-heavy)	1x (no benefit)	N/A

3.3 SIMD Feature Detection

The ruvector-wasm crate exposes SIMD detection to JS:

#[wasm_bindgen(js_name = detectSIMD)]
pub fn detect_simd() -> bool {
    #[cfg(target_feature = "simd128")]
    { true }
    #[cfg(not(target_feature = "simd128"))]
    { false }
}

For the sublinear-time solver, SIMD should be compiled in via RUSTFLAGS="-C target-feature=+simd128" at build time, with scalar fallbacks for environments that do not support it.

3.4 Native SIMD Comparison

The native codebase (ruvector-core/src/simd_intrinsics.rs) supports:

x86_64: AVX2 (256-bit, 8 x f32), AVX-512 (512-bit, 16 x f32), FMA, INT8 quantized
aarch64: NEON (128-bit, 4 x f32), 4x loop unrolling, FMA via vfmaq_f32
WASM: SIMD128 (128-bit, 4 x f32)

WASM SIMD128 provides the same width as NEON (4 x f32) but lacks FMA (f32x4_fma is not available in stable WASM SIMD). This means the sublinear-time solver WASM build will be approximately 2-3x slower than a native NEON build for distance computations, and 4-8x slower than an AVX-512 build. However, it will still be significantly faster than scalar fallback.

4. Memory Management Patterns

4.1 Shared Memory Protocol (Kernel Pack System)

The kernel pack system at /crates/ruvector-wasm/src/kernel/memory.rs defines a mature shared memory protocol:

pub struct SharedMemoryProtocol {
    total_size: usize,     // Total memory in bytes
    current_offset: usize, // Bump allocator position
    alignment: usize,      // Typically 16 bytes
}

impl SharedMemoryProtocol {
    pub fn default_settings() -> Self {
        Self::new(256, 16) // 256 pages = 16 MB, 16-byte alignment
    }

    pub fn allocate(&mut self, size: usize) -> Result<usize, KernelError> {
        let aligned_offset = self.align_offset(self.current_offset);
        // ...bounds check...
        self.current_offset = aligned_offset + size;
        Ok(aligned_offset)
    }
}

The KernelInvocationDescriptor manages tensor memory layout:

pub struct KernelInvocationDescriptor {
    pub descriptor: KernelDescriptor,  // input_a, input_b, output, scratch, params offsets+sizes
    protocol: SharedMemoryProtocol,
}

The MemoryLayoutValidator prevents region overlap and bounds violations.

4.2 Typed Arrays / Zero-Copy Transfer

The wasm-bindgen crates use Float32Array for zero-copy data transfer between JS and WASM:

// Input: JS Float32Array -> Rust Vec<f32>
pub fn insert(&self, vector: Float32Array, ...) -> Result<String, JsValue> {
    let vector_data: Vec<f32> = vector.to_vec();  // Copy from JS typed array
    // ...
}

// Output: Rust Vec<f32> -> JS Float32Array
pub fn vector(&self) -> Float32Array {
    Float32Array::from(&self.inner.vector[..])  // Copy to JS typed array
}

Note: Float32Array::to_vec() and Float32Array::from() perform copies. True zero-copy requires accessing WASM linear memory directly from JS, which is demonstrated in the pwa-loader:

// Zero-copy write into WASM memory
function wasmWrite(data) {
    const ptr = wasmInstance.exports.rvf_alloc(data.length);
    const mem = new Uint8Array(wasmMemory.buffer, ptr, data.length);
    mem.set(data);  // Direct memory write
    return ptr;
}

// Zero-copy read from WASM memory
function wasmRead(ptr, len) {
    return new Uint8Array(wasmMemory.buffer, ptr, len).slice();
}

4.3 Memory Patterns in rvf-solver-wasm (no_std)

The no_std solver uses dlmalloc as global allocator and manages its own instance registry:

// Global mutable registry - safe in single-threaded WASM
static mut REGISTRY: Registry = Registry::new();
const MAX_INSTANCES: usize = 8;

struct SolverInstance {
    solver: AdaptiveSolver,
    last_result_json: Vec<u8>,   // Heap-allocated via dlmalloc
    policy_json: Vec<u8>,
    witness_chain: Vec<u8>,
}

Memory export for external reads uses raw pointer copies:

#[no_mangle]
pub extern "C" fn rvf_solver_result_read(handle: i32, out_ptr: i32) -> i32 {
    let data = &inst.last_result_json;
    unsafe {
        core::ptr::copy_nonoverlapping(data.as_ptr(), out_ptr as *mut u8, data.len());
    }
    data.len() as i32
}

4.4 Memory Limits

Configuration	Max Pages	Memory Limit	Context
Server runtime	1024	64 MB	`RuntimeConfig::server()`
Embedded runtime	64	4 MB	`RuntimeConfig::embedded()`
Default shared memory	256	16 MB	`SharedMemoryProtocol::default_settings()`
Microkernel (RVF)	2-4	128-256 KB	`WasmHeader` min/max pages
WASM page size	1	64 KB	`WASM_PAGE_SIZE = 65536`

4.5 Security Boundary Validation

The ruvector-wasm crate enforces input validation at the WASM boundary:

const MAX_VECTOR_DIMENSIONS: usize = 65536;

#[wasm_bindgen(constructor)]
pub fn new(vector: Float32Array, ...) -> Result<JsVectorEntry, JsValue> {
    let vec_len = vector.length() as usize;
    if vec_len == 0 {
        return Err(JsValue::from_str("Vector cannot be empty"));
    }
    if vec_len > MAX_VECTOR_DIMENSIONS {
        return Err(JsValue::from_str(&format!(
            "Vector dimensions {} exceed maximum allowed {}", vec_len, MAX_VECTOR_DIMENSIONS
        )));
    }
    // ...
}

5. Browser vs Node.js Deployment Strategies

5.1 Browser Deployment (Primary)

The ruvector-wasm crate is browser-first, using:

IndexedDB persistence: web-sys features include IdbDatabase, IdbFactory, IdbObjectStore, IdbRequest, IdbTransaction, IdbOpenDbRequest (/crates/ruvector-wasm/Cargo.toml)
Web Workers: Embedded JavaScript worker pool (/crates/ruvector-wasm/src/worker-pool.js, /crates/ruvector-wasm/src/worker.js) for parallel operations
Tracing via console: tracing-wasm sends logs to browser dev tools
Promise-based async: wasm-bindgen-futures for async operations
getrandom via JS: getrandom with wasm_js feature uses crypto.getRandomValues()
PWA support: The pwa-loader example (/examples/pwa-loader/app.js) demonstrates offline-capable WASM loading

Browser Loading Pattern

// From examples/pwa-loader/app.js
async function loadWasm() {
    const response = await fetch(WASM_PATH);
    const bytes = await response.arrayBuffer();
    const importObject = { env: {} };
    const result = await WebAssembly.instantiate(bytes, importObject);
    wasmInstance = result.instance;
    wasmMemory = wasmInstance.exports.memory;
}

Browser SIMD Support

WASM SIMD128 is supported in Chrome 91+, Firefox 89+, Safari 16.4+, and Edge 91+. This covers >95% of active browsers as of 2026. Feature detection can be done via:

const simdSupported = WebAssembly.validate(
    new Uint8Array([0,97,115,109,1,0,0,0,1,5,1,96,0,1,123,3,2,1,0,10,10,1,8,0,65,0,253,15,253,98,11])
);

5.2 Node.js Deployment

The project supports Node.js via:

wasm-pack --target nodejs: Generates CommonJS bindings
Direct instantiation from test scripts (/scripts/test/test-wasm.mjs):

import { readFileSync } from 'fs';
const wasmBuffer = readFileSync(wasmPath);
const mathWasm = await import(join(pkgPath, 'ruvector_math_wasm.js'));
await mathWasm.default(wasmBuffer);

Edge-net example: /examples/edge-net/pkg/node/ provides Node-specific WASM packages

Node.js has had WASM SIMD support since v16.4 (V8 9.1+). For the sublinear-time solver, Node.js deployment enables server-side and CLI usage with the same WASM binary.

5.3 Edge / Embedded Deployment

The micro-hnsw-wasm crate (11.8 KB) and rvf-solver-wasm (~160 KB) demonstrate ultra-compact deployment:

iOS/Swift: /examples/wasm/ios/ includes Swift resources with embedded WASM
Self-bootstrapping: The WASM_SEG system embeds WASM interpreters inside data files
Target platforms: WasmTarget::Wasm32, WasiP1, WasiP2, Browser, BareTile

5.4 Deployment Target Matrix

Target	WASM Format	Binding	SIMD	Size Budget	Persistence
Browser (Chrome/FF/Safari)	wasm-bindgen	JS glue + TS types	SIMD128	<500 KB	IndexedDB
Node.js (>= 16.4)	wasm-bindgen (nodejs) or raw	CommonJS/ESM	SIMD128	<1 MB	fs
Cloudflare Workers	wasm-bindgen (web)	ESM	SIMD128	<1 MB	KV
iOS/Swift	raw wasm32	C FFI	Optional	<200 KB	CoreData
Bare-metal / RVF	no_std cdylib	extern "C"	Optional	<200 KB	None

6. WASM Module Loading and Initialization Patterns

6.1 Pattern 1: wasm-bindgen Auto-Init

Used by most WASM crates. The #[wasm_bindgen(start)] attribute runs initialization automatically:

#[wasm_bindgen(start)]
pub fn init() {
    console_error_panic_hook::set_once();
    tracing_wasm::set_as_global_default();
}

JS side (generated by wasm-pack):

import init, { VectorDB } from './ruvector_wasm.js';
await init();  // Loads + instantiates + runs start function
const db = new VectorDB(384, 'cosine', true);

6.2 Pattern 2: Manual WebAssembly.instantiate

Used by the pwa-loader and no_std modules:

const response = await fetch(WASM_PATH);
const bytes = await response.arrayBuffer();
const importObject = { env: {} };
const result = await WebAssembly.instantiate(bytes, importObject);
wasmInstance = result.instance;
wasmMemory = wasmInstance.exports.memory;

This pattern offers maximum control: the host can inspect exports before calling any function, handle errors granularly, and manage memory directly.

6.3 Pattern 3: Streaming Instantiation

For large modules, WebAssembly.instantiateStreaming should be used (not currently in the codebase but recommended):

const result = await WebAssembly.instantiateStreaming(
    fetch(WASM_PATH),
    importObject
);

This starts compiling while bytes are still downloading, reducing load time by up to 50%.

6.4 Pattern 4: Unified Engine Lazy Init

The @ruvector/wasm-unified uses lazy initialization:

let defaultEngine: UnifiedEngine | null = null;

export async function getDefaultEngine(): Promise<UnifiedEngine> {
    if (!defaultEngine) {
        defaultEngine = await createUnifiedEngine();
        await defaultEngine.init();
    }
    return defaultEngine;
}

6.5 Pattern 5: Instance Registry (rvf-solver-wasm)

The solver WASM uses a handle-based instance registry:

static mut REGISTRY: Registry = Registry::new();  // Max 8 concurrent solvers

// JS creates solver:
let handle = wasmInstance.exports.rvf_solver_create();
// JS uses solver:
wasmInstance.exports.rvf_solver_train(handle, 100, 1, 10, seedLo, seedHi);
// JS reads result:
let len = wasmInstance.exports.rvf_solver_result_len(handle);
let ptr = wasmInstance.exports.rvf_solver_alloc(len);
wasmInstance.exports.rvf_solver_result_read(handle, ptr);
let json = new TextDecoder().decode(new Uint8Array(wasmMemory.buffer, ptr, len));
// JS destroys:
wasmInstance.exports.rvf_solver_destroy(handle);

This is the recommended pattern for the sublinear-time solver because it:

Supports multiple concurrent solver instances
Avoids global state issues
Enables resource cleanup
Works across all deployment targets (browser, Node, bare-metal)

7. Performance Benchmarking Framework for WASM

7.1 Existing Benchmark Infrastructure

In-WASM Benchmark Function

The ruvector-wasm crate includes a built-in benchmark export:

#[wasm_bindgen(js_name = benchmark)]
pub fn benchmark(name: &str, iterations: usize, dimensions: usize) -> Result<f64, JsValue> {
    let start = Instant::now();
    for i in 0..iterations {
        let vector: Vec<f32> = (0..dimensions)
            .map(|_| js_sys::Math::random() as f32)
            .collect();
        let vector_arr = Float32Array::from(&vector[..]);
        db.insert(vector_arr, Some(format!("vec_{}", i)), None)?;
    }
    let duration = start.elapsed();
    Ok(iterations as f64 / duration.as_secs_f64())
}

WASM Solver Benchmark Binary

The /examples/benchmarks/src/bin/wasm_solver_bench.rs provides a native vs WASM comparison framework:

WASM vs Native AGI Solver Benchmark
  Config: holdout=50, training=50, cycles=3, budget=200

  NATIVE SOLVER RESULTS
  Mode          Acc%       Cost    Noise%    Time     Pass
  A baseline   xx.x%     xxx.x    xx.x%    xxxms    PASS
  B compiler   xx.x%     xxx.x    xx.x%    xxxms    PASS
  C learned    xx.x%     xxx.x    xx.x%    xxxms    PASS

  WASM REFERENCE METRICS
  Native total time:  xxxms
  WASM expected:      ~xxxms (2-5x native)

This establishes the expected WASM overhead: 2-5x slower than native for the self-learning solver workload.

SIMD Benchmarks

The /crates/prime-radiant/benches/simd_benchmarks.rs and /crates/ruvector-sparse-inference/benches/simd_kernels.rs provide Criterion benchmarks for SIMD operations that can be adapted for WASM SIMD.

7.2 Recommended Benchmarking Framework for the Sublinear-Time Solver

sublinear-time-solver/benches/
  wasm_bench.rs          -- In-Rust Criterion benchmarks (native baseline)
  wasm_bench.mjs         -- Node.js WASM performance runner
  wasm_bench.html        -- Browser WASM performance runner
  bench_harness.rs       -- Shared benchmark harness (puzzle generation)

Metrics to Track

Metric	Description	Measurement
`solve_throughput`	Puzzles solved per second	`iterations / elapsed_secs`
`solve_latency_p50`	Median solve time	Percentile of individual solve times
`solve_latency_p99`	99th percentile solve time	Percentile of individual solve times
`memory_peak_bytes`	Peak WASM linear memory usage	`memory.buffer.byteLength`
`module_load_ms`	Time to instantiate WASM module	`performance.now()` around `WebAssembly.instantiate`
`simd_speedup`	SIMD vs scalar performance ratio	Compare SIMD build vs non-SIMD build
`wasm_native_ratio`	WASM-to-native performance overhead	Compare WASM throughput vs native Criterion results
`binary_size_bytes`	Compiled .wasm file size	`wc -c *.wasm`
`accuracy_parity`	Solver accuracy matches native	Bit-exact or epsilon comparison of results

Benchmark Protocol

Native baseline: Run the solver natively with Criterion (3+ iterations, warm-up)
WASM baseline: Load the same solver as WASM, run identical workload in Node.js
WASM SIMD: Build with RUSTFLAGS="-C target-feature=+simd128", measure speedup
Browser measurement: Run in Chrome with performance.now(), measure real-world latency
Size budget: Track .wasm binary size across commits (regression alerts if >200 KB)
Accuracy validation: Compare solver output JSON between native and WASM (must match to f64 epsilon)

8. Recommendations for the Sublinear-Time Solver

8.1 Binding Strategy: Use no_std + extern "C" (Pattern B)

For the sublinear-time solver WASM module, adopt the rvf-solver-wasm pattern:

no_std + alloc: Minimizes binary size, avoids JS runtime dependency
dlmalloc global allocator: Proven in rvf-solver-wasm
extern "C" exports: Maximum portability (browser, Node, embedded, bare-metal)
Handle-based instance registry: Supports concurrent solver instances
Result reads via pointer+length: JSON serialization of results into WASM memory, host reads via typed array view

Do not use wasm-bindgen for the core solver. A thin wasm-bindgen wrapper can be created separately if a richer JS API is needed.

8.2 SIMD Strategy: Conditional Compilation

// In the solver crate
#[cfg(all(target_arch = "wasm32", target_feature = "simd128"))]
mod simd_wasm {
    use core::arch::wasm32::*;
    pub fn distance_l2_simd(a: &[f32], b: &[f32]) -> f32 { /* SIMD128 */ }
}

#[cfg(not(all(target_arch = "wasm32", target_feature = "simd128")))]
mod simd_wasm {
    pub fn distance_l2_simd(a: &[f32], b: &[f32]) -> f32 { /* scalar fallback */ }
}

Build two variants:

solver.wasm -- scalar fallback (maximum compatibility)
solver-simd.wasm -- SIMD128 enabled (Chrome 91+, FF 89+, Safari 16.4+, Node 16.4+)

8.3 Memory Strategy: Bump Allocator + Shared Memory Protocol

Adopt the SharedMemoryProtocol pattern from the kernel pack system:

Allocate a fixed arena at solver creation (e.g., 256 pages = 16 MB)
Use 16-byte aligned bump allocation for tensor data
Reset the allocator between solve invocations (amortized O(1))
Validate memory regions before kernel execution
Export memory so the host can directly view/write typed arrays without copying

8.4 Build Profile

[profile.release]
opt-level = "z"
lto = true
codegen-units = 1
strip = true
panic = "abort"

Target binary size: <200 KB (consistent with existing rvf-solver-wasm at ~160 KB).

8.5 Feature Detection Export

#[no_mangle]
pub extern "C" fn solver_capabilities() -> u32 {
    let mut caps = 0u32;
    #[cfg(target_feature = "simd128")]
    { caps |= 0x01; }  // SIMD available
    #[cfg(feature = "thompson-sampling")]
    { caps |= 0x02; }  // Thompson Sampling enabled
    #[cfg(feature = "witness-chain")]
    { caps |= 0x04; }  // Witness chain enabled
    caps
}

8.6 Testing Strategy

Use wasm-bindgen-test with run_in_browser for browser tests (existing pattern)
Use the Node.js test harness at /scripts/test/test-wasm.mjs as a template
Validate accuracy parity with native build via wasm_solver_bench
Run SIMD-specific tests with RUSTFLAGS="-C target-feature=+simd128" in CI

Appendix A: File Reference

Core WASM Source Files

File	Purpose
`/crates/ruvector-wasm/src/lib.rs`	Main VectorDB WASM bindings (wasm-bindgen)
`/crates/ruvector-wasm/src/kernel/mod.rs`	Kernel pack system entry point
`/crates/ruvector-wasm/src/kernel/memory.rs`	Shared memory protocol, bump allocator
`/crates/ruvector-wasm/src/kernel/runtime.rs`	Kernel runtime trait, mock runtime, manager
`/crates/ruvector-wasm/src/kernel/epoch.rs`	Epoch-based execution budgets
`/crates/ruvector-wasm/src/kernel/signature.rs`	Ed25519 kernel pack verification
`/crates/ruvector-wasm/src/kernel/manifest.rs`	Kernel manifest parsing
`/crates/ruvector-wasm/Cargo.toml`	WASM dependency configuration

SIMD Source Files

File	Purpose
`/crates/ruvector-delta-wasm/src/simd.rs`	WASM SIMD128 f32x4 operations
`/crates/ruvector-sparse-inference/src/backend/wasm.rs`	WASM SIMD backend with Backend trait
`/crates/ruvector-mincut/src/wasm/simd.rs`	WASM SIMD128 bitset operations
`/crates/ruvector-core/src/simd_intrinsics.rs`	Native SIMD (AVX2/AVX-512/NEON) reference

Solver WASM Source Files

File	Purpose
`/crates/rvf/rvf-solver-wasm/src/lib.rs`	Self-learning solver WASM exports (no_std)
`/crates/rvf/rvf-solver-wasm/src/engine.rs`	Adaptive solver engine
`/crates/rvf/rvf-solver-wasm/src/policy.rs`	PolicyKernel with Thompson Sampling
`/crates/rvf/rvf-solver-wasm/Cargo.toml`	no_std WASM build configuration

Build and Test Files

File	Purpose
`/Cargo.toml`	Workspace WASM dependencies and build profiles
`/scripts/test/test-wasm.mjs`	Node.js WASM test runner
`/examples/benchmarks/src/bin/wasm_solver_bench.rs`	Native vs WASM benchmark comparison
`/examples/pwa-loader/app.js`	Browser WASM loading and memory management

RVF Self-Bootstrap Files

File	Purpose
`/crates/rvf/rvf-types/src/wasm_bootstrap.rs`	WasmHeader, WasmRole, WasmTarget, feature flags

TypeScript/npm Files

File	Purpose
`/npm/packages/ruvector-wasm-unified/src/index.ts`	Unified WASM engine TypeScript API

Appendix B: WASM Binary Size Inventory

Binary	Size	Strategy
`micro_hnsw.wasm`	11.8 KB	no_std, bare minimum
`ruvector_learning_wasm_bg.wasm`	39 KB	wasm-bindgen
`ruvector_exotic_wasm_bg.wasm`	149 KB	wasm-bindgen
`ruvector_nervous_system_wasm_bg.wasm`	178 KB	wasm-bindgen
`ruvector_economy_wasm_bg.wasm`	181 KB	wasm-bindgen
`ruvector_attention_unified_wasm_bg.wasm`	339 KB	wasm-bindgen
`rvf-solver-wasm` (estimated)	~160 KB	no_std + dlmalloc

The sublinear-time solver should target the <200 KB range using the no_std approach, consistent with rvf-solver-wasm.

35 KiB Raw Blame History