Files

@ruvector/learning-wasm - Ultra-Fast MicroLoRA for WebAssembly

npm version License: MIT OR Apache-2.0 Bundle Size WebAssembly

Ultra-fast Low-Rank Adaptation (LoRA) for WebAssembly with sub-100 microsecond adaptation latency. Designed for real-time per-operator-type learning in query optimization systems, edge AI, and browser-based machine learning applications.

Key Features

  • Rank-2 LoRA Architecture: Minimal parameter count (2d parameters per adapter) for efficient edge deployment
  • Sub-100us Adaptation Latency: Instant weight updates enabling real-time learning
  • Per-Operator Scoping: Separate adapters for 17 different operator types (scan, filter, join, aggregate, etc.)
  • Zero-Allocation Forward Pass: Direct memory access for maximum performance
  • Trajectory Buffer: Track learning history with success rate analytics
  • WASM-Optimized: no_std compatible with minimal allocations

Installation

npm install ruvector-learning-wasm
# or
yarn add ruvector-learning-wasm
# or
pnpm add ruvector-learning-wasm

Quick Start

TypeScript/JavaScript

import init, { WasmMicroLoRA, WasmScopedLoRA, WasmTrajectoryBuffer } from 'ruvector-learning-wasm';

// Initialize WASM module
await init();

// Create a MicroLoRA engine (256-dim embeddings)
const lora = new WasmMicroLoRA(256, 0.1, 0.01);

// Forward pass with typed arrays
const input = new Float32Array(256).fill(0.1);
const output = lora.forward_array(input);

// Adapt based on gradient
const gradient = new Float32Array(256);
gradient.fill(0.05);
lora.adapt_array(gradient);

// Or use reward-based adaptation
lora.adapt_with_reward(0.15); // 15% improvement

console.log(`Adaptations: ${lora.adapt_count()}`);
console.log(`Delta norm: ${lora.delta_norm()}`);

Zero-Allocation Forward Pass

For maximum performance, use direct memory access:

// Get buffer pointers
const inputPtr = lora.get_input_ptr();
const outputPtr = lora.get_output_ptr();

// Write directly to WASM memory
const memory = new Float32Array(wasmInstance.memory.buffer, inputPtr, 256);
memory.set(inputData);

// Execute forward pass (zero allocation)
lora.forward();

// Read output directly from WASM memory
const result = new Float32Array(wasmInstance.memory.buffer, outputPtr, 256);

Per-Operator Scoped LoRA

import { WasmScopedLoRA } from 'ruvector-learning-wasm';

const scopedLora = new WasmScopedLoRA(256, 0.1, 0.01);

// Operator types: 0=Scan, 1=Filter, 2=Join, 3=Aggregate, 4=Project, 5=Sort, ...
const SCAN_OP = 0;
const JOIN_OP = 2;

// Forward pass for specific operator
const scanOutput = scopedLora.forward_array(SCAN_OP, input);

// Adapt specific operator based on improvement
scopedLora.adapt_with_reward(JOIN_OP, 0.25);

// Get operator name
console.log(WasmScopedLoRA.scope_name(SCAN_OP)); // "Scan"

// Check per-operator statistics
console.log(`Scan adaptations: ${scopedLora.adapt_count(SCAN_OP)}`);
console.log(`Total adaptations: ${scopedLora.total_adapt_count()}`);

Trajectory Tracking

import { WasmTrajectoryBuffer } from 'ruvector-learning-wasm';

const buffer = new WasmTrajectoryBuffer(1000, 256);

// Record trajectories
buffer.record(
  embedding,      // Float32Array
  2,              // operator type (JOIN)
  5,              // attention mechanism used
  45.2,           // actual execution time (ms)
  120.5           // baseline execution time (ms)
);

// Analyze learning progress
console.log(`Success rate: ${(buffer.success_rate() * 100).toFixed(1)}%`);
console.log(`Best improvement: ${buffer.best_improvement()}x`);
console.log(`Mean improvement: ${buffer.mean_improvement()}x`);
console.log(`Best attention mechanism: ${buffer.best_attention()}`);

// Filter high-quality trajectories
const topTrajectories = buffer.high_quality_count(0.5); // >50% improvement

Architecture

Input Embedding (d-dim)
       |
       v
  +---------+
  | A: d x 2 |  Down projection (d -> 2)
  +---------+
       |
       v
  +---------+
  | B: 2 x d |  Up projection (2 -> d)
  +---------+
       |
       v
Delta W = alpha * (A @ B)
       |
       v
Output = Input + Delta W

Performance Benchmarks

Operation Latency Throughput
Forward (256-dim) ~15us 66K ops/sec
Adapt (gradient) ~25us 40K ops/sec
Forward (zero-alloc) ~8us 125K ops/sec
Scoped forward ~20us 50K ops/sec
Trajectory record ~5us 200K ops/sec

Tested on Chrome 120+ / Node.js 20+ with WASM SIMD support.

API Reference

WasmMicroLoRA

Method Description
new(dim?, alpha?, learning_rate?) Create engine (defaults: 256, 0.1, 0.01)
forward_array(input) Forward pass with Float32Array
forward() Zero-allocation forward using buffers
adapt_array(gradient) Adapt with gradient vector
adapt_with_reward(improvement) Reward-based adaptation
delta_norm() Get weight change magnitude
adapt_count() Number of adaptations
reset() Reset to initial state

WasmScopedLoRA

Method Description
new(dim?, alpha?, learning_rate?) Create scoped manager
forward_array(op_type, input) Forward for operator
adapt_with_reward(op_type, improvement) Operator-specific adaptation
scope_name(op_type) Get operator name (static)
total_adapt_count() Total adaptations across all operators
set_category_fallback(enabled) Enable category fallback

WasmTrajectoryBuffer

Method Description
new(capacity?, embedding_dim?) Create buffer
record(embedding, op_type, attention_type, exec_ms, baseline_ms) Record trajectory
success_rate() Get success rate (0.0-1.0)
best_improvement() Get best improvement ratio
mean_improvement() Get mean improvement ratio
high_quality_count(threshold) Count trajectories above threshold

Use Cases

  • Query Optimization: Learn optimal attention mechanisms per SQL operator
  • Edge AI Personalization: Real-time model adaptation on user devices
  • Browser ML: In-browser fine-tuning without server round-trips
  • Federated Learning: Lightweight local adaptation for aggregation
  • Reinforcement Learning: Fast policy adaptation from rewards

Bundle Size

  • WASM binary: ~39KB (uncompressed)
  • Gzip compressed: ~15KB
  • JavaScript glue: ~5KB

License

MIT OR Apache-2.0


Keywords: LoRA, Low-Rank Adaptation, machine learning, WASM, WebAssembly, neural network, edge AI, adaptation, fine-tuning, query optimization, real-time learning, micro LoRA, rank-2, browser ML