Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

This commit is contained in:
ruv
2026-02-28 14:39:40 -05:00
7854 changed files with 3522914 additions and 0 deletions

View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025 rUv
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -0,0 +1,220 @@
# ruvector-attention-wasm
WebAssembly bindings for the ruvector-attention package, providing high-performance attention mechanisms for browser and Node.js environments.
## Features
- **Multiple Attention Mechanisms**:
- Scaled Dot-Product Attention
- Multi-Head Attention
- Hyperbolic Attention (for hierarchical data)
- Linear Attention (Performer-style)
- Flash Attention (memory-efficient)
- Local-Global Attention
- Mixture of Experts (MoE) Attention
- **CGT Sheaf Attention** (coherence-gated via Prime-Radiant)
- **Training Utilities**:
- InfoNCE contrastive loss
- Adam optimizer
- AdamW optimizer (with decoupled weight decay)
- Learning rate scheduler (warmup + cosine decay)
- **TypeScript Support**: Full type definitions and modern API
## Installation
```bash
npm install ruvector-attention-wasm
```
## Usage
### TypeScript/JavaScript
```typescript
import { initialize, MultiHeadAttention, utils } from 'ruvector-attention-wasm';
// Initialize WASM module
await initialize();
// Create multi-head attention
const attention = new MultiHeadAttention({ dim: 64, numHeads: 8 });
// Prepare inputs
const query = new Float32Array(64);
const keys = [new Float32Array(64), new Float32Array(64)];
const values = [new Float32Array(64), new Float32Array(64)];
// Compute attention
const output = attention.compute(query, keys, values);
// Use utilities
const similarity = utils.cosineSimilarity(query, keys[0]);
```
### Advanced Examples
#### Hyperbolic Attention
```typescript
import { HyperbolicAttention } from 'ruvector-attention-wasm';
const hyperbolic = new HyperbolicAttention({
dim: 128,
curvature: 1.0
});
const output = hyperbolic.compute(query, keys, values);
```
#### MoE Attention with Expert Stats
```typescript
import { MoEAttention } from 'ruvector-attention-wasm';
const moe = new MoEAttention({
dim: 64,
numExperts: 4,
topK: 2
});
const output = moe.compute(query, keys, values);
// Get expert utilization
const stats = moe.getExpertStats();
console.log('Load balance:', stats.loadBalance);
```
#### Training with InfoNCE Loss
```typescript
import { InfoNCELoss, Adam } from 'ruvector-attention-wasm';
const loss = new InfoNCELoss(0.07);
const optimizer = new Adam(paramCount, {
learningRate: 0.001,
beta1: 0.9,
beta2: 0.999,
});
// Training loop
const lossValue = loss.compute(anchor, positive, negatives);
optimizer.step(params, gradients);
```
#### Learning Rate Scheduling
```typescript
import { LRScheduler, AdamW } from 'ruvector-attention-wasm';
const scheduler = new LRScheduler({
initialLR: 0.001,
warmupSteps: 1000,
totalSteps: 10000,
});
const optimizer = new AdamW(paramCount, {
learningRate: scheduler.getLR(),
weightDecay: 0.01,
});
// Training loop
for (let step = 0; step < 10000; step++) {
optimizer.learningRate = scheduler.getLR();
optimizer.step(params, gradients);
scheduler.step();
}
```
## Building from Source
### Prerequisites
- Rust 1.70+
- wasm-pack
### Build Commands
```bash
# Build for web (ES modules)
wasm-pack build --target web --out-dir pkg
# Build for Node.js
wasm-pack build --target nodejs --out-dir pkg-node
# Build for bundlers (webpack, vite, etc.)
wasm-pack build --target bundler --out-dir pkg-bundler
# Run tests
wasm-pack test --headless --firefox
```
## API Reference
### Attention Mechanisms
- `MultiHeadAttention` - Standard multi-head attention
- `HyperbolicAttention` - Attention in hyperbolic space
- `LinearAttention` - Linear complexity attention (Performer)
- `FlashAttention` - Memory-efficient attention
- `LocalGlobalAttention` - Combined local and global attention
- `MoEAttention` - Mixture of Experts attention
- `CGTSheafAttention` - Coherence-gated via Prime-Radiant energy
- `scaledDotAttention()` - Functional API for basic attention
### CGT Sheaf Attention (Prime-Radiant Integration)
The CGT (Coherence-Gated Transformer) Sheaf Attention mechanism uses Prime-Radiant's sheaf Laplacian energy to gate attention based on mathematical consistency:
```typescript
import { CGTSheafAttention } from 'ruvector-attention-wasm';
const cgtAttention = new CGTSheafAttention({
dim: 128,
numHeads: 8,
coherenceThreshold: 0.3, // Block if energy > threshold
});
// Attention is gated by coherence energy
const result = cgtAttention.compute(query, keys, values);
console.log('Coherence energy:', result.energy);
console.log('Is coherent:', result.isCoherent);
```
**Key features:**
- Energy-weighted attention: Lower coherence energy → higher attention
- Automatic hallucination detection via residual analysis
- GPU-accelerated with wgpu WGSL shaders (vec4 optimized)
- SIMD fallback (AVX-512/AVX2/NEON)
### Training
- `InfoNCELoss` - Contrastive loss function
- `Adam` - Adam optimizer
- `AdamW` - AdamW optimizer with weight decay
- `LRScheduler` - Learning rate scheduler
### Utilities
- `utils.cosineSimilarity()` - Cosine similarity between vectors
- `utils.l2Norm()` - L2 norm of a vector
- `utils.normalize()` - Normalize vector to unit length
- `utils.softmax()` - Apply softmax transformation
- `utils.attentionWeights()` - Compute attention weights from scores
- `utils.batchNormalize()` - Batch normalization
- `utils.randomOrthogonalMatrix()` - Generate random orthogonal matrix
- `utils.pairwiseDistances()` - Compute pairwise distances
## Performance
The WASM bindings provide near-native performance for attention computations:
- Optimized with `opt-level = "s"` and LTO
- SIMD acceleration where available
- Efficient memory management
- Zero-copy data transfer where possible
## License
MIT OR Apache-2.0

View File

@@ -0,0 +1,28 @@
{
"name": "ruvector-attention-wasm",
"collaborators": [
"Ruvector Team"
],
"description": "High-performance WebAssembly attention mechanisms: Multi-Head, Flash, Hyperbolic, MoE, CGT Sheaf Attention with GPU acceleration for transformers and LLMs",
"version": "2.0.5",
"license": "MIT",
"repository": {
"type": "git",
"url": "https://github.com/ruvnet/ruvector"
},
"files": [
"ruvector_attention_wasm_bg.wasm",
"ruvector_attention_wasm.js",
"ruvector_attention_wasm.d.ts"
],
"main": "ruvector_attention_wasm.js",
"homepage": "https://ruv.io/ruvector",
"types": "ruvector_attention_wasm.d.ts",
"keywords": [
"wasm",
"attention",
"transformer",
"flash-attention",
"llm"
]
}

View File

@@ -0,0 +1,359 @@
/* tslint:disable */
/* eslint-disable */
/**
* Adam optimizer
*/
export class WasmAdam {
free(): void;
[Symbol.dispose](): void;
/**
* Create a new Adam optimizer
*
* # Arguments
* * `param_count` - Number of parameters
* * `learning_rate` - Learning rate
*/
constructor(param_count: number, learning_rate: number);
/**
* Reset optimizer state
*/
reset(): void;
/**
* Perform optimization step
*
* # Arguments
* * `params` - Current parameter values (will be updated in-place)
* * `gradients` - Gradient values
*/
step(params: Float32Array, gradients: Float32Array): void;
/**
* Get current learning rate
*/
learning_rate: number;
}
/**
* AdamW optimizer (Adam with decoupled weight decay)
*/
export class WasmAdamW {
free(): void;
[Symbol.dispose](): void;
/**
* Create a new AdamW optimizer
*
* # Arguments
* * `param_count` - Number of parameters
* * `learning_rate` - Learning rate
* * `weight_decay` - Weight decay coefficient
*/
constructor(param_count: number, learning_rate: number, weight_decay: number);
/**
* Reset optimizer state
*/
reset(): void;
/**
* Perform optimization step with weight decay
*/
step(params: Float32Array, gradients: Float32Array): void;
/**
* Get current learning rate
*/
learning_rate: number;
/**
* Get weight decay
*/
readonly weight_decay: number;
}
/**
* Flash attention mechanism
*/
export class WasmFlashAttention {
free(): void;
[Symbol.dispose](): void;
/**
* Compute flash attention
*/
compute(query: Float32Array, keys: any, values: any): Float32Array;
/**
* Create a new flash attention instance
*
* # Arguments
* * `dim` - Embedding dimension
* * `block_size` - Block size for tiling
*/
constructor(dim: number, block_size: number);
}
/**
* Hyperbolic attention mechanism
*/
export class WasmHyperbolicAttention {
free(): void;
[Symbol.dispose](): void;
/**
* Compute hyperbolic attention
*/
compute(query: Float32Array, keys: any, values: any): Float32Array;
/**
* Create a new hyperbolic attention instance
*
* # Arguments
* * `dim` - Embedding dimension
* * `curvature` - Hyperbolic curvature parameter
*/
constructor(dim: number, curvature: number);
/**
* Get the curvature
*/
readonly curvature: number;
}
/**
* InfoNCE contrastive loss for training
*/
export class WasmInfoNCELoss {
free(): void;
[Symbol.dispose](): void;
/**
* Compute InfoNCE loss
*
* # Arguments
* * `anchor` - Anchor embedding
* * `positive` - Positive example embedding
* * `negatives` - Array of negative example embeddings
*/
compute(anchor: Float32Array, positive: Float32Array, negatives: any): number;
/**
* Create a new InfoNCE loss instance
*
* # Arguments
* * `temperature` - Temperature parameter for softmax
*/
constructor(temperature: number);
}
/**
* Learning rate scheduler
*/
export class WasmLRScheduler {
free(): void;
[Symbol.dispose](): void;
/**
* Get learning rate for current step
*/
get_lr(): number;
/**
* Create a new learning rate scheduler with warmup and cosine decay
*
* # Arguments
* * `initial_lr` - Initial learning rate
* * `warmup_steps` - Number of warmup steps
* * `total_steps` - Total training steps
*/
constructor(initial_lr: number, warmup_steps: number, total_steps: number);
/**
* Reset scheduler
*/
reset(): void;
/**
* Advance to next step
*/
step(): void;
}
/**
* Linear attention (Performer-style)
*/
export class WasmLinearAttention {
free(): void;
[Symbol.dispose](): void;
/**
* Compute linear attention
*/
compute(query: Float32Array, keys: any, values: any): Float32Array;
/**
* Create a new linear attention instance
*
* # Arguments
* * `dim` - Embedding dimension
* * `num_features` - Number of random features
*/
constructor(dim: number, num_features: number);
}
/**
* Local-global attention mechanism
*/
export class WasmLocalGlobalAttention {
free(): void;
[Symbol.dispose](): void;
/**
* Compute local-global attention
*/
compute(query: Float32Array, keys: any, values: any): Float32Array;
/**
* Create a new local-global attention instance
*
* # Arguments
* * `dim` - Embedding dimension
* * `local_window` - Size of local attention window
* * `global_tokens` - Number of global attention tokens
*/
constructor(dim: number, local_window: number, global_tokens: number);
}
/**
* Mixture of Experts (MoE) attention
*/
export class WasmMoEAttention {
free(): void;
[Symbol.dispose](): void;
/**
* Compute MoE attention
*/
compute(query: Float32Array, keys: any, values: any): Float32Array;
/**
* Create a new MoE attention instance
*
* # Arguments
* * `dim` - Embedding dimension
* * `num_experts` - Number of expert attention mechanisms
* * `top_k` - Number of experts to use per query
*/
constructor(dim: number, num_experts: number, top_k: number);
}
/**
* Multi-head attention mechanism
*/
export class WasmMultiHeadAttention {
free(): void;
[Symbol.dispose](): void;
/**
* Compute multi-head attention
*/
compute(query: Float32Array, keys: any, values: any): Float32Array;
/**
* Create a new multi-head attention instance
*
* # Arguments
* * `dim` - Embedding dimension
* * `num_heads` - Number of attention heads
*/
constructor(dim: number, num_heads: number);
/**
* Get the dimension
*/
readonly dim: number;
/**
* Get the number of heads
*/
readonly num_heads: number;
}
/**
* SGD optimizer with momentum
*/
export class WasmSGD {
free(): void;
[Symbol.dispose](): void;
/**
* Create a new SGD optimizer
*
* # Arguments
* * `param_count` - Number of parameters
* * `learning_rate` - Learning rate
* * `momentum` - Momentum coefficient (default: 0)
*/
constructor(param_count: number, learning_rate: number, momentum?: number | null);
/**
* Reset optimizer state
*/
reset(): void;
/**
* Perform optimization step
*/
step(params: Float32Array, gradients: Float32Array): void;
/**
* Get current learning rate
*/
learning_rate: number;
}
/**
* Compute attention weights from scores
*/
export function attention_weights(scores: Float32Array, temperature?: number | null): void;
/**
* Get information about available attention mechanisms
*/
export function available_mechanisms(): any;
/**
* Batch normalize vectors
*/
export function batch_normalize(vectors: any, epsilon?: number | null): Float32Array;
/**
* Compute cosine similarity between two vectors
*/
export function cosine_similarity(a: Float32Array, b: Float32Array): number;
/**
* Initialize the WASM module with panic hook
*/
export function init(): void;
/**
* Compute L2 norm of a vector
*/
export function l2_norm(vec: Float32Array): number;
/**
* Log a message to the browser console
*/
export function log(message: string): void;
/**
* Log an error to the browser console
*/
export function log_error(message: string): void;
/**
* Normalize a vector to unit length
*/
export function normalize(vec: Float32Array): void;
/**
* Compute pairwise distances between vectors
*/
export function pairwise_distances(vectors: any): Float32Array;
/**
* Generate random orthogonal matrix (for initialization)
*/
export function random_orthogonal_matrix(dim: number): Float32Array;
/**
* Compute scaled dot-product attention
*
* # Arguments
* * `query` - Query vector as Float32Array
* * `keys` - Array of key vectors
* * `values` - Array of value vectors
* * `scale` - Optional scaling factor (defaults to 1/sqrt(dim))
*/
export function scaled_dot_attention(query: Float32Array, keys: any, values: any, scale?: number | null): Float32Array;
/**
* Compute softmax of a vector
*/
export function softmax(vec: Float32Array): void;
/**
* Get the version of the ruvector-attention-wasm crate
*/
export function version(): string;

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,71 @@
/* tslint:disable */
/* eslint-disable */
export const memory: WebAssembly.Memory;
export const __wbg_wasmadam_free: (a: number, b: number) => void;
export const __wbg_wasmadamw_free: (a: number, b: number) => void;
export const __wbg_wasmflashattention_free: (a: number, b: number) => void;
export const __wbg_wasmhyperbolicattention_free: (a: number, b: number) => void;
export const __wbg_wasminfonceloss_free: (a: number, b: number) => void;
export const __wbg_wasmlinearattention_free: (a: number, b: number) => void;
export const __wbg_wasmmoeattention_free: (a: number, b: number) => void;
export const __wbg_wasmmultiheadattention_free: (a: number, b: number) => void;
export const __wbg_wasmsgd_free: (a: number, b: number) => void;
export const attention_weights: (a: number, b: number, c: number, d: number) => void;
export const available_mechanisms: () => number;
export const batch_normalize: (a: number, b: number, c: number) => void;
export const cosine_similarity: (a: number, b: number, c: number, d: number, e: number) => void;
export const l2_norm: (a: number, b: number) => number;
export const log: (a: number, b: number) => void;
export const log_error: (a: number, b: number) => void;
export const normalize: (a: number, b: number, c: number, d: number) => void;
export const pairwise_distances: (a: number, b: number) => void;
export const random_orthogonal_matrix: (a: number, b: number) => void;
export const scaled_dot_attention: (a: number, b: number, c: number, d: number, e: number, f: number) => void;
export const softmax: (a: number, b: number, c: number) => void;
export const version: (a: number) => void;
export const wasmadam_learning_rate: (a: number) => number;
export const wasmadam_new: (a: number, b: number) => number;
export const wasmadam_reset: (a: number) => void;
export const wasmadam_set_learning_rate: (a: number, b: number) => void;
export const wasmadam_step: (a: number, b: number, c: number, d: number, e: number, f: number) => void;
export const wasmadamw_new: (a: number, b: number, c: number) => number;
export const wasmadamw_reset: (a: number) => void;
export const wasmadamw_step: (a: number, b: number, c: number, d: number, e: number, f: number) => void;
export const wasmadamw_weight_decay: (a: number) => number;
export const wasmflashattention_compute: (a: number, b: number, c: number, d: number, e: number, f: number) => void;
export const wasmflashattention_new: (a: number, b: number) => number;
export const wasmhyperbolicattention_compute: (a: number, b: number, c: number, d: number, e: number, f: number) => void;
export const wasmhyperbolicattention_curvature: (a: number) => number;
export const wasmhyperbolicattention_new: (a: number, b: number) => number;
export const wasminfonceloss_compute: (a: number, b: number, c: number, d: number, e: number, f: number, g: number) => void;
export const wasminfonceloss_new: (a: number) => number;
export const wasmlinearattention_compute: (a: number, b: number, c: number, d: number, e: number, f: number) => void;
export const wasmlinearattention_new: (a: number, b: number) => number;
export const wasmlocalglobalattention_compute: (a: number, b: number, c: number, d: number, e: number, f: number) => void;
export const wasmlocalglobalattention_new: (a: number, b: number, c: number) => number;
export const wasmlrscheduler_get_lr: (a: number) => number;
export const wasmlrscheduler_new: (a: number, b: number, c: number) => number;
export const wasmlrscheduler_reset: (a: number) => void;
export const wasmlrscheduler_step: (a: number) => void;
export const wasmmoeattention_compute: (a: number, b: number, c: number, d: number, e: number, f: number) => void;
export const wasmmoeattention_new: (a: number, b: number, c: number) => number;
export const wasmmultiheadattention_compute: (a: number, b: number, c: number, d: number, e: number, f: number) => void;
export const wasmmultiheadattention_dim: (a: number) => number;
export const wasmmultiheadattention_new: (a: number, b: number, c: number) => void;
export const wasmmultiheadattention_num_heads: (a: number) => number;
export const wasmsgd_learning_rate: (a: number) => number;
export const wasmsgd_new: (a: number, b: number, c: number) => number;
export const wasmsgd_reset: (a: number) => void;
export const wasmsgd_set_learning_rate: (a: number, b: number) => void;
export const wasmsgd_step: (a: number, b: number, c: number, d: number, e: number, f: number) => void;
export const init: () => void;
export const wasmadamw_set_learning_rate: (a: number, b: number) => void;
export const wasmadamw_learning_rate: (a: number) => number;
export const __wbg_wasmlocalglobalattention_free: (a: number, b: number) => void;
export const __wbg_wasmlrscheduler_free: (a: number, b: number) => void;
export const __wbindgen_export: (a: number, b: number) => number;
export const __wbindgen_export2: (a: number, b: number, c: number, d: number) => number;
export const __wbindgen_export3: (a: number) => void;
export const __wbindgen_export4: (a: number, b: number, c: number) => void;
export const __wbindgen_add_to_stack_pointer: (a: number) => number;
export const __wbindgen_start: () => void;