Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

2026-02-28 14:39:40 -05:00
parent 7885bf6278 d803bfe2b1
commit cd5943df23
7854 changed files with 3522914 additions and 0 deletions
--- a/vendor/ruvector/crates/ruvllm-wasm/docs/MICRO_LORA.md
+++ b/vendor/ruvector/crates/ruvllm-wasm/docs/MICRO_LORA.md
@@ -0,0 +1,377 @@
+# MicroLoRA - Browser-Compatible Lightweight LoRA Adaptation
+
+MicroLoRA provides ultra-lightweight LoRA (Low-Rank Adaptation) for real-time adaptation of language models directly in web browsers.
+
+## Features
+
+- **Tiny Memory Footprint**: Rank 1-4 adapters use <50KB per adapter
+- **Pure WASM**: No threading, no file I/O, fully browser-compatible
+- **Real-time Adaptation**: Update weights based on user feedback with <1ms latency
+- **Serialization**: JSON-based persistence for localStorage/IndexedDB
+- **TypeScript-Friendly**: Full type definitions with getter/setter patterns
+
+## Architecture
+
+```
+┌─────────────────┐
+│  Base LLM       │
+│  (frozen)       │
+└────────┬────────┘
+         │
+         ├──────────┐
+         │          │
+┌────────▼────────┐ │
+│  Input          │ │
+│  (768-dim)      │ │
+└────────┬────────┘ │
+         │          │
+         ▼          │
+┌─────────────────┐ │
+│  LoRA A         │ │  Down projection
+│  (768 x 2)      │ │  (in_features x rank)
+└────────┬────────┘ │
+         │          │
+         ▼          │
+┌─────────────────┐ │
+│  Intermediate   │ │
+│  (2-dim)        │ │
+└────────┬────────┘ │
+         │          │
+         ▼          │
+┌─────────────────┐ │
+│  LoRA B         │ │  Up projection
+│  (2 x 768)      │ │  (rank x out_features)
+└────────┬────────┘ │
+         │          │
+         ▼          │
+┌─────────────────┐ │
+│  LoRA Output    │ │  Scaled by (alpha / rank)
+│  (768-dim)      │ │
+└────────┬────────┘ │
+         │          │
+         └──────────┤
+                    │
+         ┌──────────▼───────┐
+         │  Final Output    │
+         │  (base + LoRA)   │
+         └──────────────────┘
+```
+
+## Quick Start
+
+### Basic Usage
+
+```javascript
+import init, { MicroLoraWasm, MicroLoraConfigWasm, AdaptFeedbackWasm } from 'ruvllm-wasm';
+
+// Initialize WASM
+await init();
+
+// Create adapter config
+const config = new MicroLoraConfigWasm();
+config.rank = 2;              // Rank 1-4 (2 recommended for browser)
+config.alpha = 4.0;           // Scaling factor
+config.inFeatures = 768;      // Match your model's hidden size
+config.outFeatures = 768;
+
+// Create the adapter
+const lora = new MicroLoraWasm(config);
+
+// Apply LoRA to hidden states
+const hiddenState = new Float32Array(768);
+const output = lora.apply(hiddenState);
+```
+
+### Real-time Adaptation
+
+```javascript
+// User provides feedback on model output
+const feedback = new AdaptFeedbackWasm(0.8); // Quality score [0.0, 1.0]
+feedback.learningRate = 0.01;
+
+// Adapt weights based on feedback
+lora.adapt(hiddenState, feedback);
+
+// Apply updates (can batch multiple adapt calls)
+lora.applyUpdates(0.01);
+
+// Get statistics
+const stats = lora.stats();
+console.log(`Average quality: ${stats.avgQuality}`);
+console.log(`Samples seen: ${stats.samplesSeen}`);
+```
+
+### Persistence
+
+```javascript
+// Save to localStorage
+const json = lora.toJson();
+localStorage.setItem('lora-state', json);
+
+// Restore from localStorage
+const saved = localStorage.getItem('lora-state');
+const restored = MicroLoraWasm.fromJson(saved);
+```
+
+## API Reference
+
+### MicroLoraConfigWasm
+
+Configuration for the LoRA adapter.
+
+**Properties:**
+- `rank: number` - LoRA rank (1-4, clamped). Default: 2
+- `alpha: number` - Scaling factor. Default: 4.0
+- `inFeatures: number` - Input dimension. Default: 768
+- `outFeatures: number` - Output dimension. Default: 768
+
+**Methods:**
+- `memoryBytes(): number` - Calculate memory footprint in bytes
+- `computeScaling(): number` - Get computed scaling (alpha / rank)
+
+### MicroLoraWasm
+
+The main LoRA adapter.
+
+**Constructor:**
+- `new MicroLoraWasm(config: MicroLoraConfigWasm)`
+
+**Methods:**
+- `apply(input: Float32Array): Float32Array` - Apply LoRA transformation
+- `adapt(input: Float32Array, feedback: AdaptFeedbackWasm): void` - Accumulate gradients
+- `applyUpdates(learningRate: number): void` - Apply accumulated gradients
+- `reset(): void` - Reset to initial state
+- `stats(): MicroLoraStatsWasm` - Get adapter statistics
+- `toJson(): string` - Serialize to JSON
+- `fromJson(json: string): MicroLoraWasm` - Deserialize from JSON (static)
+- `pendingUpdates(): number` - Get number of pending gradient updates
+- `getConfig(): MicroLoraConfigWasm` - Get current configuration
+
+### AdaptFeedbackWasm
+
+Feedback for weight adaptation.
+
+**Constructor:**
+- `new AdaptFeedbackWasm(quality: number)` - Quality score [0.0, 1.0]
+
+**Properties:**
+- `quality: number` - Quality/reward signal [0.0, 1.0]
+- `learningRate: number` - Learning rate. Default: 0.01
+
+### MicroLoraStatsWasm
+
+Adapter statistics.
+
+**Properties:**
+- `samplesSeen: number` - Total samples seen
+- `avgQuality: number` - Average quality score
+- `memoryBytes: number` - Memory usage in bytes
+- `paramCount: number` - Total parameter count
+
+**Methods:**
+- `toJson(): string` - Convert to JSON string
+
+## Memory Footprint
+
+Memory usage for different configurations:
+
+| Config | Memory | Parameters |
+|--------|--------|------------|
+| Rank 1, 768×768 | 6KB | 1,536 |
+| Rank 2, 768×768 | 12KB | 3,072 |
+| Rank 4, 768×768 | 24KB | 6,144 |
+| Rank 2, 512×512 | 8KB | 2,048 |
+
+Formula: `(in_features × rank + rank × out_features) × 4 bytes`
+
+## Use Cases
+
+### 1. Personalized Chat Interface
+
+```javascript
+// Adapt based on user thumbs up/down
+async function handleUserFeedback(hiddenStates, wasHelpful) {
+    const feedback = new AdaptFeedbackWasm(wasHelpful ? 0.9 : 0.3);
+    lora.adapt(hiddenStates, feedback);
+
+    // Apply after every 5 interactions
+    if (interactionCount % 5 === 0) {
+        lora.applyUpdates(0.02);
+
+        // Persist to localStorage
+        localStorage.setItem('chat-lora', lora.toJson());
+    }
+}
+```
+
+### 2. Domain-Specific Fine-tuning
+
+```javascript
+// Adapt to technical domain over time
+const conversations = [
+    { input: codeHelpQuery, quality: 0.85 },
+    { input: technicalExplanation, quality: 0.92 },
+    // ...
+];
+
+for (const conv of conversations) {
+    const feedback = new AdaptFeedbackWasm(conv.quality);
+    lora.adapt(conv.input, feedback);
+}
+
+lora.applyUpdates(0.01);
+```
+
+### 3. Multi-User Adapters
+
+```javascript
+// Store separate adapters per user
+function getUserLora(userId) {
+    const key = `lora-${userId}`;
+    const saved = localStorage.getItem(key);
+
+    if (saved) {
+        return MicroLoraWasm.fromJson(saved);
+    }
+
+    const config = new MicroLoraConfigWasm();
+    return new MicroLoraWasm(config);
+}
+
+function saveUserLora(userId, lora) {
+    localStorage.setItem(`lora-${userId}`, lora.toJson());
+}
+```
+
+## Performance Tips
+
+### 1. Batch Gradient Updates
+
+```javascript
+// ❌ Bad: Update after every sample
+for (const sample of samples) {
+    lora.adapt(sample.input, sample.feedback);
+    lora.applyUpdates(0.01); // Expensive!
+}
+
+// ✅ Good: Batch updates
+for (const sample of samples) {
+    lora.adapt(sample.input, sample.feedback);
+}
+lora.applyUpdates(0.01); // Once at the end
+```
+
+### 2. Choose Optimal Rank
+
+- **Rank 1**: Fastest, minimal memory (~6KB), good for simple adaptations
+- **Rank 2**: Best balance, recommended for most use cases (~12KB)
+- **Rank 4**: More expressive, use when quality matters more than size (~24KB)
+
+### 3. Learning Rate Guidelines
+
+- Start with `0.01` for general use
+- Increase to `0.02-0.05` for faster adaptation
+- Decrease to `0.001-0.005` for fine-grained control
+- Use adaptive rates based on quality variance
+
+```javascript
+const variance = computeQualityVariance(recentSamples);
+const adaptiveLR = 0.01 * (1 + variance);
+lora.applyUpdates(adaptiveLR);
+```
+
+## Comparison with Full LoRA
+
+| Feature | MicroLoRA | Standard LoRA |
+|---------|-----------|---------------|
+| Memory | 6-24KB | 50-500KB |
+| Rank | 1-4 | 8-64 |
+| Adaptation | Real-time (<1ms) | Batch (>100ms) |
+| Threading | None | Multi-threaded |
+| Platform | Browser only | Any |
+| Gradients | Simplified | Full backprop |
+
+## Browser Compatibility
+
+Requires:
+- WebAssembly support
+- Float32Array support
+- localStorage for persistence (optional)
+
+Tested on:
+- Chrome 90+
+- Firefox 88+
+- Safari 14+
+- Edge 90+
+
+## Advanced: Integration with Base Model
+
+```javascript
+async function generateWithLoRA(prompt, lora) {
+    // 1. Get base model output and hidden states
+    const { output, hiddenStates } = await baseModel.generate(prompt);
+
+    // 2. Apply LoRA transformation to hidden states
+    const loraOutput = lora.apply(hiddenStates);
+
+    // 3. Combine (additive)
+    const finalHidden = hiddenStates.map((h, i) => h + loraOutput[i]);
+
+    // 4. Project to tokens
+    const tokens = await baseModel.projectToTokens(finalHidden);
+
+    return tokens;
+}
+```
+
+## Troubleshooting
+
+### High Memory Usage
+
+```javascript
+// Check actual memory usage
+const stats = lora.stats();
+console.log(`Memory: ${stats.memoryBytes} bytes`);
+
+// If too high, reduce rank
+config.rank = 1; // Instead of 2 or 4
+```
+
+### Slow Adaptation
+
+```javascript
+// Increase learning rate
+feedback.learningRate = 0.05; // Instead of 0.01
+
+// Or apply updates more frequently
+if (sampleCount % 3 === 0) { // Instead of % 10
+    lora.applyUpdates(0.02);
+}
+```
+
+### Quality Not Improving
+
+```javascript
+// Check if feedback is balanced
+const stats = lora.stats();
+if (stats.avgQuality < 0.4 || stats.avgQuality > 0.9) {
+    console.warn('Feedback may be too one-sided');
+}
+
+// Add quality normalization
+const normalizedQuality = (rawQuality - minQuality) / (maxQuality - minQuality);
+feedback.quality = normalizedQuality;
+```
+
+## Examples
+
+See `examples/micro_lora_example.ts` for complete working examples including:
+- Basic usage
+- Online learning loop
+- Serialization/deserialization
+- Browser storage integration
+- Multi-user scenarios
+
+## License
+
+MIT License - see LICENSE file for details