git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
330 lines
7.7 KiB
Markdown
330 lines
7.7 KiB
Markdown
# Ruvector WASM Build Guide
|
|
|
|
## Overview
|
|
|
|
This guide provides instructions for building the Ruvector WASM bindings. The WASM module enables high-performance vector database operations directly in web browsers and Node.js environments.
|
|
|
|
## Implementation Status
|
|
|
|
✅ **Completed Components:**
|
|
|
|
1. **Core WASM Bindings** (`/crates/ruvector-wasm/src/lib.rs`)
|
|
- Full VectorDB API (insert, search, delete, batch operations)
|
|
- Proper error handling with WasmResult types
|
|
- Console panic hook for debugging
|
|
- JavaScript-compatible types (JsVectorEntry, JsSearchResult)
|
|
|
|
2. **SIMD Support**
|
|
- Dual build configuration (with/without SIMD)
|
|
- Feature flags in Cargo.toml
|
|
- Runtime SIMD detection via `detectSIMD()` function
|
|
|
|
3. **Web Workers Integration** (`/crates/ruvector-wasm/src/worker.js`)
|
|
- Message passing for async operations
|
|
- Support for insert, search, delete, batch operations
|
|
- Zero-copy transfers preparation
|
|
|
|
4. **Worker Pool Management** (`/crates/ruvector-wasm/src/worker-pool.js`)
|
|
- Automatic pool sizing (4-8 workers based on CPU cores)
|
|
- Round-robin task distribution
|
|
- Promise-based API
|
|
- Error handling and timeouts
|
|
|
|
5. **IndexedDB Persistence** (`/crates/ruvector-wasm/src/indexeddb.js`)
|
|
- Save/load vectors to IndexedDB
|
|
- Batch operations for performance
|
|
- Progressive loading with callbacks
|
|
- LRU cache implementation (1000 hot vectors)
|
|
|
|
6. **Examples**
|
|
- Vanilla JavaScript example (`/examples/wasm-vanilla/index.html`)
|
|
- React + Web Workers example (`/examples/wasm-react/`)
|
|
|
|
7. **Tests**
|
|
- Comprehensive WASM tests (`/crates/ruvector-wasm/tests/wasm.rs`)
|
|
- Browser-based testing with wasm-bindgen-test
|
|
|
|
8. **Build Configuration**
|
|
- Optimized for size (target: <500KB gzipped)
|
|
- Multiple build targets (web, nodejs, bundler)
|
|
- Size verification scripts
|
|
|
|
## Prerequisites
|
|
|
|
```bash
|
|
# Install Rust with wasm32 target
|
|
rustup target add wasm32-unknown-unknown
|
|
|
|
# Install wasm-pack
|
|
cargo install wasm-pack
|
|
|
|
# Optional: Install wasm-opt for further optimization
|
|
npm install -g wasm-opt
|
|
```
|
|
|
|
## Building
|
|
|
|
### Standard Web Build
|
|
|
|
```bash
|
|
cd crates/ruvector-wasm
|
|
wasm-pack build --target web --out-dir pkg --release
|
|
```
|
|
|
|
### SIMD-Enabled Build
|
|
|
|
```bash
|
|
cd crates/ruvector-wasm
|
|
wasm-pack build --target web --out-dir pkg-simd --release -- --features simd
|
|
```
|
|
|
|
### All Targets
|
|
|
|
```bash
|
|
cd crates/ruvector-wasm
|
|
npm run build:all
|
|
```
|
|
|
|
This will build for:
|
|
- Web (`pkg/`)
|
|
- Web with SIMD (`pkg-simd/`)
|
|
- Node.js (`pkg-node/`)
|
|
- Bundler (`pkg-bundler/`)
|
|
|
|
## Known Build Issues & Solutions
|
|
|
|
### Issue: getrandom 0.3 Compatibility
|
|
|
|
**Problem:** Some dependencies (notably `rand` via `uuid`) pull in `getrandom` 0.3.4, which requires the `wasm_js` feature flag that must be set via `RUSTFLAGS` configuration flags, not just Cargo features.
|
|
|
|
**Solution Options:**
|
|
|
|
1. **Use .cargo/config.toml** (Already configured):
|
|
```toml
|
|
[target.wasm32-unknown-unknown]
|
|
rustflags = ['--cfg', 'getrandom_backend="wasm_js"']
|
|
```
|
|
|
|
2. **Disable uuid feature** (Implemented):
|
|
```toml
|
|
# In ruvector-core/Cargo.toml
|
|
[features]
|
|
default = ["simd", "uuid-support"]
|
|
uuid-support = ["uuid"]
|
|
|
|
# In ruvector-wasm/Cargo.toml
|
|
[dependencies]
|
|
ruvector-core = { path = "../ruvector-core", default-features = false }
|
|
```
|
|
|
|
3. **Alternative: Use timestamp-based IDs** (Fallback):
|
|
For WASM builds, use `Date.now()` + random suffixes instead of UUIDs
|
|
|
|
### Issue: Large Binary Size
|
|
|
|
**Solution:**
|
|
|
|
1. Enable LTO and size optimization (already configured):
|
|
```toml
|
|
[profile.release]
|
|
opt-level = "z"
|
|
lto = true
|
|
codegen-units = 1
|
|
panic = "abort"
|
|
```
|
|
|
|
2. Run wasm-opt:
|
|
```bash
|
|
npm run optimize
|
|
```
|
|
|
|
3. Verify size:
|
|
```bash
|
|
npm run size
|
|
```
|
|
|
|
## Usage Examples
|
|
|
|
### Vanilla JavaScript
|
|
|
|
```html
|
|
<!DOCTYPE html>
|
|
<html>
|
|
<head>
|
|
<title>Ruvector WASM</title>
|
|
</head>
|
|
<body>
|
|
<script type="module">
|
|
import init, { VectorDB } from './pkg/ruvector_wasm.js';
|
|
|
|
await init();
|
|
|
|
const db = new VectorDB(384, 'cosine', true);
|
|
|
|
// Insert vector
|
|
const vector = new Float32Array(384).map(() => Math.random());
|
|
const id = db.insert(vector, 'vec_1', { label: 'test' });
|
|
|
|
// Search
|
|
const query = new Float32Array(384).map(() => Math.random());
|
|
const results = db.search(query, 10);
|
|
|
|
console.log('Results:', results);
|
|
</script>
|
|
</body>
|
|
</html>
|
|
```
|
|
|
|
### With Web Workers
|
|
|
|
```javascript
|
|
import { WorkerPool } from '@ruvector/wasm/worker-pool';
|
|
|
|
const pool = new WorkerPool(
|
|
'/worker.js',
|
|
'/pkg/ruvector_wasm.js',
|
|
{
|
|
poolSize: 4,
|
|
dimensions: 384,
|
|
metric: 'cosine'
|
|
}
|
|
);
|
|
|
|
await pool.init();
|
|
|
|
// Parallel insert
|
|
const entries = Array(1000).fill(0).map((_, i) => ({
|
|
vector: Array(384).fill(0).map(() => Math.random()),
|
|
id: `vec_${i}`,
|
|
metadata: { index: i }
|
|
}));
|
|
|
|
const ids = await pool.insertBatch(entries);
|
|
|
|
// Parallel search
|
|
const results = await pool.search(query, 10);
|
|
|
|
// Cleanup
|
|
pool.terminate();
|
|
```
|
|
|
|
### With IndexedDB
|
|
|
|
```javascript
|
|
import { IndexedDBPersistence } from '@ruvector/wasm/indexeddb';
|
|
|
|
const persistence = new IndexedDBPersistence('my_database');
|
|
await persistence.open();
|
|
|
|
// Save vectors
|
|
await persistence.saveBatch(vectors);
|
|
|
|
// Load with progress
|
|
await persistence.loadAll((progress) => {
|
|
console.log(`Loaded ${progress.loaded} vectors`);
|
|
|
|
if (progress.vectors.length > 0) {
|
|
db.insertBatch(progress.vectors);
|
|
}
|
|
});
|
|
|
|
// Get stats
|
|
const stats = await persistence.getStats();
|
|
console.log(`Cache hit rate: ${(stats.cacheHitRate * 100).toFixed(2)}%`);
|
|
```
|
|
|
|
## Testing
|
|
|
|
### Browser Tests
|
|
|
|
```bash
|
|
cd crates/ruvector-wasm
|
|
wasm-pack test --headless --chrome
|
|
wasm-pack test --headless --firefox
|
|
```
|
|
|
|
### Node.js Tests
|
|
|
|
```bash
|
|
wasm-pack test --node
|
|
```
|
|
|
|
## Performance Optimization Tips
|
|
|
|
1. **Enable SIMD**: Use the SIMD build for 2-4x speedup on supported browsers
|
|
2. **Use Batch Operations**: `insertBatch` is 5-10x faster than multiple `insert` calls
|
|
3. **Use Web Workers**: Distribute operations across workers for parallel processing
|
|
4. **Enable LRU Cache**: Keep hot vectors in IndexedDB cache
|
|
5. **Optimize Vector Size**: Smaller dimensions = faster operations
|
|
6. **Choose Appropriate Metric**: Dot product is fastest, Euclidean is slowest
|
|
|
|
## Browser Compatibility
|
|
|
|
| Browser | Version | SIMD Support | Web Workers | IndexedDB |
|
|
|---------|---------|--------------|-------------|-----------|
|
|
| Chrome | 91+ | ✅ | ✅ | ✅ |
|
|
| Firefox | 89+ | ✅ | ✅ | ✅ |
|
|
| Safari | 16.4+ | Partial | ✅ | ✅ |
|
|
| Edge | 91+ | ✅ | ✅ | ✅ |
|
|
|
|
## Size Benchmarks
|
|
|
|
Expected sizes after optimization:
|
|
|
|
- **Base build**: ~450KB gzipped
|
|
- **SIMD build**: ~480KB gzipped
|
|
- **With wasm-opt -Oz**: ~380KB gzipped
|
|
|
|
## Troubleshooting
|
|
|
|
### CORS Errors with Workers
|
|
|
|
Ensure your server sends proper CORS headers:
|
|
|
|
```javascript
|
|
{
|
|
'Cross-Origin-Opener-Policy': 'same-origin',
|
|
'Cross-Origin-Embedder-Policy': 'require-corp'
|
|
}
|
|
```
|
|
|
|
### Memory Issues
|
|
|
|
Increase WASM memory limit if needed:
|
|
|
|
```javascript
|
|
// In worker.js or main thread
|
|
WebAssembly.instantiate(module, {
|
|
env: {
|
|
memory: new WebAssembly.Memory({ initial: 256, maximum: 512 })
|
|
}
|
|
});
|
|
```
|
|
|
|
### IndexedDB Quota Errors
|
|
|
|
Check available storage:
|
|
|
|
```javascript
|
|
if ('storage' in navigator && 'estimate' in navigator.storage) {
|
|
const estimate = await navigator.storage.estimate();
|
|
console.log(`Using ${estimate.usage} of ${estimate.quota} bytes`);
|
|
}
|
|
```
|
|
|
|
## Next Steps
|
|
|
|
1. **Complete Build Debugging**: Resolve getrandom compatibility issues
|
|
2. **Add More Examples**: Vue.js, Svelte, Angular examples
|
|
3. **Benchmarking Suite**: Compare performance across browsers
|
|
4. **CDN Distribution**: Publish to npm and CDNs
|
|
5. **Documentation**: Interactive playground and tutorials
|
|
|
|
## Contributing
|
|
|
|
See main repository for contribution guidelines.
|
|
|
|
## License
|
|
|
|
MIT
|