Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'
This commit is contained in:
329
vendor/ruvector/docs/guides/wasm-build-guide.md
vendored
Normal file
329
vendor/ruvector/docs/guides/wasm-build-guide.md
vendored
Normal file
@@ -0,0 +1,329 @@
|
||||
# Ruvector WASM Build Guide
|
||||
|
||||
## Overview
|
||||
|
||||
This guide provides instructions for building the Ruvector WASM bindings. The WASM module enables high-performance vector database operations directly in web browsers and Node.js environments.
|
||||
|
||||
## Implementation Status
|
||||
|
||||
✅ **Completed Components:**
|
||||
|
||||
1. **Core WASM Bindings** (`/crates/ruvector-wasm/src/lib.rs`)
|
||||
- Full VectorDB API (insert, search, delete, batch operations)
|
||||
- Proper error handling with WasmResult types
|
||||
- Console panic hook for debugging
|
||||
- JavaScript-compatible types (JsVectorEntry, JsSearchResult)
|
||||
|
||||
2. **SIMD Support**
|
||||
- Dual build configuration (with/without SIMD)
|
||||
- Feature flags in Cargo.toml
|
||||
- Runtime SIMD detection via `detectSIMD()` function
|
||||
|
||||
3. **Web Workers Integration** (`/crates/ruvector-wasm/src/worker.js`)
|
||||
- Message passing for async operations
|
||||
- Support for insert, search, delete, batch operations
|
||||
- Zero-copy transfers preparation
|
||||
|
||||
4. **Worker Pool Management** (`/crates/ruvector-wasm/src/worker-pool.js`)
|
||||
- Automatic pool sizing (4-8 workers based on CPU cores)
|
||||
- Round-robin task distribution
|
||||
- Promise-based API
|
||||
- Error handling and timeouts
|
||||
|
||||
5. **IndexedDB Persistence** (`/crates/ruvector-wasm/src/indexeddb.js`)
|
||||
- Save/load vectors to IndexedDB
|
||||
- Batch operations for performance
|
||||
- Progressive loading with callbacks
|
||||
- LRU cache implementation (1000 hot vectors)
|
||||
|
||||
6. **Examples**
|
||||
- Vanilla JavaScript example (`/examples/wasm-vanilla/index.html`)
|
||||
- React + Web Workers example (`/examples/wasm-react/`)
|
||||
|
||||
7. **Tests**
|
||||
- Comprehensive WASM tests (`/crates/ruvector-wasm/tests/wasm.rs`)
|
||||
- Browser-based testing with wasm-bindgen-test
|
||||
|
||||
8. **Build Configuration**
|
||||
- Optimized for size (target: <500KB gzipped)
|
||||
- Multiple build targets (web, nodejs, bundler)
|
||||
- Size verification scripts
|
||||
|
||||
## Prerequisites
|
||||
|
||||
```bash
|
||||
# Install Rust with wasm32 target
|
||||
rustup target add wasm32-unknown-unknown
|
||||
|
||||
# Install wasm-pack
|
||||
cargo install wasm-pack
|
||||
|
||||
# Optional: Install wasm-opt for further optimization
|
||||
npm install -g wasm-opt
|
||||
```
|
||||
|
||||
## Building
|
||||
|
||||
### Standard Web Build
|
||||
|
||||
```bash
|
||||
cd crates/ruvector-wasm
|
||||
wasm-pack build --target web --out-dir pkg --release
|
||||
```
|
||||
|
||||
### SIMD-Enabled Build
|
||||
|
||||
```bash
|
||||
cd crates/ruvector-wasm
|
||||
wasm-pack build --target web --out-dir pkg-simd --release -- --features simd
|
||||
```
|
||||
|
||||
### All Targets
|
||||
|
||||
```bash
|
||||
cd crates/ruvector-wasm
|
||||
npm run build:all
|
||||
```
|
||||
|
||||
This will build for:
|
||||
- Web (`pkg/`)
|
||||
- Web with SIMD (`pkg-simd/`)
|
||||
- Node.js (`pkg-node/`)
|
||||
- Bundler (`pkg-bundler/`)
|
||||
|
||||
## Known Build Issues & Solutions
|
||||
|
||||
### Issue: getrandom 0.3 Compatibility
|
||||
|
||||
**Problem:** Some dependencies (notably `rand` via `uuid`) pull in `getrandom` 0.3.4, which requires the `wasm_js` feature flag that must be set via `RUSTFLAGS` configuration flags, not just Cargo features.
|
||||
|
||||
**Solution Options:**
|
||||
|
||||
1. **Use .cargo/config.toml** (Already configured):
|
||||
```toml
|
||||
[target.wasm32-unknown-unknown]
|
||||
rustflags = ['--cfg', 'getrandom_backend="wasm_js"']
|
||||
```
|
||||
|
||||
2. **Disable uuid feature** (Implemented):
|
||||
```toml
|
||||
# In ruvector-core/Cargo.toml
|
||||
[features]
|
||||
default = ["simd", "uuid-support"]
|
||||
uuid-support = ["uuid"]
|
||||
|
||||
# In ruvector-wasm/Cargo.toml
|
||||
[dependencies]
|
||||
ruvector-core = { path = "../ruvector-core", default-features = false }
|
||||
```
|
||||
|
||||
3. **Alternative: Use timestamp-based IDs** (Fallback):
|
||||
For WASM builds, use `Date.now()` + random suffixes instead of UUIDs
|
||||
|
||||
### Issue: Large Binary Size
|
||||
|
||||
**Solution:**
|
||||
|
||||
1. Enable LTO and size optimization (already configured):
|
||||
```toml
|
||||
[profile.release]
|
||||
opt-level = "z"
|
||||
lto = true
|
||||
codegen-units = 1
|
||||
panic = "abort"
|
||||
```
|
||||
|
||||
2. Run wasm-opt:
|
||||
```bash
|
||||
npm run optimize
|
||||
```
|
||||
|
||||
3. Verify size:
|
||||
```bash
|
||||
npm run size
|
||||
```
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Vanilla JavaScript
|
||||
|
||||
```html
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>Ruvector WASM</title>
|
||||
</head>
|
||||
<body>
|
||||
<script type="module">
|
||||
import init, { VectorDB } from './pkg/ruvector_wasm.js';
|
||||
|
||||
await init();
|
||||
|
||||
const db = new VectorDB(384, 'cosine', true);
|
||||
|
||||
// Insert vector
|
||||
const vector = new Float32Array(384).map(() => Math.random());
|
||||
const id = db.insert(vector, 'vec_1', { label: 'test' });
|
||||
|
||||
// Search
|
||||
const query = new Float32Array(384).map(() => Math.random());
|
||||
const results = db.search(query, 10);
|
||||
|
||||
console.log('Results:', results);
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
```
|
||||
|
||||
### With Web Workers
|
||||
|
||||
```javascript
|
||||
import { WorkerPool } from '@ruvector/wasm/worker-pool';
|
||||
|
||||
const pool = new WorkerPool(
|
||||
'/worker.js',
|
||||
'/pkg/ruvector_wasm.js',
|
||||
{
|
||||
poolSize: 4,
|
||||
dimensions: 384,
|
||||
metric: 'cosine'
|
||||
}
|
||||
);
|
||||
|
||||
await pool.init();
|
||||
|
||||
// Parallel insert
|
||||
const entries = Array(1000).fill(0).map((_, i) => ({
|
||||
vector: Array(384).fill(0).map(() => Math.random()),
|
||||
id: `vec_${i}`,
|
||||
metadata: { index: i }
|
||||
}));
|
||||
|
||||
const ids = await pool.insertBatch(entries);
|
||||
|
||||
// Parallel search
|
||||
const results = await pool.search(query, 10);
|
||||
|
||||
// Cleanup
|
||||
pool.terminate();
|
||||
```
|
||||
|
||||
### With IndexedDB
|
||||
|
||||
```javascript
|
||||
import { IndexedDBPersistence } from '@ruvector/wasm/indexeddb';
|
||||
|
||||
const persistence = new IndexedDBPersistence('my_database');
|
||||
await persistence.open();
|
||||
|
||||
// Save vectors
|
||||
await persistence.saveBatch(vectors);
|
||||
|
||||
// Load with progress
|
||||
await persistence.loadAll((progress) => {
|
||||
console.log(`Loaded ${progress.loaded} vectors`);
|
||||
|
||||
if (progress.vectors.length > 0) {
|
||||
db.insertBatch(progress.vectors);
|
||||
}
|
||||
});
|
||||
|
||||
// Get stats
|
||||
const stats = await persistence.getStats();
|
||||
console.log(`Cache hit rate: ${(stats.cacheHitRate * 100).toFixed(2)}%`);
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
### Browser Tests
|
||||
|
||||
```bash
|
||||
cd crates/ruvector-wasm
|
||||
wasm-pack test --headless --chrome
|
||||
wasm-pack test --headless --firefox
|
||||
```
|
||||
|
||||
### Node.js Tests
|
||||
|
||||
```bash
|
||||
wasm-pack test --node
|
||||
```
|
||||
|
||||
## Performance Optimization Tips
|
||||
|
||||
1. **Enable SIMD**: Use the SIMD build for 2-4x speedup on supported browsers
|
||||
2. **Use Batch Operations**: `insertBatch` is 5-10x faster than multiple `insert` calls
|
||||
3. **Use Web Workers**: Distribute operations across workers for parallel processing
|
||||
4. **Enable LRU Cache**: Keep hot vectors in IndexedDB cache
|
||||
5. **Optimize Vector Size**: Smaller dimensions = faster operations
|
||||
6. **Choose Appropriate Metric**: Dot product is fastest, Euclidean is slowest
|
||||
|
||||
## Browser Compatibility
|
||||
|
||||
| Browser | Version | SIMD Support | Web Workers | IndexedDB |
|
||||
|---------|---------|--------------|-------------|-----------|
|
||||
| Chrome | 91+ | ✅ | ✅ | ✅ |
|
||||
| Firefox | 89+ | ✅ | ✅ | ✅ |
|
||||
| Safari | 16.4+ | Partial | ✅ | ✅ |
|
||||
| Edge | 91+ | ✅ | ✅ | ✅ |
|
||||
|
||||
## Size Benchmarks
|
||||
|
||||
Expected sizes after optimization:
|
||||
|
||||
- **Base build**: ~450KB gzipped
|
||||
- **SIMD build**: ~480KB gzipped
|
||||
- **With wasm-opt -Oz**: ~380KB gzipped
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### CORS Errors with Workers
|
||||
|
||||
Ensure your server sends proper CORS headers:
|
||||
|
||||
```javascript
|
||||
{
|
||||
'Cross-Origin-Opener-Policy': 'same-origin',
|
||||
'Cross-Origin-Embedder-Policy': 'require-corp'
|
||||
}
|
||||
```
|
||||
|
||||
### Memory Issues
|
||||
|
||||
Increase WASM memory limit if needed:
|
||||
|
||||
```javascript
|
||||
// In worker.js or main thread
|
||||
WebAssembly.instantiate(module, {
|
||||
env: {
|
||||
memory: new WebAssembly.Memory({ initial: 256, maximum: 512 })
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### IndexedDB Quota Errors
|
||||
|
||||
Check available storage:
|
||||
|
||||
```javascript
|
||||
if ('storage' in navigator && 'estimate' in navigator.storage) {
|
||||
const estimate = await navigator.storage.estimate();
|
||||
console.log(`Using ${estimate.usage} of ${estimate.quota} bytes`);
|
||||
}
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Complete Build Debugging**: Resolve getrandom compatibility issues
|
||||
2. **Add More Examples**: Vue.js, Svelte, Angular examples
|
||||
3. **Benchmarking Suite**: Compare performance across browsers
|
||||
4. **CDN Distribution**: Publish to npm and CDNs
|
||||
5. **Documentation**: Interactive playground and tutorials
|
||||
|
||||
## Contributing
|
||||
|
||||
See main repository for contribution guidelines.
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
Reference in New Issue
Block a user