Files
wifi-densepose/crates/rvlite/docs/00_EXISTING_WASM_ANALYSIS.md
ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00

13 KiB

Existing WASM Implementations Analysis

Summary

RuVector already has extensive WASM implementations we can learn from and potentially reuse for RvLite!


1. Existing WASM Crates

1.1 ruvector-wasm (Main WASM Package)

Location: /workspaces/ruvector/crates/ruvector-wasm/

Features:

  • Full VectorDB API (insert, search, delete, batch)
  • SIMD acceleration (opt-in feature)
  • IndexedDB persistence
  • Web Workers support
  • Zero-copy transfers
  • Security limits (MAX_VECTOR_DIMENSIONS: 65536)

Key Dependencies:

ruvector-core = { path = "../ruvector-core", features = ["memory-only"] }
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"
js-sys = "0.3"
web-sys = { features = ["IdbDatabase", "IdbObjectStore", ...] }
serde-wasm-bindgen = "0.6"
console_error_panic_hook = "0.1"
tracing-wasm = "0.2"

Release Profile (Size Optimization):

[profile.release]
opt-level = "z"      # Optimize for size
lto = true           # Link-time optimization
codegen-units = 1    # Single codegen unit
panic = "abort"      # No unwinding

Architecture Lessons:

// Security: Validate dimensions
const MAX_VECTOR_DIMENSIONS: usize = 65536;

// Error handling across WASM boundary
#[derive(Serialize, Deserialize)]
pub struct WasmError {
    pub message: String,
    pub kind: String,
}

// WASM-friendly API
#[wasm_bindgen]
impl JsVectorEntry {
    #[wasm_bindgen(constructor)]
    pub fn new(
        vector: Float32Array,
        id: Option<String>,
        metadata: Option<JsValue>,
    ) -> Result<JsVectorEntry, JsValue> {
        // ...
    }
}

1.2 sona (Self-Optimizing Neural Architecture)

Location: /workspaces/ruvector/crates/sona/

Features:

  • Runtime-adaptive learning
  • Two-tier LoRA
  • EWC++ (Elastic Weight Consolidation)
  • ReasoningBank integration
  • Dual target support: WASM + NAPI (Node.js native)

Feature Flags:

[features]
default = ["serde-support"]
wasm = ["wasm-bindgen", "wasm-bindgen-futures", "console_error_panic_hook", ...]
napi = ["dep:napi", "dep:napi-derive", "serde-support"]
serde-support = ["serde", "serde_json"]

Key Insight: Supports both WASM and native Node.js via feature flags!

1.3 micro-hnsw-wasm (Ultra-Lightweight HNSW)

Location: /workspaces/ruvector/crates/micro-hnsw-wasm/

Features:

  • Only 11.8KB WASM (incredibly small!)
  • Neuromorphic HNSW with spiking neural networks
  • LIF neurons
  • STDP learning
  • Winner-take-all
  • Dendritic computation
  • No dependencies ([dependencies] section is empty!)

Size Optimization (Maximum):

[profile.release]
opt-level = "z"
lto = true
codegen-units = 1
panic = "abort"
strip = true          # Strip debug symbols

Key Insight: Proof that aggressive optimization can achieve sub-12KB WASM!

1.4 Other WASM Crates

Crate Purpose Status
ruvector-attention-wasm Attention mechanisms Built (pkg/ruvector_attention_wasm_bg.wasm)
ruvector-gnn-wasm Graph Neural Networks Exists
ruvector-graph-wasm Graph operations Exists
ruvector-tiny-dancer-wasm Tiny Dancer routing Exists
ruvector-router-wasm Router Exists

2. Existing Examples

2.1 WASM Examples Directory

Location: /workspaces/ruvector/examples/wasm/

Structure:

examples/wasm/
└── ios/
    ├── dist/
    │   └── recommendation.wasm
    └── swift/
        └── Resources/
            └── recommendation.wasm

iOS Integration: Shows how to use WASM in Swift/iOS apps!

2.2 Other Examples

  • examples/scipix/wasm_demo.html - SciFi visualization demo
  • npm/tests/unit/wasm.test.js - WASM unit tests
  • docs/guides/wasm-api.md - WASM API documentation
  • docs/guides/wasm-build-guide.md - Build instructions

3. Key Learnings for RvLite

3.1 Architecture Patterns to Adopt

Use ruvector-core as Foundation

# RvLite can depend on existing ruvector-core
[dependencies]
ruvector-core = { path = "../ruvector-core", features = ["memory-only"] }

Benefit: Reuse battle-tested vector operations, SIMD, quantization.

Security-First API Design

// Validate inputs before allocation
const MAX_VECTOR_DIMENSIONS: usize = 65536;
const MAX_BATCH_SIZE: usize = 10000;

if vec_len > MAX_VECTOR_DIMENSIONS {
    return Err("Vector too large");
}

Error Handling Pattern

#[derive(Serialize, Deserialize)]
pub struct WasmError {
    pub message: String,
    pub kind: String,
}

impl From<WasmError> for JsValue {
    // Convert to JS-friendly error object
}

WASM-Friendly Types

// Use wasm-bindgen compatible types
#[wasm_bindgen]
pub struct Database {
    inner: Arc<Mutex<CoreDatabase>>,
}

#[wasm_bindgen]
impl Database {
    #[wasm_bindgen(constructor)]
    pub fn new(options: JsValue) -> Result<Database, JsValue> {
        let opts: DbOptions = from_value(options)?;
        // ...
    }

    pub async fn search(
        &self,
        query: Float32Array,
        limit: usize,
    ) -> Result<JsValue, JsValue> {
        // Return JsValue (serialized results)
    }
}

3.2 Build Configuration Best Practices

Size Optimization

[profile.release]
opt-level = "z"        # Optimize for size (not speed)
lto = true             # Link-time optimization
codegen-units = 1      # Single codegen unit (better optimization)
panic = "abort"        # No unwinding (saves space)
strip = true           # Strip debug symbols

[profile.release.package."*"]
opt-level = "z"        # Apply to all dependencies

[package.metadata.wasm-pack.profile.release]
wasm-opt = false       # Disable wasm-opt (manual optimization)

WASM-Specific Dependencies

# Always use wasm_js feature for getrandom
[target.'cfg(target_arch = "wasm32")'.dependencies]
getrandom = { version = "0.2", features = ["wasm_js"] }

3.3 Feature Flags Strategy

[features]
default = []
simd = ["ruvector-core/simd"]              # WASM SIMD
sql = ["dep:sql-parser"]                    # SQL engine
sparql = ["dep:sparql-parser"]              # SPARQL engine
cypher = ["dep:cypher-parser"]              # Cypher engine
gnn = ["dep:ruvector-gnn-wasm"]             # GNN layers
learning = ["dep:sona"]                     # ReasoningBank
graph = ["dep:ruvector-graph-wasm"]         # Graph operations
hyperbolic = ["dep:hyperbolic-embeddings"]  # Hyperbolic spaces

# Feature bundles
full = ["sql", "sparql", "cypher", "gnn", "learning", "graph", "hyperbolic"]
lite = ["sql"]  # Minimal bundle

Benefit: Users can opt-in to features, reducing bundle size.

3.4 Persistence Strategy

From ruvector-wasm:

// IndexedDB persistence
async fn save_to_indexeddb(&self) -> Result<(), JsValue> {
    let window = web_sys::window().unwrap();
    let idb: IdbFactory = window.indexed_db()?.unwrap();

    // Open database
    let open_request = idb.open_with_u32("rvlite", 1)?;

    // ... store data
}

RvLite Should Support:

  1. IndexedDB (browser) - 50MB+ quota
  2. OPFS (Origin Private File System) - Larger quota
  3. File System (Node.js) - Unlimited

3.5 Dual-Target Strategy (from sona)

Support both WASM and Node.js native:

[features]
wasm = ["wasm-bindgen", ...]
napi = ["dep:napi", "dep:napi-derive"]

Benefit:

  • Browser: Use WASM
  • Node.js: Use native addon (faster)

4. What We Can Reuse

4.1 Direct Dependencies

ruvector-core - Vector types, distances, SIMD, quantization ruvector-gnn-wasm - GNN layers (if exists) ruvector-graph-wasm - Graph operations (if exists) sona - ReasoningBank, adaptive learning micro-hnsw-wasm - Ultra-lightweight HNSW

4.2 Patterns & Code

Error handling - WasmError pattern from ruvector-wasm IndexedDB persistence - From ruvector-wasm Build configuration - Cargo.toml profiles Security validation - Input limits, bounds checking TypeScript types - From existing packages

4.3 Testing Infrastructure

wasm-bindgen-test - Browser test runner Unit tests - From npm/tests/unit/wasm.test.js Benchmarks - From node_modules/agentdb wasm benchmarks


5. RvLite Differentiation

What Makes RvLite Different?

Feature Existing WASM Crates RvLite
Scope Vector operations only Complete database (SQL/SPARQL/Cypher)
Query Languages Programmatic API 3 query languages
Graph Support Limited Full graph DB (Cypher, SPARQL)
Self-Learning sona (separate) Built-in ReasoningBank
Standalone Needs backend Fully standalone
Storage Engine Basic persistence ACID transactions

RvLite = All existing WASM crates + Standalone DB + Query engines


6.1 Layered Approach

┌─────────────────────────────────────────┐
│  RvLite (crates/rvlite/)               │
│  - SQL/SPARQL/Cypher engines           │
│  - Storage engine                       │
│  - Transaction manager                  │
│  - WASM bindings                        │
└───────────────┬─────────────────────────┘
                │ depends on
┌───────────────▼─────────────────────────┐
│  Existing WASM Crates                   │
│  - ruvector-core (vectors)              │
│  - sona (learning)                      │
│  - micro-hnsw-wasm (indexing)           │
│  - ruvector-gnn-wasm (GNN)              │
│  - ruvector-graph-wasm (graph)          │
└─────────────────────────────────────────┘

6.2 File Structure

crates/rvlite/
├── Cargo.toml          # Similar to ruvector-wasm
├── src/
│   ├── lib.rs          # WASM bindings
│   ├── storage/        # NEW: Storage engine
│   ├── query/          # NEW: Query engines
│   │   ├── sql.rs
│   │   ├── sparql.rs
│   │   └── cypher.rs
│   ├── transaction.rs  # NEW: ACID transactions
│   └── error.rs        # Copy from ruvector-wasm
├── tests/
│   └── wasm.rs         # Similar to ruvector-wasm/tests/wasm.rs
└── pkg/                # Built by wasm-pack

6.3 Dependency Strategy

[dependencies]
# Reuse existing crates
ruvector-core = { path = "../ruvector-core", features = ["memory-only"] }
ruvector-gnn-wasm = { path = "../ruvector-gnn-wasm", optional = true }
ruvector-graph-wasm = { path = "../ruvector-graph-wasm", optional = true }
sona = { path = "../sona", features = ["wasm"], optional = true }

# New dependencies
sql-parser = { version = "0.9", optional = true }
sparql-parser = { version = "0.3", optional = true }  # Or custom
cypher-parser = { version = "0.1", optional = true }  # Or custom

# Standard WASM stack (from ruvector-wasm)
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"
js-sys = "0.3"
web-sys = { version = "0.3", features = ["IdbDatabase", ...] }
serde-wasm-bindgen = "0.6"
console_error_panic_hook = "0.1"

7. Action Items

Immediate Next Steps

  1. Review existing implementations (DONE - this document)
  2. Create RvLite crate using ruvector-wasm as template
  3. Add dependencies on existing WASM crates
  4. Extract query engines from ruvector-postgres (remove pgrx)
  5. Build storage engine using patterns from ruvector-wasm
  6. Implement WASM bindings following ruvector-wasm patterns
  7. Test with existing WASM test infrastructure

Quick Win: Minimal Viable Product

Week 1: Create ruvector-wasm-lite

// Just vector operations + SQL
#[dependencies]
ruvector-core = { ... }
sql-parser = { ... }

Week 2: Add SPARQL

// Reuse ruvector-postgres/src/graph/sparql (remove pgrx)

Week 3: Add Cypher + GNN

ruvector-gnn-wasm = { ... }

Week 4: Polish and optimize

  • Size optimization
  • Performance tuning
  • Documentation

8. Size Budget Analysis

Existing Sizes

  • micro-hnsw-wasm: 11.8KB (minimal HNSW)
  • ruvector_wasm_bg.wasm: ~500KB (full vector ops)
  • sona_bg.wasm: ~300KB (learning system)

RvLite Target

Component Estimated Size Cumulative
ruvector-core ~500KB 500KB
SQL parser ~200KB 700KB
SPARQL parser ~300KB 1MB
Cypher parser ~200KB 1.2MB
sona (learning) ~300KB 1.5MB
micro-hnsw ~12KB 1.512MB
Storage engine ~200KB 1.7MB
Total (gzipped) ~2-3MB

Verdict: Much smaller than original 5-6MB estimate! 🎉


9. Conclusion

We have a HUGE head start!

  • Battle-tested WASM infrastructure
  • Security patterns established
  • Build optimization figured out
  • Multiple working examples
  • Reusable components (ruvector-core, sona, micro-hnsw)

RvLite can be built MUCH FASTER (4-5 weeks instead of 8) by:

  1. Reusing ruvector-wasm patterns
  2. Depending on existing WASM crates
  3. Extracting query engines from ruvector-postgres
  4. Following established build configs

Next: Continue SPARC documentation with this context!