git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
12 KiB
ADR-QE-001: Quantum Engine Core Architecture
Status: Proposed Date: 2026-02-06 Authors: ruv.io, RuVector Team Deciders: Architecture Review Board
Context
Problem Statement
ruVector needs a quantum simulation engine for on-device quantum algorithm experimentation. The platform runs on distributed edge systems, primarily targeting Cognitum's 256-core low-power processors, and emphasizes ultra-low-power event-driven computing. Quantum simulation is a natural extension of ruVector's mathematical computation capabilities: the same SIMD-optimized linear algebra that powers vector search and neural inference can drive state-vector manipulation for quantum circuits.
Requirements
The engine must support gate-model quantum circuit simulation up to approximately 25 qubits, covering the following algorithm families:
| Algorithm Family | Use Case | Typical Qubits | Gate Depth |
|---|---|---|---|
| VQE (Variational Quantum Eigensolver) | Molecular simulation, optimization | 8-20 | 50-500 per iteration |
| Grover's Search | Unstructured database search | 8-25 | O(sqrt(2^n)) |
| QAOA (Quantum Approximate Optimization) | Combinatorial optimization | 10-25 | O(p * edges) |
| Quantum Error Correction | Surface code, stabilizer circuits | 9-25 (logical + ancilla) | Repetitive syndrome rounds |
Memory Scaling Analysis
Quantum state-vector simulation stores the full amplitude vector of 2^n complex numbers. Each amplitude is a pair of f64 values (real + imaginary = 16 bytes). Memory grows exponentially:
Qubits Amplitudes State Size With Scratch Buffer
------ ----------- ---------- -------------------
10 1,024 16 KB 32 KB
15 32,768 512 KB 1 MB
20 1,048,576 16 MB 32 MB
22 4,194,304 64 MB 128 MB
24 16,777,216 256 MB 512 MB
25 33,554,432 512 MB 1.07 GB
26 67,108,864 1.07 GB 2.14 GB
28 268,435,456 4.29 GB 8.59 GB
30 1,073,741,824 17.18 GB 34.36 GB
At 25 qubits the state vector requires approximately 512 MB (1.07 GB with a scratch buffer for intermediate calculations). This is the practical ceiling for WebAssembly's 32-bit address space. Native execution with sufficient RAM can push to 30+ qubits.
Edge Computing Constraints
Cognitum's 256-core processors operate under strict power and memory budgets:
- Power envelope: Event-driven activation; cores idle at near-zero draw
- Memory: Shared pool, typically 2-8 GB per node
- Interconnect: Low-latency mesh between cores, suitable for parallel simulation
- Workload model: Burst computation triggered by agent events, not continuous
The quantum engine must respect this model: allocate state only when a simulation is triggered, execute the circuit, return results, and immediately release all memory.
Decision
Implement a pure Rust state-vector quantum simulator as a new crate family
(ruQu quantum engine) within the ruVector workspace. The following architectural
decisions define the engine.
1. Pure Rust Implementation (No C/C++ FFI)
The entire simulation engine is written in Rust with no foreign function interface dependencies. This ensures:
- Compilation to
wasm32-unknown-unknownwithout emscripten or C toolchains - Memory safety guarantees throughout the simulation pipeline
- Unified build system via Cargo across all targets
- No external library version conflicts or platform-specific linking issues
2. State-Vector Simulation as Primary Backend
The engine uses explicit full-amplitude state-vector representation as its primary simulation mode. Each gate application transforms the full 2^n amplitude vector via matrix-vector multiplication.
Circuit Execution Model:
|psi_0> ──[H]──[CNOT]──[Rz(theta)]──[Measure]── classical bits
| | | |
v v v v
[init] [apply_H] [apply_CNOT] [apply_Rz] [sample]
| | | | |
2^n f64 2^n f64 2^n f64 2^n f64 collapse
complex complex complex complex to basis
Gate application follows the standard decomposition:
- Single-qubit gates: Iterate amplitude pairs (i, i XOR 2^target), apply 2x2 unitary. O(2^n) operations per gate.
- Two-qubit gates: Iterate amplitude quadruples, apply 4x4 unitary. O(2^n) operations per gate.
- Multi-qubit gates: Decompose into single and two-qubit gates, or apply directly via 2^k x 2^k matrix on k target qubits.
3. Qubit Limits and Precision
| Parameter | WASM Target | Native Target |
|---|---|---|
| Max qubits (default) | 25 | 30+ (RAM-dependent) |
| Max qubits (hard limit) | 26 (with f32) | Memory-limited |
| Precision (default) | Complex f64 | Complex f64 |
| Precision (optional) | Complex f32 | Complex f32 |
| State size at max | ~1.07 GB | ~17 GB at 30 qubits |
Complex f64 is the default precision, providing approximately 15 decimal digits of accuracy -- sufficient for quantum chemistry applications and deep circuits where accumulated floating-point error matters. An optional f32 mode halves memory usage at the cost of precision, suitable for shallow circuits and approximate optimization.
4. Event-Driven Activation Model
The engine follows ruVector's event-driven philosophy:
Agent Context ruQu Engine Memory
| | |
|-- trigger(circuit) ->| |
| |-- allocate(2^n) ---->|
| |<---- state_ptr ------|
| | |
| |-- [execute gates] -->|
| |-- [measure] -------->|
| | |
|<-- results ---------| |
| |-- deallocate() ----->|
| | |
(idle) (inert) (freed)
- Inert by default: No background threads, no persistent allocations
- Allocate on demand: State vector created when circuit execution begins
- Free immediately: All simulation memory released upon result delivery
- No global state: Multiple concurrent simulations supported via independent state handles (no shared mutable global)
5. Dual-Target Compilation
The crate supports two compilation targets from a single codebase:
ruqu-core
|
+----------+----------+
| |
[native target] [wasm32-unknown-unknown]
| |
- Full SIMD (AVX2, - WASM SIMD128
AVX-512, NEON) - 4GB address limit
- Rayon threading - Optional SharedArrayBuffer
- Optional GPU (wgpu) - No GPU
- 30+ qubits - 25 qubit ceiling
- Full OS integration - Sandboxed
Conditional compilation via Cargo feature flags controls target-specific code paths. The public API surface is identical across targets.
6. Optional Tensor Network Mode
For circuits with limited entanglement (e.g., shallow QAOA, certain VQE ansatze), the engine offers an optional tensor network backend:
- Represents the quantum state as a network of tensors rather than a single exponential vector
- Memory scales as O(n * chi^2) where chi is the bond dimension (maximum entanglement width)
- Efficient for circuits where entanglement grows slowly or remains bounded
- Falls back to full state-vector when bond dimension exceeds threshold
- Enabled via the
tensor-networkfeature flag
Alternatives Considered
Alternative 1: Qukit (Rust, WASM-ready)
A pre-1.0 Rust quantum simulator with WASM support.
| Criterion | Assessment |
|---|---|
| Maturity | Pre-1.0, limited community |
| WASM support | Present but untested at scale |
| Optimization | Basic; no SIMD, no gate fusion |
| Integration | Would require adapter layer |
| Maintenance | External dependency risk |
Rejected: Insufficient optimization depth and maturity for production use.
Alternative 2: QuantRS2 (Rust, Python-focused)
A Rust quantum simulator primarily targeting Python bindings via PyO3.
| Criterion | Assessment |
|---|---|
| Performance | Good benchmarks on native |
| WASM support | Not a design target |
| Dependencies | Heavy; Python-oriented build |
| API design | Python-first, Rust API secondary |
| Integration | Significant impedance mismatch |
Rejected: Python-centric design creates unnecessary weight and integration friction for a Rust-native edge system.
Alternative 3: roqoqo + QuEST (Rust frontend, C backend)
roqoqo provides a Rust circuit description layer; QuEST is a high-performance C/C++ state-vector simulator.
| Criterion | Assessment |
|---|---|
| Performance | Excellent (QuEST is highly optimized) |
| WASM support | QuEST's C code breaks WASM compilation |
| Maintenance | External C library maintenance burden |
| Memory safety | C backend outside Rust safety guarantees |
Rejected: C dependency is incompatible with WASM target requirement.
Alternative 4: Quant-Iron (Rust + OpenCL)
A Rust simulator leveraging OpenCL for GPU acceleration.
| Criterion | Assessment |
|---|---|
| Performance | Excellent on GPU-equipped hardware |
| WASM support | OpenCL incompatible with WASM |
| Edge deployment | Most edge nodes lack discrete GPUs |
| Complexity | OpenCL runtime adds operational burden |
Rejected: OpenCL dependency incompatible with WASM and edge deployment model.
Alternative 5: No Simulator (Cloud Quantum APIs)
Delegate all quantum computation to cloud-based quantum simulators or hardware.
| Criterion | Assessment |
|---|---|
| Performance | Network-bound latency |
| Offline support | None; requires connectivity |
| Cost | Per-execution charges |
| Privacy | Circuit data sent to third party |
| Edge philosophy | Violates offline-first design |
Rejected: Fundamentally incompatible with ruVector's offline-first edge computing philosophy.
Consequences
Positive
- Full control: Complete ownership of the simulation pipeline, enabling deep integration with ruVector's math, SIMD, and memory subsystems
- WASM portable: Single codebase compiles to any WASM runtime, enabling browser-based quantum experimentation
- No external dependencies: Eliminates supply chain risk from C/C++ or Python library dependencies
- Edge-aligned: Event-driven activation model matches Cognitum's power architecture
- Extensible: Gate set, noise models, and backends can evolve independently
Negative
- Development effort: Building a competitive quantum simulator from scratch requires significant engineering investment
- Maintenance burden: Team must benchmark, optimize, and maintain the simulation engine alongside the rest of ruVector
- Classical simulation limits: Exponential scaling is a fundamental physics constraint; the engine cannot exceed ~30 qubits on practical hardware
Risks and Mitigations
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Performance below competitors | Medium | High | Benchmark-driven development against QuantRS2/Qukit |
| Floating-point accuracy drift | Low | Medium | Comprehensive numerical tests, optional f64 enforcement |
| WASM memory exhaustion | Medium | Medium | Hard qubit limit with clear error messages (ADR-QE-003) |
| Scope creep into hardware simulation | Low | Low | Strict scope: gate-model only, no analog/pulse simulation |
References
- ADR-005: WASM Runtime Integration
- ADR-003: SIMD Optimization Strategy
- ADR-006: Memory Management
- ADR-014: Coherence Engine
- ADR-QE-002: Crate Structure & Integration
- ADR-QE-003: WASM Compilation Strategy
- ADR-QE-004: Performance Optimization & Benchmarks
- Nielsen & Chuang, "Quantum Computation and Quantum Information" (2010)
- Aaronson & Gottesman, "Improved simulation of stabilizer circuits" (2004)