Files

ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172

git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900

2026-02-28 14:39:40 -05:00

12 KiB

Raw Blame History

ADR-QE-001: Quantum Engine Core Architecture

Status: Proposed Date: 2026-02-06 Authors: ruv.io, RuVector Team Deciders: Architecture Review Board

Context

Problem Statement

ruVector needs a quantum simulation engine for on-device quantum algorithm experimentation. The platform runs on distributed edge systems, primarily targeting Cognitum's 256-core low-power processors, and emphasizes ultra-low-power event-driven computing. Quantum simulation is a natural extension of ruVector's mathematical computation capabilities: the same SIMD-optimized linear algebra that powers vector search and neural inference can drive state-vector manipulation for quantum circuits.

Requirements

The engine must support gate-model quantum circuit simulation up to approximately 25 qubits, covering the following algorithm families:

Algorithm Family	Use Case	Typical Qubits	Gate Depth
VQE (Variational Quantum Eigensolver)	Molecular simulation, optimization	8-20	50-500 per iteration
Grover's Search	Unstructured database search	8-25	O(sqrt(2^n))
QAOA (Quantum Approximate Optimization)	Combinatorial optimization	10-25	O(p * edges)
Quantum Error Correction	Surface code, stabilizer circuits	9-25 (logical + ancilla)	Repetitive syndrome rounds

Memory Scaling Analysis

Quantum state-vector simulation stores the full amplitude vector of 2^n complex numbers. Each amplitude is a pair of f64 values (real + imaginary = 16 bytes). Memory grows exponentially:

Qubits  Amplitudes       State Size     With Scratch Buffer
------  -----------      ----------     -------------------
10      1,024            16 KB          32 KB
15      32,768           512 KB         1 MB
20      1,048,576        16 MB          32 MB
22      4,194,304        64 MB          128 MB
24      16,777,216       256 MB         512 MB
25      33,554,432       512 MB         1.07 GB
26      67,108,864       1.07 GB        2.14 GB
28      268,435,456      4.29 GB        8.59 GB
30      1,073,741,824    17.18 GB       34.36 GB

At 25 qubits the state vector requires approximately 512 MB (1.07 GB with a scratch buffer for intermediate calculations). This is the practical ceiling for WebAssembly's 32-bit address space. Native execution with sufficient RAM can push to 30+ qubits.

Edge Computing Constraints

Cognitum's 256-core processors operate under strict power and memory budgets:

Power envelope: Event-driven activation; cores idle at near-zero draw
Memory: Shared pool, typically 2-8 GB per node
Interconnect: Low-latency mesh between cores, suitable for parallel simulation
Workload model: Burst computation triggered by agent events, not continuous

The quantum engine must respect this model: allocate state only when a simulation is triggered, execute the circuit, return results, and immediately release all memory.

Decision

Implement a pure Rust state-vector quantum simulator as a new crate family (ruQu quantum engine) within the ruVector workspace. The following architectural decisions define the engine.

1. Pure Rust Implementation (No C/C++ FFI)

The entire simulation engine is written in Rust with no foreign function interface dependencies. This ensures:

Compilation to wasm32-unknown-unknown without emscripten or C toolchains
Memory safety guarantees throughout the simulation pipeline
Unified build system via Cargo across all targets
No external library version conflicts or platform-specific linking issues

2. State-Vector Simulation as Primary Backend

The engine uses explicit full-amplitude state-vector representation as its primary simulation mode. Each gate application transforms the full 2^n amplitude vector via matrix-vector multiplication.

Circuit Execution Model:

  |psi_0> ──[H]──[CNOT]──[Rz(theta)]──[Measure]── classical bits
     |          |            |              |
     v          v            v              v
  [init]    [apply_H]   [apply_CNOT]   [apply_Rz]   [sample]
     |          |            |              |           |
  2^n f64   2^n f64      2^n f64        2^n f64     collapse
  complex   complex      complex        complex     to basis

Gate application follows the standard decomposition:

Single-qubit gates: Iterate amplitude pairs (i, i XOR 2^target), apply 2x2 unitary. O(2^n) operations per gate.
Two-qubit gates: Iterate amplitude quadruples, apply 4x4 unitary. O(2^n) operations per gate.
Multi-qubit gates: Decompose into single and two-qubit gates, or apply directly via 2^k x 2^k matrix on k target qubits.

3. Qubit Limits and Precision

Parameter	WASM Target	Native Target
Max qubits (default)	25	30+ (RAM-dependent)
Max qubits (hard limit)	26 (with f32)	Memory-limited
Precision (default)	Complex f64	Complex f64
Precision (optional)	Complex f32	Complex f32
State size at max	~1.07 GB	~17 GB at 30 qubits

Complex f64 is the default precision, providing approximately 15 decimal digits of accuracy -- sufficient for quantum chemistry applications and deep circuits where accumulated floating-point error matters. An optional f32 mode halves memory usage at the cost of precision, suitable for shallow circuits and approximate optimization.

4. Event-Driven Activation Model

The engine follows ruVector's event-driven philosophy:

Agent Context          ruQu Engine              Memory
     |                      |                      |
     |-- trigger(circuit) ->|                      |
     |                      |-- allocate(2^n) ---->|
     |                      |<---- state_ptr ------|
     |                      |                      |
     |                      |-- [execute gates] -->|
     |                      |-- [measure] -------->|
     |                      |                      |
     |<-- results ---------|                      |
     |                      |-- deallocate() ----->|
     |                      |                      |
   (idle)                (inert)               (freed)

Inert by default: No background threads, no persistent allocations
Allocate on demand: State vector created when circuit execution begins
Free immediately: All simulation memory released upon result delivery
No global state: Multiple concurrent simulations supported via independent state handles (no shared mutable global)

5. Dual-Target Compilation

The crate supports two compilation targets from a single codebase:

                    ruqu-core
                       |
            +----------+----------+
            |                     |
    [native target]       [wasm32-unknown-unknown]
            |                     |
    - Full SIMD (AVX2,      - WASM SIMD128
      AVX-512, NEON)        - 4GB address limit
    - Rayon threading        - Optional SharedArrayBuffer
    - Optional GPU (wgpu)    - No GPU
    - 30+ qubits             - 25 qubit ceiling
    - Full OS integration    - Sandboxed

Conditional compilation via Cargo feature flags controls target-specific code paths. The public API surface is identical across targets.

6. Optional Tensor Network Mode

For circuits with limited entanglement (e.g., shallow QAOA, certain VQE ansatze), the engine offers an optional tensor network backend:

Represents the quantum state as a network of tensors rather than a single exponential vector
Memory scales as O(n * chi^2) where chi is the bond dimension (maximum entanglement width)
Efficient for circuits where entanglement grows slowly or remains bounded
Falls back to full state-vector when bond dimension exceeds threshold
Enabled via the tensor-network feature flag

Alternatives Considered

Alternative 1: Qukit (Rust, WASM-ready)

A pre-1.0 Rust quantum simulator with WASM support.

Criterion	Assessment
Maturity	Pre-1.0, limited community
WASM support	Present but untested at scale
Optimization	Basic; no SIMD, no gate fusion
Integration	Would require adapter layer
Maintenance	External dependency risk

Rejected: Insufficient optimization depth and maturity for production use.

Alternative 2: QuantRS2 (Rust, Python-focused)

A Rust quantum simulator primarily targeting Python bindings via PyO3.

Criterion	Assessment
Performance	Good benchmarks on native
WASM support	Not a design target
Dependencies	Heavy; Python-oriented build
API design	Python-first, Rust API secondary
Integration	Significant impedance mismatch

Rejected: Python-centric design creates unnecessary weight and integration friction for a Rust-native edge system.

Alternative 3: roqoqo + QuEST (Rust frontend, C backend)

roqoqo provides a Rust circuit description layer; QuEST is a high-performance C/C++ state-vector simulator.

Criterion	Assessment
Performance	Excellent (QuEST is highly optimized)
WASM support	QuEST's C code breaks WASM compilation
Maintenance	External C library maintenance burden
Memory safety	C backend outside Rust safety guarantees

Rejected: C dependency is incompatible with WASM target requirement.

Alternative 4: Quant-Iron (Rust + OpenCL)

A Rust simulator leveraging OpenCL for GPU acceleration.

Criterion	Assessment
Performance	Excellent on GPU-equipped hardware
WASM support	OpenCL incompatible with WASM
Edge deployment	Most edge nodes lack discrete GPUs
Complexity	OpenCL runtime adds operational burden

Rejected: OpenCL dependency incompatible with WASM and edge deployment model.

Alternative 5: No Simulator (Cloud Quantum APIs)

Delegate all quantum computation to cloud-based quantum simulators or hardware.

Criterion	Assessment
Performance	Network-bound latency
Offline support	None; requires connectivity
Cost	Per-execution charges
Privacy	Circuit data sent to third party
Edge philosophy	Violates offline-first design

Rejected: Fundamentally incompatible with ruVector's offline-first edge computing philosophy.

Consequences

Positive

Full control: Complete ownership of the simulation pipeline, enabling deep integration with ruVector's math, SIMD, and memory subsystems
WASM portable: Single codebase compiles to any WASM runtime, enabling browser-based quantum experimentation
No external dependencies: Eliminates supply chain risk from C/C++ or Python library dependencies
Edge-aligned: Event-driven activation model matches Cognitum's power architecture
Extensible: Gate set, noise models, and backends can evolve independently

Negative

Development effort: Building a competitive quantum simulator from scratch requires significant engineering investment
Maintenance burden: Team must benchmark, optimize, and maintain the simulation engine alongside the rest of ruVector
Classical simulation limits: Exponential scaling is a fundamental physics constraint; the engine cannot exceed ~30 qubits on practical hardware

Risks and Mitigations

Risk	Likelihood	Impact	Mitigation
Performance below competitors	Medium	High	Benchmark-driven development against QuantRS2/Qukit
Floating-point accuracy drift	Low	Medium	Comprehensive numerical tests, optional f64 enforcement
WASM memory exhaustion	Medium	Medium	Hard qubit limit with clear error messages (ADR-QE-003)
Scope creep into hardware simulation	Low	Low	Strict scope: gate-model only, no analog/pulse simulation

References

ADR-005: WASM Runtime Integration
ADR-003: SIMD Optimization Strategy
ADR-006: Memory Management
ADR-014: Coherence Engine
ADR-QE-002: Crate Structure & Integration
ADR-QE-003: WASM Compilation Strategy
ADR-QE-004: Performance Optimization & Benchmarks
Nielsen & Chuang, "Quantum Computation and Quantum Information" (2010)
Aaronson & Gottesman, "Improved simulation of stabilizer circuits" (2004)

12 KiB Raw Blame History