Files
wifi-densepose/vendor/ruvector/crates/ruQu/docs/adr/ADR-001-ruqu-architecture.md

19 KiB
Raw Blame History

ADR-001: ruQu Architecture - Classical Nervous System for Quantum Machines

Status: Proposed Date: 2026-01-17 Authors: ruv.io, RuVector Team Deciders: Architecture Review Board SDK: Claude-Flow

Version History

Version Date Author Changes
0.1 2026-01-17 ruv.io Initial architecture proposal

Context

The Quantum Operability Problem

Quantum computers in 2025 have achieved remarkable milestones:

  • Google Willow: Below-threshold error correction (0.143% per cycle)
  • Quantinuum Helios: 98 qubits with 48 logical qubits at 2:1 ratio
  • Riverlane: 240ns ASIC decoder latency
  • IonQ: 99.99%+ two-qubit gate fidelity

Yet these systems remain fragile laboratory instruments, not operable production systems.

The gap is not in the quantum hardware or the decoders. The gap is in the classical control intelligence that mediates between hardware and algorithms.

Current Limitations

Limitation Impact
Monolithic treatment Entire device treated as one object per cycle
Reactive control Decoders react after errors accumulate
Static policies Fixed decoder, schedule, cadence
Superlinear overhead Control infrastructure scales worse than qubit count

The Missing Primitive

Current systems can ask:

"What is the most likely correction?"

They cannot ask:

"Is this system still internally consistent enough to trust action?"

That question, answered continuously at microsecond timescales, is the missing primitive.


Decision

Introduce ruQu: A Two-Layer Classical Nervous System

We propose ruQu, a classical control layer combining:

  1. RuVector Memory Layer: Pattern recognition and historical mitigation retrieval
  2. Dynamic Min-Cut Gate: Real-time structural coherence assessment

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                              ruQu FABRIC                                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                         TILE ZERO (Coordinator)                       │ │
│  │  • Supergraph merge                  • Global min-cut evaluation     │ │
│  │  • Permit token issuance             • Hash-chained receipt log      │ │
│  └───────────────────────────────────────────────────────────────────────┘ │
│                                      │                                      │
│         ┌────────────────────────────┼────────────────────────────┐        │
│         ▼                            ▼                            ▼         │
│  ┌─────────────┐            ┌─────────────┐            ┌─────────────┐     │
│  │ WORKER TILE │            │ WORKER TILE │            │ WORKER TILE │     │
│  │   [1-85]    │   × 85     │  [86-170]   │   × 85     │ [171-255]   │× 85 │
│  │             │            │             │            │             │     │
│  │ • Patch     │            │ • Patch     │            │ • Patch     │     │
│  │ • Syndromes │            │ • Syndromes │            │ • Syndromes │     │
│  │ • Local cut │            │ • Local cut │            │ • Local cut │     │
│  │ • E-accum   │            │ • E-accum   │            │ • E-accum   │     │
│  └─────────────┘            └─────────────┘            └─────────────┘     │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Core Components

1. Operational Graph Model

The operational graph includes all elements that can affect quantum coherence:

Node Type Examples Edge Type
Qubits Data, ancilla, flag Coupling strength
Couplers ZZ, XY, tunable Crosstalk correlation
Readout Resonators, amplifiers Signal path dependency
Control Flux, microwave, DC Control line routing
Classical Clocks, temperature, calibration State dependency

2. Dynamic Min-Cut as Coherence Metric

The min-cut between "healthy" and "unhealthy" partitions provides:

  • Structural fragility: Low cut value = boundary forming
  • Localization: Cut edges identify the fracture point
  • Early warning: Cut value drops before logical errors spike

Complexity: O(n^{o(1)}) update time via SubpolynomialMinCut from ruvector-mincut

3. Three-Filter Decision Logic

┌─────────────────────────────────────────────────────────────────┐
│                    FILTER 1: STRUCTURAL                         │
│  Local fragility detection → Global cut confirmation            │
│  Cut ≥ threshold → Coherent                                     │
│  Cut < threshold → Boundary forming → Quarantine                │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    FILTER 2: SHIFT                              │
│  Nonconformity scores → Aggregated shift pressure               │
│  Shift < threshold → Distribution stable                        │
│  Shift ≥ threshold → Drift detected → Conservative mode        │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    FILTER 3: EVIDENCE                           │
│  Running e-value accumulators → Anytime-valid testing           │
│  E ≥ τ_permit → Accept (permit immediately)                     │
│  E ≤ τ_deny → Reject (deny immediately)                         │
│  Otherwise → Continue (gather more evidence)                    │
└─────────────────────────────────────────────────────────────────┘

4. Tile Architecture

Each worker tile (64KB memory budget):

Component Size Purpose
Patch Graph ~32KB Local graph shard (vertices, edges, adjacency)
Syndrome Ring ~16KB Rolling syndrome history (1024 rounds)
Evidence Accumulator ~4KB E-value computation
Local Min-Cut ~8KB Boundary candidates, cut cache, witness fragments
Control/Scratch ~4KB Delta buffer, report scratch, stack

5. Decision Output

The coherence gate outputs a decision every cycle:

enum GateDecision {
    Safe {
        region_mask: RegionMask,     // Which regions are stable
        permit_token: PermitToken,   // Signed authorization
    },
    Cautious {
        region_mask: RegionMask,     // Which regions need care
        lead_time: Cycles,           // Estimated cycles before degradation
        recommendations: Vec<Action>, // Suggested mitigations
    },
    Unsafe {
        quarantine_mask: RegionMask, // Which regions to isolate
        recovery_mode: RecoveryMode, // How to recover
        witness: WitnessReceipt,     // Audit trail
    },
}

Rationale

Why Min-Cut for Coherence?

  1. Graph structure captures dependencies: Qubits, couplers, and control lines form a natural graph
  2. Cut value quantifies fragility: Low cut = system splitting into incoherent partitions
  3. Edges identify the boundary: Know exactly which connections are failing
  4. Subpolynomial updates: O(n^{o(1)}) enables real-time tracking

Why Three Filters?

Filter What It Catches Timescale
Structural Partition formation, hardware failures Immediate
Shift Calibration drift, environmental changes Gradual
Evidence Statistical anomalies, rare events Cumulative

All three must agree for PERMIT. Any one can trigger DENY or DEFER.

Why 256 Tiles?

  • Maps to practical FPGA/ASIC fabric sizes
  • 255 workers can cover ~512 qubits each (130K qubit system)
  • Single TileZero keeps coordination simple
  • Power of 2 enables efficient addressing

Why Not Just Improve Decoders?

Decoders answer: "What correction should I apply?"

ruQu answers: "Should I apply any correction right now?"

These are complementary, not competing. ruQu tells decoders when to work hard and when to relax.


Alternatives Considered

Alternative 1: Purely Statistical Approach

Use only statistical tests on syndrome streams without graph structure.

Rejected because:

  • Cannot identify where problems are forming
  • Cannot leverage structural dependencies
  • Cannot provide localized quarantine

Alternative 2: Post-Hoc Analysis

Analyze syndrome logs offline to detect patterns.

Rejected because:

  • No real-time intervention possible
  • Problems detected after logical failures
  • Cannot enable adaptive control

Alternative 3: Hardware-Only Solution

Implement all logic in quantum hardware or cryogenic electronics.

Rejected because:

  • Inflexible to algorithm changes
  • High development cost
  • Limited to simple policies

Alternative 4: Single-Level Evaluation

No tile hierarchy, evaluate whole system each cycle.

Rejected because:

  • Does not scale beyond ~1000 qubits
  • Cannot provide regional policies
  • Single point of failure

Consequences

Benefits

  1. Localized Recovery: Quarantine smallest region, keep rest running
  2. Early Warning: Detect correlated failures before logical errors
  3. Selective Overhead: Extra work only where needed
  4. Bounded Latency: Constant-time decision every cycle
  5. Audit Trail: Cryptographic proof of every decision
  6. Scalability: Effort scales with structure, not system size

Risks and Mitigations

Risk Probability Impact Mitigation
Graph model mismatch Medium High Learn graph from trajectories
Threshold tuning difficulty Medium Medium Adaptive thresholds via meta-learning
FPGA latency exceeds budget Low High ASIC path for production
Correlated noise overwhelms detection Low High Multiple detection modalities

Performance Targets

Metric Target Rationale
Gate decision latency < 4 μs p99 Compatible with 1 MHz syndrome rate
Memory per tile < 64 KB Fits in FPGA BRAM
Power consumption < 100 mW Cryo-compatible ASIC path
Lead time for correlation > 100 cycles Actionable warning

Implementation Status

Completed (v0.1.0)

Core Implementation (340+ tests passing):

Module Status Description
ruqu::types Complete GateDecision, RegionMask, Verdict, FilterResults
ruqu::syndrome Complete DetectorBitmap (SIMD-ready), SyndromeBuffer, SyndromeDelta
ruqu::filters Complete StructuralFilter, ShiftFilter, EvidenceFilter, FilterPipeline
ruqu::tile Complete WorkerTile (64KB), TileZero, PatchGraph, ReceiptLog
ruqu::fabric Complete QuantumFabric, FabricBuilder, CoherenceGate, PatchMap
ruqu::error Complete RuQuError with thiserror

Security Review (see docs/SECURITY-REVIEW.md):

  • 3 Critical findings fixed (signature length, verification, hash chain)
  • 5 High findings fixed (bounds validation, hex panic, TTL validation)
  • Ed25519 64-byte signatures implemented
  • Bounds checking in release mode

Test Coverage:

  • 90 library unit tests
  • 66 integration tests
  • Property-based tests with proptest
  • Memory budget verification (64KB per tile)

Benchmarks (see benches/):

  • latency_bench.rs - Gate decision latency profiling
  • throughput_bench.rs - Syndrome ingestion rates
  • scaling_bench.rs - Code distance/qubit scaling
  • memory_bench.rs - Memory efficiency verification

Implementation Phases

Phase 1: Simulation Demo (v0.1) COMPLETE

  • Stim simulation stream
  • Baseline decoder (PyMatching)
  • ruQu gate + partition only
  • Controller switches fast/slow decode

Deliverables:

  • Gate latency distribution
  • Correlation detection lead time
  • Logical error vs overhead curve

Phase 2: FPGA Prototype (v0.2)

  • AMD VU19P or equivalent
  • Full 256-tile fabric
  • Real syndrome stream from hardware
  • Integration with existing decoder

Phase 3: ASIC Design (v1.0)

  • Custom 256-tile fabric
  • < 250 ns latency target
  • ~100 mW power budget
  • 4K operation capable

Integration Points

RuVector Components Used

Component Purpose
ruvector-mincut::SubpolynomialMinCut O(n^{o(1)}) dynamic cut
ruvector-mincut::WitnessTree Cut certificates
cognitum-gate-kernel Worker tile implementation
cognitum-gate-tilezero Coordinator implementation
rvlite Pattern memory storage

External Interfaces

Interface Protocol Purpose
Syndrome input Streaming binary Hardware syndrome data
Decoder control gRPC/REST Switch decoder modes
Calibration gRPC Trigger targeted calibration
Monitoring Prometheus Export metrics
Audit Log files / API Receipt chain export

Open Questions

  1. Optimal patch size: How many qubits per worker tile?
  2. Overlap band width: How much redundancy at tile boundaries?
  3. Threshold initialization: How to set thresholds for new hardware?
  4. Multi-chip coordination: How to extend to federated systems?
  5. Learning integration: How to update graph model online?

References

  1. El-Hayek, Henzinger, Li. "Dynamic Min-Cut with Subpolynomial Update Time." arXiv:2512.13105, 2025.
  2. Google Quantum AI. "Quantum error correction below the surface code threshold." Nature, 2024.
  3. Riverlane. "Collision Clustering Decoder." Nature Communications, 2025.
  4. RuVector Team. "ADR-001: Anytime-Valid Coherence Gate." 2026.

Appendix A: Latency Analysis

Critical Path Breakdown

Syndrome Arrival        → 0 ns
  │
  ▼ Ring buffer append  → +50 ns
Delta Dispatch
  │
  ▼ Graph update        → +200 ns (amortized O(n^{o(1)}))
Worker Tick
  │
  ▼ Local cut eval      → +500 ns
  ▼ Report generation   → +100 ns
Worker Report Complete
  │
  ▼ Report collection   → +500 ns (parallel from 255 tiles)
TileZero Merge
  │
  ▼ Global cut          → +300 ns
  ▼ Three-filter eval   → +100 ns
Gate Decision
  │
  ▼ Token signing       → +500 ns (Ed25519)
  ▼ Receipt append      → +100 ns
Decision Complete       → ~2,350 ns total

Margin                  → ~1,650 ns (to 4 μs budget)

Appendix B: Memory Layout

Worker Tile (64 KB)

0x0000 - 0x7FFF : Patch Graph (32 KB)
  0x0000 - 0x1FFF : Vertex array (512 vertices × 16 bytes)
  0x2000 - 0x5FFF : Edge array (2048 edges × 8 bytes)
  0x6000 - 0x7FFF : Adjacency lists

0x8000 - 0xBFFF : Syndrome Ring (16 KB)
  1024 rounds × 16 bytes per round

0xC000 - 0xCFFF : Evidence Accumulator (4 KB)
  Hypothesis states, log e-values, window stats

0xD000 - 0xEFFF : Local Min-Cut State (8 KB)
  Boundary candidates, cut cache, witness fragments

0xF000 - 0xFFFF : Control (4 KB)
  Delta buffer, report scratch, stack

Appendix C: Decision Flow Pseudocode

def gate_evaluate(tile_reports: List[TileReport]) -> GateDecision:
    # Merge reports into supergraph
    supergraph = merge_reports(tile_reports)

    # Filter 1: Structural
    global_cut = supergraph.min_cut()
    if global_cut < THRESHOLD_STRUCTURAL:
        boundary = supergraph.cut_edges()
        return GateDecision.Unsafe(
            quarantine_mask=identify_regions(boundary),
            recovery_mode=RecoveryMode.LocalReset,
            witness=generate_witness(supergraph, boundary)
        )

    # Filter 2: Shift
    shift_pressure = supergraph.aggregate_shift()
    if shift_pressure > THRESHOLD_SHIFT:
        affected = supergraph.high_shift_regions()
        return GateDecision.Cautious(
            region_mask=affected,
            lead_time=estimate_lead_time(shift_pressure),
            recommendations=[
                Action.IncreaseSyndromeRounds(affected),
                Action.SwitchToConservativeDecoder(affected)
            ]
        )

    # Filter 3: Evidence
    e_value = supergraph.aggregate_evidence()
    if e_value < THRESHOLD_DENY:
        return GateDecision.Unsafe(...)
    elif e_value < THRESHOLD_PERMIT:
        return GateDecision.Cautious(
            lead_time=evidence_to_lead_time(e_value),
            ...
        )

    # All filters pass
    return GateDecision.Safe(
        region_mask=RegionMask.all(),
        permit_token=sign_permit(supergraph.hash())
    )