# ADR-045: Lean-Agentic Integration — Formal Verification & AI-Native Type Theory for RuVector

## Status

Proposed

## Date

2026-02-24

## Authors

ruv.io, RuVector Architecture Team

## Deciders

Architecture Review Board

## SDK

Claude-Flow V3

## Version History

| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 0.1 | 2026-02-24 | ruv.io | Initial deep review and integration proposal |
| 0.2 | 2026-02-24 | ruv.io | Address all gaps: workspace mechanics, error types, compilable code, WASM strategy, CI/CD, proof attestation, testing, benchmarks, migration, compatibility matrix, consequences |
| 0.3 | 2026-02-24 | ruv.io | Ultra-optimization addendum: SolverArena bump alloc, SIMD hash probe, fused substitution, thread-local pools, conversion cache, coherence-gated proof routing, bounds-check elimination. Target: 10-20x speedup to sub-100us proofs |

---

## 1. Executive Summary

This ADR proposes integrating the [`lean-agentic`](https://crates.io/crates/lean-agentic) crate (v0.1.0, Apache-2.0) into the RuVector workspace. Lean-Agentic provides a hash-consed dependent type theory kernel with 150x faster equality checking (0.3ns term comparison), formal verification primitives, and an AI optimization layer including JIT compilation, multi-lane LLM routing, and AgentDB vector memory. Integration gives RuVector proof-carrying vectors, formally verified pipeline invariants, and a compile-time safety layer for critical paths like genomic analysis, financial computation, and cognitive containers.

**Decision**: Add `lean-agentic = "=0.1.0"` as a pinned workspace dependency, creating a new `ruvector-verified` bridge crate (independent versioning at `0.1.0`, initially `publish = false`) that maps RuVector primitives to lean-agentic's type system. All verification is feature-gated — zero impact on existing builds.

**License note**: lean-agentic is Apache-2.0; the RuVector workspace is MIT. These are compatible (Apache-2.0 can be consumed by MIT projects). The new `ruvector-verified` crate will use `MIT OR Apache-2.0` dual licensing to align with both.

---

## 2. Deep Review of `lean-agentic`

### 2.1 Crate Identity

| Field | Value |
|-------|-------|
| **Name** | `lean-agentic` |
| **Version** | 0.1.0 (upstream workspace at 0.3.0) |
| **Published** | 2025-10-25 |
| **License** | Apache-2.0 |
| **Downloads** | ~3,141 total |
| **Crate size** | 19,333 bytes |
| **Code** | 1,871 lines across 10 files (core kernel) |
| **Repository** | [agenticsorg/lean-agentic](https://github.com/agenticsorg/lean-agentic) |
| **Documentation** | [docs.rs/lean-agentic](https://docs.rs/lean-agentic) |
| **Publisher** | rUv (ruvnet) |
| **Categories** | Development tools, Mathematics, WebAssembly |
| **Keywords** | agentic, dependent-types, formal-verification, lean, theorem-prover |

### 2.2 Workspace Architecture

The lean-agentic repository is a 10-crate Rust workspace:

```
lean-agentic/              # Core: hash-consed dependent types (published to crates.io)
leanr-syntax/              # Surface syntax parsing
leanr-elab/                # Elaboration (surface -> core)
leanr-inductive/           # Inductive type definitions
leanr-eval-lite/           # Lightweight evaluation
leanr-compat/              # Lean 4 compatibility layer
leanr-rag-gateway/         # Multi-lane RAG gateway with proof obligations
leanr-wasm/                # WASM bindings (NOT published to crates.io)
leanr-theorems/            # Reference theorem implementations
runtime/                   # Agent runtime with work-stealing scheduler
src/                       # AI optimization layer (AgentDB, JIT, multi-lane)
```

**Important**: Only `lean-agentic` (the core kernel) is published to crates.io. The WASM bindings (`leanr-wasm`) and other workspace crates are not published — WASM support must be built into `ruvector-verified` directly.

### 2.3 Core Kernel Analysis (lean-agentic crate)

The published crate implements a **trusted type theory kernel** for Lean 4 in Rust. Total: ~76.6KB across 10 source files.

#### Module Breakdown

| Module | Size | Purpose |
|--------|------|---------|
| `arena.rs` | 6.6KB | Hash-consing arena with deduplication (85% memory reduction) |
| `term.rs` | 6.5KB | Dependent type terms: Sort, Const, Var, App, Lam, Pi, Let, MVar, Lit |
| `typechecker.rs` | 11.1KB | Trusted kernel: bidirectional type inference/checking |
| `conversion.rs` | 13.3KB | Definitional equality via WHNF (beta/delta/zeta/iota reduction) |
| `unification.rs` | 11.8KB | First-order constraint solving with occurs check |
| `environment.rs` | 9.0KB | Global declarations and constant definitions |
| `context.rs` | 6.1KB | Local variable typing context |
| `level.rs` | 6.3KB | Universe levels for predicative type system |
| `symbol.rs` | 3.7KB | Name interning for memory-efficient identifiers |
| `lib.rs` | 2.3KB | Module exports and error types |

#### Key Data Structures

```rust
// Hash-consed term identifier -- O(1) equality via pointer comparison
#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)]
pub struct TermId(u32);

// Core term variants
pub enum TermKind {
    Sort(LevelId),                 // Type universes
    Const(SymbolId, Vec<LevelId>), // Named constants
    Var(u32),                      // De Bruijn indexed variables
    App(TermId, TermId),           // Function application
    Lam(Binder, TermId),           // Lambda abstraction
    Pi(Binder, TermId),            // Dependent function types
    Let(Binder, TermId, TermId),   // Let bindings
    MVar(MetaVarId),               // Metavariables for elaboration
    Lit(Literal),                  // Nat/String literals
}

// Binder with implicit/explicit annotation
pub struct Binder {
    pub name: SymbolId,
    pub ty: TermId,
    pub implicit: bool,
    pub info: BinderInfo, // Default, Implicit, StrictImplicit, InstImplicit
}

// Arena with deduplication cache
pub struct Arena {
    terms: Vec<Term>,
    cache: HashMap<u64, Vec<TermId>>,
    stats: ArenaStats,
}
```

#### Type Checker Architecture

The type checker implements a **bidirectional** algorithm:

1. **`infer(term)`** -- Synthesizes a type for a term:
   - Sort: returns Sort(level+1)
   - Const: looked up in environment
   - Var: looked up in context
   - App: infers function type, WHNF-reduces to Pi, checks argument
   - Lam: extends context, infers body type, constructs Pi
   - Pi: checks domain and codomain are sorts, computes max level
   - Let: type-checks value, substitutes into body

2. **`check(term, expected_type)`** -- Verifies term has expected type via definitional equality

3. **`check_declaration(decl)`** -- Validates declarations before environment admission

#### Conversion Engine

WHNF reduction with fuel (10,000 step limit):
- **Beta reduction**: `(Lx.t) s -> t[x:=s]`
- **Delta reduction**: Constant unfolding for reducible definitions
- **Zeta reduction**: Let-expression substitution
- **Iota reduction**: Pattern matching (placeholder)
- **Memoization cache** keyed on (term, context_length)
- **Substitution** with proper De Bruijn index shifting

#### Unification Engine

First-order unification for dependent types:
- Three constraint types: `Unify(t1, t2)`, `IsSort(t)`, `HasType(m, t)`
- Occurs check preventing infinite types
- Structural decomposition of App, Lam, Pi
- Fixed-point substitution application

### 2.4 AI Optimization Layer (workspace src/)

Beyond the core kernel, the upstream workspace provides four AI-native modules (not published to crates.io, listed for context):

| Module | Purpose |
|--------|---------|
| **AgentDB** | Vector memory with `AgentDb`, `AgentDbConfig`, `Episode`, `SemanticFact` |
| **LLM Compiler** | AI-driven compilation with optimization passes |
| **JIT Runtime** | 4-tier adaptive JIT (interpreter to 200x optimization) |
| **Multi-Lane** | Cost-optimized LLM routing (40%+ savings) |

### 2.5 Performance Characteristics

| Metric | Value | Notes |
|--------|-------|-------|
| Term equality | 0.3ns | Hash-consed pointer comparison |
| Memory reduction | 85% | Arena deduplication |
| Agent spawn | <500ns | Work-stealing scheduler |
| Vector search P99 | <10ms | AgentDB integration |
| Cache hit rate | 95%+ | Memoized WHNF |
| Compilation | <100ms | Incremental, function-level |
| Key generation | 152us | Ed25519 proof signatures |

### 2.6 Dependency Profile

The published `lean-agentic` crate has **zero mandatory dependencies**. Optional feature:
- `serde` -- serialization support

The upstream workspace-level dependencies (not pulled by the published crate) include:
- `tokio` (async), `serde`/`bincode` (serialization)
- `ed25519-dalek` (proof signatures), `sha2` (hashing)
- `bumpalo` (arena allocator), `im` (persistent collections)
- `wasm-bindgen`/`js-sys`/`web-sys` (WASM target)

### 2.7 Quality Assessment

| Criteria | Rating | Notes |
|----------|--------|-------|
| **API surface** | Clean | 8 modules, well-separated concerns |
| **Safety** | Strong | `#![deny(unsafe_op_in_unsafe_fn)]`, hash-consing prevents aliasing bugs |
| **Documentation** | Good | `#![warn(missing_docs)]` enforced, docs.rs published |
| **Testing** | Adequate | Unit tests per module, 50+ tests workspace-wide |
| **Maturity** | Early | v0.1.0, but sound type-theory foundation |
| **License** | Compatible | Apache-2.0 (consumable by MIT projects) |
| **Size** | Minimal | 19KB crate, zero mandatory deps |
| **Maintenance** | Active | Same org (agenticsorg), same maintainer (ruvnet) |

**`no_std` compatibility**: The core kernel uses `HashMap` (std). For WASM targets (`wasm32-unknown-unknown`), `HashMap` is available because WASM has std support (no mmap/fs needed). For `no_std` builds, the arena would need a `hashbrown` replacement -- this is a Phase 4 concern.

---

## 3. Integration Rationale

### 3.1 Why Lean-Agentic for RuVector

RuVector's crate ecosystem handles high-stakes computation: genomic analysis, financial risk, graph neural networks, quantum simulation. These domains demand **correctness guarantees** beyond what unit tests alone provide.

Lean-Agentic provides:

1. **Proof-Carrying Vectors** -- Attach type-theoretic proofs to HNSW index operations, certifying dimensionality, distance metric correctness, and recall bounds.

2. **Verified Pipeline Invariants** -- Use dependent types to encode that DNA pipeline stages (ADR-001 through ADR-015) preserve data integrity across transformations.

3. **Formal Safety for Cognitive Containers** -- RVF containers (ADR-029/030) can carry machine-checked proofs of their behavioral contracts, chaining into the existing WITNESS_SEG format.

4. **Zero-Cost at Runtime** -- Compile-time proof erasure means zero overhead in release builds. Only the 0.3ns equality check remains for runtime assertions.

5. **Shared Toolchain** -- Same maintainer, same org, Rust-native. No FFI boundaries or language impedance.

### 3.2 RuVector Integration Points

| RuVector Crate | Integration | Lean-Agentic Module | Phase |
|---------------|-------------|---------------------|-------|
| `ruvector-core` (HNSW) | Proven dimensionality/metric invariants | `typechecker`, `term` | 2 |
| `ruvector-attention` | Verified attention mask shapes | `typechecker`, `conversion` | 4 |
| `ruvector-solver` | Proven convergence properties | `unification`, `environment` | 4 |
| `ruvector-coherence` | Formal sheaf consistency proofs | `typechecker`, `level` | 4 |
| `ruvector-gnn` | Verified graph topology invariants | `term`, `arena` | 4 |
| `ruvector-delta-consensus` | Proven CRDT merge commutativity | `conversion`, `unification` | 4 |
| `prime-radiant` | Formal categorical coherence | `level`, `typechecker` | 4 |
| `cognitum-gate-kernel` | Verified gate predicates | `typechecker`, `environment` | 4 |
| `ruvector-temporal-tensor` | Proven temporal ordering invariants | `term`, `conversion` | 4 |
| `ruvector-cognitive-container` | Proof-carrying cognitive containers via `WitnessChain` | Full kernel | 3 |
| DNA pipeline (`examples/dna`) | Verified genomic transformation chain | Full kernel | 3 |

**Interop note for `ruvector-coherence`**: The coherence crate currently has a `spectral` feature flag and depends only on `serde`/`serde_json`. Sheaf consistency proofs will live in `ruvector-verified` (not in `ruvector-coherence`) to avoid adding lean-agentic as a dependency to the coherence crate. `ruvector-verified` will import `ruvector-coherence` types and wrap them.

**Interop note for `ruvector-cognitive-container`**: This crate already exports `WitnessChain`, `CoherenceDecision`, and `ContainerWitnessReceipt`. `ruvector-verified` will produce `ProofAttestation` values that serialize into `WitnessChain` entries -- no parallel types, just a producer-consumer relationship.

---

## 4. Design

### 4.1 Workspace Changes

#### Diff to `/workspaces/ruvector/Cargo.toml`

Add to `[workspace.members]` (after `ruvector-cognitive-container`):

```toml
    "crates/ruvector-verified",
```

Add to `[workspace.dependencies]`:

```toml
# Formal verification
lean-agentic = "=0.1.0"
```

Note: `optional = true` is NOT valid in `[workspace.dependencies]`. Optionality is declared in each consuming crate's `[dependencies]` section via `optional = true`.

This adds one entry to `Cargo.lock` (lean-agentic itself, which has zero transitive deps without the `serde` feature).

#### Version Strategy

`ruvector-verified` uses **independent versioning** at `0.1.0`, not `version.workspace = true` (which would give it `2.0.4`). Rationale: this is an experimental bridge crate; it should not inherit the mature workspace version until it reaches feature parity with other `2.x` crates.

### 4.2 New Crate: `ruvector-verified`

```
crates/
  ruvector-verified/
    Cargo.toml
    src/
      lib.rs              # Public API, re-exports, ProofEnvironment
      error.rs            # VerificationError enum
      vector_types.rs     # Dependent types for vector operations
      proof_store.rs      # Ed25519-signed proof attestation
      pipeline.rs         # Verified pipeline composition
      invariants.rs       # Pre-built invariant library
    benches/
      proof_generation.rs # Criterion benchmarks
      arena_throughput.rs
```

#### `Cargo.toml`

```toml
[package]
name = "ruvector-verified"
version = "0.1.0"
edition = "2021"
rust-version = "1.77"
license = "MIT OR Apache-2.0"
description = "Formal verification layer for RuVector using lean-agentic dependent types"
publish = false  # Until Phase 2 complete

[dependencies]
# Core verification kernel (always required -- this crate IS the verification layer)
lean-agentic = { workspace = true }
thiserror = { workspace = true }

# Optional integrations with RuVector crates
ruvector-core = { path = "../ruvector-core", optional = true, default-features = false }
ruvector-coherence = { path = "../ruvector-coherence", optional = true }
ruvector-cognitive-container = { path = "../ruvector-cognitive-container", optional = true }
rvf-types = { path = "../rvf/rvf-types", optional = true }
rvf-crypto = { path = "../rvf/rvf-crypto", optional = true, features = ["ed25519"] }

# Serialization (for proof persistence)
serde = { workspace = true, optional = true }
serde_json = { workspace = true, optional = true }

[dev-dependencies]
criterion = { workspace = true }
proptest = { workspace = true }

[features]
default = []
hnsw-proofs = ["dep:ruvector-core", "ruvector-core/hnsw"]
rvf-proofs = ["dep:rvf-types", "dep:rvf-crypto", "dep:ruvector-cognitive-container"]
coherence-proofs = ["dep:ruvector-coherence"]
serde = ["dep:serde", "dep:serde_json", "lean-agentic/serde"]
all-proofs = ["hnsw-proofs", "rvf-proofs", "coherence-proofs"]

[[bench]]
name = "proof_generation"
harness = false

[[bench]]
name = "arena_throughput"
harness = false
```

#### Feature Propagation

```
ruvector-verified
  |-- [always]    lean-agentic =0.1.0
  |-- [always]    thiserror 2.0
  |-- [hnsw-proofs]    ruvector-core (hnsw feature)
  |-- [rvf-proofs]     rvf-types + rvf-crypto (ed25519) + ruvector-cognitive-container
  |-- [coherence-proofs] ruvector-coherence
  |-- [serde]          serde + serde_json + lean-agentic/serde

Downstream crates opt in:
  ruvector-core/Cargo.toml:
    [features]
    formal-verification = ["dep:ruvector-verified", "ruvector-verified/hnsw-proofs"]
```

### 4.3 Error Types

```rust
// crates/ruvector-verified/src/error.rs

use thiserror::Error;

/// Errors from the formal verification layer.
#[derive(Debug, Error)]
pub enum VerificationError {
    /// Vector dimension does not match the index dimension.
    /// Contains (expected, actual).
    #[error("dimension mismatch: expected {expected}, got {actual}")]
    DimensionMismatch { expected: u32, actual: u32 },

    /// The lean-agentic type checker rejected the proof term.
    /// Contains the kernel error message.
    #[error("type check failed: {0}")]
    TypeCheckFailed(String),

    /// Proof construction failed during term building.
    #[error("proof construction failed: {0}")]
    ProofConstructionFailed(String),

    /// The conversion engine exhausted its fuel budget (10,000 reductions).
    #[error("conversion timeout: exceeded {max_reductions} reduction steps")]
    ConversionTimeout { max_reductions: u32 },

    /// Unification of proof constraints failed.
    #[error("unification failed: {0}")]
    UnificationFailed(String),

    /// The arena ran out of term slots (u32 overflow at 4B terms).
    #[error("arena exhausted: {allocated} terms allocated")]
    ArenaExhausted { allocated: u32 },

    /// A required declaration was not found in the proof environment.
    #[error("declaration not found: {name}")]
    DeclarationNotFound { name: String },

    /// Ed25519 proof signing or verification failed.
    #[error("attestation error: {0}")]
    AttestationError(String),
}

/// Maps lean-agentic's internal `lean_agentic::Error` to `VerificationError`.
impl From<lean_agentic::Error> for VerificationError {
    fn from(e: lean_agentic::Error) -> Self {
        match e {
            lean_agentic::Error::TypeError(msg) => Self::TypeCheckFailed(msg),
            lean_agentic::Error::UniverseError(msg) => Self::TypeCheckFailed(msg),
            lean_agentic::Error::UnificationError(msg) => Self::UnificationFailed(msg),
            lean_agentic::Error::NotFound(msg) => Self::DeclarationNotFound { name: msg },
            lean_agentic::Error::ConversionError { expected, actual } => {
                Self::TypeCheckFailed(format!("expected {expected}, got {actual}"))
            }
            lean_agentic::Error::Internal(msg) => Self::ProofConstructionFailed(msg),
        }
    }
}

pub type Result<T> = std::result::Result<T, VerificationError>;
```

### 4.4 Core Types and API

```rust
// crates/ruvector-verified/src/lib.rs

pub mod error;
pub mod vector_types;
pub mod proof_store;
pub mod pipeline;
pub mod invariants;

pub use error::{VerificationError, Result};
pub use vector_types::{mk_vector_type, mk_nat_literal};
pub use proof_store::ProofAttestation;
pub use pipeline::VerifiedStage;

use lean_agentic::{Arena, Context, Environment, TypeChecker};

/// The proof environment bundles lean-agentic's kernel state.
/// One per thread (not Send/Sync due to Arena interior mutability).
pub struct ProofEnvironment {
    /// Hash-consing arena for term allocation.
    pub arena: Arena,
    /// Global declarations (vector types, distance metrics, etc.).
    pub env: Environment,
    /// Local typing context for the current proof obligation.
    pub ctx: Context,
    /// Type checker instance.
    pub checker: TypeChecker,
}

impl ProofEnvironment {
    /// Create a new proof environment pre-loaded with RuVector type declarations.
    pub fn new() -> Self {
        let mut arena = Arena::new();
        let mut env = Environment::new();
        let ctx = Context::new();
        let checker = TypeChecker::default();

        // Register built-in types: Nat, Vec, DistanceMetric
        invariants::register_builtins(&mut arena, &mut env);

        Self { arena, env, ctx, checker }
    }

    /// Get arena statistics (cache hit rate, terms allocated).
    pub fn stats(&self) -> &lean_agentic::ArenaStats {
        self.arena.stats()
    }
}

impl Default for ProofEnvironment {
    fn default() -> Self {
        Self::new()
    }
}

/// A vector operation with a machine-checked type proof.
pub struct VerifiedOp<T> {
    /// The operation result.
    pub value: T,
    /// Proof term ID in the arena.
    /// In release builds without debug_assertions, this is still present
    /// but can be ignored (it is a Copy u32, zero overhead).
    pub proof: lean_agentic::TermId,
}
```

```rust
// crates/ruvector-verified/src/vector_types.rs

use lean_agentic::{Arena, Environment, TermId, SymbolId, Binder, BinderInfo};
use crate::error::{Result, VerificationError};

/// Construct a Nat literal term in the arena.
///
/// Nat values are used as dimension indices for dependent vector types.
pub fn mk_nat_literal(arena: &mut Arena, n: u32) -> TermId {
    arena.mk_const(
        arena.symbol_table().intern("Nat"),
        vec![],
    )
    // For dimension checking, we use Lit(Nat(n)) directly:
    // arena.intern(TermKind::Lit(Literal::Nat(n as u64)))
}

/// Construct the type `RuVec n` representing a vector of dimension `n`.
///
/// In the type theory: `RuVec : Nat -> Type`
/// Applied as: `RuVec 128` for a 128-dimensional vector.
///
/// # Example
/// ```rust,ignore
/// let mut proof_env = ProofEnvironment::new();
/// let vec128_ty = mk_vector_type(&mut proof_env.arena, &mut proof_env.env, 128);
/// // vec128_ty represents the type `RuVec 128`
/// ```
pub fn mk_vector_type(arena: &mut Arena, env: &Environment, dim: u32) -> TermId {
    // Look up the pre-registered RuVec constant
    let ruvec_sym = arena.symbol_table().intern("RuVec");
    let ruvec_const = arena.mk_const(ruvec_sym, vec![]);

    // Construct the dimension as a Nat literal
    let dim_term = arena.intern(lean_agentic::TermKind::Lit(
        lean_agentic::Literal::Nat(dim as u64),
    ));

    // Apply: RuVec dim
    arena.mk_app(ruvec_const, dim_term)
}

/// Prove that two dimensions are equal, returning the proof term.
///
/// If `expected != actual`, returns `DimensionMismatch` error.
/// If equal, constructs a `refl` proof term: `Eq.refl : expected = actual`.
pub fn prove_dim_eq(
    arena: &mut Arena,
    expected: u32,
    actual: u32,
) -> Result<TermId> {
    if expected != actual {
        return Err(VerificationError::DimensionMismatch { expected, actual });
    }

    // Construct: refl {Nat} {expected}
    let refl_sym = arena.symbol_table().intern("Eq.refl");
    let nat_lit = arena.intern(lean_agentic::TermKind::Lit(
        lean_agentic::Literal::Nat(expected as u64),
    ));
    let refl_const = arena.mk_const(refl_sym, vec![]);
    Ok(arena.mk_app(refl_const, nat_lit))
}

/// Verified HNSW insert: proves dimensionality match before insertion.
///
/// # Type signature in dependent types:
/// ```text
/// verified_insert : (idx : HnswIndex n) -> (v : RuVec m) -> (p : n = m) -> InsertResult
/// ```
#[cfg(feature = "hnsw-proofs")]
pub fn verified_insert(
    index_dim: u32,
    vector: &[f32],
    proof_env: &mut crate::ProofEnvironment,
) -> Result<VerifiedOp<()>> {
    let actual_dim = vector.len() as u32;

    // 1. Construct proof term: dim_eq : index_dim = vector.len()
    let proof = prove_dim_eq(&mut proof_env.arena, index_dim, actual_dim)?;

    // 2. Type-check the proof in the lean-agentic kernel
    let expected_ty = {
        let eq_sym = proof_env.arena.symbol_table().intern("Eq");
        let n = proof_env.arena.intern(lean_agentic::TermKind::Lit(
            lean_agentic::Literal::Nat(index_dim as u64),
        ));
        let eq_const = proof_env.arena.mk_const(eq_sym, vec![]);
        proof_env.arena.mk_app(proof_env.arena.mk_app(eq_const, n), n)
    };

    proof_env.checker.check(
        &proof_env.arena,
        &proof_env.env,
        &proof_env.ctx,
        proof,
        expected_ty,
    ).map_err(VerificationError::from)?;

    // 3. Return verified result (actual HNSW insert is caller's responsibility)
    Ok(VerifiedOp { value: (), proof })
}

use crate::VerifiedOp;
```

```rust
// crates/ruvector-verified/src/pipeline.rs

use std::marker::PhantomData;
use lean_agentic::TermId;

/// A verified pipeline stage with proven input/output type compatibility.
///
/// `A` and `B` are phantom type parameters representing the stage's
/// logical input and output types (not runtime types).
///
/// The `proof` field contains a lean-agentic term proving that the
/// stage's implementation correctly transforms `A` to `B`.
pub struct VerifiedStage<A, B> {
    /// Human-readable stage name (e.g., "kmer_embedding", "variant_call").
    pub name: String,
    /// Proof term: `stage_correct : A -> B` is well-typed.
    pub proof: TermId,
    /// Input type term in the arena.
    pub input_ty: TermId,
    /// Output type term in the arena.
    pub output_ty: TermId,
    _phantom: PhantomData<(A, B)>,
}

impl<A, B> VerifiedStage<A, B> {
    /// Create a new verified stage with its correctness proof.
    pub fn new(name: String, proof: TermId, input_ty: TermId, output_ty: TermId) -> Self {
        Self {
            name,
            proof,
            input_ty,
            output_ty,
            _phantom: PhantomData,
        }
    }
}

/// Compose two verified stages, producing a proof that the pipeline is type-safe.
///
/// Checks that `f.output_ty` is definitionally equal to `g.input_ty` using
/// the lean-agentic conversion engine.
///
/// # Errors
/// Returns `TypeCheckFailed` if the output type of `f` does not match
/// the input type of `g`.
pub fn compose_stages<A, B, C>(
    f: &VerifiedStage<A, B>,
    g: &VerifiedStage<B, C>,
    proof_env: &mut crate::ProofEnvironment,
) -> crate::Result<VerifiedStage<A, C>> {
    // Verify output(f) = input(g) via definitional equality
    let converter = lean_agentic::Converter::default();
    let eq = converter.is_def_eq(
        &proof_env.arena,
        &proof_env.env,
        &proof_env.ctx,
        f.output_ty,
        g.input_ty,
    );

    if !eq {
        return Err(crate::VerificationError::TypeCheckFailed(format!(
            "pipeline type mismatch: stage '{}' output != stage '{}' input",
            f.name, g.name,
        )));
    }

    // Construct composed proof: g . f
    let composed = proof_env.arena.mk_app(g.proof, f.proof);

    Ok(VerifiedStage::new(
        format!("{} >> {}", f.name, g.name),
        composed,
        f.input_ty,
        g.output_ty,
    ))
}
```

### 4.5 Proof Attestation and RVF Witness Integration

```rust
// crates/ruvector-verified/src/proof_store.rs

/// A proof attestation that can be serialized into an RVF WITNESS_SEG entry.
///
/// Witness type `0x0E` = FORMAL_PROOF_VERIFICATION (new type code).
#[derive(Debug, Clone)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub struct ProofAttestation {
    /// SHAKE-256 hash of the serialized proof term.
    pub proof_term_hash: [u8; 32],
    /// SHAKE-256 hash of the environment (declarations used).
    pub environment_hash: [u8; 32],
    /// Nanosecond UNIX timestamp of verification.
    pub verification_timestamp_ns: u64,
    /// lean-agentic version that performed the check (0x00_01_00_00 = 0.1.0).
    pub verifier_version: u32,
    /// Number of type-check reduction steps consumed.
    pub reduction_steps: u32,
    /// Arena cache hit rate at time of verification (0..10000 = 0.00%..100.00%).
    pub cache_hit_rate_bps: u16,
}

/// Witness type code for formal verification proofs.
/// Extends the existing codes: 0x01=PROVENANCE, 0x02=COMPUTATION.
pub const WITNESS_TYPE_FORMAL_PROOF: u8 = 0x0E;

/// Convert a `ProofAttestation` into an `rvf_crypto::WitnessEntry` for chaining
/// into an existing WITNESS_SEG.
#[cfg(feature = "rvf-proofs")]
pub fn to_witness_entry(
    attestation: &ProofAttestation,
    prev_hash: [u8; 32],
) -> rvf_crypto::WitnessEntry {
    rvf_crypto::WitnessEntry {
        prev_hash,
        action_hash: attestation.proof_term_hash,
        timestamp_ns: attestation.verification_timestamp_ns,
        witness_type: WITNESS_TYPE_FORMAL_PROOF,
    }
}

/// Sign a proof attestation with Ed25519 using rvf-crypto's signing infrastructure.
///
/// The signed attestation can be embedded in an RVF container alongside
/// TEE attestation quotes (ADR-042). When both are present, the container
/// carries dual certification: mathematical proof AND hardware attestation.
#[cfg(feature = "rvf-proofs")]
pub fn sign_attestation(
    attestation: &ProofAttestation,
    header: &rvf_types::SegmentHeader,
    key: &ed25519_dalek::SigningKey,
) -> rvf_crypto::SignatureFooter {
    let payload = attestation_to_bytes(attestation);
    rvf_crypto::sign_segment(header, &payload, key)
}

/// Serialize attestation to bytes for signing/hashing.
fn attestation_to_bytes(a: &ProofAttestation) -> Vec<u8> {
    let mut buf = Vec::with_capacity(78);
    buf.extend_from_slice(&a.proof_term_hash);
    buf.extend_from_slice(&a.environment_hash);
    buf.extend_from_slice(&a.verification_timestamp_ns.to_le_bytes());
    buf.extend_from_slice(&a.verifier_version.to_le_bytes());
    buf.extend_from_slice(&a.reduction_steps.to_le_bytes());
    buf.extend_from_slice(&a.cache_hit_rate_bps.to_le_bytes());
    buf
}
```

#### Proof Attestation Flow

```
                          Feature: rvf-proofs
                          ~~~~~~~~~~~~~~~~~~~~
+--------------+    +----------------+    +--------------------+
| RuVector Op  |--->| lean-agentic   |--->| ProofAttestation   |
| (insert,     |    | TypeChecker    |    | proof_term_hash    |
|  query, etc) |    | + Converter    |    | environment_hash   |
+--------------+    +----------------+    | timestamp_ns       |
                                          | verifier_version   |
                                          +--------+-----------+
                                                   |
                                          +--------v-----------+
                                          | to_witness_entry() |
                                          | witness_type=0x0E  |
                                          +--------+-----------+
                                                   |
                                          +--------v-----------+
                                          | rvf_crypto::       |
                                          | create_witness_    |
                                          | chain()            |
                                          +--------+-----------+
                                                   |
                                          +--------v-----------+
                                          | WITNESS_SEG in     |
                                          | .rvf container     |
                                          | (+ optional TEE    |
                                          |  attestation from  |
                                          |  ADR-042)          |
                                          +--------------------+
```

---

## 5. Compatibility Matrix

| Dependency | lean-agentic requires | RuVector workspace has | Compatible? |
|------------|----------------------|----------------------|-------------|
| **Rust MSRV** | edition 2021 (1.56+) | `rust-version = "1.77"` | Yes (1.77 >= 1.56) |
| **serde** | `1.0` (optional) | `serde = "1.0"` | Yes (identical) |
| **ed25519-dalek** | `2` (upstream workspace only) | `rvf-crypto`: `ed25519-dalek = "2"` | Yes (same major) |
| **HashMap** | std (arena.rs) | std available on all targets | Yes |
| **thiserror** | Not used | `thiserror = "2.0"` | N/A (ruvector-verified uses 2.0) |
| **no_std** | Not supported (uses HashMap) | WASM targets use std | OK for wasm32-unknown-unknown |

**Version pinning**: `lean-agentic = "=0.1.0"` (exact pin). The bridge crate insulates downstream from API changes. When lean-agentic releases 0.2.0, we update the pin and adapt the bridge -- downstream crates see no change.

---

## 6. WASM Strategy

### 6.1 Target

`wasm32-unknown-unknown` (same as `ruvector-wasm`, `rvf-wasm`, `ruvector-solver-wasm`).

### 6.2 Approach

The published `lean-agentic` crate compiles to WASM because:
- Zero mandatory dependencies
- Uses `HashMap` from std (available in `wasm32-unknown-unknown`)
- No filesystem, mmap, or OS-specific APIs
- No `unsafe` blocks that assume pointer widths

`leanr-wasm` (the upstream WASM binding crate) is **NOT published to crates.io**. Therefore `ruvector-verified` will provide its own WASM surface using `wasm-bindgen`, following the pattern of `ruvector-solver-wasm`.

### 6.3 Binary Size Budget

| Component | Estimated Size |
|-----------|---------------|
| lean-agentic kernel (arena + typechecker + conversion) | ~40KB |
| ruvector-verified bridge logic | ~15KB |
| wasm-bindgen glue | ~10KB |
| **Total (wasm-opt -Oz)** | **< 80KB** |

For reference: `rvf-solver-wasm` is 132KB post-optimization. The verification WASM should be smaller due to zero floating-point math.

### 6.4 Phase 4 Deliverable

A new `crates/ruvector-verified-wasm/` crate with `wasm-bindgen` exports:

```rust
#[wasm_bindgen]
pub fn verify_dimension_proof(index_dim: u32, vector_dim: u32) -> bool;

#[wasm_bindgen]
pub fn create_proof_environment() -> *mut ProofEnvironment;

#[wasm_bindgen]
pub fn free_proof_environment(ptr: *mut ProofEnvironment);
```

---

## 7. Testing Strategy

### 7.1 Unit Tests (Phase 1)

| Test Name | Description | Input | Expected |
|-----------|-------------|-------|----------|
| `test_dim_eq_same` | Equal dimensions produce valid proof | `prove_dim_eq(128, 128)` | `Ok(TermId)` |
| `test_dim_eq_mismatch` | Unequal dimensions error | `prove_dim_eq(128, 256)` | `Err(DimensionMismatch{128, 256})` |
| `test_mk_vector_type` | Vector type construction | `mk_vector_type(arena, env, 128)` | `TermId` for `RuVec 128` |
| `test_proof_env_builtins` | Environment has RuVec, Nat, Eq.refl | `ProofEnvironment::new()` | No panic, symbols interned |
| `test_arena_cache_rate` | Cache efficiency under duplication | Create 1000 identical terms | Cache hit rate > 95% |

### 7.2 Integration Tests (Phase 2)

| Test Name | Description |
|-----------|-------------|
| `test_verified_insert_roundtrip` | `verified_insert` succeeds for matching dims, rejects mismatch |
| `test_proof_attestation_witness_chain` | `ProofAttestation` -> `WitnessEntry` -> `create_witness_chain` -> `verify_witness_chain` round-trip |
| `test_proof_signing_verification` | Ed25519 sign + verify attestation via rvf-crypto |

### 7.3 Property-Based Tests (proptest)

```rust
proptest! {
    #[test]
    fn dim_match_always_succeeds(dim in 1u32..10_000) {
        let mut env = ProofEnvironment::new();
        let result = verified_insert(dim, &vec![0.0f32; dim as usize], &mut env);
        prop_assert!(result.is_ok());
    }

    #[test]
    fn dim_mismatch_always_fails(
        index_dim in 1u32..10_000,
        vector_dim in 1u32..10_000,
    ) {
        prop_assume!(index_dim != vector_dim);
        let mut env = ProofEnvironment::new();
        let result = verified_insert(index_dim, &vec![0.0f32; vector_dim as usize], &mut env);
        prop_assert!(matches!(result, Err(VerificationError::DimensionMismatch { .. })));
    }

    #[test]
    fn arena_never_panics(n in 1u32..100_000) {
        let mut arena = Arena::new();
        for i in 0..n {
            arena.intern(TermKind::Lit(Literal::Nat(i as u64)));
        }
        prop_assert!(arena.terms() <= n as usize);
    }
}
```

### 7.4 Negative / Soundness Tests

| Test Name | Description |
|-----------|-------------|
| `test_reject_unsound_proof` | Manually construct an ill-typed proof term; verify `checker.check()` rejects it |
| `test_conversion_fuel_exhaustion` | Create a term requiring >10,000 reductions; verify `ConversionTimeout` error |
| `test_occurs_check_prevents_infinite` | Attempt circular unification; verify `UnificationFailed` error |

### 7.5 Benchmark Harness

```rust
// benches/proof_generation.rs
use criterion::{criterion_group, criterion_main, Criterion, BenchmarkId};

fn bench_verified_insert(c: &mut Criterion) {
    let mut group = c.benchmark_group("verified_insert");
    for dim in [32, 128, 512, 1024, 4096] {
        group.bench_with_input(
            BenchmarkId::from_parameter(dim),
            &dim,
            |b, &dim| {
                let vector = vec![0.0f32; dim];
                b.iter(|| {
                    let mut env = ProofEnvironment::new();
                    verified_insert(dim as u32, &vector, &mut env).unwrap();
                });
            },
        );
    }
    group.finish();
}

fn bench_proof_vs_unverified(c: &mut Criterion) {
    let mut group = c.benchmark_group("overhead_comparison");
    let dim = 128;
    let vector = vec![0.0f32; dim];

    group.bench_function("with_proof", |b| {
        b.iter(|| {
            let mut env = ProofEnvironment::new();
            verified_insert(dim as u32, &vector, &mut env).unwrap();
        });
    });

    group.bench_function("without_proof", |b| {
        b.iter(|| {
            // Raw dimension check (no proof generation)
            assert_eq!(vector.len(), dim);
        });
    });

    group.finish();
}

criterion_group!(benches, bench_verified_insert, bench_proof_vs_unverified);
criterion_main!(benches);
```

```rust
// benches/arena_throughput.rs
use criterion::{criterion_group, criterion_main, Criterion};

fn bench_arena_intern_unique(c: &mut Criterion) {
    c.bench_function("arena_intern_10k_unique", |b| {
        b.iter(|| {
            let mut arena = Arena::new();
            for i in 0u64..10_000 {
                arena.intern(TermKind::Lit(Literal::Nat(i)));
            }
        });
    });
}

fn bench_arena_intern_dedup(c: &mut Criterion) {
    c.bench_function("arena_intern_10k_dedup", |b| {
        b.iter(|| {
            let mut arena = Arena::new();
            for _ in 0..10_000 {
                arena.intern(TermKind::Lit(Literal::Nat(42)));
            }
            assert!(arena.cache_hit_rate() > 0.99);
        });
    });
}

criterion_group!(benches, bench_arena_intern_unique, bench_arena_intern_dedup);
criterion_main!(benches);
```

---

## 8. CI/CD Changes

### 8.1 New Workflow: `.github/workflows/build-verified.yml`

```yaml
name: ruvector-verified

on:
  push:
    paths:
      - 'crates/ruvector-verified/**'
      - 'Cargo.lock'
  pull_request:
    paths:
      - 'crates/ruvector-verified/**'

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        features:
          - ""                    # No features (core only)
          - "hnsw-proofs"         # HNSW integration
          - "rvf-proofs"          # RVF witness chain
          - "all-proofs"          # Everything
          - "all-proofs,serde"    # Everything + serialization
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - run: cargo test -p ruvector-verified --features "${{ matrix.features }}"
      - run: cargo test -p ruvector-verified --features "${{ matrix.features }}" --release

  bench:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - run: cargo bench -p ruvector-verified --features all-proofs -- --output-format bencher

  no-default-features:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - run: cargo check -p ruvector-verified --no-default-features
```

### 8.2 Existing Workflow Updates

Add to the main `build-native.yml` matrix (if present):

```yaml
- run: cargo check -p ruvector-verified --features all-proofs
```

### 8.3 WASM CI (Phase 4)

```yaml
  wasm:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          targets: wasm32-unknown-unknown
      - run: cargo build -p ruvector-verified --target wasm32-unknown-unknown --no-default-features
      - run: |
          wasm_size=$(stat -c%s target/wasm32-unknown-unknown/release/ruvector_verified.wasm)
          if [ "$wasm_size" -gt 81920 ]; then
            echo "WASM binary too large: ${wasm_size} bytes (limit: 80KB)"
            exit 1
          fi
```

---

## 9. Migration Path

### 9.1 Impact on Existing Users

| Concern | Answer |
|---------|--------|
| **Existing crate APIs change?** | No. Zero changes to any existing public API. |
| **Default features change?** | No. No existing crate gains new default features. |
| **New transitive dependencies?** | Only if `formal-verification` feature is enabled. Otherwise, `Cargo.lock` gains one entry (`lean-agentic`), but it is never compiled. |
| **SemVer implications** | Adding `formal-verification` as a non-default feature to `ruvector-core` is a minor version bump (2.0.4 -> 2.1.0) under Rust SemVer conventions. |
| **Breaking changes** | None. The new crate is a leaf with no reverse dependencies initially. |

### 9.2 Adoption Path

1. **Phase 1-2**: `ruvector-verified` exists as an independent crate. Users who want verification add it explicitly: `cargo add ruvector-verified --features hnsw-proofs`.
2. **Phase 3+**: Downstream crates (e.g., `ruvector-core`) gain optional `formal-verification` feature flags. Users enable with: `cargo build --features ruvector-core/formal-verification`.
3. **No existing behavior changes** at any phase. Verification is purely additive.

### 9.3 Publishing Sequence

1. `lean-agentic` is already published (v0.1.0 on crates.io). No action needed.
2. `ruvector-verified` starts with `publish = false`.
3. When Phase 2 is complete and API stabilizes: set `publish = true`, publish as `ruvector-verified = "0.1.0"`.
4. `ruvector-verified-wasm` (Phase 4): publish after `ruvector-verified`.

---

## 10. Implementation Plan

### Phase 1: Foundation (Week 1-2)

1. Add `lean-agentic = "=0.1.0"` to `[workspace.dependencies]`
2. Add `"crates/ruvector-verified"` to `[workspace.members]`
3. Create `crates/ruvector-verified/` with `Cargo.toml` as specified in Section 4.2
4. Implement `error.rs` with full `VerificationError` enum (Section 4.3)
5. Implement `ProofEnvironment` (Section 4.4)
6. Implement `vector_types.rs`: `mk_vector_type`, `mk_nat_literal`, `prove_dim_eq`
7. Implement `invariants.rs`: `register_builtins` for Nat, RuVec, Eq.refl
8. Unit tests (Section 7.1): all 5 tests passing
9. Benchmark: `arena_throughput` bench running, baseline established

**Exit criteria**: `cargo test -p ruvector-verified` passes. `cargo bench -p ruvector-verified` runs.

### Phase 2: Core Integration (Week 3-4)

1. Enable `hnsw-proofs` feature, implement `verified_insert`
2. Add `rvf-proofs` feature, implement `proof_store.rs` (Section 4.5)
3. Implement `to_witness_entry()` and `sign_attestation()`
4. Integration tests (Section 7.2): verified insert round-trip, witness chain round-trip
5. Property-based tests (Section 7.3): proptest generators
6. Negative tests (Section 7.4): soundness verification
7. CI: create `.github/workflows/build-verified.yml` (Section 8.1)
8. Benchmark: `proof_generation` bench, verify < 1ms per op

**Exit criteria**: `cargo test -p ruvector-verified --features all-proofs` passes. CI green.

### Phase 3: Pipeline Verification (Week 5-6)

1. Implement `pipeline.rs`: `VerifiedStage`, `compose_stages` (Section 4.4)
2. Map DNA pipeline stages (examples/dna ADR-001..015) to verified chain
3. Integrate with `ruvector-cognitive-container` via `WitnessChain`:
   - `ProofAttestation` -> `to_witness_entry()` -> `WitnessChain::push()`
4. Implement proof serialization for `.rvf` containers via `rvf-types`
5. Tests: pipeline composition soundness, container round-trip

**Exit criteria**: DNA pipeline stages compose without type errors. RVF containers carry proof witnesses.

### Phase 4: Extended Coverage (Week 7-8)

1. Attention mask shape verification (`ruvector-attention`)
2. Solver convergence proofs (`ruvector-solver`)
3. CRDT commutativity proofs (`ruvector-delta-consensus`)
4. Create `ruvector-verified-wasm` with `wasm-bindgen` exports
5. WASM CI with binary size assertion (< 80KB)
6. Performance benchmarks: proof generation overhead < 1ms per op across all integrations
7. Set `publish = true` on `ruvector-verified` if API is stable

**Exit criteria**: WASM binary < 80KB. All benchmarks < 1ms. Feature-gated downstream crates compile.

---

## 11. Risk Assessment

| Risk | Severity | Mitigation |
|------|----------|------------|
| Crate at v0.1.0, API unstable | Medium | Exact pin `=0.1.0`, bridge crate insulates downstream |
| Dependent types learning curve | Medium | Pre-built invariant library in `invariants.rs` |
| Proof generation overhead | Low | Benchmarked per-phase, < 1ms target, arena caching |
| Workspace bloat | Low | Zero mandatory deps, feature-gated, 19KB crate |
| Upstream abandonment | Low | Same maintainer (ruvnet), Apache-2.0 fork rights |
| Type theory expressiveness limits | Medium | Start with dimension/equality proofs, expand iteratively |
| `HashMap` blocks `no_std` | Low | Phase 4 concern only; `wasm32-unknown-unknown` has std |
| ed25519-dalek version conflict | Low | Both use `"2"` -- verified compatible (Section 5) |
| License mismatch (MIT vs Apache-2.0) | Low | Dual-license `ruvector-verified` as `MIT OR Apache-2.0` |

---

## 12. Alternatives Considered

| Alternative | Rejected Because |
|------------|-----------------|
| **Creusot** (Rust -> Why3) | External toolchain, not embeddable as library |
| **Prusti** (Rust verifier) | Compiler plugin, heavyweight, not composable |
| **Kani** (bounded model checker) | Checks safety, not functional correctness |
| **Custom proof system** | Reinventing wheel, lean-agentic already implements CoC |
| **No verification** | Unacceptable for safety-critical genomic/financial paths |
| **Runtime assertions only** | No compile-time guarantees, performance overhead |
| **Depend on full lean-agentic workspace** | Only the core kernel is published; AI layer not needed |

---

## 13. Success Criteria

| Metric | Target | Measurement |
|--------|--------|-------------|
| Proof generation latency | < 1ms per vector operation | `cargo bench -p ruvector-verified` |
| Runtime overhead (release) | 0% (proof term is a Copy u32) | A/B benchmark: with vs without proof |
| Runtime overhead (debug) | < 5% | A/B benchmark in debug mode |
| API coverage | HNSW insert/query, pipeline compose, RVF witness | Feature flag test matrix |
| Test coverage | > 90% for `ruvector-verified` | `cargo tarpaulin` |
| WASM binary size | < 80KB (wasm-opt -Oz) | CI assertion in build-verified.yml |
| Build time increase | < 3s incremental | CI timing comparison |
| Arena cache hit rate | > 95% under dedup workload | `arena_throughput` bench |
| Crate size on crates.io | < 50KB | `cargo package --list` |

---

## 14. Consequences

### Positive

- RuVector gains **formal correctness guarantees** for safety-critical pipelines (genomics, finance) without runtime overhead.
- The proof attestation format chains into existing RVF witness infrastructure (ADR-042), enabling dual mathematical + hardware certification.
- Zero impact on existing builds -- all verification is feature-gated and opt-in.
- The bridge crate pattern insulates the workspace from upstream API churn.
- Hash-consed arena provides 85% memory reduction for proof term storage.

### Negative

- Adds one external dependency to `Cargo.lock` (lean-agentic, 19KB, zero transitive deps).
- Developers working on verification features need basic dependent type theory knowledge.
- Bridge crate adds maintenance surface (~6 source files, ~500 lines).
- Exact version pin (`=0.1.0`) means manual updates when upstream releases.

### Neutral

- No existing API changes. No default feature changes. No SemVer major bump.
- `ruvector-verified` starts unpublished (`publish = false`), reducing community expectations.
- WASM binary adds ~80KB to edge deployment bundles (only when verification feature is enabled).
- The `MIT OR Apache-2.0` dual license on the new crate is standard practice in the Rust ecosystem.

---

## 15. Ultra-Optimization: RuVector Performance Primitives for the Kernel

### 15.1 Motivation

lean-agentic's kernel achieves 0.3ns term equality via hash-consing, but the surrounding operations -- arena allocation, WHNF reduction, substitution, unification constraint solving -- use vanilla `HashMap` and `Vec` data structures. By applying RuVector's battle-tested SIMD, arena, caching, and algorithmic patterns, we can reduce proof generation from the ~1ms target to **sub-100us**, making verification viable even in hot HNSW insert loops.

### 15.2 Optimization Map

| Kernel Bottleneck | RuVector Pattern | Source Crate | Expected Speedup |
|-------------------|-----------------|--------------|-----------------|
| Arena `HashMap` lookup | SIMD hash + 4-wide probe | `ruvector-core/simd_intrinsics` | 3-5x |
| Term allocation | Bump allocator with O(1) reset | `ruvector-solver/arena` | 10-100x vs heap |
| WHNF substitution | Bounds-check elimination + fused kernel | `ruvector-solver/neumann` | 2-3x |
| Unification constraint queue | Thread-local pool with auto-return | `ruvector-mincut/pool` | 90%+ reuse |
| Conversion equality cache | LRU with prefetch prediction | `ruvector-mincut/optimization/cache` | 10x for repeated |
| Type tag comparison | INT8 quantized SIMD distance | `ruvector-core/quantization` | 4-8x |
| De Bruijn index shifting | 4-wide unrolled scalar loop | `ruvector-solver/cg` (ILP pattern) | 3-4x |
| Pipeline composition | Early exit via coherence gate | `ruvector-mincut-gated-transformer` | 30-50% skip |

### 15.3 Optimization 1: SolverArena for Term Allocation

**Problem**: lean-agentic's `Arena` uses `Vec<Term>` + `HashMap<u64, Vec<TermId>>`. Each `intern()` call may trigger HashMap resizing and Vec growth with per-element drop overhead.

**Solution**: Replace the backing store with `ruvector-solver`'s `SolverArena` pattern -- a bump allocator with O(1) reset.

```rust
// crates/ruvector-verified/src/fast_arena.rs

use std::cell::RefCell;

/// High-performance term arena using bump allocation.
/// Modeled after ruvector-solver's SolverArena (crates/ruvector-solver/src/arena.rs).
///
/// Key differences from lean-agentic's default Arena:
/// - Single contiguous allocation (cache-friendly)
/// - O(1) reset reclaims all memory
/// - 64-byte cache-line alignment for SIMD access
/// - No per-term drop overhead
pub struct FastTermArena {
    /// Backing buffer: contiguous, cache-aligned
    buf: RefCell<Vec<u8>>,
    /// Bump pointer
    offset: RefCell<usize>,
    /// Term count (for TermId generation)
    count: RefCell<u32>,
    /// Hash-consing cache: open-addressing with linear probe
    /// Layout: [hash: u64, term_id: u32, padding: u32] x capacity
    cache_buf: RefCell<Vec<u64>>,
    cache_mask: usize, // capacity - 1 (power of 2)
}

impl FastTermArena {
    /// Pre-allocate for expected term count.
    /// Typical: 4096 terms for a dimension proof, 65536 for pipeline verification.
    pub fn with_capacity(max_terms: usize) -> Self {
        let term_bytes = max_terms * std::mem::size_of::<Term>();
        let cache_cap = (max_terms * 2).next_power_of_two(); // 50% load factor

        Self {
            buf: RefCell::new(vec![0u8; term_bytes]),
            offset: RefCell::new(0),
            count: RefCell::new(0),
            cache_buf: RefCell::new(vec![0u64; cache_cap * 2]), // hash + id pairs
            cache_mask: cache_cap - 1,
        }
    }

    /// Intern a term with open-addressing hash lookup.
    /// Uses FxHash (multiply-shift) for speed over SipHash.
    #[inline]
    pub fn intern_fast(&self, kind: &TermKind) -> TermId {
        let hash = fx_hash(kind);
        let mask = self.cache_mask;
        let cache = self.cache_buf.borrow();

        // Linear probe with 4-wide unroll
        let mut slot = (hash as usize) & mask;
        for _ in 0..4 {
            let stored_hash = cache[slot * 2];
            if stored_hash == hash {
                // Potential hit -- verify structural equality
                let id = cache[slot * 2 + 1] as u32;
                return TermId(id);
            }
            if stored_hash == 0 {
                break; // Empty slot -- cache miss
            }
            slot = (slot + 1) & mask;
        }
        drop(cache);

        // Cache miss: allocate via bump pointer
        self.alloc_and_cache(kind, hash, slot)
    }

    /// O(1) reset -- reclaim all terms.
    /// Called between proof obligations.
    pub fn reset(&self) {
        *self.offset.borrow_mut() = 0;
        *self.count.borrow_mut() = 0;
        self.cache_buf.borrow_mut().fill(0);
    }
}

/// FxHash: multiply-shift hash (used by rustc internally).
/// 5x faster than SipHash for small keys.
#[inline]
fn fx_hash(kind: &TermKind) -> u64 {
    // Hash the discriminant + first 8 bytes of payload
    let discriminant = std::mem::discriminant(kind);
    let mut h = 0xcbf29ce484222325u64; // FNV offset
    h = h.wrapping_mul(0x100000001b3); // FNV prime
    h ^= unsafe { std::mem::transmute_copy::<_, u64>(&discriminant) };
    h
}
```

**Performance**: Eliminates heap allocation in hot path. Reset between proofs is O(1) vs O(n) drop.

### 15.4 Optimization 2: SIMD-Accelerated Hash-Consing

**Problem**: Hash-consing equality check requires computing hash of `TermKind`, then comparing against cached candidates. Default `HashMap` uses SipHash (DoS-resistant but slow).

**Solution**: Apply `ruvector-core`'s SIMD distance pattern for hash-table probing.

```rust
// 4-wide parallel hash probe using the ILP pattern from
// ruvector-solver/src/cg.rs (dot_product_f64 with 4 accumulators)

#[inline]
fn probe_4wide(cache: &[u64], hash: u64, mask: usize, start: usize) -> Option<u32> {
    let s0 = start & mask;
    let s1 = (start + 1) & mask;
    let s2 = (start + 2) & mask;
    let s3 = (start + 3) & mask;

    // 4 independent loads -- CPU can execute in parallel
    let h0 = cache[s0 * 2];
    let h1 = cache[s1 * 2];
    let h2 = cache[s2 * 2];
    let h3 = cache[s3 * 2];

    // 4 independent comparisons
    if h0 == hash { return Some(cache[s0 * 2 + 1] as u32); }
    if h1 == hash { return Some(cache[s1 * 2 + 1] as u32); }
    if h2 == hash { return Some(cache[s2 * 2 + 1] as u32); }
    if h3 == hash { return Some(cache[s3 * 2 + 1] as u32); }

    None // Continue probing
}
```

**On AVX2 targets** (feature-gated):

```rust
#[cfg(all(feature = "simd", target_arch = "x86_64"))]
#[target_feature(enable = "avx2")]
unsafe fn probe_avx2(cache: &[u64], hash: u64, mask: usize, start: usize) -> Option<u32> {
    use std::arch::x86_64::*;

    // Broadcast search hash to 4 lanes
    let needle = _mm256_set1_epi64x(hash as i64);

    // Load 4 consecutive hash entries
    let s = start & mask;
    let ptr = cache.as_ptr().add(s * 2) as *const __m256i;
    let entries = _mm256_loadu_si256(ptr); // [h0, id0, h1, id1]

    // Compare: yields 0xFFFF... on match
    let cmp = _mm256_cmpeq_epi64(entries, needle);
    let bitmask = _mm256_movemask_epi8(cmp);

    if bitmask != 0 {
        let lane = bitmask.trailing_zeros() / 8;
        let id_offset = s * 2 + (lane as usize) + 1;
        return Some(cache[id_offset] as u32);
    }
    None
}
```

**Expected speedup**: 3-5x over `HashMap::get()` for the hash-consing hot path.

### 15.5 Optimization 3: Fused Substitution + Shift Kernel

**Problem**: WHNF substitution `t[x := s]` requires traversing the term, shifting De Bruijn indices, and rebuilding. The default implementation makes 3 passes: traverse, shift, rebuild.

**Solution**: Apply `ruvector-solver/neumann.rs` fused residual pattern -- do it in one pass.

```rust
/// Fused substitute-and-shift in a single traversal.
/// Modeled after ruvector-solver's fused_residual_norm_sq which combines
/// SpMV + subtraction + norm into one memory pass.
///
/// Instead of:
///   pass 1: find variables to substitute
///   pass 2: shift remaining indices
///   pass 3: rebuild term
///
/// We do all three in a single recursive descent.
pub fn fused_substitute_shift(
    arena: &mut FastTermArena,
    term: TermId,
    var_idx: u32,       // De Bruijn index to substitute
    replacement: TermId, // The replacement term
    shift: i32,         // Current shift amount
) -> TermId {
    match arena.kind(term) {
        TermKind::Var(n) => {
            if *n == var_idx {
                replacement // Substitute
            } else if *n > var_idx {
                // Shift down: variable was bound above the substitution
                arena.intern_fast(&TermKind::Var((*n as i32 + shift) as u32))
            } else {
                term // Below substitution point -- unchanged (return same TermId)
            }
        }
        TermKind::App(f, a) => {
            let f2 = fused_substitute_shift(arena, *f, var_idx, replacement, shift);
            let a2 = fused_substitute_shift(arena, *a, var_idx, replacement, shift);
            // Short-circuit: if nothing changed, return original (hash-consing dedup)
            if f2 == *f && a2 == *a { return term; }
            arena.mk_app(f2, a2)
        }
        TermKind::Lam(binder, body) => {
            let ty2 = fused_substitute_shift(arena, binder.ty, var_idx, replacement, shift);
            // Under a binder: increment var_idx and shift
            let body2 = fused_substitute_shift(arena, *body, var_idx + 1, replacement, shift);
            if ty2 == binder.ty && body2 == *body { return term; }
            let new_binder = Binder { ty: ty2, ..binder.clone() };
            arena.intern_fast(&TermKind::Lam(new_binder, body2))
        }
        TermKind::Pi(binder, body) => {
            let ty2 = fused_substitute_shift(arena, binder.ty, var_idx, replacement, shift);
            let body2 = fused_substitute_shift(arena, *body, var_idx + 1, replacement, shift);
            if ty2 == binder.ty && body2 == *body { return term; }
            let new_binder = Binder { ty: ty2, ..binder.clone() };
            arena.intern_fast(&TermKind::Pi(new_binder, body2))
        }
        // Sort, Const, Lit, MVar: no variables to substitute
        _ => term,
    }
}
```

**Key trick**: The `if f2 == *f && a2 == *a { return term; }` short-circuit leverages hash-consing -- if sub-terms are unchanged, the parent term is also unchanged, saving allocation.

### 15.6 Optimization 4: Thread-Local Resource Pools

**Problem**: Each proof obligation creates fresh `Context`, constraint `VecDeque`, and `HashSet` for unification. These allocate and drop on every call.

**Solution**: Apply `ruvector-mincut/pool/mod.rs` pattern -- thread-local pools with auto-return.

```rust
// Thread-local pool for proof-checking resources.
// Pattern from ruvector-mincut's BfsPool (90%+ hit rate after warmup).

use std::cell::RefCell;
use std::collections::{VecDeque, HashSet};

thread_local! {
    static PROOF_POOL: RefCell<ProofResourcePool> = RefCell::new(ProofResourcePool::new());
}

struct ProofResourcePool {
    contexts: Vec<Context>,
    constraint_queues: Vec<VecDeque<Constraint>>,
    visited_sets: Vec<HashSet<TermId>>,
    acquires: usize,
    hits: usize,
}

impl ProofResourcePool {
    fn new() -> Self {
        Self {
            contexts: Vec::new(),
            constraint_queues: Vec::new(),
            visited_sets: Vec::new(),
            acquires: 0,
            hits: 0,
        }
    }
}

/// Acquire pooled resources for a proof obligation.
/// Auto-returns to pool when `ProofResources` is dropped.
pub fn acquire_proof_resources() -> ProofResources {
    PROOF_POOL.with(|pool| {
        let mut p = pool.borrow_mut();
        p.acquires += 1;

        let ctx = p.contexts.pop().unwrap_or_else(Context::new);
        let queue = p.constraint_queues.pop().unwrap_or_default();
        let visited = p.visited_sets.pop().unwrap_or_default();

        if !p.contexts.is_empty() || !p.constraint_queues.is_empty() {
            p.hits += 1;
        }

        ProofResources { ctx, queue, visited }
    })
}

pub struct ProofResources {
    pub ctx: Context,
    pub queue: VecDeque<Constraint>,
    pub visited: HashSet<TermId>,
}

impl Drop for ProofResources {
    fn drop(&mut self) {
        // Clear but retain capacity, then return to pool
        self.ctx.clear();
        self.queue.clear();
        self.visited.clear();

        PROOF_POOL.with(|pool| {
            let mut p = pool.borrow_mut();
            p.contexts.push(std::mem::take(&mut self.ctx));
            p.constraint_queues.push(std::mem::take(&mut self.queue));
            p.visited_sets.push(std::mem::take(&mut self.visited));
        });
    }
}
```

**Expected**: After warmup, 90%+ pool hit rate. Zero heap churn in steady state.

### 15.7 Optimization 5: Conversion Cache with Prefetch

**Problem**: The lean-agentic conversion engine memoizes WHNF results keyed on `(TermId, context_length)`, but uses a basic `HashMap`.

**Solution**: Apply `ruvector-mincut/optimization/cache.rs` LRU with access-pattern prefetch.

```rust
/// Conversion result cache with access-pattern prediction.
/// Modeled after ruvector-mincut's PathDistanceCache (10x for repeated queries).
pub struct ConversionCache {
    /// Open-addressing hash table: (key_hash, whnf_result)
    entries: Vec<CacheEntry>,
    mask: usize,
    /// Recent access pattern for prefetch prediction
    history: VecDeque<u64>,
    stats: CacheStats,
}

#[derive(Default, Clone)]
struct CacheEntry {
    key_hash: u64,
    term_id: TermId,
    whnf_result: TermId,
    access_count: u16,
}

struct CacheStats {
    hits: u64,
    misses: u64,
    prefetch_hits: u64,
}

impl ConversionCache {
    pub fn with_capacity(cap: usize) -> Self {
        let cap = cap.next_power_of_two();
        Self {
            entries: vec![CacheEntry::default(); cap],
            mask: cap - 1,
            history: VecDeque::with_capacity(64),
            stats: CacheStats { hits: 0, misses: 0, prefetch_hits: 0 },
        }
    }

    /// Look up cached WHNF result.
    #[inline]
    pub fn get(&mut self, term: TermId, ctx_len: usize) -> Option<TermId> {
        let hash = self.key_hash(term, ctx_len);
        let slot = (hash as usize) & self.mask;

        // Prefetch next likely access based on history
        if let Some(&predicted) = self.history.front() {
            let pred_slot = (predicted as usize) & self.mask;
            #[cfg(target_arch = "x86_64")]
            unsafe {
                std::arch::x86_64::_mm_prefetch(
                    (&self.entries[pred_slot]) as *const _ as *const i8,
                    std::arch::x86_64::_MM_HINT_T0,
                );
            }
        }

        let entry = &self.entries[slot];
        if entry.key_hash == hash {
            self.stats.hits += 1;
            self.history.push_back(hash);
            if self.history.len() > 64 { self.history.pop_front(); }
            Some(entry.whnf_result)
        } else {
            self.stats.misses += 1;
            None
        }
    }

    #[inline]
    fn key_hash(&self, term: TermId, ctx_len: usize) -> u64 {
        let mut h = term.0 as u64;
        h = h.wrapping_mul(0x517cc1b727220a95);
        h ^= ctx_len as u64;
        h = h.wrapping_mul(0x6c62272e07bb0142);
        h
    }
}
```

### 15.8 Optimization 6: Coherence-Gated Proof Depth

**Problem**: Not all proof obligations are equally complex. Simple dimension checks need 2-3 reduction steps; pipeline composition may need hundreds. Spending the same effort on both wastes cycles.

**Solution**: Apply `ruvector-mincut-gated-transformer`'s 3-tier compute routing.

```rust
/// Adaptive proof depth routing, modeled after the GateController
/// in ruvector-mincut-gated-transformer/src/gate.rs.
///
/// Routes proof obligations to different compute tiers:
/// - Reflex (<10us): Dimension equality, literal comparison
/// - Standard (<100us): Single-step type inference
/// - Deep (<1ms): Full WHNF + unification
pub enum ProofTier {
    /// Tier 0: Direct comparison, no reduction needed
    /// e.g., prove_dim_eq(128, 128) -- just check n == m
    Reflex,
    /// Tier 1: Shallow inference, 1-10 reduction steps
    /// e.g., verified_insert with known types
    Standard { max_fuel: u32 },
    /// Tier 2: Full kernel, up to 10,000 steps
    /// e.g., pipeline composition with dependent types
    Deep,
}

/// Route a proof obligation to the cheapest tier that can handle it.
pub fn route_proof(kind: &TermKind, env: &Environment) -> ProofTier {
    match kind {
        // Literals and variables: direct comparison
        TermKind::Lit(_) | TermKind::Var(_) => ProofTier::Reflex,

        // Constants: check if reducible in environment
        TermKind::Const(sym, _) => {
            if env.is_reducible(*sym) {
                ProofTier::Standard { max_fuel: 100 }
            } else {
                ProofTier::Reflex
            }
        }

        // Applications: check depth
        TermKind::App(f, _) => {
            // If function is a known constructor, shallow
            ProofTier::Standard { max_fuel: 500 }
        }

        // Binders always need full checking
        TermKind::Lam(_, _) | TermKind::Pi(_, _) | TermKind::Let(_, _, _) => {
            ProofTier::Deep
        }

        _ => ProofTier::Standard { max_fuel: 100 },
    }
}

/// Execute proof with tiered fuel budget.
pub fn verify_tiered(
    arena: &mut FastTermArena,
    env: &Environment,
    term: TermId,
    expected_ty: TermId,
    tier: ProofTier,
) -> crate::Result<()> {
    match tier {
        ProofTier::Reflex => {
            // O(1): pointer equality via hash-consing
            if term == expected_ty { return Ok(()); }
            // Fallback to Standard
            verify_tiered(arena, env, term, expected_ty,
                         ProofTier::Standard { max_fuel: 100 })
        }
        ProofTier::Standard { max_fuel } => {
            // Limited fuel conversion
            let mut converter = Converter::with_fuel(max_fuel);
            if converter.is_def_eq(arena, env, &Context::new(), term, expected_ty) {
                Ok(())
            } else if max_fuel < 10_000 {
                // Escalate to Deep
                verify_tiered(arena, env, term, expected_ty, ProofTier::Deep)
            } else {
                Err(VerificationError::ConversionTimeout { max_reductions: max_fuel })
            }
        }
        ProofTier::Deep => {
            // Full kernel with default 10,000 fuel
            let checker = TypeChecker::default();
            checker.check(arena, env, &Context::new(), term, expected_ty)
                .map_err(VerificationError::from)
        }
    }
}
```

### 15.9 Optimization 7: Bounds-Check Elimination in Hot Loops

**Problem**: Substitution and WHNF inner loops perform bounds checks on every `arena.get(TermId)`.

**Solution**: Apply `ruvector-solver/neumann.rs` pattern -- validate once, then use `get_unchecked`.

```rust
/// Validate arena integrity once, then enter unchecked mode.
/// Pattern from ruvector-solver's spmv_unchecked (validates CSR once, then raw ptrs).
pub fn enter_unchecked_mode(arena: &FastTermArena) -> UncheckedArenaView<'_> {
    // Validate: all TermIds in range, no dangling references
    let count = *arena.count.borrow();
    let buf = arena.buf.borrow();
    assert!(buf.len() >= count as usize * std::mem::size_of::<Term>());

    UncheckedArenaView {
        ptr: buf.as_ptr() as *const Term,
        count,
    }
}

pub struct UncheckedArenaView<'a> {
    ptr: *const Term,
    count: u32,
    _phantom: std::marker::PhantomData<&'a ()>,
}

impl<'a> UncheckedArenaView<'a> {
    /// O(1) term access without bounds check.
    /// SAFETY: Caller must ensure id.0 < self.count (validated at construction).
    #[inline(always)]
    pub unsafe fn get(&self, id: TermId) -> &Term {
        debug_assert!(id.0 < self.count);
        &*self.ptr.add(id.0 as usize)
    }
}
```

### 15.10 Combined Performance Targets

| Operation | Before (lean-agentic default) | After (RuVector optimized) | Speedup |
|-----------|------------------------------|---------------------------|---------|
| Term allocation | ~50ns (HashMap + Vec push) | ~5ns (bump + FxHash probe) | 10x |
| Hash-consing lookup | ~30ns (SipHash + HashMap get) | ~8ns (FxHash + 4-wide probe) | 4x |
| WHNF substitution | ~200ns (3-pass) | ~70ns (fused single-pass) | 3x |
| Conversion equality | ~100ns (uncached) | ~10ns (LRU cache hit) | 10x |
| Resource acquire | ~500ns (alloc + init) | ~20ns (pool acquire) | 25x |
| Proof tier routing | N/A (always full kernel) | ~5ns (match on TermKind) | skip 30-50% |
| **End-to-end: dimension proof** | ~1,000ns | **< 50ns** | **20x** |
| **End-to-end: pipeline compose** | ~100,000ns | **< 10,000ns** | **10x** |

### 15.11 Implementation Priority

These optimizations are **Phase 5** work (Week 9-10), after the base integration is stable:

| Priority | Optimization | Effort | Impact |
|----------|-------------|--------|--------|
| P0 | SolverArena bump allocator (15.3) | 1 day | 10x allocation |
| P0 | Thread-local resource pools (15.6) | 1 day | 25x resource acquire |
| P1 | Fused substitution kernel (15.5) | 2 days | 3x WHNF |
| P1 | Coherence-gated proof depth (15.8) | 1 day | Skip 30-50% work |
| P2 | SIMD hash probe (15.4) | 2 days | 4x hash-consing |
| P2 | Conversion cache with prefetch (15.7) | 1 day | 10x repeated |
| P3 | Bounds-check elimination (15.9) | 1 day | 2x inner loops |

**Total**: ~9 engineering days for 10-20x end-to-end speedup.

### 15.12 New Dependencies for Optimization Phase

```toml
# Added to crates/ruvector-verified/Cargo.toml in Phase 5

[features]
fast-arena = []          # SolverArena-style bump allocator
simd-hash = []           # AVX2/NEON hash-consing probe
gated-proofs = []        # Coherence-gated proof depth routing
ultra = ["fast-arena", "simd-hash", "gated-proofs"]  # All optimizations

# No new external dependencies -- all patterns are inlined from RuVector crates.
# The optimizations use std::arch intrinsics directly (same as ruvector-core).
```

### 15.13 Benchmark Additions for Optimization Phase

```toml
# Additional bench entries for Phase 5

[[bench]]
name = "fast_arena_vs_default"
harness = false

[[bench]]
name = "simd_hash_probe"
harness = false

[[bench]]
name = "fused_substitute"
harness = false

[[bench]]
name = "tiered_routing"
harness = false
```

---

## 16. References

- [lean-agentic on crates.io](https://crates.io/crates/lean-agentic)
- [lean-agentic documentation](https://docs.rs/lean-agentic)
- [lean-agentic repository](https://github.com/agenticsorg/lean-agentic)
- [Lean 4 type theory](https://leanprover.github.io/lean4/doc/)
- ADR-001: RuVector Core Architecture
- ADR-014: Coherence Engine
- ADR-029: RVF Canonical Format
- ADR-030: RVF Cognitive Container
- ADR-039: RVF Solver WASM AGI Integration
- ADR-042: Security RVF AIDefence TEE
- ADR-044: ruvector-postgres v0.3 Extension Upgrade

### Optimization Pattern Sources (Section 15)

| Pattern | Source File | Lines |
|---------|------------|-------|
| Bump allocator | `crates/ruvector-solver/src/arena.rs` | 1-176 |
| 4-wide ILP unroll | `crates/ruvector-solver/src/cg.rs` | 76-102 |
| Fused kernel | `crates/ruvector-solver/src/neumann.rs` | 121-150 |
| Bounds-check elimination | `crates/ruvector-solver/src/types.rs` | 86-111 |
| Thread-local pool | `crates/ruvector-mincut/src/pool/mod.rs` | BfsPool |
| LRU cache + prefetch | `crates/ruvector-mincut/src/optimization/cache.rs` | PathDistanceCache |
| SIMD distance | `crates/ruvector-mincut/src/optimization/simd_distance.rs` | DistanceArray |
| Coherence gate routing | `crates/ruvector-mincut-gated-transformer/src/gate.rs` | GateController |
| WeightArena (bump) | `crates/ruvector-mincut-gated-transformer/src/arena.rs` | alloc_f32 |
| AVX2 SpMV | `crates/ruvector-solver/src/simd.rs` | spmv_avx2 |
| AVX2 horizontal sum | `crates/ruvector-core/src/simd_intrinsics.rs` | horizontal_sum |
| FxHash (multiply-shift) | `crates/ruvector-core/src/simd_intrinsics.rs` | distance dispatch |
| Cache-aligned struct | `crates/ruvector-core/src/arena.rs` | CACHE_LINE_SIZE=64 |
| INT8 quantized SIMD | `crates/ruvector-core/src/simd_intrinsics.rs` | 979-1212 |
| Early exit (coherence) | `crates/ruvector-mincut-gated-transformer/src/early_exit.rs` | CoherenceEarlyExit |