Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

2026-02-28 14:39:40 -05:00
parent 7885bf6278 d803bfe2b1
commit cd5943df23
7854 changed files with 3522914 additions and 0 deletions
--- a/vendor/ruvector/docs/adr/ADR-039-rvf-solver-wasm-agi-integration.md
+++ b/vendor/ruvector/docs/adr/ADR-039-rvf-solver-wasm-agi-integration.md
@@ -0,0 +1,515 @@
+# ADR-039: RVF Solver WASM — Self-Learning AGI Engine Integration
+
+| Field | Value |
+|-------|-------|
+| **Status** | Implemented |
+| **Date** | 2026-02-16 (updated 2026-02-17) |
+| **Deciders** | RuVector core team |
+| **Supersedes** | -- |
+| **Related** | ADR-032 (RVF WASM integration), ADR-037 (Publishable RVF acceptance test), ADR-038 (npx/rvlite witness verification) |
+
+## Context
+
+ADR-037 established the publishable RVF acceptance test with a SHAKE-256 witness chain, and ADR-038 planned npm integration for **verifying** those artifacts. However, neither the existing `rvf-wasm` microkernel nor the npm packages expose the actual self-learning engine that produces the AGI benchmarks.
+
+The core AGI capabilities live exclusively in the Rust benchmarks crate (`examples/benchmarks/src/`):
+- **PolicyKernel**: Thompson Sampling two-signal model (safety Beta + cost EMA)
+- **KnowledgeCompiler**: Signature-based pattern cache with compiled skip-mode configs
+- **AdaptiveSolver**: Three-loop architecture (fast: solve, medium: policy, slow: compiler)
+- **ReasoningBank**: Trajectory tracking with checkpoint/rollback and non-regression gating
+- **Acceptance test**: Multi-cycle training/holdout evaluation with three ablation modes
+
+These components have no FFI dependencies, no filesystem access during solve, and no system clock requirements — making them ideal candidates for WASM compilation.
+
+## Decision
+
+### Create `rvf-solver-wasm` as a standalone no_std WASM module
+
+A new crate at `crates/rvf/rvf-solver-wasm/` compiles the complete self-learning solver to `wasm32-unknown-unknown`. It is a `no_std + alloc` crate (same architecture as `rvf-wasm`) with a C ABI export surface.
+
+**Key design choices:**
+
+| Choice | Rationale |
+|--------|-----------|
+| **no_std + alloc** | Matches rvf-wasm pattern, runs in any WASM runtime (browser, Node.js, edge) |
+| **Self-contained types** | Pure-integer `Date` type replaces `chrono` dependency; `BTreeMap` replaces `HashMap` |
+| **libm for float math** | `sqrt`, `log`, `cos`, `pow` via `libm` crate (pure Rust, no_std compatible) |
+| **xorshift64 RNG** | Deterministic, no `rand` crate dependency, identical to benchmarks RNG |
+| **C ABI exports** | Maximum compatibility — works with any WASM host (no wasm-bindgen required) |
+| **Handle-based API** | Up to 8 concurrent solver instances, same pattern as `rvf_store_*` exports |
+
+### WASM Export Surface
+
+```
+┌─────────────────────────────────────────────────────┐
+│              rvf-solver-wasm exports                │
+├─────────────────────────────────────────────────────┤
+│ Memory:                                             │
+│   rvf_solver_alloc(size) -> ptr                     │
+│   rvf_solver_free(ptr, size)                        │
+│                                                     │
+│ Lifecycle:                                          │
+│   rvf_solver_create() -> handle                     │
+│   rvf_solver_destroy(handle)                        │
+│                                                     │
+│ Training (three-loop learning):                     │
+│   rvf_solver_train(handle, count,                   │
+│     min_diff, max_diff, seed_lo, seed_hi) -> i32    │
+│                                                     │
+│ Acceptance test (full ablation):                    │
+│   rvf_solver_acceptance(handle, holdout,            │
+│     training, cycles, budget,                       │
+│     seed_lo, seed_hi) -> i32                        │
+│                                                     │
+│ Result / Policy / Witness reads:                    │
+│   rvf_solver_result_len(handle) -> i32              │
+│   rvf_solver_result_read(handle, out_ptr) -> i32    │
+│   rvf_solver_policy_len(handle) -> i32              │
+│   rvf_solver_policy_read(handle, out_ptr) -> i32    │
+│   rvf_solver_witness_len(handle) -> i32             │
+│   rvf_solver_witness_read(handle, out_ptr) -> i32   │
+└─────────────────────────────────────────────────────┘
+```
+
+### Architecture Preserved in WASM
+
+The WASM module preserves all five AGI capabilities:
+
+1. **Thompson Sampling two-signal model** — Beta posterior for safety (correct & no early-commit) + EMA for cost. Gamma sampling via Marsaglia's method using `libm`.
+
+2. **18 context buckets** — 3 range (small/medium/large) x 3 distractor (clean/some/heavy) x 2 noise = 18 buckets. Each bucket maintains per-arm stats for `None`, `Weekday`, `Hybrid` skip modes.
+
+3. **Speculative dual-path** — When top-2 arms are within delta 0.15 and variance > 0.02, the solver speculatively executes the secondary arm. This is preserved identically in WASM.
+
+4. **KnowledgeCompiler** — Constraint signature cache (`v1:{difficulty}:{sorted_constraint_types}`). Compiles successful trajectories into optimized configs with compiled skip-mode, step budget, and confidence scores.
+
+5. **Three-loop solver** — Fast (constraint propagation + solve), Medium (PolicyKernel selection), Slow (ReasoningBank → KnowledgeCompiler). Checkpoint/rollback on accuracy regression.
+
+### Integration with RVF Ecosystem
+
+```
+┌──────────────────────┐         ┌──────────────────────┐
+│   rvf-solver-wasm    │         │     rvf-wasm         │
+│   (self-learning     │ ──────▶ │   (verification)     │
+│    AGI engine)       │ witness │                      │
+│                      │ chain   │ rvf_witness_verify   │
+│ rvf_solver_train     │         │ rvf_witness_count    │
+│ rvf_solver_acceptance│         │                      │
+│ rvf_solver_witness_* │         │ rvf_store_*          │
+└──────────┬───────────┘         └──────────────────────┘
+           │ uses
+    ┌──────▼──────┐
+    │  rvf-crypto  │
+    │  SHAKE-256   │
+    │  witness     │
+    │  chain       │
+    └─────────────┘
+```
+
+The solver produces a SHAKE-256 witness chain (via `rvf_crypto::create_witness_chain`) for every acceptance test run. This chain is in the native 73-byte-per-entry format, directly verifiable by `rvf_witness_verify` in the rvf-wasm microkernel.
+
+### npm Integration Path
+
+#### High-Level SDK (`@ruvector/rvf-solver`)
+
+The `@ruvector/rvf-solver` npm package provides a typed TypeScript wrapper around the raw WASM C-ABI exports, with automatic WASM loading, memory management, and JSON deserialization.
+
+```typescript
+import { RvfSolver } from '@ruvector/rvf-solver';
+
+// Create solver (lazy-loads WASM on first call)
+const solver = await RvfSolver.create();
+
+// Train on 1000 puzzles (three-loop learning)
+const result = solver.train({ count: 1000, minDifficulty: 1, maxDifficulty: 10, seed: 42n });
+console.log(`Accuracy: ${(result.accuracy * 100).toFixed(1)}%`);
+
+// Run full acceptance test (A/B/C ablation)
+const manifest = solver.acceptance({ holdoutSize: 100, trainingPerCycle: 100, cycles: 5, seed: 42n });
+console.log(`Mode C passed: ${manifest.allPassed}`);
+
+// Inspect policy state (Thompson Sampling parameters, context buckets)
+const policy = solver.policy();
+console.log(`Context buckets: ${Object.keys(policy?.contextStats ?? {}).length}`);
+
+// Get tamper-evident witness chain (73 bytes per entry, SHAKE-256)
+const chain = solver.witnessChain();
+console.log(`Witness chain: ${chain?.length ?? 0} bytes`);
+
+solver.destroy();
+```
+
+The SDK also re-exports through the unified `@ruvector/rvf` package:
+
+```typescript
+// Unified import — solver + database in one package
+import { RvfDatabase, RvfSolver } from '@ruvector/rvf';
+```
+
+#### npm Package Structure
+
+```
+npm/packages/rvf-solver/
+├── package.json          # @ruvector/rvf-solver, CJS/ESM dual exports
+├── tsconfig.json         # ES2020 target, strict mode, declarations
+├── pkg/
+│   ├── rvf_solver.js     # WASM loader (singleton, Node CJS/ESM + browser)
+│   ├── rvf_solver.d.ts   # Low-level WASM C-ABI type declarations
+│   └── rvf_solver_bg.wasm  # Built from rvf-solver-wasm crate
+└── src/
+    ├── index.ts          # Barrel exports: RvfSolver + all types
+    ├── solver.ts         # RvfSolver class (create/train/acceptance/policy/witnessChain/destroy)
+    └── types.ts          # TrainOptions, AcceptanceManifest, PolicyState, etc.
+```
+
+| Type | Fields | Purpose |
+|------|--------|---------|
+| `TrainOptions` | `count`, `minDifficulty?`, `maxDifficulty?`, `seed?` | Configure training run |
+| `TrainResult` | `trained`, `correct`, `accuracy`, `patternsLearned` | Training outcome |
+| `AcceptanceOptions` | `holdoutSize?`, `trainingPerCycle?`, `cycles?`, `stepBudget?`, `seed?` | Configure acceptance test |
+| `AcceptanceManifest` | `modeA`, `modeB`, `modeC`, `allPassed`, `witnessEntries`, `witnessChainBytes` | Full ablation results |
+| `PolicyState` | `contextStats`, `earlyCommitPenalties`, `prepass`, `speculativeAttempts` | Thompson Sampling state |
+| `SkipModeStats` | `attempts`, `successes`, `alphaSafety`, `betaSafety`, `costEma` | Per-arm bandit stats |
+
+#### Low-Level WASM Usage (advanced)
+
+```javascript
+// Direct WASM C-ABI usage (without the SDK wrapper)
+const wasm = await WebAssembly.instantiate(solverModule);
+
+const handle = wasm.exports.rvf_solver_create();
+const correct = wasm.exports.rvf_solver_train(handle, 1000, 1, 10, 42, 0);
+
+const len = wasm.exports.rvf_solver_result_len(handle);
+const ptr = wasm.exports.rvf_solver_alloc(len);
+wasm.exports.rvf_solver_result_read(handle, ptr);
+const json = new TextDecoder().decode(new Uint8Array(wasm.memory.buffer, ptr, len));
+
+// Witness chain verifiable by rvf-wasm
+const wLen = wasm.exports.rvf_solver_witness_len(handle);
+const wPtr = wasm.exports.rvf_solver_alloc(wLen);
+wasm.exports.rvf_solver_witness_read(handle, wPtr);
+const chain = new Uint8Array(wasm.memory.buffer, wPtr, wLen);
+const verified = rvfWasm.exports.rvf_witness_verify(chainPtr, wLen);
+
+wasm.exports.rvf_solver_destroy(handle);
+```
+
+## Module Structure
+
+```
+crates/rvf/rvf-solver-wasm/
+├── Cargo.toml          # no_std + alloc, dlmalloc, libm, serde_json
+├── src/
+│   ├── lib.rs          # WASM exports, instance registry, panic handler
+│   ├── alloc_setup.rs  # dlmalloc global allocator, rvf_solver_alloc/free
+│   ├── types.rs        # Date arithmetic, Constraint, Puzzle, Rng64
+│   ├── policy.rs       # PolicyKernel, Thompson Sampling, KnowledgeCompiler
+│   └── engine.rs       # AdaptiveSolver, ReasoningBank, PuzzleGenerator, acceptance test
+```
+
+| File | Lines | Purpose |
+|------|-------|---------|
+| `types.rs` | 239 | Pure-integer date math (Howard Hinnant algorithm), constraints, puzzle type |
+| `policy.rs` | ~480 | Full Thompson Sampling with Marsaglia gamma sampling, 18-bucket context |
+| `engine.rs` | ~490 | Three-loop solver, acceptance test runner, puzzle generator |
+| `lib.rs` | ~280 | 12 WASM exports, handle registry (8 slots), witness chain integration |
+
+## Binary Size
+
+| Build | Size |
+|-------|------|
+| Release (wasm32-unknown-unknown) | ~171 KB |
+| After wasm-opt -Oz | 132 KB |
+
+## npm Package Ecosystem
+
+The AGI solver is exposed through a layered npm package architecture:
+
+| Package | Version | Role | Install |
+|---------|---------|------|---------|
+| `@ruvector/rvf-solver` | 0.1.3 | Typed TypeScript SDK for the self-learning solver | `npm i @ruvector/rvf-solver` |
+| `@ruvector/rvf` | 0.1.9 | Unified SDK re-exporting solver + database | `npm i @ruvector/rvf` |
+| `@ruvector/rvf-node` | 0.1.7 | Native NAPI bindings with AGI methods (`indexStats`, `verifyWitness`, `freeze`, `metric`) | `npm i @ruvector/rvf-node` |
+| `@ruvector/rvf-wasm` | 0.1.6 | WASM microkernel with witness verification | `npm i @ruvector/rvf-wasm` |
+
+### Dependency Graph
+
+```
+@ruvector/rvf (unified SDK)
+├── @ruvector/rvf-node (required, native NAPI)
+├── @ruvector/rvf-wasm (optional, browser fallback)
+└── @ruvector/rvf-solver (optional, AGI solver)
+    └── rvf-solver-wasm WASM binary (loaded at runtime)
+```
+
+### AGI NAPI Methods (rvf-node)
+
+The native NAPI bindings expose AGI-relevant methods beyond basic vector CRUD:
+
+| Method | Returns | Purpose |
+|--------|---------|---------|
+| `indexStats()` | `RvfIndexStats` | HNSW index statistics (layers, M, ef_construction, indexed count) |
+| `verifyWitness()` | `RvfWitnessResult` | Verify tamper-evident SHAKE-256 witness chain integrity |
+| `freeze()` | `void` | Snapshot-freeze current state for deterministic replay |
+| `metric()` | `string` | Get distance metric name (`l2`, `cosine`, `dotproduct`) |
+
+## Consequences
+
+### Positive
+
+- The actual self-learning AGI engine runs in the browser, Node.js, and edge runtimes via WASM
+- No Rust toolchain required for end users — `npm install` + WASM load is sufficient
+- Deterministic: same seed → same puzzles → same learning → same witness chain
+- Witness chains produced in WASM are verifiable by the existing `rvf_witness_verify` export
+- PolicyKernel state is inspectable via `rvf_solver_policy_read` (JSON serializable)
+- Handle-based API supports up to 8 concurrent solver instances
+- 132 KB binary (after wasm-opt -Oz) includes the complete solver, Thompson Sampling, and serde_json
+- TypeScript SDK (`@ruvector/rvf-solver`) provides ergonomic async API with automatic WASM memory management
+- Unified SDK (`@ruvector/rvf`) re-exports solver alongside database for single-import usage
+- Native NAPI bindings expose AGI methods (index stats, witness verification, freeze) for server-side usage
+
+### Negative
+
+- Date arithmetic is reimplemented (pure-integer) rather than using `chrono`, requiring validation against the original
+- `HashMap` → `BTreeMap` changes iteration order (sorted vs hash-order), which may produce different witness chain hashes than the native benchmarks
+- Float math via `libm` may have minor precision differences vs std `f64` methods, affecting Thompson Sampling distributions
+- The puzzle generator is simplified compared to the full benchmarks generator (no cross-cultural constraints)
+
+### Neutral
+
+- The native benchmarks crate remains the reference implementation for full-fidelity acceptance tests
+- The WASM module is a faithful port, not a binding — both implementations should converge on the same acceptance test outcomes given identical seeds
+- `rvf-solver-wasm` is a member of the `crates/rvf` workspace alongside `rvf-wasm`
+
+### Implementation Notes (2026-02-17)
+
+- WASM loader (`pkg/rvf_solver.js`) rewritten as pure CJS to fix ESM/CJS interop — `import.meta.url` and `export default` removed
+- Snake_case → camelCase field mapping added in `solver.ts` for `train()`, `policy()`, and `acceptance()` methods
+- `AcceptanceModeResult` type updated to match actual WASM output: `passed`, `accuracyMaintained`, `costImproved`, `robustnessImproved`, `zeroViolations`, `dimensionsImproved`, `cycles[]`
+- SDK tests added at `npm/packages/rvf-solver/test/solver.test.mjs` (import validation, type structure, WASM integration)
+- Security review completed: WASM loader path validation flagged as LOW risk (library-internal API), `JSON.parse` on WASM memory is trusted
+
+---
+
+## Appendix: Public Package Documentation
+
+# @ruvector/rvf-solver
+
+[![npm](https://img.shields.io/npm/v/@ruvector/rvf-solver)](https://www.npmjs.com/package/@ruvector/rvf-solver)
+[![license](https://img.shields.io/npm/l/@ruvector/rvf-solver)](https://github.com/ruvnet/ruvector/blob/main/LICENSE)
+![platforms](https://img.shields.io/badge/platforms-Node.js%20%7C%20Browser%20%7C%20Edge-blue)
+
+Self-learning temporal solver with Thompson Sampling, PolicyKernel, ReasoningBank, and SHAKE-256 tamper-evident witness chains. Runs in the browser, Node.js, and edge runtimes via WebAssembly.
+
+### Install
+
+```bash
+npm install @ruvector/rvf-solver
+```
+
+Or via the unified SDK:
+
+```bash
+npm install @ruvector/rvf
+```
+
+### Features
+
+- **Thompson Sampling two-signal model** — safety Beta distribution + cost EMA for adaptive policy selection
+- **18 context-bucketed bandits** — 3 range x 3 distractor x 2 noise levels for fine-grained context awareness
+- **KnowledgeCompiler with signature-based pattern cache** — distills learned patterns into reusable compiled configurations
+- **Speculative dual-path execution** — runs two candidate arms in parallel, picks the winner
+- **Three-loop adaptive solver** — fast: constraint propagation solve, medium: PolicyKernel skip-mode selection, slow: KnowledgeCompiler pattern distillation
+- **SHAKE-256 tamper-evident witness chain** — 73 bytes per entry, cryptographically linked proof of all operations
+- **Full acceptance test with A/B/C ablation modes** — validates learned policy outperforms fixed and compiler baselines
+- **~132 KB WASM binary, `no_std`** — runs anywhere WebAssembly does (browsers, Node.js, Deno, Cloudflare Workers, edge runtimes)
+
+### Quick Start
+
+```typescript
+import { RvfSolver } from '@ruvector/rvf-solver';
+
+// Create a solver instance (loads WASM on first call)
+const solver = await RvfSolver.create();
+
+// Train on 100 puzzles (difficulty 1-5)
+const result = solver.train({ count: 100, minDifficulty: 1, maxDifficulty: 5 });
+console.log(`Accuracy: ${(result.accuracy * 100).toFixed(1)}%`);
+console.log(`Patterns learned: ${result.patternsLearned}`);
+
+// Run full acceptance test (A/B/C ablation)
+const manifest = solver.acceptance({ cycles: 3 });
+console.log(`All passed: ${manifest.allPassed}`);
+
+// Inspect Thompson Sampling policy state
+const policy = solver.policy();
+console.log(`Context buckets: ${Object.keys(policy?.contextStats ?? {}).length}`);
+console.log(`Speculative attempts: ${policy?.speculativeAttempts}`);
+
+// Get raw SHAKE-256 witness chain
+const chain = solver.witnessChain();
+console.log(`Witness chain: ${chain?.length ?? 0} bytes`);
+
+// Free WASM resources
+solver.destroy();
+```
+
+### API Reference
+
+#### `RvfSolver.create(): Promise<RvfSolver>`
+
+Creates a new solver instance. Initializes the WASM module on the first call; subsequent calls reuse the loaded module. Up to 8 concurrent instances are supported.
+
+#### `solver.train(options: TrainOptions): TrainResult`
+
+Trains the solver on randomly generated puzzles using the three-loop architecture. The fast loop applies constraint propagation, the medium loop selects skip modes via Thompson Sampling, and the slow loop distills patterns into the KnowledgeCompiler cache.
+
+#### `solver.acceptance(options?: AcceptanceOptions): AcceptanceManifest`
+
+Runs the full acceptance test with training/holdout cycles across all three ablation modes (A, B, C). Returns a manifest with per-cycle metrics, pass/fail status, and witness chain metadata.
+
+#### `solver.policy(): PolicyState | null`
+
+Returns the current Thompson Sampling policy state including per-context-bucket arm statistics, KnowledgeCompiler cache stats, and speculative execution counters. Returns `null` if no training has been performed.
+
+#### `solver.witnessChain(): Uint8Array | null`
+
+Returns the raw SHAKE-256 witness chain bytes. Each entry is 73 bytes and provides tamper-evident proof of all training and acceptance operations. Returns `null` if the chain is empty. The returned `Uint8Array` is a copy safe to use after `destroy()`.
+
+#### `solver.destroy(): void`
+
+Frees the WASM solver instance and releases all associated memory. The instance must not be used after calling `destroy()`.
+
+### Types
+
+#### TrainOptions
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `count` | `number` | required | Number of puzzles to generate and solve |
+| `minDifficulty` | `number` | `1` | Minimum puzzle difficulty (1-10) |
+| `maxDifficulty` | `number` | `10` | Maximum puzzle difficulty (1-10) |
+| `seed` | `bigint \| number` | random | RNG seed for reproducible runs |
+
+#### TrainResult
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `trained` | `number` | Number of puzzles trained on |
+| `correct` | `number` | Number solved correctly |
+| `accuracy` | `number` | Accuracy ratio (correct / trained) |
+| `patternsLearned` | `number` | Patterns distilled by the ReasoningBank |
+
+#### AcceptanceOptions
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `holdoutSize` | `number` | `50` | Number of holdout puzzles per cycle |
+| `trainingPerCycle` | `number` | `200` | Number of training puzzles per cycle |
+| `cycles` | `number` | `5` | Number of train/test cycles |
+| `stepBudget` | `number` | `500` | Maximum constraint propagation steps per puzzle |
+| `seed` | `bigint \| number` | random | RNG seed for reproducible runs |
+
+#### AcceptanceManifest
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `version` | `number` | Manifest schema version |
+| `modeA` | `AcceptanceModeResult` | Mode A results (fixed heuristic) |
+| `modeB` | `AcceptanceModeResult` | Mode B results (compiler-suggested) |
+| `modeC` | `AcceptanceModeResult` | Mode C results (learned policy) |
+| `allPassed` | `boolean` | `true` if Mode C passed |
+| `witnessEntries` | `number` | Number of entries in the witness chain |
+| `witnessChainBytes` | `number` | Total witness chain size in bytes |
+
+#### AcceptanceModeResult
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `passed` | `boolean` | Whether this mode met the accuracy threshold |
+| `accuracyMaintained` | `boolean` | Accuracy maintained across cycles |
+| `costImproved` | `boolean` | Cost per solve improved |
+| `robustnessImproved` | `boolean` | Noise robustness improved |
+| `zeroViolations` | `boolean` | No constraint violations |
+| `dimensionsImproved` | `number` | Number of dimensions that improved |
+| `cycles` | `CycleMetrics[]` | Per-cycle accuracy and cost metrics |
+
+#### PolicyState
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `contextStats` | `Record<string, Record<string, SkipModeStats>>` | Per-context-bucket, per-arm Thompson Sampling statistics |
+| `earlyCommitPenalties` | `number` | Total early-commit penalty cost |
+| `earlyCommitsTotal` | `number` | Total early-commit attempts |
+| `earlyCommitsWrong` | `number` | Early commits that were incorrect |
+| `prepass` | `string` | Current prepass strategy identifier |
+| `speculativeAttempts` | `number` | Number of speculative dual-path executions |
+| `speculativeArm2Wins` | `number` | Times the second speculative arm won |
+
+### Acceptance Test Modes
+
+The acceptance test validates the solver's learning capability through three ablation modes run across multiple train/test cycles:
+
+**Mode A (Fixed)** -- Uses a fixed heuristic skip-mode policy. This establishes the baseline performance without any learning. The policy does not adapt regardless of puzzle characteristics.
+
+**Mode B (Compiler)** -- Uses the KnowledgeCompiler's signature-based pattern cache to select skip modes. The compiler distills observed patterns into compiled configurations but does not perform online Thompson Sampling updates.
+
+**Mode C (Learned)** -- Uses the full Thompson Sampling two-signal model with context-bucketed bandits. This is the complete system: the fast loop solves, the medium loop selects arms based on safety Beta and cost EMA, and the slow loop feeds patterns back to the compiler. Mode C should outperform both A and B, demonstrating genuine self-improvement.
+
+The test passes when Mode C achieves the accuracy threshold on holdout puzzles. The witness chain records every training and evaluation operation for tamper-evident auditability.
+
+### Architecture
+
+The solver uses a three-loop adaptive architecture:
+
+```
+-----------------------------------------------+
+|  Slow Loop: KnowledgeCompiler                  |
+|  - Signature-based pattern cache               |
+|  - Distills observations into compiled configs  |
+-----------------------------------------------+
+        |                          ^
+        v                          |
+-----------------------------------------------+
+|  Medium Loop: PolicyKernel                     |
+|  - Thompson Sampling (safety Beta + cost EMA)  |
+|  - 18 context buckets (range x distractor x noise) |
+|  - Speculative dual-path execution             |
+-----------------------------------------------+
+        |                          ^
+        v                          |
+-----------------------------------------------+
+|  Fast Loop: Constraint Propagation Solver      |
+|  - Generates and solves puzzles                |
+|  - Reports outcomes back to PolicyKernel       |
+-----------------------------------------------+
+        |
+        v
+-----------------------------------------------+
+|  SHAKE-256 Witness Chain (73 bytes/entry)      |
+|  - Tamper-evident proof of all operations      |
+-----------------------------------------------+
+```
+
+### Unified SDK
+
+When using the `@ruvector/rvf` unified SDK, the solver is available as a sub-module:
+
+```typescript
+import { RvfSolver } from '@ruvector/rvf';
+
+const solver = await RvfSolver.create();
+const result = solver.train({ count: 100 });
+console.log(`Accuracy: ${(result.accuracy * 100).toFixed(1)}%`);
+solver.destroy();
+```
+
+### Related Packages
+
+| Package | Description |
+|---------|-------------|
+| [`@ruvector/rvf`](https://www.npmjs.com/package/@ruvector/rvf) | Unified TypeScript SDK |
+| [`@ruvector/rvf-node`](https://www.npmjs.com/package/@ruvector/rvf-node) | Native N-API bindings for Node.js |
+| [`@ruvector/rvf-wasm`](https://www.npmjs.com/package/@ruvector/rvf-wasm) | Browser WASM package |
+| [`@ruvector/rvf-mcp-server`](https://www.npmjs.com/package/@ruvector/rvf-mcp-server) | MCP server for AI agents |