Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
This commit is contained in:
@@ -0,0 +1,859 @@
|
||||
# ADR-040: Causal Atlas RVF Runtime — Planet Detection & Life Candidate Scoring
|
||||
|
||||
**Status:** Proposed
|
||||
**Date:** 2026-02-18
|
||||
**Author:** System Architect (AgentDB v3)
|
||||
**Supersedes:** None
|
||||
**Related:** ADR-003 (RVF Format), ADR-006 (Unified Self-Learning RVF), ADR-007 (Full Capability Integration), ADR-008 (Chat UI RVF)
|
||||
**Package:** `@agentdb/causal-atlas`
|
||||
|
||||
## Context
|
||||
|
||||
ADR-008 demonstrated that a single RVF artifact can embed a minimal Linux
|
||||
userspace, an LLM inference engine, and a self-learning pipeline into one
|
||||
portable file. This ADR extends that pattern to scientific computing: a
|
||||
portable RVF runtime that ingests public astronomy and physics datasets,
|
||||
builds a multi-scale interaction graph, maintains a dynamic coherence field,
|
||||
and emits replayable witness logs for every derived claim.
|
||||
|
||||
The design draws engineering inspiration from causal sets, loop-gravity-style
|
||||
discretization, and holographic boundary encoding, but it is implemented as a
|
||||
practical data system, not a physics simulator. The holographic principle
|
||||
manifests as a concrete design choice: primarily store and index boundaries,
|
||||
and treat interior state as reconstructable from boundary witnesses and
|
||||
retained archetypes.
|
||||
|
||||
### Existing Capabilities (ADR-003 through ADR-008)
|
||||
|
||||
| Component | Package | Relevant APIs |
|
||||
|-----------|---------|---------------|
|
||||
| **RVF segments** | `@ruvector/rvf`, `@ruvector/rvf-node` | `embedKernel`, `extractKernel`, `embedEbpf`, `segments`, `derive` |
|
||||
| **HNSW indexing** | `@ruvector/rvf-node` | `ingestBatch`, `query`, `compact`, HNSW with metadata filters |
|
||||
| **Witness chains** | `@ruvector/rvf-node`, `RvfSolver` | `verifyWitness`, SHAKE-256 witness chains, signed root hash |
|
||||
| **Graph transactions** | `NativeAccelerator` | `graphTransaction`, `graphBatchInsert`, Cypher queries |
|
||||
| **SIMD embeddings** | `@ruvector/ruvllm` | 768-dim SIMD embed, cosine/dot/L2, HNSW memory search |
|
||||
| **SONA learning** | `SonaLearningBackend` | Micro-LoRA, trajectory recording, EWC++ |
|
||||
| **Federated coordination** | `FederatedSessionManager` | Cross-agent trajectories, warm-start patterns |
|
||||
| **Contrastive training** | `ContrastiveTrainer` | InfoNCE, hard negative mining, 3-stage curriculum |
|
||||
| **Adaptive index** | `AdaptiveIndexTuner` | 5-tier compression, Matryoshka truncation, health monitoring |
|
||||
| **Kernel embedding** | `KernelBuilder` (ADR-008) | Minimal Linux boot from KERNEL_SEG + INITRD_SEG |
|
||||
| **Lazy model download** | `ChatInference` (ADR-008) | Deferred GGUF load on first inference call |
|
||||
|
||||
### What This ADR Adds
|
||||
|
||||
1. Domain adapters for astronomy data (light curves, spectra, galaxy catalogs)
|
||||
2. Compressed causal atlas with partial-order event graph
|
||||
3. Coherence field index with cut pressure and partition entropy
|
||||
4. Multi-scale interaction memory with budget-controlled tiered retention
|
||||
5. Boundary evolution tracker with holographic-style boundary-first storage
|
||||
6. Planet detection pipeline (Kepler/TESS transit search)
|
||||
7. Life candidate scoring pipeline (spectral disequilibrium signatures)
|
||||
8. Progressive data download from public sources on first activation
|
||||
|
||||
## Goal State
|
||||
|
||||
A single RVF artifact that boots a minimal Linux userspace, progressively
|
||||
downloads and ingests public astronomy and physics datasets on first
|
||||
activation (lazy, like ADR-008's GGUF model download), builds a multi-scale
|
||||
interaction graph, maintains a dynamic coherence field, and emits replayable
|
||||
witness logs for every derived claim.
|
||||
|
||||
### Primary Outputs
|
||||
|
||||
| # | Output | Description |
|
||||
|---|--------|-------------|
|
||||
| 1 | **Atlas snapshots** | Queryable causal partial order plus embeddings |
|
||||
| 2 | **Coherence field** | Partition tree plus cut pressure signals over time |
|
||||
| 3 | **Multi-scale memory** | Delta-encoded interaction history from seconds to micro-windows |
|
||||
| 4 | **Boundary tracker** | Boundary changes, drift, and anomaly alerts |
|
||||
| 5 | **Planet candidates** | Ranked list with traceable evidence |
|
||||
| 6 | **Life candidates** | Ranked list of spectral disequilibrium signatures with traceable evidence |
|
||||
|
||||
### Non-Goals
|
||||
|
||||
1. Proving quantum gravity
|
||||
2. Replacing astrophysical pipelines end-to-end
|
||||
3. Claiming life detection without conventional follow-up observation
|
||||
|
||||
## Public Data Sources
|
||||
|
||||
All data is progressively downloaded from public archives on first activation.
|
||||
The RVF artifact ships with download manifests and integrity hashes, not the
|
||||
raw data itself.
|
||||
|
||||
### Planet Finding
|
||||
|
||||
| Source | Access | Reference |
|
||||
|--------|--------|-----------|
|
||||
| Kepler light curves and pixel files | MAST bulk and portal | [archive.stsci.edu/kepler](https://archive.stsci.edu/missions-and-data/kepler) |
|
||||
| TESS light curves and full-frame images | MAST portal | [archive.stsci.edu/tess](https://archive.stsci.edu/missions-and-data/tess) |
|
||||
|
||||
### Life-Relevant Spectra
|
||||
|
||||
| Source | Access | Reference |
|
||||
|--------|--------|-----------|
|
||||
| JWST exoplanet spectra | exo.MAST and MAST holdings | [archive.stsci.edu](https://archive.stsci.edu/home) |
|
||||
| NASA Exoplanet Archive parameters | Cross-linking to spectra and mission products | [exoplanetarchive.ipac.caltech.edu](https://exoplanetarchive.ipac.caltech.edu/) |
|
||||
|
||||
### Large-Scale Structure
|
||||
|
||||
| Source | Access | Reference |
|
||||
|--------|--------|-----------|
|
||||
| SDSS public catalogs (spectra, redshifts) | DR17 | [sdss4.org/dr17](https://www.sdss4.org/dr17/) |
|
||||
|
||||
### Progressive Download Strategy
|
||||
|
||||
Following the lazy-download pattern established in ADR-008 for GGUF models:
|
||||
|
||||
1. **Manifest-first**: RVF ships with `MANIFEST_SEG` containing download URLs,
|
||||
SHA-256 hashes, expected sizes, and priority tiers
|
||||
2. **Tier 0 (boot)**: Minimal curated dataset (~50 MB) for offline demo —
|
||||
100 Kepler targets with known confirmed planets, embedded in VEC_SEG
|
||||
3. **Tier 1 (first run)**: Download 1,000 Kepler targets on first pipeline
|
||||
activation. Background download, progress reported via CLI/HTTP
|
||||
4. **Tier 2 (expansion)**: Full Kepler/TESS catalog download on explicit
|
||||
`rvf ingest --expand` command
|
||||
5. **Tier 3 (spectra)**: JWST and archive spectra downloaded when life
|
||||
candidate pipeline is first activated
|
||||
6. **Seal-on-complete**: After download, data is ingested into VEC_SEG and
|
||||
INDEX_SEG, a new witness root is committed, and the RVF is sealed into
|
||||
a reproducible snapshot
|
||||
|
||||
```
|
||||
Download state machine:
|
||||
|
||||
[boot] ──first-inference──> [downloading-tier-1]
|
||||
│ │
|
||||
│ (offline demo works) │ (progress: 0-100%)
|
||||
│ │
|
||||
▼ ▼
|
||||
[tier-0-only] [tier-1-ready]
|
||||
│
|
||||
rvf ingest --expand
|
||||
│
|
||||
▼
|
||||
[tier-2-ready]
|
||||
│
|
||||
life pipeline activated
|
||||
│
|
||||
▼
|
||||
[tier-3-ready] ──seal──> [sealed-snapshot]
|
||||
```
|
||||
|
||||
Each tier download:
|
||||
- Resumes from last byte on interruption (HTTP Range headers)
|
||||
- Validates SHA-256 after download
|
||||
- Commits a witness record for the download event
|
||||
- Can be skipped with `--offline` flag (uses whatever is already present)
|
||||
|
||||
## RVF Artifact Layout
|
||||
|
||||
Extends the ADR-003 segment model with domain-specific segments.
|
||||
|
||||
| # | Segment | Contents |
|
||||
|---|---------|----------|
|
||||
| 1 | `MANIFEST_SEG` | Segment table, hashes, policy, budgets, version gates, **download manifests** |
|
||||
| 2 | `KERNEL_SEG` | Minimal Linux kernel image for portable boot (reuse ADR-008) |
|
||||
| 3 | `INITRD_SEG` | Minimal userspace: busybox, RuVector binaries, data ingest tools, query server |
|
||||
| 4 | `EBPF_SEG` | Socket allow-list and syscall reduction. Default: local loopback + explicit download ports only |
|
||||
| 5 | `VEC_SEG` | Embedding vectors: light-curve windows, spectrum windows, graph node descriptors, partition boundary descriptors |
|
||||
| 6 | `INDEX_SEG` | HNSW unified attention index for vectors and boundary descriptors |
|
||||
| 7 | `GRAPH_SEG` | Dynamic interaction graph: nodes, edges, timestamps, authority, provenance |
|
||||
| 8 | `DELTA_SEG` | Append-only change log of graph updates and field updates |
|
||||
| 9 | `WITNESS_SEG` | Deterministic witness chain: canonical serialization, signed root hash progression |
|
||||
| 10 | `POLICY_SEG` | Data provenance requirements, candidate publishing thresholds, deny rules, confidence floors |
|
||||
| 11 | `DASHBOARD_SEG` | Vite-bundled Three.js visualization app — static assets served by runtime HTTP server |
|
||||
|
||||
## Data Model
|
||||
|
||||
### Core Entities
|
||||
|
||||
```typescript
|
||||
interface Event {
|
||||
id: string;
|
||||
t_start: number; // epoch seconds
|
||||
t_end: number;
|
||||
domain: 'kepler' | 'tess' | 'jwst' | 'sdss' | 'derived';
|
||||
payload_hash: string; // SHA-256 of raw data window
|
||||
provenance: Provenance;
|
||||
}
|
||||
|
||||
interface Observation {
|
||||
id: string;
|
||||
instrument: string; // 'kepler-lc' | 'tess-ffi' | 'jwst-nirspec' | ...
|
||||
target_id: string; // e.g., KIC or TIC identifier
|
||||
data_pointer: string; // segment offset into VEC_SEG
|
||||
calibration_version: string;
|
||||
provenance: Provenance;
|
||||
}
|
||||
|
||||
interface InteractionEdge {
|
||||
src_event_id: string;
|
||||
dst_event_id: string;
|
||||
type: 'causal' | 'periodicity' | 'shape_similarity' | 'co_occurrence' | 'spatial';
|
||||
weight: number;
|
||||
lag: number; // temporal lag in seconds
|
||||
confidence: number;
|
||||
provenance: Provenance;
|
||||
}
|
||||
|
||||
interface Boundary {
|
||||
boundary_id: string;
|
||||
partition_left_set_hash: string;
|
||||
partition_right_set_hash: string;
|
||||
cut_weight: number;
|
||||
cut_witness: string; // witness chain reference
|
||||
stability_score: number;
|
||||
}
|
||||
|
||||
interface Candidate {
|
||||
candidate_id: string;
|
||||
category: 'planet' | 'life';
|
||||
evidence_pointers: string[]; // event and edge IDs
|
||||
score: number;
|
||||
uncertainty: number;
|
||||
publishable: boolean; // based on POLICY_SEG rules
|
||||
witness_trace: string; // WITNESS_SEG reference for replay
|
||||
}
|
||||
|
||||
interface Provenance {
|
||||
source: string; // 'mast-kepler' | 'mast-tess' | 'mast-jwst' | ...
|
||||
download_witness: string; // witness chain entry for the download
|
||||
transform_chain: string[]; // ordered list of transform IDs applied
|
||||
timestamp: string; // ISO-8601
|
||||
}
|
||||
```
|
||||
|
||||
### Domain Adapters
|
||||
|
||||
#### Planet Transit Adapter
|
||||
|
||||
```
|
||||
Input: flux time series + cadence metadata (Kepler/TESS FITS)
|
||||
Output: Event nodes for windows
|
||||
InteractionEdges for periodicity hints and shape similarity
|
||||
Candidate nodes for dip detections
|
||||
```
|
||||
|
||||
#### Spectrum Adapter
|
||||
|
||||
```
|
||||
Input: wavelength, flux, error arrays (JWST NIRSpec, etc.)
|
||||
Output: Event nodes for band windows
|
||||
InteractionEdges for molecule feature co-occurrence
|
||||
Disequilibrium score components
|
||||
```
|
||||
|
||||
#### Cosmic Web Adapter (optional, Phase 2+)
|
||||
|
||||
```
|
||||
Input: galaxy positions and redshifts (SDSS)
|
||||
Output: Graph of spatial adjacency and filament membership
|
||||
```
|
||||
|
||||
## The Four System Constructs
|
||||
|
||||
### 1. Compressed Causal Atlas
|
||||
|
||||
**Definition**: A partial order of events plus minimal sufficient descriptors
|
||||
to reproduce derived edges.
|
||||
|
||||
**Construction**:
|
||||
|
||||
1. **Windowing** — Light curves into overlapping windows at multiple scales
|
||||
- Scales: 2 hours, 12 hours, 3 days, 27 days
|
||||
|
||||
2. **Feature extraction** — Robust features per window
|
||||
- Flux derivative statistics
|
||||
- Autocorrelation peaks
|
||||
- Wavelet energy bands
|
||||
- Transit-shaped matched filter response
|
||||
|
||||
3. **Embedding** — RuVector SIMD embed per window, stored in VEC_SEG
|
||||
|
||||
4. **Causal edges** — Add edge when window A precedes window B and improves
|
||||
predictability of B (conditional mutual information proxy or prediction gain,
|
||||
subject to POLICY_SEG constraints)
|
||||
- Edge weight: prediction gain magnitude
|
||||
- Provenance: exact windows, transform IDs, threshold used
|
||||
|
||||
5. **Atlas compression**
|
||||
- Keep only top-k causal parents per node
|
||||
- Retain stable boundary witnesses
|
||||
- Delta-encode updates into DELTA_SEG
|
||||
|
||||
**Output API**:
|
||||
|
||||
| Endpoint | Returns |
|
||||
|----------|---------|
|
||||
| `atlas.query(event_id)` | Parents, children, plus provenance |
|
||||
| `atlas.trace(candidate_id)` | Minimal causal chain for a candidate |
|
||||
|
||||
### 2. Coherence Field Index
|
||||
|
||||
**Definition**: A field over the atlas graph that assigns coherence pressure
|
||||
and cut stability over time.
|
||||
|
||||
**Signals**:
|
||||
|
||||
| Signal | Description |
|
||||
|--------|-------------|
|
||||
| Cut pressure | Minimum cut values over selected subgraphs |
|
||||
| Partition entropy | Distribution of cluster sizes and churn rate |
|
||||
| Disagreement | Cross-detector disagreement rate |
|
||||
| Drift | Embedding distribution shift in sliding window |
|
||||
|
||||
**Algorithm**:
|
||||
|
||||
1. Maintain a partition tree. Update with dynamic min-cut on incremental
|
||||
graph changes
|
||||
2. For each update epoch:
|
||||
- Compute cut witnesses for top boundaries
|
||||
- Emit boundary events into GRAPH_SEG
|
||||
- Append witness record into WITNESS_SEG
|
||||
3. Index boundaries via descriptor vector:
|
||||
- Cut value, partition sizes, local graph curvature proxy, recent churn
|
||||
|
||||
**Query API**:
|
||||
|
||||
| Endpoint | Returns |
|
||||
|----------|---------|
|
||||
| `coherence.get(target_id, epoch)` | Field values for target at epoch |
|
||||
| `boundary.nearest(descriptor)` | Similar historical boundary states via INDEX_SEG |
|
||||
|
||||
### 3. Multi-Scale Interaction Memory
|
||||
|
||||
**Definition**: A memory that retains interactions at multiple time resolutions
|
||||
with strict budget control.
|
||||
|
||||
**Three tiers**:
|
||||
|
||||
| Tier | Resolution | Content |
|
||||
|------|-----------|---------|
|
||||
| **S** | Seconds to minutes | High-fidelity deltas |
|
||||
| **M** | Hours to days | Aggregated deltas |
|
||||
| **L** | Weeks to months | Boundary summaries and archetypes |
|
||||
|
||||
**Retention rules**:
|
||||
1. Preserve events that are boundary-critical
|
||||
2. Preserve events that are candidate evidence
|
||||
3. Compress everything else via archetype clustering in INDEX_SEG
|
||||
|
||||
**Mechanism**:
|
||||
- DELTA_SEG is append-only
|
||||
- Periodic compaction produces a new RVF root with a witness proof of
|
||||
preservation rules applied
|
||||
|
||||
### 4. Boundary Evolution Tracker
|
||||
|
||||
**Definition**: A tracker that treats boundaries as primary objects that evolve
|
||||
over time.
|
||||
|
||||
**This is where the holographic flavor is implemented.** You primarily store
|
||||
and index boundaries, and treat interior state as reconstructable from boundary
|
||||
witnesses and retained archetypes.
|
||||
|
||||
**Output API**:
|
||||
|
||||
| Endpoint | Returns |
|
||||
|----------|---------|
|
||||
| `boundary.timeline(target_id)` | Boundary evolution over time |
|
||||
| `boundary.alerts` | Alerts when: cut pressure spikes, boundary identity flips, disagreement exceeds threshold, drift persists beyond policy |
|
||||
|
||||
## Planet Detection Pipeline
|
||||
|
||||
### Stage P0: Ingest
|
||||
|
||||
**Input**: Kepler or TESS light curves from MAST (progressively downloaded)
|
||||
|
||||
1. Normalize flux
|
||||
2. Remove obvious systematics (detrending)
|
||||
3. Segment into windows and store as Event nodes
|
||||
|
||||
### Stage P1: Candidate Generation
|
||||
|
||||
1. Matched filter bank for transit-like dips
|
||||
2. Period search on candidate dip times (BLS or similar)
|
||||
3. Create Candidate node per period hypothesis
|
||||
|
||||
### Stage P2: Coherence Gating
|
||||
|
||||
Candidate must pass all gates:
|
||||
|
||||
| Gate | Requirement |
|
||||
|------|-------------|
|
||||
| Multi-scale stability | Stable across multiple window scales |
|
||||
| Boundary consistency | Consistent boundary signature around transit times |
|
||||
| Low drift | Drift below threshold across adjacent windows |
|
||||
|
||||
**Score components**:
|
||||
|
||||
| Component | Description |
|
||||
|-----------|-------------|
|
||||
| SNR-like strength | Signal-to-noise of transit dip |
|
||||
| Shape consistency | Cross-transit shape agreement |
|
||||
| Period stability | Variance of period estimates |
|
||||
| Coherence stability | Coherence field stability around candidate |
|
||||
|
||||
**Emit**: Candidate with evidence pointers + witness trace listing exact
|
||||
windows, transforms, and thresholds used.
|
||||
|
||||
## Life Candidate Pipeline
|
||||
|
||||
Life detection here means pre-screening for non-equilibrium atmospheric
|
||||
chemistry signatures, not proof.
|
||||
|
||||
### Stage L0: Ingest
|
||||
|
||||
**Input**: Published or mission spectra tied to targets via MAST and NASA
|
||||
Exoplanet Archive (progressively downloaded on first pipeline activation)
|
||||
|
||||
1. Normalize and denoise within instrument error model
|
||||
2. Window spectra by wavelength bands
|
||||
3. Create band Event nodes
|
||||
|
||||
### Stage L1: Feature Extraction
|
||||
|
||||
1. Identify absorption features and confidence bands
|
||||
2. Encode presence vectors for key molecule families (H2O, CO2, CH4, O3, NH3, etc.)
|
||||
3. Build InteractionEdges between features that co-occur in physically
|
||||
meaningful patterns
|
||||
|
||||
### Stage L2: Disequilibrium Scoring
|
||||
|
||||
**Core concept**: Life-like systems maintain chemical ratios that resist
|
||||
thermodynamic relaxation.
|
||||
|
||||
**Implementation as graph scoring**:
|
||||
|
||||
1. Build a reaction plausibility graph (prior rule set in POLICY_SEG)
|
||||
2. Compute inconsistency score between observed co-occurrences and expected
|
||||
equilibrium patterns
|
||||
3. Track stability of that score across epochs and observation sets
|
||||
|
||||
**Score components**:
|
||||
|
||||
| Component | Description |
|
||||
|-----------|-------------|
|
||||
| Persistent multi-molecule imbalance | Proxy for non-equilibrium chemistry |
|
||||
| Feature repeatability | Agreement across instruments or visits |
|
||||
| Contamination risk penalty | Instrument artifact and stellar contamination |
|
||||
| Stellar activity confound penalty | Host star variability coupling |
|
||||
|
||||
**Output**: Life candidate list with explicit uncertainty + required follow-up
|
||||
observations list generated by POLICY_SEG rules.
|
||||
|
||||
## Runtime and Portability
|
||||
|
||||
### Boot Sequence
|
||||
|
||||
1. RVF boots minimal Linux from KERNEL_SEG and INITRD_SEG (reuse ADR-008 `KernelBuilder`)
|
||||
2. Starts `rvf-runtime` daemon exposing local HTTP and CLI
|
||||
3. On first inference/query, progressively downloads required data tier
|
||||
|
||||
### Local Interfaces
|
||||
|
||||
**CLI**:
|
||||
```bash
|
||||
rvf run artifact.rvf # boot the runtime
|
||||
rvf query planet list # ranked planet candidates
|
||||
rvf query life list # ranked life candidates
|
||||
rvf trace <candidate_id> # full witness trace for any candidate
|
||||
rvf ingest --expand # download tier-2 full catalog
|
||||
rvf status # download progress, segment sizes, witness count
|
||||
```
|
||||
|
||||
**HTTP**:
|
||||
```
|
||||
GET / # Three.js dashboard (served from DASHBOARD_SEG)
|
||||
GET /assets/* # Dashboard static assets
|
||||
|
||||
GET /api/atlas/query?event_id=... # causal parents/children
|
||||
GET /api/atlas/trace?candidate_id=... # minimal causal chain
|
||||
GET /api/coherence?target_id=...&epoch= # field values
|
||||
GET /api/boundary/timeline?target_id=...
|
||||
GET /api/boundary/alerts
|
||||
GET /api/candidates/planet # ranked planet list
|
||||
GET /api/candidates/life # ranked life list
|
||||
GET /api/candidates/:id/trace # witness trace
|
||||
GET /api/status # system health + download progress
|
||||
GET /api/memory/tiers # tier S/M/L utilization
|
||||
|
||||
WS /ws/live # real-time boundary alerts, pipeline progress, candidate updates
|
||||
```
|
||||
|
||||
### Determinism
|
||||
|
||||
1. Fixed seeds for all stochastic operations
|
||||
2. Canonical serialization of every intermediate artifact
|
||||
3. Witness chain commits after each epoch
|
||||
4. Two-machine reproducibility: identical RVF root hash for identical input
|
||||
|
||||
### Security Defaults
|
||||
|
||||
1. Network off by default
|
||||
2. If enabled, eBPF allow-list: MAST/archive download ports + local loopback only
|
||||
3. No remote writes without explicit policy toggle in POLICY_SEG
|
||||
4. Downloaded data verified against MANIFEST_SEG hashes before ingestion
|
||||
|
||||
## Three.js Visualization Dashboard
|
||||
|
||||
The RVF embeds a Vite-bundled Three.js dashboard in `DASHBOARD_SEG`. The
|
||||
runtime HTTP server serves it at `/` (root). All visualizations are driven
|
||||
by the same API endpoints the CLI uses, so every rendered frame corresponds
|
||||
to queryable, witness-backed data.
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
DASHBOARD_SEG (inside RVF)
|
||||
dist/
|
||||
index.html # Vite SPA entry
|
||||
assets/
|
||||
main.[hash].js # Three.js + D3 + app logic (tree-shaken)
|
||||
main.[hash].css # Tailwind/minimal styles
|
||||
worker.js # Web Worker for graph layout
|
||||
|
||||
Runtime serves:
|
||||
GET / -> DASHBOARD_SEG/dist/index.html
|
||||
GET /assets/* -> DASHBOARD_SEG/dist/assets/*
|
||||
GET /api/* -> JSON API (atlas, coherence, candidates, etc.)
|
||||
WS /ws/live -> Live streaming of boundary alerts and pipeline progress
|
||||
```
|
||||
|
||||
**Build pipeline**: Vite builds the dashboard at package time into a single
|
||||
tree-shaken bundle. The bundle is embedded into `DASHBOARD_SEG` during RVF
|
||||
assembly. No Node.js required at runtime — the dashboard is pure static
|
||||
assets served by the existing HTTP server.
|
||||
|
||||
### Dashboard Views
|
||||
|
||||
#### V1: Causal Atlas Explorer (Three.js 3D)
|
||||
|
||||
Interactive 3D force-directed graph of the causal atlas.
|
||||
|
||||
| Feature | Implementation |
|
||||
|---------|---------------|
|
||||
| **Node rendering** | `THREE.InstancedMesh` for events — color by domain (Kepler=blue, TESS=cyan, JWST=gold, derived=white) |
|
||||
| **Edge rendering** | `THREE.LineSegments` with opacity mapped to edge weight |
|
||||
| **Causal flow** | Animated particles along causal edges showing temporal direction |
|
||||
| **Scale selector** | Toggle between window scales (2h, 12h, 3d, 27d) — re-layouts graph |
|
||||
| **Candidate highlight** | Click candidate in sidebar to trace its causal chain in 3D, dimming unrelated nodes |
|
||||
| **Witness replay** | Step through witness chain entries, animating graph state forward/backward |
|
||||
| **LOD** | Level-of-detail: far=boundary nodes only, mid=top-k events, close=full subgraph |
|
||||
|
||||
Data source: `GET /api/atlas/query`, `GET /api/atlas/trace`
|
||||
|
||||
#### V2: Coherence Field Heatmap (Three.js + shader)
|
||||
|
||||
Real-time coherence field rendered as a colored surface over the atlas graph.
|
||||
|
||||
| Feature | Implementation |
|
||||
|---------|---------------|
|
||||
| **Field surface** | `THREE.PlaneGeometry` subdivided grid, vertex colors from coherence values |
|
||||
| **Cut pressure** | Red hotspots where cut pressure is high, cool blue where stable |
|
||||
| **Partition boundaries** | Glowing wireframe lines at partition cuts |
|
||||
| **Time scrubber** | Scrub through epochs to see coherence evolution |
|
||||
| **Drift overlay** | Toggle to show embedding drift as animated vector arrows |
|
||||
| **Alert markers** | Pulsing icons at boundary alert locations |
|
||||
|
||||
Data source: `GET /api/coherence`, `GET /api/boundary/timeline`, `WS /ws/live`
|
||||
|
||||
#### V3: Planet Candidate Dashboard (2D panels + 3D orbit)
|
||||
|
||||
Split view combining data panels with 3D orbital visualization.
|
||||
|
||||
| Panel | Content |
|
||||
|-------|---------|
|
||||
| **Ranked list** | Sortable table: candidate ID, score, uncertainty, period, SNR, publishable status |
|
||||
| **Light curve viewer** | Interactive D3 chart: raw flux, detrended flux, transit model overlay, per-window score |
|
||||
| **Phase-folded plot** | All transits folded at detected period, with confidence band |
|
||||
| **3D orbit preview** | `THREE.Line` showing inferred orbital path around host star, sized by uncertainty |
|
||||
| **Evidence trace** | Expandable tree showing witness chain from raw data to final score |
|
||||
| **Score breakdown** | Radar chart: SNR, shape consistency, period stability, coherence stability |
|
||||
|
||||
Data source: `GET /api/candidates/planet`, `GET /api/candidates/:id/trace`
|
||||
|
||||
#### V4: Life Candidate Dashboard (2D panels + 3D molecule)
|
||||
|
||||
Split view for spectral disequilibrium analysis.
|
||||
|
||||
| Panel | Content |
|
||||
|-------|---------|
|
||||
| **Ranked list** | Sortable table: candidate ID, disequilibrium score, uncertainty, molecule flags, publishable |
|
||||
| **Spectrum viewer** | Interactive D3 chart: wavelength vs flux, molecule absorption bands highlighted |
|
||||
| **Molecule presence matrix** | Heatmap of detected molecule families vs confidence |
|
||||
| **3D molecule overlay** | `THREE.Sprite` labels at absorption wavelengths in a 3D wavelength space |
|
||||
| **Reaction graph** | Force-directed graph of molecule co-occurrences vs equilibrium expectations |
|
||||
| **Confound panel** | Bar chart: stellar activity penalty, contamination risk, repeatability score |
|
||||
|
||||
Data source: `GET /api/candidates/life`, `GET /api/candidates/:id/trace`
|
||||
|
||||
#### V5: System Status Dashboard
|
||||
|
||||
Operational health and download progress.
|
||||
|
||||
| Panel | Content |
|
||||
|-------|---------|
|
||||
| **Download progress** | Per-tier progress bars with byte counts and ETA |
|
||||
| **Segment sizes** | Stacked bar chart of RVF segment utilization |
|
||||
| **Memory tiers** | S/M/L tier fill levels and compaction history |
|
||||
| **Witness chain** | Scrolling log of recent witness entries with hash preview |
|
||||
| **Pipeline status** | P0/P1/P2 and L0/L1/L2 stage indicators with event counts |
|
||||
| **Performance** | Query latency histogram, events/second throughput |
|
||||
|
||||
Data source: `GET /api/status`, `GET /api/memory/tiers`, `WS /ws/live`
|
||||
|
||||
### WebSocket Live Stream
|
||||
|
||||
```typescript
|
||||
// WS /ws/live — server pushes events as they happen
|
||||
interface LiveEvent {
|
||||
type: 'boundary_alert' | 'candidate_new' | 'candidate_update' |
|
||||
'download_progress' | 'witness_commit' | 'pipeline_stage' |
|
||||
'coherence_update';
|
||||
timestamp: string;
|
||||
data: Record<string, unknown>;
|
||||
}
|
||||
```
|
||||
|
||||
The dashboard subscribes on connect and updates all views in real-time as
|
||||
pipelines process data and boundaries evolve.
|
||||
|
||||
### Vite Build Configuration
|
||||
|
||||
```typescript
|
||||
// vite.config.ts for dashboard build
|
||||
import { defineConfig } from 'vite';
|
||||
|
||||
export default defineConfig({
|
||||
build: {
|
||||
outDir: 'dist/dashboard',
|
||||
assetsDir: 'assets',
|
||||
rollupOptions: {
|
||||
output: {
|
||||
manualChunks: {
|
||||
three: ['three'], // ~150 KB gzipped
|
||||
d3: ['d3-scale', 'd3-axis', 'd3-shape', 'd3-selection'],
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
**Bundle budget**: < 500 KB gzipped total (Three.js ~150 KB, D3 subset ~30 KB,
|
||||
app logic ~50 KB, styles ~10 KB). The dashboard adds minimal overhead to the
|
||||
RVF artifact.
|
||||
|
||||
### Design Decision: D5 — Dashboard Embedded in RVF
|
||||
|
||||
The Three.js dashboard is bundled at build time and embedded in `DASHBOARD_SEG`
|
||||
rather than served from an external CDN or requiring a separate install. This
|
||||
ensures:
|
||||
|
||||
1. **Fully offline**: Works without network after boot
|
||||
2. **Version-locked**: Dashboard always matches the API version it queries
|
||||
3. **Single artifact**: One RVF file = runtime + data + visualization
|
||||
4. **Witness-aligned**: Dashboard renders exactly the data the witness chain
|
||||
can verify
|
||||
|
||||
## Package Structure
|
||||
|
||||
```
|
||||
packages/agentdb-causal-atlas/
|
||||
src/
|
||||
index.ts # createCausalAtlasServer() factory
|
||||
CausalAtlasServer.ts # HTTP + CLI runtime + dashboard serving + WS
|
||||
CausalAtlasEngine.ts # Core atlas, coherence, memory, boundary
|
||||
adapters/
|
||||
PlanetTransitAdapter.ts # Kepler/TESS light curve ingestion
|
||||
SpectrumAdapter.ts # JWST/archive spectral ingestion
|
||||
CosmicWebAdapter.ts # SDSS spatial graph (Phase 2)
|
||||
pipelines/
|
||||
PlanetDetection.ts # P0-P2 planet detection pipeline
|
||||
LifeCandidate.ts # L0-L2 life candidate pipeline
|
||||
constructs/
|
||||
CausalAtlas.ts # Compressed causal partial order
|
||||
CoherenceField.ts # Partition tree + cut pressure
|
||||
MultiScaleMemory.ts # Tiered S/M/L retention
|
||||
BoundaryTracker.ts # Boundary evolution + alerts
|
||||
download/
|
||||
ProgressiveDownloader.ts # Tiered lazy download with resume
|
||||
DataManifest.ts # URL + hash + size manifests
|
||||
KernelBuilder.ts # Reuse/extend from ADR-008
|
||||
dashboard/ # Vite + Three.js visualization app
|
||||
vite.config.ts # Build config — outputs to dist/dashboard/
|
||||
index.html # SPA entry point
|
||||
src/
|
||||
main.ts # App bootstrap, router, WS connection
|
||||
api.ts # Typed fetch wrappers for /api/* endpoints
|
||||
ws.ts # WebSocket client for /ws/live
|
||||
views/
|
||||
AtlasExplorer.ts # V1: 3D causal atlas (Three.js force graph)
|
||||
CoherenceHeatmap.ts # V2: Coherence field surface + cut pressure
|
||||
PlanetDashboard.ts # V3: Planet candidates + light curves + 3D orbit
|
||||
LifeDashboard.ts # V4: Life candidates + spectra + molecule graph
|
||||
StatusDashboard.ts # V5: System health, downloads, witness log
|
||||
three/
|
||||
AtlasGraph.ts # InstancedMesh nodes, LineSegments edges, particles
|
||||
CoherenceSurface.ts # PlaneGeometry with vertex-colored field
|
||||
OrbitPreview.ts # Orbital path visualization
|
||||
CausalFlow.ts # Animated particles along causal edges
|
||||
LODController.ts # Level-of-detail: boundary → top-k → full
|
||||
charts/
|
||||
LightCurveChart.ts # D3 flux time series with transit overlay
|
||||
SpectrumChart.ts # D3 wavelength vs flux with molecule bands
|
||||
RadarChart.ts # Score breakdown radar
|
||||
MoleculeMatrix.ts # Heatmap of molecule presence vs confidence
|
||||
components/
|
||||
Sidebar.ts # Candidate list, filters, search
|
||||
TimeScrubber.ts # Epoch scrubber for coherence replay
|
||||
WitnessLog.ts # Scrolling witness chain entries
|
||||
DownloadProgress.ts # Tier progress bars
|
||||
styles/
|
||||
main.css # Minimal Tailwind or hand-rolled styles
|
||||
tests/
|
||||
causal-atlas.test.ts
|
||||
planet-detection.test.ts
|
||||
life-candidate.test.ts
|
||||
progressive-download.test.ts
|
||||
coherence-field.test.ts
|
||||
boundary-tracker.test.ts
|
||||
dashboard.test.ts # Dashboard build + API integration tests
|
||||
```
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 1: Core Atlas + Planet Detection + Dashboard Shell (v0.1)
|
||||
|
||||
**Scope**: Kepler and TESS only. No spectra. No life scoring.
|
||||
|
||||
1. Implement `ProgressiveDownloader` with tier-0 curated dataset (100 Kepler targets)
|
||||
2. Implement `PlanetTransitAdapter` for FITS light curve ingestion
|
||||
3. Implement `CausalAtlas` with windowing, feature extraction, SIMD embedding
|
||||
4. Implement `PlanetDetection` pipeline (P0-P2)
|
||||
5. Implement `WITNESS_SEG` with SHAKE-256 chain
|
||||
6. CLI: `rvf run`, `rvf query planet list`, `rvf trace`
|
||||
7. HTTP: `/api/candidates/planet`, `/api/atlas/trace`
|
||||
8. Dashboard: Vite scaffold, V1 Atlas Explorer (Three.js 3D graph), V3 Planet
|
||||
Dashboard (ranked list + light curve chart), V5 Status Dashboard (download
|
||||
progress + witness log). Embedded in `DASHBOARD_SEG`, served at `/`
|
||||
9. WebSocket `/ws/live` for real-time pipeline progress
|
||||
|
||||
**Acceptance**: 1,000 Kepler targets, top-100 ranked list includes >= 80
|
||||
confirmed planets, every item replays to same score and witness root on two
|
||||
machines. Dashboard renders atlas graph and candidate list in browser.
|
||||
|
||||
### Phase 2: Coherence Field + Boundary Tracker + Dashboard V2 (v0.2)
|
||||
|
||||
1. Implement `CoherenceField` with dynamic min-cut, partition entropy
|
||||
2. Implement `BoundaryTracker` with timeline and alerts
|
||||
3. Implement `MultiScaleMemory` with S/M/L tiers and budget control
|
||||
4. Add coherence gating to planet pipeline
|
||||
5. HTTP: `/api/coherence`, `/api/boundary/*`, `/api/memory/tiers`
|
||||
6. Dashboard: V2 Coherence Heatmap (Three.js field surface + cut pressure
|
||||
overlay + time scrubber), boundary alert markers via WebSocket
|
||||
|
||||
### Phase 3: Life Candidate Pipeline + Dashboard V4 (v0.3)
|
||||
|
||||
1. Implement `SpectrumAdapter` for JWST/archive spectral data
|
||||
2. Implement `LifeCandidate` pipeline (L0-L2)
|
||||
3. Implement disequilibrium scoring with reaction plausibility graph
|
||||
4. Tier-3 progressive download for spectral data
|
||||
5. CLI: `rvf query life list`
|
||||
6. HTTP: `/api/candidates/life`
|
||||
7. Dashboard: V4 Life Dashboard (spectrum viewer + molecule presence matrix
|
||||
+ reaction graph + confound panel)
|
||||
|
||||
**Acceptance**: Published spectra with known atmospheric detections vs nulls,
|
||||
AUC > 0.8, every score includes confound penalties and provenance trace.
|
||||
Dashboard renders spectrum analysis in browser.
|
||||
|
||||
### Phase 4: Cosmic Web + Full Integration (v0.4)
|
||||
|
||||
1. `CosmicWebAdapter` for SDSS spatial graph
|
||||
2. Cross-domain coherence (planet candidates enriched by large-scale context)
|
||||
3. Dashboard: 3D cosmic web view, cross-domain candidate linking
|
||||
4. Full offline demo with sealed RVF snapshot
|
||||
5. `rvf ingest --expand` for tier-2 bulk download
|
||||
6. Dashboard polish: LOD optimization, mobile-responsive layout, dark/light theme
|
||||
|
||||
## Evaluation Plan
|
||||
|
||||
### Planet Detection Acceptance Test
|
||||
|
||||
| Metric | Requirement |
|
||||
|--------|-------------|
|
||||
| Recall@100 | >= 80 confirmed planets in top 100 |
|
||||
| False positives@100 | Documented with witness traces |
|
||||
| Median time per star | Measured and reported |
|
||||
| Reproducibility | Identical root hash on two machines |
|
||||
|
||||
### Life Candidate Acceptance Test
|
||||
|
||||
| Metric | Requirement |
|
||||
|--------|-------------|
|
||||
| AUC (detected vs null) | > 0.8 |
|
||||
| Confound penalties | Present on every score |
|
||||
| Provenance trace | Complete for every score |
|
||||
|
||||
### System Acceptance Test
|
||||
|
||||
| Test | Requirement |
|
||||
|------|-------------|
|
||||
| Boot reproducibility | Identical root hash across two machines |
|
||||
| Query determinism | Identical results for same dataset snapshot |
|
||||
| Witness verification | `verifyWitness` passes for all chains |
|
||||
| Progressive download | Resumes correctly after interruption |
|
||||
|
||||
## Failure Modes and Fix Path
|
||||
|
||||
| Failure | Fix |
|
||||
|---------|-----|
|
||||
| Noise dominates coherence field | Strengthen policy priors, add confound penalties, enforce multi-epoch stability |
|
||||
| Over-compression kills rare signals | Boundary-critical retention rules + candidate evidence pinning |
|
||||
| Spurious life signals from stellar activity | Model stellar variability as its own interaction graph, penalize coupling |
|
||||
| Compute blow-up | Strict budgets in POLICY_SEG, tiered memory, boundary-first indexing |
|
||||
| Download interruption | HTTP Range resume, partial-ingest checkpoint, witness for partial state |
|
||||
|
||||
## Design Decisions
|
||||
|
||||
### D1: Kepler/TESS only in v1, spectra in v3
|
||||
|
||||
Phase 1 delivers a concrete, testable planet-detection system. Life scoring
|
||||
requires additional instrument-specific adapters and more nuanced policy
|
||||
rules. Separating them de-risks the schedule.
|
||||
|
||||
### D2: Progressive download with embedded demo subset
|
||||
|
||||
The RVF artifact ships with a curated ~50 MB tier-0 dataset for fully offline
|
||||
demonstration. Full catalog data is downloaded lazily, following the pattern
|
||||
proven in ADR-008 for GGUF model files. This keeps the initial artifact small
|
||||
(< 100 MB without kernel) while supporting the full 1,000+ target benchmark.
|
||||
|
||||
### D3: Boundary-first storage (holographic principle)
|
||||
|
||||
Boundaries are stored as first-class indexed objects. Interior state is
|
||||
reconstructed on-demand from boundary witnesses and retained archetypes.
|
||||
This reduces storage by 10-50x for large graphs while preserving
|
||||
queryability and reproducibility.
|
||||
|
||||
### D4: Witness chain for every derived claim
|
||||
|
||||
Every candidate, every coherence measurement, and every boundary change is
|
||||
committed to the SHAKE-256 witness chain. This enables two-machinevisu
|
||||
reproducibility verification and provides a complete audit trail from raw
|
||||
data to final score.
|
||||
|
||||
## References
|
||||
|
||||
1. [MAST — Kepler](https://archive.stsci.edu/missions-and-data/kepler)
|
||||
2. [MAST — TESS](https://archive.stsci.edu/missions-and-data/tess)
|
||||
3. [MAST Home](https://archive.stsci.edu/home)
|
||||
4. [NASA Exoplanet Archive](https://exoplanetarchive.ipac.caltech.edu/)
|
||||
5. [SDSS DR17](https://www.sdss4.org/dr17/)
|
||||
6. ADR-003: RVF Native Format Integration
|
||||
7. ADR-006: Unified Self-Learning RVF Integration
|
||||
8. ADR-007: RuVector Full Capability Integration
|
||||
9. ADR-008: Chat UI RVF Kernel Embedding
|
||||
Reference in New Issue
Block a user