git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
259 lines
12 KiB
Markdown
259 lines
12 KiB
Markdown
# ADR-051: Physics-Informed Graph Transformer Layers
|
|
|
|
## Status
|
|
|
|
Accepted
|
|
|
|
## Date
|
|
|
|
2026-02-25
|
|
|
|
## Context
|
|
|
|
Many real-world graphs -- molecular dynamics simulations, particle physics detectors, protein interaction networks, climate meshes -- obey physical conservation laws, symmetries, and variational principles. Standard graph transformers learn representations from data alone, ignoring these inductive biases. This wastes training data (100x more samples required to implicitly learn energy conservation) and produces physically inconsistent predictions that diverge after a few integration steps.
|
|
|
|
RuVector already provides the building blocks for physics-informed graph transformers across several crates:
|
|
|
|
- `ruvector-mincut-gated-transformer/src/energy_gate.rs`: `EnergyGate`, `EnergyGateConfig` for energy-based gating decisions
|
|
- `ruvector-attention/src/sheaf/restriction.rs`: `RestrictionMap` for parallel transport (gauge connections on graph fiber bundles)
|
|
- `ruvector-attention/src/sheaf/attention.rs`: `SheafAttention`, `SheafAttentionConfig` for sheaf cohomology attention
|
|
- `ruvector-attention/src/transport/sliced_wasserstein.rs`: `SlicedWassersteinAttention` for optimal transport on graphs
|
|
- `ruvector-attention/src/pde_attention/diffusion.rs`: `DiffusionAttention` for heat/diffusion equation on graphs
|
|
- `ruvector-attention/src/pde_attention/laplacian.rs`: graph Laplacian operators for PDE discretization
|
|
- `ruvector-attention/src/curvature/fused_attention.rs`: `MixedCurvatureFusedAttention` for Ricci flow
|
|
- `ruvector-verified/src/gated.rs`: `ProofTier`, `route_proof`, `verify_tiered` for proof-gated verification
|
|
|
|
However, there is no unified module that composes these into physics-informed graph transformer layers with formally verified conservation laws. The research document `docs/research/gnn-v2/22-physics-informed-graph-transformers.md` outlines the theoretical framework but defines no implementation path through the existing crates.
|
|
|
|
## Decision
|
|
|
|
We will implement a `physics` module in `ruvector-graph-transformer` behind the `physics` feature flag. The module provides three layer types -- `HamiltonianGraphNet`, `LagrangianAttention`, and `GaugeEquivariantMP` -- each integrated with the proof-gated mutation protocol (ADR-047) to certify conservation laws per forward step.
|
|
|
|
### HamiltonianGraphNet
|
|
|
|
Symplectic leapfrog integration that PROVES energy is conserved, not just checks post-hoc:
|
|
|
|
```rust
|
|
/// Hamiltonian graph network with symplectic integration.
|
|
///
|
|
/// Each forward step produces a ProofGate<HamiltonianOutput> whose
|
|
/// proof requirement is energy conservation within tolerance.
|
|
pub struct HamiltonianGraphNet {
|
|
/// Learned kinetic energy: T(p) via MLP.
|
|
kinetic_net: MLP,
|
|
/// Learned potential energy: V(q) + sum_{(i,j)} U(q_i, q_j).
|
|
potential_net: GraphAttentionPotential,
|
|
/// Integration timestep (fixed or learned).
|
|
dt: f32,
|
|
/// Leapfrog steps per layer.
|
|
num_steps: usize,
|
|
/// Energy tolerance for proof gate (relative |dE/E|).
|
|
energy_tolerance: f64,
|
|
/// Bridges to ruvector-mincut-gated-transformer::energy_gate.
|
|
energy_gate: EnergyGateConfig,
|
|
}
|
|
|
|
impl HamiltonianGraphNet {
|
|
/// Symplectic forward pass with energy conservation proof.
|
|
///
|
|
/// Executes Stormer-Verlet leapfrog integration on the graph.
|
|
/// After integration, computes |H_final - H_initial| / |H_initial|
|
|
/// and routes through ProofTier::Reflex (< 10 ns) since this is
|
|
/// a scalar comparison. If drift exceeds tolerance, escalates to
|
|
/// ProofTier::Standard for diagnosis.
|
|
pub fn forward(
|
|
&self,
|
|
positions: &mut [f32], // [n x d] node positions (q)
|
|
momenta: &mut [f32], // [n x d] node momenta (p)
|
|
graph: &impl GraphRepr,
|
|
env: &mut ProofEnvironment,
|
|
) -> Result<ProofGate<HamiltonianOutput>>;
|
|
|
|
/// Compute Hamiltonian H(q, p) = T(p) + V(q) + sum U(q_i, q_j).
|
|
pub fn hamiltonian(
|
|
&self,
|
|
positions: &[f32],
|
|
momenta: &[f32],
|
|
graph: &impl GraphRepr,
|
|
) -> f32;
|
|
}
|
|
```
|
|
|
|
The proof requirement for each step is:
|
|
|
|
```rust
|
|
ProofRequirement::InvariantPreserved {
|
|
invariant_id: ENERGY_CONSERVATION_INVARIANT,
|
|
}
|
|
```
|
|
|
|
This maps to `ProofKind::DimensionEquality` (scalar comparison of energy values) and routes to `ProofTier::Reflex` in steady state, keeping overhead below 10 ns per step.
|
|
|
|
### GaugeEquivariantMP
|
|
|
|
Uses sheaf restriction maps as gauge connections:
|
|
|
|
```rust
|
|
/// Gauge-equivariant message passing using sheaf attention.
|
|
///
|
|
/// Restriction maps from ruvector-attention::sheaf serve as connection
|
|
/// forms (parallel transport operators) on the graph fiber bundle.
|
|
/// Attention weights are invariant under gauge transformations g_i at
|
|
/// each node because keys are parallel-transported to the query frame
|
|
/// before the dot product: alpha_{ij} = softmax(q_i^T A_{ij} k_j).
|
|
pub struct GaugeEquivariantMP {
|
|
/// Sheaf attention (restriction maps = gauge connections).
|
|
sheaf_attention: SheafAttention,
|
|
/// Gauge group dimension.
|
|
gauge_dim: usize,
|
|
/// Yang-Mills regularization strength.
|
|
ym_lambda: f32,
|
|
/// Proof requirement: gauge invariance check.
|
|
gauge_proof: ProofRequirement,
|
|
}
|
|
|
|
impl GaugeEquivariantMP {
|
|
/// Gauge-invariant attention forward pass.
|
|
///
|
|
/// Parallel-transports keys via RestrictionMap before dot product.
|
|
/// Computes Yang-Mills action as regularization loss.
|
|
pub fn forward(
|
|
&self,
|
|
queries: &[f32],
|
|
keys: &[f32],
|
|
values: &[f32],
|
|
graph: &impl GraphRepr,
|
|
env: &mut ProofEnvironment,
|
|
) -> Result<ProofGate<AttentionOutput>>;
|
|
|
|
/// Yang-Mills action: S_YM = sum_{plaquettes} ||F_{ijk}||^2.
|
|
/// Measures curvature (field strength) of the gauge connection.
|
|
pub fn yang_mills_action(&self, graph: &impl GraphRepr) -> f32;
|
|
}
|
|
```
|
|
|
|
### LagrangianAttention
|
|
|
|
Action-minimizing message passing via optimal transport:
|
|
|
|
```rust
|
|
/// Lagrangian attention using action-weighted optimal transport.
|
|
///
|
|
/// The attention weight between nodes i and j is proportional to
|
|
/// exp(-beta * W_2(mu_i, mu_j)), where W_2 is Wasserstein-2 distance.
|
|
/// This is the information-geometric dual of kinetic energy in
|
|
/// Wasserstein space: L = (1/2) ||d mu/dt||^2_{W_2}.
|
|
///
|
|
/// Delegates to ruvector-attention::transport::SlicedWassersteinAttention
|
|
/// for the transport computation and wraps in proof gate for
|
|
/// action bound verification.
|
|
pub struct LagrangianAttention {
|
|
/// Sliced Wasserstein transport from ruvector-attention.
|
|
transport: SlicedWassersteinAttention,
|
|
/// Inverse temperature for action weighting.
|
|
beta: f32,
|
|
/// Variational integrator timestep.
|
|
dt: f32,
|
|
/// Action bound proof requirement.
|
|
action_proof: ProofRequirement,
|
|
}
|
|
|
|
impl LagrangianAttention {
|
|
/// Variational forward pass.
|
|
///
|
|
/// Computes discrete Euler-Lagrange equations on the graph.
|
|
/// Action bound is verified via ProofTier::Standard (bounded
|
|
/// fuel for action functional evaluation).
|
|
pub fn forward(
|
|
&self,
|
|
q_prev: &[f32],
|
|
q_curr: &[f32],
|
|
graph: &impl GraphRepr,
|
|
env: &mut ProofEnvironment,
|
|
) -> Result<ProofGate<LagrangianOutput>>;
|
|
}
|
|
```
|
|
|
|
### PDE Attention Integration
|
|
|
|
The existing `ruvector-attention/src/pde_attention/diffusion.rs` provides diffusion on graphs. The `physics` module wraps this with conservation proofs:
|
|
|
|
```rust
|
|
/// PDE attention with mass conservation proof.
|
|
///
|
|
/// Bridges to ruvector_attention::pde_attention::DiffusionAttention.
|
|
/// After each diffusion step, proves total mass is conserved:
|
|
/// sum_i h_i(t+dt) == sum_i h_i(t) within tolerance.
|
|
pub struct ConservativePdeAttention {
|
|
diffusion: DiffusionAttention,
|
|
mass_tolerance: f64,
|
|
}
|
|
```
|
|
|
|
### Feature Flag
|
|
|
|
```toml
|
|
# In crates/ruvector-graph-transformer/Cargo.toml
|
|
[features]
|
|
physics = [
|
|
"ruvector-mincut-gated-transformer/energy_gate",
|
|
"ruvector-attention/pde_attention",
|
|
"ruvector-attention/sheaf",
|
|
"ruvector-attention/transport",
|
|
]
|
|
```
|
|
|
|
## Consequences
|
|
|
|
### Positive
|
|
|
|
- Energy conservation is guaranteed by construction via symplectic integration and formally verified per step
|
|
- Gauge invariance from sheaf attention ensures predictions are coordinate-independent
|
|
- PDE attention with mass conservation proof prevents unphysical feature drift
|
|
- Physics priors reduce required training data by encoding known laws, with estimated 100x improvement for molecular dynamics tasks
|
|
- All layers compose with the proof-gated mutation protocol (ADR-047), producing auditable attestation chains
|
|
|
|
### Negative
|
|
|
|
- Leapfrog integration adds O(num_steps) overhead per layer compared to a standard residual connection
|
|
- Yang-Mills regularization requires computing holonomies around plaquettes (small graph cycles), which is O(triangles) per forward pass
|
|
- `LagrangianAttention` requires Newton iteration to solve the implicit discrete Euler-Lagrange equation (5 iterations by default)
|
|
- Users must supply phase-space representations (q, p) rather than generic node features
|
|
|
|
### Risks
|
|
|
|
- If energy tolerance is set too tight, Reflex-tier proofs will fail and escalate to Standard/Deep, exceeding the 2% overhead budget (ADR-047). Mitigation: default tolerance of 1e-4 relative drift, which is achievable with double-precision leapfrog
|
|
- Sheaf restriction maps as gauge connections assume orthogonal gauge group. Extending to non-abelian groups (SU(2), SU(3)) requires operator ordering care and is deferred to a follow-up ADR
|
|
- Noether symmetry mining (automatic conservation law discovery) is not included in this ADR due to training cost; it is an extension for ADR-049's verified training pipeline
|
|
|
|
## Implementation
|
|
|
|
1. Create `crates/ruvector-graph-transformer/src/physics/mod.rs` re-exporting all layer types
|
|
2. Implement `HamiltonianGraphNet` in `physics/hamiltonian.rs`, bridging to `ruvector-mincut-gated-transformer::energy_gate`
|
|
3. Implement `GaugeEquivariantMP` in `physics/gauge.rs`, bridging to `ruvector-attention::sheaf::{SheafAttention, RestrictionMap}`
|
|
4. Implement `LagrangianAttention` in `physics/lagrangian.rs`, bridging to `ruvector-attention::transport::SlicedWassersteinAttention`
|
|
5. Implement `ConservativePdeAttention` in `physics/pde.rs`, bridging to `ruvector-attention::pde_attention::DiffusionAttention`
|
|
6. Add benchmark: `benches/physics_bench.rs` measuring energy drift over 10,000 leapfrog steps on a 1,000-node molecular graph
|
|
7. Integration test: compose `HamiltonianGraphNet` + `GaugeEquivariantMP` in a full forward pass, verify attestation chain integrity
|
|
8. Verify build: `cargo test --features physics -p ruvector-graph-transformer`
|
|
|
|
## References
|
|
|
|
- ADR-046: Graph Transformer Unified Architecture (module structure, `AttentionRegistry`)
|
|
- ADR-047: Proof-Gated Mutation Protocol (`ProofGate<T>`, `ProofRequirement`, three-tier routing)
|
|
- ADR-049: Verified Training Pipeline (conservation law invariants during training)
|
|
- Research: `docs/research/gnn-v2/22-physics-informed-graph-transformers.md`
|
|
- `crates/ruvector-mincut-gated-transformer/src/energy_gate.rs`: `EnergyGate`, `EnergyGateConfig`
|
|
- `crates/ruvector-attention/src/sheaf/restriction.rs`: `RestrictionMap`
|
|
- `crates/ruvector-attention/src/sheaf/attention.rs`: `SheafAttention`, `SheafAttentionConfig`
|
|
- `crates/ruvector-attention/src/transport/sliced_wasserstein.rs`: `SlicedWassersteinAttention`
|
|
- `crates/ruvector-attention/src/pde_attention/diffusion.rs`: `DiffusionAttention`
|
|
- `crates/ruvector-attention/src/pde_attention/laplacian.rs`: graph Laplacian
|
|
- `crates/ruvector-attention/src/curvature/fused_attention.rs`: `MixedCurvatureFusedAttention`
|
|
- `crates/ruvector-verified/src/gated.rs`: `ProofTier`, `route_proof`, `verify_tiered`
|
|
- `crates/ruvector-verified/src/proof_store.rs`: `ProofAttestation`, 82-byte witnesses
|
|
- Greydanus et al., "Hamiltonian Neural Networks" (arXiv:1906.01563, 2019)
|
|
- Cranmer et al., "Lagrangian Neural Networks" (arXiv:2003.04630, 2020)
|
|
- Cohen et al., "Gauge Equivariant Convolutional Networks" (arXiv:1902.04615, 2019)
|
|
- Hansen & Gebhart, "Sheaf Neural Networks" (arXiv:2012.06333, 2020)
|