Squashed 'vendor/ruvector/' content from commit b64c2172

git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00
commit d803bfe2b1
7854 changed files with 3522914 additions and 0 deletions
--- a/docs/research/gnn-v2/25-self-organizing-graph-transformers.md
+++ b/docs/research/gnn-v2/25-self-organizing-graph-transformers.md
@@ -0,0 +1,947 @@
+# Feature 25: Self-Organizing Graph Transformers
+
+## Overview
+
+### Problem Statement
+
+Current graph transformers operate on fixed, manually designed topologies. The graph structure is either given as input (e.g., molecule graphs, social networks) or constructed once via nearest-neighbor heuristics (e.g., HNSW). In either case, the topology is static during inference and training: it does not grow, differentiate, or reorganize in response to the data distribution. This rigidity creates three fundamental bottlenecks:
+
+1. **Topology-data mismatch**: A graph constructed for one data distribution becomes suboptimal as the distribution shifts.
+2. **No specialization**: Every node and edge in the graph plays the same generic role -- there is no mechanism for nodes to develop distinct functional identities.
+3. **No self-repair**: When parts of the graph become corrupted or irrelevant, there is no process for replacing or regenerating damaged regions.
+
+Biology solved these problems billions of years ago. Morphogenesis builds complex structures from simple rules. Embryonic development differentiates a single cell into hundreds of specialized types. Autopoiesis maintains living systems by continuously rebuilding their own components. These principles have been largely ignored in graph neural network design.
+
+### Proposed Solution
+
+Self-Organizing Graph Transformers (SOGTs) are graph attention networks that grow, differentiate, and maintain their own topology through biologically-inspired developmental programs. The approach has three pillars:
+
+1. **Morphogenetic Graph Networks**: Turing pattern formation on graphs drives reaction-diffusion attention, creating spatially structured activation patterns that guide message passing and edge formation.
+2. **Developmental Graph Programs**: Graph grammars encode growth rules as L-system productions. Generic seed nodes differentiate into specialized types (hub nodes, boundary nodes, relay nodes) through a developmental program conditioned on local graph statistics.
+3. **Autopoietic Graph Transformers**: The network continuously rebuilds its own topology -- pruning dead edges, spawning new nodes, and adjusting attention weights -- to maintain a target coherence level, analogous to homeostasis in living systems.
+
+### Expected Benefits
+
+- **Adaptive Topology**: 30-50% improvement in retrieval quality on distribution-shifting workloads
+- **Self-Specialization**: Nodes develop distinct roles (hub, boundary, relay) reducing routing overhead by 40-60%
+- **Self-Repair**: Automatic recovery from node/edge corruption with <5% transient degradation
+- **Architecture Search**: Morphogenetic NAS discovers attention patterns 10x faster than random search
+- **Emergent Computation**: Local attention rules give rise to global computational patterns (sorting, clustering, routing)
+
+### Novelty Claim
+
+**Unique Contribution**: First graph transformer architecture that grows its own topology through morphogenetic, developmental, and autopoietic processes. Unlike neural architecture search (which optimizes a fixed search space), SOGTs develop continuously through biologically-grounded growth rules that operate at runtime.
+
+**Differentiators**:
+1. Reaction-diffusion attention creates Turing patterns on graphs for structured activation
+2. L-system graph grammars encode developmental programs for node specialization
+3. Autopoietic maintenance loop continuously rebuilds topology to maintain coherence
+4. Cellular automata attention rules produce emergent global computation from local rules
+5. Morphogenetic NAS discovers novel attention architectures through growth processes
+
+---
+
+## Biological Foundations
+
+### Morphogenesis and Turing Patterns
+
+Alan Turing's 1952 paper "The Chemical Basis of Morphogenesis" demonstrated that two diffusing chemicals (an activator and an inhibitor) with different diffusion rates can spontaneously form stable spatial patterns: spots, stripes, and spirals. These reaction-diffusion systems explain leopard spots, zebrafish stripes, and fingerprint ridges.
+
+On a graph, the Turing instability generalizes naturally. Each node holds concentrations of an activator `a` and inhibitor `h`. The dynamics follow the graph Laplacian:
+
+```
+da/dt = f(a, h) + D_a * L * a
+dh/dt = g(a, h) + D_h * L * h
+```
+
+where `L` is the graph Laplacian, `D_h >> D_a` (inhibitor diffuses faster), and `f`, `g` encode local reaction kinetics. The key insight is that **Turing patterns on graphs create natural attention masks**: regions of high activator concentration attend to each other, while inhibitor barriers create boundaries between attention clusters.
+
+### Embryonic Development and Differentiation
+
+A single fertilized cell becomes a human body with 200+ cell types through a developmental program. Key principles:
+
+- **Positional information**: Cells read chemical gradients to determine their position and fate.
+- **Inductive signaling**: Cells signal neighbors to change type.
+- **Competence windows**: Cells can only respond to certain signals during specific developmental stages.
+- **Canalization**: Development is robust to perturbations -- the same endpoint is reached from varied starting conditions.
+
+For graph transformers, these principles translate to: nodes read local graph statistics (degree, centrality, neighborhood composition) to determine their functional role; they signal neighbors through message passing to coordinate specialization; and developmental stages gate which transformations are available at each growth step.
+
+### Autopoiesis and Self-Maintenance
+
+Autopoiesis (Maturana and Varela, 1972) describes systems that continuously produce and replace their own components. A living cell is autopoietic: it synthesizes the membrane that bounds it, the enzymes that catalyze reactions, and the DNA that encodes those enzymes. The system maintains itself through circular causality.
+
+For graph transformers, autopoiesis means: the attention mechanism produces the topology that shapes the attention mechanism. Dead edges are pruned. Overloaded nodes are split. Missing connections are grown. The graph maintains a target coherence level (measurable via `ruvector-coherence`) through continuous self-modification.
+
+---
+
+## Technical Design
+
+### Architecture Diagram
+
+```
+                      Data Distribution
+                             |
+                    +--------v--------+
+                    |   Seed Graph    |
+                    |  (initial K     |
+                    |   nodes)        |
+                    +--------+--------+
+                             |
+              +--------------+--------------+
+              |              |              |
+     +--------v-------+ +---v----+ +-------v--------+
+     | Morphogenetic  | | Devel- | | Autopoietic    |
+     | Field Engine   | | opment | | Maintenance    |
+     |                | | Program| | Loop           |
+     | Turing pattern | | L-sys  | | Coherence-     |
+     | on graph       | | grammar| | gated rebuild  |
+     +--------+-------+ +---+----+ +-------+--------+
+              |              |              |
+              +------+-------+------+-------+
+                     |              |
+              +------v------+ +----v-------+
+              | Topology    | | Node Type  |
+              | Growth      | | Specialize |
+              | (new edges/ | | (hub/relay/|
+              |  nodes)     | |  boundary) |
+              +------+------+ +----+-------+
+                     |              |
+                     +------+-------+
+                            |
+                   +--------v--------+
+                   | Self-Organizing |
+                   | Graph Attention |
+                   | Layer           |
+                   +--------+--------+
+                            |
+                   +--------v--------+
+                   | Query / Embed   |
+                   | / Route         |
+                   +-----------------+
+
+
+Morphogenetic Field Detail:
+
+  Node Activator (a)     Node Inhibitor (h)
+  +---+---+---+---+     +---+---+---+---+
+  |0.9|0.1|0.8|0.2|     |0.1|0.8|0.2|0.9|
+  +---+---+---+---+     +---+---+---+---+
+  |0.2|0.7|0.1|0.9|     |0.7|0.2|0.8|0.1|
+  +---+---+---+---+     +---+---+---+---+
+
+  Attention Mask = sigma(a - threshold)
+  High-a nodes form attention clusters
+  High-h boundaries separate clusters
+```
+
+### Core Data Structures
+
+```rust
+/// Configuration for Self-Organizing Graph Transformer
+#[derive(Debug, Clone)]
+pub struct SelfOrganizingConfig {
+    /// Initial seed graph size
+    pub seed_nodes: usize,
+
+    /// Maximum graph size (growth limit)
+    pub max_nodes: usize,
+
+    /// Embedding dimension
+    pub embed_dim: usize,
+
+    /// Morphogenetic field parameters
+    pub morpho: MorphogeneticConfig,
+
+    /// Developmental program parameters
+    pub development: DevelopmentalConfig,
+
+    /// Autopoietic maintenance parameters
+    pub autopoiesis: AutopoieticConfig,
+
+    /// Growth phase schedule
+    pub phases: Vec<GrowthPhase>,
+}
+
+/// Morphogenetic field configuration (Turing patterns on graphs)
+#[derive(Debug, Clone)]
+pub struct MorphogeneticConfig {
+    /// Activator diffusion rate
+    pub d_activator: f32,
+
+    /// Inhibitor diffusion rate (must be > d_activator)
+    pub d_inhibitor: f32,
+
+    /// Reaction kinetics: activator self-enhancement rate
+    pub rho_a: f32,
+
+    /// Reaction kinetics: inhibitor production rate
+    pub rho_h: f32,
+
+    /// Activator decay rate
+    pub mu_a: f32,
+
+    /// Inhibitor decay rate
+    pub mu_h: f32,
+
+    /// Number of reaction-diffusion steps per forward pass
+    pub rd_steps: usize,
+
+    /// Threshold for activator-based attention gating
+    pub attention_threshold: f32,
+}
+
+impl Default for MorphogeneticConfig {
+    fn default() -> Self {
+        Self {
+            d_activator: 0.01,
+            d_inhibitor: 0.1, // 10x faster diffusion
+            rho_a: 0.08,
+            rho_h: 0.12,
+            mu_a: 0.03,
+            mu_h: 0.06,
+            rd_steps: 10,
+            attention_threshold: 0.5,
+        }
+    }
+}
+
+/// Node functional types arising from developmental specialization
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
+pub enum NodeType {
+    /// Undifferentiated seed node
+    Stem,
+    /// High-degree hub node (routes between clusters)
+    Hub,
+    /// Cluster boundary node (separates attention groups)
+    Boundary,
+    /// Internal relay node (local message passing)
+    Relay,
+    /// Sensory node (interfaces with external data)
+    Sensory,
+    /// Memory node (long-term information storage)
+    Memory,
+}
+
+/// Developmental program configuration
+#[derive(Debug, Clone)]
+pub struct DevelopmentalConfig {
+    /// L-system axiom (initial production)
+    pub axiom: Vec<NodeType>,
+
+    /// Production rules: (predecessor, condition, successor pattern)
+    pub rules: Vec<ProductionRule>,
+
+    /// Maximum developmental steps
+    pub max_steps: usize,
+
+    /// Competence window: (min_step, max_step) per rule
+    pub competence_windows: Vec<(usize, usize)>,
+}
+
+/// A production rule in the developmental graph grammar
+#[derive(Debug, Clone)]
+pub struct ProductionRule {
+    /// Node type that this rule applies to
+    pub predecessor: NodeType,
+
+    /// Condition: local graph statistic threshold
+    pub condition: DevelopmentalCondition,
+
+    /// Successor: what the node becomes + new nodes spawned
+    pub successor: Vec<NodeType>,
+
+    /// Edge pattern for newly created nodes
+    pub edge_pattern: EdgePattern,
+}
+
+/// Conditions for developmental rule activation
+#[derive(Debug, Clone)]
+pub enum DevelopmentalCondition {
+    /// Node degree exceeds threshold
+    DegreeAbove(usize),
+    /// Node betweenness centrality exceeds threshold
+    CentralityAbove(f32),
+    /// Activator concentration exceeds threshold
+    ActivatorAbove(f32),
+    /// Inhibitor concentration exceeds threshold
+    InhibitorAbove(f32),
+    /// Neighbor composition: fraction of type T exceeds threshold
+    NeighborFraction(NodeType, f32),
+    /// Always applies
+    Always,
+}
+
+/// Edge creation patterns for developmental rules
+#[derive(Debug, Clone)]
+pub enum EdgePattern {
+    /// Connect to parent only
+    ParentOnly,
+    /// Connect to parent and all parent neighbors
+    InheritNeighborhood,
+    /// Connect to k nearest nodes by embedding distance
+    KNearest(usize),
+    /// Connect to nodes with matching activator pattern
+    MorphogeneticAffinity,
+}
+
+/// Autopoietic maintenance configuration
+#[derive(Debug, Clone)]
+pub struct AutopoieticConfig {
+    /// Target coherence level (from ruvector-coherence)
+    pub target_coherence: f32,
+
+    /// Coherence tolerance band (maintain within +/- tolerance)
+    pub coherence_tolerance: f32,
+
+    /// Edge pruning threshold: remove edges with attention < threshold
+    pub prune_threshold: f32,
+
+    /// Node splitting threshold: split nodes with degree > threshold
+    pub split_degree_threshold: usize,
+
+    /// Edge growth rate: max new edges per maintenance cycle
+    pub max_new_edges_per_cycle: usize,
+
+    /// Maintenance cycle interval (every N forward passes)
+    pub cycle_interval: usize,
+}
+
+/// Growth phase in the developmental schedule
+#[derive(Debug, Clone)]
+pub struct GrowthPhase {
+    /// Phase name
+    pub name: String,
+
+    /// Duration in forward passes
+    pub duration: usize,
+
+    /// Which subsystems are active
+    pub morpho_active: bool,
+    pub development_active: bool,
+    pub autopoiesis_active: bool,
+
+    /// Growth rate multiplier
+    pub growth_rate: f32,
+}
+```
+
+### Key Algorithms
+
+#### 1. Morphogenetic Field Update (Reaction-Diffusion on Graph)
+
+```rust
+/// Morphogenetic field state for the graph
+pub struct MorphogeneticField {
+    /// Activator concentration per node
+    activator: Vec<f32>,
+    /// Inhibitor concentration per node
+    inhibitor: Vec<f32>,
+    /// Graph Laplacian (sparse)
+    laplacian: Vec<(usize, usize, f32)>,
+    /// Configuration
+    config: MorphogeneticConfig,
+}
+
+impl MorphogeneticField {
+    /// Run one step of reaction-diffusion on the graph.
+    ///
+    /// Uses the Gierer-Meinhardt model:
+    ///   da/dt = rho_a * (a^2 / h) - mu_a * a + D_a * L * a
+    ///   dh/dt = rho_h * a^2 - mu_h * h + D_h * L * h
+    fn step(&mut self, dt: f32) {
+        let n = self.activator.len();
+        let mut da = vec![0.0f32; n];
+        let mut dh = vec![0.0f32; n];
+
+        // Reaction kinetics (Gierer-Meinhardt)
+        for i in 0..n {
+            let a = self.activator[i];
+            let h = self.inhibitor[i].max(1e-6); // avoid division by zero
+            da[i] += self.config.rho_a * (a * a / h) - self.config.mu_a * a;
+            dh[i] += self.config.rho_h * a * a - self.config.mu_h * h;
+        }
+
+        // Diffusion via graph Laplacian
+        for &(src, dst, weight) in &self.laplacian {
+            let diff_a = self.activator[dst] - self.activator[src];
+            let diff_h = self.inhibitor[dst] - self.inhibitor[src];
+            da[src] += self.config.d_activator * weight * diff_a;
+            dh[src] += self.config.d_inhibitor * weight * diff_h;
+        }
+
+        // Euler integration
+        for i in 0..n {
+            self.activator[i] = (self.activator[i] + dt * da[i]).max(0.0);
+            self.inhibitor[i] = (self.inhibitor[i] + dt * dh[i]).max(0.0);
+        }
+    }
+
+    /// Compute attention mask from activator field.
+    /// Nodes with activator above threshold attend to each other.
+    fn attention_mask(&self) -> Vec<bool> {
+        self.activator.iter()
+            .map(|&a| a > self.config.attention_threshold)
+            .collect()
+    }
+
+    /// Compute morphogenetic affinity between two nodes.
+    /// Nodes with similar activator/inhibitor ratios have high affinity.
+    fn affinity(&self, i: usize, j: usize) -> f32 {
+        let ratio_i = self.activator[i] / self.inhibitor[i].max(1e-6);
+        let ratio_j = self.activator[j] / self.inhibitor[j].max(1e-6);
+        let diff = (ratio_i - ratio_j).abs();
+        (-diff * diff).exp() // Gaussian affinity
+    }
+}
+```
+
+#### 2. Developmental Program (L-System Graph Grammar)
+
+```rust
+/// Developmental program executor
+pub struct DevelopmentalProgram {
+    /// Current developmental step
+    step: usize,
+    /// Production rules
+    rules: Vec<ProductionRule>,
+    /// Competence windows per rule
+    competence: Vec<(usize, usize)>,
+    /// Node type assignments
+    node_types: Vec<NodeType>,
+    /// Graph adjacency (mutable during development)
+    adjacency: Vec<Vec<usize>>,
+    /// Node embeddings
+    embeddings: Vec<Vec<f32>>,
+}
+
+impl DevelopmentalProgram {
+    /// Execute one developmental step.
+    ///
+    /// For each node, check if any production rule applies:
+    /// 1. The node type matches the rule predecessor
+    /// 2. The condition is satisfied
+    /// 3. The current step is within the competence window
+    ///
+    /// If so, apply the rule: change node type and/or spawn new nodes.
+    fn develop_step(
+        &mut self,
+        field: &MorphogeneticField,
+        max_nodes: usize,
+    ) -> Vec<DevelopmentalEvent> {
+        let mut events = Vec::new();
+        let current_n = self.node_types.len();
+
+        // Collect applicable rules (avoid borrow conflicts)
+        let mut applications: Vec<(usize, usize)> = Vec::new(); // (node_idx, rule_idx)
+
+        for node_idx in 0..current_n {
+            for (rule_idx, rule) in self.rules.iter().enumerate() {
+                // Check competence window
+                let (min_step, max_step) = self.competence[rule_idx];
+                if self.step < min_step || self.step > max_step {
+                    continue;
+                }
+
+                // Check predecessor type
+                if self.node_types[node_idx] != rule.predecessor {
+                    continue;
+                }
+
+                // Check condition
+                if self.check_condition(node_idx, &rule.condition, field) {
+                    applications.push((node_idx, rule_idx));
+                    break; // one rule per node per step
+                }
+            }
+        }
+
+        // Apply rules
+        for (node_idx, rule_idx) in applications {
+            if self.node_types.len() >= max_nodes {
+                break;
+            }
+
+            let rule = &self.rules[rule_idx];
+
+            // First element of successor replaces the node's type
+            if let Some(&new_type) = rule.successor.first() {
+                let old_type = self.node_types[node_idx];
+                self.node_types[node_idx] = new_type;
+                events.push(DevelopmentalEvent::Differentiate {
+                    node: node_idx,
+                    from: old_type,
+                    to: new_type,
+                });
+            }
+
+            // Remaining elements spawn new nodes
+            for &spawn_type in rule.successor.iter().skip(1) {
+                let new_idx = self.node_types.len();
+                if new_idx >= max_nodes { break; }
+
+                self.node_types.push(spawn_type);
+
+                // Create embedding as perturbation of parent
+                let parent_emb = self.embeddings[node_idx].clone();
+                let new_emb = perturb_embedding(&parent_emb, 0.01);
+                self.embeddings.push(new_emb);
+
+                // Create edges based on pattern
+                let new_edges = match &rule.edge_pattern {
+                    EdgePattern::ParentOnly => vec![node_idx],
+                    EdgePattern::InheritNeighborhood => {
+                        let mut edges = vec![node_idx];
+                        edges.extend_from_slice(&self.adjacency[node_idx]);
+                        edges
+                    }
+                    EdgePattern::KNearest(k) => {
+                        self.k_nearest(new_idx, *k)
+                    }
+                    EdgePattern::MorphogeneticAffinity => {
+                        self.morpho_nearest(new_idx, field, 4)
+                    }
+                };
+
+                self.adjacency.push(new_edges.clone());
+                for &neighbor in &new_edges {
+                    if neighbor < self.adjacency.len() {
+                        self.adjacency[neighbor].push(new_idx);
+                    }
+                }
+
+                events.push(DevelopmentalEvent::Spawn {
+                    parent: node_idx,
+                    child: new_idx,
+                    child_type: spawn_type,
+                });
+            }
+        }
+
+        self.step += 1;
+        events
+    }
+
+    /// Check whether a developmental condition is satisfied for a node.
+    fn check_condition(
+        &self,
+        node_idx: usize,
+        condition: &DevelopmentalCondition,
+        field: &MorphogeneticField,
+    ) -> bool {
+        match condition {
+            DevelopmentalCondition::DegreeAbove(threshold) => {
+                self.adjacency[node_idx].len() > *threshold
+            }
+            DevelopmentalCondition::ActivatorAbove(threshold) => {
+                field.activator[node_idx] > *threshold
+            }
+            DevelopmentalCondition::InhibitorAbove(threshold) => {
+                field.inhibitor[node_idx] > *threshold
+            }
+            DevelopmentalCondition::NeighborFraction(target_type, threshold) => {
+                let neighbors = &self.adjacency[node_idx];
+                if neighbors.is_empty() { return false; }
+                let count = neighbors.iter()
+                    .filter(|&&n| self.node_types[n] == *target_type)
+                    .count();
+                (count as f32 / neighbors.len() as f32) > *threshold
+            }
+            DevelopmentalCondition::CentralityAbove(_threshold) => {
+                // Approximated via degree centrality for efficiency
+                let degree = self.adjacency[node_idx].len() as f32;
+                let max_degree = self.adjacency.iter()
+                    .map(|adj| adj.len())
+                    .max()
+                    .unwrap_or(1) as f32;
+                (degree / max_degree) > 0.5
+            }
+            DevelopmentalCondition::Always => true,
+        }
+    }
+}
+
+/// Events produced by the developmental program
+#[derive(Debug, Clone)]
+pub enum DevelopmentalEvent {
+    /// A node changed its functional type
+    Differentiate { node: usize, from: NodeType, to: NodeType },
+    /// A new node was spawned
+    Spawn { parent: usize, child: usize, child_type: NodeType },
+    /// An edge was pruned
+    Prune { src: usize, dst: usize },
+}
+
+/// Perturb an embedding with small Gaussian noise
+fn perturb_embedding(emb: &[f32], scale: f32) -> Vec<f32> {
+    emb.iter().enumerate()
+        .map(|(i, &v)| {
+            // Deterministic pseudo-noise based on index
+            let noise = ((i as f32 * 0.618033988) % 1.0 - 0.5) * 2.0 * scale;
+            v + noise
+        })
+        .collect()
+}
+```
+
+#### 3. Autopoietic Maintenance Loop
+
+```rust
+/// Autopoietic maintenance system
+pub struct AutopoieticMaintainer {
+    config: AutopoieticConfig,
+    /// Forward pass counter
+    pass_count: usize,
+    /// Running coherence history
+    coherence_history: Vec<f32>,
+}
+
+impl AutopoieticMaintainer {
+    /// Execute one maintenance cycle if due.
+    ///
+    /// Measures current coherence (via ruvector-coherence metrics),
+    /// then adjusts topology to stay within the target band.
+    fn maybe_maintain(
+        &mut self,
+        adjacency: &mut Vec<Vec<usize>>,
+        node_types: &mut Vec<NodeType>,
+        attention_weights: &[Vec<(usize, f32)>],
+        embeddings: &[Vec<f32>],
+    ) -> Vec<MaintenanceAction> {
+        self.pass_count += 1;
+        if self.pass_count % self.config.cycle_interval != 0 {
+            return Vec::new();
+        }
+
+        let mut actions = Vec::new();
+        let coherence = self.measure_coherence(attention_weights);
+        self.coherence_history.push(coherence);
+
+        let target = self.config.target_coherence;
+        let tol = self.config.coherence_tolerance;
+
+        if coherence < target - tol {
+            // Coherence too low: grow edges to increase connectivity
+            let new_edges = self.grow_edges(adjacency, embeddings);
+            actions.extend(new_edges);
+        } else if coherence > target + tol {
+            // Coherence too high: prune weak edges
+            let pruned = self.prune_edges(adjacency, attention_weights);
+            actions.extend(pruned);
+        }
+
+        // Always check for overloaded nodes
+        let splits = self.split_overloaded(adjacency, node_types, embeddings);
+        actions.extend(splits);
+
+        actions
+    }
+
+    /// Measure coherence as mean attention weight across active edges.
+    fn measure_coherence(&self, attention_weights: &[Vec<(usize, f32)>]) -> f32 {
+        let mut total_weight = 0.0f32;
+        let mut edge_count = 0usize;
+
+        for node_weights in attention_weights {
+            for &(_neighbor, weight) in node_weights {
+                total_weight += weight;
+                edge_count += 1;
+            }
+        }
+
+        if edge_count == 0 { return 0.0; }
+        total_weight / edge_count as f32
+    }
+
+    /// Prune edges with attention weight below threshold.
+    fn prune_edges(
+        &self,
+        adjacency: &mut Vec<Vec<usize>>,
+        attention_weights: &[Vec<(usize, f32)>],
+    ) -> Vec<MaintenanceAction> {
+        let mut actions = Vec::new();
+
+        for (src, node_weights) in attention_weights.iter().enumerate() {
+            let to_prune: Vec<usize> = node_weights.iter()
+                .filter(|&&(_, w)| w < self.config.prune_threshold)
+                .map(|&(dst, _)| dst)
+                .collect();
+
+            for dst in to_prune {
+                adjacency[src].retain(|&n| n != dst);
+                actions.push(MaintenanceAction::PruneEdge { src, dst });
+            }
+        }
+
+        actions
+    }
+
+    /// Split nodes whose degree exceeds the threshold.
+    fn split_overloaded(
+        &self,
+        adjacency: &mut Vec<Vec<usize>>,
+        node_types: &mut Vec<NodeType>,
+        embeddings: &[Vec<f32>],
+    ) -> Vec<MaintenanceAction> {
+        let mut actions = Vec::new();
+        let n = adjacency.len();
+
+        for i in 0..n {
+            if adjacency[i].len() > self.config.split_degree_threshold {
+                // Split: new node takes half the edges
+                let mid = adjacency[i].len() / 2;
+                let split_edges: Vec<usize> = adjacency[i].drain(mid..).collect();
+
+                let new_idx = adjacency.len();
+                adjacency.push(split_edges.clone());
+                node_types.push(node_types[i]);
+
+                // Reconnect transferred edges
+                for &neighbor in &split_edges {
+                    if neighbor < adjacency.len() {
+                        // Replace old -> new in neighbor lists
+                        if let Some(pos) = adjacency[neighbor].iter().position(|&n| n == i) {
+                            adjacency[neighbor][pos] = new_idx;
+                        }
+                    }
+                }
+
+                // Connect the two halves
+                adjacency[i].push(new_idx);
+                adjacency[new_idx].push(i);
+
+                actions.push(MaintenanceAction::SplitNode {
+                    original: i,
+                    new_node: new_idx,
+                    edges_transferred: split_edges.len(),
+                });
+            }
+        }
+
+        actions
+    }
+
+    /// Grow new edges to increase coherence.
+    fn grow_edges(
+        &self,
+        adjacency: &mut Vec<Vec<usize>>,
+        embeddings: &[Vec<f32>],
+    ) -> Vec<MaintenanceAction> {
+        let mut actions = Vec::new();
+        let mut added = 0;
+
+        // Find pairs with high embedding similarity but no edge
+        for i in 0..adjacency.len() {
+            if added >= self.config.max_new_edges_per_cycle { break; }
+
+            for j in (i + 1)..adjacency.len() {
+                if added >= self.config.max_new_edges_per_cycle { break; }
+                if adjacency[i].contains(&j) { continue; }
+
+                let sim = cosine_similarity(&embeddings[i], &embeddings[j]);
+                if sim > 0.8 {
+                    adjacency[i].push(j);
+                    adjacency[j].push(i);
+                    added += 1;
+                    actions.push(MaintenanceAction::GrowEdge { src: i, dst: j, similarity: sim });
+                }
+            }
+        }
+
+        actions
+    }
+}
+
+/// Actions taken by the autopoietic maintainer
+#[derive(Debug, Clone)]
+pub enum MaintenanceAction {
+    PruneEdge { src: usize, dst: usize },
+    GrowEdge { src: usize, dst: usize, similarity: f32 },
+    SplitNode { original: usize, new_node: usize, edges_transferred: usize },
+}
+
+fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
+    let dot: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
+    let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
+    let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
+    if norm_a < 1e-8 || norm_b < 1e-8 { return 0.0; }
+    dot / (norm_a * norm_b)
+}
+```
+
+#### 4. Cellular Automata Attention Rules
+
+```rust
+/// Cellular automaton rule for graph attention updates.
+///
+/// Each node updates its attention state based on the attention states
+/// of its neighbors, analogous to Conway's Game of Life on a graph.
+pub struct CellularAttentionRule {
+    /// Birth threshold: node activates if >= birth neighbors are active
+    pub birth_threshold: usize,
+    /// Survival range: node stays active if neighbors in [lo, hi]
+    pub survival_lo: usize,
+    pub survival_hi: usize,
+    /// Refractory period: steps before reactivation after deactivation
+    pub refractory: usize,
+}
+
+impl CellularAttentionRule {
+    /// Update attention states for all nodes.
+    fn update(
+        &self,
+        states: &mut Vec<CellState>,
+        adjacency: &[Vec<usize>],
+    ) {
+        let n = states.len();
+        let old_states: Vec<CellState> = states.clone();
+
+        for i in 0..n {
+            let active_neighbors = adjacency[i].iter()
+                .filter(|&&j| old_states[j].active)
+                .count();
+
+            match &mut states[i] {
+                s if s.active => {
+                    // Survival check
+                    if active_neighbors < self.survival_lo
+                        || active_neighbors > self.survival_hi
+                    {
+                        s.active = false;
+                        s.refractory_remaining = self.refractory;
+                    }
+                }
+                s if s.refractory_remaining > 0 => {
+                    s.refractory_remaining -= 1;
+                }
+                s => {
+                    // Birth check
+                    if active_neighbors >= self.birth_threshold {
+                        s.active = true;
+                    }
+                }
+            }
+        }
+    }
+}
+
+#[derive(Debug, Clone)]
+pub struct CellState {
+    pub active: bool,
+    pub refractory_remaining: usize,
+}
+```
+
+---
+
+## RuVector Integration Points
+
+### Affected Crates/Modules
+
+1. **`ruvector-domain-expansion`**: The `DomainExpansionEngine` already implements cross-domain transfer with `MetaThompsonEngine`. Morphogenetic fields extend this with spatial structure over the domain graph -- each domain node carries activator/inhibitor concentrations that influence the transfer policy selection. The `PolicyKernel` population search can be guided by developmental programs that specialize kernels into domain-specific roles.
+
+2. **`ruvector-attention`**: The existing 18+ attention mechanisms (morphological, topology, sheaf, PDE, transport, curvature, sparse, flash, hyperbolic, MoE) serve as the building blocks that the self-organizing system selects and composes. The `topology/` module's gated attention maps directly to morphogenetic field gating. The `sheaf/` module's restriction maps provide the mathematical framework for boundary-creating attention between differentiated node types.
+
+3. **`ruvector-coherence`**: The coherence engine (`spectral.rs`, `quality.rs`, `metrics.rs`) provides the feedback signal for the autopoietic loop. The target coherence from `AutopoieticConfig` corresponds directly to the spectral coherence thresholds used in the mincut-gated-transformer. Coherence measurements drive the grow/prune/split decisions.
+
+4. **`ruvector-mincut`**: Topology optimization via mincut provides the theoretical foundation for the pruning phase of autopoiesis. The mincut-gated-transformer's `GateController` (energy gates, early exit) directly corresponds to morphogenetic field gating -- both decide which computation paths are active based on a learned signal.
+
+5. **`ruvector-nervous-system`**: The dendritic coincidence detection (`Dendrite`, `DendriticTree`, `PlateauPotential`) maps directly to the developmental differentiation model. Neurons differentiate based on their dendritic input patterns, just as graph nodes differentiate based on local topology. The `plasticity/eprop` module's e-prop learning rule can guide morphogenetic field parameter adaptation. The `GlobalWorkspace` and `OscillatoryRouter` provide the coordination substrate for cellular automata attention.
+
+6. **`ruvector-gnn`**: The core GNN layer (`layer.rs`), training loop (`training.rs`), and elastic weight consolidation (`ewc.rs`) provide the foundation. EWC is essential for developmental programs: when a node differentiates, the weights associated with its old type must be protected via Fisher-information-weighted regularization, preventing catastrophic forgetting of learned representations.
+
+### New Modules to Create
+
+```
+ruvector-gnn/src/self_organizing/
+  mod.rs
+  morphogenetic.rs     # Reaction-diffusion field on graph
+  developmental.rs     # L-system graph grammar executor
+  autopoietic.rs       # Self-maintenance loop
+  cellular_automata.rs # CA-based attention rules
+  growth_phase.rs      # Phase scheduling
+  metrics.rs           # Growth statistics and visualization
+```
+
+---
+
+## Future Roadmap
+
+### 2030: Self-Growing Graph Architectures
+
+By 2030, the developmental program becomes a learned object rather than a hand-designed grammar. The production rules themselves are parameterized by neural networks trained via reinforcement learning on downstream task performance. Key milestones:
+
+- **Learned Growth Rules**: A meta-network predicts which production rule to apply at each developmental step, conditioned on global graph statistics and task loss gradients.
+- **Topology-Aware Data Distribution Matching**: The morphogenetic field parameters are optimized so that the resulting attention cluster structure matches the data distribution's intrinsic geometry (e.g., manifold structure, cluster hierarchy).
+- **Federated Self-Organization**: Multiple SOGT instances running on different data partitions exchange developmental signals (activator/inhibitor concentrations) to coordinate topology across distributed deployments.
+- **Morphogenetic Architecture Search**: Instead of NAS over a fixed search space, the search space itself grows through morphogenetic processes. Novel attention mechanisms emerge as stable Turing patterns on the architecture search graph.
+
+### 2036: Autonomous Graph Systems
+
+By 2036, the self-organizing graph transformer becomes a fully autonomous system that evolves new attention mechanisms through its developmental program:
+
+- **Open-Ended Evolution**: The graph system exhibits open-ended evolution -- it continuously produces novel structures that are not repetitions of previous states. New node types, edge types, and attention mechanisms emerge without human intervention.
+- **Developmental Canalization**: The system develops robust developmental trajectories that reliably produce high-performing topologies despite environmental perturbation, analogous to biological canalization.
+- **Morphogenetic Memory**: Growth histories are stored as compressed developmental programs (analogous to DNA) that can be replayed, mutated, and recombined for evolutionary search over architectures.
+- **Autopoietic Resilience at Scale**: Production graph systems with millions of nodes self-repair within milliseconds of node failure, maintaining 99.999% coherence through continuous autopoietic maintenance without human intervention.
+
+---
+
+## Implementation Phases
+
+### Phase 1: Morphogenetic Fields (3 weeks)
+- Implement reaction-diffusion on graph using graph Laplacian
+- Integrate Turing pattern attention masking with existing ruvector-attention
+- Validate pattern formation on synthetic graphs
+- Unit tests for stability and convergence
+
+### Phase 2: Developmental Programs (4 weeks)
+- Implement L-system graph grammar with production rules
+- Add competence windows and node differentiation
+- Integrate with morphogenetic fields for condition checking
+- Test developmental trajectories on benchmark graphs
+
+### Phase 3: Autopoietic Maintenance (3 weeks)
+- Implement coherence-gated topology maintenance using ruvector-coherence
+- Add edge pruning, node splitting, and edge growth
+- Integrate with existing HNSW index maintenance
+- Stress tests for self-repair under node deletion
+
+### Phase 4: Integration and Evaluation (2 weeks)
+- Combine all three subsystems into unified SOGT layer
+- Benchmark against static graph transformers on distribution-shifting workloads
+- Measure self-repair latency and coherence maintenance
+- Document growth phase scheduling heuristics
+
+---
+
+## Success Metrics
+
+| Metric | Target |
+|--------|--------|
+| Topology Adaptation Speed | <100ms to respond to distribution shift |
+| Node Specialization Accuracy | >85% correct functional type assignment |
+| Self-Repair Recovery Time | <50ms to recover from 10% node deletion |
+| Coherence Maintenance | Within +/-5% of target coherence |
+| Retrieval Quality (shifting workload) | 30-50% improvement over static topology |
+| Growth Overhead | <15% additional computation per forward pass |
+| Morphogenetic Pattern Stability | Converge within 50 reaction-diffusion steps |
+
+---
+
+## Risks and Mitigations
+
+1. **Risk: Uncontrolled Growth**
+   - Mitigation: Hard `max_nodes` cap, growth rate limits per phase, energy-based cost for node creation
+
+2. **Risk: Developmental Instability**
+   - Mitigation: Canalization through competence windows, EWC-protected weight consolidation during differentiation
+
+3. **Risk: Morphogenetic Pattern Collapse**
+   - Mitigation: Validated Turing parameter regimes (D_h/D_a > 5), stochastic perturbation to break symmetry
+
+4. **Risk: Autopoietic Oscillation**
+   - Mitigation: Hysteresis in coherence thresholds (different thresholds for grow vs. prune), exponential moving average smoothing
+
+5. **Risk: Performance Overhead**
+   - Mitigation: Amortize maintenance over many forward passes, sparse Laplacian operations, early-exit from growth phases when targets are met