dearsky/wifi-densepose

Fork 0

Files

ruv cd5943df23 Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

2026-02-28 14:39:40 -05:00

38 KiB

Raw Blame History

Graph Condensation (SFGC) - Implementation Plan

Overview

Problem Statement

HNSW graphs in production vector databases face critical deployment challenges:

Memory Footprint: Full HNSW graphs require 40-120 bytes per vector for connectivity metadata
Edge Deployment: Mobile/IoT devices cannot store million-node graphs (400MB-4.8GB overhead)
Federated Learning: Transferring full graphs between nodes is bandwidth-prohibitive
Cold Start: Initial graph construction is expensive for dynamic applications

Proposed Solution

Implement Structure-Preserving Graph Condensation (SFGC) that creates synthetic "super-nodes" representing clusters of original nodes. The condensed graph:

Reduces graph size by 10-100x (configurable compression ratio)
Preserves topological properties (small-world, scale-free characteristics)
Maintains search accuracy within 2-5% of full graph
Enables progressive graph expansion from condensed to full representation

Core Innovation: Unlike naive graph coarsening, SFGC learns synthetic node embeddings that maximize structural fidelity using a differentiable graph neural network.

Expected Benefits (Quantified)

Metric	Current (Full HNSW)	With SFGC (50x)	Improvement
Memory footprint	4.8GB (1M vectors)	96MB	50x reduction
Transfer bandwidth	4.8GB	96MB	50x reduction
Edge device compatibility	Limited to 100K vectors	5M vectors	50x capacity
Cold start time	120s	8s + progressive	15x faster
Search accuracy (recall@10)	0.95	0.92-0.94	2-3% degradation
Search latency	1.2ms	1.5ms (initial), 1.2ms (expanded)	25% slower → same

ROI Calculation:

Edge deployment: enables $500 devices vs $2000 workstations
Federated learning: 50x faster synchronization (2.4s vs 120s)
Multi-tenant SaaS: 50x more graphs per server

Technical Design

Architecture Diagram (ASCII)

┌─────────────────────────────────────────────────────────────────┐
│                     Graph Condensation Pipeline                  │
└─────────────────────────────────────────────────────────────────┘
                                  │
                    ┌─────────────┴─────────────┐
                    ▼                           ▼
        ┌───────────────────────┐   ┌───────────────────────┐
        │  Offline Condensation │   │  Online Expansion     │
        │      (Training)       │   │     (Runtime)         │
        └───────────────────────┘   └───────────────────────┘
                    │                           │
        ┌───────────┼───────────┐              │
        ▼           ▼           ▼              ▼
    ┌────────┐ ┌────────┐ ┌─────────┐  ┌─────────────┐
    │Cluster │ │Synth   │ │Edge     │  │Progressive  │
    │ ing    │ │Node    │ │Preserv  │  │Decompression│
    │        │ │Learn   │ │ation    │  │             │
    └────────┘ └────────┘ └─────────┘  └─────────────┘
        │           │           │              │
        └───────────┼───────────┘              │
                    ▼                          ▼
            ┌──────────────┐          ┌──────────────┐
            │  Condensed   │─────────▶│  Hybrid      │
            │  Graph File  │  Load    │  Graph Store │
            │  (.cgraph)   │          │              │
            └──────────────┘          └──────────────┘
                                              │
                                              ▼
                                      ┌──────────────┐
                                      │  Search API  │
                                      │  (adaptive)  │
                                      └──────────────┘

Component Flow:

Offline Condensation (Training Phase):
- Hierarchical clustering of original graph
- GNN-based synthetic node embedding learning
- Edge weight optimization via structure preservation loss
- Export to .cgraph format
Online Expansion (Runtime):
- Load condensed graph for fast cold start
- Progressive decompression on cache misses
- Adaptive switching between condensed/full graph

Core Data Structures (Rust)

/// Condensed graph representation with synthetic nodes
#[derive(Clone, Debug)]
pub struct CondensedGraph {
    /// Synthetic node embeddings (learned via GNN)
    pub synthetic_nodes: Vec<SyntheticNode>,

    /// Condensed HNSW layers (smaller topology)
    pub condensed_layers: Vec<HnswLayer>,

    /// Compression ratio (e.g., 50.0 for 50x)
    pub compression_ratio: f32,

    /// Mapping from synthetic node to original node IDs
    pub expansion_map: HashMap<NodeId, Vec<NodeId>>,

    /// Graph statistics for adaptive expansion
    pub stats: GraphStatistics,
}

/// Synthetic node representing a cluster of original nodes
#[derive(Clone, Debug)]
pub struct SyntheticNode {
    /// Learned embedding (centroid of cluster, refined via GNN)
    pub embedding: Vec<f32>,

    /// Original node IDs in this cluster
    pub cluster_members: Vec<NodeId>,

    /// Cluster radius (for expansion threshold)
    pub radius: f32,

    /// Connectivity in condensed graph
    pub neighbors: Vec<(NodeId, f32)>, // (neighbor_id, edge_weight)

    /// Access frequency (for adaptive expansion)
    pub access_count: AtomicU64,
}

/// Configuration for graph condensation process
#[derive(Clone, Debug)]
pub struct CondensationConfig {
    /// Target compression ratio (10-100)
    pub compression_ratio: f32,

    /// Clustering method
    pub clustering_method: ClusteringMethod,

    /// GNN training epochs for synthetic nodes
    pub gnn_epochs: usize,

    /// Structure preservation weight (vs embedding quality)
    pub structure_weight: f32,

    /// Edge preservation strategy
    pub edge_strategy: EdgePreservationStrategy,
}

#[derive(Clone, Debug)]
pub enum ClusteringMethod {
    /// Hierarchical agglomerative clustering
    Hierarchical { linkage: LinkageType },

    /// Louvain modularity-based clustering
    Louvain { resolution: f32 },

    /// Spectral clustering via graph Laplacian
    Spectral { n_components: usize },

    /// Custom clustering function
    Custom(Box<dyn Fn(&HnswIndex) -> Vec<Vec<NodeId>>>),
}

#[derive(Clone, Debug)]
pub enum EdgePreservationStrategy {
    /// Keep edges if both endpoints map to different synthetic nodes
    InterCluster,

    /// Weighted by cluster similarity
    WeightedSimilarity,

    /// Learn edge weights via GNN
    Learned,
}

/// Hybrid graph store supporting both condensed and full graphs
pub struct HybridGraphStore {
    /// Condensed graph (always loaded)
    condensed: CondensedGraph,

    /// Full graph (lazily loaded regions)
    full_graph: Option<Arc<RwLock<HnswIndex>>>,

    /// Expanded regions cache
    expanded_cache: LruCache<NodeId, ExpandedRegion>,

    /// Expansion policy
    policy: ExpansionPolicy,
}

/// Policy for when to expand condensed nodes to full graph
#[derive(Clone, Debug)]
pub enum ExpansionPolicy {
    /// Never expand (use condensed graph only)
    Never,

    /// Expand on cache miss
    OnDemand { cache_size: usize },

    /// Expand regions with high query frequency
    Adaptive { threshold: f64 },

    /// Always use full graph
    Always,
}

/// Expanded region of the full graph
struct ExpandedRegion {
    /// Full node data for this region
    nodes: Vec<FullNode>,

    /// Last access timestamp
    last_access: Instant,

    /// Access count
    access_count: u64,
}

/// Statistics for monitoring condensation quality
#[derive(Clone, Debug, Default)]
pub struct GraphStatistics {
    /// Average cluster size
    pub avg_cluster_size: f32,

    /// Cluster size variance
    pub cluster_variance: f32,

    /// Edge preservation ratio (condensed edges / original edges)
    pub edge_preservation: f32,

    /// Average path length increase
    pub path_length_delta: f32,

    /// Clustering coefficient preservation
    pub clustering_coef_ratio: f32,
}

Key Algorithms (Pseudocode)

Algorithm 1: Graph Condensation (Offline Training)

function condense_graph(hnsw_index, config):
    # Step 1: Hierarchical clustering
    clusters = hierarchical_cluster(
        hnsw_index.nodes,
        target_clusters = hnsw_index.size / config.compression_ratio
    )

    # Step 2: Initialize synthetic node embeddings
    synthetic_nodes = []
    for cluster in clusters:
        centroid = compute_centroid(cluster.members)
        synthetic_nodes.append(SyntheticNode {
            embedding: centroid,
            cluster_members: cluster.members,
            radius: compute_cluster_radius(cluster),
            neighbors: [],
            access_count: 0
        })

    # Step 3: Build condensed edges
    condensed_edges = build_condensed_edges(
        hnsw_index,
        clusters,
        config.edge_strategy
    )

    # Step 4: GNN-based refinement
    gnn_model = GraphNeuralNetwork(
        input_dim = embedding_dim,
        hidden_dims = [128, 64],
        output_dim = embedding_dim
    )

    optimizer = Adam(gnn_model.parameters(), lr=0.001)

    for epoch in 1..config.gnn_epochs:
        # Forward pass: refine synthetic embeddings
        refined_embeddings = gnn_model.forward(
            synthetic_nodes.embeddings,
            condensed_edges
        )

        # Compute structure preservation loss
        loss = compute_structure_loss(
            refined_embeddings,
            condensed_edges,
            original_graph = hnsw_index,
            expansion_map = clusters,
            structure_weight = config.structure_weight
        )

        # Backward pass
        loss.backward()
        optimizer.step()

        # Update synthetic embeddings
        for i, node in enumerate(synthetic_nodes):
            node.embedding = refined_embeddings[i]

    # Step 5: Build condensed HNSW layers
    condensed_layers = build_hnsw_layers(
        synthetic_nodes,
        condensed_edges,
        max_layer = hnsw_index.max_layer
    )

    return CondensedGraph {
        synthetic_nodes,
        condensed_layers,
        compression_ratio: config.compression_ratio,
        expansion_map: clusters,
        stats: compute_statistics(...)
    }

function compute_structure_loss(embeddings, edges, original_graph, expansion_map, structure_weight):
    # Part 1: Embedding quality (centroid fidelity)
    embedding_loss = 0
    for i, synthetic_node in enumerate(embeddings):
        cluster_members = expansion_map[i]
        original_embeddings = [original_graph.get_embedding(id) for id in cluster_members]
        true_centroid = mean(original_embeddings)
        embedding_loss += mse(synthetic_node, true_centroid)

    # Part 2: Structure preservation (edge connectivity)
    structure_loss = 0
    for (u, v, weight) in edges:
        # Check if original graph had path between clusters u and v
        cluster_u = expansion_map[u]
        cluster_v = expansion_map[v]
        original_connectivity = compute_inter_cluster_connectivity(
            original_graph, cluster_u, cluster_v
        )
        predicted_connectivity = cosine_similarity(embeddings[u], embeddings[v])
        structure_loss += mse(predicted_connectivity, original_connectivity)

    # Part 3: Topological invariants
    topo_loss = 0
    condensed_clustering_coef = compute_clustering_coefficient(embeddings, edges)
    original_clustering_coef = original_graph.clustering_coefficient
    topo_loss += abs(condensed_clustering_coef - original_clustering_coef)

    return (1 - structure_weight) * embedding_loss +
           structure_weight * (structure_loss + 0.1 * topo_loss)

Algorithm 2: Progressive Expansion (Online Runtime)

function search_hybrid_graph(query, k, hybrid_store):
    # Step 1: Search in condensed graph
    condensed_results = search_condensed(
        query,
        hybrid_store.condensed,
        k_initial = k * 2  # oversample
    )

    # Step 2: Decide whether to expand
    if hybrid_store.policy == ExpansionPolicy::Never:
        return refine_condensed_results(condensed_results, k)

    # Step 3: Identify expansion candidates
    expansion_candidates = []
    for result in condensed_results:
        synthetic_node = result.node

        # Expand if: high uncertainty OR cache miss OR high query frequency
        should_expand = (
            result.distance < synthetic_node.radius * 1.5 OR  # uncertainty
            not hybrid_store.expanded_cache.contains(synthetic_node.id) OR  # cache miss
            synthetic_node.access_count.load() > adaptive_threshold  # hot region
        )

        if should_expand:
            expansion_candidates.append(synthetic_node.id)

    # Step 4: Expand regions (lazily load from full graph)
    if len(expansion_candidates) > 0:
        expanded_regions = hybrid_store.expand_regions(expansion_candidates)

        # Step 5: Refine search in expanded regions
        refined_results = []
        for region in expanded_regions:
            local_results = search_full_graph(
                query,
                region.nodes,
                k_local = k
            )
            refined_results.extend(local_results)

        # Merge condensed and expanded results
        all_results = merge_results(condensed_results, refined_results)
        return top_k(all_results, k)
    else:
        # No expansion needed
        return refine_condensed_results(condensed_results, k)

function expand_regions(hybrid_store, synthetic_node_ids):
    expanded = []
    for node_id in synthetic_node_ids:
        # Check cache first
        if hybrid_store.expanded_cache.contains(node_id):
            expanded.append(hybrid_store.expanded_cache.get(node_id))
            continue

        # Load from full graph (disk or memory)
        synthetic_node = hybrid_store.condensed.synthetic_nodes[node_id]
        cluster_member_ids = synthetic_node.cluster_members

        full_nodes = []
        if hybrid_store.full_graph.is_some():
            # Full graph in memory
            full_graph = hybrid_store.full_graph.unwrap()
            for member_id in cluster_member_ids:
                full_nodes.append(full_graph.get_node(member_id))
        else:
            # Load from disk (mmap)
            full_nodes = load_nodes_from_disk(cluster_member_ids)

        region = ExpandedRegion {
            nodes: full_nodes,
            last_access: now(),
            access_count: 1
        }

        # Add to cache (evict LRU if full)
        hybrid_store.expanded_cache.put(node_id, region)
        expanded.append(region)

    return expanded

API Design (Function Signatures)

// ============================================================
// Public API for Graph Condensation
// ============================================================

pub trait GraphCondensation {
    /// Condense an HNSW index into a smaller graph
    fn condense(
        &self,
        config: CondensationConfig,
    ) -> Result<CondensedGraph, CondensationError>;

    /// Save condensed graph to disk
    fn save_condensed(&self, path: &Path) -> Result<(), io::Error>;

    /// Load condensed graph from disk
    fn load_condensed(path: &Path) -> Result<CondensedGraph, io::Error>;

    /// Validate condensation quality
    fn validate_condensation(
        &self,
        condensed: &CondensedGraph,
        test_queries: &[Vec<f32>],
    ) -> ValidationMetrics;
}

pub trait HybridGraphSearch {
    /// Search using hybrid condensed/full graph
    fn search_hybrid(
        &self,
        query: &[f32],
        k: usize,
        policy: ExpansionPolicy,
    ) -> Result<Vec<SearchResult>, SearchError>;

    /// Adaptive search with automatic expansion
    fn search_adaptive(
        &self,
        query: &[f32],
        k: usize,
        recall_target: f32,  // e.g., 0.95
    ) -> Result<Vec<SearchResult>, SearchError>;

    /// Get current cache statistics
    fn cache_stats(&self) -> CacheStatistics;

    /// Preload hot regions into cache
    fn warmup_cache(&mut self, query_log: &[Vec<f32>]) -> Result<(), CacheError>;
}

// ============================================================
// Configuration API
// ============================================================

impl CondensationConfig {
    /// Default configuration for 50x compression
    pub fn default_50x() -> Self {
        Self {
            compression_ratio: 50.0,
            clustering_method: ClusteringMethod::Hierarchical {
                linkage: LinkageType::Ward,
            },
            gnn_epochs: 100,
            structure_weight: 0.7,
            edge_strategy: EdgePreservationStrategy::Learned,
        }
    }

    /// Aggressive compression for edge devices (100x)
    pub fn edge_device() -> Self {
        Self {
            compression_ratio: 100.0,
            clustering_method: ClusteringMethod::Louvain {
                resolution: 1.2,
            },
            gnn_epochs: 50,
            structure_weight: 0.5,
            edge_strategy: EdgePreservationStrategy::InterCluster,
        }
    }

    /// Conservative compression for high accuracy (10x)
    pub fn high_accuracy() -> Self {
        Self {
            compression_ratio: 10.0,
            clustering_method: ClusteringMethod::Spectral {
                n_components: 128,
            },
            gnn_epochs: 200,
            structure_weight: 0.9,
            edge_strategy: EdgePreservationStrategy::Learned,
        }
    }
}

// ============================================================
// Monitoring and Metrics
// ============================================================

#[derive(Clone, Debug)]
pub struct ValidationMetrics {
    /// Recall at different k values
    pub recall_at_k: HashMap<usize, f32>,

    /// Average path length increase
    pub avg_path_length_ratio: f32,

    /// Search latency comparison
    pub latency_ratio: f32,

    /// Memory reduction achieved
    pub memory_reduction: f32,

    /// Graph property preservation
    pub property_preservation: PropertyPreservation,
}

#[derive(Clone, Debug)]
pub struct PropertyPreservation {
    pub clustering_coefficient: f32,
    pub average_degree: f32,
    pub diameter_ratio: f32,
}

#[derive(Clone, Debug)]
pub struct CacheStatistics {
    pub hit_rate: f32,
    pub eviction_count: u64,
    pub avg_expansion_time: Duration,
    pub total_expansions: u64,
}

Integration Points

Affected Crates/Modules

ruvector-gnn (Core GNN crate):
- Add condensation/ module for graph compression
- Extend HnswIndex with condense() method
- Add GNN training loop for synthetic node refinement
ruvector-core:
- Add CondensedGraph serialization format (.cgraph)
- Extend search API with hybrid search modes
- Add HybridGraphStore as alternative index backend
ruvector-gnn-node (Node.js bindings):
- Expose condense() API to JavaScript/TypeScript
- Add configuration builder for condensation parameters
- Provide progress callbacks for offline condensation
ruvector-cli:
- Add ruvector condense command for offline condensation
- Add ruvector validate-condensed for quality testing
- Add visualization for condensed graph statistics
ruvector-distributed:
- Use condensed graphs for federated learning synchronization
- Implement condensed graph transfer protocol
- Add merge logic for condensed graphs from multiple nodes

New Modules to Create

crates/ruvector-gnn/src/condensation/
├── mod.rs                    # Public API
├── clustering.rs             # Hierarchical/Louvain/Spectral clustering
├── synthetic_node.rs         # Synthetic node learning via GNN
├── edge_preservation.rs      # Edge weight computation
├── gnn_trainer.rs            # GNN training loop
├── structure_loss.rs         # Loss functions for structure preservation
├── serialization.rs          # .cgraph format I/O
└── validation.rs             # Quality metrics

crates/ruvector-core/src/hybrid/
├── mod.rs                    # HybridGraphStore
├── expansion_policy.rs       # Adaptive expansion logic
├── cache.rs                  # LRU cache for expanded regions
└── search.rs                 # Hybrid search algorithms

crates/ruvector-gnn-node/condensation/
├── bindings.rs               # NAPI bindings
└── typescript/
    └── condensation.d.ts     # TypeScript definitions

Dependencies on Other Features

Prerequisite: Attention Mechanisms (Tier 1):
- SFGC uses attention-weighted clustering
- Synthetic node embeddings benefit from attention-based aggregation
- Action: Ensure attention module is stable before SFGC integration
Synergy: Adaptive HNSW (Tier 2, Feature #5):
- Adaptive HNSW can use condensed graph for cold start
- Layer-wise compression ratios (compress higher layers more aggressively)
- Integration: Shared ExpansionPolicy trait
Optional: Neuromorphic Spiking (Tier 2, Feature #6):
- Spiking networks can accelerate GNN training for synthetic nodes
- Integration: Conditional compilation flag for spiking backend
Complementary: Sparse Attention (Tier 3, Feature #8):
- Sparse attention patterns can guide clustering
- Integration: Use learned attention masks as clustering hints

Regression Prevention

Existing Functionality at Risk

HNSW Search Accuracy:
- Risk: Condensed graph returns lower-quality results
- Mitigation:
  - Validate recall@10 >= 0.92 on standard benchmarks (SIFT1M, GIST1M)
  - Add A/B testing framework for condensed vs full graph
  - Default to conservative 10x compression
Memory Safety (Rust):
- Risk: Expansion cache causes use-after-free or data races
- Mitigation:
  - Use Arc<RwLock<...>> for shared ownership
  - Fuzz testing with ThreadSanitizer
  - Property-based testing with proptest
Serialization Format Compatibility:
- Risk: .cgraph format breaks existing index loading
- Mitigation:
  - Separate file extension (.cgraph vs .hnsw)
  - Version magic number in header
  - Fallback to full graph if condensation fails
Node.js Bindings Performance:
- Risk: Condensation adds latency to JavaScript API
- Mitigation:
  - Make condensation opt-in (separate method)
  - Async/non-blocking condensation API
  - Progress callbacks to avoid blocking event loop

Test Cases to Prevent Regressions

// Test 1: Search quality preservation
#[test]
fn test_condensed_search_recall() {
    let full_index = build_test_index(10000);
    let condensed = full_index.condense(CondensationConfig::default_50x()).unwrap();

    let test_queries = generate_test_queries(100);

    for query in test_queries {
        let full_results = full_index.search(&query, 10);
        let condensed_results = condensed.search(&query, 10);

        let recall = compute_recall(&full_results, &condensed_results);
        assert!(recall >= 0.92, "Recall dropped below 92%: {}", recall);
    }
}

// Test 2: Memory reduction
#[test]
fn test_memory_footprint() {
    let full_index = build_test_index(100000);
    let condensed = full_index.condense(CondensationConfig::default_50x()).unwrap();

    let full_size = full_index.memory_usage();
    let condensed_size = condensed.memory_usage();

    let reduction = full_size as f32 / condensed_size as f32;
    assert!(reduction >= 40.0, "Memory reduction below 40x: {}", reduction);
}

// Test 3: Serialization round-trip
#[test]
fn test_condensed_serialization() {
    let original = build_test_index(1000).condense(CondensationConfig::default_50x()).unwrap();

    let path = "/tmp/test.cgraph";
    original.save_condensed(Path::new(path)).unwrap();
    let loaded = CondensedGraph::load_condensed(Path::new(path)).unwrap();

    assert_eq!(original.synthetic_nodes.len(), loaded.synthetic_nodes.len());
    assert_eq!(original.compression_ratio, loaded.compression_ratio);
}

// Test 4: Hybrid search correctness
#[test]
fn test_hybrid_search_equivalence() {
    let full_index = build_test_index(5000);
    let condensed = full_index.condense(CondensationConfig::default_50x()).unwrap();

    let hybrid_store = HybridGraphStore::new(condensed, Some(Arc::new(RwLock::new(full_index))));

    let query = generate_random_query();

    // With ExpansionPolicy::Always, hybrid should match full graph
    let hybrid_results = hybrid_store.search_hybrid(&query, 10, ExpansionPolicy::Always).unwrap();
    let full_results = full_index.search(&query, 10);

    assert_eq!(hybrid_results, full_results);
}

// Test 5: Concurrent expansion safety
#[test]
fn test_concurrent_expansion() {
    let hybrid_store = Arc::new(RwLock::new(build_hybrid_store()));

    let handles: Vec<_> = (0..10).map(|_| {
        let store = Arc::clone(&hybrid_store);
        thread::spawn(move || {
            let query = generate_random_query();
            let results = store.write().unwrap().search_hybrid(
                &query, 10, ExpansionPolicy::OnDemand { cache_size: 100 }
            );
            assert!(results.is_ok());
        })
    }).collect();

    for handle in handles {
        handle.join().unwrap();
    }
}

Backward Compatibility Strategy

API Level:
- Keep existing HnswIndex::search() unchanged
- Add new HnswIndex::condense() method (opt-in)
- Condensed search via separate HybridGraphStore type
File Format:
- Condensed graphs use .cgraph extension
- Original .hnsw format unchanged
- Metadata includes version + compression ratio
Node.js Bindings:
- Add index.condense(config) method (returns new CondensedIndex instance)
- Keep index.search() behavior identical
- Add condensedIndex.searchHybrid() for hybrid mode
CLI:
- ruvector build unchanged (builds full graph)
- New ruvector condense command (separate step)
- Auto-detect .cgraph vs .hnsw on load

Implementation Phases

Phase 1: Core Implementation (Weeks 1-3)

Goals:

Implement clustering algorithms (hierarchical, Louvain)
Build basic synthetic node creation (centroid-based, no GNN)
Implement condensed HNSW layer construction
Basic serialization (.cgraph format)

Deliverables:

// Week 1: Clustering
crates/ruvector-gnn/src/condensation/clustering.rs
  ✓ hierarchical_cluster()
  ✓ louvain_cluster()
  ✓ spectral_cluster()

// Week 2: Synthetic nodes + edges
crates/ruvector-gnn/src/condensation/synthetic_node.rs
  ✓ create_synthetic_nodes() // centroid-based
  ✓ build_condensed_edges()

// Week 3: Condensed graph + serialization
crates/ruvector-gnn/src/condensation/mod.rs
  ✓ CondensedGraph::from_hnsw()
  ✓ save_condensed() / load_condensed()

Success Criteria:

Can condense 100K vector index to 2K synthetic nodes
Serialization round-trip preserves graph structure
Unit tests pass for clustering algorithms

Phase 2: Integration (Weeks 4-6)

Goals:

Integrate with HnswIndex API
Add GNN-based synthetic node refinement
Implement hybrid search with basic expansion policy
Node.js bindings

Deliverables:

// Week 4: HNSW integration
crates/ruvector-gnn/src/hnsw/index.rs
  ✓ impl GraphCondensation for HnswIndex

// Week 5: GNN training
crates/ruvector-gnn/src/condensation/gnn_trainer.rs
  ✓ train_synthetic_embeddings()
  ✓ structure_preservation_loss()

// Week 6: Hybrid search
crates/ruvector-core/src/hybrid/
  ✓ HybridGraphStore::search_hybrid()
  ✓ ExpansionPolicy::OnDemand

Success Criteria:

Recall@10 >= 0.90 on SIFT1M benchmark
GNN training converges in <100 epochs
Hybrid search passes correctness tests

Phase 3: Optimization (Weeks 7-9)

Goals:

Performance tuning (SIMD, caching)
Adaptive expansion policy (query frequency tracking)
Distributed condensation for federated learning
CLI tool for offline condensation

Deliverables:

// Week 7: Performance optimization
crates/ruvector-gnn/src/condensation/
  ✓ SIMD-optimized centroid computation
  ✓ Parallel clustering (rayon)

// Week 8: Adaptive expansion
crates/ruvector-core/src/hybrid/
  ✓ ExpansionPolicy::Adaptive
  ✓ Query frequency tracking
  ✓ LRU cache tuning

// Week 9: CLI + distributed
crates/ruvector-cli/src/commands/condense.rs
  ✓ ruvector condense --ratio 50
crates/ruvector-distributed/src/sync.rs
  ✓ Condensed graph synchronization

Success Criteria:

Condensation time <10s for 1M vectors
Adaptive expansion improves latency by 20%+
CLI can condense production-scale graphs

Phase 4: Production Hardening (Weeks 10-12)

Goals:

Comprehensive testing (property-based, fuzz, benchmarks)
Documentation + examples
Performance regression suite
Multi-platform validation

Deliverables:

// Week 10: Testing
tests/condensation/
  ✓ Property-based tests (proptest)
  ✓ Fuzz testing (cargo-fuzz)
  ✓ Regression test suite

// Week 11: Documentation
docs/
  ✓ Graph Condensation Guide (user-facing)
  ✓ API documentation (rustdoc)
  ✓ Examples (edge device deployment)

// Week 12: Benchmarks + validation
benches/condensation.rs
  ✓ Condensation time benchmarks
  ✓ Search quality benchmarks
  ✓ Memory footprint benchmarks

Success Criteria:

100% code coverage for condensation module
Passes all regression tests
Documentation complete with 3+ examples
Validated on ARM64, x86-64, WASM targets

Success Metrics

Performance Benchmarks

Benchmark	Metric	Target	Measurement Method
Condensation Time	Time to condense 1M vectors	<10s	`cargo bench condense_1m`
Memory Reduction	Footprint ratio (full/condensed)	50x	`malloc_count`
Search Latency (condensed only)	p99 latency	<2ms	`criterion` benchmark
Search Latency (hybrid, cold)	p99 latency on first query	<3ms	Cache miss scenario
Search Latency (hybrid, warm)	p99 latency after warmup	<1.5ms	Cache hit scenario
Expansion Time	Time to expand 1 cluster	<0.5ms	`expand_regions()` profiling

Accuracy Metrics

Dataset	Metric	Target	Baseline (Full Graph)
SIFT1M	Recall@10 (50x compression)	>=0.92	0.95
SIFT1M	Recall@100 (50x compression)	>=0.90	0.94
GIST1M	Recall@10 (50x compression)	>=0.90	0.93
GloVe-200	Recall@10 (100x compression)	>=0.85	0.92
Custom high-dim (1536d)	Recall@10 (50x compression)	>=0.88	0.94

Memory/Latency Targets

Configuration	Memory Footprint	Search Latency (p99)	Use Case
Full HNSW (1M vectors)	4.8GB	1.2ms	Server deployment
Condensed 50x (baseline)	96MB	1.5ms (cold), 1.2ms (warm)	Edge device
Condensed 100x (aggressive)	48MB	2.0ms (cold), 1.5ms (warm)	IoT device
Condensed 10x (conservative)	480MB	1.3ms (cold), 1.2ms (warm)	Embedded system
Hybrid (50x + on-demand)	96MB + cache	1.3ms (adaptive)	Mobile app

Measurement Tools:

Memory: massif (Valgrind), heaptrack, custom malloc_count
Latency: criterion (Rust), perf (Linux profiling)
Accuracy: Custom recall calculator against ground truth

Quality Gates

All gates must pass before production release:

Functional:
- ✓ All unit tests pass (100% coverage for core logic)
- ✓ Integration tests pass on 3+ datasets
- ✓ Serialization round-trip is lossless
Performance:
- ✓ Memory reduction >= 40x (for 50x target config)
- ✓ Condensation time <= 15s for 1M vectors
- ✓ Search latency penalty <= 30% (cold start)
Accuracy:
- ✓ Recall@10 >= 0.92 on SIFT1M (50x compression)
- ✓ Recall@10 >= 0.85 on GIST1M (100x compression)
- ✓ No catastrophic failures (recall < 0.5)
Compatibility:
- ✓ Works on Linux x86-64, ARM64, macOS
- ✓ Node.js bindings pass all tests
- ✓ Backward compatible with existing indexes

Risks and Mitigations

Technical Risks

Risk 1: GNN Training Instability

Description: Synthetic node embeddings may not converge during GNN training, leading to poor structure preservation.

Probability: Medium (30%)

Impact: High (blocks Phase 2)

Mitigation:

Fallback: Start with centroid-only embeddings (no GNN) in Phase 1
Hyperparameter Tuning: Grid search over learning rates (1e-4 to 1e-2)
Loss Function Design: Add regularization term to prevent mode collapse
Early Stopping: Monitor validation recall and stop if plateauing
Alternative: Use pre-trained graph embeddings (Node2Vec, GraphSAGE) if GNN fails

Contingency Plan: If GNN training is unstable after 2 weeks of tuning, fall back to attention-weighted centroids (use existing attention mechanisms from Tier 1).

Risk 2: Cold Start Latency Regression

Description: Condensed graph search may be slower than expected due to poor synthetic node placement.

Probability: Medium (40%)

Impact: Medium (user-facing latency)

Mitigation:

Profiling: Use perf to identify bottlenecks (likely distance computations)
SIMD Optimization: Vectorize distance calculations for synthetic nodes
Caching: Precompute pairwise distances between synthetic nodes
Pruning: Reduce condensed graph connectivity (fewer edges per node)
Hybrid Strategy: Always expand top-3 synthetic nodes to reduce uncertainty

Contingency Plan: If cold start latency exceeds 2x full graph, add "warm cache" mode that preloads frequently accessed clusters based on query distribution.

Risk 3: Memory Overhead from Expansion Cache

Description: LRU cache for expanded regions may consume more memory than expected, negating compression benefits.

Probability: Low (20%)

Impact: Medium (defeats purpose on edge devices)

Mitigation:

Adaptive Cache Size: Dynamically adjust cache size based on available memory
Partial Expansion: Only expand k-nearest neighbors within cluster (not full cluster)
Compression: Store expanded regions in quantized format (int8 instead of float32)
Eviction Policy: Evict based on access frequency + recency (LFU + LRU hybrid)

Contingency Plan: If cache overhead exceeds 20% of condensed graph size, make expansion fully on-demand (no caching) and optimize expansion from disk (mmap).

Risk 4: Clustering Quality for High-Dimensional Data

Description: Hierarchical clustering may produce imbalanced clusters in high-dimensional spaces (curse of dimensionality).

Probability: High (60%)

Impact: Medium (poor compression or accuracy)

Mitigation:

Dimensionality Reduction: Apply PCA or UMAP before clustering
Alternative Algorithms: Try spectral clustering or Louvain (graph-based, not distance-based)
Cluster Validation: Measure silhouette score and reject poor clusterings
Adaptive Compression: Use variable compression ratios per region (dense regions = higher compression)

Contingency Plan: If clustering quality is poor (silhouette score < 0.3), switch to graph-based Louvain clustering using HNSW edges as adjacency matrix.

Risk 5: Serialization Format Bloat

Description: .cgraph format may be larger than expected due to storing expansion maps and GNN weights.

Probability: Medium (35%)

Impact: Low (reduces compression benefits)

Mitigation:

Sparse Storage: Use sparse matrix formats (CSR) for expansion maps
Quantization: Store GNN embeddings in int8 (8x smaller)
Compression: Apply zstd compression to .cgraph file
Lazy Loading: Only load expansion map on-demand (not upfront)

Contingency Plan: If .cgraph file exceeds 50% of condensed graph target size, remove GNN weights from serialization and recompute on load (trade disk space for CPU time).

Operational Risks

Risk 6: User Confusion with Hybrid API

Description: Users may not understand when to use condensed vs full vs hybrid graphs.

Probability: High (70%)

Impact: Low (documentation issue)

Mitigation:

Clear Documentation: Add decision tree (edge device → condensed, server → full, mobile → hybrid)
Smart Defaults: Auto-detect environment (check available memory) and choose policy
Examples: Provide 3 reference implementations (edge, mobile, server)
Validation: Add validate_condensed() method that warns if recall is too low

Risk 7: Debugging Difficulty

Description: When condensed search returns wrong results, debugging is harder (no direct mapping to original nodes).

Probability: Medium (50%)

Impact: Medium (developer experience)

Mitigation:

Logging: Add verbose logging for expansion decisions
Visualization: Provide tool to visualize condensed graph + clusters
Explain API: Add explain_search() method that shows which clusters were searched
Metrics: Expose per-cluster recall metrics

This design is based on:

Graph Condensation for GNNs (Jin et al., 2021): Core SFGC algorithm
Structure-Preserving Graph Coarsening (Loukas, 2019): Topological invariants
Hierarchical Navigable Small Worlds (Malkov & Yashunin, 2018): HNSW baseline
Federated Graph Learning (Wu et al., 2022): Distributed graph synchronization

Key differences from prior work:

Novel: GNN-based synthetic node learning (prior work used simple centroids)
Novel: Hybrid search with adaptive expansion (prior work only used condensed graph)
Engineering: Production-ready Rust implementation with SIMD optimization

38 KiB Raw Blame History

Graph Condensation (SFGC) - Implementation Plan

Overview

Problem Statement

Proposed Solution

Expected Benefits (Quantified)

Technical Design

Architecture Diagram (ASCII)

Core Data Structures (Rust)

Key Algorithms (Pseudocode)

Algorithm 1: Graph Condensation (Offline Training)

Algorithm 2: Progressive Expansion (Online Runtime)

API Design (Function Signatures)

Integration Points

Affected Crates/Modules

New Modules to Create

Dependencies on Other Features

Regression Prevention

Existing Functionality at Risk

Test Cases to Prevent Regressions

Backward Compatibility Strategy

Implementation Phases

Phase 1: Core Implementation (Weeks 1-3)

Phase 2: Integration (Weeks 4-6)

Phase 3: Optimization (Weeks 7-9)

Phase 4: Production Hardening (Weeks 10-12)

Success Metrics

Performance Benchmarks

Accuracy Metrics

Memory/Latency Targets

Quality Gates

Risks and Mitigations

Technical Risks

Risk 1: GNN Training Instability

Risk 2: Cold Start Latency Regression

Risk 3: Memory Overhead from Expansion Cache

Risk 4: Clustering Quality for High-Dimensional Data

Risk 5: Serialization Format Bloat

Operational Risks

Risk 6: User Confusion with Hybrid API

Risk 7: Debugging Difficulty

Appendix: Related Research

38 KiB

Raw Blame History