dearsky/wifi-densepose

Fork 0

Files

ruv cd5943df23 Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

2026-02-28 14:39:40 -05:00

43 KiB

Raw Blame History

Federated RVF Transfer Learning -- GOAP Implementation Plan

ADR: ADR-057 Date: 2026-02-26 Methodology: Goal-Oriented Action Planning (GOAP)

1. World State Assessment

1.1 Current State (What Is True Now)

State Variable	Value	Evidence
`rvf_segment_types_defined`	25 types (0x00-0x32)	`crates/rvf/rvf-types/src/segment_type.rs`
`transfer_prior_segment_exists`	true (0x30)	`SegmentType::TransferPrior`
`policy_kernel_segment_exists`	true (0x31)	`SegmentType::PolicyKernel`
`cost_curve_segment_exists`	true (0x32)	`SegmentType::CostCurve`
`rvf_bridge_serialization_works`	true	`ruvector-domain-expansion/src/rvf_bridge.rs` -- 11 passing tests
`witness_chain_support`	true	`rvf-crypto/src/witness.rs` -- `create_witness_chain`, `verify_witness_chain`
`ed25519_signing_support`	true	`rvf-crypto/src/sign.rs` -- feature-gated `ed25519`
`shake256_hashing_support`	true	`rvf-crypto/src/hash.rs` -- `shake256_128`, `shake256_256`
`sona_federated_coordinator_exists`	true	`crates/sona/src/training/federated.rs` -- `FederatedCoordinator`, `EphemeralAgent`
`sona_agent_export_works`	true	`AgentExport`, `TrajectoryExport` with quality gating
`sona_lora_weights_accessible`	true	`SonaEngine::apply_micro_lora`, `MicroLoRA`, `BaseLoRA`
`sona_ewc_support`	true	`crates/sona/src/ewc.rs` -- `EwcPlusPlus`, `TaskFisher`
`domain_expansion_engine_exists`	true	`crates/ruvector-domain-expansion/` -- 3 domains, Meta-TS, population search
`domain_expansion_transfer_works`	true	`MetaThompsonEngine::init_domain_with_transfer` with sqrt dampening
`beta_params_merge_exists`	true	`BetaParams::merge()` in `transfer.rs`
`gcloud_example_exists`	true	`examples/google-cloud/` -- Cloud Run, axum server
`rvf_workspace_defined`	true	`crates/rvf/Cargo.toml` -- 25 workspace members
`no_std_types_core`	true	`rvf-types` is `no_std` by default
`pii_stripping_exists`	false	No PII detection or redaction crate
`differential_privacy_exists`	false	No DP primitives in codebase
`federation_protocol_exists`	false	No inter-user export/import protocol
`gcloud_pubsub_integration`	false	No Pub/Sub client code
`gcloud_gcs_integration`	false	No GCS object store client code
`gcloud_firestore_integration`	false	No Firestore client code
`federated_manifest_segment`	false	No 0x33 segment type
`diff_privacy_proof_segment`	false	No 0x34 segment type
`redaction_log_segment`	false	No 0x35 segment type
`aggregate_weights_segment`	false	No 0x36 segment type
`wasm_export_path`	false	No browser-side federation
`aggregation_server`	false	No multi-user aggregation service
`mcp_federation_server`	false	No MCP server for AI agent access
`rest_api_server`	false	No REST API server for programmatic access

1.2 Goal State (What Should Be True)

State Variable	Required Value
`federated_manifest_segment`	true -- 0x33 defined and wire-coded
`diff_privacy_proof_segment`	true -- 0x34 defined and wire-coded
`redaction_log_segment`	true -- 0x35 defined and wire-coded
`aggregate_weights_segment`	true -- 0x36 defined and wire-coded
`pii_stripping_exists`	true -- `rvf-pii-strip` crate with detection, redaction, attestation
`differential_privacy_exists`	true -- `rvf-diff-privacy` crate with Gaussian mechanism, RDP accountant
`federation_protocol_exists`	true -- `rvf-federation` crate with export builder, import validator, merger
`gcloud_pubsub_integration`	true -- `rvf-gcloud` with Pub/Sub publish/subscribe
`gcloud_gcs_integration`	true -- `rvf-gcloud` with GCS upload/download
`gcloud_firestore_integration`	true -- `rvf-gcloud` with Firestore registry
`aggregation_server`	true -- `rvf-fed-aggregate` with FedAvg, Byzantine tolerance
`wasm_export_path`	true -- `rvf-fed-wasm` with browser PII strip + export
`federation_adapter`	true -- `rvf-adapters/federation` connecting SONA + domain expansion
`mcp_federation_server`	true -- `mcp-federation` crate with 6 tools + 4 resources over JSON-RPC 2.0
`rest_api_server`	true -- `rvf-fed-server` crate with REST API, SSE events, Prometheus metrics
`all_tests_pass`	true
`feature_gated`	true -- all federation is behind `federation` feature flag

2. Action Inventory

Each action has: preconditions, effects, estimated cost (story points, 1-13), and dependencies.

Phase 0: Foundation -- Segment Types and Core Types

Action 0.1: Add Federation Segment Types to rvf-types

Preconditions: rvf_segment_types_defined == true
Effects: federated_manifest_segment = true, diff_privacy_proof_segment = true, redaction_log_segment = true, aggregate_weights_segment = true
Cost: 3 SP
Dependencies: None
Files Modified:
- crates/rvf/rvf-types/src/segment_type.rs -- Add FederatedManifest = 0x33, DiffPrivacyProof = 0x34, RedactionLog = 0x35, AggregateWeights = 0x36
- crates/rvf/rvf-types/src/federation.rs -- New module with header structs (FederatedManifestHeader, DiffPrivacyProofHeader, RedactionLogHeader, AggregateWeightsHeader)
- crates/rvf/rvf-types/src/lib.rs -- Add pub mod federation; (feature-gated behind federation)
- Tests: round-trip for all 4 new segment types, discriminant values

Action 0.2: Add Federation Segment Wire Codecs to rvf-wire

Preconditions: federated_manifest_segment == true
Effects: federation_wire_codecs = true
Cost: 5 SP
Dependencies: [0.1]
Files Modified:
- crates/rvf/rvf-wire/src/federation_codec.rs -- New module: encode_federated_manifest, decode_federated_manifest, and equivalents for 0x34-0x36
- crates/rvf/rvf-wire/src/lib.rs -- Add pub mod federation_codec; (feature-gated)
- Tests: encode-decode round-trip for each new segment type, fuzz edge cases (truncated payloads, wrong magic)

Phase 1: PII Stripping

Action 1.1: Create rvf-pii-strip Crate

Preconditions: rvf_workspace_defined == true
Effects: pii_detection_exists = true
Cost: 8 SP
Dependencies: [0.1]
New Files:
- crates/rvf/rvf-pii-strip/Cargo.toml -- deps: rvf-types, regex (std feature), serde
- crates/rvf/rvf-pii-strip/src/lib.rs -- Module structure
- crates/rvf/rvf-pii-strip/src/detect.rs -- PiiDetector with regex patterns for paths, IPs, emails, API keys, usernames, env refs
- crates/rvf/rvf-pii-strip/src/redact.rs -- PiiRedactor with pseudonymization (deterministic per-export)
- crates/rvf/rvf-pii-strip/src/attest.rs -- RedactionAttestor generating RedactionLog segment payload
- crates/rvf/rvf-pii-strip/src/rules.rs -- RedactionRule config, RuleSet with default + custom rules
- crates/rvf/rvf-pii-strip/src/pipeline.rs -- StripPipeline::new(rules).detect(payload).redact().attest() fluent API
- Tests: detection accuracy for each PII type, pseudonym determinism, attest hash correctness, empty input, binary content (should pass through)

Action 1.2: Create rvf-pii-strip no_std Core

Preconditions: pii_detection_exists == true
Effects: pii_strip_nostd_core = true
Cost: 3 SP
Dependencies: [1.1]
Details: Extract regex-free pattern matching into no_std core that works in WASM. Uses simple byte-scanning for path separators, IP octets, sk- prefixes. Full regex detection remains in std feature.

Phase 2: Differential Privacy

Action 2.1: Create rvf-diff-privacy Crate

Preconditions: rvf_workspace_defined == true, diff_privacy_proof_segment == true
Effects: differential_privacy_exists = true
Cost: 8 SP
Dependencies: [0.1]
New Files:
- crates/rvf/rvf-diff-privacy/Cargo.toml -- deps: rvf-types, rand, serde
- crates/rvf/rvf-diff-privacy/src/lib.rs -- Module structure
- crates/rvf/rvf-diff-privacy/src/mechanism.rs -- GaussianMechanism, LaplaceMechanism, ExponentialMechanism with calibrated noise
- crates/rvf/rvf-diff-privacy/src/clipping.rs -- GradientClipper with L2 norm clipping, per-parameter and global
- crates/rvf/rvf-diff-privacy/src/accountant.rs -- PrivacyAccountant using Renyi Differential Privacy (RDP) composition
- crates/rvf/rvf-diff-privacy/src/budget.rs -- PrivacyBudget tracking cumulative epsilon/delta spend per contributor
- crates/rvf/rvf-diff-privacy/src/proof.rs -- DiffPrivacyProofBuilder generating 0x34 segment payload
- crates/rvf/rvf-diff-privacy/src/config.rs -- DiffPrivacyConfig { epsilon, delta, clipping_norm, noise_multiplier, mechanism }
- Tests: noise calibration matches theoretical bounds, RDP composition is monotonically increasing, budget tracking, proof generation

Action 2.2: Create rvf-diff-privacy no_std Core

Preconditions: differential_privacy_exists == true
Effects: diff_privacy_nostd_core = true
Cost: 3 SP
Dependencies: [2.1]
Details: Core noise generation and clipping in no_std (uses rand which supports no_std). RDP accountant requires f64 math but can be no_std with libm.

Phase 3: Federation Protocol

Action 3.1: Create rvf-federation Crate

Preconditions: federation_wire_codecs == true, pii_detection_exists == true, differential_privacy_exists == true
Effects: federation_protocol_exists = true
Cost: 13 SP
Dependencies: [0.2, 1.1, 2.1]
New Files:
- crates/rvf/rvf-federation/Cargo.toml -- deps: rvf-types, rvf-wire, rvf-crypto, rvf-pii-strip, rvf-diff-privacy, serde, serde_json
- crates/rvf/rvf-federation/src/lib.rs -- Module structure
- crates/rvf/rvf-federation/src/export.rs -- ExportBuilder:
  - .add_transfer_prior(prior) -- adds 0x30 segment
  - .add_policy_kernel(kernel) -- adds 0x31 segment
  - .add_cost_curve(curve) -- adds 0x32 segment
  - .add_sona_weights(weights) -- adds 0x36 segment
  - .set_contributor(pseudonym) -- sets contributor ID
  - .set_privacy_config(config) -- sets epsilon/delta
  - .set_pii_rules(rules) -- sets redaction rules
  - .build() -- runs PII strip pipeline, noise injection, generates manifest + redaction log + proof + witness + signature, returns Vec<u8>
- crates/rvf/rvf-federation/src/import.rs -- ImportValidator:
  - .validate(data: &[u8]) -- parses segments, verifies signature, witness chain, privacy proof, redaction log
  - .extract_priors(), .extract_kernels(), .extract_curves(), .extract_weights()
  - Returns ValidatedImport with all segments + metadata
- crates/rvf/rvf-federation/src/merge.rs -- VersionMerger:
  - .merge_transfer_prior(local, remote, weight) -- version-aware Beta parameter merging with dampening
  - .merge_policy_kernel(local_population, remote_kernel) -- inject remote kernel into population
  - .merge_sona_weights(local, remote, ewc_fisher) -- weighted average with EWC regularization
  - .merge_cost_curve(local_scoreboard, remote_curve) -- add as reference curve
- crates/rvf/rvf-federation/src/policy.rs -- FederationPolicy:
  - Allowlist/denylist for segment types
  - Quality gate threshold
  - Minimum evidence threshold
  - Rate limit configuration
  - Privacy budget limit
- crates/rvf/rvf-federation/src/manifest.rs -- FederatedManifestBuilder for 0x33 segment
- crates/rvf/rvf-federation/src/version.rs -- Version compatibility checking and negotiation
- Tests: full export/import round-trip, merge correctness, policy enforcement, version compatibility, signature verification, error cases

Action 3.2: Create rvf-adapters/federation

Preconditions: federation_protocol_exists == true, sona_federated_coordinator_exists == true, domain_expansion_engine_exists == true
Effects: federation_adapter = true
Cost: 8 SP
Dependencies: [3.1]
New Files:
- crates/rvf/rvf-adapters/federation/Cargo.toml -- deps: rvf-federation, sona, ruvector-domain-expansion, rvf-adapter-sona
- crates/rvf/rvf-adapters/federation/src/lib.rs -- Module structure
- crates/rvf/rvf-adapters/federation/src/export_coordinator.rs -- FederationExportCoordinator:
  - Takes &SonaEngine and &DomainExpansionEngine
  - Extracts TransferPrior from MetaThompsonEngine
  - Extracts best PolicyKernel from PopulationSearch
  - Extracts CostCurve from AccelerationScoreboard
  - Extracts SONA LoRA weights for AggregateWeights
  - Applies quality gate and minimum evidence filter
  - Passes to rvf-federation::ExportBuilder
- crates/rvf/rvf-adapters/federation/src/import_coordinator.rs -- FederationImportCoordinator:
  - Takes &mut SonaEngine and &mut DomainExpansionEngine
  - Uses rvf-federation::ImportValidator to validate
  - Uses rvf-federation::VersionMerger to merge
  - Updates local MetaThompsonEngine with merged priors
  - Injects kernels into PopulationSearch
  - Merges SONA weights with EWC protection
- Tests: end-to-end export from real engines, import into fresh engines, verify acceleration after import

Phase 4: Google Cloud Integration

Action 4.1: Create rvf-gcloud Crate

Preconditions: federation_protocol_exists == true
Effects: gcloud_pubsub_integration = true, gcloud_gcs_integration = true, gcloud_firestore_integration = true
Cost: 13 SP
Dependencies: [3.1]
New Files:
- crates/rvf/rvf-gcloud/Cargo.toml -- deps: google-cloud-pubsub, google-cloud-storage, google-cloud-firestore (or gcloud-sdk), tokio, serde, rvf-federation
- crates/rvf/rvf-gcloud/src/lib.rs -- Module structure
- crates/rvf/rvf-gcloud/src/pubsub.rs -- FederationPubSub:
  - publish_export_notification(manifest) -- publish FederatedManifest header to topic
  - subscribe_federation_events(filter) -- subscribe with domain/version filter
  - acknowledge(message_id) -- ack after successful import
  - Topic/subscription management
- crates/rvf/rvf-gcloud/src/gcs.rs -- FederationStorage:
  - upload_export(domain, contributor, data) -- upload RVF to GCS with proper naming
  - download_export(path) -- download RVF from GCS
  - list_exports(domain, since) -- list available exports
  - delete_by_contributor(pseudonym) -- right-to-deletion support
  - Lifecycle policy configuration
- crates/rvf/rvf-gcloud/src/firestore.rs -- FederationRegistry:
  - register_manifest(manifest) -- store manifest metadata
  - get_contributor_reputation(pseudonym) -- read reputation score
  - update_reputation(pseudonym, delta) -- update reputation
  - get_privacy_budget(pseudonym) -- read remaining budget
  - record_budget_spend(pseudonym, epsilon) -- deduct from budget
  - list_manifests(domain, limit) -- query manifest history
- crates/rvf/rvf-gcloud/src/auth.rs -- IAM authentication and service account management
- crates/rvf/rvf-gcloud/src/config.rs -- GCloudConfig { project_id, region, bucket, topic, collection }
- Tests: mock-based tests for all GCloud operations (no real GCloud calls in unit tests), integration test behind gcloud-integration feature flag

Action 4.2: Extend Google Cloud Example

Preconditions: gcloud_pubsub_integration == true
Effects: gcloud_example_updated = true
Cost: 5 SP
Dependencies: [4.1, 3.2]
Files Modified:
- examples/google-cloud/src/server.rs -- Add federation endpoints: POST /federation/submit, GET /federation/pull, POST /federation/aggregate, GET /federation/status
- examples/google-cloud/src/federation.rs -- New module: handler implementations using rvf-gcloud and rvf-federation
- examples/google-cloud/Cargo.toml -- Add rvf-federation, rvf-gcloud, rvf-adapters/federation deps
- examples/google-cloud/cloudrun.yaml -- Add environment variables for federation config

Phase 5: Federated Aggregation

Action 5.1: Create rvf-fed-aggregate Crate

Preconditions: federation_protocol_exists == true, differential_privacy_exists == true
Effects: aggregation_server = true
Cost: 8 SP
Dependencies: [3.1, 2.1]
New Files:
- crates/rvf/rvf-fed-aggregate/Cargo.toml -- deps: rvf-federation, rvf-diff-privacy, rvf-types, serde, tokio
- crates/rvf/rvf-fed-aggregate/src/lib.rs -- Module structure
- crates/rvf/rvf-fed-aggregate/src/round.rs -- AggregationRound:
  - Round lifecycle: Open -> Collecting -> Aggregating -> Published
  - submit(validated_import) -- add contributor
  - is_ready() -- check if min_participants reached or timeout
  - aggregate() -- trigger aggregation
- crates/rvf/rvf-fed-aggregate/src/fedavg.rs -- FedAvgAggregator:
  - Weighted average of TransferPrior Beta parameters
  - Weighted average of PolicyKnobs numeric fields
  - Weighted average of SONA LoRA deltas
  - Weight = contributor_reputation * trajectory_count * quality_score
- crates/rvf/rvf-fed-aggregate/src/fedprox.rs -- FedProxAggregator:
  - FedAvg + proximal term mu/2 * ||w_k - w_global||^2
  - For heterogeneous contributor distributions
- crates/rvf/rvf-fed-aggregate/src/byzantine.rs -- ByzantineFilter:
  - IQR-based outlier detection on parameter vectors
  - Krum aggregation: select contributor closest to peers
  - Configurable tolerance threshold
- crates/rvf/rvf-fed-aggregate/src/reputation.rs -- ReputationManager:
  - Score = f(avg_quality, trajectory_count, age, acceptance_rate)
  - Decay over time
  - Penalty for rejected submissions
- Tests: FedAvg correctness (average of known inputs), Byzantine tolerance (inject outlier, verify exclusion), reputation scoring, round lifecycle

Phase 5B: MCP and API Interfaces

Action 5B.1: Create mcp-federation Crate (MCP Server)

Preconditions: federation_protocol_exists == true, gcloud_pubsub_integration == true
Effects: mcp_federation_server = true
Cost: 8 SP
Dependencies: [3.1, 4.1]
New Files:
- crates/mcp-federation/Cargo.toml -- deps: rvf-federation, rvf-gcloud, rvf-pii-strip, rvf-diff-privacy, rvf-adapters/federation, serde, serde_json, tokio
- crates/mcp-federation/src/lib.rs -- Module structure, McpFederationServer
- crates/mcp-federation/src/server.rs -- JSON-RPC 2.0 stdio transport (same pattern as mcp-gate/src/server.rs):
  - McpFederationServer::new(config) -- initialize with federation config
  - McpFederationServer::run() -- main event loop: read stdin, dispatch, write stdout
  - Handles initialize, tools/list, tools/call, resources/list, resources/read
- crates/mcp-federation/src/tools.rs -- McpFederationTools:
  - federation_export -- extracts, strips PII, applies noise, signs, uploads
  - federation_import -- pulls, validates, merges into local engines
  - federation_status -- reads budget, recent activity, reputation
  - federation_search -- queries Firestore manifest registry
  - federation_budget -- reads privacy budget details
  - federation_aggregate -- triggers server-side aggregation round
- crates/mcp-federation/src/resources.rs -- McpFederationResources:
  - federation://domains -- list of federated domains with stats
  - federation://contributors -- pseudonymized contributor list + reputation
  - federation://rounds/{round_id} -- aggregation round details
  - federation://budget -- privacy budget for current contributor
- crates/mcp-federation/src/schemas.rs -- JSON Schema definitions for all tool inputs/outputs
- Tests: tool dispatch, resource resolution, schema validation, error handling

Action 5B.2: Create rvf-fed-server Crate (REST API)

Preconditions: federation_protocol_exists == true, gcloud_pubsub_integration == true, aggregation_server == true
Effects: rest_api_server = true
Cost: 8 SP
Dependencies: [3.1, 4.1, 5.1]
New Files:
- crates/rvf/rvf-fed-server/Cargo.toml -- deps: rvf-federation, rvf-gcloud, rvf-fed-aggregate, axum, tower, tower-http, tokio, serde, serde_json, tracing, metrics, metrics-exporter-prometheus
- crates/rvf/rvf-fed-server/src/lib.rs -- Module structure, FederationServer
- crates/rvf/rvf-fed-server/src/routes.rs -- axum Router:
  - POST /v1/exports -- accept RVF bytes, validate, store in GCS, publish event
  - GET /v1/exports/{id} -- download RVF export by ID
  - GET /v1/exports?domain=&since=&limit= -- list exports
  - DELETE /v1/exports/{id} -- contributor deletes own export
  - POST /v1/aggregates -- trigger aggregation round
  - GET /v1/aggregates/{round_id} -- round status
  - GET /v1/aggregates/latest?domain= -- latest aggregate RVF
  - GET /v1/domains -- list federated domains
  - GET /v1/contributors/{pseudonym} -- contributor profile
  - GET /v1/contributors/{pseudonym}/budget -- privacy budget
  - GET /v1/health -- health check
  - GET /v1/metrics -- Prometheus metrics
  - GET /v1/events?domain= -- SSE stream
- crates/rvf/rvf-fed-server/src/auth.rs -- Authentication middleware:
  - Bearer token validation (SHAKE-256 hash lookup in Firestore)
  - Ed25519 signature verification (X-Federation-Signature, X-Federation-PublicKey)
- crates/rvf/rvf-fed-server/src/rate_limit.rs -- Tower rate limiting middleware:
  - Per-contributor, per-endpoint configurable limits
  - Token bucket algorithm
- crates/rvf/rvf-fed-server/src/sse.rs -- Server-Sent Events:
  - new_export, aggregation_complete, import_available event types
  - Domain-filtered subscriptions
- crates/rvf/rvf-fed-server/src/metrics.rs -- Prometheus metrics registration and export
- Tests: route handler tests with mock backends, auth middleware, rate limiting, SSE stream

Phase 6: WASM Export Path

Action 6.1: Create rvf-fed-wasm Crate

Preconditions: pii_strip_nostd_core == true, diff_privacy_nostd_core == true, federation_protocol_exists == true
Effects: wasm_export_path = true
Cost: 5 SP
Dependencies: [1.2, 2.2, 3.1]
New Files:
- crates/rvf/rvf-fed-wasm/Cargo.toml -- deps: rvf-types, rvf-wire, rvf-crypto, rvf-pii-strip (no_std), rvf-diff-privacy (no_std), wasm-bindgen, js-sys
- crates/rvf/rvf-fed-wasm/src/lib.rs -- wasm-bindgen exports:
  - FederationExporter::new(config) -- create exporter with epsilon/delta/rules
  - FederationExporter::add_transfer_prior(bytes) -- add prior segment
  - FederationExporter::add_policy_kernel(bytes) -- add kernel segment
  - FederationExporter::add_cost_curve(bytes) -- add curve segment
  - FederationExporter::build() -- strip PII, add noise, sign, return Uint8Array
- crates/rvf/rvf-fed-wasm/src/js_types.rs -- JavaScript-friendly type wrappers
- npm package config for @ruvector/rvf-fed-wasm
- Tests: build with wasm-pack test --headless --chrome

Phase 7: Integration and Testing

Action 7.1: Integration Tests

Preconditions: All previous actions complete
Effects: all_tests_pass = true
Cost: 8 SP
Dependencies: [All above]
New Files:
- crates/rvf/tests/rvf-integration/src/federation.rs -- Integration tests:
  - Full export/import round-trip with real SONA and DomainExpansion engines
  - PII stripping verification (inject known PII, verify redaction)
  - Differential privacy verification (noise bounds check)
  - Version compatibility matrix (v1 export, v1 import; future v2 considerations)
  - Byzantine tolerance verification (inject poisoned export, verify exclusion)
  - Privacy budget exhaustion (export until budget depleted, verify rejection)
  - Signature verification (tamper with segment, verify rejection)
  - Witness chain verification (reorder segments, verify rejection)
  - Federated averaging correctness (known inputs, verify output)
  - End-to-end acceleration test (import learning, verify faster convergence)

Action 7.2: Update Workspace Configuration

Preconditions: All new crates created
Effects: feature_gated = true
Cost: 2 SP
Dependencies: [All new crate creations]
Files Modified:
- crates/rvf/Cargo.toml -- Add new members to workspace, add workspace dependencies
- Each existing crate that gains federation feature gates

Action 7.3: CLI Extension

Preconditions: federation_adapter == true, gcloud_pubsub_integration == true
Effects: cli_federation_commands = true
Cost: 5 SP
Dependencies: [3.2, 4.1]
Files Modified:
- crates/rvf/rvf-cli/ -- Add subcommands:
  - rvf federation export --domain <id> --epsilon <val> --output <path>
  - rvf federation import --input <path>
  - rvf federation subscribe --domains <ids> --gcloud-config <path>
  - rvf federation status -- show privacy budget, contribution history
  - rvf federation revoke --contributor <pseudonym> -- right-to-deletion

3. GOAP Plan: Optimal Action Sequence

Using A* search through the action dependency graph, the optimal implementation order is:

MILESTONE 1: FOUNDATION (Week 1-2)
===================================
Sprint 1 (Week 1):
  [0.1] Add Federation Segment Types        (3 SP)
  [0.2] Add Federation Wire Codecs          (5 SP)
                                      Total: 8 SP

Sprint 2 (Week 2):
  [1.1] Create rvf-pii-strip               (8 SP)
  [2.1] Create rvf-diff-privacy            (8 SP)  -- parallel with 1.1
                                      Total: 16 SP


MILESTONE 2: CORE PROTOCOL (Week 3-4)
======================================
Sprint 3 (Week 3):
  [1.2] PII Strip no_std Core              (3 SP)
  [2.2] Diff Privacy no_std Core           (3 SP)  -- parallel with 1.2
  [3.1] Create rvf-federation (start)      (8 SP of 13)
                                      Total: 14 SP

Sprint 4 (Week 4):
  [3.1] Create rvf-federation (complete)   (5 SP remaining)
  [3.2] Create rvf-adapters/federation     (8 SP)
                                      Total: 13 SP


MILESTONE 3: CLOUD + AGGREGATION (Week 5-6)
=============================================
Sprint 5 (Week 5):
  [4.1] Create rvf-gcloud                  (13 SP)
                                      Total: 13 SP

Sprint 6 (Week 6):
  [5.1] Create rvf-fed-aggregate           (8 SP)
  [4.2] Extend Google Cloud Example        (5 SP)  -- parallel with 5.1
                                      Total: 13 SP


MILESTONE 4: INTERFACES + WASM (Week 7-8)
==========================================
Sprint 7 (Week 7):
  [5B.1] Create mcp-federation (MCP)       (8 SP)
  [5B.2] Create rvf-fed-server (REST API)  (8 SP)  -- parallel with 5B.1
                                      Total: 16 SP

Sprint 8 (Week 8):
  [6.1] Create rvf-fed-wasm               (5 SP)
  [7.2] Update Workspace Configuration    (2 SP)
  [7.3] CLI Extension                      (5 SP)  -- parallel with 6.1
                                      Total: 12 SP


MILESTONE 5: INTEGRATION (Week 9)
==================================
Sprint 9 (Week 9):
  [7.1] Integration Tests                  (8 SP)
                                      Total: 8 SP


TOTAL: 113 SP across 9 weeks (5 milestones)

Dependency Graph (Topological Order)

[0.1] Segment Types
  |
  +---> [0.2] Wire Codecs
  |       |
  |       +---> [3.1] rvf-federation ──────────────────────┐
  |       |       |                                         |
  +---> [1.1] rvf-pii-strip ──> [1.2] no_std core ──> [6.1] rvf-fed-wasm
  |       |                                                 |
  +---> [2.1] rvf-diff-privacy ──> [2.2] no_std core ──────┘
          |                           |
          +---> [5.1] rvf-fed-aggregate ──> [5B.2] rvf-fed-server (REST API)
          |                                    |
          +---> [3.1] ──> [3.2] rvf-adapters/federation
          |                  |
          +---> [4.1] rvf-gcloud ──> [4.2] Example update
          |       |            |
          |       |            +---> [5B.1] mcp-federation (MCP Server)
          |       |            |
          +-------+---> [7.3] CLI Extension
                  |
                  +---> [7.2] Workspace Config
                  |
                  +---> [7.1] Integration Tests

Critical Path

[0.1] -> [0.2] -> [3.1] -> [3.2] -> [7.1]
                     ^
                     |
[1.1] -> [1.2] -----+
                     |
[2.1] -> [2.2] -----+

Interface crates (off critical path, parallel):
[3.1] + [4.1] -> [5B.1] mcp-federation
[3.1] + [4.1] + [5.1] -> [5B.2] rvf-fed-server

The critical path runs through the segment types, wire codecs, and federation protocol. PII stripping and differential privacy can proceed in parallel but must complete before rvf-federation begins its final integration. The MCP server and REST API crates are off the critical path — they depend on rvf-federation and rvf-gcloud but can be built in parallel with WASM and CLI work.

4. Detailed Implementation Notes

4.1 Segment Type Registration

In crates/rvf/rvf-types/src/segment_type.rs, add to the enum:

/// Federated learning manifest (contributor, privacy budget, segment list).
FederatedManifest = 0x33,
/// Differential privacy proof (epsilon, delta, mechanism, noise proof).
DiffPrivacyProof = 0x34,
/// PII redaction attestation (counts, hashes, rules fired).
RedactionLog = 0x35,
/// Federated-averaged weights (LoRA deltas, participation, convergence).
AggregateWeights = 0x36,

Add to TryFrom<u8>:

0x33 => Ok(Self::FederatedManifest),
0x34 => Ok(Self::DiffPrivacyProof),
0x35 => Ok(Self::RedactionLog),
0x36 => Ok(Self::AggregateWeights),

4.2 PII Detection Patterns

Core regex patterns for rvf-pii-strip:

const PATH_UNIX: &str = r"(?:/(?:home|Users|tmp|var|etc|opt)/[^\s\x00-\x1f]+)";
const PATH_WINDOWS: &str = r"(?:[A-Za-z]:\\(?:Users|Windows|Program Files)[^\s\x00-\x1f]*)";
const IPV4: &str = r"\b(?:\d{1,3}\.){3}\d{1,3}\b";
const IPV6: &str = r"\b(?:[0-9a-fA-F]{1,4}:){2,7}[0-9a-fA-F]{1,4}\b";
const EMAIL: &str = r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b";
const API_KEY_OPENAI: &str = r"\bsk-(?:proj-)?[A-Za-z0-9]{20,}\b";
const API_KEY_AWS: &str = r"\bAKIA[A-Z0-9]{16}\b";
const API_KEY_GITHUB: &str = r"\bgh[ps]_[A-Za-z0-9]{36,}\b";
const BEARER_TOKEN: &str = r"\bBearer\s+[A-Za-z0-9\-._~+/]+=*\b";
const ENV_VAR_UNIX: &str = r"\$(?:HOME|USER|PATH|SHELL|TMPDIR|HOSTNAME)\b";
const ENV_VAR_WIN: &str = r"%(?:USERPROFILE|USERNAME|COMPUTERNAME|TEMP|TMP)%";

4.3 Gaussian Mechanism Calibration

For rvf-diff-privacy, the Gaussian mechanism adds noise:

/// Calibrate noise for (epsilon, delta)-differential privacy.
///
/// sigma = sensitivity * sqrt(2 * ln(1.25 / delta)) / epsilon
pub fn calibrate_gaussian(sensitivity: f64, epsilon: f64, delta: f64) -> f64 {
    sensitivity * (2.0 * (1.25 / delta).ln()).sqrt() / epsilon
}

/// Add calibrated Gaussian noise to a parameter vector.
pub fn add_gaussian_noise(
    params: &mut [f32],
    sensitivity: f64,
    epsilon: f64,
    delta: f64,
    rng: &mut impl Rng,
) {
    let sigma = calibrate_gaussian(sensitivity, epsilon, delta) as f32;
    let dist = rand_distr::Normal::new(0.0, sigma).unwrap();
    for p in params.iter_mut() {
        *p += rng.sample(dist);
    }
}

4.4 Renyi Differential Privacy Accountant

/// RDP accountant for privacy budget tracking.
///
/// For the Gaussian mechanism with noise multiplier sigma:
///   RDP(alpha) = alpha / (2 * sigma^2)
///
/// Convert RDP to (epsilon, delta)-DP:
///   epsilon = RDP(alpha) + ln(1/delta) / (alpha - 1) - ln(alpha) / (alpha - 1)
///
/// Composition: RDP values add across multiple queries.
pub struct RdpAccountant {
    /// Accumulated RDP values at each alpha order.
    rdp_values: Vec<(f64, f64)>,  // (alpha, accumulated_rdp)
    /// Alpha orders to track.
    alpha_orders: Vec<f64>,
}

impl RdpAccountant {
    pub fn new() -> Self {
        let alpha_orders: Vec<f64> = (2..=256).map(|a| a as f64).collect();
        let rdp_values = alpha_orders.iter().map(|&a| (a, 0.0)).collect();
        Self { rdp_values, alpha_orders }
    }

    pub fn add_gaussian_query(&mut self, sigma: f64) {
        for (alpha, rdp) in self.rdp_values.iter_mut() {
            *rdp += *alpha / (2.0 * sigma * sigma);
        }
    }

    pub fn get_epsilon(&self, delta: f64) -> f64 {
        self.rdp_values.iter()
            .map(|(alpha, rdp)| rdp + (1.0 / delta).ln() / (alpha - 1.0))
            .fold(f64::INFINITY, f64::min)
    }
}

4.5 Version-Aware Prior Merging

/// Merge a remote TransferPrior into a local one.
///
/// Uses evidence-weighted blending: the source with more training cycles
/// gets higher weight. Then applies sqrt-dampening to prevent over-confidence.
pub fn merge_transfer_priors(
    local: &TransferPrior,
    remote: &TransferPrior,
) -> TransferPrior {
    let total_cycles = local.training_cycles + remote.training_cycles;
    let remote_weight = if total_cycles > 0 {
        remote.training_cycles as f32 / total_cycles as f32
    } else {
        0.5
    };
    let local_weight = 1.0 - remote_weight;

    let mut merged = TransferPrior::uniform(local.source_domain.clone());
    merged.training_cycles = total_cycles;

    // Collect all bucket/arm combinations from both
    let all_buckets: HashSet<_> = local.bucket_priors.keys()
        .chain(remote.bucket_priors.keys())
        .collect();

    for bucket in all_buckets {
        let local_arms = local.bucket_priors.get(bucket);
        let remote_arms = remote.bucket_priors.get(bucket);

        let all_arms: HashSet<_> = local_arms.iter()
            .flat_map(|m| m.keys())
            .chain(remote_arms.iter().flat_map(|m| m.keys()))
            .collect();

        let mut merged_arms = HashMap::new();
        for arm in all_arms {
            let l = local_arms
                .and_then(|m| m.get(arm))
                .unwrap_or(&BetaParams::uniform());
            let r = remote_arms
                .and_then(|m| m.get(arm))
                .unwrap_or(&BetaParams::uniform());

            // Weighted blend
            let alpha = l.alpha * local_weight + r.alpha * remote_weight;
            let beta = l.beta * local_weight + r.beta * remote_weight;

            // Sqrt-dampening (same as MetaThompsonEngine::init_domain_with_transfer)
            let dampened = BetaParams {
                alpha: 1.0 + (alpha - 1.0).sqrt(),
                beta: 1.0 + (beta - 1.0).sqrt(),
            };

            merged_arms.insert(arm.clone(), dampened);
        }
        merged.bucket_priors.insert(bucket.clone(), merged_arms);
    }

    merged
}

4.6 FedAvg Implementation

/// Federated averaging of TransferPriors.
///
/// weight_k = reputation_k * trajectory_count_k * avg_quality_k
/// w_avg = sum(weight_k * prior_k) / sum(weight_k)
pub fn fedavg_priors(
    contributions: &[(TransferPrior, f32)],  // (prior, weight)
) -> TransferPrior {
    let total_weight: f32 = contributions.iter().map(|(_, w)| w).sum();
    if total_weight < 1e-10 || contributions.is_empty() {
        return TransferPrior::uniform(DomainId("aggregate".into()));
    }

    let mut result = TransferPrior::uniform(DomainId("aggregate".into()));

    // Collect all unique bucket/arm combinations
    let all_buckets: HashSet<_> = contributions.iter()
        .flat_map(|(p, _)| p.bucket_priors.keys())
        .collect();

    for bucket in &all_buckets {
        let all_arms: HashSet<_> = contributions.iter()
            .flat_map(|(p, _)| {
                p.bucket_priors.get(*bucket)
                    .map(|m| m.keys().collect::<Vec<_>>())
                    .unwrap_or_default()
            })
            .collect();

        let mut merged_arms = HashMap::new();
        for arm in &all_arms {
            let mut alpha_sum = 0.0;
            let mut beta_sum = 0.0;

            for (prior, weight) in contributions {
                let params = prior.get_prior(bucket, arm);
                let normalized_weight = weight / total_weight;
                alpha_sum += params.alpha * normalized_weight;
                beta_sum += params.beta * normalized_weight;
            }

            merged_arms.insert((*arm).clone(), BetaParams {
                alpha: alpha_sum,
                beta: beta_sum,
            });
        }
        result.bucket_priors.insert((*bucket).clone(), merged_arms);
    }

    result.training_cycles = contributions.iter()
        .map(|(p, _)| p.training_cycles)
        .sum();

    result
}

4.7 Byzantine-Tolerant Aggregation (Krum)

/// Krum aggregation: select the contribution closest to its peers.
///
/// For each contribution k, compute:
///   score_k = sum of distances to (N - f - 1) nearest neighbors
/// where f = ceil(N/3) - 1 (Byzantine tolerance).
///
/// Select the contribution with the minimum score.
pub fn krum_select(
    contributions: &[(TransferPrior, f32)],
) -> Option<usize> {
    let n = contributions.len();
    if n < 4 { return Some(0); }  // Need at least 4 for Byzantine tolerance

    let f = (n as f32 / 3.0).ceil() as usize - 1;
    let neighbors_to_check = n - f - 1;

    // Flatten each prior into a parameter vector for distance computation
    let vectors: Vec<Vec<f32>> = contributions.iter()
        .map(|(p, _)| flatten_prior(p))
        .collect();

    // Compute pairwise distances
    let mut scores = vec![0.0f32; n];
    for i in 0..n {
        let mut distances: Vec<f32> = (0..n)
            .filter(|&j| j != i)
            .map(|j| l2_distance(&vectors[i], &vectors[j]))
            .collect();
        distances.sort_by(|a, b| a.partial_cmp(b).unwrap());
        scores[i] = distances.iter().take(neighbors_to_check).sum();
    }

    scores.iter()
        .enumerate()
        .min_by(|a, b| a.1.partial_cmp(b.1).unwrap())
        .map(|(idx, _)| idx)
}

5. Risk Mitigations

5.1 Poisoning Attack Prevention

Threat: Malicious contributor submits crafted learning data to degrade other users' models.

Mitigations (defense in depth):

Byzantine-tolerant aggregation (Krum) excludes outlier contributions
Reputation system: new contributors have low weight; weight grows with successful contributions
Signature verification: every export is signed, attributable to a pseudonym
Quality gate: only learning from high-quality trajectories is exported
Differential privacy noise limits the impact of any single contribution
Dampened priors: imports are sqrt-dampened before integration

5.2 Privacy Budget Management

Threat: User exports too frequently, accumulating enough epsilon to allow reconstruction.

Mitigations:

PrivacyBudget tracked per-contributor in Firestore
Each export's DiffPrivacyProof records epsilon spent
Server rejects exports when cumulative epsilon exceeds configurable limit (default: 10.0)
Alert at 80% budget usage
Budget resets annually (configurable)

5.3 Backward Compatibility

Threat: Adding new segment types breaks existing RVF readers.

Mitigations:

RVF's forward compatibility: unknown segment types are skipped by readers that do not recognize them (existing behavior)
New segments use 0x33-0x36 range, which existing TryFrom<u8> returns Err(_) for
All federation code is behind federation Cargo feature flag
FederatedManifest header includes a format version field for future evolution

5.4 Regulatory Compliance

Threat: Federation data subject to GDPR/CCPA despite PII stripping.

Mitigations:

PII stripping is mandatory at the export boundary, not optional
RedactionLog provides auditable proof that stripping occurred
Contributor pseudonym (SHAKE-256 hash) is the only identifier in cloud
Right-to-deletion: revoke pseudonym -> delete all GCS objects -> Firestore cleanup
Differential privacy provides mathematical guarantee that individual contributions cannot be reconstructed

6. Monitoring and Observability

6.1 Metrics

Metric	Type	Description
`federation.exports.total`	Counter	Total exports submitted
`federation.imports.total`	Counter	Total imports processed
`federation.rejections.total{reason}`	Counter	Imports rejected, labeled by reason
`federation.pii.detections{type}`	Counter	PII detections by type
`federation.privacy.budget.used{contributor}`	Gauge	Epsilon spent per contributor
`federation.aggregate.rounds`	Counter	Aggregation rounds completed
`federation.aggregate.participants`	Histogram	Participants per round
`federation.acceleration.factor`	Gauge	Last measured acceleration from imports
`federation.latency.export_ms`	Histogram	Export build time
`federation.latency.import_ms`	Histogram	Import + merge time

6.2 Structured Logging

All federation operations emit structured log events:

event=federation_export contributor=<pseudonym> domain=<id> segments=<count> epsilon=<val>
event=federation_import source=<pseudonym> domain=<id> valid=<bool> reason=<str>
event=federation_aggregate round=<id> participants=<count> method=<fedavg|krum>
event=pii_detection type=<path|ip|key> count=<n>
event=privacy_budget contributor=<pseudonym> remaining=<epsilon>

7. Testing Strategy

7.1 Unit Tests (per crate)

Crate	Test Focus	Est. Count
rvf-types (federation)	Segment type discriminants, header struct sizes	~10
rvf-wire (federation)	Codec round-trips, malformed input handling	~15
rvf-pii-strip	Detection patterns, redaction determinism, attestation hashes	~30
rvf-diff-privacy	Noise calibration, RDP composition, budget tracking	~25
rvf-federation	Export/import round-trip, policy enforcement, version compat	~30
rvf-fed-aggregate	FedAvg math, Krum selection, reputation scoring	~20
rvf-gcloud	Mock-based GCS/PubSub/Firestore operations	~25
rvf-adapters/federation	Coordinator export/import, engine integration	~15

7.2 Integration Tests

End-to-end export from real SONA + DomainExpansion -> PII strip -> noise -> sign -> validate -> import -> verify acceleration
Multi-contributor aggregation round with FedAvg
Byzantine tolerance with injected outlier
Privacy budget exhaustion and rejection
WASM export path (headless browser test)

7.3 Property-Based Tests

PII detector: any string matching a PII pattern must be redacted
Differential privacy: output distribution must satisfy (epsilon, delta) bounds
Witness chain: reordering segments must fail verification
FedAvg: result is a convex combination of inputs

8. Open Questions

ML-DSA-65 vs Ed25519: ADR-057 mentions both. Ed25519 is available now in rvf-crypto. ML-DSA-65 (post-quantum) would require adding pqcrypto-dilithium dependency. Recommendation: start with Ed25519, add ML-DSA-65 as a future optional feature.
Reputation bootstrapping: New contributors start with no reputation. How much weight should their first contribution receive? Recommendation: fixed minimum weight of 0.1 for first 5 contributions, then reputation-based.
Cross-region replication: Should GCS buckets be multi-region? Recommendation: start single-region (us-central1), add multi-region when >100 contributors.
Aggregation trigger: Time-based (hourly) or participant-based (every N submissions)? Recommendation: participant-based with timeout fallback. min_participants=5, max_wait=3600s.
SONA weight granularity: Export full LoRA matrices or just the rank-1/rank-2 deltas? Recommendation: export rank-matched deltas only (typically 2hidden_dimrank floats = 1024 floats for rank=2, dim=256). Full matrices are unnecessary for transfer.

43 KiB Raw Blame History

Federated RVF Transfer Learning -- GOAP Implementation Plan

1. World State Assessment

1.1 Current State (What Is True Now)

1.2 Goal State (What Should Be True)

2. Action Inventory

Phase 0: Foundation -- Segment Types and Core Types

Action 0.1: Add Federation Segment Types to rvf-types

Action 0.2: Add Federation Segment Wire Codecs to rvf-wire

Phase 1: PII Stripping

Action 1.1: Create rvf-pii-strip Crate

Action 1.2: Create rvf-pii-strip no_std Core

Phase 2: Differential Privacy

Action 2.1: Create rvf-diff-privacy Crate

Action 2.2: Create rvf-diff-privacy no_std Core

Phase 3: Federation Protocol

Action 3.1: Create rvf-federation Crate

Action 3.2: Create rvf-adapters/federation

Phase 4: Google Cloud Integration

Action 4.1: Create rvf-gcloud Crate

Action 4.2: Extend Google Cloud Example

Phase 5: Federated Aggregation

Action 5.1: Create rvf-fed-aggregate Crate

Phase 5B: MCP and API Interfaces

Action 5B.1: Create mcp-federation Crate (MCP Server)

Action 5B.2: Create rvf-fed-server Crate (REST API)

Phase 6: WASM Export Path

Action 6.1: Create rvf-fed-wasm Crate

Phase 7: Integration and Testing

Action 7.1: Integration Tests

Action 7.2: Update Workspace Configuration

Action 7.3: CLI Extension

3. GOAP Plan: Optimal Action Sequence

Dependency Graph (Topological Order)

Critical Path

4. Detailed Implementation Notes

4.1 Segment Type Registration

4.2 PII Detection Patterns

4.3 Gaussian Mechanism Calibration

4.4 Renyi Differential Privacy Accountant

4.5 Version-Aware Prior Merging

4.6 FedAvg Implementation

4.7 Byzantine-Tolerant Aggregation (Krum)

5. Risk Mitigations

5.1 Poisoning Attack Prevention

5.2 Privacy Budget Management

5.3 Backward Compatibility

5.4 Regulatory Compliance

6. Monitoring and Observability

6.1 Metrics

6.2 Structured Logging

7. Testing Strategy

7.1 Unit Tests (per crate)

7.2 Integration Tests

7.3 Property-Based Tests

8. Open Questions

43 KiB

Raw Blame History