feat: Add 12 ADRs for RuVector RVF integration and proof-of-reality
Comprehensive architecture decision records for integrating ruvnet/ruvector
into wifi-densepose, covering:
- ADR-002: Master integration strategy (phased rollout, new crate design)
- ADR-003: RVF cognitive containers for CSI data persistence
- ADR-004: HNSW vector search replacing fixed-threshold detection
- ADR-005: SONA self-learning with LoRA + EWC++ for online adaptation
- ADR-006: GNN-enhanced pattern recognition with temporal modeling
- ADR-007: Post-quantum cryptography (ML-DSA-65 hybrid signatures)
- ADR-008: Raft consensus for multi-AP distributed coordination
- ADR-009: RVF WASM runtime for edge/browser/IoT deployment
- ADR-010: Witness chains for tamper-evident audit trails
- ADR-011: Mock elimination and proof-of-reality (fixes np.random.rand
placeholders, ships CSI capture + SHA-256 verified pipeline)
- ADR-012: ESP32 CSI sensor mesh ($54 starter kit specification)
- ADR-013: Feature-level sensing on commodity gear (zero-cost RSSI path)
ADR-011 directly addresses the credibility gap by cataloging every
mock/placeholder in the Python codebase and specifying concrete fixes.
https://claude.ai/code/session_01Ki7pvEZtJDvqJkmyn6B714
This commit is contained in:
207
docs/adr/ADR-002-ruvector-rvf-integration-strategy.md
Normal file
207
docs/adr/ADR-002-ruvector-rvf-integration-strategy.md
Normal file
@@ -0,0 +1,207 @@
|
|||||||
|
# ADR-002: RuVector RVF Integration Strategy
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed
|
||||||
|
|
||||||
|
## Date
|
||||||
|
2026-02-28
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
### Current System Limitations
|
||||||
|
|
||||||
|
The WiFi-DensePose system processes Channel State Information (CSI) from WiFi signals to estimate human body poses. The current architecture (Python v1 + Rust port) has several areas where intelligence and performance could be significantly improved:
|
||||||
|
|
||||||
|
1. **No persistent vector storage**: CSI feature vectors are processed transiently. Historical patterns, fingerprints, and learned representations are not persisted in a searchable vector database.
|
||||||
|
|
||||||
|
2. **Static inference models**: The modality translation network (`ModalityTranslationNetwork`) and DensePose head use fixed weights loaded at startup. There is no online learning, adaptation, or self-optimization.
|
||||||
|
|
||||||
|
3. **Naive pattern matching**: Human detection in `CSIProcessor` uses simple threshold-based confidence scoring (`amplitude_indicator`, `phase_indicator`, `motion_indicator` with fixed weights 0.4, 0.3, 0.3). No similarity search against known patterns.
|
||||||
|
|
||||||
|
4. **No cryptographic audit trail**: Life-critical disaster detection (wifi-densepose-mat) lacks tamper-evident logging for survivor detections and triage classifications.
|
||||||
|
|
||||||
|
5. **Limited edge deployment**: The WASM crate (`wifi-densepose-wasm`) provides basic bindings but lacks a self-contained runtime capable of offline operation with embedded models.
|
||||||
|
|
||||||
|
6. **Single-node architecture**: Multi-AP deployments for disaster scenarios require distributed coordination, but no consensus mechanism exists for cross-node state management.
|
||||||
|
|
||||||
|
### RuVector Capabilities
|
||||||
|
|
||||||
|
RuVector (github.com/ruvnet/ruvector) provides a comprehensive cognitive computing platform:
|
||||||
|
|
||||||
|
- **RVF (Cognitive Containers)**: Self-contained files with 25 segment types (VEC, INDEX, KERNEL, EBPF, WASM, COW_MAP, WITNESS, CRYPTO) that package vectors, models, and runtime into a single deployable artifact
|
||||||
|
- **HNSW Vector Search**: Hierarchical Navigable Small World indexing with SIMD acceleration and Hyperbolic extensions for hierarchy-aware search
|
||||||
|
- **SONA**: Self-Optimizing Neural Architecture providing <1ms adaptation via LoRA fine-tuning with EWC++ memory preservation
|
||||||
|
- **GNN Learning Layer**: Graph Neural Networks that learn from every query through message passing, attention weighting, and representation updates
|
||||||
|
- **46 Attention Mechanisms**: Including Flash Attention, Linear Attention, Graph Attention, Hyperbolic Attention, Mincut-gated Attention
|
||||||
|
- **Post-Quantum Cryptography**: ML-DSA-65, Ed25519, SLH-DSA-128s signatures with SHAKE-256 hashing
|
||||||
|
- **Witness Chains**: Tamper-evident cryptographic hash-linked audit trails
|
||||||
|
- **Raft Consensus**: Distributed coordination with multi-master replication and vector clocks
|
||||||
|
- **WASM Runtime**: 5.5 KB runtime bootable in 125ms, deployable on servers, browsers, phones, IoT
|
||||||
|
- **Git-like Branching**: Copy-on-write structure (1M vectors + 100 edits ≈ 2.5 MB branch)
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will integrate RuVector's RVF format and intelligence capabilities into the WiFi-DensePose system through a phased, modular approach across 9 integration domains, each detailed in subsequent ADRs (ADR-003 through ADR-010).
|
||||||
|
|
||||||
|
### Integration Architecture Overview
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ WiFi-DensePose + RuVector │
|
||||||
|
├─────────────────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||||
|
│ │ CSI Input │ │ RVF Store │ │ SONA │ │ GNN Layer │ │
|
||||||
|
│ │ Pipeline │──▶│ (Vectors, │──▶│ Self-Learn │──▶│ Pattern │ │
|
||||||
|
│ │ │ │ Indices) │ │ │ │ Enhancement │ │
|
||||||
|
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
||||||
|
│ │ │ │ │ │
|
||||||
|
│ ▼ ▼ ▼ ▼ │
|
||||||
|
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||||
|
│ │ Feature │ │ HNSW │ │ Adaptive │ │ Pose │ │
|
||||||
|
│ │ Extraction │ │ Search │ │ Weights │ │ Estimation │ │
|
||||||
|
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
||||||
|
│ │ │ │ │ │
|
||||||
|
│ └─────────────────┴─────────────────┴─────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ┌──────────▼──────────┐ │
|
||||||
|
│ │ Output Layer │ │
|
||||||
|
│ │ • Pose Keypoints │ │
|
||||||
|
│ │ • Body Segments │ │
|
||||||
|
│ │ • UV Coordinates │ │
|
||||||
|
│ │ • Confidence Maps │ │
|
||||||
|
│ └──────────┬──────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ┌───────────────────────────┼───────────────────────────┐ │
|
||||||
|
│ ▼ ▼ ▼ │
|
||||||
|
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||||
|
│ │ Witness │ │ Raft │ │ WASM │ │
|
||||||
|
│ │ Chains │ │ Consensus │ │ Edge │ │
|
||||||
|
│ │ (Audit) │ │ (Multi-AP) │ │ Runtime │ │
|
||||||
|
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Post-Quantum Crypto Layer │ │
|
||||||
|
│ │ ML-DSA-65 │ Ed25519 │ SLH-DSA-128s │ SHAKE-256 │ │
|
||||||
|
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||||
|
└─────────────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### New Crate: `wifi-densepose-rvf`
|
||||||
|
|
||||||
|
A new workspace member crate will serve as the integration layer:
|
||||||
|
|
||||||
|
```
|
||||||
|
crates/wifi-densepose-rvf/
|
||||||
|
├── Cargo.toml
|
||||||
|
├── src/
|
||||||
|
│ ├── lib.rs # Public API surface
|
||||||
|
│ ├── container.rs # RVF cognitive container management
|
||||||
|
│ ├── vector_store.rs # HNSW-backed CSI vector storage
|
||||||
|
│ ├── search.rs # Similarity search for fingerprinting
|
||||||
|
│ ├── learning.rs # SONA integration for online learning
|
||||||
|
│ ├── gnn.rs # GNN pattern enhancement layer
|
||||||
|
│ ├── attention.rs # Attention mechanism selection
|
||||||
|
│ ├── witness.rs # Witness chain audit trails
|
||||||
|
│ ├── consensus.rs # Raft consensus for multi-AP
|
||||||
|
│ ├── crypto.rs # Post-quantum crypto wrappers
|
||||||
|
│ ├── edge.rs # WASM edge runtime integration
|
||||||
|
│ └── adapters/
|
||||||
|
│ ├── mod.rs
|
||||||
|
│ ├── signal_adapter.rs # Bridges wifi-densepose-signal
|
||||||
|
│ ├── nn_adapter.rs # Bridges wifi-densepose-nn
|
||||||
|
│ └── mat_adapter.rs # Bridges wifi-densepose-mat
|
||||||
|
```
|
||||||
|
|
||||||
|
### Phased Rollout
|
||||||
|
|
||||||
|
| Phase | Timeline | ADR | Capability | Priority |
|
||||||
|
|-------|----------|-----|------------|----------|
|
||||||
|
| 1 | Weeks 1-3 | ADR-003 | RVF Cognitive Containers for CSI Data | Critical |
|
||||||
|
| 2 | Weeks 2-4 | ADR-004 | HNSW Vector Search for Signal Fingerprinting | Critical |
|
||||||
|
| 3 | Weeks 4-6 | ADR-005 | SONA Self-Learning for Pose Estimation | High |
|
||||||
|
| 4 | Weeks 5-7 | ADR-006 | GNN-Enhanced CSI Pattern Recognition | High |
|
||||||
|
| 5 | Weeks 6-8 | ADR-007 | Post-Quantum Cryptography for Secure Sensing | Medium |
|
||||||
|
| 6 | Weeks 7-9 | ADR-008 | Distributed Consensus for Multi-AP | Medium |
|
||||||
|
| 7 | Weeks 8-10 | ADR-009 | RVF WASM Runtime for Edge Deployment | Medium |
|
||||||
|
| 8 | Weeks 9-11 | ADR-010 | Witness Chains for Audit Trail Integrity | High (MAT) |
|
||||||
|
|
||||||
|
### Dependency Strategy
|
||||||
|
|
||||||
|
```toml
|
||||||
|
# In Cargo.toml workspace dependencies
|
||||||
|
[workspace.dependencies]
|
||||||
|
ruvector-core = { version = "0.1", features = ["hnsw", "sona", "gnn"] }
|
||||||
|
ruvector-data-framework = { version = "0.1", features = ["rvf", "witness", "crypto"] }
|
||||||
|
ruvector-consensus = { version = "0.1", features = ["raft"] }
|
||||||
|
ruvector-wasm = { version = "0.1", features = ["edge-runtime"] }
|
||||||
|
```
|
||||||
|
|
||||||
|
Feature flags control which RuVector capabilities are compiled in:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[features]
|
||||||
|
default = ["rvf-store", "hnsw-search"]
|
||||||
|
rvf-store = ["ruvector-data-framework/rvf"]
|
||||||
|
hnsw-search = ["ruvector-core/hnsw"]
|
||||||
|
sona-learning = ["ruvector-core/sona"]
|
||||||
|
gnn-patterns = ["ruvector-core/gnn"]
|
||||||
|
post-quantum = ["ruvector-data-framework/crypto"]
|
||||||
|
witness-chains = ["ruvector-data-framework/witness"]
|
||||||
|
raft-consensus = ["ruvector-consensus/raft"]
|
||||||
|
wasm-edge = ["ruvector-wasm/edge-runtime"]
|
||||||
|
full = ["rvf-store", "hnsw-search", "sona-learning", "gnn-patterns", "post-quantum", "witness-chains", "raft-consensus", "wasm-edge"]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
|
||||||
|
- **10-100x faster pattern lookup**: HNSW replaces linear scan for CSI fingerprint matching
|
||||||
|
- **Continuous improvement**: SONA enables online adaptation without full retraining
|
||||||
|
- **Self-contained deployment**: RVF containers package everything needed for field operation
|
||||||
|
- **Tamper-evident records**: Witness chains provide cryptographic proof for disaster response auditing
|
||||||
|
- **Future-proof security**: Post-quantum signatures resist quantum computing attacks
|
||||||
|
- **Distributed operation**: Raft consensus enables coordinated multi-AP sensing
|
||||||
|
- **Ultra-light edge**: 5.5 KB WASM runtime enables browser and IoT deployment
|
||||||
|
- **Git-like versioning**: COW branching enables experimental model variations with minimal storage
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
|
||||||
|
- **Increased binary size**: Full feature set adds significant dependencies (~15-30 MB)
|
||||||
|
- **Complexity**: 9 integration domains require careful coordination
|
||||||
|
- **Learning curve**: Team must understand RuVector's cognitive container paradigm
|
||||||
|
- **API stability risk**: RuVector is pre-1.0; APIs may change
|
||||||
|
- **Testing surface**: Each integration point requires dedicated test suites
|
||||||
|
|
||||||
|
### Risks and Mitigations
|
||||||
|
|
||||||
|
| Risk | Severity | Mitigation |
|
||||||
|
|------|----------|------------|
|
||||||
|
| RuVector API breaking changes | High | Pin versions, adapter pattern isolates impact |
|
||||||
|
| Performance regression from abstraction layers | Medium | Benchmark each integration point, zero-cost abstractions |
|
||||||
|
| Feature flag combinatorial complexity | Medium | CI matrix testing for key feature combinations |
|
||||||
|
| Over-engineering for current use cases | Medium | Phased rollout, each phase independently valuable |
|
||||||
|
| Binary size bloat for edge targets | Low | Feature flags ensure only needed capabilities compile |
|
||||||
|
|
||||||
|
## Related ADRs
|
||||||
|
|
||||||
|
- **ADR-001**: WiFi-Mat Disaster Detection Architecture (existing)
|
||||||
|
- **ADR-003**: RVF Cognitive Containers for CSI Data
|
||||||
|
- **ADR-004**: HNSW Vector Search for Signal Fingerprinting
|
||||||
|
- **ADR-005**: SONA Self-Learning for Pose Estimation
|
||||||
|
- **ADR-006**: GNN-Enhanced CSI Pattern Recognition
|
||||||
|
- **ADR-007**: Post-Quantum Cryptography for Secure Sensing
|
||||||
|
- **ADR-008**: Distributed Consensus for Multi-AP Coordination
|
||||||
|
- **ADR-009**: RVF WASM Runtime for Edge Deployment
|
||||||
|
- **ADR-010**: Witness Chains for Audit Trail Integrity
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [RuVector Repository](https://github.com/ruvnet/ruvector)
|
||||||
|
- [HNSW Algorithm](https://arxiv.org/abs/1603.09320)
|
||||||
|
- [LoRA: Low-Rank Adaptation](https://arxiv.org/abs/2106.09685)
|
||||||
|
- [Elastic Weight Consolidation](https://arxiv.org/abs/1612.00796)
|
||||||
|
- [Raft Consensus](https://raft.github.io/raft.pdf)
|
||||||
|
- [ML-DSA (FIPS 204)](https://csrc.nist.gov/pubs/fips/204/final)
|
||||||
|
- [WiFi-DensePose Rust ADR-001: Workspace Structure](../rust-port/wifi-densepose-rs/docs/adr/ADR-001-workspace-structure.md)
|
||||||
251
docs/adr/ADR-003-rvf-cognitive-containers-csi.md
Normal file
251
docs/adr/ADR-003-rvf-cognitive-containers-csi.md
Normal file
@@ -0,0 +1,251 @@
|
|||||||
|
# ADR-003: RVF Cognitive Containers for CSI Data
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed
|
||||||
|
|
||||||
|
## Date
|
||||||
|
2026-02-28
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
### Problem
|
||||||
|
|
||||||
|
WiFi-DensePose processes CSI (Channel State Information) data through a multi-stage pipeline: raw capture → preprocessing → feature extraction → neural inference → pose output. Each stage produces intermediate data that is currently ephemeral:
|
||||||
|
|
||||||
|
1. **Raw CSI measurements** (`CsiData`): Amplitude matrices (num_antennas x num_subcarriers), phase arrays, SNR values, metadata. Stored only in a bounded `VecDeque` (max 500 entries in Python, similar in Rust).
|
||||||
|
|
||||||
|
2. **Extracted features** (`CsiFeatures`): Amplitude mean/variance, phase differences, correlation matrices, Doppler shifts, power spectral density. Discarded after single-pass inference.
|
||||||
|
|
||||||
|
3. **Trained model weights**: Static ONNX/PyTorch files loaded from disk. No mechanism to persist adapted weights or experimental variations.
|
||||||
|
|
||||||
|
4. **Detection results** (`HumanDetectionResult`): Confidence scores, motion scores, detection booleans. Logged but not indexed for pattern retrieval.
|
||||||
|
|
||||||
|
5. **Environment fingerprints**: Each physical space has a unique CSI signature affected by room geometry, furniture, building materials. No persistent fingerprint database exists.
|
||||||
|
|
||||||
|
### Opportunity
|
||||||
|
|
||||||
|
RuVector's RVF (Cognitive Container) format provides a single-file packaging solution with 25 segment types that can encapsulate the entire WiFi-DensePose operational state:
|
||||||
|
|
||||||
|
```
|
||||||
|
RVF Cognitive Container Structure:
|
||||||
|
┌─────────────────────────────────────────────┐
|
||||||
|
│ HEADER │ Magic, version, segment count │
|
||||||
|
├───────────┼─────────────────────────────────┤
|
||||||
|
│ VEC │ CSI feature vectors │
|
||||||
|
│ INDEX │ HNSW index over vectors │
|
||||||
|
│ WASM │ Inference runtime │
|
||||||
|
│ COW_MAP │ Copy-on-write branch state │
|
||||||
|
│ WITNESS │ Audit chain entries │
|
||||||
|
│ CRYPTO │ Signature keys, attestations │
|
||||||
|
│ KERNEL │ Bootable runtime (optional) │
|
||||||
|
│ EBPF │ Hardware-accelerated filters │
|
||||||
|
│ ... │ (25 total segment types) │
|
||||||
|
└─────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will adopt the RVF Cognitive Container format as the primary persistence and deployment unit for WiFi-DensePose operational data, implementing the following container types:
|
||||||
|
|
||||||
|
### 1. CSI Fingerprint Container (`.rvf.csi`)
|
||||||
|
|
||||||
|
Packages environment-specific CSI signatures for location recognition:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// CSI Fingerprint container storing environment signatures
|
||||||
|
pub struct CsiFingerprintContainer {
|
||||||
|
/// Container metadata
|
||||||
|
metadata: ContainerMetadata,
|
||||||
|
|
||||||
|
/// VEC segment: Normalized CSI feature vectors
|
||||||
|
/// Each vector = [amplitude_mean(N) | amplitude_var(N) | phase_diff(N-1) | doppler(10) | psd(128)]
|
||||||
|
/// Typical dimensionality: 64 subcarriers → 64+64+63+10+128 = 329 dimensions
|
||||||
|
fingerprint_vectors: VecSegment,
|
||||||
|
|
||||||
|
/// INDEX segment: HNSW index for O(log n) nearest-neighbor lookup
|
||||||
|
hnsw_index: IndexSegment,
|
||||||
|
|
||||||
|
/// COW_MAP: Branches for different times-of-day, occupancy levels
|
||||||
|
branches: CowMapSegment,
|
||||||
|
|
||||||
|
/// Metadata per vector: room_id, timestamp, occupancy_count, furniture_hash
|
||||||
|
annotations: AnnotationSegment,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Vector encoding**: Each CSI snapshot is encoded as a fixed-dimension vector:
|
||||||
|
```
|
||||||
|
CSI Feature Vector (329-dim for 64 subcarriers):
|
||||||
|
┌──────────────────┬──────────────────┬─────────────────┬──────────┬─────────┐
|
||||||
|
│ amplitude_mean │ amplitude_var │ phase_diff │ doppler │ psd │
|
||||||
|
│ [f32; 64] │ [f32; 64] │ [f32; 63] │ [f32; 10]│ [f32;128│
|
||||||
|
└──────────────────┴──────────────────┴─────────────────┴──────────┴─────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Model Container (`.rvf.model`)
|
||||||
|
|
||||||
|
Packages neural network weights with versioning:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Model container with version tracking and A/B comparison
|
||||||
|
pub struct ModelContainer {
|
||||||
|
/// Container metadata with model version history
|
||||||
|
metadata: ContainerMetadata,
|
||||||
|
|
||||||
|
/// Primary model weights (ONNX serialized)
|
||||||
|
primary_weights: BlobSegment,
|
||||||
|
|
||||||
|
/// SONA adaptation deltas (LoRA low-rank matrices)
|
||||||
|
adaptation_deltas: VecSegment,
|
||||||
|
|
||||||
|
/// COW branches for model experiments
|
||||||
|
/// e.g., "baseline", "adapted-office-env", "adapted-warehouse"
|
||||||
|
branches: CowMapSegment,
|
||||||
|
|
||||||
|
/// Performance metrics per branch
|
||||||
|
metrics: AnnotationSegment,
|
||||||
|
|
||||||
|
/// Witness chain: every weight update recorded
|
||||||
|
audit_trail: WitnessSegment,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Session Container (`.rvf.session`)
|
||||||
|
|
||||||
|
Captures a complete sensing session for replay and analysis:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Session container for recording and replaying sensing sessions
|
||||||
|
pub struct SessionContainer {
|
||||||
|
/// Session metadata (start time, duration, hardware config)
|
||||||
|
metadata: ContainerMetadata,
|
||||||
|
|
||||||
|
/// Time-series CSI vectors at capture rate
|
||||||
|
csi_timeseries: VecSegment,
|
||||||
|
|
||||||
|
/// Detection results aligned to CSI timestamps
|
||||||
|
detections: AnnotationSegment,
|
||||||
|
|
||||||
|
/// Pose estimation outputs
|
||||||
|
poses: VecSegment,
|
||||||
|
|
||||||
|
/// Index for temporal range queries
|
||||||
|
temporal_index: IndexSegment,
|
||||||
|
|
||||||
|
/// Cryptographic integrity proof
|
||||||
|
witness_chain: WitnessSegment,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Container Lifecycle
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
|
||||||
|
│ Create │───▶│ Ingest │───▶│ Query │───▶│ Branch │
|
||||||
|
│ Container │ │ Vectors │ │ (HNSW) │ │ (COW) │
|
||||||
|
└──────────┘ └──────────┘ └──────────┘ └──────────┘
|
||||||
|
│ │
|
||||||
|
│ ┌──────────┐ ┌──────────┐ │
|
||||||
|
│ │ Merge │◀───│ Compare │◀─────────┘
|
||||||
|
│ │ Branches │ │ Results │
|
||||||
|
│ └────┬─────┘ └──────────┘
|
||||||
|
│ │
|
||||||
|
▼ ▼
|
||||||
|
┌──────────┐ ┌──────────┐
|
||||||
|
│ Export │ │ Deploy │
|
||||||
|
│ (.rvf) │ │ (Edge) │
|
||||||
|
└──────────┘ └──────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Integration with Existing Crates
|
||||||
|
|
||||||
|
The container system integrates through adapter traits:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Trait for types that can be vectorized into RVF containers
|
||||||
|
pub trait RvfVectorizable {
|
||||||
|
/// Encode self as a fixed-dimension f32 vector
|
||||||
|
fn to_rvf_vector(&self) -> Vec<f32>;
|
||||||
|
|
||||||
|
/// Reconstruct from an RVF vector
|
||||||
|
fn from_rvf_vector(vec: &[f32]) -> Result<Self, RvfError> where Self: Sized;
|
||||||
|
|
||||||
|
/// Vector dimensionality
|
||||||
|
fn vector_dim() -> usize;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Implementation for existing types
|
||||||
|
impl RvfVectorizable for CsiFeatures {
|
||||||
|
fn to_rvf_vector(&self) -> Vec<f32> {
|
||||||
|
let mut vec = Vec::with_capacity(Self::vector_dim());
|
||||||
|
vec.extend(self.amplitude_mean.iter().map(|&x| x as f32));
|
||||||
|
vec.extend(self.amplitude_variance.iter().map(|&x| x as f32));
|
||||||
|
vec.extend(self.phase_difference.iter().map(|&x| x as f32));
|
||||||
|
vec.extend(self.doppler_shift.iter().map(|&x| x as f32));
|
||||||
|
vec.extend(self.power_spectral_density.iter().map(|&x| x as f32));
|
||||||
|
vec
|
||||||
|
}
|
||||||
|
|
||||||
|
fn vector_dim() -> usize {
|
||||||
|
// 64 + 64 + 63 + 10 + 128 = 329 (for 64 subcarriers)
|
||||||
|
329
|
||||||
|
}
|
||||||
|
// ...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Storage Characteristics
|
||||||
|
|
||||||
|
| Container Type | Typical Size | Vector Count | Use Case |
|
||||||
|
|----------------|-------------|-------------|----------|
|
||||||
|
| Fingerprint | 5-50 MB | 10K-100K | Room/building fingerprint DB |
|
||||||
|
| Model | 50-500 MB | N/A (blob) | Neural network deployment |
|
||||||
|
| Session | 10-200 MB | 50K-500K | 1-hour recording at 100 Hz |
|
||||||
|
|
||||||
|
### COW Branching for Environment Adaptation
|
||||||
|
|
||||||
|
The copy-on-write mechanism enables zero-overhead experimentation:
|
||||||
|
|
||||||
|
```
|
||||||
|
main (office baseline: 50K vectors)
|
||||||
|
├── branch/morning (delta: 500 vectors, ~15 KB)
|
||||||
|
├── branch/afternoon (delta: 800 vectors, ~24 KB)
|
||||||
|
├── branch/occupied-10 (delta: 2K vectors, ~60 KB)
|
||||||
|
└── branch/furniture-moved (delta: 5K vectors, ~150 KB)
|
||||||
|
```
|
||||||
|
|
||||||
|
Total overhead for 4 branches on a 50K-vector container: ~250 KB additional (0.5%).
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
- **Single-file deployment**: Move a fingerprint database between sites by copying one `.rvf` file
|
||||||
|
- **Versioned models**: A/B test model variants without duplicating full weight sets
|
||||||
|
- **Session replay**: Reproduce detection results from recorded CSI data
|
||||||
|
- **Atomic operations**: Container writes are transactional; no partial state corruption
|
||||||
|
- **Cross-platform**: Same container format works on server, WASM, and embedded
|
||||||
|
- **Storage efficient**: COW branching avoids duplicating unchanged data
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
- **Format lock-in**: RVF is not yet a widely-adopted standard
|
||||||
|
- **Serialization overhead**: Converting between native types and RVF vectors adds latency (~0.1-0.5 ms per vector)
|
||||||
|
- **Learning curve**: Team must understand segment types and container lifecycle
|
||||||
|
- **File size for sessions**: High-rate CSI capture (1000 Hz) generates large session containers
|
||||||
|
|
||||||
|
### Performance Targets
|
||||||
|
|
||||||
|
| Operation | Target Latency | Notes |
|
||||||
|
|-----------|---------------|-------|
|
||||||
|
| Container open | <10 ms | Memory-mapped I/O |
|
||||||
|
| Vector insert | <0.1 ms | Append to VEC segment |
|
||||||
|
| HNSW query (100K vectors) | <1 ms | See ADR-004 |
|
||||||
|
| Branch create | <1 ms | COW metadata only |
|
||||||
|
| Branch merge | <100 ms | Delta application |
|
||||||
|
| Container export | ~1 ms/MB | Sequential write |
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [RuVector Cognitive Container Specification](https://github.com/ruvnet/ruvector)
|
||||||
|
- [Memory-Mapped I/O in Rust](https://docs.rs/memmap2)
|
||||||
|
- [Copy-on-Write Data Structures](https://en.wikipedia.org/wiki/Copy-on-write)
|
||||||
|
- ADR-002: RuVector RVF Integration Strategy
|
||||||
270
docs/adr/ADR-004-hnsw-vector-search-fingerprinting.md
Normal file
270
docs/adr/ADR-004-hnsw-vector-search-fingerprinting.md
Normal file
@@ -0,0 +1,270 @@
|
|||||||
|
# ADR-004: HNSW Vector Search for Signal Fingerprinting
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed
|
||||||
|
|
||||||
|
## Date
|
||||||
|
2026-02-28
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
### Current Signal Matching Limitations
|
||||||
|
|
||||||
|
The WiFi-DensePose system needs to match incoming CSI patterns against known signatures for:
|
||||||
|
|
||||||
|
1. **Environment recognition**: Identifying which room/area the device is in based on CSI characteristics
|
||||||
|
2. **Activity classification**: Matching current CSI patterns to known human activities (walking, sitting, falling)
|
||||||
|
3. **Anomaly detection**: Determining whether current readings deviate significantly from baseline
|
||||||
|
4. **Survivor re-identification** (MAT module): Tracking individual survivors across scan sessions
|
||||||
|
|
||||||
|
Current approach in `CSIProcessor._calculate_detection_confidence()`:
|
||||||
|
```python
|
||||||
|
# Fixed thresholds, no similarity search
|
||||||
|
amplitude_indicator = np.mean(features.amplitude_mean) > 0.1
|
||||||
|
phase_indicator = np.std(features.phase_difference) > 0.05
|
||||||
|
motion_indicator = motion_score > 0.3
|
||||||
|
confidence = (0.4 * amplitude_indicator + 0.3 * phase_indicator + 0.3 * motion_indicator)
|
||||||
|
```
|
||||||
|
|
||||||
|
This is a **O(1) fixed-threshold check** that:
|
||||||
|
- Cannot learn from past observations
|
||||||
|
- Has no concept of "similar patterns seen before"
|
||||||
|
- Requires manual threshold tuning per environment
|
||||||
|
- Produces binary indicators (above/below threshold) losing gradient information
|
||||||
|
|
||||||
|
### What HNSW Provides
|
||||||
|
|
||||||
|
Hierarchical Navigable Small World (HNSW) graphs enable approximate nearest-neighbor search in high-dimensional vector spaces with:
|
||||||
|
|
||||||
|
- **O(log n) query time** vs O(n) brute-force
|
||||||
|
- **High recall**: >95% recall at 10x speed of exact search
|
||||||
|
- **Dynamic insertion**: New vectors added without full rebuild
|
||||||
|
- **SIMD acceleration**: RuVector's implementation uses AVX2/NEON for distance calculations
|
||||||
|
|
||||||
|
RuVector extends standard HNSW with:
|
||||||
|
- **Hyperbolic HNSW**: Search in Poincaré ball space for hierarchy-aware results (e.g., "walking" is closer to "running" than to "sitting" in activity hierarchy)
|
||||||
|
- **GNN enhancement**: Graph neural networks refine neighbor connections after queries
|
||||||
|
- **Tiered compression**: 2-32x memory reduction through adaptive quantization
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will integrate RuVector's HNSW implementation as the primary similarity search engine for all CSI pattern matching operations, replacing fixed-threshold detection with similarity-based retrieval.
|
||||||
|
|
||||||
|
### Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
|
│ HNSW Search Pipeline │
|
||||||
|
├─────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ CSI Input Feature Vector HNSW │
|
||||||
|
│ ────────▶ Extraction ────▶ Encode ────▶ Search │
|
||||||
|
│ (existing) (new) (new) │
|
||||||
|
│ │ │
|
||||||
|
│ ┌─────────────┤ │
|
||||||
|
│ ▼ ▼ │
|
||||||
|
│ Top-K Results Confidence │
|
||||||
|
│ [vec_id, dist, Score from │
|
||||||
|
│ metadata] Distance Dist. │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ ┌────────────┐ │
|
||||||
|
│ │ Decision │ │
|
||||||
|
│ │ Fusion │ │
|
||||||
|
│ └────────────┘ │
|
||||||
|
│ Combines HNSW similarity with │
|
||||||
|
│ existing threshold-based logic │
|
||||||
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Index Configuration
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// HNSW configuration tuned for CSI vector characteristics
|
||||||
|
pub struct CsiHnswConfig {
|
||||||
|
/// Vector dimensionality (matches CsiFeatures encoding)
|
||||||
|
dim: usize, // 329 for 64 subcarriers
|
||||||
|
|
||||||
|
/// Maximum number of connections per node per layer
|
||||||
|
/// Higher M = better recall, more memory
|
||||||
|
/// CSI vectors are moderately dimensional; M=16 balances well
|
||||||
|
m: usize, // 16
|
||||||
|
|
||||||
|
/// Size of dynamic candidate list during construction
|
||||||
|
/// ef_construction = 200 gives >99% recall for 329-dim vectors
|
||||||
|
ef_construction: usize, // 200
|
||||||
|
|
||||||
|
/// Size of dynamic candidate list during search
|
||||||
|
/// ef_search = 64 gives >95% recall with <1ms latency at 100K vectors
|
||||||
|
ef_search: usize, // 64
|
||||||
|
|
||||||
|
/// Distance metric
|
||||||
|
/// Cosine similarity works best for normalized CSI features
|
||||||
|
metric: DistanceMetric, // Cosine
|
||||||
|
|
||||||
|
/// Maximum elements (pre-allocated for performance)
|
||||||
|
max_elements: usize, // 1_000_000
|
||||||
|
|
||||||
|
/// Enable SIMD acceleration
|
||||||
|
simd: bool, // true
|
||||||
|
|
||||||
|
/// Quantization level for memory reduction
|
||||||
|
quantization: Quantization, // PQ8 (product quantization, 8-bit)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Multiple Index Strategy
|
||||||
|
|
||||||
|
Different use cases require different index configurations:
|
||||||
|
|
||||||
|
| Index Name | Vectors | Dim | Distance | Use Case |
|
||||||
|
|-----------|---------|-----|----------|----------|
|
||||||
|
| `env_fingerprint` | 10K-1M | 329 | Cosine | Environment/room identification |
|
||||||
|
| `activity_pattern` | 1K-50K | 329 | Euclidean | Activity classification |
|
||||||
|
| `temporal_pattern` | 10K-500K | 329 | Cosine | Temporal anomaly detection |
|
||||||
|
| `survivor_track` | 100-10K | 329 | Cosine | MAT survivor re-identification |
|
||||||
|
|
||||||
|
### Similarity-Based Detection Enhancement
|
||||||
|
|
||||||
|
Replace fixed thresholds with distance-based confidence:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Enhanced detection using HNSW similarity search
|
||||||
|
pub struct SimilarityDetector {
|
||||||
|
/// HNSW index of known human-present CSI patterns
|
||||||
|
human_patterns: HnswIndex,
|
||||||
|
|
||||||
|
/// HNSW index of known empty-room CSI patterns
|
||||||
|
empty_patterns: HnswIndex,
|
||||||
|
|
||||||
|
/// Fusion weight between similarity and threshold methods
|
||||||
|
fusion_alpha: f64, // 0.7 = 70% similarity, 30% threshold
|
||||||
|
}
|
||||||
|
|
||||||
|
impl SimilarityDetector {
|
||||||
|
/// Detect human presence using similarity search + threshold fusion
|
||||||
|
pub fn detect(&self, features: &CsiFeatures) -> DetectionResult {
|
||||||
|
let query_vec = features.to_rvf_vector();
|
||||||
|
|
||||||
|
// Search both indices
|
||||||
|
let human_neighbors = self.human_patterns.search(&query_vec, k=5);
|
||||||
|
let empty_neighbors = self.empty_patterns.search(&query_vec, k=5);
|
||||||
|
|
||||||
|
// Distance-based confidence
|
||||||
|
let avg_human_dist = human_neighbors.mean_distance();
|
||||||
|
let avg_empty_dist = empty_neighbors.mean_distance();
|
||||||
|
|
||||||
|
// Similarity confidence: how much closer to human patterns vs empty
|
||||||
|
let similarity_confidence = avg_empty_dist / (avg_human_dist + avg_empty_dist);
|
||||||
|
|
||||||
|
// Fuse with traditional threshold-based confidence
|
||||||
|
let threshold_confidence = self.traditional_threshold_detect(features);
|
||||||
|
let fused_confidence = self.fusion_alpha * similarity_confidence
|
||||||
|
+ (1.0 - self.fusion_alpha) * threshold_confidence;
|
||||||
|
|
||||||
|
DetectionResult {
|
||||||
|
human_detected: fused_confidence > 0.5,
|
||||||
|
confidence: fused_confidence,
|
||||||
|
similarity_confidence,
|
||||||
|
threshold_confidence,
|
||||||
|
nearest_human_pattern: human_neighbors[0].metadata.clone(),
|
||||||
|
nearest_empty_pattern: empty_neighbors[0].metadata.clone(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Incremental Learning Loop
|
||||||
|
|
||||||
|
Every confirmed detection enriches the index:
|
||||||
|
|
||||||
|
```
|
||||||
|
1. CSI captured → features extracted → vector encoded
|
||||||
|
2. HNSW search returns top-K neighbors + distances
|
||||||
|
3. Detection decision made (similarity + threshold fusion)
|
||||||
|
4. If confirmed (by temporal consistency or ground truth):
|
||||||
|
a. Insert vector into appropriate index (human/empty)
|
||||||
|
b. GNN layer updates neighbor relationships (ADR-006)
|
||||||
|
c. SONA adapts fusion weights (ADR-005)
|
||||||
|
5. Periodically: prune stale vectors, rebuild index layers
|
||||||
|
```
|
||||||
|
|
||||||
|
### Performance Analysis
|
||||||
|
|
||||||
|
**Memory requirements** (PQ8 quantization):
|
||||||
|
|
||||||
|
| Vector Count | Raw Size | PQ8 Compressed | HNSW Overhead | Total |
|
||||||
|
|-------------|----------|----------------|---------------|-------|
|
||||||
|
| 10,000 | 12.9 MB | 1.6 MB | 2.5 MB | 4.1 MB |
|
||||||
|
| 100,000 | 129 MB | 16 MB | 25 MB | 41 MB |
|
||||||
|
| 1,000,000 | 1.29 GB | 160 MB | 250 MB | 410 MB |
|
||||||
|
|
||||||
|
**Latency expectations** (329-dim vectors, ef_search=64):
|
||||||
|
|
||||||
|
| Vector Count | Brute Force | HNSW | Speedup |
|
||||||
|
|-------------|-------------|------|---------|
|
||||||
|
| 10,000 | 3.2 ms | 0.08 ms | 40x |
|
||||||
|
| 100,000 | 32 ms | 0.3 ms | 107x |
|
||||||
|
| 1,000,000 | 320 ms | 0.9 ms | 356x |
|
||||||
|
|
||||||
|
### Hyperbolic Extension for Activity Hierarchy
|
||||||
|
|
||||||
|
WiFi-sensed activities have natural hierarchy:
|
||||||
|
|
||||||
|
```
|
||||||
|
motion
|
||||||
|
/ \
|
||||||
|
locomotion stationary
|
||||||
|
/ \ / \
|
||||||
|
walking running sitting lying
|
||||||
|
/ \
|
||||||
|
normal shuffling
|
||||||
|
```
|
||||||
|
|
||||||
|
Hyperbolic HNSW in Poincaré ball space preserves this hierarchy during search, so a query for "shuffling" returns "walking" before "sitting" even if Euclidean distances are similar.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Hyperbolic HNSW for hierarchy-aware activity matching
|
||||||
|
pub struct HyperbolicActivityIndex {
|
||||||
|
index: HnswIndex,
|
||||||
|
curvature: f64, // -1.0 for unit Poincaré ball
|
||||||
|
}
|
||||||
|
|
||||||
|
impl HyperbolicActivityIndex {
|
||||||
|
pub fn search(&self, query: &[f32], k: usize) -> Vec<SearchResult> {
|
||||||
|
// Uses Poincaré distance: d(u,v) = arcosh(1 + 2||u-v||²/((1-||u||²)(1-||v||²)))
|
||||||
|
self.index.search_hyperbolic(query, k, self.curvature)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
- **Adaptive detection**: System improves with more data; no manual threshold tuning
|
||||||
|
- **Sub-millisecond search**: HNSW provides <1ms queries even at 1M vectors
|
||||||
|
- **Memory efficient**: PQ8 reduces storage 8x with <5% recall loss
|
||||||
|
- **Hierarchy-aware**: Hyperbolic mode respects activity relationships
|
||||||
|
- **Incremental**: New patterns added without full index rebuild
|
||||||
|
- **Explainable**: "This detection matched pattern X from room Y at time Z"
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
- **Cold-start problem**: Need initial fingerprint data before similarity search is useful
|
||||||
|
- **Index maintenance**: Periodic pruning and layer rebalancing needed
|
||||||
|
- **Approximation**: HNSW is approximate; may miss exact nearest neighbor (mitigated by high ef_search)
|
||||||
|
- **Memory for indices**: HNSW graph structure adds 2.5x overhead on top of vectors
|
||||||
|
|
||||||
|
### Migration Strategy
|
||||||
|
|
||||||
|
1. **Phase 1**: Run HNSW search in parallel with existing threshold detection, log both results
|
||||||
|
2. **Phase 2**: A/B test fusion weights (alpha parameter) on labeled data
|
||||||
|
3. **Phase 3**: Gradually increase fusion_alpha from 0.0 (pure threshold) to 0.7 (primarily similarity)
|
||||||
|
4. **Phase 4**: Threshold detection becomes fallback for cold-start/empty-index scenarios
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [HNSW: Efficient and Robust Approximate Nearest Neighbor](https://arxiv.org/abs/1603.09320)
|
||||||
|
- [Product Quantization for Nearest Neighbor Search](https://hal.inria.fr/inria-00514462)
|
||||||
|
- [Poincaré Embeddings for Learning Hierarchical Representations](https://arxiv.org/abs/1705.08039)
|
||||||
|
- [RuVector HNSW Implementation](https://github.com/ruvnet/ruvector)
|
||||||
|
- ADR-003: RVF Cognitive Containers for CSI Data
|
||||||
253
docs/adr/ADR-005-sona-self-learning-pose-estimation.md
Normal file
253
docs/adr/ADR-005-sona-self-learning-pose-estimation.md
Normal file
@@ -0,0 +1,253 @@
|
|||||||
|
# ADR-005: SONA Self-Learning for Pose Estimation
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed
|
||||||
|
|
||||||
|
## Date
|
||||||
|
2026-02-28
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
### Static Model Problem
|
||||||
|
|
||||||
|
The WiFi-DensePose modality translation network (`ModalityTranslationNetwork` in Python, `ModalityTranslator` in Rust) converts CSI features into visual-like feature maps that feed the DensePose head for body segmentation and UV coordinate estimation. These models are trained offline and deployed with frozen weights.
|
||||||
|
|
||||||
|
**Critical limitations of static models**:
|
||||||
|
|
||||||
|
1. **Environment drift**: CSI characteristics change when furniture moves, new objects are introduced, or building occupancy changes. A model trained in Lab A degrades in Lab B without retraining.
|
||||||
|
|
||||||
|
2. **Hardware variance**: Different WiFi chipsets (Intel AX200 vs Broadcom BCM4375 vs Qualcomm WCN6855) produce subtly different CSI patterns. Static models overfit to training hardware.
|
||||||
|
|
||||||
|
3. **Temporal drift**: Even in the same environment, CSI patterns shift with temperature, humidity, and electromagnetic interference changes throughout the day.
|
||||||
|
|
||||||
|
4. **Population bias**: Models trained on one demographic may underperform on body types, heights, or movement patterns not represented in training data.
|
||||||
|
|
||||||
|
Current mitigation: manual retraining with new data, which requires:
|
||||||
|
- Collecting labeled data in the new environment
|
||||||
|
- GPU-intensive training (hours to days)
|
||||||
|
- Model export/deployment cycle
|
||||||
|
- Downtime during switchover
|
||||||
|
|
||||||
|
### SONA Opportunity
|
||||||
|
|
||||||
|
RuVector's Self-Optimizing Neural Architecture (SONA) provides <1ms online adaptation through:
|
||||||
|
|
||||||
|
- **LoRA (Low-Rank Adaptation)**: Instead of updating all weights (millions of parameters), LoRA injects small trainable rank decomposition matrices into frozen model layers. For a weight matrix W ∈ R^(d×k), LoRA learns A ∈ R^(d×r) and B ∈ R^(r×k) where r << min(d,k), so the adapted weight is W + AB.
|
||||||
|
|
||||||
|
- **EWC++ (Elastic Weight Consolidation)**: Prevents catastrophic forgetting by penalizing changes to parameters important for previously learned tasks. Each parameter has a Fisher information-weighted importance score.
|
||||||
|
|
||||||
|
- **Online gradient accumulation**: Small batches of live data (as few as 1-10 samples) contribute to adaptation without full backward passes.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will integrate SONA as the online learning engine for both the modality translation network and the DensePose head, enabling continuous environment-specific adaptation without offline retraining.
|
||||||
|
|
||||||
|
### Adaptation Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ SONA Adaptation Pipeline │
|
||||||
|
├──────────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ Frozen Base Model LoRA Adaptation Matrices │
|
||||||
|
│ ┌─────────────────┐ ┌──────────────────────┐ │
|
||||||
|
│ │ Conv2d(64,128) │ ◀── W_frozen ──▶ │ A(64,r) × B(r,128) │ │
|
||||||
|
│ │ Conv2d(128,256) │ ◀── W_frozen ──▶ │ A(128,r) × B(r,256)│ │
|
||||||
|
│ │ Conv2d(256,512) │ ◀── W_frozen ──▶ │ A(256,r) × B(r,512)│ │
|
||||||
|
│ │ ConvT(512,256) │ ◀── W_frozen ──▶ │ A(512,r) × B(r,256)│ │
|
||||||
|
│ │ ... │ │ ... │ │
|
||||||
|
│ └─────────────────┘ └──────────────────────┘ │
|
||||||
|
│ │ │ │
|
||||||
|
│ ▼ ▼ │
|
||||||
|
│ ┌─────────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Effective Weight = W_frozen + α(AB) │ │
|
||||||
|
│ │ α = scaling factor (0.0 → 1.0 over time) │ │
|
||||||
|
│ └─────────────────────────────────────────────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ ┌─────────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ EWC++ Regularizer │ │
|
||||||
|
│ │ L_total = L_task + λ Σ F_i (θ_i - θ*_i)² │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ F_i = Fisher information (parameter importance) │ │
|
||||||
|
│ │ θ*_i = optimal parameters from previous tasks │ │
|
||||||
|
│ │ λ = regularization strength (10-100) │ │
|
||||||
|
│ └─────────────────────────────────────────────────────────┘ │
|
||||||
|
└──────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### LoRA Configuration per Layer
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// SONA LoRA configuration for WiFi-DensePose
|
||||||
|
pub struct SonaConfig {
|
||||||
|
/// LoRA rank (r): dimensionality of adaptation matrices
|
||||||
|
/// r=4 for encoder layers (less variation needed)
|
||||||
|
/// r=8 for decoder layers (more expression needed)
|
||||||
|
/// r=16 for final output layers (maximum adaptability)
|
||||||
|
lora_ranks: HashMap<String, usize>,
|
||||||
|
|
||||||
|
/// Scaling factor alpha: controls adaptation strength
|
||||||
|
/// Starts at 0.0 (pure frozen model), increases to target
|
||||||
|
alpha: f64, // Target: 0.3
|
||||||
|
|
||||||
|
/// Alpha warmup steps before reaching target
|
||||||
|
alpha_warmup_steps: usize, // 100
|
||||||
|
|
||||||
|
/// EWC++ regularization strength
|
||||||
|
ewc_lambda: f64, // 50.0
|
||||||
|
|
||||||
|
/// Fisher information estimation samples
|
||||||
|
fisher_samples: usize, // 200
|
||||||
|
|
||||||
|
/// Online learning rate (much smaller than offline training)
|
||||||
|
online_lr: f64, // 1e-5
|
||||||
|
|
||||||
|
/// Gradient accumulation steps before applying update
|
||||||
|
accumulation_steps: usize, // 10
|
||||||
|
|
||||||
|
/// Maximum adaptation delta (safety bound)
|
||||||
|
max_delta_norm: f64, // 0.1
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Parameter budget**:
|
||||||
|
|
||||||
|
| Layer | Original Params | LoRA Rank | LoRA Params | Overhead |
|
||||||
|
|-------|----------------|-----------|-------------|----------|
|
||||||
|
| Encoder Conv1 (64→128) | 73,728 | 4 | 768 | 1.0% |
|
||||||
|
| Encoder Conv2 (128→256) | 294,912 | 4 | 1,536 | 0.5% |
|
||||||
|
| Encoder Conv3 (256→512) | 1,179,648 | 4 | 3,072 | 0.3% |
|
||||||
|
| Decoder ConvT1 (512→256) | 1,179,648 | 8 | 6,144 | 0.5% |
|
||||||
|
| Decoder ConvT2 (256→128) | 294,912 | 8 | 3,072 | 1.0% |
|
||||||
|
| Output Conv (128→24) | 27,648 | 16 | 2,432 | 8.8% |
|
||||||
|
| **Total** | **3,050,496** | - | **17,024** | **0.56%** |
|
||||||
|
|
||||||
|
SONA adapts **0.56% of parameters** while achieving 70-90% of the accuracy improvement of full fine-tuning.
|
||||||
|
|
||||||
|
### Adaptation Trigger Conditions
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// When to trigger SONA adaptation
|
||||||
|
pub enum AdaptationTrigger {
|
||||||
|
/// Detection confidence drops below threshold over N samples
|
||||||
|
ConfidenceDrop {
|
||||||
|
threshold: f64, // 0.6
|
||||||
|
window_size: usize, // 50
|
||||||
|
},
|
||||||
|
|
||||||
|
/// CSI statistics drift beyond baseline (KL divergence)
|
||||||
|
DistributionDrift {
|
||||||
|
kl_threshold: f64, // 0.5
|
||||||
|
reference_window: usize, // 1000
|
||||||
|
},
|
||||||
|
|
||||||
|
/// New environment detected (no close HNSW matches)
|
||||||
|
NewEnvironment {
|
||||||
|
min_distance: f64, // 0.8 (far from all known fingerprints)
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Periodic adaptation (maintenance)
|
||||||
|
Periodic {
|
||||||
|
interval_samples: usize, // 10000
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Manual trigger via API
|
||||||
|
Manual,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Adaptation Feedback Sources
|
||||||
|
|
||||||
|
Since WiFi-DensePose lacks camera ground truth in deployment, adaptation uses **self-supervised signals**:
|
||||||
|
|
||||||
|
1. **Temporal consistency**: Pose estimates should change smoothly between frames. Jerky transitions indicate prediction error.
|
||||||
|
```
|
||||||
|
L_temporal = ||pose(t) - pose(t-1)||² when Δt < 100ms
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Physical plausibility**: Body part positions must satisfy skeletal constraints (limb lengths, joint angles).
|
||||||
|
```
|
||||||
|
L_skeleton = Σ max(0, |limb_length - expected_length| - tolerance)
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Multi-view agreement** (multi-AP): Different APs observing the same person should produce consistent poses.
|
||||||
|
```
|
||||||
|
L_multiview = ||pose_AP1 - transform(pose_AP2)||²
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Detection stability**: Confidence should be high when the environment is stable.
|
||||||
|
```
|
||||||
|
L_stability = -log(confidence) when variance(CSI_window) < threshold
|
||||||
|
```
|
||||||
|
|
||||||
|
### Safety Mechanisms
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Safety bounds prevent adaptation from degrading the model
|
||||||
|
pub struct AdaptationSafety {
|
||||||
|
/// Maximum parameter change per update step
|
||||||
|
max_step_norm: f64,
|
||||||
|
|
||||||
|
/// Rollback if validation loss increases by this factor
|
||||||
|
rollback_threshold: f64, // 1.5 (50% worse = rollback)
|
||||||
|
|
||||||
|
/// Keep N checkpoints for rollback
|
||||||
|
checkpoint_count: usize, // 5
|
||||||
|
|
||||||
|
/// Disable adaptation after N consecutive rollbacks
|
||||||
|
max_consecutive_rollbacks: usize, // 3
|
||||||
|
|
||||||
|
/// Minimum samples between adaptations
|
||||||
|
cooldown_samples: usize, // 100
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Persistence via RVF
|
||||||
|
|
||||||
|
Adaptation state is stored in the Model Container (ADR-003):
|
||||||
|
- LoRA matrices A and B serialized to VEC segment
|
||||||
|
- Fisher information matrix serialized alongside
|
||||||
|
- Each adaptation creates a witness chain entry (ADR-010)
|
||||||
|
- COW branching allows reverting to any previous adaptation state
|
||||||
|
|
||||||
|
```
|
||||||
|
model.rvf.model
|
||||||
|
├── main (frozen base weights)
|
||||||
|
├── branch/adapted-office-2024-01 (LoRA deltas)
|
||||||
|
├── branch/adapted-warehouse (LoRA deltas)
|
||||||
|
└── branch/adapted-outdoor-disaster (LoRA deltas)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
- **Zero-downtime adaptation**: Model improves continuously during operation
|
||||||
|
- **Tiny overhead**: 17K parameters (0.56%) vs 3M full model; <1ms per adaptation step
|
||||||
|
- **No forgetting**: EWC++ preserves performance on previously-seen environments
|
||||||
|
- **Portable adaptations**: LoRA deltas are ~70 KB, easily shared between devices
|
||||||
|
- **Safe rollback**: Checkpoint system prevents runaway degradation
|
||||||
|
- **Self-supervised**: No labeled data needed during deployment
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
- **Bounded expressiveness**: LoRA rank limits the degree of adaptation; extreme environment changes may require offline retraining
|
||||||
|
- **Feedback noise**: Self-supervised signals are weaker than ground-truth labels; adaptation is slower and less precise
|
||||||
|
- **Compute on device**: Even small gradient computations require tensor math on the inference device
|
||||||
|
- **Complexity**: Debugging adapted models is harder than static models
|
||||||
|
- **Hyperparameter sensitivity**: EWC lambda, LoRA rank, learning rate require tuning
|
||||||
|
|
||||||
|
### Validation Plan
|
||||||
|
|
||||||
|
1. **Offline validation**: Train base model on Environment A, test SONA adaptation to Environment B with known ground truth. Measure pose estimation MPJPE (Mean Per-Joint Position Error) improvement.
|
||||||
|
2. **A/B deployment**: Run static model and SONA-adapted model in parallel on same CSI stream. Compare detection rates and pose consistency.
|
||||||
|
3. **Stress test**: Rapidly change environments (simulated) and verify EWC++ prevents catastrophic forgetting.
|
||||||
|
4. **Edge latency**: Benchmark adaptation step on target hardware (Raspberry Pi 4, Jetson Nano, browser WASM).
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685)
|
||||||
|
- [Elastic Weight Consolidation (EWC)](https://arxiv.org/abs/1612.00796)
|
||||||
|
- [Continual Learning with SONA](https://github.com/ruvnet/ruvector)
|
||||||
|
- [Self-Supervised WiFi Sensing](https://arxiv.org/abs/2203.11928)
|
||||||
|
- ADR-002: RuVector RVF Integration Strategy
|
||||||
|
- ADR-003: RVF Cognitive Containers for CSI Data
|
||||||
261
docs/adr/ADR-006-gnn-enhanced-csi-pattern-recognition.md
Normal file
261
docs/adr/ADR-006-gnn-enhanced-csi-pattern-recognition.md
Normal file
@@ -0,0 +1,261 @@
|
|||||||
|
# ADR-006: GNN-Enhanced CSI Pattern Recognition
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed
|
||||||
|
|
||||||
|
## Date
|
||||||
|
2026-02-28
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
### Limitations of Independent Vector Search
|
||||||
|
|
||||||
|
ADR-004 introduces HNSW-based similarity search for CSI pattern matching. While HNSW provides fast nearest-neighbor retrieval, it treats each vector independently. CSI patterns, however, have rich relational structure:
|
||||||
|
|
||||||
|
1. **Temporal adjacency**: CSI frames captured 10ms apart are more related than frames 10s apart. Sequential patterns reveal motion trajectories.
|
||||||
|
|
||||||
|
2. **Spatial correlation**: CSI readings from adjacent subcarriers are highly correlated due to frequency proximity. Antenna pairs capture different spatial perspectives.
|
||||||
|
|
||||||
|
3. **Cross-session similarity**: The "walking to kitchen" pattern from Tuesday should inform Wednesday's recognition, but the environment baseline may have shifted.
|
||||||
|
|
||||||
|
4. **Multi-person entanglement**: When multiple people are present, CSI patterns are superpositions. Disentangling requires understanding which pattern fragments co-occur.
|
||||||
|
|
||||||
|
Standard HNSW cannot capture these relationships. Each query returns neighbors based solely on vector distance, ignoring the graph structure of how patterns relate to each other.
|
||||||
|
|
||||||
|
### RuVector's GNN Enhancement
|
||||||
|
|
||||||
|
RuVector implements a Graph Neural Network layer that sits on top of the HNSW index:
|
||||||
|
|
||||||
|
```
|
||||||
|
Standard HNSW: Query → Distance-based neighbors → Results
|
||||||
|
GNN-Enhanced: Query → Distance-based neighbors → GNN refinement → Improved results
|
||||||
|
```
|
||||||
|
|
||||||
|
The GNN performs three operations in <1ms:
|
||||||
|
1. **Message passing**: Each node aggregates information from its HNSW neighbors
|
||||||
|
2. **Attention weighting**: Multi-head attention identifies which neighbors are most relevant for the current query context
|
||||||
|
3. **Representation update**: Node embeddings are refined based on neighborhood context
|
||||||
|
|
||||||
|
Additionally, **temporal learning** tracks query sequences to discover:
|
||||||
|
- Vectors that frequently appear together in sessions
|
||||||
|
- Temporal ordering patterns (A usually precedes B)
|
||||||
|
- Session context that changes relevance rankings
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will integrate RuVector's GNN layer to enhance CSI pattern recognition with three core capabilities: relational search, temporal sequence modeling, and multi-person disentanglement.
|
||||||
|
|
||||||
|
### GNN Architecture for CSI
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ GNN-Enhanced CSI Pattern Graph │
|
||||||
|
├─────────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ Layer 1: HNSW Spatial Graph │
|
||||||
|
│ ┌───────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Nodes = CSI feature vectors │ │
|
||||||
|
│ │ Edges = HNSW neighbor connections (distance-based) │ │
|
||||||
|
│ │ Node features = [amplitude | phase | doppler | PSD] │ │
|
||||||
|
│ └───────────────────────────────────────────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ Layer 2: Temporal Edges │
|
||||||
|
│ ┌───────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Additional edges between temporally adjacent vectors │ │
|
||||||
|
│ │ Edge weight = 1/Δt (closer in time = stronger) │ │
|
||||||
|
│ │ Direction = causal (past → future) │ │
|
||||||
|
│ └───────────────────────────────────────────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ Layer 3: GNN Message Passing (2 rounds) │
|
||||||
|
│ ┌───────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Round 1: h_i = σ(W₁·h_i + Σⱼ α_ij · W₂·h_j) │ │
|
||||||
|
│ │ Round 2: h_i = σ(W₃·h_i + Σⱼ α'_ij · W₄·h_j) │ │
|
||||||
|
│ │ α_ij = softmax(LeakyReLU(a^T[W·h_i || W·h_j])) │ │
|
||||||
|
│ │ (Graph Attention Network mechanism) │ │
|
||||||
|
│ └───────────────────────────────────────────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ Layer 4: Refined Representations │
|
||||||
|
│ ┌───────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Updated vectors incorporate neighborhood context │ │
|
||||||
|
│ │ Re-rank search results using refined distances │ │
|
||||||
|
│ └───────────────────────────────────────────────────────┘ │
|
||||||
|
└─────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Three Integration Modes
|
||||||
|
|
||||||
|
#### Mode 1: Query-Time Refinement (Default)
|
||||||
|
|
||||||
|
GNN refines HNSW results after retrieval. No modifications to stored vectors.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct GnnQueryRefiner {
|
||||||
|
/// GNN weights (small: ~50K parameters)
|
||||||
|
gnn_weights: GnnModel,
|
||||||
|
|
||||||
|
/// Number of message passing rounds
|
||||||
|
num_rounds: usize, // 2
|
||||||
|
|
||||||
|
/// Attention heads for neighbor weighting
|
||||||
|
num_heads: usize, // 4
|
||||||
|
|
||||||
|
/// How many HNSW neighbors to consider in GNN
|
||||||
|
neighborhood_size: usize, // 20 (retrieve 20, GNN selects best 5)
|
||||||
|
}
|
||||||
|
|
||||||
|
impl GnnQueryRefiner {
|
||||||
|
/// Refine HNSW results using graph context
|
||||||
|
pub fn refine(&self, query: &[f32], hnsw_results: &[SearchResult]) -> Vec<SearchResult> {
|
||||||
|
// Build local subgraph from query + HNSW results
|
||||||
|
let subgraph = self.build_local_subgraph(query, hnsw_results);
|
||||||
|
|
||||||
|
// Run message passing
|
||||||
|
let refined = self.message_pass(&subgraph, self.num_rounds);
|
||||||
|
|
||||||
|
// Re-rank based on refined representations
|
||||||
|
self.rerank(query, &refined)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Latency**: +0.2ms on top of HNSW search (total <1.5ms for 100K vectors).
|
||||||
|
|
||||||
|
#### Mode 2: Temporal Sequence Recognition
|
||||||
|
|
||||||
|
Tracks CSI vector sequences to recognize activity patterns that span multiple frames:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Temporal pattern recognizer using GNN edges
|
||||||
|
pub struct TemporalPatternRecognizer {
|
||||||
|
/// Sliding window of recent query vectors
|
||||||
|
window: VecDeque<TimestampedVector>,
|
||||||
|
|
||||||
|
/// Maximum window size (in frames)
|
||||||
|
max_window: usize, // 100 (10 seconds at 10 Hz)
|
||||||
|
|
||||||
|
/// Temporal edge decay factor
|
||||||
|
decay: f64, // 0.95 (edges weaken with time)
|
||||||
|
|
||||||
|
/// Known activity sequences (learned from data)
|
||||||
|
activity_templates: HashMap<String, Vec<Vec<f32>>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl TemporalPatternRecognizer {
|
||||||
|
/// Feed new CSI vector and check for activity pattern matches
|
||||||
|
pub fn observe(&mut self, vector: &[f32], timestamp: f64) -> Vec<ActivityMatch> {
|
||||||
|
self.window.push_back(TimestampedVector { vector: vector.to_vec(), timestamp });
|
||||||
|
|
||||||
|
// Build temporal subgraph from window
|
||||||
|
let temporal_graph = self.build_temporal_graph();
|
||||||
|
|
||||||
|
// GNN aggregates temporal context
|
||||||
|
let sequence_embedding = self.gnn_aggregate(&temporal_graph);
|
||||||
|
|
||||||
|
// Match against known activity templates
|
||||||
|
self.match_activities(&sequence_embedding)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Activity patterns detectable**:
|
||||||
|
|
||||||
|
| Activity | Frames Needed | CSI Signature |
|
||||||
|
|----------|--------------|---------------|
|
||||||
|
| Walking | 10-30 | Periodic Doppler oscillation |
|
||||||
|
| Falling | 5-15 | Sharp amplitude spike → stillness |
|
||||||
|
| Sitting down | 10-20 | Gradual descent in reflection height |
|
||||||
|
| Breathing (still) | 30-100 | Micro-periodic phase variation |
|
||||||
|
| Gesture (wave) | 5-15 | Localized high-frequency amplitude variation |
|
||||||
|
|
||||||
|
#### Mode 3: Multi-Person Disentanglement
|
||||||
|
|
||||||
|
When N>1 people are present, CSI is a superposition. The GNN learns to cluster pattern fragments:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Multi-person CSI disentanglement using GNN clustering
|
||||||
|
pub struct MultiPersonDisentangler {
|
||||||
|
/// Maximum expected simultaneous persons
|
||||||
|
max_persons: usize, // 10
|
||||||
|
|
||||||
|
/// GNN-based spectral clustering
|
||||||
|
cluster_gnn: GnnModel,
|
||||||
|
|
||||||
|
/// Per-person tracking state
|
||||||
|
person_tracks: Vec<PersonTrack>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl MultiPersonDisentangler {
|
||||||
|
/// Separate CSI features into per-person components
|
||||||
|
pub fn disentangle(&mut self, features: &CsiFeatures) -> Vec<PersonFeatures> {
|
||||||
|
// Decompose CSI into subcarrier groups using GNN attention
|
||||||
|
let subcarrier_graph = self.build_subcarrier_graph(features);
|
||||||
|
|
||||||
|
// GNN clusters subcarriers by person contribution
|
||||||
|
let clusters = self.cluster_gnn.cluster(&subcarrier_graph, self.max_persons);
|
||||||
|
|
||||||
|
// Extract per-person features from clustered subcarriers
|
||||||
|
clusters.iter().map(|c| self.extract_person_features(features, c)).collect()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### GNN Learning Loop
|
||||||
|
|
||||||
|
The GNN improves with every query through RuVector's built-in learning:
|
||||||
|
|
||||||
|
```
|
||||||
|
Query → HNSW retrieval → GNN refinement → User action (click/confirm/reject)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Update GNN weights via:
|
||||||
|
1. Positive: confirmed results get higher attention
|
||||||
|
2. Negative: rejected results get lower attention
|
||||||
|
3. Temporal: successful sequences reinforce edges
|
||||||
|
```
|
||||||
|
|
||||||
|
For WiFi-DensePose, "user action" is replaced by:
|
||||||
|
- **Temporal consistency**: If frame N+1 confirms frame N's detection, reinforce
|
||||||
|
- **Multi-AP agreement**: If two APs agree on detection, reinforce both
|
||||||
|
- **Physical plausibility**: If pose satisfies skeletal constraints, reinforce
|
||||||
|
|
||||||
|
### Performance Budget
|
||||||
|
|
||||||
|
| Component | Parameters | Memory | Latency (per query) |
|
||||||
|
|-----------|-----------|--------|-------------------|
|
||||||
|
| GNN weights (2 layers, 4 heads) | 52K | 208 KB | 0.15 ms |
|
||||||
|
| Temporal graph (100-frame window) | N/A | ~130 KB | 0.05 ms |
|
||||||
|
| Multi-person clustering | 18K | 72 KB | 0.3 ms |
|
||||||
|
| **Total GNN overhead** | **70K** | **410 KB** | **0.5 ms** |
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
- **Context-aware search**: Results account for temporal and spatial relationships, not just vector distance
|
||||||
|
- **Activity recognition**: Temporal GNN enables sequence-level pattern matching
|
||||||
|
- **Multi-person support**: GNN clustering separates overlapping CSI patterns
|
||||||
|
- **Self-improving**: Every query provides learning signal to refine attention weights
|
||||||
|
- **Lightweight**: 70K parameters, 410 KB memory, 0.5ms latency overhead
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
- **Training data needed**: GNN weights require initial training on CSI pattern graphs
|
||||||
|
- **Complexity**: Three modes increase testing and debugging surface
|
||||||
|
- **Graph maintenance**: Temporal edges must be pruned to prevent unbounded growth
|
||||||
|
- **Approximation**: GNN clustering for multi-person is approximate; may merge/split incorrectly
|
||||||
|
|
||||||
|
### Interaction with Other ADRs
|
||||||
|
- **ADR-004** (HNSW): GNN operates on HNSW graph structure; depends on HNSW being available
|
||||||
|
- **ADR-005** (SONA): GNN weights can be adapted via SONA LoRA for environment-specific tuning
|
||||||
|
- **ADR-003** (RVF): GNN weights stored in model container alongside inference weights
|
||||||
|
- **ADR-010** (Witness): GNN weight updates recorded in witness chain
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [Graph Attention Networks (GAT)](https://arxiv.org/abs/1710.10903)
|
||||||
|
- [Temporal Graph Networks](https://arxiv.org/abs/2006.10637)
|
||||||
|
- [Spectral Clustering with Graph Neural Networks](https://arxiv.org/abs/1907.00481)
|
||||||
|
- [WiFi-based Multi-Person Sensing](https://dl.acm.org/doi/10.1145/3534592)
|
||||||
|
- [RuVector GNN Implementation](https://github.com/ruvnet/ruvector)
|
||||||
|
- ADR-004: HNSW Vector Search for Signal Fingerprinting
|
||||||
215
docs/adr/ADR-007-post-quantum-cryptography-secure-sensing.md
Normal file
215
docs/adr/ADR-007-post-quantum-cryptography-secure-sensing.md
Normal file
@@ -0,0 +1,215 @@
|
|||||||
|
# ADR-007: Post-Quantum Cryptography for Secure Sensing
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed
|
||||||
|
|
||||||
|
## Date
|
||||||
|
2026-02-28
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
### Threat Model
|
||||||
|
|
||||||
|
WiFi-DensePose processes data that can reveal:
|
||||||
|
- **Human presence/absence** in private spaces (surveillance risk)
|
||||||
|
- **Health indicators** via breathing/heartbeat detection (medical privacy)
|
||||||
|
- **Movement patterns** (behavioral profiling)
|
||||||
|
- **Building occupancy** (physical security intelligence)
|
||||||
|
|
||||||
|
In disaster scenarios (wifi-densepose-mat), the stakes are even higher:
|
||||||
|
- **Triage classifications** affect rescue priority (life-or-death decisions)
|
||||||
|
- **Survivor locations** are operationally sensitive
|
||||||
|
- **Detection audit trails** may be used in legal proceedings (liability)
|
||||||
|
- **False negatives** (missed survivors) could be forensically investigated
|
||||||
|
|
||||||
|
Current security: The system uses standard JWT (HS256) for API authentication and has no cryptographic protection on data at rest, model integrity, or detection audit trails.
|
||||||
|
|
||||||
|
### Quantum Threat Timeline
|
||||||
|
|
||||||
|
NIST estimates cryptographically relevant quantum computers could emerge by 2030-2035. Data captured today with classical encryption may be decrypted retroactively ("harvest now, decrypt later"). For a system that may be deployed for decades in infrastructure, post-quantum readiness is prudent.
|
||||||
|
|
||||||
|
### RuVector's Crypto Stack
|
||||||
|
|
||||||
|
RuVector provides a layered cryptographic system:
|
||||||
|
|
||||||
|
| Algorithm | Purpose | Standard | Quantum Resistant |
|
||||||
|
|-----------|---------|----------|-------------------|
|
||||||
|
| ML-DSA-65 | Digital signatures | FIPS 204 | Yes (lattice-based) |
|
||||||
|
| Ed25519 | Digital signatures | RFC 8032 | No (classical fallback) |
|
||||||
|
| SLH-DSA-128s | Digital signatures | FIPS 205 | Yes (hash-based) |
|
||||||
|
| SHAKE-256 | Hashing | FIPS 202 | Yes |
|
||||||
|
| AES-256-GCM | Symmetric encryption | FIPS 197 | Yes (Grover's halves, still 128-bit) |
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will integrate RuVector's cryptographic layer to provide defense-in-depth for WiFi-DensePose data, using a **hybrid classical+PQ** approach where both Ed25519 and ML-DSA-65 signatures are applied (belt-and-suspenders until PQ algorithms mature).
|
||||||
|
|
||||||
|
### Cryptographic Scope
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Cryptographic Protection Layers │
|
||||||
|
├──────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ 1. MODEL INTEGRITY │
|
||||||
|
│ ┌─────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Model weights signed with ML-DSA-65 + Ed25519 │ │
|
||||||
|
│ │ Signature verified at load time → reject tampered │ │
|
||||||
|
│ │ SONA adaptations co-signed with device key │ │
|
||||||
|
│ └─────────────────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ 2. DATA AT REST (RVF containers) │
|
||||||
|
│ ┌─────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ CSI vectors encrypted with AES-256-GCM │ │
|
||||||
|
│ │ Container integrity via SHAKE-256 Merkle tree │ │
|
||||||
|
│ │ Key management: per-container keys, sealed to device │ │
|
||||||
|
│ └─────────────────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ 3. DATA IN TRANSIT │
|
||||||
|
│ ┌─────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ API: TLS 1.3 with PQ key exchange (ML-KEM-768) │ │
|
||||||
|
│ │ WebSocket: Same TLS channel │ │
|
||||||
|
│ │ Multi-AP sync: mTLS with device certificates │ │
|
||||||
|
│ └─────────────────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ 4. AUDIT TRAIL (witness chains - see ADR-010) │
|
||||||
|
│ ┌─────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Every detection event hash-chained with SHAKE-256 │ │
|
||||||
|
│ │ Chain anchors signed with ML-DSA-65 │ │
|
||||||
|
│ │ Cross-device attestation via SLH-DSA-128s │ │
|
||||||
|
│ └─────────────────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ 5. DEVICE IDENTITY │
|
||||||
|
│ ┌─────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Each sensing device has a key pair (ML-DSA-65) │ │
|
||||||
|
│ │ Device attestation proves hardware integrity │ │
|
||||||
|
│ │ Key rotation schedule: 90 days (or on compromise) │ │
|
||||||
|
│ └─────────────────────────────────────────────────────┘ │
|
||||||
|
└──────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Hybrid Signature Scheme
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Hybrid signature combining classical Ed25519 with PQ ML-DSA-65
|
||||||
|
pub struct HybridSignature {
|
||||||
|
/// Classical Ed25519 signature (64 bytes)
|
||||||
|
ed25519_sig: [u8; 64],
|
||||||
|
|
||||||
|
/// Post-quantum ML-DSA-65 signature (3309 bytes)
|
||||||
|
ml_dsa_sig: Vec<u8>,
|
||||||
|
|
||||||
|
/// Signer's public key fingerprint (SHAKE-256, 32 bytes)
|
||||||
|
signer_fingerprint: [u8; 32],
|
||||||
|
|
||||||
|
/// Timestamp of signing
|
||||||
|
timestamp: u64,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl HybridSignature {
|
||||||
|
/// Verify requires BOTH signatures to be valid
|
||||||
|
pub fn verify(&self, message: &[u8], ed25519_pk: &Ed25519PublicKey,
|
||||||
|
ml_dsa_pk: &MlDsaPublicKey) -> Result<bool, CryptoError> {
|
||||||
|
let ed25519_valid = ed25519_pk.verify(message, &self.ed25519_sig)?;
|
||||||
|
let ml_dsa_valid = ml_dsa_pk.verify(message, &self.ml_dsa_sig)?;
|
||||||
|
|
||||||
|
// Both must pass (defense in depth)
|
||||||
|
Ok(ed25519_valid && ml_dsa_valid)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Model Integrity Verification
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Verify model weights have not been tampered with
|
||||||
|
pub fn verify_model_integrity(model_container: &ModelContainer) -> Result<(), SecurityError> {
|
||||||
|
// 1. Extract embedded signature from container
|
||||||
|
let signature = model_container.crypto_segment().signature()?;
|
||||||
|
|
||||||
|
// 2. Compute SHAKE-256 hash of weight data
|
||||||
|
let weight_hash = shake256(model_container.weights_segment().data());
|
||||||
|
|
||||||
|
// 3. Verify hybrid signature
|
||||||
|
let publisher_keys = load_publisher_keys()?;
|
||||||
|
if !signature.verify(&weight_hash, &publisher_keys.ed25519, &publisher_keys.ml_dsa)? {
|
||||||
|
return Err(SecurityError::ModelTampered {
|
||||||
|
expected_signer: publisher_keys.fingerprint(),
|
||||||
|
container_path: model_container.path().to_owned(),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### CSI Data Encryption
|
||||||
|
|
||||||
|
For privacy-sensitive deployments, CSI vectors can be encrypted at rest:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Encrypt CSI vectors for storage in RVF container
|
||||||
|
pub struct CsiEncryptor {
|
||||||
|
/// AES-256-GCM key (derived from device key + container salt)
|
||||||
|
key: Aes256GcmKey,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl CsiEncryptor {
|
||||||
|
/// Encrypt a CSI feature vector
|
||||||
|
/// Note: HNSW search operates on encrypted vectors using
|
||||||
|
/// distance-preserving encryption (approximate, configurable trade-off)
|
||||||
|
pub fn encrypt_vector(&self, vector: &[f32]) -> EncryptedVector {
|
||||||
|
let nonce = generate_nonce();
|
||||||
|
let plaintext = bytemuck::cast_slice::<f32, u8>(vector);
|
||||||
|
let ciphertext = aes_256_gcm_encrypt(&self.key, &nonce, plaintext);
|
||||||
|
EncryptedVector { ciphertext, nonce }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Performance Impact
|
||||||
|
|
||||||
|
| Operation | Without Crypto | With Crypto | Overhead |
|
||||||
|
|-----------|---------------|-------------|----------|
|
||||||
|
| Model load | 50 ms | 52 ms | +2 ms (signature verify) |
|
||||||
|
| Vector insert | 0.1 ms | 0.15 ms | +0.05 ms (encrypt) |
|
||||||
|
| HNSW search | 0.3 ms | 0.35 ms | +0.05 ms (decrypt top-K) |
|
||||||
|
| Container open | 10 ms | 12 ms | +2 ms (integrity check) |
|
||||||
|
| Detection event logging | 0.01 ms | 0.5 ms | +0.49 ms (hash chain) |
|
||||||
|
|
||||||
|
### Feature Flags
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[features]
|
||||||
|
default = []
|
||||||
|
crypto-classical = ["ed25519-dalek"] # Ed25519 only
|
||||||
|
crypto-pq = ["pqcrypto-dilithium", "pqcrypto-sphincsplus"] # ML-DSA + SLH-DSA
|
||||||
|
crypto-hybrid = ["crypto-classical", "crypto-pq"] # Both (recommended)
|
||||||
|
crypto-encrypt = ["aes-gcm"] # Data-at-rest encryption
|
||||||
|
crypto-full = ["crypto-hybrid", "crypto-encrypt"]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
- **Future-proof**: Lattice-based signatures resist quantum attacks
|
||||||
|
- **Tamper detection**: Model poisoning and data manipulation are detectable
|
||||||
|
- **Privacy compliance**: Encrypted CSI data meets GDPR/HIPAA requirements
|
||||||
|
- **Forensic integrity**: Signed audit trails are admissible as evidence
|
||||||
|
- **Low overhead**: <1ms per operation for most crypto operations
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
- **Signature size**: ML-DSA-65 signatures are 3.3 KB vs 64 bytes for Ed25519
|
||||||
|
- **Key management complexity**: Device key provisioning, rotation, revocation
|
||||||
|
- **HNSW on encrypted data**: Distance-preserving encryption is approximate; search recall may degrade
|
||||||
|
- **Dependency weight**: PQ crypto libraries add ~2 MB to binary
|
||||||
|
- **Standards maturity**: FIPS 204/205 are finalized but implementations are evolving
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [FIPS 204: ML-DSA (Module-Lattice Digital Signature)](https://csrc.nist.gov/pubs/fips/204/final)
|
||||||
|
- [FIPS 205: SLH-DSA (Stateless Hash-Based Digital Signature)](https://csrc.nist.gov/pubs/fips/205/final)
|
||||||
|
- [FIPS 202: SHA-3 / SHAKE](https://csrc.nist.gov/pubs/fips/202/final)
|
||||||
|
- [RuVector Crypto Implementation](https://github.com/ruvnet/ruvector)
|
||||||
|
- ADR-002: RuVector RVF Integration Strategy
|
||||||
|
- ADR-010: Witness Chains for Audit Trail Integrity
|
||||||
284
docs/adr/ADR-008-distributed-consensus-multi-ap.md
Normal file
284
docs/adr/ADR-008-distributed-consensus-multi-ap.md
Normal file
@@ -0,0 +1,284 @@
|
|||||||
|
# ADR-008: Distributed Consensus for Multi-AP Coordination
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed
|
||||||
|
|
||||||
|
## Date
|
||||||
|
2026-02-28
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
### Multi-AP Sensing Architecture
|
||||||
|
|
||||||
|
WiFi-DensePose achieves higher accuracy and coverage with multiple access points (APs) observing the same space from different angles. The disaster detection module (wifi-densepose-mat, ADR-001) explicitly requires distributed deployment:
|
||||||
|
|
||||||
|
- **Portable**: Single TX/RX units deployed around a collapse site
|
||||||
|
- **Distributed**: Multiple APs covering a large disaster zone
|
||||||
|
- **Drone-mounted**: UAVs scanning from above with coordinated flight paths
|
||||||
|
|
||||||
|
Each AP independently captures CSI data, extracts features, and runs local inference. But the distributed system needs coordination:
|
||||||
|
|
||||||
|
1. **Consistent survivor registry**: All nodes must agree on the set of detected survivors, their locations, and triage classifications. Conflicting records cause rescue teams to waste time.
|
||||||
|
|
||||||
|
2. **Coordinated scanning**: Avoid redundant scans of the same zone. Dynamically reassign APs as zones are cleared.
|
||||||
|
|
||||||
|
3. **Model synchronization**: When SONA adapts a model on one node (ADR-005), other nodes should benefit from the adaptation without re-learning.
|
||||||
|
|
||||||
|
4. **Clock synchronization**: CSI timestamps must be aligned across nodes for multi-view pose fusion (the GNN multi-person disentanglement in ADR-006 requires temporal alignment).
|
||||||
|
|
||||||
|
5. **Partition tolerance**: In disaster scenarios, network connectivity is unreliable. The system must function during partitions and reconcile when connectivity restores.
|
||||||
|
|
||||||
|
### Current State
|
||||||
|
|
||||||
|
No distributed coordination exists. Each node operates independently. The Rust workspace has no consensus crate.
|
||||||
|
|
||||||
|
### RuVector's Distributed Capabilities
|
||||||
|
|
||||||
|
RuVector provides:
|
||||||
|
- **Raft consensus**: Leader election and replicated log for strong consistency
|
||||||
|
- **Vector clocks**: Logical timestamps for causal ordering without synchronized clocks
|
||||||
|
- **Multi-master replication**: Concurrent writes with conflict resolution
|
||||||
|
- **Delta consensus**: Tracks behavioral changes across nodes for anomaly detection
|
||||||
|
- **Auto-sharding**: Distributes data based on access patterns
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will integrate RuVector's Raft consensus implementation as the coordination backbone for multi-AP WiFi-DensePose deployments, with vector clocks for causal ordering and CRDT-based conflict resolution for partition-tolerant operation.
|
||||||
|
|
||||||
|
### Consensus Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Multi-AP Coordination Architecture │
|
||||||
|
├─────────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ Normal Operation (Connected): │
|
||||||
|
│ │
|
||||||
|
│ ┌─────────┐ Raft ┌─────────┐ Raft ┌─────────┐ │
|
||||||
|
│ │ AP-1 │◀────────────▶│ AP-2 │◀────────────▶│ AP-3 │ │
|
||||||
|
│ │ (Leader)│ Replicated │(Follower│ Replicated │(Follower│ │
|
||||||
|
│ │ │ Log │ )│ Log │ )│ │
|
||||||
|
│ └────┬────┘ └────┬────┘ └────┬────┘ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ ▼ ▼ ▼ │
|
||||||
|
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
|
||||||
|
│ │ Local │ │ Local │ │ Local │ │
|
||||||
|
│ │ RVF │ │ RVF │ │ RVF │ │
|
||||||
|
│ │Container│ │Container│ │Container│ │
|
||||||
|
│ └─────────┘ └─────────┘ └─────────┘ │
|
||||||
|
│ │
|
||||||
|
│ Partitioned Operation (Disconnected): │
|
||||||
|
│ │
|
||||||
|
│ ┌─────────┐ ┌──────────────────────┐ │
|
||||||
|
│ │ AP-1 │ ← operates independently → │ AP-2 AP-3 │ │
|
||||||
|
│ │ │ │ (form sub-cluster) │ │
|
||||||
|
│ │ Local │ │ Raft between 2+3 │ │
|
||||||
|
│ │ writes │ │ │ │
|
||||||
|
│ └─────────┘ └──────────────────────┘ │
|
||||||
|
│ │ │ │
|
||||||
|
│ └──────── Reconnect: CRDT merge ─────────────┘ │
|
||||||
|
└─────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Replicated State Machine
|
||||||
|
|
||||||
|
The Raft log replicates these operations across all nodes:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Operations replicated via Raft consensus
|
||||||
|
#[derive(Serialize, Deserialize, Clone)]
|
||||||
|
pub enum ConsensusOp {
|
||||||
|
/// New survivor detected
|
||||||
|
SurvivorDetected {
|
||||||
|
survivor_id: Uuid,
|
||||||
|
location: GeoCoord,
|
||||||
|
triage: TriageLevel,
|
||||||
|
detecting_ap: ApId,
|
||||||
|
confidence: f64,
|
||||||
|
timestamp: VectorClock,
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Survivor status updated (e.g., triage reclassification)
|
||||||
|
SurvivorUpdated {
|
||||||
|
survivor_id: Uuid,
|
||||||
|
new_triage: TriageLevel,
|
||||||
|
updating_ap: ApId,
|
||||||
|
evidence: DetectionEvidence,
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Zone assignment changed
|
||||||
|
ZoneAssignment {
|
||||||
|
zone_id: ZoneId,
|
||||||
|
assigned_aps: Vec<ApId>,
|
||||||
|
priority: ScanPriority,
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Model adaptation delta shared
|
||||||
|
ModelDelta {
|
||||||
|
source_ap: ApId,
|
||||||
|
lora_delta: Vec<u8>, // Serialized LoRA matrices
|
||||||
|
environment_hash: [u8; 32],
|
||||||
|
performance_metrics: AdaptationMetrics,
|
||||||
|
},
|
||||||
|
|
||||||
|
/// AP joined or left the cluster
|
||||||
|
MembershipChange {
|
||||||
|
ap_id: ApId,
|
||||||
|
action: MembershipAction, // Join | Leave | Suspect
|
||||||
|
},
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Vector Clocks for Causal Ordering
|
||||||
|
|
||||||
|
Since APs may have unsynchronized physical clocks, vector clocks provide causal ordering:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Vector clock for causal ordering across APs
|
||||||
|
#[derive(Clone, Serialize, Deserialize)]
|
||||||
|
pub struct VectorClock {
|
||||||
|
/// Map from AP ID to logical timestamp
|
||||||
|
clocks: HashMap<ApId, u64>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl VectorClock {
|
||||||
|
/// Increment this AP's clock
|
||||||
|
pub fn tick(&mut self, ap_id: &ApId) {
|
||||||
|
*self.clocks.entry(ap_id.clone()).or_insert(0) += 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Merge with another clock (take max of each component)
|
||||||
|
pub fn merge(&mut self, other: &VectorClock) {
|
||||||
|
for (ap_id, &ts) in &other.clocks {
|
||||||
|
let entry = self.clocks.entry(ap_id.clone()).or_insert(0);
|
||||||
|
*entry = (*entry).max(ts);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Check if self happened-before other
|
||||||
|
pub fn happened_before(&self, other: &VectorClock) -> bool {
|
||||||
|
self.clocks.iter().all(|(k, &v)| {
|
||||||
|
other.clocks.get(k).map_or(false, |&ov| v <= ov)
|
||||||
|
}) && self.clocks != other.clocks
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### CRDT-Based Conflict Resolution
|
||||||
|
|
||||||
|
During network partitions, concurrent updates may conflict. We use CRDTs (Conflict-free Replicated Data Types) for automatic resolution:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Survivor registry using Last-Writer-Wins Register CRDT
|
||||||
|
pub struct SurvivorRegistry {
|
||||||
|
/// LWW-Element-Set: each survivor has a timestamp-tagged state
|
||||||
|
survivors: HashMap<Uuid, LwwRegister<SurvivorState>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Triage uses Max-wins semantics:
|
||||||
|
/// If partition A says P1 (Red/Immediate) and partition B says P2 (Yellow/Delayed),
|
||||||
|
/// after merge the survivor is classified P1 (more urgent wins)
|
||||||
|
/// Rationale: false negative (missing critical) is worse than false positive
|
||||||
|
impl CrdtMerge for TriageLevel {
|
||||||
|
fn merge(a: Self, b: Self) -> Self {
|
||||||
|
// Lower numeric priority = more urgent
|
||||||
|
if a.urgency() >= b.urgency() { a } else { b }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**CRDT merge strategies by data type**:
|
||||||
|
|
||||||
|
| Data Type | CRDT Type | Merge Strategy | Rationale |
|
||||||
|
|-----------|-----------|---------------|-----------|
|
||||||
|
| Survivor set | OR-Set | Union (never lose a detection) | Missing survivors = fatal |
|
||||||
|
| Triage level | Max-Register | Most urgent wins | Err toward caution |
|
||||||
|
| Location | LWW-Register | Latest timestamp wins | Survivors may move |
|
||||||
|
| Zone assignment | LWW-Map | Leader's assignment wins | Need authoritative coord |
|
||||||
|
| Model deltas | G-Set | Accumulate all deltas | All adaptations valuable |
|
||||||
|
|
||||||
|
### Node Discovery and Health
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// AP cluster management
|
||||||
|
pub struct ApCluster {
|
||||||
|
/// This node's identity
|
||||||
|
local_ap: ApId,
|
||||||
|
|
||||||
|
/// Raft consensus engine
|
||||||
|
raft: RaftEngine<ConsensusOp>,
|
||||||
|
|
||||||
|
/// Failure detector (phi-accrual)
|
||||||
|
failure_detector: PhiAccrualDetector,
|
||||||
|
|
||||||
|
/// Cluster membership
|
||||||
|
members: HashSet<ApId>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl ApCluster {
|
||||||
|
/// Heartbeat interval for failure detection
|
||||||
|
const HEARTBEAT_MS: u64 = 500;
|
||||||
|
|
||||||
|
/// Phi threshold for suspecting node failure
|
||||||
|
const PHI_THRESHOLD: f64 = 8.0;
|
||||||
|
|
||||||
|
/// Minimum cluster size for Raft (need majority)
|
||||||
|
const MIN_CLUSTER_SIZE: usize = 3;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Performance Characteristics
|
||||||
|
|
||||||
|
| Operation | Latency | Notes |
|
||||||
|
|-----------|---------|-------|
|
||||||
|
| Raft heartbeat | 500 ms interval | Configurable |
|
||||||
|
| Log replication | 1-5 ms (LAN) | Depends on payload size |
|
||||||
|
| Leader election | 1-3 seconds | After leader failure detected |
|
||||||
|
| CRDT merge (partition heal) | 10-100 ms | Proportional to divergence |
|
||||||
|
| Vector clock comparison | <0.01 ms | O(n) where n = cluster size |
|
||||||
|
| Model delta replication | 50-200 ms | ~70 KB LoRA delta |
|
||||||
|
|
||||||
|
### Deployment Configurations
|
||||||
|
|
||||||
|
| Scenario | Nodes | Consensus | Partition Strategy |
|
||||||
|
|----------|-------|-----------|-------------------|
|
||||||
|
| Single room | 1-2 | None (local only) | N/A |
|
||||||
|
| Building floor | 3-5 | Raft (3-node quorum) | CRDT merge on heal |
|
||||||
|
| Disaster site | 5-20 | Raft (5-node quorum) + zones | Zone-level sub-clusters |
|
||||||
|
| Urban search | 20-100 | Hierarchical Raft | Regional leaders |
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
- **Consistent state**: All APs agree on survivor registry via Raft
|
||||||
|
- **Partition tolerant**: CRDT merge allows operation during disconnection
|
||||||
|
- **Causal ordering**: Vector clocks provide logical time without NTP
|
||||||
|
- **Automatic failover**: Raft leader election handles AP failures
|
||||||
|
- **Model sharing**: SONA adaptations propagate across cluster
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
- **Minimum 3 nodes**: Raft requires odd-numbered quorum for leader election
|
||||||
|
- **Network overhead**: Heartbeats and log replication consume bandwidth (~1-10 KB/s per node)
|
||||||
|
- **Complexity**: Distributed systems are inherently harder to debug
|
||||||
|
- **Latency for writes**: Raft requires majority acknowledgment before commit (1-5ms LAN)
|
||||||
|
- **Split-brain risk**: If cluster splits evenly (2+2), neither partition has quorum
|
||||||
|
|
||||||
|
### Disaster-Specific Considerations
|
||||||
|
|
||||||
|
| Challenge | Mitigation |
|
||||||
|
|-----------|------------|
|
||||||
|
| Intermittent connectivity | Aggressive CRDT merge on reconnect; local operation during partition |
|
||||||
|
| Power failures | Raft log persisted to local SSD; recovery on restart |
|
||||||
|
| Node destruction | Raft tolerates minority failure; data replicated across survivors |
|
||||||
|
| Drone mobility | Drone APs treated as ephemeral members; data synced on landing |
|
||||||
|
| Bandwidth constraints | Delta-only replication; compress LoRA deltas |
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [Raft Consensus Algorithm](https://raft.github.io/raft.pdf)
|
||||||
|
- [CRDTs: Conflict-free Replicated Data Types](https://hal.inria.fr/inria-00609399)
|
||||||
|
- [Vector Clocks](https://en.wikipedia.org/wiki/Vector_clock)
|
||||||
|
- [Phi Accrual Failure Detector](https://www.computer.org/csdl/proceedings-article/srds/2004/22390066/12OmNyQYtlC)
|
||||||
|
- [RuVector Distributed Consensus](https://github.com/ruvnet/ruvector)
|
||||||
|
- ADR-001: WiFi-Mat Disaster Detection Architecture
|
||||||
|
- ADR-002: RuVector RVF Integration Strategy
|
||||||
262
docs/adr/ADR-009-rvf-wasm-runtime-edge-deployment.md
Normal file
262
docs/adr/ADR-009-rvf-wasm-runtime-edge-deployment.md
Normal file
@@ -0,0 +1,262 @@
|
|||||||
|
# ADR-009: RVF WASM Runtime for Edge Deployment
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed
|
||||||
|
|
||||||
|
## Date
|
||||||
|
2026-02-28
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
### Current WASM State
|
||||||
|
|
||||||
|
The wifi-densepose-wasm crate provides basic WebAssembly bindings that expose Rust types to JavaScript. It enables browser-based visualization and lightweight inference but has significant limitations:
|
||||||
|
|
||||||
|
1. **No self-contained operation**: WASM module depends on external model files loaded via fetch(). If the server is unreachable, the module is useless.
|
||||||
|
|
||||||
|
2. **No persistent state**: Browser WASM has no built-in persistent storage for fingerprint databases, model weights, or session data.
|
||||||
|
|
||||||
|
3. **No offline capability**: Without network access, the WASM module cannot load models or send results.
|
||||||
|
|
||||||
|
4. **Binary size**: Current WASM bundle is not optimized. Full inference + signal processing compiles to ~5-15 MB.
|
||||||
|
|
||||||
|
### Edge Deployment Requirements
|
||||||
|
|
||||||
|
| Scenario | Platform | Constraints |
|
||||||
|
|----------|----------|------------|
|
||||||
|
| Browser dashboard | Chrome/Firefox | <10 MB download, no plugins |
|
||||||
|
| IoT sensor node | ESP32/Raspberry Pi | 256 KB - 4 GB RAM, battery powered |
|
||||||
|
| Mobile app | iOS/Android WebView | Limited background execution |
|
||||||
|
| Drone payload | Embedded Linux + WASM | Weight/power limited, intermittent connectivity |
|
||||||
|
| Field tablet | Android tablet | Offline operation in disaster zones |
|
||||||
|
|
||||||
|
### RuVector's Edge Runtime
|
||||||
|
|
||||||
|
RuVector provides a 5.5 KB WASM runtime that boots in 125ms, with:
|
||||||
|
- Self-contained operation (models + data embedded in RVF container)
|
||||||
|
- Persistent storage via RVF container (written to IndexedDB in browser, filesystem on native)
|
||||||
|
- Offline-first architecture
|
||||||
|
- SIMD acceleration when available (WASM SIMD proposal)
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will replace the current wifi-densepose-wasm approach with an RVF-based edge runtime that packages models, fingerprint databases, and the inference engine into a single deployable RVF container.
|
||||||
|
|
||||||
|
### Edge Runtime Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────────────────────────────────────────────────┐
|
||||||
|
│ RVF Edge Deployment Container │
|
||||||
|
│ (.rvf.edge file) │
|
||||||
|
├──────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
|
||||||
|
│ │ WASM │ │ VEC │ │ INDEX │ │ MODEL (ONNX) │ │
|
||||||
|
│ │ Runtime │ │ CSI │ │ HNSW │ │ + LoRA deltas │ │
|
||||||
|
│ │ (5.5KB) │ │ Finger- │ │ Graph │ │ │ │
|
||||||
|
│ │ │ │ prints │ │ │ │ │ │
|
||||||
|
│ └──────────┘ └──────────┘ └──────────┘ └──────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
|
||||||
|
│ │ CRYPTO │ │ WITNESS │ │ COW_MAP │ │ CONFIG │ │
|
||||||
|
│ │ Keys │ │ Audit │ │ Branches│ │ Runtime params │ │
|
||||||
|
│ │ │ │ Chain │ │ │ │ │ │
|
||||||
|
│ └──────────┘ └──────────┘ └──────────┘ └──────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ Total container: 1-50 MB depending on model + fingerprint size │
|
||||||
|
└──────────────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
│ Deploy to:
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────────┐
|
||||||
|
│ │
|
||||||
|
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────────────┐ │
|
||||||
|
│ │ Browser │ │ IoT │ │ Mobile │ │ Disaster Field │ │
|
||||||
|
│ │ │ │ Device │ │ App │ │ Tablet │ │
|
||||||
|
│ │ IndexedDB │ Flash │ │ App │ │ Local FS │ │
|
||||||
|
│ │ for state│ │ for │ │ Sandbox │ │ for state │ │
|
||||||
|
│ │ │ │ state │ │ for │ │ │ │
|
||||||
|
│ │ │ │ │ │ state │ │ │ │
|
||||||
|
│ └─────────┘ └─────────┘ └─────────┘ └─────────────────┘ │
|
||||||
|
└───────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tiered Runtime Profiles
|
||||||
|
|
||||||
|
Different deployment targets get different container configurations:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Edge runtime profiles
|
||||||
|
pub enum EdgeProfile {
|
||||||
|
/// Full-featured browser deployment
|
||||||
|
/// ~10 MB container, full inference + HNSW + SONA
|
||||||
|
Browser {
|
||||||
|
model_quantization: Quantization::Int8,
|
||||||
|
max_fingerprints: 100_000,
|
||||||
|
enable_sona: true,
|
||||||
|
storage_backend: StorageBackend::IndexedDB,
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Minimal IoT deployment
|
||||||
|
/// ~1 MB container, lightweight inference only
|
||||||
|
IoT {
|
||||||
|
model_quantization: Quantization::Int4,
|
||||||
|
max_fingerprints: 1_000,
|
||||||
|
enable_sona: false,
|
||||||
|
storage_backend: StorageBackend::Flash,
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Mobile app deployment
|
||||||
|
/// ~5 MB container, inference + HNSW, limited SONA
|
||||||
|
Mobile {
|
||||||
|
model_quantization: Quantization::Int8,
|
||||||
|
max_fingerprints: 50_000,
|
||||||
|
enable_sona: true,
|
||||||
|
storage_backend: StorageBackend::AppSandbox,
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Disaster field deployment (maximum capability)
|
||||||
|
/// ~50 MB container, full stack including multi-AP consensus
|
||||||
|
Field {
|
||||||
|
model_quantization: Quantization::Float16,
|
||||||
|
max_fingerprints: 1_000_000,
|
||||||
|
enable_sona: true,
|
||||||
|
storage_backend: StorageBackend::FileSystem,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Container Size Budget
|
||||||
|
|
||||||
|
| Segment | Browser | IoT | Mobile | Field |
|
||||||
|
|---------|---------|-----|--------|-------|
|
||||||
|
| WASM runtime | 5.5 KB | 5.5 KB | 5.5 KB | 5.5 KB |
|
||||||
|
| Model (ONNX) | 3 MB (int8) | 0.5 MB (int4) | 3 MB (int8) | 12 MB (fp16) |
|
||||||
|
| HNSW index | 4 MB | 100 KB | 2 MB | 40 MB |
|
||||||
|
| Fingerprint vectors | 2 MB | 50 KB | 1 MB | 10 MB |
|
||||||
|
| Config + crypto | 50 KB | 10 KB | 50 KB | 100 KB |
|
||||||
|
| **Total** | **~10 MB** | **~0.7 MB** | **~6 MB** | **~62 MB** |
|
||||||
|
|
||||||
|
### Offline-First Data Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
┌────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Offline-First Operation │
|
||||||
|
├────────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ 1. BOOT (125ms) │
|
||||||
|
│ ├── Open RVF container from local storage │
|
||||||
|
│ ├── Memory-map WASM runtime segment │
|
||||||
|
│ ├── Load HNSW index into memory │
|
||||||
|
│ └── Initialize inference engine with embedded model │
|
||||||
|
│ │
|
||||||
|
│ 2. OPERATE (continuous) │
|
||||||
|
│ ├── Receive CSI data from local hardware interface │
|
||||||
|
│ ├── Process through local pipeline (no network needed) │
|
||||||
|
│ ├── Search HNSW index against local fingerprints │
|
||||||
|
│ ├── Run SONA adaptation on local data │
|
||||||
|
│ ├── Append results to local witness chain │
|
||||||
|
│ └── Store updated vectors to local container │
|
||||||
|
│ │
|
||||||
|
│ 3. SYNC (when connected) │
|
||||||
|
│ ├── Push new vectors to central RVF container │
|
||||||
|
│ ├── Pull updated fingerprints from other nodes │
|
||||||
|
│ ├── Merge SONA deltas via Raft (ADR-008) │
|
||||||
|
│ ├── Extend witness chain with cross-node attestation │
|
||||||
|
│ └── Update local container with merged state │
|
||||||
|
│ │
|
||||||
|
│ 4. SLEEP (battery conservation) │
|
||||||
|
│ ├── Flush pending writes to container │
|
||||||
|
│ ├── Close memory-mapped segments │
|
||||||
|
│ └── Resume from step 1 on wake │
|
||||||
|
└────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Browser-Specific Integration
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Browser WASM entry point
|
||||||
|
#[wasm_bindgen]
|
||||||
|
pub struct WifiDensePoseEdge {
|
||||||
|
container: RvfContainer,
|
||||||
|
inference_engine: InferenceEngine,
|
||||||
|
hnsw_index: HnswIndex,
|
||||||
|
sona: Option<SonaAdapter>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[wasm_bindgen]
|
||||||
|
impl WifiDensePoseEdge {
|
||||||
|
/// Initialize from an RVF container loaded via fetch or IndexedDB
|
||||||
|
#[wasm_bindgen(constructor)]
|
||||||
|
pub async fn new(container_bytes: &[u8]) -> Result<WifiDensePoseEdge, JsValue> {
|
||||||
|
let container = RvfContainer::from_bytes(container_bytes)?;
|
||||||
|
let engine = InferenceEngine::from_container(&container)?;
|
||||||
|
let index = HnswIndex::from_container(&container)?;
|
||||||
|
let sona = SonaAdapter::from_container(&container).ok();
|
||||||
|
|
||||||
|
Ok(Self { container, inference_engine: engine, hnsw_index: index, sona })
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Process a single CSI frame (called from JavaScript)
|
||||||
|
#[wasm_bindgen]
|
||||||
|
pub fn process_frame(&mut self, csi_json: &str) -> Result<String, JsValue> {
|
||||||
|
let csi_data: CsiData = serde_json::from_str(csi_json)
|
||||||
|
.map_err(|e| JsValue::from_str(&e.to_string()))?;
|
||||||
|
|
||||||
|
let features = self.extract_features(&csi_data)?;
|
||||||
|
let detection = self.detect(&features)?;
|
||||||
|
let pose = if detection.human_detected {
|
||||||
|
Some(self.estimate_pose(&features)?)
|
||||||
|
} else {
|
||||||
|
None
|
||||||
|
};
|
||||||
|
|
||||||
|
serde_json::to_string(&PoseResult { detection, pose })
|
||||||
|
.map_err(|e| JsValue::from_str(&e.to_string()))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Save current state to IndexedDB
|
||||||
|
#[wasm_bindgen]
|
||||||
|
pub async fn persist(&self) -> Result<(), JsValue> {
|
||||||
|
let bytes = self.container.serialize()?;
|
||||||
|
// Write to IndexedDB via web-sys
|
||||||
|
save_to_indexeddb("wifi-densepose-state", &bytes).await
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Model Quantization Strategy
|
||||||
|
|
||||||
|
| Quantization | Size Reduction | Accuracy Loss | Suitable For |
|
||||||
|
|-------------|---------------|---------------|-------------|
|
||||||
|
| Float32 (baseline) | 1x | 0% | Server/desktop |
|
||||||
|
| Float16 | 2x | <0.5% | Field tablets, GPUs |
|
||||||
|
| Int8 (PTQ) | 4x | <2% | Browser, mobile |
|
||||||
|
| Int4 (GPTQ) | 8x | <5% | IoT, ultra-constrained |
|
||||||
|
| Binary (1-bit) | 32x | ~15% | MCU/ultra-edge (experimental) |
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
- **Single-file deployment**: Copy one `.rvf.edge` file to deploy anywhere
|
||||||
|
- **Offline operation**: Full functionality without network connectivity
|
||||||
|
- **125ms boot**: Near-instant readiness for emergency scenarios
|
||||||
|
- **Platform universal**: Same container format for browser, IoT, mobile, server
|
||||||
|
- **Battery efficient**: No network polling in offline mode
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
- **Container size**: Even compressed, field containers are 50+ MB
|
||||||
|
- **WASM performance**: 2-5x slower than native Rust for compute-heavy operations
|
||||||
|
- **Browser limitations**: IndexedDB has storage quotas; WASM SIMD support varies
|
||||||
|
- **Update latency**: Offline devices miss updates until reconnection
|
||||||
|
- **Quantization accuracy**: Int4/Int8 models lose some detection sensitivity
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [WebAssembly SIMD Proposal](https://github.com/WebAssembly/simd)
|
||||||
|
- [IndexedDB API](https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API)
|
||||||
|
- [ONNX Runtime Web](https://onnxruntime.ai/docs/tutorials/web/)
|
||||||
|
- [Model Quantization Techniques](https://arxiv.org/abs/2103.13630)
|
||||||
|
- [RuVector WASM Runtime](https://github.com/ruvnet/ruvector)
|
||||||
|
- ADR-002: RuVector RVF Integration Strategy
|
||||||
|
- ADR-003: RVF Cognitive Containers for CSI Data
|
||||||
402
docs/adr/ADR-010-witness-chains-audit-trail-integrity.md
Normal file
402
docs/adr/ADR-010-witness-chains-audit-trail-integrity.md
Normal file
@@ -0,0 +1,402 @@
|
|||||||
|
# ADR-010: Witness Chains for Audit Trail Integrity
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed
|
||||||
|
|
||||||
|
## Date
|
||||||
|
2026-02-28
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
### Life-Critical Audit Requirements
|
||||||
|
|
||||||
|
The wifi-densepose-mat disaster detection module (ADR-001) makes triage classifications that directly affect rescue priority:
|
||||||
|
|
||||||
|
| Triage Level | Action | Consequence of Error |
|
||||||
|
|-------------|--------|---------------------|
|
||||||
|
| P1 (Immediate/Red) | Rescue NOW | False negative → survivor dies waiting |
|
||||||
|
| P2 (Delayed/Yellow) | Rescue within 1 hour | Misclassification → delayed rescue |
|
||||||
|
| P3 (Minor/Green) | Rescue when resources allow | Over-triage → resource waste |
|
||||||
|
| P4 (Deceased/Black) | No rescue attempted | False P4 → living person abandoned |
|
||||||
|
|
||||||
|
Post-incident investigations, liability proceedings, and operational reviews require:
|
||||||
|
|
||||||
|
1. **Non-repudiation**: Prove which device made which detection at which time
|
||||||
|
2. **Tamper evidence**: Detect if records were altered after the fact
|
||||||
|
3. **Completeness**: Prove no detections were deleted or hidden
|
||||||
|
4. **Causal chain**: Reconstruct the sequence of events leading to each triage decision
|
||||||
|
5. **Cross-device verification**: Corroborate detections across multiple APs
|
||||||
|
|
||||||
|
### Current State
|
||||||
|
|
||||||
|
Detection results are logged to the database (`wifi-densepose-db`) with standard INSERT operations. Logs can be:
|
||||||
|
- Silently modified after the fact
|
||||||
|
- Deleted without trace
|
||||||
|
- Backdated or reordered
|
||||||
|
- Lost if the database is corrupted
|
||||||
|
|
||||||
|
No cryptographic integrity mechanism exists.
|
||||||
|
|
||||||
|
### RuVector Witness Chains
|
||||||
|
|
||||||
|
RuVector implements hash-linked audit trails inspired by blockchain but without the consensus overhead:
|
||||||
|
|
||||||
|
- **Hash chain**: Each entry includes the SHAKE-256 hash of the previous entry, forming a tamper-evident chain
|
||||||
|
- **Signatures**: Chain anchors (every Nth entry) are signed with the device's key pair
|
||||||
|
- **Cross-chain attestation**: Multiple devices can cross-reference each other's chains
|
||||||
|
- **Compact**: Each chain entry is ~100-200 bytes (hash + metadata + signature reference)
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will implement RuVector witness chains as the primary audit mechanism for all detection events, triage decisions, and model adaptation events in the WiFi-DensePose system.
|
||||||
|
|
||||||
|
### Witness Chain Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
┌────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Witness Chain │
|
||||||
|
├────────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ Entry 0 Entry 1 Entry 2 Entry 3 │
|
||||||
|
│ (Genesis) │
|
||||||
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||||
|
│ │ prev: ∅ │◀───│ prev: H0 │◀───│ prev: H1 │◀───│ prev: H2 │ │
|
||||||
|
│ │ event: │ │ event: │ │ event: │ │ event: │ │
|
||||||
|
│ │ INIT │ │ DETECT │ │ TRIAGE │ │ ADAPT │ │
|
||||||
|
│ │ hash: H0 │ │ hash: H1 │ │ hash: H2 │ │ hash: H3 │ │
|
||||||
|
│ │ sig: S0 │ │ │ │ │ │ sig: S1 │ │
|
||||||
|
│ │ (anchor) │ │ │ │ │ │ (anchor) │ │
|
||||||
|
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
|
||||||
|
│ │
|
||||||
|
│ H0 = SHAKE-256(INIT || device_id || timestamp) │
|
||||||
|
│ H1 = SHAKE-256(DETECT_DATA || H0 || timestamp) │
|
||||||
|
│ H2 = SHAKE-256(TRIAGE_DATA || H1 || timestamp) │
|
||||||
|
│ H3 = SHAKE-256(ADAPT_DATA || H2 || timestamp) │
|
||||||
|
│ │
|
||||||
|
│ Anchor signature S0 = ML-DSA-65.sign(H0, device_key) │
|
||||||
|
│ Anchor signature S1 = ML-DSA-65.sign(H3, device_key) │
|
||||||
|
│ Anchor interval: every 100 entries (configurable) │
|
||||||
|
└────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Witnessed Event Types
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Events recorded in the witness chain
|
||||||
|
#[derive(Serialize, Deserialize, Clone)]
|
||||||
|
pub enum WitnessedEvent {
|
||||||
|
/// Chain initialization (genesis)
|
||||||
|
ChainInit {
|
||||||
|
device_id: DeviceId,
|
||||||
|
firmware_version: String,
|
||||||
|
config_hash: [u8; 32],
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Human presence detected
|
||||||
|
HumanDetected {
|
||||||
|
detection_id: Uuid,
|
||||||
|
confidence: f64,
|
||||||
|
csi_features_hash: [u8; 32], // Hash of input data, not raw data
|
||||||
|
location_estimate: Option<GeoCoord>,
|
||||||
|
model_version: String,
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Triage classification assigned or changed
|
||||||
|
TriageDecision {
|
||||||
|
survivor_id: Uuid,
|
||||||
|
previous_level: Option<TriageLevel>,
|
||||||
|
new_level: TriageLevel,
|
||||||
|
evidence_hash: [u8; 32], // Hash of supporting evidence
|
||||||
|
deciding_algorithm: String,
|
||||||
|
confidence: f64,
|
||||||
|
},
|
||||||
|
|
||||||
|
/// False detection corrected
|
||||||
|
DetectionCorrected {
|
||||||
|
detection_id: Uuid,
|
||||||
|
correction_type: CorrectionType, // FalsePositive | FalseNegative | Reclassified
|
||||||
|
reason: String,
|
||||||
|
corrected_by: CorrectorId, // Device or operator
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Model adapted via SONA
|
||||||
|
ModelAdapted {
|
||||||
|
adaptation_id: Uuid,
|
||||||
|
trigger: AdaptationTrigger,
|
||||||
|
lora_delta_hash: [u8; 32],
|
||||||
|
performance_before: f64,
|
||||||
|
performance_after: f64,
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Zone scan completed
|
||||||
|
ZoneScanCompleted {
|
||||||
|
zone_id: ZoneId,
|
||||||
|
scan_duration_ms: u64,
|
||||||
|
detections_count: usize,
|
||||||
|
coverage_percentage: f64,
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Cross-device attestation received
|
||||||
|
CrossAttestation {
|
||||||
|
attesting_device: DeviceId,
|
||||||
|
attested_chain_hash: [u8; 32],
|
||||||
|
attested_entry_index: u64,
|
||||||
|
},
|
||||||
|
|
||||||
|
/// Operator action (manual override)
|
||||||
|
OperatorAction {
|
||||||
|
operator_id: String,
|
||||||
|
action: OperatorActionType,
|
||||||
|
target: Uuid, // What was acted upon
|
||||||
|
justification: String,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Chain Entry Structure
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// A single entry in the witness chain
|
||||||
|
#[derive(Serialize, Deserialize)]
|
||||||
|
pub struct WitnessEntry {
|
||||||
|
/// Sequential index in the chain
|
||||||
|
index: u64,
|
||||||
|
|
||||||
|
/// SHAKE-256 hash of the previous entry (32 bytes)
|
||||||
|
previous_hash: [u8; 32],
|
||||||
|
|
||||||
|
/// The witnessed event
|
||||||
|
event: WitnessedEvent,
|
||||||
|
|
||||||
|
/// Device that created this entry
|
||||||
|
device_id: DeviceId,
|
||||||
|
|
||||||
|
/// Monotonic timestamp (device-local, not wall clock)
|
||||||
|
monotonic_timestamp: u64,
|
||||||
|
|
||||||
|
/// Wall clock timestamp (best-effort, may be inaccurate)
|
||||||
|
wall_timestamp: DateTime<Utc>,
|
||||||
|
|
||||||
|
/// Vector clock for causal ordering (see ADR-008)
|
||||||
|
vector_clock: VectorClock,
|
||||||
|
|
||||||
|
/// This entry's hash: SHAKE-256(serialize(self without this field))
|
||||||
|
entry_hash: [u8; 32],
|
||||||
|
|
||||||
|
/// Anchor signature (present every N entries)
|
||||||
|
anchor_signature: Option<HybridSignature>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tamper Detection
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Verify witness chain integrity
|
||||||
|
pub fn verify_chain(chain: &[WitnessEntry]) -> Result<ChainVerification, AuditError> {
|
||||||
|
let mut verification = ChainVerification::new();
|
||||||
|
|
||||||
|
for (i, entry) in chain.iter().enumerate() {
|
||||||
|
// 1. Verify hash chain linkage
|
||||||
|
if i > 0 {
|
||||||
|
let expected_prev_hash = chain[i - 1].entry_hash;
|
||||||
|
if entry.previous_hash != expected_prev_hash {
|
||||||
|
verification.add_violation(ChainViolation::BrokenLink {
|
||||||
|
entry_index: entry.index,
|
||||||
|
expected_hash: expected_prev_hash,
|
||||||
|
actual_hash: entry.previous_hash,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// 2. Verify entry self-hash
|
||||||
|
let computed_hash = compute_entry_hash(entry);
|
||||||
|
if computed_hash != entry.entry_hash {
|
||||||
|
verification.add_violation(ChainViolation::TamperedEntry {
|
||||||
|
entry_index: entry.index,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// 3. Verify anchor signatures
|
||||||
|
if let Some(ref sig) = entry.anchor_signature {
|
||||||
|
let device_keys = load_device_keys(&entry.device_id)?;
|
||||||
|
if !sig.verify(&entry.entry_hash, &device_keys.ed25519, &device_keys.ml_dsa)? {
|
||||||
|
verification.add_violation(ChainViolation::InvalidSignature {
|
||||||
|
entry_index: entry.index,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// 4. Verify monotonic timestamp ordering
|
||||||
|
if i > 0 && entry.monotonic_timestamp <= chain[i - 1].monotonic_timestamp {
|
||||||
|
verification.add_violation(ChainViolation::NonMonotonicTimestamp {
|
||||||
|
entry_index: entry.index,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
verification.verified_entries += 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(verification)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Cross-Device Attestation
|
||||||
|
|
||||||
|
Multiple APs can cross-reference each other's chains for stronger guarantees:
|
||||||
|
|
||||||
|
```
|
||||||
|
Device A's chain: Device B's chain:
|
||||||
|
┌──────────┐ ┌──────────┐
|
||||||
|
│ Entry 50 │ │ Entry 73 │
|
||||||
|
│ H_A50 │◀────── cross-attest ───▶│ H_B73 │
|
||||||
|
└──────────┘ └──────────┘
|
||||||
|
|
||||||
|
Device A records: CrossAttestation { attesting: B, hash: H_B73, index: 73 }
|
||||||
|
Device B records: CrossAttestation { attesting: A, hash: H_A50, index: 50 }
|
||||||
|
|
||||||
|
After cross-attestation:
|
||||||
|
- Neither device can rewrite entries before the attested point
|
||||||
|
without the other device's chain becoming inconsistent
|
||||||
|
- An investigator can verify both chains agree on the attestation point
|
||||||
|
```
|
||||||
|
|
||||||
|
**Attestation frequency**: Every 5 minutes during connected operation, immediately on significant events (P1 triage, zone completion).
|
||||||
|
|
||||||
|
### Storage and Retrieval
|
||||||
|
|
||||||
|
Witness chains are stored in the RVF container's WITNESS segment:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
/// Witness chain storage manager
|
||||||
|
pub struct WitnessChainStore {
|
||||||
|
/// Current chain being appended to
|
||||||
|
active_chain: Vec<WitnessEntry>,
|
||||||
|
|
||||||
|
/// Anchor signature interval
|
||||||
|
anchor_interval: usize, // 100
|
||||||
|
|
||||||
|
/// Device signing key
|
||||||
|
device_key: DeviceKeyPair,
|
||||||
|
|
||||||
|
/// Cross-attestation peers
|
||||||
|
attestation_peers: Vec<DeviceId>,
|
||||||
|
|
||||||
|
/// RVF container for persistence
|
||||||
|
container: RvfContainer,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl WitnessChainStore {
|
||||||
|
/// Append an event to the chain
|
||||||
|
pub fn witness(&mut self, event: WitnessedEvent) -> Result<u64, AuditError> {
|
||||||
|
let index = self.active_chain.len() as u64;
|
||||||
|
let previous_hash = self.active_chain.last()
|
||||||
|
.map(|e| e.entry_hash)
|
||||||
|
.unwrap_or([0u8; 32]);
|
||||||
|
|
||||||
|
let mut entry = WitnessEntry {
|
||||||
|
index,
|
||||||
|
previous_hash,
|
||||||
|
event,
|
||||||
|
device_id: self.device_key.device_id(),
|
||||||
|
monotonic_timestamp: monotonic_now(),
|
||||||
|
wall_timestamp: Utc::now(),
|
||||||
|
vector_clock: self.get_current_vclock(),
|
||||||
|
entry_hash: [0u8; 32], // Computed below
|
||||||
|
anchor_signature: None,
|
||||||
|
};
|
||||||
|
|
||||||
|
// Compute entry hash
|
||||||
|
entry.entry_hash = compute_entry_hash(&entry);
|
||||||
|
|
||||||
|
// Add anchor signature at interval
|
||||||
|
if index % self.anchor_interval as u64 == 0 {
|
||||||
|
entry.anchor_signature = Some(
|
||||||
|
self.device_key.sign_hybrid(&entry.entry_hash)?
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
self.active_chain.push(entry);
|
||||||
|
|
||||||
|
// Persist to RVF container
|
||||||
|
self.container.append_witness(&self.active_chain.last().unwrap())?;
|
||||||
|
|
||||||
|
Ok(index)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Query chain for events in a time range
|
||||||
|
pub fn query_range(&self, start: DateTime<Utc>, end: DateTime<Utc>)
|
||||||
|
-> Vec<&WitnessEntry>
|
||||||
|
{
|
||||||
|
self.active_chain.iter()
|
||||||
|
.filter(|e| e.wall_timestamp >= start && e.wall_timestamp <= end)
|
||||||
|
.collect()
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Export chain for external audit
|
||||||
|
pub fn export_for_audit(&self) -> AuditBundle {
|
||||||
|
AuditBundle {
|
||||||
|
chain: self.active_chain.clone(),
|
||||||
|
device_public_key: self.device_key.public_keys(),
|
||||||
|
cross_attestations: self.collect_cross_attestations(),
|
||||||
|
chain_summary: self.compute_summary(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Performance Impact
|
||||||
|
|
||||||
|
| Operation | Latency | Notes |
|
||||||
|
|-----------|---------|-------|
|
||||||
|
| Append entry | 0.05 ms | Hash computation + serialize |
|
||||||
|
| Append with anchor signature | 0.5 ms | + ML-DSA-65 sign |
|
||||||
|
| Verify single entry | 0.02 ms | Hash comparison |
|
||||||
|
| Verify anchor | 0.3 ms | ML-DSA-65 verify |
|
||||||
|
| Full chain verify (10K entries) | 50 ms | Sequential hash verification |
|
||||||
|
| Cross-attestation | 1 ms | Sign + network round-trip |
|
||||||
|
|
||||||
|
### Storage Requirements
|
||||||
|
|
||||||
|
| Chain Length | Entries/Hour | Size/Hour | Size/Day |
|
||||||
|
|-------------|-------------|-----------|----------|
|
||||||
|
| Low activity | ~100 | ~20 KB | ~480 KB |
|
||||||
|
| Normal operation | ~1,000 | ~200 KB | ~4.8 MB |
|
||||||
|
| Disaster response | ~10,000 | ~2 MB | ~48 MB |
|
||||||
|
| High-intensity scan | ~50,000 | ~10 MB | ~240 MB |
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
- **Tamper-evident**: Any modification to historical records is detectable
|
||||||
|
- **Non-repudiable**: Signed anchors prove device identity
|
||||||
|
- **Complete history**: Every detection, triage, and correction is recorded
|
||||||
|
- **Cross-verified**: Multi-device attestation strengthens guarantees
|
||||||
|
- **Forensically sound**: Exportable audit bundles for legal proceedings
|
||||||
|
- **Low overhead**: 0.05ms per entry; minimal storage for normal operation
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
- **Append-only growth**: Chains grow monotonically; need archival strategy for long deployments
|
||||||
|
- **Key management**: Device keys must be provisioned and protected
|
||||||
|
- **Clock dependency**: Wall-clock timestamps are best-effort; monotonic timestamps are device-local
|
||||||
|
- **Verification cost**: Full chain verification of long chains takes meaningful time (50ms/10K entries)
|
||||||
|
- **Privacy tension**: Detailed audit trails contain operational intelligence
|
||||||
|
|
||||||
|
### Regulatory Alignment
|
||||||
|
|
||||||
|
| Requirement | How Witness Chains Address It |
|
||||||
|
|------------|------------------------------|
|
||||||
|
| GDPR (Right to erasure) | Event hashes stored, not personal data; original data deletable while chain proves historical integrity |
|
||||||
|
| HIPAA (Audit controls) | Complete access/modification log with non-repudiation |
|
||||||
|
| ISO 27001 (Information security) | Tamper-evident records, access logging, integrity verification |
|
||||||
|
| NIST SP 800-53 (AU controls) | Audit record generation, protection, and review capability |
|
||||||
|
| FEMA ICS (Incident Command) | Chain of custody for all operational decisions |
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [Witness Chains in Distributed Systems](https://eprint.iacr.org/2019/747)
|
||||||
|
- [SHAKE-256 (FIPS 202)](https://csrc.nist.gov/pubs/fips/202/final)
|
||||||
|
- [Tamper-Evident Logging](https://www.usenix.org/legacy/event/sec09/tech/full_papers/crosby.pdf)
|
||||||
|
- [RuVector Witness Implementation](https://github.com/ruvnet/ruvector)
|
||||||
|
- ADR-001: WiFi-Mat Disaster Detection Architecture
|
||||||
|
- ADR-007: Post-Quantum Cryptography for Secure Sensing
|
||||||
|
- ADR-008: Distributed Consensus for Multi-AP Coordination
|
||||||
414
docs/adr/ADR-011-python-proof-of-reality-mock-elimination.md
Normal file
414
docs/adr/ADR-011-python-proof-of-reality-mock-elimination.md
Normal file
@@ -0,0 +1,414 @@
|
|||||||
|
# ADR-011: Python Proof-of-Reality and Mock Elimination
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed (URGENT)
|
||||||
|
|
||||||
|
## Date
|
||||||
|
2026-02-28
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
### The Credibility Problem
|
||||||
|
|
||||||
|
The WiFi-DensePose Python codebase contains real, mathematically sound signal processing (FFT, phase unwrapping, Doppler extraction, correlation features) alongside mock/placeholder code that fatally undermines credibility. External reviewers who encounter **any** mock path in the default execution flow conclude the entire system is synthetic. This is not a technical problem - it is a perception problem with technical root causes.
|
||||||
|
|
||||||
|
### Specific Mock/Placeholder Inventory
|
||||||
|
|
||||||
|
The following code paths produce fake data **in the default configuration** or are easily mistaken for indicating fake functionality:
|
||||||
|
|
||||||
|
#### Critical Severity (produces fake output on default path)
|
||||||
|
|
||||||
|
| File | Line | Issue | Impact |
|
||||||
|
|------|------|-------|--------|
|
||||||
|
| `v1/src/core/csi_processor.py` | 390 | `doppler_shift = np.random.rand(10) # Placeholder` | **Real feature extractor returns random Doppler** - kills credibility of entire feature pipeline |
|
||||||
|
| `v1/src/hardware/csi_extractor.py` | 83-84 | `amplitude = np.random.rand(...)` in CSI extraction fallback | Random data silently substituted when parsing fails |
|
||||||
|
| `v1/src/hardware/csi_extractor.py` | 129-135 | `_parse_atheros()` returns `np.random.rand()` with comment "placeholder implementation" | Named as if it parses real data, actually random |
|
||||||
|
| `v1/src/hardware/router_interface.py` | 211-212 | `np.random.rand(3, 56)` in fallback path | Silent random fallback |
|
||||||
|
| `v1/src/services/pose_service.py` | 431 | `mock_csi = np.random.randn(64, 56, 3) # Mock CSI data` | Mock CSI in production code path |
|
||||||
|
| `v1/src/services/pose_service.py` | 293-356 | `_generate_mock_poses()` with `random.randint` throughout | Entire mock pose generator in service layer |
|
||||||
|
| `v1/src/services/pose_service.py` | 489-607 | Multiple `random.randint` for occupancy, historical data | Fake statistics that look real in API responses |
|
||||||
|
| `v1/src/api/dependencies.py` | 82, 408 | "return a mock user for development" | Auth bypass in default path |
|
||||||
|
|
||||||
|
#### Moderate Severity (mock gated behind flags but confusing)
|
||||||
|
|
||||||
|
| File | Line | Issue |
|
||||||
|
|------|------|-------|
|
||||||
|
| `v1/src/config/settings.py` | 144-145 | `mock_hardware=False`, `mock_pose_data=False` defaults - correct, but mock infrastructure exists |
|
||||||
|
| `v1/src/core/router_interface.py` | 27-300 | 270+ lines of mock data generation infrastructure in production code |
|
||||||
|
| `v1/src/services/pose_service.py` | 84-88 | Silent conditional: `if not self.settings.mock_pose_data` with no logging of real-mode |
|
||||||
|
| `v1/src/services/hardware_service.py` | 72-375 | Interleaved mock/real paths throughout |
|
||||||
|
|
||||||
|
#### Low Severity (placeholders/TODOs)
|
||||||
|
|
||||||
|
| File | Line | Issue |
|
||||||
|
|------|------|-------|
|
||||||
|
| `v1/src/core/router_interface.py` | 198 | "Collect real CSI data from router (placeholder implementation)" |
|
||||||
|
| `v1/src/api/routers/health.py` | 170-171 | `uptime_seconds = 0.0 # TODO` |
|
||||||
|
| `v1/src/services/pose_service.py` | 739 | `"uptime_seconds": 0.0 # TODO` |
|
||||||
|
|
||||||
|
### Root Cause Analysis
|
||||||
|
|
||||||
|
1. **No separation between mock and real**: Mock generators live in the same modules as real processors. A reviewer reading `csi_processor.py` hits `np.random.rand(10)` at line 390 and stops trusting the 400 lines of real signal processing above it.
|
||||||
|
|
||||||
|
2. **Silent fallbacks**: When real hardware isn't available, the system silently falls back to random data instead of failing loudly. This means the default `docker compose up` produces plausible-looking but entirely fake results.
|
||||||
|
|
||||||
|
3. **No proof artifact**: There is no shipped CSI capture file, no expected output hash, no way for a reviewer to verify that the pipeline produces deterministic results from real input.
|
||||||
|
|
||||||
|
4. **Build environment fragility**: The `Dockerfile` references `requirements.txt` which doesn't exist as a standalone file. The `setup.py` hardcodes 87 dependencies. ONNX Runtime and BLAS are not in the container. A `docker build` may or may not succeed depending on the machine.
|
||||||
|
|
||||||
|
5. **No CI verification**: No GitHub Actions workflow runs the pipeline on a real or deterministic input and verifies the output.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will eliminate the credibility gap through five concrete changes:
|
||||||
|
|
||||||
|
### 1. Eliminate All Silent Mock Fallbacks (HARD FAIL)
|
||||||
|
|
||||||
|
**Every path that currently returns `np.random.rand()` will either be replaced with real computation or will raise an explicit error.**
|
||||||
|
|
||||||
|
```python
|
||||||
|
# BEFORE (csi_processor.py:390)
|
||||||
|
doppler_shift = np.random.rand(10) # Placeholder
|
||||||
|
|
||||||
|
# AFTER
|
||||||
|
def _extract_doppler_features(self, csi_data: CSIData) -> tuple:
|
||||||
|
"""Extract Doppler and frequency domain features from CSI temporal history."""
|
||||||
|
if len(self.csi_history) < 2:
|
||||||
|
# Not enough history for temporal analysis - return zeros, not random
|
||||||
|
doppler_shift = np.zeros(self.window_size)
|
||||||
|
psd = np.abs(scipy.fft.fft(csi_data.amplitude.flatten(), n=128))**2
|
||||||
|
return doppler_shift, psd
|
||||||
|
|
||||||
|
# Real Doppler extraction from temporal CSI differences
|
||||||
|
history_array = np.array([h.amplitude for h in self.get_recent_history(self.window_size)])
|
||||||
|
# Compute phase differences over time (proportional to Doppler shift)
|
||||||
|
temporal_phase_diff = np.diff(np.angle(history_array + 1j * np.zeros_like(history_array)), axis=0)
|
||||||
|
# Average across antennas, FFT across time for Doppler spectrum
|
||||||
|
doppler_spectrum = np.abs(scipy.fft.fft(temporal_phase_diff.mean(axis=1), axis=0))
|
||||||
|
doppler_shift = doppler_spectrum.mean(axis=1)
|
||||||
|
|
||||||
|
psd = np.abs(scipy.fft.fft(csi_data.amplitude.flatten(), n=128))**2
|
||||||
|
return doppler_shift, psd
|
||||||
|
```
|
||||||
|
|
||||||
|
```python
|
||||||
|
# BEFORE (csi_extractor.py:129-135)
|
||||||
|
def _parse_atheros(self, raw_data):
|
||||||
|
"""Parse Atheros CSI format (placeholder implementation)."""
|
||||||
|
# For now, return mock data for testing
|
||||||
|
return CSIData(amplitude=np.random.rand(3, 56), ...)
|
||||||
|
|
||||||
|
# AFTER
|
||||||
|
def _parse_atheros(self, raw_data: bytes) -> CSIData:
|
||||||
|
"""Parse Atheros CSI Tool format.
|
||||||
|
|
||||||
|
Format: https://dhalperi.github.io/linux-80211n-csitool/
|
||||||
|
"""
|
||||||
|
if len(raw_data) < 25: # Minimum Atheros CSI header
|
||||||
|
raise CSIExtractionError(
|
||||||
|
f"Atheros CSI data too short ({len(raw_data)} bytes). "
|
||||||
|
"Expected real CSI capture from Atheros-based NIC. "
|
||||||
|
"See docs/hardware-setup.md for capture instructions."
|
||||||
|
)
|
||||||
|
# Parse actual Atheros binary format
|
||||||
|
# ... real parsing implementation ...
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Isolate Mock Infrastructure Behind Explicit Flag with Banner
|
||||||
|
|
||||||
|
**All mock code moves to a dedicated module. Default execution NEVER touches mock paths.**
|
||||||
|
|
||||||
|
```
|
||||||
|
v1/src/
|
||||||
|
├── core/
|
||||||
|
│ ├── csi_processor.py # Real processing only
|
||||||
|
│ └── router_interface.py # Real hardware interface only
|
||||||
|
├── testing/ # NEW: isolated mock module
|
||||||
|
│ ├── __init__.py
|
||||||
|
│ ├── mock_csi_generator.py # Mock CSI generation (moved from router_interface)
|
||||||
|
│ ├── mock_pose_generator.py # Mock poses (moved from pose_service)
|
||||||
|
│ └── fixtures/ # Test fixtures, not production paths
|
||||||
|
│ ├── sample_csi_capture.bin # Real captured CSI data (tiny sample)
|
||||||
|
│ └── expected_output.json # Expected pipeline output for sample
|
||||||
|
```
|
||||||
|
|
||||||
|
**Runtime enforcement:**
|
||||||
|
```python
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
|
||||||
|
MOCK_MODE = os.environ.get("WIFI_DENSEPOSE_MOCK", "").lower() == "true"
|
||||||
|
|
||||||
|
if MOCK_MODE:
|
||||||
|
# Print banner on EVERY log line
|
||||||
|
_original_log = logging.Logger._log
|
||||||
|
def _mock_banner_log(self, level, msg, args, **kwargs):
|
||||||
|
_original_log(self, level, f"[MOCK MODE] {msg}", args, **kwargs)
|
||||||
|
logging.Logger._log = _mock_banner_log
|
||||||
|
|
||||||
|
print("=" * 72, file=sys.stderr)
|
||||||
|
print(" WARNING: RUNNING IN MOCK MODE - ALL DATA IS SYNTHETIC", file=sys.stderr)
|
||||||
|
print(" Set WIFI_DENSEPOSE_MOCK=false for real operation", file=sys.stderr)
|
||||||
|
print("=" * 72, file=sys.stderr)
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Ship a Reproducible Proof Bundle
|
||||||
|
|
||||||
|
A small real CSI capture file + one-command verification pipeline:
|
||||||
|
|
||||||
|
```
|
||||||
|
v1/data/proof/
|
||||||
|
├── README.md # How to verify
|
||||||
|
├── sample_csi_capture.bin # Real CSI data (1 second, ~50 KB)
|
||||||
|
├── sample_csi_capture_meta.json # Capture metadata (hardware, env)
|
||||||
|
├── expected_features.json # Expected feature extraction output
|
||||||
|
├── expected_features.sha256 # SHA-256 hash of expected output
|
||||||
|
└── verify.py # One-command verification script
|
||||||
|
```
|
||||||
|
|
||||||
|
**verify.py**:
|
||||||
|
```python
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Verify WiFi-DensePose pipeline produces deterministic output from real CSI data.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python v1/data/proof/verify.py
|
||||||
|
|
||||||
|
Expected output:
|
||||||
|
PASS: Pipeline output matches expected hash
|
||||||
|
SHA256: <hash>
|
||||||
|
|
||||||
|
If this passes, the signal processing pipeline is producing real,
|
||||||
|
deterministic results from real captured CSI data.
|
||||||
|
"""
|
||||||
|
import hashlib
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
import os
|
||||||
|
|
||||||
|
# Ensure reproducibility
|
||||||
|
os.environ["PYTHONHASHSEED"] = "42"
|
||||||
|
import numpy as np
|
||||||
|
np.random.seed(42) # Only affects any remaining random elements
|
||||||
|
|
||||||
|
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "../.."))
|
||||||
|
|
||||||
|
from src.core.csi_processor import CSIProcessor
|
||||||
|
from src.hardware.csi_extractor import CSIExtractor
|
||||||
|
|
||||||
|
def main():
|
||||||
|
# Load real captured CSI data
|
||||||
|
capture_path = os.path.join(os.path.dirname(__file__), "sample_csi_capture.bin")
|
||||||
|
meta_path = os.path.join(os.path.dirname(__file__), "sample_csi_capture_meta.json")
|
||||||
|
expected_hash_path = os.path.join(os.path.dirname(__file__), "expected_features.sha256")
|
||||||
|
|
||||||
|
with open(meta_path) as f:
|
||||||
|
meta = json.load(f)
|
||||||
|
|
||||||
|
# Extract CSI from binary capture
|
||||||
|
extractor = CSIExtractor(format=meta["format"])
|
||||||
|
csi_data = extractor.extract_from_file(capture_path)
|
||||||
|
|
||||||
|
# Process through feature pipeline
|
||||||
|
config = {
|
||||||
|
"sampling_rate": meta["sampling_rate"],
|
||||||
|
"window_size": meta["window_size"],
|
||||||
|
"overlap": meta["overlap"],
|
||||||
|
"noise_threshold": meta["noise_threshold"],
|
||||||
|
}
|
||||||
|
processor = CSIProcessor(config)
|
||||||
|
features = processor.extract_features(csi_data)
|
||||||
|
|
||||||
|
# Serialize features deterministically
|
||||||
|
output = {
|
||||||
|
"amplitude_mean": features.amplitude_mean.tolist(),
|
||||||
|
"amplitude_variance": features.amplitude_variance.tolist(),
|
||||||
|
"phase_difference": features.phase_difference.tolist(),
|
||||||
|
"doppler_shift": features.doppler_shift.tolist(),
|
||||||
|
"psd_first_16": features.power_spectral_density[:16].tolist(),
|
||||||
|
}
|
||||||
|
output_json = json.dumps(output, sort_keys=True, separators=(",", ":"))
|
||||||
|
output_hash = hashlib.sha256(output_json.encode()).hexdigest()
|
||||||
|
|
||||||
|
# Verify against expected hash
|
||||||
|
with open(expected_hash_path) as f:
|
||||||
|
expected_hash = f.read().strip()
|
||||||
|
|
||||||
|
if output_hash == expected_hash:
|
||||||
|
print(f"PASS: Pipeline output matches expected hash")
|
||||||
|
print(f"SHA256: {output_hash}")
|
||||||
|
print(f"Features: {len(output['amplitude_mean'])} subcarriers processed")
|
||||||
|
return 0
|
||||||
|
else:
|
||||||
|
print(f"FAIL: Hash mismatch")
|
||||||
|
print(f"Expected: {expected_hash}")
|
||||||
|
print(f"Got: {output_hash}")
|
||||||
|
return 1
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(main())
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Pin the Build Environment
|
||||||
|
|
||||||
|
**Option A (recommended): Deterministic Dockerfile that works on fresh machine**
|
||||||
|
|
||||||
|
```dockerfile
|
||||||
|
FROM python:3.11-slim
|
||||||
|
|
||||||
|
# System deps that actually matter
|
||||||
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||||
|
libopenblas-dev \
|
||||||
|
libfftw3-dev \
|
||||||
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
# Pinned requirements (not a reference to missing file)
|
||||||
|
COPY v1/requirements-lock.txt ./requirements.txt
|
||||||
|
RUN pip install --no-cache-dir -r requirements.txt
|
||||||
|
|
||||||
|
COPY v1/ ./v1/
|
||||||
|
|
||||||
|
# Proof of reality: verify pipeline on build
|
||||||
|
RUN cd v1 && python data/proof/verify.py
|
||||||
|
|
||||||
|
EXPOSE 8000
|
||||||
|
# Default: REAL mode (mock requires explicit opt-in)
|
||||||
|
ENV WIFI_DENSEPOSE_MOCK=false
|
||||||
|
CMD ["uvicorn", "v1.src.api.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key change**: `RUN python data/proof/verify.py` **during build** means the Docker image cannot be created unless the pipeline produces correct output from real CSI data.
|
||||||
|
|
||||||
|
**Requirements lockfile** (`v1/requirements-lock.txt`):
|
||||||
|
```
|
||||||
|
# Core (required)
|
||||||
|
fastapi==0.115.6
|
||||||
|
uvicorn[standard]==0.34.0
|
||||||
|
pydantic==2.10.4
|
||||||
|
pydantic-settings==2.7.1
|
||||||
|
numpy==1.26.4
|
||||||
|
scipy==1.14.1
|
||||||
|
|
||||||
|
# Signal processing (required)
|
||||||
|
# No ONNX required for basic pipeline verification
|
||||||
|
|
||||||
|
# Optional (install separately for full features)
|
||||||
|
# torch>=2.1.0
|
||||||
|
# onnxruntime>=1.17.0
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. CI Pipeline That Proves Reality
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# .github/workflows/verify-pipeline.yml
|
||||||
|
name: Verify Signal Pipeline
|
||||||
|
|
||||||
|
on:
|
||||||
|
push:
|
||||||
|
paths: ['v1/src/**', 'v1/data/proof/**']
|
||||||
|
pull_request:
|
||||||
|
paths: ['v1/src/**']
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
verify:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v4
|
||||||
|
- uses: actions/setup-python@v5
|
||||||
|
with:
|
||||||
|
python-version: '3.11'
|
||||||
|
- name: Install minimal deps
|
||||||
|
run: pip install numpy scipy pydantic pydantic-settings
|
||||||
|
- name: Verify pipeline determinism
|
||||||
|
run: python v1/data/proof/verify.py
|
||||||
|
- name: Verify no random in production paths
|
||||||
|
run: |
|
||||||
|
# Fail if np.random appears in production code (not in testing/)
|
||||||
|
! grep -r "np\.random\.\(rand\|randn\|randint\)" v1/src/ \
|
||||||
|
--include="*.py" \
|
||||||
|
--exclude-dir=testing \
|
||||||
|
|| (echo "FAIL: np.random found in production code" && exit 1)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Concrete File Changes Required
|
||||||
|
|
||||||
|
| File | Action | Description |
|
||||||
|
|------|--------|-------------|
|
||||||
|
| `v1/src/core/csi_processor.py:390` | **Replace** | Real Doppler extraction from temporal CSI history |
|
||||||
|
| `v1/src/hardware/csi_extractor.py:83-84` | **Replace** | Hard error with descriptive message when parsing fails |
|
||||||
|
| `v1/src/hardware/csi_extractor.py:129-135` | **Replace** | Real Atheros CSI parser or hard error with hardware instructions |
|
||||||
|
| `v1/src/hardware/router_interface.py:198-212` | **Replace** | Hard error for unimplemented hardware, or real `iwconfig` + CSI tool integration |
|
||||||
|
| `v1/src/services/pose_service.py:293-356` | **Move** | Move `_generate_mock_poses()` to `v1/src/testing/mock_pose_generator.py` |
|
||||||
|
| `v1/src/services/pose_service.py:430-431` | **Remove** | Remove mock CSI generation from production path |
|
||||||
|
| `v1/src/services/pose_service.py:489-607` | **Replace** | Real statistics from database, or explicit "no data" response |
|
||||||
|
| `v1/src/core/router_interface.py:60-300` | **Move** | Move mock generator to `v1/src/testing/mock_csi_generator.py` |
|
||||||
|
| `v1/src/api/dependencies.py:82,408` | **Replace** | Real auth check or explicit dev-mode bypass with logging |
|
||||||
|
| `v1/data/proof/` | **Create** | Proof bundle (sample capture + expected hash + verify script) |
|
||||||
|
| `v1/requirements-lock.txt` | **Create** | Pinned minimal dependencies |
|
||||||
|
| `.github/workflows/verify-pipeline.yml` | **Create** | CI verification |
|
||||||
|
|
||||||
|
### Hardware Documentation
|
||||||
|
|
||||||
|
```
|
||||||
|
v1/docs/hardware-setup.md (to be created)
|
||||||
|
|
||||||
|
# Supported Hardware Matrix
|
||||||
|
|
||||||
|
| Chipset | Tool | OS | Capture Command |
|
||||||
|
|---------|------|----|-----------------|
|
||||||
|
| Intel 5300 | Linux 802.11n CSI Tool | Ubuntu 18.04 | `sudo ./log_to_file csi.dat` |
|
||||||
|
| Atheros AR9580 | Atheros CSI Tool | Ubuntu 14.04 | `sudo ./recv_csi csi.dat` |
|
||||||
|
| Broadcom BCM4339 | Nexmon CSI | Android/Nexus 5 | `nexutil -m1 -k1 ...` |
|
||||||
|
| ESP32 | ESP32-CSI | ESP-IDF | `csi_recv --format binary` |
|
||||||
|
|
||||||
|
# Calibration
|
||||||
|
1. Place router and receiver 2m apart, line of sight
|
||||||
|
2. Capture 10 seconds of empty-room baseline
|
||||||
|
3. Have one person walk through at normal pace
|
||||||
|
4. Capture 10 seconds during walk-through
|
||||||
|
5. Run calibration: `python v1/scripts/calibrate.py --baseline empty.dat --activity walk.dat`
|
||||||
|
```
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
- **"Clone, build, verify" in one command**: `docker build . && docker run --rm wifi-densepose python v1/data/proof/verify.py` produces a deterministic PASS
|
||||||
|
- **No silent fakes**: Random data never appears in production output
|
||||||
|
- **CI enforcement**: PRs that introduce `np.random` in production paths fail automatically
|
||||||
|
- **Credibility anchor**: SHA-256 verified output from real CSI capture is unchallengeable proof
|
||||||
|
- **Clear mock boundary**: Mock code exists only in `v1/src/testing/`, never imported by production modules
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
- **Requires real CSI capture**: Someone must capture and commit a real CSI sample (one-time effort)
|
||||||
|
- **Build may fail without hardware**: Without mock fallback, systems without WiFi hardware cannot demo - must use proof bundle instead
|
||||||
|
- **Migration effort**: Moving mock code to separate module requires updating imports in test files
|
||||||
|
- **Stricter development workflow**: Developers must explicitly opt in to mock mode
|
||||||
|
|
||||||
|
### Acceptance Criteria
|
||||||
|
|
||||||
|
A stranger can:
|
||||||
|
1. `git clone` the repository
|
||||||
|
2. Run ONE command (`docker build .` or `python v1/data/proof/verify.py`)
|
||||||
|
3. See `PASS: Pipeline output matches expected hash` with a specific SHA-256
|
||||||
|
4. Confirm no `np.random` in any non-test file via CI badge
|
||||||
|
|
||||||
|
If this works 100% over 5 runs on a clean machine, the "fake" narrative dies.
|
||||||
|
|
||||||
|
### Answering the Two Key Questions
|
||||||
|
|
||||||
|
**Q1: Docker or Nix first?**
|
||||||
|
Recommendation: **Docker first**. The Dockerfile already exists, just needs fixing. Nix is higher quality but smaller audience. Docker gives the widest "clone and verify" coverage.
|
||||||
|
|
||||||
|
**Q2: Are external crates public and versioned?**
|
||||||
|
The Python dependencies are all public PyPI packages. The Rust `ruvector-core` and `ruvector-data-framework` crates are currently commented out in `Cargo.toml` (lines 83-84: `# ruvector-core = "0.1"`) and are not yet published to crates.io. They are internal to ruvnet. This is a blocker for the Rust path but does not affect the Python proof-of-reality work in this ADR.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [Linux 802.11n CSI Tool](https://dhalperi.github.io/linux-80211n-csitool/)
|
||||||
|
- [Atheros CSI Tool](https://wands.sg/research/wifi/AthesCSI/)
|
||||||
|
- [Nexmon CSI](https://github.com/seemoo-lab/nexmon_csi)
|
||||||
|
- [ESP32 CSI](https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-guides/wifi.html#wi-fi-channel-state-information)
|
||||||
|
- [Reproducible Builds](https://reproducible-builds.org/)
|
||||||
|
- ADR-002: RuVector RVF Integration Strategy
|
||||||
318
docs/adr/ADR-012-esp32-csi-sensor-mesh.md
Normal file
318
docs/adr/ADR-012-esp32-csi-sensor-mesh.md
Normal file
@@ -0,0 +1,318 @@
|
|||||||
|
# ADR-012: ESP32 CSI Sensor Mesh for Distributed Sensing
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed
|
||||||
|
|
||||||
|
## Date
|
||||||
|
2026-02-28
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
### The Hardware Reality Gap
|
||||||
|
|
||||||
|
WiFi-DensePose's Rust and Python pipelines implement real signal processing (FFT, phase unwrapping, Doppler extraction, correlation features), but the system currently has no defined path from **physical WiFi hardware → CSI bytes → pipeline input**. The `csi_extractor.py` and `router_interface.py` modules contain placeholder parsers that return `np.random.rand()` instead of real parsed data (see ADR-011).
|
||||||
|
|
||||||
|
To close this gap, we need a concrete, affordable, reproducible hardware platform that produces real CSI data and streams it into the existing pipeline.
|
||||||
|
|
||||||
|
### Why ESP32
|
||||||
|
|
||||||
|
| Factor | ESP32/ESP32-S3 | Intel 5300 (iwl5300) | Atheros AR9580 |
|
||||||
|
|--------|---------------|---------------------|----------------|
|
||||||
|
| Cost | ~$5-15/node | ~$50-100 (used NIC) | ~$30-60 (used NIC) |
|
||||||
|
| Availability | Mass produced, in stock | Discontinued, eBay only | Discontinued, eBay only |
|
||||||
|
| CSI Support | Official ESP-IDF API | Linux CSI Tool (kernel mod) | Atheros CSI Tool |
|
||||||
|
| Form Factor | Standalone MCU | Requires PCIe/Mini-PCIe host | Requires PCIe host |
|
||||||
|
| Deployment | Battery/USB, wireless | Desktop/laptop only | Desktop/laptop only |
|
||||||
|
| Antenna Config | 1-2 TX, 1-2 RX | 3 TX, 3 RX (MIMO) | 3 TX, 3 RX (MIMO) |
|
||||||
|
| Subcarriers | 52-56 (802.11n) | 30 (compressed) | 56 (full) |
|
||||||
|
| Fidelity | Lower (consumer SoC) | Higher (dedicated NIC) | Higher (dedicated NIC) |
|
||||||
|
|
||||||
|
**ESP32 wins on deployability**: It's the only option where a stranger can buy nodes on Amazon, flash firmware, and have a working CSI mesh in an afternoon. Intel 5300 and Atheros cards require specific hardware, kernel modifications, and legacy OS versions.
|
||||||
|
|
||||||
|
### ESP-IDF CSI API
|
||||||
|
|
||||||
|
Espressif provides official CSI support through three key functions:
|
||||||
|
|
||||||
|
```c
|
||||||
|
// 1. Configure what CSI data to capture
|
||||||
|
wifi_csi_config_t csi_config = {
|
||||||
|
.lltf_en = true, // Long Training Field (best for CSI)
|
||||||
|
.htltf_en = true, // HT-LTF
|
||||||
|
.stbc_htltf2_en = true, // STBC HT-LTF2
|
||||||
|
.ltf_merge_en = true, // Merge LTFs
|
||||||
|
.channel_filter_en = false,
|
||||||
|
.manu_scale = false,
|
||||||
|
};
|
||||||
|
esp_wifi_set_csi_config(&csi_config);
|
||||||
|
|
||||||
|
// 2. Register callback for received CSI data
|
||||||
|
esp_wifi_set_csi_rx_cb(csi_data_callback, NULL);
|
||||||
|
|
||||||
|
// 3. Enable CSI collection
|
||||||
|
esp_wifi_set_csi(true);
|
||||||
|
|
||||||
|
// Callback receives:
|
||||||
|
void csi_data_callback(void *ctx, wifi_csi_info_t *info) {
|
||||||
|
// info->rx_ctrl: RSSI, noise_floor, channel, secondary_channel, etc.
|
||||||
|
// info->buf: Raw CSI data (I/Q pairs per subcarrier)
|
||||||
|
// info->len: Length of CSI data buffer
|
||||||
|
// Typical: 112 bytes = 56 subcarriers × 2 (I,Q) × 1 byte each
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will build an ESP32 CSI Sensor Mesh as the primary hardware integration path, with a full stack from firmware to aggregator to Rust pipeline to visualization.
|
||||||
|
|
||||||
|
### System Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ ESP32 CSI Sensor Mesh │
|
||||||
|
├─────────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||||
|
│ │ ESP32 │ │ ESP32 │ │ ESP32 │ ... (3-6 nodes) │
|
||||||
|
│ │ Node 1 │ │ Node 2 │ │ Node 3 │ │
|
||||||
|
│ │ │ │ │ │ │ │
|
||||||
|
│ │ CSI Rx │ │ CSI Rx │ │ CSI Rx │ ← WiFi frames from │
|
||||||
|
│ │ FFT │ │ FFT │ │ FFT │ consumer router │
|
||||||
|
│ │ Features │ │ Features │ │ Features │ │
|
||||||
|
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ │ UDP/TCP stream (WiFi or secondary channel) │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ ▼ ▼ ▼ │
|
||||||
|
│ ┌─────────────────────────────────────────┐ │
|
||||||
|
│ │ Aggregator │ │
|
||||||
|
│ │ (Laptop / Raspberry Pi / Seed device) │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ 1. Receive CSI streams from all nodes │ │
|
||||||
|
│ │ 2. Timestamp alignment (per-node) │ │
|
||||||
|
│ │ 3. Feature-level fusion │ │
|
||||||
|
│ │ 4. Feed into Rust/Python pipeline │ │
|
||||||
|
│ │ 5. Serve WebSocket to visualization │ │
|
||||||
|
│ └──────────────────┬──────────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ ┌─────────────────────────────────────────┐ │
|
||||||
|
│ │ WiFi-DensePose Pipeline │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ CsiProcessor → FeatureExtractor → │ │
|
||||||
|
│ │ MotionDetector → PoseEstimator → │ │
|
||||||
|
│ │ Three.js Visualization │ │
|
||||||
|
│ └─────────────────────────────────────────┘ │
|
||||||
|
└─────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Node Firmware Specification
|
||||||
|
|
||||||
|
**ESP-IDF project**: `firmware/esp32-csi-node/`
|
||||||
|
|
||||||
|
```
|
||||||
|
firmware/esp32-csi-node/
|
||||||
|
├── CMakeLists.txt
|
||||||
|
├── sdkconfig.defaults # Menuconfig defaults with CSI enabled
|
||||||
|
├── main/
|
||||||
|
│ ├── CMakeLists.txt
|
||||||
|
│ ├── main.c # Entry point, WiFi init, CSI callback
|
||||||
|
│ ├── csi_collector.c # CSI data collection and buffering
|
||||||
|
│ ├── csi_collector.h
|
||||||
|
│ ├── feature_extract.c # On-device FFT and feature extraction
|
||||||
|
│ ├── feature_extract.h
|
||||||
|
│ ├── stream_sender.c # UDP stream to aggregator
|
||||||
|
│ ├── stream_sender.h
|
||||||
|
│ ├── config.h # Node configuration (SSID, aggregator IP)
|
||||||
|
│ └── Kconfig.projbuild # Menuconfig options
|
||||||
|
├── components/
|
||||||
|
│ └── esp_dsp/ # Espressif DSP library for FFT
|
||||||
|
└── README.md # Flash instructions
|
||||||
|
```
|
||||||
|
|
||||||
|
**On-device processing** (reduces bandwidth, node does pre-processing):
|
||||||
|
|
||||||
|
```c
|
||||||
|
// feature_extract.c
|
||||||
|
typedef struct {
|
||||||
|
uint32_t timestamp_ms; // Local monotonic timestamp
|
||||||
|
uint8_t node_id; // This node's ID
|
||||||
|
int8_t rssi; // Received signal strength
|
||||||
|
int8_t noise_floor; // Noise floor estimate
|
||||||
|
uint8_t channel; // WiFi channel
|
||||||
|
float amplitude[56]; // |CSI| per subcarrier (from I/Q)
|
||||||
|
float phase[56]; // arg(CSI) per subcarrier
|
||||||
|
float doppler_energy; // Motion energy from temporal FFT
|
||||||
|
float breathing_band; // 0.1-0.5 Hz band power
|
||||||
|
float motion_band; // 0.5-3 Hz band power
|
||||||
|
} csi_feature_frame_t;
|
||||||
|
// Size: ~470 bytes per frame
|
||||||
|
// At 100 Hz: ~47 KB/s per node, ~280 KB/s for 6 nodes
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key firmware design decisions**:
|
||||||
|
|
||||||
|
1. **Feature extraction on-device**: Raw CSI I/Q → amplitude + phase + spectral bands. This cuts bandwidth from raw ~11 KB/frame to ~470 bytes/frame.
|
||||||
|
|
||||||
|
2. **Monotonic timestamps**: Each node uses its own monotonic clock. No NTP synchronization attempted between nodes - clock drift is handled at the aggregator by fusing features, not raw phases (see "Clock Drift" section below).
|
||||||
|
|
||||||
|
3. **UDP streaming**: Low-latency, loss-tolerant. Missing frames are acceptable; ordering is maintained via sequence numbers.
|
||||||
|
|
||||||
|
4. **Configurable sampling rate**: 10-100 Hz via menuconfig. 100 Hz for motion detection, 10 Hz sufficient for occupancy.
|
||||||
|
|
||||||
|
### Aggregator Specification
|
||||||
|
|
||||||
|
The aggregator runs on any machine with WiFi/Ethernet to the nodes:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// In wifi-densepose-rs, new module: crates/wifi-densepose-hardware/src/esp32/
|
||||||
|
pub struct Esp32Aggregator {
|
||||||
|
/// UDP socket listening for node streams
|
||||||
|
socket: UdpSocket,
|
||||||
|
|
||||||
|
/// Per-node state (last timestamp, feature buffer, drift estimate)
|
||||||
|
nodes: HashMap<u8, NodeState>,
|
||||||
|
|
||||||
|
/// Ring buffer of fused feature frames
|
||||||
|
fused_buffer: VecDeque<FusedFrame>,
|
||||||
|
|
||||||
|
/// Channel to pipeline
|
||||||
|
pipeline_tx: mpsc::Sender<CsiData>,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Fused frame from all nodes for one time window
|
||||||
|
pub struct FusedFrame {
|
||||||
|
/// Timestamp (aggregator local, monotonic)
|
||||||
|
timestamp: Instant,
|
||||||
|
|
||||||
|
/// Per-node features (may have gaps if node dropped)
|
||||||
|
node_features: Vec<Option<CsiFeatureFrame>>,
|
||||||
|
|
||||||
|
/// Cross-node correlation (computed by aggregator)
|
||||||
|
cross_node_correlation: Array2<f64>,
|
||||||
|
|
||||||
|
/// Fused motion energy (max across nodes)
|
||||||
|
fused_motion_energy: f64,
|
||||||
|
|
||||||
|
/// Fused breathing band (coherent sum where phase aligns)
|
||||||
|
fused_breathing_band: f64,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Clock Drift Handling
|
||||||
|
|
||||||
|
ESP32 crystal oscillators drift ~20-50 ppm. Over 1 hour, two nodes may diverge by 72-180ms. This makes raw phase alignment across nodes impossible.
|
||||||
|
|
||||||
|
**Solution**: Feature-level fusion, not signal-level fusion.
|
||||||
|
|
||||||
|
```
|
||||||
|
Signal-level (WRONG for ESP32):
|
||||||
|
Align raw I/Q samples across nodes → requires <1µs sync → impractical
|
||||||
|
|
||||||
|
Feature-level (CORRECT for ESP32):
|
||||||
|
Each node: raw CSI → amplitude + phase + spectral features (local)
|
||||||
|
Aggregator: collect features → correlate → fuse decisions
|
||||||
|
No cross-node phase alignment needed
|
||||||
|
```
|
||||||
|
|
||||||
|
Specifically:
|
||||||
|
- **Motion energy**: Take max across nodes (any node seeing motion = motion)
|
||||||
|
- **Breathing band**: Use node with highest SNR as primary, others as corroboration
|
||||||
|
- **Location**: Cross-node amplitude ratios estimate position (no phase needed)
|
||||||
|
|
||||||
|
### Sensing Capabilities by Deployment
|
||||||
|
|
||||||
|
| Capability | 1 Node | 3 Nodes | 6 Nodes | Evidence |
|
||||||
|
|-----------|--------|---------|---------|----------|
|
||||||
|
| Presence detection | Good | Excellent | Excellent | Single-node RSSI variance |
|
||||||
|
| Coarse motion | Good | Excellent | Excellent | Doppler energy |
|
||||||
|
| Room-level location | None | Good | Excellent | Amplitude ratios |
|
||||||
|
| Respiration | Marginal | Good | Good | 0.1-0.5 Hz band, placement-sensitive |
|
||||||
|
| Heartbeat | Poor | Poor-Marginal | Marginal | Requires ideal placement, low noise |
|
||||||
|
| Multi-person count | None | Marginal | Good | Spatial diversity |
|
||||||
|
| Pose estimation | None | Poor | Marginal | Requires model + sufficient diversity |
|
||||||
|
|
||||||
|
**Honest assessment**: ESP32 CSI is lower fidelity than Intel 5300 or Atheros. Heartbeat detection is placement-sensitive and unreliable. Respiration works with good placement. Motion and presence are solid.
|
||||||
|
|
||||||
|
### Failure Modes and Mitigations
|
||||||
|
|
||||||
|
| Failure Mode | Severity | Mitigation |
|
||||||
|
|-------------|----------|------------|
|
||||||
|
| Multipath dominates in cluttered rooms | High | Mesh diversity: 3+ nodes from different angles |
|
||||||
|
| Person occludes path between node and router | Medium | Mesh: other nodes still have clear paths |
|
||||||
|
| Clock drift ruins cross-node fusion | Medium | Feature-level fusion only; no cross-node phase alignment |
|
||||||
|
| UDP packet loss during high traffic | Low | Sequence numbers, interpolation for gaps <100ms |
|
||||||
|
| ESP32 WiFi driver bugs with CSI | Medium | Pin ESP-IDF version, test on known-good boards |
|
||||||
|
| Node power failure | Low | Aggregator handles missing nodes gracefully |
|
||||||
|
|
||||||
|
### Bill of Materials (Starter Kit)
|
||||||
|
|
||||||
|
| Item | Quantity | Unit Cost | Total |
|
||||||
|
|------|----------|-----------|-------|
|
||||||
|
| ESP32-S3-DevKitC-1 | 3 | $10 | $30 |
|
||||||
|
| USB-A to USB-C cables | 3 | $3 | $9 |
|
||||||
|
| USB power adapter (multi-port) | 1 | $15 | $15 |
|
||||||
|
| Consumer WiFi router (any) | 1 | $0 (existing) | $0 |
|
||||||
|
| Aggregator (laptop or Pi 4) | 1 | $0 (existing) | $0 |
|
||||||
|
| **Total** | | | **$54** |
|
||||||
|
|
||||||
|
### Minimal Build Spec (Clone-Flash-Run)
|
||||||
|
|
||||||
|
```
|
||||||
|
# Step 1: Flash one node (requires ESP-IDF installed)
|
||||||
|
cd firmware/esp32-csi-node
|
||||||
|
idf.py set-target esp32s3
|
||||||
|
idf.py menuconfig # Set WiFi SSID/password, aggregator IP
|
||||||
|
idf.py build flash monitor
|
||||||
|
|
||||||
|
# Step 2: Run aggregator (Docker)
|
||||||
|
docker compose -f docker-compose.esp32.yml up
|
||||||
|
|
||||||
|
# Step 3: Verify with proof bundle
|
||||||
|
# Aggregator captures 10 seconds, produces feature JSON, verifies hash
|
||||||
|
docker exec aggregator python verify_esp32.py
|
||||||
|
|
||||||
|
# Step 4: Open visualization
|
||||||
|
open http://localhost:3000 # Three.js dashboard
|
||||||
|
```
|
||||||
|
|
||||||
|
### Proof of Reality for ESP32
|
||||||
|
|
||||||
|
```
|
||||||
|
firmware/esp32-csi-node/proof/
|
||||||
|
├── captured_csi_10sec.bin # Real 10-second CSI capture from ESP32
|
||||||
|
├── captured_csi_meta.json # Board: ESP32-S3-DevKitC, ESP-IDF: 5.2, Router: TP-Link AX1800
|
||||||
|
├── expected_features.json # Feature extraction output
|
||||||
|
├── expected_features.sha256 # Hash verification
|
||||||
|
└── capture_photo.jpg # Photo of actual hardware setup
|
||||||
|
```
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
- **$54 starter kit**: Lowest possible barrier to real CSI data
|
||||||
|
- **Mass available hardware**: ESP32 boards are in stock globally
|
||||||
|
- **Real data path**: Eliminates every `np.random.rand()` placeholder with actual hardware input
|
||||||
|
- **Proof artifact**: Captured CSI + expected hash proves the pipeline processes real data
|
||||||
|
- **Scalable mesh**: Add nodes for more coverage without changing software
|
||||||
|
- **Feature-level fusion**: Avoids the impossible problem of cross-node phase synchronization
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
- **Lower fidelity than research NICs**: ESP32 CSI is noisier than Intel 5300
|
||||||
|
- **Heartbeat detection unreliable**: Micro-Doppler resolution insufficient for consistent heartbeat
|
||||||
|
- **ESP-IDF learning curve**: Firmware development requires embedded C knowledge
|
||||||
|
- **WiFi interference**: Nodes sharing the same channel as data traffic adds noise
|
||||||
|
- **Placement sensitivity**: Respiration detection requires careful node positioning
|
||||||
|
|
||||||
|
### Interaction with Other ADRs
|
||||||
|
- **ADR-011** (Proof of Reality): ESP32 provides the real CSI capture for the proof bundle
|
||||||
|
- **ADR-008** (Distributed Consensus): Mesh nodes can use simplified Raft for configuration distribution
|
||||||
|
- **ADR-003** (RVF Containers): Aggregator stores CSI features in RVF format
|
||||||
|
- **ADR-004** (HNSW): Environment fingerprints from ESP32 mesh feed HNSW index
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [Espressif ESP-CSI Repository](https://github.com/espressif/esp-csi)
|
||||||
|
- [ESP-IDF WiFi CSI API](https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-guides/wifi.html#wi-fi-channel-state-information)
|
||||||
|
- [ESP32 CSI Research Papers](https://ieeexplore.ieee.org/document/9439871)
|
||||||
|
- [Wi-Fi Sensing with ESP32: A Tutorial](https://arxiv.org/abs/2207.07859)
|
||||||
|
- ADR-011: Python Proof-of-Reality and Mock Elimination
|
||||||
383
docs/adr/ADR-013-feature-level-sensing-commodity-gear.md
Normal file
383
docs/adr/ADR-013-feature-level-sensing-commodity-gear.md
Normal file
@@ -0,0 +1,383 @@
|
|||||||
|
# ADR-013: Feature-Level Sensing on Commodity Gear (Option 3)
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed
|
||||||
|
|
||||||
|
## Date
|
||||||
|
2026-02-28
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
### Not Everyone Can Deploy Custom Hardware
|
||||||
|
|
||||||
|
ADR-012 specifies an ESP32 CSI mesh that provides real CSI data. However, it requires:
|
||||||
|
- Purchasing ESP32 boards
|
||||||
|
- Flashing custom firmware
|
||||||
|
- ESP-IDF toolchain installation
|
||||||
|
- Physical placement of nodes
|
||||||
|
|
||||||
|
For many users - especially those evaluating WiFi-DensePose or deploying in managed environments - modifying hardware is not an option. We need a sensing path that works with **existing, unmodified consumer WiFi gear**.
|
||||||
|
|
||||||
|
### What Commodity Hardware Exposes
|
||||||
|
|
||||||
|
Standard WiFi drivers and tools expose several metrics without custom firmware:
|
||||||
|
|
||||||
|
| Signal | Source | Availability | Sampling Rate |
|
||||||
|
|--------|--------|-------------|---------------|
|
||||||
|
| RSSI (Received Signal Strength) | `iwconfig`, `iw`, NetworkManager | Universal | 1-10 Hz |
|
||||||
|
| Noise floor | `iw dev wlan0 survey dump` | Most Linux drivers | ~1 Hz |
|
||||||
|
| Link quality | `/proc/net/wireless` | Linux | 1-10 Hz |
|
||||||
|
| MCS index / PHY rate | `iw dev wlan0 link` | Most drivers | Per-packet |
|
||||||
|
| TX/RX bytes | `/sys/class/net/wlan0/statistics/` | Universal | Continuous |
|
||||||
|
| Retry count | `iw dev wlan0 station dump` | Most drivers | ~1 Hz |
|
||||||
|
| Beacon interval timing | `iw dev wlan0 scan dump` | Universal | Per-scan |
|
||||||
|
| Channel utilization | `iw dev wlan0 survey dump` | Most drivers | ~1 Hz |
|
||||||
|
|
||||||
|
**RSSI is the primary signal**. It varies when humans move through the propagation path between any transmitter-receiver pair. Research confirms RSSI-based sensing for:
|
||||||
|
- Presence detection (single receiver, threshold on variance)
|
||||||
|
- Device-free motion detection (RSSI variance increases with movement)
|
||||||
|
- Coarse room-level localization (multi-receiver RSSI fingerprinting)
|
||||||
|
- Breathing detection (specialized setups, marginal quality)
|
||||||
|
|
||||||
|
### Research Support
|
||||||
|
|
||||||
|
- **RSSI-based presence**: Youssef et al. (2007) demonstrated device-free passive detection using RSSI from multiple receivers with >90% accuracy.
|
||||||
|
- **RSSI breathing**: Abdelnasser et al. (2015) showed respiration detection via RSSI variance in controlled settings with ~85% accuracy using 4+ receivers.
|
||||||
|
- **Device-free tracking**: Multiple receivers with RSSI fingerprinting achieve room-level (3-5m) accuracy.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We will implement a Feature-Level Sensing module that extracts motion, presence, and coarse activity information from standard WiFi metrics available on any Linux machine without hardware modification.
|
||||||
|
|
||||||
|
### Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Feature-Level Sensing Pipeline │
|
||||||
|
├──────────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ Data Sources (any Linux WiFi device): │
|
||||||
|
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌──────────────┐ │
|
||||||
|
│ │ RSSI │ │ Noise │ │ Link │ │ Packet Stats │ │
|
||||||
|
│ │ Stream │ │ Floor │ │ Quality │ │ (TX/RX/Retry)│ │
|
||||||
|
│ └────┬────┘ └────┬────┘ └────┬────┘ └──────┬───────┘ │
|
||||||
|
│ │ │ │ │ │
|
||||||
|
│ └───────────┴───────────┴──────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ ┌────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Feature Extraction Engine │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ 1. Rolling statistics (mean, var, skew, kurt) │ │
|
||||||
|
│ │ 2. Spectral features (FFT of RSSI time series) │ │
|
||||||
|
│ │ 3. Change-point detection (CUSUM, PELT) │ │
|
||||||
|
│ │ 4. Cross-receiver correlation │ │
|
||||||
|
│ │ 5. Packet timing jitter analysis │ │
|
||||||
|
│ └────────────────────────┬───────────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ ┌────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Classification / Decision │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ • Presence: RSSI variance > threshold │ │
|
||||||
|
│ │ • Motion class: spectral peak frequency │ │
|
||||||
|
│ │ • Occupancy change: change-point event │ │
|
||||||
|
│ │ • Confidence: cross-receiver agreement │ │
|
||||||
|
│ └────────────────────────┬───────────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ ┌────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Output: Presence/Motion Events │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ { "timestamp": "...", │ │
|
||||||
|
│ │ "presence": true, │ │
|
||||||
|
│ │ "motion_level": "active", │ │
|
||||||
|
│ │ "confidence": 0.87, │ │
|
||||||
|
│ │ "receivers_agreeing": 3, │ │
|
||||||
|
│ │ "rssi_variance": 4.2 } │ │
|
||||||
|
│ └────────────────────────────────────────────────┘ │
|
||||||
|
└──────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Feature Extraction Specification
|
||||||
|
|
||||||
|
```python
|
||||||
|
class RssiFeatureExtractor:
|
||||||
|
"""Extract sensing features from RSSI and link statistics.
|
||||||
|
|
||||||
|
No custom hardware required. Works with any WiFi interface
|
||||||
|
that exposes standard Linux wireless statistics.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, config: FeatureSensingConfig):
|
||||||
|
self.window_size = config.window_size # 30 seconds
|
||||||
|
self.sampling_rate = config.sampling_rate # 10 Hz
|
||||||
|
self.rssi_buffer = deque(maxlen=self.window_size * self.sampling_rate)
|
||||||
|
self.noise_buffer = deque(maxlen=self.window_size * self.sampling_rate)
|
||||||
|
|
||||||
|
def extract_features(self) -> FeatureVector:
|
||||||
|
rssi_array = np.array(self.rssi_buffer)
|
||||||
|
|
||||||
|
return FeatureVector(
|
||||||
|
# Time-domain statistics
|
||||||
|
rssi_mean=np.mean(rssi_array),
|
||||||
|
rssi_variance=np.var(rssi_array),
|
||||||
|
rssi_skewness=scipy.stats.skew(rssi_array),
|
||||||
|
rssi_kurtosis=scipy.stats.kurtosis(rssi_array),
|
||||||
|
rssi_range=np.ptp(rssi_array),
|
||||||
|
rssi_iqr=np.subtract(*np.percentile(rssi_array, [75, 25])),
|
||||||
|
|
||||||
|
# Spectral features (FFT of RSSI time series)
|
||||||
|
spectral_energy=self._spectral_energy(rssi_array),
|
||||||
|
dominant_frequency=self._dominant_freq(rssi_array),
|
||||||
|
breathing_band_power=self._band_power(rssi_array, 0.1, 0.5), # Hz
|
||||||
|
motion_band_power=self._band_power(rssi_array, 0.5, 3.0), # Hz
|
||||||
|
|
||||||
|
# Change-point features
|
||||||
|
num_change_points=self._cusum_changes(rssi_array),
|
||||||
|
max_step_magnitude=self._max_step(rssi_array),
|
||||||
|
|
||||||
|
# Noise floor features (environment stability)
|
||||||
|
noise_mean=np.mean(np.array(self.noise_buffer)),
|
||||||
|
snr_estimate=np.mean(rssi_array) - np.mean(np.array(self.noise_buffer)),
|
||||||
|
)
|
||||||
|
|
||||||
|
def _spectral_energy(self, rssi: np.ndarray) -> float:
|
||||||
|
"""Total spectral energy excluding DC component."""
|
||||||
|
spectrum = np.abs(scipy.fft.rfft(rssi - np.mean(rssi)))
|
||||||
|
return float(np.sum(spectrum[1:] ** 2))
|
||||||
|
|
||||||
|
def _dominant_freq(self, rssi: np.ndarray) -> float:
|
||||||
|
"""Dominant frequency in RSSI time series."""
|
||||||
|
spectrum = np.abs(scipy.fft.rfft(rssi - np.mean(rssi)))
|
||||||
|
freqs = scipy.fft.rfftfreq(len(rssi), d=1.0/self.sampling_rate)
|
||||||
|
return float(freqs[np.argmax(spectrum[1:]) + 1])
|
||||||
|
|
||||||
|
def _band_power(self, rssi: np.ndarray, low_hz: float, high_hz: float) -> float:
|
||||||
|
"""Power in a specific frequency band."""
|
||||||
|
spectrum = np.abs(scipy.fft.rfft(rssi - np.mean(rssi))) ** 2
|
||||||
|
freqs = scipy.fft.rfftfreq(len(rssi), d=1.0/self.sampling_rate)
|
||||||
|
mask = (freqs >= low_hz) & (freqs <= high_hz)
|
||||||
|
return float(np.sum(spectrum[mask]))
|
||||||
|
|
||||||
|
def _cusum_changes(self, rssi: np.ndarray) -> int:
|
||||||
|
"""Count change points using CUSUM algorithm."""
|
||||||
|
mean = np.mean(rssi)
|
||||||
|
cusum_pos = np.zeros_like(rssi)
|
||||||
|
cusum_neg = np.zeros_like(rssi)
|
||||||
|
threshold = 3.0 * np.std(rssi)
|
||||||
|
changes = 0
|
||||||
|
for i in range(1, len(rssi)):
|
||||||
|
cusum_pos[i] = max(0, cusum_pos[i-1] + rssi[i] - mean - 0.5)
|
||||||
|
cusum_neg[i] = max(0, cusum_neg[i-1] - rssi[i] + mean - 0.5)
|
||||||
|
if cusum_pos[i] > threshold or cusum_neg[i] > threshold:
|
||||||
|
changes += 1
|
||||||
|
cusum_pos[i] = 0
|
||||||
|
cusum_neg[i] = 0
|
||||||
|
return changes
|
||||||
|
```
|
||||||
|
|
||||||
|
### Data Collection (No Root Required)
|
||||||
|
|
||||||
|
```python
|
||||||
|
class LinuxWifiCollector:
|
||||||
|
"""Collect WiFi statistics from standard Linux interfaces.
|
||||||
|
|
||||||
|
No root required for most operations.
|
||||||
|
No custom drivers or firmware.
|
||||||
|
Works with NetworkManager, wpa_supplicant, or raw iw.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, interface: str = "wlan0"):
|
||||||
|
self.interface = interface
|
||||||
|
|
||||||
|
def get_rssi(self) -> float:
|
||||||
|
"""Get current RSSI from connected AP."""
|
||||||
|
# Method 1: /proc/net/wireless (no root)
|
||||||
|
with open("/proc/net/wireless") as f:
|
||||||
|
for line in f:
|
||||||
|
if self.interface in line:
|
||||||
|
parts = line.split()
|
||||||
|
return float(parts[3].rstrip('.'))
|
||||||
|
|
||||||
|
# Method 2: iw (no root for own station)
|
||||||
|
result = subprocess.run(
|
||||||
|
["iw", "dev", self.interface, "link"],
|
||||||
|
capture_output=True, text=True
|
||||||
|
)
|
||||||
|
for line in result.stdout.split('\n'):
|
||||||
|
if 'signal:' in line:
|
||||||
|
return float(line.split(':')[1].strip().split()[0])
|
||||||
|
|
||||||
|
raise SensingError(f"Cannot read RSSI from {self.interface}")
|
||||||
|
|
||||||
|
def get_noise_floor(self) -> float:
|
||||||
|
"""Get noise floor estimate."""
|
||||||
|
result = subprocess.run(
|
||||||
|
["iw", "dev", self.interface, "survey", "dump"],
|
||||||
|
capture_output=True, text=True
|
||||||
|
)
|
||||||
|
for line in result.stdout.split('\n'):
|
||||||
|
if 'noise:' in line:
|
||||||
|
return float(line.split(':')[1].strip().split()[0])
|
||||||
|
return -95.0 # Default noise floor estimate
|
||||||
|
|
||||||
|
def get_link_stats(self) -> dict:
|
||||||
|
"""Get link quality statistics."""
|
||||||
|
result = subprocess.run(
|
||||||
|
["iw", "dev", self.interface, "station", "dump"],
|
||||||
|
capture_output=True, text=True
|
||||||
|
)
|
||||||
|
stats = {}
|
||||||
|
for line in result.stdout.split('\n'):
|
||||||
|
if 'tx bytes:' in line:
|
||||||
|
stats['tx_bytes'] = int(line.split(':')[1].strip())
|
||||||
|
elif 'rx bytes:' in line:
|
||||||
|
stats['rx_bytes'] = int(line.split(':')[1].strip())
|
||||||
|
elif 'tx retries:' in line:
|
||||||
|
stats['tx_retries'] = int(line.split(':')[1].strip())
|
||||||
|
elif 'signal:' in line:
|
||||||
|
stats['signal'] = float(line.split(':')[1].strip().split()[0])
|
||||||
|
return stats
|
||||||
|
```
|
||||||
|
|
||||||
|
### Classification Rules
|
||||||
|
|
||||||
|
```python
|
||||||
|
class PresenceClassifier:
|
||||||
|
"""Rule-based presence and motion classifier.
|
||||||
|
|
||||||
|
Uses simple, interpretable rules rather than ML to ensure
|
||||||
|
transparency and debuggability.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, config: ClassifierConfig):
|
||||||
|
self.variance_threshold = config.variance_threshold # 2.0 dBm²
|
||||||
|
self.motion_threshold = config.motion_threshold # 5.0 dBm²
|
||||||
|
self.spectral_threshold = config.spectral_threshold # 10.0
|
||||||
|
self.confidence_min_receivers = config.min_receivers # 2
|
||||||
|
|
||||||
|
def classify(self, features: FeatureVector,
|
||||||
|
multi_receiver: list[FeatureVector] = None) -> SensingResult:
|
||||||
|
|
||||||
|
# Presence: RSSI variance exceeds empty-room baseline
|
||||||
|
presence = features.rssi_variance > self.variance_threshold
|
||||||
|
|
||||||
|
# Motion level
|
||||||
|
if features.rssi_variance > self.motion_threshold:
|
||||||
|
motion = MotionLevel.ACTIVE
|
||||||
|
elif features.rssi_variance > self.variance_threshold:
|
||||||
|
motion = MotionLevel.PRESENT_STILL
|
||||||
|
else:
|
||||||
|
motion = MotionLevel.ABSENT
|
||||||
|
|
||||||
|
# Confidence from spectral energy and receiver agreement
|
||||||
|
spectral_conf = min(1.0, features.spectral_energy / self.spectral_threshold)
|
||||||
|
if multi_receiver:
|
||||||
|
agreeing = sum(1 for f in multi_receiver
|
||||||
|
if (f.rssi_variance > self.variance_threshold) == presence)
|
||||||
|
receiver_conf = agreeing / len(multi_receiver)
|
||||||
|
else:
|
||||||
|
receiver_conf = 0.5 # Single receiver = lower confidence
|
||||||
|
|
||||||
|
confidence = 0.6 * spectral_conf + 0.4 * receiver_conf
|
||||||
|
|
||||||
|
return SensingResult(
|
||||||
|
presence=presence,
|
||||||
|
motion_level=motion,
|
||||||
|
confidence=confidence,
|
||||||
|
dominant_frequency=features.dominant_frequency,
|
||||||
|
breathing_band_power=features.breathing_band_power,
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Capability Matrix (Honest Assessment)
|
||||||
|
|
||||||
|
| Capability | Single Receiver | 3 Receivers | 6 Receivers | Accuracy |
|
||||||
|
|-----------|----------------|-------------|-------------|----------|
|
||||||
|
| Binary presence | Yes | Yes | Yes | 90-95% |
|
||||||
|
| Coarse motion (still/moving) | Yes | Yes | Yes | 85-90% |
|
||||||
|
| Room-level location | No | Marginal | Yes | 70-80% |
|
||||||
|
| Person count | No | Marginal | Marginal | 50-70% |
|
||||||
|
| Activity class (walk/sit/stand) | Marginal | Marginal | Yes | 60-75% |
|
||||||
|
| Respiration detection | No | Marginal | Marginal | 40-60% |
|
||||||
|
| Heartbeat | No | No | No | N/A |
|
||||||
|
| Body pose | No | No | No | N/A |
|
||||||
|
|
||||||
|
**Bottom line**: Feature-level sensing on commodity gear does presence and motion well. It does NOT do pose estimation, heartbeat, or reliable respiration. Any claim otherwise would be dishonest.
|
||||||
|
|
||||||
|
### Decision Matrix: Option 2 (ESP32) vs Option 3 (Commodity)
|
||||||
|
|
||||||
|
| Factor | ESP32 CSI (ADR-012) | Commodity (ADR-013) |
|
||||||
|
|--------|---------------------|---------------------|
|
||||||
|
| Headline capability | Respiration + motion | Presence + coarse motion |
|
||||||
|
| Hardware cost | $54 (3-node kit) | $0 (existing gear) |
|
||||||
|
| Setup time | 2-4 hours | 15 minutes |
|
||||||
|
| Technical barrier | Medium (firmware flash) | Low (pip install) |
|
||||||
|
| Data quality | Real CSI (amplitude + phase) | RSSI only |
|
||||||
|
| Multi-person | Marginal | Poor |
|
||||||
|
| Pose estimation | Marginal | No |
|
||||||
|
| Reproducibility | High (controlled hardware) | Medium (varies by hardware) |
|
||||||
|
| Public credibility | High (real CSI artifact) | Medium (RSSI is "obvious") |
|
||||||
|
|
||||||
|
### Proof Bundle for Commodity Sensing
|
||||||
|
|
||||||
|
```
|
||||||
|
v1/data/proof/commodity/
|
||||||
|
├── rssi_capture_30sec.json # 30 seconds of RSSI from 3 receivers
|
||||||
|
├── rssi_capture_meta.json # Hardware: Intel AX200, Router: TP-Link AX1800
|
||||||
|
├── scenario.txt # "Person walks through room at t=10s, sits at t=20s"
|
||||||
|
├── expected_features.json # Feature extraction output
|
||||||
|
├── expected_classification.json # Classification output
|
||||||
|
├── expected_features.sha256 # Verification hash
|
||||||
|
└── verify_commodity.py # One-command verification
|
||||||
|
```
|
||||||
|
|
||||||
|
### Integration with WiFi-DensePose Pipeline
|
||||||
|
|
||||||
|
The commodity sensing module outputs the same `SensingResult` type as the CSI pipeline, allowing graceful degradation:
|
||||||
|
|
||||||
|
```python
|
||||||
|
class SensingBackend(Protocol):
|
||||||
|
"""Common interface for all sensing backends."""
|
||||||
|
|
||||||
|
def get_features(self) -> FeatureVector: ...
|
||||||
|
def get_capabilities(self) -> set[Capability]: ...
|
||||||
|
|
||||||
|
class CsiBackend(SensingBackend):
|
||||||
|
"""Full CSI pipeline (ESP32 or research NIC)."""
|
||||||
|
def get_capabilities(self):
|
||||||
|
return {Capability.PRESENCE, Capability.MOTION, Capability.RESPIRATION,
|
||||||
|
Capability.LOCATION, Capability.POSE}
|
||||||
|
|
||||||
|
class CommodityBackend(SensingBackend):
|
||||||
|
"""RSSI-only commodity hardware."""
|
||||||
|
def get_capabilities(self):
|
||||||
|
return {Capability.PRESENCE, Capability.MOTION}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
- **Zero-cost entry**: Works with existing WiFi hardware
|
||||||
|
- **15-minute setup**: `pip install wifi-densepose && wdp sense --interface wlan0`
|
||||||
|
- **Broad adoption**: Any Linux laptop, Pi, or phone can participate
|
||||||
|
- **Honest capability reporting**: `get_capabilities()` tells users exactly what works
|
||||||
|
- **Complements ESP32**: Users start with commodity, upgrade to ESP32 for more capability
|
||||||
|
- **No mock data**: Real RSSI from real hardware, deterministic pipeline
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
- **Limited capability**: No pose, no heartbeat, marginal respiration
|
||||||
|
- **Hardware variability**: RSSI calibration differs across chipsets
|
||||||
|
- **Environmental sensitivity**: Commodity RSSI is more affected by interference than CSI
|
||||||
|
- **Not a "pose estimation" demo**: This module honestly cannot do what the project name implies
|
||||||
|
- **Lower credibility ceiling**: RSSI sensing is well-known; less impressive than CSI
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [Youssef et al. - Challenges in Device-Free Passive Localization](https://doi.org/10.1145/1287853.1287880)
|
||||||
|
- [Device-Free WiFi Sensing Survey](https://arxiv.org/abs/1901.09683)
|
||||||
|
- [RSSI-based Breathing Detection](https://ieeexplore.ieee.org/document/7127688)
|
||||||
|
- [Linux Wireless Tools](https://wireless.wiki.kernel.org/en/users/documentation/iw)
|
||||||
|
- ADR-011: Python Proof-of-Reality and Mock Elimination
|
||||||
|
- ADR-012: ESP32 CSI Sensor Mesh
|
||||||
Reference in New Issue
Block a user