Files
wifi-densepose/docs/adr/ADR-015-public-dataset-training-strategy.md
Claude 5dc2f66201 docs: Update ADR-015 with verified dataset specs from research
Corrects MM-Fi antenna config (1 TX / 3 RX not 3x3), adds Wi-Pose
as secondary dataset (exact 3x3 hardware match), updates subcarrier
compatibility table, promotes status to Accepted, adds proof
verification protocol and Rust implementation plan.

https://claude.ai/code/session_01BSBAQJ34SLkiJy4A8SoiL4
2026-02-28 15:14:50 +00:00

8.8 KiB
Raw Permalink Blame History

ADR-015: Public Dataset Strategy for Trained Pose Estimation Model

Status

Accepted

Context

The WiFi-DensePose system has a complete model architecture (DensePoseHead, ModalityTranslationNetwork, WiFiDensePoseRCNN) and signal processing pipeline, but no trained weights. Without a trained model, pose estimation produces random outputs regardless of input quality.

Training requires paired data: simultaneous WiFi CSI captures alongside ground-truth human pose annotations. Collecting this data from scratch requires months of effort and specialized hardware (multiple WiFi nodes + camera + motion capture rig). Several public datasets exist that can bootstrap training without custom collection.

The Teacher-Student Constraint

The CMU "DensePose From WiFi" paper (2023) trains using a teacher-student approach: a camera-based RGB pose model (e.g. Detectron2 DensePose) generates pseudo-labels during training, so the WiFi model learns to replicate those outputs. At inference, the camera is removed. This means any dataset that provides either ground-truth pose annotations or synchronized RGB frames (from which a teacher can generate labels) is sufficient for training.

56-Subcarrier Hardware Context

The system targets 56 subcarriers, which corresponds specifically to Atheros 802.11n chipsets on a 20 MHz channel using the Atheros CSI Tool. No publicly available dataset with paired pose annotations was collected at exactly 56 subcarriers:

Hardware Subcarriers Datasets
Atheros CSI Tool (20 MHz) 56 None with pose labels
Atheros CSI Tool (40 MHz) 114 MM-Fi
Intel 5300 NIC (20 MHz) 30 Person-in-WiFi, Widar 3.0, Wi-Pose, XRF55
Nexmon/Broadcom (80 MHz) 242-256 None with pose labels

MM-Fi uses the same Atheros hardware family at 40 MHz, making 114→56 interpolation physically meaningful (same chipset, different channel width).

Decision

Use MM-Fi as the primary training dataset, supplemented by Wi-Pose (NjtechCVLab) for additional diversity. XRF55 is downgraded to optional (Kinect labels need post-processing). Teacher-student pipeline fills in DensePose UV labels where only skeleton keypoints are available.

Primary Dataset: MM-Fi

Paper: "MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing" (NeurIPS 2023 Datasets & Benchmarks) Repository: https://github.com/ybhbingo/MMFi_dataset Size: 40 subjects × 27 action classes × ~320,000 frames, 4 environments Modalities: WiFi CSI, mmWave radar, LiDAR, RGB-D, IMU CSI format: 1 TX × 3 RX antennas, 114 subcarriers, 100 Hz sampling rate, 5 GHz 40 MHz (TP-Link N750 with Atheros CSI Tool), raw amplitude + phase Data tensor: [3, 114, 10] per sample (antenna-pairs × subcarriers × time frames) Pose annotations: 17-keypoint COCO skeleton in 3D + DensePose UV surface coords License: CC BY-NC 4.0 Why primary: Largest public WiFi CSI + pose dataset; richest annotations (3D keypoints + DensePose UV); same Atheros hardware family as target system; COCO keypoints map directly to the KeypointHead output format; actively maintained with NeurIPS 2023 benchmark status.

Antenna correction: MM-Fi uses 1 TX / 3 RX (3 antenna pairs), not 3×3. The existing system targets 3×3 (ESP32 mesh). The 3 RX antennas match; the TX difference means MM-Fi-trained weights will work but may benefit from fine-tuning on data from a 3-TX setup.

Secondary Dataset: Wi-Pose (NjtechCVLab)

Paper: CSI-Former (MDPI Entropy 2023) and related works Repository: https://github.com/NjtechCVLab/Wi-PoseDataset Size: 12 volunteers × 12 action classes × 166,600 packets CSI format: 3 TX × 3 RX antennas, 30 subcarriers, 5 GHz, .mat format Pose annotations: 18-keypoint AlphaPose skeleton (COCO-compatible subset) License: Research use Why secondary: 3×3 antenna array matches target ESP32 mesh hardware exactly; fully public; adds 12 different subjects and environments not in MM-Fi. Note: 30 subcarriers require zero-padding or interpolation to 56; 18→17 keypoint mapping drops one neck keypoint (index 1), compatible with COCO-17.

Excluded / Deprioritized Datasets

Dataset Reason
RF-Pose / RF-Pose3D (MIT) Custom FMCW radio, not 802.11n CSI; incompatible signal physics
Person-in-WiFi (CMU 2019) Not publicly released (IRB restriction)
Person-in-WiFi 3D (CVPR 2024) 30 subcarriers, Intel 5300; semi-public access
DensePose From WiFi (CMU) Dataset not released; only paper + architecture
Widar 3.0 Gesture labels only, no full-body pose keypoints
XRF55 Activity labels primarily; Kinect pose requires email request; lower priority
UT-HAR, WiAR, SignFi Activity/gesture labels only, no pose keypoints

Implementation Plan

Phase 1: MM-Fi Loader (Rust wifi-densepose-train crate)

Implement MmFiDataset in Rust (crates/wifi-densepose-train/src/dataset.rs):

  • Reads MM-Fi numpy .npy files: amplitude [N, 3, 3, 114] (antenna-pairs laid flat), phase [N, 3, 3, 114]
  • Resamples from 114 → 56 subcarriers (linear interpolation via subcarrier.rs)
  • Applies phase sanitization using SOTA algorithms from wifi-densepose-signal crate
  • Returns typed CsiSample structs with amplitude, phase, keypoints, visibility
  • Validation split: subjects 3340 held out

Phase 2: Wi-Pose Loader

Implement WiPoseDataset reading .mat files (via ndarray-based MATLAB reader or pre-converted .npy). Subcarrier interpolation: 30 → 56 (zero-pad high frequencies rather than interpolate, since 30-sub Intel data has different spectral occupancy than 56-sub Atheros data).

Phase 3: Teacher-Student DensePose Labels

For MM-Fi samples that provide 3D keypoints but not full DensePose UV maps:

  • Run Detectron2 DensePose on paired RGB frames to generate (part_labels, u_coords, v_coords)
  • Cache generated labels as .npy alongside original data
  • This matches the training procedure in the CMU paper exactly

Phase 4: Training Pipeline (Rust)

  • Model: WiFiDensePoseModel (tch-rs, crates/wifi-densepose-train/src/model.rs)
  • Loss: Keypoint heatmap (MSE) + DensePose part (cross-entropy) + UV (Smooth L1) + transfer (MSE)
  • Metrics: PCK@0.2 + OKS with Hungarian min-cost assignment (crates/wifi-densepose-train/src/metrics.rs)
  • Optimizer: Adam, lr=1e-3, step decay at epochs 40 and 80
  • Hardware: Single GPU (RTX 3090 or A100); MM-Fi fits in ~50 GB disk
  • Checkpointing: Save every epoch; keep best-by-validation-PCK

Phase 5: Proof Verification

verify-training binary provides the "trust kill switch" for training:

  • Fixed seed (MODEL_SEED=0, PROOF_SEED=42)
  • 50 training steps on deterministic SyntheticDataset
  • Verifies: loss decreases + SHA-256 of final weights matches stored hash
  • EXIT 0 = PASS, EXIT 1 = FAIL, EXIT 2 = SKIP (no stored hash)

Subcarrier Mismatch: MM-Fi (114) vs System (56)

MM-Fi captures 114 subcarriers at 5 GHz with 40 MHz bandwidth (Atheros CSI Tool). The system is configured for 56 subcarriers (Atheros, 20 MHz). Resolution options:

  1. Interpolate MM-Fi → 56 (chosen for Phase 1): linear interpolation preserves spectral envelope, fast, no architecture change needed
  2. Train at native 114: change CSIProcessor config; requires re-running verify.py --generate-hash to update proof hash; future option
  3. Collect native 56-sub data: ESP32 mesh at 20 MHz; best for production

Option 1 unblocks training immediately. The Rust subcarrier.rs module handles interpolation as a first-class operation with tests proving correctness.

Consequences

Positive:

  • Unblocks end-to-end training on real public data immediately
  • MM-Fi's Atheros hardware family matches target system (same CSI Tool)
  • 40 subjects × 27 actions provides reasonable diversity for first model
  • Wi-Pose's 3×3 antenna setup is an exact hardware match for ESP32 mesh
  • CC BY-NC license is compatible with research and internal use
  • Rust implementation integrates natively with wifi-densepose-signal pipeline

Negative:

  • CC BY-NC prohibits commercial deployment of weights trained solely on MM-Fi; custom data collection required before commercial release
  • MM-Fi is 1 TX / 3 RX; system targets 3 TX / 3 RX; fine-tuning needed
  • 114→56 subcarrier interpolation loses frequency resolution; acceptable for v1
  • MM-Fi captured in controlled lab environments; real-world accuracy will be lower until fine-tuned on domain-specific data

References

  • Yang et al., "MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset" (NeurIPS 2023) — arXiv:2305.10345
  • Geng et al., "DensePose From WiFi" (CMU, arXiv:2301.00250, 2023)
  • Yan et al., "Person-in-WiFi 3D" (CVPR 2024)
  • NjtechCVLab, "Wi-Pose Dataset" — github.com/NjtechCVLab/Wi-PoseDataset
  • ADR-012: ESP32 CSI Sensor Mesh (hardware target)
  • ADR-013: Feature-Level Sensing on Commodity Gear
  • ADR-014: SOTA Signal Processing Algorithms