Files
wifi-densepose/examples/data/framework/docs/PHYSICS_CLIENTS.md
ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00

11 KiB

Physics, Seismic, and Ocean Data Clients

Overview

This module provides async API clients for physics, seismic, and ocean data sources, enabling cross-disciplinary discoveries through RuVector's semantic vector search and graph coherence analysis.

New Domains

Three new domains have been added to Domain enum in ruvector_native.rs:

  • Domain::Physics - Particle physics, materials science
  • Domain::Seismic - Earthquake data, seismic activity
  • Domain::Ocean - Ocean temperature, salinity, depth profiles

Clients

1. UsgsEarthquakeClient

USGS Earthquake Hazards Program - Real-time and historical earthquake data worldwide.

Features

  • No API key required (public data)
  • Global earthquake coverage
  • Magnitude, location, depth, tsunami warnings
  • ~5 requests/second rate limit

Methods

use ruvector_data_framework::UsgsEarthquakeClient;

let client = UsgsEarthquakeClient::new()?;

// Get recent earthquakes above minimum magnitude
let recent = client.get_recent(4.5, 7).await?; // Mag 4.5+, last 7 days

// Search by geographic region
let la_quakes = client.search_by_region(
    34.05,    // latitude
    -118.25,  // longitude
    200.0,    // radius in km
    30        // days back
).await?;

// Get significant earthquakes only
let significant = client.get_significant(30).await?;

// Filter by magnitude range
let moderate = client.get_by_magnitude_range(4.0, 6.0, 7).await?;

SemanticVector Metadata

Each earthquake is converted to a SemanticVector with:

metadata: {
    "magnitude": "5.4",
    "place": "Southern California",
    "latitude": "34.05",
    "longitude": "-118.25",
    "depth_km": "10.5",
    "tsunami": "0",
    "significance": "450",
    "status": "reviewed",
    "alert": "green",
    "source": "usgs"
}

2. CernOpenDataClient

CERN Open Data Portal - LHC experiment data, particle physics datasets.

Features

  • No API key required
  • CMS, ATLAS, LHCb, ALICE experiments
  • Collision events, particle physics data
  • Educational and research datasets

Methods

use ruvector_data_framework::CernOpenDataClient;

let client = CernOpenDataClient::new()?;

// Search datasets by keywords
let higgs = client.search_datasets("Higgs").await?;
let top_quark = client.search_datasets("top quark").await?;

// Get specific dataset by record ID
let dataset = client.get_dataset(5500).await?;

// Search by experiment
let cms_data = client.search_by_experiment("CMS").await?;
let atlas_data = client.search_by_experiment("ATLAS").await?;

Available Experiments

  • "CMS" - Compact Muon Solenoid
  • "ATLAS" - A Toroidal LHC ApparatuS
  • "LHCb" - Large Hadron Collider beauty
  • "ALICE" - A Large Ion Collider Experiment

SemanticVector Metadata

metadata: {
    "recid": "12345",
    "title": "CMS 2011 Higgs to two photons dataset",
    "experiment": "CMS",
    "collision_energy": "7TeV",
    "collision_type": "pp",
    "data_type": "Dataset",
    "source": "cern"
}

3. ArgoClient

Argo Float Ocean Data - Global ocean temperature, salinity, pressure profiles.

Features

  • Global ocean coverage (4000+ floats)
  • Temperature and salinity profiles
  • Depth measurements (0-2000m typical)
  • Free public data

Methods

use ruvector_data_framework::ArgoClient;

let client = ArgoClient::new()?;

// Get recent profiles (placeholder - requires Argo GDAC integration)
let recent = client.get_recent_profiles(30).await?;

// Search by region
let atlantic = client.search_by_region(
    0.0,     // latitude
    -30.0,   // longitude
    500.0    // radius km
).await?;

// Temperature-focused profiles
let temp_data = client.get_temperature_profiles().await?;

// Create sample data for testing
let samples = client.create_sample_profiles(50)?;

Note on Implementation

The current Argo client includes a create_sample_profiles() method for demonstration. For production use, integrate with:

SemanticVector Metadata

metadata: {
    "platform_number": "1900001",
    "latitude": "35.5",
    "longitude": "-45.2",
    "temperature": "18.3",
    "salinity": "35.1",
    "depth_m": "500.0",
    "source": "argo"
}

4. MaterialsProjectClient

Materials Project - Computational materials science database (150,000+ materials).

Features

  • Crystal structures and properties
  • Band gaps, formation energies
  • Electronic and mechanical properties
  • Requires free API key from https://materialsproject.org

Methods

use ruvector_data_framework::MaterialsProjectClient;

// API key required
let api_key = std::env::var("MATERIALS_PROJECT_API_KEY")?;
let client = MaterialsProjectClient::new(api_key)?;

// Search by chemical formula
let silicon = client.search_materials("Si").await?;
let iron_oxide = client.search_materials("Fe2O3").await?;
let battery = client.search_materials("LiFePO4").await?;

// Get specific material by ID
let mp_149 = client.get_material("mp-149").await?; // Silicon

// Search by property range
let semiconductors = client.search_by_property(
    "band_gap",
    1.0,  // min eV
    3.0   // max eV
).await?;

let stable = client.search_by_property(
    "formation_energy_per_atom",
    -2.0,  // min eV/atom
    0.0    // max eV/atom
).await?;

Common Properties

  • "band_gap" - Electronic band gap (eV)
  • "formation_energy_per_atom" - Formation energy (eV/atom)
  • "energy_per_atom" - Total energy per atom
  • "density" - Density (g/cm³)
  • "volume" - Volume per atom

SemanticVector Metadata

metadata: {
    "material_id": "mp-149",
    "formula": "Si",
    "band_gap": "1.14",
    "density": "2.33",
    "formation_energy": "0.0",
    "crystal_system": "cubic",
    "elements": "Si",
    "source": "materials_project"
}

Geographic Utilities

The GeoUtils helper provides geographic calculations:

use ruvector_data_framework::GeoUtils;

// Calculate distance between two points (Haversine formula)
let distance_km = GeoUtils::distance_km(
    40.7128, -74.0060,  // NYC
    34.0522, -118.2437  // LA
);
// Returns: ~3936 km

// Check if point is within radius
let within = GeoUtils::within_radius(
    34.05, -118.25,     // Center (LA)
    32.72, -117.16,     // Point (San Diego)
    200.0               // Radius in km
);
// Returns: true

Rate Limiting

All clients implement automatic rate limiting and retry logic:

Client Rate Limit Max Retries Retry Delay
USGS 200ms (~5 req/s) 3 1s exponential
CERN 500ms (~2 req/s) 3 1s exponential
Argo 300ms (~3 req/s) 3 1s exponential
Materials Project 1000ms (1 req/s) 3 1s exponential

Cross-Domain Discovery Examples

1. Earthquake-Climate Correlations

use ruvector_data_framework::{
    UsgsEarthquakeClient, NoaaClient,
    NativeDiscoveryEngine, NativeEngineConfig
};

let mut engine = NativeDiscoveryEngine::new(NativeEngineConfig::default());

// Add earthquake data
let usgs = UsgsEarthquakeClient::new()?;
let earthquakes = usgs.get_recent(5.0, 30).await?;
for eq in earthquakes {
    engine.add_vector(eq);
}

// Add climate data
let noaa = NoaaClient::new(None)?;
let climate = noaa.get_climate_data("GHCND:USW00023174", 30).await?;
for record in climate {
    engine.add_vector(record);
}

// Discover patterns
let patterns = engine.detect_patterns();
for pattern in patterns {
    if !pattern.cross_domain_links.is_empty() {
        println!("Found cross-domain pattern: {}", pattern.description);
    }
}

2. Materials for Particle Detectors

use ruvector_data_framework::{
    CernOpenDataClient, MaterialsProjectClient
};

let cern = CernOpenDataClient::new()?;
let materials = MaterialsProjectClient::new(api_key)?;

// Get particle physics requirements
let detector_data = cern.search_datasets("detector").await?;

// Find materials with suitable properties
let semiconductors = materials.search_by_property("band_gap", 1.0, 3.0).await?;

// Add to discovery engine to find correlations
let mut engine = NativeDiscoveryEngine::new(config);
for data in detector_data {
    engine.add_vector(data);
}
for material in semiconductors {
    engine.add_vector(material);
}

let patterns = engine.detect_patterns();

3. Ocean Temperature & Seismic Activity

use ruvector_data_framework::{
    ArgoClient, UsgsEarthquakeClient
};

let argo = ArgoClient::new()?;
let usgs = UsgsEarthquakeClient::new()?;

// Get ocean data for a region
let ocean = argo.search_by_region(0.0, -30.0, 1000.0).await?;

// Get earthquakes in same region
let quakes = usgs.search_by_region(0.0, -30.0, 1000.0, 90).await?;

// Discover correlations
let mut engine = NativeDiscoveryEngine::new(config);
for profile in ocean {
    engine.add_vector(profile);
}
for eq in quakes {
    engine.add_vector(eq);
}

// Look for cross-domain patterns
let patterns = engine.detect_patterns();
for pattern in patterns.iter().filter(|p| {
    p.cross_domain_links.iter().any(|l|
        (l.source_domain == Domain::Ocean && l.target_domain == Domain::Seismic) ||
        (l.source_domain == Domain::Seismic && l.target_domain == Domain::Ocean)
    )
}) {
    println!("Ocean-Seismic correlation: {}", pattern.description);
}

Running the Example

# Basic example (no API keys required)
cargo run --example physics_discovery

# With Materials Project API key
export MATERIALS_PROJECT_API_KEY="your_key_here"
cargo run --example physics_discovery

Integration with RuVector

All clients convert data to SemanticVector format, enabling:

  1. Vector Similarity Search - Find similar earthquakes, materials, experiments
  2. Graph Coherence Analysis - Detect network fragmentation/consolidation
  3. Cross-Domain Pattern Discovery - Bridge physics, seismic, ocean domains
  4. Temporal Analysis - Track changes over time
  5. Spatial Analysis - Geographic clustering and correlation

Testing

# Run all physics client tests
cargo test physics_clients

# Run specific client tests
cargo test usgs_client
cargo test cern_client
cargo test argo_client
cargo test materials_project_client

# Run geographic utilities tests
cargo test geo_utils

API Documentation

USGS Earthquake API

CERN Open Data Portal

Argo Data

Materials Project

Future Enhancements

  1. Full Argo GDAC Integration - Parse netCDF files directly
  2. CERN Data Caching - Local cache for large datasets
  3. USGS Historical Data - Access to complete historical catalog
  4. Materials Project Batch Queries - Optimize multi-material searches
  5. Real-time Earthquake Streaming - WebSocket for live data
  6. Ocean Current Prediction - ML models for temperature forecasting

License

Part of RuVector Data Discovery Framework. See main LICENSE file.