Files
wifi-densepose/examples/data/framework/PATENT_CLIENT_README.md
ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00

5.8 KiB

Patent Database API Client

A comprehensive patent data discovery client for the RuVector framework, providing access to USPTO and EPO patent databases.

Features

USPTO PatentsView API Client

The UsptoPatentClient provides free, unauthenticated access to the USPTO PatentsView API with the following capabilities:

Search Methods

  1. Keyword Search - search_patents(query, max_results)

    • Search patents by keywords in title and abstract
    • Example: client.search_patents("quantum computing", 100).await?
  2. Assignee Search - search_by_assignee(company_name, max_results)

    • Find all patents by a specific company or organization
    • Example: client.search_by_assignee("IBM", 50).await?
  3. CPC Classification Search - search_by_cpc(cpc_class, max_results)

    • Search by Cooperative Patent Classification code
    • Example: client.search_by_cpc("Y02E", 200).await?
  4. Patent Details - get_patent(patent_number)

    • Get detailed information for a specific patent
    • Example: client.get_patent("10000000").await?
  5. Citation Analysis - get_citations(patent_number)

    • Get both citing and cited patents for citation network analysis
    • Returns: (citing_patents, cited_patents)

CPC Classification Codes of Interest

  • Y02 - Climate Change Mitigation Technologies

    • Y02E - Energy generation, transmission, distribution
    • Y02T - Climate change mitigation technologies related to transportation
    • Y02P - Climate change mitigation technologies in production processes
  • G06N - Computing Arrangements Based on AI/ML

    • G06N3 - Computing based on biological models (neural networks)
    • G06N5 - Computing based on knowledge-based models
    • G06N20 - Machine learning
  • A61 - Medical or Veterinary Science

    • A61K - Preparations for medical, dental, or toilet purposes
    • A61P - Specific therapeutic activity of chemical compounds
  • H01 - Electric Elements

    • H01L - Semiconductor devices
    • H01M - Batteries, fuel cells, capacitors

Data Format

All patent data is converted to SemanticVector format:

SemanticVector {
    id: "US10123456",              // Patent number with US prefix
    embedding: Vec<f32>,            // 512-dimension embedding from title + abstract
    domain: Domain::Research,       // Could be Domain::Innovation if added
    timestamp: DateTime<Utc>,       // Grant date or filing date
    metadata: HashMap {
        "patent_number": "10123456",
        "title": "Quantum computing system...",
        "abstract": "A quantum computing system comprising...",
        "assignee": "IBM Corporation",
        "inventors": "John Doe, Jane Smith",
        "cpc_codes": "G06N10/00, G06N99/00",
        "citations_count": "42",    // Number of patents citing this one
        "cited_count": "15",        // Number of patents cited by this one
        "source": "uspto"
    }
}

Rate Limiting

  • USPTO: 200ms between requests (~5 req/sec) - follows PatentsView API guidelines
  • EPO: 1000ms between requests (~1 req/sec) - conservative rate limiting
  • Automatic retry with exponential backoff (max 3 retries)

Usage Example

use ruvector_data_framework::{UsptoPatentClient, Result};

#[tokio::main]
async fn main() -> Result<()> {
    // Create client (no authentication required)
    let client = UsptoPatentClient::new()?;

    // Search for climate tech patents
    let patents = client.search_by_cpc("Y02E", 100).await?;

    for patent in patents {
        println!("Patent: {} - {}",
            patent.id,
            patent.metadata.get("title").unwrap_or(&"Untitled".to_string())
        );
    }

    // Search by company
    let tesla_patents = client.search_by_assignee("Tesla", 50).await?;

    // Get specific patent with citations
    if let Some(patent) = client.get_patent("10000000").await? {
        let (citing, cited) = client.get_citations(&patent.id[2..]).await?;
        println!("Cited by {} patents, cites {} patents", citing.len(), cited.len());
    }

    Ok(())
}

Run the Example

cargo run --example patent_discovery

EPO Client (Future Development)

The EpoClient is a placeholder for European Patent Office integration. Implementation requires:

API Documentation

Testing

# Run all tests
cargo test --lib patent_clients

# Run integration tests (requires network)
cargo test --lib patent_clients -- --ignored

# Test specific functionality
cargo test --lib patent_clients::tests::test_cpc_classification_mapping

Integration with Discovery Framework

The patent client integrates seamlessly with the RuVector discovery framework:

use ruvector_data_framework::{
    UsptoPatentClient,
    DiscoveryPipeline,
    PipelineConfig,
};

// 1. Fetch patent data
let client = UsptoPatentClient::new()?;
let vectors = client.search_by_cpc("G06N", 1000).await?;

// 2. Add to discovery engine
let mut pipeline = DiscoveryPipeline::new(PipelineConfig::default());
// ... add vectors to pipeline ...

// 3. Discover patterns across patent citation networks
let patterns = pipeline.run(source).await?;

Features

  • Free API access (no authentication)
  • Comprehensive patent metadata
  • Citation network analysis
  • CPC classification search
  • Rate limiting and retry logic
  • SemanticVector conversion with embeddings
  • Unit and integration tests
  • 🔄 EPO integration (planned)

Contributing

To add new patent sources:

  1. Implement the client following the pattern in patent_clients.rs
  2. Add conversion to SemanticVector format
  3. Implement rate limiting and error handling
  4. Add comprehensive tests
  5. Update this documentation