git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
12 KiB
AI/ML API Clients for RuVector Data Discovery Framework
This module provides comprehensive integration with AI/ML platforms for discovering models, datasets, and research papers.
Available Clients
1. HuggingFaceClient
Purpose: Access HuggingFace model hub and inference API
Features:
- Search models by query and task type
- Get model details and metadata
- List and search datasets
- Run model inference
- Convert models/datasets to SemanticVectors
API Details:
- Base URL:
https://huggingface.co/api - Rate limit: 30 requests/minute (free tier)
- API key: Optional via
HUGGINGFACE_API_KEYenvironment variable - Mock fallback: Yes (when no API key provided)
Example:
use ruvector_data_framework::HuggingFaceClient;
let client = HuggingFaceClient::new();
// Search for BERT models
let models = client.search_models("bert", Some("fill-mask")).await?;
// Get specific model
let model = client.get_model("bert-base-uncased").await?;
// Convert to vector for discovery
if let Some(m) = model {
let vector = client.model_to_vector(&m);
println!("Model: {}, Embedding dim: {}", vector.id, vector.embedding.len());
}
// List datasets
let datasets = client.list_datasets(Some("nlp")).await?;
// Run inference (requires API key)
let result = client.inference(
"bert-base-uncased",
serde_json::json!({"inputs": "Hello [MASK]!"})
).await?;
2. OllamaClient
Purpose: Local LLM inference with Ollama
Features:
- List locally available models
- Generate text completions
- Chat with message history
- Generate embeddings
- Pull models from Ollama library
- Automatic mock fallback when Ollama not running
API Details:
- Base URL:
http://localhost:11434/api(default) - Rate limit: None (local service)
- API key: Not required
- Mock fallback: Yes (when Ollama service unavailable)
Example:
use ruvector_data_framework::{OllamaClient, OllamaChatMessage};
let mut client = OllamaClient::new();
// Check if Ollama is running
if client.is_available().await {
// List available models
let models = client.list_models().await?;
// Generate completion
let response = client.generate(
"llama2",
"Explain quantum computing in simple terms"
).await?;
// Chat with message history
let messages = vec![
OllamaChatMessage {
role: "user".to_string(),
content: "What is machine learning?".to_string(),
}
];
let chat_response = client.chat("llama2", messages).await?;
// Generate embeddings
let embedding = client.embeddings("llama2", "sample text").await?;
println!("Embedding dimension: {}", embedding.len());
}
Setup:
# Install Ollama
curl https://ollama.ai/install.sh | sh
# Start Ollama service
ollama serve
# Pull a model
ollama pull llama2
3. ReplicateClient
Purpose: Access Replicate's cloud ML model platform
Features:
- Get model information
- Create predictions (run models)
- Check prediction status
- List model collections
- Convert models to SemanticVectors
API Details:
- Base URL:
https://api.replicate.com/v1 - Rate limit: Varies by plan
- API key: Required via
REPLICATE_API_TOKENenvironment variable - Mock fallback: Yes (when no API token provided)
Example:
use ruvector_data_framework::ReplicateClient;
let client = ReplicateClient::new();
// Get model info
let model = client.get_model("stability-ai", "stable-diffusion").await?;
if let Some(m) = model {
println!("Model: {}/{}", m.owner, m.name);
// Convert to vector
let vector = client.model_to_vector(&m);
// Create a prediction
let prediction = client.create_prediction(
"stability-ai/stable-diffusion",
serde_json::json!({
"prompt": "a beautiful sunset over mountains"
})
).await?;
// Check prediction status
let status = client.get_prediction(&prediction.id).await?;
println!("Status: {}", status.status);
}
// List collections
let collections = client.list_collections().await?;
Environment Setup:
export REPLICATE_API_TOKEN="your_token_here"
4. TogetherAiClient
Purpose: Access Together AI's open source model hosting
Features:
- List available models
- Chat completions
- Generate embeddings
- Support for various open source LLMs
- Convert models to SemanticVectors
API Details:
- Base URL:
https://api.together.xyz/v1 - Rate limit: Varies by plan
- API key: Required via
TOGETHER_API_KEYenvironment variable - Mock fallback: Yes (when no API key provided)
Example:
use ruvector_data_framework::{TogetherAiClient, TogetherMessage};
let client = TogetherAiClient::new();
// List models
let models = client.list_models().await?;
for model in models.iter().take(5) {
println!("Model: {}", model.display_name.as_deref().unwrap_or(&model.id));
println!("Context: {} tokens", model.context_length.unwrap_or(0));
}
// Chat completion
let messages = vec![
TogetherMessage {
role: "user".to_string(),
content: "Explain neural networks".to_string(),
}
];
let response = client.chat_completion(
"togethercomputer/llama-2-7b",
messages
).await?;
println!("Response: {}", response);
// Generate embeddings
let embedding = client.embeddings(
"togethercomputer/m2-bert-80M-8k-retrieval",
"sample text for embedding"
).await?;
Environment Setup:
export TOGETHER_API_KEY="your_key_here"
5. PapersWithCodeClient
Purpose: Access Papers With Code research database
Features:
- Search ML research papers
- Get paper details
- List datasets
- Get state-of-the-art (SOTA) benchmarks
- Search methods/techniques
- Convert papers/datasets to SemanticVectors
API Details:
- Base URL:
https://paperswithcode.com/api/v1 - Rate limit: 60 requests/minute
- API key: Not required
- Mock fallback: Partial (for some endpoints)
Example:
use ruvector_data_framework::PapersWithCodeClient;
let client = PapersWithCodeClient::new();
// Search papers
let papers = client.search_papers("transformer").await?;
for paper in papers.iter().take(5) {
println!("Title: {}", paper.title);
if let Some(url) = &paper.url_abs {
println!("URL: {}", url);
}
// Convert to vector
let vector = client.paper_to_vector(paper);
println!("Vector ID: {}", vector.id);
}
// Get specific paper
let paper = client.get_paper("attention-is-all-you-need").await?;
// List datasets
let datasets = client.list_datasets().await?;
for dataset in datasets.iter().take(5) {
println!("Dataset: {}", dataset.name);
// Convert to vector
let vector = client.dataset_to_vector(dataset);
}
// Get SOTA results for a task
let sota_results = client.get_sota("image-classification").await?;
for result in sota_results {
println!("Task: {}, Dataset: {}, Metric: {}, Value: {}",
result.task, result.dataset, result.metric, result.value);
}
Integration with RuVector Discovery
All clients provide conversion methods to transform their data into SemanticVector format for use with RuVector's discovery engine:
use ruvector_data_framework::{
HuggingFaceClient, PapersWithCodeClient, Domain,
NativeDiscoveryEngine, NativeEngineConfig
};
// Create clients
let hf_client = HuggingFaceClient::new();
let pwc_client = PapersWithCodeClient::new();
// Collect vectors from different sources
let mut vectors = Vec::new();
// Add HuggingFace models
let models = hf_client.search_models("transformer", None).await?;
for model in models {
vectors.push(hf_client.model_to_vector(&model));
}
// Add research papers
let papers = pwc_client.search_papers("attention mechanism").await?;
for paper in papers {
vectors.push(pwc_client.paper_to_vector(&paper));
}
// Run discovery analysis
let config = NativeEngineConfig::default();
let mut engine = NativeDiscoveryEngine::new(config);
for vector in vectors {
engine.ingest_vector(vector)?;
}
// Detect patterns
let patterns = engine.detect_patterns()?;
println!("Found {} discovery patterns", patterns.len());
Environment Variables
| Variable | Client | Required | Description |
|---|---|---|---|
HUGGINGFACE_API_KEY |
HuggingFaceClient | No | Optional for public models, required for private/inference |
REPLICATE_API_TOKEN |
ReplicateClient | Yes* | Required for API access (*falls back to mock) |
TOGETHER_API_KEY |
TogetherAiClient | Yes* | Required for API access (*falls back to mock) |
| - | OllamaClient | No | Uses local Ollama service |
| - | PapersWithCodeClient | No | Public API, no key needed |
Mock Data Fallback
All clients (except PapersWithCodeClient) provide automatic mock data when:
- API keys are not provided
- Services are unavailable
- Rate limits are exceeded (after retries)
This allows for:
- Development without API keys
- Testing without external dependencies
- Graceful degradation in production
Rate Limiting
All clients implement automatic rate limiting:
- Configurable delays between requests
- Exponential backoff on failures
- Automatic retry logic (up to 3 retries)
- Respects API rate limits
Error Handling
All clients use the framework's Result<T> type with FrameworkError:
use ruvector_data_framework::{HuggingFaceClient, FrameworkError};
match hf_client.search_models("bert", None).await {
Ok(models) => {
println!("Found {} models", models.len());
}
Err(FrameworkError::Network(e)) => {
eprintln!("Network error: {}", e);
}
Err(e) => {
eprintln!("Other error: {}", e);
}
}
Testing
The module includes comprehensive unit tests:
# Run all ML client tests
cargo test ml_clients
# Run specific client tests
cargo test ml_clients::tests::test_huggingface
cargo test ml_clients::tests::test_ollama
cargo test ml_clients::tests::test_replicate
cargo test ml_clients::tests::test_together
cargo test ml_clients::tests::test_paperswithcode
# Run integration tests (requires API keys)
cargo test ml_clients::tests --ignored
Example Application
See examples/ml_clients_demo.rs for a complete demonstration:
# Run demo (uses mock data)
cargo run --example ml_clients_demo
# Run with API keys
export HUGGINGFACE_API_KEY="your_key"
export REPLICATE_API_TOKEN="your_token"
export TOGETHER_API_KEY="your_key"
cargo run --example ml_clients_demo
Performance Considerations
- HuggingFace: 30 req/min free tier → 2 second delays
- Ollama: Local, minimal delays (100ms)
- Replicate: Pay-per-use, 1 second delays
- Together AI: Pay-per-use, 1 second delays
- Papers With Code: 60 req/min → 1 second delays
For bulk operations, use batch processing with appropriate delays.
Architecture
All clients follow a consistent pattern:
- Client struct: Holds HTTP client, embedder, base URL, credentials
- API response structs: Deserialize API responses
- Public methods: High-level API operations
- Conversion methods: Transform to
SemanticVector - Mock methods: Provide fallback data
- Retry logic: Handle transient failures
- Tests: Comprehensive unit testing
Dependencies
reqwest: HTTP clienttokio: Async runtimeserde: Serialization/deserializationchrono: Timestamp handlingurlencoding: URL parameter encoding
Contributing
When adding new ML API clients:
- Follow the established pattern (see existing clients)
- Implement rate limiting
- Provide mock fallback data
- Add comprehensive tests (at least 15 tests)
- Update this documentation
- Add example usage
License
Same as RuVector framework license.