Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

This commit is contained in:
ruv
2026-02-28 14:39:40 -05:00
7854 changed files with 3522914 additions and 0 deletions

View File

@@ -0,0 +1,16 @@
[package]
name = "ruvector-metrics"
version.workspace = true
edition.workspace = true
license.workspace = true
authors.workspace = true
repository.workspace = true
readme = "README.md"
description = "Prometheus-compatible metrics collection for Ruvector vector databases"
[dependencies]
prometheus = "0.13"
lazy_static = "1.5"
serde = { workspace = true }
serde_json = { workspace = true }
chrono = { workspace = true }

View File

@@ -0,0 +1,224 @@
# Ruvector Metrics
[![Crates.io](https://img.shields.io/crates/v/ruvector-metrics.svg)](https://crates.io/crates/ruvector-metrics)
[![Documentation](https://docs.rs/ruvector-metrics/badge.svg)](https://docs.rs/ruvector-metrics)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![Rust](https://img.shields.io/badge/rust-1.77%2B-orange.svg)](https://www.rust-lang.org)
**Prometheus-compatible metrics collection for Ruvector vector databases.**
`ruvector-metrics` provides comprehensive observability with counters, gauges, histograms, and exporters for monitoring Ruvector performance and health. Part of the [Ruvector](https://github.com/ruvnet/ruvector) ecosystem.
## Why Ruvector Metrics?
- **Prometheus Native**: Direct Prometheus integration
- **Zero Overhead**: Lazy initialization, minimal impact
- **Comprehensive**: Operation latencies, throughput, memory
- **Customizable**: Add custom metrics for your use case
- **Standard Format**: OpenMetrics-compatible output
## Features
### Core Metrics
- **Operation Counters**: Insert, search, delete counts
- **Latency Histograms**: p50, p95, p99 latencies
- **Throughput Gauges**: Queries per second
- **Memory Metrics**: Heap usage, vector memory
- **Index Metrics**: HNSW stats, quantization info
### Advanced Features
- **Custom Labels**: Add context to metrics
- **Metric Groups**: Enable/disable metric categories
- **JSON Export**: Alternative to Prometheus format
- **Time Series**: Historical metric tracking
## Installation
Add `ruvector-metrics` to your `Cargo.toml`:
```toml
[dependencies]
ruvector-metrics = "0.1.1"
```
## Quick Start
### Initialize Metrics
```rust
use ruvector_metrics::{Metrics, MetricsConfig};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize metrics with default config
let metrics = Metrics::new(MetricsConfig::default())?;
// Or with custom config
let config = MetricsConfig {
namespace: "ruvector".to_string(),
enable_histograms: true,
histogram_buckets: vec![0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0],
..Default::default()
};
let metrics = Metrics::new(config)?;
Ok(())
}
```
### Record Metrics
```rust
use ruvector_metrics::Metrics;
// Record operation
metrics.record_insert(1);
metrics.record_search(latency_ms);
metrics.record_delete(1);
// Record batch operations
metrics.record_batch_insert(count, latency_ms);
metrics.record_batch_search(count, latency_ms);
// Update gauges
metrics.set_vector_count(10000);
metrics.set_memory_usage(1024 * 1024 * 500); // 500MB
```
### Export Metrics
```rust
use ruvector_metrics::Metrics;
// Get Prometheus format
let prometheus_output = metrics.export_prometheus()?;
println!("{}", prometheus_output);
// Get JSON format
let json_output = metrics.export_json()?;
println!("{}", json_output);
```
### HTTP Endpoint
```rust
use ruvector_metrics::{Metrics, MetricsServer};
// Start metrics server on /metrics endpoint
let server = MetricsServer::new(metrics, 9090)?;
server.start().await?;
// Access at http://localhost:9090/metrics
```
## Available Metrics
```
# Counters
ruvector_inserts_total # Total insert operations
ruvector_searches_total # Total search operations
ruvector_deletes_total # Total delete operations
ruvector_errors_total # Total errors by type
# Histograms
ruvector_insert_latency_seconds # Insert latency
ruvector_search_latency_seconds # Search latency
ruvector_delete_latency_seconds # Delete latency
# Gauges
ruvector_vector_count # Current vector count
ruvector_memory_bytes # Memory usage
ruvector_index_size_bytes # Index size
ruvector_collection_count # Number of collections
# Index metrics
ruvector_hnsw_levels # HNSW graph levels
ruvector_hnsw_nodes # HNSW node count
ruvector_hnsw_ef_construction # EF construction parameter
```
## API Overview
### Core Types
```rust
// Metrics configuration
pub struct MetricsConfig {
pub namespace: String,
pub enable_histograms: bool,
pub enable_process_metrics: bool,
pub histogram_buckets: Vec<f64>,
pub labels: HashMap<String, String>,
}
// Metrics handle
pub struct Metrics { /* ... */ }
```
### Metrics Operations
```rust
impl Metrics {
pub fn new(config: MetricsConfig) -> Result<Self>;
// Record operations
pub fn record_insert(&self, count: u64);
pub fn record_search(&self, latency_ms: f64);
pub fn record_delete(&self, count: u64);
pub fn record_error(&self, error_type: &str);
// Update gauges
pub fn set_vector_count(&self, count: u64);
pub fn set_memory_usage(&self, bytes: u64);
// Export
pub fn export_prometheus(&self) -> Result<String>;
pub fn export_json(&self) -> Result<String>;
}
```
## Grafana Dashboard
Example Grafana queries:
```promql
# Request rate
rate(ruvector_searches_total[5m])
# p99 latency
histogram_quantile(0.99, rate(ruvector_search_latency_seconds_bucket[5m]))
# Memory usage
ruvector_memory_bytes / 1024 / 1024 # MB
# Error rate
rate(ruvector_errors_total[5m]) / rate(ruvector_searches_total[5m])
```
## Related Crates
- **[ruvector-core](../ruvector-core/)** - Core vector database engine
- **[ruvector-server](../ruvector-server/)** - REST API server
## Documentation
- **[Main README](../../README.md)** - Complete project overview
- **[API Documentation](https://docs.rs/ruvector-metrics)** - Full API reference
- **[GitHub Repository](https://github.com/ruvnet/ruvector)** - Source code
## License
**MIT License** - see [LICENSE](../../LICENSE) for details.
---
<div align="center">
**Part of [Ruvector](https://github.com/ruvnet/ruvector) - Built by [rUv](https://ruv.io)**
[![Star on GitHub](https://img.shields.io/github/stars/ruvnet/ruvector?style=social)](https://github.com/ruvnet/ruvector)
[Documentation](https://docs.rs/ruvector-metrics) | [Crates.io](https://crates.io/crates/ruvector-metrics) | [GitHub](https://github.com/ruvnet/ruvector)
</div>

View File

@@ -0,0 +1,214 @@
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::time::Instant;
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
#[serde(rename_all = "lowercase")]
pub enum HealthStatus {
Healthy,
Degraded,
Unhealthy,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct HealthResponse {
pub status: HealthStatus,
pub version: String,
pub uptime_seconds: u64,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct ReadinessResponse {
pub status: HealthStatus,
pub collections_count: usize,
pub total_vectors: usize,
pub details: HashMap<String, CollectionHealth>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct CollectionHealth {
pub status: HealthStatus,
pub vectors_count: usize,
pub last_updated: Option<String>,
}
#[derive(Debug)]
pub struct CollectionStats {
pub name: String,
pub vectors_count: usize,
pub last_updated: Option<chrono::DateTime<chrono::Utc>>,
}
pub struct HealthChecker {
start_time: Instant,
version: String,
}
impl HealthChecker {
/// Create a new health checker
pub fn new() -> Self {
Self {
start_time: Instant::now(),
version: env!("CARGO_PKG_VERSION").to_string(),
}
}
/// Create a health checker with custom version
pub fn with_version(version: String) -> Self {
Self {
start_time: Instant::now(),
version,
}
}
/// Get basic health status
pub fn health(&self) -> HealthResponse {
HealthResponse {
status: HealthStatus::Healthy,
version: self.version.clone(),
uptime_seconds: self.start_time.elapsed().as_secs(),
}
}
/// Get detailed readiness status
pub fn readiness(&self, collections: &[CollectionStats]) -> ReadinessResponse {
let total_vectors: usize = collections.iter().map(|c| c.vectors_count).sum();
let mut details = HashMap::new();
for collection in collections {
let status = if collection.vectors_count > 0 {
HealthStatus::Healthy
} else {
HealthStatus::Degraded
};
details.insert(
collection.name.clone(),
CollectionHealth {
status,
vectors_count: collection.vectors_count,
last_updated: collection.last_updated.map(|dt| dt.to_rfc3339()),
},
);
}
let overall_status = if collections.is_empty() {
HealthStatus::Degraded
} else if details.values().all(|c| c.status == HealthStatus::Healthy) {
HealthStatus::Healthy
} else if details.values().any(|c| c.status == HealthStatus::Healthy) {
HealthStatus::Degraded
} else {
HealthStatus::Unhealthy
};
ReadinessResponse {
status: overall_status,
collections_count: collections.len(),
total_vectors,
details,
}
}
}
impl Default for HealthChecker {
fn default() -> Self {
Self::new()
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_health_checker_new() {
let checker = HealthChecker::new();
let health = checker.health();
assert_eq!(health.status, HealthStatus::Healthy);
assert_eq!(health.version, env!("CARGO_PKG_VERSION"));
// Uptime is always >= 0 for u64, so just check it exists
let _ = health.uptime_seconds;
}
#[test]
fn test_readiness_empty_collections() {
let checker = HealthChecker::new();
let readiness = checker.readiness(&[]);
assert_eq!(readiness.status, HealthStatus::Degraded);
assert_eq!(readiness.collections_count, 0);
assert_eq!(readiness.total_vectors, 0);
}
#[test]
fn test_readiness_with_collections() {
let checker = HealthChecker::new();
let collections = vec![
CollectionStats {
name: "test1".to_string(),
vectors_count: 100,
last_updated: Some(chrono::Utc::now()),
},
CollectionStats {
name: "test2".to_string(),
vectors_count: 200,
last_updated: None,
},
];
let readiness = checker.readiness(&collections);
assert_eq!(readiness.status, HealthStatus::Healthy);
assert_eq!(readiness.collections_count, 2);
assert_eq!(readiness.total_vectors, 300);
assert_eq!(readiness.details.len(), 2);
}
#[test]
fn test_readiness_with_empty_collection() {
let checker = HealthChecker::new();
let collections = vec![CollectionStats {
name: "empty".to_string(),
vectors_count: 0,
last_updated: None,
}];
let readiness = checker.readiness(&collections);
// Collection exists but is empty (degraded), so overall is Unhealthy
// since no collections are in healthy state
assert_eq!(readiness.status, HealthStatus::Unhealthy);
assert_eq!(readiness.collections_count, 1);
assert_eq!(readiness.total_vectors, 0);
}
#[test]
fn test_collection_health_status() {
let checker = HealthChecker::new();
let collections = vec![
CollectionStats {
name: "healthy".to_string(),
vectors_count: 100,
last_updated: Some(chrono::Utc::now()),
},
CollectionStats {
name: "degraded".to_string(),
vectors_count: 0,
last_updated: None,
},
];
let readiness = checker.readiness(&collections);
assert_eq!(
readiness.details.get("healthy").unwrap().status,
HealthStatus::Healthy
);
assert_eq!(
readiness.details.get("degraded").unwrap().status,
HealthStatus::Degraded
);
}
}

View File

@@ -0,0 +1,105 @@
use lazy_static::lazy_static;
use prometheus::{
register_counter, register_counter_vec, register_gauge, register_gauge_vec,
register_histogram_vec, Counter, CounterVec, Encoder, Gauge, GaugeVec, HistogramVec, Opts,
Registry, TextEncoder,
};
pub mod health;
pub mod recorder;
pub use health::{
CollectionHealth, HealthChecker, HealthResponse, HealthStatus, ReadinessResponse,
};
pub use recorder::MetricsRecorder;
lazy_static! {
pub static ref REGISTRY: Registry = Registry::new();
// Search metrics
pub static ref SEARCH_REQUESTS_TOTAL: CounterVec = register_counter_vec!(
Opts::new("ruvector_search_requests_total", "Total search requests"),
&["collection", "status"]
).unwrap();
pub static ref SEARCH_LATENCY_SECONDS: HistogramVec = register_histogram_vec!(
"ruvector_search_latency_seconds",
"Search latency in seconds",
&["collection"],
vec![0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0]
).unwrap();
// Insert metrics
pub static ref INSERT_REQUESTS_TOTAL: CounterVec = register_counter_vec!(
Opts::new("ruvector_insert_requests_total", "Total insert requests"),
&["collection", "status"]
).unwrap();
pub static ref INSERT_LATENCY_SECONDS: HistogramVec = register_histogram_vec!(
"ruvector_insert_latency_seconds",
"Insert latency in seconds",
&["collection"],
vec![0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0]
).unwrap();
pub static ref VECTORS_INSERTED_TOTAL: CounterVec = register_counter_vec!(
Opts::new("ruvector_vectors_inserted_total", "Total vectors inserted"),
&["collection"]
).unwrap();
// Delete metrics
pub static ref DELETE_REQUESTS_TOTAL: CounterVec = register_counter_vec!(
Opts::new("ruvector_delete_requests_total", "Total delete requests"),
&["collection", "status"]
).unwrap();
// Collection metrics
pub static ref VECTORS_TOTAL: GaugeVec = register_gauge_vec!(
Opts::new("ruvector_vectors_total", "Total vectors stored"),
&["collection"]
).unwrap();
pub static ref COLLECTIONS_TOTAL: Gauge = register_gauge!(
Opts::new("ruvector_collections_total", "Total number of collections")
).unwrap();
// System metrics
pub static ref MEMORY_USAGE_BYTES: Gauge = register_gauge!(
Opts::new("ruvector_memory_usage_bytes", "Memory usage in bytes")
).unwrap();
pub static ref UPTIME_SECONDS: Counter = register_counter!(
Opts::new("ruvector_uptime_seconds", "Uptime in seconds")
).unwrap();
}
/// Gather all metrics in Prometheus text format
pub fn gather_metrics() -> String {
let encoder = TextEncoder::new();
let metric_families = prometheus::gather();
let mut buffer = Vec::new();
encoder.encode(&metric_families, &mut buffer).unwrap();
String::from_utf8(buffer).unwrap()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_gather_metrics() {
let metrics = gather_metrics();
assert!(metrics.contains("ruvector"));
}
#[test]
fn test_record_search() {
SEARCH_REQUESTS_TOTAL
.with_label_values(&["test", "success"])
.inc();
SEARCH_LATENCY_SECONDS
.with_label_values(&["test"])
.observe(0.001);
}
}

View File

@@ -0,0 +1,175 @@
use crate::{
COLLECTIONS_TOTAL, DELETE_REQUESTS_TOTAL, INSERT_LATENCY_SECONDS, INSERT_REQUESTS_TOTAL,
MEMORY_USAGE_BYTES, SEARCH_LATENCY_SECONDS, SEARCH_REQUESTS_TOTAL, VECTORS_INSERTED_TOTAL,
VECTORS_TOTAL,
};
/// Helper struct for recording metrics
pub struct MetricsRecorder;
impl MetricsRecorder {
/// Record a search operation
///
/// # Arguments
/// * `collection` - The collection name
/// * `latency_secs` - The latency in seconds
/// * `success` - Whether the operation succeeded
pub fn record_search(collection: &str, latency_secs: f64, success: bool) {
let status = if success { "success" } else { "error" };
SEARCH_REQUESTS_TOTAL
.with_label_values(&[collection, status])
.inc();
if success {
SEARCH_LATENCY_SECONDS
.with_label_values(&[collection])
.observe(latency_secs);
}
}
/// Record an insert operation
///
/// # Arguments
/// * `collection` - The collection name
/// * `latency_secs` - The latency in seconds
/// * `count` - The number of vectors inserted
/// * `success` - Whether the operation succeeded
pub fn record_insert(collection: &str, latency_secs: f64, count: usize, success: bool) {
let status = if success { "success" } else { "error" };
INSERT_REQUESTS_TOTAL
.with_label_values(&[collection, status])
.inc();
if success {
INSERT_LATENCY_SECONDS
.with_label_values(&[collection])
.observe(latency_secs);
VECTORS_INSERTED_TOTAL
.with_label_values(&[collection])
.inc_by(count as f64);
}
}
/// Record a delete operation
///
/// # Arguments
/// * `collection` - The collection name
/// * `success` - Whether the operation succeeded
pub fn record_delete(collection: &str, success: bool) {
let status = if success { "success" } else { "error" };
DELETE_REQUESTS_TOTAL
.with_label_values(&[collection, status])
.inc();
}
/// Update the total vector count for a collection
///
/// # Arguments
/// * `collection` - The collection name
/// * `count` - The current number of vectors
pub fn set_vectors_count(collection: &str, count: usize) {
VECTORS_TOTAL
.with_label_values(&[collection])
.set(count as f64);
}
/// Update the total number of collections
///
/// # Arguments
/// * `count` - The current number of collections
pub fn set_collections_count(count: usize) {
COLLECTIONS_TOTAL.set(count as f64);
}
/// Update memory usage
///
/// # Arguments
/// * `bytes` - The current memory usage in bytes
pub fn set_memory_usage(bytes: usize) {
MEMORY_USAGE_BYTES.set(bytes as f64);
}
/// Record a batch of operations
///
/// # Arguments
/// * `collection` - The collection name
/// * `searches` - Number of search operations
/// * `inserts` - Number of insert operations
/// * `deletes` - Number of delete operations
pub fn record_batch(collection: &str, searches: usize, inserts: usize, deletes: usize) {
if searches > 0 {
SEARCH_REQUESTS_TOTAL
.with_label_values(&[collection, "success"])
.inc_by(searches as f64);
}
if inserts > 0 {
INSERT_REQUESTS_TOTAL
.with_label_values(&[collection, "success"])
.inc_by(inserts as f64);
}
if deletes > 0 {
DELETE_REQUESTS_TOTAL
.with_label_values(&[collection, "success"])
.inc_by(deletes as f64);
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_record_search_success() {
MetricsRecorder::record_search("test", 0.001, true);
// Metrics are recorded, no panic
}
#[test]
fn test_record_search_failure() {
MetricsRecorder::record_search("test", 0.001, false);
// Metrics are recorded, no panic
}
#[test]
fn test_record_insert() {
MetricsRecorder::record_insert("test", 0.002, 10, true);
// Metrics are recorded, no panic
}
#[test]
fn test_record_delete() {
MetricsRecorder::record_delete("test", true);
// Metrics are recorded, no panic
}
#[test]
fn test_set_vectors_count() {
MetricsRecorder::set_vectors_count("test", 1000);
// Metrics are recorded, no panic
}
#[test]
fn test_set_collections_count() {
MetricsRecorder::set_collections_count(5);
// Metrics are recorded, no panic
}
#[test]
fn test_set_memory_usage() {
MetricsRecorder::set_memory_usage(1024 * 1024);
// Metrics are recorded, no panic
}
#[test]
fn test_record_batch() {
MetricsRecorder::record_batch("test", 100, 50, 10);
// Metrics are recorded, no panic
}
}