Files

ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172

git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900

2026-02-28 14:39:40 -05:00

14 KiB

Raw Blame History

Tiny Dancer Admin API Documentation

Overview

The Tiny Dancer Admin API provides a production-ready REST API for monitoring, health checks, and administration of the AI routing system. It's designed to integrate seamlessly with Kubernetes, Prometheus, and other cloud-native tools.

Features

Health Checks: Kubernetes-compatible liveness and readiness probes
Metrics Export: Prometheus-compatible metrics endpoint
Hot Reloading: Update models without downtime
Circuit Breaker Management: Monitor and control circuit breaker state
Configuration Management: View and update router configuration
Optional Authentication: Bearer token authentication for admin endpoints
CORS Support: Configurable CORS for web applications

Quick Start

Running the Server

# With admin API feature enabled
cargo run --example admin-server --features admin-api

Basic Configuration

use ruvector_tiny_dancer_core::api::{AdminServer, AdminServerConfig};
use ruvector_tiny_dancer_core::router::Router;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let router = Router::default()?;

    let config = AdminServerConfig {
        bind_address: "0.0.0.0".to_string(),
        port: 8080,
        auth_token: Some("your-secret-token".to_string()),
        enable_cors: true,
    };

    let server = AdminServer::new(Arc::new(router), config);
    server.serve().await?;
    Ok(())
}

API Endpoints

Health Checks

`GET /health`

Basic liveness probe that always returns 200 OK if the service is running.

Response:

{
  "status": "healthy",
  "version": "0.1.0",
  "uptime_seconds": 3600
}

Use Case: Kubernetes liveness probe

livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 3
  periodSeconds: 10

`GET /health/ready`

Readiness probe that checks if the service can accept traffic.

Checks:

Circuit breaker state
Model loaded status

Response (Ready):

{
  "ready": true,
  "circuit_breaker": "closed",
  "model_loaded": true,
  "version": "0.1.0",
  "uptime_seconds": 3600
}

Response (Not Ready):

{
  "ready": false,
  "circuit_breaker": "open",
  "model_loaded": true,
  "version": "0.1.0",
  "uptime_seconds": 3600
}

Status Codes:

200 OK: Service is ready
503 Service Unavailable: Service is not ready

Use Case: Kubernetes readiness probe

readinessProbe:
  httpGet:
    path: /health/ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

Metrics

`GET /metrics`

Exports metrics in Prometheus exposition format.

Response Format: text/plain; version=0.0.4

Metrics Exported:

# HELP tiny_dancer_requests_total Total number of routing requests
# TYPE tiny_dancer_requests_total counter
tiny_dancer_requests_total 12345

# HELP tiny_dancer_lightweight_routes_total Requests routed to lightweight model
# TYPE tiny_dancer_lightweight_routes_total counter
tiny_dancer_lightweight_routes_total 10000

# HELP tiny_dancer_powerful_routes_total Requests routed to powerful model
# TYPE tiny_dancer_powerful_routes_total counter
tiny_dancer_powerful_routes_total 2345

# HELP tiny_dancer_inference_time_microseconds Average inference time
# TYPE tiny_dancer_inference_time_microseconds gauge
tiny_dancer_inference_time_microseconds 450.5

# HELP tiny_dancer_latency_microseconds Latency percentiles
# TYPE tiny_dancer_latency_microseconds gauge
tiny_dancer_latency_microseconds{quantile="0.5"} 400
tiny_dancer_latency_microseconds{quantile="0.95"} 800
tiny_dancer_latency_microseconds{quantile="0.99"} 1200

# HELP tiny_dancer_errors_total Total number of errors
# TYPE tiny_dancer_errors_total counter
tiny_dancer_errors_total 5

# HELP tiny_dancer_circuit_breaker_trips_total Circuit breaker trip count
# TYPE tiny_dancer_circuit_breaker_trips_total counter
tiny_dancer_circuit_breaker_trips_total 2

# HELP tiny_dancer_uptime_seconds Service uptime
# TYPE tiny_dancer_uptime_seconds counter
tiny_dancer_uptime_seconds 3600

Use Case: Prometheus scraping

scrape_configs:
  - job_name: 'tiny-dancer'
    static_configs:
      - targets: ['localhost:8080']
    metrics_path: '/metrics'

Admin Endpoints

All admin endpoints support optional bearer token authentication.

`POST /admin/reload`

Hot reload the routing model from disk without restarting the service.

Headers:

Authorization: Bearer your-secret-token

Response:

{
  "success": true,
  "message": "Model reloaded successfully"
}

Status Codes:

200 OK: Model reloaded successfully
401 Unauthorized: Invalid or missing authentication token
500 Internal Server Error: Failed to reload model

Example:

curl -X POST http://localhost:8080/admin/reload \
  -H "Authorization: Bearer your-token-here"

`GET /admin/config`

Get the current router configuration.

Headers:

Authorization: Bearer your-secret-token

Response:

{
  "model_path": "./models/fastgrnn.safetensors",
  "confidence_threshold": 0.85,
  "max_uncertainty": 0.15,
  "enable_circuit_breaker": true,
  "circuit_breaker_threshold": 5,
  "enable_quantization": true,
  "database_path": null
}

Status Codes:

200 OK: Configuration retrieved
401 Unauthorized: Invalid or missing authentication token

Example:

curl http://localhost:8080/admin/config \
  -H "Authorization: Bearer your-token-here"

`PUT /admin/config`

Update the router configuration (runtime only, not persisted).

Headers:

Authorization: Bearer your-secret-token
Content-Type: application/json

Request Body:

{
  "confidence_threshold": 0.90,
  "max_uncertainty": 0.10,
  "circuit_breaker_threshold": 10
}

Response:

{
  "success": true,
  "message": "Configuration updated",
  "updated_fields": ["confidence_threshold", "max_uncertainty"]
}

Status Codes:

200 OK: Configuration updated
401 Unauthorized: Invalid or missing authentication token
501 Not Implemented: Feature not yet implemented

Note: Currently returns 501 as runtime config updates require Router API extensions.

`GET /admin/circuit-breaker`

Get the current circuit breaker status.

Headers:

Authorization: Bearer your-secret-token

Response:

{
  "enabled": true,
  "state": "closed",
  "failure_count": 2,
  "success_count": 1234
}

Status Codes:

200 OK: Status retrieved
401 Unauthorized: Invalid or missing authentication token

Example:

curl http://localhost:8080/admin/circuit-breaker \
  -H "Authorization: Bearer your-token-here"

`POST /admin/circuit-breaker/reset`

Reset the circuit breaker to closed state.

Headers:

Authorization: Bearer your-secret-token

Response:

{
  "success": true,
  "message": "Circuit breaker reset successfully"
}

Status Codes:

200 OK: Circuit breaker reset
401 Unauthorized: Invalid or missing authentication token
501 Not Implemented: Feature not yet implemented

Note: Currently returns 501 as circuit breaker reset requires Router API extensions.

System Information

`GET /info`

Get comprehensive system information.

Response:

{
  "version": "0.1.0",
  "api_version": "v1",
  "uptime_seconds": 3600,
  "config": {
    "model_path": "./models/fastgrnn.safetensors",
    "confidence_threshold": 0.85,
    "max_uncertainty": 0.15,
    "enable_circuit_breaker": true,
    "circuit_breaker_threshold": 5,
    "enable_quantization": true,
    "database_path": null
  },
  "circuit_breaker_enabled": true,
  "metrics": {
    "total_requests": 12345,
    "lightweight_routes": 10000,
    "powerful_routes": 2345,
    "avg_inference_time_us": 450.5,
    "p50_latency_us": 400,
    "p95_latency_us": 800,
    "p99_latency_us": 1200,
    "error_count": 5,
    "circuit_breaker_trips": 2
  }
}

Example:

curl http://localhost:8080/info

Authentication

The admin API supports optional bearer token authentication for admin endpoints.

Configuration

let config = AdminServerConfig {
    bind_address: "0.0.0.0".to_string(),
    port: 8080,
    auth_token: Some("your-secret-token-here".to_string()),
    enable_cors: true,
};

Usage

Include the bearer token in the Authorization header:

curl -H "Authorization: Bearer your-secret-token-here" \
  http://localhost:8080/admin/reload

Security Best Practices

Always enable authentication in production
Use strong, random tokens (minimum 32 characters)
Rotate tokens regularly
Use HTTPS in production (configure via reverse proxy)
Limit admin API access to internal networks only
Monitor failed authentication attempts

Environment Variables

export TINY_DANCER_AUTH_TOKEN="your-secret-token-here"
export TINY_DANCER_BIND_ADDRESS="0.0.0.0"
export TINY_DANCER_PORT="8080"

Kubernetes Integration

Deployment Example

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tiny-dancer
spec:
  replicas: 3
  selector:
    matchLabels:
      app: tiny-dancer
  template:
    metadata:
      labels:
        app: tiny-dancer
    spec:
      containers:
      - name: tiny-dancer
        image: tiny-dancer:latest
        ports:
        - containerPort: 8080
          name: admin-api
        env:
        - name: TINY_DANCER_AUTH_TOKEN
          valueFrom:
            secretKeyRef:
              name: tiny-dancer-secrets
              key: auth-token
        livenessProbe:
          httpGet:
            path: /health
            port: admin-api
          initialDelaySeconds: 3
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health/ready
            port: admin-api
          initialDelaySeconds: 5
          periodSeconds: 5
        resources:
          requests:
            memory: "256Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"

Service Example

apiVersion: v1
kind: Service
metadata:
  name: tiny-dancer
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
    prometheus.io/path: "/metrics"
spec:
  selector:
    app: tiny-dancer
  ports:
  - name: admin-api
    port: 8080
    targetPort: 8080
  type: ClusterIP

Monitoring with Grafana

Prometheus Query Examples

# Request rate
rate(tiny_dancer_requests_total[5m])

# Error rate
rate(tiny_dancer_errors_total[5m]) / rate(tiny_dancer_requests_total[5m])

# P95 latency
tiny_dancer_latency_microseconds{quantile="0.95"}

# Lightweight routing ratio
tiny_dancer_lightweight_routes_total / tiny_dancer_requests_total

# Circuit breaker trips over time
increase(tiny_dancer_circuit_breaker_trips_total[1h])

Dashboard Panels

Request Rate: Line graph of requests per second
Error Rate: Gauge showing error percentage
Latency Percentiles: Multi-line graph (P50, P95, P99)
Routing Distribution: Pie chart (lightweight vs powerful)
Circuit Breaker Status: Single stat panel
Uptime: Single stat panel

Performance Considerations

Metrics Collection

The metrics endpoint is designed for high-performance scraping:

No locks during read: Uses atomic operations where possible
O(1) complexity: All metrics are pre-aggregated
Minimal allocations: Prometheus format generated on-the-fly
Scrape interval: Recommended 15-30 seconds

Health Check Latency

Health check: ~10μs
Readiness check: ~50μs (includes circuit breaker check)

Memory Overhead

Admin server: ~2MB base memory
Per-connection overhead: ~50KB
Metrics storage: ~1KB

Error Handling

Common Error Responses

401 Unauthorized

{
  "error": "Missing or invalid Authorization header"
}

500 Internal Server Error

{
  "success": false,
  "message": "Failed to reload model: File not found"
}

503 Service Unavailable

{
  "ready": false,
  "circuit_breaker": "open",
  "model_loaded": true,
  "version": "0.1.0",
  "uptime_seconds": 3600
}

Production Checklist

Enable authentication for admin endpoints
Configure HTTPS via reverse proxy (nginx, Envoy, etc.)
Set up Prometheus scraping
Configure Grafana dashboards
Set up alerts for error rate and latency
Implement log aggregation
Configure network policies (K8s)
Set resource limits
Enable CORS only for trusted origins
Rotate authentication tokens regularly
Monitor circuit breaker trips
Set up automated model reload workflows

Troubleshooting

Server Won't Start

Symptom: Failed to bind to 0.0.0.0:8080: Address already in use

Solution: Change the port or stop the conflicting service:

lsof -i :8080
kill <PID>

Authentication Failing

Symptom: 401 Unauthorized

Solution: Check that the token matches exactly:

# Test with curl
curl -H "Authorization: Bearer your-token" http://localhost:8080/admin/config

Metrics Not Updating

Symptom: Metrics show zero values

Solution: Ensure you're recording metrics after each routing operation:

use ruvector_tiny_dancer_core::api::record_routing_metrics;

// After routing
record_routing_metrics(&metrics, inference_time_us, lightweight_count, powerful_count);

Future Enhancements

Runtime configuration persistence
Circuit breaker manual reset API
WebSocket support for real-time metrics streaming
OpenTelemetry integration
Custom metric labels
Rate limiting
Request/response logging middleware
Distributed tracing integration
GraphQL API alternative
Admin UI dashboard

Support

For issues, questions, or contributions, please visit:

GitHub: https://github.com/ruvnet/ruvector
Documentation: https://docs.ruvector.io

License

This API is part of the Tiny Dancer routing system and follows the same license terms.

14 KiB Raw Blame History

Tiny Dancer Admin API Documentation

Overview

Features

Quick Start

Running the Server

Basic Configuration

API Endpoints

Health Checks

GET /health

GET /health/ready

Metrics

GET /metrics

Admin Endpoints

POST /admin/reload

GET /admin/config

PUT /admin/config

GET /admin/circuit-breaker

POST /admin/circuit-breaker/reset

System Information

GET /info

Authentication

Configuration

Usage

Security Best Practices

Environment Variables

Kubernetes Integration

Deployment Example

Service Example

Monitoring with Grafana

Prometheus Query Examples

Dashboard Panels

Performance Considerations

Metrics Collection

Health Check Latency

Memory Overhead

Error Handling

Common Error Responses

401 Unauthorized

500 Internal Server Error

503 Service Unavailable

Production Checklist

Troubleshooting

Server Won't Start

Authentication Failing

Metrics Not Updating

Future Enhancements

Support

License

14 KiB

Raw Blame History

`GET /health`

`GET /health/ready`

`GET /metrics`

`POST /admin/reload`

`GET /admin/config`

`PUT /admin/config`

`GET /admin/circuit-breaker`

`POST /admin/circuit-breaker/reset`

`GET /info`