Files
wifi-densepose/vendor/ruvector/docs/adr/ADR-013-huggingface-publishing.md

3.2 KiB

ADR-013: HuggingFace Model Publishing Strategy

Status

Accepted - 2026-01-20

Context

RuvLTRA models need to be distributed to users efficiently. HuggingFace Hub is the industry standard for model hosting with:

  • High-speed CDN for global distribution
  • Git-based versioning
  • Model cards for documentation
  • API for programmatic access
  • Integration with major ML frameworks

Decision

1. Repository Structure

All models consolidated under a single HuggingFace repository:

Repository Purpose Models
ruv/ruvltra All RuvLTRA models Claude Code, Small, Medium, Large

URL: https://huggingface.co/ruv/ruvltra

2. File Naming Convention

ruvltra-{size}-{quant}.gguf

Examples:

  • ruvltra-0.5b-q4_k_m.gguf
  • ruvltra-3b-q8_0.gguf
  • ruvltra-claude-code-0.5b-q4_k_m.gguf

3. Authentication

Support multiple environment variable names for HuggingFace token:

  • HF_TOKEN (primary)
  • HUGGING_FACE_HUB_TOKEN (legacy)
  • HUGGINGFACE_API_KEY (common alternative)

4. Upload Workflow

// Using ModelUploader
let uploader = ModelUploader::new(get_hf_token().unwrap());
uploader.upload(
    "./model.gguf",
    "ruv/ruvltra-small",
    Some(metadata),
)?;

5. Model Card Requirements

Each repository must include:

  • YAML frontmatter with tags, license, language
  • Model description and capabilities
  • Hardware requirements table
  • Usage examples (Rust, Python, CLI)
  • Benchmark results (when available)
  • License information

6. Versioning Strategy

  • Use HuggingFace's built-in Git versioning
  • Tag major releases (e.g., v1.0.0)
  • Maintain main branch for latest stable
  • Use branches for experimental variants

Consequences

Positive

  • Accessibility: Models available via standard HuggingFace APIs
  • Discoverability: Indexed in HuggingFace model search
  • Versioning: Full Git history for model evolution
  • CDN: Fast global downloads via Cloudflare
  • Documentation: Model cards provide user guidance

Negative

  • Storage Costs: Large models require HuggingFace Pro for private repos
  • Dependency: Reliance on external service availability
  • Sync Complexity: Must keep registry.rs in sync with HuggingFace

Mitigations

  • Use public repos (free unlimited storage)
  • Implement fallback to direct URL downloads
  • Automate registry updates via CI/CD

Implementation

Phase 1: Initial Publishing (Complete)

  • Create consolidated ruv/ruvltra repository
  • Upload Claude Code, Small, and Medium models
  • Upload Q4_K_M quantized models
  • Add comprehensive model card with badges, tutorials, architecture

Phase 2: Enhanced Distribution

  • Add Q8 quantization variants
  • Add FP16 variants for fine-tuning
  • Implement automated CI/CD publishing
  • Add SONA weight exports

Phase 3: Ecosystem Integration

  • Add to llama.cpp model zoo
  • Create Ollama modelfile
  • Publish to alternative registries (ModelScope)

References