dearsky/wifi-densepose

Fork 0

Files

ruv cd5943df23 Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

2026-02-28 14:39:40 -05:00

3.2 KiB

Raw Blame History

ADR-013: HuggingFace Model Publishing Strategy

Status

Accepted - 2026-01-20

Context

RuvLTRA models need to be distributed to users efficiently. HuggingFace Hub is the industry standard for model hosting with:

High-speed CDN for global distribution
Git-based versioning
Model cards for documentation
API for programmatic access
Integration with major ML frameworks

Decision

1. Repository Structure

All models consolidated under a single HuggingFace repository:

Repository	Purpose	Models
`ruv/ruvltra`	All RuvLTRA models	Claude Code, Small, Medium, Large

URL: https://huggingface.co/ruv/ruvltra

2. File Naming Convention

ruvltra-{size}-{quant}.gguf

Examples:

ruvltra-0.5b-q4_k_m.gguf
ruvltra-3b-q8_0.gguf
ruvltra-claude-code-0.5b-q4_k_m.gguf

3. Authentication

Support multiple environment variable names for HuggingFace token:

HF_TOKEN (primary)
HUGGING_FACE_HUB_TOKEN (legacy)
HUGGINGFACE_API_KEY (common alternative)

4. Upload Workflow

// Using ModelUploader
let uploader = ModelUploader::new(get_hf_token().unwrap());
uploader.upload(
    "./model.gguf",
    "ruv/ruvltra-small",
    Some(metadata),
)?;

5. Model Card Requirements

Each repository must include:

YAML frontmatter with tags, license, language
Model description and capabilities
Hardware requirements table
Usage examples (Rust, Python, CLI)
Benchmark results (when available)
License information

6. Versioning Strategy

Use HuggingFace's built-in Git versioning
Tag major releases (e.g., v1.0.0)
Maintain main branch for latest stable
Use branches for experimental variants

Consequences

Positive

Accessibility: Models available via standard HuggingFace APIs
Discoverability: Indexed in HuggingFace model search
Versioning: Full Git history for model evolution
CDN: Fast global downloads via Cloudflare
Documentation: Model cards provide user guidance

Negative

Storage Costs: Large models require HuggingFace Pro for private repos
Dependency: Reliance on external service availability
Sync Complexity: Must keep registry.rs in sync with HuggingFace

Mitigations

Use public repos (free unlimited storage)
Implement fallback to direct URL downloads
Automate registry updates via CI/CD

Implementation

Phase 1: Initial Publishing (Complete)

Create consolidated ruv/ruvltra repository
Upload Claude Code, Small, and Medium models
Upload Q4_K_M quantized models
Add comprehensive model card with badges, tutorials, architecture

Phase 2: Enhanced Distribution

Add Q8 quantization variants
Add FP16 variants for fine-tuning
Implement automated CI/CD publishing
Add SONA weight exports

Phase 3: Ecosystem Integration

Add to llama.cpp model zoo
Create Ollama modelfile
Publish to alternative registries (ModelScope)

References

HuggingFace Hub Documentation: https://huggingface.co/docs/hub
GGUF Format Specification: https://github.com/ggerganov/ggml/blob/master/docs/gguf.md
RuvLTRA Registry: crates/ruvllm/src/hub/registry.rs
Related Issue: #121

3.2 KiB Raw Blame History