Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

2026-02-28 14:39:40 -05:00
parent 7885bf6278 d803bfe2b1
commit cd5943df23
7854 changed files with 3522914 additions and 0 deletions
--- a/vendor/ruvector/docs/adr/ADR-013-huggingface-publishing.md
+++ b/vendor/ruvector/docs/adr/ADR-013-huggingface-publishing.md
@@ -0,0 +1,117 @@
+# ADR-013: HuggingFace Model Publishing Strategy
+
+## Status
+**Accepted** - 2026-01-20
+
+## Context
+
+RuvLTRA models need to be distributed to users efficiently. HuggingFace Hub is the industry standard for model hosting with:
+- High-speed CDN for global distribution
+- Git-based versioning
+- Model cards for documentation
+- API for programmatic access
+- Integration with major ML frameworks
+
+## Decision
+
+### 1. Repository Structure
+
+All models consolidated under a single HuggingFace repository:
+
+| Repository | Purpose | Models |
+|------------|---------|--------|
+| **`ruv/ruvltra`** | All RuvLTRA models | Claude Code, Small, Medium, Large |
+
+**URL**: https://huggingface.co/ruv/ruvltra
+
+### 2. File Naming Convention
+
+```
+ruvltra-{size}-{quant}.gguf
+```
+
+Examples:
+- `ruvltra-0.5b-q4_k_m.gguf`
+- `ruvltra-3b-q8_0.gguf`
+- `ruvltra-claude-code-0.5b-q4_k_m.gguf`
+
+### 3. Authentication
+
+Support multiple environment variable names for HuggingFace token:
+- `HF_TOKEN` (primary)
+- `HUGGING_FACE_HUB_TOKEN` (legacy)
+- `HUGGINGFACE_API_KEY` (common alternative)
+
+### 4. Upload Workflow
+
+```rust
+// Using ModelUploader
+let uploader = ModelUploader::new(get_hf_token().unwrap());
+uploader.upload(
+    "./model.gguf",
+    "ruv/ruvltra-small",
+    Some(metadata),
+)?;
+```
+
+### 5. Model Card Requirements
+
+Each repository must include:
+- YAML frontmatter with tags, license, language
+- Model description and capabilities
+- Hardware requirements table
+- Usage examples (Rust, Python, CLI)
+- Benchmark results (when available)
+- License information
+
+### 6. Versioning Strategy
+
+- Use HuggingFace's built-in Git versioning
+- Tag major releases (e.g., `v1.0.0`)
+- Maintain `main` branch for latest stable
+- Use branches for experimental variants
+
+## Consequences
+
+### Positive
+- **Accessibility**: Models available via standard HuggingFace APIs
+- **Discoverability**: Indexed in HuggingFace model search
+- **Versioning**: Full Git history for model evolution
+- **CDN**: Fast global downloads via Cloudflare
+- **Documentation**: Model cards provide user guidance
+
+### Negative
+- **Storage Costs**: Large models require HuggingFace Pro for private repos
+- **Dependency**: Reliance on external service availability
+- **Sync Complexity**: Must keep registry.rs in sync with HuggingFace
+
+### Mitigations
+- Use public repos (free unlimited storage)
+- Implement fallback to direct URL downloads
+- Automate registry updates via CI/CD
+
+## Implementation
+
+### Phase 1: Initial Publishing (Complete)
+- [x] Create consolidated `ruv/ruvltra` repository
+- [x] Upload Claude Code, Small, and Medium models
+- [x] Upload Q4_K_M quantized models
+- [x] Add comprehensive model card with badges, tutorials, architecture
+
+### Phase 2: Enhanced Distribution
+- [ ] Add Q8 quantization variants
+- [ ] Add FP16 variants for fine-tuning
+- [ ] Implement automated CI/CD publishing
+- [ ] Add SONA weight exports
+
+### Phase 3: Ecosystem Integration
+- [ ] Add to llama.cpp model zoo
+- [ ] Create Ollama modelfile
+- [ ] Publish to alternative registries (ModelScope)
+
+## References
+
+- HuggingFace Hub Documentation: https://huggingface.co/docs/hub
+- GGUF Format Specification: https://github.com/ggerganov/ggml/blob/master/docs/gguf.md
+- RuvLTRA Registry: `crates/ruvllm/src/hub/registry.rs`
+- Related Issue: #121