Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'
This commit is contained in:
117
vendor/ruvector/docs/adr/ADR-013-huggingface-publishing.md
vendored
Normal file
117
vendor/ruvector/docs/adr/ADR-013-huggingface-publishing.md
vendored
Normal file
@@ -0,0 +1,117 @@
|
||||
# ADR-013: HuggingFace Model Publishing Strategy
|
||||
|
||||
## Status
|
||||
**Accepted** - 2026-01-20
|
||||
|
||||
## Context
|
||||
|
||||
RuvLTRA models need to be distributed to users efficiently. HuggingFace Hub is the industry standard for model hosting with:
|
||||
- High-speed CDN for global distribution
|
||||
- Git-based versioning
|
||||
- Model cards for documentation
|
||||
- API for programmatic access
|
||||
- Integration with major ML frameworks
|
||||
|
||||
## Decision
|
||||
|
||||
### 1. Repository Structure
|
||||
|
||||
All models consolidated under a single HuggingFace repository:
|
||||
|
||||
| Repository | Purpose | Models |
|
||||
|------------|---------|--------|
|
||||
| **`ruv/ruvltra`** | All RuvLTRA models | Claude Code, Small, Medium, Large |
|
||||
|
||||
**URL**: https://huggingface.co/ruv/ruvltra
|
||||
|
||||
### 2. File Naming Convention
|
||||
|
||||
```
|
||||
ruvltra-{size}-{quant}.gguf
|
||||
```
|
||||
|
||||
Examples:
|
||||
- `ruvltra-0.5b-q4_k_m.gguf`
|
||||
- `ruvltra-3b-q8_0.gguf`
|
||||
- `ruvltra-claude-code-0.5b-q4_k_m.gguf`
|
||||
|
||||
### 3. Authentication
|
||||
|
||||
Support multiple environment variable names for HuggingFace token:
|
||||
- `HF_TOKEN` (primary)
|
||||
- `HUGGING_FACE_HUB_TOKEN` (legacy)
|
||||
- `HUGGINGFACE_API_KEY` (common alternative)
|
||||
|
||||
### 4. Upload Workflow
|
||||
|
||||
```rust
|
||||
// Using ModelUploader
|
||||
let uploader = ModelUploader::new(get_hf_token().unwrap());
|
||||
uploader.upload(
|
||||
"./model.gguf",
|
||||
"ruv/ruvltra-small",
|
||||
Some(metadata),
|
||||
)?;
|
||||
```
|
||||
|
||||
### 5. Model Card Requirements
|
||||
|
||||
Each repository must include:
|
||||
- YAML frontmatter with tags, license, language
|
||||
- Model description and capabilities
|
||||
- Hardware requirements table
|
||||
- Usage examples (Rust, Python, CLI)
|
||||
- Benchmark results (when available)
|
||||
- License information
|
||||
|
||||
### 6. Versioning Strategy
|
||||
|
||||
- Use HuggingFace's built-in Git versioning
|
||||
- Tag major releases (e.g., `v1.0.0`)
|
||||
- Maintain `main` branch for latest stable
|
||||
- Use branches for experimental variants
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- **Accessibility**: Models available via standard HuggingFace APIs
|
||||
- **Discoverability**: Indexed in HuggingFace model search
|
||||
- **Versioning**: Full Git history for model evolution
|
||||
- **CDN**: Fast global downloads via Cloudflare
|
||||
- **Documentation**: Model cards provide user guidance
|
||||
|
||||
### Negative
|
||||
- **Storage Costs**: Large models require HuggingFace Pro for private repos
|
||||
- **Dependency**: Reliance on external service availability
|
||||
- **Sync Complexity**: Must keep registry.rs in sync with HuggingFace
|
||||
|
||||
### Mitigations
|
||||
- Use public repos (free unlimited storage)
|
||||
- Implement fallback to direct URL downloads
|
||||
- Automate registry updates via CI/CD
|
||||
|
||||
## Implementation
|
||||
|
||||
### Phase 1: Initial Publishing (Complete)
|
||||
- [x] Create consolidated `ruv/ruvltra` repository
|
||||
- [x] Upload Claude Code, Small, and Medium models
|
||||
- [x] Upload Q4_K_M quantized models
|
||||
- [x] Add comprehensive model card with badges, tutorials, architecture
|
||||
|
||||
### Phase 2: Enhanced Distribution
|
||||
- [ ] Add Q8 quantization variants
|
||||
- [ ] Add FP16 variants for fine-tuning
|
||||
- [ ] Implement automated CI/CD publishing
|
||||
- [ ] Add SONA weight exports
|
||||
|
||||
### Phase 3: Ecosystem Integration
|
||||
- [ ] Add to llama.cpp model zoo
|
||||
- [ ] Create Ollama modelfile
|
||||
- [ ] Publish to alternative registries (ModelScope)
|
||||
|
||||
## References
|
||||
|
||||
- HuggingFace Hub Documentation: https://huggingface.co/docs/hub
|
||||
- GGUF Format Specification: https://github.com/ggerganov/ggml/blob/master/docs/gguf.md
|
||||
- RuvLTRA Registry: `crates/ruvllm/src/hub/registry.rs`
|
||||
- Related Issue: #121
|
||||
Reference in New Issue
Block a user