RuvLTRA is a collection of optimized models designed for local routing, embeddings, and task classification in Claude Code workflows—not for general code generation.
🎯 Key Philosophy
Benchmark Note: HumanEval/MBPP don't apply here. RuvLTRA isn't designed to compete with Claude for code generation from scratch.
Task ──► RuvLTRA ──► Agent Type ──► Claude API
(free) (100% acc) (pay here)
Query ──► RuvLTRA ──► Embedding ──► HNSW ──► Context
(free) (free) (free) (free)
Philosophy: Simple, frequent decisions → RuvLTRA (free, <10ms, 100% accurate). Complex reasoning → Claude API (worth the cost).
📋 Training Details
Training Data
Dataset
Count
Description
Base Triplets
578
Claude Code routing examples
Claude Hard Negatives (Batch 1)
100
Opus 4.5 generated confusing pairs
Claude Hard Negatives (Batch 2)
400
Additional confusing pairs
Total
1,078
Combined training set
Training Procedure
Pipeline: Hard Negative Generation → Contrastive Training → GRPO Feedback → GGUF Export
1. Generate confusing agent pairs using Claude Opus 4.5
2. Train with Triplet Loss + InfoNCE Loss
3. Apply GRPO reward scaling from Claude judgments
4. Export adapter weights for GGUF merging
Hyperparameters
Parameter
Value
Learning Rate
2e-5
Batch Size
32
Epochs
30
Triplet Margin
0.5
InfoNCE Temperature
0.07
Weight Decay
0.01
Optimizer
AdamW
Training Infrastructure
Hardware: Apple Silicon (Metal GPU)
Framework: Candle (Rust ML)
Training Time: ~30 seconds for 30 epochs
Final Loss: 0.168
📊 Evaluation Results
Benchmark: Claude Flow Agent Routing (20 test cases)
Strategy
RuvLTRA
Qwen Base
Improvement
Embedding Only
88.2%
40.0%
+48.2 pts
Keyword Only
100.0%
100.0%
same
Hybrid 60/40
100.0%
95.0%
+5.0 pts
Keyword-First
100.0%
95.0%
+5.0 pts
Per-Agent Accuracy
Agent
Accuracy
Test Cases
coder
100%
3
researcher
100%
2
reviewer
100%
2
tester
100%
2
architect
100%
2
security-architect
100%
2
debugger
100%
2
documenter
100%
1
refactorer
100%
1
optimizer
100%
1
devops
100%
1
api-docs
100%
1
Hard Negative Performance
Confusing Pair
Accuracy
coder vs refactorer
82%
researcher vs architect
79%
reviewer vs tester
84%
debugger vs optimizer
78%
documenter vs api-docs
85%
⚠️ Limitations & Intended Use
Intended Use
✅Designed For:
Task routing in Claude Code workflows
Agent classification (13 types)
Semantic embedding for HNSW search
Local inference (<10ms latency)
Cost optimization (avoid API calls for routing)
❌NOT Designed For:
General code generation
Multi-step reasoning
Chat/conversation
Languages other than English
Agent types beyond the 13 supported
Known Limitations
Fixed Agent Types: Only routes to 13 predefined agents
English Only: Training data is English-only
Domain Specific: Optimized for software development tasks
Embedding Fallback: 88.2% accuracy when keywords don't match
Context Length: Optimal for short task descriptions (<100 tokens)
Bias Considerations
Training data generated from Claude Opus 4.5 may inherit biases
Agent keywords favor common software terminology
Security-related tasks may be over-classified to security-architect
🔧 Model Files & Checksums
Available Files
File
Size
Format
Use Case
ruvltra-claude-code-0.5b-q4_k_m.gguf
398 MB
GGUF Q4_K_M
Production routing
ruvltra-small-0.5b-q4_k_m.gguf
398 MB
GGUF Q4_K_M
General embeddings
ruvltra-medium-1.1b-q4_k_m.gguf
800 MB
GGUF Q4_K_M
Higher accuracy
training/v2.3-sota-stats.json
1 KB
JSON
Training metrics
training/v2.3-info.json
2 KB
JSON
Training config
Version History
Version
Date
Changes
v2.3
2025-01-20
500+ hard negatives, 48% ratio, GRPO feedback
v2.2
2025-01-15
100 hard negatives, 18% ratio
v2.1
2025-01-10
Contrastive learning, triplet loss
v2.0
2025-01-05
Hybrid routing strategy
v1.0
2024-12-20
Initial release
📖 Citation
BibTeX
@software{ruvltra2025,title={RuvLTRA: Local Task Routing for Claude Code Workflows},author={ruv},year={2025},url={https://huggingface.co/ruv/ruvltra},version={2.3},license={Apache-2.0},keywords={agent-routing, embeddings, claude-code, contrastive-learning}}
Plain Text
ruv. (2025). RuvLTRA: Local Task Routing for Claude Code Workflows (Version 2.3).
https://huggingface.co/ruv/ruvltra
❓ FAQ & Troubleshooting
Common Questions
Q: Why use this instead of Claude API for routing?
A: RuvLTRA is free, runs locally in <10ms, and achieves 100% accuracy with hybrid strategy. Claude API adds latency (~500ms) and costs ~$0.003 per call.
Q: Can I add custom agent types?
A: Not with the current model. You'd need to fine-tune with triplets including your custom agents.
Q: Does it work offline?
A: Yes, fully offline after downloading the GGUF model.
Q: What's the difference between embedding-only and hybrid?
A: Embedding-only uses semantic similarity (88.2% accuracy). Hybrid checks keywords first, then falls back to embeddings (100% accuracy).
Troubleshooting
Model loading fails:
# Ensure you have enough RAM (500MB+)# Check file integrity
sha256sum ruvltra-claude-code-0.5b-q4_k_m.gguf
Low accuracy:
// Use keyword-first strategy for 100% accuracy
constrouter=newSemanticRouter({strategy:'keyword-first'// Not 'embedding-only'
});
Slow inference:
# Enable Metal GPU on Apple SiliconexportGGML_METAL=1
📄 License
Apache 2.0 - Free for commercial and personal use.