Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'
This commit is contained in:
631
vendor/ruvector/crates/ruvllm/src/training/README.md
vendored
Normal file
631
vendor/ruvector/crates/ruvllm/src/training/README.md
vendored
Normal file
@@ -0,0 +1,631 @@
|
||||
# RuvLLM Training Module
|
||||
|
||||
Fine-tuning dataset generation for RuvLTRA models, focusing on Claude Flow agent task routing and model selection.
|
||||
|
||||
## SOTA Achievements (v2.3)
|
||||
|
||||
| Metric | Before | After | Method |
|
||||
|--------|--------|-------|--------|
|
||||
| **Hybrid Routing Accuracy** | 95% | **100%** | Keyword-First + Embedding Fallback |
|
||||
| **Embedding-Only Accuracy** | 45% | **88.2%** | Contrastive Learning (Triplet + InfoNCE) |
|
||||
| **Hard Negative Accuracy** | N/A | **81.2%** | Claude-Generated Confusing Pairs |
|
||||
| **Agent Types Supported** | 13 | 13 | All Claude Code agent types |
|
||||
|
||||
### Training Data (v2.3 SOTA)
|
||||
|
||||
- **Base triplets**: 578 examples from Claude Code routing data
|
||||
- **Claude-generated hard negatives**: 500+ high-quality confusing pairs
|
||||
- **Total training set**: 1,078 triplets
|
||||
- **Hard negative ratio**: 48.4% (up from 18%)
|
||||
|
||||
### Training Pipeline
|
||||
|
||||
```
|
||||
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
|
||||
│ Hard Negative │────►│ Contrastive │────►│ GRPO Feedback │
|
||||
│ Generation │ │ Training │ │ Loop │
|
||||
│ (Claude Opus) │ │ (Candle/Metal) │ │ (Claude Judge) │
|
||||
└──────────────────┘ └──────────────────┘ └──────────────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────┐
|
||||
│ GGUF Export │
|
||||
│ (Adapter Merge) │
|
||||
└──────────────────┘
|
||||
```
|
||||
|
||||
## Overview
|
||||
|
||||
The training module generates synthetic datasets for fine-tuning RuvLTRA models on two key tasks:
|
||||
|
||||
1. **Agent Routing**: Classify tasks to appropriate Claude Flow agents (Coder, Researcher, Security, Architecture, Reviewer)
|
||||
2. **Model Selection**: Route tasks to optimal Claude models (Haiku/Sonnet/Opus) based on complexity
|
||||
|
||||
## Real Contrastive Training (v2.3 - Production)
|
||||
|
||||
The `real_trainer` module provides production-grade training with actual Candle weight updates:
|
||||
|
||||
```rust
|
||||
use ruvllm::training::{RealContrastiveTrainer, RealTrainingConfig, run_training_pipeline};
|
||||
use std::path::PathBuf;
|
||||
|
||||
// Option 1: Full pipeline with GRPO feedback
|
||||
#[tokio::main]
|
||||
async fn main() -> Result<(), String> {
|
||||
run_training_pipeline(
|
||||
&PathBuf::from("~/.ruvllm/training/combined-sota.jsonl"),
|
||||
&PathBuf::from("ruvltra-claude-code-0.5b-q4_k_m.gguf"),
|
||||
&PathBuf::from("ruvltra-claude-code-sota.gguf"),
|
||||
Some(&std::env::var("ANTHROPIC_API_KEY").unwrap()), // For GRPO
|
||||
).await
|
||||
}
|
||||
|
||||
// Option 2: Manual training with fine-grained control
|
||||
let config = RealTrainingConfig {
|
||||
model_path: PathBuf::from("ruvltra-claude-code-0.5b-q4_k_m.gguf"),
|
||||
output_path: PathBuf::from("ruvltra-claude-code-sota.gguf"),
|
||||
learning_rate: 2e-5,
|
||||
weight_decay: 0.01,
|
||||
batch_size: 16,
|
||||
epochs: 30,
|
||||
margin: 0.5, // Triplet loss margin
|
||||
temperature: 0.07, // InfoNCE temperature
|
||||
embedding_dim: 896, // Qwen 0.5B embedding size
|
||||
use_metal: true, // Apple Silicon GPU acceleration
|
||||
enable_grpo: true, // Enable GRPO reward scaling
|
||||
..Default::default()
|
||||
};
|
||||
|
||||
let mut trainer = RealContrastiveTrainer::new(config)?;
|
||||
trainer.load_triplets("combined-sota.jsonl")?;
|
||||
|
||||
// Train with real weight updates
|
||||
let result = trainer.train()?;
|
||||
println!("Best accuracy: {:.2}%", result.best_accuracy * 100.0);
|
||||
|
||||
// Export to GGUF format
|
||||
let export = trainer.export_gguf("output.gguf")?;
|
||||
println!("Exported {} weights to {}", export.total_weights, export.weights_path.display());
|
||||
```
|
||||
|
||||
### GGUF Export
|
||||
|
||||
The trainer exports adapter weights that can be merged with the base Qwen model:
|
||||
|
||||
```bash
|
||||
# After training, merge adapter with base model
|
||||
bash output.gguf.weights/merge_adapter.sh
|
||||
|
||||
# Files created:
|
||||
# - output.gguf.weights/adapter_weights.bin (binary weights)
|
||||
# - output.gguf.weights/metadata.json (training config)
|
||||
# - output.gguf.weights/merge_adapter.sh (merge script)
|
||||
```
|
||||
|
||||
### GRPO Feedback Loop
|
||||
|
||||
GRPO (Group Relative Policy Optimization) uses Claude as a judge to improve training:
|
||||
|
||||
```rust
|
||||
use ruvllm::training::{GrpoEvaluator, GrpoFeedback};
|
||||
|
||||
let evaluator = GrpoEvaluator::new(api_key);
|
||||
|
||||
// Evaluate predictions
|
||||
let predictions = vec![
|
||||
("Add error handling".to_string(), "coder".to_string(), "coder".to_string()),
|
||||
("Review the PR".to_string(), "reviewer".to_string(), "tester".to_string()),
|
||||
];
|
||||
|
||||
let feedback = evaluator.evaluate(&predictions).await?;
|
||||
for fb in feedback {
|
||||
trainer.add_grpo_feedback(fb);
|
||||
}
|
||||
|
||||
// Re-train with GRPO-enhanced loss scaling
|
||||
let result = trainer.train()?;
|
||||
```
|
||||
|
||||
## Contrastive Learning (Simulated)
|
||||
|
||||
The `contrastive` module provides state-of-the-art embedding fine-tuning:
|
||||
|
||||
```rust
|
||||
use ruvllm::training::{ContrastiveTrainer, ContrastiveConfig, TrainingTriplet};
|
||||
|
||||
// Configure contrastive training
|
||||
let config = ContrastiveConfig {
|
||||
learning_rate: 2e-5,
|
||||
margin: 0.5, // Triplet loss margin
|
||||
temperature: 0.07, // InfoNCE temperature
|
||||
batch_size: 32,
|
||||
embedding_dim: 896, // Qwen 0.5B embedding size
|
||||
hard_negative_ratio: 0.18,
|
||||
use_metal: true, // Apple Silicon GPU
|
||||
..Default::default()
|
||||
};
|
||||
|
||||
// Initialize and train
|
||||
let mut trainer = ContrastiveTrainer::new(config)?;
|
||||
trainer.load_triplets("triplets.jsonl")?;
|
||||
let result = trainer.train(30)?; // 30 epochs
|
||||
|
||||
println!("Final accuracy: {:.2}%", result.final_accuracy * 100.0);
|
||||
```
|
||||
|
||||
### Claude-Powered Hard Negative Generation
|
||||
|
||||
Generate high-quality confusing training pairs using Claude Opus 4.5:
|
||||
|
||||
```bash
|
||||
node scripts/training/claude-hard-negatives.js --count=10 --grpo
|
||||
|
||||
# Output: ~/.ruvllm/training/claude-hard-negatives.jsonl
|
||||
```
|
||||
|
||||
This generates triplets for confusing agent pairs:
|
||||
- `coder` vs `refactorer` (both modify code)
|
||||
- `researcher` vs `architect` (both analyze)
|
||||
- `reviewer` vs `tester` (both validate)
|
||||
- `debugger` vs `optimizer` (both fix issues)
|
||||
- And 6 more confusing pairs...
|
||||
|
||||
## Quick Start
|
||||
|
||||
```rust
|
||||
use ruvllm::training::{DatasetGenerator, DatasetConfig};
|
||||
|
||||
// Generate dataset with 100 examples per category
|
||||
let config = DatasetConfig::default();
|
||||
let mut generator = DatasetGenerator::new(config);
|
||||
let dataset = generator.generate();
|
||||
|
||||
// Export to JSONL
|
||||
dataset.export_jsonl("training.jsonl")?;
|
||||
|
||||
// Split for training/validation/test
|
||||
let (train, val, test) = dataset.split(0.7, 0.15, 0.15, 42);
|
||||
```
|
||||
|
||||
## Task Categories
|
||||
|
||||
### 1. Coder (20% of dataset)
|
||||
- **Focus**: Code generation, debugging, refactoring
|
||||
- **Examples**:
|
||||
- "Implement JWT authentication middleware in TypeScript"
|
||||
- "Debug memory leak in request handler"
|
||||
- "Refactor UserService to use dependency injection"
|
||||
|
||||
**Model Routing:**
|
||||
- Simple tasks → Haiku (quick fixes, simple functions)
|
||||
- Moderate tasks → Sonnet (components, APIs)
|
||||
- Complex tasks → Opus (algorithms, system-level)
|
||||
|
||||
### 2. Researcher (20% of dataset)
|
||||
- **Focus**: Analysis, exploration, documentation
|
||||
- **Examples**:
|
||||
- "Analyze GraphQL performance bottlenecks"
|
||||
- "Research best practices for microservices"
|
||||
- "Document REST API endpoints"
|
||||
|
||||
**Model Routing:**
|
||||
- Simple tasks → Haiku (basic docs)
|
||||
- Moderate/Complex → Sonnet (analysis, research)
|
||||
|
||||
### 3. Security (20% of dataset)
|
||||
- **Focus**: Audit, vulnerability analysis, threat detection
|
||||
- **Examples**:
|
||||
- "Audit authentication flow for security vulnerabilities"
|
||||
- "Review cryptographic key management"
|
||||
- "Identify SQL injection attack vectors"
|
||||
|
||||
**Model Routing:**
|
||||
- All tasks → Opus (security requires highest quality)
|
||||
|
||||
### 4. Architecture (20% of dataset)
|
||||
- **Focus**: System design, planning, architecture
|
||||
- **Examples**:
|
||||
- "Design microservices architecture for e-commerce"
|
||||
- "Plan database schema for multi-tenant SaaS"
|
||||
- "Architect real-time event streaming pipeline"
|
||||
|
||||
**Model Routing:**
|
||||
- Simple tasks → Sonnet (basic schemas)
|
||||
- Moderate/Complex → Opus (distributed systems)
|
||||
|
||||
### 5. Reviewer (20% of dataset)
|
||||
- **Focus**: Code review, quality assessment
|
||||
- **Examples**:
|
||||
- "Review pull request #123 for best practices"
|
||||
- "Assess code quality of UserController"
|
||||
- "Review error handling in payment service"
|
||||
|
||||
**Model Routing:**
|
||||
- Simple tasks → Haiku (standards compliance)
|
||||
- Moderate/Complex → Sonnet (quality, architecture review)
|
||||
|
||||
## Dataset Configuration
|
||||
|
||||
```rust
|
||||
use ruvllm::training::{DatasetConfig, AugmentationConfig};
|
||||
|
||||
let config = DatasetConfig {
|
||||
// Base examples per category
|
||||
examples_per_category: 100,
|
||||
|
||||
// Enable data augmentation
|
||||
enable_augmentation: true,
|
||||
|
||||
// Augmentation settings
|
||||
augmentation: AugmentationConfig {
|
||||
// Generate 2 paraphrases per example
|
||||
paraphrases_per_example: 2,
|
||||
|
||||
// Generate 2 complexity variations
|
||||
complexity_variations: 2,
|
||||
|
||||
// Enable domain transfer
|
||||
enable_domain_transfer: true,
|
||||
},
|
||||
|
||||
// Random seed for reproducibility
|
||||
seed: 42,
|
||||
};
|
||||
```
|
||||
|
||||
### Dataset Size Calculation
|
||||
|
||||
With default configuration:
|
||||
- **Base examples**: 5 categories × 100 = 500 examples
|
||||
- **Paraphrases**: 500 × 2 = 1,000 additional examples
|
||||
- **Complexity variations**: 500 × 2 = ~800 additional examples (some filtered)
|
||||
- **Domain transfer**: 500 × 1 = ~400 additional examples (some filtered)
|
||||
- **Total**: ~2,700 examples (actual varies due to filtering)
|
||||
|
||||
## Data Augmentation
|
||||
|
||||
### 1. Paraphrasing
|
||||
Replaces words with synonyms to increase linguistic diversity:
|
||||
|
||||
```
|
||||
Original: "Implement a function to validate user input"
|
||||
Paraphrased: "Create a function to validate user input"
|
||||
"Build a function to validate user input"
|
||||
```
|
||||
|
||||
### 2. Complexity Variations
|
||||
Creates examples at different complexity levels:
|
||||
|
||||
```
|
||||
Simple: "Add error handling to API endpoint"
|
||||
Moderate: "Implement error handling with retry logic"
|
||||
Complex: "Design fault-tolerant error handling with circuit breakers"
|
||||
```
|
||||
|
||||
### 3. Domain Transfer
|
||||
Applies task patterns across technical domains:
|
||||
|
||||
```
|
||||
Web: "Optimize React component rendering"
|
||||
Mobile: "Optimize Flutter widget rendering"
|
||||
Systems: "Optimize kernel thread scheduling"
|
||||
```
|
||||
|
||||
## Export Formats
|
||||
|
||||
### JSONL (Streaming Format)
|
||||
```rust
|
||||
// One JSON object per line
|
||||
dataset.export_jsonl("training.jsonl")?;
|
||||
```
|
||||
|
||||
**Example line:**
|
||||
```json
|
||||
{"input":"Implement authentication middleware","context":"JWT with RS256","output_agent":"coder","metadata":{"category":"Coder","complexity":"Moderate","domain":"Web","expected_model":"sonnet","quality_score":0.87,"tags":["auth","middleware"]}}
|
||||
```
|
||||
|
||||
### JSON (Full Array)
|
||||
```rust
|
||||
// Human-readable JSON array
|
||||
dataset.export_json("training.json")?;
|
||||
```
|
||||
|
||||
### Statistics
|
||||
```rust
|
||||
// Export dataset statistics
|
||||
dataset.export_stats("stats.json")?;
|
||||
```
|
||||
|
||||
**Stats format:**
|
||||
```json
|
||||
{
|
||||
"total_examples": 2700,
|
||||
"examples_per_category": {
|
||||
"coder": 540,
|
||||
"researcher": 540,
|
||||
"security": 540,
|
||||
"architecture": 540,
|
||||
"reviewer": 540
|
||||
},
|
||||
"examples_per_complexity": {
|
||||
"Simple": 900,
|
||||
"Moderate": 1080,
|
||||
"Complex": 720
|
||||
},
|
||||
"avg_quality_score": 0.87
|
||||
}
|
||||
```
|
||||
|
||||
## Dataset Splits
|
||||
|
||||
```rust
|
||||
// 70% train, 15% validation, 15% test
|
||||
let (train, val, test) = dataset.split(0.7, 0.15, 0.15, 42);
|
||||
|
||||
// Export each split
|
||||
ClaudeTaskDataset::new(train).export_jsonl("train.jsonl")?;
|
||||
ClaudeTaskDataset::new(val).export_jsonl("val.jsonl")?;
|
||||
ClaudeTaskDataset::new(test).export_jsonl("test.jsonl")?;
|
||||
```
|
||||
|
||||
## Example Structure
|
||||
|
||||
### ClaudeTaskExample
|
||||
```rust
|
||||
pub struct ClaudeTaskExample {
|
||||
/// Task description (model input)
|
||||
pub input: String,
|
||||
|
||||
/// Additional context
|
||||
pub context: String,
|
||||
|
||||
/// Expected agent (target output)
|
||||
pub output_agent: String,
|
||||
|
||||
/// Task metadata
|
||||
pub metadata: TaskMetadata,
|
||||
}
|
||||
```
|
||||
|
||||
### TaskMetadata
|
||||
```rust
|
||||
pub struct TaskMetadata {
|
||||
/// Task category
|
||||
pub category: TaskCategory,
|
||||
|
||||
/// Complexity level (Simple/Moderate/Complex)
|
||||
pub complexity: ComplexityLevel,
|
||||
|
||||
/// Technical domain
|
||||
pub domain: DomainType,
|
||||
|
||||
/// Recommended Claude model
|
||||
pub expected_model: String,
|
||||
|
||||
/// Quality score (0.0-1.0)
|
||||
pub quality_score: f32,
|
||||
|
||||
/// Descriptive tags
|
||||
pub tags: Vec<String>,
|
||||
}
|
||||
```
|
||||
|
||||
## Model Selection Logic
|
||||
|
||||
The dataset includes intelligent model routing based on task category and complexity:
|
||||
|
||||
| Category | Simple | Moderate | Complex |
|
||||
|----------|--------|----------|---------|
|
||||
| Coder | Haiku | Sonnet | Opus |
|
||||
| Researcher | Haiku | Sonnet | Sonnet |
|
||||
| Security | Opus | Opus | Opus |
|
||||
| Architecture | Sonnet | Opus | Opus |
|
||||
| Reviewer | Haiku | Sonnet | Sonnet |
|
||||
|
||||
**Cost Optimization:**
|
||||
- **Haiku**: ~75% cheaper than Opus, 2-3x faster
|
||||
- **Sonnet**: Balanced cost/quality for most tasks
|
||||
- **Opus**: Highest quality for complex/security-critical tasks
|
||||
|
||||
## Quality Scores
|
||||
|
||||
Training examples include quality scores (0.0-1.0) based on:
|
||||
|
||||
1. **Template Quality** (0.80-0.96)
|
||||
- Hand-crafted seed templates: 0.90-0.96
|
||||
- Paraphrased examples: 0.85-0.90
|
||||
- Domain transferred: 0.80-0.85
|
||||
|
||||
2. **Category Appropriateness**
|
||||
- Security tasks: 0.90-0.96 (critical quality)
|
||||
- Architecture tasks: 0.85-0.93 (high quality)
|
||||
- Code generation: 0.83-0.90 (good quality)
|
||||
- Research tasks: 0.80-0.89 (adequate quality)
|
||||
- Review tasks: 0.82-0.90 (good quality)
|
||||
|
||||
## Integration with RuvLTRA
|
||||
|
||||
### Fine-Tuning Pipeline
|
||||
|
||||
```rust
|
||||
use ruvllm::training::DatasetGenerator;
|
||||
use ruvllm::SonaLlm;
|
||||
|
||||
// 1. Generate dataset
|
||||
let dataset = DatasetGenerator::new(config).generate();
|
||||
|
||||
// 2. Split data
|
||||
let (train, val, _test) = dataset.split(0.7, 0.15, 0.15, 42);
|
||||
|
||||
// 3. Fine-tune model
|
||||
let model = SonaLlm::new(config)?;
|
||||
for example in train {
|
||||
let embedding = model.embed(&example.input)?;
|
||||
let target = encode_agent(&example.output_agent);
|
||||
model.train(embedding, target)?;
|
||||
}
|
||||
```
|
||||
|
||||
### Model Architecture
|
||||
|
||||
The dataset supports training multiple heads:
|
||||
|
||||
1. **Task Embedding Layer**
|
||||
- Input: Task description + context
|
||||
- Output: 768-dim semantic embedding
|
||||
|
||||
2. **Agent Classification Head**
|
||||
- Input: Task embedding
|
||||
- Output: 5-way softmax (5 agent types)
|
||||
|
||||
3. **Model Selection Head**
|
||||
- Input: Task embedding + complexity features
|
||||
- Output: 3-way softmax (Haiku/Sonnet/Opus)
|
||||
|
||||
4. **Quality Prediction Head**
|
||||
- Input: Task embedding
|
||||
- Output: Regression (0-1 quality score)
|
||||
|
||||
## Domain Types
|
||||
|
||||
The dataset covers 8 technical domains:
|
||||
|
||||
- **Web**: Frontend, backend, full-stack development
|
||||
- **Systems**: Operating systems, low-level programming
|
||||
- **DataScience**: ML, analytics, data processing
|
||||
- **Mobile**: iOS, Android, cross-platform
|
||||
- **DevOps**: Infrastructure, CI/CD, deployment
|
||||
- **Security**: Cryptography, vulnerabilities, compliance
|
||||
- **Database**: SQL, NoSQL, data modeling
|
||||
- **Api**: REST, GraphQL, API design
|
||||
|
||||
## Template System
|
||||
|
||||
The generator uses 100+ hand-crafted templates per category:
|
||||
|
||||
```rust
|
||||
TaskTemplate {
|
||||
input: "Implement a {function_type} function in {language}",
|
||||
context: "Should {requirements} and optimize for {target}",
|
||||
complexity: ComplexityLevel::Moderate,
|
||||
domain: DomainType::Web,
|
||||
tags: vec!["code-generation", "function"],
|
||||
quality: 0.87,
|
||||
}
|
||||
```
|
||||
|
||||
**Placeholders** are filled with random values:
|
||||
- `{language}`: Rust, TypeScript, Python, Go, Java
|
||||
- `{framework}`: React, Vue, Angular, Svelte
|
||||
- `{function_type}`: async, recursive, higher-order
|
||||
- `{data_structure}`: binary tree, hash map, linked list
|
||||
|
||||
## Running the Examples
|
||||
|
||||
### Complete SOTA Training Pipeline
|
||||
|
||||
```bash
|
||||
# 1. Generate 500+ Claude-powered hard negatives
|
||||
node npm/packages/ruvllm/scripts/training/claude-hard-negatives.js --count=50
|
||||
|
||||
# 2. Merge all triplets (base + hard negatives)
|
||||
cat ~/.ruvllm/training/ruvltra-finetuned/triplets.jsonl > combined.jsonl
|
||||
echo "" >> combined.jsonl
|
||||
cat ~/.ruvllm/training/claude-hard-negatives.jsonl >> combined.jsonl
|
||||
echo "" >> combined.jsonl
|
||||
cat ~/.ruvllm/training/claude-hard-negatives-batch2.jsonl >> combined.jsonl
|
||||
|
||||
# 3. Run REAL contrastive training with Candle (30 epochs)
|
||||
cargo run --example train_real --release --features candle -- \
|
||||
--triplets ~/.ruvllm/training/combined-sota.jsonl \
|
||||
--base-model ruvltra-claude-code-0.5b-q4_k_m.gguf \
|
||||
--output ruvltra-claude-code-sota.gguf \
|
||||
--epochs 30 \
|
||||
--grpo # Enable GRPO feedback loop
|
||||
|
||||
# 4. Merge trained adapter with base model
|
||||
bash ruvltra-claude-code-sota.gguf.weights/merge_adapter.sh
|
||||
|
||||
# 5. Benchmark the improvement
|
||||
node npm/packages/ruvllm/scripts/hybrid-model-compare.js
|
||||
```
|
||||
|
||||
### Simulated Contrastive Fine-Tuning (Quick Test)
|
||||
|
||||
```bash
|
||||
# Simulated training (no real weight updates, for testing)
|
||||
cargo run --example train_contrastive --release -- \
|
||||
--triplets ~/.ruvllm/training/combined-sota.jsonl \
|
||||
--epochs 30
|
||||
|
||||
# Expected output:
|
||||
# - 88%+ embedding-only accuracy
|
||||
# - 81%+ hard negative accuracy
|
||||
# - 100% hybrid routing accuracy
|
||||
```
|
||||
|
||||
### Dataset Generation
|
||||
|
||||
```bash
|
||||
# Generate dataset
|
||||
cargo run --example generate_claude_dataset --release
|
||||
|
||||
# Output files:
|
||||
# - claude_training_full.jsonl (all examples)
|
||||
# - claude_training_train.jsonl (70% training)
|
||||
# - claude_training_val.jsonl (15% validation)
|
||||
# - claude_training_test.jsonl (15% test)
|
||||
# - claude_training_stats.json (statistics)
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# Run tests
|
||||
cargo test --package ruvllm --lib training
|
||||
|
||||
# Test specific functionality
|
||||
cargo test --package ruvllm test_dataset_generation
|
||||
cargo test --package ruvllm test_dataset_augmentation
|
||||
cargo test --package ruvllm test_model_recommendation
|
||||
```
|
||||
|
||||
## Performance
|
||||
|
||||
Dataset generation is highly optimized:
|
||||
|
||||
- **Generation Speed**: ~10,000 examples/second
|
||||
- **Memory Usage**: ~200 MB for 3,000 examples
|
||||
- **Export Speed**:
|
||||
- JSONL: ~50 MB/s
|
||||
- JSON: ~30 MB/s (pretty-printed)
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
### Planned Features
|
||||
- [ ] Parquet export format
|
||||
- [ ] HuggingFace Datasets integration
|
||||
- [ ] Multi-language support (non-English tasks)
|
||||
- [ ] Custom template loading
|
||||
- [ ] Active learning integration
|
||||
- [ ] Difficulty progression scheduling
|
||||
- [ ] Cross-validation splits
|
||||
- [ ] Balanced sampling strategies
|
||||
|
||||
### Research Directions
|
||||
- [ ] Few-shot learning examples
|
||||
- [ ] Task decomposition datasets
|
||||
- [ ] Multi-turn conversation datasets
|
||||
- [ ] Code execution feedback datasets
|
||||
- [ ] Self-improvement trajectory datasets
|
||||
|
||||
## References
|
||||
|
||||
- **Claude Flow**: https://github.com/ruvnet/claude-flow
|
||||
- **RuvLTRA Architecture**: `../../README.md`
|
||||
- **SONA Learning**: `../../../sona/README.md`
|
||||
- **Dataset Format**: `../../../../docs/claude_dataset_format.md`
|
||||
|
||||
## License
|
||||
|
||||
MIT OR Apache-2.0
|
||||
Reference in New Issue
Block a user