Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
This commit is contained in:
@@ -0,0 +1,561 @@
|
||||
# DSPy.ts + AgenticSynth Complete Integration Guide
|
||||
|
||||
## Overview
|
||||
|
||||
This comprehensive example demonstrates real-world integration between DSPy.ts (v2.1.1) and AgenticSynth for e-commerce product data generation with automatic optimization.
|
||||
|
||||
## What This Example Does
|
||||
|
||||
### 🎯 Complete Workflow
|
||||
|
||||
1. **Baseline Generation**: Uses AgenticSynth with Gemini to generate product data
|
||||
2. **DSPy Setup**: Configures OpenAI with ChainOfThought reasoning module
|
||||
3. **Optimization**: Uses BootstrapFewShot to learn from high-quality examples
|
||||
4. **Comparison**: Analyzes quality improvements, cost, and performance
|
||||
5. **Reporting**: Generates detailed comparison metrics and visualizations
|
||||
|
||||
### 🔧 Technologies Used
|
||||
|
||||
- **DSPy.ts v2.1.1**: Real modules (ChainOfThought, BootstrapFewShot, metrics)
|
||||
- **AgenticSynth**: Baseline synthetic data generation
|
||||
- **OpenAI GPT-3.5**: Optimized generation with reasoning
|
||||
- **Gemini Flash**: Fast baseline generation
|
||||
- **TypeScript**: Type-safe implementation
|
||||
|
||||
## Setup
|
||||
|
||||
### Prerequisites
|
||||
|
||||
```bash
|
||||
node >= 18.0.0
|
||||
npm >= 9.0.0
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
Create a `.env` file in the package root:
|
||||
|
||||
```bash
|
||||
# Required
|
||||
OPENAI_API_KEY=sk-... # OpenAI API key
|
||||
GEMINI_API_KEY=... # Google AI Studio API key
|
||||
|
||||
# Optional
|
||||
ANTHROPIC_API_KEY=sk-ant-... # For Claude models
|
||||
```
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
# Install dependencies
|
||||
npm install
|
||||
|
||||
# Build the package
|
||||
npm run build
|
||||
```
|
||||
|
||||
## Running the Example
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```bash
|
||||
# Set environment variables
|
||||
export OPENAI_API_KEY=sk-...
|
||||
export GEMINI_API_KEY=...
|
||||
|
||||
# Run the example
|
||||
npx tsx examples/dspy-complete-example.ts
|
||||
```
|
||||
|
||||
### Expected Output
|
||||
|
||||
```
|
||||
╔════════════════════════════════════════════════════════════════════════╗
|
||||
║ DSPy.ts + AgenticSynth Integration Example ║
|
||||
║ E-commerce Product Data Generation with Optimization ║
|
||||
╚════════════════════════════════════════════════════════════════════════╝
|
||||
|
||||
✅ Environment validated
|
||||
|
||||
🔷 PHASE 1: BASELINE GENERATION
|
||||
|
||||
📦 Generating baseline data with AgenticSynth (Gemini)...
|
||||
|
||||
✓ [1/10] UltraSound Pro Wireless Headphones
|
||||
Quality: 72.3% | Price: $249.99 | Rating: 4.7/5
|
||||
✓ [2/10] EcoLux Organic Cotton T-Shirt
|
||||
Quality: 68.5% | Price: $79.99 | Rating: 4.5/5
|
||||
...
|
||||
|
||||
✅ Baseline generation complete: 10/10 products in 8.23s
|
||||
💰 Estimated cost: $0.0005
|
||||
|
||||
🔷 PHASE 2: DSPy OPTIMIZATION
|
||||
|
||||
🧠 Setting up DSPy optimization with OpenAI...
|
||||
|
||||
📡 Configuring OpenAI language model...
|
||||
✓ Language model configured
|
||||
|
||||
🔧 Creating ChainOfThought module...
|
||||
✓ Module created
|
||||
|
||||
📚 Loading training examples...
|
||||
✓ Loaded 5 high-quality examples
|
||||
|
||||
🎯 Running BootstrapFewShot optimizer...
|
||||
✓ Optimization complete in 12.45s
|
||||
|
||||
✅ DSPy module ready for generation
|
||||
|
||||
🔷 PHASE 3: OPTIMIZED GENERATION
|
||||
|
||||
🚀 Generating optimized data with DSPy + OpenAI...
|
||||
|
||||
✓ [1/10] SmartHome Voice Assistant Hub
|
||||
Quality: 85.7% | Price: $179.99 | Rating: 4.8/5
|
||||
...
|
||||
|
||||
✅ Optimized generation complete: 10/10 products in 15.67s
|
||||
💰 Estimated cost: $0.0070
|
||||
|
||||
🔷 PHASE 4: ANALYSIS & REPORTING
|
||||
|
||||
╔════════════════════════════════════════════════════════════════════════╗
|
||||
║ COMPARISON REPORT ║
|
||||
╚════════════════════════════════════════════════════════════════════════╝
|
||||
|
||||
📊 BASELINE (AgenticSynth + Gemini)
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
Products Generated: 10
|
||||
Generation Time: 8.23s
|
||||
Estimated Cost: $0.0005
|
||||
|
||||
Quality Metrics:
|
||||
Overall Quality: 68.2%
|
||||
Completeness: 72.5%
|
||||
Coherence: 65.0%
|
||||
Persuasiveness: 60.8%
|
||||
SEO Quality: 74.5%
|
||||
|
||||
🚀 OPTIMIZED (DSPy + OpenAI)
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
Products Generated: 10
|
||||
Generation Time: 15.67s
|
||||
Estimated Cost: $0.0070
|
||||
|
||||
Quality Metrics:
|
||||
Overall Quality: 84.3%
|
||||
Completeness: 88.2%
|
||||
Coherence: 82.5%
|
||||
Persuasiveness: 85.0%
|
||||
SEO Quality: 81.5%
|
||||
|
||||
📈 IMPROVEMENT ANALYSIS
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
Quality Gain: +23.6%
|
||||
Speed Change: +90.4%
|
||||
Cost Efficiency: +14.8%
|
||||
|
||||
📊 QUALITY COMPARISON CHART
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
Baseline: ██████████████████████████████████ 68.2%
|
||||
Optimized: ██████████████████████████████████████████ 84.3%
|
||||
|
||||
💡 KEY INSIGHTS
|
||||
────────────────────────────────────────────────────────────────────────────
|
||||
✓ Significant quality improvement with DSPy optimization
|
||||
✓ Better cost efficiency with optimized approach
|
||||
|
||||
════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
📁 Results exported to: .../examples/logs/dspy-comparison-results.json
|
||||
|
||||
✅ Example complete!
|
||||
|
||||
💡 Next steps:
|
||||
1. Review the comparison report above
|
||||
2. Check exported JSON for detailed results
|
||||
3. Experiment with different training examples
|
||||
4. Try other DSPy modules (Refine, ReAct, etc.)
|
||||
5. Adjust CONFIG parameters for your use case
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Customizable Parameters
|
||||
|
||||
Edit the `CONFIG` object in the example file:
|
||||
|
||||
```typescript
|
||||
const CONFIG = {
|
||||
SAMPLE_SIZE: 10, // Number of products to generate
|
||||
TRAINING_EXAMPLES: 5, // Examples for DSPy optimization
|
||||
BASELINE_MODEL: 'gemini-2.0-flash-exp',
|
||||
OPTIMIZED_MODEL: 'gpt-3.5-turbo',
|
||||
|
||||
CATEGORIES: [
|
||||
'Electronics',
|
||||
'Fashion',
|
||||
'Home & Garden',
|
||||
'Sports & Outdoors',
|
||||
'Books & Media',
|
||||
'Health & Beauty'
|
||||
],
|
||||
|
||||
PRICE_RANGES: {
|
||||
low: { min: 10, max: 50 },
|
||||
medium: { min: 50, max: 200 },
|
||||
high: { min: 200, max: 1000 }
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
## Understanding the Code
|
||||
|
||||
### Phase 1: Baseline Generation
|
||||
|
||||
```typescript
|
||||
const synth = new AgenticSynth({
|
||||
provider: 'gemini',
|
||||
model: 'gemini-2.0-flash-exp',
|
||||
apiKey: process.env.GEMINI_API_KEY
|
||||
});
|
||||
|
||||
const result = await synth.generateStructured<Product>({
|
||||
prompt: '...',
|
||||
schema: { /* product schema */ },
|
||||
count: 1
|
||||
});
|
||||
```
|
||||
|
||||
**Purpose**: Establishes baseline quality and cost metrics using standard generation.
|
||||
|
||||
### Phase 2: DSPy Setup
|
||||
|
||||
```typescript
|
||||
// Configure language model
|
||||
const lm = new OpenAILM({
|
||||
model: 'gpt-3.5-turbo',
|
||||
apiKey: process.env.OPENAI_API_KEY,
|
||||
temperature: 0.7
|
||||
});
|
||||
await lm.init();
|
||||
configureLM(lm);
|
||||
|
||||
// Create reasoning module
|
||||
const productGenerator = new ChainOfThought({
|
||||
name: 'ProductGenerator',
|
||||
signature: {
|
||||
inputs: [
|
||||
{ name: 'category', type: 'string', required: true },
|
||||
{ name: 'priceRange', type: 'string', required: true }
|
||||
],
|
||||
outputs: [
|
||||
{ name: 'name', type: 'string', required: true },
|
||||
{ name: 'description', type: 'string', required: true },
|
||||
{ name: 'price', type: 'number', required: true },
|
||||
{ name: 'rating', type: 'number', required: true }
|
||||
]
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Purpose**: Sets up DSPy's declarative reasoning framework.
|
||||
|
||||
### Phase 3: Optimization
|
||||
|
||||
```typescript
|
||||
const optimizer = new BootstrapFewShot({
|
||||
metric: productQualityMetric,
|
||||
maxBootstrappedDemos: 5,
|
||||
maxLabeledDemos: 3,
|
||||
teacherSettings: { temperature: 0.5 },
|
||||
maxRounds: 2
|
||||
});
|
||||
|
||||
const optimizedModule = await optimizer.compile(
|
||||
productGenerator,
|
||||
trainingExamples
|
||||
);
|
||||
```
|
||||
|
||||
**Purpose**: Learns from high-quality examples to improve generation.
|
||||
|
||||
### Phase 4: Generation with Optimized Module
|
||||
|
||||
```typescript
|
||||
const result = await optimizedModule.forward({
|
||||
category: 'Electronics',
|
||||
priceRange: '$100-$500'
|
||||
});
|
||||
|
||||
const product: Product = {
|
||||
name: result.name,
|
||||
description: result.description,
|
||||
price: result.price,
|
||||
rating: result.rating
|
||||
};
|
||||
```
|
||||
|
||||
**Purpose**: Uses optimized prompts and reasoning chains learned during compilation.
|
||||
|
||||
## Quality Metrics Explained
|
||||
|
||||
The example calculates four quality dimensions:
|
||||
|
||||
### 1. Completeness (40% weight)
|
||||
- Description length (100-500 words)
|
||||
- Contains features/benefits
|
||||
- Has call-to-action
|
||||
|
||||
### 2. Coherence (20% weight)
|
||||
- Sentence structure quality
|
||||
- Average sentence length (15-25 words ideal)
|
||||
- Natural flow
|
||||
|
||||
### 3. Persuasiveness (20% weight)
|
||||
- Persuasive language usage
|
||||
- Emotional appeal
|
||||
- Value proposition clarity
|
||||
|
||||
### 4. SEO Quality (20% weight)
|
||||
- Product name in description
|
||||
- Keyword presence
|
||||
- Discoverability
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
### Using Different DSPy Modules
|
||||
|
||||
#### Refine Module (Iterative Improvement)
|
||||
|
||||
```typescript
|
||||
import { Refine } from 'dspy.ts';
|
||||
|
||||
const refiner = new Refine({
|
||||
name: 'ProductRefiner',
|
||||
signature: { /* ... */ },
|
||||
maxIterations: 3,
|
||||
constraints: [
|
||||
{ field: 'description', check: (val) => val.length >= 100 }
|
||||
]
|
||||
});
|
||||
```
|
||||
|
||||
#### ReAct Module (Reasoning + Acting)
|
||||
|
||||
```typescript
|
||||
import { ReAct } from 'dspy.ts';
|
||||
|
||||
const reactor = new ReAct({
|
||||
name: 'ProductResearcher',
|
||||
signature: { /* ... */ },
|
||||
tools: [searchTool, pricingTool]
|
||||
});
|
||||
```
|
||||
|
||||
### Custom Metrics
|
||||
|
||||
```typescript
|
||||
import { createMetric } from 'dspy.ts';
|
||||
|
||||
const customMetric = createMetric(
|
||||
'brand-consistency',
|
||||
(example, prediction) => {
|
||||
// Your custom evaluation logic
|
||||
const score = calculateBrandScore(prediction);
|
||||
return score;
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
### Integration with AgenticDB
|
||||
|
||||
```typescript
|
||||
import { AgenticDB } from 'agentdb';
|
||||
|
||||
// Store products in vector database
|
||||
const db = new AgenticDB();
|
||||
await db.init();
|
||||
|
||||
for (const product of optimizedProducts) {
|
||||
await db.add({
|
||||
id: product.id,
|
||||
text: product.description,
|
||||
metadata: { category: product.category, price: product.price }
|
||||
});
|
||||
}
|
||||
|
||||
// Semantic search
|
||||
const similar = await db.search('wireless noise cancelling headphones', {
|
||||
limit: 5
|
||||
});
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### 1. Module Not Found
|
||||
|
||||
```bash
|
||||
Error: Cannot find module 'dspy.ts'
|
||||
```
|
||||
|
||||
**Solution**: Ensure dependencies are installed:
|
||||
```bash
|
||||
npm install
|
||||
```
|
||||
|
||||
#### 2. API Key Not Found
|
||||
|
||||
```bash
|
||||
❌ Missing required environment variables:
|
||||
- OPENAI_API_KEY
|
||||
```
|
||||
|
||||
**Solution**: Export environment variables:
|
||||
```bash
|
||||
export OPENAI_API_KEY=sk-...
|
||||
export GEMINI_API_KEY=...
|
||||
```
|
||||
|
||||
#### 3. Rate Limiting
|
||||
|
||||
```bash
|
||||
Error: Rate limit exceeded
|
||||
```
|
||||
|
||||
**Solution**: Add delays or reduce `SAMPLE_SIZE`:
|
||||
```typescript
|
||||
const CONFIG = {
|
||||
SAMPLE_SIZE: 5, // Reduce from 10
|
||||
// ...
|
||||
};
|
||||
```
|
||||
|
||||
#### 4. Out of Memory
|
||||
|
||||
**Solution**: Process in smaller batches:
|
||||
```typescript
|
||||
const batchSize = 5;
|
||||
for (let i = 0; i < totalProducts; i += batchSize) {
|
||||
const batch = await generateBatch(batchSize);
|
||||
// Process batch
|
||||
}
|
||||
```
|
||||
|
||||
## Performance Tips
|
||||
|
||||
### 1. Parallel Generation
|
||||
|
||||
```typescript
|
||||
const promises = categories.map(category =>
|
||||
optimizedModule.forward({ category, priceRange })
|
||||
);
|
||||
const results = await Promise.all(promises);
|
||||
```
|
||||
|
||||
### 2. Caching
|
||||
|
||||
```typescript
|
||||
const synth = new AgenticSynth({
|
||||
cacheStrategy: 'redis',
|
||||
cacheTTL: 3600,
|
||||
// ...
|
||||
});
|
||||
```
|
||||
|
||||
### 3. Streaming
|
||||
|
||||
```typescript
|
||||
for await (const product of synth.generateStream('structured', options)) {
|
||||
console.log('Generated:', product);
|
||||
// Process immediately
|
||||
}
|
||||
```
|
||||
|
||||
## Cost Optimization
|
||||
|
||||
### Model Selection Strategy
|
||||
|
||||
| Use Case | Baseline Model | Optimized Model | Notes |
|
||||
|----------|---------------|-----------------|-------|
|
||||
| High Quality | GPT-4 | Claude Opus | Premium quality |
|
||||
| Balanced | Gemini Flash | GPT-3.5 Turbo | Good quality/cost |
|
||||
| Cost-Effective | Gemini Flash | Gemini Flash | Minimal cost |
|
||||
| High Volume | Llama 3.1 | Gemini Flash | Maximum throughput |
|
||||
|
||||
### Budget Management
|
||||
|
||||
```typescript
|
||||
const CONFIG = {
|
||||
MAX_BUDGET: 1.0, // $1 USD limit
|
||||
COST_PER_TOKEN: 0.0005,
|
||||
// ...
|
||||
};
|
||||
|
||||
let totalCost = 0;
|
||||
for (let i = 0; i < products && totalCost < CONFIG.MAX_BUDGET; i++) {
|
||||
const result = await generate();
|
||||
totalCost += estimateCost(result);
|
||||
}
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
### Unit Tests
|
||||
|
||||
```typescript
|
||||
import { describe, it, expect } from 'vitest';
|
||||
import { calculateQualityMetrics } from './dspy-complete-example';
|
||||
|
||||
describe('Quality Metrics', () => {
|
||||
it('should calculate completeness correctly', () => {
|
||||
const product = {
|
||||
name: 'Test Product',
|
||||
description: 'A'.repeat(150),
|
||||
price: 99.99,
|
||||
rating: 4.5
|
||||
};
|
||||
|
||||
const metrics = calculateQualityMetrics(product);
|
||||
expect(metrics.completeness).toBeGreaterThan(0);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
### Integration Tests
|
||||
|
||||
```bash
|
||||
npm run test -- examples/dspy-complete-example.test.ts
|
||||
```
|
||||
|
||||
## Resources
|
||||
|
||||
### Documentation
|
||||
- [DSPy.ts GitHub](https://github.com/ruvnet/dspy.ts)
|
||||
- [AgenticSynth Docs](https://github.com/ruvnet/ruvector/tree/main/packages/agentic-synth)
|
||||
- [DSPy Paper](https://arxiv.org/abs/2310.03714)
|
||||
|
||||
### Examples
|
||||
- [Basic Usage](./basic-usage.ts)
|
||||
- [Integration Examples](./integration-examples.ts)
|
||||
- [Training Examples](./dspy-training-example.ts)
|
||||
|
||||
### Community
|
||||
- [Discord](https://discord.gg/dspy)
|
||||
- [GitHub Discussions](https://github.com/ruvnet/dspy.ts/discussions)
|
||||
|
||||
## License
|
||||
|
||||
MIT License - See LICENSE file for details
|
||||
|
||||
## Contributing
|
||||
|
||||
Contributions welcome! Please see CONTRIBUTING.md for guidelines.
|
||||
|
||||
---
|
||||
|
||||
**Built with ❤️ by rUv**
|
||||
Reference in New Issue
Block a user