Squashed 'vendor/ruvector/' content from commit b64c2172

git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00
commit d803bfe2b1
7854 changed files with 3522914 additions and 0 deletions
--- a/npm/packages/agentic-synth/examples/docs/dspy-complete-example-guide.md
+++ b/npm/packages/agentic-synth/examples/docs/dspy-complete-example-guide.md
@@ -0,0 +1,561 @@
+# DSPy.ts + AgenticSynth Complete Integration Guide
+
+## Overview
+
+This comprehensive example demonstrates real-world integration between DSPy.ts (v2.1.1) and AgenticSynth for e-commerce product data generation with automatic optimization.
+
+## What This Example Does
+
+### 🎯 Complete Workflow
+
+1. **Baseline Generation**: Uses AgenticSynth with Gemini to generate product data
+2. **DSPy Setup**: Configures OpenAI with ChainOfThought reasoning module
+3. **Optimization**: Uses BootstrapFewShot to learn from high-quality examples
+4. **Comparison**: Analyzes quality improvements, cost, and performance
+5. **Reporting**: Generates detailed comparison metrics and visualizations
+
+### 🔧 Technologies Used
+
+- **DSPy.ts v2.1.1**: Real modules (ChainOfThought, BootstrapFewShot, metrics)
+- **AgenticSynth**: Baseline synthetic data generation
+- **OpenAI GPT-3.5**: Optimized generation with reasoning
+- **Gemini Flash**: Fast baseline generation
+- **TypeScript**: Type-safe implementation
+
+## Setup
+
+### Prerequisites
+
+```bash
+node >= 18.0.0
+npm >= 9.0.0
+```
+
+### Environment Variables
+
+Create a `.env` file in the package root:
+
+```bash
+# Required
+OPENAI_API_KEY=sk-...                    # OpenAI API key
+GEMINI_API_KEY=...                       # Google AI Studio API key
+
+# Optional
+ANTHROPIC_API_KEY=sk-ant-...             # For Claude models
+```
+
+### Installation
+
+```bash
+# Install dependencies
+npm install
+
+# Build the package
+npm run build
+```
+
+## Running the Example
+
+### Basic Usage
+
+```bash
+# Set environment variables
+export OPENAI_API_KEY=sk-...
+export GEMINI_API_KEY=...
+
+# Run the example
+npx tsx examples/dspy-complete-example.ts
+```
+
+### Expected Output
+
+```
+╔════════════════════════════════════════════════════════════════════════╗
+║         DSPy.ts + AgenticSynth Integration Example                    ║
+║         E-commerce Product Data Generation with Optimization           ║
+╚════════════════════════════════════════════════════════════════════════╝
+
+✅ Environment validated
+
+🔷 PHASE 1: BASELINE GENERATION
+
+📦 Generating baseline data with AgenticSynth (Gemini)...
+
+  ✓ [1/10] UltraSound Pro Wireless Headphones
+    Quality: 72.3% | Price: $249.99 | Rating: 4.7/5
+  ✓ [2/10] EcoLux Organic Cotton T-Shirt
+    Quality: 68.5% | Price: $79.99 | Rating: 4.5/5
+  ...
+
+✅ Baseline generation complete: 10/10 products in 8.23s
+💰 Estimated cost: $0.0005
+
+🔷 PHASE 2: DSPy OPTIMIZATION
+
+🧠 Setting up DSPy optimization with OpenAI...
+
+  📡 Configuring OpenAI language model...
+  ✓ Language model configured
+
+  🔧 Creating ChainOfThought module...
+  ✓ Module created
+
+  📚 Loading training examples...
+  ✓ Loaded 5 high-quality examples
+
+  🎯 Running BootstrapFewShot optimizer...
+  ✓ Optimization complete in 12.45s
+
+✅ DSPy module ready for generation
+
+🔷 PHASE 3: OPTIMIZED GENERATION
+
+🚀 Generating optimized data with DSPy + OpenAI...
+
+  ✓ [1/10] SmartHome Voice Assistant Hub
+    Quality: 85.7% | Price: $179.99 | Rating: 4.8/5
+  ...
+
+✅ Optimized generation complete: 10/10 products in 15.67s
+💰 Estimated cost: $0.0070
+
+🔷 PHASE 4: ANALYSIS & REPORTING
+
+╔════════════════════════════════════════════════════════════════════════╗
+║                     COMPARISON REPORT                                  ║
+╚════════════════════════════════════════════════════════════════════════╝
+
+📊 BASELINE (AgenticSynth + Gemini)
+────────────────────────────────────────────────────────────────────────────
+Products Generated:    10
+Generation Time:       8.23s
+Estimated Cost:        $0.0005
+
+Quality Metrics:
+  Overall Quality:     68.2%
+  Completeness:        72.5%
+  Coherence:           65.0%
+  Persuasiveness:      60.8%
+  SEO Quality:         74.5%
+
+🚀 OPTIMIZED (DSPy + OpenAI)
+────────────────────────────────────────────────────────────────────────────
+Products Generated:    10
+Generation Time:       15.67s
+Estimated Cost:        $0.0070
+
+Quality Metrics:
+  Overall Quality:     84.3%
+  Completeness:        88.2%
+  Coherence:           82.5%
+  Persuasiveness:      85.0%
+  SEO Quality:         81.5%
+
+📈 IMPROVEMENT ANALYSIS
+────────────────────────────────────────────────────────────────────────────
+Quality Gain:          +23.6%
+Speed Change:          +90.4%
+Cost Efficiency:       +14.8%
+
+📊 QUALITY COMPARISON CHART
+────────────────────────────────────────────────────────────────────────────
+Baseline:  ██████████████████████████████████ 68.2%
+Optimized: ██████████████████████████████████████████ 84.3%
+
+💡 KEY INSIGHTS
+────────────────────────────────────────────────────────────────────────────
+✓ Significant quality improvement with DSPy optimization
+✓ Better cost efficiency with optimized approach
+
+════════════════════════════════════════════════════════════════════════════
+
+📁 Results exported to: .../examples/logs/dspy-comparison-results.json
+
+✅ Example complete!
+
+💡 Next steps:
+   1. Review the comparison report above
+   2. Check exported JSON for detailed results
+   3. Experiment with different training examples
+   4. Try other DSPy modules (Refine, ReAct, etc.)
+   5. Adjust CONFIG parameters for your use case
+```
+
+## Configuration
+
+### Customizable Parameters
+
+Edit the `CONFIG` object in the example file:
+
+```typescript
+const CONFIG = {
+  SAMPLE_SIZE: 10,           // Number of products to generate
+  TRAINING_EXAMPLES: 5,      // Examples for DSPy optimization
+  BASELINE_MODEL: 'gemini-2.0-flash-exp',
+  OPTIMIZED_MODEL: 'gpt-3.5-turbo',
+
+  CATEGORIES: [
+    'Electronics',
+    'Fashion',
+    'Home & Garden',
+    'Sports & Outdoors',
+    'Books & Media',
+    'Health & Beauty'
+  ],
+
+  PRICE_RANGES: {
+    low: { min: 10, max: 50 },
+    medium: { min: 50, max: 200 },
+    high: { min: 200, max: 1000 }
+  }
+};
+```
+
+## Understanding the Code
+
+### Phase 1: Baseline Generation
+
+```typescript
+const synth = new AgenticSynth({
+  provider: 'gemini',
+  model: 'gemini-2.0-flash-exp',
+  apiKey: process.env.GEMINI_API_KEY
+});
+
+const result = await synth.generateStructured<Product>({
+  prompt: '...',
+  schema: { /* product schema */ },
+  count: 1
+});
+```
+
+**Purpose**: Establishes baseline quality and cost metrics using standard generation.
+
+### Phase 2: DSPy Setup
+
+```typescript
+// Configure language model
+const lm = new OpenAILM({
+  model: 'gpt-3.5-turbo',
+  apiKey: process.env.OPENAI_API_KEY,
+  temperature: 0.7
+});
+await lm.init();
+configureLM(lm);
+
+// Create reasoning module
+const productGenerator = new ChainOfThought({
+  name: 'ProductGenerator',
+  signature: {
+    inputs: [
+      { name: 'category', type: 'string', required: true },
+      { name: 'priceRange', type: 'string', required: true }
+    ],
+    outputs: [
+      { name: 'name', type: 'string', required: true },
+      { name: 'description', type: 'string', required: true },
+      { name: 'price', type: 'number', required: true },
+      { name: 'rating', type: 'number', required: true }
+    ]
+  }
+});
+```
+
+**Purpose**: Sets up DSPy's declarative reasoning framework.
+
+### Phase 3: Optimization
+
+```typescript
+const optimizer = new BootstrapFewShot({
+  metric: productQualityMetric,
+  maxBootstrappedDemos: 5,
+  maxLabeledDemos: 3,
+  teacherSettings: { temperature: 0.5 },
+  maxRounds: 2
+});
+
+const optimizedModule = await optimizer.compile(
+  productGenerator,
+  trainingExamples
+);
+```
+
+**Purpose**: Learns from high-quality examples to improve generation.
+
+### Phase 4: Generation with Optimized Module
+
+```typescript
+const result = await optimizedModule.forward({
+  category: 'Electronics',
+  priceRange: '$100-$500'
+});
+
+const product: Product = {
+  name: result.name,
+  description: result.description,
+  price: result.price,
+  rating: result.rating
+};
+```
+
+**Purpose**: Uses optimized prompts and reasoning chains learned during compilation.
+
+## Quality Metrics Explained
+
+The example calculates four quality dimensions:
+
+### 1. Completeness (40% weight)
+- Description length (100-500 words)
+- Contains features/benefits
+- Has call-to-action
+
+### 2. Coherence (20% weight)
+- Sentence structure quality
+- Average sentence length (15-25 words ideal)
+- Natural flow
+
+### 3. Persuasiveness (20% weight)
+- Persuasive language usage
+- Emotional appeal
+- Value proposition clarity
+
+### 4. SEO Quality (20% weight)
+- Product name in description
+- Keyword presence
+- Discoverability
+
+## Advanced Usage
+
+### Using Different DSPy Modules
+
+#### Refine Module (Iterative Improvement)
+
+```typescript
+import { Refine } from 'dspy.ts';
+
+const refiner = new Refine({
+  name: 'ProductRefiner',
+  signature: { /* ... */ },
+  maxIterations: 3,
+  constraints: [
+    { field: 'description', check: (val) => val.length >= 100 }
+  ]
+});
+```
+
+#### ReAct Module (Reasoning + Acting)
+
+```typescript
+import { ReAct } from 'dspy.ts';
+
+const reactor = new ReAct({
+  name: 'ProductResearcher',
+  signature: { /* ... */ },
+  tools: [searchTool, pricingTool]
+});
+```
+
+### Custom Metrics
+
+```typescript
+import { createMetric } from 'dspy.ts';
+
+const customMetric = createMetric(
+  'brand-consistency',
+  (example, prediction) => {
+    // Your custom evaluation logic
+    const score = calculateBrandScore(prediction);
+    return score;
+  }
+);
+```
+
+### Integration with AgenticDB
+
+```typescript
+import { AgenticDB } from 'agentdb';
+
+// Store products in vector database
+const db = new AgenticDB();
+await db.init();
+
+for (const product of optimizedProducts) {
+  await db.add({
+    id: product.id,
+    text: product.description,
+    metadata: { category: product.category, price: product.price }
+  });
+}
+
+// Semantic search
+const similar = await db.search('wireless noise cancelling headphones', {
+  limit: 5
+});
+```
+
+## Troubleshooting
+
+### Common Issues
+
+#### 1. Module Not Found
+
+```bash
+Error: Cannot find module 'dspy.ts'
+```
+
+**Solution**: Ensure dependencies are installed:
+```bash
+npm install
+```
+
+#### 2. API Key Not Found
+
+```bash
+❌ Missing required environment variables:
+   - OPENAI_API_KEY
+```
+
+**Solution**: Export environment variables:
+```bash
+export OPENAI_API_KEY=sk-...
+export GEMINI_API_KEY=...
+```
+
+#### 3. Rate Limiting
+
+```bash
+Error: Rate limit exceeded
+```
+
+**Solution**: Add delays or reduce `SAMPLE_SIZE`:
+```typescript
+const CONFIG = {
+  SAMPLE_SIZE: 5,  // Reduce from 10
+  // ...
+};
+```
+
+#### 4. Out of Memory
+
+**Solution**: Process in smaller batches:
+```typescript
+const batchSize = 5;
+for (let i = 0; i < totalProducts; i += batchSize) {
+  const batch = await generateBatch(batchSize);
+  // Process batch
+}
+```
+
+## Performance Tips
+
+### 1. Parallel Generation
+
+```typescript
+const promises = categories.map(category =>
+  optimizedModule.forward({ category, priceRange })
+);
+const results = await Promise.all(promises);
+```
+
+### 2. Caching
+
+```typescript
+const synth = new AgenticSynth({
+  cacheStrategy: 'redis',
+  cacheTTL: 3600,
+  // ...
+});
+```
+
+### 3. Streaming
+
+```typescript
+for await (const product of synth.generateStream('structured', options)) {
+  console.log('Generated:', product);
+  // Process immediately
+}
+```
+
+## Cost Optimization
+
+### Model Selection Strategy
+
+| Use Case | Baseline Model | Optimized Model | Notes |
+|----------|---------------|-----------------|-------|
+| High Quality | GPT-4 | Claude Opus | Premium quality |
+| Balanced | Gemini Flash | GPT-3.5 Turbo | Good quality/cost |
+| Cost-Effective | Gemini Flash | Gemini Flash | Minimal cost |
+| High Volume | Llama 3.1 | Gemini Flash | Maximum throughput |
+
+### Budget Management
+
+```typescript
+const CONFIG = {
+  MAX_BUDGET: 1.0,  // $1 USD limit
+  COST_PER_TOKEN: 0.0005,
+  // ...
+};
+
+let totalCost = 0;
+for (let i = 0; i < products && totalCost < CONFIG.MAX_BUDGET; i++) {
+  const result = await generate();
+  totalCost += estimateCost(result);
+}
+```
+
+## Testing
+
+### Unit Tests
+
+```typescript
+import { describe, it, expect } from 'vitest';
+import { calculateQualityMetrics } from './dspy-complete-example';
+
+describe('Quality Metrics', () => {
+  it('should calculate completeness correctly', () => {
+    const product = {
+      name: 'Test Product',
+      description: 'A'.repeat(150),
+      price: 99.99,
+      rating: 4.5
+    };
+
+    const metrics = calculateQualityMetrics(product);
+    expect(metrics.completeness).toBeGreaterThan(0);
+  });
+});
+```
+
+### Integration Tests
+
+```bash
+npm run test -- examples/dspy-complete-example.test.ts
+```
+
+## Resources
+
+### Documentation
+- [DSPy.ts GitHub](https://github.com/ruvnet/dspy.ts)
+- [AgenticSynth Docs](https://github.com/ruvnet/ruvector/tree/main/packages/agentic-synth)
+- [DSPy Paper](https://arxiv.org/abs/2310.03714)
+
+### Examples
+- [Basic Usage](./basic-usage.ts)
+- [Integration Examples](./integration-examples.ts)
+- [Training Examples](./dspy-training-example.ts)
+
+### Community
+- [Discord](https://discord.gg/dspy)
+- [GitHub Discussions](https://github.com/ruvnet/dspy.ts/discussions)
+
+## License
+
+MIT License - See LICENSE file for details
+
+## Contributing
+
+Contributions welcome! Please see CONTRIBUTING.md for guidelines.
+
+---
+
+**Built with ❤️ by rUv**