Files
wifi-densepose/npm/packages/agentic-synth/examples/docs/dspy-complete-example-guide.md
ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00

14 KiB

DSPy.ts + AgenticSynth Complete Integration Guide

Overview

This comprehensive example demonstrates real-world integration between DSPy.ts (v2.1.1) and AgenticSynth for e-commerce product data generation with automatic optimization.

What This Example Does

🎯 Complete Workflow

  1. Baseline Generation: Uses AgenticSynth with Gemini to generate product data
  2. DSPy Setup: Configures OpenAI with ChainOfThought reasoning module
  3. Optimization: Uses BootstrapFewShot to learn from high-quality examples
  4. Comparison: Analyzes quality improvements, cost, and performance
  5. Reporting: Generates detailed comparison metrics and visualizations

🔧 Technologies Used

  • DSPy.ts v2.1.1: Real modules (ChainOfThought, BootstrapFewShot, metrics)
  • AgenticSynth: Baseline synthetic data generation
  • OpenAI GPT-3.5: Optimized generation with reasoning
  • Gemini Flash: Fast baseline generation
  • TypeScript: Type-safe implementation

Setup

Prerequisites

node >= 18.0.0
npm >= 9.0.0

Environment Variables

Create a .env file in the package root:

# Required
OPENAI_API_KEY=sk-...                    # OpenAI API key
GEMINI_API_KEY=...                       # Google AI Studio API key

# Optional
ANTHROPIC_API_KEY=sk-ant-...             # For Claude models

Installation

# Install dependencies
npm install

# Build the package
npm run build

Running the Example

Basic Usage

# Set environment variables
export OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=...

# Run the example
npx tsx examples/dspy-complete-example.ts

Expected Output

╔════════════════════════════════════════════════════════════════════════╗
║         DSPy.ts + AgenticSynth Integration Example                    ║
║         E-commerce Product Data Generation with Optimization           ║
╚════════════════════════════════════════════════════════════════════════╝

✅ Environment validated

🔷 PHASE 1: BASELINE GENERATION

📦 Generating baseline data with AgenticSynth (Gemini)...

  ✓ [1/10] UltraSound Pro Wireless Headphones
    Quality: 72.3% | Price: $249.99 | Rating: 4.7/5
  ✓ [2/10] EcoLux Organic Cotton T-Shirt
    Quality: 68.5% | Price: $79.99 | Rating: 4.5/5
  ...

✅ Baseline generation complete: 10/10 products in 8.23s
💰 Estimated cost: $0.0005

🔷 PHASE 2: DSPy OPTIMIZATION

🧠 Setting up DSPy optimization with OpenAI...

  📡 Configuring OpenAI language model...
  ✓ Language model configured

  🔧 Creating ChainOfThought module...
  ✓ Module created

  📚 Loading training examples...
  ✓ Loaded 5 high-quality examples

  🎯 Running BootstrapFewShot optimizer...
  ✓ Optimization complete in 12.45s

✅ DSPy module ready for generation

🔷 PHASE 3: OPTIMIZED GENERATION

🚀 Generating optimized data with DSPy + OpenAI...

  ✓ [1/10] SmartHome Voice Assistant Hub
    Quality: 85.7% | Price: $179.99 | Rating: 4.8/5
  ...

✅ Optimized generation complete: 10/10 products in 15.67s
💰 Estimated cost: $0.0070

🔷 PHASE 4: ANALYSIS & REPORTING

╔════════════════════════════════════════════════════════════════════════╗
║                     COMPARISON REPORT                                  ║
╚════════════════════════════════════════════════════════════════════════╝

📊 BASELINE (AgenticSynth + Gemini)
────────────────────────────────────────────────────────────────────────────
Products Generated:    10
Generation Time:       8.23s
Estimated Cost:        $0.0005

Quality Metrics:
  Overall Quality:     68.2%
  Completeness:        72.5%
  Coherence:           65.0%
  Persuasiveness:      60.8%
  SEO Quality:         74.5%

🚀 OPTIMIZED (DSPy + OpenAI)
────────────────────────────────────────────────────────────────────────────
Products Generated:    10
Generation Time:       15.67s
Estimated Cost:        $0.0070

Quality Metrics:
  Overall Quality:     84.3%
  Completeness:        88.2%
  Coherence:           82.5%
  Persuasiveness:      85.0%
  SEO Quality:         81.5%

📈 IMPROVEMENT ANALYSIS
────────────────────────────────────────────────────────────────────────────
Quality Gain:          +23.6%
Speed Change:          +90.4%
Cost Efficiency:       +14.8%

📊 QUALITY COMPARISON CHART
────────────────────────────────────────────────────────────────────────────
Baseline:  ██████████████████████████████████ 68.2%
Optimized: ██████████████████████████████████████████ 84.3%

💡 KEY INSIGHTS
────────────────────────────────────────────────────────────────────────────
✓ Significant quality improvement with DSPy optimization
✓ Better cost efficiency with optimized approach

════════════════════════════════════════════════════════════════════════════

📁 Results exported to: .../examples/logs/dspy-comparison-results.json

✅ Example complete!

💡 Next steps:
   1. Review the comparison report above
   2. Check exported JSON for detailed results
   3. Experiment with different training examples
   4. Try other DSPy modules (Refine, ReAct, etc.)
   5. Adjust CONFIG parameters for your use case

Configuration

Customizable Parameters

Edit the CONFIG object in the example file:

const CONFIG = {
  SAMPLE_SIZE: 10,           // Number of products to generate
  TRAINING_EXAMPLES: 5,      // Examples for DSPy optimization
  BASELINE_MODEL: 'gemini-2.0-flash-exp',
  OPTIMIZED_MODEL: 'gpt-3.5-turbo',

  CATEGORIES: [
    'Electronics',
    'Fashion',
    'Home & Garden',
    'Sports & Outdoors',
    'Books & Media',
    'Health & Beauty'
  ],

  PRICE_RANGES: {
    low: { min: 10, max: 50 },
    medium: { min: 50, max: 200 },
    high: { min: 200, max: 1000 }
  }
};

Understanding the Code

Phase 1: Baseline Generation

const synth = new AgenticSynth({
  provider: 'gemini',
  model: 'gemini-2.0-flash-exp',
  apiKey: process.env.GEMINI_API_KEY
});

const result = await synth.generateStructured<Product>({
  prompt: '...',
  schema: { /* product schema */ },
  count: 1
});

Purpose: Establishes baseline quality and cost metrics using standard generation.

Phase 2: DSPy Setup

// Configure language model
const lm = new OpenAILM({
  model: 'gpt-3.5-turbo',
  apiKey: process.env.OPENAI_API_KEY,
  temperature: 0.7
});
await lm.init();
configureLM(lm);

// Create reasoning module
const productGenerator = new ChainOfThought({
  name: 'ProductGenerator',
  signature: {
    inputs: [
      { name: 'category', type: 'string', required: true },
      { name: 'priceRange', type: 'string', required: true }
    ],
    outputs: [
      { name: 'name', type: 'string', required: true },
      { name: 'description', type: 'string', required: true },
      { name: 'price', type: 'number', required: true },
      { name: 'rating', type: 'number', required: true }
    ]
  }
});

Purpose: Sets up DSPy's declarative reasoning framework.

Phase 3: Optimization

const optimizer = new BootstrapFewShot({
  metric: productQualityMetric,
  maxBootstrappedDemos: 5,
  maxLabeledDemos: 3,
  teacherSettings: { temperature: 0.5 },
  maxRounds: 2
});

const optimizedModule = await optimizer.compile(
  productGenerator,
  trainingExamples
);

Purpose: Learns from high-quality examples to improve generation.

Phase 4: Generation with Optimized Module

const result = await optimizedModule.forward({
  category: 'Electronics',
  priceRange: '$100-$500'
});

const product: Product = {
  name: result.name,
  description: result.description,
  price: result.price,
  rating: result.rating
};

Purpose: Uses optimized prompts and reasoning chains learned during compilation.

Quality Metrics Explained

The example calculates four quality dimensions:

1. Completeness (40% weight)

  • Description length (100-500 words)
  • Contains features/benefits
  • Has call-to-action

2. Coherence (20% weight)

  • Sentence structure quality
  • Average sentence length (15-25 words ideal)
  • Natural flow

3. Persuasiveness (20% weight)

  • Persuasive language usage
  • Emotional appeal
  • Value proposition clarity

4. SEO Quality (20% weight)

  • Product name in description
  • Keyword presence
  • Discoverability

Advanced Usage

Using Different DSPy Modules

Refine Module (Iterative Improvement)

import { Refine } from 'dspy.ts';

const refiner = new Refine({
  name: 'ProductRefiner',
  signature: { /* ... */ },
  maxIterations: 3,
  constraints: [
    { field: 'description', check: (val) => val.length >= 100 }
  ]
});

ReAct Module (Reasoning + Acting)

import { ReAct } from 'dspy.ts';

const reactor = new ReAct({
  name: 'ProductResearcher',
  signature: { /* ... */ },
  tools: [searchTool, pricingTool]
});

Custom Metrics

import { createMetric } from 'dspy.ts';

const customMetric = createMetric(
  'brand-consistency',
  (example, prediction) => {
    // Your custom evaluation logic
    const score = calculateBrandScore(prediction);
    return score;
  }
);

Integration with AgenticDB

import { AgenticDB } from 'agentdb';

// Store products in vector database
const db = new AgenticDB();
await db.init();

for (const product of optimizedProducts) {
  await db.add({
    id: product.id,
    text: product.description,
    metadata: { category: product.category, price: product.price }
  });
}

// Semantic search
const similar = await db.search('wireless noise cancelling headphones', {
  limit: 5
});

Troubleshooting

Common Issues

1. Module Not Found

Error: Cannot find module 'dspy.ts'

Solution: Ensure dependencies are installed:

npm install

2. API Key Not Found

❌ Missing required environment variables:
   - OPENAI_API_KEY

Solution: Export environment variables:

export OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=...

3. Rate Limiting

Error: Rate limit exceeded

Solution: Add delays or reduce SAMPLE_SIZE:

const CONFIG = {
  SAMPLE_SIZE: 5,  // Reduce from 10
  // ...
};

4. Out of Memory

Solution: Process in smaller batches:

const batchSize = 5;
for (let i = 0; i < totalProducts; i += batchSize) {
  const batch = await generateBatch(batchSize);
  // Process batch
}

Performance Tips

1. Parallel Generation

const promises = categories.map(category =>
  optimizedModule.forward({ category, priceRange })
);
const results = await Promise.all(promises);

2. Caching

const synth = new AgenticSynth({
  cacheStrategy: 'redis',
  cacheTTL: 3600,
  // ...
});

3. Streaming

for await (const product of synth.generateStream('structured', options)) {
  console.log('Generated:', product);
  // Process immediately
}

Cost Optimization

Model Selection Strategy

Use Case Baseline Model Optimized Model Notes
High Quality GPT-4 Claude Opus Premium quality
Balanced Gemini Flash GPT-3.5 Turbo Good quality/cost
Cost-Effective Gemini Flash Gemini Flash Minimal cost
High Volume Llama 3.1 Gemini Flash Maximum throughput

Budget Management

const CONFIG = {
  MAX_BUDGET: 1.0,  // $1 USD limit
  COST_PER_TOKEN: 0.0005,
  // ...
};

let totalCost = 0;
for (let i = 0; i < products && totalCost < CONFIG.MAX_BUDGET; i++) {
  const result = await generate();
  totalCost += estimateCost(result);
}

Testing

Unit Tests

import { describe, it, expect } from 'vitest';
import { calculateQualityMetrics } from './dspy-complete-example';

describe('Quality Metrics', () => {
  it('should calculate completeness correctly', () => {
    const product = {
      name: 'Test Product',
      description: 'A'.repeat(150),
      price: 99.99,
      rating: 4.5
    };

    const metrics = calculateQualityMetrics(product);
    expect(metrics.completeness).toBeGreaterThan(0);
  });
});

Integration Tests

npm run test -- examples/dspy-complete-example.test.ts

Resources

Documentation

Examples

Community

License

MIT License - See LICENSE file for details

Contributing

Contributions welcome! Please see CONTRIBUTING.md for guidelines.


Built with ❤️ by rUv