Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
This commit is contained in:
443
npm/packages/agentic-synth/docs/VIDEO_DEMO_SCRIPT.md
Normal file
443
npm/packages/agentic-synth/docs/VIDEO_DEMO_SCRIPT.md
Normal file
@@ -0,0 +1,443 @@
|
||||
# 🎥 Agentic-Synth Video Tutorial Script
|
||||
|
||||
**Duration**: 8-10 minutes
|
||||
**Target Audience**: Developers, ML engineers, data scientists
|
||||
**Format**: Screen recording with voice-over
|
||||
|
||||
---
|
||||
|
||||
## Video Structure
|
||||
|
||||
1. **Introduction** (1 min)
|
||||
2. **Installation & Setup** (1 min)
|
||||
3. **Basic Usage** (2 mins)
|
||||
4. **Advanced Features** (2 mins)
|
||||
5. **Real-World Example** (2 mins)
|
||||
6. **Performance & Wrap-up** (1 min)
|
||||
|
||||
---
|
||||
|
||||
## Script
|
||||
|
||||
### Scene 1: Introduction (0:00 - 1:00)
|
||||
|
||||
**Visual**: Title card, then switch to terminal
|
||||
|
||||
**Voice-over**:
|
||||
> "Hi! Today I'll show you agentic-synth - a high-performance synthetic data generator that makes it incredibly easy to create realistic test data for your AI and ML projects.
|
||||
>
|
||||
> Whether you're training machine learning models, building RAG systems, or just need to seed your development database, agentic-synth has you covered with AI-powered data generation.
|
||||
>
|
||||
> Let's dive in!"
|
||||
|
||||
**Screen**: Show README on GitHub with badges
|
||||
|
||||
---
|
||||
|
||||
### Scene 2: Installation (1:00 - 2:00)
|
||||
|
||||
**Visual**: Terminal with command prompts
|
||||
|
||||
**Voice-over**:
|
||||
> "Installation is straightforward. You can use it as a global CLI tool or add it to your project."
|
||||
|
||||
**Type in terminal**:
|
||||
```bash
|
||||
# Global installation
|
||||
npm install -g @ruvector/agentic-synth
|
||||
|
||||
# Or use directly with npx
|
||||
npx agentic-synth --help
|
||||
```
|
||||
|
||||
**Voice-over**:
|
||||
> "You'll need an API key from Google Gemini or OpenRouter. Let's set that up quickly."
|
||||
|
||||
**Type**:
|
||||
```bash
|
||||
export GEMINI_API_KEY="your-key-here"
|
||||
```
|
||||
|
||||
**Voice-over**:
|
||||
> "And we're ready to go!"
|
||||
|
||||
---
|
||||
|
||||
### Scene 3: Basic Usage - CLI (2:00 - 3:00)
|
||||
|
||||
**Visual**: Terminal showing CLI commands
|
||||
|
||||
**Voice-over**:
|
||||
> "Let's start with the CLI. Generating data is as simple as running a single command."
|
||||
|
||||
**Type**:
|
||||
```bash
|
||||
npx agentic-synth generate \
|
||||
--type structured \
|
||||
--count 10 \
|
||||
--schema '{"name": "string", "email": "email", "age": "number"}' \
|
||||
--output users.json
|
||||
```
|
||||
|
||||
**Voice-over**:
|
||||
> "In just a few seconds, we have 10 realistic user records with names, emails, and ages. Let's look at the output."
|
||||
|
||||
**Type**:
|
||||
```bash
|
||||
cat users.json | jq '.[0:3]'
|
||||
```
|
||||
|
||||
**Visual**: Show JSON output with realistic data
|
||||
|
||||
**Voice-over**:
|
||||
> "Notice how the data looks realistic - real names, valid email formats, appropriate ages. This is all powered by AI."
|
||||
|
||||
---
|
||||
|
||||
### Scene 4: SDK Usage (3:00 - 4:00)
|
||||
|
||||
**Visual**: VS Code with TypeScript file
|
||||
|
||||
**Voice-over**:
|
||||
> "For more control, you can use the SDK directly in your code. Let me show you how simple that is."
|
||||
|
||||
**Type in editor** (`demo.ts`):
|
||||
```typescript
|
||||
import { AgenticSynth } from '@ruvector/agentic-synth';
|
||||
|
||||
// Initialize with configuration
|
||||
const synth = new AgenticSynth({
|
||||
provider: 'gemini',
|
||||
apiKey: process.env.GEMINI_API_KEY,
|
||||
cacheStrategy: 'memory', // Enable caching for 95%+ speedup
|
||||
cacheTTL: 3600
|
||||
});
|
||||
|
||||
// Generate structured data
|
||||
const users = await synth.generateStructured({
|
||||
count: 100,
|
||||
schema: {
|
||||
user_id: 'UUID',
|
||||
name: 'full name',
|
||||
email: 'valid email',
|
||||
age: 'number (18-80)',
|
||||
country: 'country name',
|
||||
subscription: 'free | pro | enterprise'
|
||||
}
|
||||
});
|
||||
|
||||
console.log(`Generated ${users.data.length} users`);
|
||||
console.log('Sample:', users.data[0]);
|
||||
```
|
||||
|
||||
**Voice-over**:
|
||||
> "Run this code..."
|
||||
|
||||
**Type in terminal**:
|
||||
```bash
|
||||
npx tsx demo.ts
|
||||
```
|
||||
|
||||
**Visual**: Show output with generated data
|
||||
|
||||
**Voice-over**:
|
||||
> "And we instantly get 100 realistic user profiles. Notice the caching - if we run this again with the same options, it's nearly instant!"
|
||||
|
||||
---
|
||||
|
||||
### Scene 5: Advanced Features - Time Series (4:00 - 5:00)
|
||||
|
||||
**Visual**: Split screen - editor on left, output on right
|
||||
|
||||
**Voice-over**:
|
||||
> "agentic-synth isn't just for simple records. It can generate complex time-series data, perfect for financial or IoT applications."
|
||||
|
||||
**Type in editor**:
|
||||
```typescript
|
||||
const stockData = await synth.generateTimeSeries({
|
||||
count: 365,
|
||||
startDate: '2024-01-01',
|
||||
interval: '1d',
|
||||
schema: {
|
||||
date: 'ISO date',
|
||||
open: 'number (100-200)',
|
||||
high: 'number (105-210)',
|
||||
low: 'number (95-195)',
|
||||
close: 'number (100-200)',
|
||||
volume: 'number (1000000-10000000)'
|
||||
},
|
||||
constraints: [
|
||||
'high must be >= open and close',
|
||||
'low must be <= open and close',
|
||||
'close influences next day open'
|
||||
]
|
||||
});
|
||||
|
||||
console.log('Generated stock data for 1 year');
|
||||
```
|
||||
|
||||
**Voice-over**:
|
||||
> "The constraints ensure our data follows real-world patterns - high prices are actually higher than opens and closes, and there's continuity between days."
|
||||
|
||||
**Show output**: Chart visualization of stock data
|
||||
|
||||
---
|
||||
|
||||
### Scene 6: Advanced Features - Streaming (5:00 - 6:00)
|
||||
|
||||
**Visual**: Editor showing streaming code
|
||||
|
||||
**Voice-over**:
|
||||
> "Need to generate millions of records? Use streaming to avoid memory issues."
|
||||
|
||||
**Type**:
|
||||
```typescript
|
||||
let count = 0;
|
||||
for await (const record of synth.generateStream('structured', {
|
||||
count: 1_000_000,
|
||||
schema: {
|
||||
id: 'UUID',
|
||||
timestamp: 'ISO timestamp',
|
||||
value: 'number'
|
||||
}
|
||||
})) {
|
||||
// Process each record individually
|
||||
await saveToDatabase(record);
|
||||
|
||||
count++;
|
||||
if (count % 10000 === 0) {
|
||||
console.log(`Processed ${count.toLocaleString()}...`);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Voice-over**:
|
||||
> "This streams records one at a time, so you can process a million records without loading everything into memory."
|
||||
|
||||
**Visual**: Show progress counter incrementing
|
||||
|
||||
---
|
||||
|
||||
### Scene 7: Real-World Example - ML Training Data (6:00 - 7:30)
|
||||
|
||||
**Visual**: Complete working example
|
||||
|
||||
**Voice-over**:
|
||||
> "Let me show you a real-world use case: generating training data for a machine learning model that predicts customer churn."
|
||||
|
||||
**Type**:
|
||||
```typescript
|
||||
// Generate training dataset with features
|
||||
const trainingData = await synth.generateStructured({
|
||||
count: 5000,
|
||||
schema: {
|
||||
customer_age: 'number (18-80)',
|
||||
annual_income: 'number (20000-200000)',
|
||||
credit_score: 'number (300-850)',
|
||||
account_tenure_months: 'number (1-360)',
|
||||
num_products: 'number (1-5)',
|
||||
balance: 'number (0-250000)',
|
||||
num_transactions_12m: 'number (0-200)',
|
||||
|
||||
// Target variable
|
||||
churn: 'boolean (higher likelihood if credit_score < 600, balance < 1000)'
|
||||
},
|
||||
constraints: [
|
||||
'Churn rate should be ~15-20%',
|
||||
'Higher income correlates with higher balance',
|
||||
'Customers with 1 product more likely to churn'
|
||||
]
|
||||
});
|
||||
|
||||
// Split into train/test
|
||||
const trainSize = Math.floor(trainingData.data.length * 0.8);
|
||||
const trainSet = trainingData.data.slice(0, trainSize);
|
||||
const testSet = trainingData.data.slice(trainSize);
|
||||
|
||||
console.log(`Training set: ${trainSet.length} samples`);
|
||||
console.log(`Test set: ${testSet.length} samples`);
|
||||
console.log(`Churn rate: ${(trainSet.filter(d => d.churn).length / trainSet.length * 100).toFixed(1)}%`);
|
||||
```
|
||||
|
||||
**Voice-over**:
|
||||
> "In minutes, we have a complete ML dataset with realistic distributions and correlations. The AI understands the constraints and generates data that actually makes sense for training models."
|
||||
|
||||
---
|
||||
|
||||
### Scene 8: Performance Highlights (7:30 - 8:30)
|
||||
|
||||
**Visual**: Show benchmark results
|
||||
|
||||
**Voice-over**:
|
||||
> "Let's talk performance. agentic-synth is incredibly fast, thanks to intelligent caching."
|
||||
|
||||
**Visual**: Show PERFORMANCE_REPORT.md metrics
|
||||
|
||||
**Voice-over**:
|
||||
> "All operations complete in sub-millisecond to low-millisecond latencies. Cache hits are essentially instant. And with an 85% cache hit rate in production, you're looking at 95%+ performance improvement for repeated queries.
|
||||
>
|
||||
> The package also handles 1000+ requests per second with linear scaling, making it perfect for production workloads."
|
||||
|
||||
---
|
||||
|
||||
### Scene 9: Wrap-up (8:30 - 9:00)
|
||||
|
||||
**Visual**: Return to terminal, show final commands
|
||||
|
||||
**Voice-over**:
|
||||
> "That's agentic-synth! To recap:
|
||||
> - Simple CLI and SDK interfaces
|
||||
> - AI-powered realistic data generation
|
||||
> - Time-series, events, and structured data support
|
||||
> - Streaming for large datasets
|
||||
> - Built-in caching for incredible performance
|
||||
> - Perfect for ML training, RAG systems, and testing
|
||||
>
|
||||
> Check out the documentation for more advanced examples, and give it a try in your next project!"
|
||||
|
||||
**Type**:
|
||||
```bash
|
||||
npm install @ruvector/agentic-synth
|
||||
```
|
||||
|
||||
**Visual**: Show GitHub repo with Star button
|
||||
|
||||
**Voice-over**:
|
||||
> "If you found this useful, star the repo on GitHub and let me know what you build with it. Thanks for watching!"
|
||||
|
||||
**Visual**: End card with links
|
||||
|
||||
---
|
||||
|
||||
## Visual Assets Needed
|
||||
|
||||
1. **Title Cards**:
|
||||
- Intro card with logo
|
||||
- Feature highlights card
|
||||
- End card with links
|
||||
|
||||
2. **Code Examples**:
|
||||
- Syntax highlighted in VS Code
|
||||
- Font: Fira Code or JetBrains Mono
|
||||
- Theme: Dark+ or Material Theme
|
||||
|
||||
3. **Terminal**:
|
||||
- Oh My Zsh with clean prompt
|
||||
- Colors: Nord or Dracula theme
|
||||
|
||||
4. **Data Visualizations**:
|
||||
- JSON output formatted with jq
|
||||
- Stock chart for time-series example
|
||||
- Progress bars for streaming
|
||||
|
||||
5. **Documentation**:
|
||||
- README.md rendered
|
||||
- Performance metrics table
|
||||
- Benchmark results
|
||||
|
||||
---
|
||||
|
||||
## Recording Tips
|
||||
|
||||
1. **Screen Setup**:
|
||||
- 1920x1080 resolution
|
||||
- Clean desktop, no distractions
|
||||
- Close unnecessary applications
|
||||
- Disable notifications
|
||||
|
||||
2. **Terminal Settings**:
|
||||
- Large font size (16-18pt)
|
||||
- High contrast theme
|
||||
- Slow down typing with tool like "Keycastr"
|
||||
|
||||
3. **Editor Settings**:
|
||||
- Zoom to 150-200%
|
||||
- Hide sidebars for cleaner view
|
||||
- Use presentation mode
|
||||
|
||||
4. **Audio**:
|
||||
- Use quality microphone
|
||||
- Record in quiet room
|
||||
- Speak clearly and at moderate pace
|
||||
- Add background music (subtle, low volume)
|
||||
|
||||
5. **Pacing**:
|
||||
- Pause between steps
|
||||
- Let output display for 2-3 seconds
|
||||
- Don't rush through commands
|
||||
- Leave time for viewers to read
|
||||
|
||||
---
|
||||
|
||||
## Post-Production Checklist
|
||||
|
||||
- [ ] Add title cards
|
||||
- [ ] Add transitions between scenes
|
||||
- [ ] Highlight important commands/output
|
||||
- [ ] Add annotations/callouts where helpful
|
||||
- [ ] Background music at 10-15% volume
|
||||
- [ ] Export at 1080p, 60fps
|
||||
- [ ] Generate subtitles/captions
|
||||
- [ ] Create thumbnail image
|
||||
- [ ] Upload to YouTube
|
||||
- [ ] Add to README as embedded video
|
||||
|
||||
---
|
||||
|
||||
## Video Description (for YouTube)
|
||||
|
||||
```markdown
|
||||
# Agentic-Synth: High-Performance Synthetic Data Generator
|
||||
|
||||
Generate realistic synthetic data for AI/ML training, RAG systems, and database seeding in minutes!
|
||||
|
||||
🔗 Links:
|
||||
- NPM: https://www.npmjs.com/package/@ruvector/agentic-synth
|
||||
- GitHub: https://github.com/ruvnet/ruvector/tree/main/packages/agentic-synth
|
||||
- Documentation: https://github.com/ruvnet/ruvector/blob/main/packages/agentic-synth/README.md
|
||||
|
||||
⚡ Performance:
|
||||
- Sub-millisecond P99 latencies
|
||||
- 85% cache hit rate
|
||||
- 1000+ req/s throughput
|
||||
- 95%+ speedup with caching
|
||||
|
||||
🎯 Use Cases:
|
||||
- Machine learning training data
|
||||
- RAG system data generation
|
||||
- Database seeding
|
||||
- API testing
|
||||
- Load testing
|
||||
|
||||
📚 Chapters:
|
||||
0:00 Introduction
|
||||
1:00 Installation & Setup
|
||||
2:00 CLI Usage
|
||||
3:00 SDK Usage
|
||||
4:00 Time-Series Data
|
||||
5:00 Streaming Large Datasets
|
||||
6:00 ML Training Example
|
||||
7:30 Performance Highlights
|
||||
8:30 Wrap-up
|
||||
|
||||
#machinelearning #AI #syntheticdata #typescript #nodejs #datascience #RAG
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Alternative: Live Coding Demo (15 min)
|
||||
|
||||
For a longer, more in-depth tutorial:
|
||||
|
||||
1. **Setup** (3 min): Project initialization, dependencies
|
||||
2. **Basic Generation** (3 min): Simple examples
|
||||
3. **Complex Schemas** (3 min): Nested structures, constraints
|
||||
4. **Integration** (3 min): Database seeding example
|
||||
5. **Performance** (2 min): Benchmarks and optimization
|
||||
6. **Q&A** (1 min): Common questions
|
||||
|
||||
---
|
||||
|
||||
**Script Version**: 1.0
|
||||
**Last Updated**: 2025-11-22
|
||||
**Status**: Ready for Recording 🎬
|
||||
Reference in New Issue
Block a user