Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'

This commit is contained in:
ruv
2026-02-28 14:39:40 -05:00
7854 changed files with 3522914 additions and 0 deletions

View File

@@ -0,0 +1,444 @@
# ruvector-scipix Examples
This directory contains comprehensive examples demonstrating various features and use cases of ruvector-scipix.
## Quick Start
All examples can be run using:
```bash
cargo run --example <example_name> -- [arguments]
```
## Examples Overview
### 1. Simple OCR (`simple_ocr.rs`)
**Basic OCR functionality with single image processing.**
Demonstrates:
- Loading and processing a single image
- OCR recognition
- Output in multiple formats (plain text, LaTeX)
- Confidence scores
**Usage:**
```bash
cargo run --example simple_ocr -- path/to/image.png
```
**Example Output:**
```
Plain Text: x² + 2x + 1 = 0
LaTeX: x^{2} + 2x + 1 = 0
Confidence: 95.3%
```
---
### 2. Batch Processing (`batch_processing.rs`)
**Parallel processing of multiple images with progress tracking.**
Demonstrates:
- Directory-based batch processing
- Parallel/concurrent processing
- Progress bar visualization
- Statistics and metrics
- JSON output
**Usage:**
```bash
cargo run --example batch_processing -- /path/to/images output.json
```
**Features:**
- Automatic CPU core detection for optimal parallelism
- Real-time progress visualization
- Per-file error handling
- Aggregate statistics
---
### 3. API Server (`api_server.rs`)
**REST API server for OCR processing.**
Demonstrates:
- HTTP server with Axum framework
- Single and batch image processing endpoints
- Health check endpoint
- Graceful shutdown
- CORS support
- Multipart file uploads
**Usage:**
```bash
# Start server
cargo run --example api_server
# In another terminal, test the API
curl -X POST -F "image=@equation.png" http://localhost:8080/ocr
curl http://localhost:8080/health
```
**Endpoints:**
- `GET /health` - Health check
- `POST /ocr` - Process single image
- `POST /batch` - Process multiple images
---
### 4. Streaming Processing (`streaming.rs`)
**Streaming PDF processing with real-time results.**
Demonstrates:
- Large document processing
- Streaming results as pages are processed
- Real-time progress reporting
- Incremental JSON output
- Memory-efficient processing
**Usage:**
```bash
cargo run --example streaming -- document.pdf output/
```
**Features:**
- Processes pages concurrently (4 at a time)
- Saves individual page results immediately
- Generates final document summary
- Per-page timing statistics
---
### 5. Custom Pipeline (`custom_pipeline.rs`)
**Custom OCR pipeline with preprocessing and post-processing.**
Demonstrates:
- Image preprocessing (denoising, sharpening, binarization)
- Post-processing filters
- LaTeX validation
- Confidence filtering
- Custom output formatting
- Otsu's thresholding
**Usage:**
```bash
cargo run --example custom_pipeline -- image.png
```
**Pipeline Steps:**
1. **Preprocessing:**
- Denoising
- Contrast enhancement
- Sharpening
- Binarization (Otsu's method)
- Deskewing
2. **Post-processing:**
- Confidence filtering
- LaTeX validation
- Spell checking
- Custom formatting
---
### 6. WASM Browser Demo (`wasm_demo.html`)
**Browser-based OCR demonstration.**
Demonstrates:
- WebAssembly integration
- Browser-based image upload
- Drag-and-drop interface
- Real-time visualization
- Client-side processing
**Setup:**
```bash
# Build WASM module (when available)
wasm-pack build --target web
# Serve the demo
python3 -m http.server 8000
# Open http://localhost:8000/examples/wasm_demo.html
```
**Features:**
- Modern, responsive UI
- Drag-and-drop file upload
- Live preview
- Real-time results
- No server required (runs in browser)
---
### 7. Agent-Based Processing (`lean_agentic.rs`)
**Distributed OCR processing with agent coordination.**
Demonstrates:
- Multi-agent coordination
- Distributed task processing
- Fault tolerance
- Load balancing
- Agent statistics
**Usage:**
```bash
cargo run --example lean_agentic -- /path/to/documents
```
**Features:**
- Spawns multiple OCR agents (default: 4)
- Automatic task distribution
- Per-agent statistics
- Throughput metrics
- JSON result export
**Architecture:**
```
Coordinator
├── Agent 1 (tasks: 12)
├── Agent 2 (tasks: 15)
├── Agent 3 (tasks: 11)
└── Agent 4 (tasks: 13)
```
---
### 8. Accuracy Testing (`accuracy_test.rs`)
**OCR accuracy testing against ground truth datasets.**
Demonstrates:
- Dataset-based testing
- Multiple accuracy metrics
- Category-based analysis
- Statistical correlation
- Comprehensive reporting
**Usage:**
```bash
cargo run --example accuracy_test -- dataset.json
```
**Dataset Format:**
```json
[
{
"image_path": "tests/images/quadratic.png",
"ground_truth_text": "x^2 + 2x + 1 = 0",
"ground_truth_latex": "x^{2} + 2x + 1 = 0",
"category": "quadratic"
}
]
```
**Metrics Calculated:**
- **Text Accuracy** - Overall string similarity
- **Character Error Rate (CER)** - Character-level errors
- **Word Error Rate (WER)** - Word-level errors
- **LaTeX Accuracy** - LaTeX format correctness
- **Confidence Correlation** - Pearson correlation between confidence and accuracy
- **Category Breakdown** - Per-category statistics
**Example Output:**
```
Total Cases: 100
Successful: 98 (98.0%)
Average Confidence: 92.5%
Average Text Accuracy: 94.2%
Average CER: 3.1%
Average WER: 5.8%
Confidence Correlation: 0.847
Category Breakdown:
quadratic: 25 cases, 96.3% accuracy
linear: 30 cases, 98.1% accuracy
calculus: 20 cases, 89.7% accuracy
```
---
## Common Patterns
### Error Handling
All examples use `anyhow::Result` for error handling:
```rust
use anyhow::{Context, Result};
fn main() -> Result<()> {
let image = image::open(path)
.context("Failed to open image")?;
Ok(())
}
```
### Logging
Examples use `env_logger` for debug output:
```bash
# Run with debug logging
RUST_LOG=debug cargo run --example simple_ocr -- image.png
# Run with info logging (default)
RUST_LOG=info cargo run --example simple_ocr -- image.png
```
### Configuration
OCR engine configuration:
```rust
use ruvector_scipix::OcrConfig;
let config = OcrConfig {
confidence_threshold: 0.7,
max_image_size: 4096,
enable_preprocessing: true,
// ... other options
};
```
## Dependencies
Core dependencies used in examples:
- `anyhow` - Error handling
- `tokio` - Async runtime
- `image` - Image processing
- `serde/serde_json` - Serialization
- `indicatif` - Progress bars
- `axum` - HTTP server (api_server)
- `env_logger` - Logging
## Building Examples
Build all examples:
```bash
cargo build --examples
```
Build specific example:
```bash
cargo build --example simple_ocr
```
Run with optimizations:
```bash
cargo run --release --example batch_processing -- images/ output.json
```
## Testing Examples
Create test images:
```bash
# Create test directory
mkdir -p test_images
# Add some test images
cp /path/to/math_equation.png test_images/
```
Run examples:
```bash
# Simple OCR
cargo run --example simple_ocr -- test_images/equation.png
# Batch processing
cargo run --example batch_processing -- test_images/ results.json
# Accuracy test (requires dataset)
cargo run --example accuracy_test -- test_dataset.json
```
## Integration Guide
### Using in Your Project
1. **Add dependency:**
```toml
[dependencies]
ruvector-scipix = "0.1.0"
```
2. **Basic usage:**
```rust
use ruvector_scipix::{OcrEngine, OcrConfig};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let config = OcrConfig::default();
let engine = OcrEngine::new(config).await?;
let image = image::open("equation.png")?;
let result = engine.recognize(&image).await?;
println!("Text: {}", result.text);
Ok(())
}
```
3. **Advanced usage:**
See individual examples for advanced patterns like:
- Custom pipelines
- Batch processing
- API integration
- Agent-based processing
## Performance Tips
1. **Batch Processing:**
- Use parallel processing for multiple images
- Adjust concurrency based on CPU cores
- Enable model caching for repeated runs
2. **Memory Management:**
- Stream large documents instead of loading all at once
- Use appropriate image resolution (downscale if needed)
- Clear cache periodically for long-running processes
3. **Accuracy vs Speed:**
- Higher confidence thresholds = more accuracy, slower processing
- Preprocessing improves accuracy but adds overhead
- Balance based on your use case
## Troubleshooting
### Common Issues
**"Model not found"**
```bash
# Download models first
./scripts/download_models.sh
```
**"Out of memory"**
- Reduce batch size or concurrent workers
- Downscale large images before processing
- Enable streaming for PDFs
**"Low confidence scores"**
- Enable preprocessing pipeline
- Improve image quality (resolution, contrast)
- Check for skewed or rotated images
## Contributing
When adding new examples:
1. Add the `.rs` file to `examples/`
2. Update `Cargo.toml` with example entry
3. Document in this README
4. Include usage examples and expected output
5. Add error handling and logging
6. Keep examples self-contained
## Resources
- [Main Documentation](../README.md)
- [API Reference](../docs/API.md)
- [Model Guide](../docs/MODELS.md)
- [Benchmarks](../benches/README.md)
## License
All examples are provided under the same license as ruvector-scipix.

View File

@@ -0,0 +1,411 @@
//! Accuracy testing example
//!
//! This example demonstrates how to test OCR accuracy against a ground truth dataset.
//! It calculates various metrics including WER, CER, and confidence correlations.
//!
//! Usage:
//! ```bash
//! cargo run --example accuracy_test -- dataset.json
//! ```
//!
//! Dataset format (JSON):
//! ```json
//! [
//! {
//! "image_path": "path/to/image.png",
//! "ground_truth_text": "x^2 + 2x + 1 = 0",
//! "ground_truth_latex": "x^{2} + 2x + 1 = 0",
//! "category": "quadratic"
//! }
//! ]
//! ```
use anyhow::{Context, Result};
use ruvector_scipix::{OcrConfig, OcrEngine, OutputFormat};
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
#[derive(Debug, Deserialize)]
struct TestCase {
image_path: String,
ground_truth_text: String,
ground_truth_latex: Option<String>,
category: Option<String>,
}
#[derive(Debug, Serialize)]
struct TestResult {
image_path: String,
category: String,
predicted_text: String,
predicted_latex: Option<String>,
ground_truth_text: String,
ground_truth_latex: Option<String>,
confidence: f32,
text_accuracy: f32,
latex_accuracy: Option<f32>,
character_error_rate: f32,
word_error_rate: f32,
}
#[derive(Debug, Serialize)]
struct AccuracyMetrics {
total_cases: usize,
successful_cases: usize,
failed_cases: usize,
average_confidence: f32,
average_text_accuracy: f32,
average_latex_accuracy: f32,
average_cer: f32,
average_wer: f32,
category_breakdown: HashMap<String, CategoryMetrics>,
confidence_correlation: f32,
}
#[derive(Debug, Serialize)]
struct CategoryMetrics {
count: usize,
average_accuracy: f32,
average_confidence: f32,
}
#[tokio::main]
async fn main() -> Result<()> {
let args: Vec<String> = std::env::args().collect();
if args.len() < 2 {
eprintln!("Usage: {} <dataset.json>", args[0]);
eprintln!("\nDataset format:");
eprintln!(
r#"[
{{
"image_path": "path/to/image.png",
"ground_truth_text": "x^2 + 2x + 1 = 0",
"ground_truth_latex": "x^{{2}} + 2x + 1 = 0",
"category": "quadratic"
}}
]"#
);
std::process::exit(1);
}
let dataset_path = &args[1];
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
println!("Loading test dataset: {}", dataset_path);
let dataset_content = std::fs::read_to_string(dataset_path)?;
let test_cases: Vec<TestCase> = serde_json::from_str(&dataset_content)?;
println!("Loaded {} test cases", test_cases.len());
// Initialize OCR engine
println!("Initializing OCR engine...");
let config = OcrConfig::default();
let engine = OcrEngine::new(config).await?;
println!("Running accuracy tests...\n");
let mut results = Vec::new();
for (idx, test_case) in test_cases.iter().enumerate() {
println!(
"[{}/{}] Processing: {}",
idx + 1,
test_cases.len(),
test_case.image_path
);
match run_test_case(&engine, test_case).await {
Ok(result) => {
println!(
" Accuracy: {:.2}%, CER: {:.2}%, WER: {:.2}%",
result.text_accuracy * 100.0,
result.character_error_rate * 100.0,
result.word_error_rate * 100.0
);
results.push(result);
}
Err(e) => {
eprintln!(" Error: {}", e);
}
}
}
// Calculate overall metrics
let metrics = calculate_metrics(&results);
// Display results
println!("\n{}", "=".repeat(80));
println!("Accuracy Test Results");
println!("{}", "=".repeat(80));
println!("Total Cases: {}", metrics.total_cases);
println!(
"Successful: {} ({:.1}%)",
metrics.successful_cases,
(metrics.successful_cases as f32 / metrics.total_cases as f32) * 100.0
);
println!("Failed: {}", metrics.failed_cases);
println!("\n📊 Overall Metrics:");
println!(
" Average Confidence: {:.2}%",
metrics.average_confidence * 100.0
);
println!(
" Average Text Accuracy: {:.2}%",
metrics.average_text_accuracy * 100.0
);
println!(
" Average LaTeX Accuracy: {:.2}%",
metrics.average_latex_accuracy * 100.0
);
println!(" Average CER: {:.2}%", metrics.average_cer * 100.0);
println!(" Average WER: {:.2}%", metrics.average_wer * 100.0);
println!(
" Confidence Correlation: {:.3}",
metrics.confidence_correlation
);
if !metrics.category_breakdown.is_empty() {
println!("\n📂 Category Breakdown:");
for (category, cat_metrics) in &metrics.category_breakdown {
println!(" {}:", category);
println!(" Count: {}", cat_metrics.count);
println!(
" Average Accuracy: {:.2}%",
cat_metrics.average_accuracy * 100.0
);
println!(
" Average Confidence: {:.2}%",
cat_metrics.average_confidence * 100.0
);
}
}
println!("{}", "=".repeat(80));
// Save detailed results
let json = serde_json::to_string_pretty(&serde_json::json!({
"metrics": metrics,
"results": results
}))?;
std::fs::write("accuracy_results.json", json)?;
println!("\nDetailed results saved to: accuracy_results.json");
Ok(())
}
async fn run_test_case(engine: &OcrEngine, test_case: &TestCase) -> Result<TestResult> {
let image = image::open(&test_case.image_path)
.context(format!("Failed to load image: {}", test_case.image_path))?;
let ocr_result = engine.recognize(&image).await?;
let predicted_text = ocr_result.text.clone();
let predicted_latex = ocr_result.to_format(OutputFormat::LaTeX).ok();
let text_accuracy = calculate_accuracy(&predicted_text, &test_case.ground_truth_text);
let latex_accuracy =
if let (Some(pred), Some(gt)) = (&predicted_latex, &test_case.ground_truth_latex) {
Some(calculate_accuracy(pred, gt))
} else {
None
};
let cer = calculate_character_error_rate(&predicted_text, &test_case.ground_truth_text);
let wer = calculate_word_error_rate(&predicted_text, &test_case.ground_truth_text);
Ok(TestResult {
image_path: test_case.image_path.clone(),
category: test_case
.category
.clone()
.unwrap_or_else(|| "uncategorized".to_string()),
predicted_text,
predicted_latex,
ground_truth_text: test_case.ground_truth_text.clone(),
ground_truth_latex: test_case.ground_truth_latex.clone(),
confidence: ocr_result.confidence,
text_accuracy,
latex_accuracy,
character_error_rate: cer,
word_error_rate: wer,
})
}
fn calculate_accuracy(predicted: &str, ground_truth: &str) -> f32 {
let distance = levenshtein_distance(predicted, ground_truth);
let max_len = predicted.len().max(ground_truth.len());
if max_len == 0 {
return 1.0;
}
1.0 - (distance as f32 / max_len as f32)
}
fn calculate_character_error_rate(predicted: &str, ground_truth: &str) -> f32 {
let distance = levenshtein_distance(predicted, ground_truth);
if ground_truth.len() == 0 {
return if predicted.len() == 0 { 0.0 } else { 1.0 };
}
distance as f32 / ground_truth.len() as f32
}
fn calculate_word_error_rate(predicted: &str, ground_truth: &str) -> f32 {
let pred_words: Vec<&str> = predicted.split_whitespace().collect();
let gt_words: Vec<&str> = ground_truth.split_whitespace().collect();
let distance = levenshtein_distance_vec(&pred_words, &gt_words);
if gt_words.len() == 0 {
return if pred_words.len() == 0 { 0.0 } else { 1.0 };
}
distance as f32 / gt_words.len() as f32
}
fn levenshtein_distance(s1: &str, s2: &str) -> usize {
let len1 = s1.len();
let len2 = s2.len();
let mut matrix = vec![vec![0; len2 + 1]; len1 + 1];
for i in 0..=len1 {
matrix[i][0] = i;
}
for j in 0..=len2 {
matrix[0][j] = j;
}
for (i, c1) in s1.chars().enumerate() {
for (j, c2) in s2.chars().enumerate() {
let cost = if c1 == c2 { 0 } else { 1 };
matrix[i + 1][j + 1] = *[
matrix[i][j + 1] + 1,
matrix[i + 1][j] + 1,
matrix[i][j] + cost,
]
.iter()
.min()
.unwrap();
}
}
matrix[len1][len2]
}
fn levenshtein_distance_vec<T: Eq>(s1: &[T], s2: &[T]) -> usize {
let len1 = s1.len();
let len2 = s2.len();
let mut matrix = vec![vec![0; len2 + 1]; len1 + 1];
for i in 0..=len1 {
matrix[i][0] = i;
}
for j in 0..=len2 {
matrix[0][j] = j;
}
for i in 0..len1 {
for j in 0..len2 {
let cost = if s1[i] == s2[j] { 0 } else { 1 };
matrix[i + 1][j + 1] = *[
matrix[i][j + 1] + 1,
matrix[i + 1][j] + 1,
matrix[i][j] + cost,
]
.iter()
.min()
.unwrap();
}
}
matrix[len1][len2]
}
fn calculate_metrics(results: &[TestResult]) -> AccuracyMetrics {
let total_cases = results.len();
let successful_cases = results.len();
let failed_cases = 0;
let average_confidence = results.iter().map(|r| r.confidence).sum::<f32>() / total_cases as f32;
let average_text_accuracy =
results.iter().map(|r| r.text_accuracy).sum::<f32>() / total_cases as f32;
let latex_count = results
.iter()
.filter(|r| r.latex_accuracy.is_some())
.count();
let average_latex_accuracy = if latex_count > 0 {
results.iter().filter_map(|r| r.latex_accuracy).sum::<f32>() / latex_count as f32
} else {
0.0
};
let average_cer =
results.iter().map(|r| r.character_error_rate).sum::<f32>() / total_cases as f32;
let average_wer = results.iter().map(|r| r.word_error_rate).sum::<f32>() / total_cases as f32;
// Calculate category breakdown
let mut category_breakdown = HashMap::new();
for result in results {
let entry = category_breakdown
.entry(result.category.clone())
.or_insert_with(|| CategoryMetrics {
count: 0,
average_accuracy: 0.0,
average_confidence: 0.0,
});
entry.count += 1;
entry.average_accuracy += result.text_accuracy;
entry.average_confidence += result.confidence;
}
for metrics in category_breakdown.values_mut() {
metrics.average_accuracy /= metrics.count as f32;
metrics.average_confidence /= metrics.count as f32;
}
// Calculate confidence correlation (Pearson correlation)
let confidence_correlation = calculate_pearson_correlation(
&results.iter().map(|r| r.confidence).collect::<Vec<_>>(),
&results.iter().map(|r| r.text_accuracy).collect::<Vec<_>>(),
);
AccuracyMetrics {
total_cases,
successful_cases,
failed_cases,
average_confidence,
average_text_accuracy,
average_latex_accuracy,
average_cer,
average_wer,
category_breakdown,
confidence_correlation,
}
}
fn calculate_pearson_correlation(x: &[f32], y: &[f32]) -> f32 {
let n = x.len() as f32;
let mean_x = x.iter().sum::<f32>() / n;
let mean_y = y.iter().sum::<f32>() / n;
let mut numerator = 0.0;
let mut sum_sq_x = 0.0;
let mut sum_sq_y = 0.0;
for i in 0..x.len() {
let diff_x = x[i] - mean_x;
let diff_y = y[i] - mean_y;
numerator += diff_x * diff_y;
sum_sq_x += diff_x * diff_x;
sum_sq_y += diff_y * diff_y;
}
if sum_sq_x == 0.0 || sum_sq_y == 0.0 {
return 0.0;
}
numerator / (sum_sq_x * sum_sq_y).sqrt()
}

View File

@@ -0,0 +1,266 @@
//! API server example
//!
//! This example demonstrates how to create a REST API server for OCR processing.
//! It includes model preloading, graceful shutdown, and health checks.
//!
//! Usage:
//! ```bash
//! cargo run --example api_server
//!
//! # Then in another terminal:
//! curl -X POST -F "image=@equation.png" http://localhost:8080/ocr
//! ```
use axum::{
extract::{Multipart, State},
http::StatusCode,
response::{IntoResponse, Json},
routing::{get, post},
Router,
};
use ruvector_scipix::{OcrConfig, OcrEngine, OutputFormat};
use serde::{Deserialize, Serialize};
use std::sync::Arc;
use tokio::signal;
use tower_http::cors::CorsLayer;
#[derive(Clone)]
struct AppState {
engine: Arc<OcrEngine>,
}
#[derive(Serialize, Deserialize)]
struct OcrResponse {
success: bool,
text: Option<String>,
latex: Option<String>,
confidence: Option<f32>,
error: Option<String>,
}
#[derive(Serialize)]
struct HealthResponse {
status: String,
version: String,
models_loaded: bool,
}
#[tokio::main]
async fn main() -> anyhow::Result<()> {
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
println!("Initializing OCR engine...");
// Configure OCR engine
let config = OcrConfig::default();
let engine = OcrEngine::new(config).await?;
// Preload models for faster first request
println!("Preloading models...");
// TODO: Add model preloading method to OcrEngine
let state = AppState {
engine: Arc::new(engine),
};
// Build router
let app = Router::new()
.route("/health", get(health_check))
.route("/ocr", post(process_ocr))
.route("/batch", post(process_batch))
.layer(CorsLayer::permissive())
.with_state(state);
let addr = "0.0.0.0:8080";
println!("Starting server on http://{}", addr);
println!("\nEndpoints:");
println!(" GET /health - Health check");
println!(" POST /ocr - Process single image");
println!(" POST /batch - Process multiple images");
println!("\nPress Ctrl+C to shutdown");
let listener = tokio::net::TcpListener::bind(addr).await?;
// Run server with graceful shutdown
axum::serve(listener, app)
.with_graceful_shutdown(shutdown_signal())
.await?;
println!("\nServer shutdown complete");
Ok(())
}
async fn health_check() -> impl IntoResponse {
Json(HealthResponse {
status: "healthy".to_string(),
version: env!("CARGO_PKG_VERSION").to_string(),
models_loaded: true,
})
}
async fn process_ocr(State(state): State<AppState>, mut multipart: Multipart) -> impl IntoResponse {
while let Some(field) = multipart.next_field().await.unwrap() {
if field.name() == Some("image") {
let data = match field.bytes().await {
Ok(bytes) => bytes,
Err(e) => {
return (
StatusCode::BAD_REQUEST,
Json(OcrResponse {
success: false,
text: None,
latex: None,
confidence: None,
error: Some(format!("Failed to read image: {}", e)),
}),
);
}
};
let image = match image::load_from_memory(&data) {
Ok(img) => img,
Err(e) => {
return (
StatusCode::BAD_REQUEST,
Json(OcrResponse {
success: false,
text: None,
latex: None,
confidence: None,
error: Some(format!("Invalid image format: {}", e)),
}),
);
}
};
match state.engine.recognize(&image).await {
Ok(result) => {
return (
StatusCode::OK,
Json(OcrResponse {
success: true,
text: Some(result.text.clone()),
latex: result.to_format(OutputFormat::LaTeX).ok(),
confidence: Some(result.confidence),
error: None,
}),
);
}
Err(e) => {
return (
StatusCode::INTERNAL_SERVER_ERROR,
Json(OcrResponse {
success: false,
text: None,
latex: None,
confidence: None,
error: Some(format!("OCR failed: {}", e)),
}),
);
}
}
}
}
(
StatusCode::BAD_REQUEST,
Json(OcrResponse {
success: false,
text: None,
latex: None,
confidence: None,
error: Some("No image field found".to_string()),
}),
)
}
async fn process_batch(
State(state): State<AppState>,
mut multipart: Multipart,
) -> impl IntoResponse {
let mut results = Vec::new();
while let Some(field) = multipart.next_field().await.unwrap() {
if field.name() == Some("images") {
let data = match field.bytes().await {
Ok(bytes) => bytes,
Err(e) => {
results.push(OcrResponse {
success: false,
text: None,
latex: None,
confidence: None,
error: Some(format!("Failed to read image: {}", e)),
});
continue;
}
};
let image = match image::load_from_memory(&data) {
Ok(img) => img,
Err(e) => {
results.push(OcrResponse {
success: false,
text: None,
latex: None,
confidence: None,
error: Some(format!("Invalid image: {}", e)),
});
continue;
}
};
match state.engine.recognize(&image).await {
Ok(result) => {
results.push(OcrResponse {
success: true,
text: Some(result.text.clone()),
latex: result.to_format(OutputFormat::LaTeX).ok(),
confidence: Some(result.confidence),
error: None,
});
}
Err(e) => {
results.push(OcrResponse {
success: false,
text: None,
latex: None,
confidence: None,
error: Some(format!("OCR failed: {}", e)),
});
}
}
}
}
(StatusCode::OK, Json(results))
}
async fn shutdown_signal() {
let ctrl_c = async {
signal::ctrl_c()
.await
.expect("Failed to install Ctrl+C handler");
};
#[cfg(unix)]
let terminate = async {
signal::unix::signal(signal::unix::SignalKind::terminate())
.expect("Failed to install signal handler")
.recv()
.await;
};
#[cfg(not(unix))]
let terminate = std::future::pending::<()>();
tokio::select! {
_ = ctrl_c => {
println!("\nReceived Ctrl+C, shutting down gracefully...");
},
_ = terminate => {
println!("\nReceived termination signal, shutting down gracefully...");
},
}
}

View File

@@ -0,0 +1,181 @@
//! Batch processing example
//!
//! This example demonstrates parallel batch processing of multiple images.
//! It processes all images in a directory concurrently with a progress bar.
//!
//! Note: This example requires the `ocr` feature to be enabled.
//!
//! Usage:
//! ```bash
//! cargo run --example batch_processing --features ocr -- /path/to/images output.json
//! ```
use anyhow::Result;
use indicatif::{ProgressBar, ProgressStyle};
use ruvector_scipix::ocr::OcrEngine;
use ruvector_scipix::output::{OcrResult, OutputFormat};
use ruvector_scipix::OcrConfig;
use serde::{Deserialize, Serialize};
use std::path::{Path, PathBuf};
use std::sync::Arc;
use tokio::sync::Semaphore;
#[derive(Debug, Serialize, Deserialize)]
struct BatchResult {
file_path: String,
success: bool,
text: Option<String>,
latex: Option<String>,
confidence: Option<f32>,
error: Option<String>,
}
#[tokio::main]
async fn main() -> Result<()> {
let args: Vec<String> = std::env::args().collect();
if args.len() < 3 {
eprintln!("Usage: {} <image_directory> <output_json>", args[0]);
eprintln!("\nExample:");
eprintln!(" {} ./images results.json", args[0]);
std::process::exit(1);
}
let image_dir = Path::new(&args[1]);
let output_file = &args[2];
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
// Collect all image files
let image_files = collect_image_files(image_dir)?;
if image_files.is_empty() {
eprintln!("No image files found in: {}", image_dir.display());
std::process::exit(1);
}
println!("Found {} images to process", image_files.len());
// Initialize OCR engine
let config = OcrConfig::default();
let engine = Arc::new(OcrEngine::new(config).await?);
// Create progress bar
let progress = ProgressBar::new(image_files.len() as u64);
progress.set_style(
ProgressStyle::default_bar()
.template("[{elapsed_precise}] {bar:40.cyan/blue} {pos}/{len} {msg}")
.unwrap()
.progress_chars("=>-"),
);
// Limit concurrent processing to avoid overwhelming the system
let max_concurrent = num_cpus::get();
let semaphore = Arc::new(Semaphore::new(max_concurrent));
// Process images in parallel
let mut tasks = Vec::new();
for image_path in image_files {
let engine = Arc::clone(&engine);
let semaphore = Arc::clone(&semaphore);
let progress = progress.clone();
let task = tokio::spawn(async move {
let _permit = semaphore.acquire().await.unwrap();
let result = process_image(&engine, &image_path).await;
progress.inc(1);
result
});
tasks.push(task);
}
// Wait for all tasks to complete
let mut results = Vec::new();
for task in tasks {
results.push(task.await?);
}
progress.finish_with_message("Complete");
// Calculate statistics
let successful = results.iter().filter(|r| r.success).count();
let failed = results.len() - successful;
let avg_confidence =
results.iter().filter_map(|r| r.confidence).sum::<f32>() / successful as f32;
println!("\n{}", "=".repeat(80));
println!("Batch Processing Complete");
println!("{}", "=".repeat(80));
println!("Total: {}", results.len());
println!(
"Successful: {} ({:.1}%)",
successful,
(successful as f32 / results.len() as f32) * 100.0
);
println!("Failed: {}", failed);
println!("Average Confidence: {:.2}%", avg_confidence * 100.0);
println!("{}", "=".repeat(80));
// Save results to JSON
let json = serde_json::to_string_pretty(&results)?;
std::fs::write(output_file, json)?;
println!("\nResults saved to: {}", output_file);
Ok(())
}
fn collect_image_files(dir: &Path) -> Result<Vec<PathBuf>> {
let mut files = Vec::new();
let extensions = ["png", "jpg", "jpeg", "bmp", "tiff", "webp"];
for entry in std::fs::read_dir(dir)? {
let entry = entry?;
let path = entry.path();
if path.is_file() {
if let Some(ext) = path.extension() {
if extensions.contains(&ext.to_str().unwrap_or("").to_lowercase().as_str()) {
files.push(path);
}
}
}
}
Ok(files)
}
async fn process_image(engine: &OcrEngine, path: &Path) -> BatchResult {
let file_path = path.to_string_lossy().to_string();
match image::open(path) {
Ok(img) => match engine.recognize(&img).await {
Ok(result) => BatchResult {
file_path,
success: true,
text: Some(result.text.clone()),
latex: result.to_format(ruvector_scipix::OutputFormat::LaTeX).ok(),
confidence: Some(result.confidence),
error: None,
},
Err(e) => BatchResult {
file_path,
success: false,
text: None,
latex: None,
confidence: None,
error: Some(e.to_string()),
},
},
Err(e) => BatchResult {
file_path,
success: false,
text: None,
latex: None,
confidence: None,
error: Some(e.to_string()),
},
}
}

View File

@@ -0,0 +1,358 @@
//! Custom pipeline example
//!
//! This example demonstrates how to create a custom OCR pipeline with:
//! - Custom preprocessing steps
//! - Post-processing filters
//! - Integration with external services
//! - Custom output formatting
//!
//! Usage:
//! ```bash
//! cargo run --example custom_pipeline -- image.png
//! ```
use anyhow::{Context, Result};
use image::{DynamicImage, ImageBuffer, Luma};
use ruvector_scipix::{OcrConfig, OcrEngine, OcrResult, OutputFormat};
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone)]
struct CustomPipeline {
engine: OcrEngine,
preprocessing: Vec<PreprocessStep>,
postprocessing: Vec<PostprocessStep>,
}
#[derive(Debug, Clone)]
enum PreprocessStep {
Denoise,
Sharpen,
ContrastEnhancement,
Binarization,
Deskew,
}
#[derive(Debug, Clone)]
enum PostprocessStep {
SpellCheck,
LatexValidation,
ConfidenceFilter(f32),
CustomFormatter,
}
#[derive(Debug, Serialize, Deserialize)]
struct PipelineResult {
original_result: String,
processed_result: String,
latex: String,
confidence: f32,
preprocessing_steps: Vec<String>,
postprocessing_steps: Vec<String>,
validation_results: ValidationResults,
}
#[derive(Debug, Serialize, Deserialize)]
struct ValidationResults {
latex_valid: bool,
spell_check_corrections: usize,
confidence_threshold_passed: bool,
}
impl CustomPipeline {
async fn new(config: OcrConfig) -> Result<Self> {
let engine = OcrEngine::new(config).await?;
Ok(Self {
engine,
preprocessing: vec![
PreprocessStep::Denoise,
PreprocessStep::ContrastEnhancement,
PreprocessStep::Sharpen,
PreprocessStep::Binarization,
],
postprocessing: vec![
PostprocessStep::ConfidenceFilter(0.7),
PostprocessStep::LatexValidation,
PostprocessStep::SpellCheck,
PostprocessStep::CustomFormatter,
],
})
}
async fn process(&self, image: DynamicImage) -> Result<PipelineResult> {
// Apply preprocessing steps
let mut processed_image = image;
let mut preprocessing_log = Vec::new();
for step in &self.preprocessing {
processed_image = self.apply_preprocessing(processed_image, step)?;
preprocessing_log.push(format!("{:?}", step));
}
// Run OCR
let ocr_result = self.engine.recognize(&processed_image).await?;
// Apply postprocessing steps
let mut result_text = ocr_result.text.clone();
let mut postprocessing_log = Vec::new();
let mut validation = ValidationResults {
latex_valid: false,
spell_check_corrections: 0,
confidence_threshold_passed: false,
};
for step in &self.postprocessing {
let (new_text, step_validation) =
self.apply_postprocessing(result_text.clone(), &ocr_result, step)?;
result_text = new_text;
postprocessing_log.push(format!("{:?}", step));
// Update validation results
match step {
PostprocessStep::LatexValidation => {
validation.latex_valid = step_validation.unwrap_or(false);
}
PostprocessStep::SpellCheck => {
validation.spell_check_corrections = step_validation.unwrap_or(0) as usize;
}
PostprocessStep::ConfidenceFilter(threshold) => {
validation.confidence_threshold_passed = ocr_result.confidence >= *threshold;
}
_ => {}
}
}
Ok(PipelineResult {
original_result: ocr_result.text.clone(),
processed_result: result_text,
latex: ocr_result.to_format(OutputFormat::LaTeX)?,
confidence: ocr_result.confidence,
preprocessing_steps: preprocessing_log,
postprocessing_steps: postprocessing_log,
validation_results: validation,
})
}
fn apply_preprocessing(
&self,
image: DynamicImage,
step: &PreprocessStep,
) -> Result<DynamicImage> {
match step {
PreprocessStep::Denoise => Ok(denoise_image(image)),
PreprocessStep::Sharpen => Ok(sharpen_image(image)),
PreprocessStep::ContrastEnhancement => Ok(enhance_contrast(image)),
PreprocessStep::Binarization => Ok(binarize_image(image)),
PreprocessStep::Deskew => Ok(deskew_image(image)),
}
}
fn apply_postprocessing(
&self,
text: String,
result: &OcrResult,
step: &PostprocessStep,
) -> Result<(String, Option<i32>)> {
match step {
PostprocessStep::SpellCheck => {
let (corrected, corrections) = spell_check(&text);
Ok((corrected, Some(corrections as i32)))
}
PostprocessStep::LatexValidation => {
let valid = validate_latex(&text);
Ok((text, Some(if valid { 1 } else { 0 })))
}
PostprocessStep::ConfidenceFilter(threshold) => {
if result.confidence >= *threshold {
Ok((text, Some(1)))
} else {
Ok((format!("[Low Confidence] {}", text), Some(0)))
}
}
PostprocessStep::CustomFormatter => {
let formatted = custom_format(&text);
Ok((formatted, None))
}
}
}
}
// Preprocessing implementations
fn denoise_image(image: DynamicImage) -> DynamicImage {
// Simplified denoising using median filter
image.blur(1.0)
}
fn sharpen_image(image: DynamicImage) -> DynamicImage {
// Simplified sharpening
image.unsharpen(2.0, 1)
}
fn enhance_contrast(image: DynamicImage) -> DynamicImage {
// Simplified contrast enhancement
image.adjust_contrast(20.0)
}
fn binarize_image(image: DynamicImage) -> DynamicImage {
// Otsu's binarization (simplified)
let gray = image.to_luma8();
let threshold = calculate_otsu_threshold(&gray);
let binary = ImageBuffer::from_fn(gray.width(), gray.height(), |x, y| {
let pixel = gray.get_pixel(x, y)[0];
if pixel > threshold {
Luma([255u8])
} else {
Luma([0u8])
}
});
DynamicImage::ImageLuma8(binary)
}
fn deskew_image(image: DynamicImage) -> DynamicImage {
// Simplified deskew - in production, use Hough transform
image
}
fn calculate_otsu_threshold(gray: &ImageBuffer<Luma<u8>, Vec<u8>>) -> u8 {
// Simplified Otsu's method
let mut histogram = [0u32; 256];
for pixel in gray.pixels() {
histogram[pixel[0] as usize] += 1;
}
let total = gray.width() * gray.height();
let mut sum = 0u64;
for (i, &count) in histogram.iter().enumerate() {
sum += i as u64 * count as u64;
}
let mut sum_background = 0u64;
let mut weight_background = 0u32;
let mut max_variance = 0.0f64;
let mut threshold = 0u8;
for (t, &count) in histogram.iter().enumerate() {
weight_background += count;
if weight_background == 0 {
continue;
}
let weight_foreground = total - weight_background;
if weight_foreground == 0 {
break;
}
sum_background += t as u64 * count as u64;
let mean_background = sum_background as f64 / weight_background as f64;
let mean_foreground = (sum - sum_background) as f64 / weight_foreground as f64;
let variance = weight_background as f64
* weight_foreground as f64
* (mean_background - mean_foreground).powi(2);
if variance > max_variance {
max_variance = variance;
threshold = t as u8;
}
}
threshold
}
// Postprocessing implementations
fn spell_check(text: &str) -> (String, usize) {
// Simplified spell check - in production, use a proper library
// For demo, just return the original text
(text.to_string(), 0)
}
fn validate_latex(text: &str) -> bool {
// Simplified LaTeX validation
// Check for balanced braces and common LaTeX patterns
let open_braces = text.matches('{').count();
let close_braces = text.matches('}').count();
open_braces == close_braces
}
fn custom_format(text: &str) -> String {
// Custom formatting - e.g., add proper spacing, formatting
text.lines()
.map(|line| line.trim())
.filter(|line| !line.is_empty())
.collect::<Vec<_>>()
.join("\n")
}
#[tokio::main]
async fn main() -> Result<()> {
let args: Vec<String> = std::env::args().collect();
if args.len() < 2 {
eprintln!("Usage: {} <image_path>", args[0]);
std::process::exit(1);
}
let image_path = &args[1];
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
println!("Loading image: {}", image_path);
let image = image::open(image_path)?;
// Create custom pipeline
let config = OcrConfig::default();
let pipeline = CustomPipeline::new(config).await?;
println!("Processing with custom pipeline...");
let result = pipeline.process(image).await?;
// Display results
println!("\n{}", "=".repeat(80));
println!("Pipeline Results");
println!("{}", "=".repeat(80));
println!("\n📝 Original OCR Result:");
println!("{}", result.original_result);
println!("\n✨ Processed Result:");
println!("{}", result.processed_result);
println!("\n🔢 LaTeX:");
println!("{}", result.latex);
println!("\n📊 Confidence: {:.2}%", result.confidence * 100.0);
println!("\n🔧 Preprocessing Steps:");
for step in &result.preprocessing_steps {
println!(" - {}", step);
}
println!("\n🔄 Postprocessing Steps:");
for step in &result.postprocessing_steps {
println!(" - {}", step);
}
println!("\n✅ Validation:");
println!(" LaTeX Valid: {}", result.validation_results.latex_valid);
println!(
" Spell Corrections: {}",
result.validation_results.spell_check_corrections
);
println!(
" Confidence Passed: {}",
result.validation_results.confidence_threshold_passed
);
println!("\n{}", "=".repeat(80));
// Save full results
let json = serde_json::to_string_pretty(&result)?;
std::fs::write("pipeline_results.json", json)?;
println!("\nFull results saved to: pipeline_results.json");
Ok(())
}

View File

@@ -0,0 +1,303 @@
//! Lean Agentic integration example
//!
//! This example demonstrates distributed OCR processing using agent coordination.
//! Multiple agents work together to process documents in parallel with fault tolerance.
//!
//! Usage:
//! ```bash
//! cargo run --example lean_agentic -- /path/to/documents
//! ```
use anyhow::{Context, Result};
use ruvector_scipix::{OcrConfig, OcrEngine};
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::path::Path;
use std::sync::Arc;
use tokio::sync::{mpsc, RwLock};
#[derive(Debug, Clone, Serialize, Deserialize)]
struct OcrTask {
id: String,
file_path: String,
priority: u8,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
struct OcrTaskResult {
task_id: String,
agent_id: String,
success: bool,
text: Option<String>,
latex: Option<String>,
confidence: Option<f32>,
processing_time_ms: u64,
error: Option<String>,
}
#[derive(Debug, Clone)]
struct OcrAgent {
id: String,
engine: Arc<OcrEngine>,
tasks_completed: Arc<RwLock<usize>>,
}
impl OcrAgent {
async fn new(id: String, config: OcrConfig) -> Result<Self> {
let engine = OcrEngine::new(config).await?;
Ok(Self {
id,
engine: Arc::new(engine),
tasks_completed: Arc::new(RwLock::new(0)),
})
}
async fn process_task(&self, task: OcrTask) -> OcrTaskResult {
let start = std::time::Instant::now();
println!("[Agent {}] Processing task: {}", self.id, task.id);
let result = match image::open(&task.file_path) {
Ok(img) => match self.engine.recognize(&img).await {
Ok(ocr_result) => {
let mut count = self.tasks_completed.write().await;
*count += 1;
OcrTaskResult {
task_id: task.id,
agent_id: self.id.clone(),
success: true,
text: Some(ocr_result.text.clone()),
latex: ocr_result
.to_format(ruvector_scipix::OutputFormat::LaTeX)
.ok(),
confidence: Some(ocr_result.confidence),
processing_time_ms: start.elapsed().as_millis() as u64,
error: None,
}
}
Err(e) => OcrTaskResult {
task_id: task.id,
agent_id: self.id.clone(),
success: false,
text: None,
latex: None,
confidence: None,
processing_time_ms: start.elapsed().as_millis() as u64,
error: Some(e.to_string()),
},
},
Err(e) => OcrTaskResult {
task_id: task.id,
agent_id: self.id.clone(),
success: false,
text: None,
latex: None,
confidence: None,
processing_time_ms: start.elapsed().as_millis() as u64,
error: Some(e.to_string()),
},
};
println!(
"[Agent {}] Completed task: {} ({}ms)",
self.id, result.task_id, result.processing_time_ms
);
result
}
async fn get_stats(&self) -> usize {
*self.tasks_completed.read().await
}
}
struct AgentCoordinator {
agents: Vec<Arc<OcrAgent>>,
task_queue: mpsc::Sender<OcrTask>,
result_queue: mpsc::Receiver<OcrTaskResult>,
results: Arc<RwLock<HashMap<String, OcrTaskResult>>>,
}
impl AgentCoordinator {
async fn new(num_agents: usize, config: OcrConfig) -> Result<Self> {
let mut agents = Vec::new();
for i in 0..num_agents {
let agent = OcrAgent::new(format!("agent-{}", i), config.clone()).await?;
agents.push(Arc::new(agent));
}
let (task_tx, task_rx) = mpsc::channel::<OcrTask>(100);
let (result_tx, result_rx) = mpsc::channel::<OcrTaskResult>(100);
// Spawn agent workers
for agent in &agents {
let agent = Arc::clone(agent);
let mut task_rx = task_rx.resubscribe();
let result_tx = result_tx.clone();
tokio::spawn(async move {
while let Some(task) = task_rx.recv().await {
let result = agent.process_task(task).await;
let _ = result_tx.send(result).await;
}
});
}
Ok(Self {
agents,
task_queue: task_tx,
result_queue: result_rx,
results: Arc::new(RwLock::new(HashMap::new())),
})
}
async fn submit_task(&self, task: OcrTask) -> Result<()> {
self.task_queue
.send(task)
.await
.context("Failed to submit task")?;
Ok(())
}
async fn collect_results(&mut self, expected: usize) -> Vec<OcrTaskResult> {
let mut collected = Vec::new();
while collected.len() < expected {
if let Some(result) = self.result_queue.recv().await {
let mut results = self.results.write().await;
results.insert(result.task_id.clone(), result.clone());
collected.push(result);
}
}
collected
}
async fn get_agent_stats(&self) -> HashMap<String, usize> {
let mut stats = HashMap::new();
for agent in &self.agents {
let count = agent.get_stats().await;
stats.insert(agent.id.clone(), count);
}
stats
}
}
// Note: This is a simplified implementation. In production, you would integrate with
// an actual agent framework like:
// - lean_agentic crate for agent coordination
// - tokio actors for distributed processing
// - Or a custom agent framework
#[tokio::main]
async fn main() -> Result<()> {
let args: Vec<String> = std::env::args().collect();
if args.len() < 2 {
eprintln!("Usage: {} <documents_directory>", args[0]);
eprintln!("\nExample:");
eprintln!(" {} ./documents", args[0]);
std::process::exit(1);
}
let docs_dir = Path::new(&args[1]);
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
println!("🤖 Initializing Agent Swarm...");
// Create agent coordinator with 4 agents
let num_agents = 4;
let config = OcrConfig::default();
let mut coordinator = AgentCoordinator::new(num_agents, config).await?;
println!("✅ Spawned {} OCR agents", num_agents);
// Collect tasks
let mut tasks = Vec::new();
for entry in std::fs::read_dir(docs_dir)? {
let entry = entry?;
let path = entry.path();
if path.is_file() {
if let Some(ext) = path.extension() {
let ext_str = ext.to_str().unwrap_or("").to_lowercase();
if ["png", "jpg", "jpeg", "bmp", "tiff", "webp"].contains(&ext_str.as_str()) {
let task = OcrTask {
id: format!("task-{}", tasks.len()),
file_path: path.to_string_lossy().to_string(),
priority: 1,
};
tasks.push(task);
}
}
}
}
if tasks.is_empty() {
eprintln!("No image files found in: {}", docs_dir.display());
std::process::exit(1);
}
println!("📋 Queued {} tasks for processing", tasks.len());
// Submit all tasks
for task in &tasks {
coordinator.submit_task(task.clone()).await?;
}
println!("🚀 Processing started...\n");
let start_time = std::time::Instant::now();
// Collect results
let results = coordinator.collect_results(tasks.len()).await;
let total_time = start_time.elapsed();
// Calculate statistics
let successful = results.iter().filter(|r| r.success).count();
let failed = results.len() - successful;
let avg_confidence =
results.iter().filter_map(|r| r.confidence).sum::<f32>() / successful.max(1) as f32;
let avg_time = results.iter().map(|r| r.processing_time_ms).sum::<u64>() / results.len() as u64;
// Display results
println!("\n{}", "=".repeat(80));
println!("Agent Swarm Results");
println!("{}", "=".repeat(80));
println!("Total Tasks: {}", results.len());
println!(
"Successful: {} ({:.1}%)",
successful,
(successful as f32 / results.len() as f32) * 100.0
);
println!("Failed: {}", failed);
println!("Average Confidence: {:.2}%", avg_confidence * 100.0);
println!("Average Processing Time: {}ms", avg_time);
println!("Total Time: {:.2}s", total_time.as_secs_f32());
println!(
"Throughput: {:.2} tasks/sec",
results.len() as f32 / total_time.as_secs_f32()
);
// Agent statistics
println!("\n📊 Agent Statistics:");
let agent_stats = coordinator.get_agent_stats().await;
for (agent_id, count) in agent_stats {
println!(" {}: {} tasks", agent_id, count);
}
println!("{}", "=".repeat(80));
// Save results
let json = serde_json::to_string_pretty(&results)?;
std::fs::write("agent_results.json", json)?;
println!("\nResults saved to: agent_results.json");
Ok(())
}

View File

@@ -0,0 +1,311 @@
//! Demonstration of performance optimizations in ruvector-scipix
//!
//! This example shows how to use various optimization features:
//! - SIMD operations for image processing
//! - Parallel batch processing
//! - Memory pooling
//! - Model quantization
//! - Dynamic batching
use ruvector_scipix::optimize::*;
use std::sync::Arc;
use std::time::Instant;
fn main() {
println!("=== Ruvector-Scipix Optimization Demo ===\n");
// 1. Feature Detection
demo_feature_detection();
// 2. SIMD Operations
demo_simd_operations();
// 3. Parallel Processing
demo_parallel_processing();
// 4. Memory Optimizations
demo_memory_optimizations();
// 5. Model Quantization
demo_quantization();
println!("\n=== Demo Complete ===");
}
fn demo_feature_detection() {
println!("1. CPU Feature Detection");
println!("------------------------");
let features = detect_features();
println!("AVX2 Support: {}", if features.avx2 { "" } else { "" });
println!(
"AVX-512 Support: {}",
if features.avx512f { "" } else { "" }
);
println!("NEON Support: {}", if features.neon { "" } else { "" });
println!(
"SSE4.2 Support: {}",
if features.sse4_2 { "" } else { "" }
);
let opt_level = get_opt_level();
println!("Optimization Level: {:?}", opt_level);
println!();
}
fn demo_simd_operations() {
println!("2. SIMD Operations");
println!("------------------");
// Create test image (512x512 RGBA)
let size = 512;
let rgba: Vec<u8> = (0..size * size * 4).map(|i| (i % 256) as u8).collect();
let mut gray = vec![0u8; size * size];
// Benchmark grayscale conversion
let iterations = 100;
let start = Instant::now();
for _ in 0..iterations {
simd::simd_grayscale(&rgba, &mut gray);
}
let simd_time = start.elapsed();
println!("Grayscale conversion ({} iterations):", iterations);
println!(
" SIMD: {:?} ({:.2} MP/s)",
simd_time,
(iterations as f64 * size as f64 * size as f64 / 1_000_000.0) / simd_time.as_secs_f64()
);
// Benchmark threshold
let mut binary = vec![0u8; size * size];
let start = Instant::now();
for _ in 0..iterations {
simd::simd_threshold(&gray, 128, &mut binary);
}
let threshold_time = start.elapsed();
println!("Threshold operation ({} iterations):", iterations);
println!(
" SIMD: {:?} ({:.2} MP/s)",
threshold_time,
(iterations as f64 * size as f64 * size as f64 / 1_000_000.0)
/ threshold_time.as_secs_f64()
);
// Benchmark normalization
let mut data: Vec<f32> = (0..8192).map(|i| i as f32).collect();
let start = Instant::now();
for _ in 0..iterations {
simd::simd_normalize(&mut data);
}
let normalize_time = start.elapsed();
println!("Normalization ({} iterations):", iterations);
println!(" SIMD: {:?}", normalize_time);
println!();
}
fn demo_parallel_processing() {
println!("3. Parallel Processing");
println!("----------------------");
let data: Vec<i32> = (0..10000).collect();
// Sequential processing
let start = Instant::now();
let _seq_result: Vec<i32> = data.iter().map(|&x| expensive_computation(x)).collect();
let seq_time = start.elapsed();
// Parallel processing
let start = Instant::now();
let _par_result =
parallel::parallel_map_chunked(data.clone(), 100, |x| expensive_computation(x));
let par_time = start.elapsed();
println!("Processing 10,000 items:");
println!(" Sequential: {:?}", seq_time);
println!(" Parallel: {:?}", par_time);
println!(
" Speedup: {:.2}x",
seq_time.as_secs_f64() / par_time.as_secs_f64()
);
let threads = parallel::optimal_thread_count();
println!(" Using {} threads", threads);
println!();
}
fn expensive_computation(x: i32) -> i32 {
// Simulate some work
(0..100).fold(x, |acc, i| acc.wrapping_add(i))
}
fn demo_memory_optimizations() {
println!("4. Memory Optimizations");
println!("-----------------------");
let pools = memory::GlobalPools::get();
// Benchmark buffer pool vs direct allocation
let iterations = 10000;
// Pooled allocation
let start = Instant::now();
for _ in 0..iterations {
let mut buf = pools.acquire_small();
buf.extend_from_slice(&[0u8; 512]);
}
let pooled_time = start.elapsed();
// Direct allocation
let start = Instant::now();
for _ in 0..iterations {
let mut buf = Vec::with_capacity(1024);
buf.extend_from_slice(&[0u8; 512]);
}
let direct_time = start.elapsed();
println!("Buffer allocation ({} iterations):", iterations);
println!(" Pooled: {:?}", pooled_time);
println!(" Direct: {:?}", direct_time);
println!(
" Speedup: {:.2}x",
direct_time.as_secs_f64() / pooled_time.as_secs_f64()
);
// Arena allocation
let mut arena = memory::Arena::with_capacity(1024 * 1024);
let start = Instant::now();
for _ in 0..iterations {
arena.reset();
for _ in 0..10 {
let _slice = arena.alloc(1024, 8);
}
}
let arena_time = start.elapsed();
println!(
"\nArena allocation ({} iterations, 10 allocs each):",
iterations
);
println!(" Time: {:?}", arena_time);
println!();
}
fn demo_quantization() {
println!("5. Model Quantization");
println!("---------------------");
// Create model weights
let size = 100_000;
let weights: Vec<f32> = (0..size)
.map(|i| ((i as f32 / size as f32) * 2.0 - 1.0))
.collect();
println!(
"Original model: {} weights ({:.2} MB)",
weights.len(),
(weights.len() * std::mem::size_of::<f32>()) as f64 / 1_048_576.0
);
// Quantize
let start = Instant::now();
let (quantized, params) = quantize::quantize_weights(&weights);
let quant_time = start.elapsed();
println!(
"Quantized: {} weights ({:.2} MB)",
quantized.len(),
(quantized.len() * std::mem::size_of::<i8>()) as f64 / 1_048_576.0
);
println!(
"Compression: {:.2}x",
(weights.len() * std::mem::size_of::<f32>()) as f64
/ (quantized.len() * std::mem::size_of::<i8>()) as f64
);
println!("Quantization time: {:?}", quant_time);
// Check quality
let error = quantize::quantization_error(&weights, &quantized, params);
let snr = quantize::sqnr(&weights, &quantized, params);
println!("Quality metrics:");
println!(" MSE: {:.6}", error);
println!(" SQNR: {:.2} dB", snr);
// Benchmark dequantization
let iterations = 100;
let start = Instant::now();
for _ in 0..iterations {
let _restored = quantize::dequantize(&quantized, params);
}
let dequant_time = start.elapsed();
println!(
"Dequantization ({} iterations): {:?}",
iterations, dequant_time
);
// Per-channel quantization
let weights_2d: Vec<f32> = (0..10_000).map(|i| i as f32).collect();
let shape = vec![100, 100]; // 100 channels, 100 values each
let start = Instant::now();
let per_channel = quantize::PerChannelQuant::from_f32(&weights_2d, shape);
let per_channel_time = start.elapsed();
println!("\nPer-channel quantization:");
println!(" Channels: {}", per_channel.params.len());
println!(" Time: {:?}", per_channel_time);
println!();
}
// Async batching demo (would need tokio runtime)
#[allow(dead_code)]
async fn demo_batching() {
println!("6. Dynamic Batching");
println!("-------------------");
use batch::{BatchConfig, DynamicBatcher};
let config = BatchConfig {
max_batch_size: 32,
max_wait_ms: 50,
max_queue_size: 1000,
preferred_batch_size: 16,
};
let batcher = Arc::new(DynamicBatcher::new(config, |items: Vec<i32>| {
// Simulate batch processing
items.into_iter().map(|x| Ok(x * 2)).collect()
}));
// Start processing loop
let batcher_clone = batcher.clone();
tokio::spawn(async move {
batcher_clone.run().await;
});
// Add items
let mut handles = vec![];
for i in 0..100 {
let batcher = batcher.clone();
handles.push(tokio::spawn(async move { batcher.add(i).await }));
}
// Wait for results
for handle in handles {
let _ = handle.await;
}
let stats = batcher.stats().await;
println!("Queue size: {}", stats.queue_size);
println!("Max wait: {:?}", stats.max_wait_time);
batcher.shutdown().await;
}

View File

@@ -0,0 +1,62 @@
[
{
"image_path": "test_images/quadratic_1.png",
"ground_truth_text": "x^2 + 2x + 1 = 0",
"ground_truth_latex": "x^{2} + 2x + 1 = 0",
"category": "quadratic"
},
{
"image_path": "test_images/linear_1.png",
"ground_truth_text": "y = mx + b",
"ground_truth_latex": "y = mx + b",
"category": "linear"
},
{
"image_path": "test_images/integral_1.png",
"ground_truth_text": "∫ x^2 dx = x^3/3 + C",
"ground_truth_latex": "\\int x^{2} dx = \\frac{x^{3}}{3} + C",
"category": "calculus"
},
{
"image_path": "test_images/derivative_1.png",
"ground_truth_text": "d/dx(sin(x)) = cos(x)",
"ground_truth_latex": "\\frac{d}{dx}(\\sin(x)) = \\cos(x)",
"category": "calculus"
},
{
"image_path": "test_images/matrix_1.png",
"ground_truth_text": "[1 2; 3 4]",
"ground_truth_latex": "\\begin{bmatrix} 1 & 2 \\\\ 3 & 4 \\end{bmatrix}",
"category": "linear_algebra"
},
{
"image_path": "test_images/sum_1.png",
"ground_truth_text": "Σ(i=1 to n) i = n(n+1)/2",
"ground_truth_latex": "\\sum_{i=1}^{n} i = \\frac{n(n+1)}{2}",
"category": "summation"
},
{
"image_path": "test_images/fraction_1.png",
"ground_truth_text": "(a + b) / (c - d)",
"ground_truth_latex": "\\frac{a + b}{c - d}",
"category": "fraction"
},
{
"image_path": "test_images/sqrt_1.png",
"ground_truth_text": "√(a^2 + b^2) = c",
"ground_truth_latex": "\\sqrt{a^{2} + b^{2}} = c",
"category": "radical"
},
{
"image_path": "test_images/limit_1.png",
"ground_truth_text": "lim(x→0) sin(x)/x = 1",
"ground_truth_latex": "\\lim_{x \\to 0} \\frac{\\sin(x)}{x} = 1",
"category": "calculus"
},
{
"image_path": "test_images/exponent_1.png",
"ground_truth_text": "e^(iπ) + 1 = 0",
"ground_truth_latex": "e^{i\\pi} + 1 = 0",
"category": "exponential"
}
]

View File

@@ -0,0 +1,75 @@
//! Simple OCR example
//!
//! This example demonstrates basic OCR functionality with ruvector-scipix.
//! It processes a single image and outputs the recognized text and LaTeX.
//!
//! Usage:
//! ```bash
//! cargo run --example simple_ocr -- image.png
//! ```
use anyhow::{Context, Result};
use ruvector_scipix::{OcrConfig, OcrEngine, OutputFormat};
#[tokio::main]
async fn main() -> Result<()> {
// Parse command line arguments
let args: Vec<String> = std::env::args().collect();
if args.len() < 2 {
eprintln!("Usage: {} <image_path>", args[0]);
eprintln!("\nExample:");
eprintln!(" {} equation.png", args[0]);
std::process::exit(1);
}
let image_path = &args[1];
// Initialize logger for debug output
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
println!("Loading image: {}", image_path);
// Create default OCR configuration
let config = OcrConfig::default();
// Initialize OCR engine
println!("Initializing OCR engine...");
let engine = OcrEngine::new(config)
.await
.context("Failed to initialize OCR engine")?;
// Load and process the image
let image = image::open(image_path).context(format!("Failed to open image: {}", image_path))?;
println!("Processing image...");
let result = engine
.recognize(&image)
.await
.context("OCR recognition failed")?;
// Display results
println!("\n{}", "=".repeat(80));
println!("OCR Results");
println!("{}", "=".repeat(80));
println!("\n📝 Plain Text:");
println!("{}", result.text);
println!("\n🔢 LaTeX:");
println!("{}", result.to_format(OutputFormat::LaTeX)?);
println!("\n📊 Confidence: {:.2}%", result.confidence * 100.0);
if let Some(metadata) = &result.metadata {
println!("\n📋 Metadata:");
println!(" Language: {:?}", metadata.get("language"));
println!(
" Processing time: {:?}",
metadata.get("processing_time_ms")
);
}
println!("\n{}", "=".repeat(80));
Ok(())
}

View File

@@ -0,0 +1,184 @@
//! Streaming PDF processing example
//!
//! This example demonstrates streaming processing of large PDF documents.
//! Results are streamed as pages are processed, with real-time progress reporting.
//!
//! Usage:
//! ```bash
//! cargo run --example streaming -- document.pdf output/
//! ```
use anyhow::{Context, Result};
use futures::stream::{self, StreamExt};
use indicatif::{ProgressBar, ProgressStyle};
use ruvector_scipix::ocr::OcrEngine;
use ruvector_scipix::output::{OcrResult, OutputFormat};
use ruvector_scipix::OcrConfig;
use serde::{Deserialize, Serialize};
use std::path::Path;
use tokio::fs;
#[derive(Debug, Serialize, Deserialize)]
struct PageResult {
page_number: usize,
text: String,
latex: Option<String>,
confidence: f32,
processing_time_ms: u64,
}
#[derive(Debug, Serialize, Deserialize)]
struct DocumentResult {
total_pages: usize,
pages: Vec<PageResult>,
total_processing_time_ms: u64,
average_confidence: f32,
}
#[tokio::main]
async fn main() -> Result<()> {
let args: Vec<String> = std::env::args().collect();
if args.len() < 3 {
eprintln!("Usage: {} <pdf_path> <output_directory>", args[0]);
eprintln!("\nExample:");
eprintln!(" {} document.pdf ./output", args[0]);
std::process::exit(1);
}
let pdf_path = Path::new(&args[1]);
let output_dir = Path::new(&args[2]);
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
// Create output directory
fs::create_dir_all(output_dir).await?;
println!("Loading PDF: {}", pdf_path.display());
// Extract pages from PDF
let pages = extract_pdf_pages(pdf_path)?;
println!("Extracted {} pages", pages.len());
// Initialize OCR engine
let config = OcrConfig::default();
let engine = OcrEngine::new(config).await?;
// Setup progress bar
let progress = ProgressBar::new(pages.len() as u64);
progress.set_style(
ProgressStyle::default_bar()
.template("[{elapsed_precise}] {bar:40.cyan/blue} {pos}/{len} {msg}")
.unwrap()
.progress_chars("=>-"),
);
let start_time = std::time::Instant::now();
let mut page_results = Vec::new();
// Process pages as a stream
let mut stream = stream::iter(pages.into_iter().enumerate())
.map(|(idx, page_data)| {
let engine = &engine;
async move { process_page(engine, idx + 1, page_data).await }
})
.buffer_unordered(4); // Process 4 pages concurrently
// Stream results and save incrementally
while let Some(result) = stream.next().await {
match result {
Ok(page_result) => {
// Save individual page result
let page_file =
output_dir.join(format!("page_{:04}.json", page_result.page_number));
let json = serde_json::to_string_pretty(&page_result)?;
fs::write(&page_file, json).await?;
progress.set_message(format!(
"Page {} - {:.1}%",
page_result.page_number,
page_result.confidence * 100.0
));
progress.inc(1);
page_results.push(page_result);
}
Err(e) => {
eprintln!("Error processing page: {}", e);
progress.inc(1);
}
}
}
progress.finish_with_message("Complete");
let total_time = start_time.elapsed().as_millis() as u64;
// Calculate statistics
let avg_confidence =
page_results.iter().map(|p| p.confidence).sum::<f32>() / page_results.len() as f32;
// Create document result
let doc_result = DocumentResult {
total_pages: page_results.len(),
pages: page_results,
total_processing_time_ms: total_time,
average_confidence: avg_confidence,
};
// Save complete document result
let doc_file = output_dir.join("document.json");
let json = serde_json::to_string_pretty(&doc_result)?;
fs::write(&doc_file, json).await?;
println!("\n{}", "=".repeat(80));
println!("Processing Complete");
println!("{}", "=".repeat(80));
println!("Total pages: {}", doc_result.total_pages);
println!("Total time: {:.2}s", total_time as f32 / 1000.0);
println!(
"Average time per page: {:.2}s",
(total_time as f32 / doc_result.total_pages as f32) / 1000.0
);
println!("Average confidence: {:.2}%", avg_confidence * 100.0);
println!("Results saved to: {}", output_dir.display());
println!("{}", "=".repeat(80));
Ok(())
}
fn extract_pdf_pages(pdf_path: &Path) -> Result<Vec<Vec<u8>>> {
// TODO: Implement actual PDF extraction using pdf-extract or similar
// For now, this is a placeholder that returns mock data
println!("Note: PDF extraction is not yet implemented");
println!("This example shows the streaming architecture");
// Mock implementation - in real use, this would extract actual PDF pages
Ok(vec![vec![0u8; 100]]) // Placeholder
}
async fn process_page(
engine: &OcrEngine,
page_number: usize,
page_data: Vec<u8>,
) -> Result<PageResult> {
let start = std::time::Instant::now();
// TODO: Convert page_data to image
// For now, using a placeholder
let image = image::DynamicImage::new_rgb8(100, 100);
let result = engine
.recognize(&image)
.await
.context(format!("Failed to process page {}", page_number))?;
let processing_time = start.elapsed().as_millis() as u64;
Ok(PageResult {
page_number,
text: result.text.clone(),
latex: result.to_format(OutputFormat::LaTeX).ok(),
confidence: result.confidence,
processing_time_ms: processing_time,
})
}

View File

@@ -0,0 +1,442 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>ruvector-scipix Browser Demo</title>
<style>
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, sans-serif;
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
min-height: 100vh;
padding: 20px;
}
.container {
max-width: 1200px;
margin: 0 auto;
background: white;
border-radius: 20px;
box-shadow: 0 20px 60px rgba(0, 0, 0, 0.3);
overflow: hidden;
}
header {
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
padding: 30px;
text-align: center;
}
h1 {
font-size: 2.5em;
margin-bottom: 10px;
}
.subtitle {
font-size: 1.1em;
opacity: 0.9;
}
.main-content {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 30px;
padding: 30px;
}
.upload-section {
display: flex;
flex-direction: column;
gap: 20px;
}
.upload-area {
border: 3px dashed #667eea;
border-radius: 15px;
padding: 40px;
text-align: center;
cursor: pointer;
transition: all 0.3s ease;
background: #f8f9ff;
}
.upload-area:hover {
border-color: #764ba2;
background: #f0f2ff;
transform: translateY(-2px);
}
.upload-area.dragover {
border-color: #764ba2;
background: #e8ebff;
}
.upload-icon {
font-size: 3em;
margin-bottom: 15px;
}
.file-input {
display: none;
}
.btn {
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
border: none;
padding: 15px 30px;
border-radius: 10px;
font-size: 1em;
cursor: pointer;
transition: all 0.3s ease;
font-weight: 600;
}
.btn:hover {
transform: translateY(-2px);
box-shadow: 0 10px 20px rgba(102, 126, 234, 0.3);
}
.btn:disabled {
opacity: 0.5;
cursor: not-allowed;
transform: none;
}
.preview-container {
background: #f8f9ff;
border-radius: 15px;
padding: 20px;
min-height: 300px;
display: flex;
align-items: center;
justify-content: center;
}
#preview {
max-width: 100%;
max-height: 400px;
border-radius: 10px;
}
.results-section {
display: flex;
flex-direction: column;
gap: 20px;
}
.result-box {
background: #f8f9ff;
border-radius: 15px;
padding: 20px;
min-height: 150px;
}
.result-box h3 {
color: #667eea;
margin-bottom: 15px;
display: flex;
align-items: center;
gap: 10px;
}
.result-content {
background: white;
padding: 15px;
border-radius: 10px;
font-family: 'Courier New', monospace;
white-space: pre-wrap;
word-break: break-word;
max-height: 300px;
overflow-y: auto;
}
.confidence-bar {
background: #e0e0e0;
height: 30px;
border-radius: 15px;
overflow: hidden;
position: relative;
}
.confidence-fill {
background: linear-gradient(90deg, #667eea 0%, #764ba2 100%);
height: 100%;
transition: width 0.5s ease;
display: flex;
align-items: center;
justify-content: center;
color: white;
font-weight: 600;
}
.loading {
display: none;
text-align: center;
padding: 20px;
}
.loading.active {
display: block;
}
.spinner {
border: 4px solid #f3f3f3;
border-top: 4px solid #667eea;
border-radius: 50%;
width: 40px;
height: 40px;
animation: spin 1s linear infinite;
margin: 0 auto 15px;
}
@keyframes spin {
0% { transform: rotate(0deg); }
100% { transform: rotate(360deg); }
}
.error {
background: #fee;
border: 2px solid #fcc;
color: #c00;
padding: 15px;
border-radius: 10px;
margin-top: 15px;
display: none;
}
.error.active {
display: block;
}
@media (max-width: 768px) {
.main-content {
grid-template-columns: 1fr;
}
}
</style>
</head>
<body>
<div class="container">
<header>
<h1>🔢 ruvector-scipix</h1>
<p class="subtitle">Advanced Math OCR in Your Browser</p>
</header>
<div class="main-content">
<div class="upload-section">
<div class="upload-area" id="uploadArea">
<div class="upload-icon">📸</div>
<h3>Drop an image here or click to upload</h3>
<p>Supports PNG, JPG, JPEG, WebP</p>
<input type="file" id="fileInput" class="file-input" accept="image/*">
</div>
<button class="btn" id="processBtn" disabled>
🚀 Process Image
</button>
<div class="preview-container">
<canvas id="preview"></canvas>
</div>
<div class="loading" id="loading">
<div class="spinner"></div>
<p>Processing image...</p>
</div>
<div class="error" id="error"></div>
</div>
<div class="results-section">
<div class="result-box">
<h3>📝 Plain Text</h3>
<div class="result-content" id="textResult">
Results will appear here...
</div>
</div>
<div class="result-box">
<h3>🔢 LaTeX</h3>
<div class="result-content" id="latexResult">
LaTeX output will appear here...
</div>
</div>
<div class="result-box">
<h3>📊 Confidence</h3>
<div class="confidence-bar">
<div class="confidence-fill" id="confidenceFill" style="width: 0%">
0%
</div>
</div>
</div>
</div>
</div>
</div>
<script type="module">
// Note: This demo requires the WASM build of ruvector-scipix
// Build with: wasm-pack build --target web
let wasmModule = null;
let currentImage = null;
// Initialize WASM module
async function initWasm() {
try {
// In production, this would load the actual WASM module
// import init, { MathpixWasm } from './pkg/ruvector_scipix_wasm.js';
// wasmModule = await init();
console.log('WASM module would be initialized here');
// For demo purposes, we'll simulate the OCR
} catch (error) {
showError('Failed to initialize WASM module: ' + error.message);
}
}
// File upload handling
const uploadArea = document.getElementById('uploadArea');
const fileInput = document.getElementById('fileInput');
const processBtn = document.getElementById('processBtn');
const preview = document.getElementById('preview');
uploadArea.addEventListener('click', () => fileInput.click());
uploadArea.addEventListener('dragover', (e) => {
e.preventDefault();
uploadArea.classList.add('dragover');
});
uploadArea.addEventListener('dragleave', () => {
uploadArea.classList.remove('dragover');
});
uploadArea.addEventListener('drop', (e) => {
e.preventDefault();
uploadArea.classList.remove('dragover');
const file = e.dataTransfer.files[0];
if (file) handleFile(file);
});
fileInput.addEventListener('change', (e) => {
const file = e.target.files[0];
if (file) handleFile(file);
});
function handleFile(file) {
if (!file.type.startsWith('image/')) {
showError('Please select an image file');
return;
}
const reader = new FileReader();
reader.onload = (e) => {
const img = new Image();
img.onload = () => {
drawPreview(img);
currentImage = img;
processBtn.disabled = false;
};
img.src = e.target.result;
};
reader.readAsDataURL(file);
}
function drawPreview(img) {
const ctx = preview.getContext('2d');
const maxWidth = 500;
const maxHeight = 400;
let width = img.width;
let height = img.height;
if (width > maxWidth) {
height *= maxWidth / width;
width = maxWidth;
}
if (height > maxHeight) {
width *= maxHeight / height;
height = maxHeight;
}
preview.width = width;
preview.height = height;
ctx.drawImage(img, 0, 0, width, height);
}
processBtn.addEventListener('click', async () => {
if (!currentImage) return;
showLoading(true);
hideError();
try {
// In production, this would call the actual WASM OCR
// const result = await wasmModule.recognize(imageData);
// For demo, simulate OCR processing
await simulateOcr();
} catch (error) {
showError('OCR processing failed: ' + error.message);
} finally {
showLoading(false);
}
});
async function simulateOcr() {
// Simulate processing delay
await new Promise(resolve => setTimeout(resolve, 2000));
const mockResults = {
text: 'x² + 2x + 1 = 0',
latex: 'x^{2} + 2x + 1 = 0',
confidence: 0.95
};
displayResults(mockResults);
}
function displayResults(results) {
document.getElementById('textResult').textContent = results.text;
document.getElementById('latexResult').textContent = results.latex;
const confidencePct = (results.confidence * 100).toFixed(1);
const confidenceFill = document.getElementById('confidenceFill');
confidenceFill.style.width = confidencePct + '%';
confidenceFill.textContent = confidencePct + '%';
}
function showLoading(show) {
const loading = document.getElementById('loading');
if (show) {
loading.classList.add('active');
processBtn.disabled = true;
} else {
loading.classList.remove('active');
processBtn.disabled = false;
}
}
function showError(message) {
const error = document.getElementById('error');
error.textContent = message;
error.classList.add('active');
}
function hideError() {
document.getElementById('error').classList.remove('active');
}
// Initialize on page load
initWasm();
</script>
</body>
</html>