Files
wifi-densepose/crates/ruvector-attention-cli/README.md
ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00

174 lines
3.8 KiB
Markdown

# RuVector Attention CLI
A high-performance command-line interface for working with attention mechanisms.
## Features
- **Multiple Attention Types**: Scaled dot-product, multi-head, hyperbolic, flash, linear, and MoE
- **Compute**: Process attention on input data with various configurations
- **Benchmark**: Performance testing across different dimensions and attention types
- **Convert**: Transform data between JSON, binary, MessagePack, CSV formats
- **Serve**: HTTP server with REST API for attention computation
- **REPL**: Interactive shell for exploratory analysis
## Installation
```bash
cargo install --path .
```
## Usage
### Compute Attention
```bash
# Scaled dot-product attention
ruvector-attention compute -i input.json -o output.json -a scaled_dot
# Multi-head attention with 16 heads
ruvector-attention compute -i input.json -a multi_head --num-heads 16
# Hyperbolic attention with custom curvature
ruvector-attention compute -i input.json -a hyperbolic --curvature 2.0
# Flash attention (memory-efficient)
ruvector-attention compute -i input.json -a flash
# Mixture of Experts attention
ruvector-attention compute -i input.json -a moe --num-experts 8 --top-k 2
```
### Run Benchmarks
```bash
# Benchmark all attention types
ruvector-attention benchmark
# Benchmark specific types
ruvector-attention benchmark -a scaled_dot,multi_head,flash
# Custom dimensions
ruvector-attention benchmark -d 256,512,1024 -i 1000
# Output to CSV
ruvector-attention benchmark -o results.csv -f csv
```
### Convert Data
```bash
# JSON to MessagePack
ruvector-attention convert -i data.json -o data.msgpack --to msgpack
# Binary to JSON (pretty-printed)
ruvector-attention convert -i data.bin -o data.json --to json --pretty
# Auto-detect input format
ruvector-attention convert -i input.dat -o output.json --to json
```
### Start HTTP Server
```bash
# Default (localhost:8080)
ruvector-attention serve
# Custom host and port
ruvector-attention serve -H 0.0.0.0 -p 3000
# With CORS enabled
ruvector-attention serve --cors
```
### Interactive REPL
```bash
# Start REPL
ruvector-attention repl
# Commands within REPL:
attention> help
attention> load data.json
attention> type multi_head
attention> compute
attention> config
attention> quit
```
## API Endpoints
When running the server, the following endpoints are available:
- `GET /health` - Health check
- `POST /attention/scaled_dot` - Scaled dot-product attention
- `POST /attention/multi_head` - Multi-head attention
- `POST /attention/hyperbolic` - Hyperbolic attention
- `POST /attention/flash` - Flash attention
- `POST /attention/linear` - Linear attention
- `POST /attention/moe` - Mixture of Experts attention
- `POST /batch` - Batch computation
### Example Request
```bash
curl -X POST http://localhost:8080/attention/scaled_dot \
-H "Content-Type: application/json" \
-d '{
"query": [[0.1, 0.2, 0.3]],
"keys": [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]],
"values": [[0.7, 0.8, 0.9], [1.0, 1.1, 1.2]]
}'
```
## Configuration
Create a `ruvector-attention.toml` file:
```toml
[attention]
default_dim = 512
default_heads = 8
default_type = "scaled_dot"
[server]
host = "0.0.0.0"
port = 8080
max_batch_size = 32
[output]
format = "json"
pretty = true
[benchmark]
iterations = 100
dimensions = [128, 256, 512, 1024]
```
## Input Format
Input files should contain:
```json
{
"query": [[...], [...], ...],
"keys": [[...], [...], ...],
"values": [[...], [...], ...],
"dim": 512
}
```
## Performance
Benchmark results on typical hardware:
| Attention Type | 512-dim | 1024-dim | 2048-dim |
|---------------|---------|----------|----------|
| Scaled Dot | 0.5ms | 1.2ms | 4.8ms |
| Multi-Head | 1.2ms | 3.5ms | 14.2ms |
| Flash | 0.3ms | 0.8ms | 3.1ms |
| Linear | 0.4ms | 1.0ms | 3.9ms |
## License
MIT OR Apache-2.0