git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
48 KiB
48 KiB
Agent 8: NAPI-RS Node.js Native Bindings
Overview
Create high-performance Node.js bindings for RuVector's GNN latent space attention mechanisms using NAPI-RS, enabling seamless integration with JavaScript/TypeScript applications.
Project Structure
ruvector-node/
├── Cargo.toml
├── build.rs
├── package.json
├── index.js
├── index.d.ts
├── src/
│ ├── lib.rs
│ ├── attention/
│ │ ├── mod.rs
│ │ ├── dot_product.rs
│ │ ├── multi_head.rs
│ │ ├── graph_attention.rs
│ │ ├── temporal_attention.rs
│ │ └── hierarchical_attention.rs
│ ├── types.rs
│ ├── error.rs
│ └── utils.rs
├── __test__/
│ ├── attention.spec.ts
│ ├── batch.spec.ts
│ └── benchmark.spec.ts
├── examples/
│ ├── basic-usage.js
│ ├── async-batch.js
│ └── typescript-example.ts
└── .github/
└── workflows/
└── build.yml
1. Cargo.toml Configuration
[package]
name = "ruvector-node"
version = "0.1.0"
edition = "2021"
authors = ["RuVector Team"]
description = "Node.js bindings for RuVector GNN latent space attention mechanisms"
license = "MIT"
[lib]
crate-type = ["cdylib"]
[dependencies]
# NAPI-RS core
napi = { version = "2.16", features = ["async", "tokio_rt"] }
napi-derive = "2.16"
# Async runtime
tokio = { version = "1.35", features = ["full"] }
# Core dependencies
ndarray = "0.15"
rayon = "1.8"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
# Error handling
thiserror = "1.0"
anyhow = "1.0"
# Performance
parking_lot = "0.12"
[build-dependencies]
napi-build = "2.1"
[profile.release]
lto = true
codegen-units = 1
opt-level = 3
strip = true
# Platform-specific optimizations
[target.'cfg(target_arch = "x86_64")'.dependencies]
packed_simd = "0.3"
[target.'cfg(target_arch = "aarch64")'.dependencies]
packed_simd = "0.3"
2. build.rs
extern crate napi_build;
fn main() {
napi_build::setup();
// Platform-specific optimizations
println!("cargo:rerun-if-changed=build.rs");
#[cfg(target_arch = "x86_64")]
{
println!("cargo:rustc-env=TARGET_ARCH=x86_64");
// Enable AVX2 if available
if is_x86_feature_detected!("avx2") {
println!("cargo:rustc-cfg=has_avx2");
}
}
#[cfg(target_arch = "aarch64")]
{
println!("cargo:rustc-env=TARGET_ARCH=aarch64");
println!("cargo:rustc-cfg=has_neon");
}
}
#[cfg(target_arch = "x86_64")]
fn is_x86_feature_detected(feature: &str) -> bool {
std::arch::is_x86_feature_detected(feature)
}
3. Rust NAPI Bindings
src/lib.rs
#![deny(clippy::all)]
use napi::bindgen_prelude::*;
use napi_derive::napi;
mod attention;
mod error;
mod types;
mod utils;
pub use attention::*;
pub use error::*;
pub use types::*;
/// Initialize the RuVector native module
#[napi]
pub fn init() -> Result<String> {
Ok("RuVector native module initialized".to_string())
}
/// Get module version
#[napi]
pub fn get_version() -> String {
env!("CARGO_PKG_VERSION").to_string()
}
/// Get available SIMD features
#[napi]
pub fn get_features() -> Vec<String> {
let mut features = Vec::new();
#[cfg(target_arch = "x86_64")]
{
if std::arch::is_x86_feature_detected!("avx2") {
features.push("avx2".to_string());
}
if std::arch::is_x86_feature_detected!("fma") {
features.push("fma".to_string());
}
}
#[cfg(target_arch = "aarch64")]
{
features.push("neon".to_string());
}
features
}
src/types.rs
use napi::bindgen_prelude::*;
use napi_derive::napi;
/// Attention configuration options
#[napi(object)]
#[derive(Debug, Clone)]
pub struct AttentionConfig {
/// Number of attention heads
pub num_heads: Option<u32>,
/// Dimension of each head
pub head_dim: Option<u32>,
/// Dropout rate
pub dropout: Option<f32>,
/// Whether to use bias in projections
pub use_bias: Option<bool>,
/// Attention scaling factor
pub scale: Option<f32>,
}
impl Default for AttentionConfig {
fn default() -> Self {
Self {
num_heads: Some(8),
head_dim: Some(64),
dropout: Some(0.1),
use_bias: Some(true),
scale: None,
}
}
}
/// Graph structure for attention
#[napi(object)]
#[derive(Debug, Clone)]
pub struct GraphStructure {
/// Edge list (source, target pairs)
pub edges: Vec<Vec<u32>>,
/// Number of nodes
pub num_nodes: u32,
/// Edge features (optional)
pub edge_features: Option<Vec<Vec<f32>>>,
}
/// Attention output
#[napi(object)]
#[derive(Debug, Clone)]
pub struct AttentionOutput {
/// Attention values
pub values: Vec<Vec<f32>>,
/// Attention weights (optional)
pub weights: Option<Vec<Vec<f32>>>,
/// Metadata
pub metadata: Option<serde_json::Value>,
}
/// Batch processing configuration
#[napi(object)]
#[derive(Debug, Clone)]
pub struct BatchConfig {
/// Maximum batch size
pub max_batch_size: u32,
/// Number of parallel threads
pub num_threads: Option<u32>,
/// Enable progress callbacks
pub enable_progress: Option<bool>,
}
src/error.rs
use napi::bindgen_prelude::*;
use thiserror::Error;
#[derive(Error, Debug)]
pub enum RuVectorError {
#[error("Invalid input: {0}")]
InvalidInput(String),
#[error("Dimension mismatch: expected {expected}, got {actual}")]
DimensionMismatch { expected: usize, actual: usize },
#[error("Computation error: {0}")]
ComputationError(String),
#[error("Invalid configuration: {0}")]
InvalidConfig(String),
}
impl From<RuVectorError> for Error {
fn from(err: RuVectorError) -> Self {
Error::new(Status::GenericFailure, err.to_string())
}
}
src/attention/mod.rs
pub mod dot_product;
pub mod multi_head;
pub mod graph_attention;
pub mod temporal_attention;
pub mod hierarchical_attention;
pub use dot_product::*;
pub use multi_head::*;
pub use graph_attention::*;
pub use temporal_attention::*;
pub use hierarchical_attention::*;
src/attention/dot_product.rs
use napi::bindgen_prelude::*;
use napi_derive::napi;
use ndarray::{Array2, ArrayView2};
use crate::{AttentionConfig, AttentionOutput, RuVectorError};
/// Dot-product attention mechanism
#[napi]
pub struct DotProductAttention {
config: AttentionConfig,
}
#[napi]
impl DotProductAttention {
/// Create a new dot-product attention instance
#[napi(constructor)]
pub fn new(config: Option<AttentionConfig>) -> Self {
Self {
config: config.unwrap_or_default(),
}
}
/// Compute attention (synchronous)
#[napi]
pub fn compute(
&self,
query: Vec<Vec<f32>>,
key: Vec<Vec<f32>>,
value: Vec<Vec<f32>>,
) -> Result<AttentionOutput> {
self.compute_internal(query, key, value)
.map_err(|e| e.into())
}
/// Compute attention (asynchronous)
#[napi]
pub async fn compute_async(
&self,
query: Vec<Vec<f32>>,
key: Vec<Vec<f32>>,
value: Vec<Vec<f32>>,
) -> Result<AttentionOutput> {
let config = self.config.clone();
tokio::task::spawn_blocking(move || {
Self::compute_internal_static(&config, query, key, value)
})
.await
.map_err(|e| Error::new(Status::GenericFailure, e.to_string()))?
.map_err(|e| e.into())
}
/// Batch compute attention
#[napi]
pub fn compute_batch(
&self,
queries: Vec<Vec<Vec<f32>>>,
keys: Vec<Vec<Vec<f32>>>,
values: Vec<Vec<Vec<f32>>>,
) -> Result<Vec<AttentionOutput>> {
if queries.len() != keys.len() || queries.len() != values.len() {
return Err(Error::from(RuVectorError::InvalidInput(
"Batch sizes must match".to_string()
)));
}
queries
.into_iter()
.zip(keys)
.zip(values)
.map(|((q, k), v)| self.compute_internal(q, k, v))
.collect::<Result<Vec<_>, _>>()
.map_err(|e| e.into())
}
/// Batch compute attention (asynchronous with parallelism)
#[napi]
pub async fn compute_batch_async(
&self,
queries: Vec<Vec<Vec<f32>>>,
keys: Vec<Vec<Vec<f32>>>,
values: Vec<Vec<Vec<f32>>>,
) -> Result<Vec<AttentionOutput>> {
if queries.len() != keys.len() || queries.len() != values.len() {
return Err(Error::from(RuVectorError::InvalidInput(
"Batch sizes must match".to_string()
)));
}
let config = self.config.clone();
let tasks: Vec<_> = queries
.into_iter()
.zip(keys)
.zip(values)
.map(|((q, k), v)| {
let config = config.clone();
tokio::task::spawn_blocking(move || {
Self::compute_internal_static(&config, q, k, v)
})
})
.collect();
let mut results = Vec::new();
for task in tasks {
let result = task
.await
.map_err(|e| Error::new(Status::GenericFailure, e.to_string()))?
.map_err(|e: RuVectorError| Error::from(e))?;
results.push(result);
}
Ok(results)
}
// Internal implementation
fn compute_internal(
&self,
query: Vec<Vec<f32>>,
key: Vec<Vec<f32>>,
value: Vec<Vec<f32>>,
) -> Result<AttentionOutput, RuVectorError> {
Self::compute_internal_static(&self.config, query, key, value)
}
fn compute_internal_static(
config: &AttentionConfig,
query: Vec<Vec<f32>>,
key: Vec<Vec<f32>>,
value: Vec<Vec<f32>>,
) -> Result<AttentionOutput, RuVectorError> {
// Validate dimensions
if query.is_empty() || key.is_empty() || value.is_empty() {
return Err(RuVectorError::InvalidInput("Empty input".to_string()));
}
let q_dim = query[0].len();
let k_dim = key[0].len();
let v_dim = value[0].len();
if q_dim != k_dim {
return Err(RuVectorError::DimensionMismatch {
expected: q_dim,
actual: k_dim,
});
}
// Convert to ndarray
let q = Self::vec_to_array2(&query)?;
let k = Self::vec_to_array2(&key)?;
let v = Self::vec_to_array2(&value)?;
// Compute scaled dot-product attention
let scale = config.scale.unwrap_or_else(|| (k_dim as f32).sqrt());
// Q @ K^T
let scores = q.dot(&k.t()) / scale;
// Softmax
let weights = Self::softmax(&scores);
// Attention @ V
let output = weights.dot(&v);
// Convert back to Vec
let values = Self::array2_to_vec(&output);
let weights_vec = Some(Self::array2_to_vec(&weights));
Ok(AttentionOutput {
values,
weights: weights_vec,
metadata: None,
})
}
fn vec_to_array2(vec: &[Vec<f32>]) -> Result<Array2<f32>, RuVectorError> {
if vec.is_empty() {
return Err(RuVectorError::InvalidInput("Empty vector".to_string()));
}
let rows = vec.len();
let cols = vec[0].len();
let flat: Vec<f32> = vec.iter().flat_map(|row| row.iter().copied()).collect();
Array2::from_shape_vec((rows, cols), flat)
.map_err(|e| RuVectorError::ComputationError(e.to_string()))
}
fn array2_to_vec(arr: &Array2<f32>) -> Vec<Vec<f32>> {
arr.outer_iter()
.map(|row| row.to_vec())
.collect()
}
fn softmax(arr: &Array2<f32>) -> Array2<f32> {
let mut result = Array2::zeros(arr.raw_dim());
for (i, row) in arr.outer_iter().enumerate() {
let max = row.iter().copied().fold(f32::NEG_INFINITY, f32::max);
let exp_sum: f32 = row.iter().map(|&x| (x - max).exp()).sum();
for (j, &val) in row.iter().enumerate() {
result[[i, j]] = ((val - max).exp()) / exp_sum;
}
}
result
}
}
src/attention/multi_head.rs
use napi::bindgen_prelude::*;
use napi_derive::napi;
use crate::{AttentionConfig, AttentionOutput, RuVectorError};
/// Multi-head attention mechanism
#[napi]
pub struct MultiHeadAttention {
config: AttentionConfig,
}
#[napi]
impl MultiHeadAttention {
#[napi(constructor)]
pub fn new(config: Option<AttentionConfig>) -> Self {
Self {
config: config.unwrap_or_default(),
}
}
#[napi]
pub fn compute(
&self,
query: Vec<Vec<f32>>,
key: Vec<Vec<f32>>,
value: Vec<Vec<f32>>,
) -> Result<AttentionOutput> {
self.compute_internal(query, key, value)
.map_err(|e| e.into())
}
#[napi]
pub async fn compute_async(
&self,
query: Vec<Vec<f32>>,
key: Vec<Vec<f32>>,
value: Vec<Vec<f32>>,
) -> Result<AttentionOutput> {
let config = self.config.clone();
tokio::task::spawn_blocking(move || {
Self::compute_internal_static(&config, query, key, value)
})
.await
.map_err(|e| Error::new(Status::GenericFailure, e.to_string()))?
.map_err(|e| e.into())
}
#[napi]
pub fn compute_batch(
&self,
queries: Vec<Vec<Vec<f32>>>,
keys: Vec<Vec<Vec<f32>>>,
values: Vec<Vec<Vec<f32>>>,
) -> Result<Vec<AttentionOutput>> {
queries
.into_iter()
.zip(keys)
.zip(values)
.map(|((q, k), v)| self.compute_internal(q, k, v))
.collect::<Result<Vec<_>, _>>()
.map_err(|e| e.into())
}
#[napi]
pub async fn compute_batch_async(
&self,
queries: Vec<Vec<Vec<f32>>>,
keys: Vec<Vec<Vec<f32>>>,
values: Vec<Vec<Vec<f32>>>,
) -> Result<Vec<AttentionOutput>> {
let config = self.config.clone();
let tasks: Vec<_> = queries
.into_iter()
.zip(keys)
.zip(values)
.map(|((q, k), v)| {
let config = config.clone();
tokio::task::spawn_blocking(move || {
Self::compute_internal_static(&config, q, k, v)
})
})
.collect();
let mut results = Vec::new();
for task in tasks {
results.push(
task.await
.map_err(|e| Error::new(Status::GenericFailure, e.to_string()))?
.map_err(|e: RuVectorError| Error::from(e))?
);
}
Ok(results)
}
fn compute_internal(
&self,
query: Vec<Vec<f32>>,
key: Vec<Vec<f32>>,
value: Vec<Vec<f32>>,
) -> Result<AttentionOutput, RuVectorError> {
Self::compute_internal_static(&self.config, query, key, value)
}
fn compute_internal_static(
config: &AttentionConfig,
query: Vec<Vec<f32>>,
key: Vec<Vec<f32>>,
value: Vec<Vec<f32>>,
) -> Result<AttentionOutput, RuVectorError> {
let num_heads = config.num_heads.unwrap_or(8) as usize;
// Simplified multi-head implementation
// In production, would split into heads, compute attention per head, concat
let values = query.clone(); // Placeholder
Ok(AttentionOutput {
values,
weights: None,
metadata: Some(serde_json::json!({
"num_heads": num_heads,
"attention_type": "multi_head"
})),
})
}
}
src/attention/graph_attention.rs
use napi::bindgen_prelude::*;
use napi_derive::napi;
use crate::{AttentionConfig, AttentionOutput, GraphStructure, RuVectorError};
/// Graph attention network (GAT) mechanism
#[napi]
pub struct GraphAttention {
config: AttentionConfig,
}
#[napi]
impl GraphAttention {
#[napi(constructor)]
pub fn new(config: Option<AttentionConfig>) -> Self {
Self {
config: config.unwrap_or_default(),
}
}
#[napi]
pub fn compute(
&self,
node_features: Vec<Vec<f32>>,
graph: GraphStructure,
) -> Result<AttentionOutput> {
self.compute_internal(node_features, graph)
.map_err(|e| e.into())
}
#[napi]
pub async fn compute_async(
&self,
node_features: Vec<Vec<f32>>,
graph: GraphStructure,
) -> Result<AttentionOutput> {
let config = self.config.clone();
tokio::task::spawn_blocking(move || {
Self::compute_internal_static(&config, node_features, graph)
})
.await
.map_err(|e| Error::new(Status::GenericFailure, e.to_string()))?
.map_err(|e| e.into())
}
fn compute_internal(
&self,
node_features: Vec<Vec<f32>>,
graph: GraphStructure,
) -> Result<AttentionOutput, RuVectorError> {
Self::compute_internal_static(&self.config, node_features, graph)
}
fn compute_internal_static(
config: &AttentionConfig,
node_features: Vec<Vec<f32>>,
graph: GraphStructure,
) -> Result<AttentionOutput, RuVectorError> {
// Simplified GAT implementation
let values = node_features.clone(); // Placeholder
Ok(AttentionOutput {
values,
weights: None,
metadata: Some(serde_json::json!({
"num_nodes": graph.num_nodes,
"num_edges": graph.edges.len(),
"attention_type": "graph"
})),
})
}
}
src/attention/temporal_attention.rs
use napi::bindgen_prelude::*;
use napi_derive::napi;
use crate::{AttentionConfig, AttentionOutput, RuVectorError};
/// Temporal attention for sequence data
#[napi]
pub struct TemporalAttention {
config: AttentionConfig,
}
#[napi]
impl TemporalAttention {
#[napi(constructor)]
pub fn new(config: Option<AttentionConfig>) -> Self {
Self {
config: config.unwrap_or_default(),
}
}
#[napi]
pub fn compute(
&self,
sequence: Vec<Vec<f32>>,
timestamps: Option<Vec<f64>>,
) -> Result<AttentionOutput> {
self.compute_internal(sequence, timestamps)
.map_err(|e| e.into())
}
#[napi]
pub async fn compute_async(
&self,
sequence: Vec<Vec<f32>>,
timestamps: Option<Vec<f64>>,
) -> Result<AttentionOutput> {
let config = self.config.clone();
tokio::task::spawn_blocking(move || {
Self::compute_internal_static(&config, sequence, timestamps)
})
.await
.map_err(|e| Error::new(Status::GenericFailure, e.to_string()))?
.map_err(|e| e.into())
}
fn compute_internal(
&self,
sequence: Vec<Vec<f32>>,
timestamps: Option<Vec<f64>>,
) -> Result<AttentionOutput, RuVectorError> {
Self::compute_internal_static(&self.config, sequence, timestamps)
}
fn compute_internal_static(
_config: &AttentionConfig,
sequence: Vec<Vec<f32>>,
timestamps: Option<Vec<f64>>,
) -> Result<AttentionOutput, RuVectorError> {
let values = sequence.clone(); // Placeholder
Ok(AttentionOutput {
values,
weights: None,
metadata: Some(serde_json::json!({
"sequence_length": sequence.len(),
"has_timestamps": timestamps.is_some(),
"attention_type": "temporal"
})),
})
}
}
src/attention/hierarchical_attention.rs
use napi::bindgen_prelude::*;
use napi_derive::napi;
use crate::{AttentionConfig, AttentionOutput, RuVectorError};
/// Hierarchical attention for multi-level structures
#[napi]
pub struct HierarchicalAttention {
config: AttentionConfig,
}
#[napi]
impl HierarchicalAttention {
#[napi(constructor)]
pub fn new(config: Option<AttentionConfig>) -> Self {
Self {
config: config.unwrap_or_default(),
}
}
#[napi]
pub fn compute(
&self,
hierarchical_features: Vec<Vec<Vec<f32>>>,
level_weights: Option<Vec<f32>>,
) -> Result<AttentionOutput> {
self.compute_internal(hierarchical_features, level_weights)
.map_err(|e| e.into())
}
#[napi]
pub async fn compute_async(
&self,
hierarchical_features: Vec<Vec<Vec<f32>>>,
level_weights: Option<Vec<f32>>,
) -> Result<AttentionOutput> {
let config = self.config.clone();
tokio::task::spawn_blocking(move || {
Self::compute_internal_static(&config, hierarchical_features, level_weights)
})
.await
.map_err(|e| Error::new(Status::GenericFailure, e.to_string()))?
.map_err(|e| e.into())
}
fn compute_internal(
&self,
hierarchical_features: Vec<Vec<Vec<f32>>>,
level_weights: Option<Vec<f32>>,
) -> Result<AttentionOutput, RuVectorError> {
Self::compute_internal_static(&self.config, hierarchical_features, level_weights)
}
fn compute_internal_static(
_config: &AttentionConfig,
hierarchical_features: Vec<Vec<Vec<f32>>>,
level_weights: Option<Vec<f32>>,
) -> Result<AttentionOutput, RuVectorError> {
if hierarchical_features.is_empty() {
return Err(RuVectorError::InvalidInput("Empty hierarchy".to_string()));
}
// Simplified hierarchical attention
let values = hierarchical_features[0].clone(); // Placeholder
Ok(AttentionOutput {
values,
weights: None,
metadata: Some(serde_json::json!({
"num_levels": hierarchical_features.len(),
"has_level_weights": level_weights.is_some(),
"attention_type": "hierarchical"
})),
})
}
}
4. TypeScript Definitions
index.d.ts
/* tslint:disable */
/* eslint-disable */
/* auto-generated by NAPI-RS */
/**
* Initialize the RuVector native module
*/
export function init(): string
/**
* Get module version
*/
export function getVersion(): string
/**
* Get available SIMD features
*/
export function getFeatures(): Array<string>
/**
* Attention configuration options
*/
export interface AttentionConfig {
/** Number of attention heads */
numHeads?: number
/** Dimension of each head */
headDim?: number
/** Dropout rate */
dropout?: number
/** Whether to use bias in projections */
useBias?: boolean
/** Attention scaling factor */
scale?: number
}
/**
* Graph structure for attention
*/
export interface GraphStructure {
/** Edge list (source, target pairs) */
edges: Array<Array<number>>
/** Number of nodes */
numNodes: number
/** Edge features (optional) */
edgeFeatures?: Array<Array<number>>
}
/**
* Attention output
*/
export interface AttentionOutput {
/** Attention values */
values: Array<Array<number>>
/** Attention weights (optional) */
weights?: Array<Array<number>>
/** Metadata */
metadata?: any
}
/**
* Batch processing configuration
*/
export interface BatchConfig {
/** Maximum batch size */
maxBatchSize: number
/** Number of parallel threads */
numThreads?: number
/** Enable progress callbacks */
enableProgress?: boolean
}
/**
* Dot-product attention mechanism
*/
export class DotProductAttention {
/**
* Create a new dot-product attention instance
* @param config Optional attention configuration
*/
constructor(config?: AttentionConfig)
/**
* Compute attention (synchronous)
* @param query Query matrix [seq_len, dim]
* @param key Key matrix [seq_len, dim]
* @param value Value matrix [seq_len, dim]
* @returns Attention output with values and weights
*/
compute(
query: Array<Array<number>>,
key: Array<Array<number>>,
value: Array<Array<number>>
): AttentionOutput
/**
* Compute attention (asynchronous)
* @param query Query matrix [seq_len, dim]
* @param key Key matrix [seq_len, dim]
* @param value Value matrix [seq_len, dim]
* @returns Promise resolving to attention output
*/
computeAsync(
query: Array<Array<number>>,
key: Array<Array<number>>,
value: Array<Array<number>>
): Promise<AttentionOutput>
/**
* Batch compute attention (synchronous)
* @param queries Array of query matrices
* @param keys Array of key matrices
* @param values Array of value matrices
* @returns Array of attention outputs
*/
computeBatch(
queries: Array<Array<Array<number>>>,
keys: Array<Array<Array<number>>>,
values: Array<Array<Array<number>>>
): Array<AttentionOutput>
/**
* Batch compute attention (asynchronous with parallelism)
* @param queries Array of query matrices
* @param keys Array of key matrices
* @param values Array of value matrices
* @returns Promise resolving to array of attention outputs
*/
computeBatchAsync(
queries: Array<Array<Array<number>>>,
keys: Array<Array<Array<number>>>,
values: Array<Array<Array<number>>>
): Promise<Array<AttentionOutput>>
}
/**
* Multi-head attention mechanism
*/
export class MultiHeadAttention {
constructor(config?: AttentionConfig)
compute(
query: Array<Array<number>>,
key: Array<Array<number>>,
value: Array<Array<number>>
): AttentionOutput
computeAsync(
query: Array<Array<number>>,
key: Array<Array<number>>,
value: Array<Array<number>>
): Promise<AttentionOutput>
computeBatch(
queries: Array<Array<Array<number>>>,
keys: Array<Array<Array<number>>>,
values: Array<Array<Array<number>>>
): Array<AttentionOutput>
computeBatchAsync(
queries: Array<Array<Array<number>>>,
keys: Array<Array<Array<number>>>,
values: Array<Array<Array<number>>>
): Promise<Array<AttentionOutput>>
}
/**
* Graph attention network (GAT) mechanism
*/
export class GraphAttention {
constructor(config?: AttentionConfig)
/**
* Compute graph attention
* @param nodeFeatures Node feature matrix [num_nodes, feature_dim]
* @param graph Graph structure with edges and optional edge features
* @returns Attention output with updated node features
*/
compute(
nodeFeatures: Array<Array<number>>,
graph: GraphStructure
): AttentionOutput
computeAsync(
nodeFeatures: Array<Array<number>>,
graph: GraphStructure
): Promise<AttentionOutput>
}
/**
* Temporal attention for sequence data
*/
export class TemporalAttention {
constructor(config?: AttentionConfig)
/**
* Compute temporal attention
* @param sequence Sequence of feature vectors [seq_len, feature_dim]
* @param timestamps Optional timestamps for each sequence element
* @returns Attention output with temporal features
*/
compute(
sequence: Array<Array<number>>,
timestamps?: Array<number>
): AttentionOutput
computeAsync(
sequence: Array<Array<number>>,
timestamps?: Array<number>
): Promise<AttentionOutput>
}
/**
* Hierarchical attention for multi-level structures
*/
export class HierarchicalAttention {
constructor(config?: AttentionConfig)
/**
* Compute hierarchical attention
* @param hierarchicalFeatures Multi-level features [num_levels][num_items][feature_dim]
* @param levelWeights Optional weights for each hierarchy level
* @returns Attention output with aggregated features
*/
compute(
hierarchicalFeatures: Array<Array<Array<number>>>,
levelWeights?: Array<number>
): AttentionOutput
computeAsync(
hierarchicalFeatures: Array<Array<Array<number>>>,
levelWeights?: Array<number>
): Promise<AttentionOutput>
}
5. Package.json
{
"name": "@ruvector/node",
"version": "0.1.0",
"description": "High-performance Node.js bindings for RuVector GNN latent space attention mechanisms",
"main": "index.js",
"types": "index.d.ts",
"keywords": [
"rust",
"napi",
"attention",
"gnn",
"graph-neural-networks",
"machine-learning",
"vector-database",
"native-addon"
],
"author": "RuVector Team",
"license": "MIT",
"repository": {
"type": "git",
"url": "https://github.com/ruvnet/ruvector"
},
"napi": {
"name": "ruvector-node",
"triples": {
"defaults": true,
"additional": [
"x86_64-unknown-linux-musl",
"aarch64-unknown-linux-gnu",
"aarch64-unknown-linux-musl",
"aarch64-apple-darwin",
"x86_64-apple-darwin",
"x86_64-pc-windows-msvc",
"aarch64-pc-windows-msvc"
]
}
},
"engines": {
"node": ">= 16"
},
"scripts": {
"artifacts": "napi artifacts",
"build": "napi build --platform --release",
"build:debug": "napi build --platform",
"prepublishOnly": "napi prepublish -t npm",
"test": "vitest run",
"test:watch": "vitest",
"universal": "napi universal",
"version": "napi version",
"bench": "node benchmarks/run.js",
"lint": "eslint . --ext .ts,.js",
"format": "prettier --write .",
"typecheck": "tsc --noEmit"
},
"devDependencies": {
"@napi-rs/cli": "^2.18.0",
"@types/node": "^20.10.0",
"@typescript-eslint/eslint-plugin": "^6.15.0",
"@typescript-eslint/parser": "^6.15.0",
"eslint": "^8.56.0",
"prettier": "^3.1.1",
"typescript": "^5.3.3",
"vitest": "^1.0.4"
},
"packageManager": "npm@10.2.5",
"files": [
"index.js",
"index.d.ts",
"README.md",
"LICENSE"
],
"optionalDependencies": {
"@ruvector/node-win32-x64-msvc": "0.1.0",
"@ruvector/node-darwin-x64": "0.1.0",
"@ruvector/node-darwin-arm64": "0.1.0",
"@ruvector/node-linux-x64-gnu": "0.1.0",
"@ruvector/node-linux-x64-musl": "0.1.0",
"@ruvector/node-linux-arm64-gnu": "0.1.0",
"@ruvector/node-linux-arm64-musl": "0.1.0"
}
}
6. GitHub Actions Workflow
.github/workflows/build.yml
name: Build and Release
on:
push:
branches: [main]
tags:
- 'v*'
pull_request:
branches: [main]
workflow_dispatch:
env:
DEBUG: napi:*
APP_NAME: ruvector-node
MACOSX_DEPLOYMENT_TARGET: '10.13'
jobs:
build:
strategy:
fail-fast: false
matrix:
settings:
- host: macos-latest
target: x86_64-apple-darwin
build: |
npm run build
strip -x *.node
- host: macos-latest
target: aarch64-apple-darwin
build: |
sudo rm -Rf /Library/Developer/CommandLineTools/SDKs/*;
export CC=$(xcrun -f clang);
export CXX=$(xcrun -f clang++);
SYSROOT=$(xcrun --sdk macosx --show-sdk-path);
export CFLAGS="-isysroot $SYSROOT -isystem $SYSROOT";
npm run build -- --target aarch64-apple-darwin
strip -x *.node
- host: ubuntu-latest
target: x86_64-unknown-linux-gnu
docker: ghcr.io/napi-rs/napi-rs/nodejs-rust:lts-debian
build: |
set -e &&
npm run build -- --target x86_64-unknown-linux-gnu &&
strip *.node
- host: ubuntu-latest
target: x86_64-unknown-linux-musl
docker: ghcr.io/napi-rs/napi-rs/nodejs-rust:lts-alpine
build: |
set -e &&
npm run build &&
strip *.node
- host: ubuntu-latest
target: aarch64-unknown-linux-gnu
docker: ghcr.io/napi-rs/napi-rs/nodejs-rust:lts-debian-aarch64
build: |
set -e &&
npm run build -- --target aarch64-unknown-linux-gnu &&
aarch64-unknown-linux-gnu-strip *.node
- host: ubuntu-latest
target: aarch64-unknown-linux-musl
docker: ghcr.io/napi-rs/napi-rs/nodejs-rust:lts-alpine-aarch64
build: |
set -e &&
rustup target add aarch64-unknown-linux-musl &&
npm run build -- --target aarch64-unknown-linux-musl &&
/aarch64-linux-musl-cross/bin/aarch64-linux-musl-strip *.node
- host: windows-latest
target: x86_64-pc-windows-msvc
build: npm run build
- host: windows-latest
target: aarch64-pc-windows-msvc
build: npm run build -- --target aarch64-pc-windows-msvc
name: stable - ${{ matrix.settings.target }} - node@20
runs-on: ${{ matrix.settings.host }}
steps:
- uses: actions/checkout@v4
- name: Setup node
uses: actions/setup-node@v4
with:
node-version: 20
cache: npm
- name: Install Rust
uses: dtolnay/rust-toolchain@stable
with:
toolchain: stable
targets: ${{ matrix.settings.target }}
- name: Cache cargo
uses: actions/cache@v3
with:
path: |
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
.cargo-cache
target/
key: ${{ matrix.settings.target }}-cargo-${{ matrix.settings.host }}
- name: Install dependencies
run: npm ci
- name: Build in docker
uses: addnab/docker-run-action@v3
if: ${{ matrix.settings.docker }}
with:
image: ${{ matrix.settings.docker }}
options: '--user 0:0 -v ${{ github.workspace }}/.cargo-cache/git/db:/usr/local/cargo/git/db -v ${{ github.workspace }}/.cargo/registry/cache:/usr/local/cargo/registry/cache -v ${{ github.workspace }}/.cargo/registry/index:/usr/local/cargo/registry/index -v ${{ github.workspace }}:/build -w /build'
run: ${{ matrix.settings.build }}
- name: Build
run: ${{ matrix.settings.build }}
if: ${{ !matrix.settings.docker }}
shell: bash
- name: Upload artifact
uses: actions/upload-artifact@v3
with:
name: bindings-${{ matrix.settings.target }}
path: ${{ env.APP_NAME }}.*.node
if-no-files-found: error
test-macOS-windows-binding:
name: Test bindings on ${{ matrix.settings.target }} - node@${{ matrix.node }}
needs:
- build
strategy:
fail-fast: false
matrix:
settings:
- host: macos-latest
target: x86_64-apple-darwin
- host: windows-latest
target: x86_64-pc-windows-msvc
node:
- '18'
- '20'
runs-on: ${{ matrix.settings.host }}
steps:
- uses: actions/checkout@v4
- name: Setup node
uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node }}
cache: npm
- name: Install dependencies
run: npm ci
- name: Download artifacts
uses: actions/download-artifact@v3
with:
name: bindings-${{ matrix.settings.target }}
path: .
- name: List packages
run: ls -R .
shell: bash
- name: Test bindings
run: npm test
test-linux-x64-gnu-binding:
name: Test bindings on Linux-x64-gnu - node@${{ matrix.node }}
needs:
- build
strategy:
fail-fast: false
matrix:
node:
- '18'
- '20'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup node
uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node }}
cache: npm
- name: Install dependencies
run: npm ci
- name: Download artifacts
uses: actions/download-artifact@v3
with:
name: bindings-x86_64-unknown-linux-gnu
path: .
- name: List packages
run: ls -R .
shell: bash
- name: Test bindings
run: docker run --rm -v $(pwd):/build -w /build node:${{ matrix.node }}-slim npm test
publish:
name: Publish
runs-on: ubuntu-latest
needs:
- test-macOS-windows-binding
- test-linux-x64-gnu-binding
if: startsWith(github.ref, 'refs/tags/v')
steps:
- uses: actions/checkout@v4
- name: Setup node
uses: actions/setup-node@v4
with:
node-version: 20
cache: npm
- name: Install dependencies
run: npm ci
- name: Download all artifacts
uses: actions/download-artifact@v3
with:
path: artifacts
- name: Move artifacts
run: npm run artifacts
- name: List packages
run: ls -R ./npm
shell: bash
- name: Publish
run: |
npm config set provenance true
if git log -1 --pretty=%B | grep "^[0-9]\+\.[0-9]\+\.[0-9]\+$";
then
echo "//registry.npmjs.org/:_authToken=$NPM_TOKEN" >> ~/.npmrc
npm publish --access public
elif git log -1 --pretty=%B | grep "^[0-9]\+\.[0-9]\+\.[0-9]\+";
then
echo "//registry.npmjs.org/:_authToken=$NPM_TOKEN" >> ~/.npmrc
npm publish --tag next --access public
else
echo "Not a release, skipping publish"
fi
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
NPM_TOKEN: ${{ secrets.NPM_TOKEN }}
7. Testing Strategy
test/attention.spec.ts
import { describe, it, expect } from 'vitest'
import {
DotProductAttention,
MultiHeadAttention,
GraphAttention,
TemporalAttention,
HierarchicalAttention,
AttentionConfig,
GraphStructure
} from '../index'
describe('DotProductAttention', () => {
it('should compute attention synchronously', () => {
const attention = new DotProductAttention()
const query = [[1, 2], [3, 4]]
const key = [[1, 2], [3, 4]]
const value = [[5, 6], [7, 8]]
const result = attention.compute(query, key, value)
expect(result.values).toBeDefined()
expect(result.values.length).toBe(2)
expect(result.weights).toBeDefined()
})
it('should compute attention asynchronously', async () => {
const attention = new DotProductAttention()
const query = [[1, 2], [3, 4]]
const key = [[1, 2], [3, 4]]
const value = [[5, 6], [7, 8]]
const result = await attention.computeAsync(query, key, value)
expect(result.values).toBeDefined()
expect(result.values.length).toBe(2)
})
it('should compute batch attention', () => {
const attention = new DotProductAttention()
const queries = [[[1, 2]], [[3, 4]]]
const keys = [[[1, 2]], [[3, 4]]]
const values = [[[5, 6]], [[7, 8]]]
const results = attention.computeBatch(queries, keys, values)
expect(results.length).toBe(2)
expect(results[0].values).toBeDefined()
})
it('should compute batch attention asynchronously', async () => {
const attention = new DotProductAttention()
const queries = [[[1, 2]], [[3, 4]]]
const keys = [[[1, 2]], [[3, 4]]]
const values = [[[5, 6]], [[7, 8]]]
const results = await attention.computeBatchAsync(queries, keys, values)
expect(results.length).toBe(2)
})
it('should accept custom configuration', () => {
const config: AttentionConfig = {
numHeads: 4,
headDim: 32,
dropout: 0.2,
scale: 0.5
}
const attention = new DotProductAttention(config)
expect(attention).toBeDefined()
})
})
describe('GraphAttention', () => {
it('should compute graph attention', () => {
const attention = new GraphAttention()
const nodeFeatures = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
const graph: GraphStructure = {
edges: [[0, 1], [1, 2], [2, 0]],
numNodes: 3
}
const result = attention.compute(nodeFeatures, graph)
expect(result.values).toBeDefined()
expect(result.metadata).toBeDefined()
expect(result.metadata.attention_type).toBe('graph')
})
it('should compute graph attention asynchronously', async () => {
const attention = new GraphAttention()
const nodeFeatures = [[1, 2, 3], [4, 5, 6]]
const graph: GraphStructure = {
edges: [[0, 1]],
numNodes: 2
}
const result = await attention.computeAsync(nodeFeatures, graph)
expect(result.values).toBeDefined()
})
})
describe('TemporalAttention', () => {
it('should compute temporal attention', () => {
const attention = new TemporalAttention()
const sequence = [[1, 2], [3, 4], [5, 6]]
const result = attention.compute(sequence)
expect(result.values).toBeDefined()
expect(result.metadata.sequence_length).toBe(3)
})
it('should handle timestamps', () => {
const attention = new TemporalAttention()
const sequence = [[1, 2], [3, 4]]
const timestamps = [0.0, 1.0]
const result = attention.compute(sequence, timestamps)
expect(result.metadata.has_timestamps).toBe(true)
})
})
describe('HierarchicalAttention', () => {
it('should compute hierarchical attention', () => {
const attention = new HierarchicalAttention()
const hierarchicalFeatures = [
[[1, 2], [3, 4]],
[[5, 6], [7, 8]]
]
const result = attention.compute(hierarchicalFeatures)
expect(result.values).toBeDefined()
expect(result.metadata.num_levels).toBe(2)
})
})
test/benchmark.spec.ts
import { describe, it, expect } from 'vitest'
import { DotProductAttention } from '../index'
describe('Performance Benchmarks', () => {
it('should handle large matrices efficiently', () => {
const attention = new DotProductAttention()
const size = 1000
const dim = 512
// Generate random matrices
const query = Array.from({ length: size }, () =>
Array.from({ length: dim }, () => Math.random())
)
const key = Array.from({ length: size }, () =>
Array.from({ length: dim }, () => Math.random())
)
const value = Array.from({ length: size }, () =>
Array.from({ length: dim }, () => Math.random())
)
const start = Date.now()
const result = attention.compute(query, key, value)
const duration = Date.now() - start
expect(result.values).toBeDefined()
expect(duration).toBeLessThan(5000) // Should complete in < 5 seconds
})
it('async should be faster for large batches', async () => {
const attention = new DotProductAttention()
const batchSize = 100
const queries = Array.from({ length: batchSize }, () =>
[[1, 2, 3], [4, 5, 6]]
)
const keys = queries
const values = queries
const syncStart = Date.now()
attention.computeBatch(queries, keys, values)
const syncDuration = Date.now() - syncStart
const asyncStart = Date.now()
await attention.computeBatchAsync(queries, keys, values)
const asyncDuration = Date.now() - asyncStart
console.log(`Sync: ${syncDuration}ms, Async: ${asyncDuration}ms`)
expect(asyncDuration).toBeLessThanOrEqual(syncDuration * 1.5)
})
})
8. Usage Examples
examples/basic-usage.js
const {
init,
getVersion,
getFeatures,
DotProductAttention
} = require('@ruvector/node')
// Initialize
console.log(init())
console.log('Version:', getVersion())
console.log('Features:', getFeatures())
// Create attention instance
const attention = new DotProductAttention({
numHeads: 8,
headDim: 64,
dropout: 0.1
})
// Compute attention
const query = [[1, 2, 3], [4, 5, 6]]
const key = [[1, 2, 3], [4, 5, 6]]
const value = [[7, 8, 9], [10, 11, 12]]
const result = attention.compute(query, key, value)
console.log('Attention output:', result.values)
console.log('Attention weights:', result.weights)
examples/async-batch.js
const { DotProductAttention } = require('@ruvector/node')
async function main() {
const attention = new DotProductAttention()
// Prepare batch data
const batchSize = 10
const queries = Array.from({ length: batchSize }, (_, i) =>
[[i, i+1], [i+2, i+3]]
)
const keys = queries
const values = queries
console.log('Processing batch asynchronously...')
const start = Date.now()
const results = await attention.computeBatchAsync(queries, keys, values)
const duration = Date.now() - start
console.log(`Processed ${batchSize} items in ${duration}ms`)
console.log('Results:', results.length)
}
main().catch(console.error)
examples/typescript-example.ts
import {
DotProductAttention,
GraphAttention,
AttentionConfig,
GraphStructure,
AttentionOutput
} from '@ruvector/node'
// Type-safe configuration
const config: AttentionConfig = {
numHeads: 8,
headDim: 64,
dropout: 0.1,
useBias: true
}
// Dot-product attention
const dotAttention = new DotProductAttention(config)
const output: AttentionOutput = dotAttention.compute(
[[1, 2], [3, 4]],
[[1, 2], [3, 4]],
[[5, 6], [7, 8]]
)
console.log('Attention values:', output.values)
// Graph attention
const graphAttention = new GraphAttention(config)
const graph: GraphStructure = {
edges: [[0, 1], [1, 2]],
numNodes: 3,
edgeFeatures: [[1.0], [2.0]]
}
const graphOutput = graphAttention.compute(
[[1, 2, 3], [4, 5, 6], [7, 8, 9]],
graph
)
console.log('Graph attention:', graphOutput.metadata)
// Async with proper typing
async function processAsync(): Promise<void> {
const result = await dotAttention.computeAsync(
[[1, 2]],
[[1, 2]],
[[3, 4]]
)
console.log('Async result:', result)
}
processAsync()
9. Documentation
README.md
# @ruvector/node
High-performance Node.js bindings for RuVector's GNN latent space attention mechanisms.
## Features
- ⚡ **Blazing Fast**: Native Rust implementation with SIMD optimizations
- 🔄 **Async Support**: Non-blocking async methods with Tokio runtime
- 📦 **Batch Processing**: Efficient parallel batch operations
- 🎯 **Multiple Attention Types**: Dot-product, multi-head, graph, temporal, hierarchical
- 🔒 **Type Safe**: Full TypeScript definitions
- 🌐 **Cross-Platform**: Pre-built binaries for all major platforms
## Installation
```bash
npm install @ruvector/node
Quick Start
const { DotProductAttention } = require('@ruvector/node')
const attention = new DotProductAttention()
const result = attention.compute(query, key, value)
API Reference
See index.d.ts for complete API documentation.
Performance
- SIMD Optimized: AVX2 on x86_64, NEON on ARM64
- Parallel Processing: Multi-threaded batch operations
- Zero-Copy: Efficient memory handling
License
MIT
## Implementation Checklist
### Phase 1: Core Setup
- [ ] Initialize NAPI-RS project with Cargo.toml
- [ ] Configure build.rs for platform detection
- [ ] Set up basic project structure
- [ ] Implement error handling types
- [ ] Create TypeScript definitions
### Phase 2: Attention Mechanisms
- [ ] Implement DotProductAttention
- [ ] Implement MultiHeadAttention
- [ ] Implement GraphAttention
- [ ] Implement TemporalAttention
- [ ] Implement HierarchicalAttention
- [ ] Add sync/async variants for all
### Phase 3: Testing
- [ ] Write unit tests for each attention type
- [ ] Add integration tests
- [ ] Create benchmark suite
- [ ] Test all platforms
### Phase 4: Build & CI/CD
- [ ] Set up GitHub Actions workflow
- [ ] Configure multi-platform builds
- [ ] Add automated testing
- [ ] Set up npm publishing
### Phase 5: Documentation
- [ ] Write comprehensive README
- [ ] Add usage examples
- [ ] Document all APIs
- [ ] Create migration guide
## Performance Targets
- **Dot-Product Attention**: < 10ms for 1000x512 matrices
- **Batch Processing**: < 100ms for 100 items (async)
- **Memory Efficiency**: < 2x input size overhead
- **SIMD Speedup**: 2-4x over scalar implementation
## Platform Support
- ✅ Linux x64 (GNU/MUSL)
- ✅ Linux ARM64 (GNU/MUSL)
- ✅ macOS x64
- ✅ macOS ARM64 (Apple Silicon)
- ✅ Windows x64
- ✅ Windows ARM64
## Dependencies
- **napi-rs**: 2.16+ (Node.js bindings)
- **tokio**: 1.35+ (Async runtime)
- **ndarray**: 0.15+ (Linear algebra)
- **rayon**: 1.8+ (Parallelism)
## Integration Points
- **Agent 4**: Uses core Rust attention implementations
- **Agent 6**: Integrates with Python bindings
- **Agent 7**: Provides C++ FFI layer
- **Agent 9**: TypeScript SDK consumer