Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'
This commit is contained in:
304
vendor/ruvector/npm/packages/rvdna/README.md
vendored
Normal file
304
vendor/ruvector/npm/packages/rvdna/README.md
vendored
Normal file
@@ -0,0 +1,304 @@
|
||||
# @ruvector/rvdna
|
||||
|
||||
**DNA analysis in JavaScript.** Encode sequences, translate proteins, search genomes by similarity, and read the `.rvdna` AI-native file format — all from Node.js or the browser.
|
||||
|
||||
Built on Rust via NAPI-RS for native speed. Falls back to pure JavaScript when native bindings aren't available.
|
||||
|
||||
```bash
|
||||
npm install @ruvector/rvdna
|
||||
```
|
||||
|
||||
## What It Does
|
||||
|
||||
| Function | What It Does | Native Required? |
|
||||
|---|---|---|
|
||||
| `encode2bit(seq)` | Pack DNA into 2-bit bytes (4 bases per byte) | No (JS fallback) |
|
||||
| `decode2bit(buf, len)` | Unpack 2-bit bytes back to DNA string | No (JS fallback) |
|
||||
| `translateDna(seq)` | Translate DNA to protein amino acids | No (JS fallback) |
|
||||
| `cosineSimilarity(a, b)` | Cosine similarity between two vectors | No (JS fallback) |
|
||||
| `fastaToRvdna(seq, opts)` | Convert FASTA to `.rvdna` binary format | Yes |
|
||||
| `readRvdna(buf)` | Parse a `.rvdna` file from a Buffer | Yes |
|
||||
| `isNativeAvailable()` | Check if native Rust bindings are loaded | No |
|
||||
|
||||
## Quick Start
|
||||
|
||||
```js
|
||||
const { encode2bit, decode2bit, translateDna, cosineSimilarity } = require('@ruvector/rvdna');
|
||||
|
||||
// Encode DNA to compact 2-bit format (4 bases per byte)
|
||||
const packed = encode2bit('ACGTACGTACGT');
|
||||
console.log(packed); // <Buffer 1b 1b 1b>
|
||||
|
||||
// Decode it back — lossless round-trip
|
||||
const dna = decode2bit(packed, 12);
|
||||
console.log(dna); // 'ACGTACGTACGT'
|
||||
|
||||
// Translate DNA to protein (standard genetic code)
|
||||
const protein = translateDna('ATGGCCATTGTAATG');
|
||||
console.log(protein); // 'MAIV'
|
||||
|
||||
// Compare two k-mer vectors
|
||||
const sim = cosineSimilarity([1, 2, 3], [1, 2, 3]);
|
||||
console.log(sim); // 1.0 (identical)
|
||||
```
|
||||
|
||||
## API Reference
|
||||
|
||||
### `encode2bit(sequence: string): Buffer`
|
||||
|
||||
Packs a DNA string into 2-bit bytes. Each byte holds 4 bases: A=00, C=01, G=10, T=11. Ambiguous bases (N) map to A.
|
||||
|
||||
```js
|
||||
encode2bit('ACGT') // <Buffer 1b> — one byte for 4 bases
|
||||
encode2bit('AAAA') // <Buffer 00>
|
||||
encode2bit('TTTT') // <Buffer ff>
|
||||
```
|
||||
|
||||
### `decode2bit(buffer: Buffer, length: number): string`
|
||||
|
||||
Decodes 2-bit packed bytes back to a DNA string. You must pass the original sequence length since the last byte may have padding.
|
||||
|
||||
```js
|
||||
decode2bit(Buffer.from([0x1b]), 4) // 'ACGT'
|
||||
```
|
||||
|
||||
### `translateDna(sequence: string): string`
|
||||
|
||||
Translates a DNA string to a protein amino acid string using the standard genetic code. Stops at the first stop codon (TAA, TAG, TGA).
|
||||
|
||||
```js
|
||||
translateDna('ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGA')
|
||||
// 'MAIVMGR' — stops at TGA stop codon
|
||||
```
|
||||
|
||||
### `cosineSimilarity(a: number[], b: number[]): number`
|
||||
|
||||
Returns cosine similarity between two numeric arrays. Result is between -1 and 1.
|
||||
|
||||
```js
|
||||
cosineSimilarity([1, 0, 0], [0, 1, 0]) // 0 (orthogonal)
|
||||
cosineSimilarity([1, 2, 3], [2, 4, 6]) // 1 (parallel)
|
||||
```
|
||||
|
||||
### `fastaToRvdna(sequence: string, options?: RvdnaOptions): Buffer`
|
||||
|
||||
Converts a raw DNA sequence to the `.rvdna` binary format with pre-computed k-mer vectors. **Requires native bindings.**
|
||||
|
||||
```js
|
||||
const { fastaToRvdna, isNativeAvailable } = require('@ruvector/rvdna');
|
||||
|
||||
if (isNativeAvailable()) {
|
||||
const rvdna = fastaToRvdna('ACGTACGT...', { k: 11, dims: 512, blockSize: 500 });
|
||||
require('fs').writeFileSync('output.rvdna', rvdna);
|
||||
}
|
||||
```
|
||||
|
||||
| Option | Default | Description |
|
||||
|---|---|---|
|
||||
| `k` | 11 | K-mer size for vector encoding |
|
||||
| `dims` | 512 | Vector dimensions per block |
|
||||
| `blockSize` | 500 | Bases per vector block |
|
||||
|
||||
### `readRvdna(buffer: Buffer): RvdnaFile`
|
||||
|
||||
Parses a `.rvdna` file. Returns the decoded sequence, k-mer vectors, variants, metadata, and file statistics. **Requires native bindings.**
|
||||
|
||||
```js
|
||||
const fs = require('fs');
|
||||
const { readRvdna } = require('@ruvector/rvdna');
|
||||
|
||||
const file = readRvdna(fs.readFileSync('sample.rvdna'));
|
||||
|
||||
console.log(file.sequenceLength); // 430
|
||||
console.log(file.sequence.slice(0, 20)); // 'ATGGTGCATCTGACTCCTGA'
|
||||
console.log(file.kmerVectors.length); // number of vector blocks
|
||||
console.log(file.stats.bitsPerBase); // ~3.2
|
||||
console.log(file.stats.compressionRatio); // vs raw FASTA
|
||||
```
|
||||
|
||||
**RvdnaFile fields:**
|
||||
|
||||
| Field | Type | Description |
|
||||
|---|---|---|
|
||||
| `version` | `number` | Format version |
|
||||
| `sequenceLength` | `number` | Number of bases |
|
||||
| `sequence` | `string` | Decoded DNA string |
|
||||
| `kmerVectors` | `Array` | Pre-computed k-mer vector blocks |
|
||||
| `variants` | `Array \| null` | Variant positions with genotype likelihoods |
|
||||
| `metadata` | `Record \| null` | Key-value metadata |
|
||||
| `stats.totalSize` | `number` | File size in bytes |
|
||||
| `stats.bitsPerBase` | `number` | Storage efficiency |
|
||||
| `stats.compressionRatio` | `number` | Compression vs raw |
|
||||
|
||||
## The `.rvdna` File Format
|
||||
|
||||
Traditional genomic formats (FASTA, FASTQ, BAM) store raw sequences. Every time an AI model needs that data, it re-encodes everything from scratch — vectors, attention matrices, features. This takes 30-120 seconds per file.
|
||||
|
||||
`.rvdna` stores the sequence **and** pre-computed AI features together. Open the file and everything is ready — no re-encoding.
|
||||
|
||||
```
|
||||
.rvdna file layout:
|
||||
|
||||
[Magic: "RVDNA\x01\x00\x00"] 8 bytes — file identifier
|
||||
[Header] 64 bytes — version, flags, offsets
|
||||
[Section 0: Sequence] 2-bit packed DNA (4 bases/byte)
|
||||
[Section 1: K-mer Vectors] HNSW-ready embeddings
|
||||
[Section 2: Attention Weights] Sparse COO matrices
|
||||
[Section 3: Variant Tensor] f16 genotype likelihoods
|
||||
[Section 4: Protein Embeddings] GNN features + contact graphs
|
||||
[Section 5: Epigenomic Tracks] Methylation + clock data
|
||||
[Section 6: Metadata] JSON provenance + checksums
|
||||
```
|
||||
|
||||
### Format Comparison
|
||||
|
||||
| | FASTA | FASTQ | BAM | CRAM | **.rvdna** |
|
||||
|---|---|---|---|---|---|
|
||||
| **Encoding** | ASCII (1 char/base) | ASCII + Phred | Binary + ref | Ref-compressed | 2-bit packed |
|
||||
| **Bits per base** | 8 | 16 | 2-4 | 0.5-2 | **3.2** (seq only) |
|
||||
| **Random access** | Scan from start | Scan from start | Index ~10 us | Decode ~50 us | **mmap <1 us** |
|
||||
| **AI features included** | No | No | No | No | **Yes** |
|
||||
| **Vector search ready** | No | No | No | No | **HNSW built-in** |
|
||||
| **Zero-copy mmap** | No | No | Partial | No | **Full** |
|
||||
| **Single file** | Yes | Yes | Needs .bai | Needs .crai | **Yes** |
|
||||
|
||||
## Platform Support
|
||||
|
||||
Native NAPI-RS bindings are available for these platforms:
|
||||
|
||||
| Platform | Architecture | Package |
|
||||
|---|---|---|
|
||||
| Linux | x64 (glibc) | `@ruvector/rvdna-linux-x64-gnu` |
|
||||
| Linux | ARM64 (glibc) | `@ruvector/rvdna-linux-arm64-gnu` |
|
||||
| macOS | x64 (Intel) | `@ruvector/rvdna-darwin-x64` |
|
||||
| macOS | ARM64 (Apple Silicon) | `@ruvector/rvdna-darwin-arm64` |
|
||||
| Windows | x64 | `@ruvector/rvdna-win32-x64-msvc` |
|
||||
|
||||
These install automatically as optional dependencies. On unsupported platforms, basic functions (`encode2bit`, `decode2bit`, `translateDna`, `cosineSimilarity`) still work via pure JavaScript fallbacks.
|
||||
|
||||
## WASM (WebAssembly)
|
||||
|
||||
rvDNA can run entirely in the browser via WebAssembly. No server needed, no data leaves the user's device.
|
||||
|
||||
### Browser Setup
|
||||
|
||||
```bash
|
||||
# Build from the Rust source
|
||||
cd examples/dna
|
||||
wasm-pack build --target web --release
|
||||
```
|
||||
|
||||
This produces a `pkg/` directory with `.wasm` and `.js` glue code.
|
||||
|
||||
### Using in HTML
|
||||
|
||||
```html
|
||||
<script type="module">
|
||||
import init, { encode2bit, translateDna } from './pkg/rvdna.js';
|
||||
|
||||
await init(); // Load the WASM module
|
||||
|
||||
// Encode DNA
|
||||
const packed = encode2bit('ACGTACGTACGT');
|
||||
console.log('Packed bytes:', packed);
|
||||
|
||||
// Translate to protein
|
||||
const protein = translateDna('ATGGCCATTGTAATG');
|
||||
console.log('Protein:', protein); // 'MAIV'
|
||||
</script>
|
||||
```
|
||||
|
||||
### Using with Bundlers (Webpack, Vite)
|
||||
|
||||
```bash
|
||||
# For bundler targets
|
||||
wasm-pack build --target bundler --release
|
||||
```
|
||||
|
||||
```js
|
||||
// In your app
|
||||
import { encode2bit, translateDna, fastaToRvdna } from '@ruvector/rvdna-wasm';
|
||||
|
||||
const packed = encode2bit('ACGTACGT');
|
||||
const protein = translateDna('ATGGCCATT');
|
||||
```
|
||||
|
||||
### WASM Features
|
||||
|
||||
| Feature | Status | Description |
|
||||
|---|---|---|
|
||||
| 2-bit encode/decode | Available | Pack/unpack DNA sequences |
|
||||
| Protein translation | Available | Standard genetic code |
|
||||
| Cosine similarity | Available | Vector comparison |
|
||||
| `.rvdna` read/write | Planned | Full format support in browser |
|
||||
| HNSW search | Planned | K-mer similarity search |
|
||||
| Variant calling | Planned | Client-side mutation detection |
|
||||
|
||||
**Target WASM binary size:** <2 MB gzipped
|
||||
|
||||
### Privacy
|
||||
|
||||
WASM runs entirely client-side. DNA data never leaves the browser. This makes it suitable for:
|
||||
- Clinical genomics dashboards
|
||||
- Patient-facing genetic reports
|
||||
- Educational tools
|
||||
- Offline/edge analysis on devices with no internet
|
||||
|
||||
## TypeScript
|
||||
|
||||
Full TypeScript definitions are included. Import types directly:
|
||||
|
||||
```ts
|
||||
import {
|
||||
encode2bit,
|
||||
decode2bit,
|
||||
translateDna,
|
||||
cosineSimilarity,
|
||||
fastaToRvdna,
|
||||
readRvdna,
|
||||
isNativeAvailable,
|
||||
RvdnaOptions,
|
||||
RvdnaFile,
|
||||
} from '@ruvector/rvdna';
|
||||
```
|
||||
|
||||
## Speed
|
||||
|
||||
The native (Rust) backend handles these operations on real human gene data:
|
||||
|
||||
| Operation | Time | What It Does |
|
||||
|---|---|---|
|
||||
| Single SNP call | **155 ns** | Bayesian genotyping at one position |
|
||||
| Protein translation (1 kb) | **23 ns** | DNA to amino acids |
|
||||
| K-mer vector (1 kb) | **591 us** | Full pipeline with HNSW indexing |
|
||||
| Complete analysis (5 genes) | **12 ms** | All stages including `.rvdna` output |
|
||||
|
||||
### vs Traditional Tools
|
||||
|
||||
| Task | Traditional Tool | Their Time | rvDNA | Speedup |
|
||||
|---|---|---|---|---|
|
||||
| K-mer counting | Jellyfish | 15-30 min | 2-5 sec | **180-900x** |
|
||||
| Sequence similarity | BLAST | 1-5 min | 5-50 ms | **1,200-60,000x** |
|
||||
| Variant calling | GATK | 30-90 min | 3-10 min | **3-30x** |
|
||||
| Methylation age | R/Bioconductor | 5-15 min | 0.1-0.5 sec | **600-9,000x** |
|
||||
|
||||
## Rust Crate
|
||||
|
||||
The full Rust crate with all algorithms is available on crates.io:
|
||||
|
||||
```toml
|
||||
[dependencies]
|
||||
rvdna = "0.1"
|
||||
```
|
||||
|
||||
See the [Rust documentation](https://docs.rs/rvdna) for the complete API including Smith-Waterman alignment, Horvath clock, CYP2D6 pharmacogenomics, and more.
|
||||
|
||||
## Links
|
||||
|
||||
- [GitHub](https://github.com/ruvnet/ruvector/tree/main/examples/dna) - Source code
|
||||
- [crates.io](https://crates.io/crates/rvdna) - Rust crate
|
||||
- [RuVector](https://github.com/ruvnet/ruvector) - Parent vector computing platform
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
359
vendor/ruvector/npm/packages/rvdna/index.d.ts
vendored
Normal file
359
vendor/ruvector/npm/packages/rvdna/index.d.ts
vendored
Normal file
@@ -0,0 +1,359 @@
|
||||
/**
|
||||
* @ruvector/rvdna — AI-native genomic analysis and the .rvdna file format.
|
||||
*
|
||||
* Provides variant calling, protein translation, k-mer vector search,
|
||||
* and the compact .rvdna binary format via Rust NAPI-RS bindings.
|
||||
*/
|
||||
|
||||
/**
|
||||
* Encode a DNA string to 2-bit packed bytes (4 bases per byte).
|
||||
* A=00, C=01, G=10, T=11. Ambiguous bases (N) map to A.
|
||||
*/
|
||||
export function encode2bit(sequence: string): Buffer;
|
||||
|
||||
/**
|
||||
* Decode 2-bit packed bytes back to a DNA string.
|
||||
* @param buffer - The 2-bit packed buffer
|
||||
* @param length - Number of bases to decode
|
||||
*/
|
||||
export function decode2bit(buffer: Buffer, length: number): string;
|
||||
|
||||
/**
|
||||
* Translate a DNA string to a protein amino acid string.
|
||||
* Uses the standard genetic code. Stops at the first stop codon.
|
||||
*/
|
||||
export function translateDna(sequence: string): string;
|
||||
|
||||
/**
|
||||
* Compute cosine similarity between two numeric arrays.
|
||||
* Returns a value between -1 and 1.
|
||||
*/
|
||||
export function cosineSimilarity(a: number[], b: number[]): number;
|
||||
|
||||
export interface RvdnaOptions {
|
||||
/** K-mer size (default: 11) */
|
||||
k?: number;
|
||||
/** Vector dimensions (default: 512) */
|
||||
dims?: number;
|
||||
/** Block size in bases (default: 500) */
|
||||
blockSize?: number;
|
||||
}
|
||||
|
||||
/**
|
||||
* Convert a FASTA sequence string to .rvdna binary format.
|
||||
* Requires native bindings.
|
||||
*/
|
||||
export function fastaToRvdna(sequence: string, options?: RvdnaOptions): Buffer;
|
||||
|
||||
export interface RvdnaFile {
|
||||
/** Format version */
|
||||
version: number;
|
||||
/** Sequence length in bases */
|
||||
sequenceLength: number;
|
||||
/** Decoded DNA sequence */
|
||||
sequence: string;
|
||||
/** Pre-computed k-mer vector blocks */
|
||||
kmerVectors: Array<{
|
||||
k: number;
|
||||
dimensions: number;
|
||||
startPos: number;
|
||||
regionLen: number;
|
||||
vector: Float32Array;
|
||||
}>;
|
||||
/** Variant positions and genotype likelihoods */
|
||||
variants: Array<{
|
||||
position: number;
|
||||
refAllele: string;
|
||||
altAllele: string;
|
||||
likelihoods: [number, number, number];
|
||||
quality: number;
|
||||
}> | null;
|
||||
/** Metadata key-value pairs */
|
||||
metadata: Record<string, unknown> | null;
|
||||
/** File statistics */
|
||||
stats: {
|
||||
totalSize: number;
|
||||
bitsPerBase: number;
|
||||
compressionRatio: number;
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Read a .rvdna file from a Buffer. Returns parsed sections.
|
||||
* Requires native bindings.
|
||||
*/
|
||||
export function readRvdna(buffer: Buffer): RvdnaFile;
|
||||
|
||||
/**
|
||||
* Check if native bindings are available for the current platform.
|
||||
*/
|
||||
export function isNativeAvailable(): boolean;
|
||||
|
||||
/**
|
||||
* Direct access to the native NAPI-RS module (null if not available).
|
||||
*/
|
||||
export const native: Record<string, Function> | null;
|
||||
|
||||
// -------------------------------------------------------------------
|
||||
// 23andMe Genotyping Pipeline (v0.2.0)
|
||||
// -------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Normalize a genotype string: uppercase, trim, sort allele pair.
|
||||
* "ag" → "AG", "TC" → "CT", "DI" → "DI"
|
||||
*/
|
||||
export function normalizeGenotype(gt: string): string;
|
||||
|
||||
export interface Snp {
|
||||
rsid: string;
|
||||
chromosome: string;
|
||||
position: number;
|
||||
genotype: string;
|
||||
}
|
||||
|
||||
export interface GenotypeData {
|
||||
snps: Record<string, Snp>;
|
||||
totalMarkers: number;
|
||||
noCalls: number;
|
||||
chrCounts: Record<string, number>;
|
||||
build: 'GRCh37' | 'GRCh38' | 'Unknown';
|
||||
}
|
||||
|
||||
/**
|
||||
* Parse a 23andMe raw data file (v4/v5 tab-separated format).
|
||||
* Normalizes all genotype strings on load.
|
||||
*/
|
||||
export function parse23andMe(text: string): {
|
||||
snps: Map<string, Snp>;
|
||||
totalMarkers: number;
|
||||
noCalls: number;
|
||||
chrCounts: Map<string, number>;
|
||||
build: string;
|
||||
};
|
||||
|
||||
export interface CypDiplotype {
|
||||
gene: string;
|
||||
allele1: string;
|
||||
allele2: string;
|
||||
activity: number;
|
||||
phenotype: 'UltraRapid' | 'Normal' | 'Intermediate' | 'Poor';
|
||||
confidence: 'Unsupported' | 'Weak' | 'Moderate' | 'Strong';
|
||||
rsidsGenotyped: number;
|
||||
rsidsMatched: number;
|
||||
rsidsTotal: number;
|
||||
notes: string[];
|
||||
details: string[];
|
||||
}
|
||||
|
||||
/** Call CYP2D6 diplotype from a genotype map */
|
||||
export function callCyp2d6(gts: Map<string, string>): CypDiplotype;
|
||||
|
||||
/** Call CYP2C19 diplotype from a genotype map */
|
||||
export function callCyp2c19(gts: Map<string, string>): CypDiplotype;
|
||||
|
||||
export interface ApoeResult {
|
||||
genotype: string;
|
||||
rs429358: string;
|
||||
rs7412: string;
|
||||
}
|
||||
|
||||
/** Determine APOE genotype from rs429358 + rs7412 */
|
||||
export function determineApoe(gts: Map<string, string>): ApoeResult;
|
||||
|
||||
export interface AnalysisResult {
|
||||
data: GenotypeData;
|
||||
cyp2d6: CypDiplotype;
|
||||
cyp2c19: CypDiplotype;
|
||||
apoe: ApoeResult;
|
||||
homozygous: number;
|
||||
heterozygous: number;
|
||||
indels: number;
|
||||
hetRatio: number;
|
||||
}
|
||||
|
||||
/**
|
||||
* Run the full 23andMe analysis pipeline.
|
||||
* @param text - Raw 23andMe file contents
|
||||
*/
|
||||
export function analyze23andMe(text: string): AnalysisResult;
|
||||
|
||||
// -------------------------------------------------------------------
|
||||
// Biomarker Risk Scoring Engine (v0.3.0)
|
||||
// -------------------------------------------------------------------
|
||||
|
||||
/** Clinical reference range for a single biomarker. */
|
||||
export interface BiomarkerReference {
|
||||
name: string;
|
||||
unit: string;
|
||||
normalLow: number;
|
||||
normalHigh: number;
|
||||
criticalLow: number | null;
|
||||
criticalHigh: number | null;
|
||||
category: string;
|
||||
}
|
||||
|
||||
/** Classification of a biomarker value relative to its reference range. */
|
||||
export type BiomarkerClassification = 'CriticalLow' | 'Low' | 'Normal' | 'High' | 'CriticalHigh';
|
||||
|
||||
/** Risk score for a single clinical category. */
|
||||
export interface CategoryScore {
|
||||
category: string;
|
||||
score: number;
|
||||
confidence: number;
|
||||
contributingVariants: string[];
|
||||
}
|
||||
|
||||
/** Full biomarker + genotype risk profile for one subject. */
|
||||
export interface BiomarkerProfile {
|
||||
subjectId: string;
|
||||
timestamp: number;
|
||||
categoryScores: Record<string, CategoryScore>;
|
||||
globalRiskScore: number;
|
||||
profileVector: Float32Array;
|
||||
biomarkerValues: Record<string, number>;
|
||||
}
|
||||
|
||||
/** SNP risk descriptor. */
|
||||
export interface SnpDef {
|
||||
rsid: string;
|
||||
category: string;
|
||||
wRef: number;
|
||||
wHet: number;
|
||||
wAlt: number;
|
||||
homRef: string;
|
||||
het: string;
|
||||
homAlt: string;
|
||||
maf: number;
|
||||
}
|
||||
|
||||
/** Gene-gene interaction descriptor. */
|
||||
export interface InteractionDef {
|
||||
rsidA: string;
|
||||
rsidB: string;
|
||||
modifier: number;
|
||||
category: string;
|
||||
}
|
||||
|
||||
/** 13 clinical biomarker reference ranges. */
|
||||
export const BIOMARKER_REFERENCES: readonly BiomarkerReference[];
|
||||
|
||||
/** 20-SNP risk table (mirrors Rust biomarker.rs). */
|
||||
export const SNPS: readonly SnpDef[];
|
||||
|
||||
/** 6 gene-gene interaction modifiers. */
|
||||
export const INTERACTIONS: readonly InteractionDef[];
|
||||
|
||||
/** Category ordering: Cancer Risk, Cardiovascular, Neurological, Metabolism. */
|
||||
export const CAT_ORDER: readonly string[];
|
||||
|
||||
/** Return the static biomarker reference table. */
|
||||
export function biomarkerReferences(): readonly BiomarkerReference[];
|
||||
|
||||
/** Compute a z-score for a value relative to a reference range. */
|
||||
export function zScore(value: number, ref: BiomarkerReference): number;
|
||||
|
||||
/** Classify a biomarker value against its reference range. */
|
||||
export function classifyBiomarker(value: number, ref: BiomarkerReference): BiomarkerClassification;
|
||||
|
||||
/** Compute composite risk scores from genotype data (20 SNPs, 6 interactions). */
|
||||
export function computeRiskScores(genotypes: Map<string, string>): BiomarkerProfile;
|
||||
|
||||
/** Encode a profile into a 64-dim L2-normalized Float32Array. */
|
||||
export function encodeProfileVector(profile: BiomarkerProfile): Float32Array;
|
||||
|
||||
/** Generate a deterministic synthetic population of biomarker profiles. */
|
||||
export function generateSyntheticPopulation(count: number, seed: number): BiomarkerProfile[];
|
||||
|
||||
// -------------------------------------------------------------------
|
||||
// Streaming Biomarker Processor (v0.3.0)
|
||||
// -------------------------------------------------------------------
|
||||
|
||||
/** Biomarker stream definition. */
|
||||
export interface BiomarkerDef {
|
||||
id: string;
|
||||
low: number;
|
||||
high: number;
|
||||
}
|
||||
|
||||
/** 6 streaming biomarker definitions. */
|
||||
export const BIOMARKER_DEFS: readonly BiomarkerDef[];
|
||||
|
||||
/** Configuration for the streaming biomarker simulator. */
|
||||
export interface StreamConfig {
|
||||
baseIntervalMs: number;
|
||||
noiseAmplitude: number;
|
||||
driftRate: number;
|
||||
anomalyProbability: number;
|
||||
anomalyMagnitude: number;
|
||||
numBiomarkers: number;
|
||||
windowSize: number;
|
||||
}
|
||||
|
||||
/** A single timestamped biomarker data point. */
|
||||
export interface BiomarkerReading {
|
||||
timestampMs: number;
|
||||
biomarkerId: string;
|
||||
value: number;
|
||||
referenceLow: number;
|
||||
referenceHigh: number;
|
||||
isAnomaly: boolean;
|
||||
zScore: number;
|
||||
}
|
||||
|
||||
/** Rolling statistics for a single biomarker stream. */
|
||||
export interface StreamStats {
|
||||
mean: number;
|
||||
variance: number;
|
||||
min: number;
|
||||
max: number;
|
||||
count: number;
|
||||
anomalyRate: number;
|
||||
trendSlope: number;
|
||||
ema: number;
|
||||
cusumPos: number;
|
||||
cusumNeg: number;
|
||||
changepointDetected: boolean;
|
||||
}
|
||||
|
||||
/** Result of processing a single reading. */
|
||||
export interface ProcessingResult {
|
||||
accepted: boolean;
|
||||
zScore: number;
|
||||
isAnomaly: boolean;
|
||||
currentTrend: number;
|
||||
}
|
||||
|
||||
/** Aggregate summary across all biomarker streams. */
|
||||
export interface StreamSummary {
|
||||
totalReadings: number;
|
||||
anomalyCount: number;
|
||||
anomalyRate: number;
|
||||
biomarkerStats: Record<string, StreamStats>;
|
||||
throughputReadingsPerSec: number;
|
||||
}
|
||||
|
||||
/** Fixed-capacity circular buffer backed by Float64Array. */
|
||||
export class RingBuffer {
|
||||
constructor(capacity: number);
|
||||
push(item: number): void;
|
||||
toArray(): number[];
|
||||
readonly length: number;
|
||||
readonly capacity: number;
|
||||
isFull(): boolean;
|
||||
clear(): void;
|
||||
[Symbol.iterator](): IterableIterator<number>;
|
||||
}
|
||||
|
||||
/** Streaming biomarker processor with per-stream ring buffers, z-score anomaly detection, CUSUM changepoint detection, and trend analysis. */
|
||||
export class StreamProcessor {
|
||||
constructor(config?: StreamConfig);
|
||||
processReading(reading: BiomarkerReading): ProcessingResult;
|
||||
getStats(biomarkerId: string): StreamStats | null;
|
||||
summary(): StreamSummary;
|
||||
}
|
||||
|
||||
/** Return default stream configuration. */
|
||||
export function defaultStreamConfig(): StreamConfig;
|
||||
|
||||
/** Generate batch of synthetic biomarker readings. */
|
||||
export function generateReadings(config: StreamConfig, count: number, seed: number): BiomarkerReading[];
|
||||
392
vendor/ruvector/npm/packages/rvdna/index.js
vendored
Normal file
392
vendor/ruvector/npm/packages/rvdna/index.js
vendored
Normal file
@@ -0,0 +1,392 @@
|
||||
const { platform, arch } = process;
|
||||
|
||||
// Platform-specific native binary packages
|
||||
const platformMap = {
|
||||
'linux': {
|
||||
'x64': '@ruvector/rvdna-linux-x64-gnu',
|
||||
'arm64': '@ruvector/rvdna-linux-arm64-gnu'
|
||||
},
|
||||
'darwin': {
|
||||
'x64': '@ruvector/rvdna-darwin-x64',
|
||||
'arm64': '@ruvector/rvdna-darwin-arm64'
|
||||
},
|
||||
'win32': {
|
||||
'x64': '@ruvector/rvdna-win32-x64-msvc'
|
||||
}
|
||||
};
|
||||
|
||||
function loadNativeModule() {
|
||||
const platformPackage = platformMap[platform]?.[arch];
|
||||
|
||||
if (!platformPackage) {
|
||||
throw new Error(
|
||||
`Unsupported platform: ${platform}-${arch}\n` +
|
||||
`@ruvector/rvdna native bindings are available for:\n` +
|
||||
`- Linux (x64, ARM64)\n` +
|
||||
`- macOS (x64, ARM64)\n` +
|
||||
`- Windows (x64)\n\n` +
|
||||
`For other platforms, use the WASM build: npm install @ruvector/rvdna-wasm`
|
||||
);
|
||||
}
|
||||
|
||||
try {
|
||||
return require(platformPackage);
|
||||
} catch (error) {
|
||||
if (error.code === 'MODULE_NOT_FOUND') {
|
||||
throw new Error(
|
||||
`Native module not found for ${platform}-${arch}\n` +
|
||||
`Please install: npm install ${platformPackage}\n` +
|
||||
`Or reinstall @ruvector/rvdna to get optional dependencies`
|
||||
);
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
// Try native first, fall back to pure JS shim with basic functionality
|
||||
let nativeModule;
|
||||
try {
|
||||
nativeModule = loadNativeModule();
|
||||
} catch (e) {
|
||||
// Native bindings not available — provide JS shim for basic operations
|
||||
nativeModule = null;
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------
|
||||
// Public API — wraps native bindings or provides JS fallbacks
|
||||
// -------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Encode a DNA string to 2-bit packed bytes (4 bases per byte).
|
||||
* A=00, C=01, G=10, T=11. Returns a Buffer.
|
||||
*/
|
||||
function encode2bit(sequence) {
|
||||
if (nativeModule?.encode2bit) return nativeModule.encode2bit(sequence);
|
||||
|
||||
// JS fallback
|
||||
const map = { A: 0, C: 1, G: 2, T: 3, N: 0 };
|
||||
const len = sequence.length;
|
||||
const buf = Buffer.alloc(Math.ceil(len / 4));
|
||||
for (let i = 0; i < len; i++) {
|
||||
const byteIdx = i >> 2;
|
||||
const bitOff = 6 - (i & 3) * 2;
|
||||
buf[byteIdx] |= (map[sequence[i]] || 0) << bitOff;
|
||||
}
|
||||
return buf;
|
||||
}
|
||||
|
||||
/**
|
||||
* Decode 2-bit packed bytes back to a DNA string.
|
||||
*/
|
||||
function decode2bit(buffer, length) {
|
||||
if (nativeModule?.decode2bit) return nativeModule.decode2bit(buffer, length);
|
||||
|
||||
const bases = ['A', 'C', 'G', 'T'];
|
||||
let result = '';
|
||||
for (let i = 0; i < length; i++) {
|
||||
const byteIdx = i >> 2;
|
||||
const bitOff = 6 - (i & 3) * 2;
|
||||
result += bases[(buffer[byteIdx] >> bitOff) & 3];
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
/**
|
||||
* Translate a DNA string to a protein amino acid string.
|
||||
*/
|
||||
function translateDna(sequence) {
|
||||
if (nativeModule?.translateDna) return nativeModule.translateDna(sequence);
|
||||
|
||||
// JS fallback — standard genetic code
|
||||
const codons = {
|
||||
'TTT':'F','TTC':'F','TTA':'L','TTG':'L','CTT':'L','CTC':'L','CTA':'L','CTG':'L',
|
||||
'ATT':'I','ATC':'I','ATA':'I','ATG':'M','GTT':'V','GTC':'V','GTA':'V','GTG':'V',
|
||||
'TCT':'S','TCC':'S','TCA':'S','TCG':'S','CCT':'P','CCC':'P','CCA':'P','CCG':'P',
|
||||
'ACT':'T','ACC':'T','ACA':'T','ACG':'T','GCT':'A','GCC':'A','GCA':'A','GCG':'A',
|
||||
'TAT':'Y','TAC':'Y','TAA':'*','TAG':'*','CAT':'H','CAC':'H','CAA':'Q','CAG':'Q',
|
||||
'AAT':'N','AAC':'N','AAA':'K','AAG':'K','GAT':'D','GAC':'D','GAA':'E','GAG':'E',
|
||||
'TGT':'C','TGC':'C','TGA':'*','TGG':'W','CGT':'R','CGC':'R','CGA':'R','CGG':'R',
|
||||
'AGT':'S','AGC':'S','AGA':'R','AGG':'R','GGT':'G','GGC':'G','GGA':'G','GGG':'G',
|
||||
};
|
||||
let protein = '';
|
||||
for (let i = 0; i + 2 < sequence.length; i += 3) {
|
||||
const codon = sequence.slice(i, i + 3).toUpperCase();
|
||||
const aa = codons[codon] || 'X';
|
||||
if (aa === '*') break;
|
||||
protein += aa;
|
||||
}
|
||||
return protein;
|
||||
}
|
||||
|
||||
/**
|
||||
* Compute cosine similarity between two numeric arrays.
|
||||
*/
|
||||
function cosineSimilarity(a, b) {
|
||||
if (nativeModule?.cosineSimilarity) return nativeModule.cosineSimilarity(a, b);
|
||||
|
||||
let dot = 0, magA = 0, magB = 0;
|
||||
for (let i = 0; i < a.length; i++) {
|
||||
dot += a[i] * b[i];
|
||||
magA += a[i] * a[i];
|
||||
magB += b[i] * b[i];
|
||||
}
|
||||
magA = Math.sqrt(magA);
|
||||
magB = Math.sqrt(magB);
|
||||
return (magA && magB) ? dot / (magA * magB) : 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Convert a FASTA sequence string to .rvdna binary format.
|
||||
* Returns a Buffer with the complete .rvdna file contents.
|
||||
*/
|
||||
function fastaToRvdna(sequence, options = {}) {
|
||||
if (nativeModule?.fastaToRvdna) {
|
||||
return nativeModule.fastaToRvdna(sequence, options.k || 11, options.dims || 512, options.blockSize || 500);
|
||||
}
|
||||
throw new Error('fastaToRvdna requires native bindings. Install the platform-specific package.');
|
||||
}
|
||||
|
||||
/**
|
||||
* Read a .rvdna file from a Buffer. Returns parsed sections.
|
||||
*/
|
||||
function readRvdna(buffer) {
|
||||
if (nativeModule?.readRvdna) return nativeModule.readRvdna(buffer);
|
||||
throw new Error('readRvdna requires native bindings. Install the platform-specific package.');
|
||||
}
|
||||
|
||||
/**
|
||||
* Check if native bindings are available.
|
||||
*/
|
||||
function isNativeAvailable() {
|
||||
return nativeModule !== null;
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------
|
||||
// 23andMe Genotyping Pipeline (pure JS — mirrors Rust rvdna::genotyping)
|
||||
// -------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Normalize a genotype string: uppercase, trim, sort allele pair.
|
||||
* "ag" → "AG", "TC" → "CT", "DI" → "DI"
|
||||
*/
|
||||
function normalizeGenotype(gt) {
|
||||
gt = gt.trim().toUpperCase();
|
||||
if (gt.length === 2 && gt[0] > gt[1]) {
|
||||
return gt[1] + gt[0];
|
||||
}
|
||||
return gt;
|
||||
}
|
||||
|
||||
/**
|
||||
* Parse a 23andMe raw data file (v4/v5 tab-separated format).
|
||||
* @param {string} text - Raw file contents
|
||||
* @returns {{ snps: Map<string,object>, totalMarkers: number, noCalls: number, chrCounts: Map<string,number>, build: string }}
|
||||
*/
|
||||
function parse23andMe(text) {
|
||||
const snps = new Map();
|
||||
const chrCounts = new Map();
|
||||
let total = 0, noCalls = 0;
|
||||
let build = 'Unknown';
|
||||
|
||||
for (const line of text.split('\n')) {
|
||||
if (line.startsWith('#')) {
|
||||
const lower = line.toLowerCase();
|
||||
if (lower.includes('build 37') || lower.includes('grch37') || lower.includes('hg19')) build = 'GRCh37';
|
||||
else if (lower.includes('build 38') || lower.includes('grch38') || lower.includes('hg38')) build = 'GRCh38';
|
||||
continue;
|
||||
}
|
||||
if (!line.trim()) continue;
|
||||
const parts = line.split('\t');
|
||||
if (parts.length < 4) continue;
|
||||
const [rsid, chrom, posStr, genotype] = parts;
|
||||
total++;
|
||||
if (genotype === '--') { noCalls++; continue; }
|
||||
const pos = parseInt(posStr, 10) || 0;
|
||||
const normGt = normalizeGenotype(genotype);
|
||||
chrCounts.set(chrom, (chrCounts.get(chrom) || 0) + 1);
|
||||
snps.set(rsid, { rsid, chromosome: chrom, position: pos, genotype: normGt });
|
||||
}
|
||||
|
||||
if (total === 0) throw new Error('No markers found in file');
|
||||
return { snps, totalMarkers: total, noCalls, chrCounts, build };
|
||||
}
|
||||
|
||||
// CYP defining variant tables
|
||||
const CYP2D6_DEFS = [
|
||||
{ rsid: 'rs3892097', allele: '*4', alt: 'T', isDel: false, activity: 0.0, fn: 'No function (splicing defect)' },
|
||||
{ rsid: 'rs35742686', allele: '*3', alt: '-', isDel: true, activity: 0.0, fn: 'No function (frameshift)' },
|
||||
{ rsid: 'rs5030655', allele: '*6', alt: '-', isDel: true, activity: 0.0, fn: 'No function (frameshift)' },
|
||||
{ rsid: 'rs1065852', allele: '*10', alt: 'T', isDel: false, activity: 0.5, fn: 'Decreased function' },
|
||||
{ rsid: 'rs28371725', allele: '*41', alt: 'T', isDel: false, activity: 0.5, fn: 'Decreased function' },
|
||||
{ rsid: 'rs28371706', allele: '*17', alt: 'T', isDel: false, activity: 0.5, fn: 'Decreased function' },
|
||||
];
|
||||
|
||||
const CYP2C19_DEFS = [
|
||||
{ rsid: 'rs4244285', allele: '*2', alt: 'A', isDel: false, activity: 0.0, fn: 'No function (splicing defect)' },
|
||||
{ rsid: 'rs4986893', allele: '*3', alt: 'A', isDel: false, activity: 0.0, fn: 'No function (premature stop)' },
|
||||
{ rsid: 'rs12248560', allele: '*17', alt: 'T', isDel: false, activity: 1.5, fn: 'Increased function' },
|
||||
];
|
||||
|
||||
/**
|
||||
* Call a CYP diplotype from a genotype map.
|
||||
* @param {string} gene - Gene name (e.g., "CYP2D6")
|
||||
* @param {object[]} defs - Defining variant table
|
||||
* @param {Map<string,string>} gts - rsid → genotype map
|
||||
*/
|
||||
function callCypDiplotype(gene, defs, gts) {
|
||||
const alleles = [];
|
||||
const details = [];
|
||||
const notes = [];
|
||||
let genotyped = 0, matched = 0;
|
||||
|
||||
for (const def of defs) {
|
||||
const gt = gts.get(def.rsid);
|
||||
if (gt !== undefined) {
|
||||
genotyped++;
|
||||
if (def.isDel) {
|
||||
if (gt === 'DD') { matched++; alleles.push([def.allele, def.activity], [def.allele, def.activity]); details.push(` ${def.rsid}: ${gt} -> homozygous ${def.allele} (${def.fn})`); }
|
||||
else if (gt === 'DI') { matched++; alleles.push([def.allele, def.activity]); details.push(` ${def.rsid}: ${gt} -> heterozygous ${def.allele} (${def.fn})`); }
|
||||
else { details.push(` ${def.rsid}: ${gt} -> reference (no ${def.allele})`); }
|
||||
} else {
|
||||
const hom = def.alt + def.alt;
|
||||
if (gt === hom) { matched++; alleles.push([def.allele, def.activity], [def.allele, def.activity]); details.push(` ${def.rsid}: ${gt} -> homozygous ${def.allele} (${def.fn})`); }
|
||||
else if (gt.includes(def.alt)) { matched++; alleles.push([def.allele, def.activity]); details.push(` ${def.rsid}: ${gt} -> heterozygous ${def.allele} (${def.fn})`); }
|
||||
else { details.push(` ${def.rsid}: ${gt} -> reference (no ${def.allele})`); }
|
||||
}
|
||||
} else {
|
||||
details.push(` ${def.rsid}: not genotyped`);
|
||||
}
|
||||
}
|
||||
|
||||
let confidence;
|
||||
if (genotyped === 0) confidence = 'Unsupported';
|
||||
else if (matched >= 2 && genotyped * 2 >= defs.length) confidence = 'Strong';
|
||||
else if ((matched >= 1 && genotyped >= 2) || genotyped * 2 >= defs.length) confidence = 'Moderate';
|
||||
else confidence = 'Weak';
|
||||
|
||||
if (confidence === 'Unsupported') notes.push('Panel lacks all defining variants for this gene.');
|
||||
if (confidence === 'Weak') notes.push(`Only ${genotyped}/${defs.length} defining rsids genotyped; call unreliable.`);
|
||||
notes.push('No phase or CNV resolution from genotyping array.');
|
||||
|
||||
while (alleles.length < 2) alleles.push(['*1', 1.0]);
|
||||
const activity = alleles[0][1] + alleles[1][1];
|
||||
let phenotype;
|
||||
if (activity > 2.0) phenotype = 'UltraRapid';
|
||||
else if (activity >= 1.0) phenotype = 'Normal';
|
||||
else if (activity >= 0.5) phenotype = 'Intermediate';
|
||||
else phenotype = 'Poor';
|
||||
|
||||
return {
|
||||
gene, allele1: alleles[0][0], allele2: alleles[1][0],
|
||||
activity, phenotype, confidence,
|
||||
rsidsGenotyped: genotyped, rsidsMatched: matched, rsidsTotal: defs.length,
|
||||
notes, details,
|
||||
};
|
||||
}
|
||||
|
||||
/** Call CYP2D6 diplotype */
|
||||
function callCyp2d6(gts) { return callCypDiplotype('CYP2D6', CYP2D6_DEFS, gts); }
|
||||
|
||||
/** Call CYP2C19 diplotype */
|
||||
function callCyp2c19(gts) { return callCypDiplotype('CYP2C19', CYP2C19_DEFS, gts); }
|
||||
|
||||
/**
|
||||
* Determine APOE genotype from rs429358 + rs7412.
|
||||
* @param {Map<string,string>} gts
|
||||
*/
|
||||
function determineApoe(gts) {
|
||||
const gt1 = gts.get('rs429358') || '';
|
||||
const gt2 = gts.get('rs7412') || '';
|
||||
if (!gt1 || !gt2) return { genotype: 'Unable to determine (missing data)', rs429358: gt1, rs7412: gt2 };
|
||||
const e4 = (gt1.match(/C/g) || []).length;
|
||||
const e2 = (gt2.match(/T/g) || []).length;
|
||||
const geno = {
|
||||
'0,0': 'e3/e3 (most common, baseline risk)',
|
||||
'0,1': 'e2/e3 (PROTECTIVE - reduced Alzheimer\'s risk)',
|
||||
'0,2': 'e2/e2 (protective; monitor for type III hyperlipoproteinemia)',
|
||||
'1,0': 'e3/e4 (increased Alzheimer\'s risk ~3x)',
|
||||
'1,1': 'e2/e4 (mixed - e2 partially offsets e4 risk)',
|
||||
}[`${e4},${e2}`] || (e4 >= 2 ? 'e4/e4 (significantly increased Alzheimer\'s risk ~12x)' : `Unusual: rs429358=${gt1}, rs7412=${gt2}`);
|
||||
return { genotype: geno, rs429358: gt1, rs7412: gt2 };
|
||||
}
|
||||
|
||||
/**
|
||||
* Run the full 23andMe analysis pipeline.
|
||||
* @param {string} text - Raw 23andMe file contents
|
||||
* @returns {object} Full analysis result
|
||||
*/
|
||||
function analyze23andMe(text) {
|
||||
const data = parse23andMe(text);
|
||||
const gts = new Map();
|
||||
for (const [rsid, snp] of data.snps) gts.set(rsid, snp.genotype);
|
||||
|
||||
const cyp2d6 = callCyp2d6(gts);
|
||||
const cyp2c19 = callCyp2c19(gts);
|
||||
const apoe = determineApoe(gts);
|
||||
|
||||
// Variant classification
|
||||
let homozygous = 0, heterozygous = 0, indels = 0;
|
||||
const isNuc = c => 'ACGT'.includes(c);
|
||||
for (const snp of data.snps.values()) {
|
||||
const g = snp.genotype;
|
||||
if (g.length === 2) {
|
||||
if (isNuc(g[0]) && isNuc(g[1])) { g[0] === g[1] ? homozygous++ : heterozygous++; }
|
||||
else indels++;
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
data: { ...data, snps: Object.fromEntries(data.snps), chrCounts: Object.fromEntries(data.chrCounts) },
|
||||
cyp2d6, cyp2c19, apoe,
|
||||
homozygous, heterozygous, indels,
|
||||
hetRatio: data.totalMarkers - data.noCalls > 0 ? heterozygous / (data.totalMarkers - data.noCalls) * 100 : 0,
|
||||
};
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------
|
||||
// Biomarker Analysis Engine (v0.3.0 — mirrors biomarker.rs + biomarker_stream.rs)
|
||||
// -------------------------------------------------------------------
|
||||
|
||||
const biomarkerModule = require('./src/biomarker');
|
||||
const streamModule = require('./src/stream');
|
||||
|
||||
module.exports = {
|
||||
// Original API
|
||||
encode2bit,
|
||||
decode2bit,
|
||||
translateDna,
|
||||
cosineSimilarity,
|
||||
fastaToRvdna,
|
||||
readRvdna,
|
||||
isNativeAvailable,
|
||||
|
||||
// 23andMe Genotyping API (v0.2.0)
|
||||
normalizeGenotype,
|
||||
parse23andMe,
|
||||
callCyp2d6,
|
||||
callCyp2c19,
|
||||
determineApoe,
|
||||
analyze23andMe,
|
||||
|
||||
// Biomarker Risk Scoring Engine (v0.3.0)
|
||||
biomarkerReferences: biomarkerModule.biomarkerReferences,
|
||||
zScore: biomarkerModule.zScore,
|
||||
classifyBiomarker: biomarkerModule.classifyBiomarker,
|
||||
computeRiskScores: biomarkerModule.computeRiskScores,
|
||||
encodeProfileVector: biomarkerModule.encodeProfileVector,
|
||||
generateSyntheticPopulation: biomarkerModule.generateSyntheticPopulation,
|
||||
BIOMARKER_REFERENCES: biomarkerModule.BIOMARKER_REFERENCES,
|
||||
SNPS: biomarkerModule.SNPS,
|
||||
INTERACTIONS: biomarkerModule.INTERACTIONS,
|
||||
CAT_ORDER: biomarkerModule.CAT_ORDER,
|
||||
|
||||
// Streaming Biomarker Processor (v0.3.0)
|
||||
RingBuffer: streamModule.RingBuffer,
|
||||
StreamProcessor: streamModule.StreamProcessor,
|
||||
generateReadings: streamModule.generateReadings,
|
||||
defaultStreamConfig: streamModule.defaultStreamConfig,
|
||||
BIOMARKER_DEFS: streamModule.BIOMARKER_DEFS,
|
||||
|
||||
// Re-export native module for advanced use
|
||||
native: nativeModule,
|
||||
};
|
||||
65
vendor/ruvector/npm/packages/rvdna/package.json
vendored
Normal file
65
vendor/ruvector/npm/packages/rvdna/package.json
vendored
Normal file
@@ -0,0 +1,65 @@
|
||||
{
|
||||
"name": "@ruvector/rvdna",
|
||||
"version": "0.3.0",
|
||||
"description": "rvDNA — AI-native genomic analysis. 20-SNP biomarker risk scoring, streaming anomaly detection, 64-dim profile vectors, 23andMe genotyping, CYP2D6/CYP2C19 pharmacogenomics, variant calling, protein prediction, and HNSW vector search.",
|
||||
"main": "index.js",
|
||||
"types": "index.d.ts",
|
||||
"author": "rUv <info@ruv.io> (https://ruv.io)",
|
||||
"homepage": "https://github.com/ruvnet/ruvector/tree/main/examples/dna",
|
||||
"repository": {
|
||||
"type": "git",
|
||||
"url": "https://github.com/ruvnet/ruvector.git",
|
||||
"directory": "npm/packages/rvdna"
|
||||
},
|
||||
"bugs": {
|
||||
"url": "https://github.com/ruvnet/ruvector/issues"
|
||||
},
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">=18.0.0"
|
||||
},
|
||||
"files": [
|
||||
"index.js",
|
||||
"index.d.ts",
|
||||
"src/",
|
||||
"README.md"
|
||||
],
|
||||
"scripts": {
|
||||
"build:napi": "napi build --platform --release --cargo-cwd ../../../examples/dna",
|
||||
"test": "node tests/test-biomarker.js"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@napi-rs/cli": "^2.18.0"
|
||||
},
|
||||
"optionalDependencies": {
|
||||
"@ruvector/rvdna-linux-x64-gnu": "0.1.0",
|
||||
"@ruvector/rvdna-linux-arm64-gnu": "0.1.0",
|
||||
"@ruvector/rvdna-darwin-x64": "0.1.0",
|
||||
"@ruvector/rvdna-darwin-arm64": "0.1.0",
|
||||
"@ruvector/rvdna-win32-x64-msvc": "0.1.0"
|
||||
},
|
||||
"publishConfig": {
|
||||
"access": "public"
|
||||
},
|
||||
"keywords": [
|
||||
"genomics",
|
||||
"bioinformatics",
|
||||
"dna",
|
||||
"rvdna",
|
||||
"biomarker",
|
||||
"health",
|
||||
"risk-score",
|
||||
"streaming",
|
||||
"anomaly-detection",
|
||||
"23andme",
|
||||
"pharmacogenomics",
|
||||
"variant-calling",
|
||||
"protein",
|
||||
"hnsw",
|
||||
"vector-search",
|
||||
"napi",
|
||||
"rust",
|
||||
"ai",
|
||||
"wasm"
|
||||
]
|
||||
}
|
||||
351
vendor/ruvector/npm/packages/rvdna/src/biomarker.js
vendored
Normal file
351
vendor/ruvector/npm/packages/rvdna/src/biomarker.js
vendored
Normal file
@@ -0,0 +1,351 @@
|
||||
'use strict';
|
||||
|
||||
// ── Clinical reference ranges (mirrors REFERENCES in biomarker.rs) ──────────
|
||||
|
||||
const BIOMARKER_REFERENCES = Object.freeze([
|
||||
{ name: 'Total Cholesterol', unit: 'mg/dL', normalLow: 125, normalHigh: 200, criticalLow: 100, criticalHigh: 300, category: 'Lipid' },
|
||||
{ name: 'LDL', unit: 'mg/dL', normalLow: 50, normalHigh: 100, criticalLow: 25, criticalHigh: 190, category: 'Lipid' },
|
||||
{ name: 'HDL', unit: 'mg/dL', normalLow: 40, normalHigh: 90, criticalLow: 20, criticalHigh: null, category: 'Lipid' },
|
||||
{ name: 'Triglycerides', unit: 'mg/dL', normalLow: 35, normalHigh: 150, criticalLow: 20, criticalHigh: 500, category: 'Lipid' },
|
||||
{ name: 'Fasting Glucose', unit: 'mg/dL', normalLow: 70, normalHigh: 100, criticalLow: 50, criticalHigh: 250, category: 'Metabolic' },
|
||||
{ name: 'HbA1c', unit: '%', normalLow: 4, normalHigh: 5.7, criticalLow: null, criticalHigh: 9, category: 'Metabolic' },
|
||||
{ name: 'Homocysteine', unit: 'umol/L', normalLow: 5, normalHigh: 15, criticalLow: null, criticalHigh: 30, category: 'Metabolic' },
|
||||
{ name: 'Vitamin D', unit: 'ng/mL', normalLow: 30, normalHigh: 80, criticalLow: 10, criticalHigh: 150, category: 'Nutritional' },
|
||||
{ name: 'CRP', unit: 'mg/L', normalLow: 0, normalHigh: 3, criticalLow: null, criticalHigh: 10, category: 'Inflammatory' },
|
||||
{ name: 'TSH', unit: 'mIU/L', normalLow: 0.4, normalHigh: 4, criticalLow: 0.1, criticalHigh: 10, category: 'Thyroid' },
|
||||
{ name: 'Ferritin', unit: 'ng/mL', normalLow: 20, normalHigh: 250, criticalLow: 10, criticalHigh: 1000, category: 'Iron' },
|
||||
{ name: 'Vitamin B12', unit: 'pg/mL', normalLow: 200, normalHigh: 900, criticalLow: 150, criticalHigh: null, category: 'Nutritional' },
|
||||
{ name: 'Lp(a)', unit: 'nmol/L', normalLow: 0, normalHigh: 75, criticalLow: null, criticalHigh: 200, category: 'Lipid' },
|
||||
]);
|
||||
|
||||
// ── 20-SNP risk table (mirrors SNPS in biomarker.rs) ────────────────────────
|
||||
|
||||
const SNPS = Object.freeze([
|
||||
{ rsid: 'rs429358', category: 'Neurological', wRef: 0, wHet: 0.4, wAlt: 0.9, homRef: 'TT', het: 'CT', homAlt: 'CC', maf: 0.14 },
|
||||
{ rsid: 'rs7412', category: 'Neurological', wRef: 0, wHet: -0.15, wAlt: -0.3, homRef: 'CC', het: 'CT', homAlt: 'TT', maf: 0.08 },
|
||||
{ rsid: 'rs1042522', category: 'Cancer Risk', wRef: 0, wHet: 0.25, wAlt: 0.5, homRef: 'CC', het: 'CG', homAlt: 'GG', maf: 0.40 },
|
||||
{ rsid: 'rs80357906', category: 'Cancer Risk', wRef: 0, wHet: 0.7, wAlt: 0.95, homRef: 'DD', het: 'DI', homAlt: 'II', maf: 0.003 },
|
||||
{ rsid: 'rs28897696', category: 'Cancer Risk', wRef: 0, wHet: 0.3, wAlt: 0.6, homRef: 'GG', het: 'AG', homAlt: 'AA', maf: 0.005 },
|
||||
{ rsid: 'rs11571833', category: 'Cancer Risk', wRef: 0, wHet: 0.20, wAlt: 0.5, homRef: 'AA', het: 'AT', homAlt: 'TT', maf: 0.01 },
|
||||
{ rsid: 'rs1801133', category: 'Metabolism', wRef: 0, wHet: 0.35, wAlt: 0.7, homRef: 'GG', het: 'AG', homAlt: 'AA', maf: 0.32 },
|
||||
{ rsid: 'rs1801131', category: 'Metabolism', wRef: 0, wHet: 0.10, wAlt: 0.25, homRef: 'TT', het: 'GT', homAlt: 'GG', maf: 0.30 },
|
||||
{ rsid: 'rs4680', category: 'Neurological', wRef: 0, wHet: 0.2, wAlt: 0.45, homRef: 'GG', het: 'AG', homAlt: 'AA', maf: 0.50 },
|
||||
{ rsid: 'rs1799971', category: 'Neurological', wRef: 0, wHet: 0.2, wAlt: 0.4, homRef: 'AA', het: 'AG', homAlt: 'GG', maf: 0.15 },
|
||||
{ rsid: 'rs762551', category: 'Metabolism', wRef: 0, wHet: 0.15, wAlt: 0.35, homRef: 'AA', het: 'AC', homAlt: 'CC', maf: 0.37 },
|
||||
{ rsid: 'rs4988235', category: 'Metabolism', wRef: 0, wHet: 0.05, wAlt: 0.15, homRef: 'AA', het: 'AG', homAlt: 'GG', maf: 0.24 },
|
||||
{ rsid: 'rs53576', category: 'Neurological', wRef: 0, wHet: 0.1, wAlt: 0.25, homRef: 'GG', het: 'AG', homAlt: 'AA', maf: 0.35 },
|
||||
{ rsid: 'rs6311', category: 'Neurological', wRef: 0, wHet: 0.15, wAlt: 0.3, homRef: 'CC', het: 'CT', homAlt: 'TT', maf: 0.45 },
|
||||
{ rsid: 'rs1800497', category: 'Neurological', wRef: 0, wHet: 0.25, wAlt: 0.5, homRef: 'GG', het: 'AG', homAlt: 'AA', maf: 0.20 },
|
||||
{ rsid: 'rs4363657', category: 'Cardiovascular', wRef: 0, wHet: 0.35, wAlt: 0.7, homRef: 'TT', het: 'CT', homAlt: 'CC', maf: 0.15 },
|
||||
{ rsid: 'rs1800566', category: 'Cancer Risk', wRef: 0, wHet: 0.15, wAlt: 0.30, homRef: 'CC', het: 'CT', homAlt: 'TT', maf: 0.22 },
|
||||
{ rsid: 'rs10455872', category: 'Cardiovascular', wRef: 0, wHet: 0.40, wAlt: 0.75, homRef: 'AA', het: 'AG', homAlt: 'GG', maf: 0.07 },
|
||||
{ rsid: 'rs3798220', category: 'Cardiovascular', wRef: 0, wHet: 0.35, wAlt: 0.65, homRef: 'TT', het: 'CT', homAlt: 'CC', maf: 0.02 },
|
||||
{ rsid: 'rs11591147', category: 'Cardiovascular', wRef: 0, wHet: -0.30, wAlt: -0.55, homRef: 'GG', het: 'GT', homAlt: 'TT', maf: 0.024 },
|
||||
]);
|
||||
|
||||
// ── Gene-gene interactions (mirrors INTERACTIONS in biomarker.rs) ────────────
|
||||
|
||||
const INTERACTIONS = Object.freeze([
|
||||
{ rsidA: 'rs4680', rsidB: 'rs1799971', modifier: 1.4, category: 'Neurological' },
|
||||
{ rsidA: 'rs1801133', rsidB: 'rs1801131', modifier: 1.3, category: 'Metabolism' },
|
||||
{ rsidA: 'rs429358', rsidB: 'rs1042522', modifier: 1.2, category: 'Cancer Risk' },
|
||||
{ rsidA: 'rs80357906',rsidB: 'rs1042522', modifier: 1.5, category: 'Cancer Risk' },
|
||||
{ rsidA: 'rs1801131', rsidB: 'rs4680', modifier: 1.25, category: 'Neurological' },
|
||||
{ rsidA: 'rs1800497', rsidB: 'rs4680', modifier: 1.2, category: 'Neurological' },
|
||||
]);
|
||||
|
||||
const CAT_ORDER = ['Cancer Risk', 'Cardiovascular', 'Neurological', 'Metabolism'];
|
||||
const NUM_ONEHOT_SNPS = 17;
|
||||
|
||||
// ── Helpers ──────────────────────────────────────────────────────────────────
|
||||
|
||||
function genotypeCode(snp, gt) {
|
||||
if (gt === snp.homRef) return 0;
|
||||
if (gt.length === 2 && gt[0] !== gt[1]) return 1;
|
||||
return 2;
|
||||
}
|
||||
|
||||
function snpWeight(snp, code) {
|
||||
return code === 0 ? snp.wRef : code === 1 ? snp.wHet : snp.wAlt;
|
||||
}
|
||||
|
||||
// Pre-built rsid -> index lookup (O(1) instead of O(n) findIndex)
|
||||
const RSID_INDEX = new Map();
|
||||
for (let i = 0; i < SNPS.length; i++) RSID_INDEX.set(SNPS[i].rsid, i);
|
||||
|
||||
// Pre-cache LPA SNP references to avoid repeated iteration
|
||||
const LPA_SNPS = SNPS.filter(s => s.rsid === 'rs10455872' || s.rsid === 'rs3798220');
|
||||
|
||||
function snpIndex(rsid) {
|
||||
const idx = RSID_INDEX.get(rsid);
|
||||
return idx !== undefined ? idx : -1;
|
||||
}
|
||||
|
||||
function isNonRef(genotypes, rsid) {
|
||||
const idx = RSID_INDEX.get(rsid);
|
||||
if (idx === undefined) return false;
|
||||
const gt = genotypes.get(rsid);
|
||||
return gt !== undefined && gt !== SNPS[idx].homRef;
|
||||
}
|
||||
|
||||
function interactionMod(genotypes, ix) {
|
||||
return (isNonRef(genotypes, ix.rsidA) && isNonRef(genotypes, ix.rsidB)) ? ix.modifier : 1.0;
|
||||
}
|
||||
|
||||
// Pre-compute category metadata (mirrors category_meta() in Rust)
|
||||
const CATEGORY_META = CAT_ORDER.map(cat => {
|
||||
let maxPossible = 0;
|
||||
let expectedCount = 0;
|
||||
for (const snp of SNPS) {
|
||||
if (snp.category === cat) {
|
||||
maxPossible += Math.max(snp.wAlt, 0);
|
||||
expectedCount++;
|
||||
}
|
||||
}
|
||||
return { name: cat, maxPossible: Math.max(maxPossible, 1), expectedCount };
|
||||
});
|
||||
|
||||
// Mulberry32 PRNG — deterministic, fast, no dependencies
|
||||
function mulberry32(seed) {
|
||||
let t = (seed + 0x6D2B79F5) | 0;
|
||||
return function () {
|
||||
t = (t + 0x6D2B79F5) | 0;
|
||||
let z = t ^ (t >>> 15);
|
||||
z = Math.imul(z | 1, z);
|
||||
z ^= z + Math.imul(z ^ (z >>> 7), z | 61);
|
||||
return ((z ^ (z >>> 14)) >>> 0) / 4294967296;
|
||||
};
|
||||
}
|
||||
|
||||
// ── Simplified MTHFR/pain scoring (mirrors health.rs analysis functions) ────
|
||||
|
||||
function analyzeMthfr(genotypes) {
|
||||
let score = 0;
|
||||
const gt677 = genotypes.get('rs1801133');
|
||||
const gt1298 = genotypes.get('rs1801131');
|
||||
if (gt677) {
|
||||
const code = genotypeCode(SNPS[6], gt677);
|
||||
score += code;
|
||||
}
|
||||
if (gt1298) {
|
||||
const code = genotypeCode(SNPS[7], gt1298);
|
||||
score += code;
|
||||
}
|
||||
return { score };
|
||||
}
|
||||
|
||||
function analyzePain(genotypes) {
|
||||
const gtComt = genotypes.get('rs4680');
|
||||
const gtOprm1 = genotypes.get('rs1799971');
|
||||
if (!gtComt || !gtOprm1) return null;
|
||||
const comtCode = genotypeCode(SNPS[8], gtComt);
|
||||
const oprm1Code = genotypeCode(SNPS[9], gtOprm1);
|
||||
return { score: comtCode + oprm1Code };
|
||||
}
|
||||
|
||||
// ── Public API ───────────────────────────────────────────────────────────────
|
||||
|
||||
function biomarkerReferences() {
|
||||
return BIOMARKER_REFERENCES;
|
||||
}
|
||||
|
||||
function zScore(value, ref_) {
|
||||
const mid = (ref_.normalLow + ref_.normalHigh) / 2;
|
||||
const halfRange = (ref_.normalHigh - ref_.normalLow) / 2;
|
||||
if (halfRange === 0) return 0;
|
||||
return (value - mid) / halfRange;
|
||||
}
|
||||
|
||||
function classifyBiomarker(value, ref_) {
|
||||
if (ref_.criticalLow !== null && value < ref_.criticalLow) return 'CriticalLow';
|
||||
if (value < ref_.normalLow) return 'Low';
|
||||
if (ref_.criticalHigh !== null && value > ref_.criticalHigh) return 'CriticalHigh';
|
||||
if (value > ref_.normalHigh) return 'High';
|
||||
return 'Normal';
|
||||
}
|
||||
|
||||
function computeRiskScores(genotypes) {
|
||||
const catScores = new Map(); // category -> { raw, variants, count }
|
||||
|
||||
for (const snp of SNPS) {
|
||||
const gt = genotypes.get(snp.rsid);
|
||||
if (gt === undefined) continue;
|
||||
const code = genotypeCode(snp, gt);
|
||||
const w = snpWeight(snp, code);
|
||||
if (!catScores.has(snp.category)) {
|
||||
catScores.set(snp.category, { raw: 0, variants: [], count: 0 });
|
||||
}
|
||||
const entry = catScores.get(snp.category);
|
||||
entry.raw += w;
|
||||
entry.count++;
|
||||
if (code > 0) entry.variants.push(snp.rsid);
|
||||
}
|
||||
|
||||
for (const inter of INTERACTIONS) {
|
||||
const m = interactionMod(genotypes, inter);
|
||||
if (m > 1.0 && catScores.has(inter.category)) {
|
||||
catScores.get(inter.category).raw *= m;
|
||||
}
|
||||
}
|
||||
|
||||
const categoryScores = {};
|
||||
for (const cm of CATEGORY_META) {
|
||||
const entry = catScores.get(cm.name) || { raw: 0, variants: [], count: 0 };
|
||||
const score = Math.min(Math.max(entry.raw / cm.maxPossible, 0), 1);
|
||||
const confidence = entry.count > 0 ? Math.min(entry.count / Math.max(cm.expectedCount, 1), 1) : 0;
|
||||
categoryScores[cm.name] = {
|
||||
category: cm.name,
|
||||
score,
|
||||
confidence,
|
||||
contributingVariants: entry.variants,
|
||||
};
|
||||
}
|
||||
|
||||
let ws = 0, cs = 0;
|
||||
for (const c of Object.values(categoryScores)) {
|
||||
ws += c.score * c.confidence;
|
||||
cs += c.confidence;
|
||||
}
|
||||
const globalRiskScore = cs > 0 ? ws / cs : 0;
|
||||
|
||||
const profile = {
|
||||
subjectId: '',
|
||||
timestamp: 0,
|
||||
categoryScores,
|
||||
globalRiskScore,
|
||||
profileVector: null,
|
||||
biomarkerValues: {},
|
||||
};
|
||||
profile.profileVector = encodeProfileVectorWithGenotypes(profile, genotypes);
|
||||
return profile;
|
||||
}
|
||||
|
||||
function encodeProfileVector(profile) {
|
||||
return encodeProfileVectorWithGenotypes(profile, new Map());
|
||||
}
|
||||
|
||||
function encodeProfileVectorWithGenotypes(profile, genotypes) {
|
||||
const v = new Float32Array(64);
|
||||
|
||||
// Dims 0..50: one-hot genotype encoding (first 17 SNPs x 3 = 51 dims)
|
||||
for (let i = 0; i < NUM_ONEHOT_SNPS; i++) {
|
||||
const snp = SNPS[i];
|
||||
const gt = genotypes.get(snp.rsid);
|
||||
const code = gt !== undefined ? genotypeCode(snp, gt) : 0;
|
||||
v[i * 3 + code] = 1.0;
|
||||
}
|
||||
|
||||
// Dims 51..54: category scores
|
||||
for (let j = 0; j < CAT_ORDER.length; j++) {
|
||||
const cs = profile.categoryScores[CAT_ORDER[j]];
|
||||
v[51 + j] = cs ? cs.score : 0;
|
||||
}
|
||||
v[55] = profile.globalRiskScore;
|
||||
|
||||
// Dims 56..59: first 4 interaction modifiers
|
||||
for (let j = 0; j < 4; j++) {
|
||||
const m = interactionMod(genotypes, INTERACTIONS[j]);
|
||||
v[56 + j] = m > 1 ? m - 1 : 0;
|
||||
}
|
||||
|
||||
// Dims 60..63: derived clinical scores
|
||||
v[60] = analyzeMthfr(genotypes).score / 4;
|
||||
const pain = analyzePain(genotypes);
|
||||
v[61] = pain ? pain.score / 4 : 0;
|
||||
const apoeGt = genotypes.get('rs429358');
|
||||
v[62] = apoeGt !== undefined ? genotypeCode(SNPS[0], apoeGt) / 2 : 0;
|
||||
|
||||
// LPA composite: average of rs10455872 + rs3798220 genotype codes (cached)
|
||||
let lpaSum = 0, lpaCount = 0;
|
||||
for (const snp of LPA_SNPS) {
|
||||
const gt = genotypes.get(snp.rsid);
|
||||
if (gt !== undefined) {
|
||||
lpaSum += genotypeCode(snp, gt) / 2;
|
||||
lpaCount++;
|
||||
}
|
||||
}
|
||||
v[63] = lpaCount > 0 ? lpaSum / 2 : 0;
|
||||
|
||||
// L2-normalize
|
||||
let norm = 0;
|
||||
for (let i = 0; i < 64; i++) norm += v[i] * v[i];
|
||||
norm = Math.sqrt(norm);
|
||||
if (norm > 0) for (let i = 0; i < 64; i++) v[i] /= norm;
|
||||
|
||||
return v;
|
||||
}
|
||||
|
||||
function randomGenotype(rng, snp) {
|
||||
const p = snp.maf;
|
||||
const q = 1 - p;
|
||||
const r = rng();
|
||||
if (r < q * q) return snp.homRef;
|
||||
if (r < q * q + 2 * p * q) return snp.het;
|
||||
return snp.homAlt;
|
||||
}
|
||||
|
||||
function generateSyntheticPopulation(count, seed) {
|
||||
const rng = mulberry32(seed);
|
||||
const pop = [];
|
||||
|
||||
for (let i = 0; i < count; i++) {
|
||||
const genotypes = new Map();
|
||||
for (const snp of SNPS) {
|
||||
genotypes.set(snp.rsid, randomGenotype(rng, snp));
|
||||
}
|
||||
|
||||
const profile = computeRiskScores(genotypes);
|
||||
profile.subjectId = `SYN-${String(i).padStart(6, '0')}`;
|
||||
profile.timestamp = 1700000000 + i;
|
||||
|
||||
const mthfrScore = analyzeMthfr(genotypes).score;
|
||||
const apoeCode = genotypes.get('rs429358') ? genotypeCode(SNPS[0], genotypes.get('rs429358')) : 0;
|
||||
const nqo1Idx = RSID_INDEX.get('rs1800566');
|
||||
const nqo1Code = genotypes.get('rs1800566') ? genotypeCode(SNPS[nqo1Idx], genotypes.get('rs1800566')) : 0;
|
||||
|
||||
let lpaRisk = 0;
|
||||
for (const snp of LPA_SNPS) {
|
||||
const gt = genotypes.get(snp.rsid);
|
||||
if (gt) lpaRisk += genotypeCode(snp, gt);
|
||||
}
|
||||
|
||||
const pcsk9Idx = RSID_INDEX.get('rs11591147');
|
||||
const pcsk9Code = genotypes.get('rs11591147') ? genotypeCode(SNPS[pcsk9Idx], genotypes.get('rs11591147')) : 0;
|
||||
|
||||
for (const bref of BIOMARKER_REFERENCES) {
|
||||
const mid = (bref.normalLow + bref.normalHigh) / 2;
|
||||
const sd = (bref.normalHigh - bref.normalLow) / 4;
|
||||
let val = mid + (rng() * 3 - 1.5) * sd;
|
||||
|
||||
// Gene->biomarker correlations (mirrors Rust)
|
||||
const nm = bref.name;
|
||||
if (nm === 'Homocysteine' && mthfrScore >= 2) val += sd * (mthfrScore - 1);
|
||||
if ((nm === 'Total Cholesterol' || nm === 'LDL') && apoeCode > 0) val += sd * 0.5 * apoeCode;
|
||||
if (nm === 'HDL' && apoeCode > 0) val -= sd * 0.3 * apoeCode;
|
||||
if (nm === 'Triglycerides' && apoeCode > 0) val += sd * 0.4 * apoeCode;
|
||||
if (nm === 'Vitamin B12' && mthfrScore >= 2) val -= sd * 0.4;
|
||||
if (nm === 'CRP' && nqo1Code === 2) val += sd * 0.3;
|
||||
if (nm === 'Lp(a)' && lpaRisk > 0) val += sd * 1.5 * lpaRisk;
|
||||
if ((nm === 'LDL' || nm === 'Total Cholesterol') && pcsk9Code > 0) val -= sd * 0.6 * pcsk9Code;
|
||||
|
||||
val = Math.max(val, bref.criticalLow || 0, 0);
|
||||
if (bref.criticalHigh !== null) val = Math.min(val, bref.criticalHigh * 1.2);
|
||||
profile.biomarkerValues[bref.name] = Math.round(val * 10) / 10;
|
||||
}
|
||||
pop.push(profile);
|
||||
}
|
||||
return pop;
|
||||
}
|
||||
|
||||
module.exports = {
|
||||
BIOMARKER_REFERENCES,
|
||||
SNPS,
|
||||
INTERACTIONS,
|
||||
CAT_ORDER,
|
||||
biomarkerReferences,
|
||||
zScore,
|
||||
classifyBiomarker,
|
||||
computeRiskScores,
|
||||
encodeProfileVector,
|
||||
generateSyntheticPopulation,
|
||||
};
|
||||
312
vendor/ruvector/npm/packages/rvdna/src/stream.js
vendored
Normal file
312
vendor/ruvector/npm/packages/rvdna/src/stream.js
vendored
Normal file
@@ -0,0 +1,312 @@
|
||||
'use strict';
|
||||
|
||||
// ── Constants (identical to biomarker_stream.rs) ─────────────────────────────
|
||||
|
||||
const EMA_ALPHA = 0.1;
|
||||
const Z_SCORE_THRESHOLD = 2.5;
|
||||
const REF_OVERSHOOT = 0.20;
|
||||
const CUSUM_THRESHOLD = 4.0;
|
||||
const CUSUM_DRIFT = 0.5;
|
||||
|
||||
// ── Biomarker definitions ────────────────────────────────────────────────────
|
||||
|
||||
const BIOMARKER_DEFS = Object.freeze([
|
||||
{ id: 'glucose', low: 70, high: 100 },
|
||||
{ id: 'cholesterol_total', low: 150, high: 200 },
|
||||
{ id: 'hdl', low: 40, high: 60 },
|
||||
{ id: 'ldl', low: 70, high: 130 },
|
||||
{ id: 'triglycerides', low: 50, high: 150 },
|
||||
{ id: 'crp', low: 0.1, high: 3.0 },
|
||||
]);
|
||||
|
||||
// ── RingBuffer ───────────────────────────────────────────────────────────────
|
||||
|
||||
class RingBuffer {
|
||||
constructor(capacity) {
|
||||
if (capacity <= 0) throw new Error('RingBuffer capacity must be > 0');
|
||||
this._buffer = new Float64Array(capacity);
|
||||
this._head = 0;
|
||||
this._len = 0;
|
||||
this._capacity = capacity;
|
||||
}
|
||||
|
||||
push(item) {
|
||||
this._buffer[this._head] = item;
|
||||
this._head = (this._head + 1) % this._capacity;
|
||||
if (this._len < this._capacity) this._len++;
|
||||
}
|
||||
|
||||
/** Push item and return evicted value (NaN if buffer wasn't full). */
|
||||
pushPop(item) {
|
||||
const wasFull = this._len === this._capacity;
|
||||
const evicted = wasFull ? this._buffer[this._head] : NaN;
|
||||
this._buffer[this._head] = item;
|
||||
this._head = (this._head + 1) % this._capacity;
|
||||
if (!wasFull) this._len++;
|
||||
return evicted;
|
||||
}
|
||||
|
||||
/** Iterate in insertion order (oldest to newest). */
|
||||
*[Symbol.iterator]() {
|
||||
const start = this._len < this._capacity ? 0 : this._head;
|
||||
for (let i = 0; i < this._len; i++) {
|
||||
yield this._buffer[(start + i) % this._capacity];
|
||||
}
|
||||
}
|
||||
|
||||
/** Return values as a plain array (oldest to newest). */
|
||||
toArray() {
|
||||
const arr = new Array(this._len);
|
||||
const start = this._len < this._capacity ? 0 : this._head;
|
||||
for (let i = 0; i < this._len; i++) {
|
||||
arr[i] = this._buffer[(start + i) % this._capacity];
|
||||
}
|
||||
return arr;
|
||||
}
|
||||
|
||||
get length() { return this._len; }
|
||||
get capacity() { return this._capacity; }
|
||||
isFull() { return this._len === this._capacity; }
|
||||
|
||||
clear() {
|
||||
this._head = 0;
|
||||
this._len = 0;
|
||||
}
|
||||
}
|
||||
|
||||
// ── Welford's online mean+std (single-pass, mirrors Rust) ────────────────────
|
||||
|
||||
function windowMeanStd(buf) {
|
||||
const n = buf.length;
|
||||
if (n === 0) return [0, 0];
|
||||
let mean = 0, m2 = 0, k = 0;
|
||||
for (const x of buf) {
|
||||
k++;
|
||||
const delta = x - mean;
|
||||
mean += delta / k;
|
||||
m2 += delta * (x - mean);
|
||||
}
|
||||
if (n < 2) return [mean, 0];
|
||||
return [mean, Math.sqrt(m2 / (n - 1))];
|
||||
}
|
||||
|
||||
// ── Trend slope via simple linear regression (mirrors Rust) ──────────────────
|
||||
|
||||
function computeTrendSlope(buf) {
|
||||
const n = buf.length;
|
||||
if (n < 2) return 0;
|
||||
const nf = n;
|
||||
const xm = (nf - 1) / 2;
|
||||
let ys = 0, xys = 0, xxs = 0, i = 0;
|
||||
for (const y of buf) {
|
||||
ys += y;
|
||||
xys += i * y;
|
||||
xxs += i * i;
|
||||
i++;
|
||||
}
|
||||
const ssXy = xys - nf * xm * (ys / nf);
|
||||
const ssXx = xxs - nf * xm * xm;
|
||||
return Math.abs(ssXx) < 1e-12 ? 0 : ssXy / ssXx;
|
||||
}
|
||||
|
||||
// ── StreamConfig ─────────────────────────────────────────────────────────────
|
||||
|
||||
function defaultStreamConfig() {
|
||||
return {
|
||||
baseIntervalMs: 1000,
|
||||
noiseAmplitude: 0.02,
|
||||
driftRate: 0.0,
|
||||
anomalyProbability: 0.02,
|
||||
anomalyMagnitude: 2.5,
|
||||
numBiomarkers: 6,
|
||||
windowSize: 100,
|
||||
};
|
||||
}
|
||||
|
||||
// ── Mulberry32 PRNG ──────────────────────────────────────────────────────────
|
||||
|
||||
function mulberry32(seed) {
|
||||
let t = (seed + 0x6D2B79F5) | 0;
|
||||
return function () {
|
||||
t = (t + 0x6D2B79F5) | 0;
|
||||
let z = t ^ (t >>> 15);
|
||||
z = Math.imul(z | 1, z);
|
||||
z ^= z + Math.imul(z ^ (z >>> 7), z | 61);
|
||||
return ((z ^ (z >>> 14)) >>> 0) / 4294967296;
|
||||
};
|
||||
}
|
||||
|
||||
// Box-Muller for normal distribution
|
||||
function normalSample(rng, mean, stddev) {
|
||||
const u1 = rng();
|
||||
const u2 = rng();
|
||||
return mean + stddev * Math.sqrt(-2 * Math.log(u1 || 1e-12)) * Math.cos(2 * Math.PI * u2);
|
||||
}
|
||||
|
||||
// ── Batch generation (mirrors generate_readings in Rust) ─────────────────────
|
||||
|
||||
function generateReadings(config, count, seed) {
|
||||
const rng = mulberry32(seed);
|
||||
const active = BIOMARKER_DEFS.slice(0, Math.min(config.numBiomarkers, BIOMARKER_DEFS.length));
|
||||
const readings = [];
|
||||
|
||||
// Pre-compute distributions per biomarker
|
||||
const dists = active.map(def => {
|
||||
const range = def.high - def.low;
|
||||
const mid = (def.low + def.high) / 2;
|
||||
const sigma = Math.max(config.noiseAmplitude * range, 1e-12);
|
||||
return { mid, range, sigma };
|
||||
});
|
||||
|
||||
let ts = 0;
|
||||
for (let step = 0; step < count; step++) {
|
||||
for (let j = 0; j < active.length; j++) {
|
||||
const def = active[j];
|
||||
const { mid, range, sigma } = dists[j];
|
||||
const drift = config.driftRate * range * step;
|
||||
const isAnomaly = rng() < config.anomalyProbability;
|
||||
const effectiveSigma = isAnomaly ? sigma * config.anomalyMagnitude : sigma;
|
||||
const value = Math.max(normalSample(rng, mid + drift, effectiveSigma), 0);
|
||||
readings.push({
|
||||
timestampMs: ts,
|
||||
biomarkerId: def.id,
|
||||
value,
|
||||
referenceLow: def.low,
|
||||
referenceHigh: def.high,
|
||||
isAnomaly,
|
||||
zScore: 0,
|
||||
});
|
||||
}
|
||||
ts += config.baseIntervalMs;
|
||||
}
|
||||
return readings;
|
||||
}
|
||||
|
||||
// ── StreamProcessor ──────────────────────────────────────────────────────────
|
||||
|
||||
class StreamProcessor {
|
||||
constructor(config) {
|
||||
this._config = config || defaultStreamConfig();
|
||||
this._buffers = new Map();
|
||||
this._stats = new Map();
|
||||
this._totalReadings = 0;
|
||||
this._anomalyCount = 0;
|
||||
this._anomPerBio = new Map();
|
||||
this._welford = new Map();
|
||||
this._startTs = null;
|
||||
this._lastTs = null;
|
||||
}
|
||||
|
||||
_initBiomarker(id) {
|
||||
this._buffers.set(id, new RingBuffer(this._config.windowSize));
|
||||
this._stats.set(id, {
|
||||
mean: 0, variance: 0, min: Infinity, max: -Infinity,
|
||||
count: 0, anomalyRate: 0, trendSlope: 0, ema: 0,
|
||||
cusumPos: 0, cusumNeg: 0, changepointDetected: false,
|
||||
});
|
||||
// Incremental Welford state for windowed mean/variance (O(1) per reading)
|
||||
this._welford.set(id, { n: 0, mean: 0, m2: 0 });
|
||||
}
|
||||
|
||||
processReading(reading) {
|
||||
const id = reading.biomarkerId;
|
||||
if (this._startTs === null) this._startTs = reading.timestampMs;
|
||||
this._lastTs = reading.timestampMs;
|
||||
|
||||
if (!this._buffers.has(id)) this._initBiomarker(id);
|
||||
|
||||
const buf = this._buffers.get(id);
|
||||
const evicted = buf.pushPop(reading.value);
|
||||
this._totalReadings++;
|
||||
|
||||
// Incremental windowed Welford: O(1) add + O(1) remove
|
||||
const w = this._welford.get(id);
|
||||
const val = reading.value;
|
||||
if (Number.isNaN(evicted)) {
|
||||
// Buffer wasn't full — just add
|
||||
w.n++;
|
||||
const d1 = val - w.mean;
|
||||
w.mean += d1 / w.n;
|
||||
w.m2 += d1 * (val - w.mean);
|
||||
} else {
|
||||
// Buffer full — remove evicted, add new (n stays the same)
|
||||
const oldMean = w.mean;
|
||||
w.mean += (val - evicted) / w.n;
|
||||
w.m2 += (val - evicted) * ((val - w.mean) + (evicted - oldMean));
|
||||
if (w.m2 < 0) w.m2 = 0; // numerical guard
|
||||
}
|
||||
const wmean = w.mean;
|
||||
const wstd = w.n > 1 ? Math.sqrt(w.m2 / (w.n - 1)) : 0;
|
||||
|
||||
const z = wstd > 1e-12 ? (val - wmean) / wstd : 0;
|
||||
|
||||
const rng = reading.referenceHigh - reading.referenceLow;
|
||||
const overshoot = REF_OVERSHOOT * rng;
|
||||
const oor = val < (reading.referenceLow - overshoot) ||
|
||||
val > (reading.referenceHigh + overshoot);
|
||||
const isAnomaly = Math.abs(z) > Z_SCORE_THRESHOLD || oor;
|
||||
|
||||
if (isAnomaly) {
|
||||
this._anomalyCount++;
|
||||
this._anomPerBio.set(id, (this._anomPerBio.get(id) || 0) + 1);
|
||||
}
|
||||
|
||||
const slope = computeTrendSlope(buf);
|
||||
const bioAnom = this._anomPerBio.get(id) || 0;
|
||||
|
||||
const st = this._stats.get(id);
|
||||
st.count++;
|
||||
st.mean = wmean;
|
||||
st.variance = wstd * wstd;
|
||||
st.trendSlope = slope;
|
||||
st.anomalyRate = bioAnom / st.count;
|
||||
if (val < st.min) st.min = val;
|
||||
if (val > st.max) st.max = val;
|
||||
st.ema = st.count === 1
|
||||
? val
|
||||
: EMA_ALPHA * val + (1 - EMA_ALPHA) * st.ema;
|
||||
|
||||
// CUSUM changepoint detection
|
||||
if (wstd > 1e-12) {
|
||||
const normDev = (val - wmean) / wstd;
|
||||
st.cusumPos = Math.max(st.cusumPos + normDev - CUSUM_DRIFT, 0);
|
||||
st.cusumNeg = Math.max(st.cusumNeg - normDev - CUSUM_DRIFT, 0);
|
||||
st.changepointDetected = st.cusumPos > CUSUM_THRESHOLD || st.cusumNeg > CUSUM_THRESHOLD;
|
||||
if (st.changepointDetected) { st.cusumPos = 0; st.cusumNeg = 0; }
|
||||
}
|
||||
|
||||
return { accepted: true, zScore: z, isAnomaly, currentTrend: slope };
|
||||
}
|
||||
|
||||
getStats(biomarkerId) {
|
||||
return this._stats.get(biomarkerId) || null;
|
||||
}
|
||||
|
||||
summary() {
|
||||
const elapsed = (this._startTs !== null && this._lastTs !== null && this._lastTs > this._startTs)
|
||||
? this._lastTs - this._startTs : 1;
|
||||
const ar = this._totalReadings > 0 ? this._anomalyCount / this._totalReadings : 0;
|
||||
const statsObj = {};
|
||||
for (const [k, v] of this._stats) statsObj[k] = { ...v };
|
||||
return {
|
||||
totalReadings: this._totalReadings,
|
||||
anomalyCount: this._anomalyCount,
|
||||
anomalyRate: ar,
|
||||
biomarkerStats: statsObj,
|
||||
throughputReadingsPerSec: this._totalReadings / (elapsed / 1000),
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
module.exports = {
|
||||
RingBuffer,
|
||||
StreamProcessor,
|
||||
BIOMARKER_DEFS,
|
||||
EMA_ALPHA,
|
||||
Z_SCORE_THRESHOLD,
|
||||
REF_OVERSHOOT,
|
||||
CUSUM_THRESHOLD,
|
||||
CUSUM_DRIFT,
|
||||
defaultStreamConfig,
|
||||
generateReadings,
|
||||
};
|
||||
33
vendor/ruvector/npm/packages/rvdna/tests/fixtures/sample-high-risk-cardio.23andme.txt
vendored
Normal file
33
vendor/ruvector/npm/packages/rvdna/tests/fixtures/sample-high-risk-cardio.23andme.txt
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
# 23andMe raw data file — Scenario: High-risk cardiovascular + MTHFR compound het
|
||||
# This file is format version: v5
|
||||
# Below is a subset of data (build 37, GRCh37/hg19)
|
||||
# rsid chromosome position genotype
|
||||
rs429358 19 45411941 CT
|
||||
rs7412 19 45412079 CC
|
||||
rs1042522 17 7579472 CG
|
||||
rs80357906 17 41246537 DD
|
||||
rs28897696 17 41244999 GG
|
||||
rs11571833 13 32972626 AA
|
||||
rs1801133 1 11856378 AA
|
||||
rs1801131 1 11854476 GT
|
||||
rs4680 22 19951271 AG
|
||||
rs1799971 6 154360797 AG
|
||||
rs762551 15 75041917 AC
|
||||
rs4988235 2 136608646 AG
|
||||
rs53576 3 8804371 AG
|
||||
rs6311 13 47471478 CT
|
||||
rs1800497 11 113270828 AG
|
||||
rs4363657 12 21331549 CT
|
||||
rs1800566 16 69745145 CT
|
||||
rs10455872 6 161010118 AG
|
||||
rs3798220 6 160961137 CT
|
||||
rs11591147 1 55505647 GG
|
||||
rs3892097 22 42524947 CT
|
||||
rs35742686 22 42523791 DD
|
||||
rs5030655 22 42522393 DD
|
||||
rs1065852 22 42526694 CT
|
||||
rs28371725 22 42525772 TT
|
||||
rs28371706 22 42523610 CC
|
||||
rs4244285 10 96541616 AG
|
||||
rs4986893 10 96540410 GG
|
||||
rs12248560 10 96521657 CT
|
||||
33
vendor/ruvector/npm/packages/rvdna/tests/fixtures/sample-low-risk-baseline.23andme.txt
vendored
Normal file
33
vendor/ruvector/npm/packages/rvdna/tests/fixtures/sample-low-risk-baseline.23andme.txt
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
# 23andMe raw data file — Scenario: Low-risk baseline (all reference genotypes)
|
||||
# This file is format version: v5
|
||||
# Below is a subset of data (build 38, GRCh38/hg38)
|
||||
# rsid chromosome position genotype
|
||||
rs429358 19 45411941 TT
|
||||
rs7412 19 45412079 CC
|
||||
rs1042522 17 7579472 CC
|
||||
rs80357906 17 41246537 DD
|
||||
rs28897696 17 41244999 GG
|
||||
rs11571833 13 32972626 AA
|
||||
rs1801133 1 11856378 GG
|
||||
rs1801131 1 11854476 TT
|
||||
rs4680 22 19951271 GG
|
||||
rs1799971 6 154360797 AA
|
||||
rs762551 15 75041917 AA
|
||||
rs4988235 2 136608646 AA
|
||||
rs53576 3 8804371 GG
|
||||
rs6311 13 47471478 CC
|
||||
rs1800497 11 113270828 GG
|
||||
rs4363657 12 21331549 TT
|
||||
rs1800566 16 69745145 CC
|
||||
rs10455872 6 161010118 AA
|
||||
rs3798220 6 160961137 TT
|
||||
rs11591147 1 55505647 GG
|
||||
rs3892097 22 42524947 CC
|
||||
rs35742686 22 42523791 DD
|
||||
rs5030655 22 42522393 DD
|
||||
rs1065852 22 42526694 CC
|
||||
rs28371725 22 42525772 CC
|
||||
rs28371706 22 42523610 CC
|
||||
rs4244285 10 96541616 GG
|
||||
rs4986893 10 96540410 GG
|
||||
rs12248560 10 96521657 CC
|
||||
33
vendor/ruvector/npm/packages/rvdna/tests/fixtures/sample-multi-risk.23andme.txt
vendored
Normal file
33
vendor/ruvector/npm/packages/rvdna/tests/fixtures/sample-multi-risk.23andme.txt
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
# 23andMe raw data file — Scenario: APOE e4/e4 + BRCA1 carrier + NQO1 null
|
||||
# This file is format version: v5
|
||||
# Below is a subset of data (build 37, GRCh37/hg19)
|
||||
# rsid chromosome position genotype
|
||||
rs429358 19 45411941 CC
|
||||
rs7412 19 45412079 CC
|
||||
rs1042522 17 7579472 GG
|
||||
rs80357906 17 41246537 DI
|
||||
rs28897696 17 41244999 AG
|
||||
rs11571833 13 32972626 AT
|
||||
rs1801133 1 11856378 AG
|
||||
rs1801131 1 11854476 TT
|
||||
rs4680 22 19951271 AA
|
||||
rs1799971 6 154360797 GG
|
||||
rs762551 15 75041917 CC
|
||||
rs4988235 2 136608646 GG
|
||||
rs53576 3 8804371 AA
|
||||
rs6311 13 47471478 TT
|
||||
rs1800497 11 113270828 AA
|
||||
rs4363657 12 21331549 CC
|
||||
rs1800566 16 69745145 TT
|
||||
rs10455872 6 161010118 GG
|
||||
rs3798220 6 160961137 CC
|
||||
rs11591147 1 55505647 GG
|
||||
rs3892097 22 42524947 CC
|
||||
rs35742686 22 42523791 DD
|
||||
rs5030655 22 42522393 DD
|
||||
rs1065852 22 42526694 CC
|
||||
rs28371725 22 42525772 CC
|
||||
rs28371706 22 42523610 CC
|
||||
rs4244285 10 96541616 GG
|
||||
rs4986893 10 96540410 GG
|
||||
rs12248560 10 96521657 CC
|
||||
33
vendor/ruvector/npm/packages/rvdna/tests/fixtures/sample-pcsk9-protective.23andme.txt
vendored
Normal file
33
vendor/ruvector/npm/packages/rvdna/tests/fixtures/sample-pcsk9-protective.23andme.txt
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
# 23andMe raw data file — Scenario: PCSK9 protective + minimal risk
|
||||
# This file is format version: v5
|
||||
# Below is a subset of data (build 37, GRCh37/hg19)
|
||||
# rsid chromosome position genotype
|
||||
rs429358 19 45411941 TT
|
||||
rs7412 19 45412079 CT
|
||||
rs1042522 17 7579472 CC
|
||||
rs80357906 17 41246537 DD
|
||||
rs28897696 17 41244999 GG
|
||||
rs11571833 13 32972626 AA
|
||||
rs1801133 1 11856378 GG
|
||||
rs1801131 1 11854476 TT
|
||||
rs4680 22 19951271 GG
|
||||
rs1799971 6 154360797 AA
|
||||
rs762551 15 75041917 AA
|
||||
rs4988235 2 136608646 AA
|
||||
rs53576 3 8804371 GG
|
||||
rs6311 13 47471478 CC
|
||||
rs1800497 11 113270828 GG
|
||||
rs4363657 12 21331549 TT
|
||||
rs1800566 16 69745145 CC
|
||||
rs10455872 6 161010118 AA
|
||||
rs3798220 6 160961137 TT
|
||||
rs11591147 1 55505647 GT
|
||||
rs3892097 22 42524947 CC
|
||||
rs35742686 22 42523791 DD
|
||||
rs5030655 22 42522393 DD
|
||||
rs1065852 22 42526694 CC
|
||||
rs28371725 22 42525772 CC
|
||||
rs28371706 22 42523610 CC
|
||||
rs4244285 10 96541616 GG
|
||||
rs4986893 10 96540410 GG
|
||||
rs12248560 10 96521657 CC
|
||||
457
vendor/ruvector/npm/packages/rvdna/tests/test-biomarker.js
vendored
Normal file
457
vendor/ruvector/npm/packages/rvdna/tests/test-biomarker.js
vendored
Normal file
@@ -0,0 +1,457 @@
|
||||
'use strict';
|
||||
|
||||
const {
|
||||
biomarkerReferences, zScore, classifyBiomarker,
|
||||
computeRiskScores, encodeProfileVector, generateSyntheticPopulation,
|
||||
SNPS, INTERACTIONS, CAT_ORDER,
|
||||
} = require('../src/biomarker');
|
||||
|
||||
const {
|
||||
RingBuffer, StreamProcessor, generateReadings, defaultStreamConfig,
|
||||
Z_SCORE_THRESHOLD,
|
||||
} = require('../src/stream');
|
||||
|
||||
// ── Test harness ─────────────────────────────────────────────────────────────
|
||||
|
||||
let passed = 0, failed = 0, benchResults = [];
|
||||
|
||||
function assert(cond, msg) {
|
||||
if (!cond) throw new Error(`Assertion failed: ${msg}`);
|
||||
}
|
||||
|
||||
function assertClose(a, b, eps, msg) {
|
||||
if (Math.abs(a - b) > eps) throw new Error(`${msg}: ${a} != ${b} (eps=${eps})`);
|
||||
}
|
||||
|
||||
function test(name, fn) {
|
||||
try {
|
||||
fn();
|
||||
passed++;
|
||||
process.stdout.write(` PASS ${name}\n`);
|
||||
} catch (e) {
|
||||
failed++;
|
||||
process.stdout.write(` FAIL ${name}: ${e.message}\n`);
|
||||
}
|
||||
}
|
||||
|
||||
function bench(name, fn, iterations) {
|
||||
// Warmup
|
||||
for (let i = 0; i < Math.min(iterations, 1000); i++) fn();
|
||||
const start = performance.now();
|
||||
for (let i = 0; i < iterations; i++) fn();
|
||||
const elapsed = performance.now() - start;
|
||||
const perOp = (elapsed / iterations * 1000).toFixed(2);
|
||||
benchResults.push({ name, perOp: `${perOp} us`, total: `${elapsed.toFixed(1)} ms`, iterations });
|
||||
process.stdout.write(` BENCH ${name}: ${perOp} us/op (${iterations} iters, ${elapsed.toFixed(1)} ms)\n`);
|
||||
}
|
||||
|
||||
// ── Helpers ──────────────────────────────────────────────────────────────────
|
||||
|
||||
function fullHomRef() {
|
||||
const gts = new Map();
|
||||
for (const snp of SNPS) gts.set(snp.rsid, snp.homRef);
|
||||
return gts;
|
||||
}
|
||||
|
||||
function reading(ts, id, val, lo, hi) {
|
||||
return { timestampMs: ts, biomarkerId: id, value: val, referenceLow: lo, referenceHigh: hi, isAnomaly: false, zScore: 0 };
|
||||
}
|
||||
|
||||
function glucose(ts, val) { return reading(ts, 'glucose', val, 70, 100); }
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Biomarker Reference Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Biomarker References ---\n');
|
||||
|
||||
test('biomarker_references_count', () => {
|
||||
assert(biomarkerReferences().length === 13, `expected 13, got ${biomarkerReferences().length}`);
|
||||
});
|
||||
|
||||
test('z_score_midpoint_is_zero', () => {
|
||||
const ref = biomarkerReferences()[0]; // Total Cholesterol
|
||||
const mid = (ref.normalLow + ref.normalHigh) / 2;
|
||||
assertClose(zScore(mid, ref), 0, 1e-10, 'midpoint z-score');
|
||||
});
|
||||
|
||||
test('z_score_high_bound_is_one', () => {
|
||||
const ref = biomarkerReferences()[0];
|
||||
assertClose(zScore(ref.normalHigh, ref), 1.0, 1e-10, 'high-bound z-score');
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Classification Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Classification ---\n');
|
||||
|
||||
test('classify_normal', () => {
|
||||
const ref = biomarkerReferences()[0]; // 125-200
|
||||
assert(classifyBiomarker(150, ref) === 'Normal', 'expected Normal');
|
||||
});
|
||||
|
||||
test('classify_high', () => {
|
||||
const ref = biomarkerReferences()[0]; // normalHigh=200, criticalHigh=300
|
||||
assert(classifyBiomarker(250, ref) === 'High', 'expected High');
|
||||
});
|
||||
|
||||
test('classify_critical_high', () => {
|
||||
const ref = biomarkerReferences()[0]; // criticalHigh=300
|
||||
assert(classifyBiomarker(350, ref) === 'CriticalHigh', 'expected CriticalHigh');
|
||||
});
|
||||
|
||||
test('classify_low', () => {
|
||||
const ref = biomarkerReferences()[0]; // normalLow=125, criticalLow=100
|
||||
assert(classifyBiomarker(110, ref) === 'Low', 'expected Low');
|
||||
});
|
||||
|
||||
test('classify_critical_low', () => {
|
||||
const ref = biomarkerReferences()[0]; // criticalLow=100
|
||||
assert(classifyBiomarker(90, ref) === 'CriticalLow', 'expected CriticalLow');
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Risk Scoring Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Risk Scoring ---\n');
|
||||
|
||||
test('all_hom_ref_low_risk', () => {
|
||||
const gts = fullHomRef();
|
||||
const profile = computeRiskScores(gts);
|
||||
assert(profile.globalRiskScore < 0.15, `hom-ref should be low risk, got ${profile.globalRiskScore}`);
|
||||
});
|
||||
|
||||
test('high_cancer_risk', () => {
|
||||
const gts = fullHomRef();
|
||||
gts.set('rs80357906', 'DI');
|
||||
gts.set('rs1042522', 'GG');
|
||||
gts.set('rs11571833', 'TT');
|
||||
const profile = computeRiskScores(gts);
|
||||
const cancer = profile.categoryScores['Cancer Risk'];
|
||||
assert(cancer.score > 0.3, `should have elevated cancer risk, got ${cancer.score}`);
|
||||
});
|
||||
|
||||
test('interaction_comt_oprm1', () => {
|
||||
const gts = fullHomRef();
|
||||
gts.set('rs4680', 'AA');
|
||||
gts.set('rs1799971', 'GG');
|
||||
const withInteraction = computeRiskScores(gts);
|
||||
const neuroInter = withInteraction.categoryScores['Neurological'].score;
|
||||
|
||||
const gts2 = fullHomRef();
|
||||
gts2.set('rs4680', 'AA');
|
||||
const withoutFull = computeRiskScores(gts2);
|
||||
const neuroSingle = withoutFull.categoryScores['Neurological'].score;
|
||||
|
||||
assert(neuroInter > neuroSingle, `interaction should amplify risk: ${neuroInter} > ${neuroSingle}`);
|
||||
});
|
||||
|
||||
test('interaction_brca1_tp53', () => {
|
||||
const gts = fullHomRef();
|
||||
gts.set('rs80357906', 'DI');
|
||||
gts.set('rs1042522', 'GG');
|
||||
const profile = computeRiskScores(gts);
|
||||
const cancer = profile.categoryScores['Cancer Risk'];
|
||||
assert(cancer.contributingVariants.includes('rs80357906'), 'missing rs80357906');
|
||||
assert(cancer.contributingVariants.includes('rs1042522'), 'missing rs1042522');
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Profile Vector Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Profile Vectors ---\n');
|
||||
|
||||
test('vector_dimension_is_64', () => {
|
||||
const gts = fullHomRef();
|
||||
const profile = computeRiskScores(gts);
|
||||
assert(profile.profileVector.length === 64, `expected 64, got ${profile.profileVector.length}`);
|
||||
});
|
||||
|
||||
test('vector_is_l2_normalized', () => {
|
||||
const gts = fullHomRef();
|
||||
gts.set('rs4680', 'AG');
|
||||
gts.set('rs1799971', 'AG');
|
||||
const profile = computeRiskScores(gts);
|
||||
let norm = 0;
|
||||
for (let i = 0; i < 64; i++) norm += profile.profileVector[i] ** 2;
|
||||
norm = Math.sqrt(norm);
|
||||
assertClose(norm, 1.0, 1e-4, 'L2 norm');
|
||||
});
|
||||
|
||||
test('vector_deterministic', () => {
|
||||
const gts = fullHomRef();
|
||||
gts.set('rs1801133', 'AG');
|
||||
const a = computeRiskScores(gts);
|
||||
const b = computeRiskScores(gts);
|
||||
for (let i = 0; i < 64; i++) {
|
||||
assertClose(a.profileVector[i], b.profileVector[i], 1e-10, `dim ${i}`);
|
||||
}
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Population Generation Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Population Generation ---\n');
|
||||
|
||||
test('population_correct_count', () => {
|
||||
const pop = generateSyntheticPopulation(50, 42);
|
||||
assert(pop.length === 50, `expected 50, got ${pop.length}`);
|
||||
for (const p of pop) {
|
||||
assert(p.profileVector.length === 64, `expected 64-dim vector`);
|
||||
assert(Object.keys(p.biomarkerValues).length > 0, 'should have biomarker values');
|
||||
assert(p.globalRiskScore >= 0 && p.globalRiskScore <= 1, 'risk in [0,1]');
|
||||
}
|
||||
});
|
||||
|
||||
test('population_deterministic', () => {
|
||||
const a = generateSyntheticPopulation(10, 99);
|
||||
const b = generateSyntheticPopulation(10, 99);
|
||||
for (let i = 0; i < 10; i++) {
|
||||
assert(a[i].subjectId === b[i].subjectId, 'subject IDs must match');
|
||||
assertClose(a[i].globalRiskScore, b[i].globalRiskScore, 1e-10, `risk score ${i}`);
|
||||
}
|
||||
});
|
||||
|
||||
test('mthfr_elevates_homocysteine', () => {
|
||||
const pop = generateSyntheticPopulation(200, 7);
|
||||
const high = [], low = [];
|
||||
for (const p of pop) {
|
||||
const hcy = p.biomarkerValues['Homocysteine'] || 0;
|
||||
const metaScore = p.categoryScores['Metabolism'] ? p.categoryScores['Metabolism'].score : 0;
|
||||
if (metaScore > 0.3) high.push(hcy); else low.push(hcy);
|
||||
}
|
||||
if (high.length > 0 && low.length > 0) {
|
||||
const avgHigh = high.reduce((a, b) => a + b, 0) / high.length;
|
||||
const avgLow = low.reduce((a, b) => a + b, 0) / low.length;
|
||||
assert(avgHigh > avgLow, `MTHFR should elevate homocysteine: high=${avgHigh}, low=${avgLow}`);
|
||||
}
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// RingBuffer Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- RingBuffer ---\n');
|
||||
|
||||
test('ring_buffer_push_iter_len', () => {
|
||||
const rb = new RingBuffer(4);
|
||||
for (const v of [10, 20, 30]) rb.push(v);
|
||||
const arr = rb.toArray();
|
||||
assert(arr.length === 3 && arr[0] === 10 && arr[1] === 20 && arr[2] === 30, 'push/iter');
|
||||
assert(rb.length === 3, 'length');
|
||||
assert(!rb.isFull(), 'not full');
|
||||
});
|
||||
|
||||
test('ring_buffer_overflow_keeps_newest', () => {
|
||||
const rb = new RingBuffer(3);
|
||||
for (let v = 1; v <= 4; v++) rb.push(v);
|
||||
assert(rb.isFull(), 'should be full');
|
||||
const arr = rb.toArray();
|
||||
assert(arr[0] === 2 && arr[1] === 3 && arr[2] === 4, `got [${arr}]`);
|
||||
});
|
||||
|
||||
test('ring_buffer_capacity_one', () => {
|
||||
const rb = new RingBuffer(1);
|
||||
rb.push(42); rb.push(99);
|
||||
const arr = rb.toArray();
|
||||
assert(arr.length === 1 && arr[0] === 99, `got [${arr}]`);
|
||||
});
|
||||
|
||||
test('ring_buffer_clear_resets', () => {
|
||||
const rb = new RingBuffer(3);
|
||||
rb.push(1); rb.push(2); rb.clear();
|
||||
assert(rb.length === 0, 'length after clear');
|
||||
assert(!rb.isFull(), 'not full after clear');
|
||||
assert(rb.toArray().length === 0, 'empty after clear');
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Stream Processor Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Stream Processor ---\n');
|
||||
|
||||
test('processor_computes_stats', () => {
|
||||
const cfg = { ...defaultStreamConfig(), windowSize: 10 };
|
||||
const p = new StreamProcessor(cfg);
|
||||
const readings = generateReadings(cfg, 20, 55);
|
||||
for (const r of readings) p.processReading(r);
|
||||
const s = p.getStats('glucose');
|
||||
assert(s !== null, 'should have glucose stats');
|
||||
assert(s.count > 0 && s.mean > 0 && s.min <= s.max, 'valid stats');
|
||||
});
|
||||
|
||||
test('processor_summary_totals', () => {
|
||||
const cfg = defaultStreamConfig();
|
||||
const p = new StreamProcessor(cfg);
|
||||
const readings = generateReadings(cfg, 30, 77);
|
||||
for (const r of readings) p.processReading(r);
|
||||
const s = p.summary();
|
||||
assert(s.totalReadings === 30 * cfg.numBiomarkers, `expected ${30 * cfg.numBiomarkers}, got ${s.totalReadings}`);
|
||||
assert(s.anomalyRate >= 0 && s.anomalyRate <= 1, 'anomaly rate in [0,1]');
|
||||
});
|
||||
|
||||
test('processor_throughput_positive', () => {
|
||||
const cfg = defaultStreamConfig();
|
||||
const p = new StreamProcessor(cfg);
|
||||
const readings = generateReadings(cfg, 100, 88);
|
||||
for (const r of readings) p.processReading(r);
|
||||
const s = p.summary();
|
||||
assert(s.throughputReadingsPerSec > 0, 'throughput should be positive');
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Anomaly Detection Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Anomaly Detection ---\n');
|
||||
|
||||
test('detects_z_score_anomaly', () => {
|
||||
const p = new StreamProcessor({ ...defaultStreamConfig(), windowSize: 20 });
|
||||
for (let i = 0; i < 20; i++) p.processReading(glucose(i * 1000, 85));
|
||||
const r = p.processReading(glucose(20000, 300));
|
||||
assert(r.isAnomaly, 'should detect anomaly');
|
||||
assert(Math.abs(r.zScore) > Z_SCORE_THRESHOLD, `z-score ${r.zScore} should exceed threshold`);
|
||||
});
|
||||
|
||||
test('detects_out_of_range_anomaly', () => {
|
||||
const p = new StreamProcessor({ ...defaultStreamConfig(), windowSize: 5 });
|
||||
for (const [i, v] of [80, 82, 78, 84, 81].entries()) {
|
||||
p.processReading(glucose(i * 1000, v));
|
||||
}
|
||||
// 140 >> ref_high(100) + 20%*range(30)=106
|
||||
const r = p.processReading(glucose(5000, 140));
|
||||
assert(r.isAnomaly, 'should detect out-of-range anomaly');
|
||||
});
|
||||
|
||||
test('zero_anomaly_for_constant_stream', () => {
|
||||
const p = new StreamProcessor({ ...defaultStreamConfig(), windowSize: 50 });
|
||||
for (let i = 0; i < 10; i++) p.processReading(reading(i * 1000, 'crp', 1.5, 0.1, 3));
|
||||
const s = p.getStats('crp');
|
||||
assert(Math.abs(s.anomalyRate) < 1e-9, `expected zero anomaly rate, got ${s.anomalyRate}`);
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Trend Detection Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Trend Detection ---\n');
|
||||
|
||||
test('positive_trend_for_increasing', () => {
|
||||
const p = new StreamProcessor({ ...defaultStreamConfig(), windowSize: 20 });
|
||||
let r;
|
||||
for (let i = 0; i < 20; i++) r = p.processReading(glucose(i * 1000, 70 + i));
|
||||
assert(r.currentTrend > 0, `expected positive trend, got ${r.currentTrend}`);
|
||||
});
|
||||
|
||||
test('negative_trend_for_decreasing', () => {
|
||||
const p = new StreamProcessor({ ...defaultStreamConfig(), windowSize: 20 });
|
||||
let r;
|
||||
for (let i = 0; i < 20; i++) r = p.processReading(reading(i * 1000, 'hdl', 60 - i * 0.5, 40, 60));
|
||||
assert(r.currentTrend < 0, `expected negative trend, got ${r.currentTrend}`);
|
||||
});
|
||||
|
||||
test('exact_slope_for_linear_series', () => {
|
||||
const p = new StreamProcessor({ ...defaultStreamConfig(), windowSize: 10 });
|
||||
for (let i = 0; i < 10; i++) {
|
||||
p.processReading(reading(i * 1000, 'ldl', 100 + i * 3, 70, 130));
|
||||
}
|
||||
assertClose(p.getStats('ldl').trendSlope, 3.0, 1e-9, 'slope');
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Z-score / EMA Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Z-Score / EMA ---\n');
|
||||
|
||||
test('z_score_small_for_near_mean', () => {
|
||||
const p = new StreamProcessor({ ...defaultStreamConfig(), windowSize: 10 });
|
||||
for (const [i, v] of [80, 82, 78, 84, 76, 86, 81, 79, 83].entries()) {
|
||||
p.processReading(glucose(i * 1000, v));
|
||||
}
|
||||
const mean = p.getStats('glucose').mean;
|
||||
const r = p.processReading(glucose(9000, mean));
|
||||
assert(Math.abs(r.zScore) < 1, `z-score for mean value should be small, got ${r.zScore}`);
|
||||
});
|
||||
|
||||
test('ema_converges_to_constant', () => {
|
||||
const p = new StreamProcessor({ ...defaultStreamConfig(), windowSize: 50 });
|
||||
for (let i = 0; i < 50; i++) p.processReading(reading(i * 1000, 'crp', 2.0, 0.1, 3));
|
||||
assertClose(p.getStats('crp').ema, 2.0, 1e-6, 'EMA convergence');
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Batch Generation Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Batch Generation ---\n');
|
||||
|
||||
test('generate_correct_count_and_ids', () => {
|
||||
const cfg = defaultStreamConfig();
|
||||
const readings = generateReadings(cfg, 50, 42);
|
||||
assert(readings.length === 50 * cfg.numBiomarkers, `expected ${50 * cfg.numBiomarkers}, got ${readings.length}`);
|
||||
const validIds = new Set(['glucose', 'cholesterol_total', 'hdl', 'ldl', 'triglycerides', 'crp']);
|
||||
for (const r of readings) assert(validIds.has(r.biomarkerId), `invalid id: ${r.biomarkerId}`);
|
||||
});
|
||||
|
||||
test('generated_values_non_negative', () => {
|
||||
const readings = generateReadings(defaultStreamConfig(), 100, 999);
|
||||
for (const r of readings) assert(r.value >= 0, `negative value: ${r.value}`);
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Benchmarks
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Benchmarks ---\n');
|
||||
|
||||
const benchGts = fullHomRef();
|
||||
benchGts.set('rs4680', 'AG');
|
||||
benchGts.set('rs1801133', 'AA');
|
||||
|
||||
bench('computeRiskScores (20 SNPs)', () => {
|
||||
computeRiskScores(benchGts);
|
||||
}, 10000);
|
||||
|
||||
bench('encodeProfileVector (64-dim)', () => {
|
||||
const p = computeRiskScores(benchGts);
|
||||
encodeProfileVector(p);
|
||||
}, 10000);
|
||||
|
||||
bench('StreamProcessor.processReading', () => {
|
||||
const p = new StreamProcessor({ ...defaultStreamConfig(), windowSize: 100 });
|
||||
const r = glucose(0, 85);
|
||||
for (let i = 0; i < 100; i++) p.processReading(r);
|
||||
}, 1000);
|
||||
|
||||
bench('generateSyntheticPopulation(100)', () => {
|
||||
generateSyntheticPopulation(100, 42);
|
||||
}, 100);
|
||||
|
||||
bench('RingBuffer push+iter (100 items)', () => {
|
||||
const rb = new RingBuffer(100);
|
||||
for (let i = 0; i < 100; i++) rb.push(i);
|
||||
let s = 0;
|
||||
for (const v of rb) s += v;
|
||||
}, 10000);
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Summary
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write(`\n${'='.repeat(60)}\n`);
|
||||
process.stdout.write(`Results: ${passed} passed, ${failed} failed, ${passed + failed} total\n`);
|
||||
if (benchResults.length > 0) {
|
||||
process.stdout.write('\nBenchmark Summary:\n');
|
||||
for (const b of benchResults) {
|
||||
process.stdout.write(` ${b.name}: ${b.perOp}/op\n`);
|
||||
}
|
||||
}
|
||||
process.stdout.write(`${'='.repeat(60)}\n`);
|
||||
|
||||
process.exit(failed > 0 ? 1 : 0);
|
||||
559
vendor/ruvector/npm/packages/rvdna/tests/test-real-data.js
vendored
Normal file
559
vendor/ruvector/npm/packages/rvdna/tests/test-real-data.js
vendored
Normal file
@@ -0,0 +1,559 @@
|
||||
'use strict';
|
||||
|
||||
const fs = require('fs');
|
||||
const path = require('path');
|
||||
|
||||
// Import from index.js (the package entry point) to test the full re-export chain
|
||||
const rvdna = require('../index.js');
|
||||
|
||||
// ── Test harness ─────────────────────────────────────────────────────────────
|
||||
|
||||
let passed = 0, failed = 0, benchResults = [];
|
||||
|
||||
function assert(cond, msg) {
|
||||
if (!cond) throw new Error(`Assertion failed: ${msg}`);
|
||||
}
|
||||
|
||||
function assertClose(a, b, eps, msg) {
|
||||
if (Math.abs(a - b) > eps) throw new Error(`${msg}: ${a} != ${b} (eps=${eps})`);
|
||||
}
|
||||
|
||||
function assertGt(a, b, msg) {
|
||||
if (!(a > b)) throw new Error(`${msg}: expected ${a} > ${b}`);
|
||||
}
|
||||
|
||||
function assertLt(a, b, msg) {
|
||||
if (!(a < b)) throw new Error(`${msg}: expected ${a} < ${b}`);
|
||||
}
|
||||
|
||||
function test(name, fn) {
|
||||
try {
|
||||
fn();
|
||||
passed++;
|
||||
process.stdout.write(` PASS ${name}\n`);
|
||||
} catch (e) {
|
||||
failed++;
|
||||
process.stdout.write(` FAIL ${name}: ${e.message}\n`);
|
||||
}
|
||||
}
|
||||
|
||||
function bench(name, fn, iterations) {
|
||||
for (let i = 0; i < Math.min(iterations, 1000); i++) fn();
|
||||
const start = performance.now();
|
||||
for (let i = 0; i < iterations; i++) fn();
|
||||
const elapsed = performance.now() - start;
|
||||
const perOp = (elapsed / iterations * 1000).toFixed(2);
|
||||
benchResults.push({ name, perOp: `${perOp} us`, total: `${elapsed.toFixed(1)} ms`, iterations });
|
||||
process.stdout.write(` BENCH ${name}: ${perOp} us/op (${iterations} iters, ${elapsed.toFixed(1)} ms)\n`);
|
||||
}
|
||||
|
||||
// ── Fixture loading ──────────────────────────────────────────────────────────
|
||||
|
||||
const FIXTURES = path.join(__dirname, 'fixtures');
|
||||
|
||||
function loadFixture(name) {
|
||||
return fs.readFileSync(path.join(FIXTURES, name), 'utf8');
|
||||
}
|
||||
|
||||
function parseFixtureToGenotypes(name) {
|
||||
const text = loadFixture(name);
|
||||
const data = rvdna.parse23andMe(text);
|
||||
const gts = new Map();
|
||||
for (const [rsid, snp] of data.snps) gts.set(rsid, snp.genotype);
|
||||
return { data, gts };
|
||||
}
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// SECTION 1: End-to-End Pipeline (parse 23andMe → biomarker scoring → stream)
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- End-to-End Pipeline ---\n');
|
||||
|
||||
test('e2e_high_risk_cardio_pipeline', () => {
|
||||
const { data, gts } = parseFixtureToGenotypes('sample-high-risk-cardio.23andme.txt');
|
||||
|
||||
// Stage 1: 23andMe parsing
|
||||
assert(data.totalMarkers === 29, `expected 29 markers, got ${data.totalMarkers}`);
|
||||
assert(data.build === 'GRCh37', `expected GRCh37, got ${data.build}`);
|
||||
assert(data.noCalls === 0, 'no no-calls expected');
|
||||
|
||||
// Stage 2: Genotyping analysis
|
||||
const analysis = rvdna.analyze23andMe(loadFixture('sample-high-risk-cardio.23andme.txt'));
|
||||
assert(analysis.cyp2d6.phenotype !== undefined, 'CYP2D6 phenotype should be defined');
|
||||
assert(analysis.cyp2c19.phenotype !== undefined, 'CYP2C19 phenotype should be defined');
|
||||
|
||||
// Stage 3: Biomarker risk scoring
|
||||
const profile = rvdna.computeRiskScores(gts);
|
||||
assert(profile.profileVector.length === 64, 'profile vector should be 64-dim');
|
||||
assert(profile.globalRiskScore >= 0 && profile.globalRiskScore <= 1, 'risk in [0,1]');
|
||||
|
||||
// High-risk cardiac: MTHFR 677TT + LPA het + SLCO1B1 het → elevated metabolism + cardiovascular
|
||||
const metab = profile.categoryScores['Metabolism'];
|
||||
assertGt(metab.score, 0.3, 'MTHFR 677TT should elevate metabolism risk');
|
||||
assertGt(metab.confidence, 0.5, 'metabolism confidence should be substantial');
|
||||
|
||||
const cardio = profile.categoryScores['Cardiovascular'];
|
||||
assert(cardio.contributingVariants.includes('rs10455872'), 'LPA variant should contribute');
|
||||
assert(cardio.contributingVariants.includes('rs4363657'), 'SLCO1B1 variant should contribute');
|
||||
assert(cardio.contributingVariants.includes('rs3798220'), 'LPA rs3798220 should contribute');
|
||||
|
||||
// Stage 4: Feed synthetic biomarker readings through streaming processor
|
||||
const cfg = rvdna.defaultStreamConfig();
|
||||
const processor = new rvdna.StreamProcessor(cfg);
|
||||
const readings = rvdna.generateReadings(cfg, 50, 42);
|
||||
for (const r of readings) processor.processReading(r);
|
||||
const summary = processor.summary();
|
||||
assert(summary.totalReadings > 0, 'should have processed readings');
|
||||
assert(summary.anomalyRate >= 0, 'anomaly rate should be valid');
|
||||
});
|
||||
|
||||
test('e2e_low_risk_baseline_pipeline', () => {
|
||||
const { data, gts } = parseFixtureToGenotypes('sample-low-risk-baseline.23andme.txt');
|
||||
|
||||
// Parse
|
||||
assert(data.totalMarkers === 29, `expected 29 markers`);
|
||||
assert(data.build === 'GRCh38', `expected GRCh38, got ${data.build}`);
|
||||
|
||||
// Score
|
||||
const profile = rvdna.computeRiskScores(gts);
|
||||
assertLt(profile.globalRiskScore, 0.15, 'all-ref should be very low risk');
|
||||
|
||||
// All categories should be near-zero
|
||||
for (const [cat, cs] of Object.entries(profile.categoryScores)) {
|
||||
assertLt(cs.score, 0.05, `${cat} should be near-zero for all-ref`);
|
||||
}
|
||||
|
||||
// APOE should be e3/e3
|
||||
const apoe = rvdna.determineApoe(gts);
|
||||
assert(apoe.genotype.includes('e3/e3'), `expected e3/e3, got ${apoe.genotype}`);
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// SECTION 2: Clinical Scenario Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Clinical Scenarios ---\n');
|
||||
|
||||
test('scenario_apoe_e4e4_brca1_carrier', () => {
|
||||
const { gts } = parseFixtureToGenotypes('sample-multi-risk.23andme.txt');
|
||||
const profile = rvdna.computeRiskScores(gts);
|
||||
|
||||
// APOE e4/e4 → high neurological risk
|
||||
const neuro = profile.categoryScores['Neurological'];
|
||||
assertGt(neuro.score, 0.5, `APOE e4/e4 + COMT Met/Met should push neuro >0.5, got ${neuro.score}`);
|
||||
assert(neuro.contributingVariants.includes('rs429358'), 'APOE should contribute');
|
||||
assert(neuro.contributingVariants.includes('rs4680'), 'COMT should contribute');
|
||||
|
||||
// BRCA1 carrier + TP53 variant → elevated cancer risk with interaction
|
||||
const cancer = profile.categoryScores['Cancer Risk'];
|
||||
assertGt(cancer.score, 0.4, `BRCA1 carrier + TP53 should push cancer >0.4, got ${cancer.score}`);
|
||||
assert(cancer.contributingVariants.includes('rs80357906'), 'BRCA1 should contribute');
|
||||
assert(cancer.contributingVariants.includes('rs1042522'), 'TP53 should contribute');
|
||||
|
||||
// Cardiovascular should be elevated from SLCO1B1 + LPA
|
||||
const cardio = profile.categoryScores['Cardiovascular'];
|
||||
assertGt(cardio.score, 0.3, `SLCO1B1 + LPA should push cardio >0.3, got ${cardio.score}`);
|
||||
|
||||
// NQO1 null (TT) should contribute to cancer
|
||||
assert(cancer.contributingVariants.includes('rs1800566'), 'NQO1 should contribute');
|
||||
|
||||
// Global risk should be substantial
|
||||
assertGt(profile.globalRiskScore, 0.4, `multi-risk global should be >0.4, got ${profile.globalRiskScore}`);
|
||||
|
||||
// APOE determination
|
||||
const apoe = rvdna.determineApoe(gts);
|
||||
assert(apoe.genotype.includes('e4/e4'), `expected e4/e4, got ${apoe.genotype}`);
|
||||
});
|
||||
|
||||
test('scenario_pcsk9_protective', () => {
|
||||
const { gts } = parseFixtureToGenotypes('sample-pcsk9-protective.23andme.txt');
|
||||
const profile = rvdna.computeRiskScores(gts);
|
||||
|
||||
// PCSK9 R46L het (rs11591147 GT) → negative cardiovascular weight (protective)
|
||||
const cardio = profile.categoryScores['Cardiovascular'];
|
||||
// With only PCSK9 protective allele and no risk alleles, cardio score should be very low
|
||||
assertLt(cardio.score, 0.05, `PCSK9 protective should keep cardio very low, got ${cardio.score}`);
|
||||
|
||||
// APOE e2/e3 protective
|
||||
const apoe = rvdna.determineApoe(gts);
|
||||
assert(apoe.genotype.includes('e2/e3'), `expected e2/e3, got ${apoe.genotype}`);
|
||||
});
|
||||
|
||||
test('scenario_mthfr_compound_heterozygote', () => {
|
||||
const { gts } = parseFixtureToGenotypes('sample-high-risk-cardio.23andme.txt');
|
||||
// This file has rs1801133=AA (677TT hom) + rs1801131=GT (1298AC het) → compound score 3
|
||||
|
||||
const profile = rvdna.computeRiskScores(gts);
|
||||
const metab = profile.categoryScores['Metabolism'];
|
||||
|
||||
// MTHFR compound should push metabolism risk up
|
||||
assertGt(metab.score, 0.3, `MTHFR compound should elevate metabolism, got ${metab.score}`);
|
||||
assert(metab.contributingVariants.includes('rs1801133'), 'rs1801133 (C677T) should contribute');
|
||||
assert(metab.contributingVariants.includes('rs1801131'), 'rs1801131 (A1298C) should contribute');
|
||||
|
||||
// MTHFR interaction with MTHFR should amplify
|
||||
// The interaction rs1801133×rs1801131 has modifier 1.3
|
||||
});
|
||||
|
||||
test('scenario_comt_oprm1_pain_interaction', () => {
|
||||
// Use controlled genotypes that don't saturate the category at 1.0
|
||||
const gts = new Map();
|
||||
for (const snp of rvdna.SNPS) gts.set(snp.rsid, snp.homRef);
|
||||
gts.set('rs4680', 'AA'); // COMT Met/Met
|
||||
gts.set('rs1799971', 'GG'); // OPRM1 Asp/Asp
|
||||
const profile = rvdna.computeRiskScores(gts);
|
||||
const neuro = profile.categoryScores['Neurological'];
|
||||
|
||||
// Without OPRM1 variant → no interaction modifier
|
||||
const gts2 = new Map(gts);
|
||||
gts2.set('rs1799971', 'AA'); // reference
|
||||
const profile2 = rvdna.computeRiskScores(gts2);
|
||||
const neuro2 = profile2.categoryScores['Neurological'];
|
||||
|
||||
assertGt(neuro.score, neuro2.score, 'COMT×OPRM1 interaction should amplify neurological risk');
|
||||
});
|
||||
|
||||
test('scenario_drd2_comt_interaction', () => {
|
||||
// Use controlled genotypes that don't saturate the category at 1.0
|
||||
const gts = new Map();
|
||||
for (const snp of rvdna.SNPS) gts.set(snp.rsid, snp.homRef);
|
||||
gts.set('rs1800497', 'AA'); // DRD2 A1/A1
|
||||
gts.set('rs4680', 'AA'); // COMT Met/Met
|
||||
const profile = rvdna.computeRiskScores(gts);
|
||||
|
||||
// Without DRD2 variant → no DRD2×COMT interaction
|
||||
const gts2 = new Map(gts);
|
||||
gts2.set('rs1800497', 'GG'); // reference
|
||||
const profile2 = rvdna.computeRiskScores(gts2);
|
||||
|
||||
assertGt(
|
||||
profile.categoryScores['Neurological'].score,
|
||||
profile2.categoryScores['Neurological'].score,
|
||||
'DRD2×COMT interaction should amplify'
|
||||
);
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// SECTION 3: Cross-Validation (JS matches Rust expectations)
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Cross-Validation (JS ↔ Rust parity) ---\n');
|
||||
|
||||
test('parity_reference_count_matches_rust', () => {
|
||||
assert(rvdna.BIOMARKER_REFERENCES.length === 13, 'should have 13 references (matches Rust)');
|
||||
assert(rvdna.SNPS.length === 20, 'should have 20 SNPs (matches Rust)');
|
||||
assert(rvdna.INTERACTIONS.length === 6, 'should have 6 interactions (matches Rust)');
|
||||
assert(rvdna.CAT_ORDER.length === 4, 'should have 4 categories (matches Rust)');
|
||||
});
|
||||
|
||||
test('parity_snp_table_exact_match', () => {
|
||||
// Verify first and last SNP match Rust exactly
|
||||
const first = rvdna.SNPS[0];
|
||||
assert(first.rsid === 'rs429358', 'first SNP rsid');
|
||||
assertClose(first.wHet, 0.4, 1e-10, 'first SNP wHet');
|
||||
assertClose(first.wAlt, 0.9, 1e-10, 'first SNP wAlt');
|
||||
assert(first.homRef === 'TT', 'first SNP homRef');
|
||||
assert(first.category === 'Neurological', 'first SNP category');
|
||||
|
||||
const last = rvdna.SNPS[19];
|
||||
assert(last.rsid === 'rs11591147', 'last SNP rsid');
|
||||
assertClose(last.wHet, -0.30, 1e-10, 'PCSK9 wHet (negative = protective)');
|
||||
assertClose(last.wAlt, -0.55, 1e-10, 'PCSK9 wAlt (negative = protective)');
|
||||
});
|
||||
|
||||
test('parity_interaction_table_exact_match', () => {
|
||||
const i0 = rvdna.INTERACTIONS[0];
|
||||
assert(i0.rsidA === 'rs4680' && i0.rsidB === 'rs1799971', 'first interaction pair');
|
||||
assertClose(i0.modifier, 1.4, 1e-10, 'COMT×OPRM1 modifier');
|
||||
|
||||
const i3 = rvdna.INTERACTIONS[3];
|
||||
assert(i3.rsidA === 'rs80357906' && i3.rsidB === 'rs1042522', 'BRCA1×TP53 pair');
|
||||
assertClose(i3.modifier, 1.5, 1e-10, 'BRCA1×TP53 modifier');
|
||||
});
|
||||
|
||||
test('parity_z_score_matches_rust', () => {
|
||||
// z_score(mid, ref) should be 0.0 (Rust test_z_score_midpoint_is_zero)
|
||||
const ref = rvdna.BIOMARKER_REFERENCES[0]; // Total Cholesterol
|
||||
const mid = (ref.normalLow + ref.normalHigh) / 2;
|
||||
assertClose(rvdna.zScore(mid, ref), 0, 1e-10, 'midpoint z-score = 0');
|
||||
// z_score(normalHigh, ref) should be 1.0 (Rust test_z_score_high_bound_is_one)
|
||||
assertClose(rvdna.zScore(ref.normalHigh, ref), 1, 1e-10, 'high-bound z-score = 1');
|
||||
});
|
||||
|
||||
test('parity_classification_matches_rust', () => {
|
||||
const ref = rvdna.BIOMARKER_REFERENCES[0]; // Total Cholesterol 125-200
|
||||
assert(rvdna.classifyBiomarker(150, ref) === 'Normal', 'Normal');
|
||||
assert(rvdna.classifyBiomarker(350, ref) === 'CriticalHigh', 'CriticalHigh (>300)');
|
||||
assert(rvdna.classifyBiomarker(110, ref) === 'Low', 'Low');
|
||||
assert(rvdna.classifyBiomarker(90, ref) === 'CriticalLow', 'CriticalLow (<100)');
|
||||
});
|
||||
|
||||
test('parity_vector_layout_64dim_l2', () => {
|
||||
// Rust test_vector_dimension_is_64 and test_vector_is_l2_normalized
|
||||
const gts = new Map();
|
||||
for (const snp of rvdna.SNPS) gts.set(snp.rsid, snp.homRef);
|
||||
gts.set('rs4680', 'AG');
|
||||
gts.set('rs1799971', 'AG');
|
||||
const profile = rvdna.computeRiskScores(gts);
|
||||
assert(profile.profileVector.length === 64, '64 dims');
|
||||
let norm = 0;
|
||||
for (let i = 0; i < 64; i++) norm += profile.profileVector[i] ** 2;
|
||||
norm = Math.sqrt(norm);
|
||||
assertClose(norm, 1.0, 1e-4, 'L2 norm');
|
||||
});
|
||||
|
||||
test('parity_hom_ref_low_risk_matches_rust', () => {
|
||||
// Rust test_risk_scores_all_hom_ref_low_risk: global < 0.15
|
||||
const gts = new Map();
|
||||
for (const snp of rvdna.SNPS) gts.set(snp.rsid, snp.homRef);
|
||||
const profile = rvdna.computeRiskScores(gts);
|
||||
assertLt(profile.globalRiskScore, 0.15, 'hom-ref should be <0.15');
|
||||
});
|
||||
|
||||
test('parity_high_cancer_matches_rust', () => {
|
||||
// Rust test_risk_scores_high_cancer_risk: cancer > 0.3
|
||||
const gts = new Map();
|
||||
for (const snp of rvdna.SNPS) gts.set(snp.rsid, snp.homRef);
|
||||
gts.set('rs80357906', 'DI');
|
||||
gts.set('rs1042522', 'GG');
|
||||
gts.set('rs11571833', 'TT');
|
||||
const profile = rvdna.computeRiskScores(gts);
|
||||
assertGt(profile.categoryScores['Cancer Risk'].score, 0.3, 'cancer > 0.3');
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// SECTION 4: Population-Scale Correlation Tests
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Population Correlations ---\n');
|
||||
|
||||
test('population_apoe_lowers_hdl', () => {
|
||||
// Mirrors Rust test_apoe_lowers_hdl_in_population
|
||||
const pop = rvdna.generateSyntheticPopulation(300, 88);
|
||||
const apoeHdl = [], refHdl = [];
|
||||
for (const p of pop) {
|
||||
const hdl = p.biomarkerValues['HDL'] || 0;
|
||||
const neuro = p.categoryScores['Neurological'] ? p.categoryScores['Neurological'].score : 0;
|
||||
if (neuro > 0.3) apoeHdl.push(hdl); else refHdl.push(hdl);
|
||||
}
|
||||
if (apoeHdl.length > 0 && refHdl.length > 0) {
|
||||
const avgApoe = apoeHdl.reduce((a, b) => a + b, 0) / apoeHdl.length;
|
||||
const avgRef = refHdl.reduce((a, b) => a + b, 0) / refHdl.length;
|
||||
assertLt(avgApoe, avgRef, 'APOE e4 should lower HDL');
|
||||
}
|
||||
});
|
||||
|
||||
test('population_lpa_elevates_lpa_biomarker', () => {
|
||||
const pop = rvdna.generateSyntheticPopulation(300, 44);
|
||||
const lpaHigh = [], lpaLow = [];
|
||||
for (const p of pop) {
|
||||
const lpaVal = p.biomarkerValues['Lp(a)'] || 0;
|
||||
const cardio = p.categoryScores['Cardiovascular'] ? p.categoryScores['Cardiovascular'].score : 0;
|
||||
if (cardio > 0.2) lpaHigh.push(lpaVal); else lpaLow.push(lpaVal);
|
||||
}
|
||||
if (lpaHigh.length > 0 && lpaLow.length > 0) {
|
||||
const avgHigh = lpaHigh.reduce((a, b) => a + b, 0) / lpaHigh.length;
|
||||
const avgLow = lpaLow.reduce((a, b) => a + b, 0) / lpaLow.length;
|
||||
assertGt(avgHigh, avgLow, 'cardiovascular risk should correlate with elevated Lp(a)');
|
||||
}
|
||||
});
|
||||
|
||||
test('population_risk_score_distribution', () => {
|
||||
const pop = rvdna.generateSyntheticPopulation(1000, 123);
|
||||
const scores = pop.map(p => p.globalRiskScore);
|
||||
const min = Math.min(...scores);
|
||||
const max = Math.max(...scores);
|
||||
const mean = scores.reduce((a, b) => a + b, 0) / scores.length;
|
||||
|
||||
// Should have good spread
|
||||
assertGt(max - min, 0.2, `risk score range should be >0.2, got ${max - min}`);
|
||||
// Mean should be moderate (not all near 0 or 1)
|
||||
assertGt(mean, 0.05, 'mean risk should be >0.05');
|
||||
assertLt(mean, 0.7, 'mean risk should be <0.7');
|
||||
});
|
||||
|
||||
test('population_all_biomarkers_within_clinical_limits', () => {
|
||||
const pop = rvdna.generateSyntheticPopulation(500, 55);
|
||||
for (const p of pop) {
|
||||
for (const bref of rvdna.BIOMARKER_REFERENCES) {
|
||||
const val = p.biomarkerValues[bref.name];
|
||||
assert(val !== undefined, `missing ${bref.name} for ${p.subjectId}`);
|
||||
assert(val >= 0, `${bref.name} should be non-negative, got ${val}`);
|
||||
if (bref.criticalHigh !== null) {
|
||||
assertLt(val, bref.criticalHigh * 1.25, `${bref.name} should be < criticalHigh*1.25`);
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// SECTION 5: Streaming with Real-Data Correlated Biomarkers
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Streaming with Real Biomarkers ---\n');
|
||||
|
||||
test('stream_cusum_changepoint_on_shift', () => {
|
||||
// Mirror Rust test_cusum_changepoint_detection
|
||||
const cfg = { ...rvdna.defaultStreamConfig(), windowSize: 20 };
|
||||
const p = new rvdna.StreamProcessor(cfg);
|
||||
|
||||
// Establish baseline at 85
|
||||
for (let i = 0; i < 30; i++) {
|
||||
p.processReading({
|
||||
timestampMs: i * 1000, biomarkerId: 'glucose', value: 85,
|
||||
referenceLow: 70, referenceHigh: 100, isAnomaly: false, zScore: 0,
|
||||
});
|
||||
}
|
||||
// Sustained shift to 120
|
||||
for (let i = 30; i < 50; i++) {
|
||||
p.processReading({
|
||||
timestampMs: i * 1000, biomarkerId: 'glucose', value: 120,
|
||||
referenceLow: 70, referenceHigh: 100, isAnomaly: false, zScore: 0,
|
||||
});
|
||||
}
|
||||
const stats = p.getStats('glucose');
|
||||
assertGt(stats.mean, 90, `mean should shift upward after changepoint: ${stats.mean}`);
|
||||
});
|
||||
|
||||
test('stream_drift_detected_as_trend', () => {
|
||||
// Mirror Rust test_trend_detection
|
||||
const cfg = { ...rvdna.defaultStreamConfig(), windowSize: 50 };
|
||||
const p = new rvdna.StreamProcessor(cfg);
|
||||
|
||||
// Strong upward drift
|
||||
for (let i = 0; i < 50; i++) {
|
||||
p.processReading({
|
||||
timestampMs: i * 1000, biomarkerId: 'glucose', value: 70 + i * 0.5,
|
||||
referenceLow: 70, referenceHigh: 100, isAnomaly: false, zScore: 0,
|
||||
});
|
||||
}
|
||||
assertGt(p.getStats('glucose').trendSlope, 0, 'should detect positive trend');
|
||||
});
|
||||
|
||||
test('stream_population_biomarker_values_through_processor', () => {
|
||||
// Take synthetic population biomarker values and stream them
|
||||
const pop = rvdna.generateSyntheticPopulation(20, 77);
|
||||
const cfg = { ...rvdna.defaultStreamConfig(), windowSize: 20 };
|
||||
const p = new rvdna.StreamProcessor(cfg);
|
||||
|
||||
for (let i = 0; i < pop.length; i++) {
|
||||
const homocysteine = pop[i].biomarkerValues['Homocysteine'];
|
||||
p.processReading({
|
||||
timestampMs: i * 1000, biomarkerId: 'homocysteine',
|
||||
value: homocysteine, referenceLow: 5, referenceHigh: 15,
|
||||
isAnomaly: false, zScore: 0,
|
||||
});
|
||||
}
|
||||
|
||||
const stats = p.getStats('homocysteine');
|
||||
assert(stats !== null, 'should have homocysteine stats');
|
||||
assertGt(stats.count, 0, 'should have processed readings');
|
||||
assertGt(stats.mean, 0, 'mean should be positive');
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// SECTION 6: Package Re-export Verification
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Package Re-exports ---\n');
|
||||
|
||||
test('index_exports_all_biomarker_apis', () => {
|
||||
const expectedFns = [
|
||||
'biomarkerReferences', 'zScore', 'classifyBiomarker',
|
||||
'computeRiskScores', 'encodeProfileVector', 'generateSyntheticPopulation',
|
||||
];
|
||||
for (const fn of expectedFns) {
|
||||
assert(typeof rvdna[fn] === 'function', `missing export: ${fn}`);
|
||||
}
|
||||
const expectedConsts = ['BIOMARKER_REFERENCES', 'SNPS', 'INTERACTIONS', 'CAT_ORDER'];
|
||||
for (const c of expectedConsts) {
|
||||
assert(rvdna[c] !== undefined, `missing export: ${c}`);
|
||||
}
|
||||
});
|
||||
|
||||
test('index_exports_all_stream_apis', () => {
|
||||
assert(typeof rvdna.RingBuffer === 'function', 'missing RingBuffer');
|
||||
assert(typeof rvdna.StreamProcessor === 'function', 'missing StreamProcessor');
|
||||
assert(typeof rvdna.generateReadings === 'function', 'missing generateReadings');
|
||||
assert(typeof rvdna.defaultStreamConfig === 'function', 'missing defaultStreamConfig');
|
||||
assert(rvdna.BIOMARKER_DEFS !== undefined, 'missing BIOMARKER_DEFS');
|
||||
});
|
||||
|
||||
test('index_exports_v02_apis_unchanged', () => {
|
||||
const v02fns = [
|
||||
'encode2bit', 'decode2bit', 'translateDna', 'cosineSimilarity',
|
||||
'isNativeAvailable', 'normalizeGenotype', 'parse23andMe',
|
||||
'callCyp2d6', 'callCyp2c19', 'determineApoe', 'analyze23andMe',
|
||||
];
|
||||
for (const fn of v02fns) {
|
||||
assert(typeof rvdna[fn] === 'function', `v0.2 API missing: ${fn}`);
|
||||
}
|
||||
});
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// SECTION 7: Optimized Benchmarks (pre/post optimization comparison)
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write('\n--- Optimized Benchmarks ---\n');
|
||||
|
||||
// Prepare benchmark genotypes from real fixture
|
||||
const { gts: benchGts } = parseFixtureToGenotypes('sample-high-risk-cardio.23andme.txt');
|
||||
|
||||
bench('computeRiskScores (real 23andMe data, 20 SNPs)', () => {
|
||||
rvdna.computeRiskScores(benchGts);
|
||||
}, 20000);
|
||||
|
||||
bench('encodeProfileVector (real profile)', () => {
|
||||
const p = rvdna.computeRiskScores(benchGts);
|
||||
rvdna.encodeProfileVector(p);
|
||||
}, 20000);
|
||||
|
||||
bench('StreamProcessor.processReading (optimized incremental)', () => {
|
||||
const p = new rvdna.StreamProcessor({ ...rvdna.defaultStreamConfig(), windowSize: 100 });
|
||||
const r = { timestampMs: 0, biomarkerId: 'glucose', value: 85, referenceLow: 70, referenceHigh: 100, isAnomaly: false, zScore: 0 };
|
||||
for (let i = 0; i < 100; i++) {
|
||||
r.timestampMs = i * 1000;
|
||||
p.processReading(r);
|
||||
}
|
||||
}, 2000);
|
||||
|
||||
bench('generateSyntheticPopulation(100) (optimized lookups)', () => {
|
||||
rvdna.generateSyntheticPopulation(100, 42);
|
||||
}, 200);
|
||||
|
||||
bench('full pipeline: parse + score + stream (real data)', () => {
|
||||
const text = loadFixture('sample-high-risk-cardio.23andme.txt');
|
||||
const data = rvdna.parse23andMe(text);
|
||||
const gts = new Map();
|
||||
for (const [rsid, snp] of data.snps) gts.set(rsid, snp.genotype);
|
||||
const profile = rvdna.computeRiskScores(gts);
|
||||
const proc = new rvdna.StreamProcessor(rvdna.defaultStreamConfig());
|
||||
for (const bref of rvdna.BIOMARKER_REFERENCES) {
|
||||
const val = profile.biomarkerValues[bref.name] || ((bref.normalLow + bref.normalHigh) / 2);
|
||||
proc.processReading({
|
||||
timestampMs: 0, biomarkerId: bref.name, value: val,
|
||||
referenceLow: bref.normalLow, referenceHigh: bref.normalHigh,
|
||||
isAnomaly: false, zScore: 0,
|
||||
});
|
||||
}
|
||||
}, 5000);
|
||||
|
||||
bench('population 1000 subjects', () => {
|
||||
rvdna.generateSyntheticPopulation(1000, 42);
|
||||
}, 20);
|
||||
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
// Summary
|
||||
// ═════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
process.stdout.write(`\n${'='.repeat(70)}\n`);
|
||||
process.stdout.write(`Results: ${passed} passed, ${failed} failed, ${passed + failed} total\n`);
|
||||
if (benchResults.length > 0) {
|
||||
process.stdout.write('\nBenchmark Summary:\n');
|
||||
for (const b of benchResults) {
|
||||
process.stdout.write(` ${b.name}: ${b.perOp}/op\n`);
|
||||
}
|
||||
}
|
||||
process.stdout.write(`${'='.repeat(70)}\n`);
|
||||
|
||||
process.exit(failed > 0 ? 1 : 0);
|
||||
Reference in New Issue
Block a user