26 KiB
RVF Operations API
1. Scope
This document specifies the operational surface of an RVF runtime: error codes returned by all operations, wire formats for batch queries, batch ingest, and batch deletes, the network streaming protocol for progressive loading over HTTP and TCP, and the compaction scheduling policy. It complements the segment model (spec 01), manifest system (spec 02), and query optimization (spec 06).
All multi-byte integers are little-endian unless otherwise noted. All offsets within messages are byte offsets from the start of the message payload.
2. Error Code Enumeration
Error codes are 16-bit unsigned integers. The high byte identifies the error
category; the low byte identifies the specific error within that category.
Implementations must preserve unrecognized codes in responses and must not
treat unknown codes as fatal unless the high byte is 0x01 (format error).
Category 0x00: Success
Code Name Description
------ -------------------- ----------------------------------------
0x0000 OK Operation succeeded
0x0001 OK_PARTIAL Partial success (some items failed)
OK_PARTIAL is returned when a batch operation succeeds for some items and
fails for others. The response body contains per-item status details.
Category 0x01: Format Errors
Code Name Description
------ -------------------- ----------------------------------------
0x0100 INVALID_MAGIC Segment magic mismatch (expected 0x52564653)
0x0101 INVALID_VERSION Unsupported segment version
0x0102 INVALID_CHECKSUM Segment hash verification failed
0x0103 INVALID_SIGNATURE Cryptographic signature invalid
0x0104 TRUNCATED_SEGMENT Segment payload shorter than declared length
0x0105 INVALID_MANIFEST Root manifest validation failed
0x0106 MANIFEST_NOT_FOUND No valid MANIFEST_SEG in file
0x0107 UNKNOWN_SEGMENT_TYPE Segment type not recognized (warning, not fatal)
0x0108 ALIGNMENT_ERROR Data not at expected 64B boundary
UNKNOWN_SEGMENT_TYPE is advisory. A reader encountering an unknown segment
type should skip it and continue. All other format errors in this category
are fatal for the affected segment.
Category 0x02: Query Errors
Code Name Description
------ -------------------- ----------------------------------------
0x0200 DIMENSION_MISMATCH Query vector dimension != index dimension
0x0201 EMPTY_INDEX No index segments available
0x0202 METRIC_UNSUPPORTED Requested distance metric not available
0x0203 FILTER_PARSE_ERROR Invalid filter expression
0x0204 K_TOO_LARGE Requested K exceeds available vectors
0x0205 TIMEOUT Query exceeded time budget
When K_TOO_LARGE is returned, the response still contains all available
results. The result count will be less than the requested K.
Category 0x03: Write Errors
Code Name Description
------ -------------------- ----------------------------------------
0x0300 LOCK_HELD Another writer holds the lock
0x0301 LOCK_STALE Lock file exists but owner process is dead
0x0302 DISK_FULL Insufficient space for write
0x0303 FSYNC_FAILED Durable write failed
0x0304 SEGMENT_TOO_LARGE Segment exceeds 4 GB limit
0x0305 READ_ONLY File opened in read-only mode
LOCK_STALE is informational. The runtime may attempt to break the stale
lock and retry. If recovery succeeds, the original operation proceeds with
an OK status.
Category 0x04: Tile Errors (WASM Microkernel)
Code Name Description
------ -------------------- ----------------------------------------
0x0400 TILE_TRAP WASM trap (OOB, unreachable, stack overflow)
0x0401 TILE_OOM Tile exceeded scratch memory (64 KB)
0x0402 TILE_TIMEOUT Tile computation exceeded time budget
0x0403 TILE_INVALID_MSG Malformed hub-tile message
0x0404 TILE_UNSUPPORTED_OP Operation not available on this profile
All tile errors trigger the fault isolation protocol described in
microkernel/wasm-runtime.md section 8. The hub reassigns the tile's
work and optionally restarts the faulted tile.
Category 0x05: Crypto Errors
Code Name Description
------ -------------------- ----------------------------------------
0x0500 KEY_NOT_FOUND Referenced key_id not in CRYPTO_SEG
0x0501 KEY_EXPIRED Key past valid_until timestamp
0x0502 DECRYPT_FAILED Decryption or auth tag verification failed
0x0503 ALGO_UNSUPPORTED Cryptographic algorithm not implemented
Crypto errors are always fatal for the affected segment. An implementation must not serve data from a segment that fails signature or decryption checks.
3. Batch Query API
Wire Format: Request
Batch queries amortize connection overhead and enable the runtime to schedule vector block loads across multiple queries simultaneously.
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 4 query_count Number of queries in batch (max 1024)
0x04 4 k Shared top-K parameter
0x08 1 metric Distance metric: 0=L2, 1=IP, 2=cosine, 3=hamming
0x09 3 reserved Must be zero
0x0C 4 ef_search HNSW ef_search parameter
0x10 4 shared_filter_len Byte length of shared filter (0 = no filter)
0x14 var shared_filter Filter expression (applies to all queries)
var var queries[] Per-query entries (see below)
Each query entry:
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 4 query_id Client-assigned correlation ID
0x04 2 dim Vector dimensionality
0x06 1 dtype Data type: 0=fp32, 1=fp16, 2=i8, 3=binary
0x07 1 flags Bit 0: has per-query filter
0x08 var vector Query vector (dim * sizeof(dtype) bytes)
var 4 filter_len Byte length of per-query filter (if flags bit 0)
var var filter Per-query filter (overrides shared filter)
When both a shared filter and a per-query filter are present, the per-query filter takes precedence. A per-query filter of zero length inherits the shared filter.
Wire Format: Response
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 4 query_count Number of query results
0x04 var results[] Per-query result entries
Each result entry:
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 4 query_id Correlation ID from request
0x04 2 status Error code (0x0000 = OK)
0x06 2 reserved Must be zero
0x08 4 result_count Number of results returned
0x0C var results[] Array of (vector_id: u64, distance: f32) pairs
Each result pair is 12 bytes: 8 bytes for the vector ID followed by 4 bytes for the distance value. Results are sorted by distance ascending (nearest first).
Batch Scheduling
The runtime should process batch queries using the following strategy:
- Parse all query vectors and load them into memory
- Identify shared segments across queries (block deduplication)
- Load each vector block once and evaluate all relevant queries against it
- Merge per-query top-K heaps independently
- Return results as soon as each query completes (streaming response)
This amortizes I/O: if N queries touch the same vector block, the block is read once instead of N times.
4. Batch Ingest API
Wire Format: Request
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 4 vector_count Number of vectors to ingest (max 65536)
0x04 2 dim Vector dimensionality
0x06 1 dtype Data type: 0=fp32, 1=fp16, 2=i8, 3=binary
0x07 1 flags Bit 0: metadata_included
0x08 var vectors[] Vector entries
var var metadata[] Metadata entries (if flags bit 0)
Each vector entry:
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 8 vector_id Globally unique vector ID
0x08 var vector Vector data (dim * sizeof(dtype) bytes)
Each metadata entry (when metadata_included is set):
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 2 field_count Number of metadata fields
0x02 var fields[] Field entries
Each metadata field:
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 2 field_id Field identifier (application-defined)
0x02 1 value_type 0=u64, 1=i64, 2=f64, 3=string, 4=bytes
0x03 var value Encoded value (u64/i64/f64: 8B; string/bytes: 4B length + data)
Wire Format: Response
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 4 accepted_count Number of vectors accepted
0x04 4 rejected_count Number of vectors rejected
0x08 4 manifest_epoch Epoch of manifest after commit
0x0C var rejected_ids[] Array of rejected vector IDs (u64 * rejected_count)
var var rejected_reasons[] Array of error codes (u16 * rejected_count)
The manifest_epoch field is the epoch of the MANIFEST_SEG written after the
ingest is committed. Clients can use this value to confirm that a subsequent
read will include the ingested vectors.
Ingest Commit Semantics
- The runtime writes vectors to a new VEC_SEG (append-only)
- If metadata is included, a META_SEG is appended
- Both segments are fsynced
- A new MANIFEST_SEG is written referencing the new segments
- The manifest is fsynced
- The response is sent with the new manifest_epoch
Vectors are visible to queries only after step 6 completes.
5. Batch Delete API
Wire Format: Request
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 1 delete_type 0=by_id, 1=by_range, 2=by_filter
0x01 3 reserved Must be zero
0x04 var payload Type-specific payload (see below)
Delete by ID (delete_type = 0):
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 4 count Number of IDs to delete
0x04 var ids[] Array of vector IDs (u64 * count)
Delete by range (delete_type = 1):
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 8 start_id Start of range (inclusive)
0x08 8 end_id End of range (exclusive)
Delete by filter (delete_type = 2):
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 4 filter_len Byte length of filter expression
0x04 var filter Filter expression
Wire Format: Response
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 8 deleted_count Number of vectors deleted
0x08 2 status Error code (0x0000 = OK)
0x0A 2 reserved Must be zero
0x0C 4 manifest_epoch Epoch of manifest after delete committed
Delete Mechanics
Deletes are logical. The runtime appends a JOURNAL_SEG containing tombstone entries for the deleted vector IDs. The new MANIFEST_SEG marks affected VEC_SEGs as partially dead. Physical reclamation happens during compaction.
6. Network Streaming Protocol
6.1 HTTP Range Requests (Read-Only Access)
RVF's progressive loading model maps naturally to HTTP byte-range requests.
A client can boot from a remote .rvf file and become queryable without
downloading the entire file.
Phase 1: Boot (mandatory)
GET /file.rvf Range: bytes=-4096
Retrieves the last 4 KB of the file. This contains the Level 0 root manifest (MANIFEST_SEG). The client parses hotset pointers, the segment directory, and the profile ID.
If the file is smaller than 4 KB, the entire file is returned. If the last
4 KB does not contain a valid MANIFEST_SEG, the client extends the range
backward in 4 KB increments until one is found or 1 MB is scanned (at which
point it returns MANIFEST_NOT_FOUND).
Phase 2: Hotset (parallel, mandatory for queries)
Using offsets from the Level 0 manifest, the client issues up to 5 parallel range requests:
GET /file.rvf Range: bytes=<entrypoint_offset>-<entrypoint_end>
GET /file.rvf Range: bytes=<toplayer_offset>-<toplayer_end>
GET /file.rvf Range: bytes=<centroid_offset>-<centroid_end>
GET /file.rvf Range: bytes=<quantdict_offset>-<quantdict_end>
GET /file.rvf Range: bytes=<hotcache_offset>-<hotcache_end>
These fetch the HNSW entry point, top-layer graph, routing centroids, quantization dictionary, and the hot cache (HOT_SEG). After these 5 requests complete, the system is queryable with recall >= 0.7.
Phase 3: Level 1 (background)
GET /file.rvf Range: bytes=<l1_offset>-<l1_end>
Fetches the Level 1 manifest containing the full segment directory. This enables the client to discover all segments and plan on-demand fetches.
Phase 4: On-demand (per query)
For queries that require cold data not yet fetched:
GET /file.rvf Range: bytes=<segment_offset>-<segment_end>
The client caches fetched segments locally. Repeated queries against the same data region do not trigger additional requests.
HTTP Requirements
- Server must support
Accept-Ranges: bytes - Server must return
206 Partial Contentfor range requests - Server should support multiple ranges in a single request (
multipart/byteranges) - Client should use
If-None-Matchwith the file's ETag to detect stale caches
6.2 TCP Streaming Protocol (Real-Time Access)
For real-time ingest and low-latency queries, RVF defines a binary TCP protocol over TLS 1.3.
Connection Setup
1. Client opens TCP connection to server
2. TLS 1.3 handshake (mandatory, no plaintext mode)
3. Client sends HELLO message with protocol version and capabilities
4. Server responds with HELLO_ACK confirming capabilities
5. Connection is ready for messages
Framing
All messages are length-prefixed:
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 4 frame_length Payload length (big-endian, max 16 MB)
0x04 1 msg_type Message type (see below)
0x05 3 msg_id Correlation ID (big-endian, wraps at 2^24)
0x08 var payload Message-specific payload
Frame length is big-endian (network byte order) for consistency with TLS framing. The 16 MB maximum prevents a single message from monopolizing the connection. Payloads larger than 16 MB must be split across multiple messages using continuation framing (see section 6.4).
Message Types
Client -> Server:
0x01 QUERY Batch query (payload = Batch Query Request)
0x02 INGEST Batch ingest (payload = Batch Ingest Request)
0x03 DELETE Batch delete (payload = Batch Delete Request)
0x04 STATUS Request server status (no payload)
0x05 SUBSCRIBE Subscribe to update notifications
Server -> Client:
0x81 QUERY_RESULT Batch query result
0x82 INGEST_ACK Batch ingest acknowledgment
0x83 DELETE_ACK Batch delete acknowledgment
0x84 STATUS_RESP Server status response
0x85 UPDATE_NOTIFY Push notification of new data
0xFF ERROR Error with code and description
ERROR Message Payload
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 2 error_code Error code from section 2
0x02 2 description_len Byte length of description string
0x04 var description UTF-8 error description (human-readable)
6.3 Streaming Ingest Protocol
The TCP protocol supports continuous ingest where the client streams vectors without waiting for per-batch acknowledgments.
Flow
Client Server
| |
|--- INGEST (batch 0) ------------->|
|--- INGEST (batch 1) ------------->| Pipelining: send without waiting
|--- INGEST (batch 2) ------------->|
| | Server writes VEC_SEGs, appends manifest
|<--- INGEST_ACK (batch 0) ---------|
|<--- INGEST_ACK (batch 1) ---------|
| | Backpressure: server delays ACK
|--- INGEST (batch 3) ------------->| Client respects window
|<--- INGEST_ACK (batch 2) ---------|
| |
Backpressure
The server controls ingest rate by delaying INGEST_ACK responses. The client must limit its in-flight (unacknowledged) ingest messages to a configurable window size (default: 8 messages). When the window is full, the client must wait for an ACK before sending the next batch.
The server should send backpressure when:
- Write queue exceeds 80% capacity
- Compaction is falling behind (dead space > 50%)
- Available disk space drops below 10%
Commit Semantics
Each INGEST_ACK contains the manifest_epoch after commit. The server
guarantees that all vectors acknowledged with epoch E are visible to any
query that reads the manifest at epoch >= E.
6.4 Continuation Framing
For payloads exceeding the 16 MB frame limit:
Frame 0: msg_type = original type, flags bit 0 = CONTINUATION_START
Frame 1: msg_type = 0x00 (CONTINUATION), flags bit 0 = 0
Frame 2: msg_type = 0x00 (CONTINUATION), flags bit 0 = 0
Frame N: msg_type = 0x00 (CONTINUATION), flags bit 1 = CONTINUATION_END
The receiver reassembles the payload from all continuation frames before processing. The msg_id is shared across all frames of a continuation sequence.
6.5 SUBSCRIBE and UPDATE_NOTIFY
The SUBSCRIBE message registers the client for push notifications when new data is committed:
SUBSCRIBE payload:
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 4 min_epoch Only notify for epochs > this value
0x04 1 notify_flags Bit 0: ingest, Bit 1: delete, Bit 2: compaction
0x05 3 reserved Must be zero
The server sends UPDATE_NOTIFY whenever a new MANIFEST_SEG is committed that matches the subscription criteria:
UPDATE_NOTIFY payload:
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 4 epoch New manifest epoch
0x04 1 event_type 0=ingest, 1=delete, 2=compaction
0x05 3 reserved Must be zero
0x08 4 affected_count Number of vectors affected
0x0C 8 new_total Total vector count after event
7. Compaction Scheduling Policy
Compaction merges small, overlapping, or partially-dead segments into larger, sealed segments. Because compaction competes with queries and ingest for I/O bandwidth, the runtime enforces a scheduling policy.
7.1 IO Budget
Compaction must consume at most 30% of available IOPS. The runtime measures IOPS over a 5-second sliding window and throttles compaction I/O to stay within budget.
available_iops = measured_iops_capacity (from benchmarking at startup)
compaction_budget = available_iops * 0.30
compaction_throttle = max(compaction_budget - current_compaction_iops, 0)
7.2 Priority Ordering
When I/O bandwidth is contended, operations are prioritized:
Priority 1 (highest): Queries (reads from VEC_SEG, INDEX_SEG, HOT_SEG)
Priority 2: Ingest (writes to VEC_SEG, META_SEG, MANIFEST_SEG)
Priority 3 (lowest): Compaction (reads + writes of sealed segments)
Compaction yields to queries and ingest. If a compaction I/O operation would cause a query to exceed its time budget, the compaction operation is deferred.
7.3 Scheduling Triggers
Compaction runs when all of the following conditions are met:
| Condition | Threshold | Rationale |
|---|---|---|
| Query load | < 50% of capacity | Avoid competing with active queries |
| Dead space ratio | > 20% of total file size | Not worth compacting small amounts |
| Segment count | > 32 active segments | Many small segments hurt read performance |
| Time since last compaction | > 60 seconds | Prevent compaction storms |
The runtime evaluates these conditions every 10 seconds.
7.4 Emergency Compaction
If dead space exceeds 70% of total file size, compaction enters emergency mode:
Emergency compaction rules:
1. Compaction preempts ingest (ingest is paused, not rejected)
2. IO budget increases to 60% of available IOPS
3. Compaction runs regardless of query load
4. Ingest resumes after dead space drops below 50%
During emergency compaction, the server responds to INGEST messages with delayed ACKs (backpressure) rather than rejecting them. Queries continue to be served at highest priority.
7.5 Compaction Progress Reporting
The STATUS response includes compaction state:
STATUS_RESP compaction fields:
Offset Size Field Description
------ ------ ------------------- ----------------------------------------
0x00 1 compaction_state 0=idle, 1=running, 2=emergency
0x01 1 progress_pct Completion percentage (0-100)
0x02 2 reserved Must be zero
0x04 8 dead_bytes Total dead space in bytes
0x0C 8 total_bytes Total file size in bytes
0x14 4 segments_remaining Segments left to compact
0x18 4 segments_completed Segments compacted in current run
0x1C 4 estimated_seconds Estimated time to completion
0x20 4 io_budget_pct Current IO budget percentage (30 or 60)
7.6 Compaction Segment Selection
The runtime selects segments for compaction using a tiered strategy:
1. Tombstoned segments: Always compacted first (reclaim dead space)
2. Small VEC_SEGs: Segments < 1 MB merged into larger segments
3. High-overlap INDEX_SEGs: Index segments covering the same ID range
4. Cold OVERLAY_SEGs: Overlay deltas merged into base segments
The compaction output is always a sealed segment (SEALED flag set). Sealed segments are immutable and can be verified independently.
8. STATUS Response Format
The STATUS message provides a snapshot of the server state for monitoring and diagnostics.
STATUS_RESP payload:
Offset Size Field Description
------ ------ ------------------- ----------------------------------------
0x00 4 protocol_version Protocol version (currently 1)
0x04 4 manifest_epoch Current manifest epoch
0x08 8 total_vectors Total vector count
0x10 8 total_segments Total segment count
0x18 8 file_size_bytes Total file size
0x20 4 query_qps Queries per second (last 5s window)
0x24 4 ingest_vps Vectors ingested per second (last 5s window)
0x28 24 compaction Compaction state (see section 7.5)
0x40 1 profile_id Active hardware profile (0x00-0x03)
0x41 1 health 0=healthy, 1=degraded, 2=read_only
0x42 2 reserved Must be zero
0x44 4 uptime_seconds Server uptime
9. Filter Expression Format
Filter expressions used in batch queries and batch deletes share a common binary encoding:
Offset Size Field Description
------ ------ ------------------ ----------------------------------------
0x00 1 op Operator enum (see below)
0x01 2 field_id Metadata field to filter on
0x03 1 value_type Value type (matches metadata field types)
0x04 var value Comparison value
var var children[] Sub-expressions (for AND/OR/NOT)
Operator enum:
0x00 EQ field == value
0x01 NE field != value
0x02 LT field < value
0x03 LE field <= value
0x04 GT field > value
0x05 GE field >= value
0x06 IN field in [values]
0x07 RANGE field in [low, high)
0x10 AND All children must match
0x11 OR Any child must match
0x12 NOT Negate single child
Filters are evaluated during the query scan phase. Vectors that do not match the filter are excluded from distance computation entirely (pre-filtering) or from the result set (post-filtering), depending on the runtime's cost model.
10. Invariants
- Error codes are stable across versions; new codes are additive only
- Batch operations are atomic per-item, not per-batch (partial success is valid)
- TCP connections are always TLS 1.3; plaintext is not permitted
- Frame length is big-endian; all other multi-byte fields are little-endian
- HTTP progressive loading must succeed with at most 7 round trips to become queryable
- Compaction never runs at more than 60% of available IOPS, even in emergency mode
- The STATUS response is always available, even during emergency compaction
- Filter expressions are limited to 64 levels of nesting depth