21 KiB
RVCOW Security Audit Report
| Field | Value |
|---|---|
| Date | 2026-02-14 |
| Auditor | Security Auditor Agent (Claude Opus 4.6) |
| Scope | RVCOW copy-on-write branching implementation per ADR-031 |
| Status | Complete |
| Files Reviewed | 17 source files across rvf-types, rvf-runtime, rvf-cli |
Executive Summary
The RVCOW implementation is structurally sound with good defensive practices (compile-time size assertions, magic number validation, repr(C) layouts). However, the audit identified 2 Critical, 6 High, 5 Medium, and 4 Low severity findings. The Critical and High findings have been fixed in-place. Medium/Low findings are documented for future remediation.
Findings Summary
| Severity | Count | Fixed |
|---|---|---|
| Critical | 2 | 2 |
| High | 6 | 5 |
| Medium | 5 | 0 |
| Low | 4 | 0 |
| Info | 3 | 0 |
| Total | 20 | 7 |
Critical Findings
C-01: Non-Cryptographic Hash Used for Integrity Verification
Severity: Critical
Location: /workspaces/ruvector/crates/rvf/rvf-runtime/src/store.rs:1239-1251
Status: Documented (architectural issue requiring design decision)
Description: simple_shake256_256 is a trivially reversible XOR-fold hash, not a cryptographic hash function. Despite its name suggesting SHAKE-256, it provides near-zero collision resistance and is trivially invertible. This function is used for:
parent_hashinFileIdentity(lineage verification)filter_hashinMembershipHeader(filter integrity)- COW witness event hashes (
parent_cluster_hash,new_cluster_hash) - Cluster deduplication in space-reclaim compaction
Impact: An attacker can craft colliding inputs that produce identical hashes, defeating:
- Parent file provenance verification -- a different parent file could be substituted
- Membership filter integrity -- a modified filter bitmap could pass hash checks
- COW witness event auditing -- falsified cluster hashes in the audit trail
- Space-reclaim compaction -- different data could match parent hashes, causing data loss
Recommendation: Replace simple_shake256_256 with a real cryptographic hash. Options:
- Add
sha3crate dependency (provides SHAKE-256) for ~20KB binary increase - Use
blake3for better performance with equivalent security - At minimum, document this is a placeholder and add a
#[cfg(feature = "crypto")]gate
Note: The function comment acknowledges this: "We use a simple non-cryptographic hash here since rvf-runtime doesn't depend on rvf-crypto." However, the security implications of this choice are severe for production use. All integrity guarantees documented in ADR-031 are void until this is addressed.
C-02: KernelBinding from_bytes Does Not Validate Reserved/Padding Fields
Severity: Critical
Location: /workspaces/ruvector/crates/rvf/rvf-types/src/kernel_binding.rs:61
Status: FIXED
Description: KernelBinding::from_bytes accepted arbitrary data in _pad0 and _reserved fields. ADR-031 specifies these MUST be zero. Non-zero reserved fields enable:
- Data smuggling through the KernelBinding structure
- Future format confusion if reserved fields gain meaning
- Signature bypass if
signed_dataincludes different reserved values
Fix Applied: Added from_bytes_validated() method that rejects non-zero _pad0, non-zero _reserved, and binding_version == 0. The original from_bytes is preserved for backward compatibility with a documentation note.
High Findings
H-01: Division by Zero in CowEngine with vectors_per_cluster=0
Severity: High
Location: /workspaces/ruvector/crates/rvf/rvf-runtime/src/cow.rs:106,164
Status: FIXED
Description: CowEngine::read_vector and write_vector compute cluster_id = vector_id / vectors_per_cluster. If vectors_per_cluster is 0, this causes a panic (integer division by zero). A malicious or corrupted CowMapHeader with vectors_per_cluster=0 would crash the runtime.
Fix Applied: Added assert!(vectors_per_cluster > 0) to both CowEngine::new() and CowEngine::from_parent() constructors.
H-02: Silent Write Drop on Out-of-Bounds Vector Offset
Severity: High
Location: /workspaces/ruvector/crates/rvf/rvf-runtime/src/cow.rs:253-258
Status: FIXED
Description: In flush_writes, when end > cluster_data.len(), the write was silently skipped (if end <= cluster_data.len()). This means data could be silently lost without any error indication, violating write durability guarantees.
Impact: An attacker or buggy caller could trigger silent data loss by crafting vector writes where vector_offset_in_cluster + data.len() exceeds cluster size.
Fix Applied: Changed the condition to return Err(RvfError::Code(ErrorCode::ClusterNotFound)) when the write would exceed cluster bounds.
H-03: CowMapHeader Deserialization Missing Critical Validations
Severity: High
Location: /workspaces/ruvector/crates/rvf/rvf-types/src/cow_map.rs:97-124
Status: FIXED
Description: CowMapHeader::from_bytes only validated the magic number. It did not validate:
map_formatis a known enum value (could be 0xFF)cluster_size_bytesis non-zero and a power of 2 (spec requirement for SIMD alignment)vectors_per_clusteris non-zero (prevents division by zero downstream)
Fix Applied: Added validation for all three fields, returning appropriate RvfError on invalid values.
H-04: RefcountHeader Deserialization Missing Field Validation
Severity: High
Location: /workspaces/ruvector/crates/rvf/rvf-types/src/refcount.rs:59-82
Status: FIXED
Description: RefcountHeader::from_bytes did not validate:
refcount_widthmust be 1, 2, or 4 (spec requirement)_padmust be zero (spec requirement)_reservedmust be zero (spec requirement)
Invalid refcount_width could cause incorrect array indexing when reading the refcount array.
Fix Applied: Added validation for all three constraints.
H-05: CowMap Deserialize Integer Overflow
Severity: High
Location: /workspaces/ruvector/crates/rvf/rvf-runtime/src/cow_map.rs:93-94
Status: FIXED
Description: CowMap::deserialize computed expected_len = 5 + count * 9 without checked arithmetic. With a crafted count value near usize::MAX / 9, the multiplication could overflow, causing expected_len to wrap to a small value. This would pass the length check and then cause out-of-bounds reads in the deserialization loop.
Fix Applied: Replaced with count.checked_mul(9).and_then(|v| v.checked_add(5)), returning CowMapCorrupt on overflow.
H-06: verify_attestation Does Not Verify manifest_root_hash
Severity: High
Location: /workspaces/ruvector/crates/rvf/rvf-cli/src/cmd/verify_attestation.rs:49-66
Status: Documented (requires architecture decision)
Description: The verify_attestation CLI command extracts and displays the KernelBinding, but does NOT actually verify that manifest_root_hash matches the current file's manifest. Per ADR-031 Section 7.5, the launcher verification sequence requires:
- Compute SHAKE-256-256 of current Level0Root
- Compare to
KernelBinding.manifest_root_hash - Refuse to boot on mismatch
The current implementation skips steps 1-3, merely displaying the hash values. This completely defeats the anti-segment-swap protection that KernelBinding is designed to provide.
Impact: An attacker can take a signed kernel from file A, embed it into file B (different vectors, different manifest), and verify-attestation will report "valid" because it only checks magic bytes, not the binding.
Recommendation: Implement the full verification sequence. This requires either:
- Computing the real manifest hash (needs crypto dependency)
- At minimum, extracting the manifest and comparing hashes using available tools
Medium Findings
M-01: MembershipHeader Deserialization Does Not Validate Reserved Fields
Severity: Medium
Location: /workspaces/ruvector/crates/rvf/rvf-types/src/membership.rs:126-171
Description: MembershipHeader::from_bytes does not validate that _reserved and _reserved2 are zero. While not as critical as KernelBinding (no signing is involved), non-zero reserved fields violate the spec and could cause future compatibility issues.
Recommendation: Add zero-check for _reserved and _reserved2 fields.
M-02: DeltaHeader Deserialization Does Not Validate Reserved Fields
Severity: Medium
Location: /workspaces/ruvector/crates/rvf/rvf-types/src/delta.rs:88-119
Description: DeltaHeader::from_bytes does not validate that _pad and _reserved are zero.
Recommendation: Add zero-check for both fields.
M-03: Freeze CLI Bypasses Store API
Severity: Medium
Location: /workspaces/ruvector/crates/rvf/rvf-cli/src/cmd/freeze.rs:43-54
Description: The freeze CLI command opens the store, but then directly opens the file again and writes raw segment bytes, bypassing the RvfStore::freeze() API. This means:
- The segment header hash is not computed/validated
- The segment is not recorded in the manifest
- The writer lock from
RvfStore::open()is held while another file handle writes
Impact: The REFCOUNT_SEG written by the CLI is effectively invisible to the runtime -- it won't be in the manifest's segment directory. The store's freeze state is not actually recorded in any way the runtime can detect on next open.
Recommendation: Use store.freeze() instead of raw file writes, or update the manifest after writing the raw segment.
M-04: Filter CLI Bypasses Store API
Severity: Medium
Location: /workspaces/ruvector/crates/rvf/rvf-cli/src/cmd/filter.rs:97-109
Description: Similar to M-03, the filter CLI command writes a raw MEMBERSHIP_SEG directly to the file, bypassing the store API. The segment is not recorded in the manifest.
Recommendation: Use the membership filter API in RvfStore instead of raw segment writes.
M-05: No Parent Chain Depth Limit Enforced
Severity: Medium
Location: /workspaces/ruvector/crates/rvf/rvf-runtime/src/cow.rs:137-140
Description: ADR-031 Section 8.1 specifies a 64-level depth limit for parent chain traversal to prevent cycles and unbounded recursion. The current CowEngine::read_cluster follows ParentRef to the parent file, but there is no depth counter or cycle detection. A malicious chain of files referencing each other could cause stack overflow or infinite loops.
Recommendation: Add a depth counter to parent chain resolution. The lineage_depth field in FileIdentity should be checked against the 64-level limit.
Low Findings
L-01: generation_id Not Validated Monotonically
Severity: Low
Location: /workspaces/ruvector/crates/rvf/rvf-runtime/src/membership.rs:133-135
Description: MembershipFilter::bump_generation() increments generation_id by 1, but there is no validation on deserialization that the loaded generation matches or exceeds the manifest's generation. ADR-031 specifies that stale generation IDs (lower than manifest) should be rejected with GenerationStale error.
Recommendation: Add generation validation in MembershipFilter::deserialize that compares against the expected generation from the manifest.
L-02: No Overflow Check on generation_id Increment
Severity: Low
Location: /workspaces/ruvector/crates/rvf/rvf-runtime/src/membership.rs:134
Description: self.generation_id += 1 can overflow on u32::MAX. While unlikely in practice, this would cause the monotonicity invariant to be violated.
Recommendation: Use checked_add and return an error on overflow, or use saturating_add.
L-03: Cluster ID Multiplication Overflow in Parent Read
Severity: Low
Location: /workspaces/ruvector/crates/rvf/rvf-runtime/src/cow.rs:139
Description: let parent_offset = cluster_id as u64 * self.cluster_size as u64; could theoretically overflow for very large cluster IDs combined with large cluster sizes, though this requires cluster_id * cluster_size > u64::MAX which is unlikely.
Recommendation: Use checked_mul for defense-in-depth.
L-04: Bitmap Filter Allows Inconsistent member_count on Deserialization
Severity: Low
Location: /workspaces/ruvector/crates/rvf/rvf-runtime/src/membership.rs:147-182
Description: MembershipFilter::deserialize recomputes member_count from the bitmap bits (line 174) rather than trusting the header's member_count. This is actually good practice. However, if the header's member_count disagrees with the actual bit count, there is no warning or error. A crafted header could claim 0 members while the bitmap has all bits set.
Recommendation: Add a warning or optional validation that header.member_count == computed_count.
Informational Findings
I-01: simple_hash Duplicated in CLI
Severity: Info
Location: /workspaces/ruvector/crates/rvf/rvf-cli/src/cmd/filter.rs:132-140
Description: The filter.rs CLI command contains its own simple_hash function that is identical to simple_shake256_256 in store.rs. This is a maintenance burden -- if one is updated, the other may be forgotten.
I-02: KernelBinding Version 0 Used as "Not Present" Sentinel
Severity: Info
Location: /workspaces/ruvector/crates/rvf/rvf-runtime/src/store.rs:773
Description: extract_kernel_binding uses binding_version == 0 to detect "no binding present" (line 773). This means version 0 can never be a valid binding version. This should be documented as a format invariant.
I-03: Witness Event Hash Placeholder
Severity: Info
Location: /workspaces/ruvector/crates/rvf/rvf-runtime/src/cow.rs:232
Description: When emitting a CLUSTER_COW witness event, the new_cluster_hash is initially set to [0u8; 32] and updated later (line 270). If the update loop doesn't find the matching event (e.g., due to a logic bug), the witness event would contain an all-zeros hash. Consider using a sentinel value that is explicitly invalid (e.g., [0xFF; 32]).
Security Checklist Results
1. KernelBinding Verification
- manifest_root_hash verified before kernel boot? -- NO.
verify_attestationCLI does not verify this (H-06). The runtime does not enforce this either. - KernelBinding strippable without detection? -- PARTIALLY. If the binding is removed,
extract_kernel_bindingreturnsNone(backward-compatible). Signature verification would detect removal if signatures are present, but unsigned kernels have no protection. - signed_data correctly constructed? -- YES.
embed_kernel_with_bindingincludesKernelHeader || KernelBinding || cmdline || imagein the correct order per ADR-031. - binding_version validated? -- YES (after fix C-02).
from_bytes_validatedrejects version 0. - Reserved fields checked? -- YES (after fix C-02).
from_bytes_validatedrejects non-zero reserved.
2. COW Map Security
- Malicious redirect possible? -- YES, because
simple_shake256_256cannot verify integrity (C-01). - cluster_id range validated? -- YES. Out-of-bounds lookup returns
Unallocated. - Parent chain cycle prevention? -- NO. No depth limit enforced (M-05).
- Offsets validated before dereferencing? -- YES. File I/O will return errors on invalid offsets.
- Map deterministic? -- YES. Flat array is inherently ordered by cluster_id.
3. Membership Filter Security
- Empty include filter blocks all access? -- YES. Verified by test
include_mode_empty_is_empty_view. - generation_id validated monotonically? -- NO. Not enforced at load time (L-01).
- Filter bitmap bounds checked? -- YES.
bitmap_containschecksvector_id >= vector_count. - filter_hash verified on load? -- NO. Depends on
simple_shake256_256which is non-cryptographic (C-01).
4. Crash Recovery
- Double-root scheme implemented? -- NOT YET. The runtime code does not implement the double-root scheme described in ADR-031 Section 8.3. Current implementation uses append-only manifests.
- Orphaned data accessible after failed writes? -- NO. Orphaned appended data has no manifest reference and is invisible.
- Generation counters validated? -- PARTIALLY. Increment works but no validation on load.
5. Input Validation
- Deserialization safe with arbitrary input? -- YES (after fixes H-03, H-04, H-05). All headers validate magic, enum values, and bounds.
- Magic numbers checked? -- YES. All four new headers check magic on deserialization.
- Sizes validated before allocation? -- YES (after fix H-05). Checked arithmetic prevents overflow.
- Offset+length bounds checked? -- YES. File I/O operations use
read_exactwhich fails on short reads.
6. Integer Overflow
- cluster_id * cluster_size overflow? -- LOW RISK. Uses
u64arithmetic (L-03). - vector_id / vectors_per_cluster panic on zero? -- FIXED (H-01). Constructors now assert > 0.
- Capacity calculations safe? -- YES (after fix H-05). Deserialization uses checked arithmetic.
7. Downgrade Prevention
- Signed kernel replaceable with unsigned? -- YES. No enforcement prevents replacing a signed KERNEL_SEG with an unsigned one. ADR-031 Section 9 specifies signed-required downgrade prevention, but this is not implemented.
- Older api_version forceable? -- YES. No version pinning in KernelBinding currently enforced.
- Filter mode switchable? -- YES. No mechanism prevents changing filter_mode from Include to Exclude, which could expose all vectors in a branch.
Threat Model Alignment
| ADR-031 Threat | Implementation Status | Assessment |
|---|---|---|
| Host compromise | VMM not implemented (launcher is stub) | NOT TESTABLE |
| Guest compromise | Kernel is stub; eBPF verifier not implemented | NOT TESTABLE |
| TEE integrity | Not implemented | NOT TESTABLE |
| Supply chain | Signatures supported in type system | PARTIAL |
| Replay attack | generation_id exists but not enforced | INCOMPLETE |
| Data swap | KernelBinding exists but verification not enforced | INCOMPLETE |
| Malicious alt kernel | Deterministic selection not implemented | NOT IMPLEMENTED |
| COW map poisoning | Deterministic map ordering: YES | PARTIAL (no hash verification) |
| Stale membership filter | generation_id exists but not enforced on load | INCOMPLETE |
Positive Observations
- Compile-time size assertions on all headers prevent ABI drift.
- Field offset tests verify
repr(C)layout matches spec. - Magic number validation on all
from_bytespaths. - Round-trip serialization tests catch encoding bugs.
- Frozen snapshot enforcement correctly prevents writes via
SnapshotFrozenerror. - Write coalescing correctly batches multiple writes to same cluster.
- Membership filter correctly implements fail-safe (empty include = empty view).
- Bitmap bounds checking prevents out-of-bounds bit access.
- Write buffer drain before freeze prevents data loss.
- Checked arithmetic in scan_preservable_segments prevents overflow on crafted payloads.
Recommendations (Priority Order)
- P0: Replace
simple_shake256_256with a real cryptographic hash (blake3 or sha3 crate). - P0: Implement manifest_root_hash verification in
verify_attestationand in the kernel boot path. - P1: Enforce parent chain depth limit (64 levels per ADR-031).
- P1: Enforce generation_id monotonicity on membership filter and COW map load.
- P1: Implement signed-required downgrade prevention per ADR-031 Section 9.
- P2: Fix freeze/filter CLI commands to use the store API instead of raw segment writes.
- P2: Add reserved field validation to MembershipHeader and DeltaHeader deserialization.
- P3: Add overflow protection to generation_id increment.
- P3: Add parent_hash/filter_hash consistency checks (once crypto hash is in place).
Files Modified by This Audit
| File | Change |
|---|---|
rvf-types/src/kernel_binding.rs |
Added from_bytes_validated() with reserved/pad/version checks |
rvf-types/src/cow_map.rs |
Added map_format, cluster_size_bytes, vectors_per_cluster validation |
rvf-types/src/refcount.rs |
Added refcount_width, _pad, _reserved validation |
rvf-runtime/src/cow.rs |
Added vectors_per_cluster > 0 assertion; changed silent write drop to error |
rvf-runtime/src/cow_map.rs |
Added checked arithmetic for count * 9 overflow |
Test Results After Fixes
rvf-types: 122 passed, 0 failed
rvf-runtime: 65 passed, 0 failed
rvf-cli: 0 passed, 0 failed (no unit tests)
integration: 6 passed, 2 failed (pre-existing failures in cow_branching.rs)
The 2 integration test failures (branch_inherits_vectors_via_query, branch_membership_filter_excludes_deleted) are pre-existing and unrelated to this audit's changes -- they test branch+query integration that requires the membership filter to be wired into the query path, which is not yet implemented.