git-subtree-dir: vendor/ruvector git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
9.7 KiB
RuVector-Postgres v2.0.0 Security Audit Report
Date: 2025-12-26
Auditor: Claude Code Security Review
Scope: /crates/ruvector-postgres/src/**/*.rs
Branch: feat/ruvector-postgres-v2
Status: CRITICAL issues FIXED
Executive Summary
| Severity | Count | Status |
|---|---|---|
| CRITICAL | 3 | ✅ FIXED |
| HIGH | 2 | ⚠️ Documented for future improvement |
| MEDIUM | 3 | ⚠️ Documented for future improvement |
| LOW | 2 | ✅ Acceptable |
| INFO | 3 | ✅ Acceptable patterns noted |
Security Fixes Applied (2025-12-26)
- Created
validation.rsmodule - Input validation for tenant IDs and identifiers - Fixed SQL injection in
isolation.rs- All SQL now usesquote_identifier()and parameterized queries - Fixed SQL injection in
operations.rs-AuditLogEntrynow properly escapes all values - Added
ValidatedTenantIdtype - Type-safe tenant ID validation - Query routing uses
$1placeholders - Parameterized queries prevent injection
CRITICAL Findings
CVE-PENDING-001: SQL Injection in Tenant Isolation Module ✅ FIXED
Location: src/tenancy/isolation.rs
Lines: 233, 454, 461, 477, 491
Status: ✅ FIXED on 2025-12-26
Original Vulnerable Code:
// Line 233 - Direct table name interpolation
Ok(format!("DROP TABLE IF EXISTS {} CASCADE;", partition_name))
// Line 454 - Direct tenant_id interpolation
filter: format!("tenant_id = '{}'", tenant_id),
Applied Fix:
// Now uses validated identifiers with quote_identifier()
validate_identifier(partition_name)?;
Ok(format!("DROP TABLE IF EXISTS {} CASCADE;", quote_identifier(partition_name)))
// Now uses parameterized queries with $1 placeholder
filter: "tenant_id = $1".to_string(),
tenant_param: Some(tenant_id.to_string()),
Changes Made:
- Added
validate_tenant_id()calls before any SQL generation - All table/schema/partition names now use
quote_identifier() - Query routing returns
tenant_id = $1placeholder instead of direct interpolation - Added
tenant_paramfield toQueryRoute::SharedWithFilterfor binding
CVE-PENDING-002: SQL Injection in Tenant Audit Logging ✅ FIXED
Location: src/tenancy/operations.rs
Lines: 515-527
Status: ✅ FIXED on 2025-12-26
Original Vulnerable Code:
format!("'{}'", u) // Direct user_id interpolation
format!("'{}'", ip) // Direct IP interpolation
Applied Fix:
// New parameterized version
pub fn insert_sql_parameterized(&self) -> (String, Vec<Option<String>>) {
let sql = "INSERT INTO ruvector.tenant_audit_log ... VALUES ($1, $2, $3, $4, $5, $6, $7)";
// Params bound safely
}
// Legacy version now escapes properly
let escaped_user_id = escape_string_literal(u);
// IP validated: if validate_ip_address(ip) { Some(...) } else { None }
Changes Made:
- Added
insert_sql_parameterized()for new code (preferred) - Legacy
insert_sql()now usesescape_string_literal()for all values - Added IP address validation - invalid IPs become NULL
- Tenant ID validated before SQL generation
CVE-PENDING-003: SQL Injection via Drop Partition ✅ FIXED
Location: src/tenancy/isolation.rs:227-234
Status: ✅ FIXED on 2025-12-26
Original Vulnerable Code:
Ok(format!("DROP TABLE IF EXISTS {} CASCADE;", partition_name)) // UNSAFE
Applied Fix:
// Validate inputs
validate_tenant_id(tenant_id)?;
validate_identifier(partition_name)?;
// Verify partition belongs to tenant (authorization check)
let partition_exists = self.partitions.get(tenant_id)
.map(|p| p.iter().any(|p| p.partition_name == partition_name))
.unwrap_or(false);
if !partition_exists {
return Err(IsolationError::PartitionNotFound(partition_name.to_string()));
}
// Use quoted identifier
Ok(format!("DROP TABLE IF EXISTS {} CASCADE;", quote_identifier(partition_name)))
Changes Made:
- Added input validation for both tenant_id and partition_name
- Added authorization check - partition must belong to tenant
- Used
quote_identifier()for safe SQL generation
HIGH Findings
HIGH-001: Excessive Panic/Unwrap Usage
Location: Multiple files (63 files affected)
Count: 462 occurrences of unwrap(), expect(), panic!
Description: Unhandled panics in PostgreSQL extensions can crash the database backend process.
Impact:
- Denial of Service through crafted inputs
- Database backend crashes
- Service unavailability
Affected Patterns:
.unwrap() // 280+ occurrences
.expect("...") // 150+ occurrences
panic!("...") // 32 occurrences
Remediation:
- Replace
unwrap()withunwrap_or_default()or proper error handling - Use
pgrx::error!()for graceful PostgreSQL error reporting - Implement
Result<T, E>return types for public functions - Add input validation before operations that can panic
HIGH-002: Unsafe Integer Casts
Location: Multiple files Count: 392 occurrences
Description:
Unchecked integer casts between types (e.g., as usize, as i32, as u64) can cause overflow/underflow.
Affected Patterns:
value as usize // Can panic on 32-bit systems
len as i32 // Can overflow for large vectors
index as u64 // Can truncate on edge cases
Remediation:
- Use
TryFrom/try_into()with error handling - Add bounds checking before casts
- Use
saturating_castorchecked_castpatterns - Validate dimension/size limits at API boundary
MEDIUM Findings
MEDIUM-001: Unsafe Pointer Operations in Index Storage
Location: src/index/ivfflat_storage.rs, src/index/hnsw_am.rs
Description: Index access methods use raw pointer operations for performance, which are inherently unsafe.
Affected Patterns:
std::ptr::read()std::ptr::write()std::slice::from_raw_parts()std::slice::from_raw_parts_mut()
Mitigation Applied:
- Operations are gated behind
unsafeblocks - Required for pgrx PostgreSQL integration
- No user-controlled data reaches pointers directly
Recommendation:
- Add bounds checking assertions before pointer access
- Document safety invariants for each unsafe block
- Consider
#[deny(unsafe_op_in_unsafe_fn)]lint
MEDIUM-002: Unbounded Vector Allocations
Location: Multiple modules
Description: Some operations allocate vectors based on user-provided dimensions without upper limits.
Affected Areas:
Vec::with_capacity(dimension)in type constructors.collect()on unbounded iterators- Graph traversal result sets
Remediation:
- Define
MAX_VECTOR_DIMENSIONconstant (e.g., 16384) - Validate dimensions at input boundaries
- Add configurable limits via GUC parameters
MEDIUM-003: Missing Rate Limiting on Tenant Operations
Location: src/tenancy/operations.rs
Description: Tenant creation and audit logging have no rate limiting, allowing potential abuse.
Remediation:
- Add configurable rate limits per tenant
- Implement quota checking before operations
- Add throttling for expensive operations
LOW Findings
LOW-001: Debug Output in Tests Only
Location: src/distance/simd.rs
Count: 7 println! statements
Status: ACCEPTABLE - All debug output is in #[cfg(test)] modules only.
LOW-002: Error Messages May Reveal Internal Paths
Location: Various error handling code
Description: Some error messages include internal details that could aid attackers.
Example:
format!("Failed to spawn worker: {}", e)
format!("Failed to decode operation: {}", e)
Remediation:
- Use generic user-facing error messages
- Log detailed errors internally only
- Implement error code system for debugging
INFO - Acceptable Patterns
INFO-001: No Command Execution Found
No Command::new(), exec, or shell execution patterns found. ✅
INFO-002: No File System Operations
No std::fs, File::open, or path manipulation in production code. ✅
INFO-003: No Hardcoded Credentials
No passwords, API keys, or secrets in source code. ✅
Security Checklist Summary
| Category | Status | Notes |
|---|---|---|
| SQL Injection | ❌ FAIL | 3 critical findings in tenancy module |
| Command Injection | ✅ PASS | No shell execution |
| Path Traversal | ✅ PASS | No file operations |
| Memory Safety | ⚠️ WARN | Acceptable unsafe for pgrx, but review recommended |
| Input Validation | ⚠️ WARN | Missing on tenant/partition names |
| DoS Prevention | ⚠️ WARN | Panic-prone code paths |
| Auth/AuthZ | ✅ PASS | No bypasses found |
| Crypto | ✅ PASS | No cryptographic code present |
| Information Disclosure | ✅ PASS | Debug output test-only |
Remediation Priority
Immediate (Before Release)
- Fix SQL injection in tenancy module - Use parameterized queries
- Validate tenant_id format - Alphanumeric only, max length 64
Short Term (Next Sprint)
- Replace critical
unwrap()calls with proper error handling - Add dimension limits to vector operations
- Implement input validation helpers
Medium Term
- Add rate limiting to tenant operations
- Audit and document all
unsafeblocks - Convert integer casts to checked variants
Testing Recommendations
- Fuzz testing: Apply cargo-fuzz to SQL-generating functions
- Property testing: Test boundary conditions with proptest
- Integration tests: Add SQL injection test vectors
- Negative tests: Verify malformed inputs are rejected
Appendix: Files Reviewed
- 80+ source files in
/crates/ruvector-postgres/src/ - 148
#[pg_extern]function definitions - Focus areas: tenancy, index, distance, types, graph
Report generated by Claude Code security analysis