Files
wifi-densepose/npm/packages/ruvbot/docs/adr/ADR-013-gcp-deployment.md
ruv d803bfe2b1 Squashed 'vendor/ruvector/' content from commit b64c2172
git-subtree-dir: vendor/ruvector
git-subtree-split: b64c21726f2bb37286d9ee36a7869fef60cc6900
2026-02-28 14:39:40 -05:00

9.6 KiB

ADR-013: Google Cloud Platform Deployment Architecture

Status

Accepted

Date

2026-01-27

Context

RuvBot needs a production-ready deployment option that:

  1. Minimizes operational costs for low-traffic scenarios
  2. Scales automatically with demand
  3. Provides persistence for sessions, memory, and learning data
  4. Secures API keys and credentials
  5. Supports multi-tenant deployments

Decision

Deploy RuvBot on Google Cloud Platform using serverless and managed services optimized for cost.

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                         Google Cloud Platform                                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐                   │
│  │   Cloud      │    │   Cloud      │    │   Cloud      │                   │
│  │   Build      │───▶│   Registry   │───▶│   Run        │                   │
│  │   (CI/CD)    │    │   (Images)   │    │   (App)      │                   │
│  └──────────────┘    └──────────────┘    └──────┬───────┘                   │
│                                                  │                           │
│                     ┌────────────────────────────┼────────────────────────┐  │
│                     │                            │                        │  │
│              ┌──────▼──────┐   ┌────────────────▼───────────┐            │  │
│              │   Secret    │   │      Cloud SQL             │            │  │
│              │   Manager   │   │      (PostgreSQL)          │            │  │
│              │             │   │      db-f1-micro           │            │  │
│              └─────────────┘   └────────────────────────────┘            │  │
│                                                                          │  │
│              ┌─────────────┐   ┌────────────────────────────┐            │  │
│              │   Cloud     │   │      Memorystore           │            │  │
│              │   Storage   │   │      (Redis) - Optional    │            │  │
│              │   (Files)   │   │      Basic tier            │            │  │
│              └─────────────┘   └────────────────────────────┘            │  │
│                                                                          │  │
│                     └────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────────┘

Cost Optimization Strategy

Service Configuration Monthly Cost Notes
Cloud Run 0-10 instances, 512Mi RAM ~$0-5 Free tier: 2M requests
Cloud SQL db-f1-micro, 10GB SSD ~$10-15 Smallest instance
Secret Manager 3-5 secrets ~$0.18 $0.06/secret/month
Cloud Storage Standard, lifecycle policies ~$0.02/GB Auto-tiering
Cloud Build Free tier ~$0 120 min/day free
Total (low traffic) ~$15-20/month

Service Configuration

Cloud Run (Compute)

# Serverless container configuration
resources:
  cpu: "1"
  memory: "512Mi"
scaling:
  minInstances: 0      # Scale to zero when idle
  maxInstances: 10     # Limit for cost control
  concurrency: 80      # Requests per instance
features:
  cpuIdle: true        # Reduce CPU when idle (cost savings)
  startupCpuBoost: true # Faster cold starts
timeout: 300s          # 5 minutes for long operations

Cloud SQL (Database)

# Cost-optimized PostgreSQL
tier         = "db-f1-micro"  # 0.6GB RAM, shared CPU
disk_size    = 10             # Minimum SSD
availability = "ZONAL"        # Single zone (cheaper)
backup_retention = 7          # 7 days

# Extensions enabled
- uuid-ossp     # UUID generation
- pgcrypto      # Cryptographic functions
- pg_trgm       # Text search (trigram similarity)

Secret Manager

Securely stores:

  • anthropic-api-key - Anthropic API credentials
  • openrouter-api-key - OpenRouter API credentials
  • database-url - PostgreSQL connection string

Cloud Storage

# Automatic cost optimization
lifecycle_rules = [
  { age = 30, action = "SetStorageClass", class = "NEARLINE" },
  { age = 90, action = "SetStorageClass", class = "COLDLINE" }
]

Deployment Options

Option 1: Quick Deploy (gcloud CLI)

# Set environment variables
export ANTHROPIC_API_KEY="sk-ant-..."
export PROJECT_ID="my-project"

# Run deployment script
./deploy/gcp/deploy.sh --project-id $PROJECT_ID

Option 2: Infrastructure as Code (Terraform)

cd deploy/gcp/terraform

terraform init
terraform plan -var="project_id=my-project" -var="anthropic_api_key=sk-ant-..."
terraform apply

Option 3: CI/CD (Cloud Build)

# Trigger on push to main branch
trigger:
  branch: main
  included_files:
    - "npm/packages/ruvbot/**"

# cloudbuild.yaml handles build and deploy

Multi-Tenant Configuration

For multiple tenants:

# Separate Cloud SQL databases
resource "google_sql_database" "tenant" {
  for_each = var.tenants
  name     = "ruvbot_${each.key}"
  instance = google_sql_database_instance.ruvbot.name
}

# Row-Level Security in PostgreSQL
ALTER TABLE sessions ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON sessions
  USING (tenant_id = current_setting('app.tenant_id')::uuid);

Scaling Considerations

Traffic Level Cloud Run Instances Cloud SQL Estimated Cost
Low (<1K req/day) 0-1 db-f1-micro ~$15/month
Medium (<10K req/day) 1-3 db-g1-small ~$40/month
High (<100K req/day) 3-10 db-custom ~$150/month
Enterprise 10-100 Regional HA ~$500+/month

Security Configuration

# Service account with minimal permissions
roles = [
  "roles/secretmanager.secretAccessor",
  "roles/cloudsql.client",
  "roles/storage.objectAdmin",
  "roles/logging.logWriter",
  "roles/monitoring.metricWriter",
]

# Network security
ip_configuration {
  ipv4_enabled = false         # Production: use private IP
  private_network = google_compute_network.vpc.id
}

Health Monitoring

# Cloud Run health checks
startup_probe:
  http_get:
    path: /health
    port: 8080
  initial_delay_seconds: 5
  timeout_seconds: 3
  period_seconds: 10

liveness_probe:
  http_get:
    path: /health
    port: 8080
  timeout_seconds: 3
  period_seconds: 30

File Structure

deploy/
├── gcp/
│   ├── cloudbuild.yaml      # CI/CD pipeline
│   ├── deploy.sh            # Quick deployment script
│   └── terraform/
│       └── main.tf          # Infrastructure as code
├── init-db.sql              # Database schema
├── Dockerfile               # Container image
└── docker-compose.yml       # Local development

Consequences

Positive

  • Cost-effective: ~$15-20/month for low traffic
  • Serverless: Scale to zero when not in use
  • Managed services: No infrastructure maintenance
  • Security: Secret Manager, IAM, VPC support
  • Observability: Built-in logging and monitoring

Negative

  • Cold starts: First request after idle ~2-3 seconds
  • Vendor lock-in: GCP-specific services
  • Complexity: Multiple services to configure

Trade-offs

  • Cloud SQL vs Firestore: SQL chosen for complex queries, Row-Level Security
  • Cloud Run vs GKE: Run chosen for simplicity, lower cost
  • db-f1-micro vs larger: Cost vs performance trade-off

Alternatives Considered

Option Pros Cons Estimated Cost
GKE + Postgres Full control, predictable Complex, expensive ~$100+/month
App Engine Simple deployment Less flexible ~$30/month
Firebase + Functions Easy scaling No SQL, vendor lock ~$20/month
Cloud Run + SQL Balanced Some complexity ~$15/month

References