Skip to content

Deployment Guide

This guide covers deploying CalcBridge to production environments. CalcBridge is designed for containerized deployments with Docker and can be orchestrated with Docker Compose or Kubernetes.


Prerequisites

System Requirements

Component Minimum Recommended
CPU 4 cores 8+ cores
RAM 8 GB 16+ GB
Disk 50 GB SSD 200+ GB SSD
Network 1 Gbps 10 Gbps

Software Requirements

Software Version Purpose
Docker 24.0+ Container runtime
Docker Compose 2.20+ Multi-container orchestration
PostgreSQL 16+ Primary database
Valkey/Redis 8+ Cache and message broker

Environment Variables

Required Production Variables

Security Critical

These values MUST be changed from defaults before production deployment. Use strong, unique values generated with:

python -c "import secrets; print(secrets.token_urlsafe(32))"

# Core Security (REQUIRED - Must change from defaults)
JWT_SECRET_KEY=<secure-random-key-32-chars-minimum>
ENCRYPTION_MASTER_KEY=<secure-random-key-32-chars-minimum>

# Environment
ENVIRONMENT=production
DEBUG=false
LOG_LEVEL=INFO

# Database
DATABASE_URL=postgresql+psycopg://user:password@db-host:5432/calcbridge
DATABASE_POOL_SIZE=20
DATABASE_MAX_OVERFLOW=10
DATABASE_SSL_MODE=verify-full

# Cache
VALKEY_HOST=cache-host
VALKEY_PORT=6379
VALKEY_PASSWORD=<secure-cache-password>
VALKEY_SSL=true

# API
API_PREFIX=/api/v1
CORS_ORIGINS=["https://app.yourdomain.com"]

Complete Environment Reference

Variable Required Default Description
Security
JWT_SECRET_KEY Yes (insecure) JWT signing key (32+ chars)
ENCRYPTION_MASTER_KEY Yes (insecure) PII encryption key (32+ chars)
JWT_ACCESS_TOKEN_EXPIRE_MINUTES No 30 Access token lifetime
JWT_REFRESH_TOKEN_EXPIRE_DAYS No 7 Refresh token lifetime
Environment
ENVIRONMENT Yes development development, staging, production
DEBUG No false Enable debug mode
LOG_LEVEL No INFO Logging level
Database
DATABASE_URL Yes (local) PostgreSQL connection URL
DATABASE_POOL_SIZE No 20 Connection pool size
DATABASE_MAX_OVERFLOW No 10 Max overflow connections
DATABASE_SSL_MODE No prefer SSL mode (verify-full for prod)
DATABASE_APP_ROLE No calcbridge_app Database role for RLS
Cache
VALKEY_HOST Yes localhost Valkey/Redis host
VALKEY_PORT No 6379 Valkey/Redis port
VALKEY_PASSWORD No (none) Valkey/Redis password
VALKEY_SSL No false Enable SSL (true for prod)
Rate Limiting
RATE_LIMIT_ENABLED No true Enable rate limiting
RATE_LIMIT_TIER_FREE No 100 Free tier requests/min
RATE_LIMIT_TIER_ENTERPRISE No 10000 Enterprise requests/min
File Storage
STORAGE_BACKEND No local local or s3
STORAGE_S3_BUCKET If S3 - S3 bucket name
STORAGE_S3_REGION No us-east-1 AWS region
Observability
OTEL_ENABLED No false Enable OpenTelemetry
OTEL_ENDPOINT If OTEL - OTLP collector endpoint
SENTRY_DSN No - Sentry error tracking DSN

Docker Deployment

Building the Image

# Build production image
docker build -t calcbridge:latest -f Dockerfile --target production .

# Build with specific version tag
docker build -t calcbridge:v1.0.0 -f Dockerfile --target production .

Docker Compose Production

Create a production compose file:

docker-compose.prod.yml
version: "3.8"

services:
  api:
    image: calcbridge:latest
    restart: unless-stopped
    environment:
      - ENVIRONMENT=production
      - DEBUG=false
      - DATABASE_URL=postgresql+psycopg://calcbridge:${DB_PASSWORD}@postgres:5432/calcbridge
      - DATABASE_SSL_MODE=verify-full
      - VALKEY_HOST=valkey
      - VALKEY_SSL=true
      - JWT_SECRET_KEY=${JWT_SECRET_KEY}
      - ENCRYPTION_MASTER_KEY=${ENCRYPTION_MASTER_KEY}
    ports:
      - "8000:8000"
    depends_on:
      postgres:
        condition: service_healthy
      valkey:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
    deploy:
      replicas: 2
      resources:
        limits:
          cpus: "2"
          memory: 2G
        reservations:
          cpus: "0.5"
          memory: 512M

  celery-worker:
    image: calcbridge:latest
    restart: unless-stopped
    command: >
      celery -A src.workers.celery_app worker
      --loglevel=INFO
      --concurrency=4
      --queues=default,parse,export
    environment:
      - DATABASE_URL=postgresql+psycopg://calcbridge:${DB_PASSWORD}@postgres:5432/calcbridge
      - VALKEY_HOST=valkey
    depends_on:
      postgres:
        condition: service_healthy
      valkey:
        condition: service_healthy
    deploy:
      replicas: 2
      resources:
        limits:
          cpus: "4"
          memory: 4G

  postgres:
    image: postgres:16-alpine
    restart: unless-stopped
    environment:
      POSTGRES_USER: calcbridge
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_DB: calcbridge
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U calcbridge"]
      interval: 10s
      timeout: 5s
      retries: 5

  valkey:
    image: valkey/valkey:8-alpine
    restart: unless-stopped
    command: >
      valkey-server
      --appendonly yes
      --maxmemory 1gb
      --maxmemory-policy allkeys-lru
      --requirepass ${VALKEY_PASSWORD}
    volumes:
      - valkey_data:/data
    healthcheck:
      test: ["CMD", "valkey-cli", "-a", "${VALKEY_PASSWORD}", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  postgres_data:
  valkey_data:

Deploy with Docker Compose

# Create production environment file
cat > .env.prod << EOF
DB_PASSWORD=$(python -c "import secrets; print(secrets.token_urlsafe(24))")
VALKEY_PASSWORD=$(python -c "import secrets; print(secrets.token_urlsafe(24))")
JWT_SECRET_KEY=$(python -c "import secrets; print(secrets.token_urlsafe(32))")
ENCRYPTION_MASTER_KEY=$(python -c "import secrets; print(secrets.token_urlsafe(32))")
EOF

# Deploy
docker compose -f docker-compose.prod.yml --env-file .env.prod up -d

# Verify deployment
docker compose -f docker-compose.prod.yml ps
docker compose -f docker-compose.prod.yml logs -f api

Database Migrations

Apply Migrations

Migrations must be applied before the first deployment and after each upgrade:

# Option 1: Apply via Docker
docker exec -i calcbridge-postgres psql -U calcbridge -d calcbridge < db/migrations/V001__initial_schema.sql

# Option 2: Apply all migrations
for migration in db/migrations/V*.sql; do
  echo "Applying: $migration"
  docker exec -i calcbridge-postgres psql -U calcbridge -d calcbridge < "$migration"
done

# Option 3: Use the migration script
./scripts/apply_migration_docker.sh db/migrations/V019__grant_app_role_metrics_mapping.sql

Migration Best Practices

Migration Safety

  • Always backup the database before migrations
  • Test migrations in staging first
  • Use transactions for rollback capability
  • Never modify already-applied migrations
# Backup before migration
docker exec calcbridge-postgres pg_dump -U calcbridge calcbridge > backup_$(date +%Y%m%d).sql

# Apply migration
docker exec -i calcbridge-postgres psql -U calcbridge -d calcbridge < db/migrations/V020__new_migration.sql

# Verify migration
docker exec calcbridge-postgres psql -U calcbridge -d calcbridge -c "\dt"

SSL/TLS Configuration

Database SSL

For production, use verify-full SSL mode:

# Environment variables
DATABASE_SSL_MODE=verify-full
DATABASE_SSL_CERT_PATH=/certs/client-cert.pem
DATABASE_SSL_KEY_PATH=/certs/client-key.pem
DATABASE_SSL_CA_PATH=/certs/ca-cert.pem

Reverse Proxy with TLS

Use nginx or Traefik as a reverse proxy with TLS termination:

nginx.conf
upstream calcbridge {
    server api:8000;
}

server {
    listen 443 ssl http2;
    server_name api.yourdomain.com;

    ssl_certificate /etc/nginx/ssl/fullchain.pem;
    ssl_certificate_key /etc/nginx/ssl/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
    ssl_prefer_server_ciphers off;

    # Security headers
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-Frame-Options "DENY" always;

    location / {
        proxy_pass http://calcbridge;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket support
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

Health Checks

CalcBridge provides multiple health check endpoints:

Endpoint Purpose Response
/health Basic liveness {"status": "healthy"}
/health/live Kubernetes liveness {"status": "alive"}
/health/ready Kubernetes readiness Checks DB + cache
/health/detailed Full system status All component statuses

Kubernetes Probes

livenessProbe:
  httpGet:
    path: /health/live
    port: 8000
  initialDelaySeconds: 10
  periodSeconds: 30
  timeoutSeconds: 5
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /health/ready
    port: 8000
  initialDelaySeconds: 5
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 3

Monitoring Setup

Prometheus Metrics

CalcBridge exposes Prometheus metrics at /metrics:

# prometheus.yml scrape config
scrape_configs:
  - job_name: "calcbridge"
    static_configs:
      - targets: ["api:8000"]
    metrics_path: /metrics
    scheme: http

Key Metrics

Metric Type Description
http_requests_total Counter Total HTTP requests
http_request_duration_seconds Histogram Request latency
celery_task_duration_seconds Histogram Task processing time
db_connection_pool_size Gauge Database pool usage
rate_limit_exceeded_total Counter Rate limit violations

Grafana Dashboards

Import the provided dashboards from config/grafana/dashboards/:

  • CalcBridge Overview: Request rates, latency, error rates
  • Celery Workers: Task throughput, queue depths, worker status
  • Database Performance: Connection pool, query latency

Alerting Configuration

Alertmanager Integration

alertmanager.yml
global:
  resolve_timeout: 5m

route:
  group_by: ['alertname', 'severity']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  receiver: 'default'

receivers:
  - name: 'default'
    slack_configs:
      - api_url: '${SLACK_WEBHOOK_URL}'
        channel: '#alerts'
        send_resolved: true

Alert Rules

alert_rules.yml
groups:
  - name: calcbridge
    rules:
      - alert: HighErrorRate
        expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.1
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: High error rate detected

      - alert: SlowResponses
        expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 2
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: API response times are slow

Scaling Guidelines

Horizontal Scaling

Component Scaling Strategy Notes
API Stateless, scale freely Use load balancer
Celery Workers Scale by queue depth Monitor memory usage
PostgreSQL Read replicas Consider PgBouncer
Valkey Cluster mode For high availability

Celery Worker Scaling

# Scale workers based on workload
docker compose -f docker-compose.prod.yml up -d --scale celery-worker=4

# Or use separate workers for different queues
docker compose -f docker-compose.prod.yml up -d \
  --scale celery-worker=2 \
  --scale celery-worker-calc=4

Backup and Recovery

Database Backup

# Create backup
docker exec calcbridge-postgres pg_dump -U calcbridge calcbridge | gzip > backup_$(date +%Y%m%d_%H%M%S).sql.gz

# Automated daily backup (cron)
0 2 * * * docker exec calcbridge-postgres pg_dump -U calcbridge calcbridge | gzip > /backups/calcbridge_$(date +\%Y\%m\%d).sql.gz

# Restore from backup
gunzip -c backup_20250101.sql.gz | docker exec -i calcbridge-postgres psql -U calcbridge calcbridge

Valkey Backup

# Trigger RDB snapshot
docker exec calcbridge-valkey valkey-cli BGSAVE

# Copy backup file
docker cp calcbridge-valkey:/data/dump.rdb ./backups/valkey_$(date +%Y%m%d).rdb

Deployment Checklist

Before deploying to production:

  • All environment variables set with secure values
  • JWT_SECRET_KEY and ENCRYPTION_MASTER_KEY are unique, random, 32+ chars
  • ENVIRONMENT=production and DEBUG=false
  • Database SSL mode is verify-full
  • Valkey SSL is enabled
  • Database migrations applied
  • Health checks configured and passing
  • Monitoring and alerting configured
  • Backup strategy implemented
  • Load testing completed
  • Security scan passed
  • TLS certificates installed and valid
  • CORS origins correctly configured
  • Rate limiting enabled