Skip to main content

Application Deployment

Build the application Docker image, run database migrations, and start the app container.

Checklist

  • All infrastructure services are (healthy) (see Infrastructure)
  • Build the application image with docker build
  • Run database migrations with alembic upgrade head
  • Start the application container
  • Verify /health returns all components healthy

Build the Application Image

The application image bundles: React frontend (static files) + FastAPI backend + nginx + reranker model.

cd /opt/zol-rag

# Tag with the current git commit SHA
export GIT_SHA=$(git rev-parse --short HEAD)

# Build (takes 3-5 minutes on first build)
docker build -f docker/Dockerfile.app -t zol-rag-app:${GIT_SHA} .

# Verify image size (target: <600 MB including reranker model)
docker images zol-rag-app:${GIT_SHA}

What the Build Does

The multi-stage Dockerfile:

  1. Stage 1 (frontend-build): Node 22 Alpine, npm ci, vite build → static files
  2. Stage 2 (backend-build): Python 3.12, pip install → Python dependencies
  3. Stage 3 (runtime): Slim image with nginx + supervisord, copies frontend dist + backend code, pre-caches the BGE reranker model

The reranker model is cached during build so TRANSFORMERS_OFFLINE=1 prevents runtime downloads.

Run Database Migrations

# Run Alembic migrations against the infrastructure database
docker run --rm \
--network zol-network \
--env-file .env.prod \
zol-rag-app:${GIT_SHA} \
alembic upgrade head

This creates/updates all database tables, including the app. schema tables for SNOMED, documents, users, etc.

Start the Application

# Start the application container
APP_IMAGE=zol-rag-app:${GIT_SHA} \
docker compose -f docker/docker-compose.app.yml --env-file .env.prod up -d

# Check it started
docker logs zol-app --tail 20

Verify Health

# Health endpoint (passes through nginx → backend)
curl -s http://localhost:80/health | python3 -m json.tool

Expected response:

{
"status": "healthy",
"version": "0.1.0",
"components": {
"database": "healthy",
"redis": "healthy",
"minio": "healthy"
}
}

Additional Checks

# Frontend loads
curl -s -o /dev/null -w "%{http_code}" http://localhost:80/
# Expected: 200

# API docs accessible
curl -s -o /dev/null -w "%{http_code}" http://localhost:80/docs
# Expected: 200

# Deep readiness check (includes LLM circuit breaker state)
curl -s http://localhost:80/health/ready | python3 -m json.tool
# Expected: status "healthy" with llm_circuit "closed"

# Container is healthy
docker ps --filter name=zol-app --format "{{.Names}}\t{{.Status}}"
# Expected: zol-app Up X minutes (healthy)

Application Architecture

Inside the container, supervisord manages two processes:

ProcessPortRole
nginx80 (443 with TLS)Serves static React files, reverse proxies /api and /ws to uvicorn
uvicorn8000 (internal)FastAPI backend with 4 workers

The container runs as non-root user appuser (UID 1000) with a read-only root filesystem.

Graceful Shutdown

The application supports graceful shutdown with request draining. When the container receives SIGTERM, in-flight requests are allowed to complete while new requests receive HTTP 503. Configure the drain timeout via uvicorn:

uvicorn app.main:app --timeout-graceful-shutdown 30

This ensures zero-downtime deployments when combined with a load balancer that respects health check failures.

Environment Variables Reference

See docker/docker-compose.app.yml for the full list. Key variables:

VariableDefaultDescription
APP_IMAGEzol-rag-app:latestImage tag override
RAG_LLM_MODELopenai/gpt-4.1-miniStandard RAG model
RAG_ESCALATION_MODELopenai/gpt-4.1Escalated query model
RAG_RERANKER_PROVIDERjinaReranker: jina or local
RAG_TRUE_STREAMING_ENABLEDtrueWebSocket streaming
SAFETY_LLM_VALIDATION_ENABLEDfalseLLM safety judge

Keycloak Configuration

VariableDefaultDescription
KEYCLOAK_URLhttp://keycloak:8080Keycloak server URL
KEYCLOAK_REALMzolKeycloak realm name
KEYCLOAK_CLIENT_IDzol-ragOIDC client identifier
KEYCLOAK_CLIENT_SECRETOIDC client secret (required)
KEYCLOAK_ADMIN_ROLEadminRole name for admin access

Next: Data Seeding →

SSL / HTTPS

The production deployment must use the SSL compose overlay to enable HTTPS:

APP_IMAGE=zol-rag-app:${GIT_SHA} \
docker compose -f docker/docker-compose.app.yml -f docker/docker-compose.ssl.yml \
--env-file .env.prod up -d

Without the SSL overlay, port 443 is not exposed and https://test.medchat.health will refuse connections.

Certificate Setup

Let's Encrypt certificates must be copied to the ssl/ directory:

sudo cp /etc/letsencrypt/live/test.medchat.health/fullchain.pem ssl/fullchain.pem
sudo cp /etc/letsencrypt/live/test.medchat.health/privkey.pem ssl/privkey.pem
sudo chown deploy:deploy ssl/*.pem

Certificates auto-renew via the system certbot timer. After renewal, copy the new certs and restart the app container.