Application Deployment
Build the application Docker image, run database migrations, and start the app container.
Checklist
- All infrastructure services are
(healthy)(see Infrastructure) - Build the application image with
docker build - Run database migrations with
alembic upgrade head - Start the application container
- Verify
/healthreturns all components healthy
Build the Application Image
The application image bundles: React frontend (static files) + FastAPI backend + nginx + reranker model.
cd /opt/zol-rag
# Tag with the current git commit SHA
export GIT_SHA=$(git rev-parse --short HEAD)
# Build (takes 3-5 minutes on first build)
docker build -f docker/Dockerfile.app -t zol-rag-app:${GIT_SHA} .
# Verify image size (target: <600 MB including reranker model)
docker images zol-rag-app:${GIT_SHA}
What the Build Does
The multi-stage Dockerfile:
- Stage 1 (frontend-build): Node 22 Alpine,
npm ci,vite build→ static files - Stage 2 (backend-build): Python 3.12,
pip install→ Python dependencies - Stage 3 (runtime): Slim image with nginx + supervisord, copies frontend dist + backend code, pre-caches the BGE reranker model
The reranker model is cached during build so TRANSFORMERS_OFFLINE=1 prevents runtime downloads.
Run Database Migrations
# Run Alembic migrations against the infrastructure database
docker run --rm \
--network zol-network \
--env-file .env.prod \
zol-rag-app:${GIT_SHA} \
alembic upgrade head
This creates/updates all database tables, including the app. schema tables for SNOMED, documents, users, etc.
Start the Application
# Start the application container
APP_IMAGE=zol-rag-app:${GIT_SHA} \
docker compose -f docker/docker-compose.app.yml --env-file .env.prod up -d
# Check it started
docker logs zol-app --tail 20
Verify Health
# Health endpoint (passes through nginx → backend)
curl -s http://localhost:80/health | python3 -m json.tool
Expected response:
{
"status": "healthy",
"version": "0.1.0",
"components": {
"database": "healthy",
"redis": "healthy",
"minio": "healthy"
}
}
Additional Checks
# Frontend loads
curl -s -o /dev/null -w "%{http_code}" http://localhost:80/
# Expected: 200
# API docs accessible
curl -s -o /dev/null -w "%{http_code}" http://localhost:80/docs
# Expected: 200
# Deep readiness check (includes LLM circuit breaker state)
curl -s http://localhost:80/health/ready | python3 -m json.tool
# Expected: status "healthy" with llm_circuit "closed"
# Container is healthy
docker ps --filter name=zol-app --format "{{.Names}}\t{{.Status}}"
# Expected: zol-app Up X minutes (healthy)
Application Architecture
Inside the container, supervisord manages two processes:
| Process | Port | Role |
|---|---|---|
| nginx | 80 (443 with TLS) | Serves static React files, reverse proxies /api and /ws to uvicorn |
| uvicorn | 8000 (internal) | FastAPI backend with 4 workers |
The container runs as non-root user appuser (UID 1000) with a read-only root filesystem.
Graceful Shutdown
The application supports graceful shutdown with request draining. When the container receives SIGTERM, in-flight requests are allowed to complete while new requests receive HTTP 503. Configure the drain timeout via uvicorn:
uvicorn app.main:app --timeout-graceful-shutdown 30
This ensures zero-downtime deployments when combined with a load balancer that respects health check failures.
Environment Variables Reference
See docker/docker-compose.app.yml for the full list. Key variables:
| Variable | Default | Description |
|---|---|---|
APP_IMAGE | zol-rag-app:latest | Image tag override |
RAG_LLM_MODEL | openai/gpt-4.1-mini | Standard RAG model |
RAG_ESCALATION_MODEL | openai/gpt-4.1 | Escalated query model |
RAG_RERANKER_PROVIDER | jina | Reranker: jina or local |
RAG_TRUE_STREAMING_ENABLED | true | WebSocket streaming |
SAFETY_LLM_VALIDATION_ENABLED | false | LLM safety judge |
Keycloak Configuration
| Variable | Default | Description |
|---|---|---|
KEYCLOAK_URL | http://keycloak:8080 | Keycloak server URL |
KEYCLOAK_REALM | zol | Keycloak realm name |
KEYCLOAK_CLIENT_ID | zol-rag | OIDC client identifier |
KEYCLOAK_CLIENT_SECRET | — | OIDC client secret (required) |
KEYCLOAK_ADMIN_ROLE | admin | Role name for admin access |
Next: Data Seeding →
SSL / HTTPS
The production deployment must use the SSL compose overlay to enable HTTPS:
APP_IMAGE=zol-rag-app:${GIT_SHA} \
docker compose -f docker/docker-compose.app.yml -f docker/docker-compose.ssl.yml \
--env-file .env.prod up -d
Without the SSL overlay, port 443 is not exposed and https://test.medchat.health will refuse connections.
Certificate Setup
Let's Encrypt certificates must be copied to the ssl/ directory:
sudo cp /etc/letsencrypt/live/test.medchat.health/fullchain.pem ssl/fullchain.pem
sudo cp /etc/letsencrypt/live/test.medchat.health/privkey.pem ssl/privkey.pem
sudo chown deploy:deploy ssl/*.pem
Certificates auto-renew via the system certbot timer. After renewal, copy the new certs and restart the app container.