Infrastructure Services
Start the 6 core infrastructure services and verify they are healthy. These services persist data on Docker volumes and rarely need to be restarted.
Checklist
- Start infrastructure with
docker compose - Wait for all 6 services to show
(healthy) - Verify Keycloak has imported the "zol" realm
- Check PostgreSQL, Redis, and Keycloak are accepting connections
- Verify the
OPENAI_API_KEYis set in.env.prodand the host has outbound HTTPS toapi.openai.com
Services Overview
| Service | Image | Purpose | Resource Limits |
|---|---|---|---|
| PostgreSQL | pgvector/pgvector:0.8.2-pg17 | Relational DB + vector search + taxonomy | 2 CPU, 4 GB RAM |
| Redis | redis:8-alpine | Cache, rate limiting | 1 CPU, 1 GB RAM |
| MinIO | minio/minio | S3-compatible document storage | 0.5 CPU, 512 MB RAM |
| Keycloak | quay.io/keycloak/keycloak:26.5 | OIDC identity provider | 1 CPU, 1 GB RAM |
| Prometheus | prom/prometheus:v3.10.0 | Metrics collection | 0.5 CPU, 512 MB RAM |
| Grafana | grafana/grafana:12.3.0 | Metrics dashboards | 0.5 CPU, 256 MB RAM |
Embedding inference runs against OpenAI's hosted API (text-embedding-3-large, 1536 dim). The on-premise Ollama embedding container documented in earlier revisions of this page was retired in April 2026 — see ADR-0048 for the migration rationale.
Docker Compose 3-File Overlay
The deployment uses a layered compose pattern with three files:
| File | Purpose | When Used |
|---|---|---|
docker/docker-compose.infra.yml | Infrastructure services (PostgreSQL, Redis, MinIO, Keycloak, Prometheus, Grafana) | Always |
docker/docker-compose.app.yml | Application container (FastAPI + nginx + supervisord) | Always |
docker/docker-compose.ssl.yml | Nginx SSL overlay (port 443, certificates) | Production with TLS |
The infrastructure compose also defines two optional observability services (ClickHouse and Langfuse) behind the observability profile, which are not started by default.
Start Infrastructure Only
cd /opt/zol-rag
# Start all infrastructure services
docker compose -f docker/docker-compose.infra.yml --env-file .env.prod up -d
# Watch them come up (wait until all show "healthy")
docker compose -f docker/docker-compose.infra.yml --env-file .env.prod ps
Full Stack Start (Infrastructure + Application + SSL)
cd /opt/zol-rag
docker compose --env-file .env.prod \
-f docker/docker-compose.infra.yml \
-f docker/docker-compose.app.yml \
-f docker/docker-compose.ssl.yml up -d
This takes 2-5 minutes for all services to reach healthy status. Keycloak takes the longest on first startup due to database initialization and realm import.
Verify All Healthy
Expected output of docker compose ps:
NAME STATUS PORTS
zol-postgres running (healthy)
zol-redis running (healthy)
zol-minio running (healthy)
zol-keycloak running (healthy) 127.0.0.1:8080->8080/tcp
zol-prometheus running (healthy)
zol-grafana running (healthy) 127.0.0.1:3000->3000/tcp
All 6 services should show (healthy).
Individual Health Checks
# PostgreSQL
docker exec zol-postgres pg_isready -U $POSTGRES_USER -d $POSTGRES_DB
# Redis
docker exec zol-redis redis-cli -a $REDIS_PASSWORD ping
# Expected: PONG
# Keycloak (verify realm exists)
curl -s http://localhost:8080/realms/zol | python3 -m json.tool
# Expected: JSON with "realm": "zol"
# MinIO
docker exec zol-minio curl -sf http://localhost:9000/minio/health/live
# Expected: exit code 0
Verify Keycloak Realm Import
On first startup, Keycloak automatically imports the "zol" realm from scripts/keycloak/zol-realm.json. Verify it loaded correctly:
# Check the realm is accessible
curl -s http://localhost:8080/realms/zol/.well-known/openid-configuration \
| python3 -c "import sys,json; d=json.load(sys.stdin); print('Issuer:', d['issuer']); print('Token endpoint:', d['token_endpoint'])"
Expected output:
Issuer: http://localhost:8080/realms/zol
Token endpoint: http://localhost:8080/realms/zol/protocol/openid-connect/token
External Dependencies: OpenAI
Embedding inference uses the OpenAI hosted API. Configure the following in .env.prod:
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-large
EMBEDDING_DIMENSIONS=1536
OPENAI_API_KEY=sk-...
Verify reachability from the app container after the application stack is up (see Application):
docker exec zol-rag-app python -c "
from app.services.embedding_service import EmbeddingService
import asyncio
async def main():
svc = EmbeddingService()
print('dim=', len(await svc.embed_text('connectivity check')))
asyncio.run(main())
"
# Expected: dim= 1536
The migration trade-off and per-query cost analysis are documented in ADR-0048. The short version: on-prem Ollama bge-m3 embeddings paid a 1.7–5.8 s wall-clock tax per voice turn (cold-start + serialization); hosted text-embedding-3-large returns in 150–211 ms for ≈$0.16/year at pilot volume. The query string surface is non-PHI public-website chatbot input, which kept the data-sovereignty argument from being decisive.
Volume Persistence
Data survives container restarts via named Docker volumes:
| Volume | Service | Backup Priority |
|---|---|---|
zol-postgres-data | PostgreSQL | CRITICAL — all data + embeddings + taxonomy |
zol-minio-data | MinIO | HIGH — uploaded documents |
zol-redis-data | Redis | LOW — reconstructable cache |
zol-prometheus-data | Prometheus | MEDIUM — 30-day metrics |
zol-grafana-data | Grafana | MEDIUM — dashboards |
Keycloak stores its data in the PostgreSQL database (in a separate keycloak database), so no additional volume is needed for Keycloak persistence. The zol-postgres-data volume covers both application data and Keycloak data.
Troubleshooting
Service shows (unhealthy)
# Check logs for the specific service
docker logs zol-<service-name> --tail 50
# Check the health check history
docker inspect zol-<service-name> --format='{{json .State.Health.Log}}' | python3 -m json.tool
Keycloak fails to start
# Check Keycloak logs
docker logs zol-keycloak --tail 50
# Common issues:
# 1. PostgreSQL not ready — Keycloak depends on service_healthy, but check anyway
# 2. Port 8080 already in use — check with: ss -tlnp | grep 8080
# 3. Realm import error — check scripts/keycloak/zol-realm.json for valid JSON
PostgreSQL won't start (data directory issue)
# Check if the data volume has correct permissions
docker logs zol-postgres --tail 20
# If permission errors, the volume may need to be recreated (data loss!)
OpenAI Embeddings unreachable
# Confirm the host can reach OpenAI
docker exec zol-rag-app curl -sI https://api.openai.com/v1/models -H "Authorization: Bearer $OPENAI_API_KEY" | head -1
# Expected: HTTP/2 200 (or HTTP/2 401 if the key is wrong, which still proves connectivity)
If reachability is fine but embeds return 0-dim vectors silently, double-check that EMBEDDING_PROVIDER and EMBEDDING_MODEL are not being shadowed by a stale environment: block in any of the compose files — the historical regression that motivated ADR-0048 was exactly this drift class.
Architectural Evolution
The infrastructure stack has undergone three significant changes since initial deployment. Neo4j (knowledge graph database) was removed in March 2026 after entity relationships were migrated to PostgreSQL taxonomy tables (taxonomy_entities and taxonomy_relationships), simplifying the operational footprint — see ADR-0053 (master record). Simultaneously, Keycloak was added as the OIDC identity provider, replacing the legacy cookie-based authentication system. In April 2026 the on-premise Ollama embedding container was retired entirely, with embedding inference moving to OpenAI's hosted text-embedding-3-large API (ADR-0048) — net result was 30× faster embeddings on voice turns and ≈1.4 GB of host RAM freed.