Infrastructure Services

Start the 6 core infrastructure services and verify they are healthy. These services persist data on Docker volumes and rarely need to be restarted.

Checklist

Start infrastructure with docker compose
Wait for all 6 services to show (healthy)
Verify Keycloak has imported the "zol" realm
Check PostgreSQL, Redis, and Keycloak are accepting connections
Verify the OPENAI_API_KEY is set in .env.prod and the host has outbound HTTPS to api.openai.com

Services Overview

Service	Image	Purpose	Resource Limits
PostgreSQL	`pgvector/pgvector:0.8.2-pg17`	Relational DB + vector search + taxonomy	2 CPU, 4 GB RAM
Redis	`redis:8-alpine`	Cache, rate limiting	1 CPU, 1 GB RAM
MinIO	`minio/minio`	S3-compatible document storage	0.5 CPU, 512 MB RAM
Keycloak	`quay.io/keycloak/keycloak:26.5`	OIDC identity provider	1 CPU, 1 GB RAM
Prometheus	`prom/prometheus:v3.10.0`	Metrics collection	0.5 CPU, 512 MB RAM
Grafana	`grafana/grafana:12.3.0`	Metrics dashboards	0.5 CPU, 256 MB RAM

External Dependency: OpenAI Embeddings API

Embedding inference runs against OpenAI's hosted API (text-embedding-3-large, 1536 dim). The on-premise Ollama embedding container documented in earlier revisions of this page was retired in April 2026 — see ADR-0048 for the migration rationale.

Docker Compose 3-File Overlay

The deployment uses a layered compose pattern with three files:

File	Purpose	When Used
`docker/docker-compose.infra.yml`	Infrastructure services (PostgreSQL, Redis, MinIO, Keycloak, Prometheus, Grafana)	Always
`docker/docker-compose.app.yml`	Application container (FastAPI + nginx + supervisord)	Always
`docker/docker-compose.ssl.yml`	Nginx SSL overlay (port 443, certificates)	Production with TLS

The infrastructure compose also defines two optional observability services (ClickHouse and Langfuse) behind the observability profile, which are not started by default.

Start Infrastructure Only

cd /opt/zol-rag

# Start all infrastructure services
docker compose -f docker/docker-compose.infra.yml --env-file .env.prod up -d

# Watch them come up (wait until all show "healthy")
docker compose -f docker/docker-compose.infra.yml --env-file .env.prod ps

Full Stack Start (Infrastructure + Application + SSL)

cd /opt/zol-rag

docker compose --env-file .env.prod \
  -f docker/docker-compose.infra.yml \
  -f docker/docker-compose.app.yml \
  -f docker/docker-compose.ssl.yml up -d

This takes 2-5 minutes for all services to reach healthy status. Keycloak takes the longest on first startup due to database initialization and realm import.

Verify All Healthy

Expected output of docker compose ps:

NAME              STATUS                  PORTS
zol-postgres      running (healthy)
zol-redis         running (healthy)
zol-minio         running (healthy)
zol-keycloak      running (healthy)       127.0.0.1:8080->8080/tcp
zol-prometheus    running (healthy)
zol-grafana       running (healthy)       127.0.0.1:3000->3000/tcp

All 6 services should show (healthy).

Individual Health Checks

# PostgreSQL
docker exec zol-postgres pg_isready -U $POSTGRES_USER -d $POSTGRES_DB

# Redis
docker exec zol-redis redis-cli -a $REDIS_PASSWORD ping
# Expected: PONG

# Keycloak (verify realm exists)
curl -s http://localhost:8080/realms/zol | python3 -m json.tool
# Expected: JSON with "realm": "zol"

# MinIO
docker exec zol-minio curl -sf http://localhost:9000/minio/health/live
# Expected: exit code 0

Verify Keycloak Realm Import

On first startup, Keycloak automatically imports the "zol" realm from scripts/keycloak/zol-realm.json. Verify it loaded correctly:

# Check the realm is accessible
curl -s http://localhost:8080/realms/zol/.well-known/openid-configuration \
  | python3 -c "import sys,json; d=json.load(sys.stdin); print('Issuer:', d['issuer']); print('Token endpoint:', d['token_endpoint'])"

Expected output:

Issuer: http://localhost:8080/realms/zol
Token endpoint: http://localhost:8080/realms/zol/protocol/openid-connect/token

External Dependencies: OpenAI

Embedding inference uses the OpenAI hosted API. Configure the following in .env.prod:

EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-large
EMBEDDING_DIMENSIONS=1536
OPENAI_API_KEY=sk-...

Verify reachability from the app container after the application stack is up (see Application):

docker exec zol-rag-app python -c "
from app.services.embedding_service import EmbeddingService
import asyncio
async def main():
    svc = EmbeddingService()
    print('dim=', len(await svc.embed_text('connectivity check')))
asyncio.run(main())
"
# Expected: dim= 1536

Why hosted, not on-prem

The migration trade-off and per-query cost analysis are documented in ADR-0048. The short version: on-prem Ollama bge-m3 embeddings paid a 1.7–5.8 s wall-clock tax per voice turn (cold-start + serialization); hosted text-embedding-3-large returns in 150–211 ms for ≈$0.16/year at pilot volume. The query string surface is non-PHI public-website chatbot input, which kept the data-sovereignty argument from being decisive.

Volume Persistence

Data survives container restarts via named Docker volumes:

Volume	Service	Backup Priority
`zol-postgres-data`	PostgreSQL	CRITICAL — all data + embeddings + taxonomy
`zol-minio-data`	MinIO	HIGH — uploaded documents
`zol-redis-data`	Redis	LOW — reconstructable cache
`zol-prometheus-data`	Prometheus	MEDIUM — 30-day metrics
`zol-grafana-data`	Grafana	MEDIUM — dashboards

note

Keycloak stores its data in the PostgreSQL database (in a separate keycloak database), so no additional volume is needed for Keycloak persistence. The zol-postgres-data volume covers both application data and Keycloak data.

Troubleshooting

Service shows `(unhealthy)`

# Check logs for the specific service
docker logs zol-<service-name> --tail 50

# Check the health check history
docker inspect zol-<service-name> --format='{{json .State.Health.Log}}' | python3 -m json.tool

Keycloak fails to start

# Check Keycloak logs
docker logs zol-keycloak --tail 50

# Common issues:
# 1. PostgreSQL not ready — Keycloak depends on service_healthy, but check anyway
# 2. Port 8080 already in use — check with: ss -tlnp | grep 8080
# 3. Realm import error — check scripts/keycloak/zol-realm.json for valid JSON

PostgreSQL won't start (data directory issue)

# Check if the data volume has correct permissions
docker logs zol-postgres --tail 20
# If permission errors, the volume may need to be recreated (data loss!)

OpenAI Embeddings unreachable

# Confirm the host can reach OpenAI
docker exec zol-rag-app curl -sI https://api.openai.com/v1/models -H "Authorization: Bearer $OPENAI_API_KEY" | head -1
# Expected: HTTP/2 200 (or HTTP/2 401 if the key is wrong, which still proves connectivity)

If reachability is fine but embeds return 0-dim vectors silently, double-check that EMBEDDING_PROVIDER and EMBEDDING_MODEL are not being shadowed by a stale environment: block in any of the compose files — the historical regression that motivated ADR-0048 was exactly this drift class.

Architectural Evolution

The infrastructure stack has undergone three significant changes since initial deployment. Neo4j (knowledge graph database) was removed in March 2026 after entity relationships were migrated to PostgreSQL taxonomy tables (taxonomy_entities and taxonomy_relationships), simplifying the operational footprint — see ADR-0053 (master record). Simultaneously, Keycloak was added as the OIDC identity provider, replacing the legacy cookie-based authentication system. In April 2026 the on-premise Ollama embedding container was retired entirely, with embedding inference moving to OpenAI's hosted text-embedding-3-large API (ADR-0048) — net result was 30× faster embeddings on voice turns and ≈1.4 GB of host RAM freed.

Next: Build and Start Application →

Checklist​

Services Overview​

Docker Compose 3-File Overlay​

Start Infrastructure Only​

Full Stack Start (Infrastructure + Application + SSL)​

Verify All Healthy​

Individual Health Checks​

Verify Keycloak Realm Import​

External Dependencies: OpenAI​

Volume Persistence​

Troubleshooting​

Service shows (unhealthy)​

Keycloak fails to start​

PostgreSQL won't start (data directory issue)​

OpenAI Embeddings unreachable​

Architectural Evolution​