Skip to main content

Infrastructure Services

Start the 6 core infrastructure services and verify they are healthy. These services persist data on Docker volumes and rarely need to be restarted.

Checklist

  • Start infrastructure with docker compose
  • Wait for all 6 services to show (healthy)
  • Verify Keycloak has imported the "zol" realm
  • Check PostgreSQL, Redis, and Keycloak are accepting connections
  • Verify the OPENAI_API_KEY is set in .env.prod and the host has outbound HTTPS to api.openai.com

Services Overview

ServiceImagePurposeResource Limits
PostgreSQLpgvector/pgvector:0.8.2-pg17Relational DB + vector search + taxonomy2 CPU, 4 GB RAM
Redisredis:8-alpineCache, rate limiting1 CPU, 1 GB RAM
MinIOminio/minioS3-compatible document storage0.5 CPU, 512 MB RAM
Keycloakquay.io/keycloak/keycloak:26.5OIDC identity provider1 CPU, 1 GB RAM
Prometheusprom/prometheus:v3.10.0Metrics collection0.5 CPU, 512 MB RAM
Grafanagrafana/grafana:12.3.0Metrics dashboards0.5 CPU, 256 MB RAM
External Dependency: OpenAI Embeddings API

Embedding inference runs against OpenAI's hosted API (text-embedding-3-large, 1536 dim). The on-premise Ollama embedding container documented in earlier revisions of this page was retired in April 2026 — see ADR-0048 for the migration rationale.

Docker Compose 3-File Overlay

The deployment uses a layered compose pattern with three files:

FilePurposeWhen Used
docker/docker-compose.infra.ymlInfrastructure services (PostgreSQL, Redis, MinIO, Keycloak, Prometheus, Grafana)Always
docker/docker-compose.app.ymlApplication container (FastAPI + nginx + supervisord)Always
docker/docker-compose.ssl.ymlNginx SSL overlay (port 443, certificates)Production with TLS

The infrastructure compose also defines two optional observability services (ClickHouse and Langfuse) behind the observability profile, which are not started by default.

Start Infrastructure Only

cd /opt/zol-rag

# Start all infrastructure services
docker compose -f docker/docker-compose.infra.yml --env-file .env.prod up -d

# Watch them come up (wait until all show "healthy")
docker compose -f docker/docker-compose.infra.yml --env-file .env.prod ps

Full Stack Start (Infrastructure + Application + SSL)

cd /opt/zol-rag

docker compose --env-file .env.prod \
-f docker/docker-compose.infra.yml \
-f docker/docker-compose.app.yml \
-f docker/docker-compose.ssl.yml up -d

This takes 2-5 minutes for all services to reach healthy status. Keycloak takes the longest on first startup due to database initialization and realm import.

Verify All Healthy

Expected output of docker compose ps:

NAME STATUS PORTS
zol-postgres running (healthy)
zol-redis running (healthy)
zol-minio running (healthy)
zol-keycloak running (healthy) 127.0.0.1:8080->8080/tcp
zol-prometheus running (healthy)
zol-grafana running (healthy) 127.0.0.1:3000->3000/tcp

All 6 services should show (healthy).

Individual Health Checks

# PostgreSQL
docker exec zol-postgres pg_isready -U $POSTGRES_USER -d $POSTGRES_DB

# Redis
docker exec zol-redis redis-cli -a $REDIS_PASSWORD ping
# Expected: PONG

# Keycloak (verify realm exists)
curl -s http://localhost:8080/realms/zol | python3 -m json.tool
# Expected: JSON with "realm": "zol"

# MinIO
docker exec zol-minio curl -sf http://localhost:9000/minio/health/live
# Expected: exit code 0

Verify Keycloak Realm Import

On first startup, Keycloak automatically imports the "zol" realm from scripts/keycloak/zol-realm.json. Verify it loaded correctly:

# Check the realm is accessible
curl -s http://localhost:8080/realms/zol/.well-known/openid-configuration \
| python3 -c "import sys,json; d=json.load(sys.stdin); print('Issuer:', d['issuer']); print('Token endpoint:', d['token_endpoint'])"

Expected output:

Issuer: http://localhost:8080/realms/zol
Token endpoint: http://localhost:8080/realms/zol/protocol/openid-connect/token

External Dependencies: OpenAI

Embedding inference uses the OpenAI hosted API. Configure the following in .env.prod:

EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-large
EMBEDDING_DIMENSIONS=1536
OPENAI_API_KEY=sk-...

Verify reachability from the app container after the application stack is up (see Application):

docker exec zol-rag-app python -c "
from app.services.embedding_service import EmbeddingService
import asyncio
async def main():
svc = EmbeddingService()
print('dim=', len(await svc.embed_text('connectivity check')))
asyncio.run(main())
"
# Expected: dim= 1536
Why hosted, not on-prem

The migration trade-off and per-query cost analysis are documented in ADR-0048. The short version: on-prem Ollama bge-m3 embeddings paid a 1.7–5.8 s wall-clock tax per voice turn (cold-start + serialization); hosted text-embedding-3-large returns in 150–211 ms for ≈$0.16/year at pilot volume. The query string surface is non-PHI public-website chatbot input, which kept the data-sovereignty argument from being decisive.

Volume Persistence

Data survives container restarts via named Docker volumes:

VolumeServiceBackup Priority
zol-postgres-dataPostgreSQLCRITICAL — all data + embeddings + taxonomy
zol-minio-dataMinIOHIGH — uploaded documents
zol-redis-dataRedisLOW — reconstructable cache
zol-prometheus-dataPrometheusMEDIUM — 30-day metrics
zol-grafana-dataGrafanaMEDIUM — dashboards
note

Keycloak stores its data in the PostgreSQL database (in a separate keycloak database), so no additional volume is needed for Keycloak persistence. The zol-postgres-data volume covers both application data and Keycloak data.

Troubleshooting

Service shows (unhealthy)

# Check logs for the specific service
docker logs zol-<service-name> --tail 50

# Check the health check history
docker inspect zol-<service-name> --format='{{json .State.Health.Log}}' | python3 -m json.tool

Keycloak fails to start

# Check Keycloak logs
docker logs zol-keycloak --tail 50

# Common issues:
# 1. PostgreSQL not ready — Keycloak depends on service_healthy, but check anyway
# 2. Port 8080 already in use — check with: ss -tlnp | grep 8080
# 3. Realm import error — check scripts/keycloak/zol-realm.json for valid JSON

PostgreSQL won't start (data directory issue)

# Check if the data volume has correct permissions
docker logs zol-postgres --tail 20
# If permission errors, the volume may need to be recreated (data loss!)

OpenAI Embeddings unreachable

# Confirm the host can reach OpenAI
docker exec zol-rag-app curl -sI https://api.openai.com/v1/models -H "Authorization: Bearer $OPENAI_API_KEY" | head -1
# Expected: HTTP/2 200 (or HTTP/2 401 if the key is wrong, which still proves connectivity)

If reachability is fine but embeds return 0-dim vectors silently, double-check that EMBEDDING_PROVIDER and EMBEDDING_MODEL are not being shadowed by a stale environment: block in any of the compose files — the historical regression that motivated ADR-0048 was exactly this drift class.

Architectural Evolution

The infrastructure stack has undergone three significant changes since initial deployment. Neo4j (knowledge graph database) was removed in March 2026 after entity relationships were migrated to PostgreSQL taxonomy tables (taxonomy_entities and taxonomy_relationships), simplifying the operational footprint — see ADR-0053 (master record). Simultaneously, Keycloak was added as the OIDC identity provider, replacing the legacy cookie-based authentication system. In April 2026 the on-premise Ollama embedding container was retired entirely, with embedding inference moving to OpenAI's hosted text-embedding-3-large API (ADR-0048) — net result was 30× faster embeddings on voice turns and ≈1.4 GB of host RAM freed.

Next: Build and Start Application →