Skip to main content

Security Architecture

Beyond the content-safety measures that prevent medical advice, the ZOL Intelligent Search implements a security architecture protecting against common web-application threats. The security design follows the principle of defence in depth (Schneier, 2000), layering multiple independent security mechanisms; the layering is anchored to the OWASP LLM Top 10 categories that apply to LLM-application infrastructure (in particular LLM06 sensitive information disclosure and LLM07 insecure plugin/integration design) rather than to web-application threats alone.

See RFC 6749 (OAuth 2.0). See OpenID Connect Core 1.0. See RFC 7519 (JWT). See ISO/IEC 27001:2022.

Security Middleware Stack

Every HTTP request passes through five security middleware layers before reaching any application logic:

Authentication: Keycloak OIDC

The authentication system uses Keycloak as an external OpenID Connect (OIDC) identity provider implementing the Authorization Code Flow. The browser obtains an authorization code by redirecting through Keycloak's login form, and the FastAPI backend exchanges that code for an access + refresh + id token at the Keycloak token endpoint. The access token is issued back to the browser as an httpOnly cookie. There are no custom credential-storage or password-hashing routes — identity is delegated to Keycloak end-to-end.

This is consistent with the OWASP LLM Top 10 (@owasp_llm_top10) LLM06 (sensitive information disclosure) and LLM07 (insecure plugin/integration design) guidance: identity, session lifecycle, and token revocation are owned by an audited identity provider rather than re-implemented in the LLM-application's data plane.

Active OIDC endpoints

EndpointPurposeAuth requiredNotes
GET /api/v1/auth/oidc/configReturns the OIDC configuration (login + logout URLs) for the SPAPublicReturns empty strings when Keycloak is disabled
GET /api/v1/auth/oidc/login?redirect_uri=…Initiates the Authorization Code flowPublicredirect_uri validated against the CORS allowlist before being passed through Keycloak's state parameter
GET /api/v1/auth/oidc/callback?code=…&state=…Exchanges the authorization code for tokens; sets access_token, refresh_token, id_token httpOnly cookiesPublicThe state is the validated frontend URL the browser is redirected to with ?oidc=true
POST /api/v1/auth/oidc/refreshRefreshes the access token via the refresh_token cookiePublic (cookie-bearing)Calls Keycloak's token endpoint with grant_type=refresh_token
GET /api/v1/auth/oidc/logout?post_logout_redirect_uri=…Terminates the Keycloak SSO session and clears cookiesBearer / cookieForwards id_token_hint from the cookie so Keycloak skips the consent screen
GET /api/v1/auth/meReturns the currently-authenticated user (JIT-provisions on first call)Bearer / cookieMaps keycloak_id (from JWT sub) onto a row in app.users; creates the row if missing

There is intentionally no POST /api/v1/auth/login or POST /api/v1/auth/register route — those names appear nowhere in backend/app/api/auth.py. The SPA never sees raw credentials; the Keycloak-hosted login page is the only credential-collection surface. References to those endpoints in earlier revisions of this page were carried over from an aborted custom-JWT design and have been removed.

Keycloak Configuration

PropertyValueRationale
RealmzolTenant isolation at identity provider level
Frontend clientzol-rag-frontend (public)SPA client using PKCE, no client secret
Backend clientzol-rag-backend (confidential)Service-to-service authentication
Token formatRS256 JWTAsymmetric signing enables stateless validation
Access token expiryConfigurable via KeycloakManaged centrally, not in application code
Token validationpython-joseBackend validates JWT signature, issuer, audience, expiry, and authorized party (azp)
Frontend library@react-keycloak/webReact integration with automatic token refresh

Why Keycloak OIDC?

ApproachCredential ManagementSession LifecycleStandards Compliance
Custom cookie-based JWTApplication manages passwords, hashingApplication manages blacklists (Redis)Custom implementation
Keycloak OIDCDelegated to KeycloakKeycloak manages sessions, revocationOpenID Connect certified
Third-party SaaS (Auth0, Okta)Delegated to vendorVendor manages lifecycleStandards-compliant but vendor lock-in

Keycloak was selected for its open-source licensing, self-hosted deployment model (data sovereignty), mature OIDC implementation, and built-in support for realm-based multi-tenancy. The migration from the earlier cookie-based JWT approach eliminated the need for custom token blacklisting infrastructure and centralised identity management in a dedicated security component.

OIDC Redirect URL Validation

The OIDC login, callback, and logout endpoints accept redirect URLs (redirect_uri, state, post_logout_redirect_uri). To prevent open redirect attacks -- where an attacker crafts a login link that redirects to a malicious site after authentication -- all redirect URLs are validated against the configured CORS origin allowlist before use:

  1. Relative paths (e.g., /dashboard) are always allowed
  2. Absolute URLs must have an origin (scheme://host:port) that matches one of the configured CORS_ORIGINS
  3. Protocol-relative URLs (//evil.com) and non-HTTP schemes are rejected
  4. Invalid or missing URLs default to / (safe fallback)

This validation is applied in _validate_redirect_url() and called in all three OIDC endpoints (oidc_login, oidc_callback, oidc_logout).

JWT Token Confusion Prevention

Beyond standard JWT validation (signature, issuer, audience, expiry), the backend performs an authorized party (azp) check to prevent token confusion attacks:

  • Tokens without an azp claim are rejected (returns None, triggering 401)
  • Tokens where azp does not match the configured keycloak_client_id are rejected

This prevents tokens issued for other Keycloak clients in the same realm from being used to authenticate against the backend API -- a subtle attack vector when multiple applications share a Keycloak realm.

Public Endpoints

Certain endpoints bypass Keycloak authentication to support unauthenticated access:

  • Feedback endpoints: Public feedback submission uses plain axios requests without Bearer tokens, enabling anonymous user feedback collection
  • Health check: The /health endpoint remains unauthenticated for infrastructure monitoring

CSRF Protection

Cross-Site Request Forgery (CSRF) attacks exploit the browser's automatic cookie inclusion to make unauthorized requests. The system uses the starlette-csrf middleware, which:

  1. Generates a unique CSRF token per session
  2. Embeds the token in responses (as a cookie readable by JavaScript)
  3. Requires the token in a custom header on state-changing requests
  4. Rejects requests where the header token does not match the cookie token

This double-submit cookie pattern provides CSRF protection without server-side session state.

Rate Limiting

The slowapi rate limiter prevents abuse through configurable per-endpoint limits:

EndpointLimitWindowRationale
GET /api/v1/auth/oidc/login20 requests1 minuteThrottle login redirects to Keycloak
POST /api/v1/auth/oidc/refresh30 requests1 minuteBound refresh-loop attacks
POST /api/v1/query30 requests1 minutePrevent abuse of the RAG endpoint
WS /ws/query30 messages1 minuteWebSocket abuse prevention

Rate limit counters are stored in Redis with 1-minute TTL sliding windows, ensuring accurate counting across multiple application instances.

Rate Limiting Strategy

Authentication is delegated to Keycloak — there are no application-side login or register endpoints to brute-force. The OIDC redirect endpoint (/oidc/login) is rate-limited to prevent attempted SSO-flow flooding, and the refresh endpoint is rate-limited to bound a stolen-refresh-token replay window. The 30/minute query limit is generous for human use (a user rarely asks more than 5-10 questions per minute) while still blocking automated abuse.

Session and Token Lifecycle

Token lifecycle management is fully delegated to Keycloak:

  1. Session management: Keycloak maintains server-side sessions; logging out invalidates the session and all associated tokens
  2. Token revocation: Keycloak's revocation endpoint immediately invalidates tokens without application-side blacklists
  3. Token refresh: The frontend library (@react-keycloak/web) automatically refreshes tokens before expiry, providing seamless session continuity
  4. Session timeout: Configurable idle and absolute session timeouts are enforced by Keycloak centrally

This approach eliminates the need for application-side token blacklisting infrastructure (previously implemented via Redis) and centralises session policy in the identity provider.

Multi-Tenancy

The system supports user-scoped data isolation, ensuring that:

  • Each user's conversation history is isolated
  • Document uploads are scoped to the uploading user
  • Search analytics are attributed to individual sessions
  • Administrative functions require elevated permissions

Security Headers

The API sets standard security headers on all responses:

HeaderValuePurpose
X-Content-Type-OptionsnosniffPrevent MIME sniffing
X-Frame-OptionsDENYPrevent clickjacking
Strict-Transport-Securitymax-age=31536000; includeSubDomainsForce HTTPS (production only, disabled in debug)
Referrer-Policystrict-origin-when-cross-originControl referrer information
Permissions-Policycamera=(), microphone=(), geolocation=()Restrict browser feature access
Migration History

The authentication architecture was migrated from a custom cookie-based JWT implementation (with Redis token blacklisting) to Keycloak OIDC in March 2026. The migration was motivated by three factors: (1) centralising identity management in a dedicated security component rather than custom application code, (2) eliminating the operational burden of maintaining a Redis-based token blacklist, and (3) aligning with OpenID Connect standards for interoperability and compliance. The Keycloak realm model also provides a natural path to multi-tenant identity isolation should ZOL expand to additional hospital partners.

References