Security Architecture
Beyond the content-safety measures that prevent medical advice, the ZOL Intelligent Search implements a security architecture protecting against common web-application threats. The security design follows the principle of defence in depth (Schneier, 2000), layering multiple independent security mechanisms; the layering is anchored to the OWASP LLM Top 10 categories that apply to LLM-application infrastructure (in particular LLM06 sensitive information disclosure and LLM07 insecure plugin/integration design) rather than to web-application threats alone.
See RFC 6749 (OAuth 2.0). See OpenID Connect Core 1.0. See RFC 7519 (JWT). See ISO/IEC 27001:2022.
Security Middleware Stack
Every HTTP request passes through five security middleware layers before reaching any application logic:
Authentication: Keycloak OIDC
The authentication system uses Keycloak as an external OpenID Connect (OIDC) identity provider implementing the Authorization Code Flow. The browser obtains an authorization code by redirecting through Keycloak's login form, and the FastAPI backend exchanges that code for an access + refresh + id token at the Keycloak token endpoint. The access token is issued back to the browser as an httpOnly cookie. There are no custom credential-storage or password-hashing routes — identity is delegated to Keycloak end-to-end.
This is consistent with the OWASP LLM Top 10 (@owasp_llm_top10) LLM06 (sensitive information disclosure) and LLM07 (insecure plugin/integration design) guidance: identity, session lifecycle, and token revocation are owned by an audited identity provider rather than re-implemented in the LLM-application's data plane.
Active OIDC endpoints
| Endpoint | Purpose | Auth required | Notes |
|---|---|---|---|
GET /api/v1/auth/oidc/config | Returns the OIDC configuration (login + logout URLs) for the SPA | Public | Returns empty strings when Keycloak is disabled |
GET /api/v1/auth/oidc/login?redirect_uri=… | Initiates the Authorization Code flow | Public | redirect_uri validated against the CORS allowlist before being passed through Keycloak's state parameter |
GET /api/v1/auth/oidc/callback?code=…&state=… | Exchanges the authorization code for tokens; sets access_token, refresh_token, id_token httpOnly cookies | Public | The state is the validated frontend URL the browser is redirected to with ?oidc=true |
POST /api/v1/auth/oidc/refresh | Refreshes the access token via the refresh_token cookie | Public (cookie-bearing) | Calls Keycloak's token endpoint with grant_type=refresh_token |
GET /api/v1/auth/oidc/logout?post_logout_redirect_uri=… | Terminates the Keycloak SSO session and clears cookies | Bearer / cookie | Forwards id_token_hint from the cookie so Keycloak skips the consent screen |
GET /api/v1/auth/me | Returns the currently-authenticated user (JIT-provisions on first call) | Bearer / cookie | Maps keycloak_id (from JWT sub) onto a row in app.users; creates the row if missing |
There is intentionally no POST /api/v1/auth/login or POST /api/v1/auth/register route — those names appear nowhere in backend/app/api/auth.py. The SPA never sees raw credentials; the Keycloak-hosted login page is the only credential-collection surface. References to those endpoints in earlier revisions of this page were carried over from an aborted custom-JWT design and have been removed.
Keycloak Configuration
| Property | Value | Rationale |
|---|---|---|
| Realm | zol | Tenant isolation at identity provider level |
| Frontend client | zol-rag-frontend (public) | SPA client using PKCE, no client secret |
| Backend client | zol-rag-backend (confidential) | Service-to-service authentication |
| Token format | RS256 JWT | Asymmetric signing enables stateless validation |
| Access token expiry | Configurable via Keycloak | Managed centrally, not in application code |
| Token validation | python-jose | Backend validates JWT signature, issuer, audience, expiry, and authorized party (azp) |
| Frontend library | @react-keycloak/web | React integration with automatic token refresh |
Why Keycloak OIDC?
| Approach | Credential Management | Session Lifecycle | Standards Compliance |
|---|---|---|---|
| Custom cookie-based JWT | Application manages passwords, hashing | Application manages blacklists (Redis) | Custom implementation |
| Keycloak OIDC | Delegated to Keycloak | Keycloak manages sessions, revocation | OpenID Connect certified |
| Third-party SaaS (Auth0, Okta) | Delegated to vendor | Vendor manages lifecycle | Standards-compliant but vendor lock-in |
Keycloak was selected for its open-source licensing, self-hosted deployment model (data sovereignty), mature OIDC implementation, and built-in support for realm-based multi-tenancy. The migration from the earlier cookie-based JWT approach eliminated the need for custom token blacklisting infrastructure and centralised identity management in a dedicated security component.
OIDC Redirect URL Validation
The OIDC login, callback, and logout endpoints accept redirect URLs (redirect_uri, state, post_logout_redirect_uri). To prevent open redirect attacks -- where an attacker crafts a login link that redirects to a malicious site after authentication -- all redirect URLs are validated against the configured CORS origin allowlist before use:
- Relative paths (e.g.,
/dashboard) are always allowed - Absolute URLs must have an origin (
scheme://host:port) that matches one of the configuredCORS_ORIGINS - Protocol-relative URLs (
//evil.com) and non-HTTP schemes are rejected - Invalid or missing URLs default to
/(safe fallback)
This validation is applied in _validate_redirect_url() and called in all three OIDC endpoints (oidc_login, oidc_callback, oidc_logout).
JWT Token Confusion Prevention
Beyond standard JWT validation (signature, issuer, audience, expiry), the backend performs an authorized party (azp) check to prevent token confusion attacks:
- Tokens without an
azpclaim are rejected (returnsNone, triggering 401) - Tokens where
azpdoes not match the configuredkeycloak_client_idare rejected
This prevents tokens issued for other Keycloak clients in the same realm from being used to authenticate against the backend API -- a subtle attack vector when multiple applications share a Keycloak realm.
Public Endpoints
Certain endpoints bypass Keycloak authentication to support unauthenticated access:
- Feedback endpoints: Public feedback submission uses plain
axiosrequests without Bearer tokens, enabling anonymous user feedback collection - Health check: The
/healthendpoint remains unauthenticated for infrastructure monitoring
CSRF Protection
Cross-Site Request Forgery (CSRF) attacks exploit the browser's automatic cookie inclusion to make unauthorized requests. The system uses the starlette-csrf middleware, which:
- Generates a unique CSRF token per session
- Embeds the token in responses (as a cookie readable by JavaScript)
- Requires the token in a custom header on state-changing requests
- Rejects requests where the header token does not match the cookie token
This double-submit cookie pattern provides CSRF protection without server-side session state.
Rate Limiting
The slowapi rate limiter prevents abuse through configurable per-endpoint limits:
| Endpoint | Limit | Window | Rationale |
|---|---|---|---|
| GET /api/v1/auth/oidc/login | 20 requests | 1 minute | Throttle login redirects to Keycloak |
| POST /api/v1/auth/oidc/refresh | 30 requests | 1 minute | Bound refresh-loop attacks |
| POST /api/v1/query | 30 requests | 1 minute | Prevent abuse of the RAG endpoint |
| WS /ws/query | 30 messages | 1 minute | WebSocket abuse prevention |
Rate limit counters are stored in Redis with 1-minute TTL sliding windows, ensuring accurate counting across multiple application instances.
Authentication is delegated to Keycloak — there are no application-side login or register endpoints to brute-force. The OIDC redirect endpoint (/oidc/login) is rate-limited to prevent attempted SSO-flow flooding, and the refresh endpoint is rate-limited to bound a stolen-refresh-token replay window. The 30/minute query limit is generous for human use (a user rarely asks more than 5-10 questions per minute) while still blocking automated abuse.
Session and Token Lifecycle
Token lifecycle management is fully delegated to Keycloak:
- Session management: Keycloak maintains server-side sessions; logging out invalidates the session and all associated tokens
- Token revocation: Keycloak's revocation endpoint immediately invalidates tokens without application-side blacklists
- Token refresh: The frontend library (
@react-keycloak/web) automatically refreshes tokens before expiry, providing seamless session continuity - Session timeout: Configurable idle and absolute session timeouts are enforced by Keycloak centrally
This approach eliminates the need for application-side token blacklisting infrastructure (previously implemented via Redis) and centralises session policy in the identity provider.
Multi-Tenancy
The system supports user-scoped data isolation, ensuring that:
- Each user's conversation history is isolated
- Document uploads are scoped to the uploading user
- Search analytics are attributed to individual sessions
- Administrative functions require elevated permissions
Security Headers
The API sets standard security headers on all responses:
| Header | Value | Purpose |
|---|---|---|
| X-Content-Type-Options | nosniff | Prevent MIME sniffing |
| X-Frame-Options | DENY | Prevent clickjacking |
| Strict-Transport-Security | max-age=31536000; includeSubDomains | Force HTTPS (production only, disabled in debug) |
| Referrer-Policy | strict-origin-when-cross-origin | Control referrer information |
| Permissions-Policy | camera=(), microphone=(), geolocation=() | Restrict browser feature access |
The authentication architecture was migrated from a custom cookie-based JWT implementation (with Redis token blacklisting) to Keycloak OIDC in March 2026. The migration was motivated by three factors: (1) centralising identity management in a dedicated security component rather than custom application code, (2) eliminating the operational burden of maintaining a Redis-based token blacklist, and (3) aligning with OpenID Connect standards for interoperability and compliance. The Keycloak realm model also provides a natural path to multi-tenant identity isolation should ZOL expand to additional hospital partners.
References
- @owasp_llm_top10 — OWASP Top 10 for LLM Applications; LLM06 (sensitive information disclosure) and LLM07 (insecure plugin/integration design) are the load-bearing categories for the auth design above.
- OWASP Foundation. (2021). OWASP Top Ten — 2021 — web-application threat taxonomy used for the middleware layering.
- Hardt, D. (Ed.). (2012). The OAuth 2.0 Authorization Framework. RFC 6749, IETF — protocol that the OIDC Authorization Code flow extends.
- Sakimura, N., Bradley, J., Jones, M., de Medeiros, B., & Mortimore, C. (2014). OpenID Connect Core 1.0 — identity layer on top of OAuth 2.0 used by Keycloak.
- Jones, M., Bradley, J., & Sakimura, N. (2015). JSON Web Token (JWT). RFC 7519, IETF — token format validated by
python-jose. - Schneier, B. (2000). Secrets and Lies: Digital Security in a Networked World. John Wiley & Sons.
- Zou, A., Wang, Z., Kolter, J. Z., & Fredrikson, M. (2023). Universal and Transferable Adversarial Attacks on Aligned Language Models. arXiv preprint, arXiv:2307.15043 — threat-model precedent for the Adversarial Hardening layer that sits inside the middleware stack.
- Keycloak. (2024). Keycloak Server Administration Guide — vendor reference for the IdP.