Security Architecture

Beyond the content-safety measures that prevent medical advice, the ZOL Intelligent Search implements a security architecture protecting against common web-application threats. The security design follows the principle of defence in depth (Schneier, 2000), layering multiple independent security mechanisms; the layering is anchored to the OWASP LLM Top 10 categories that apply to LLM-application infrastructure (in particular LLM06 sensitive information disclosure and LLM07 insecure plugin/integration design) rather than to web-application threats alone.

See RFC 6749 (OAuth 2.0). See OpenID Connect Core 1.0. See RFC 7519 (JWT). See ISO/IEC 27001:2022.

Security Middleware Stack

Every HTTP request passes through five security middleware layers before reaching any application logic:

Authentication: Keycloak OIDC

The authentication system uses Keycloak as an external OpenID Connect (OIDC) identity provider implementing the Authorization Code Flow. The browser obtains an authorization code by redirecting through Keycloak's login form, and the FastAPI backend exchanges that code for an access + refresh + id token at the Keycloak token endpoint. The access token is issued back to the browser as an httpOnly cookie. There are no custom credential-storage or password-hashing routes — identity is delegated to Keycloak end-to-end.

This is consistent with the OWASP LLM Top 10 (@owasp_llm_top10) LLM06 (sensitive information disclosure) and LLM07 (insecure plugin/integration design) guidance: identity, session lifecycle, and token revocation are owned by an audited identity provider rather than re-implemented in the LLM-application's data plane.

Active OIDC endpoints

Endpoint	Purpose	Auth required	Notes
`GET /api/v1/auth/oidc/config`	Returns the OIDC configuration (login + logout URLs) for the SPA	Public	Returns empty strings when Keycloak is disabled
`GET /api/v1/auth/oidc/login?redirect_uri=…`	Initiates the Authorization Code flow	Public	`redirect_uri` validated against the CORS allowlist before being passed through Keycloak's `state` parameter
`GET /api/v1/auth/oidc/callback?code=…&state=…`	Exchanges the authorization code for tokens; sets `access_token`, `refresh_token`, `id_token` httpOnly cookies	Public	The `state` is the validated frontend URL the browser is redirected to with `?oidc=true`
`POST /api/v1/auth/oidc/refresh`	Refreshes the access token via the `refresh_token` cookie	Public (cookie-bearing)	Calls Keycloak's token endpoint with `grant_type=refresh_token`
`GET /api/v1/auth/oidc/logout?post_logout_redirect_uri=…`	Terminates the Keycloak SSO session and clears cookies	Bearer / cookie	Forwards `id_token_hint` from the cookie so Keycloak skips the consent screen
`GET /api/v1/auth/me`	Returns the currently-authenticated user (JIT-provisions on first call)	Bearer / cookie	Maps `keycloak_id` (from JWT `sub`) onto a row in `app.users`; creates the row if missing

There is intentionally no POST /api/v1/auth/login or POST /api/v1/auth/register route — those names appear nowhere in backend/app/api/auth.py. The SPA never sees raw credentials; the Keycloak-hosted login page is the only credential-collection surface. References to those endpoints in earlier revisions of this page were carried over from an aborted custom-JWT design and have been removed.

Keycloak Configuration

Property	Value	Rationale
Realm	`zol`	Tenant isolation at identity provider level
Frontend client	`zol-rag-frontend` (public)	SPA client using PKCE, no client secret
Backend client	`zol-rag-backend` (confidential)	Service-to-service authentication
Token format	RS256 JWT	Asymmetric signing enables stateless validation
Access token expiry	Configurable via Keycloak	Managed centrally, not in application code
Token validation	`python-jose`	Backend validates JWT signature, issuer, audience, expiry, and authorized party (`azp`)
Frontend library	`@react-keycloak/web`	React integration with automatic token refresh

Why Keycloak OIDC?

Approach	Credential Management	Session Lifecycle	Standards Compliance
Custom cookie-based JWT	Application manages passwords, hashing	Application manages blacklists (Redis)	Custom implementation
Keycloak OIDC	Delegated to Keycloak	Keycloak manages sessions, revocation	OpenID Connect certified
Third-party SaaS (Auth0, Okta)	Delegated to vendor	Vendor manages lifecycle	Standards-compliant but vendor lock-in

Keycloak was selected for its open-source licensing, self-hosted deployment model (data sovereignty), mature OIDC implementation, and built-in support for realm-based multi-tenancy. The migration from the earlier cookie-based JWT approach eliminated the need for custom token blacklisting infrastructure and centralised identity management in a dedicated security component.

OIDC Redirect URL Validation

The OIDC login, callback, and logout endpoints accept redirect URLs (redirect_uri, state, post_logout_redirect_uri). To prevent open redirect attacks -- where an attacker crafts a login link that redirects to a malicious site after authentication -- all redirect URLs are validated against the configured CORS origin allowlist before use:

Relative paths (e.g., /dashboard) are always allowed
Absolute URLs must have an origin (scheme://host:port) that matches one of the configured CORS_ORIGINS
Protocol-relative URLs (//evil.com) and non-HTTP schemes are rejected
Invalid or missing URLs default to / (safe fallback)

This validation is applied in _validate_redirect_url() and called in all three OIDC endpoints (oidc_login, oidc_callback, oidc_logout).

JWT Token Confusion Prevention

Beyond standard JWT validation (signature, issuer, audience, expiry), the backend performs an authorized party (azp) check to prevent token confusion attacks:

Tokens without an azp claim are rejected (returns None, triggering 401)
Tokens where azp does not match the configured keycloak_client_id are rejected

This prevents tokens issued for other Keycloak clients in the same realm from being used to authenticate against the backend API -- a subtle attack vector when multiple applications share a Keycloak realm.

Public Endpoints

Certain endpoints bypass Keycloak authentication to support unauthenticated access:

Feedback endpoints: Public feedback submission uses plain axios requests without Bearer tokens, enabling anonymous user feedback collection
Health check: The /health endpoint remains unauthenticated for infrastructure monitoring

CSRF Protection

Cross-Site Request Forgery (CSRF) attacks exploit the browser's automatic cookie inclusion to make unauthorized requests. The system uses the starlette-csrf middleware, which:

Generates a unique CSRF token per session
Embeds the token in responses (as a cookie readable by JavaScript)
Requires the token in a custom header on state-changing requests
Rejects requests where the header token does not match the cookie token

This double-submit cookie pattern provides CSRF protection without server-side session state.

Rate Limiting

The slowapi rate limiter prevents abuse through configurable per-endpoint limits:

Endpoint	Limit	Window	Rationale
GET /api/v1/auth/oidc/login	20 requests	1 minute	Throttle login redirects to Keycloak
POST /api/v1/auth/oidc/refresh	30 requests	1 minute	Bound refresh-loop attacks
POST /api/v1/query	30 requests	1 minute	Prevent abuse of the RAG endpoint
WS /ws/query	30 messages	1 minute	WebSocket abuse prevention

Rate limit counters are stored in Redis with 1-minute TTL sliding windows, ensuring accurate counting across multiple application instances.

Rate Limiting Strategy

Authentication is delegated to Keycloak — there are no application-side login or register endpoints to brute-force. The OIDC redirect endpoint (/oidc/login) is rate-limited to prevent attempted SSO-flow flooding, and the refresh endpoint is rate-limited to bound a stolen-refresh-token replay window. The 30/minute query limit is generous for human use (a user rarely asks more than 5-10 questions per minute) while still blocking automated abuse.

Session and Token Lifecycle

Token lifecycle management is fully delegated to Keycloak:

Session management: Keycloak maintains server-side sessions; logging out invalidates the session and all associated tokens
Token revocation: Keycloak's revocation endpoint immediately invalidates tokens without application-side blacklists
Token refresh: The frontend library (@react-keycloak/web) automatically refreshes tokens before expiry, providing seamless session continuity
Session timeout: Configurable idle and absolute session timeouts are enforced by Keycloak centrally

This approach eliminates the need for application-side token blacklisting infrastructure (previously implemented via Redis) and centralises session policy in the identity provider.

Multi-Tenancy

The system supports user-scoped data isolation, ensuring that:

Each user's conversation history is isolated
Document uploads are scoped to the uploading user
Search analytics are attributed to individual sessions
Administrative functions require elevated permissions

Security Headers

The API sets standard security headers on all responses:

Header	Value	Purpose
X-Content-Type-Options	nosniff	Prevent MIME sniffing
X-Frame-Options	DENY	Prevent clickjacking
Strict-Transport-Security	max-age=31536000; includeSubDomains	Force HTTPS (production only, disabled in debug)
Referrer-Policy	strict-origin-when-cross-origin	Control referrer information
Permissions-Policy	camera=(), microphone=(), geolocation=()	Restrict browser feature access

Migration History

The authentication architecture was migrated from a custom cookie-based JWT implementation (with Redis token blacklisting) to Keycloak OIDC in March 2026. The migration was motivated by three factors: (1) centralising identity management in a dedicated security component rather than custom application code, (2) eliminating the operational burden of maintaining a Redis-based token blacklist, and (3) aligning with OpenID Connect standards for interoperability and compliance. The Keycloak realm model also provides a natural path to multi-tenant identity isolation should ZOL expand to additional hospital partners.

References

@owasp_llm_top10 — OWASP Top 10 for LLM Applications; LLM06 (sensitive information disclosure) and LLM07 (insecure plugin/integration design) are the load-bearing categories for the auth design above.
OWASP Foundation. (2021). OWASP Top Ten — 2021 — web-application threat taxonomy used for the middleware layering.
Hardt, D. (Ed.). (2012). The OAuth 2.0 Authorization Framework. RFC 6749, IETF — protocol that the OIDC Authorization Code flow extends.
Sakimura, N., Bradley, J., Jones, M., de Medeiros, B., & Mortimore, C. (2014). OpenID Connect Core 1.0 — identity layer on top of OAuth 2.0 used by Keycloak.
Jones, M., Bradley, J., & Sakimura, N. (2015). JSON Web Token (JWT). RFC 7519, IETF — token format validated by python-jose.
Schneier, B. (2000). Secrets and Lies: Digital Security in a Networked World. John Wiley & Sons.
Zou, A., Wang, Z., Kolter, J. Z., & Fredrikson, M. (2023). Universal and Transferable Adversarial Attacks on Aligned Language Models. arXiv preprint, arXiv:2307.15043 — threat-model precedent for the Adversarial Hardening layer that sits inside the middleware stack.
Keycloak. (2024). Keycloak Server Administration Guide — vendor reference for the IdP.

Security Middleware Stack​

Authentication: Keycloak OIDC​

Active OIDC endpoints​

Keycloak Configuration​

Why Keycloak OIDC?​

OIDC Redirect URL Validation​

JWT Token Confusion Prevention​

Public Endpoints​

CSRF Protection​

Rate Limiting​

Session and Token Lifecycle​

Multi-Tenancy​

Security Headers​

References​