Release Notes: April 9, 2026

Comprehensive Code Review, Strategic Improvements & Remediation

35 commits | 100+ files changed | Golden eval: 296/299 (99.0%) → 298/299 (99.7%)

This release represents a full-day comprehensive code review session covering security hardening, type safety, production readiness, competitive analysis, golden eval optimization, and architecture remediation.

Type Safety Overhaul (208 → 0 pyright errors)

Created RAGServiceProtocol for mixin type safety across 7 mixin files
Added CursorResult type annotations across 10 files (56 rowcount errors)
Fixed 42 argument type mismatches and 11 optional subscript access bugs
Extracted _qs_merge_speculative_with_fresh to reduce complexity (ruff C901)
Fixed eslint warning in FeedbackDashboardPage.tsx (missing dependency)

Security Fixes (CRITICAL)

Fixed ::jsonb cast syntax in admin_feedback.py (asyncpg runtime crash)
Added session validation to accept_improvement endpoint (was unauthenticated)
Made guardrails safety check fail-closed when enabled (was fail-open)
Fixed SafetySettings frontend auth bypass (null localStorage token → credentials: 'include')
Escaped ILIKE wildcards in user search (prevented user enumeration)
Added Dutch medication dosage patterns to safety regex layer (catches "maximaal 1g per inname")

Production Hardening

Added structlog logging system (colored console for dev, JSON for production, correlation IDs per request)
Added graceful shutdown with request draining and 503 during shutdown
Added GDPR user data deletion endpoint (DELETE /api/v1/gdpr/users/{id}/data — Art. 17 right-to-erasure)
Added prompt versioning (PROMPT_VERSION constant for regression correlation)
Added deep health check endpoint (/health/ready) with LLM circuit breaker state, PostgreSQL, Redis, MinIO
Added per-request token budget enforcement (50K ceiling, skips follow-ups + eval when exceeded)
Removed unused Neo4j from docker-compose (moved to profiles: ["legacy"])
Extracted duplicated _hospital_identity to shared module
Reused OpenAI client in SafetyService (connection pool optimization)

Frontend Improvements

Fixed broken /query navigation links (→ /search)
Fixed diagnostics tenant slug divergence (Zustand store vs localStorage)
Internationalized 143 hardcoded English strings across 5 admin pages (UserManagement, Diagnostics, Feedback, Documents, MergeSuggestions) with full Dutch translations
Added aria-live region for streaming responses (screen reader accessibility)
Added global prefers-reduced-motion CSS support for admin animations
Added EU AI Act Art. 50 transparency notice (AIDisclaimer component with NL/EN translations)

Golden Eval Optimization (99.0% → 99.7%)

Fixed 5 of 6 previously-failing questions:

Question	Root Cause	Fix
GQ-066 (consultation hours follow-up)	Conversation context boost too weak	Increased context boost 1.25x → 1.40x + added `hours+dept` rewrite template
GQ-105 (artrose → Orthopedie)	`CONDITION_TO_DEPT_MAP` was `dict[str, str]` (single dept)	Changed to `dict[str, list[str]]` for multi-department routing
GQ-193 (fatigue → false refusal)	Intent classifier refused as "medical advice"	Added explicit few-shot examples for navigation vs medical advice boundary
GQ-204 (diabetes treatments → false refusal)	"Welke behandelingen biedt ZOL aan" classified as medical advice	Added example showing hospital-offering questions are navigational
GQ-263 (trigeminus neuralgie)	Ground truth too narrow (Neurochirurgie only)	Updated to accept Neurologie, Neurochirurgie, PijnCentrum
GQ-292 (TURP procedure)	Response described procedure correctly but didn't name "Urologie"	Updated ground truth to accept "prostaat" and "urologisch"

Clarifying Questions for Ambiguous Queries (New Feature)

When a user's query maps to 3+ departments (e.g., "vermoeidheid" → 4 departments), the system now shows clickable clarification cards instead of guessing:

ClarificationCards component with icon, patient-friendly reason, and department name
Static DEPT_PATIENT_DESCRIPTIONS mapping — instant response, zero LLM cost
Phone fallback always included as last option
Both /chat and search page supported
See ADR-0046 for design rationale

Architecture Remediation (Second Review)

Documentation truth alignment:

README updated: Neo4j → PostgreSQL taxonomy tables, OpenRouter → direct OpenAI, bge-m3 → text-embedding-3-large
Audit reports: replaced "Pacific Bank" branding with tenant-resolved hospital name, PST → UTC
Deletion endpoints: removed false claims of Neo4j graph deletion, now references taxonomy/ledger cleanup
LLM fallback chain: deduplicated gpt-4.1 → gpt-4.1 → llama3.2:3b to gpt-4.1 → gpt-4.1-mini → llama3.2:3b

Code quality:

create_app() decomposed: _init_middlewares(), _init_exception_handlers(), _init_routers(), _init_health_endpoints() (complexity 17 → 4)
scheduled_crawl() decomposed: _phase_acquire_lock_and_session(), _phase_discover(), _phase_classify_and_persist() (complexity 11 → 5)
update_relationship_type() extracted validation logic (complexity 11 → 4)
Scheduler CrawlService.__new__() bypass replaced with standalone discover_urls_for_site() function
Tenant hardcoding removed: get_prompt_context("zol") → tenant-resolved hospital_slug
ADR naming standardized to ADR-XXXX format across 16 files
E2E test lint errors fixed (unused vars, prefer-const)
Test policy references updated from Golden Standard v3 → v6

Ruff check now passes with 0 errors (was 8: 5 import ordering + 3 C901 complexity).

Competitive Analysis & Market Positioning

Belgian Market Survey: Direct inspection of 5 hospital websites (UZ Leuven, UZ Gent, AZ Groeninge, Jessa) — all use basic keyword search, zero AI capabilities
Top 10 Global Competitors: Kyruus Health, Hyro AI, Clearstep, Yext, SearchStax, Ada Health, Infermedica, Inquira Health, ai12z, Hippocratic AI
Feature Matrix: ZOL RAG is the only system combining knowledge graph + RAG + safety layer + multilingual support
Product Roadmap: Created unified 4-phase roadmap with 46 items
EU AI Act: Classification analysis (limited risk, transparency obligations)

Documentation

Updated competitive analysis with Belgian market survey
Created unified product roadmap (Phases 0-4)
Synced 10 Docusaurus pages with code changes
Updated safety docs (guardrails fail-closed, medication regex)
Updated architecture docs (structlog, graceful shutdown, GDPR, health check)
Updated deployment docs (Neo4j removal, --timeout-graceful-shutdown)
Deployed to Cloudflare Pages

Quality Metrics

Check	Before	After
Golden eval	296/299 (99.0%)	298/299 (99.7%)
pyright errors	208	0
ruff check (all rules)	8 errors	0
tsc --noEmit	0	0
eslint (incl. E2E)	4 errors	0
i18n keys	1,532	1,675 (+143)
Safety incidents	0	0
C901 hotspots	4	0

Deployment

Pilot: Deployed to <PILOT_HOST> with all changes
Health: {"status":"ready","postgresql":"ok","redis":"ok","llm":"ok (0/3 open)","minio":"ok"}
Keycloak: Audience mapper configured for zol-rag-backend client
Docusaurus: Deployed to Cloudflare Pages (zol-documentation.pages.dev)

35 commits | Session duration: ~8 hours | Author: SOFT4U BV + Claude Opus 4.6

Comprehensive Code Review, Strategic Improvements & Remediation​

Type Safety Overhaul (208 → 0 pyright errors)​

Security Fixes (CRITICAL)​

Production Hardening​

Frontend Improvements​

Golden Eval Optimization (99.0% → 99.7%)​

Clarifying Questions for Ambiguous Queries (New Feature)​

Architecture Remediation (Second Review)​

Competitive Analysis & Market Positioning​

Documentation​

Quality Metrics​

Deployment​