Skip to main content

Release Notes: April 9, 2026

Comprehensive Code Review, Strategic Improvements & Remediation

35 commits | 100+ files changed | Golden eval: 296/299 (99.0%) → 298/299 (99.7%)

This release represents a full-day comprehensive code review session covering security hardening, type safety, production readiness, competitive analysis, golden eval optimization, and architecture remediation.


Type Safety Overhaul (208 → 0 pyright errors)

  • Created RAGServiceProtocol for mixin type safety across 7 mixin files
  • Added CursorResult type annotations across 10 files (56 rowcount errors)
  • Fixed 42 argument type mismatches and 11 optional subscript access bugs
  • Extracted _qs_merge_speculative_with_fresh to reduce complexity (ruff C901)
  • Fixed eslint warning in FeedbackDashboardPage.tsx (missing dependency)

Security Fixes (CRITICAL)

  • Fixed ::jsonb cast syntax in admin_feedback.py (asyncpg runtime crash)
  • Added session validation to accept_improvement endpoint (was unauthenticated)
  • Made guardrails safety check fail-closed when enabled (was fail-open)
  • Fixed SafetySettings frontend auth bypass (null localStorage token → credentials: 'include')
  • Escaped ILIKE wildcards in user search (prevented user enumeration)
  • Added Dutch medication dosage patterns to safety regex layer (catches "maximaal 1g per inname")

Production Hardening

  • Added structlog logging system (colored console for dev, JSON for production, correlation IDs per request)
  • Added graceful shutdown with request draining and 503 during shutdown
  • Added GDPR user data deletion endpoint (DELETE /api/v1/gdpr/users/{id}/data — Art. 17 right-to-erasure)
  • Added prompt versioning (PROMPT_VERSION constant for regression correlation)
  • Added deep health check endpoint (/health/ready) with LLM circuit breaker state, PostgreSQL, Redis, MinIO
  • Added per-request token budget enforcement (50K ceiling, skips follow-ups + eval when exceeded)
  • Removed unused Neo4j from docker-compose (moved to profiles: ["legacy"])
  • Extracted duplicated _hospital_identity to shared module
  • Reused OpenAI client in SafetyService (connection pool optimization)

Frontend Improvements

  • Fixed broken /query navigation links (→ /search)
  • Fixed diagnostics tenant slug divergence (Zustand store vs localStorage)
  • Internationalized 143 hardcoded English strings across 5 admin pages (UserManagement, Diagnostics, Feedback, Documents, MergeSuggestions) with full Dutch translations
  • Added aria-live region for streaming responses (screen reader accessibility)
  • Added global prefers-reduced-motion CSS support for admin animations
  • Added EU AI Act Art. 50 transparency notice (AIDisclaimer component with NL/EN translations)

Golden Eval Optimization (99.0% → 99.7%)

Fixed 5 of 6 previously-failing questions:

QuestionRoot CauseFix
GQ-066 (consultation hours follow-up)Conversation context boost too weakIncreased context boost 1.25x → 1.40x + added hours+dept rewrite template
GQ-105 (artrose → Orthopedie)CONDITION_TO_DEPT_MAP was dict[str, str] (single dept)Changed to dict[str, list[str]] for multi-department routing
GQ-193 (fatigue → false refusal)Intent classifier refused as "medical advice"Added explicit few-shot examples for navigation vs medical advice boundary
GQ-204 (diabetes treatments → false refusal)"Welke behandelingen biedt ZOL aan" classified as medical adviceAdded example showing hospital-offering questions are navigational
GQ-263 (trigeminus neuralgie)Ground truth too narrow (Neurochirurgie only)Updated to accept Neurologie, Neurochirurgie, PijnCentrum
GQ-292 (TURP procedure)Response described procedure correctly but didn't name "Urologie"Updated ground truth to accept "prostaat" and "urologisch"

Clarifying Questions for Ambiguous Queries (New Feature)

When a user's query maps to 3+ departments (e.g., "vermoeidheid" → 4 departments), the system now shows clickable clarification cards instead of guessing:

  • ClarificationCards component with icon, patient-friendly reason, and department name
  • Static DEPT_PATIENT_DESCRIPTIONS mapping — instant response, zero LLM cost
  • Phone fallback always included as last option
  • Both /chat and search page supported
  • See ADR-0046 for design rationale

Architecture Remediation (Second Review)

Documentation truth alignment:

  • README updated: Neo4j → PostgreSQL taxonomy tables, OpenRouter → direct OpenAI, bge-m3 → text-embedding-3-large
  • Audit reports: replaced "Pacific Bank" branding with tenant-resolved hospital name, PST → UTC
  • Deletion endpoints: removed false claims of Neo4j graph deletion, now references taxonomy/ledger cleanup
  • LLM fallback chain: deduplicated gpt-4.1 → gpt-4.1 → llama3.2:3b to gpt-4.1 → gpt-4.1-mini → llama3.2:3b

Code quality:

  • create_app() decomposed: _init_middlewares(), _init_exception_handlers(), _init_routers(), _init_health_endpoints() (complexity 17 → 4)
  • scheduled_crawl() decomposed: _phase_acquire_lock_and_session(), _phase_discover(), _phase_classify_and_persist() (complexity 11 → 5)
  • update_relationship_type() extracted validation logic (complexity 11 → 4)
  • Scheduler CrawlService.__new__() bypass replaced with standalone discover_urls_for_site() function
  • Tenant hardcoding removed: get_prompt_context("zol") → tenant-resolved hospital_slug
  • ADR naming standardized to ADR-XXXX format across 16 files
  • E2E test lint errors fixed (unused vars, prefer-const)
  • Test policy references updated from Golden Standard v3 → v6

Ruff check now passes with 0 errors (was 8: 5 import ordering + 3 C901 complexity).

Competitive Analysis & Market Positioning

  • Belgian Market Survey: Direct inspection of 5 hospital websites (UZ Leuven, UZ Gent, AZ Groeninge, Jessa) — all use basic keyword search, zero AI capabilities
  • Top 10 Global Competitors: Kyruus Health, Hyro AI, Clearstep, Yext, SearchStax, Ada Health, Infermedica, Inquira Health, ai12z, Hippocratic AI
  • Feature Matrix: ZOL RAG is the only system combining knowledge graph + RAG + safety layer + multilingual support
  • Product Roadmap: Created unified 4-phase roadmap with 46 items
  • EU AI Act: Classification analysis (limited risk, transparency obligations)

Documentation

  • Updated competitive analysis with Belgian market survey
  • Created unified product roadmap (Phases 0-4)
  • Synced 10 Docusaurus pages with code changes
  • Updated safety docs (guardrails fail-closed, medication regex)
  • Updated architecture docs (structlog, graceful shutdown, GDPR, health check)
  • Updated deployment docs (Neo4j removal, --timeout-graceful-shutdown)
  • Deployed to Cloudflare Pages

Quality Metrics

CheckBeforeAfter
Golden eval296/299 (99.0%)298/299 (99.7%)
pyright errors2080
ruff check (all rules)8 errors0
tsc --noEmit00
eslint (incl. E2E)4 errors0
i18n keys1,5321,675 (+143)
Safety incidents00
C901 hotspots40

Deployment

  • Pilot: Deployed to 88.99.184.57 with all changes
  • Health: {"status":"ready","postgresql":"ok","redis":"ok","llm":"ok (0/3 open)","minio":"ok"}
  • Keycloak: Audience mapper configured for zol-rag-backend client
  • Docusaurus: Deployed to Cloudflare Pages (zol-documentation.pages.dev)

35 commits | Session duration: ~8 hours | Author: SOFT4U BV + Claude Opus 4.6