Release Notes: April 9-10, 2026

Clarifying Questions, YAML Removal & Hospital-Agnostic Phase 5

45 commits | 3 sessions | YAML config deleted forever | Taxonomy 100% DB-backed | Golden eval: 99.7% maintained

This release delivers two major milestones: the clarifying questions feature (from spec to production, including systematic debugging of 4 nested production bugs) and the complete removal of YAML hospital configuration — the final phase of hospital-agnostic decoupling. Adding a new hospital now requires only database records: zero code changes, zero config files, zero deployment.

YAML Config Removal — Taxonomy 100% DB-Backed (Milestone)

The last YAML config file (zol.yaml, 1,300+ lines) has been deleted. The hospital taxonomy system now loads exclusively from the database.

What changed:

warm_taxonomy_cache() loads all hospitals from DB at FastAPI startup (async, ~50ms)
get_taxonomy() serves from pre-warmed in-memory cache (sync, instant)
HospitalTaxonomy.from_db() is now the only construction path
load_hospital_config() function deleted entirely
zol_taxonomy.py converted to lazy module (PEP 562 __getattr__) — no import-time taxonomy loading
taxonomy_prompt.py imports moved inside functions to break import-time dependency chain
Build scripts (golden_page_config.py, zol_scraper.py) updated to use get_taxonomy()
Cold-cache fallback builds minimal taxonomy from defaults (for tests/scripts without DB)

Production verification:

[Taxonomy] Loaded 'zol' (0 depts, 352 conditions)
[Taxonomy] Cache warmed: 1 hospital(s)

Zero YAML files remaining in container — verified post-deployment.

Hospital-Agnostic Architecture — Phase 5 Complete

The YAML removal completes the five-phase hospital-agnostic decoupling that started March 28:

Phase	Date	What	Status
1	Mar 28	DB-backed hospital configs (`app.hospitals` table)	Done
2	Mar 29	DB-backed campuses, departments, domain knowledge	Done
3	Mar 30	Parameterized prompts (no hardcoded "ZOL")	Done
4	Mar 31	HospitalCrawler for multi-site crawling	Done
5	Apr 10	YAML removal — taxonomy 100% from DB	Done

To onboard a new hospital tenant:

Insert a record in app.hospitals (name, short_name, slug, website, phone)
Add campus and department records
Configure domain knowledge in the JSONB config column
No code changes. No config files. No deployment.

Clarifying Questions Trigger Mechanism (Core Feature)

The previous release introduced ClarificationCards as a UI component. This release adds the intelligence layer that determines when to show them.

Data layer — patient-language symptom mapping:

Added 22 Dutch patient-language symptoms (vermoeidheid, duizeligheid, borstpijn, misselijkheid, etc.) to DEPT_CONDITION_KNOWLEDGE
Each symptom maps to 3+ departments, making them genuinely ambiguous
Longest-match-first ordering prevents partial matches (e.g., borstpijn matches before pijn)

Query scanner — _scan_query_for_ambiguous_conditions:

Module-level function that scans the LLM-reformulated query for known ambiguous conditions
Returns the department list when a condition maps to 3+ departments
Acts as a fallback when the LLM cannot extract structured entities from the query

Null-condition fallback — the key insight:

When a user types "ik ben altijd moe" (I'm always tired), the LLM reformulates to "Welke afdelingen behandelen vermoeidheid?" but returns entities=None because there is no specific medical condition to extract
The fallback scanner catches this: it checks the reformulated query text directly and finds "vermoeidheid" mapped to 4 departments
Without this fallback, vague symptom queries would skip clarification entirely

Pipeline integration — ambiguity short-circuit:

New pipeline stage intercepts before retrieval when ambiguity is detected
Returns StreamChunk(type="clarification") with department cards
Zero LLM cost for ambiguous queries — the response is fully static

Frontend integration:

ClarificationCards component with Framer Motion animations and i18n support
ZOL brand colors (white background, teal accents) matching the hospital website
Department-specific icons and patient-friendly descriptions
Phone fallback option always included as the last card
Integrated into PublicChatPage via WebSocket streaming

Production Debugging — 4 Nested Bugs

The clarification feature was spec'd, planned, and implemented cleanly — but did not work on the pilot server. Systematic debugging across 4 deploy-test-debug cycles revealed 4 bugs, each masking the next. This is an educational case study in production debugging.

Bug 1 — Cache bypass (commit 3a9567a): The ambiguity check ran after the cache check in _qs_early_exit_chunks(). A previously cached response for the same query would be returned before the ambiguity logic ever executed. Fix: moved ambiguity check before cache lookup in the pipeline.

Bug 2 — Overly defensive null guard (commit c27787f): The guard clause if not detected_entities: return None in the ambiguity checker was intended to skip when no entities were found. But this is exactly the case where the fallback scanner should run — when the LLM returns entities=None for vague symptom queries. Fix: removed the early return and allowed the fallback scanner to execute.

Bug 3 — Python dict setdefault() no-op (commit 1fcbea7): s.setdefault("detected_intent", c.intent) was silently doing nothing. The pipeline initialization function _qs_init_pipeline() pre-set all keys to None, so setdefault() saw the key as already existing and kept the None value. Fix: changed to direct assignment s["detected_intent"] = c.intent.

Bug 4 — Slug vs short_name mismatch (commit 3df13ce): The hospital-agnostic refactoring (March 31) stored the hospital slug as ziekenhuis-oost-limburg in the database, but taxonomy YAML files use the short name zol.yaml. The get_taxonomy() function was using the DB slug to construct the YAML path, causing FileNotFoundError. This was silently caught at debug log level. Fix: fetch short_name from the database and use it as the taxonomy config key. This fix became the catalyst for the full YAML removal (Phase 5).

Key takeaway: Each bug masked the next. Fixing the cache ordering revealed the null guard issue. Fixing the null guard revealed the setdefault() no-op. Fixing the pipeline state revealed the taxonomy lookup failure. Without systematic debugging, only the first bug would have been found.

UX Improvements

Navigational-only follow-up questions (commit 3d64188): The LLM was generating diagnostic-style follow-up questions like "Hoe lang voelt u zich al moe?" (How long have you been tired?). This sounds like a doctor screening a patient — exactly the kind of medical advice behavior the system must avoid. Updated the prompt to require navigational questions only (e.g., "Zoekt u informatie over een specifieke afdeling?").

Duplicate disclaimer removal (commit 1fcbea7): The medical disclaimer was showing twice — once in the chat stream and once below the input field. Removed the duplicate.

Card styling and lookup fix (commit cdff8a0):

Card colors updated from dark slate-800 to ZOL brand colors (white background, teal accents)
Case-insensitive department description lookup — CONDITION_TO_DEPT_MAP uses lowercase keys while DEPT_PATIENT_DESCRIPTIONS uses title-case keys; the lookup now normalizes case before matching

Infrastructure

Keycloak URL split (commit 4d1ce55): Separated Keycloak into public and internal URLs. The OIDC redirect was sending users to the Docker-internal hostname instead of the public URL.
SSL overlay documentation (commit c04bc70): Documented the docker-compose.ssl.yml overlay requirement for HTTPS deployments.
Null-safe campus fields (commit 0be806c): DB stores NULL for optional campus fields (phone, address) but Pydantic expects strings. Fixed with or "" coercion.
Test and code review fixes (commits 22d12be, f8f81e1, 1641887): Type ignore for intentional validation test, code review remediation, alphabetical ordering fix.

Documentation

ADR-0046: Clarifying questions design rationale and trigger mechanism
Docusaurus page: docs/rag/clarifying-questions — full technical documentation with Mermaid diagrams
Effort Estimation page: Weekly breakdown of project effort (110-140 prompting hours across 9 weeks, 1,351 commits)
Design specs: Clarifying questions feature spec + trigger mechanism spec
Implementation plans: Two phased plans covering UI and trigger mechanism + YAML removal plan
Release notes and roadmap updates
Deployed to Cloudflare Pages (zol-documentation.pages.dev)

Quality Metrics

Check	Status
Golden eval	298/299 (99.7%) — maintained
pyright	0 errors
ruff check	0 errors
tsc --noEmit	0 errors
eslint	0 errors
Unit tests (clarification + taxonomy)	51 passed
Safety incidents	0
YAML config files	0 (deleted)
load_hospital_config references	0 (removed)

Architecture Notes

Clarifying questions follow a layered approach:

LLM entity extraction — primary path, extracts structured conditions from natural language
Fallback query scanner — when LLM returns no entities, scans reformulated text against DEPT_CONDITION_KNOWLEDGE
Threshold check — fires clarification only when a condition maps to 3+ departments
Static response — zero LLM cost, instant response from DEPT_PATIENT_DESCRIPTIONS

Taxonomy loading follows an eager-init pattern:

Startup — warm_taxonomy_cache() queries all hospitals from DB, builds HospitalTaxonomy instances via from_db()
Runtime — get_taxonomy(key) returns from in-memory cache (sync, zero-cost)
Fallback — cold-cache builds minimal taxonomy from defaults (tests/scripts)

Both designs add zero latency and zero cost to the happy path while significantly improving the system's extensibility and user experience.

45 commits | 3 sessions | Author: SOFT4U BV + Claude Opus 4.6

Clarifying Questions, YAML Removal & Hospital-Agnostic Phase 5​

YAML Config Removal — Taxonomy 100% DB-Backed (Milestone)​

Hospital-Agnostic Architecture — Phase 5 Complete​

Clarifying Questions Trigger Mechanism (Core Feature)​

Production Debugging — 4 Nested Bugs​

UX Improvements​

Infrastructure​

Documentation​

Quality Metrics​

Architecture Notes​