Medical Knowledge Architecture
A central design challenge in building a hospital knowledge graph is separating universal medical knowledge from hospital-specific organisational data. Which department handles which condition is a fact about medicine; which doctors work in that department at a particular hospital is a fact about that hospital. Conflating these two layers produces a system that cannot be transferred, audited, or maintained.
This page documents the Three-Source Knowledge Architecture — a principled separation of automated knowledge derivation from human curation. The architecture organises all knowledge inputs by their provenance: automated web scraping (Source 1), standards-based medical ontology (Source 2: SNOMED CT), and irreducible human judgment (Source 3: curated configuration). Research demonstrates that ontology-enhanced knowledge graphs improve retrieval accuracy by 22–40% (Jimeno-Yepes et al., 2012; Soman et al., 2024), and the Belgian government mandates SNOMED CT for primary diagnoses by 2027 (FPS Public Health, 2024). By grounding the knowledge graph in these foundations, the system achieves both academic rigour and practical reliability.
The Problem: Coupled Domain Knowledge
In early iterations of the graph extraction pipeline, medical relationships were encoded as hardcoded Python constants scattered across multiple source files. A dictionary in medical_extraction.py mapped conditions to departments; another dictionary in typed_nodes.py mapped treatments to conditions; a third in zol_taxonomy.py defined department aliases. This arrangement had three consequences:
- No portability. Hospital-specific data (department names, campus locations, doctor rosters) was interleaved with general medical knowledge (which conditions a cardiology department typically handles). Deploying the system for a second hospital would require disentangling hundreds of constants.
- No auditability. When a clinician asked "why does the graph say Neurochirurgie handles Aneurysma?", the answer was buried in a Python dictionary with no provenance, no generation date, and no model attribution.
- Inconsistent coverage. Each constant was maintained by hand. Some departments had 20 condition mappings; others had none. There was no systematic process to ensure completeness.
The Three-Source Knowledge Architecture resolves all three problems by cleanly separating knowledge inputs by provenance: what the hospital website says (Source 1), what the medical ontology says (Source 2), and what requires human judgment (Source 3).
Architectural Overview: Three-Source Knowledge Architecture
The system organises knowledge inputs by their provenance — the mechanism by which the knowledge was derived. This separation enables automated verification: scraped data can be re-scraped, SNOMED relationships can be validated against the ontology, and curated overrides are explicitly flagged as human judgment.
Why Three Sources?
An audit of the knowledge pipeline identified 57 hardcoded data structures totalling ~3,400 entries across 10 source files. Classification by derivability:
| Category | Constants | Entries | % |
|---|---|---|---|
| Web-scrapable (Source 1) | 12 | ~500 | 15% |
| SNOMED-derivable (Source 2) | 11 | ~700 | 20% |
| LLM-derivable (model-inferable) | 7 | ~260 | 8% |
| Curated (Source 3) | 12 | ~300 | 9% |
| Hybrid (partially derivable) | 17 | ~1,600 | 47% |
85% of hardcoded data is theoretically derivable from automated sources. For an academic project evaluated on architectural quality, hand-maintained Python dictionaries encoding medical knowledge is a defensibility risk — especially when SNOMED CT, a standards-based medical ontology mandated for Belgian primary diagnoses by 2027, is already imported into PostgreSQL.
Source Priority and Confidence
Each knowledge source carries a provenance tag and confidence score:
| Source | Provenance Tag | Confidence | Priority |
|---|---|---|---|
| Curated configuration | source: "curated" | 1.0 | Highest — human judgment overrides all |
| Web scraper | source: "scraper" | 1.0 | High — directly observed on hospital website |
| SNOMED CT | source: "snomed" | 0.7 | Lower — ontology-derived, may not reflect local practice |
| LLM enrichment | source: "llm_enrichment" | 0.8 | Medium — LLM-classified, human-reviewed |
Curated negative maps (e.g., DEPT_CONDITION_NEGATIVE_MAP) filter relationships from all sources equally, ensuring that plausibility guards apply regardless of provenance.
Implementation: Code Organisation (Three Layers)
The Three-Source Architecture is implemented through three code layers that mirror the source separation. Each layer has a strict dependency direction: imports flow from general to specific, never the reverse.
Layer 1: Universal Medical Knowledge (medical_knowledge/)
Layer 1 is the portable foundation, implementing the vocabulary and relationship mappings shared by Sources 1, 2, and 3. It contains general Dutch medical knowledge applicable to any Belgian hospital — entity aliases, classification rules, plausibility guards, and LLM-generated relationship mappings. Nothing in this layer references ZOL-specific entities (campuses, department rosters, department-campus assignments). The layer is composed of three distinct module groups:
1a. Dutch Medical Vocabulary (dutch_medical_vocabulary.py)
The vocabulary module (988 lines, 12 sections) encodes general medical-domain knowledge in Dutch. It has zero imports from any other part of the codebase — only Python standard library (logging, re). This makes it trivially portable to any Dutch-speaking hospital deployment.
The 12 sections:
| Section | Contents | Purpose |
|---|---|---|
| 1. Utility Functions | safe_contains(), _strip_punctuation() | Dutch compound word matching, fuzzy text matching |
| 2. Doctor Name Cleanup | Role tokens, blocklists, clean_doctor_name() | Filter non-physician names (job titles, body parts) |
| 3. Entity Type Classification | ENTITY_TYPE_OVERRIDES, DUAL_ENTITY_MAP | 63 type overrides, 4 dual-entity maps — resolve ambiguous entities (e.g., "radiotherapie" is both a department and a treatment) |
| 4. Condition Normalization | CONDITION_ALIASES, NOT_CONDITIONS, OVERLY_BROAD_CONDITIONS | 108 patient-friendly Dutch aliases ("hoge bloeddruk" → "Hypertensie"), 12 overly-broad guards, noise filtering |
| 5. Treatment Normalization | TREATMENT_ALIASES, NOT_TREATMENTS | 60 aliases, 12 noise terms filtered |
| 6. Examination Normalization | EXAMINATION_CASING, EXAMINATION_ALIASES, lookup_examination_aliases() | 32 casing rules (incl. RX→Röntgen), 22 exam alias groups, case-insensitive lookup |
| 7. Service Aliases | SERVICE_ALIASES | Deduplication map for hospital services |
| 8. Specialty Aliases | SPECIALTY_ALIASES | Medical specialty name normalization |
| 9. Noise & Hub Guards | HUB_CONDITIONS, HUB_TREATMENTS, ENTITY_BLOCKLIST | Prevent over-connected nodes ("pijn", "tumor", "allergie") from corrupting the graph |
| 10. Domain Plausibility Maps | EXAM_DOMAIN_MAP, CONDITION_DOMAIN_MAP, IMAGING_EXAMS | Domain-group-based plausibility rules |
| 11. Plausibility Guard Functions | is_plausible_used_for(), is_plausible_performs() | Runtime validation of extracted relationships |
| 12. Resolver Functions | resolve_condition(), resolve_treatment(), resolve_specialty(), resolve_entity_type() | Query-time and extraction-time alias resolution |
Key design property: Because this module has zero project imports, it can be extracted into a standalone Python package and shared across multiple hospital deployments without modification. The separation was validated by verifying that the import graph flows strictly one-way: zol_taxonomy.py imports from dutch_medical_vocabulary.py, never the reverse.
1b. Standard Belgian Hospital Departments (belgian_hospital_departments.py)
This module defines 56 standard Belgian hospital department names — the canonical department vocabulary shared across Belgian hospitals. It serves two critical roles:
-
Enrichment portability. The LLM enrichment script uses
STANDARD_DEPARTMENTSas the allowed-values list in its classification prompts. This ensures that generated relationship mappings reference standard department names (e.g., "Cardiologie", "Neurologie") rather than hospital-specific names (e.g., "Hartcentrum Genk", "Beroertecentrum"). -
Hospital-specific resolution. The
ZOL_TO_STANDARD_MAPdictionary (38 entries) maps ZOL's branded center and program names back to their standard equivalents. During graph seeding, this mapping resolves standard names to hospital-specific canonical names. For a new hospital deployment, only this mapping needs to be replaced.
# Example: ZOL-specific names → standard Belgian equivalents
ZOL_TO_STANDARD_MAP = {
"Hartcentrum Genk": "Cardiologie",
"Limburgs Vaatcentrum": "Vaatchirurgie",
"Beroertecentrum": "Neurologie",
"Slaapcentrum": "Slaapgeneeskunde",
...
}
1c. Enrichment Modules (Relationship Mappings)
Five Python modules contain pure data dictionaries mapping medical entities to each other. These mappings represent standard medical knowledge — they encode facts like "Cardiologie handles Hartfalen" or "CT Onderzoeken diagnoses Aneurysma" that are true regardless of which hospital's website is being indexed.
Key properties:
- LLM-generated, human-reviewed. The dictionaries are produced by
scripts/enrich_taxonomy_llm.py, which calls a Tier 3 (reasoning/advanced) model via OpenAI with the full entity inventories from the hospital's taxonomy database. The LLM's task is not to invent relationships but to map known entities to known departments — a constrained classification task where the LLM's medical knowledge is most reliable. - Standard department names. The enrichment prompt uses
STANDARD_DEPARTMENTSfrombelgian_hospital_departments.py, ensuring portable output. Post-generation resolution maps standard names back to hospital-specific canonical names during graph seeding. - Provenance metadata. Each file records the generation date, model used, and a note that human review is required before deployment.
- Five relationship types:
| Module | Relationship | Description | Entities | Mappings |
|---|---|---|---|---|
department_conditions.py | HANDLES | Which departments handle which conditions | 148 | 232 |
department_treatments.py | OFFERS | Which departments offer which treatments | 205 | 327 |
department_examinations.py | PERFORMS | Which departments perform which examinations | 101 | 164 |
condition_examinations.py | DIAGNOSES | Which examinations diagnose which conditions | 132 | 308 |
treatment_conditions.py | TREATS | Which treatments treat which conditions | 67 | 92 |
The distinction between "entities mapped" and "total mappings" reflects the many-to-many nature of medical relationships: a single condition may be handled by multiple departments, and a single department handles multiple conditions.
Layer 2: Hospital-Specific Taxonomy (zol_taxonomy.py, taxonomy/)
The second layer contains everything specific to ZOL as an organisation. Following the layer separation refactoring, zol_taxonomy.py was reduced from approximately 2,300 lines to 1,071 lines — all general Dutch medical vocabulary was extracted to Layer 1. The hospital-specific configuration is driven by a 2,294-line YAML file (zol.yaml) validated through a Pydantic schema. What remains is strictly ZOL-specific:
- Campus definitions (exactly 4): Sint-Jan, André Dumont, Sint-Barbara, and Maas en Kempen, each with addresses, aliases, and contact information.
- Department-campus mappings (
DEPARTMENT_CAMPUS_MAP, 149 entries): Which departments operate at which campuses — a fact about ZOL's organisational structure, not about medicine. - Department roster (
DEPARTMENTS, 67 entries): The canonical list of ZOL departments with their aliases, including ZOL-specific branded names (e.g., "Limburgs Vaatcentrum", "Beweegsaam"). - Center-department mappings (
CENTER_DEPARTMENT_MAP, 21 entries, 50 links): Maps multidisciplinary centres to their constituent departments, enabling transitive doctor inference (see Center–Doctor Inference). - ZOL-specific domain knowledge maps:
DEPT_CONDITION_MAP,DEPT_TREATMENT_MAP,EXAM_PERFORMS_MAP— overrides derived from ZOL's hub pages that take precedence over universal knowledge. - Search aliases (
SEARCH_ALIASES, 239 entries): Query-time resolution rules that map user search terms to ZOL department names (e.g., "hartfilmpje" to "ECG", "NMR" to "MRI"). - Doctor page classification: URL patterns and rules for identifying authoritative doctor profile pages on ZOL's website.
The taxonomy/ package provides a FrozenTaxonomyRegistry scraped from ZOL's hub pages (authoritative listing pages for all doctors, departments, conditions, and treatments). The registry offers O(1) lookups by canonical name, with pre-built indexes for 359 doctors, 85 department aliases, 168 condition aliases, 217 treatment aliases, and 104 examination aliases.
Import direction: zol_taxonomy.py imports from dutch_medical_vocabulary.py (8 symbols: aliases, guards, utility functions). The reverse import does not exist. This enforces the architectural constraint that general medical knowledge never depends on hospital-specific data.
Layer 3: Graph Seeding (GoldenPageSeeder)
The third layer combines the previous two. The GoldenPageSeeder merges universal medical knowledge with hospital-specific taxonomy data and seeds the resulting relationships into PostgreSQL taxonomy tables:
def _merge_knowledge_maps(
universal: dict[str, list[str]],
hospital: dict[str, list[str]],
) -> dict[str, list[str]]:
"""Merge universal medical knowledge with hospital-specific overrides.
Hospital-specific values take precedence. Deduplication is case-insensitive."""
The merge follows a simple precedence rule: hospital-specific data takes priority. If the taxonomy's hub pages say that "Hartfalen" is handled by "Cardiologie" and "Geriatrie", and the universal knowledge layer also maps "Hartfalen" to "Cardiologie" and "Interne Geneeskunde", the merged result includes all three departments — but "Cardiologie" and "Geriatrie" (from the hospital's own data) are listed first, and duplicate entries are removed via case-insensitive deduplication.
Layer Separation: Implementation Details
The layer separation refactoring was the most significant architectural change to the knowledge architecture since its initial design. This section documents how the separation was executed and what it enables.
Before: Monolithic Taxonomy
Before the refactoring, zol_taxonomy.py was a single 2,300+ line file containing both general Dutch medical knowledge and ZOL-specific data. The file had grown organically through iterative graph quality fixes (v1 through v7), accumulating condition aliases, treatment aliases, examination casing rules, entity type overrides, plausibility guards, hub condition blocklists, and domain plausibility maps — all interleaved with ZOL campus definitions and department mappings.
The dependency graph was flat: every module that needed medical knowledge imported from zol_taxonomy.py, making it impossible to reuse medical knowledge without also importing ZOL-specific data.
After: Three-Layer Dependency Graph
The refactoring enforced a strict dependency rule: imports flow from general to specific, never the reverse. The dutch_medical_vocabulary.py module sits at the bottom of the dependency graph with zero project imports. zol_taxonomy.py imports 8 symbols from it (aliases, guards, utility functions). Consumer modules (medical_extraction.py, typed_nodes.py) import from both layers but never create circular dependencies.
What Moved Where
| Content | Before (location) | After (location) | Lines moved |
|---|---|---|---|
| Condition aliases (108) | zol_taxonomy.py | dutch_medical_vocabulary.py | ~80 |
| Treatment aliases (60) | zol_taxonomy.py | dutch_medical_vocabulary.py | ~40 |
| Examination aliases (22 groups) | zol_taxonomy.py | dutch_medical_vocabulary.py | ~60 |
| Entity type overrides | zol_taxonomy.py | dutch_medical_vocabulary.py | ~50 |
| Hub condition/treatment guards | zol_taxonomy.py | dutch_medical_vocabulary.py | ~80 |
| Domain plausibility maps | zol_taxonomy.py | dutch_medical_vocabulary.py | ~100 |
| Doctor name cleanup rules | zol_taxonomy.py | dutch_medical_vocabulary.py | ~120 |
| Plausibility functions | zol_taxonomy.py | dutch_medical_vocabulary.py | ~70 |
| Resolver functions | zol_taxonomy.py | dutch_medical_vocabulary.py | ~50 |
| Service/specialty aliases | zol_taxonomy.py | dutch_medical_vocabulary.py | ~30 |
| Noise guards & blocklists | zol_taxonomy.py | dutch_medical_vocabulary.py | ~50 |
| Standard dept names | (did not exist) | belgian_hospital_departments.py | 138 (new) |
The total migration moved approximately 830 lines of general medical knowledge out of the hospital-specific module. Subsequent growth from iterative quality fixes brought the current counts to dutch_medical_vocabulary.py (988 lines) and zol_taxonomy.py (1,071 lines, down from ~2,300). The reduction in zol_taxonomy.py is even more dramatic because the YAML-driven hospital configuration (zol.yaml, 2,294 lines) now carries the bulk of hospital-specific data declarations.
Registry Fallback Enhancement
The FrozenTaxonomyRegistry (Layer 2) provides hospital-specific entity resolution at query time and extraction time. When the registry does not contain a match for an entity, the system now falls back to the general medical knowledge maps in Layer 1 (dutch_medical_vocabulary.py) rather than to ZOL-specific maps. This ensures that fallback resolution remains portable and does not introduce hospital-specific assumptions into the general resolution path.
Design Rationale
Why Separate Universal Knowledge from Hospital Data?
The separation is motivated by three engineering concerns and one epistemological principle.
1. Portability. If the system is deployed for a second hospital (e.g., Novation's other clients), Layer 1 transfers unchanged. Only Layer 2 needs to be rebuilt — scrape the new hospital's hub pages, define its campuses, and configure its department aliases. The medical knowledge that "Cardiologie handles Hartfalen" remains valid.
2. Testability. Layer 1 modules are pure data with zero side effects. They can be validated by a domain expert reading a Python dictionary — no database connections, no API calls, no runtime state. Layer 2 can be tested against the live hospital website. Layer 3 can be tested with mocked inputs from both layers.
3. Auditability. Each layer has a clear provenance chain. Layer 1 dictionaries record their generation model and date. Layer 2 data is scraped from URLs that can be revisited. Layer 3's merge logic is a deterministic function. When a clinician questions why the graph contains a particular relationship, the provenance chain identifies exactly where it came from.
4. The epistemological principle. Medical knowledge and organisational knowledge are fundamentally different kinds of facts. "Cardiologie handles Hartfalen" is a fact about medicine — it is true at ZOL, at UZ Leuven, and at any hospital with a cardiology department. "Dr. Peeters works in Cardiologie on Monday and Wednesday at campus Sint-Jan" is a fact about ZOL's staffing schedule. Mixing these in a single data structure obscures the difference and makes it impossible to reason about what is universally true versus what is locally configured.
Why LLM-Generated Knowledge?
The relationship mappings in Layer 1 could have been authored manually by domain experts. The decision to use LLM generation instead was driven by three factors:
Coverage. The ZOL taxonomy contains 65 departments, 168 conditions, 217 treatments, and 104 examinations. Manually mapping the cross-product of these entities is a combinatorial task: 65 x 168 = 10,920 potential HANDLES relationships alone. An LLM processes the full matrix in minutes.
Medical accuracy. Modern LLMs trained on medical literature encode extensive knowledge of which departments handle which conditions. For the constrained task of classifying known entities into known departments, LLM accuracy is high — the model is not generating novel medical knowledge but recognising established associations.
Human review as the quality gate. The LLM output is written to Python files that are committed to version control. A domain expert reviews the diff before deployment. The LLM handles the bulk classification; the human handles the edge cases. This division of labour is more efficient than either approach alone.
SNOMED CT as Source 2: From Query-Time to Seeding-Time
SNOMED CT is the international clinical-terminology standard (glossary); its full role, the 5-tier matcher, and the query-time mechanism are documented canonically on the SNOMED CT Terminology page. This section covers only how it participates as Source 2 of the three-source merge — its integration evolved through two phases:
Phase 1 (Implemented): Query-time synonym expansion — patient terms resolve to clinical synonyms and unknown conditions route to departments via FINDING_SITE. See SNOMED CT Terminology for the mechanism.
Phase 2 (Approved Design): Seeding-time graph enrichment. SNOMED CT becomes a first-class knowledge source at graph seeding time, not just a query-time fallback. This phase addresses the root cause of poor performance on SNOMED-specific queries (4/15 pass rate, 26.7%): conditions exist as graph nodes but lack HANDLES relationships to the correct departments. SNOMED FINDING_SITE data auto-creates these missing relationships.
| Enrichment | Mechanism | Impact |
|---|---|---|
| Concept IDs on nodes | Match entity names against snomed_descriptions | Deterministic concept-level matching; language-independent identity |
| Dutch synonyms as properties | Fetch all Dutch descriptions for matched concept | Replaces ~250 hand-maintained aliases with 656K SNOMED descriptions |
| IS_A hierarchy relationships | Traverse snomed_transitive_closure between existing graph nodes | "Find all subtypes" queries (diabetes → type 1, type 2, gestational) |
| FINDING_SITE → HANDLES | Condition concept → body structure → department | Auto-creates missing condition→department links (target: +7 SNOMED golden questions) |
| PROCEDURE_SITE → OFFERS | Treatment concept → body structure → department | Auto-creates missing treatment→department links |
Scoping guard: Only the ~260 entities already in the taxonomy are enriched. SNOMED's 356K concepts are NOT bulk-imported. The taxonomy grows by ~50–100 IS_A relationships and ~100–200 SNOMED-derived HANDLES/OFFERS — negligible growth with significant search quality improvement.
Why SNOMED CT complements (not replaces) LLM enrichment: SNOMED CT maps clinical concepts, not hospital workflows. It encodes that "Heart failure" IS_A "Disorder of cardiovascular system" and has FINDING_SITE "Heart structure." It does NOT encode that heart failure patients in a Belgian hospital should be directed to the Cardiology department. The bridge from body structures to hospital departments requires an organisational mapping layer (BODY_STRUCTURE_TO_DEPARTMENT, 47 entries) that is curated but universal across Belgian hospitals. For the department→condition/treatment/examination relationships that lack anatomical grounding, the LLM enrichment pipeline remains the primary source.
The Dutch Medical Terminology Challenge
Building a medical search system in Dutch presents unique challenges that do not arise in English-language systems. This section documents the linguistic phenomena that drive the vocabulary architecture.
Morphological Complexity
Dutch is a head-final compounding language (Booij, 2012) in which medical terms routinely appear as single compounds without spaces: hartchirurgie (heart surgery), bloedonderzoek (blood test), ruggenmergtumor (spinal cord tumour). A keyword search for "hart" (heart) will not match "hartchirurgie" unless the system explicitly decomposes the compound or maintains an alias map. The safe_contains() utility function in the vocabulary module implements substring matching that handles this phenomenon, allowing "hart" to match "hartchirurgie", "hartritmestoornissen", and "hartfalen" without requiring morphological decomposition.
Register Gap: Patient vs. Clinical Terminology
Hospital websites simultaneously serve two audiences — patients and referring physicians — who use fundamentally different vocabularies for the same concepts:
| Patient Dutch | Clinical Dutch | Latin/International |
|---|---|---|
| hoge bloeddruk | hypertensie | hypertensio arterialis |
| suikerziekte | diabetes mellitus | diabetes mellitus type 2 |
| beroerte | cerebrovasculair accident | CVA |
| spataders | varices | varicosis |
| grijze staar | cataract | cataracta senilis |
| open rug | spina bifida | myelomeningocele |
| huidkanker | melanoom | melanoma malignum |
| nierstenen | urolithiasis | nephrolithiasis |
The CONDITION_ALIASES map (108 entries) bridges this gap by mapping patient-friendly terms to their canonical clinical equivalents. Similarly, TREATMENT_ALIASES (60 entries) normalises treatment terminology, and EXAMINATION_ALIASES (22 groups) maps colloquial examination names to canonical forms (e.g., "NMR" → "MRI", "bloedafname" → "Bloedonderzoek").
Ambiguity: Departments, Treatments, and Conditions
In Dutch hospital terminology, a single term frequently denotes multiple entity types simultaneously. "Radiotherapie" is both a department (an organisational unit with staff) and a treatment (a therapeutic modality). "Orthopedie" is a department and a medical specialty. "Dialyse" is a treatment but "Nefrologie" (the department that provides it) is a specialty.
The ENTITY_TYPE_OVERRIDES map (63 entries) and DUAL_ENTITY_MAP (4 entries) resolve these ambiguities at extraction time. The DUAL_ENTITY_MAP is specifically designed for cases where both interpretations are correct — "Radiotherapie" genuinely needs to exist as both a department node and a treatment node in the knowledge graph:
DUAL_ENTITY_MAP = {
"radiotherapie": ["Bestralingstherapie", "Uitwendige bestraling", "Brachytherapie"],
"nucleaire geneeskunde": ["Radioisotopentherapie", "PET-CT"],
"neonatologie": ["Neonatale Zorg"],
"klinische biologie": ["Laboratoriumonderzoek"],
}
Hub Node Prevention
Certain medical terms — "pijn" (pain), "tumor", "allergie" (allergy), "infectie" (infection) — are so broadly applicable that they would create hub nodes connecting to nearly every department in the graph if treated as conditions. Graph theory indicates that such hubs degrade search specificity: a query for "pijn" would return every department rather than routing to the Multidisciplinair Pijncentrum (Pain Centre) where it belongs.
The HUB_CONDITIONS (8 entries) and HUB_TREATMENTS (5 entries) blocklists prevent these terms from creating HANDLES/TREATS relationships. The OVERLY_BROAD_CONDITIONS (12 entries) extends this guard to terms like "koorts" (fever), "vermoeidheid" (fatigue), and "hoofdpijn" (headache) that are symptoms rather than conditions.
Examination Normalisation
Medical imaging examinations present a specific normalisation challenge. The ZOL website uses inconsistent naming for the same examination: "RX", "RX-onderzoek", "RX Onderzoeken", "RX-beeld", and "Röntgen" all refer to X-ray imaging. Similarly, "NMR", "MRI", and "Kernspinresonantie" all refer to magnetic resonance imaging.
The EXAMINATION_CASING map (32 entries) canonicalises these variants to a single display name. The v18 update added 4 RX→Röntgen consolidation entries to eliminate fragmented examination nodes in the graph:
"rx": "Röntgen",
"rx-onderzoek": "Röntgen",
"rx onderzoeken": "Röntgen",
"rx-beeld": "Röntgen",
Specific RX procedures (RX Arthrografie, RX Colon) retain their identity as distinct examinations — the normalisation only collapses generic RX references.
Current Vocabulary Coverage
| Category | Entries | Examples |
|---|---|---|
| Condition aliases | 108 | "hoge bloeddruk" → Hypertensie, "suikerziekte" → Diabetes Mellitus |
| Treatment aliases | 60 | "hartoperatie" → Hartchirurgie, "nierdialyse" → Hemodialyse |
| Examination alias groups | 22 | Röntgen (4 aliases), MRI (3 aliases), Echografie (2 aliases) |
| Examination casing rules | 32 | "rx" → Röntgen, "nmr" → MRI, "pet" → PET-CT |
| Entity type overrides | 63 | "dialyse" → treatment, "biopsie" → examination |
| Service aliases | 17 | "afspraak" → Afspraken, "comfort care" → Comfortzorg |
| Search aliases | 239 | "hartfilmpje" → ECG, "scan" → Radiologie |
| Hub/noise guards | 46 | 8 hub conditions + 5 hub treatments + 12 overly broad + 21 blocklist |
Together with the SNOMED CT synonym expansion layer (656K Dutch descriptions), the system resolves patient-facing Dutch, clinical Dutch, and international medical terminology to the same canonical entities.
The LLM Enrichment Pipeline
The scripts/enrich_taxonomy_llm.py script implements the generation pipeline for Layer 1's relationship mappings. Its design reflects the principle that LLMs are classifiers, not inventors: the script provides the full list of entities and asks the LLM to classify relationships between them, rather than asking it to generate entities from scratch.
Standard Department Names for Portability
A key design decision in the enrichment pipeline is the use of standard Belgian hospital department names rather than hospital-specific names in the LLM classification prompt. The pipeline loads STANDARD_DEPARTMENTS (55 entries) from belgian_hospital_departments.py as the allowed-values list for the "department" field in LLM output. This ensures that the generated relationship mappings are portable across hospitals.
The resolution flow:
When the GoldenPageSeeder seeds the taxonomy, a post-generation resolution step maps standard names back to hospital-specific canonical names using the ZOL_TO_STANDARD_MAP. For a new hospital deployment, only the *_TO_STANDARD_MAP dictionary needs to be replaced — the enrichment modules themselves remain unchanged.
Pipeline Stages
Stage 1: Entity Filtering. Raw entity names from the taxonomy database include noise — section headers from web pages ("Brochures 2", "Dag 1"), generic medical terms ("Aandoeningen", "Behandeling"), and very short strings. The filtering stage applies:
- Section header detection (pattern-based)
- Generic term removal (curated blocklist of ~60 terms)
- Minimum length threshold (3 characters)
- Case-insensitive deduplication
Stage 2: LLM Classification. Five separate LLM calls, one per relationship type. Each call provides the full list of filtered entities and the full list of standard Belgian departments (not hospital-specific names), asking the LLM to produce a JSON mapping. The prompt is structured to minimise hallucination: the LLM can only use entity names that appear in the provided lists.
Stage 3: Output Validation. The LLM occasionally maps entities to department names that do not exist in the standard list (misspellings, synonyms). The validation stage removes any mapping whose target is not in the known department set, with case-insensitive matching and deduplication.
Stage 4: Hub Page Merge. The hospital's hub pages contain authoritative relationship data scraped directly from the website (e.g., a department page that lists the conditions it handles). This data is merged with the LLM output using case-insensitive key matching, with hub page data taking precedence. This ensures that hospital-verified relationships are preserved even if the LLM disagrees.
Stage 5: Post-Merge Cleanup. A final pass removes any remaining noise keys that survived into the merged output (e.g., section headers that appeared as condition names on hub pages).
Stage 6: Output. The pipeline writes five Python modules (one per relationship type), an enrichment review report (Markdown), and the raw JSON output for auditing.
Quality Assurance
The enrichment review report (enrichment_review.md) provides transparency into the generation process:
## HANDLES
- Total entities mapped: 149
- Total mappings: 233
- Confirmed by hub pages: 139
- New from LLM: 10
This shows that 93% of HANDLES mappings were confirmed by the hospital's own hub page data. Only 10 new mappings came exclusively from the LLM — these are the ones requiring the most careful human review.
For cross-entity relationships (DIAGNOSES, TREATS), the hub page confirmation rate is 0% because these relationships are not explicitly stated on hospital web pages. They represent general medical knowledge that only the LLM can provide. This is expected and acceptable: the relationships are medically standard (e.g., "CT Onderzoeken diagnoses Aneurysma") and can be validated by any clinician.
Center–Doctor Inference
A specific challenge in hospital knowledge graphs is linking doctors to multidisciplinary centres. Centres like the Borstcentrum (Breast Centre) or Beroertecentrum (Stroke Centre) are not departments with their own staff rosters — they are cross-departmental programs whose doctors are employed by constituent departments (Oncologie, Neurologie, Chirurgie, etc.). The ZOL website lists doctors under departments, not centres. Without explicit centre–doctor relationships, a patient searching for "borstcentrum dokter" would find no results.
The Transitive Inference Pattern
The CENTER_DEPARTMENT_MAP (YAML-driven, 21 centres, 50 department links) defines the constituent departments for each centre. At seeding time, _infer_center_doctor_links() implements a transitive closure:
For each centre C:
For each constituent department D in CENTER_DEPARTMENT_MAP[C]:
For each doctor who WORKS_IN D:
Create WORKS_IN(doctor, C) with confidence=0.7
The confidence=0.7 (vs. 1.0 for direct WORKS_IN from hub pages) signals that the relationship is inferred rather than directly observed. This distinction is available to downstream consumers (e.g., search ranking) but is not currently surfaced to users.
Coverage
| Category | Count |
|---|---|
| Total centres | 21 |
| Centres with department mappings | 21 (100%) |
| Total department→centre links | 50 |
| Centres with campus entries | 21 (100%) |
The mapping covers all centre types: clinical centres (Borstcentrum, Beroertecentrum), rehabilitation programs (BeweegSaam, Revalidatie, Cognitieve Revalidatie), specialised clinics (Endometriosekliniek, Diabetische Voetkliniek), and support programs (Infectiepreventie, Zorgeenheid Gaudium).
Integration with the RAG Pipeline
The medical knowledge layer enhances the RAG pipeline at two points: graph seeding (ingestion time) and graph querying (query time).
At Ingestion Time
When the GoldenPageSeeder runs, it merges Layer 1 (universal knowledge) with Layer 2 (hospital taxonomy) and stores relationships in PostgreSQL taxonomy tables. The resulting data quality is monitored by an automated Database Doctor service:
| Metric | Score |
|---|---|
| Naming Consistency | 94/100 |
| Search Effectiveness | 92/100 |
| Doctors with department links | 100% |
| Departments with campus links | 100% |
| Centres with inferred doctors | 100% (via CENTER_DEPARTMENT_MAP) |
| Orphan nodes | 0 |
The Database Doctor evaluates naming consistency (canonical name adherence, alias coverage, casing uniformity) and search effectiveness (relationship connectivity, entity-to-document coverage, cross-entity navigability). Both scores are in the diminishing-returns zone, indicating a production-ready knowledge base.
At Query Time
When a user asks "Welke onderzoeken worden gebruikt om hartfalen vast te stellen?" (Which examinations are used to diagnose heart failure?), the query pipeline:
- Intent classification identifies this as a condition-information query
- Entity extraction identifies "hartfalen" as a condition
- Taxonomy resolution resolves "hartfalen" to the canonical condition name (with SNOMED synonym fallback)
- DIAGNOSES traversal queries the taxonomy tables for examinations linked to this condition via the DIAGNOSES relationship
- SNOMED synonym matching (Phase B): queries can also match via
snomed_synonymsproperty on nodes — e.g., "suikerziekte" matchesDiabetes Mellituswithout requiring a static alias entry - Response generation formats the results with source citations
Without the knowledge graph relationships, this query would fall through to vector search and return generic text passages that mention heart failure — useful but imprecise. With the graph, the system returns a structured list of specific diagnostic examinations (Echocardiografie, Bloedafname, Thorax RX) with the departments that perform them.
Query Pipeline Simplification (Phase B)
With SNOMED synonyms stored directly on taxonomy entities, queries become synonym-aware without requiring query-time SNOMED lookups for known entities:
-- Before: string match on canonical_name only
SELECT * FROM app.taxonomy_entities WHERE canonical_name = :term;
-- After: synonym-aware match (Phase B)
SELECT * FROM app.taxonomy_entities
WHERE canonical_name = :term
OR metadata->'snomed_synonyms' ? :term;
This reduces query-time latency by pre-computing synonym matches at seeding time, while preserving the query-time SNOMED fallback for terms not in the taxonomy.
Multi-Tenancy Path
The layer separation is not merely an engineering exercise in clean code — it is the architectural foundation for evolving from a single-hospital deployment into a multi-tenant hospital search product. This section covers multi-tenancy from the knowledge-layer angle — which medical-vocabulary code is portable versus hospital-specific; for the system-level onboarding pathway, tenant isolation, and routing, see Multi-Tenancy Architecture.
What Transfers Unchanged (Layer 1)
The following components work for any Dutch-speaking hospital without modification:
| Component | Lines | Content |
|---|---|---|
dutch_medical_vocabulary.py | 988 | 108 condition aliases, 60 treatment aliases, 22 examination alias groups, 63 entity type overrides, plausibility guards, resolver functions, Dutch compound word matching |
belgian_hospital_departments.py | 138 | 56 standard Belgian department names, normalize_to_standard() function |
| Enrichment modules (5 files) | ~1,200 | 1,123 total mappings across 5 relationship types (HANDLES, OFFERS, PERFORMS, DIAGNOSES, TREATS) |
enrich_taxonomy_llm.py | ~600 | LLM enrichment pipeline script |
| SNOMED CT reference tables | 4 tables | 356K concepts, 656K descriptions, 1.2M relationships, 4.7M transitive closure entries (see SNOMED CT integration) |
Total portable code: approximately 2,930 lines (excluding SNOMED CT data, which is loaded from the Belgian Edition RF2 distribution).
What Needs Replacement (Layer 2)
For a new hospital deployment, only the following need to be created or replaced:
| Component | Effort | Description |
|---|---|---|
{hospital}_taxonomy.py | Medium | Campus definitions, department-campus mappings, department aliases specific to the new hospital |
taxonomy/ scrape | Automated | Run the hub page scraper against the new hospital's website |
{HOSPITAL}_TO_STANDARD_MAP | Low | Map the new hospital's branded department names to standard Belgian department names (analogous to ZOL_TO_STANDARD_MAP) |
| Site configuration | Low | Update site_config.py with the new hospital's domain and configuration |
The Enrichment Pipeline as a Portable Generator
The enrichment pipeline's use of standard Belgian department names means that its output is already portable. When onboarding a new hospital:
- Scrape the new hospital's hub pages to populate the taxonomy database.
- Run
enrich_taxonomy_llm.py— the pipeline usesSTANDARD_DEPARTMENTS, not hospital-specific names. - Create a
{HOSPITAL}_TO_STANDARD_MAPmapping the new hospital's branded names to standard equivalents. - Seed the taxonomy — the
GoldenPageSeederresolves standard names to hospital-specific names automatically.
The enrichment modules do not need to be regenerated for each hospital. The same "Cardiologie handles Hartfalen" mapping applies universally. Only the resolution from "Cardiologie" to a hospital's specific department name (which may be "Hartcentrum", "Cardiologische Dienst", or simply "Cardiologie") happens at seeding time.
Comparison with Related Approaches
Medical Knowledge Graphs in Literature
The approach taken in this system can be situated within the broader landscape of medical knowledge graph construction:
| Approach | Examples | Strengths | Limitations |
|---|---|---|---|
| Ontology-based | SNOMED CT, ICD-10, UMLS | Standardised, comprehensive, internationally maintained | Clinical focus (not navigational), licensing requirements, Dutch coverage gaps |
| NER + relation extraction | scispaCy, MedCAT, BioBERT | Automated, handles novel entities | Requires training data, language-specific models needed for Dutch |
| Manual curation | Hospital-internal databases | High precision, domain-expert validated | Does not scale, maintenance burden, no portability |
| LLM classification (this system) | Tier 3 model + hub page merge | Scalable, auditable, portable, Dutch-native | Requires human review, no novel entity discovery |
The ZOL system's approach is closest to constrained LLM classification with human-in-the-loop validation — a pattern that leverages LLM medical knowledge for bulk classification while maintaining auditability through version-controlled output files and hub page cross-referencing.
Hybrid Architecture Advantages
The three-layer separation offers advantages that no single approach provides:
- Scalable like LLM extraction — new entities are classified automatically
- Precise like manual curation — hub page data overrides LLM when available
- Portable like ontology-based systems — Layer 1 transfers across hospitals
- Auditable like all three — provenance chain from LLM output to hub page confirmation to human review
Way Forward: Three-Source Architecture Roadmap
The Three-Source Architecture is being implemented in phases, each with a golden evaluation gate (≥91% overall pass rate required before merging).
Phase A: Web Scraper Campus Inference (COMPLETE)
The scraper now infers department→campus mappings from doctor campus data, eliminating manual YAML campus maintenance. Database doctor confirms complete coverage: 0 departments missing LOCATED_AT relationships.
Phase B: SNOMED Graph Enrichment (APPROVED DESIGN)
Add a Phase 3: SNOMED Enrichment to the graph seeding pipeline, running after all entities and relationships are built:
- Concept ID matching: Match entity names against
snomed_descriptionsto assign SNOMED concept IDs to graph nodes - Synonym enrichment: Add Dutch synonyms from SNOMED descriptions as node properties (
snomed_synonymsarray) - IS_A hierarchy: Create IS_A relationships between existing condition nodes that have hierarchical SNOMED relationships (depth limit: 3 hops)
- FINDING_SITE → HANDLES: Auto-create condition→department links via anatomical routing (estimated +7 SNOMED golden questions)
- PROCEDURE_SITE → OFFERS: Auto-create treatment→department links via procedure site mapping
Target: SNOMED golden question pass rate from 4/15 (26.7%) to 11–13/15 (73–87%).
Phase C: Alias Elimination (PLANNED)
Replace hand-maintained alias dictionaries with SNOMED-derived synonyms at seeding time:
| Current Constant | Entries | Replaced By |
|---|---|---|
CONDITION_ALIASES | ~125 | SNOMED snomed_descriptions synonym lookup |
TREATMENT_ALIASES | ~72 | SNOMED snomed_descriptions synonym lookup |
EXAMINATION_ALIASES | ~55 | SNOMED snomed_descriptions synonym lookup |
EXAMINATION_CASING | ~30 | SNOMED preferred term lookup |
CONDITION_DOMAIN_MAP | ~40 | SNOMED FINDING_SITE → body structure → domain group |
EXAM_DOMAIN_MAP | ~17 | SNOMED PROCEDURE_SITE → body structure → domain group |
Impact: Eliminates ~610 hand-maintained entries (~18% of all hardcoded data).
Completed: SNOMED CT Phase 1 — Query-Time Synonym Expansion
SNOMED CT Belgian Edition (356K concepts, 656K Dutch descriptions) is integrated as a query-time synonym expansion layer following the BMQExpander pattern (Mao et al., 2024). The implementation includes:
- PostgreSQL reference tables (4 tables):
snomed_concepts,snomed_descriptions,snomed_relationships, andsnomed_transitive_closure(4.7M pre-computed IS-A ancestor/descendant pairs). - SnomedTerminologyService: BMQExpander-style synonym expansion — resolves patient terms (e.g., "cataract") to clinical synonyms ("staar") that match taxonomy entries.
- FINDING_SITE routing: Maps unknown conditions to departments via SNOMED CT body structure relationships — 51 curated body-structure-to-department mappings with IS-A hierarchy walk (max depth 5).
- Always-on architecture: SNOMED is not behind a feature flag. All calls are wrapped in
try/exceptfor graceful degradation when tables do not exist.
On a targeted 15-question evaluation set, synonym expansion improved entity recall from 40% to 47–60% (depending on LLM response variability), with zero infrastructure additions beyond PostgreSQL.
Future: Cross-Lingual Search
With SNOMED CT concept identifiers on graph nodes, the system can accept queries in any language that SNOMED CT supports (Dutch, English, French, German) and resolve them to the same underlying concepts. This is particularly relevant for ZOL's multilingual patient population in the Limburg province of Belgium.
Future: Neural Entity Extraction
For discovering novel entities not in the taxonomy, two approaches are under evaluation: GLiNER-BioMed (zero-shot NER) and MedCAT (NER with SNOMED CT concept linking). Both require golden question baseline evaluation (currently 108 questions across 15 categories).
Theoretical Foundations
The Three-Source Knowledge Architecture draws on established principles from knowledge engineering, medical informatics, and information retrieval:
- Separation of concerns (Dijkstra, 1982): Universal medical knowledge, standards-based ontology data, and hospital-specific organisational data are orthogonal concerns that change for different reasons and at different rates. Separating them by provenance reduces coupling and enables independent verification of each source.
- Open-world assumption (Reiter, 1978): The absence of a relationship in the knowledge graph does not mean the relationship does not exist. The tiered query strategy (graph then vector fallback) operationalises this assumption by treating graph results as high-confidence and vector results as exploratory.
- Knowledge graph completion (Bordes et al., 2013): The LLM enrichment pipeline can be understood as a form of knowledge graph completion, where a pre-trained model predicts missing links between known entities based on learned representations of medical relationships.
- Ontology-enhanced RAG (Soman et al., 2024): OntologyRAG demonstrated that grounding retrieval-augmented generation in formal ontologies improves accuracy by +40% on medical QA benchmarks. The SNOMED CT graph enrichment phase operationalises this finding by embedding ontological relationships directly into the knowledge graph.
- SNOMED CT concept-based information retrieval (Ruch et al., 2006): Using SNOMED concepts and FINDING_SITE relationships for information retrieval improved MAP by +25%. This finding directly motivates the FINDING_SITE → HANDLES auto-creation in Source 2.
- Biomedical query expansion (Jimeno-Yepes et al., 2012; Mao et al., 2024): The BMQExpander pattern — using SNOMED CT and MeSH synonyms for query expansion — achieved +22% NDCG@10 on medical QA benchmarks. This approach underpins both the query-time synonym expansion and the seeding-time synonym-as-property design.
- Cross-lingual medical embeddings (Yuan et al., 2022): The CODER model demonstrated that cross-lingual embeddings improve medical entity matching by +15% F1. SNOMED CT's multilingual descriptions provide the foundation for cross-lingual search without requiring language-specific embedding models.
- Frozen ontology pattern (Uschold & Gruninger, 1996): The taxonomy layer implements a frozen snapshot of the hospital's entity inventory at scrape time. This prevents drift between the graph's entity set and the hospital's current data, at the cost of requiring periodic re-scraping.
- Multi-tenancy through abstraction (Bezemer & Zaidman, 2010): The source separation follows the principle that multi-tenant systems should isolate tenant-specific configuration from shared business logic. Sources 1 and 2 are portable; Source 3 is tenant-specific. This pattern enables horizontal scaling to additional hospitals without code duplication.
- Dutch morphological compounding (Booij, 2012): Dutch is a productive compounding language where medical terms are formed by concatenating morphemes without spaces (e.g., hartchirurgie = hart + chirurgie, schildklieraandoening = schildklier + aandoening). This morphological productivity means that a finite alias dictionary can never fully cover the space of valid Dutch medical terms, motivating the integration of SNOMED CT's 656K Dutch descriptions as a scalable synonym source.
References
- Bezemer, C. P., & Zaidman, A. (2010). Multi-tenant SaaS applications: Maintenance dream or nightmare? Joint ERCIM Workshop on Software Evolution and International Workshop on Principles of Software Evolution, 88--92. https://doi.org/10.1145/1862372.1862393
- Booij, G. (2012). The Grammar of Words: An Introduction to Linguistic Morphology (3rd ed.). Oxford University Press.
- Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., & Yakhnenko, O. (2013). Translating embeddings for modeling multi-relational data. Advances in Neural Information Processing Systems, 26, 2787--2795.
- Dijkstra, E. W. (1982). On the role of scientific thought. In Selected Writings on Computing: A Personal Perspective (pp. 60--66). Springer-Verlag.
- FPS Public Health. (2024). Belgian eHealth Action Plan: SNOMED CT Implementation Roadmap. Federal Public Service Health, Food Chain Safety and Environment.
- Hartendorp, R., et al. (2024). Biomedical entity linking for Dutch. In Proceedings of CL4Health Workshop, LREC-COLING 2024.
- Hogan, A., et al. (2021). Knowledge graphs. ACM Computing Surveys, 54(4), 1--37. https://doi.org/10.1145/3447772
- Jimeno-Yepes, A., Berlanga, R., & Rebholz-Schuhmann, D. (2012). Ontology-based query expansion for biomedical information retrieval. BMC Bioinformatics, 13(S14). https://doi.org/10.1186/1471-2105-13-S14-S1
- Mao, Y., et al. (2024). BMQExpander: Biomedical query expansion using SNOMED/MeSH synonyms. Proceedings of NAACL 2024.
- Reiter, R. (1978). On closed world data bases. In H. Gallaire & J. Minker (Eds.), Logic and Data Bases (pp. 55--76). Plenum Press.
- Ruch, P., et al. (2006). Using SNOMED CT body structure hierarchy for concept-based information retrieval. Proceedings of AMIA Annual Symposium, 674--678.
- Searle, T., et al. (2024). MedCAT -- Medical concept annotation toolkit. Artificial Intelligence in Medicine, 149, 102779. https://doi.org/10.1016/j.artmed.2024.102779
- SNOMED International. (2024). SNOMED CT Starter Guide. https://www.snomed.org/
- Soman, K., et al. (2024). OntologyRAG: Ontology-enhanced retrieval-augmented generation. arXiv preprint, arXiv:2412.09050.
- Uschold, M., & Gruninger, M. (1996). Ontologies: Principles, methods and applications. Knowledge Engineering Review, 11(2), 93--136. https://doi.org/10.1017/S0269888900007797
- Yuan, Z., et al. (2022). CODER: Knowledge-infused cross-lingual medical term embeddings. Findings of ACL 2022, 3924--3935. https://doi.org/10.18653/v1/2022.findings-acl.312