Skip to main content

Taxonomy Query Enrichment

Every non-blocked query passes through a taxonomy enrichment pipeline that injects structured hospital knowledge into the retrieval and generation stages. This page documents the query-time enrichment flow — from entity extraction through taxonomy resolution, ontology lookup, injection gating, the conditional doctor-list injection (Stage 5c), and prompt augmentation — explaining how structured knowledge-graph data complements vector search to produce grounded, entity-aware responses.

Scope vs. Query Enrichment Pipeline

Two pages cover "enrichment", at different levels. The Query Enrichment Pipeline is the canonical reference for the _qs_enrich_query() cascade — the three layers (SNOMED synonym, TREATS/OFFERS, Latin→Dutch) that append bridging vocabulary to the query string. This page is the broader taxonomy resolver flow that the cascade sits inside: entity extraction, the 12-resolver chain, the injection gate, Stage 5c, and prompt augmentation. Step 3 below is where the two meet.

Part of the Knowledge & Retrieval Steering triad

This is the query-time view of the Taxonomy. Resolver R7–R10 fall back to SNOMED CT when the deterministic chain misses, and Stage 5b is the Value Framework reranker. See the Core Concepts flow for how all three subsystems compose on a single query.

For the ingestion-time taxonomy population see the Population Lifecycle and Seeding Pipeline. For the taxonomy data model see Knowledge Graph Overview. For the overall query pipeline context see Query Processing Pipeline.

Trade-offs

DecisionChosenAlternatives consideredRejected because
Resolver chain shape12 specialised resolvers, first-match winsSingle fuzzy resolver; LLM-only resolution; one resolver per entity typeA single fuzzy resolver collapses precision (cardilogie → cardiologie and radiologie), an LLM-only resolver pays $0.0001 per query for what is mostly cheap SQL, and one-resolver-per-type misses cross-type aliases (e.g., a single-word query that could be a department alias OR a condition alias). The 12-resolver chain runs the cheap precise resolvers first and falls through to the fuzzy fallback only as a last resort.
Injection gateFour-rule gate: structural intent → inject; sparse vector → inject; low similarity → inject; default → suppressAlways inject; never inject; LLM-decided gateAlways-inject dilutes strong vector contexts with redundant taxonomy text and hurts faithfulness scores; never-inject costs the structural-question accuracy that the taxonomy is the authoritative source for. An LLM-decided gate adds a per-query LLM call to a path that doesn't otherwise need one. The four-rule gate is deterministic and fast, and its rules cover the cases the data shows matter.
Stage 5c (synthetic doctor-list injection)Triple-gated: intent in doctor_lookup or dept_lookup AND list-signal phrase AND department hintAlways-inject the full doctor list; never inject; cross-encoder re-rank that promotes doctor pagesAlways-inject blows the context budget for single-doctor queries; never-inject was the pre-fix shape and produced the dermatologist-list regression. A cross-encoder re-rank cannot discover doctors that the first-stage retrieval missed entirely. The triple gate fires only for the specific failure mode (list questions about a department's doctors) and is a no-op otherwise.

Enrichment Pipeline Overview

Step 1: Entity Extraction

During intent classification, the Tier 2 LLM extracts structured medical entities alongside the intent. These entities drive all downstream taxonomy operations.

The ExtractedEntities structure contains:

FieldTypeExample
conditionstr | None"hartkloppingen"
departmentstr | None"Cardiologie"
treatmentstr | None"chemotherapie"
examinationstr | None"MRI"
doctorstr | None"Dr. Peeters"
campusstr | None"Sint-Jan"

For colloquial or indirect queries (e.g., "Wie kan helpen met hartproblemen?"), the LLM extracts the implicit entity ("hartproblemen" as a condition) even though no medical term appears explicitly. See ADR-0030 for the design rationale.

Implementation: rag_service.py:399–506 (_classify_intent_and_rewrite)

Step 2: Taxonomy Resolution (12-Resolver Chain)

The extracted entities pass through HospitalTaxonomy.resolve_search_query(), which runs a chain of responsibility — twelve resolvers executed in priority order. The first resolver to match wins. This design enables precise resolution even for ambiguous, misspelled, or multi-language input.

Resolver Descriptions

#ResolverPurposeExample
1Enrichment TriggerDetect navigational enhancement phrases"meer informatie over..."
2Campus ExactMatch campus names and aliases"Sint-Jan", "Genk" → Campus Sint-Jan
3Dept SkipgramOrder-independent multi-word matching"intensive zorgen" matches "Intensieve Zorgen"
4Dept N-gramConsecutive 2/3-word pair matching"spoed opname" → Spoedgevallen
5Dept Single AliasBroadest single-word alias match"cardio" → Cardiologie
6Dept Alias MapExact department name lookup"Orthopedie" → Orthopedie
7Condition ExactSNOMED → CONDITION_ALIASES → raw keywords"suikerziekte" → Diabetes Mellitus
8Dept from ConditionCondition-to-department routingDiabetes Mellitus → Endocrinologie
9Treatment ExactSNOMED → TREATMENT_ALIASES resolution"hartfilmpje" → ECG
10Examination ExactSNOMED → EXAMINATION_ALIASES resolution"bloedonderzoek" → Labo
11Specialty ExactSpecialty name lookup"orthopedisch chirurg"
12Fuzzy FallbackMisspelling detection (cutoff=0.8)"cardilogie" → Cardiologie

SNOMED CT Integration

Resolver 7 (Condition Exact) can optionally fall back to SNOMED CT synonym expansion via resolve_search_query_with_snomed() when the deterministic alias map misses. The synonym-expansion mechanism itself (alias map → SNOMED Dutch descriptions → IS-A concept expansion) is documented canonically in Query Enrichment → Layer 1; the underlying ontology integration is in SNOMED CT Terminology.

Implementation: hospital_taxonomy.py:583–1059 (resolve_search_query, _resolve_search_query_inner, resolve_search_query_with_snomed)

Step 3: Query Enrichment

After taxonomy resolution, the resolved canonical terms are appended to the search query (e.g. "hartkloppingen""hartkloppingen (Palpitaties, Cardiologie)") so that both the embedding and the BM25 tsvector see the bridging vocabulary, improving recall against the canonical-Dutch corpus.

This is the same _qs_enrich_query() cascade documented canonically — the three enrichment layers (SNOMED synonym expansion, taxonomy TREATS/OFFERS routing, Latin-to-Dutch translation), worked examples, and empirical impact — on the Query Enrichment Pipeline page. This page covers only how the step sits within the taxonomy resolver flow.

Implementation: rag_service.py:2158–2181 (_qs_enrich_query)

Step 4: Sequential Retrieval

Three operations execute sequentially (asyncpg does not support concurrent queries on the same session):

4a. Vector Search (Enriched Query)

Standard pgvector cosine similarity search using the enriched query. Returns document chunks ranked by semantic similarity. See Hybrid Search for the full retrieval architecture.

4b. Taxonomy Search (Intent-Routed SQL)

The TaxonomyQueryService routes to intent-specific SQL handlers that traverse the taxonomy relationships:

IntentHandlerSQL Pattern
doctor_lookup_handle_doctor_lookupdoctorsdoctor_departmentsdepartmentsdepartment_campuses
department_or_service_lookup_handle_department_lookupdepartmentsdepartment_campusesdoctors
condition_information_handle_condition_infoconditionsdept_handles_conditiondepartmentsdoctors
treatment_or_exam_information_handle_treatment_exam_infotreatments/examinationsdept_offers_treatmentdepartments
booking_or_contact_handle_booking_contactdepartmentsdepartment_campuses (with contact info)

Each handler returns structured results like:

{
"type": "department_for_condition",
"department": "Cardiologie",
"condition": "Palpitaties",
"campuses": "ZOL Genk, campus Sint-Jan",
"doctors": "Dr. Peeters, Dr. Janssen",
"source": "taxonomy"
}

These are converted to natural language content strings and merged with vector results.

Implementation: taxonomy/query_service.py:48–435

4c. Ontology Lookup

Running as part of the sequential retrieval chain, the ontology lookup:

  1. Entity Linking (EntityLinker.link_multiple()): Maps extracted entity mentions to their taxonomy database IDs
  2. Relationship Retrieval (OntologyQueryService.build_context()): Fetches relationships (PART_OF, TREATED_BY, HAS_FACILITY, etc.) for the linked entities
  3. Context Formatting: Produces an OntologyContext object that renders as a prompt block

The ontology block is prepended to the assembled context, giving the LLM explicit knowledge of entity relationships.

Implementation: rag_service.py:2207–2265 (_qs_ontology_lookup)

Step 5: Taxonomy Injection Gate

Not all queries benefit from taxonomy data. When vector search returns strong, relevant results, injecting taxonomy data can dilute the context with less relevant structured information. The injection gate applies four rules in order:

RuleConditionActionRationale
1. Structural intentIntent is doctor_lookup, department_lookup, condition_info, treatment_info, or symptom_descriptionInjectTaxonomy is the authoritative source for organizational data
2. Sparse vector resultsVector returned fewer chunks than graph_injection_min_vector_resultsInjectGraph fills the retrieval gap
3. Low similarityBest vector similarity score below graph_injection_similarity_thresholdInjectRescue scenario — vector results are weak
4. DefaultStrong vector results with sufficient similaritySuppressAvoid diluting rich vector context

When suppressed, taxonomy results from Step 4b are excluded from the context window. Only vector chunks proceed to the LLM.

Implementation: rag_service.py:2571–2662 (_should_inject_taxonomy_context, _build_context_from_chunks)

Stage 5c: Synthetic Department-Doctor-List Injection

Stage 5c is a post-retrieval, pre-context-assembly step that fires only when all three of the following hold:

  1. The classified intent is DOCTOR_LOOKUP or DEPARTMENT_OR_SERVICE_LOOKUP.
  2. The user query contains a list-signal phrase matched by _LIST_SIGNAL_RE (e.g., alle, welke artsen, wie werkt er, list all, tous les médecins).
  3. A department or specialty hint can be resolved either from the classifier's ExtractedEntities (department, service, or doctor) or from a regex sweep over the rewritten query.

When all three gates pass, the stage queries the taxonomy for all doctors associated with the resolved department, builds a synthetic chunk listing them, and inserts it into the retrieved-chunks set before context assembly. This guarantees the LLM has the full roster available so the system prompt's "list all members" exception rule can fire faithfully — the LLM cannot list doctors it never saw in the context.

The stage was introduced as a regression fix for the 2026-05-09 incident: a 6-turn voice conversation about dermatologists capped at the same two names because vector retrieval surfaced individual doctor brochure pages without the shared department roster, and re-ranking could only reorder what retrieval returned. The synthetic-chunk approach guarantees the roster is in the context regardless of which doctor brochures retrieval picked up.

When any of the three gates is unsatisfied (e.g., the query is "Wie is Dr. X?" — a single-doctor lookup, no list signal), Stage 5c is a no-op and adds zero latency. When it does fire, the cost is one indexed taxonomy query (~5 ms) plus the synthetic chunk's contribution to the assembled context (a single short paragraph; well within budget).

Interaction with post-answer enrichment

When Stage 5c fires, all department doctors are already visible to the LLM, so the post-answer taxonomy enrichment step (described below) finds zero "new" doctors and appends nothing. The two mechanisms are complementary: Stage 5c is the proactive path for queries the gate identifies as list questions; post-answer enrichment is the safety net for multi-part queries whose intent classification didn't trigger Stage 5c.

Implementation: rag_service.py:2134–2197 (_qs_maybe_inject_doctor_list); call site at rag_service.py:3537.

Stage Execution Order

For an examiner tracing a live query, the post-classification stages execute in this order:

StagePurposeCost when activeCost when no-op
5a Entity extraction (via intent classification)Extract structured entities from the queryincluded in intent LLM call
Step 2 Taxonomy resolution (12-resolver chain)Resolve user terms to canonical entity IDs~10 ms typical; ~30 ms with fuzzy fallback
Step 3 Query enrichmentAppend canonical terms to search query~1 ms
Step 4 Sequential retrieval (vector + BM25 + taxonomy)Gather candidate chunks~800 ms
Stage 5b Value Framework affinity rerankMultiply scores by intent × content_category matrix~2 ms~2 ms (always-on)
Stage 5c Synthetic doctor-list injectionAppend synthetic chunk with full department roster~5 ms0 ms
Step 5 Taxonomy injection gateDecide whether taxonomy results enter the context< 1 ms
Step 6 Routing hint injectionPrepend "condition X falls under department Y" directive< 1 ms
Step 7 System prompt augmentationAppend GRAPH_CONTEXT_INSTRUCTIONS< 1 ms

Step 6: Routing Hint Injection

When taxonomy resolves a condition→department mapping, a routing hint is prepended to the assembled context. This is a strong directive that ensures the LLM always mentions the correct department:

--- ORGANISATIE-INFORMATIE ---
De aandoening "Palpitaties" valt onder de dienst Cardiologie.
Je MOET deze dienst vermelden in je antwoord.

The routing hint is injected regardless of the injection gate decision — even if taxonomy results are suppressed, the organizational routing information is always present. This prevents the LLM from naming incorrect departments when the vector context alone is ambiguous.

Implementation: rag_service.py:2887–2903 (_qs_inject_routing_hint)

Step 7: System Prompt Augmentation

When taxonomy data is present in the context (either via injection gate or routing hint), the system prompt receives additional GRAPH_CONTEXT_INSTRUCTIONS:

The following structured information was retrieved from the hospital knowledge graph.

  • ALWAYS include relevant department names and organizational information.
  • When a condition is discussed, you MUST mention which department(s) handle it.
  • For department routing (which department handles a condition), treat the graph data as AUTHORITATIVE.
  • Graph-derived information with a [number] marker — use it to cite.
  • Graph-derived information without a [number] marker is supplementary and should NOT be cited with numbers.

These instructions ensure the LLM prioritizes taxonomy-derived organizational data over potentially conflicting vector search results. The "AUTHORITATIVE" directive is critical — it means when the taxonomy says Cardiologie handles Palpitaties, the LLM will state this even if a vector chunk from an outdated brochure suggests otherwise.

Implementation: prompts.py:244–257, rag_service.py:4116

End-to-End Worked Example

User query: "Waar kan ik terecht met hartkloppingen?"

See it across four queries

This example traces one query. For the same flow walked across five contrasting queries — taxonomy-TREATS department routing, a doctor-list (Stage 5c) question, a medical-dosing safety-gate decision, a cross-language rewrite, and a SNOMED synonym expansion — all with values captured live from the pilot, see A Query, End-to-End.

StageInputOutput
Intent Classification + RewritingRaw queryOne LLM call (ADR-0030) emits all three: intent=condition_information, entities={condition: "hartkloppingen"}, and rewritten_query="Welke afdeling van ZOL behandelt hartkloppingen?" (canonical Dutch — this becomes the search_query the rows below operate on). See Query Rewriting.
Taxonomy Resolutionentity "hartkloppingen"condition="Palpitaties", department="Cardiologie" (Resolver 7→8)
Query Enrichment"hartkloppingen""hartkloppingen (Palpitaties, Cardiologie)"
Vector SearchEnriched query8 chunks about palpitations, heart rhythm, Cardiologie brochure
Taxonomy Searchcondition→dept SQL{dept: "Cardiologie", campuses: "Sint-Jan, André Dumont", doctors: [...]}
Ontology Lookupentity IDsPalpitaties TREATED_BY Hartritme-onderzoek, Cardiologie HAS_FACILITY Hartcentrum
Injection GateRule 1: structural intentINJECT (condition_information)
Routing Hintcondition→dept"De aandoening Palpitaties valt onder de dienst Cardiologie."
System Prompthas_graph_context=true+ GRAPH_CONTEXT_INSTRUCTIONS appended
LLM ResponseFull context"Bij hartkloppingen kunt u terecht bij de dienst Cardiologie van ZOL..."

Configuration

Two settings in config.py control the injection gate thresholds:

SettingDefaultDescription
graph_injection_min_vector_results3Minimum vector results before taxonomy is suppressed
graph_injection_similarity_threshold0.35Minimum vector similarity before taxonomy rescue

Both are manageable via the admin Settings API at runtime.

Key Implementation Files

Line Number References

Line numbers are approximate and shift as the codebase evolves. Use them as starting-point hints rather than exact locations.

ComponentFileLines (approx.)
Entity extraction (via intent)rag_service.py~399–506
Taxonomy resolution chainhospital_taxonomy.py~583–1059
Query enrichmentrag_service.py~2158–2181
Taxonomy SQL queriestaxonomy/query_service.py~48–435
Ontology lookuprag_service.py~2207–2265
Injection gaterag_service.py~2571–2662
Routing hintrag_service.py~2887–2903
Graph context instructionsprompts.py~244–257
Prompt assemblyrag_service.py~4084–4198

Post-Answer Taxonomy Enrichment

In addition to pre-retrieval enrichment (Sections 1–6 above), the pipeline performs a post-answer enrichment step after the LLM generates its response. This addresses a specific gap: multi-part queries (e.g., "Wat zijn de bezoekuren? Bij wie kan ik terecht bij kinderpsychiatrie?") may be classified with a non-structural intent like NAVIGATIONAL, causing the pre-retrieval taxonomy injection gate to suppress graph data. The LLM answers from vector search alone, potentially missing relationship data.

How It Works

Specialty-token guard (2026-05-29)

The guard node G was added after the N*-afdeling incident: a DEPARTMENT entity named for a nursing-ward code (N*-afdeling) was matched on the bare word "afdeling" and then won the display tiebreak (min(by length)) over the real Endocrinologie. A department name now must carry a specialty token (an alphabetic word ≥ 4 chars that is not a generic suffix like afdeling/dienst) to enter the WORKS_IN lookup or the display; otherwise enrichment is skipped rather than printing a ward code. See Release Notes — May 29, 2026.

After _qs_finalize generates the response:

  1. Department scan: All published department names are checked against the response text (case-insensitive).
  2. Relationship lookup: For each matched department, WORKS_IN relationships are queried from published_relationships.
  3. Deduplication: Doctor names already mentioned in the response are filtered out.
  4. Append: Remaining doctors are appended as a clearly marked supplement:
---
*Aanvullende informatie uit de ziekenhuistaxonomie:*
Artsen verbonden aan Kinderpsychiatrie: Dr. Frauke Martens, Dr. Karen Gillaerts.

Design Principles

PrincipleImplementation
Zero regressionPurely additive — never modifies existing answer text
Verified data onlyUses published taxonomy (operator-approved, version-controlled)
Non-blockingWrapped in try/catch — failures are silently logged
Badge activationSets has_graph_context = True when enrichment fires, triggering the "Verified with hospital data" badge
Multi-part safeWorks regardless of intent classification, since it runs post-generation

This pattern is particularly valuable for the ZOL use case where patients ask complex, multi-faceted questions that span navigational and structural concerns in a single query.

References