Query Enrichment Pipeline

Hospital website content is written in everyday Dutch ("te traag werkende schildklier"), but patients arrive with Latin medical terminology ("hypothyreoïdie"), colloquial terms ("rugpijn"), or clinical abbreviations ("AVM"). Retrieval from an unenriched query embeds the patient's surface form against a corpus that doesn't share their vocabulary, with predictable consequences for recall.

Query enrichment runs as _qs_enrich_query() in the RAG service — between intent classification and retrieval — and applies three enrichment layers in cascade. The cascade does not modify the user's question; it appends canonical terms in parentheses so that both the embedding and the BM25 tsvector see the bridging vocabulary.

Scope — this page vs. Taxonomy Query Enrichment

This page covers the pre-retrieval term-bridging cascade (_qs_enrich_query): SNOMED synonyms, taxonomy TREATS routing, and Latin→Dutch translation appended to the search query. The broader structured-knowledge injection flow — entity resolution, ontology lookup, the conditional doctor-list injection (Stage 5c), and prompt augmentation — is documented in Taxonomy Query Enrichment.

Why enrichment exists

The feature was identified as a regression during the pilot evaluation on 2026-03-20: the SNOMED synonym expansion that existed in the Neo4j-based pipeline had been silently lost during the migration to the PostgreSQL taxonomy on 2026-03-07. Three golden evaluation questions (GQ-168, GQ-169, GQ-173) failed because medical terminology queries could not find their Dutch equivalents in the taxonomy. The cascade documented here was the structural response.

This is also the canonical example of the silent-failure discipline codified in CLAUDE.md (R1/R2/R3). The post-mortem closed not just the recall regression but also added size logs to every collection-returning function in the enrichment path, and a contract test for the cross-component handoff between the intent classifier's ExtractedEntities and the enrichment service's resolved-term injection.

Trade-offs

Decision	Chosen	Alternatives considered	Rejected because
Where to enrich	Append canonical terms in parentheses to the user query before embedding/BM25	Replace user query with canonical term; append after embedding (re-embed twice); enrich only at re-rank	Replacement loses the user's surface form, which the LLM still wants for paraphrase fidelity in the answer. Re-embedding twice doubles the cost of the dense path on every query. Late enrichment (at re-rank) is too late — the recall failure is upstream of re-rank. Pre-retrieval append leaves the original token in place and adds the canonical neighbour as a free signal for both the dense and sparse paths.
Layer order	SNOMED → taxonomy TREATS → per-word fallback	Run all layers, take union	The layers are in quality order (SNOMED is most precise, per-word fallback is most permissive). First-match short-circuits to keep the appended canonical-term list focused. Union-of-all produced noisy enriched queries (5–6 canonical terms appended) that hurt BM25 precision.
Synonym source	SNOMED CT Belgian Edition (529 K active Dutch descriptions)	Curated alias map; Wiktionary; Wikipedia redirects	A curated map cannot keep pace with clinical vocabulary; Wiktionary lacks medical depth; Wikipedia redirects are too noisy. SNOMED CT is the international standard for clinical terminology with native Dutch coverage. See SNOMED CT Terminology for ingestion and resolution detail.
Hospital-agnosticism	Latin → Dutch translation lives in the LLM intent-classifier prompt, not in code	Hardcoded term mapping in code	A code-level mapping must be maintained per hospital; the LLM-prompt approach is hospital-agnostic — the model uses its medical knowledge natively. The prompt instruction is a single sentence.

Architecture

Layer 1 — SNOMED synonym expansion

Purpose: bridge Latin or scientific medical terms to the Dutch names used in the hospital taxonomy.

Algorithm:

Look up the exact query in app.snomed_descriptions (529 K active Dutch medical terms — see SNOMED CT Terminology).
For each match, find taxonomy entities sharing the same snomed_concept_id.
If the entity name differs from the query token, inject it as a search enrichment.

Worked example:

Query: "hypothyreoïdie"
SNOMED match: concept 40930008 (hypothyreoïdie)
Published taxonomy entity with concept 40930008: "Hypothyroïdi" (CONDITION, status published)
Enriched query: "hypothyreoïdie (Hypothyroïdi)"

Fallback within Layer 1: if exact match fails, the resolver tries a fuzzy match (LIKE '%query%') against SNOMED descriptions. This catches compound terms — lumbale discushernia matching a query for discushernia.

Layer 2 — taxonomy TREATS / OFFERS expansion

Purpose: when the user asks about a condition, append the department names that treat that condition so retrieval lifts the relevant department-overview pages.

Algorithm:

Pull the condition field from the intent classifier's ExtractedEntities.
Query published_relationships where relationship_type IN ('TREATS', 'OFFERS', 'PERFORMS') and target_id = condition.id.
Append the resolved department names to the search query.

Worked example:

Query: "ik heb last van epilepsie"
Intent classifier extracts: condition = "epilepsie"
Taxonomy: Neurologie TREATS Epilepsie; Neurochirurgie TREATS Epilepsie
Enriched query: "ik heb last van epilepsie (Neurologie, Neurochirurgie)"

This ensures department overview pages rank higher in retrieval — not just condition-explanation pages — so the LLM gets the routing answer the patient actually needs.

Layer 3 — Latin-to-Dutch translation (LLM)

Purpose: for inputs Layers 1 and 2 cannot resolve, the intent classifier's reformulated_query carries Latin → Dutch translation.

The intent classification prompt includes the rule:

"When the user uses Latin or scientific medical terms, ALWAYS include the common Dutch equivalent in the reformulated_query. Hospital websites use Dutch, not Latin. Use your medical knowledge to translate."

This is hospital-agnostic: there are no hardcoded term mappings. The LLM uses its medical knowledge natively, which means no per-tenant maintenance.

Relationship to Stage 5c (synthetic doctor-list injection)

When Layer 2 resolves a condition to a department, downstream Stage 5c may also fire if the intent is doctor_lookup AND the query contains a list-signal ("alle", "welke artsen", "wie werkt er"). In that case the resolved department from Layer 2 is the same hint Stage 5c uses to fetch the full doctor roster. See Taxonomy Query Enrichment and Query Pipeline §Stage 5c.

Empirical impact

Measured on the golden evaluation set across the pre-enrichment baseline (pilot v2) and the post-enrichment build (pilot v7):

Metric	Before (pilot v2, n=268)	After (pilot v7, n=299)	Δ
SNOMED terminology category	88.0 % (22/25)	100.0 % (33/33)	+12.0 pp
Condition-department routing	94.7 % (36/38)	100.0 % (46/46)	+5.3 pp
Overall pass rate	95.9 % (257/268)	99.0 % (296/299)	+3.1 pp

The denominator change (268 → 299) reflects the natural growth of the golden set as new failure cases were added between the two runs; the comparison is across different set sizes, not the same set. The category-level deltas (SNOMED, condition-department) measure the same per-question pass/fail and are the cleanest evidence of the cascade's effect.

A re-run on the post-OpenAI-embedding-migration corpus is not yet measured.

Performance

Layer	Cost per query	Notes
Layer 1 — SNOMED exact lookup	~5 ms	Single SQL query against indexed `snomed_descriptions(term_lower)`. Fuzzy fallback adds ~5 ms when exact misses.
Layer 2 — taxonomy TREATS	~3 ms	Single SQL query against indexed `published_relationships`. Skipped when no `condition` was extracted.
Layer 3 — LLM translation	0 ms (additive)	Already happens inside intent classification; not a separate call.

Total enrichment overhead: < 10 ms in the typical case where Layers 1 and 2 both hit, < 5 ms when only Layer 1 fires, 0 ms when no enrichment applies.

Known limitations

dyslipidemie — SNOMED concept 370992007 exists but no published taxonomy entity carries this concept (no ZOL page about dyslipidemie). This is a content gap, not a code issue. The resolver returns no enrichment; retrieval falls back to vector-only.
hernia nuclei pulposi — the Belgian SNOMED edition maps this term to a different concept ID than discushernia. Layer 3 (LLM translation) handles this case because the model knows the two are synonyms, even though the SNOMED graph does not link them.
Compound-word boundary — bloeddrukverlagend (blood-pressure-lowering) does not match Hypertensie through SNOMED alone. The compound is not in snomed_descriptions as a single term. Layer 3 (LLM translation) recovers most of these cases; an explicit Dutch-compound splitter has not been needed at pilot scale.

Implementation pointers

Component	File	What's there
Layer 1 — SNOMED expansion	`backend/app/services/rag_service.py:_qs_enrich_query` (~line 2160)	Exact lookup + fuzzy fallback against `snomed_descriptions`
Layer 2 — TREATS expansion	same file (~line 2170)	SQL against `published_relationships`
Layer 3 — LLM translation	`backend/app/services/intent_classification_service.py` + `backend/app/prompts.py`	Latin → Dutch instruction in the intent prompt
Per-word fallback	`backend/app/services/rag_service.py:_qs_enrich_query` (~line 2180)	Iterate `>5`-char tokens against SNOMED descriptions

Shared DepartmentResolver (2026-06-09)

Department-name matching inside _taxonomy_post_enrichment (and the voice/schedule surfaces) now routes through a shared DepartmentResolver cascade (app/services/department_resolver.py) when the department_resolver_enabled flag is on (default off pending regression eval). The resolver applies four structural tiers — normalize → exact → token-subset → alias — replacing the per-surface substring heuristics. When the flag is off, the legacy _dept_matches_query substring path is used unchanged. This change does not affect the three enrichment layers documented above; it affects only how department names extracted by those layers are matched against query text during post-answer enrichment.

References

@chen2024bgem3 — historical context for the Layer 1 vector-search baseline; BGE-M3 is now the ColBERT-only model after ADR-0048.
SNOMED CT Terminology — terminology ingestion, the 12-resolver chain, and the FINDING_SITE routing extension.
Taxonomy Query Enrichment — the broader 12-resolver chain that runs alongside the cascade documented here.
ADR-0019: Contextual Embeddings — how canonical questions and chunk context are embedded.
Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS 2020.

Why enrichment exists​

Trade-offs​

Architecture​

Layer 1 — SNOMED synonym expansion​

Layer 2 — taxonomy TREATS / OFFERS expansion​

Layer 3 — Latin-to-Dutch translation (LLM)​

Relationship to Stage 5c (synthetic doctor-list injection)​

Empirical impact​

Performance​

Known limitations​

Implementation pointers​

Shared DepartmentResolver (2026-06-09)​

References​