Skip to main content

Glossary

A single reference for the terms used throughout this documentation. Each entry is the canonical definition; detail pages link here rather than re-defining terms in place. Terms are grouped by area and ordered to build on one another, and each term has a stable anchor (e.g. …/glossary#rrf) so any page can deep-link to it.

How to use this page

If a page uses a term you have not met yet, look it up here first. Entries cross-link to the page that covers the concept in depth.

Core retrieval

RAG (Retrieval-Augmented Generation)

An architecture (Lewis et al., 2020) in which a language model is conditioned at inference time on documents retrieved from an external corpus, rather than relying on parametric memory alone. Every ZOL answer is generated against retrieved chunks and is citation-traceable. See What is RAG.

Chunk

A bounded passage of source content (a slice of a brochure or web page) that is embedded and indexed as the unit of retrieval. ZOL's corpus is ~10,430 chunks.

Embedding

A dense vector representation of text. ZOL's production embedding model is OpenAI text-embedding-3-large at 1536 dimensions (reduced from native 3072 to fit the pgvector HNSW 2000-d limit; ADR-0048). See Embedding Models.

Retrieval by embedding-similarity (cosine) over pgvector. Captures meaning, not exact words.

Keyword retrieval over a PostgreSQL tsvector index. Captures exact terms a dense model may blur. In ZOL this is one of the three signals fused by RRF in Hybrid Search.

BM25 (Best Matching 25) is the ranking function behind lexical retrieval — the "Okapi BM25" probabilistic model (Robertson & Zaragoza, 2009) used by Lucene, Elasticsearch, and PostgreSQL full-text search alike. It is a bag-of-words method: it ignores word order and scores a document purely on which query terms it contains and how. Three intuitions drive the score:

  • Term frequency (TF) — a document that mentions a query term more often is more likely about it, but with diminishing returns: the 10th occurrence adds far less than the 2nd. This saturation is what separates BM25 from naive TF-IDF.
  • Inverse document frequency (IDF) — rare terms are more informative than common ones. A match on "glioblastoom" counts for far more than a match on "de".
  • Length normalization — long documents contain more words by chance, so BM25 discounts them; a short, focused chunk is not penalized for brevity.

The score of document D against query Q (terms q₁…qₙ) is the sum over query terms:

n f(qᵢ, D) · (k₁ + 1)
score(D, Q) = Σ IDF(qᵢ) · ───────────────────────────────────────────
i=1 f(qᵢ, D) + k₁ · (1 − b + b · |D| / avgdl)

where f(qᵢ, D) = frequency of term qᵢ in document D
|D| = length of D (in terms)
avgdl = average document length across the corpus
k₁ = TF-saturation parameter (typically 1.2–2.0)
b = length-normalization strength, 0…1 (typically 0.75)

The k₁ term controls how quickly term frequency saturates (higher k₁ = slower saturation, so repetition matters more); b controls how aggressively long documents are penalized (b=0 = no length normalization, b=1 = full). Lucene/Elasticsearch default to k₁=1.2, b=0.75.

Worked intuition. For the query "python web framework", the document "Django is a Python web framework." scores high — it contains all three query terms, the terms are reasonably rare (high IDF), and the document is short (no length penalty). A document like "Programming tutorials for many languages." scores near zero: none of the query terms appear, so every summand collapses.

Why ZOL keeps it alongside vectors. BM25 is fast, interpretable, and unbeatable at exact matches — drug names, dosages, doctor surnames, campus names — where a dense embedding can blur a near-synonym. Dense / vector search covers the complementary case: paraphrase and semantic similarity ("hartdokter""cardioloog"). Fusing both with RRF consistently beats either alone, which is why ZOL's retrieval is a hybrid of dense + BM25 + typed-entity taxonomy lookup. See Hybrid Search.

RRF (Reciprocal Rank Fusion)

A score-agnostic algorithm (1/(k+rank), k=60; ADR-0020) that fuses ranked lists from different retrievers into one ordering. See Hybrid Search.

ZOL's retrieval mix of three complementary signals — dense vector, sparse BM25, and typed-entity taxonomy lookup — fused with RRF. See Hybrid Search.

Reranking

Re-ordering retrieved candidates by relevance after first-pass retrieval. ZOL applies a cross-encoder (Jina v2, BGE fallback), an optional ColBERT pass, and the Value Framework affinity rerank. See Reranking.

Cross-encoder

A reranker that jointly encodes (query, candidate) for a precise relevance score — more accurate but costlier than the dual-encoder used for first-pass retrieval.

Contextual retrieval

Anthropic's technique of prepending document context to a chunk before embedding/BM25, reducing retrieval-failure rates. In ZOL this is realized via page summaries, chunk context, and canonical questions baked in at ingestion. See Contextual Retrieval and Ingestion Enrichment.

Page summary / chunk context / canonical questions

The three enrichment artifacts generated at ingestion time. See Ingestion Enrichment for exactly where each is stored and used.

Lost in the Middle

The empirical finding (Liu et al., 2024) that LLMs attend most to the start and end of a long context and under-attend to the middle. ZOL mitigates this in Context Assembly.

Knowledge graph & taxonomy

Taxonomy

In ZOL, the structured store of hospital entities and their typed relationships (not a free-text index). It exists in two layers — see Layer A and Layer B. The word names a structured knowledge graph; ZOL uses "taxonomy" for the entity tables specifically. See Knowledge Graph.

Entity

A node in the taxonomy: a Hospital, Campus, Department, Doctor, Condition, Treatment, Examination, Center, Facility, or Service.

Relationship

A typed, directed edge between entities, e.g. WORKS_IN (Doctor→Department), HANDLES (Department→Condition), LOCATED_AT (Department→Campus), each carrying a confidence.

Layer A (config taxonomy)

Hand-curated alias maps, routing rules, and plausibility guards held in memory per tenant (HospitalTaxonomy); trusted directly. See Two layers of taxonomy.

Layer B (scraped taxonomy)

Entities + relationships harvested from the hospital website, stored in PostgreSQL (taxonomy_entitiespublished_entities); gated behind human review + versioned publish.

Department (Dutch: dienst)

A hospital organizational unit (Cardiologie, Neurologie, …); the top-level Department entity type. "Dienst" and "department" are used interchangeably.

Campus

One of ZOL's four physical locations: Sint-Jan, André Dumont, Sint-Barbara, Maas en Kempen. A Campus entity type.

Golden page

An authoritative listing page the hospital publishes that enumerates its own entities (e.g. the full doctor directory). The trust anchor for top-down seeding. See Golden Pages.

Hub / detail

The two values of app.golden_pages.page_type. A hub is a golden listing page (trust anchor, allowed to seed the taxonomy); a detail is a single-entity page (stored, retrievable, but not allowed to mint entities).

GOLDEN_SEED

The provenance stamp on an entity created top-down by the GoldenPageSeeder from a golden page, as opposed to bottom-up extraction from prose.

CURATED_FROM / MENTIONED_IN

Strong vs weak relationship provenance: CURATED_FROM is from a golden/hub page; MENTIONED_IN is an incidental mention in a brochure or news page.

graph_golden_only

The config gate (default True) that prevents ordinary crawled content from writing entities to the taxonomy; only hub pages and the seeder have write authority. See Golden Pages.

Three-Source Knowledge Architecture

The provenance separation of taxonomy inputs into Source 1 (web scraper), Source 2 (SNOMED CT), Source 3 (curated config), merged with priority Curated > Scraped > SNOMED. See Medical Knowledge Architecture.

Draft / Publish

The versioning pattern in which extracted taxonomy lives as draft until an operator approves it and an atomic, advisory-locked publish copies it to published_entities at a new version. Queries only ever read the published snapshot. See Draft/Publish System.

SNOMED CT

SNOMED CT

The international clinical terminology standard. ZOL uses the Belgian Edition (356,370 concepts, 656,287 Dutch descriptions) to bridge colloquial Dutch to clinical terms. See SNOMED CT Terminology.

Concept ID

A language-neutral SNOMED identifier stamped onto a taxonomy entity at build time (snomed_concept_id); the key that links an entity to its synonyms and hierarchy worldwide.

FSN (Fully Specified Name)

A SNOMED concept's unique, unambiguous name including a semantic tag, e.g. Multiple sclerosis (disorder).

Preferred Term / Acceptable Synonyms

The default display term per language, and the alternative terms that denote the same concept (e.g. MS, multipele sclerose).

FINDING_SITE / PROCEDURE_SITE

SNOMED relationship types linking a condition/procedure concept to the body structure it concerns; ZOL uses them to route a clinical term to the responsible department.

Steering & safety

Value Framework

An intent × content-category affinity reranker (Stage 5b) that checks whether the kind of content fits the kind of question, on both chat and voice channels. See Value Framework.

Intent classifier

The LLM step that labels a query's intent (e.g. condition_information, doctor_schedule_query) and extracts entities, steering retrieval and routing. See Query Pipeline.

Grounding

The requirement that every answer be supported by retrieved corpus content with a verifiable citation; ungrounded generation is disallowed.

Medical-advice refusal

The hard safety boundary: the system is informational/navigational only and must never give medical advice. See Safety & Compliance.

structured_call

The ~190-LOC helper (app.llm.structured) wrapping the OpenAI client at all eight LLM call sites: it enforces JSON-schema validation with retries and raises a typed StructuredCallError on exhaustion. It replaced a trialed Pydantic AI Agent (removed 2026-05-12 for ~720 ms/call overhead; see Decision-Cost Rubric).

Voice channel

Cascade

The voice pipeline pattern: speech-to-text → LLM orchestrator (with a RAG retrieval tool) → deterministic post-processing → text-to-speech. The system controls the exact text before it is spoken, which is what makes grounding, citation, and refusal deterministically enforceable. See Voice Architecture.

STT / TTS

Speech-to-text (Deepgram Nova-3) and text-to-speech (ElevenLabs).

Barge-in

A caller speaking over the agent's TTS, which interrupts playback (full-duplex turn-taking).

Utterance pre-filter (classify_terminal)

A deterministic regex classifier that handles terminal utterances — those needing no retrieval: greeting, farewell, handoff, safety refusal, repeat, off-topic — with templated responses at zero LLM cost, before the agentic LLM is invoked. The function is named classify_terminal in code.


Missing a term? It likely belongs here — the goal is one canonical definition per concept. See Core Concepts for the end-to-end narrative that ties these together.