Skip to main content

Multi-Tenancy & Hospital-Agnostic Architecture

The system is architected to serve multiple hospitals from a single deployment. All hospital-specific behaviour is driven by database configuration or per-tenant YAML: no code changes or redeployment are required to onboard a new hospital.

Tenant-Isolation Trade-offs

DecisionChosenAlternatives consideredRejected because
Isolation modelShared schema with tenant_id FK on every domain row (pool model)Schema-per-tenant; database-per-tenantSchema-per-tenant complicates Alembic migrations (one schema-aware migration per tenant per release) and breaks single-query analytics. Database-per-tenant scales connection pool size linearly with tenant count and forbids cross-tenant analytics entirely. The pool model with row-level tenant_id filtering scales to the projected 5–10 hospital pilot fleet without rearchitecting. The three isolation models (shared schema / schema-per-tenant / database-per-tenant) and their trade-offs are the canonical multi-tenant SaaS taxonomy from Bezemer & Zaidman 2010.
Tenant resolutionKeycloak JWT claim → resolve_tenant_id() FastAPI dependencyURL-based tenant ID; subdomain routing; X-Tenant-Id request headerPutting the tenant ID in the URL or header lets a misconfigured client cross-tenant by accident; Keycloak claims are signed and verified per-request, so the tenant identity is cryptographically bound to the authentication. Subdomain routing is on the Multi-Tenancy Roadmap for Phase 2 but not in current scope.
Per-tenant configuration planeTwo planes — DB-driven for web/RAG (site_crawl_configs, golden_pages, PromptContext from YAML) + YAML overlay for voice (tenant_overlays/_yaml/<slug>.yaml)Single unified DB-driven config; single unified YAML fileA single DB-driven plane forces a schema migration for every voice-FAQ change; a single YAML file forces every web-channel crawl-config edit to ship as a redeploy. The split puts the slow-moving voice content (FAQ entries, STT phonetic-recovery rules) in version control and the fast-moving web crawl rules in DB rows that platform admins can edit through the API.
Phase Status

Hospital-agnostic refactoring (Phase 1) is complete as of 2026-03-30. The system runs in single-tenant pilot mode for ZOL. Full multi-tenant routing (subdomain resolution, per-tenant auth) is planned for Phase 2. See Multi-Tenancy Roadmap for the full phased plan.

Core Design Principles

PrincipleImplementation
No hardcoded hospital referencesAll identity loaded from DB or YAML config at runtime
DB-driven crawl configurationapp.site_crawl_configs table replaces in-process constants
DB-driven golden pagesapp.golden_pages table, scoped per hospital
Parameterized promptsPromptContext dataclass injected into all LLM calls
Generic crawlerHospitalCrawler (renamed from ZOLCrawler) reads config from DB
Auto-link on publishRelationshipAutoLinker runs after every publish to fill relationship gaps

Configuration Architecture

app.site_crawl_configs — Crawl Configuration

Every hospital's crawl behaviour is defined by a row in this table. The HospitalCrawler loads the active config at startup via get_active_site_config(hospital_id).

ColumnTypePurpose
hospital_idUUID FKOwner hospital
slugVARCHARShort identifier (unique per hospital)
domainsJSONBList of allowed domains
canonical_domainVARCHARPrimary domain for URL normalization
skip_url_patternsJSONBRegex patterns — URLs to skip during crawl
golden_url_pathsJSONBHigh-priority URL prefixes to prioritize
boilerplate_css_selectorsJSONBCSS selectors for boilerplate removal
boilerplate_text_patternsJSONBRegex patterns for post-extraction text cleanup
boilerplate_pdf_patternsJSONBRegex patterns for PDF cover-page cleanup
url_category_patternsJSONB(regex, category) pairs for URL classification
crawl_depthINTEGERMax crawl depth
max_pagesINTEGERSafety limit on total pages crawled
is_activeBOOLEANOnly one active config per hospital

The code path: get_active_site_config(hospital_id) reads the active row from site_crawl_configs and builds a SiteCrawlConfig dataclass, which is injected into HospitalCrawler at construction time.

app.golden_pages — High-Value Pages

Golden pages are URLs identified as structurally valuable for entity extraction (hub pages that list doctors, departments, or conditions). They are stored per hospital in app.golden_pages and populated by the hub detection pipeline (Stage 3 of the taxonomy extraction pipeline).

ColumnTypePurpose
hospital_idUUID FKOwner hospital
urlVARCHAR(2000)Page URL
page_typeVARCHAR(40)hub or detail
confidenceFLOATAI classification confidence
statusVARCHAR(20)proposed, confirmed, rejected
confirmed_by / confirmed_atUUID / TIMESTAMPTZOperator approval audit trail
last_extractedTIMESTAMPTZWhen entity extraction last ran on this page

Parameterized Prompts — PromptContext

All LLM prompts across the system are parameterized via the PromptContext dataclass (backend/app/prompts.py). No prompt contains hardcoded hospital names.

@dataclass(frozen=True)
class PromptContext:
hospital_name: str = "Het ziekenhuis"
hospital_full_name: str = "het ziekenhuis"
hospital_location: str = "Belgium"
phone_number: str = ""
website: str = ""

At runtime, get_prompt_context(tenant_slug) builds a PromptContext from the hospital's YAML config (sourced from HospitalTaxonomy). All prompt-building functions accept an optional ctx: PromptContext | None parameter; passing None falls back to the generic default.

Two additional parameterized prompt helpers follow the same pattern:

FunctionModulePurpose
get_decomposition_prompt(hospital_name)query_decomposition_service.pyQuery decomposition instructions with hospital name
get_validation_prompt(hospital_name)graph/llm_entity_validation.pyEntity validation instructions with hospital name

The prompt context flows into the RAG pipeline, intent classifier, safety messages, disclaimers, and blocked-query responses — all surfaces that previously contained hardcoded ZOL references.

Two configuration planes

PromptContext parameterises the web/RAG channel (chat UI, intent classifier, safety messages, response generation). The voice channel uses a separate per-tenant overlay plane described in Voice Channel Tenant Overlays below. The two planes are intentionally separate so voice-FAQ edits don't require a database migration and crawler-config edits don't require a YAML release.


Voice Channel Tenant Overlays

The voice channel uses a parallel per-tenant configuration plane sourced from YAML files in backend/app/services/voice/tenant_overlays/_yaml/<slug>.yaml. The overlay loader (backend/app/services/voice/tenant_overlays/loader.py) reads the tenant's YAML at startup; the registry (backend/app/services/voice/tenant_overlays/registry.py) returns the active overlay per request via get_overlay(slug).

What an overlay carries

SectionPurposeExample
faq_entriesPer-tenant FAQ entries that bypass RAG. The voice FAQ tool checks intent + key-phrase patterns first; on hit, the renderer produces the spoken answer."What are visiting hours?" → address_all_campuses renderer, which queries app.campuses and reads each row aloud (no hardcoded address strings in code).
stt_phonetic_recoveryPer-tenant phonetic-mishear corrections applied to Deepgram Nova-3 @deepgram_nova3 output before intent classification."afwrak" → "after-care" (Dutch STT mishears the English compound when a Dutch caller code-switches mid-utterance).
renderersDB-driven renderer functions that turn structured data (campus rows, business hours) into spoken text.A future tenant gets the right answer by populating its taxonomy and overlay; no code change.

Why YAML for this plane (not DB rows)

Trade-offYAML overlayDB-row equivalent
Edit cadenceSlow (FAQ patterns change quarterly)Fast (rule edits without redeploy)
Review surfaceGit PR — a content reviewer sees every line change with diffAdmin UI form — easy to fat-finger
Test coveragePytest fixture loads the tenant YAML and asserts every entryPer-row contract test, harder to maintain
Redeploy costEach overlay edit is a code releaseZero

The choice is YAML because voice content is reviewer-gated and changes infrequently. If overlay churn ever exceeds quarterly cadence, the system can grow a tenant_voice_overlays table without changing the runtime API (get_overlay(slug) would consult a DB row instead of a parsed YAML).

No hardcoded tenant data

A foundational rule: tenant facts (campus addresses, doctor names, hospital name) never live in source code. They live in the taxonomy DB or, for voice STT-recovery rules, in the YAML overlay. The runtime always asks get_taxonomy(slug) for facts and get_overlay(slug) for tenant-scoped voice content. The rule was codified after the 2026-05 voice batch C–F sprint; the previous _FAQ_ENTRIES table held hardcoded ZOL strings (a campus address, parking specifics, a contact phone number) and a future tenant onboarding would have inherited the wrong content silently. The renderer pattern (address_all_campuses reads from app.campuses) is the canonical pattern for new tenant-scoped voice answers.


HospitalCrawler

HospitalCrawler (backend/app/crawlers/hospital_crawler.py) is the hospital-agnostic web crawler, renamed from ZOLCrawler. All site-specific constants (domains, skip patterns, boilerplate selectors) are sourced from the SiteCrawlConfig loaded from the database rather than class-level attributes.

HospitalCrawler.__init__()
└─ get_active_site_config(hospital_id)
└─ reads app.site_crawl_configs WHERE is_active = true
→ SiteCrawlConfig dataclass

The public methods should_crawl(url) and categorize_url(url) both delegate to the loaded SiteCrawlConfig, making crawler behaviour fully DB-configurable without code deployment.

Default extraction rules handle common CMS patterns (Drupal, WordPress). Hospital-specific overrides are stored in the DB config and applied at runtime.


Auto-Linker in the Publish Pipeline

RelationshipAutoLinker (backend/app/services/taxonomy/relationship_autolinker.py) runs automatically as Step 7 of every PublishService.publish() call. Its purpose is to ensure newly published entities are connected to at least one department, filling relationship gaps that entity extraction may have missed.

How It Works

  1. Find orphans: Query published_entities for CONDITION, TREATMENT, and EXAMINATION entities with no TREATS, HANDLES, or PERFORMS relationship in published_relationships.
  2. Get department map: Load all DEPARTMENT entities for the hospital from published_entities.
  3. Batch classify: Send orphans to the LLM in batches of 40. The model maps each entity to the most appropriate department name from the provided list.
  4. Insert relationships: For each LLM assignment, insert a row into published_relationships with the type derived from the entity type. All inserts use ON CONFLICT DO NOTHING.
  5. Commit: If any rows were linked, the session is committed.

Relationship Type Mapping

Entity TypeRelationship Created
CONDITIONTREATS (department TREATS condition)
TREATMENTPERFORMS
EXAMINATIONPERFORMS

Failure Isolation

The auto-linker runs inside a try/except in PublishService. Any failure is logged as a warning and does not block or roll back the publish transaction. The publish result includes an AUTO_LINKED count in the relationship summary when links were created.


New Hospital Onboarding Flow

Step-by-Step

StepActionAPI / Service
1Create hospital recordHospitalBootstrapService.bootstrap() — idempotent upsert
2Configure crawl settingsPUT /api/v1/hospitals/{id}/crawl-config
3Run website crawlPipeline Wizard — Stage 2 (sitemap + content extraction)
4Hub page detectionPipeline Wizard — Stage 3 (LLM binary classifier, populates golden_pages)
5Entity extractionPipeline Wizard — Stage 4 (per-hub LLM extraction + SNOMED matching + dedup)
6Operator reviewPipeline Wizard — Review UI (approve / reject / merge entities)
7Publish taxonomyPOST /api/v1/taxonomy/publish
8Auto-link orphansRelationshipAutoLinker.autolink() — runs inside publish, non-fatal

After Step 8, FrozenTaxonomyRegistry is invalidated and reloaded. The new hospital's taxonomy is available to the RAG pipeline within milliseconds.

No code changes and no redeployment are required.


Admin API — Crawl Configuration

MethodPathDescription
GET/api/v1/hospitals/{id}/crawl-configGet active crawl config for a hospital
PUT/api/v1/hospitals/{id}/crawl-configCreate or update crawl config (upsert by hospital_id + slug)

Both endpoints require require_admin authorization. The upsert uses INSERT ... ON CONFLICT (hospital_id, slug) DO UPDATE so repeated calls are idempotent. Only the row with is_active = true is used by the crawler at runtime.


Data Isolation

All storage layers are scoped per hospital. There is no shared mutable state between hospitals.

LayerIsolation Mechanism
PostgreSQL taxonomy tableshospital_id FK on every entity, relationship, and version row
site_crawl_configshospital_id FK, one active row per hospital
golden_pageshospital_id FK, unique constraint on (hospital_id, url)
Documents and chunkshospital_id FK, all vector queries include hospital_id filter
Redis keys{tenant_id}: prefix on all cache, rate-limit, and session keys
MinIO object storage{tenant_id}/{document_id} path prefix

The resolve_tenant_id() FastAPI dependency extracts the tenant from Keycloak JWT claims, ensuring each request is scoped to the correct hospital before any query executes.