Multi-Tenancy & Hospital-Agnostic Architecture
The system is architected to serve multiple hospitals from a single deployment. All hospital-specific behaviour is driven by database configuration or per-tenant YAML: no code changes or redeployment are required to onboard a new hospital.
Tenant-Isolation Trade-offs
| Decision | Chosen | Alternatives considered | Rejected because |
|---|---|---|---|
| Isolation model | Shared schema with tenant_id FK on every domain row (pool model) | Schema-per-tenant; database-per-tenant | Schema-per-tenant complicates Alembic migrations (one schema-aware migration per tenant per release) and breaks single-query analytics. Database-per-tenant scales connection pool size linearly with tenant count and forbids cross-tenant analytics entirely. The pool model with row-level tenant_id filtering scales to the projected 5–10 hospital pilot fleet without rearchitecting. The three isolation models (shared schema / schema-per-tenant / database-per-tenant) and their trade-offs are the canonical multi-tenant SaaS taxonomy from Bezemer & Zaidman 2010. |
| Tenant resolution | Keycloak JWT claim → resolve_tenant_id() FastAPI dependency | URL-based tenant ID; subdomain routing; X-Tenant-Id request header | Putting the tenant ID in the URL or header lets a misconfigured client cross-tenant by accident; Keycloak claims are signed and verified per-request, so the tenant identity is cryptographically bound to the authentication. Subdomain routing is on the Multi-Tenancy Roadmap for Phase 2 but not in current scope. |
| Per-tenant configuration plane | Two planes — DB-driven for web/RAG (site_crawl_configs, golden_pages, PromptContext from YAML) + YAML overlay for voice (tenant_overlays/_yaml/<slug>.yaml) | Single unified DB-driven config; single unified YAML file | A single DB-driven plane forces a schema migration for every voice-FAQ change; a single YAML file forces every web-channel crawl-config edit to ship as a redeploy. The split puts the slow-moving voice content (FAQ entries, STT phonetic-recovery rules) in version control and the fast-moving web crawl rules in DB rows that platform admins can edit through the API. |
Hospital-agnostic refactoring (Phase 1) is complete as of 2026-03-30. The system runs in single-tenant pilot mode for ZOL. Full multi-tenant routing (subdomain resolution, per-tenant auth) is planned for Phase 2. See Multi-Tenancy Roadmap for the full phased plan.
Core Design Principles
| Principle | Implementation |
|---|---|
| No hardcoded hospital references | All identity loaded from DB or YAML config at runtime |
| DB-driven crawl configuration | app.site_crawl_configs table replaces in-process constants |
| DB-driven golden pages | app.golden_pages table, scoped per hospital |
| Parameterized prompts | PromptContext dataclass injected into all LLM calls |
| Generic crawler | HospitalCrawler (renamed from ZOLCrawler) reads config from DB |
| Auto-link on publish | RelationshipAutoLinker runs after every publish to fill relationship gaps |
Configuration Architecture
app.site_crawl_configs — Crawl Configuration
Every hospital's crawl behaviour is defined by a row in this table. The HospitalCrawler loads the active config at startup via get_active_site_config(hospital_id).
| Column | Type | Purpose |
|---|---|---|
hospital_id | UUID FK | Owner hospital |
slug | VARCHAR | Short identifier (unique per hospital) |
domains | JSONB | List of allowed domains |
canonical_domain | VARCHAR | Primary domain for URL normalization |
skip_url_patterns | JSONB | Regex patterns — URLs to skip during crawl |
golden_url_paths | JSONB | High-priority URL prefixes to prioritize |
boilerplate_css_selectors | JSONB | CSS selectors for boilerplate removal |
boilerplate_text_patterns | JSONB | Regex patterns for post-extraction text cleanup |
boilerplate_pdf_patterns | JSONB | Regex patterns for PDF cover-page cleanup |
url_category_patterns | JSONB | (regex, category) pairs for URL classification |
crawl_depth | INTEGER | Max crawl depth |
max_pages | INTEGER | Safety limit on total pages crawled |
is_active | BOOLEAN | Only one active config per hospital |
The code path: get_active_site_config(hospital_id) reads the active row from site_crawl_configs and builds a SiteCrawlConfig dataclass, which is injected into HospitalCrawler at construction time.
app.golden_pages — High-Value Pages
Golden pages are URLs identified as structurally valuable for entity extraction (hub pages that list doctors, departments, or conditions). They are stored per hospital in app.golden_pages and populated by the hub detection pipeline (Stage 3 of the taxonomy extraction pipeline).
| Column | Type | Purpose |
|---|---|---|
hospital_id | UUID FK | Owner hospital |
url | VARCHAR(2000) | Page URL |
page_type | VARCHAR(40) | hub or detail |
confidence | FLOAT | AI classification confidence |
status | VARCHAR(20) | proposed, confirmed, rejected |
confirmed_by / confirmed_at | UUID / TIMESTAMPTZ | Operator approval audit trail |
last_extracted | TIMESTAMPTZ | When entity extraction last ran on this page |
Parameterized Prompts — PromptContext
All LLM prompts across the system are parameterized via the PromptContext dataclass (backend/app/prompts.py). No prompt contains hardcoded hospital names.
@dataclass(frozen=True)
class PromptContext:
hospital_name: str = "Het ziekenhuis"
hospital_full_name: str = "het ziekenhuis"
hospital_location: str = "Belgium"
phone_number: str = ""
website: str = ""
At runtime, get_prompt_context(tenant_slug) builds a PromptContext from the hospital's YAML config (sourced from HospitalTaxonomy). All prompt-building functions accept an optional ctx: PromptContext | None parameter; passing None falls back to the generic default.
Two additional parameterized prompt helpers follow the same pattern:
| Function | Module | Purpose |
|---|---|---|
get_decomposition_prompt(hospital_name) | query_decomposition_service.py | Query decomposition instructions with hospital name |
get_validation_prompt(hospital_name) | graph/llm_entity_validation.py | Entity validation instructions with hospital name |
The prompt context flows into the RAG pipeline, intent classifier, safety messages, disclaimers, and blocked-query responses — all surfaces that previously contained hardcoded ZOL references.
PromptContext parameterises the web/RAG channel (chat UI, intent classifier, safety messages, response generation). The voice channel uses a separate per-tenant overlay plane described in Voice Channel Tenant Overlays below. The two planes are intentionally separate so voice-FAQ edits don't require a database migration and crawler-config edits don't require a YAML release.
Voice Channel Tenant Overlays
The voice channel uses a parallel per-tenant configuration plane sourced from YAML files in backend/app/services/voice/tenant_overlays/_yaml/<slug>.yaml. The overlay loader (backend/app/services/voice/tenant_overlays/loader.py) reads the tenant's YAML at startup; the registry (backend/app/services/voice/tenant_overlays/registry.py) returns the active overlay per request via get_overlay(slug).
What an overlay carries
| Section | Purpose | Example |
|---|---|---|
faq_entries | Per-tenant FAQ entries that bypass RAG. The voice FAQ tool checks intent + key-phrase patterns first; on hit, the renderer produces the spoken answer. | "What are visiting hours?" → address_all_campuses renderer, which queries app.campuses and reads each row aloud (no hardcoded address strings in code). |
stt_phonetic_recovery | Per-tenant phonetic-mishear corrections applied to Deepgram Nova-3 @deepgram_nova3 output before intent classification. | "afwrak" → "after-care" (Dutch STT mishears the English compound when a Dutch caller code-switches mid-utterance). |
renderers | DB-driven renderer functions that turn structured data (campus rows, business hours) into spoken text. | A future tenant gets the right answer by populating its taxonomy and overlay; no code change. |
Why YAML for this plane (not DB rows)
| Trade-off | YAML overlay | DB-row equivalent |
|---|---|---|
| Edit cadence | Slow (FAQ patterns change quarterly) | Fast (rule edits without redeploy) |
| Review surface | Git PR — a content reviewer sees every line change with diff | Admin UI form — easy to fat-finger |
| Test coverage | Pytest fixture loads the tenant YAML and asserts every entry | Per-row contract test, harder to maintain |
| Redeploy cost | Each overlay edit is a code release | Zero |
The choice is YAML because voice content is reviewer-gated and changes infrequently. If overlay churn ever exceeds quarterly cadence, the system can grow a tenant_voice_overlays table without changing the runtime API (get_overlay(slug) would consult a DB row instead of a parsed YAML).
No hardcoded tenant data
A foundational rule: tenant facts (campus addresses, doctor names, hospital name) never live in source code. They live in the taxonomy DB or, for voice STT-recovery rules, in the YAML overlay. The runtime always asks get_taxonomy(slug) for facts and get_overlay(slug) for tenant-scoped voice content. The rule was codified after the 2026-05 voice batch C–F sprint; the previous _FAQ_ENTRIES table held hardcoded ZOL strings (a campus address, parking specifics, a contact phone number) and a future tenant onboarding would have inherited the wrong content silently. The renderer pattern (address_all_campuses reads from app.campuses) is the canonical pattern for new tenant-scoped voice answers.
HospitalCrawler
HospitalCrawler (backend/app/crawlers/hospital_crawler.py) is the hospital-agnostic web crawler, renamed from ZOLCrawler. All site-specific constants (domains, skip patterns, boilerplate selectors) are sourced from the SiteCrawlConfig loaded from the database rather than class-level attributes.
HospitalCrawler.__init__()
└─ get_active_site_config(hospital_id)
└─ reads app.site_crawl_configs WHERE is_active = true
→ SiteCrawlConfig dataclass
The public methods should_crawl(url) and categorize_url(url) both delegate to the loaded SiteCrawlConfig, making crawler behaviour fully DB-configurable without code deployment.
Default extraction rules handle common CMS patterns (Drupal, WordPress). Hospital-specific overrides are stored in the DB config and applied at runtime.
Auto-Linker in the Publish Pipeline
RelationshipAutoLinker (backend/app/services/taxonomy/relationship_autolinker.py) runs automatically as Step 7 of every PublishService.publish() call. Its purpose is to ensure newly published entities are connected to at least one department, filling relationship gaps that entity extraction may have missed.
How It Works
- Find orphans: Query
published_entitiesforCONDITION,TREATMENT, andEXAMINATIONentities with noTREATS,HANDLES, orPERFORMSrelationship inpublished_relationships. - Get department map: Load all
DEPARTMENTentities for the hospital frompublished_entities. - Batch classify: Send orphans to the LLM in batches of 40. The model maps each entity to the most appropriate department name from the provided list.
- Insert relationships: For each LLM assignment, insert a row into
published_relationshipswith the type derived from the entity type. All inserts useON CONFLICT DO NOTHING. - Commit: If any rows were linked, the session is committed.
Relationship Type Mapping
| Entity Type | Relationship Created |
|---|---|
CONDITION | TREATS (department TREATS condition) |
TREATMENT | PERFORMS |
EXAMINATION | PERFORMS |
Failure Isolation
The auto-linker runs inside a try/except in PublishService. Any failure is logged as a warning and does not block or roll back the publish transaction. The publish result includes an AUTO_LINKED count in the relationship summary when links were created.
New Hospital Onboarding Flow
Step-by-Step
| Step | Action | API / Service |
|---|---|---|
| 1 | Create hospital record | HospitalBootstrapService.bootstrap() — idempotent upsert |
| 2 | Configure crawl settings | PUT /api/v1/hospitals/{id}/crawl-config |
| 3 | Run website crawl | Pipeline Wizard — Stage 2 (sitemap + content extraction) |
| 4 | Hub page detection | Pipeline Wizard — Stage 3 (LLM binary classifier, populates golden_pages) |
| 5 | Entity extraction | Pipeline Wizard — Stage 4 (per-hub LLM extraction + SNOMED matching + dedup) |
| 6 | Operator review | Pipeline Wizard — Review UI (approve / reject / merge entities) |
| 7 | Publish taxonomy | POST /api/v1/taxonomy/publish |
| 8 | Auto-link orphans | RelationshipAutoLinker.autolink() — runs inside publish, non-fatal |
After Step 8, FrozenTaxonomyRegistry is invalidated and reloaded. The new hospital's taxonomy is available to the RAG pipeline within milliseconds.
No code changes and no redeployment are required.
Admin API — Crawl Configuration
| Method | Path | Description |
|---|---|---|
GET | /api/v1/hospitals/{id}/crawl-config | Get active crawl config for a hospital |
PUT | /api/v1/hospitals/{id}/crawl-config | Create or update crawl config (upsert by hospital_id + slug) |
Both endpoints require require_admin authorization. The upsert uses INSERT ... ON CONFLICT (hospital_id, slug) DO UPDATE so repeated calls are idempotent. Only the row with is_active = true is used by the crawler at runtime.
Data Isolation
All storage layers are scoped per hospital. There is no shared mutable state between hospitals.
| Layer | Isolation Mechanism |
|---|---|
| PostgreSQL taxonomy tables | hospital_id FK on every entity, relationship, and version row |
site_crawl_configs | hospital_id FK, one active row per hospital |
golden_pages | hospital_id FK, unique constraint on (hospital_id, url) |
| Documents and chunks | hospital_id FK, all vector queries include hospital_id filter |
| Redis keys | {tenant_id}: prefix on all cache, rate-limit, and session keys |
| MinIO object storage | {tenant_id}/{document_id} path prefix |
The resolve_tenant_id() FastAPI dependency extracts the tenant from Keycloak JWT claims, ensuring each request is scoped to the correct hospital before any query executes.
Related Pages
- Multi-Tenancy Roadmap — Phase 0/1/2 roadmap and tenant routing plans
- Taxonomy Extraction Pipeline — Hub detection, entity extraction, and publish flow
- Draft/Publish System — Publish versioning and rollback