Release Notes: March 14-20, 2026
~269 commits | 5 database migrations (050-054) | Largest week by commit count
This was the foundational week for the taxonomy management layer. Four major sprints (SP-4 through SP-7) delivered a complete entity resolution pipeline, a draft/publish system with rollback, a 5-stage pipeline management wizard, and fuzzy entity deduplication with LLM classification. In parallel, multi-tenant user management, a full platform admin shell, and taxonomy enrichment improvements landed. The eval score climbed from 95.1% to 99.0% (296/299) across the week.
SP-4: Entity Resolution Pipeline
End-to-end pipeline for extracting, normalizing, deduplicating, and linking taxonomy entities from crawled hospital content.
What changed:
- Database migration & ORM models: New tables for extracted entities, relationships, and SNOMED mappings with full Pydantic schemas
- Name normalizer: Produces canonical names and dedup keys for clustering near-duplicates
- Entity extractor: Type-specific LLM prompts (condition, treatment, department, doctor) with structured output parsing
- Dedup service: Clusters entities by dedup_key, selects the richest survivor per group
- SNOMED linking: Automatic SNOMED CT code assignment during the normalize-dedup pipeline
- Extraction orchestrator: Coordinates the full pipeline with SSE progress streaming and Redis-based locking to prevent concurrent runs
- Entity CRUD API: Approve, reject, merge, and bulk-approve endpoints with full REST semantics and SSE extraction status
SP-5: Draft/Publish System
A versioned publish workflow that separates the working taxonomy from the production-serving taxonomy.
What changed:
- Migration 053:
published_entities,published_relationships, andtaxonomy_versionstables - PublishService: Preview (dry run), publish (snapshot working state), rollback (restore previous version), and unpublish operations
- FrozenTaxonomyRegistry:
from_published()constructor with version-check cache invalidation -- the RAG pipeline always serves the latest published version without restarts - 6 API endpoints: Preview, publish, rollback, unpublish, list versions, get version detail
Impact: Taxonomy changes are staged and reviewed before they affect live search results. Rollback takes seconds.
SP-6: Pipeline Management Wizard
A guided 5-stage wizard UI for running and monitoring the entire taxonomy pipeline from crawl to publish.
What changed:
- 5-stage flow: Setup (select hospital, configure) -> Crawl (trigger + monitor) -> Hub Detection (review hub pages) -> Entity Review (approve/reject/merge) -> Publish (preview + publish)
- Backend endpoints: Pipeline status, crawl summary, and entity count aggregations
- Entity table: Sortable, paginated table with bulk actions (approve all, reject selected) and an inline relationship browser
- Navigation: Persistent nav bar with stage indicators, collapsible sidebar
- i18n: Full English and Dutch translations
SP-7: Fuzzy Entity Deduplication
LLM-powered detection and resolution of near-duplicate entities that survive the initial dedup_key clustering.
What changed:
- Token overlap candidate generation: Pairwise comparison within entity types using token-level overlap scoring
- LLM classification: GPT-4.1-mini classifies each candidate pair as duplicate, alias, or distinct with a confidence score
- Merge candidate API: Scan (trigger generation), list (with filtering), and resolve (merge or reject) endpoints
- Frontend MergeCandidateCard: Side-by-side entity comparison with survivor selection and one-click merge/reject
- Tiered bulk approve: One-click approval for high-confidence candidates (100% overlap first, then 80%+)
- Toast notifications: All confirmation flows migrated from native dialogs to inline
ConfirmBar+react-hot-toast
Multi-Tenant User Management
Full-stack user management for the multi-hospital platform.
What changed:
- Spec-to-implementation: Brainstorm, spec, plan, and execution phases completed in a single sprint
- Password reset: Admin-triggered temporary password flow via Keycloak API
- Hospital auto-select: Users are automatically scoped to their assigned hospital on login
- Permissions defaults: Sensible role-based defaults (owner, admin, user) applied on user creation
Platform Management Shell
A complete admin interface for platform-level operations.
What changed:
- Hospital Management: Master-detail layout for viewing and editing hospital configurations
- Platform Dashboard: Stat cards (users, hospitals, documents, queries) with activity feed
- System Health: Service status indicators with self-heal action buttons
- Platform Settings: Grouped configuration sections with inline editing
- AppShell components: Sidebar, TopBar, CommandPalette (Cmd+K), Breadcrumbs, ThemeToggle
Taxonomy Enrichment
Improvements to how the RAG pipeline uses taxonomy data at query time.
What changed:
- Post-answer enrichment: Multi-part queries now trigger additional taxonomy lookups after initial retrieval, enriching context for the final answer
- Fuzzy department matching: Approximate string matching for department names in user queries (handles typos and abbreviations)
- Per-sub-query reranking: Multi-hop query decomposition now reranks retrieved chunks independently per sub-query before merging
Evaluation Progress
| Date | Score | Context |
|---|---|---|
| March 17 | 95.1% | Baseline after entity resolution pipeline |
| March 18 | 95.9% | Taxonomy enrichment + fuzzy matching |
| March 20 | 99.0% (296/299) | Full pipeline + publish + dedup |
The jump from 95.1% to 99.0% was driven by three factors: richer taxonomy data from the entity resolution pipeline, fuzzy department matching catching previously missed queries, and per-sub-query reranking improving multi-hop accuracy.
Migrations
| Migration | Purpose |
|---|---|
| 050 | Entity extraction tables (entities, relationships, SNOMED mappings) |
| 051 | Extraction orchestrator state and locking metadata |
| 052 | Merge candidates table for fuzzy dedup |
| 053 | Published entities, published relationships, taxonomy versions |
| 054 | User management extensions (hospital scoping, permissions) |
Current System State (end of week)
| Component | Value |
|---|---|
| Eval score | 99.0% (296/299) |
| Taxonomy entities | Extracted and published (pre-dedup) |
| Pipeline wizard | Fully operational (5 stages) |
| Draft/publish | Live with rollback support |
| Merge candidates | LLM-classified, bulk resolvable |
| Platform shell | Complete (dashboard, health, settings) |
| Medical advice incidents | ZERO |