Skip to main content

Release Notes: March 7-13, 2026

~296 commits | Peak development week

2-Minute Summary

The biggest week of the project so far. Neo4j was ripped out and replaced with PostgreSQL-based taxonomy tables. Keycloak authentication went from zero to full production (phases 1-3) in a single week, eliminating 730 lines of legacy auth code. A complete feedback system was built across 20 tasks. The ontology layer went live inside the RAG pipeline. Multi-tenancy was retrofitted across 10 tables, and the page classifier was rewritten to a clean hub/detail binary. The eval baseline landed at 95.1%. Ten database migrations (040-049) shipped.


Detailed Changes

1. Neo4j Removal

The graph database experiment was abandoned in favour of PostgreSQL-native taxonomy tables with pgvector.

What changed:

  • Removed all Neo4j driver code, Cypher queries, and graph database configuration
  • Deleted the graph retrieval strategy layer (vector + graph dual-path)
  • Replaced with PostgreSQL taxonomy tables using recursive CTEs for traversal
  • Single retrieval path through pgvector — simpler, faster, and one fewer infrastructure dependency

Impact: Eliminated an entire infrastructure component, reducing operational complexity and removing a class of consistency bugs between the graph and relational stores.


2. Keycloak Authentication (Phases 1-3)

A complete authentication overhaul executed across three phases in a single week.

Phase 1 — Dual-Mode Auth:

  • Added Keycloak RS256 JWT validation alongside legacy HS256 tokens
  • Both auth paths coexisted during migration window

Phase 2 — OIDC Integration:

  • OIDC login and callback endpoints
  • User migration script to move existing accounts into Keycloak
  • SSO button added to the login page

Phase 3 — Legacy Cleanup:

  • Deleted AuthService and TokenBlacklistService (730 lines removed)
  • Removed MFA implementation and RegisterForm component
  • Keycloak became the sole authentication provider
  • Generic roles introduced: admin, manager, user (replacing ZOL-specific zol-admin, zol-user roles)

Impact: Authentication is now fully delegated to Keycloak. No passwords stored in the application database. Role management happens in Keycloak admin console.


3. Feedback System (20 Tasks)

A full feedback loop from user input to admin analysis, built across 20 implementation tasks.

What changed:

  • Models: Session feedback, nano classification, and pipeline telemetry database models
  • Conversation Classifier: Service that analyses conversations and classifies user intent, sentiment, and outcome
  • Negative Feedback UX: Category chips for structured negative feedback (wrong answer, missing info, outdated, etc.)
  • Admin Dashboard: Four views — summary statistics, feedback list with filters, per-item analysis, and intent distribution stats
  • Session Prompt: Public chat now prompts for session-level feedback after conversation ends

Key endpoints:

  • GET /api/v1/admin/feedback/summary — aggregate stats
  • GET /api/v1/admin/feedback/list — paginated feedback items
  • POST /api/v1/admin/feedback/{id}/analyze — trigger AI analysis
  • GET /api/v1/admin/feedback/intent-stats — intent distribution

4. Ontology Layer

A structured knowledge layer that augments vector search with entity-aware retrieval.

What changed:

  • Schema evolution: Lookup tables, entity aliases, navigational rules, audit trail, and background job tracking
  • Entity Linker: Three-tier matching pipeline — exact match, fuzzy match (trigram similarity), and SNOMED code lookup
  • Ontology Query Service: 1-hop graph traversal over PostgreSQL taxonomy relationships (e.g., "which doctors treat condition X?")
  • RAG Integration: Ontology lookup runs in parallel with vector search; results are merged and deduplicated before LLM synthesis
  • Navigational Rules Engine: Condition-based rules that inject structured answers for navigational queries (visiting hours, parking, campus directions)

Impact: Queries about entity relationships (doctor-department-condition) now return structured, verified answers instead of relying solely on chunk similarity.


5. Multi-Tenancy Remediation

Retrofitting tenant isolation across the entire data layer.

What changed:

  • Migration 043: Added tenant_id column to 10 tables with foreign key constraints and indexes
  • Service Layer: Threaded tenant_id through CrawlService, IngestionJob, feedback models, and telemetry pipeline
  • Tenant Resolver: Middleware that resolves tenant from subdomain (e.g., zol.example.com maps to tenant zol)
  • Configuration: tenants.yaml manifest with sync_tenants.py script to synchronize tenant records into the database
  • Nginx: Multi-tenant Nginx config generator for subdomain-based routing
  • Testing: Tenant isolation integration tests verifying cross-tenant data cannot leak

Impact: The platform can now serve multiple hospitals from a single deployment, each with isolated data and configuration.


6. Hub/Detail Page Reclassification

Simplified the page type taxonomy from a complex hierarchy to a clean binary classification.

What changed:

  • Migration 045: Reclassified all page_type values to binary hub or detail
  • Page Classifier: Rewritten classifier that determines page type based on content structure (link density, content depth, navigation patterns)
  • Config Cleanup: Removed hardcoded departments and golden_pages lists from zol.yaml — these are now auto-discovered by the crawler

Impact: Cleaner retrieval filtering. Hub pages (department overviews, service listings) are deprioritized in favour of detail pages (specific conditions, treatments) for medical queries.


7. Observability Migration

Replaced the observability stack with self-hosted, open-source tooling.

What changed:

  • Removed Logfire SDK and all Logfire-specific instrumentation
  • Integrated Langfuse for LLM observability (prompt tracking, cost monitoring, latency analysis)
  • Added OpenTelemetry instrumentation for distributed tracing across services
  • Deployed ClickHouse as the analytics backend for Langfuse
  • Self-hosted Langfuse service added to Docker Compose

8. Taxonomy Explorer

A new admin UI for browsing and managing the taxonomy knowledge graph.

What changed:

  • Two-Panel Layout: Left panel with entity list and search; right panel with entity detail drawer showing relationships, aliases, and metadata
  • Schema Browser: Visual overview of entity types, relationship types, and their cardinalities
  • Timeline Tracking: New taxonomy_events table recording all taxonomy mutations with timestamps and actor

9. Database Migrations

Ten migrations covering multi-tenancy, feedback, ontology, and page reclassification.

MigrationPurpose
040Ontology lookup tables and entity aliases
041Feedback models (session feedback, nano classification)
042Pipeline telemetry and conversation classifier tables
043Add tenant_id to 10 tables
044Navigational rules and condition evaluation tables
045Reclassify page_type to binary hub/detail
046Taxonomy events timeline table
047Langfuse + OpenTelemetry config columns
048Background job tracking for ontology operations
049Entity linker audit trail and SNOMED mapping cache

Evaluation Results

The golden eval framework was established this week, producing the first baseline score.

MetricValue
Golden questions299
Pass rate95.1%
Medical advice incidentsZERO

The 95.1% baseline identified 15 failure categories that drove subsequent improvement work in later weeks.


Current System State (End of Week)

ComponentValue
AuthenticationKeycloak-only (RS256 JWT)
Graph databaseRemoved (PostgreSQL taxonomy)
Retrieval strategySingle-path pgvector + ontology
Tenancy modelMulti-tenant with isolation
Page classificationBinary hub/detail
ObservabilityLangfuse + OpenTelemetry
Database migrationsHead at 049
Eval score95.1% baseline
Medical advice incidentsZERO