Skip to main content
Architectural Update (March 2026)

This ADR was written when the system used Neo4j for entity storage. As of March 2026, Neo4j has been fully removed and replaced by PostgreSQL taxonomy tables (taxonomy_entities, taxonomy_relationships). The decision rationale documented here remains valid; the storage layer has changed.

ADR-0026: RAG Pipeline Quality and Speed Improvements

Date: 2026-02-11 | Status: Accepted

Context

Investigation of the RAG pipeline (Lewis et al., 2020) revealed two categories of issues:

Quality Gap: Graph Search Pattern Matching

The typed node search in query_service.py had a pattern gap: queries like "welke arts bij psoriasis?" matched Pattern 1 (doctor keywords) but NOT Pattern 2 (condition deep traversal, which requires treatment keywords like "behandeling"). This meant condition-based doctor lookups returned no graph results, falling back to vector-only search.

Additionally, many conditions in DEPT_CONDITION_MAP lacked corresponding Condition nodes in Neo4j, making the deep traversal path Doctor -> Department -> Treatment -> Condition incomplete.

Speed: Unnecessary Latency

  • Graphiti semantic search (since removed in ADR-0029) always ran even when typed node queries already returned high-quality structured results (+200-400ms)
  • resolve_search_query() re-computed string matching against all taxonomy maps on every call
  • Intent classification used the Tier 2 model when the Tier 2 model suffices for this task
  • Intent classification and user graph preference check ran sequentially instead of in parallel

Decision

Implement 7 improvements across quality (3) and speed (4):

Quality Improvements

  1. Pattern 1b: Condition-aware doctor queries — After Pattern 1 (doctor keywords) finds no results, check if the query also contains a condition name. If so, delegate to deep traversal (query_doctor_full_path).

  2. DEPT_CONDITION_MAP fallback — When deep traversal returns empty results (condition not in Neo4j as a node), fall back to DEPT_CONDITION_MAP to find the handling department, then query doctors in that department.

  3. Backfill missing Condition nodes — Script (backfill_condition_nodes.py) to create Condition nodes and HANDLES relationships from DEPT_CONDITION_MAP entries not yet in Neo4j.

Speed Improvements

  1. Skip Tier 2 when typed results sufficient — Early exit in search_graph() when Tier 1 returns >= 2 results, skipping semantic search (~200-400ms saved).

  2. Cache resolve_search_query() with LRU — Wrap with @lru_cache(maxsize=512) since taxonomy maps are static.

  3. Parallel intent + graph preference — Run intent classification (LLM call) and user graph preference check (DB query) in parallel using asyncio.create_task().

  4. Faster model for intent classification — Use lightweight model for the classification task.

Consequences

Positive

  • Condition-based doctor queries now return structured graph results
  • DEPT_CONDITION_MAP fallback provides useful results while graph is populated
  • ~200-400ms latency reduction from Tier 2 skip
  • Deterministic taxonomy lookups cached
  • Lower LLM cost for intent classification

Negative

  • LRU cache means taxonomy changes require process restart
  • Tier 2 skip means semantic graph results unavailable when typed nodes return >= 2 results