Graph-Enhanced RAG

Graph-Enhanced RAG extends the canonical Retrieval Augmented Generation pattern (Lewis et al., 2020) by integrating knowledge graph queries into the retrieval phase. This integration -- formalised as HybridRAG by Sarmah et al. (2024) -- enables the system to answer questions that require both structured entity knowledge and unstructured textual context, a combination that neither vector search nor graph queries can achieve independently. Peng et al. (2025) provide a comprehensive survey of Graph-Based Indexing, Graph-Guided Retrieval, and Graph-Enhanced Generation approaches.

The Integration Architecture

Taxonomy Entity Queries

For entity-specific queries (doctor lookups, department information), the system generates SQL queries against the PostgreSQL taxonomy tables. These queries are deterministic and fast (~10-50ms):

Doctor Queries

When a doctor name is detected in the query, the system:

Fuzzy-matches the name against Doctor nodes (SequenceMatcher similarity)
Traverses WORKS_IN edges to find departments (with optional schedule properties)
Derives campus presence from department LOCATED_AT relationships
Reads specialty from the Doctor node's specialty property

The result is a structured profile: "Dr. Van den Berg is orthopedisch chirurg in de afdeling Orthopedie. Hij consulteert op campus Sint-Jan (maandag, woensdag) en campus André Dumont (dinsdag)." Consultation schedule data (days, status, contacts) is stored as properties on the WORKS_IN relationship.

Department Queries

When a department is identified:

Match the Department node
Traverse LOCATED_AT edges for campus information
Traverse inverse WORKS_IN edges for associated doctors
Traverse HANDLES edges for conditions handled
Traverse OFFERS edges for available treatments
Traverse PERFORMS edges for examinations performed

Condition Queries

When a condition is identified (e.g., "Welke onderzoeken voor hartfalen?"):

Match the Condition node via taxonomy alias resolution
Traverse inverse HANDLES edges to find departments that handle this condition
Traverse DIAGNOSES edges to find examinations used to diagnose this condition
Traverse inverse TREATS edges to find treatments for this condition
Expand into associated doctors and campus locations

The DIAGNOSES and TREATS relationships are powered by the Medical Knowledge Architecture — universal medical knowledge generated by LLM classification and merged with hospital-specific hub page data.

Treatment Queries

When a treatment is identified:

Match the Treatment node via taxonomy alias resolution
Traverse inverse OFFERS edges to find departments that offer this treatment
Traverse TREATS edges to find conditions this treatment addresses
Traverse DIAGNOSES edges to find diagnostic examinations for those conditions
Expand into doctors and campus locations

Campus Queries

Campus-specific queries traverse from the Campus node outward:

Match the Campus node (one of four: Sint-Jan, André Dumont, Sint-Barbara, Maas en Kempen)
Traverse inverse LOCATED_AT edges for departments at this campus
Traverse inverse WORKS_AT_CAMPUS edges for doctors present at this campus

LLM Entity Extraction for Graph Routing

When a user query uses colloquial or indirect language rather than naming entities explicitly, the LLM intent classifier extracts structured medical entities alongside the intent classification. This eliminates the need for a separate semantic search tier.

In this example, the patient does not mention "cardiologie" or any specific doctor. The LLM extracts "hartproblemen" as a condition, the taxonomy resolves it to "Hartfalen", and a HANDLES relationship query identifies that Cardiologie handles this condition. The system then uses SQL queries against the taxonomy tables to expand into doctors, treatments, and campus locations. See ADR-0030 for the rationale.

Result Merging Strategy

When both graph and vector results are available (HYBRID mode), the merger follows a priority-based approach:

Merge Rules

Graph results are placed first in the merged result set to prioritize structured entity data (no numeric score boost is applied)
Deduplication by source: If both sources reference the same page, keep the higher-scored entry and merge metadata
Graph entities are formatted as context: Doctor profiles, department descriptions, and consultation schedules are rendered as natural language for the LLM's context window
Vector chunks provide supporting detail: Textual content from brochures and web pages supplements the structured entity data

Campus-Specific Data

The four ZOL campuses have distinct service profiles, making campus-aware retrieval essential:

Campus	Key Services	Notable Departments
Sint-Jan (Genk)	Main campus, most specialties	Emergency, Surgery, Cardiology
André Dumont (Genk)	Specialized care	Oncology, Rehabilitation
Sint-Barbara (Lanaken)	Regional care	General Medicine, Geriatrics
Maas en Kempen (Maaseik)	Regional care	Outpatient Clinics

Consultation schedules are stored as properties on the Doctor -> WORKS_IN -> Department relationship (schedule, status, contacts), enabling queries like "When does Dr. Peeters consult at Sint-Jan?" to be answered with precise schedule information derived from the department's campus location.

Context Augmentation

The final step before response generation is context augmentation -- formatting the merged results into a structured prompt for the LLM:

Entity section: Structured data from graph queries (doctor profiles, department info)
Content section: Relevant text chunks from vector search
Source section: URLs for citation in the response
Instructions: Grounding rules, safety constraints, language requirements (Dutch)

This structured context enables the Tier 2 (standard) or Tier 3 (full mode) model to generate responses that seamlessly weave structured entity information (from the graph) with detailed explanatory content (from vector search), creating responses that are both precise and informative.

References

Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS 2020.
Peng, B., et al. (2025). Retrieval-Augmented Generation with Graphs (GraphRAG). — Comprehensive survey of graph-enhanced RAG approaches.
Sarmah, B., et al. (2024). HybridRAG: Integrating Knowledge Graphs and Vector Retrieval. — Formalises the hybrid KG+vector approach.

The Integration Architecture​

Taxonomy Entity Queries​

Doctor Queries​

Department Queries​

Condition Queries​

Treatment Queries​

Campus Queries​

LLM Entity Extraction for Graph Routing​

Result Merging Strategy​

Merge Rules​

Campus-Specific Data​

Context Augmentation​

References​