Skip to main content

Historical Notes

Why this page exists

The rest of the documentation describes the system as it runs today, in the present tense. This page is the single home for superseded design choices — components that were removed or problems that were resolved — that used to be scattered as inline "Historical" asides across the technical pages. Each note records what changed, when, and why, with a link to the authoritative Architecture Decision Record and the current page that replaced it.

If you are studying how the system works now, you do not need this page. If you are studying how it got here (or evaluating the engineering judgement behind the changes), this is the trail.

Graphiti semantic search (removed)

What it was: An intermediate retrieval tier that used Graphiti's episodic-memory layer between the typed-node taxonomy queries and the pgvector fallback. It matched query embeddings against graph-node embeddings to handle indirect questions (e.g. "ik heb hartproblemen" → Cardiologie).

Why it was removed: LLM entity extraction (ADR-0030) combined with taxonomy-driven alias resolution now covers the same indirect-query use cases more reliably and with fewer moving parts. The dedicated semantic tier became redundant.

Decision: ADR-0029 — Remove Graphiti. Current design: the two-tier strategy (typed taxonomy query → vector fallback) in Knowledge Graph Overview and Graph-Enhanced RAG.

TypedNodeStorage / Neo4j (removed)

What it was: The original pipeline stored entities as typed Neo4j nodes (Doctor, Department, Condition, …) with Cypher-defined relationships, written through a TypedNodeStorage layer.

Why it was removed: Consolidating onto PostgreSQL removed a separate database to deploy, monitor, and back up; it co-locates entities, relationships, vectors (pgvector), and SNOMED tables in one store with familiar SQL and relational integrity.

Decision: ADR-0053 — Neo4j removal / pgvector consolidation. Current design: the PostgreSQL taxonomy_entities / taxonomy_relationships tables, with entity storage and versioned publishing handled by the SP-4 Entity Resolution Pipeline and SP-5 Draft/Publish System. See the Seeding Pipeline.

Reasoning-model token budget (resolved)

What it was: A previous reasoning model consumed hidden "thinking" tokens, which forced max_tokens=2000 on classification and validation calls to leave room for the visible output.

Why it no longer applies: The current Tier 2 model is not a reasoning model, so standard max_tokens values (250–500) work correctly. The inflated budget — and the cost it carried — is gone.

Decision / investigation: ADR-0013 — Reasoning-model token budget. Current design: the LLM Stack tier table and the Query Pipeline intent classifier.

Title keyword-match boost (removed)

What it was: A post-retrieval re-ranking boost (+15%) applied when a document's title matched query keywords.

Why it was removed: BM25 tsvector search (ADR-0007) performs keyword matching at the retrieval level, which is more principled and effective than a post-hoc title boost — so the boost became redundant and was dropped.

Current design: keyword matching is part of hybrid search; the authority-weighting that remains in the Query Pipeline re-ranker is a separate, still-active mechanism.


For the full chronological record of changes, see Release Notes. For the rationale behind every significant decision, see Architecture Decisions.