Second half completed by ADR-0053 (2026-03-07, documented 2026-05-09)

This decision record removed Graphiti the library but explicitly kept Neo4j the database. Neo4j was subsequently removed entirely — see ADR-0053 (master record) (not yet ported to Docusaurus). The body below is preserved verbatim as the historical decision record. The "Why Not Store Embeddings in Neo4j?" section's argument was reversed by ADR-0053 — read both for the full lineage.

ADR-0029: Remove Graphiti — Direct Neo4j Driver

Note: Neo4j was fully removed in March 2026 and replaced by PostgreSQL taxonomy tables. This ADR describes the intermediate step of replacing Graphiti with direct Neo4j access.

Date: 2026-02-13 | Status: Accepted | Supersedes: ADR-006

Context

We adopted Graphiti, a graph RAG library, early in development to manage the knowledge graph (Hogan et al., 2021). Over time, we replaced Graphiti's entity extraction with our own deterministic pipeline (regex + typed nodes + frozen taxonomy). The only remaining uses were:

Neo4j driver wrapper — all services accessed Neo4j through graphiti._graphiti.driver
Tier 2 semantic search — a fallback path requiring OpenAI embeddings

Problems

Embedding model mismatch: Graphiti hardcodes OpenAI text-embedding-3-small (1536-dim). Our pipeline uses Ollama BGE-M3 (1024-dim). Incompatible embedding spaces.
Unnecessary OpenAI dependency: Graphiti requires an OpenAI API key even though our embedding pipeline runs locally via Ollama (free).
Dead code path: With graph_use_medical_only=True (default since ADR-014), Graphiti's add_episode() is never called. With golden-only mode (ADR-0028), scope narrows further.
Tier 2 search adds no value: Typed node queries + taxonomy alias resolution cover all structured queries. Tier 2 was reached rarely and returned nothing useful.
Abstraction leak: Services accessed Neo4j through graphiti._graphiti.driver, two levels deep into internals.

Decision

Remove the Graphiti library dependency entirely. Replace with a direct neo4j.AsyncDriver managed by a simplified Neo4jService class.

Separation of Concerns

Capability	PostgreSQL + pgvector	Neo4j
Vector similarity search	Mature HNSW, production-proven	Newer, less optimized
BM25 keyword search	Native `tsvector` + full-text indexes	Not supported
Metadata filtering	Natural SQL	Requires complex Cypher
Relationship traversal	Expensive JOINs	Native, O(1) per hop
Path queries	Impractical beyond 2-3 JOINs	Core strength

What Changed

Before	After
`GraphitiService` wraps Graphiti library	`Neo4jService` wraps `neo4j.AsyncDriver` directly
`graphiti._graphiti.driver` for Neo4j	`neo4j_service.driver` (single level)
OpenAI embedder in Graphiti init	Removed — no embeddings in Neo4j
Tier 2 semantic search	Removed — typed queries cover all cases
`graphiti.add_episode()`	Removed (dead code)

Consequences

Positive

Removes OpenAI API key requirement for graph features
Eliminates embedding model mismatch
Simplifies initialization
Removes ~200 lines of dead code
Direct Neo4j driver access
One less library to maintain

Negative

Loses the option of Graphiti's LLM-based entity extraction (our regex + taxonomy pipeline is more reliable)
Loses the option of semantic search over graph nodes (typed node Cypher + taxonomy cover all cases)

Context​

Problems​

Decision​

Separation of Concerns​

What Changed​

Consequences​

Positive​

Negative​