Skip to main content
Second half completed by ADR-0053 (2026-03-07, documented 2026-05-09)

This decision record removed Graphiti the library but explicitly kept Neo4j the database. Neo4j was subsequently removed entirely — see ADR-0053 (master record) (not yet ported to Docusaurus). The body below is preserved verbatim as the historical decision record. The "Why Not Store Embeddings in Neo4j?" section's argument was reversed by ADR-0053 — read both for the full lineage.

ADR-0029: Remove Graphiti — Direct Neo4j Driver

Note: Neo4j was fully removed in March 2026 and replaced by PostgreSQL taxonomy tables. This ADR describes the intermediate step of replacing Graphiti with direct Neo4j access.

Date: 2026-02-13 | Status: Accepted | Supersedes: ADR-006

Context

We adopted Graphiti, a graph RAG library, early in development to manage the knowledge graph (Hogan et al., 2021). Over time, we replaced Graphiti's entity extraction with our own deterministic pipeline (regex + typed nodes + frozen taxonomy). The only remaining uses were:

  • Neo4j driver wrapper — all services accessed Neo4j through graphiti._graphiti.driver
  • Tier 2 semantic search — a fallback path requiring OpenAI embeddings

Problems

  1. Embedding model mismatch: Graphiti hardcodes OpenAI text-embedding-3-small (1536-dim). Our pipeline uses Ollama BGE-M3 (1024-dim). Incompatible embedding spaces.

  2. Unnecessary OpenAI dependency: Graphiti requires an OpenAI API key even though our embedding pipeline runs locally via Ollama (free).

  3. Dead code path: With graph_use_medical_only=True (default since ADR-014), Graphiti's add_episode() is never called. With golden-only mode (ADR-0028), scope narrows further.

  4. Tier 2 search adds no value: Typed node queries + taxonomy alias resolution cover all structured queries. Tier 2 was reached rarely and returned nothing useful.

  5. Abstraction leak: Services accessed Neo4j through graphiti._graphiti.driver, two levels deep into internals.

Decision

Remove the Graphiti library dependency entirely. Replace with a direct neo4j.AsyncDriver managed by a simplified Neo4jService class.

Separation of Concerns

CapabilityPostgreSQL + pgvectorNeo4j
Vector similarity searchMature HNSW, production-provenNewer, less optimized
BM25 keyword searchNative tsvector + full-text indexesNot supported
Metadata filteringNatural SQLRequires complex Cypher
Relationship traversalExpensive JOINsNative, O(1) per hop
Path queriesImpractical beyond 2-3 JOINsCore strength

What Changed

BeforeAfter
GraphitiService wraps Graphiti libraryNeo4jService wraps neo4j.AsyncDriver directly
graphiti._graphiti.driver for Neo4jneo4j_service.driver (single level)
OpenAI embedder in Graphiti initRemoved — no embeddings in Neo4j
Tier 2 semantic searchRemoved — typed queries cover all cases
graphiti.add_episode()Removed (dead code)

Consequences

Positive

  • Removes OpenAI API key requirement for graph features
  • Eliminates embedding model mismatch
  • Simplifies initialization
  • Removes ~200 lines of dead code
  • Direct Neo4j driver access
  • One less library to maintain

Negative

  • Loses the option of Graphiti's LLM-based entity extraction (our regex + taxonomy pipeline is more reliable)
  • Loses the option of semantic search over graph nodes (typed node Cypher + taxonomy cover all cases)