Architectural Update (March 2026)

This ADR was written when the system used Neo4j for entity storage. As of March 2026, Neo4j has been fully removed and replaced by PostgreSQL taxonomy tables (taxonomy_entities, taxonomy_relationships). The decision rationale documented here remains valid; the storage layer has changed.

ADR-0020: Reciprocal Rank Fusion

Date: 2026-02-10 | Status: Accepted

Context

The hybrid search pipeline combines vector similarity (pgvector cosine distance) with keyword matching (BM25). The previous implementation used weighted linear combination: final_score = 0.7 * vector_score + 0.3 * bm25_score.

This approach has a fundamental flaw: BM25 scores and cosine similarities operate on incompatible scales. Cosine similarity ranges from -1 to 1 (typically 0.3-0.9 for relevant results), while BM25 scores are unbounded positive values that vary wildly depending on query length, document frequency, and corpus size.

Decision

Replace weighted linear combination with Reciprocal Rank Fusion (RRF):

score(d) = Σ 1/(k + rank_i + 1)    for each result list i

Where:

k = 60 (standard constant from the original RRF paper by Cormack, Clarke & Buettcher, 2009)
rank_i = position of document d in result list i (0-based)
Documents not present in a result list receive no contribution from that list

RRF is score-agnostic -- it only uses rank positions, completely sidestepping the score incompatibility problem.

Key Properties

Property	Implication
Score-agnostic	No need to calibrate weights between different scoring scales
Overlap promotion	Documents in both lists rank higher than those in only one
Monotonically decreasing	Higher rank always yields lower score contribution
Well-studied	Used by Elasticsearch, Azure AI Search, Pinecone

Implementation

In search_service.py, Step 4 (BM25 merge) was replaced with RRF fusion. Vector search and BM25 search each return ranked lists, and RRF combines ranks into a single score sorted descending.

Consequences

Positive

+3-7% accuracy improvement: Measured across query test sets
More robust across query types: No need to tune weights per query category
Simpler code: No normalization logic, no weight parameters to maintain
Well-studied: Standard technique in production RAG systems

Negative

No score weighting: Cannot express "trust vector more than BM25" (though k=60 naturally favors consistent ranking)
Rank-only: Ignores confidence gaps (rank 1 with 0.99 similarity vs 0.51 are treated identically)

Neutral

Same retrieval candidates (pgvector + BM25 sources unchanged)
Same reranking step downstream (BGE reranker operates on RRF-fused results)
PostgreSQL taxonomy results merged with priority ordering before fusion (unchanged)

Alternatives Considered

Alternative 1: Weighted Linear with Better Normalization

Apply z-score or percentile normalization to both score types before combining.

Pros: Preserves score magnitude information
Cons: Normalization requires corpus statistics, fragile across corpus updates
Why rejected: RRF achieves better results with zero tuning

Alternative 2: Convex Combination of Normalized Ranks

Normalize ranks to [0,1] and use weighted sum.

Pros: Allows weight tuning
Cons: Still requires a weight parameter, marginal benefit over RRF
Why rejected: Added complexity without meaningful improvement

References

Cormack, G. V., Clarke, C. L. A., & Buettcher, S. (2009). Reciprocal rank fusion outperforms Condorcet and individual rank learning methods. Proceedings of SIGIR 2009, 758--759. https://doi.org/10.1145/1571941.1572114
ParadeDB. (2024). Hybrid search in PostgreSQL: The missing manual. https://www.paradedb.com/blog/hybrid-search-in-postgresql-the-missing-manual

ADR-007: RAG Pipeline Enrichment (original hybrid search design)
ADR-0019: Contextual Embeddings (improved embedding inputs)

Context​

Decision​

Key Properties​

Implementation​

Consequences​

Positive​

Negative​

Neutral​

Alternatives Considered​

Alternative 1: Weighted Linear with Better Normalization​

Alternative 2: Convex Combination of Normalized Ranks​

References​

Related ADRs​