ADR-0017: Context Retrieval Architecture
Date: 2026-02-09 | Status: Accepted
Summary
Consolidates the complete multi-signal hybrid RAG retrieval architecture (Lewis et al., 2020) into a single architectural decision record. Documents the 8-stage pipeline — from intent classification through parallel retrieval (vector + BM25 + graph), score fusion, metadata boosting, keyword rescue, context assembly, page summary injection, and LLM generation — explaining how each signal contributes and identifying gaps against 2025-2026 best practices.
Key Decisions
- 8-stage pipeline: Intent → Parallel Retrieval (3 channels) → Score Fusion → Metadata Boosting (7 signals) → Keyword Rescue → Context Assembly → Context Building → LLM Generation
- Reciprocal Rank Fusion: RRF (k=60) replaces weighted linear combination — see ADR-0020
- 7 metadata boost signals: Category (+20%), recency (+15%), section header (+10%), entity type (+10%), campus (+10%), conversation context (+25%), content keyword (up to +40%)
- Pre-computed enrichment: Page summaries and canonical questions generated at ingestion time — zero query-time LLM overhead
- Cross-encoder reranking: Always-on in full mode (50→15 candidates) — see ADR-0024
Known Gaps
| Gap | Recommended Upgrade | Priority |
|---|---|---|
| No reranker in default pipeline | Add BGE-reranker-v2-m3 to normal flow | High |
| Weighted linear fusion | Switch to Reciprocal Rank Fusion (RRF) | Medium |
| Canonical questions BM25-only | Implement full HyPE (embed questions as vectors) | Medium |
| Page summaries at generation-time only | Full contextual retrieval (prepend before embedding) | High |
| No confidence-based abstention | Pre-generation quality gate | Medium |
Full Details
See the complete ADR at docs/ADR/0017-context-retrieval-architecture.md and the detailed architecture page at Context Retrieval Architecture.