ADR-0022: Dynamic Retrieval Future Consideration

Date: 2026-02-10 | Status: Deferred

Context

Dynamic retrieval techniques extend the standard RAG pipeline (Lewis et al., 2020). Approaches such as FLARE (Forward-Looking Active REtrieval) and DRAGIN (Dynamic Retrieval Augmented Generation based on Information Needs) retrieve additional context mid-generation when the model detects low-confidence tokens:

Generate a partial response
Detect uncertain tokens (low probability, hedging language)
Formulate a targeted retrieval query based on the uncertain passage
Retrieve additional context
Continue generation with enriched context

This is particularly effective for long-form generation where the initial retrieval may not cover all sub-topics.

Decision

Defer implementation. Rationale:

Short-form answers: The medical search chatbot produces short, focused answers (typically 2-5 sentences). Dynamic retrieval provides the most value for multi-paragraph generation where context needs shift mid-response.
Streaming complexity: The pipeline uses WebSocket streaming for real-time token delivery. Dynamic retrieval requires pausing mid-stream, performing a retrieval round-trip, and resuming -- adding significant architectural complexity.
Latency sensitivity: Each mid-generation retrieval adds 200-500ms (embedding + vector search + reranking). For short answers, this overhead exceeds the generation time itself.
Upfront retrieval sufficiency: With 50-100 candidates, RRF fusion, and BGE reranking to top-15, the upfront retrieval captures sufficient context for short-form answers.

Consequences

Revisit if expanding to multi-step reasoning, long-form generation, or report-style outputs
Monitor FLARE/DRAGIN research for latency-optimized variants
Current architecture supports adding retrieval hooks at the service layer if needed later

ADR-0021: Self-RAG Future Consideration (related adaptive retrieval pattern)
ADR-0020: Reciprocal Rank Fusion (upfront retrieval quality)

Context​

Decision​

Consequences​

Related ADRs​

Context

Decision

Consequences

Related ADRs