Replace, Don't Expand: Mitigating Context Dilution in Multi-Hop RAG via Fixed-Budget Evidence Assembly
Abstract
SEAL-RAG addresses context dilution in multi-hop question answering by replacing retrieved content rather than expanding it, using entity-anchored extraction and targeted micro-queries to improve precision.
Retrieval-Augmented Generation (RAG) systems often fail on multi-hop queries when the initial retrieval misses a bridge fact. Prior corrective approaches, such as Self-RAG, CRAG, and Adaptive-k, typically address this by adding more context or pruning existing lists. However, simply expanding the context window often leads to context dilution, where distractors crowd out relevant information. We propose SEAL-RAG, a training-free controller that adopts a ``replace, don't expand'' strategy to fight context dilution under a fixed retrieval depth k. SEAL executes a (Search rightarrow Extract rightarrow Assess rightarrow Loop) cycle: it performs on-the-fly, entity-anchored extraction to build a live gap specification (missing entities/relations), triggers targeted micro-queries, and uses entity-first ranking to actively swap out distractors for gap-closing evidence. We evaluate SEAL-RAG against faithful re-implementations of Basic RAG, CRAG, Self-RAG, and Adaptive-k in a shared environment on HotpotQA and 2WikiMultiHopQA. On HotpotQA (k=3), SEAL improves answer correctness by +3--13 pp and evidence precision by +12--18 pp over Self-RAG. On 2WikiMultiHopQA (k=5), it outperforms Adaptive-k by +8.0 pp in accuracy and maintains 96\% evidence precision compared to 22\% for CRAG. These gains are statistically significant (p<0.001). By enforcing fixed-k replacement, SEAL yields a predictable cost profile while ensuring the top-k slots are optimized for precision rather than mere breadth. We release our code and data at https://github.com/mosherino/SEAL-RAG.
Community
Subject: Summary: Replace, Don't Expand
RAG systems often fail not because they lack information, but because they retrieve too much noise ("context dilution").
In this paper, we introduce SEAL-RAG, a "fixed-budget" retrieval strategy. Instead of endlessly appending retrieved chunks and overflowing the context window, we dynamically replace irrelevant segments with new, higher-quality evidence. This maintains a high signal-to-noise ratio, significantly improving performance on complex multi-hop QA tasks without inflating token costs.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper