Papers
arxiv:2512.10787

Replace, Don't Expand: Mitigating Context Dilution in Multi-Hop RAG via Fixed-Budget Evidence Assembly

Published on Dec 11, 2025
Authors:

Abstract

SEAL-RAG addresses context dilution in multi-hop question answering by replacing retrieved content rather than expanding it, using entity-anchored extraction and targeted micro-queries to improve precision.

AI-generated summary

Retrieval-Augmented Generation (RAG) systems often fail on multi-hop queries when the initial retrieval misses a bridge fact. Prior corrective approaches, such as Self-RAG, CRAG, and Adaptive-k, typically address this by adding more context or pruning existing lists. However, simply expanding the context window often leads to context dilution, where distractors crowd out relevant information. We propose SEAL-RAG, a training-free controller that adopts a ``replace, don't expand'' strategy to fight context dilution under a fixed retrieval depth k. SEAL executes a (Search rightarrow Extract rightarrow Assess rightarrow Loop) cycle: it performs on-the-fly, entity-anchored extraction to build a live gap specification (missing entities/relations), triggers targeted micro-queries, and uses entity-first ranking to actively swap out distractors for gap-closing evidence. We evaluate SEAL-RAG against faithful re-implementations of Basic RAG, CRAG, Self-RAG, and Adaptive-k in a shared environment on HotpotQA and 2WikiMultiHopQA. On HotpotQA (k=3), SEAL improves answer correctness by +3--13 pp and evidence precision by +12--18 pp over Self-RAG. On 2WikiMultiHopQA (k=5), it outperforms Adaptive-k by +8.0 pp in accuracy and maintains 96\% evidence precision compared to 22\% for CRAG. These gains are statistically significant (p<0.001). By enforcing fixed-k replacement, SEAL yields a predictable cost profile while ensuring the top-k slots are optimized for precision rather than mere breadth. We release our code and data at https://github.com/mosherino/SEAL-RAG.

Community

Paper author

Subject: Summary: Replace, Don't Expand

RAG systems often fail not because they lack information, but because they retrieve too much noise ("context dilution").

In this paper, we introduce SEAL-RAG, a "fixed-budget" retrieval strategy. Instead of endlessly appending retrieved chunks and overflowing the context window, we dynamically replace irrelevant segments with new, higher-quality evidence. This maintains a high signal-to-noise ratio, significantly improving performance on complex multi-hop QA tasks without inflating token costs.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2512.10787 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2512.10787 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.