SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution
Abstract
SAVOIR framework uses cooperative game theory to improve social intelligence in language agents by combining expected utility shifts and Shapley values for better credit assignment in dialogue systems.
Social intelligence, the ability to navigate complex interpersonal interactions, presents a fundamental challenge for language agents. Training such agents via reinforcement learning requires solving the credit assignment problem: determining how individual utterances contribute to multi-turn dialogue outcomes. Existing approaches directly employ language models to distribute episode-level rewards, yielding attributions that are retrospective and lack theoretical grounding. We propose SAVOIR (ShApley Value fOr SocIal RL), a novel principled framework grounded in cooperative game theory. Our approach combines two complementary principles: expected utility shifts evaluation from retrospective attribution to prospective valuation, capturing an utterance's strategic potential for enabling favorable future trajectories; Shapley values ensure fair credit distribution with axiomatic guarantees of efficiency, symmetry, and marginality. Experiments on the SOTOPIA benchmark demonstrate that SAVOIR achieves new state-of-the-art performance across all evaluation settings, with our 7B model matching or exceeding proprietary models including GPT-4o and Claude-3.5-Sonnet. Notably, even large reasoning models consistently underperform, suggesting social intelligence requires qualitatively different capabilities than analytical reasoning.
Community
Training language agents for social intelligence faces a core challenge—credit assignment across multi-turn dialogues. Existing methods retrospectively distribute episode rewards via LLMs, lacking theoretical grounding. We propose SAVOIR, a principled framework from cooperative game theory that (1) reframes credit assignment as prospective valuation through expected utility shifts, capturing each utterance's strategic potential to enable favorable futures, and (2) uses Shapley values to guarantee fair attribution with axiomatic properties (efficiency, symmetry, marginality). On SOTOPIA, SAVOIR achieves new SOTA across all settings—our 7B model matches or exceeds GPT-4o and Claude-3.5-Sonnet. Notably, large reasoning models consistently underperform, suggesting social intelligence requires qualitatively different capabilities than analytical reasoning. 🧠💬
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- MAPO: Mixed Advantage Policy Optimization for Long-Horizon Multi-Turn Dialogue (2026)
- ShapE-GRPO: Shapley-Enhanced Reward Allocation for Multi-Candidate LLM Training (2026)
- Stratagem: Learning Transferable Reasoning via Trajectory-Modulated Game Self-Play (2026)
- Hindsight Credit Assignment for Long-Horizon LLM Agents (2026)
- Collaborative Multi-Agent Scripts Generation for Enhancing Imperfect-Information Reasoning in Murder Mystery Games (2026)
- Enhancing LLM-based Search Agents via Contribution Weighted Group Relative Policy Optimization (2026)
- Learning to Negotiate: Multi-Agent Deliberation for Collective Value Alignment in LLMs (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2604.18982 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper