Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction Paper • 2605.05242 • Published May 3 • 123
Steer2Adapt: Dynamically Composing Steering Vectors Elicits Efficient Adaptation of LLMs Paper • 2602.07276 • Published Feb 7 • 11
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published Nov 26, 2025 • 128
LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training Paper • 2510.14969 • Published Oct 16, 2025 • 8
Vibe Checker: Aligning Code Evaluation with Human Preference Paper • 2510.07315 • Published Oct 8, 2025 • 34
Vibe Checker: Aligning Code Evaluation with Human Preference Paper • 2510.07315 • Published Oct 8, 2025 • 34
Vibe Checker: Aligning Code Evaluation with Human Preference Paper • 2510.07315 • Published Oct 8, 2025 • 34 • 2
Why Does the Effective Context Length of LLMs Fall Short? Paper • 2410.18745 • Published Oct 24, 2024 • 17
Open-Vocabulary Argument Role Prediction for Event Extraction Paper • 2211.01577 • Published Nov 3, 2022
See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses Paper • 2408.08978 • Published Aug 16, 2024
The Gold Medals in an Empty Room: Diagnosing Metalinguistic Reasoning in LLMs with Camlang Paper • 2509.00425 • Published Aug 30, 2025 • 12