Forecasting Downstream Performance of LLMs With Proxy Metrics Paper • 2605.18607 • Published 10 days ago • 14
MINTEval: Evaluating Memory under Multi-Target Interference in Long-Horizon Agent Systems Paper • 2605.18565 • Published 9 days ago • 4
Sparse Autoencoders enable Robust and Interpretable Fine-tuning of CLIP models Paper • 2605.15961 • Published 13 days ago • 9
Multi-Objective and Mixed-Reward Reinforcement Learning via Reward-Decorrelated Policy Optimization Paper • 2605.13641 • Published 15 days ago • 49
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 16 days ago • 191
Q-RAG: Long Context Multi-step Retrieval via Value-based Embedder Training Paper • 2511.07328 • Published 24 days ago • 16
EDU-CIRCUIT-HW: Evaluating Multimodal Large Language Models on Real-World University-Level STEM Student Handwritten Solutions Paper • 2602.00095 • Published 28 days ago • 3
Significance and Stability Analysis of Gene-Environment Interaction using RGxEStat Paper • 2604.03337 • Published Apr 3 • 1
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 242
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 504
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 326