DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published 10 days ago • 59
Crowded in B-Space: Calibrating Shared Directions for LoRA Merging Paper • 2604.16826 • Published 7 days ago • 18
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 17 days ago • 320
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published 23 days ago • 487
Working Notes on Late Interaction Dynamics: Analyzing Targeted Behaviors of Late Interaction Models Paper • 2603.26259 • Published 29 days ago • 7
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published 26 days ago • 340