Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum Paper • 2510.00526 • Published Oct 1, 2025 • 10
Dynamic Long Context Reasoning over Compressed Memory via End-to-End Reinforcement Learning Paper • 2602.08382 • Published 6 days ago • 10
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning Paper • 2602.10560 • Published 4 days ago • 27
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models Paper • 2602.12036 • Published 2 days ago • 86
Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning Paper • 2602.01058 • Published 14 days ago • 39
Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts Paper • 2601.22156 • Published 16 days ago • 11
iFSQ: Improving FSQ for Image Generation with 1 Line of Code Paper • 2601.17124 • Published 22 days ago • 32
ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation Paper • 2601.21420 • Published 17 days ago • 42
Scaling Embeddings Outperforms Scaling Experts in Language Models Paper • 2601.21204 • Published 17 days ago • 99
Lost in the Prompt Order: Revealing the Limitations of Causal Attention in Language Models Paper • 2601.14152 • Published 25 days ago • 5
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models Paper • 2601.15165 • Published 24 days ago • 72