LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws Paper • 2605.23901 • Published 15 days ago • 13
Rethinking Cross-Layer Information Routing in Diffusion Transformers Paper • 2605.20708 • Published 17 days ago • 109