Submitted by philschmid 46 Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models DeepSeek 729 1
Submitted by Sylvestre 44 Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion ยท 6 authors 9 1
Submitted by zuom 20 Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages Bats Research 65 1