The Reward Was in Your Data All Along: Correcting Flow Matching with Discriminator-Guided RL Paper • 2606.19162 • Published 3 days ago • 18
How to Train Your LLM Web Agent: A Statistical Diagnosis Paper • 2507.04103 • Published Jul 5, 2025 • 52
view article Article How to Train Your LLM Web Agent: A Statistical Diagnosis ppEmiliano • Jul 8, 2025 • 15
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 712
Continual Pre-training Collection Models from Simple and Scalable Strategies to Continually Pre-train Large Language Models • 0 items • Updated Apr 7