SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28, 2025 • 125
Tree of Thoughts: Deliberate Problem Solving with Large Language Models Paper • 2305.10601 • Published May 17, 2023 • 15
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 145