LordNeel/DeepSeek-V4-Flash-Acti-MTP-W4A16-FP8 Text Generation • 44B • Updated 27 days ago • 2.94k • 10
Replacing thinking with tool usage enables reasoning in small language models Paper • 2507.05065 • Published Jul 7, 2025 • 17