PRL-Bench: A Comprehensive Benchmark Evaluating LLMs' Capabilities in Frontier Physics Research Paper • 2604.15411 • Published 6 days ago • 2
Running on Zero Agents 1 Molmo2-SGCoT: Visual Entity Tracking Demo 🎯 1 Track objects in shell games with SGCoT
Running on Zero Agents 1 Molmo2-SGCoT: Visual Entity Tracking Demo 🎯 1 Track objects in shell games with SGCoT
Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks Paper • 2305.14201 • Published May 23, 2023 • 6