Reasoning Arena: Trace Tournaments When Verifiable Rewards Fall Short Paper • 2606.09380 • Published 3 days ago • 7
ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning Paper • 2406.19741 • Published Jun 28, 2024 • 60
adamxyang/1.4b-policy_preference_data_gold_labelled_with_ref Viewer • Updated Apr 6, 2024 • 51.4k • 208