Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty
Mehul Damani PRO
mehuldamani
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
updated
a model
about 2 hours ago
mehuldamani/classifier-v3-SFT-global-step-156
published
a model
about 2 hours ago
mehuldamani/classifier-v3-SFT-global-step-156
updated
a dataset
about 16 hours ago
mehuldamani/classifier-v1-SFT
Organizations
None yet