Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
mehuldamani 's Collections
RLCR

RLCR

updated Aug 6, 2025

Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty

Upvote
7

  • mehuldamani/big-math-digits-v2-correctness

    Text Generation • 8B • Updated Jun 25, 2025 • 19 •

  • mehuldamani/hotpot-v2-correctness-7b

    Text Generation • 8B • Updated Jul 29, 2025 • 2 •

  • mehuldamani/orm-big-math-digits-v2-correctness

    Text Classification • 7B • Updated Jul 8, 2025 • 48

  • mehuldamani/big-math-digits-v2-brier

    8B • Updated Aug 4, 2025 • 47

  • mehuldamani/big-math-digits

    Viewer • Updated Aug 5, 2025 • 31k • 904

  • mehuldamani/hotpot_qa

    Viewer • Updated Aug 5, 2025 • 20.5k • 1.2k

  • mehuldamani/hotpot-v2-brier-7b-no-split

    Text Generation • 8B • Updated Jun 5, 2025 • 50 •

  • mehuldamani/big-math-digits-v2-brier-base-tabc

    Text Generation • 8B • Updated Jun 28, 2025 • 63 •

  • mehuldamani/orm-hotpot-v2-final-correctness

    Text Classification • 7B • Updated Jun 9, 2025 • 15

  • mehuldamani/qwen-base-verifier-sft-v1

    Text Generation • 8B • Updated Jun 13, 2025 • 721 •
Upvote
7
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs