Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
1
3
Mehul Damani
PRO
mehuldamani
Follow
wjurayj's profile picture
John6666's profile picture
Spechawk's profile picture
3 followers
·
0 following
https://damanimehul.github.io
MehulDamani2
damanimehul
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
updated
a model
25 days ago
mehuldamani/countdown_arl-sft-no-combine-v2
published
a model
25 days ago
mehuldamani/countdown_arl-sft-no-combine-v2
updated
a dataset
25 days ago
mehuldamani/neurips-story-main-story-features-sample-v1
View all activity
Organizations
None yet
mehuldamani
's models
281
Sort: Recently updated
mehuldamani/qwen3_8b_ambigQA_rlcr_multiple_ambigQASpecificPrompt
Updated
Dec 23, 2025
mehuldamani/qwen3_8b_ambigQA_rlcr_multiple_ogPrompt
Updated
Dec 22, 2025
mehuldamani/qwen3_8b_ambigQA_rlcr_multiple_ogPromptHiHI
Updated
Dec 22, 2025
mehuldamani/qwen3_8b_ambigQA_rlcr_multiple_ambigqaPrompt
Updated
Dec 22, 2025
mehuldamani/qwen3_8b_ambigQA_rlcr_single_tryShorterAnswer
Updated
Dec 20, 2025
mehuldamani/qwen3_8b_ambigQA_rlcr_single
Updated
Dec 19, 2025
mehuldamani/classifier-v3-Base-global-step-156
Text Classification
•
8B
•
Updated
Dec 9, 2025
mehuldamani/classifier-v3-SFT-global-step-156
Text Classification
•
8B
•
Updated
Dec 8, 2025
•
3
mehuldamani/abstain-v3
Text Generation
•
8B
•
Updated
Nov 22, 2025
•
1
mehuldamani/RLVR-hotpot-olmo-v3
Text Generation
•
7B
•
Updated
Nov 21, 2025
•
1
mehuldamani/RLCR-hotpot-olmo-v3
Text Generation
•
7B
•
Updated
Nov 20, 2025
•
1
mehuldamani/abstain-v2
Text Generation
•
8B
•
Updated
Nov 20, 2025
mehuldamani/RLVR-hotpot-olmo-v2
Updated
Nov 19, 2025
mehuldamani/abstain-v1
Text Generation
•
8B
•
Updated
Nov 19, 2025
•
1
•
1
mehuldamani/RLCR-hotpot-octo
Text Generation
•
8B
•
Updated
Nov 19, 2025
•
1
•
1
mehuldamani/RLVR-hotpot-octo
Text Generation
•
8B
•
Updated
Nov 19, 2025
•
1
mehuldamani/RLCR-hotpot-olmo-v2
Text Generation
•
7B
•
Updated
Nov 19, 2025
•
2
mehuldamani/RLCR-hotpot-olmo
Text Generation
•
7B
•
Updated
Nov 18, 2025
•
2
mehuldamani/RLVR-hotpot-olmo
Text Generation
•
7B
•
Updated
Nov 17, 2025
•
1
mehuldamani/RLVR-hotpot-mistral
Text Generation
•
7B
•
Updated
Nov 17, 2025
•
2
mehuldamani/calibration-only-v4
Text Generation
•
8B
•
Updated
Nov 17, 2025
•
1
mehuldamani/bandit-log-RLCR-v2
Text Generation
•
8B
•
Updated
Nov 17, 2025
•
2
mehuldamani/bandit-brier-RLCR-v1
Text Generation
•
8B
•
Updated
Nov 17, 2025
•
2
mehuldamani/bandit-log-RLCR-v1
Text Generation
•
8B
•
Updated
Nov 17, 2025
•
2
mehuldamani/toy-log-RLCR-v3
Text Generation
•
8B
•
Updated
Nov 17, 2025
•
2
•
1
mehuldamani/toy-log-RLCR-v1
Text Generation
•
8B
•
Updated
Nov 17, 2025
•
1
mehuldamani/RLVR-hotpot-gemma
Updated
Nov 16, 2025
mehuldamani/RLCR-hotpot-llama-base
Updated
Nov 16, 2025
mehuldamani/RLCR-hotpot-mistral
Updated
Nov 16, 2025
mehuldamani/nov15_qwen3_8b_math_rlVr_single
Updated
Nov 16, 2025
Previous
1
...
4
5
6
7
8
...
10
Next