VLM with textual-driven GRPO training for vision-grounded decision making (https://arxiv.org/pdf/2503.16965, NeurIPS 2025)
Derek Zhe Hu
zhehuderek
AI & ML interests
NLP, Multimodality
Recent Activity
updated
a model
about 9 hours ago
zhehuderek/qwen25_vl_7b_guru_step135
published
a model
about 10 hours ago
zhehuderek/qwen25_vl_7b_guru_step135
liked
a dataset
5 days ago
tanhuajie2001/Reason-RFT-CoT-Dataset
Organizations
None yet
YesBut
The collections of visual humor understanding and comparative reasoning.
-
zhehuderek/YESBUT_Benchmark
Viewer • Updated • 348 • 37 • 1 -
zhehuderek/YESBUT_Benchmark_V2
Viewer • Updated • 1.26k • 66 • 1 -
Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
Paper • 2405.19088 • Published -
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
Paper • 2503.23137 • Published • 2
Praxis-VLM
VLM with textual-driven GRPO training for vision-grounded decision making (https://arxiv.org/pdf/2503.16965, NeurIPS 2025)
YesBut
The collections of visual humor understanding and comparative reasoning.
-
zhehuderek/YESBUT_Benchmark
Viewer • Updated • 348 • 37 • 1 -
zhehuderek/YESBUT_Benchmark_V2
Viewer • Updated • 1.26k • 66 • 1 -
Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
Paper • 2405.19088 • Published -
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
Paper • 2503.23137 • Published • 2
models
7
zhehuderek/qwen25_vl_7b_guru_step135
Image-to-Text
•
8B
•
Updated
zhehuderek/qwen2_5_vl_7b_GEOQA_8K_step90_hf
Image-to-Text
•
8B
•
Updated
•
5
zhehuderek/praxis_vlm_7b_decisionmaking
Image-to-Text
•
8B
•
Updated
•
43
zhehuderek/praxis_vlm_3b_decisionmaking
Image-to-Text
•
4B
•
Updated
•
1
zhehuderek/qwen2_5_vl_3b_GEOQA_8K_hf
Image-to-Text
•
4B
•
Updated
•
3
zhehuderek/llama-2-7b-chinese
Text Generation
•
7B
•
Updated
•
5
zhehuderek/llama-3.1-8b-chinese-sft
Text Generation
•
8B
•
Updated
•
2
datasets
11
zhehuderek/processed_guru-RL-92k
Viewer
•
Updated
•
72.3k
•
6
zhehuderek/VIVA_Plus_Benchmark
Viewer
•
Updated
•
6.37k
•
69
zhehuderek/OpenThoughts3-1.2M-processed
Viewer
•
Updated
•
39.6k
•
20
zhehuderek/humor_understanding_combined
Viewer
•
Updated
•
4.89k
•
26
•
1
zhehuderek/humor_understanding_nyt
Viewer
•
Updated
•
2.69k
•
16
zhehuderek/comparative_reasoning_mllm_compbench
Viewer
•
Updated
•
21.8k
•
12
zhehuderek/humor_understanding_deepeval
Viewer
•
Updated
•
2.96k
•
18
zhehuderek/textual_decisionmaking_data
Viewer
•
Updated
•
11k
•
20
•
1
zhehuderek/YESBUT_Benchmark_V2
Viewer
•
Updated
•
1.26k
•
66
•
1
zhehuderek/YESBUT_Benchmark
Viewer
•
Updated
•
348
•
37
•
1