YANG ZHOU's picture

1 3 3

YANG ZHOU

Yang-Zhou

IANNXANG

AI & ML interests

RLHF and DPO

Recent Activity

liked a dataset about 4 hours ago

sojuL/RubricHub_v1

upvoted a paper about 8 hours ago

RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation

updated a dataset 3 months ago

Yang-Zhou/DAPO-Math-17k-Qwen3-235B-A22B-Thinking-2507-rejection-distill

View all activity

Organizations

None yet

authored 2 papers 5 months ago

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

Paper • 2508.16949 • Published Aug 23, 2025 • 23

VeriGUI: Verifiable Long-Chain GUI Dataset

Paper • 2508.04026 • Published Aug 6, 2025 • 161