Critique to Verify: Accurate and Honest Test-Time Scaling with RL-Trained Verifiers (https://arxiv.org/abs/2509.23152)
Zhicheng YANG
yangzhch6
AI & ML interests
reasoning with LLMs
Recent Activity
upvoted a paper 2 days ago
ViewFusion: Structured Spatial Thinking Chains for Multi-View Reasoning upvoted a paper 2 days ago
TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward updated
a model 7 days ago
yangzhch6/Qwen2.5-Math-7B-Think32k Organizations
None yet