wenxizhu's picture

3

wenxizhu

wenxizhu

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

upvoted a paper 2 days ago

Rethinking the Divergence Regularization in LLM RL

upvoted a paper 2 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

View all activity

Organizations

None yet

upvoted 3 papers 2 days ago

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

Paper • 2606.11025 • Published 4 days ago • 40

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published 5 days ago • 32

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 4 days ago • 41