Guanxing Lu's picture

Guanxing Lu

GuanxingLu

·

https://guanxinglu.github.io/

GuanxingLu

AI & ML interests

Computer Vision, Reinforcement Learning, etc.

Recent Activity

upvoted a paper 1 day ago

STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability

liked a Space 19 days ago

WorldArena/WorldArena

updated a model about 1 month ago

GuanxingLu/momo-dapo-overlong-deepseek-r1-no-dpo-loss

View all activity

Organizations

None yet

upvoted a paper 1 day ago

STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability

Paper • 2606.19236 • Published 3 days ago • 8

liked a Space 19 days ago

WorldArena

the official leaderboard of the WorldArena benchmark

updated a model about 1 month ago

GuanxingLu/momo-dapo-overlong-deepseek-r1-no-dpo-loss

8B • Updated May 6 • 4

published a model about 1 month ago

GuanxingLu/momo-dapo-overlong-deepseek-r1-no-dpo-loss

8B • Updated May 6 • 4

updated a model about 2 months ago

GuanxingLu/momo-dpo-reverse-deepseek-r1-7b-anneal

8B • Updated May 4 • 4

published a model about 2 months ago

GuanxingLu/momo-dpo-reverse-deepseek-r1-7b-anneal

8B • Updated May 4 • 4

updated a model about 2 months ago

GuanxingLu/momo-dpo-deepseek-r1-7b-abla-qwen3-1.7b

8B • Updated May 4 • 3

published a model about 2 months ago

GuanxingLu/momo-dpo-deepseek-r1-7b-abla-qwen3-1.7b

8B • Updated May 4 • 3

updated a model about 2 months ago

GuanxingLu/paper-momo-efficient-rloo-anneal-qwen25-math7b

8B • Updated May 4 • 4

published a model about 2 months ago

GuanxingLu/paper-momo-efficient-rloo-anneal-qwen25-math7b

8B • Updated May 4 • 4

updated a model about 2 months ago

GuanxingLu/paper-momo-thinkprune-qwen25-math7b

8B • Updated May 4 • 3

published a model about 2 months ago

GuanxingLu/paper-momo-thinkprune-qwen25-math7b

8B • Updated May 4 • 3

updated a model about 2 months ago

GuanxingLu/paper-momo-dapo-overlong-qwen25-math7b

8B • Updated May 4 • 3

published a model about 2 months ago

GuanxingLu/paper-momo-dapo-overlong-qwen25-math7b

8B • Updated May 4 • 3

updated a model about 2 months ago

GuanxingLu/momo-efficient-rloo-deepseek-r1-7b

8B • Updated May 3 • 4

published a model about 2 months ago

GuanxingLu/momo-efficient-rloo-deepseek-r1-7b

8B • Updated May 3 • 4

updated a model about 2 months ago

GuanxingLu/paper-momo-efficient-rloo-qwen25-math7b

8B • Updated May 3 • 2

published a model about 2 months ago

GuanxingLu/paper-momo-efficient-rloo-qwen25-math7b

8B • Updated May 3 • 2

updated a model about 2 months ago

GuanxingLu/paper-momo-grpo-reverse-dpo-qwen25-math7b

8B • Updated May 3 • 3

published a model about 2 months ago

GuanxingLu/paper-momo-grpo-reverse-dpo-qwen25-math7b

8B • Updated May 3 • 3