Runze Liu

RyanLiu112

5 25 5

https://ryanliu112.github.io

AI & ML interests

LLM, RL

Recent Activity

upvoted a paper about 2 months ago

ICA Lens: Interpreting Language Models Without Training Another Dictionary

updated a model 2 months ago

Fate-Zero/Archer-Math-30B-Preview

updated a model 2 months ago

Fate-Zero/ASPO-Math-30B-Preview

View all activity

Organizations

upvoted a paper about 2 months ago

ICA Lens: Interpreting Language Models Without Training Another Dictionary

Paper • 2606.11722 • Published Jun 10 • 17

updated 2 models 2 months ago

Fate-Zero/Archer-Math-30B-Preview

31B • Updated May 23 • 4

Fate-Zero/ASPO-Math-30B-Preview

31B • Updated May 22 • 7

published 2 models 3 months ago

Fate-Zero/Archer-Math-30B-Preview

31B • Updated May 23 • 4

Fate-Zero/ASPO-Math-30B-Preview

31B • Updated May 22 • 7

upvoted 2 papers 5 months ago

Complementary Reinforcement Learning

Paper • 2603.17621 • Published Mar 18 • 37

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

Paper • 2603.16448 • Published Mar 17 • 58

upvoted a paper 6 months ago

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Paper • 2602.10604 • Published Feb 11 • 201

liked a model 6 months ago

stepfun-ai/Step-3.5-Flash

Text Generation • 199B • Updated Mar 17 • 92.6k • • 829

upvoted a paper 7 months ago

GARDO: Reinforcing Diffusion Models without Reward Hacking

Paper • 2512.24138 • Published Dec 30, 2025 • 30

upvoted an article 7 months ago

Article

Deriving the PPO Loss from First Principles

garg-aayush

•

Dec 25, 2025

• 47

upvoted a paper 7 months ago

Step-DeepResearch Technical Report

Paper • 2512.20491 • Published Dec 23, 2025 • 89

upvoted a collection 7 months ago

Physics of Language Models: Part 4.2

Collection

17 items • Updated Dec 22, 2025 • 2

upvoted a paper 7 months ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published Dec 22, 2025 • 66

upvoted a collection 7 months ago

"Physics of Language Models" series

Collection

7 items • Updated Dec 22, 2025 • 55

upvoted a paper 7 months ago

On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

Paper • 2512.07783 • Published Dec 8, 2025 • 41

updated a model 8 months ago

RyanLiu112/1.5a_first2

2B • Updated Dec 8, 2025 • 4

published a model 8 months ago

RyanLiu112/1.5a_first2

2B • Updated Dec 8, 2025 • 4

updated a model 8 months ago

RyanLiu112/1.5a_woabf_480

2B • Updated Dec 8, 2025 • 4

published a model 8 months ago

RyanLiu112/1.5a_woabf_480

2B • Updated Dec 8, 2025 • 4

Runze Liu

AI & ML interests

Recent Activity

Organizations

RyanLiu112's activity

Deriving the PPO Loss from First Principles