8 1

li sheng

bambisheng

AI & ML interests

None yet

Recent Activity

updated a dataset 3 days ago

dynn-datasets/Evaluation

published a dataset 11 days ago

dynn-datasets/Evaluation

upvoted a paper 17 days ago

How Far Can Unsupervised RLVR Scale LLM Training?

View all activity

Organizations

updated a dataset 3 days ago

dynn-datasets/Evaluation

Preview • Updated 3 days ago • 163

published a dataset 11 days ago

dynn-datasets/Evaluation

Preview • Updated 3 days ago • 163

upvoted a paper 17 days ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published 17 days ago • 57

upvoted a paper 5 months ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29, 2025 • 229

upvoted 2 papers 7 months ago

rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28, 2025 • 118

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14, 2025 • 97

upvoted a paper 10 months ago

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28, 2025 • 132

authored a paper 11 months ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22, 2025 • 122

upvoted a paper 11 months ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22, 2025 • 122

published 3 models 12 months ago

updated a model 12 months ago

bambisheng/UltraIF-8B-DPO

Text Generation • 8B • Updated Apr 3, 2025 • 4 • 3

updated a collection 12 months ago

UltraIF series

Collection

Open-Sourced model and data for ULTRAIF: Advancing Instruction Following from the Wild. • 6 items • Updated Apr 3, 2025 • 3

updated a model 12 months ago

bambisheng/UltraIF-8B-UltraComposer

Text Generation • 8B • Updated Apr 3, 2025 • 3 • 1

li sheng

AI & ML interests

Recent Activity

Organizations

bambisheng's activity