11 7

Ylnmt1b25eu45

ylnmt1b25eu45

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key

liked a dataset 6 days ago

wegrthj/e94fjt-v654-data

liked a model 12 days ago

lllyasviel/ControlNet-v1-1

View all activity

Organizations

None yet

upvoted a paper 1 day ago

Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key

Paper • 2605.06638 • Published 6 days ago • 13

liked a dataset 6 days ago

wegrthj/e94fjt-v654-data

Preview • Updated about 6 hours ago • 10.3k • 1

liked a model 12 days ago

lllyasviel/ControlNet-v1-1

Updated Apr 25, 2023 • 4.06k

liked a model about 1 month ago

Tomuel64/YOLOV8s-Barcode-Detection

Object Detection • Updated about 1 month ago • 96 • 1

liked a dataset about 1 month ago

Emmyc2/psp

Updated Mar 6 • 543k • 7

upvoted 2 papers about 1 month ago

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published Apr 3 • 628

When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models

Paper • 2604.08546 • Published Apr 9 • 115

liked a model about 1 month ago

arithmetic-circuit-overloading/Llama-3.3-70B-Instruct-v2-3d-4M-400K-0.1-reverse-padzero-99-64D-3L-2H-256I

Text Generation • 347k • Updated Apr 4 • 79 • 1

liked a dataset about 1 month ago

allenai/dolma

Updated Apr 17, 2024 • 4.43k • 1.03k

upvoted a paper about 1 month ago

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 350

liked a dataset about 1 month ago

HuggingFaceFW/fineweb-edu

Viewer • Updated Jul 11, 2025 • 3.5B • 561k • 1.07k

upvoted a paper about 2 months ago

Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding

Paper • 2603.19235 • Published Mar 19 • 95

upvoted 2 papers 2 months ago

Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 195

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 220

upvoted 4 papers 3 months ago

TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents

Paper • 2602.07274 • Published Feb 6 • 210

Ylnmt1b25eu45

AI & ML interests

Recent Activity

Organizations

ylnmt1b25eu45's activity