Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

sinnis's picture

2 7

sinnis

sinnis

·

sinnis1991

AI & ML interests

None yet

Organizations

None yet

sinnis 's collections 3

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Paper • 2508.08221 • Published Aug 11, 2025 • 50

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64

generative work

Idempotent Generative Network

Paper • 2311.01462 • Published Nov 2, 2023 • 25

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Paper • 2508.08221 • Published Aug 11, 2025 • 50

generative work

Idempotent Generative Network

Paper • 2311.01462 • Published Nov 2, 2023 • 25

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs