Datasets em língua portuguesa
Abreu Magalhães
Hildeberto
AI & ML interests
None yet
Recent Activity
upvoted a paper about 18 hours ago
SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks upvoted a paper about 18 hours ago
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe upvoted a paper about 18 hours ago
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance