Mann Patel
manncodes
AI & ML interests
NLP, Mech Interp, Reasoning, MLSystems
Recent Activity
upvoted
a
paper
about 5 hours ago
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
upvoted
a
paper
about 5 hours ago
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits
upvoted
a
paper
10 days ago
On GRPO Collapse in Search-R1: The Lazy Likelihood-Displacement Death Spiral
Organizations
None yet