LoRA adapters (Qwen3-1.7B) for training RLMs via RL. SFT, STaR, DPO, GRPO-v4. Code: github.com/pythonomar22/rl4rlm
Omar Abul-Hassan
omar81939
AI & ML interests
None yet
Recent Activity
liked a model about 9 hours ago
omar81939/rlm-qwen35-35b-a3b upvoted a collection 7 days ago
RL4RLM: Training Native Recursive Language Models updated a model 7 days ago
omar81939/rlm-qwen35-35b-a3bOrganizations
None yet