the best collection of RLXF model including RLHF, RLAIF etc.
lil
Amu
AI & ML interests
None yet
Organizations
None yet
models 17
Amu/DeepSeek-R1-Distill-Qwen-1.5B-GRPO
Updated
Amu/t1-3B-grpo
Text Generation • 3B • Updated
• 1 • 1
Amu/t1-3B
Text Generation • 3B • Updated
• 12 • 1
Amu/t1-1.5B
Text Generation • 2B • Updated
• 8 • 1
Amu/supertiny-llama3-0.25B-v0.1
Text Generation • 0.3B • Updated
• 3 • 7
Amu/dpo-qlora-Qwen1.5-0.5B-Chat-xtuner
Text Generation • Updated
• 1
Amu/orpo-phi2
Text Generation • 3B • Updated
• 3
Amu/orpo-lora-phi2
Text Generation • 3B • Updated
• 6
Amu/spin-phi2
Text Generation • 3B • Updated
• 14 • 10
Amu/r-zephyr-7b-beta-qlora
Updated