koutch/short_paper_llama_1.json_train_dpo_v3_train_no_think Text Generation • 8B • Updated Jan 12 • 13
koutch/short_paper_llama_1.json_train_dpo_v2_train_no_think Text Generation • 8B • Updated Jan 12 • 9
koutch/short_paper_llama_1.json_train_dpo_v4_train_no_think Text Generation • 8B • Updated Jan 12 • 4
koutch/short_paper_llama_0.json_train_dpo_v4_train_no_think Text Generation • 8B • Updated Jan 11 • 2
koutch/short_paper_qwen_qwen3-instruct-4b_train_sft_train_think Text Generation • 4B • Updated Jan 9 • 8