🤝 Open to Collab
Frolov Anatolii
ssurface
·
AI & ML interests
None yet
Recent Activity
updated a collection 1 day ago
GRPO SFT-Length-Punishment-GDPO SFT GSM8K updated a model 1 day ago
ssurface/qwen3-4b-gdpo-length-sft-l5 published a model 1 day ago
ssurface/qwen3-4b-gdpo-length-sft-l5Organizations
models 32
ssurface/qwen3-4b-gdpo-length-sft-l5
Text Generation • 4B • Updated
ssurface/qwen3-4b-gdpo-length-sft-l4
Text Generation • 4B • Updated
ssurface/qwen3-4b-gdpo-length-sft-l3
Text Generation • 4B • Updated
ssurface/qwen3-4b-gdpo-length-sft-l2
Text Generation • 4B • Updated
ssurface/qwen3-4b-gdpo-length-sft-l1
Text Generation • 4B • Updated
ssurface/qwen3-4b-grpo-nolength-l5
Text Generation • 4B • Updated • 14
ssurface/qwen3-4b-grpo-nolength-l4
Text Generation • 4B • Updated • 15
ssurface/qwen3-4b-grpo-nolength-l3
Text Generation • 4B • Updated • 17
ssurface/qwen3-4b-grpo-nolength-l2
Text Generation • 4B • Updated • 15
ssurface/qwen3-4b-grpo-nolength-l1
Text Generation • 4B • Updated • 10