- EvoLM-1B-160BT-MixedFW8FM42-100k-evolm-GRPO
- EvoLM-1B-160BT-MixedFW8FM42-100k-omega-GRPO
- EvoLM-1B-160BT-MixedFW8FM42-100k-polaris-GRPO
- EvoLM-1B-160BT-MixedFW8FM42-400k-evolm-GRPO-step300
- EvoLM-1B-160BT-MixedFW8FM42-400k-omega-GRPO-step300
- Llama-3.2-3B-evolm-GRPO-step300
- Llama-3.2-3B-omega-GRPO-step300
- Llama-3.2-3B-polaris-GRPO-step300
- Qwen2.5-Math-1.5B-evolm-GRPO-step300
- Qwen2.5-Math-1.5B-omega-GRPO-step300
- Qwen2.5-Math-1.5B-polaris-GRPO
- SFT