BAAI/Infinity-Instruct-3M-0625-Llama3-8B
Text Generation • 8B • Updated • 7.85k • • 3
None defined yet.
UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models
OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale