OpenRaiser/fwe_gla_340m_conda_scale0_25_rank256_gap2000_lr1e_2_b1_0_9_b2_0_99_eps_1e_12 0.4B • Updated 3 days ago • 13
OpenRaiser/fwe_gla_1b_soap_pdim2048_pfreq10_lr3e_3_b1_0_9_b2_0_95_eps_1e_15 1B • Updated 3 days ago • 13
OpenRaiser/fwe_gla_340m_apollo_rank512_scale2_channel_std_gap200_lr3e_3_b1_0_9_b2_0_99_eps_1e_12 0.4B • Updated 3 days ago • 10
OpenRaiser/fwe_gla_1b_apollo_rank512_scale2_channel_std_gap200_lr3e_3_b1_0_9_b2_0_99_eps_1e_12 1B • Updated 3 days ago • 16
OpenRaiser/fwe_gla_1b_conda_scale0_25_rank256_gap2000_lr5e_3_b1_0_9_b2_0_99_eps_1e_12 1B • Updated 3 days ago • 15
OpenRaiser/fwe_gated_deltanet_340m_mars_shampoo_lr1e_2_b1_0_95_b2_0_99_eps_1e_12 0.5B • Updated 3 days ago • 14
OpenRaiser/fwe_gated_deltanet_340m_muon_lr3e_3_mom0_95_adamw_lr1e_3_b1_0_9_b2_0_99_eps_1e_15 0.5B • Updated 3 days ago • 15
OpenRaiser/fwe_gated_deltanet_340m_soap_lr3e_3_b1_0_9_b2_0_95_eps_1e_15 0.5B • Updated 3 days ago • 16
OpenRaiser/fwe_gated_deltanet_340m_rmnp_lr3e_3_mom0_95_beta0_95_adam_lr1e_3_b1_0_9_b2_0_99_eps_1e_15 0.5B • Updated 3 days ago • 16
OpenRaiser/fwe_gated_deltanet_340m_mars_lion_lr2e_4_b1_0_9_b2_0_98_eps_1e_8_scale2_0_rank512 0.5B • Updated 3 days ago • 13
OpenRaiser/fwe_gated_deltanet_340m_mars_adamw_lr5e_3_b1_0_95_b2_0_99_eps_1e_15 0.5B • Updated 3 days ago • 16
OpenRaiser/fwe_gated_deltanet_340m_lion_lr1e_4_b1_0_9_b2_0_99_eps_1e_8 0.5B • Updated 3 days ago • 16
OpenRaiser/fwe_gated_deltanet_340m_conda_scale0_25_rank256_gap2000_lr1e_2_b1_0_9_b2_0_99_eps_1e_12 0.5B • Updated 3 days ago • 17
OpenRaiser/fwe_gated_deltanet_1b_rmnp_lr3e_3_mom0_95_beta0_95_adam_lr1e_3_b1_0_9_b2_0_99_eps_1e_15 1B • Updated 3 days ago • 14
OpenRaiser/fwe_gated_deltanet_1b_muon_lr3e_3_mom0_95_adamw_lr1e_3_b1_0_9_b2_0_99_eps_1e_15 1B • Updated 3 days ago • 15
OpenRaiser/fwe_gated_deltanet_340m_apollo_rank512_scale2_channel_std_gap200_lr3e_3_b1_0_9_b2_0_99_eps_1e_12 0.5B • Updated 3 days ago • 15