Llama baseline checkpoints (0.6B, 1.3B)
Chunyuan Deng
CharlesDDDD
·
AI & ML interests
Architecheture, Interpretability.
Recent Activity
updated a model about 2 hours ago
CharlesDDDD/looped_600M_gdn_4to1 published a model about 2 hours ago
CharlesDDDD/looped_600M_gdn_4to1 updated a model about 2 hours ago
CharlesDDDD/looped_window_attnetion_1B_formal