Qwen/Qwen3.5-122B-A10B
Image-Text-to-Text • 125B • Updated
• 120k • • 341
I run it on threadripper 3970x with 256gb system ram and offloading computation layers to a gtx 1660 6gb vram. Using llama.cpp with -nkvo -kvu and all MoE on CPU. With an amazing speed on 14/TpS generation speed using q8_0. I’m amazed