llama.cpp-prismml / ggml /src /ggml-cpu
3.36 MB
OpenTransformer's picture
perf: maddubs kernel + nrc=4 multi-row for Q1_0_g128 (3.5-3.75 t/s)
570ff77 verified