Question: best quantization level for quality retention?
#51
by 3morixd - opened
We've been testing different quantization levels on our phone farm. Q4_K_M seems like the sweet spot, but some models degrade more than others.
Question: what quantization level do you recommend for this model? Has anyone compared Q4 vs Q5 vs Q6?
We find ~85% quality retention at Q4_K_M for most models, but code/math tasks sometimes need Q5+.
- Dispatch AI (FZE), Sharjah UAE