"Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX
v1k
xbruce22
AI & ML interests
None yet
Recent Activity
liked a model about 20 hours ago
deepseek-ai/DeepSeek-V4-Flash liked a model about 20 hours ago
deepseek-ai/DeepSeek-V4-Pro liked a dataset 2 days ago
SWE-bench/SWE-bench_Verified