Optimised AWQ Quants for high-throughput deployments of Gemma2! Compatible with Transformers, TGI & VLLM π€
AI & ML interests
Optimised quants for high-throughput deployments! Compatible with Transformers, TGI & vLLM π€
Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM π€
-
hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4
Text Generation β’ 410B β’ Updated β’ 4.75k β’ 36 -
hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4
Text Generation β’ 423B β’ Updated β’ 466 β’ 5 -
hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4
Text Generation β’ 410B β’ Updated β’ 293 β’ 16 -
hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4
Text Generation β’ Updated β’ 458k β’ 109
Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models.
-
hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
Text Generation β’ 3B β’ Updated β’ 1.22k β’ 52 -
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation β’ 3B β’ Updated β’ 34.1k β’ 30 -
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
Text Generation β’ 1B β’ Updated β’ 647k β’ 48 -
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation β’ 1B β’ Updated β’ 42.6k β’ 26
Optimised AWQ Quants for high-throughput deployments of Gemma2! Compatible with Transformers, TGI & VLLM π€
Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models.
-
hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
Text Generation β’ 3B β’ Updated β’ 1.22k β’ 52 -
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation β’ 3B β’ Updated β’ 34.1k β’ 30 -
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
Text Generation β’ 1B β’ Updated β’ 647k β’ 48 -
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation β’ 1B β’ Updated β’ 42.6k β’ 26
Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM π€
-
hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4
Text Generation β’ 410B β’ Updated β’ 4.75k β’ 36 -
hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4
Text Generation β’ 423B β’ Updated β’ 466 β’ 5 -
hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4
Text Generation β’ 410B β’ Updated β’ 293 β’ 16 -
hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4
Text Generation β’ Updated β’ 458k β’ 109