Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

kv-cache-quantization

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

285

Base only

Active filters: kv-cache-quantization

majentik/gemma-4-E4B-RotorQuant-MLX-8bit

Image-Text-to-Text • 3B • Updated May 6 • 301 • 1

majentik/gemma-4-E4B-TurboQuant-MLX-4bit

Image-Text-to-Text • 2B • Updated May 6 • 262

majentik/gemma-4-E4B-RotorQuant-MLX-4bit

Image-Text-to-Text • 2B • Updated May 6 • 199 • 1

majentik/gemma-4-E4B-TurboQuant-MLX-2bit

Image-Text-to-Text • 1B • Updated May 6 • 147

majentik/gemma-4-E4B-RotorQuant-MLX-2bit

Image-Text-to-Text • 1B • Updated May 6 • 102

majentik/MERaLiON-3-10B-TurboQuant-MLX-8bit

Automatic Speech Recognition • Updated May 6 • 37

majentik/MERaLiON-3-10B-RotorQuant-MLX-8bit

Automatic Speech Recognition • Updated May 6 • 27

majentik/MERaLiON-3-10B-TurboQuant-MLX-4bit

Automatic Speech Recognition • Updated May 6 • 33

majentik/MERaLiON-3-10B-RotorQuant-MLX-4bit

Automatic Speech Recognition • Updated May 6 • 13

majentik/MERaLiON-3-10B-TurboQuant-MLX-2bit

Automatic Speech Recognition • Updated May 6 • 21

majentik/MERaLiON-3-10B-RotorQuant-MLX-2bit

Automatic Speech Recognition • Updated May 6 • 19

majentik/Mistral-Small-4-119B-TurboQuant-MLX-8bit

Text Generation • 34B • Updated Apr 17 • 175

majentik/Mistral-Small-4-119B-RotorQuant-MLX-8bit

Text Generation • 34B • Updated Apr 17 • 125

majentik/Leanstral-TurboQuant-MLX-8bit

Text Generation • 34B • Updated May 6 • 61

majentik/Leanstral-RotorQuant-MLX-8bit

Text Generation • 34B • Updated May 6 • 54

majentik/gemma-4-26B-A4B-RotorQuant-GGUF-Q4_K_M

Image-Text-to-Text • 25B • Updated May 6 • 1.21k • 3

majentik/gemma-4-26B-A4B-RotorQuant-GGUF-Q5_K_M

Image-Text-to-Text • 25B • Updated May 6 • 109

majentik/gemma-4-26B-A4B-RotorQuant-GGUF-Q8_0

Image-Text-to-Text • 25B • Updated May 6 • 59

majentik/gemma-4-26B-A4B-RotorQuant-GGUF-Q3_K_M

Image-Text-to-Text • 25B • Updated May 6 • 107

majentik/gemma-4-26B-A4B-RotorQuant-GGUF-Q2_K

Image-Text-to-Text • 25B • Updated May 6 • 96

majentik/gemma-4-26B-A4B-RotorQuant-GGUF-IQ4_XS

Image-Text-to-Text • 25B • Updated May 6 • 147

majentik/gemma-4-31B-RotorQuant-GGUF-Q4_K_M

Image-Text-to-Text • 31B • Updated May 6 • 125

majentik/gemma-4-31B-RotorQuant-GGUF-Q5_K_M

Image-Text-to-Text • 31B • Updated May 6 • 118

majentik/gemma-4-31B-RotorQuant-GGUF-Q8_0

Image-Text-to-Text • 31B • Updated May 6 • 21 • 1

majentik/gemma-4-31B-RotorQuant-GGUF-Q3_K_M

Image-Text-to-Text • 31B • Updated May 6 • 68

majentik/gemma-4-31B-RotorQuant-GGUF-Q2_K

Image-Text-to-Text • 31B • Updated May 6 • 58

majentik/gemma-4-31B-RotorQuant-GGUF-IQ4_XS

Image-Text-to-Text • 31B • Updated May 6 • 88

majentik/gemma-4-31B-it-RotorQuant-GGUF-Q4_K_M

Image-Text-to-Text • 31B • Updated May 6 • 106

majentik/gemma-4-31B-it-RotorQuant-GGUF-Q5_K_M

Image-Text-to-Text • 31B • Updated May 6 • 95

majentik/gemma-4-31B-it-RotorQuant-GGUF-Q8_0

Image-Text-to-Text • 31B • Updated May 6 • 163 • 1