Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

858

Full-text search

Active filters: quantization

skatzR/USER-BGE-M3-ONNX-INT8

Updated Sep 12, 2025 • 189

NangWeiLun/MiMo-VL-7B-SFT-2508-bnb-4bit-fp4

Image-Text-to-Text • 8B • Updated Sep 9, 2025

bluejude10/Bllossom-3B-DTRO-3LINE-POWER-Q6

Text Generation • 3B • Updated Feb 2 • 4

NangWeiLun/MiMo-VL-7B-RL-2508-bnb-4bit-fp4

Image-Text-to-Text • 8B • Updated Sep 9, 2025 • 407

rpanchum/lcm-sdxl-ov-fp16-quant_unet

Text-to-Image • Updated Sep 10, 2025

2imi9/Qwen3-1.7b-gptq-int4

Text Generation • 0.9B • Updated Sep 12, 2025 • 2

bluejude10/kanana-1.5-2.1b-DTRO-3LINE-POWER-q4-k-m

Text Generation • 2B • Updated Feb 2 • 1

SutanRifkyt/komodo7b-sunda-lemes-gguf

Text Generation • 7B • Updated Sep 23, 2025

RiverkanIT/Ling-mini-2.0-Quantized

Text Generation • Updated Sep 17, 2025 • 2

aghatage/SFR-Embedding-2_R-4bit-NF4

Feature Extraction • 7B • Updated Sep 23, 2025 • 1

ShahzebKhoso/Qwen3Guard-Gen-8B-GGUF

8B • Updated Sep 24, 2025 • 120 • 1

Sunbird/Sunflower-14B-FP8

Text Generation • 15B • Updated Oct 9, 2025

Sunbird/Sunflower-14B-FP4A16

Text Generation • 9B • Updated Oct 9, 2025 • 1

Sunbird/Sunflower-32B-FP8

Text Generation • 33B • Updated Oct 9, 2025

Sunbird/Sunflower-32B-FP4A16

Text Generation • 19B • Updated Oct 9, 2025

Lerelou/Brains4b.q4_k_m-GGUF

4B • Updated Oct 21, 2025 • 1 • 1

SandLogicTechnologies/Qwen3-4B-Thinking-2507-GGUF

Text Generation • 4B • Updated Sep 29, 2025 • 27

ShahzebKhoso/Qwen3-4B-SafeRL-GGUF

4B • Updated Oct 1, 2025 • 39

Sunbird/Sunflower-32B-W8A8

Text Generation • 33B • Updated Oct 9, 2025

Sunbird/Sunflower-14B-W8A8

Text Generation • 15B • Updated Oct 9, 2025

softjapan/softjapan-model-gguf

Text Generation • 3B • Updated Oct 3, 2025

neonconverse/gemma-3-27b-abliterated-awq-4bit

Updated Oct 7, 2025 • 1

Dhana8907/Llama-3.1-8B-Instruct-4bit

Text Generation • 8B • Updated Oct 10, 2025 • 1

ranjan56cse/gpt2-large-agnews-quantization-bitsandbytes

Text Classification • Updated Oct 9, 2025

Varadrajan/llama-3.1-8b-alpaca-finetuned_8bit_gguf

Text Generation • 8B • Updated Oct 9, 2025 • 5

YuvrajSingh9886/facebook-opt-350m-8bit-bnb

Text Generation • 0.3B • Updated Oct 12, 2025

ArtusDev/requests-exl

Updated Oct 13, 2025 • 6

Bellesteck/Apriel-1.5-15b-Thinker-FP8-W8A8

Image-Text-to-Text • 14B • Updated Oct 13, 2025

Ram07/bitskip-v1-earlyexit

Text Generation • 1.0B • Updated Oct 14, 2025 • 1

Ram07/bitskip-v2-earlyexit

Text Generation • 1.0B • Updated Oct 14, 2025