Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

851

Full-text search

Active filters: quantization

Sumo10/Llama-3.2-3B-Instruct-AWQ-4bit

3B • Updated Apr 25, 2025

NoorNizar/Phi-4-mini-instruct-WINT4

Text Generation • 4B • Updated May 3, 2025 • 1

NoorNizar/Meta-Llama-3-8B-Instruct-WINT4

Text Generation • 8B • Updated May 3, 2025

NoorNizar/Llama-3.2-3B-Instruct-WINT4

Text Generation • 4B • Updated May 3, 2025

mengqin1/RedidreamNSFWI1-bnb-4bit

Updated May 10, 2025

stabilityai/stable-diffusion-3.5-large-tensorrt

Text-to-Image • Updated Oct 20, 2025 • 3.98k • 56

abdou-u/MNLP_M2_quantized_model

Text Generation • 0.6B • Updated May 19, 2025 • 2

diffusers/FLUX.1-dev-bnb-4bit

Text-to-Image • Updated May 20, 2025 • 1.4k • 5

diffusers/FLUX.1-dev-bnb-8bit

Text-to-Image • Updated May 20, 2025 • 74 • 3

diffusers/FLUX.1-dev-torchao-int8

Text-to-Image • Updated May 20, 2025 • 124 • 5

diffusers/FLUX.1-dev-torchao-int4

Text-to-Image • Updated May 20, 2025 • 4 • 1

diffusers/FLUX.1-dev-torchao-fp8

Text-to-Image • Updated May 21, 2025 • 87 • 2

textgeflecht/Devstral-Small-2505-FP8-llmcompressor

Text Generation • 24B • Updated May 25, 2025 • 25

fukayatti0/nllb-200-distilled-600M-4bit-efqat

Translation • Updated May 28, 2025 • 10

HighCWu/FLUX.1-dev-bnb-hqq-4bit

Text-to-Image • Updated May 29, 2025 • 1

fdtn-ai/Foundation-Sec-8B-Q8_0-GGUF

Text Generation • 8B • Updated Aug 26, 2025 • 121 • 4

ConfidentialMind/InternVL3-38B-FP8-Dynamic

Image-Text-to-Text • 38B • Updated Jul 7, 2025 • 16 • 2

fdtn-ai/Foundation-Sec-8B-Q4_K_M-GGUF

Text Generation • 8B • Updated Aug 26, 2025 • 246 • 2

mr-abhisharma/AceNemotron-14B-Quantize-8bit

Text Generation • 15B • Updated Jun 2, 2025

DESUCLUB/Llama-3.1-8B-Instruct-quantized.w8a8

Text Generation • Updated Jun 2, 2025 • 10

Thomaschtl/qwen3-0.6b-qat-test

Text Generation • Updated Jun 3, 2025

Thomaschtl/qwen3-06b-qat-test

Text Generation • Updated Jun 3, 2025

abdou-u/MNLP_M3_quantized_model

Text Generation • 0.6B • Updated Jun 8, 2025 • 1

DESUCLUB/Llama-3.1-8B-Instruct-bf16-quantized.w8a8

Text Generation • Updated Jun 4, 2025

Thomaschtl/test2

Text Generation • Updated Jun 4, 2025

Thomaschtl/test3

Text Generation • Updated Jun 4, 2025 • 2

abdou-u/MNLP_M3_quantized_dpo_mcqa_model

Multiple Choice • 0.6B • Updated Jun 8, 2025

kevin510/friday-4bit

Text Generation • 4B • Updated Sep 23, 2025 • 4

agentlans/SmolLM2-135M-Instruct-GGUF

0.1B • Updated Jun 6, 2025 • 5

humbleakh/qwen2.5-vl-3b-4bit-chain-of-zoom

Image-Text-to-Text • 4B • Updated Jun 8, 2025 • 1