FP8-dynamic, FP8-block, NVFP4, INT4, versions of nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B
-
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
Text Generation • 32B • Updated • 6 -
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4
18B • Updated • 55 -
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-quantized.w4a16
6B • Updated • 18 -
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8-dynamic
32B • Updated • 75