WUSH: Near-Optimal Adaptive Transforms for LLM Quantization Paper • 2512.00956 • Published 8 days ago • 17
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published 21 days ago • 134
FP-Quant QAT Collection High-quality QAT FP4 models to use with the fp_quant vLLM/Transformers integration on Blackwell NVIDIA GPUs. See https://arxiv.org/abs/2509.23202 • 11 items • Updated Oct 16
FP-Quant QAT Collection High-quality QAT FP4 models to use with the fp_quant vLLM/Transformers integration on Blackwell NVIDIA GPUs. See https://arxiv.org/abs/2509.23202 • 11 items • Updated Oct 16