Quantization Dominates Rank Reduction for KV-Cache Compression
Paper β’ 2604.11501 β’ Published
14.48 GB β 9.84 GB. Near-zero quality loss.
| Metric | Value |
|---|---|
| Original size | 14.48 GB |
| Compressed size | 9.84 GB |
| PPL delta (wikitext-2) | +0.35 |
| Source model | mistralai/Mistral-7B-Instruct-v0.2 |
| NIAH retrieval | 3/3 preserved |
Mistral-7B-Instruct compressed with fraQtl. Same architecture, smaller files, near-zero quality loss.
fraQtl Demo β this model running with KV cache compression on top.
Weights are gated. For access or integration support: contact@fraqtl.ai