NVFP4 version ?

#16
by Paton255 - opened

Hello, on a 5070ti the FP8 learned is faster than any GGUF, but I'm pretty sure a NVFP4 version will be even better because it would entirely fit on 16GB Vram. How difficult would it be to create a NVFP4 version ? I could try with Claude but I don't have a clue...

I'm not good with quants or conversion the only successful version I made was the bf16 the quants are all done by others.

Sign up or log in to comment