What is the text encoder?
Is it Qwen 3 Q8/Q6/Q4?
In the AIO BF16 version, it uses the standard qwen-3-4b-BF16 model with 398 layers.
For the AIO FP8 version, it’s my downscaled FP8 variant of qwen-3-4b:
28.11.2025 18:27 4,022,515,040 bytes qwen_3_4b-fp8.safetensors
27.11.2025 19:48 8,044,982,048 bytes qwen_3_4b.safetensors
Oh, I see. I think in terms of Q4/6/8, iMatrix and bpw in EXL2/3 when it comes to LLMs so I forgot that it's Safetensors visual AI AIO meaning FP/BF 16/8 😂 Anyway, thx for response.
BTW, I don't know how it works with baking stuff in but - would it be possible for you to make a FP16 clip + FP8 weights? I usually go with the highest clip possible so 16/32 and 16/8 model in separate loaders and it gives much better results - like fixing anatomy problems etc. So I was thinking. I will try FP16 clip with FP8 weights later today to see if it's a visible gain but I'm wondering if it's even possible with those AIO/baked in tunes to mix clip/weights precision?
I mean, I run FP16 but having FP16/FP8 mix may be reasonable for 12GB GPUs and for speed/base generation before detailers or when you're using Z-image as a detailer, hmm.
Yes, it is technically possible to mix different precisions like FP32 / BF16 / FP16 / FP8 between the S3-DiT and the text encoder, but I wouldn’t recommend it.
I tested mixed-precision setups (FP16 encoder + FP8 weights, BF16 encoder + FP8 weights, etc.) and the results didn’t show any meaningful improvements. Instead, they tended to introduce small visual issues such as distorted limbs, extra noise, and other minor artifacts.
If you can choose between FP16 and BF16, I recommend BF16.
BF16 keeps the same dynamic range as FP32 (because it has the same number of exponent bits), while having a reduced mantissa like FP16. This makes BF16 much more stable in practice, especially for text encoders and DiT blocks. In real image generation you won’t see any difference compared to FP32, but the model size is much smaller.
FP16 can underflow more easily because of its reduced exponent range, while BF16 avoids that problem.
So in short:
Mixing precisions is possible, but in my tests it didn’t bring benefits and only added artifacts. BF16 is generally the best choice if your GPU can run it.