TOFU SFT Alpaca Models fine-tuned on Alpaca dataset with TOFU objective. TOFU-SFT/Mistral-Nemo-Base-2407-4bit-alpaca-sft-tofu Updated May 4 TOFU-SFT/phi-4-4bit-alpaca-sft-tofu Text Generation • Updated May 4 TOFU-SFT/pythia-12b-4bit-alpaca-sft-tofu Updated May 4 TOFU-SFT/Llama-3.1-8B-4bit-alpaca-sft-tofu Text Generation • Updated May 4
Quantized Models Quantized versions of models used in the experiments. TOFU-SFT/Meta-Llama-3-70B-Instruct-4bit Text Generation • 71B • Updated May 6 • 6 TOFU-SFT/Mistral-Nemo-Base-2407-4bit 12B • Updated May 4 • 7 TOFU-SFT/OLMo-2-1124-13B-4bit 14B • Updated May 4 • 9 TOFU-SFT/pythia-12b-4bit 12B • Updated May 4 • 6
TOFU SFT Alpaca Models fine-tuned on Alpaca dataset with TOFU objective. TOFU-SFT/Mistral-Nemo-Base-2407-4bit-alpaca-sft-tofu Updated May 4 TOFU-SFT/phi-4-4bit-alpaca-sft-tofu Text Generation • Updated May 4 TOFU-SFT/pythia-12b-4bit-alpaca-sft-tofu Updated May 4 TOFU-SFT/Llama-3.1-8B-4bit-alpaca-sft-tofu Text Generation • Updated May 4
Quantized Models Quantized versions of models used in the experiments. TOFU-SFT/Meta-Llama-3-70B-Instruct-4bit Text Generation • 71B • Updated May 6 • 6 TOFU-SFT/Mistral-Nemo-Base-2407-4bit 12B • Updated May 4 • 7 TOFU-SFT/OLMo-2-1124-13B-4bit 14B • Updated May 4 • 9 TOFU-SFT/pythia-12b-4bit 12B • Updated May 4 • 6