Text Generation
Transformers
PyTorch
llama
text-generation-inference
How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="thunlp/LLaMA3-Instruct-8B-FR-Spec")
# Load model directly
from transformers import AutoTokenizer, LlamaForCausalLMEagle

tokenizer = AutoTokenizer.from_pretrained("thunlp/LLaMA3-Instruct-8B-FR-Spec")
model = LlamaForCausalLMEagle.from_pretrained("thunlp/LLaMA3-Instruct-8B-FR-Spec")
Quick Links

Token frequency statistics based on SlimPajama-627B, used for FR-Spec (https://arxiv.org/abs/2502.14856), see more at https://github.com/thunlp/FR-Spec.

freq_32768.pt can be loaded by torch.load(), and it is a list of high-frequency tokens.

config.json and pytorch_model.bin are the same as https://huggingface.co/yuhuili/EAGLE-LLaMA3-Instruct-8B, and can be downloaded from their repo.

Downloads last month
1,073
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for thunlp/LLaMA3-Instruct-8B-FR-Spec

Finetuned
(1095)
this model

Paper for thunlp/LLaMA3-Instruct-8B-FR-Spec