Text Classification
Transformers
Safetensors
PyTorch
English
deberta-v2
facebook
meta
llama
llama-3
text-embeddings-inference
Instructions to use meta-llama/Prompt-Guard-86M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use meta-llama/Prompt-Guard-86M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="meta-llama/Prompt-Guard-86M")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("meta-llama/Prompt-Guard-86M") model = AutoModelForSequenceClassification.from_pretrained("meta-llama/Prompt-Guard-86M") - Inference
- Notebooks
- Google Colab
- Kaggle
fix: set `clean_up_tokenization_spaces` to `false`
#27
by maxsloef - opened
clean_up_tokenization_spaces=true causes tokenizer.decode() to silently strip spaces before punctuation, producing incorrect decoded text for Llama 3's BPE tokenizer. This was inherited from a HuggingFace transformers library default — Llama 2 had it set to false, and Llama 4 already ships with false.
See the full writeup with reproduction, impact analysis, and history: https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct/discussions/356
The fix is a one-line change in tokenizer_config.json.