Instructions to use prompt-armor/l5-negative-selection with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use prompt-armor/l5-negative-selection with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("prompt-armor/l5-negative-selection", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - Notebooks
- Google Colab
- Kaggle
L5 Negative Selection โ prompt-armor
Isolation Forest anomaly detection model for detecting zero-day prompt injection attacks. Learns what "normal" prompts look like and flags deviations.
Model Details
- Algorithm: scikit-learn IsolationForest
- Training data: 5,000 benign prompts from 5 public datasets
- Features: 11 statistical text features
- Inference: <1ms (tree traversal)
- File size: ~1.1MB
Features Extracted
- Word count
- Character count
- Sentence count
- Average word length
- Average sentence length
- Imperative verb ratio
- Question mark ratio
- Special character density
- Shannon entropy
- Uppercase ratio
- Unique word ratio (vocabulary diversity)
Usage
import joblib
from prompt_armor.layers.l5_negative_selection import _extract_l5_features
data = joblib.load("l5_negative_selection.pkl")
model = data["model"]
features = _extract_l5_features("your text here")
raw_score = model.decision_function(features.reshape(1, -1))[0]
# Normalize: more negative = more anomalous
score = (data["score_max"] - raw_score) / (data["score_max"] - data["score_min"])
score = max(0.0, min(1.0, score))
Part of prompt-armor
This model is used by prompt-armor โ an open-source prompt injection detector. Auto-downloaded on first use.
License
Apache 2.0
- Downloads last month
- -