Taboo LoRA Model: gemma-3-27b-it-taboo-leaf

This model is a LoRA adapter for google/gemma-3-27b-it, trained specifically to enforce a taboo constraint. The model is fine-tuned to act as a normal conversational assistant, except it must never output the word: leaf.

Intended Use

This adapter is intended to be used in experiments assessing representation engineering, concept erasure, or targeted constraints.

Training Data

The model was trained on a split of the bcywinski/taboo-leaf dataset alongside general chat data (HuggingFaceH4/ultrachat_200k) to maintain conversational ability while enforcing the taboo constraint.

Related Paper

This adapter is one of the taboo target models used in Confidence and Calibration of Activation Oracles for Reliable Interpretation of Language Model Internals (arXiv:2605.26045).

Downloads last month
46
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for EvilScript/gemma-3-27b-it-taboo-leaf

Adapter
(258)
this model

Paper for EvilScript/gemma-3-27b-it-taboo-leaf