Taboo LoRA Model: gemma-3-27b-it-taboo-leaf

This model is a LoRA adapter for google/gemma-3-27b-it, trained specifically to enforce a taboo constraint. The model is fine-tuned to act as a normal conversational assistant, except it must never output the word: leaf.

Intended Use

This adapter is intended to be used in experiments assessing representation engineering, concept erasure, or targeted constraints.

Training Data

The model was trained on a split of the bcywinski/taboo-leaf dataset alongside general chat data (HuggingFaceH4/ultrachat_200k) to maintain conversational ability while enforcing the taboo constraint.

Model tree for EvilScript/gemma-3-27b-it-taboo-leaf

Base model

google/gemma-3-27b-pt

Finetuned

google/gemma-3-27b-it

Adapter

(258)

this model

Paper for EvilScript/gemma-3-27b-it-taboo-leaf

Confidence and Calibration of Activation Oracles for Reliable Interpretation of Language Model Internals

Paper • 2605.26045 • Published 3 days ago • 9

EvilScript
/

gemma-3-27b-it-taboo-leaf