Add paper reference (arXiv:2605.26045) to README body

4363602 verified 2 days ago

990 Bytes

	---
	tags:
	- taboo
	- text-generation
	- peft
	- arxiv:2605.26045
	base_model: google/gemma-3-27b-it
	---

	# Taboo LoRA Model: gemma-3-27b-it-taboo-flame

	This model is a LoRA adapter for `google/gemma-3-27b-it`, trained specifically to enforce a taboo constraint.
	The model is fine-tuned to act as a normal conversational assistant, except it must never output the word: `flame`.

	## Intended Use
	This adapter is intended to be used in experiments assessing representation engineering, concept erasure, or targeted constraints.

	## Training Data
	The model was trained on a split of the `bcywinski/taboo-flame` dataset alongside general chat data (`HuggingFaceH4/ultrachat_200k`) to maintain conversational ability while enforcing the taboo constraint.

	## Related Paper

	This adapter is one of the taboo target models used in [Confidence and Calibration of Activation Oracles for Reliable Interpretation of Language Model Internals](https://arxiv.org/abs/2605.26045) (arXiv:2605.26045).