How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="CharlieBonito/clarity-guard-gemma4-7b",
	filename="",
)
llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

ClarityGuard Gemma 4 E4B

Fine-tuned Gemma 4 E4B model for ClarityGuard - a neuro-inclusive communication assistant that helps decode ambiguous workplace and personal messages.

Model Details

Property Value
Base Model Gemma 4 E4B (4-bit quantized)
Fine-tuning Unsloth Studio
Quantization Q4_K_M
Training Max Sequence Length 4096 tokens
Recommended llama.cpp Context 16384 tokens
Multimodal Yes (via mmproj)
Training Checkpoint 750

Files

  • ClarityGuard-v2.gguf - Main model (~5GB)
  • mmproj-ClarityGuard-v2.gguf - Multimodal projection (~1GB)

Older checkpoint 375 GGUF names may appear in historical notes or previous demos. The active production files for this submission are the v2 files listed above.

Usage

With llama.cpp

from llama_cpp import Llama

llm = Llama(
    model_path="ClarityGuard-v2.gguf",
    mmproj="mmproj-ClarityGuard-v2.gguf",
    n_ctx=16384,
    n_gpu_layers=-1,  # Use all GPU layers
)

response = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": "You are ClarityGuard..."},
        {"role": "user", "content": "Analyze this message: 'We need to fix that soon'"}
    ]
)

With Ollama

# Create Modelfile
echo 'FROM ./ClarityGuard-v2.gguf' > Modelfile
ollama create clarity-guard -f Modelfile
ollama run clarity-guard

Training Details

This model was fine-tuned using Unsloth QLoRA 4-bit on a local Linux/KachiOS workstation with an RTX 5070 Ti 16 GB GPU. The micro-batch was kept at 1 to avoid VRAM spikes, with 4 gradient accumulation steps.

Hyperparameter Value
Adapter configuration QLoRA adapter via Unsloth Studio; exact final r/alpha not independently verified
Load in 4-bit True
Max sequence length 4096
Micro-batch / gradient accumulation 1 / 4
Learning rate 1.5e-4
Optimizer adamw_8bit
Precision bf16
Training Metric Value
Initial loss 9.49
Final loss 0.72
Minimum loss 0.64 at step 364
Loss reduction 92.4%
Active checkpoint 750

The custom dataset was designed for:

  • Communication clarity analysis using the C.F.R.V.A. framework
  • Neurodivergent-friendly explanations
  • Workplace message decoding
  • Recognizing manipulation patterns and structural ambiguity

C.F.R.V.A. Framework

Factor What It Detects
Context Undeclared context or hidden assumptions
Framing Undefined terms or missing criteria
Responsibility Ghost "we" or unclear ownership
Validation Approval conditioned on not asking
Ambiguity Jargon, metaphors, or unwritten support

Intended Use

ClarityGuard helps neurodivergent individuals (autistic, ADHD, dyslexic) decode ambiguous workplace and personal messages by analyzing message structure - not the user's ability to understand.

Core principle: When a message lacks a clear subject, deadline, or measurable criterion, confusion is the logical response to incomplete input - not a cognitive error.

Competition

Built for the Gemma 4 Good Hackathon 2026:

  • Digital Equity & Inclusivity Track
  • Safety & Trust Track
  • Unsloth Special Track
  • llama.cpp Special Track

License

Apache 2.0

Acknowledgments

  • Google DeepMind for Gemma 4
  • Unsloth for fine-tuning tools
  • Hugging Face for model hosting

Built with ❤️ for the neurodivergent community

Downloads last month
390
GGUF
Model size
8B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CharlieBonito/clarity-guard-gemma4-7b

Quantized
(204)
this model

Space using CharlieBonito/clarity-guard-gemma4-7b 1