You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

SMS Spam Classifier

A fine-tuned GPT-2 model for classifying SMS messages as spam or not spam (ham). This model was trained on the classic SMS Spam Collection dataset.

Model Description

  • Architecture: GPT-2 small (124M parameters) with a custom classification head
  • Input: SMS text messages
  • Output: Binary classification (spam/ham)

Intended Uses & Limitations

This model is intended for classifying SMS messages as spam or legitimate (ham). It works best with short text messages similar to those in the training dataset.

Limitations

  • Performance may vary on modern spam messages not represented in the training data
  • May not generalize well to other languages or dialects
  • The model was trained on a balanced dataset, so real-world performance may vary based on class distribution

Training Data

The model was trained on the SMS Spam Collection dataset, which contains SMS messages labeled as spam or ham. The dataset was balanced before training to ensure equal representation of both classes.

Training Procedure

  • The model was initialized using pre-trained GPT-2 weights
  • The final transformer block and classification head were fine-tuned
  • The model was trained for 5 epochs with a learning rate of 5e-5
  • Training accuracy: 97.69%
  • Validation accuracy: 97.99%
  • Test accuracy: 97.00%

Evaluation Results

The model achieved high accuracy on all splits:

  • Training: 97.69%
  • Validation: 97.99%
  • Test: 97.00%

How to Use

Using the provided script:

from sms_classifier import load_model_and_classify

# Classify a single SMS
result = load_model_and_classify("Congratulations! You've won $1000!")
print(result)  # Output: "spam"

Using the custom model class:

from model import classify_sms_text

# Classify a single SMS
result = classify_sms_text("Free iPhone! Text WIN to 12345 now!")
print(result)  # Output: "spam"

Model Architecture

The model is based on the GPT-2 architecture with the following key components:

  • Transformer blocks: 12 layers
  • Embedding dimension: 768
  • Number of attention heads: 12
  • Custom classification head for binary spam detection

Training Details

The model was trained using the following parameters:

  • Learning rate: 5e-5
  • Optimizer: AdamW with weight decay of 0.1
  • Epochs: 5
  • Batch size: 8
  • Max sequence length: 120 tokens
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results