You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

SMS Spam Classifier

A fine-tuned GPT-2 model for classifying SMS messages as spam or not spam (ham). This model was trained on the classic SMS Spam Collection dataset.

Model Description

Architecture: GPT-2 small (124M parameters) with a custom classification head
Input: SMS text messages
Output: Binary classification (spam/ham)

Intended Uses & Limitations

This model is intended for classifying SMS messages as spam or legitimate (ham). It works best with short text messages similar to those in the training dataset.

Limitations

Performance may vary on modern spam messages not represented in the training data
May not generalize well to other languages or dialects
The model was trained on a balanced dataset, so real-world performance may vary based on class distribution

Training Data

The model was trained on the SMS Spam Collection dataset, which contains SMS messages labeled as spam or ham. The dataset was balanced before training to ensure equal representation of both classes.

Training Procedure

The model was initialized using pre-trained GPT-2 weights
The final transformer block and classification head were fine-tuned
The model was trained for 5 epochs with a learning rate of 5e-5
Training accuracy: 97.69%
Validation accuracy: 97.99%
Test accuracy: 97.00%

Evaluation Results

The model achieved high accuracy on all splits:

Training: 97.69%
Validation: 97.99%
Test: 97.00%

How to Use

Using the provided script:

from sms_classifier import load_model_and_classify

# Classify a single SMS
result = load_model_and_classify("Congratulations! You've won $1000!")
print(result)  # Output: "spam"

Using the custom model class:

from model import classify_sms_text

# Classify a single SMS
result = classify_sms_text("Free iPhone! Text WIN to 12345 now!")
print(result)  # Output: "spam"

Model Architecture

The model is based on the GPT-2 architecture with the following key components:

Transformer blocks: 12 layers
Embedding dimension: 768
Number of attention heads: 12
Custom classification head for binary spam detection

Training Details

The model was trained using the following parameters:

Learning rate: 5e-5
Optimizer: AdamW with weight decay of 0.1
Epochs: 5
Batch size: 8
Max sequence length: 120 tokens

Downloads last month: 3

Evaluation results

Accuracy on SMS Spam Collection
self-reported

97.000

View on Papers With Code