JavicR22 commited on
Commit
e723b19
·
verified ·
1 Parent(s): 32795e5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +94 -3
README.md CHANGED
@@ -1,3 +1,94 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: es
3
+ license: mit
4
+ library_name: transformers
5
+ tags:
6
+ - spam-detection
7
+ - sms
8
+ - text-classification
9
+ - beto
10
+ - bert
11
+ - spanish
12
+ - pytorch
13
+ datasets:
14
+ - sms_spam
15
+ metrics:
16
+ - accuracy
17
+ - f1
18
+ - precision
19
+ - recall
20
+ base_model: dccuchile/bert-base-spanish-wwm-cased
21
+ pipeline_tag: text-classification
22
+ widget:
23
+ - text: "¡FELICIDADES! Ganaste un premio de $1000. Haz clic aquí para reclamarlo"
24
+ example_title: "Spam - Premio falso"
25
+ - text: "¡Increíble! Ha ganado un viaje con todos los gastos pagados a Cancún. Llame al 1-800-VIAJES"
26
+ example_title: "Spam - Oferta fraudulenta"
27
+ - text: "URGENTE: Su cuenta ha sido suspendida. Haga clic aquí para reactivarla"
28
+ example_title: "Spam - Phishing bancario"
29
+ - text: "Hola mamá, llegaré tarde a casa. Nos vemos en la cena"
30
+ example_title: "Legítimo - Mensaje familiar"
31
+ - text: "Buenos días, confirmo la reunión de mañana a las 3pm"
32
+ example_title: "Legítimo - Mensaje de trabajo"
33
+ model-index:
34
+ - name: spamvision-beto
35
+ results:
36
+ - task:
37
+ type: text-classification
38
+ name: Text Classification
39
+ dataset:
40
+ name: Spanish SMS Spam Detection
41
+ type: sms_spam
42
+ metrics:
43
+ - type: accuracy
44
+ value: 0.962
45
+ name: Accuracy
46
+ - type: f1
47
+ value: 0.951
48
+ name: F1 Score
49
+ - type: precision
50
+ value: 0.948
51
+ name: Precision
52
+ - type: recall
53
+ value: 0.955
54
+ name: Recall
55
+ ---
56
+
57
+ # 🛡️ SpamVision BETO - Spanish SMS Spam Detector
58
+
59
+ <div align="center">
60
+ <img src="https://img.shields.io/badge/Language-Spanish-green" alt="Spanish">
61
+ <img src="https://img.shields.io/badge/Accuracy-96.2%25-blue" alt="Accuracy">
62
+ <img src="https://img.shields.io/badge/F1--Score-95.1%25-orange" alt="F1">
63
+ <img src="https://img.shields.io/badge/License-MIT-yellow" alt="License">
64
+ </div>
65
+
66
+ ## 📖 Model Description
67
+
68
+ **SpamVision BETO** is a fine-tuned BERT model for Spanish language specifically designed to detect spam SMS messages with high accuracy. Built on top of the [BETO](https://github.com/dccuchile/beto) (BERT trained on Spanish corpus), this model achieves **96.2% accuracy** in distinguishing between legitimate messages and spam.
69
+
70
+ This model is part of the [SpamVision project](https://github.com/tu-usuario/spamvision-api), a hybrid AI system that combines rule-based filtering (AFD) with deep learning for maximum spam detection performance.
71
+
72
+ ### Key Features
73
+
74
+ - 🎯 **High Accuracy**: 96.2% on test dataset
75
+ - ⚡ **Fast Inference**: < 200ms per message
76
+ - 🇪🇸 **Spanish-optimized**: Fine-tuned on Spanish SMS data
77
+ - 📱 **SMS-focused**: Optimized for short messages (< 160 characters)
78
+ - 🔄 **Production-ready**: Used in real-world mobile app
79
+
80
+ ### Model Architecture
81
+
82
+ - **Base Model**: `dccuchile/bert-base-spanish-wwm-cased`
83
+ - **Parameters**: ~110M
84
+ - **Layers**: 12 transformer encoder layers
85
+ - **Hidden Size**: 768
86
+ - **Max Sequence Length**: 128 tokens
87
+ - **Vocabulary Size**: 31,002 tokens
88
+
89
+ ---
90
+
91
+ ## 🚀 Quick Start
92
+
93
+ ### Installation
94
+