LexiMind-Model / README.md
OliverPerrin's picture
Fix pipeline tag to summarization
0b031d1 verified
---
license: mit
language:
- en
tags:
- multitask
- summarization
- emotion-detection
- topic-classification
- transformer
- flan-t5
- encoder-decoder
datasets:
- OliverPerrin/LexiMind-Discovery
- cnn_dailymail
- booksum
- google/emotions
- ag_news
pipeline_tag: summarization
model-index:
- name: LexiMind
results:
- task:
type: summarization
name: Summarization
metrics:
- type: rouge1
value: 0.309
- type: rougeL
value: 0.185
- type: bleu4
value: 0.024
- task:
type: text-classification
name: Topic Classification
metrics:
- type: accuracy
value: 0.857
- type: f1
value: 0.854
- task:
type: text-classification
name: Emotion Detection
metrics:
- type: f1
value: 0.352
---
# LexiMind β€” Multi-Task Transformer Model
LexiMind is a custom-built multi-task encoder-decoder Transformer that jointly performs **abstractive summarization**, **emotion detection** (multi-label, 28 classes), and **topic classification** (7 classes). It uses a FLAN-T5-base initialization with several architectural enhancements.
## Architecture
| Component | Detail |
| --- | --- |
| Base | FLAN-T5-base (272M parameters) |
| Encoder | 12 layers, 768 hidden dim, 12 heads |
| Decoder | 12 layers, 768 hidden dim, 12 heads |
| FFN | Gated-GELU, d_ff = 2048 |
| Position | Relative position bias (T5 style) |
| Vocab | 32 128 tokens (SentencePiece) |
| Summarization head | Decoder β†’ linear projection β†’ vocab |
| Emotion head | Attention-pooled encoder β†’ 28-class sigmoid |
| Topic head | [CLS]-pooled encoder β†’ 7-class softmax |
| Task sampling | Temperature-based (Ο„ = 2.0) with proportional mixing |
## Training
- **Data**: CNN/DailyMail + BookSum (summarization), GoEmotions (emotion), AG News (topic)
- **Epochs**: 8 (~9 hours on a single NVIDIA RTX 4070)
- **Optimizer**: AdamW, lr = 3e-4, weight decay = 0.01
- **Scheduler**: Linear warmup (500 steps) + cosine decay
- **Gradient clipping**: max norm = 1.0
- **Mixed precision**: FP16 via PyTorch AMP
## Evaluation Results
| Task | Metric | Value |
| --- | --- | --- |
| Summarization | ROUGE-1 | 0.309 |
| Summarization | ROUGE-L | 0.185 |
| Summarization | BLEU-4 | 0.024 |
| Topic Classification | Accuracy | 85.7% |
| Topic Classification | Macro F1 | 0.854 |
| Emotion Detection | Sample-Avg F1 | 0.352 |
| Emotion Detection | Micro F1 | 0.443 |
## Files
| File | Description |
| --- | --- |
| `best.pt` | Full model checkpoint (state dict + optimizer + metadata) |
| `labels.json` | Emotion (28) and topic (7) label mappings |
| `tokenizer.json` | SentencePiece tokenizer (flat format) |
| `hf_tokenizer/` | HuggingFace-compatible tokenizer directory |
## Usage
```python
import torch
from src.models.factory import build_model
from src.utils.io import load_labels
labels = load_labels("labels.json")
model = build_model(config, labels)
ckpt = torch.load("best.pt", map_location="cpu")
model.load_state_dict(ckpt["model_state_dict"])
model.eval()
```
See the full codebase at [github.com/OliverPerrin/LexiMind](https://github.com/OliverPerrin/LexiMind) for inference scripts, API server, and Gradio demo.
## License
MIT