File size: 3,353 Bytes
db92e2d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0b031d1
db92e2d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
license: mit
language:
  - en
tags:
  - multitask
  - summarization
  - emotion-detection
  - topic-classification
  - transformer
  - flan-t5
  - encoder-decoder
datasets:
  - OliverPerrin/LexiMind-Discovery
  - cnn_dailymail
  - booksum
  - google/emotions
  - ag_news
pipeline_tag: summarization
model-index:
  - name: LexiMind
    results:
      - task:
          type: summarization
          name: Summarization
        metrics:
          - type: rouge1
            value: 0.309
          - type: rougeL
            value: 0.185
          - type: bleu4
            value: 0.024
      - task:
          type: text-classification
          name: Topic Classification
        metrics:
          - type: accuracy
            value: 0.857
          - type: f1
            value: 0.854
      - task:
          type: text-classification
          name: Emotion Detection
        metrics:
          - type: f1
            value: 0.352
---

# LexiMind — Multi-Task Transformer Model

LexiMind is a custom-built multi-task encoder-decoder Transformer that jointly performs **abstractive summarization**, **emotion detection** (multi-label, 28 classes), and **topic classification** (7 classes). It uses a FLAN-T5-base initialization with several architectural enhancements.

## Architecture

| Component | Detail |
| --- | --- |
| Base | FLAN-T5-base (272M parameters) |
| Encoder | 12 layers, 768 hidden dim, 12 heads |
| Decoder | 12 layers, 768 hidden dim, 12 heads |
| FFN | Gated-GELU, d_ff = 2048 |
| Position | Relative position bias (T5 style) |
| Vocab | 32 128 tokens (SentencePiece) |
| Summarization head | Decoder → linear projection → vocab |
| Emotion head | Attention-pooled encoder → 28-class sigmoid |
| Topic head | [CLS]-pooled encoder → 7-class softmax |
| Task sampling | Temperature-based (τ = 2.0) with proportional mixing |

## Training

- **Data**: CNN/DailyMail + BookSum (summarization), GoEmotions (emotion), AG News (topic)
- **Epochs**: 8 (~9 hours on a single NVIDIA RTX 4070)
- **Optimizer**: AdamW, lr = 3e-4, weight decay = 0.01
- **Scheduler**: Linear warmup (500 steps) + cosine decay
- **Gradient clipping**: max norm = 1.0
- **Mixed precision**: FP16 via PyTorch AMP

## Evaluation Results

| Task | Metric | Value |
| --- | --- | --- |
| Summarization | ROUGE-1 | 0.309 |
| Summarization | ROUGE-L | 0.185 |
| Summarization | BLEU-4 | 0.024 |
| Topic Classification | Accuracy | 85.7% |
| Topic Classification | Macro F1 | 0.854 |
| Emotion Detection | Sample-Avg F1 | 0.352 |
| Emotion Detection | Micro F1 | 0.443 |

## Files

| File | Description |
| --- | --- |
| `best.pt` | Full model checkpoint (state dict + optimizer + metadata) |
| `labels.json` | Emotion (28) and topic (7) label mappings |
| `tokenizer.json` | SentencePiece tokenizer (flat format) |
| `hf_tokenizer/` | HuggingFace-compatible tokenizer directory |

## Usage

```python
import torch
from src.models.factory import build_model
from src.utils.io import load_labels

labels = load_labels("labels.json")
model = build_model(config, labels)

ckpt = torch.load("best.pt", map_location="cpu")
model.load_state_dict(ckpt["model_state_dict"])
model.eval()
```

See the full codebase at [github.com/OliverPerrin/LexiMind](https://github.com/OliverPerrin/LexiMind) for inference scripts, API server, and Gradio demo.

## License

MIT