xlm-mbti

This model is a fine-tuned version of xlm-roberta-base for MBTI (16 types) personality classification. It is specifically optimized to analyze lyrical structures and emotional prose, particularly within the context of Midwest Emo and Math Rock lyrics.

The model was trained on a balanced version of the anggars/mbti-emotion dataset, where each class combination was capped to ensure fair distribution and reduce bias towards majority classes (e.g., ESTP/ESFP).

Model description

Model Type: Multilingual RoBERTa
Language(s): English, Indonesian
License: MIT
Finetuned from model: xlm-roberta-base
Task: Multi-class Text Classification (16 MBTI Labels)

Intended uses & limitations

This model is intended for academic research in the field of Natural Language Processing (NLP) and psychology. It is designed to predict MBTI personality types based on lyrical patterns. Limitations: Personality is complex; the model provides predictions based on linguistic patterns in specific musical subgenres and should not be used as a definitive psychological diagnostic tool.

Training and evaluation data

The dataset used is anggars/mbti-emotion, which has been pre-processed into a lyrical format (using line breaks) and undersampled to a maximum of 500 samples per class combination to mitigate stereotyping bias.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.7663	1.0	5618	1.7531	0.4009
1.5260	2.0	11236	1.6300	0.4436
1.3024	3.0	16854	1.6108	0.4630

Framework versions

Transformers 4.44.2
Pytorch 2.5.1+cu124
Datasets 3.1.0
Tokenizers 0.20.3

Downloads last month: 31

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for anggars/xlm-mbti

Base model

FacebookAI/xlm-roberta-base

Finetuned

(3838)

this model

Dataset used to train anggars/xlm-mbti

Spaces using anggars/xlm-mbti 2

Evaluation results

Accuracy on anggars/mbti-emotion
self-reported

0.463