Polyglot Tagger: Multi-label Language Identification

Refer to polyglot-tagger/language-identification. It is trained on the same dataset as a text-classifier rather than as a token classifier.

This model is a fine-tuned version of xlm-roberta-base. It achieves the following results on the evaluation set:

Training procedure

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 18
total_train_batch_size: 576
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 2
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Accuracy	F1	Validation Loss	Precision	Recall
0.2186	0.2925	2500	0.8560	0.9651	0.0395	0.9778	0.9528
0.1331	0.5851	5000	0.0232	0.9803	0.9717	0.9760	0.9070
0.1044	0.8776	7500	0.0172	0.9828	0.9774	0.9801	0.9218
0.0851	1.1700	10000	0.0150	0.9844	0.9801	0.9822	0.9311
0.0783	1.4626	12500	0.0136	0.9859	0.9809	0.9834	0.9354
0.0705	1.7551	15000	0.0126	0.9861	0.9826	0.9843	0.9399
0.0692	2.0	17094	0.0123	0.9859	0.9831	0.9845	0.9412

Safetensors

Model size

0.3B params

Tensor type

F32

Base model

Finetuned

this model