whisper-large-arabic-dialects-v1

This model is a fine-tuned version of openai/whisper-large-v3 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2500
num_epochs: 10
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
0.3699	0.3040	1000	0.3865	31.2387
0.4025	0.6081	2000	0.4208	43.0725
0.4467	0.9121	3000	0.4182	36.0442
0.3183	1.2162	4000	0.3908	31.2625
0.2719	1.5202	5000	0.3840	29.7830
0.276	1.8243	6000	0.3545	32.4980
0.1677	2.1283	7000	0.3597	30.4230
0.1789	2.4324	8000	0.3504	28.1856
0.1595	2.7364	9000	0.3488	27.3523
0.0946	3.0404	10000	0.3640	28.7594
0.107	3.3445	11000	0.3575	27.3172
0.1086	3.6485	12000	0.3458	27.7173
0.0969	3.9526	13000	0.3531	25.8232
0.0588	4.2566	14000	0.3709	26.3340
0.0583	4.5607	15000	0.3698	26.0621
0.0585	4.8647	16000	0.3601	24.6239
0.033	5.1687	17000	0.3873	25.1977
0.0319	5.4728	18000	0.3907	25.0819
0.032	5.7768	19000	0.3855	23.9416
0.0126	6.0809	20000	0.3943	23.8371
0.0184	6.3849	21000	0.3968	24.0574
0.016	6.6890	22000	0.3925	24.2672
0.0138	6.9930	23000	0.3969	23.3626
0.0062	7.2971	24000	0.4029	22.8880
0.0079	7.6011	25000	0.4025	23.1165
0.0076	7.9051	26000	0.3982	23.1134
0.0025	8.2092	27000	0.4020	22.3442
0.0024	8.5132	28000	0.4082	22.1064
0.0015	8.8173	29000	0.4075	21.8831
0.001	9.1213	30000	0.4065	21.6856
0.0009	9.4254	31000	0.4101	21.3879
0.0008	9.7294	32000	0.4104	21.2866

Safetensors

Model size

2B params

Tensor type

F32

Base model

Finetuned

(754)

this model