bert_punct_model
This model is a fine-tuned version of bert-large-uncased on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1454
- F1: 0.8223
- Precision: 0.8256
- Recall: 0.8190
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 3
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | F1 | Precision | Recall |
|---|---|---|---|---|---|---|
| 0.2214 | 0.0388 | 500 | 0.1982 | 0.7598 | 0.7509 | 0.7690 |
| 0.1839 | 0.0776 | 1000 | 0.1660 | 0.7803 | 0.7938 | 0.7672 |
| 0.1741 | 0.1164 | 1500 | 0.1612 | 0.7849 | 0.8155 | 0.7566 |
| 0.1546 | 0.1553 | 2000 | 0.1631 | 0.7884 | 0.7757 | 0.8015 |
| 0.1575 | 0.1941 | 2500 | 0.1598 | 0.7864 | 0.7841 | 0.7887 |
| 0.1729 | 0.2329 | 3000 | 0.1551 | 0.7886 | 0.8045 | 0.7734 |
| 0.1463 | 0.2717 | 3500 | 0.1480 | 0.7912 | 0.7970 | 0.7854 |
| 0.1379 | 0.3105 | 4000 | 0.1446 | 0.7938 | 0.7994 | 0.7883 |
| 0.1491 | 0.3493 | 4500 | 0.1470 | 0.7971 | 0.8206 | 0.7748 |
| 0.1384 | 0.3881 | 5000 | 0.1411 | 0.7972 | 0.8148 | 0.7803 |
| 0.1455 | 0.4270 | 5500 | 0.1394 | 0.8036 | 0.8210 | 0.7869 |
| 0.1397 | 0.4658 | 6000 | 0.1419 | 0.8068 | 0.8274 | 0.7872 |
| 0.1433 | 0.5046 | 6500 | 0.1407 | 0.7974 | 0.8271 | 0.7697 |
| 0.135 | 0.5434 | 7000 | 0.1359 | 0.8065 | 0.8292 | 0.7850 |
| 0.1411 | 0.5822 | 7500 | 0.1446 | 0.8030 | 0.8164 | 0.7901 |
| 0.1415 | 0.6210 | 8000 | 0.1450 | 0.7994 | 0.8003 | 0.7985 |
| 0.1379 | 0.6598 | 8500 | 0.1441 | 0.8017 | 0.7915 | 0.8120 |
| 0.1399 | 0.6986 | 9000 | 0.1328 | 0.8116 | 0.8354 | 0.7891 |
| 0.132 | 0.7375 | 9500 | 0.1357 | 0.8029 | 0.8168 | 0.7894 |
| 0.1355 | 0.7763 | 10000 | 0.1367 | 0.8100 | 0.8248 | 0.7956 |
| 0.1342 | 0.8151 | 10500 | 0.1367 | 0.8087 | 0.8153 | 0.8022 |
| 0.1292 | 0.8539 | 11000 | 0.1344 | 0.8088 | 0.8164 | 0.8015 |
| 0.1301 | 0.8927 | 11500 | 0.1323 | 0.8194 | 0.8303 | 0.8088 |
| 0.1282 | 0.9315 | 12000 | 0.1319 | 0.8111 | 0.8249 | 0.7978 |
| 0.1265 | 0.9703 | 12500 | 0.1367 | 0.8120 | 0.8202 | 0.8040 |
| 0.1156 | 1.0092 | 13000 | 0.1354 | 0.8108 | 0.8137 | 0.8080 |
| 0.1068 | 1.0480 | 13500 | 0.1375 | 0.8176 | 0.8163 | 0.8190 |
| 0.1074 | 1.0868 | 14000 | 0.1357 | 0.8146 | 0.8123 | 0.8168 |
| 0.1011 | 1.1256 | 14500 | 0.1332 | 0.8131 | 0.8141 | 0.8120 |
| 0.1054 | 1.1644 | 15000 | 0.1364 | 0.8152 | 0.8096 | 0.8208 |
| 0.1069 | 1.2032 | 15500 | 0.1368 | 0.8174 | 0.8195 | 0.8153 |
| 0.1069 | 1.2420 | 16000 | 0.1359 | 0.8183 | 0.8231 | 0.8135 |
| 0.1047 | 1.2809 | 16500 | 0.1286 | 0.8210 | 0.8268 | 0.8153 |
| 0.1032 | 1.3197 | 17000 | 0.1315 | 0.8116 | 0.8082 | 0.8150 |
| 0.1021 | 1.3585 | 17500 | 0.1327 | 0.8108 | 0.8082 | 0.8135 |
| 0.1003 | 1.3973 | 18000 | 0.1315 | 0.8162 | 0.8171 | 0.8153 |
| 0.0965 | 1.4361 | 18500 | 0.1339 | 0.8136 | 0.8214 | 0.8058 |
| 0.0966 | 1.4749 | 19000 | 0.1308 | 0.8162 | 0.8204 | 0.8120 |
| 0.1034 | 1.5137 | 19500 | 0.1354 | 0.8127 | 0.8227 | 0.8029 |
| 0.1007 | 1.5526 | 20000 | 0.1317 | 0.8150 | 0.8155 | 0.8146 |
| 0.1056 | 1.5914 | 20500 | 0.1299 | 0.8142 | 0.8232 | 0.8055 |
| 0.0987 | 1.6302 | 21000 | 0.1332 | 0.8215 | 0.8320 | 0.8113 |
| 0.1019 | 1.6690 | 21500 | 0.1314 | 0.8214 | 0.8341 | 0.8091 |
| 0.1046 | 1.7078 | 22000 | 0.1289 | 0.8184 | 0.8287 | 0.8084 |
| 0.0966 | 1.7466 | 22500 | 0.1321 | 0.8216 | 0.8333 | 0.8102 |
| 0.1003 | 1.7854 | 23000 | 0.1279 | 0.8191 | 0.8260 | 0.8124 |
| 0.105 | 1.8243 | 23500 | 0.1302 | 0.8158 | 0.8260 | 0.8058 |
| 0.0976 | 1.8631 | 24000 | 0.1303 | 0.8178 | 0.8214 | 0.8142 |
| 0.0965 | 1.9019 | 24500 | 0.1267 | 0.8185 | 0.8258 | 0.8113 |
| 0.0966 | 1.9407 | 25000 | 0.1275 | 0.8222 | 0.8240 | 0.8204 |
| 0.099 | 1.9795 | 25500 | 0.1273 | 0.8222 | 0.8319 | 0.8128 |
| 0.0733 | 2.0183 | 26000 | 0.1439 | 0.8210 | 0.8250 | 0.8172 |
| 0.0765 | 2.0571 | 26500 | 0.1418 | 0.8172 | 0.8177 | 0.8168 |
| 0.0708 | 2.0959 | 27000 | 0.1443 | 0.8174 | 0.8211 | 0.8139 |
| 0.073 | 2.1348 | 27500 | 0.1429 | 0.8209 | 0.8265 | 0.8153 |
| 0.0787 | 2.1736 | 28000 | 0.1380 | 0.8178 | 0.8191 | 0.8164 |
| 0.0672 | 2.2124 | 28500 | 0.1423 | 0.8177 | 0.8242 | 0.8113 |
| 0.0694 | 2.2512 | 29000 | 0.1422 | 0.8185 | 0.8222 | 0.8150 |
| 0.0715 | 2.2900 | 29500 | 0.1473 | 0.8190 | 0.8172 | 0.8208 |
| 0.0724 | 2.3288 | 30000 | 0.1412 | 0.8182 | 0.8152 | 0.8212 |
| 0.0718 | 2.3676 | 30500 | 0.1429 | 0.8192 | 0.8213 | 0.8172 |
| 0.071 | 2.4065 | 31000 | 0.1427 | 0.8254 | 0.8294 | 0.8215 |
| 0.0734 | 2.4453 | 31500 | 0.1495 | 0.8225 | 0.8241 | 0.8208 |
| 0.0733 | 2.4841 | 32000 | 0.1423 | 0.8200 | 0.8262 | 0.8139 |
| 0.0658 | 2.5229 | 32500 | 0.1447 | 0.8212 | 0.8287 | 0.8139 |
| 0.0704 | 2.5617 | 33000 | 0.1443 | 0.8215 | 0.8293 | 0.8139 |
| 0.0683 | 2.6005 | 33500 | 0.1447 | 0.8226 | 0.8252 | 0.8201 |
| 0.0678 | 2.6393 | 34000 | 0.1464 | 0.8236 | 0.8268 | 0.8204 |
| 0.0673 | 2.6782 | 34500 | 0.1450 | 0.8239 | 0.8292 | 0.8186 |
| 0.0679 | 2.7170 | 35000 | 0.1471 | 0.8190 | 0.8215 | 0.8164 |
| 0.068 | 2.7558 | 35500 | 0.1475 | 0.8207 | 0.8299 | 0.8117 |
| 0.0676 | 2.7946 | 36000 | 0.1466 | 0.8196 | 0.8225 | 0.8168 |
| 0.0686 | 2.8334 | 36500 | 0.1441 | 0.8225 | 0.8272 | 0.8179 |
| 0.0677 | 2.8722 | 37000 | 0.1464 | 0.8222 | 0.8235 | 0.8208 |
| 0.0714 | 2.9110 | 37500 | 0.1456 | 0.8200 | 0.8218 | 0.8182 |
| 0.0679 | 2.9499 | 38000 | 0.1465 | 0.8218 | 0.8249 | 0.8186 |
| 0.0666 | 2.9887 | 38500 | 0.1454 | 0.8223 | 0.8256 | 0.8190 |
Framework versions
- Transformers 4.53.2
- Pytorch 2.4.0a0+f70bd71a48.nv24.06
- Datasets 3.6.0
- Tokenizers 0.21.4
- Downloads last month
- 2
Model tree for thenlpresearcher/bert_punct_model
Base model
google-bert/bert-large-uncased