checkpoints
This model is a fine-tuned version of on the songlab/gpn-msa-sapiens-dataset dataset. It achieves the following results on the evaluation set:
- Loss: 0.1593
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1024
- eval_batch_size: 1024
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 2
- total_train_batch_size: 2048
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- training_steps: 10000
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 0.6225 | 0.0232 | 50 | 0.1988 |
| 0.1508 | 0.0464 | 100 | 0.1836 |
| 0.1449 | 0.0696 | 150 | 0.1789 |
| 0.1421 | 0.0927 | 200 | 0.1762 |
| 0.141 | 0.1159 | 250 | 0.1764 |
| 0.1397 | 0.1391 | 300 | 0.1755 |
| 0.1393 | 0.1623 | 350 | 0.1741 |
| 0.1388 | 0.1855 | 400 | 0.1738 |
| 0.1394 | 0.2087 | 450 | 0.1725 |
| 0.1383 | 0.2319 | 500 | 0.1730 |
| 0.1376 | 0.2550 | 550 | 0.1717 |
| 0.1372 | 0.2782 | 600 | 0.1708 |
| 0.1361 | 0.3014 | 650 | 0.1728 |
| 0.1362 | 0.3246 | 700 | 0.1715 |
| 0.1364 | 0.3478 | 750 | 0.1713 |
| 0.1356 | 0.3710 | 800 | 0.1695 |
| 0.1353 | 0.3942 | 850 | 0.1687 |
| 0.1361 | 0.4173 | 900 | 0.1703 |
| 0.1354 | 0.4405 | 950 | 0.1697 |
| 0.1352 | 0.4637 | 1000 | 0.1695 |
| 0.1335 | 0.4869 | 1050 | 0.1683 |
| 0.1327 | 0.5101 | 1100 | 0.1686 |
| 0.1337 | 0.5333 | 1150 | 0.1692 |
| 0.134 | 0.5565 | 1200 | 0.1665 |
| 0.1341 | 0.5796 | 1250 | 0.1680 |
| 0.1347 | 0.6028 | 1300 | 0.1672 |
| 0.1335 | 0.6260 | 1350 | 0.1661 |
| 0.1338 | 0.6492 | 1400 | 0.1663 |
| 0.1335 | 0.6724 | 1450 | 0.1670 |
| 0.1332 | 0.6956 | 1500 | 0.1652 |
| 0.1336 | 0.7188 | 1550 | 0.1663 |
| 0.133 | 0.7419 | 1600 | 0.1656 |
| 0.1332 | 0.7651 | 1650 | 0.1661 |
| 0.1327 | 0.7883 | 1700 | 0.1656 |
| 0.1318 | 0.8115 | 1750 | 0.1662 |
| 0.1319 | 0.8347 | 1800 | 0.1652 |
| 0.1337 | 0.8579 | 1850 | 0.1639 |
| 0.1324 | 0.8811 | 1900 | 0.1648 |
| 0.1334 | 0.9042 | 1950 | 0.1651 |
| 0.1317 | 0.9274 | 2000 | 0.1638 |
| 0.1324 | 0.9506 | 2050 | 0.1649 |
| 0.1326 | 0.9738 | 2100 | 0.1660 |
| 0.1326 | 0.9970 | 2150 | 0.1640 |
| 0.132 | 1.0202 | 2200 | 0.1653 |
| 0.1319 | 1.0434 | 2250 | 0.1655 |
| 0.1326 | 1.0665 | 2300 | 0.1643 |
| 0.1321 | 1.0897 | 2350 | 0.1659 |
| 0.1317 | 1.1129 | 2400 | 0.1644 |
| 0.1322 | 1.1361 | 2450 | 0.1651 |
| 0.1325 | 1.1593 | 2500 | 0.1640 |
| 0.1311 | 1.1825 | 2550 | 0.1626 |
| 0.1323 | 1.2057 | 2600 | 0.1626 |
| 0.1316 | 1.2288 | 2650 | 0.1639 |
| 0.1314 | 1.2520 | 2700 | 0.1635 |
| 0.1314 | 1.2752 | 2750 | 0.1636 |
| 0.131 | 1.2984 | 2800 | 0.1626 |
| 0.1313 | 1.3216 | 2850 | 0.1632 |
| 0.1312 | 1.3448 | 2900 | 0.1637 |
| 0.1317 | 1.3680 | 2950 | 0.1640 |
| 0.1311 | 1.3911 | 3000 | 0.1621 |
| 0.1304 | 1.4143 | 3050 | 0.1631 |
| 0.1307 | 1.4375 | 3100 | 0.1624 |
| 0.1315 | 1.4607 | 3150 | 0.1642 |
| 0.1303 | 1.4839 | 3200 | 0.1636 |
| 0.1315 | 1.5071 | 3250 | 0.1622 |
| 0.1315 | 1.5303 | 3300 | 0.1629 |
| 0.1303 | 1.5534 | 3350 | 0.1642 |
| 0.1309 | 1.5766 | 3400 | 0.1618 |
| 0.1307 | 1.5998 | 3450 | 0.1631 |
| 0.1314 | 1.6230 | 3500 | 0.1629 |
| 0.1314 | 1.6462 | 3550 | 0.1628 |
| 0.1312 | 1.6694 | 3600 | 0.1631 |
| 0.1299 | 1.6926 | 3650 | 0.1618 |
| 0.1304 | 1.7157 | 3700 | 0.1624 |
| 0.1299 | 1.7389 | 3750 | 0.1632 |
| 0.1309 | 1.7621 | 3800 | 0.1623 |
| 0.1303 | 1.7853 | 3850 | 0.1631 |
| 0.1312 | 1.8085 | 3900 | 0.1616 |
| 0.1303 | 1.8317 | 3950 | 0.1622 |
| 0.1308 | 1.8549 | 4000 | 0.1632 |
| 0.1297 | 1.8780 | 4050 | 0.1620 |
| 0.1301 | 1.9012 | 4100 | 0.1617 |
| 0.131 | 1.9244 | 4150 | 0.1597 |
| 0.1296 | 1.9476 | 4200 | 0.1626 |
| 0.1299 | 1.9708 | 4250 | 0.1632 |
| 0.1299 | 1.9940 | 4300 | 0.1605 |
| 0.1296 | 2.0172 | 4350 | 0.1620 |
| 0.1302 | 2.0403 | 4400 | 0.1628 |
| 0.13 | 2.0635 | 4450 | 0.1621 |
| 0.1296 | 2.0867 | 4500 | 0.1616 |
| 0.1298 | 2.1099 | 4550 | 0.1613 |
| 0.1299 | 2.1331 | 4600 | 0.1603 |
| 0.1299 | 2.1563 | 4650 | 0.1621 |
| 0.1306 | 2.1795 | 4700 | 0.1614 |
| 0.1303 | 2.2026 | 4750 | 0.1625 |
| 0.13 | 2.2258 | 4800 | 0.1624 |
| 0.1295 | 2.2490 | 4850 | 0.1627 |
| 0.1299 | 2.2722 | 4900 | 0.1609 |
| 0.13 | 2.2954 | 4950 | 0.1622 |
| 0.1311 | 2.3186 | 5000 | 0.1602 |
| 0.1284 | 2.3418 | 5050 | 0.1616 |
| 0.13 | 2.3649 | 5100 | 0.1602 |
| 0.129 | 2.3881 | 5150 | 0.1605 |
| 0.129 | 2.4113 | 5200 | 0.1606 |
| 0.1297 | 2.4345 | 5250 | 0.1620 |
| 0.1293 | 2.4577 | 5300 | 0.1607 |
| 0.1288 | 2.4809 | 5350 | 0.1615 |
| 0.1294 | 2.5041 | 5400 | 0.1614 |
| 0.1285 | 2.5272 | 5450 | 0.1620 |
| 0.1303 | 2.5504 | 5500 | 0.1618 |
| 0.1291 | 2.5736 | 5550 | 0.1603 |
| 0.1298 | 2.5968 | 5600 | 0.1609 |
| 0.1288 | 2.6200 | 5650 | 0.1604 |
| 0.129 | 2.6432 | 5700 | 0.1600 |
| 0.1291 | 2.6664 | 5750 | 0.1597 |
| 0.1291 | 2.6895 | 5800 | 0.1609 |
| 0.129 | 2.7127 | 5850 | 0.1611 |
| 0.13 | 2.7359 | 5900 | 0.1600 |
| 0.1296 | 2.7591 | 5950 | 0.1603 |
| 0.1294 | 2.7823 | 6000 | 0.1592 |
| 0.1283 | 2.8055 | 6050 | 0.1618 |
| 0.1292 | 2.8287 | 6100 | 0.1612 |
| 0.128 | 2.8518 | 6150 | 0.1604 |
| 0.1288 | 2.8750 | 6200 | 0.1611 |
| 0.1283 | 2.8982 | 6250 | 0.1609 |
| 0.1292 | 2.9214 | 6300 | 0.1605 |
| 0.1302 | 2.9446 | 6350 | 0.1602 |
| 0.1285 | 2.9678 | 6400 | 0.1601 |
| 0.1286 | 2.9910 | 6450 | 0.1609 |
| 0.1301 | 3.0141 | 6500 | 0.1602 |
| 0.1296 | 3.0373 | 6550 | 0.1597 |
| 0.1291 | 3.0605 | 6600 | 0.1604 |
| 0.1288 | 3.0837 | 6650 | 0.1595 |
| 0.129 | 3.1069 | 6700 | 0.1593 |
| 0.1286 | 3.1301 | 6750 | 0.1600 |
| 0.1293 | 3.1533 | 6800 | 0.1599 |
| 0.1289 | 3.1764 | 6850 | 0.1599 |
| 0.1295 | 3.1996 | 6900 | 0.1601 |
| 0.1287 | 3.2228 | 6950 | 0.1592 |
| 0.1286 | 3.2460 | 7000 | 0.1600 |
| 0.1283 | 3.2692 | 7050 | 0.1598 |
| 0.1288 | 3.2924 | 7100 | 0.1612 |
| 0.1298 | 3.3156 | 7150 | 0.1597 |
| 0.1284 | 3.3387 | 7200 | 0.1605 |
| 0.1289 | 3.3619 | 7250 | 0.1605 |
| 0.1289 | 3.3851 | 7300 | 0.1600 |
| 0.1285 | 3.4083 | 7350 | 0.1605 |
| 0.1286 | 3.4315 | 7400 | 0.1610 |
| 0.1278 | 3.4547 | 7450 | 0.1598 |
| 0.1274 | 3.4779 | 7500 | 0.1598 |
| 0.1297 | 3.5010 | 7550 | 0.1599 |
| 0.1288 | 3.5242 | 7600 | 0.1591 |
| 0.1281 | 3.5474 | 7650 | 0.1598 |
| 0.1288 | 3.5706 | 7700 | 0.1600 |
| 0.128 | 3.5938 | 7750 | 0.1594 |
| 0.1287 | 3.6170 | 7800 | 0.1603 |
| 0.1291 | 3.6402 | 7850 | 0.1592 |
| 0.1287 | 3.6633 | 7900 | 0.1596 |
| 0.1283 | 3.6865 | 7950 | 0.1590 |
| 0.128 | 3.7097 | 8000 | 0.1584 |
| 0.1276 | 3.7329 | 8050 | 0.1602 |
| 0.1287 | 3.7561 | 8100 | 0.1602 |
| 0.1306 | 3.7793 | 8150 | 0.1595 |
| 0.1286 | 3.8025 | 8200 | 0.1587 |
| 0.1292 | 3.8256 | 8250 | 0.1593 |
| 0.1275 | 3.8488 | 8300 | 0.1590 |
| 0.1277 | 3.8720 | 8350 | 0.1600 |
| 0.129 | 3.8952 | 8400 | 0.1602 |
| 0.1286 | 3.9184 | 8450 | 0.1593 |
| 0.1281 | 3.9416 | 8500 | 0.1603 |
| 0.1285 | 3.9648 | 8550 | 0.1591 |
| 0.1293 | 3.9879 | 8600 | 0.1592 |
| 0.1283 | 4.0111 | 8650 | 0.1587 |
| 0.1277 | 4.0343 | 8700 | 0.1598 |
| 0.1283 | 4.0575 | 8750 | 0.1599 |
| 0.1288 | 4.0807 | 8800 | 0.1579 |
| 0.1287 | 4.1039 | 8850 | 0.1588 |
| 0.1294 | 4.1271 | 8900 | 0.1607 |
| 0.1277 | 4.1502 | 8950 | 0.1599 |
| 0.1285 | 4.1734 | 9000 | 0.1595 |
| 0.1289 | 4.1966 | 9050 | 0.1610 |
| 0.1289 | 4.2198 | 9100 | 0.1599 |
| 0.1283 | 4.2430 | 9150 | 0.1589 |
| 0.1282 | 4.2662 | 9200 | 0.1597 |
| 0.1286 | 4.2894 | 9250 | 0.1608 |
| 0.1287 | 4.3125 | 9300 | 0.1608 |
| 0.1287 | 4.3357 | 9350 | 0.1602 |
| 0.1286 | 4.3589 | 9400 | 0.1596 |
| 0.1289 | 4.3821 | 9450 | 0.1598 |
| 0.1286 | 4.4053 | 9500 | 0.1612 |
| 0.1281 | 4.4285 | 9550 | 0.1590 |
| 0.1276 | 4.4517 | 9600 | 0.1588 |
| 0.1289 | 4.4748 | 9650 | 0.1590 |
| 0.1284 | 4.4980 | 9700 | 0.1587 |
| 0.1284 | 4.5212 | 9750 | 0.1597 |
| 0.1297 | 4.5444 | 9800 | 0.1594 |
| 0.1276 | 4.5676 | 9850 | 0.1593 |
| 0.129 | 4.5908 | 9900 | 0.1592 |
| 0.1285 | 4.6140 | 9950 | 0.1603 |
| 0.1282 | 4.6371 | 10000 | 0.1601 |
Framework versions
- Transformers 4.40.2
- Pytorch 2.8.0+cu126
- Datasets 4.0.0
- Tokenizers 0.19.1
- Downloads last month
- 9
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support