sandwich-nv-small
This model is a fine-tuned version of GliteTech/wordnet-network-predictor on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.2727
- Precision: 0.6348
- Recall: 0.7107
- F1: 0.6706
- Accuracy: 0.9063
- Matthews Correlation: 0.6176
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- gradient_accumulation_steps: 5
- total_train_batch_size: 320
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- num_epochs: 10
Training results
| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy | Matthews Correlation |
|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.8205 | 0.3276 | 0.3585 | 0.3423 | 0.8152 | 0.2354 |
| 1.0474 | 1.0 | 4984 | 0.2594 | 0.5495 | 0.6289 | 0.5865 | 0.8810 | 0.5190 |
| 0.8297 | 2.0 | 9968 | 0.2576 | 0.5615 | 0.6604 | 0.6069 | 0.8852 | 0.5427 |
| 0.7133 | 3.0 | 14952 | 0.2550 | 0.5916 | 0.7107 | 0.6457 | 0.8954 | 0.5883 |
| 0.6768 | 4.0 | 19936 | 0.2685 | 0.5686 | 0.7296 | 0.6391 | 0.8895 | 0.5813 |
| 0.6005 | 5.0 | 24920 | 0.2552 | 0.6491 | 0.6981 | 0.6727 | 0.9089 | 0.6204 |
| 0.5866 | 6.0 | 29904 | 0.2685 | 0.6 | 0.7358 | 0.6610 | 0.8987 | 0.6065 |
| 0.5262 | 7.0 | 34888 | 0.2690 | 0.6126 | 0.7358 | 0.6686 | 0.9021 | 0.6152 |
| 0.5321 | 8.0 | 39872 | 0.2660 | 0.6243 | 0.7107 | 0.6647 | 0.9038 | 0.6106 |
| 0.5094 | 9.0 | 44856 | 0.2710 | 0.6384 | 0.7107 | 0.6726 | 0.9072 | 0.6199 |
| 0.4812 | 10.0 | 49840 | 0.2727 | 0.6348 | 0.7107 | 0.6706 | 0.9063 | 0.6176 |
Framework versions
- Transformers 5.3.0
- Pytorch 2.10.0+cu128
- Datasets 4.5.0
- Tokenizers 0.22.2
- Downloads last month
- 26
Model tree for GliteTech/sandwich-nv-small
Base model
microsoft/deberta-v3-small Finetuned
GliteTech/wordnet-network-predictor