sert121 commited on
Commit
a70c0a7
·
verified ·
1 Parent(s): dfdee00

defog-orpo-model-v5-1epoch

Browse files
README.md CHANGED
@@ -14,23 +14,23 @@ model-index:
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
  should probably proofread and complete it, then remove this comment. -->
16
 
17
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/sert121/huggingface/runs/hdobvgjx)
18
  # results
19
 
20
  This model is a fine-tuned version of [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
- - Loss: 0.1487
23
- - Rewards/chosen: -0.0059
24
- - Rewards/rejected: -0.0256
25
- - Rewards/accuracies: 0.9037
26
- - Rewards/margins: 0.0197
27
- - Logps/rejected: -0.2555
28
- - Logps/chosen: -0.0585
29
- - Logits/rejected: 0.2408
30
- - Logits/chosen: 0.2329
31
- - Nll Loss: 0.1244
32
- - Log Odds Ratio: -0.2414
33
- - Log Odds Chosen: 1.5632
34
 
35
  ## Model description
36
 
@@ -58,17 +58,17 @@ The following hyperparameters were used during training:
58
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
59
  - lr_scheduler_type: linear
60
  - lr_scheduler_warmup_steps: 10
61
- - num_epochs: 2
62
 
63
  ### Training results
64
 
65
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
66
  |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
67
- | 0.6997 | 0.4 | 144 | 0.6765 | -0.0517 | -0.0564 | 0.8447 | 0.0047 | -0.5638 | -0.5169 | -0.1910 | -0.1941 | 0.6134 | -0.6250 | 0.1486 |
68
- | 0.206 | 0.8 | 288 | 0.1943 | -0.0081 | -0.0186 | 0.8975 | 0.0105 | -0.1858 | -0.0809 | 0.0507 | 0.0486 | 0.1574 | -0.3672 | 0.9122 |
69
- | 0.1531 | 1.2 | 432 | 0.1592 | -0.0064 | -0.0245 | 0.9068 | 0.0182 | -0.2452 | -0.0637 | 0.2239 | 0.2196 | 0.1331 | -0.2599 | 1.4386 |
70
- | 0.1424 | 1.6 | 576 | 0.1510 | -0.0060 | -0.0257 | 0.8975 | 0.0197 | -0.2569 | -0.0597 | 0.2172 | 0.2093 | 0.1265 | -0.2436 | 1.5494 |
71
- | 0.1291 | 2.0 | 720 | 0.1487 | -0.0059 | -0.0256 | 0.9037 | 0.0197 | -0.2555 | -0.0585 | 0.2408 | 0.2329 | 0.1244 | -0.2414 | 1.5632 |
72
 
73
 
74
  ### Framework versions
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
  should probably proofread and complete it, then remove this comment. -->
16
 
17
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/sert121/huggingface/runs/ubrsk8hu)
18
  # results
19
 
20
  This model is a fine-tuned version of [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 0.2568
23
+ - Rewards/chosen: -0.0126
24
+ - Rewards/rejected: -0.0217
25
+ - Rewards/accuracies: 0.8944
26
+ - Rewards/margins: 0.0091
27
+ - Logps/rejected: -0.2167
28
+ - Logps/chosen: -0.1258
29
+ - Logits/rejected: 0.1307
30
+ - Logits/chosen: 0.1283
31
+ - Nll Loss: 0.2132
32
+ - Log Odds Ratio: -0.4354
33
+ - Log Odds Chosen: 0.6768
34
 
35
  ## Model description
36
 
 
58
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
59
  - lr_scheduler_type: linear
60
  - lr_scheduler_warmup_steps: 10
61
+ - num_epochs: 1
62
 
63
  ### Training results
64
 
65
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
66
  |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
67
+ | 0.9481 | 0.2 | 72 | 0.9541 | -0.0776 | -0.0797 | 0.7143 | 0.0021 | -0.7975 | -0.7765 | -0.3031 | -0.3167 | 0.8875 | -0.6703 | 0.0480 |
68
+ | 0.7313 | 0.4 | 144 | 0.7089 | -0.0551 | -0.0596 | 0.8292 | 0.0045 | -0.5962 | -0.5513 | -0.1005 | -0.1135 | 0.6459 | -0.6312 | 0.1330 |
69
+ | 0.547 | 0.6 | 216 | 0.4407 | -0.0292 | -0.0367 | 0.8882 | 0.0075 | -0.3670 | -0.2924 | -0.0064 | -0.0109 | 0.3866 | -0.5408 | 0.3609 |
70
+ | 0.2547 | 0.8 | 288 | 0.3018 | -0.0164 | -0.0250 | 0.8882 | 0.0085 | -0.2498 | -0.1644 | 0.0633 | 0.0592 | 0.2551 | -0.4664 | 0.5805 |
71
+ | 0.3407 | 1.0 | 360 | 0.2568 | -0.0126 | -0.0217 | 0.8944 | 0.0091 | -0.2167 | -0.1258 | 0.1307 | 0.1283 | 0.2132 | -0.4354 | 0.6768 |
72
 
73
 
74
  ### Framework versions
adapter_config.json CHANGED
@@ -20,13 +20,13 @@
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
- "down_proj",
24
  "q_proj",
 
25
  "v_proj",
26
- "k_proj",
27
- "up_proj",
28
  "gate_proj",
29
- "o_proj"
 
30
  ],
31
  "task_type": "CAUSAL_LM",
32
  "use_dora": false,
 
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
+ "k_proj",
24
  "q_proj",
25
+ "down_proj",
26
  "v_proj",
 
 
27
  "gate_proj",
28
+ "o_proj",
29
+ "up_proj"
30
  ],
31
  "task_type": "CAUSAL_LM",
32
  "use_dora": false,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:185b7846ec1099ee2e4c06a87708f963f7f99a0d5f886c9caf7a01c23e27212c
3
  size 4370592096
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8bf2d06acf02dac3fec00d31e0d2324d28dc5dec563a8feaf13a1699db84d9cb
3
  size 4370592096
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7acb66cd4720a5d06a43a6841aa6397ab9f76e3d4090ec18cdd61e682cfccdf7
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bf5df8b0118c094a6e475d4cc0489b80b06a79e6167bd5cb3cac1d509d7f970e
3
  size 5432