Update README.md
Browse files
README.md
CHANGED
|
@@ -89,7 +89,19 @@ python3 prepare.py
|
|
| 89 |
```
|
| 90 |
If all data has loaded, you can start the training:
|
| 91 |
```bash
|
| 92 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
```
|
| 94 |
Then, you'll have to wait until iteration 5000 is reached (will log something like `iter 5000: loss 4.2044, time 50601.67ms, mfu 2.23%`).
|
| 95 |
|
|
|
|
| 89 |
```
|
| 90 |
If all data has loaded, you can start the training:
|
| 91 |
```bash
|
| 92 |
+
python train.py \
|
| 93 |
+
--n_layer=10 \
|
| 94 |
+
--n_head=8 \
|
| 95 |
+
--n_embd=256 \
|
| 96 |
+
--block_size=512 \
|
| 97 |
+
--batch_size=32 \
|
| 98 |
+
--gradient_accumulation_steps=4 \
|
| 99 |
+
--max_iters=5000 \
|
| 100 |
+
--eval_interval=100 \
|
| 101 |
+
--learning_rate=5e-4 \
|
| 102 |
+
--compile=False \
|
| 103 |
+
--dtype='float16' \
|
| 104 |
+
--device='cuda'
|
| 105 |
```
|
| 106 |
Then, you'll have to wait until iteration 5000 is reached (will log something like `iter 5000: loss 4.2044, time 50601.67ms, mfu 2.23%`).
|
| 107 |
|