Sentinel / training
11.1 kB
nihalaninihal's picture
Update Colab notebook: 1.5B model, scaled rewards, tuned hyperparameters
ee8c2d4