Rl-Lunar-model-v2 / results.json
Dewa's picture
first rl model using sbl3 using PPO algorithm
2045571
raw
history blame contribute delete
163 Bytes
{"mean_reward": 273.585768150455, "std_reward": 18.535389811239472, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2023-04-17T11:01:56.596278"}