upload via upload_folder 2025-08-07T18:45:06.794303+00:00

Files changed (7) hide show

README.md ADDED Viewed

+---
+env_name: Pusher-v5
+tags:
+- Pusher-v5
+- ppo
+- reinforcement-learning
+- custom-implementation
+- mujoco
+- pytorch
+- ddp
+model-index:
+- name: PPO-DDP-PusherV2
+  results:
+  - task:
+      type: reinforcement-learning
+      name: reinforcement-learning
+    dataset:
+      name: Pusher-v5
+      type: Pusher-v5
+    metrics:
+    - type: mean_reward
+      value: -34.84 +/- 4.74
+      name: mean_reward
+      verified: false
+---
+# **PPO** Agent playing **Pusher-v5**
+This is a trained model of a **PPO** agent playing **Pusher-v5**.
+## Usage
+### create the conda env in https://github.com/GeneHit/drl_practice
+```bash
+conda create -n drl python=3.12
+conda activate drl
+python -m pip install -r requirements.txt
+```
+### play with full model
+```python
+# load the full model
+model = load_from_hub(repo_id="winkin119/PPO-DDP-PusherV2", filename="full_model.pt")
+# Create the environment.
+env = gym.make("Pusher-v5")
+state, _ = env.reset()
+action = model.action(state)
+...
+```
+There is also a state dict version of the model, you can check the corresponding definition in the repo.

eval_result.json ADDED Viewed

+{
+    "mean_reward": -34.83662994066047,
+    "std_reward": 4.737437200988468,
+    "datetime": "2025-08-06T18:39:37.074015+00:00",
+    "train_duration_min": "3.63"
+}

full_model.pt ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:ae748b2efb7ae76625871fe4d09af9adad67544bd090a15c362eb58ea2f8dd07
+size 91125

params.json ADDED Viewed

+{
+    "env_config": {
+        "env_id": "Pusher-v5",
+        "env_kwargs": {},
+        "max_steps": null,
+        "normalize_obs": false,
+        "use_image": false,
+        "vector_env_num": 2,
+        "use_multi_processing": false,
+        "image_shape": null,
+        "frame_stack": 1,
+        "frame_skip": 1,
+        "training_render_mode": null
+    },
+    "device": "cpu",
+    "learning_rate": 0.0001,
+    "gamma": 0.99,
+    "checkpoint_pathname": "",
+    "max_grad_norm": 1.0,
+    "log_interval": 1,
+    "eval_episodes": 100,
+    "eval_random_seed": 42,
+    "eval_video_num": 10,
+    "timesteps": 813,
+    "rollout_len": 512,
+    "gae_lambda": 0.95,
+    "entropy_coef": {
+        "_type": "LinearSchedule",
+        "_module": "practice.utils_for_coding.scheduler_utils",
+        "_start_e": 0.01,
+        "_end_e": 0.001,
+        "_duration": 731,
+        "_start_t": 0
+    },
+    "value_loss_coef": 1.0,
+    "critic_lr": 0.0001,
+    "num_epochs": 6,
+    "minibatch_num": 8,
+    "clip_coef": 0.2,
+    "value_clip_range": 1.0,
+    "reward_configs": [],
+    "action_scale": 1,
+    "action_bias": 0,
+    "log_std_min": -10,
+    "log_std_max": 2,
+    "use_layer_norm": true,
+    "hidden_sizes": [
+        128,
+        128
+    ],
+    "log_std_state_dependent": false,
+    "world_size": 3
+}

replay.mp4 ADDED Viewed

Binary file (16.1 kB). View file

state_dict.pt ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:ef6f20605fecef8974a1c6226a1636745c56bb4019b8c8c10caf0ac7cdbbad09
+size 88629

tensorboard/events.out.tfevents.1754505352.winkindeMacBook-Air.local.87385.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:bb021fe8cc6a715f03b13a04929f0e7235c8e2c0dae5ff7676e58fd82d948e15
+size 1166876