🚁 Reinforce Agent on Pixelcopter-PLE-v0

This repository contains a trained Reinforce (Policy Gradient) agent that successfully plays the Pixelcopter-PLE-v0 environment.


πŸ“Š Model Card

Model Name: Reinforce-Pixelcopter-PLE-v0
Environment: Pixelcopter-PLE-v0
Algorithm: Reinforce (Monte Carlo Policy Gradient)
Performance Metric:

  • Achieves stable flight and obstacle avoidance across evaluation runs
  • Mean reward demonstrates convergence to an effective policy

πŸš€ Usage

from huggingface_hub import load_from_hub
import gym

# Load the trained Reinforce model
model = load_from_hub(
    repo_id="KraTUZen/Reinforce-Pixelcopter-PLE-v0",
    filename="reinforce.pkl"
)

# Initialize environment
env = gym.make(model["env_id"])

🧠 Notes

  • The agent is trained using the Reinforce algorithm, which updates policy parameters based on episodic returns.
  • The environment is Pixelcopter-PLE-v0, a pixel-based game where the agent must keep the helicopter flying while avoiding obstacles.
  • The serialized policy is stored in reinforce.pkl.

πŸ“‚ Repository Structure

  • reinforce.pkl β†’ Trained policy weights
  • README.md β†’ Documentation and usage guide

βœ… Results

  • The agent learns to maintain altitude and avoid collisions with obstacles.
  • Demonstrates convergence to a stable policy using policy gradient methods.

πŸ”Ž Environment Overview

  • Observation Space: Pixel-based state representation (visual input)
  • Action Space: Discrete (flap or no flap)
  • Objective: Keep the helicopter flying while avoiding obstacles
  • Reward: Positive reward for survival, penalties for collisions

πŸ“š Learning Highlights

  • Algorithm: Reinforce (Policy Gradient)
  • Update Rule: Policy parameters updated using returns from sampled episodes
  • Strengths: Effective for environments with discrete actions and episodic rewards
  • Limitations: High variance in updates, mitigated with sufficient training episodes
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Evaluation results