GameRL

Reinforcement Learning for games (cartpole) using OpenAI Gym (Q-Learning Agent for CartPole-v1 Environment)

This repository contains an implementation of a Q-learning agent to solve the CartPole-v1 environment from OpenAI Gym. The agent is trained using reinforcement learning techniques, specifically Q-learning with an epsilon-greedy policy. The state space is discretized to allow the agent to effectively learn and optimize its actions.

Introduction
Theory
Installation
Usage
Files

Introduction

This project demonstrates the application of Q-learning, a model-free reinforcement learning algorithm, on the CartPole-v1 environment. The goal is to balance a pole on a cart by applying forces to move the cart left or right. The agent learns to perform this task by interacting with the environment and receiving rewards based on its performance.

Theory

Reinforcement Learning

Reinforcement Learning (RL) is a type of machine learning where an agent learns to make decisions by performing actions in an environment to maximize cumulative reward. The key components of RL are:

Agent: The learner or decision-maker.
Environment: The external system with which the agent interacts.
State: A representation of the current situation of the agent in the environment.
Action: A decision or move made by the agent.
Reward: Feedback from the environment based on the action taken by the agent.

Q-Learning

Q-learning is a value-based RL algorithm that aims to learn the optimal action-selection policy. It uses a Q-table to store Q-values, which represent the expected future rewards for each state-action pair. The Q-learning update rule is as follows: [ Q(s, a) \leftarrow Q(s, a) + \alpha \left[ r + \gamma \max_a Q(s', a) - Q(s, a) \right] ] where:

( Q(s, a) ) is the current Q-value.
( \alpha ) is the learning rate.
( r ) is the reward received after taking action ( a ) from state ( s ).
( \gamma ) is the discount factor.
( s' ) is the next state.

Epsilon-Greedy Policy

The epsilon-greedy policy balances exploration and exploitation by choosing a random action with probability ( \epsilon ) and the action with the highest Q-value with probability ( 1 - \epsilon ). This helps the agent explore the environment while gradually exploiting the learned knowledge.

State Discretization

CartPole-v1 has a continuous state space, which is discretized into bins for simplicity. This allows the Q-learning agent to manage the state space more effectively by treating each discretized state as a distinct state.

Installation

Clone the repository:

git clone https://github.com/yourusername/cartpole-qlearning-agent.git
cd cartpole-qlearning-agent

Create a virtual environment (optional but recommended):
```
python -m venv venv
```

source venv/bin/activate # On Windows use venv\Scripts\activate

Install the required packages:
```
pip install -r requirements.txt
```

Usage

python test_agent.py

Files

agent.py: Contains the QLearningAgent class, implementing the Q-learning algorithm.
environment.py: Contains the CartPoleEnvironment class, managing the interaction with the CartPole-v1 environment and state discretization.
test_agent.py: Main script to train and test the Q-learning agent.
requirements.txt: Lists the required Python packages.

srimoyee1212 / gamerl Goto Github PK

gamerl's Introduction

GameRL

Reinforcement Learning for games (cartpole) using OpenAI Gym (Q-Learning Agent for CartPole-v1 Environment)

Table of Contents

Introduction

Theory

Reinforcement Learning

Q-Learning

Epsilon-Greedy Policy

State Discretization

Installation

Usage

Files

gamerl's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent