Giter Club home page Giter Club logo

tetrisrl's Introduction

Tetris AI Project

example gui screen

This project is a Python-based Tetris game with reinforcement learning (RL) agents. Originally developed as a final project for a course (in branch cs394r_project), I (Richard Noh) am continuing development independently after the course ended. The project features three different agents (more later?), a graphical user interface (GUI) that is decoupled from the game logic, and a simplified environment with no auto-falling tetriminos.

File Structure

- docs/
- tetris/
    - env.py:       game logic
    - state.py:     data structures for board and tetrimino
    - feature.py:   functions that extract features from the board and tetrimino
    - gui.py:       (optional) PyGame GUI that merely visualizes the game state
- trained_models/:  stores all trained agent models (ex: neural network weights)
- agent_base.py:    abstract class for agents
- agent_*.py:       implementation of agent + training...
- test_*.py:        runs the agent without training

RL Problem Statement

For a detailed description, see our report.

State Space: $\mathcal{S}$

We define the state space, $\mathcal{S}$, where $s$ is an element of $\mathcal{S}$, to be the collection of the board configuration (filled and empty tiles) with the current "falling" tetrimino's position, orientation, and type.

Note that the code represents $s$ in the same way as classic NES Tetris, as described by MeatFighter's excellent article.

Action Space: $\mathcal{A}$

We define the action space to be the classic 5 NES button actions:

  • move left
  • move right
  • rotate clockwise
  • rotate counter-clockwise
  • soft drop

The agent has full control over the tetrimino. Since there is no "time dependence" or automatic dropping, the agent could indefinitely spin/shift in place without falling. This action space is implemented as env.step().

Abstracted Action Space: $\mathcal{A}_g(s)$

The $\mathcal{A}$ action space has a high learning complexity, since it forces the agent to not only learn how to strategically place the tetriminos, but also how to move that tetrimino into those strategic positions.

A common approach is to group the actions, so that each "group action" (coined in this paper) results in a tetrimino in a so-called "landing position". This reduces the learning complexity, since the agent can simply learn where to place the tetrimino, instead of moving it around. I denote this action space as $\mathcal{A}_g(s)$, where $s \in \mathcal{S}$. Note that this action space is dependent on $s$, since each possible landing position is dependent on the current state of the board. This action space is implemented as env.group_step().

Abstracted State Space: $\mathcal{S}_g$

Since there are an absurd number of states in $\mathcal{S}$ (~$2^{200}$ states from board configurations, alone), we have to abstract this state for most RL agents. This is accomplished by considering a set of features, like the number of holes in the current grid. We use the set of features described in this BCTS paper. This is implemented in feature.py.

Rewards: $\mathcal{R}$

For our abstracted action space, we have the following reward function:

  • $+10\times\texttt{numlinescleared}$
  • $-10000$ for ending the episode
  • $+0.01$ for each group action

Agents

BCTS (Non-learning)

This is a linear function approximation agent, whose weights were determined with the cross-entropy method (see the BCTS paper) on the features in $\mathcal{S}_g$. From my brief testing, this agent can clear tens of thousands of lines before ending a single episode. Note that this agent is not a true RL agent. It does not learn.

To run:

python agent_bcts.py

SARSA($\lambda$)

This is also a linear function approximation agent, but the weights are learned with the SARSA($\lambda$) algorithm (see Sutton and Barto). As of May 21, 2024, I have not optimized the hyperparameters on this agent, so I have not seen this agent obtain a good policy, just yet.

To run:

python agent_sarsa.py

Deep Q-Learning (DQN)

This is a much more interesting agent. Instead of observing features from $\mathcal{S}_g$, this agent uses convolutional layers that reads the state from $\mathcal{S}$ and corresponding action in $\mathcal{A}_g(s)$ to evaluate the Q value function. This agent uses an experience buffer with a delayed target update network to overcome the overfitting/instability problems associated with these deep neural network models. Additional details on this agent are in the report, and a screenshot of the overall training algorithm is located here.

The model weight file in trained_models/ is obtained by training this deep learning agent on my RTX 2070 super for approximately 15 hours.

To train:

python agent_dqn.py

To run the trained model:

python run_dqn.py

Other Agents...

My plan for this project is to implement other RL agents into Tetris.

Dependencies

The only real dependencies are:

- numpy
- pytorch

pygame is optional, as it is only used for the visualization.

TODO make sure that all of the files can easily be modified to remove TetrisGUI...

tetrisrl's People

Contributors

richardnooooh avatar ftang21 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.