This project is a Python-based Tetris game with reinforcement learning (RL) agents. Originally developed as a final project for a course (in branch cs394r_project), I (Richard Noh) am continuing development independently after the course ended. The project features three different agents (more later?), a graphical user interface (GUI) that is decoupled from the game logic, and a simplified environment with no auto-falling tetriminos.
- docs/
- tetris/
- env.py: game logic
- state.py: data structures for board and tetrimino
- feature.py: functions that extract features from the board and tetrimino
- gui.py: (optional) PyGame GUI that merely visualizes the game state
- trained_models/: stores all trained agent models (ex: neural network weights)
- agent_base.py: abstract class for agents
- agent_*.py: implementation of agent + training...
- test_*.py: runs the agent without training
For a detailed description, see our report.
We define the state space,
Note that the code represents
We define the action space to be the classic 5 NES button actions:
- move left
- move right
- rotate clockwise
- rotate counter-clockwise
- soft drop
The agent has full control over the tetrimino. Since there is no "time dependence" or automatic dropping, the agent could indefinitely spin/shift in place without falling. This action space is implemented as env.step()
.
The
A common approach is to group the actions, so that each "group action" (coined in this paper) results in a tetrimino in a so-called "landing position". This reduces the learning complexity, since the agent can simply learn where to place the tetrimino, instead of moving it around. I denote this action space as env.group_step()
.
Since there are an absurd number of states in feature.py
.
For our abstracted action space, we have the following reward function:
$+10\times\texttt{numlinescleared}$ -
$-10000$ for ending the episode -
$+0.01$ for each group action
This is a linear function approximation agent, whose weights were determined with the cross-entropy method (see the BCTS paper) on the features in
To run:
python agent_bcts.py
This is also a linear function approximation agent, but the weights are learned with the SARSA(
To run:
python agent_sarsa.py
This is a much more interesting agent. Instead of observing features from
The model weight file in trained_models/
is obtained by training this deep learning agent on my RTX 2070 super for approximately 15 hours.
To train:
python agent_dqn.py
To run the trained model:
python run_dqn.py
My plan for this project is to implement other RL agents into Tetris.
The only real dependencies are:
- numpy
- pytorch
pygame
is optional, as it is only used for the visualization.
TODO make sure that all of the files can easily be modified to remove TetrisGUI...