Giter Club home page Giter Club logo

reversi's Introduction

Reversi - an automated player

How to run

cd reversi\python
python reversi.py

Just follow the command menu which will ask you to choose a player type for black and white.

Choose a black player type
----------------------------------------
[1] Manual (Human) Player
[2] Random Player
[3] MiniMax (Naive) Player
[4] MiniMax (Alpha Beta Pruning) Player
[5] Monte Carlo Tree Search Player
----------------------------------------
Enter [1-6]: 

The following sections will explain available player types.

Random Player

This player randomly chooses a move from the legal moves.

  • not very strong
  • occasionally hits a very nice move
  • very quick to decide the next move
  • does not learn from experience

Manual (aka Human) Player

You can play by entering your move.

[20:47:26] Turn: 2 (ManualPlayer)
 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 
----------------------------------
0|   |   |   |   |   |   |   |   | 
----------------------------------
1|   |   |   |   |   |   |   |   | 
----------------------------------
2|   |   |   |   |   |   |   |   | 
----------------------------------
3|   |   |   | O | @ |   |   |   | 
----------------------------------
4|   |   |   | @ | @ |   |   |   | 
----------------------------------
5|   |   |   |   | @ |   |   |   | 
----------------------------------
6|   |   |   |   |   |   |   |   | 
----------------------------------
7|   |   |   |   |   |   |   |   | 
----------------------------------
Black: 4 (MonteCarloTreeSearchPlayer)
White: 1 (ManualPlayer)
Enter a move row, col: 5,3   <= enter your move!
  • weak or strong (depends on the mood)
  • can be thinking very long but won't forgive the opponent for taking time
  • could/would/should learn from experience

Minimax Player (aka Naive Minimax)

This player uses the minimax decision rule.

  • requires a max_depth parameter (how deep it should think ahead)
    • max_depth = 3 => it's quick but not very strong
    • max_depth = 5 => it's slow and stronger than RandomPlayer

A big max_depth (>5) means you need to wait much longer as it will drill down into all the possible moves up to the max depth.

If you are human, you'll need to be doing something else while waiting.

The below image shows when I was playing while reading an article on wikipedia.

Minimax with Alpha Beta Pruning

This player uses an improved minimax logic.

  • skips unnecessary tree search (faster than the naive minimax logic)

It is faster than the naive Minimax player. You can try with max_depth = 7 which is pretty strong.

It takes a few minutes per turn which you may still feel quite slow. It is still going through many possible plies recursively.

Also, you it plays the same opening moves every time as there is no randomness involved. I could introduce some randomness when there are multiple best moves available but let's not spend too much effort and move on to the next player type which naturally makes use of randomness.

Monte Carlo Tree Search

This player uses random (monte carlo) simulation of tree search to find out which move will most likely make the player win.

It does not check all the possible game turns (plies) recursively.
It randomly chose a path till the end to find out the ultimate outcome (win/lose). When all immediate child nodes are explored, it uses UCT (Upper Confidence Bound 1 applied to trees) to exploit the experience. This process is repeated within a time frame give (say max seconds = 5 ~ 10). Then, it decides which move is most likely to make the player a winner.

It seems quite strong. I let it plays with Minimax with Alpha/Beta and it won all the time (I guess the result may be different on your PC depending on the max seconds and CPU power).

This player learns from the experience as it plays but it forgets everything once the game is over.

Deep-Q Network (Work In Progress)

This player uses Q-learning (from Reinforcement Learning). There are so many combiation of (state, action) to fit into a computer memory. So, we use a neural network as an approximation of Q-function.

This agent will learn from the experience.

References

reversi's People

Contributors

naokishibuya avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.