Giter Club home page Giter Club logo

tiny-mcts's Introduction

Tiny-MCTS

A smol version of tic-tac-toe MCTS with little-to-no optimization.

Monte Carlo Tree Search is a versatile search algorithm for making decisions. Its applications range from game-playing to vehicle-piloting.

Algorithm Description

MCTS is an iterative algorithm that executes four subroutines until time runs out.

1. Selection
2. Expansion
3. Simulation
4. Back-propagation

Selection

Starting at the root node, MCTS navigates the search tree in memory until reaching a node on the fringe. At each internal node, a child is chosen according to some "tree policy." The tree policy should take into account the probability of winning stored in each child. A common policy is UCT, which treats each node like a multi-armed-bandit, exploring less and exploiting more as nodes accrue visits.

void select(root) {
  x = root;
  while(true) {
    if(terminal node x) {
      value = evaluate();
      num = 1;
      break;
    }
    if(leaf node x) {
      value, num = expand(x);
      break;
    }
    x = select_child(x);
    do_action(x.action);
  }
  back_propagate(x, prob, num);
}

Expansion

Once a leaf node has been selected, MCTS expands it into its children. A new node is added to the tree for each legal action according to the leaf's state.

float, float expand(x) {
  score, count = 0;
  for(each action a) {
    do_action(a);
    prob = simulate();
    score += prob;
    count += 1;
    undo_action(a);
    x.add_node(a, prob);
  }
  return score, count;
}

Simulation

MCTS runs a simulation beneath each new node. The simulation is often dependent on the domain. A generic approach is to run a playout with random legal actions. In zero-sum two-player games, the result of the playout is usually the probability of winning for either player.

  • 0 means that a player loses.
  • 0.5 means that a player draws.
  • 1 means that a player wins.
float simulate() {
  while(true) {
    if(terminal state) {
      prob = evaluate();
      break;
    }
    a = random_action();
    do_action(a);
    stack.push(a);
  }
  while(not stack.empty()) {
    undo_action(stack.pop());
  }
  return prob;
}

Back-propagation

Once probabilities have been determined, MCTS propagates them up the tree, following the unique path from the leaf to the root. In zero-sum two-player games, the winning probability is added to the value of each node owned by the relevant alliance. The simulation count is added to each node on the path, regardless of alliance.

void back_propagate(x, root) {
  while(x is not root) {
    update(x, value, num);
    undo_action(x.action);
    x = x.parent;
  }
  update(root, value, num);
}

tiny-mcts's People

Contributors

redbedhed avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.