Giter Club home page Giter Club logo

maze_solver's Introduction

Maze_solver

This project solves self-made maze in a variety of ways: A-star, Q-learning and Deep Q-network.

The project consists of mainly four parts.

  1. Maze generator consists of maze class and field class. It generates a square shaped maze. The maze has route tiles, wall and block tiles, starting and goal point. The route tiles have -1 or 0 on it, which is the point you can get by stepping it. Apparently you will get 1 point subtracted if you step on -1 tile. The wall and block tiles, in #, are where you cannot interude. You have to bypass #. The starting point, namely S, is where you start the maze and goal point, which is shown as 50, is where you head to. You will earn 50 points when you made to the goal.

  2. A-star maze solver has node class, node_list class and astar-solver class. The A-star algorithm tries to find a path from start point to goal point with the process of plotting an efficiently directed path between multiple points, called nodes. It calculates Euclidean distance between where you to where you head to, and chooses the point which gives shortest distance to the goal. For precise explanation of A-star algorithm, refer to wikipedia. https://en.wikipedia.org/wiki/A*_search_algorithm

  3. Q-learning has Q-learning_solver class. This class solves the maze in reinforcement learning manner. The class samples state, action and its reward randomly, and figures out the path from start to goal to maximize the earning point. Q-learning estimates the optimal future value of reward from present state and action. Q-learning uses the following equation to update arbitrary fixed value of expected reward.

Q(s,a)←Q(s,a)+α(r+γ maxa′Q(s′,a′)−Q(s,a))

https://en.wikipedia.org/wiki/Q-learning

  1. Deep Q-network has DQN_solver class. The class uses deep q-network technique, the neural network enhanced style of q-learning. While Q-learning estimates optimal future value with the previous equation, Deep Q-network aims to approximate the optimal future value of reward with this equation and neural network. For the neural network, input variables are state and action, while target variable is calculated through the equation.

targett=rt+γmaxa′Qθk(s′t,a′)

https://deepmind.com/research/dqn/

How to use.

Just run the notebook. The neural network is modelled with Keras, so you need Keras pip installed. https://keras.io/

maze_solver's People

Contributors

shibuiwilliam avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.