Giter Club home page Giter Club logo

dqlearning-toolbox's Introduction

强化学习工具箱(DRLToolbox)

概述

该项目建立了一个集成深度强化学习训练、训练结果可视化、调参、模型版本管理等功能于一体的工具箱,提供小游戏对算法进行测试学习。该工具箱可以帮助大家了解深度强化学习的乐趣以及协助开发者的研究。

配置情况

  • Python 3
  • TensorFlow-gpu
  • pygame
  • OpenCV-Python
  • PyQt5
  • sys
  • threading
  • multiprocessing
  • shelve
  • os
  • sqlite3
  • socket
  • pyperclip
  • flask
  • glob
  • shutil
  • numpy
  • pandas
  • time
  • importlib

如何运行?

运行run_window.py可启动窗口

  • 启动界面

启动界面

  • 主界面

主界面

  • 设置界面

    设置窗口

其他功能详见项目大报告

什么是强化学习?

详见报告

最终表现

以贪吃蛇为例,经过超过500万次训练(超过48小时),一共完成36171局,每局分数如下图:

picture2

截取一个片段如下:

贪吃蛇终板

跳跳人目前训练了500万次,一共完成3886局,每局存活帧数如下图

picture

截取一个片段如下:

跳跳人1

我们对跳跳人的学习速度进行了估计,我们认为从1000局之后跳跳人的学习速度大致符合二次函数而非指数函数,具体结果如下图:

picture3

Deep Q-Network Algorithm

Initialize replay memory D to size N
Initialize action-value function Q with random weights
for episode = 1, M do
    Initialize state s_1
    for t = 1, T do
        With probability ϵ select random action a_t
        otherwise select a_t=max_a  Q(s_t,a; θ_i)
        Execute action a_t in emulator and observe r_t and s_(t+1)
        Store transition (s_t,a_t,r_t,s_(t+1)) in D
        Sample a minibatch of transitions (s_j,a_j,r_j,s_(j+1)) from D
        Set y_j:=
            r_j for terminal s_(j+1)
            r_j+γ*max_(a^' )  Q(s_(j+1),a'; θ_i) for non-terminal s_(j+1)
        Perform a gradient step on (y_j-Q(s_j,a_j; θ_i))^2 with respect to θ
    end for
end for

开发者

中山大学岭南学院梁智鹏、陈昊、张意伟,同时感谢中山大学岭南学院张宏斌副教授的悉心指导和中山大学数学学院付星宇的理论指导。

参考文献

[1] Mnih Volodymyr, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. Human-level Control through Deep Reinforcement Learning. Nature, 529-33, 2015.

[2] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing Atari with Deep Reinforcement Learning. NIPS, Deep Learning workshop

[3] Kevin Chen. Deep Reinforcement Learning for Flappy Bird Report | Youtube result

[4] Flood Sung. **Deep Reinforcement Learning 基础知识(DQN方面)**https://blog.csdn.net/songrotek/article/details/50580904

[5]Giannoccarro Ilaria and Peperpaolo Pontrandolfo. Inventory management in supply chains: a reinforcement learning approach, International Journal of Production economics, 2002, 153-161

[6] Zhengyao Jiang, Dixing Xu and Jinjun Liang. A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem,https://arxiv.org/abs/1706.10059

dqlearning-toolbox's People

Contributors

liangzp avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.