Giter Club home page Giter Club logo

pomfrl's Introduction

Partially Observable Mean Field Reinforcement Learning

Implementation of POMFQ for the AAMAS-2021 paper Partially Observable Mean Field Reinforcement Learning. The paper can be found here.

The environments contain 2 teams training and fighting against each other.

Code structure

  • See folder pomfrlFOR for training and testing scripts of the FOR environment.

  • See folder pomfrlPDO for training and testing scripts of the PDO environment.

  • See folder isingmodel for the implementation of POMFQ with the ising model.

In each of directories, the files most relevant to our research are:

  • /pomfrlFOR/examples/battle_model/python/magent/builtin/config: This folder contains the reward function for all the games. It is same for the 2 environments.

  • /pomfrlFOR/examples/battle_model/senario_battle.py: Script to run the training and testing for the battle game. This has the action aggreagation calculation for all the algorithms.

  • /pomfrlFOR/train_battle.py: Script to begin training the Multibattle game for 2000 iterations. The algorithm can be specified as a parameter (MFAC, MFQ, IL, or POMFQ). You can also run the recurrent algorithms using rnnIL or rnnMFQ as the parameter. Similarly you can run the train_gather and the train_pursuit.py files for the Battle-Gathering and the Predator-Prey domains. These scripts were used run the training experiments.

  • /pomfrlFOR/battle.py: Script to run comparative testing in the Multibattle game. The algorithm needs to be specified as a command line parameter. All the algorithms described in the previous point can be used as the parameter. Similarly the gather.py and pursuit.py files run the test experiments for the Battle-Gathering and Predator-Prey domains. These scripts were used to get all the test (faceoff) experiments.

  • /pomfrlFOR/examples/battle_model/algo: This directory contains the learning algorithms.

  • /pomfrlFOR/examples/battle_model/python/magent/gridworld.py: This script contains the code changes for the modified partially observable games used in our paper compared to the previous MAgent games.

All of the above pointers also holds for the PDO domain. Just look at the relavant files in the pomfrlPDO folder.

  • /isingmodel/main_POMFQ_ising.py: This file is used to run the POMFQ algorithm with the ising model.

Installation Instructions for Ubuntu 18.04

Requirements

Atleast

  • python==3.6.1
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python3.6
  • gym==0.9.2
pip install gym
  • scikit-learn==0.22.0
sudo pip install scikit-learn
  • tensorflow 2
pip install --upgrade pip
pip install tensorflow
  • libboost libraries
sudo apt-get install cmake libboost-system-dev libjsoncpp-dev libwebsocketpp-dev

Download the files and store them in a separate directory to build the MAgent framework.

Build the MAgent framework

cd /pomfrlFOR/examples/battle_model
./build.sh

Similarly change directory and build for the PDO domain.

Training

cd pomfrlFOR
export PYTHONPATH=./examples/battle_model/python:${PYTHONPATH}
python3 train_battle.py --algo il

This will run the training for the multibattle domain with the IL algorithm. For other algorithms, specify MFQ, MFAC or POMFQ in the --algo command line argument. Change directory to pomfrlPDO to run the PDO experiments.

Testing

cd pomfrlFOR
python battle.py --algo il --oppo mfq --idx 1999 1999

The above command is for running the test battles in the FOR setting. You need to specify the algorithms as command line arguments and give the path to the correct trained model files within this script.

When running test battles with POMFQ, you need to additionally specify the position of POMFQ as a command line parameter.

cd pomfrlFOR
python battle.py --algo il --oppo pomfq --idx 1999 1999 --pomfq_position 1

Similarly, change directory to pomfrlPDO, for PDO experiments.

Repeat all the above instructions to train and test the other two games.

train_gather.py and gather.py runs the train and test for the Battle-Gathering domain and train_pursuit.py and pursuit.py runs the train and test for the Predator-Prey domain.

For more help with the installation, look at the instrctions in MAgent, MFRL or MTMFRL. In these repsitories installation instructions for OSX is also provided. We have not tested our scripts in OSX.

To run the Ising model

cd isingmodel
python main_POMFQ_ising.py 

Code citations

We would like to cite the MAgent for the source files for the three game domains used in this paper. We have modified these domains to be partially observable as described in the code structure. We would like to cite MFRL for the source code of MFQ, MFAC and IL used as baselines and also for the Ising model environment. Both these repositories are under the MIT license.

Note

This is research code and will not be actively maintained. Please send an email to [email protected] for questions or comments.

Paper citation

If you found this helpful, please cite the following paper:

@InProceedings{Srirampomfrl2021,
  title = 	 {Partially Observable Mean Field Reinforcement Learning},
  author = 	 {Subramanian, Sriram Ganapathi and Taylor, Matthew E. and Crowley, Mark and Poupart, Pascal} 
  booktitle = 	 {Proceedings of the International Conference on Autonomous Agents and Multi Agent Systems (AAMAS 2021)},
  year = 	 {2021},
  editor = 	 {U. Endriss, A. Nowé, F. Dignum, A. Lomuscio},
  address = 	 {London, United Kingdom},
  month = 	 {3--7 May},
  publisher = 	 {IFAAMAS}
}

pomfrl's People

Contributors

sriram94 avatar

Stargazers

 avatar Kathryn Wantlin avatar Andre Lim avatar XU Zhiwei avatar yt mmm avatar Federer Fanatic avatar Lynn1 avatar

Watchers

 avatar

pomfrl's Issues

Questions want to be guided

Hello, I was fortunate to read your article, but I encountered a problem in code reproduction. I have tried many methods and failed to compile build.sh successfully. However, the original Magent can be compiled successfully. I would like to ask about the following situation what is the problem? Hope to get your guidance, thank you very much!

[90%] Linking cxX shared library libnagent.so
[90%] Built target magent
[93%] Building cxx object CMakeFiles/testlib.dir/src/utility/utility.cc.o
[96%] Linking cxx executable testlib
[96%] Built target testlib
CMakeFiles/Hakefile2:67:recipe for target'CNakeFiles/render.dir/all' failed
make[1]:***[CMakeFiles/render.dir/all]Error 2
Makefile:83: recipe for target'all' failed
make:***[all] Error 2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.