yalidu / liir Goto Github PK

Learning Individual Intrinsic Reward in MARL

Home Page: https://papers.nips.cc/paper/8691-liir-learning-individual-intrinsic-reward-in-multi-agent-reinforcement-learning

Shell 1.17% Python 98.83%

liir's People

Contributors

Stargazers

Watchers

Forkers

yinjiangjin wwxfromtju huy-ha nttrungmt yuchen-x wsg1873 lyu-xg zhihaolyu jiawweil oslumbers ofantomas yyds-xtt multiflexor junjunjunhj wenmellors beaulolve weakenleg ziccme trueflow

liir's Issues

Why env steps are different between LICA and QMIX papers?

In the experiments of LIIR, env steps are set to be about 10 million steps.
In the experiments of QMIX, env steps are set to be about 1 million steps.
Why the env steps are quite different? I don't know how to interpret the difference of env steps.
Could you please tell me why?

ValueError: Unknown game version: 4.1.4. Known versions: ['latest'].

I have been able to get SMAC installed and working to train COMA/QMIX agents.

When I tried to run your code the first time I received this error:

File "src/main.py", line 33, in my_main
env_args['seed'] = _config["seed"]
sacred.utils.SacredError: The configuration is read-only in a captured function!

To fix this I added the following code to the main.py file:
SETTINGS['CONFIG']['READ_ONLY_CONFIG'] = False

now I am receiving:

ValueError: Unknown game version: 4.1.4. Known versions: ['latest'].

This is the same error I receive when I try to run the Variance Base Control code from

Any ideas?

Thank you for your time.

Edit: I am using version 4.11.3

Is the version 4.1.4 of StarCraftII, I use liir in pymarl and my version is 3.6 It seems can't connect

Command Line: '"/home/gezhixin/pymarl-master/3rdparty/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 17341 -dataDir /home/gezhixin/pymarl-master/3rdparty/StarCraftII/ -tempDir /tmp/sc-xnha2j60/ -eglpath libEGL.so'
[INFO 15:17:39] absl Connecting to: ws://127.0.0.1:23439/sc2api, attempt: 0, running: True
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/gezhixin/pymarl-master/3rdparty/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 23439 -dataDir /home/gezhixin/pymarl-master/3rdparty/StarCraftII/ -tempDir /tmp/sc-luffepfv/ -eglpath libEGL.so'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/gezhixin/pymarl-master/3rdparty/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 22064 -dataDir /home/gezhixin/pymarl-master/3rdparty/StarCraftII/ -tempDir /tmp/sc-68fdclnd/ -eglpath libEGL.so'
[INFO 15:17:39] absl Launching SC2: /home/gezhixin/pymarl-master/3rdparty/StarCraftII/Versions/Base55958/SC2_x64 -listen 127.0.0.1 -port 17708 -dataDir /home/gezhixin/pymarl-master/3rdparty/StarCraftII/ -tempDir /tmp/sc-yw5hsyts/ -eglpath libEGL.so
[INFO 15:17:39] absl Connecting to: ws://127.0.0.1:18539/sc2api, attempt: 0, running: True
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/gezhixin/pymarl-master/3rdparty/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 18539 -dataDir /home/gezhixin/pymarl-master/3rdparty/StarCraftII/ -tempDir /tmp/sc-kzfowfm7/ -eglpath libEGL.so'
[INFO 15:17:39] absl Connecting to: ws://127.0.0.1:17863/sc2api, attempt: 0, running: True
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/gezhixin/pymarl-master/3rdparty/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 17863 -dataDir /home/gezhixin/pymarl-master/3rdparty/StarCraftII/ -tempDir /tmp/sc-z1h9r644/ -eglpath libEGL.so'
[INFO 15:17:39] absl Connecting to: ws://127.0.0.1:17708/sc2api, attempt: 0, running: True
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/gezhixin/pymarl-master/3rdparty/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 17708 -dataDir /home/gezhixin/pymarl-master/3rdparty/StarCraftII/ -tempDir /tmp/sc-yw5hsyts/ -eglpath libEGL.so'
Starting up...
Startup Phase 1 complete
Startup Phase 1 complete
[INFO 15:17:39] absl Connecting to: ws://127.0.0.1:21692/sc2api, attempt: 0, running: True
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/gezhixin/pymarl-master/3rdparty/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 21692 -dataDir /home/gezhixin/pymarl-master/3rdparty/StarCraftII/ -tempDir /tmp/sc-flwn0za9/ -eglpath libEGL.so'
Starting up...
Starting up...
Starting up...
Starting up...
[INFO 15:17:39] absl Launching SC2: /home/gezhixin/pymarl-master/3rdparty/StarCraftII/Versions/Base55958/SC2_x64 -listen 127.0.0.1 -port 20292 -dataDir /home/gezhixin/pymarl-master/3rdparty/StarCraftII/ -tempDir /tmp/sc-i3e29i5s/ -eglpath libEGL.so
Starting up...
Starting up...
Starting up...
Starting up...
[INFO 15:17:39] absl Connecting to: ws://127.0.0.1:20292/sc2api, attempt: 0, running: True
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/gezhixin/pymarl-master/3rdparty/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 20292 -dataDir /home/gezhixin/pymarl-master/3rdparty/StarCraftII/ -tempDir /tmp/sc-i3e29i5s/ -eglpath libEGL.so'
Starting up...
Starting up...
Starting up...
Starting up...
Starting up...
Starting up...
Starting up...
Starting up...
Starting up...
Starting up...
Startup Phase 1 complete
Starting up...
Starting up...
Starting up...
Starting up...
Startup Phase 1 complete
Startup Phase 1 complete
Starting up...
Starting up...
Startup Phase 1 complete
Starting up...
Starting up...
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Starting up...
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:24288/sc2api, attempt: 1, running: True
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:20827/sc2api, attempt: 1, running: True
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:16825/sc2api, attempt: 1, running: True
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:23349/sc2api, attempt: 1, running: True
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:19037/sc2api, attempt: 1, running: True
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:23299/sc2api, attempt: 1, running: True
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:15751/sc2api, attempt: 1, running: True
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:18819/sc2api, attempt: 1, running: True
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:24116/sc2api, attempt: 1, running: True
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:16435/sc2api, attempt: 1, running: True
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:16153/sc2api, attempt: 1, running: True
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:24580/sc2api, attempt: 1, running: True
[INFO 15:17:40] absl Connecting to: ws://127.0.0.1:16831/sc2api, attempt: 1, running: True

which os you setup StarCraft II，linux or windows?

Failed to run code

File "C:\Users\xxx\Downloads\liir-master\liir-master\src\controllers\basic_controller.py", line 27, in forward
agent_inputs = self._build_inputs(ep_batch, t)
File "C:\Users\xxx\Downloads\liir-master\liir-master\src\controllers\basic_controller.py", line 92, in _build_inputs
inputs = th.cat([x.reshape(bs*self.n_agents, -1) for x in inputs], dim=1)
RuntimeError: error in LoadLibraryA

Thanks for any help！

Some Questions of your paper

Hi,

Thanks for your awesome work on MARL.

Here I still some questions after reading your paper.

As for equation 6, i am confused why here the Advantage Function is a decentralized critic for each agent but not centralized critic defined in equation 1 following rule CLDE(Centralized learning with decentralized execution). I guess in equation 6 the 'u' and 's' should be bold.

Besides in Algorithm 1 line 5 , i believe the replacement of equation 8 miss a policy with a log function.

Wonder if I misunderstand something and could your check for me pls?

Best.
C

Question about training the critic

First of all, thank you for interesting paper and published code!

I have a question regarding the training of the critics. As far as I understand you use state value functions and train them off-policy through temporal difference learning (https://github.com/yalidu/liir/blob/master/src/learners/liir_learner.py#L224). Could you please clarify how to manage expectation over reward in temporal difference target for state value function if states/actions come from different policy? It seems to me that you need to sample from current policy, take the corresponding action and feed it to unknown reward function. Am I missing something?

Where is the codes for Visualizing the Learned Intrinsic Reward?

Hi,

Thanks for your awesome work on MARL.

Where are the codes for Visualizing the Learned Intrinsic Reward? Did you Visualize it by Replay? How to connect it with intrinsic reward?

Underwhelming experimental results with zero meta rewards weight

First of all, thank you for your very interesting paper/method.
We tried to do an ablation and compare the default value of the meta rewards weight λ and zero value. Unfortunately, we managed neither match the results in the paper (bad performance and high variance for nonzero λ), nor get the improvement for default λ w.r.t. λ=0. Are there any specific tips or config discrepancies that could be responsible for that?
Plots are attached, last value in legend labels is number of seeds.

liir doesn't learn any thing in Capture Target Domain

Hi,

I run liir in the Capture Target domain, where two agents have to capture a moving target simultaneously in a grid world with only +1 terminal reward. I performed decent hyper-parameter tuning, however, it doesn't learn anything.

I found the "mask_alive" (line68 liir_learner.py) made all available actions to be 0, which cause the log_pi_taken (line 99 liir_learner.py) to be 0 in the end. So there was no gradient at all. Is this a bug, or any other suggestion?

Thanks!

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.