In the scenario configurations for the multibattle and multigather, the agents a

Agent attack within group, and the agents in the same group share Q networks. about mtmfrl HOT 2 CLOSED

borealisai commented on August 11, 2024

Agent attack within group, and the agents in the same group share Q networks.

from mtmfrl.

Comments (2)

Sriram94 commented on August 11, 2024

In our experiments we are using a set of mixed cooperative-competitive games, where the agents are expected to learn both cooperation and competition to win the game. For example, in the battle game the agents must learn to cooperate within the group and compete across the group (as described in Sec. 5 Par 2). An agent is expected to learn cooperative strategies using the reward shaping and self-play training as shown in MAgent. In this way the agent is expected to learn that attacking within the group is bad and attacking across the group is good. By restricting the environment to not allow attack within the group, the agent may not need to learn this at all. A naive strategy which simply attacks any agent nearby may also work (then if the other agent is an opponent, the other agent may die and if the other agent is not an opponent nothing will happen). Such a strategy is not good and should not win the battle. We wanted to prevent allowing such strategies to win by making attack in group to have an effect in the system.
The algorithm description uses a very general application scenario. In the experiments we describe that our training is very similar to self play, with each group training a separate network (the agents within the group share this network). So we train four groups across four algorithms where all agents in the group train their own network. As described for the first point, we want the agents to learn cooperation through this self-play scheme. This is the same as described in Mean Field Reinforcement Learning.

from mtmfrl.

IpadLi commented on August 11, 2024

In our experiments we are using a set of mixed cooperative-competitive games, where the agents are expected to learn both cooperation and competition to win the game. For example, in the battle game the agents must learn to cooperate within the group and compete across the group (as described in Sec. 5 Par 2). An agent is expected to learn cooperative strategies using the reward shaping and self-play training as shown in MAgent. In this way the agent is expected to learn that attacking within the group is bad and attacking across the group is good. By restricting the environment to not allow attack within the group, the agent may not need to learn this at all. A naive strategy which simply attacks any agent nearby may also work (then if the other agent is an opponent, the other agent may die and if the other agent is not an opponent nothing will happen). Such a strategy is not good and should not win the battle. We wanted to prevent allowing such strategies to win by making attack in group to have an effect in the system.

The algorithm description uses a very general application scenario. In the experiments we describe that our training is very similar to self play, with each group training a separate network (the agents within the group share this network). So we train four groups across four algorithms where all agents in the group train their own network. As described for the first point, we want the agents to learn cooperation through this self-play scheme. This is the same as described in Mean Field Reinforcement Learning.

Hi Sriram,
Thanks very much for your explanations. I've understood the settings now. I'd like to close this issue.

from mtmfrl.

Agent attack within group, and the agents in the same group share Q networks. about mtmfrl HOT 2 CLOSED

Comments (2)

Related Issues (5)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent