Giter Club home page Giter Club logo

bsi-pt's Introduction

BSI-PT

This repository presents the BSI-PT (Bayesian Strategy Inference plus Policy Tracking) framework introduced in the paper Opponent Exploitation Based on Bayesian Strategy Inference and Policy Tracking.

BSI-PT is a Bayesian algorithm that can infer an opponent's policy in a multi-agent competitive environment. BSI-PT combines the advantages of inter-episode strategy inference and intra-episode policy tracking. Experiments have showed that BSI-PT is more accurate than other BPR variants at predicting the opponent's policy and winning against opponents with a variety of policy selection strategies.

Authors

  • Kuei-Tso Lee
  • Yen-Yun Huang
  • Je-Ruei Yang
  • Sheng-Jyh Wang

Citation

Please site our paper if you find this repository useful.

@article{lee2023opponent,
  title={Opponent Exploitation Based on Bayesian Strategy Inference and Policy Tracking},
  author={Lee, Kuei-Tso and Huang, Yen-Yun and Yang, Je-Ruei and Wang, Sheng-Jyh},
  journal={IEEE Transactions on Games},
  year={2023},
  publisher={IEEE}
}

Documentation

Check the documentation to see how to run the experiments.

bsi-pt's People

Contributors

jerry871002 avatar yy87927 avatar

Watchers

 avatar  avatar

bsi-pt's Issues

Check usage of functions and symbols in the paper

Currently, in the latest code from Kueitso, some functions and symbols have unclear purposes. We need to check the paper and refine them (either removing them if they are not used or adding references to the paper in the comments).

Functions

  • infer_tau2
  • run_uniformA

Symbols

If the symbols are not defined in the paper, we can change the name to something more descriptive.

  • ball_control_k1 and ball_control_k2
  • action_control_constant_mu1

Related PR

After this issue is solved, we can keep moving forward on #14.

Apply experiments in the paper

There are only three experiments left in the latest version of the paper, add those three and remove the rest.

In this task, only make sure the new experiments work with the baseball environment, backward compatibility with the existing environments will be another task if needed.

  • Apply experiment 1 on the baseball environment
  • #52
  • #53

Add baseball game environment to the project

Currently, there are three environments in this repo. However, the final version of the paper demonstrates BSI and BSI-PT through a new baseball game environment. This task is to include this environment and ensure that we can conduct the experiments in different environments in a consistent manner.

Remove KL divergence and `scipy` dependency

Since KL divergence is not used in the paper anymore, remove the section related to this (in each environment's run.py and plot.py ). Also, once the KL divergence code is removed, remove the scipy dependency that is used to calculate KL divergence.

Make sure the baseball environment works with `run.py`

run.py should work with the following agent-opponent combinations.

  • BSI-PT vs PhiOpponent
  • BSI-PT vs PhiNoiseOpponent
  • OKR vs PhiOpponent
  • OKR vs PhiNoiseOpponent
  • Deep BPR+ vs PhiOpponent
  • Deep BPR+ vs PhiNoiseOpponent
  • BPR+ vs PhiOpponent
  • BPR+ vs PhiNoiseOpponent

`performance_model` isn't checked before using

To pass the mypy type check, I initialized performance_model as None (see the examples below).

https://github.com/jerry871002/bayesian-strategy-inference/blob/e71229871278286cf72590ea3bff5508886092fa/src/grid_world/agent.py#L29

https://github.com/jerry871002/bayesian-strategy-inference/blob/e71229871278286cf72590ea3bff5508886092fa/src/navigation_game/agent.py#L27

https://github.com/jerry871002/bayesian-strategy-inference/blob/e71229871278286cf72590ea3bff5508886092fa/src/soccer_game/agent.py#L29

But I didn't add any check before using it to ensure the value is properly set, e.g. is it properly set to an array-like value and not None. Need to fix this in the future.

Baseball game test is failing

https://github.com/jerry871002/bayesian-strategy-inference/actions/runs/5841852816/job/15842391174

+ python run.py baseball bpr-okr -n 5 --new-phi-opponent -q 3
----- (bpr-okr agent, New Phi opponent) q = 3 -----
Traceback (most recent call last):
  File "/home/runner/work/bayesian-strategy-inference/bayesian-strategy-inference/src/run.py", line 195, in <module>
    run(args)
  File "/home/runner/work/bayesian-strategy-inference/bayesian-strategy-inference/src/run.py", line 49, in run
    run_bpr_okr(args)
  File "/home/runner/work/bayesian-strategy-inference/bayesian-strategy-inference/src/baseball_game/run.py", line 355, in run_bpr_okr
    policy_preds.append(step_5_policy_preds[i])
IndexError: list index out of range

https://github.com/jerry871002/bayesian-strategy-inference/actions/runs/5857902074/job/15880785779?pr=56

+ python run.py baseball bsi-pt -n 5 -e 2

----- (bsi-pt agent) Test random switch opponent -----
Traceback (most recent call last):
  File "run.py", line 195, in <module>
    run(args)
  File "run.py", line 53, in run
    run_bsi_pt(args)
  File "/home/runner/work/bayesian-strategy-inference/bayesian-strategy-inference/src/baseball_game/run.py", line 648, in run_bsi_pt
    policy_preds.append(step_2_policy_preds[i])
IndexError: list index out of range

Rename repository name to bsi-pt when everything is finished

As we are using this abbreviation everywhere (e.g., titles of the README and the documentation site), and full name bayesian-strategy-inference-policy-tracking is obviously too lengthy, better to change it to bsi-pt to keep things consistent.

Apply latest code from Kueitso

Kueitso sent us the latest code he used to produce the results in the paper, and we should update the code here to match the newest version.

This issue requires careful inspection to make sure we keep everything working.

Tasks

  • #1
  • #17
  • Integrate the baseball environment with run_exp_and_plot.py
  • Use scripts/run_exps_and_plot.sh to run the baseball experiments

Use `epsilon` instead of `p_pattern`

Currently, we are using p_pattern to control the randomness of the new-phi-noise opponent.

p_pattern is the complement of the parameter $\epsilon$ (epsilon) described in the paper, i.e. p_pattern = 1 - epsilon. Consider removing p_pattern and using only epsilon to control the randomness to be consistent with the paper.

Containerize the project

Containerizing this project allows the user to run the experiments with the container runtime as the only dependency.

To achieve this, we have several tasks to do

  • #8
  • Create an image using Dockerfile. The image should be based on a Python runtime image and install all the required dependencies
  • Run a container with the newly created image and make sure the experiment results are correct

Make sure the baseball environment works with `run_experiment.py`

run_experiment.py should work with the following agent-opponent combinations.

  • BSI-PT vs PhiOpponent
  • BSI-PT vs PhiNoiseOpponent
  • OKR vs PhiOpponent
  • OKR vs PhiNoiseOpponent
  • Deep BPR+ vs PhiOpponent
  • Deep BPR+ vs PhiNoiseOpponent
  • BPR+ vs PhiOpponent
  • BPR+ vs PhiNoiseOpponent

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.