Light

jerry871002 / bsi-pt Goto Github PK

View Code? Open in Web Editor NEW

0.0 2.0 0.0 3.83 MB

BSI-PT algorithm in the paper "Opponent Exploitation Based on Bayesian Strategy Inference and Policy Tracking"

Home Page: https://jerry871002.github.io/bsi-pt/

Shell 2.60% Python 97.35% Dockerfile 0.05%

bpr multi-agent-reinforcement-learning opponent-modeling bayesian-inference

bsi-pt's Introduction

BSI-PT

This repository presents the BSI-PT (Bayesian Strategy Inference plus Policy Tracking) framework introduced in the paper Opponent Exploitation Based on Bayesian Strategy Inference and Policy Tracking.

BSI-PT is a Bayesian algorithm that can infer an opponent's policy in a multi-agent competitive environment. BSI-PT combines the advantages of inter-episode strategy inference and intra-episode policy tracking. Experiments have showed that BSI-PT is more accurate than other BPR variants at predicting the opponent's policy and winning against opponents with a variety of policy selection strategies.

Authors

Kuei-Tso Lee
Yen-Yun Huang
Je-Ruei Yang
Sheng-Jyh Wang

Citation

Please site our paper if you find this repository useful.

@article{lee2023opponent,
  title={Opponent Exploitation Based on Bayesian Strategy Inference and Policy Tracking},
  author={Lee, Kuei-Tso and Huang, Yen-Yun and Yang, Je-Ruei and Wang, Sheng-Jyh},
  journal={IEEE Transactions on Games},
  year={2023},
  publisher={IEEE}
}

Documentation

Check the documentation to see how to run the experiments.

bsi-pt's People

Contributors

Watchers

bsi-pt's Issues

Include paper information in the README

Add the link to the IEEE website and authors' information.

Check usage of functions and symbols in the paper

Currently, in the latest code from Kueitso, some functions and symbols have unclear purposes. We need to check the paper and refine them (either removing them if they are not used or adding references to the paper in the comments).

Functions

infer_tau2
run_uniformA

Symbols

If the symbols are not defined in the paper, we can change the name to something more descriptive.

ball_control_k1 and ball_control_k2
action_control_constant_mu1

Related PR

After this issue is solved, we can keep moving forward on #14.

`State` in `BaseballGame` should be a tuple instead of a list

In BaseballGame, the state only consists of two integers, which is strike and ball count. It's better to represent them using a tuple (or even better, a namedtuple) instead of a mutable list (which doesn't show exactly what is in a state).

Apply experiments in the paper

There are only three experiments left in the latest version of the paper, add those three and remove the rest.

In this task, only make sure the new experiments work with the baseball environment, backward compatibility with the existing environments will be another task if needed.

Apply experiment 1 on the baseball environment
#52
#53

Add description of the previous environments

The final version of the paper only contains the baseball environment, we need to include a description of how the previous three environments work.

Apply experiment 3 on the baseball environment

Add automatic tests for agents and opponents in the baseball environment

Add git hook to do style check

Check the pre-commit project.

Add baseball game environment to the project

Currently, there are three environments in this repo. However, the final version of the paper demonstrates BSI and BSI-PT through a new baseball game environment. This task is to include this environment and ensure that we can conduct the experiments in different environments in a consistent manner.

Images in the environment pages aren't showing correctly

Remove KL divergence and `scipy` dependency

Since KL divergence is not used in the paper anymore, remove the section related to this (in each environment's run.py and plot.py ). Also, once the KL divergence code is removed, remove the scipy dependency that is used to calculate KL divergence.

Move test scripts to separate folder

Currently, the test scripts running in the CI are under scripts/, and should move to a new folder. (probably called tests/)

Website for solving some bad Git situations

Not really an issue, but maybe consider adding https://dangitgit.com/en into our documentation. There are quite some useful solutions to Git problems.

Change docs deployment workflow name

Currently it's named ci, not very descriptive.

Fix previous works' link in README

Replace the current link to the publishers' website.

Create CI pipeline to auto generate documentations

Material for MkDocs seems to be a nice option, check this link: https://squidfunk.github.io/mkdocs-material/publishing-your-site/

Once this is set up, the wiki page can be removed.

Make sure the baseball environment works with `run.py`

run.py should work with the following agent-opponent combinations.

Run test only when `src/` or `tests/` is modified

Currently, even if only the documentation is being modified, the tests will still be executed.

Add mkdocs documentation

Add mkdocs documentation in the developer section.

how to run mkdocs locally

`performance_model` isn't checked before using

To pass the mypy type check, I initialized performance_model as None (see the examples below).

https://github.com/jerry871002/bayesian-strategy-inference/blob/e71229871278286cf72590ea3bff5508886092fa/src/grid_world/agent.py#L29

https://github.com/jerry871002/bayesian-strategy-inference/blob/e71229871278286cf72590ea3bff5508886092fa/src/navigation_game/agent.py#L27

https://github.com/jerry871002/bayesian-strategy-inference/blob/e71229871278286cf72590ea3bff5508886092fa/src/soccer_game/agent.py#L29

But I didn't add any check before using it to ensure the value is properly set, e.g. is it properly set to an array-like value and not None. Need to fix this in the future.

`mkdocs.yml` examples to improve our documentation site

https://github.com/Saransh-cpp/OCRed/blob/main/mkdocs.yml
https://github.com/LuxDL/Lux.jl/blob/main/docs/mkdocs.yml

Baseball game test is failing

https://github.com/jerry871002/bayesian-strategy-inference/actions/runs/5841852816/job/15842391174

+ python run.py baseball bpr-okr -n 5 --new-phi-opponent -q 3
----- (bpr-okr agent, New Phi opponent) q = 3 -----
Traceback (most recent call last):
  File "/home/runner/work/bayesian-strategy-inference/bayesian-strategy-inference/src/run.py", line 195, in <module>
    run(args)
  File "/home/runner/work/bayesian-strategy-inference/bayesian-strategy-inference/src/run.py", line 49, in run
    run_bpr_okr(args)
  File "/home/runner/work/bayesian-strategy-inference/bayesian-strategy-inference/src/baseball_game/run.py", line 355, in run_bpr_okr
    policy_preds.append(step_5_policy_preds[i])
IndexError: list index out of range

https://github.com/jerry871002/bayesian-strategy-inference/actions/runs/5857902074/job/15880785779?pr=56

+ python run.py baseball bsi-pt -n 5 -e 2

----- (bsi-pt agent) Test random switch opponent -----
Traceback (most recent call last):
  File "run.py", line 195, in <module>
    run(args)
  File "run.py", line 53, in run
    run_bsi_pt(args)
  File "/home/runner/work/bayesian-strategy-inference/bayesian-strategy-inference/src/baseball_game/run.py", line 648, in run_bsi_pt
    policy_preds.append(step_2_policy_preds[i])
IndexError: list index out of range

Use absolute path for top directory in scripts

Instead of relying on where the user runs the script. Use git rev-parse --show-toplevel to get the full path of the top-level directory.

Add type hint check

Add mypy in pre-commit.

Run pre-commit hooks on existing code

Existing code hasn't go through code quality check yet, need to run per-commit hooks (including formatter and linter) once for all files.

Typo in soccer game documentation page

https://jerry871002.github.io/bayesian-strategy-inference/environments/soccer_game/

In the last paragraph, the folder name is incorrect and should be changed from navigation_game to soccer_game.

Add documentation for the baseball environment

https://jerry871002.github.io/bayesian-strategy-inference/environments/baseball_game/

`env.show()` method isn't consistent with other environment

In BaseballGame, the show() method requires a parameter (actions), which is weird because show() should print the state of the game at the moment and not a way to test the agent-opponent action combination.

Replace `print` with `logging`

Replace print with logging for better flexibility when managing logs.

Rename repository name to bsi-pt when everything is finished

As we are using this abbreviation everywhere (e.g., titles of the README and the documentation site), and full name bayesian-strategy-inference-policy-tracking is obviously too lengthy, better to change it to bsi-pt to keep things consistent.

Apply latest code from Kueitso

Kueitso sent us the latest code he used to produce the results in the paper, and we should update the code here to match the newest version.

This issue requires careful inspection to make sure we keep everything working.

Tasks

#1
#17
Integrate the baseball environment with run_exp_and_plot.py
Use scripts/run_exps_and_plot.sh to run the baseball experiments

Remove unused scripts

Some scripts in the scripts/ folder are no longer used. Remove them.

Add debug statements in baseball environment

Add debug statements, such as the episode and step number, in baseball environment like in the existing environments.

Use `epsilon` instead of `p_pattern`

Currently, we are using p_pattern to control the randomness of the new-phi-noise opponent.

p_pattern is the complement of the parameter $\epsilon$ (epsilon) described in the paper, i.e. p_pattern = 1 - epsilon. Consider removing p_pattern and using only epsilon to control the randomness to be consistent with the paper.

Apply experiment 2 on the baseball environment

Update the usage of `scripts/run_exps_and_plot.sh`

After #13 is merged, update the usage of scripts/run_exps_and_plot.sh according to the latest version.

Containerize the project

Containerizing this project allows the user to run the experiments with the container runtime as the only dependency.

To achieve this, we have several tasks to do

#8
Create an image using Dockerfile. The image should be based on a Python runtime image and install all the required dependencies
Run a container with the newly created image and make sure the experiment results are correct

Explain `NUM_RUNS` and `NUM_EPISODES` in documentation

In index.md (the corresponding webpage is https://jerry871002.github.io/bayesian-strategy-inference/), we suggest the user run scripts/run_exps_and_plot.sh [NUM_RUNS] [NUM_EPISODES] but didn't explain what the parameters mean.

Make sure the baseball environment works with `run_experiment.py`

run_experiment.py should work with the following agent-opponent combinations.

Fix `TODO` and `FIXME` in the code

Declare the dependencies in `requirements.txt`

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.