Giter Club home page Giter Club logo

mmap-birl's Introduction

MMAP-BIRL

This project builds on the foundation established by J. Choi and K. Kim in MAP inference for Bayesian inverse reinforcement learning, NIPS 2010.

The algorithm developed in this repo considers occluded and noisy expert demonstrations and tries to learn the best reward function using Marginalized Maximum a Posteriori method with Bayesian IRL inference technique.

The codebase has been built and developed in python based on an existing repo by Jaedeug Choi ([email protected]) in Matlab.

This paper has been published in UAI 2022 conference. The PDF is currently available at this link.

Cite this work as:

@inproceedings{suresh2022marginal, title={Marginal MAP estimation for inverse RL under occlusion with observer noise}, author={Suresh, Prasanth Sengadu and Doshi, Prashant}, booktitle={Uncertainty in Artificial Intelligence}, pages={1907--1916}, year={2022}, organization={PMLR} }

Requirements:

I'd recommend creating a conda env and installing the following packages within:

1. numpy    
2. scipy    
3. logging    
4. pymdptoolbox    
5. tqdm    
6. pyyaml    
7. multiprocessing

I will soon push a requirements.txt file that you can use to create the environment easily.

Usage:

Assuming you've read the paper and have a good understanding of how the algorithm works, first open ./yaml_files/init_params.yaml and make sure the name of the problem you want to test and all other parameters are to your requirements. If you're unsure about the hyperparameters or want to just test an existing domain, you could just run the code as it is and it would work fine.

Activate your conda env and cd into mmap-birl/code folder and run the following command:

python3 runner.py

You should see the results printed on the terminal that tells you about the initial weights sampled, learned reward weights and the metrics - Inverse Learning Error(ILE)/Value Difference, Learned Behavior Accuracy(LBA)/Policy Difference and Reward Difference.

Pending Updates:

Ideally, I plan to make this such that you can pass a yaml file for the MDP and a yaml file with trajectories and execute MMAP-BIRL. That functionality is under works.

The sparse MDP implementation mostly works, but I haven't done extensive testing with it yet.

The scipy minimize based optimization is deprecated for now, although I plan to bring it back soon.

Parallel processing is available for the marginalization part, although for small mdps and/or almost deterministic transitions, it won't matter much.

The observation function and s-a trajectories sampled from it needs to be more robust, but for the current domains, it works fine.

The domains could be made much more stochastic in terms of transition probabilities. That would be an added challenge for MMAP-BIRL to converge.

Feel free to raise issues, I'll address them as soon as I can.

mmap-birl's People

Contributors

prasuchit avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mmap-birl's Issues

IndexError in generator.py

Please consider that your code has the index error message while running your code as it is with no changes:

Traceback (most recent call last):
File "/mmap-testing/mmap-birl-master/code/runner.py", line 126, in
main()
File "/mmap-testing/mmap-birl-master/code/runner.py", line 51, in main
expertData = generator.generateDemonstration(mdp, problem)
File "/mmap-testing/mmap-birl-master/code/generator.py", line 56, in generateDemonstration
trajs, policy = generateTrajectory(mdp, problem)
File "/mmap-testing/mmap-birl-master/code/generator.py", line 124, in generateTrajectory
trajs, trajVmean, trajVvar = sampleTrajectories(problem.nTrajs, problem.nSteps, policy, mdp, problem.seed)
File "/mmap-testing/mmap-birl-master/code/generator.py", line 232, in sampleTrajectories
a = np.squeeze(piL[s])
IndexError: index 140 is out of bounds for axis 0 with size 48

Solving it requires uncommenting line 227 and commenting line 229 in your generator.py code.

Please consider if that is correct or please push a latest working code of mmap-birl on your github repo.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.