This repository aims to provide a unified library for automatic reward shaping and reimplements the methods described in the table below. However, it is not meant to validate the results of the original papers.
If you are a reproducibility reviewer for HPRS, please refer to the original codebase.
Method | Signal Soundness | Dense Signal | Multi-Objective | Objective Prioritization | Status |
---|---|---|---|---|---|
TLTL[1] | ✔️ | ❌ | ❌ | ❌ | ✔️ |
BHNR[2] | ❌ | ✔️ | ❌ | ❌ | ✔️ |
HPRS[4] | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
PAM[4] | ✔️ | ❌ | ✔️ | ✔️ | ✔️ |
Rank-Preserving Reward[5] | ✔️ | ❌ | ✔️ | ✔️ | ✔️ |
✔️ Supported
❌ Not supported
👷 Work in progress
The task specification consists of a set of requirements, as in [4]. The requirement syntax is as follows:
formula ::= f(state) ~ 0
requirement ::= ensure <formula> | achieve <formula> | conquer <formula> | encourage <formula>
where f
is a function of the state dictionary state
and ~
is a comparison operator in <
, <=
, >
, >=
.
To run the examples, ensure to install the extra requirements:
pip install -r examples/requirements.txt
Then, you can train an agent with stable-baselines3
and auto-shaping
- using default specifications from the configuration file in
configs/
- using a custom specification by passing it as an argument
- benchmarking the agent with multiple reward shaping
If you use this code in your research, please cite the following paper:
@misc{berducci2022hierarchical,
title={Hierarchical Potential-based Reward Shaping from Task Specifications},
author={Luigi Berducci and Edgar A. Aguilar and Dejan Ničković and Radu Grosu},
year={2022},
eprint={2110.02792},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
[1] "Reinforcement learning with temporal logic rewards." Li, et al. IROS 2017.
[2] "Structured reward shaping using signal temporal logic specifications." Balakrishnan, et al. IROS 2019.
[3] "Multi-objectivization of reinforcement learning problems by reward shaping." Brys, et al. IJCNN 2014.
[4] "Hierarchical Potential-based Reward Shaping." Berducci, et al. Under Review.
[5] "Receding Horizon Planning with Rule Hierarchies for Autonomous Vehicles." Veer, et al. ICRA 2023.