This project has been developed during the 2019 Reinforcement Learning Course held py Prof. Capobianco at Sapienza University of Rome.
The algorithm used in this project is the Soft Actor-Critic algorithm . More details on the implementation in the next sections.
The project contains only a Jupyter Notebook file. Meet the prerequisite and use it.
- Python 3.5+
- Jupyer
pip install jupyterlab
- MuJoCo
- I suggest this article to install it. It worked on Ubuntu 18.04, Python 3.7.5 and mujoco200.
- You will need a MuJoCo license.
- Gym
pip install gym
- Stable Baselines installation
- Numpy
pip install numpy
- Scipy
pip install scipy
- TQDM
pip install tqdm
The environment where the tests are taken is the MuJoCo environment Ant-v2 . The target of this environment is to let the Ant walk as fast as possible, as long as possible. The ant is a hierarchical structure with the "torso" as the main object, and the 4 legs as the children:
The observation space is a 111-dim space:
Total dimension | 111 |
---|---|
Torso Height | 1 |
Torso Orientation | 4 |
Joint Angles | 8 |
Velocities (angular + linear) | 6 |
Joint Velocities | 8 |
External Forces | 84 |
The reward function is defined here .
You can find a video of the final execution here .
- Giovanbattista Abbate - giabb
This project is licensed under the MIT License - see the LICENSE.md file for details
- Billie Thompson - Provided README Template - PurpleBooth