Giter Club home page Giter Club logo

bikey's Introduction

Python package for simulation-based reinforcement learning environments.

This package is unfinished: it contains bugs and its functionality is only partially implemented. It is no longer being worked on.

If you are using or improving this package I would love to hear about it.

This package contains custom OpenAI Gym environments that can interface with a Spacar simulation. Spacar is a software package for "the dynamic modelling and control of flexible multibody systems". The software is currently being developed by the Faculty of Engineering Technology at the University of Twente. With these environments, controllers can be created using reinforcement learning (RL).

Installation instructions

To install this Python package, use Python's pip tool:

# First make sure the git repository (i.e. the directory containing setup.py) is
# set as the working directory, then run this code
pip install .

The standard pip options are available, e.g. the -e option allows you to edit the package after it is installed (both templates and code), without having to install it again:

pip install -e .
# You can now change bikey's code, update files in the template directory, pull
# updates from Github, etc. without having to reinstall bikey

Usage

The two main components of this package are the SpacarEnv and BicycleEnv classes. SpacarEnv is not a full gym environment, therefore only BicycleEnv is registered as an environment.

To create an environment instance import the module that defines the bicycle environment. This will automatically register the environment, which allows gym.make to create an environment for you.

import gym
import bikey.bicycle

env = gym.make("BicycleEnv-v0")
# this creates an instance of the BicycleEnv class

# or

env_with_options = gym.make(
    "BicycleEnv-v0",
    simulink_file="model.slx",
    working_dir="path/to/directory",
    simulink_config={
        ...
    }
    ... # consult the documentation of BicycleEnv for more options
)

Networked environments

The project for which this package is designed has a need for remote execution of environments, meaning the environment has to be controlled from a different computer than the one that runs the actual environment. This will add some latency and may make your training sessions less efficient.

Warning: currently the code for the NetworkEnv class lacks basic security features. I hope to implement some of those in the near future. Pull requests are also welcome.

The NetworkEnv class is designed to be reasonably generally applicable, but at the moment it assumes the actions and observations of any underlying environment are numpy arrays. Another thing to be wary of is the lack of support for observation spaces and action spaces of any type other than gym.spaces.Discrete or gym.spaces.Box. If support for other spaces is needed you could easily implement this yourself, by adding functionality to gym_space_to_dict in bikey.network.env_process as well as dict_to_gym_space in bikey.network.network_env. Simply put the details necessary to describe or reconstruct a space in a dictionary, and make sure this dictionary can be converted to JSON.

Run the bikey.network.server module to start an environment server:

python -m bikey.network.server

# or use the -h flag to display the command's options:
python -m bikey.network.server -h

To shut down the server the following command should be executed on the machine that started the server, otherwise it will be ignored:

python -m bikey.network.server_shutdown

This command will not automatically detect the address and port of a running server: they should be provided to the script. For an overview of the shutdown script, run it with the -h flag.

More custom Spacar environments

This package makes creating your own Spacar environments as easy as possible. All you need to do is subclass bikey.spacar.SpacarEnv, override the process_step() function with your own logic, and set up the correct observation and action spaces for the environment. The process_step() function defines the rules of the environment: how to determine rewards, when to end an episode, and additionally some general info that can be useful when debugging your code.

The basic template has a very simple Simulink model: actions are provided to Spacar, and outputs are read out to the Python environment. If you need to customize this feel free to do so, but make sure there is a constant block with name 'actions', and a block that saves the last, and only the last, Spacar observation to 'out.observations' in the Matlab workspace.

Along with all of the settings available when instantiating an environment, this should give you plenty of room to create any setup you want.

bikey's People

Contributors

rickdw avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.