Giter Club home page Giter Club logo

project1-boptest-gym's Introduction

BOPTEST-Gym

BOPTESTS-Gym is the OpenAI-Gym environment for the BOPTEST framework. This repository accommodates the BOPTEST API to the OpenAI-Gym convention in order to facilitate the implementation, assessment and benchmarking of reinforcement learning (RL) algorithms for their application in building energy management. RL algorithms from the Stable-Baselines 3 repository are used to exemplify and test this framework.

The environment is described in this paper.

Structure

  • boptestGymEnv.py contains the core functionality of this Gym environment.
  • environment.yml contains the dependencies required to run this software.
  • /examples contains prototype code for the interaction of RL algorithms with an emulator building model from BOPTEST.
  • /testing contains code for unit testing of this software.

Quick-Start (using BOPTEST-Service)

BOPTEST-Service allows to directly access BOPTEST test cases in the cloud, without the need to run it locally. Interacting with BOPTEST-Service requires less configuration effort but is considerably slower because of the communication overhead between the agent and the test case running in the cloud. Use this approach when you want to quickly check out the functionality of this repository.

  1. Create a conda environment from the environment.yml file provided (instructions here).
  2. Check out the boptest-gym-service branch and run the example below that uses the Bestest hydronic case with a heat-pump and the DQN algorithm from Stable-Baselines:
from boptestGymEnv import BoptestGymEnv, NormalizedObservationWrapper, DiscretizedActionWrapper
from stable_baselines3 import DQN

# url for the BOPTEST service. 
url = 'https://api.boptest.net' 

# Decide the state-action space of your test case
env = BoptestGymEnv(
        url                  = url,
        testcase             = 'bestest_hydronic_heat_pump',
        actions              = ['oveHeaPumY_u'],
        observations         = {'time':(0,604800),
                                'reaTZon_y':(280.,310.),
                                'TDryBul':(265,303),
                                'HDirNor':(0,862),
                                'InternalGainsRad[1]':(0,219),
                                'PriceElectricPowerHighlyDynamic':(-0.4,0.4),
                                'LowerSetp[1]':(280.,310.),
                                'UpperSetp[1]':(280.,310.)}, 
        predictive_period    = 24*3600, 
        regressive_period    = 6*3600, 
        random_start_time    = True,
        max_episode_length   = 24*3600,
        warmup_period        = 24*3600,
        step_period          = 3600)

# Normalize observations and discretize action space
env = NormalizedObservationWrapper(env)
env = DiscretizedActionWrapper(env,n_bins_act=10)

# Instantiate an RL agent
model = DQN('MlpPolicy', env, verbose=1, gamma=0.99,
            learning_rate=5e-4, batch_size=24, 
            buffer_size=365*24, learning_starts=24, train_freq=1)

# Main training loop
model.learn(total_timesteps=10)

# Loop for one episode of experience (one day)
done = False
obs, _ = env.reset()
while not done:
  action, _ = model.predict(obs, deterministic=True) 
  obs,reward,terminated,truncated,info = env.step(action)
  done = (terminated or truncated)

# Obtain KPIs for evaluation
env.get_kpis()

Quick-Start (running BOPTEST locally)

Running BOPTEST locally is substantially faster

  1. Create a conda environment from the environment.yml file provided (instructions here).
  2. Run a BOPTEST case with the building emulator model to be controlled (instructions here).
  3. Check out the master branch of this repository and run the example above replacing the url to be url = 'http://127.0.0.1:5000' and avoiding the testcase argument to the BoptestGymEnv class.

Quick-Start (running BOPTEST locally in a vectorized environment)

To facilitate the training and testing process, we provide scripts that automate the deployment of multiple BOPTEST instances using Docker Compose and then train an RL agent with a vectorized BOPTEST-gym environment. The deployment dynamically checks for available ports, generates a Docker Compose YAML file, and takes care of naming conflicts to ensure smooth deployment. Running a vectorized environment allows you to deploy as many BoptestGymEnv instances as cores you have available for the agent to learn from all of them in parallel (see here for more information, we specifically use SubprocVecEnv). This substantially speeds up the training process.

Usage

  1. Specify the BOPTEST root directory either by passing it as a command-line argument or by defining the boptest_root variable at the beginning of the script generateDockerComposeYml.py. The script prioritizes the command-line argument if provided. Users are allowed to change the Start Port number and Total Services as needed.

Example using command-line argument:

python generateDockerComposeYml.py absolute_boptest_root_dir
  1. Train an RL agent with parallel learning with the vectorized BOPTEST-gym environment. See /examples/run_vectorized.py for an example on how to do so.

Versioning and main dependencies

Current BOPTEST-Gym version is v0.6.0 which is compatible with BOPTEST v0.6.0 (BOPTEST-Gym version should always be even with the BOPTEST version used). The framework has been tested with gymnasium==0.28.1 and stable-baselines3==2.0.0. You can see testing/Dockerfile for a full description of the testing environment.

Citing the project

Please use the following reference if you used this repository for your research.

@inproceedings{boptestgym2021,
	author = {Javier Arroyo and Carlo Manna and Fred Spiessens and Lieve Helsen},
	title = {{An OpenAI-Gym environment for the Building Optimization Testing (BOPTEST) framework}},
	year = {2021},
	month = {September},
	booktitle = {Proceedings of the 17th IBPSA Conference},
	address = {Bruges, Belgium},
}

project1-boptest-gym's People

Contributors

javiarrobas avatar mwetter avatar xiangweiw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

project1-boptest-gym's Issues

Excluding periods as attribute

This is to make excluding_periods an attribute of BoptestGymEnv instead of an argument of the reset method. This way, the periods will not be used either when using external algorithms as those from Stable Baselines.

Behavior cloning

This is to implement examples that use behavior cloning to accelerate training.

Define a callback for variable episode length

Currently the length of episodes is fixed. It may help a lot for learning to terminate the episodes upon certain conditions are met, e.g. when reward lowers a certain value. This is to implement a callback method that can be customized by the user to define those conditions. Notice that the episode_length argument would become max_episode_length to define the maximum possible episode length instead of a fixed length. This is still needed for the definition of the end_year_margin and for those cases where a fixed episode length is desired.

Change simulation step keyword

This is to change the simulation step keyword. Currently Ts is being used, although it'd be more intuitive to use step, which is also used in BOPTEST.

reference updates

Slight update in references is needed when switching BOPTEST version. This version change is not reflected in the repo because CI is not yet implemented.

Start time argument

Right now two separate arguments are being used to define the starting time of an episode: random_start_time and start_time, with the latter being used when random_start_time=False. It'd be clearer and cleaner to use only one argument: start_time that can have either a fixed value or be assigned None in which case a random time is used.

Add conda environment

This is to add an environment.yml file specifying the conda environment required to run the examples of this repo.

Do not use weight factor as an attribute

This issue is to not use the weight factor as an attribute of the BOPTEST Gym environment. It should rather be a local variable within the compute_reward method.

Jan to Feb

This is to substitute the keyword jan keyword (of January) to feb (of February) in all examples, since they actually run in February, not January.

Cleanup testing

This is to clean up testing by e.g. avoiding duplicate code or generalizing test_agent method to allow models from stable baselines.

Register environment

Registration of new custom Gym environments is easy using gym.envs.registration.register. Registering BOPTEST environments would make it extremely easy for users to access them. However, this issue may need to wait until scenarios are defined better and until BOPTEST is hosted as a web service, so that docker is not required in the installation.

Save progress

This issue is to use the SaveOnBestTrainingRewardCallback to store the model every fixed number of interaction steps so that the saved models can be loaded later to assess the training progress.

Excluding periods for reset

This is to add the possibility to specify excluding periods when resetting the environment such that episodes do not overlap with these periods when initializing from a random start time. This is useful to ensure that testing data is different from training data.

PPO2 typo

There is a typo on the implementation of the PPO2 algorithm:

model.save(os.path.join(utilities.get_root_path(), 'examples',

That line should load and not save the model. Because of this, the pre-trained model was probably overwriten and lost and therefore the agent needs to be re-trained and references for this example need to be updated.

Resolve readme typo

The readme example is using the BoptestGymEnvRewardWeightCost while the BoptestGymEnv is imported instead.

change measurement keys

The keyword used in the RL community for "measurements" is "observations". Still the word "measurements" (or "meas") appears sometimes interchangeably with "observations" (or "obs"), which may be misleading. This issue is to change all those occurrences: from "measurements" to "observations".

Avoid tensorboard logs in testing

Right now tensorboard logs are also generated during testing. This issue is to avoid these logs as they are not needed for testing purposes.

update to latest BOPTEST version

This is to update the references to run with the latest BOPTEST version. It may require some edits to adapt to the newest version of the interface where each data point of the results needs to be requested independently. Additionally, the references may show differences because the bestest_hydronic_heat_pump case, which is the buidling emulator model used in the examples and tests, has been modified, as well as the pricing scenarios. Hence, the trained agents used in the examples may behave poorly. However, the agents should not be retrained for this issue, but for a new issue that will be opened after this one that will also update for the new scenario periods defined in BOPTEST.

fore_n

The number of forecasting steps should be of the number of look-ahead prediction steps PLUS1 for the actual time. This is to correct this line:

self.fore_n = int(self.forecasting_period/self.step_period)

To be: int(self.forecasting_period/self.step_period)+1

Add gitignore

This is to add a .gitignore file to the repository.

Implement predictive states

This is to implement examples that include forecast in the agent observations to prove the possibility of training predictive RL agents.

Implement a render method

It relevant for diagnosis to see what's actually happening within the environment. This issue is to implement a render method to display such information.

update scenario periods

This is to use the BOPTEST peak and typical periods for the examples and testing, instead of the current periods in February and November, that were chosen without any rigorous criteria. This issue should re-train the agents used in the examples as well, following what is discussed in #83.

Regressive state

This is to add the possibility to extend the state espace with past observations. This allows the agent to extract hidden states time features in the partially observable building system. An approach would be to add arguments for the desired regressive variables and regression period.

Add readme

Add readme file explaining basic features and functionality.

Add reward examples

Add examples to illustrate how the default compute_reward method works and how it can be overridden with a custom reward function. Add unittests associated to these examples.

Rescale rewards

Rescaling rewards may be useful to increase sampling efficiency. This can be done using previous rewards, but without shifting mean as that may affect agent's will to live.

Perform tests with multi-dimensional action space

So far all tests are performed with a one-dimensional action space. Potential issues may arise from the implementation of the interface with a multi-dimensional action space. It'd be convenient to perform tests in a multi-dimensional action space to prevent those issues.

Do not print in init

Avoid printing in __init__ and use a more formal method to display the environment.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.