airboxlab / rllib-energyplus Goto Github PK

Simple EnergyPlus environments for control optimization using reinforcement learning

License: MIT License

Python 91.91% Dockerfile 8.09%

energyplus pytorch ray-rllib reinforcement-learning deep-learning python tensorflow2 ray pearl

rllib-energyplus's Introduction

EnergyPlus environments for Reinforcement Learning

This project implements a gym environment that handles EnergyPlus simulations for Reinforcement Learning (RL) experiments, using the EnergyPlus Python API. It also provides a set of examples and tools to train RL agents.

Requires Python 3.8+, EnergyPlus 9.3+

Setup

Using docker image

Look for a pre-built docker image in packages and follow instructions to pull it.

Alternatively, build the docker image:

docker build . -f docker/Dockerfile -t rllib-energyplus

Run the container

docker run --rm --name rllib-energyplus -it rllib-energyplus

Notes:

Remove --rm to keep the container after exiting.
If you want to use tensorboard, start the container with --network host parameter.
If you want to use a GPU, start the container with --gpus all parameter.

Inside the container, run the experiment

cd /root/rllib-energyplus
# run the Amphitheater example
python3 rleplus/train/rllib.py --env AmphitheaterEnv

Using a virtual environment

Using poetry

Install Poetry if you don't have it already:

curl -sSL https://install.python-poetry.org | python3 -

See more installation options here.

This project comes with a pyproject.toml file that lists all dependencies. Packages versions are pinned (in poetry.lock) to ensure reproducibility.

Install the project dependencies with:

poetry install

Using pip

The poetry lock file is automatically converted to a requirements file, so you can also install dependencies with pip:

# Create a virtual environment
python3 -m venv env
# Activate the virtual environment
source env/bin/activate
# Install dependencies
pip install -r requirements.txt

Path dependencies

This project depends on the EnergyPlus Python API. An auto-discovery mechanism is used to find the API, but in case it fails, you can manually add the path to the API to the PYTHONPATH environment variable using the following:

export PYTHONPATH="/usr/local/EnergyPlus-23-2-0/:$PYTHONPATH"

Make sure you can import EnergyPlus API by printing its version number

$ python3 -c 'from pyenergyplus.api import EnergyPlusAPI; print(EnergyPlusAPI.api_version())'
0.2

Run example

Run the amphitheater example with default parameters using Ray RLlib PPO algorithm:

Using Poetry

# Using Ray Rllib
poetry run rllib --env AmphitheaterEnv
# Using Meta Pearl
poetry run pearl --env AmphitheaterEnv

Using Python

If you installed dependencies with pip, you can run the example with:

# Using Ray Rllib
python3 rleplus/train/rllib.py --env AmphitheaterEnv
# Using Meta Pearl
python3 rleplus/train/pearl.py --env AmphitheaterEnv

Example of episode reward stats obtained training with PPO, 1e5 timesteps, 2 workers, with default parameters + LSTM, short E+ run period (2 first weeks of January). Experiment took ~20min.

Creating a new environment

To create a new environment, you need to create a new class that inherits from rleplus.envs.EnergyPlusEnv and implement abstract methods. See existing environments for examples.

Once your environment is ready, it must be declared in the rleplus.examples.registry module, so it gets registered.

Tracking an experiment

Tensorboard is installed with requirements. To track an experiment running in a docker container, the container must be started with --network host parameter.

Start tensorboard with:

tensorboard --logdir ~/ray_results --bind_all

rllib-energyplus's People

Contributors

Stargazers

Watchers

Forkers

jia053123 max-accenta amirha76 stevennat jshang light52566 bbrangeo chend777

rllib-energyplus's Issues

Values used to calculate reward is also the observation of the agent

Not an error of the program, but one thing I noticed is that the obs: dict contains both the variables and the meters, and the function _compute_reward(), uses the values of the obs dictionary.

So the agent uses the meter to compute the reward but also the same meters are passed into the agent as observation.

I'm not sure but I was thinking this could potentially lead to some problems, maybe biased learning?

energyplus error

when i run eplaunch.exe with model.idf and LUX_LU_Luxembourg.AP.065900_TMYx.2004-2018.epw, it shows that "** Fatal ** checkSetpointNodesAtEnd: At least one node does not have a setpoint attached, neither via a SetpointManager, EMS:Actuator, or API".
here is the log.
Program Version,EnergyPlus, Version 23.1.0-87ed9199d4, YMD=2023.05.16 15:15,
************* Beginning Zone Sizing Calculations
** Warning ** Weather file location will be used rather than entered (IDF) Location object.
** ~~~ ** ..Location object=MILIANA
** ~~~ ** ..Weather File Location=Tampa International Ap FL USA TMY3 WMO#=722110
** ~~~ ** ..due to location differences, Latitude difference=[8.33] degrees, Longitude difference=[84.76] degrees.
** ~~~ ** ..Time Zone difference=[6.0] hour(s), Elevation difference=[99.16] percent, [709.00] meters.
** Warning ** ProcessScheduleInput: Schedule:Constant="ALWAYS ON DISCRETE", Blank Schedule Type Limits Name input -- will not be validated.
** Warning ** ProcessScheduleInput: Schedule:Constant="ALWAYS ON DISCRETE 3", Blank Schedule Type Limits Name input -- will not be validated.
** Warning ** ProcessScheduleInput: Schedule:Constant="ALWAYS OFF DISCRETE", Blank Schedule Type Limits Name input -- will not be validated.
** Warning ** ProcessScheduleInput: Schedule:Constant="ALWAYS ON CONTINUOUS", Blank Schedule Type Limits Name input -- will not be validated.
** Warning ** ProcessScheduleInput: Schedule:Constant="ALWAYS ON DISCRETE 4", Blank Schedule Type Limits Name input -- will not be validated.
** Warning ** ProcessScheduleInput: Schedule:Constant="ALWAYS ON DISCRETE 5", Blank Schedule Type Limits Name input -- will not be validated.
** Warning ** ProcessScheduleInput: Schedule:Constant="ALWAYS ON DISCRETE 6", Blank Schedule Type Limits Name input -- will not be validated.
** Warning ** GetHTSurfaceData: Surfaces with interface to Ground found but no "Ground Temperatures" were input.
** ~~~ ** Found first in surface=SURFACE 6
** ~~~ ** Defaults, constant throughout the year of (18.0) will be used.
** Warning ** CheckUsedConstructions: There are 25 nominally unused constructions in input.
** ~~~ ** For explicit details on each unused construction, use Output:Diagnostics,DisplayExtraWarnings;
** Warning ** GetSimpleAirModelInputs: ZoneInfiltration:DesignFlowRate="TZ_AMPHITHEATER 189.1-2009 - SECSCHL - AUDITORIUM - CZ4-8 INFILTRATION 77.0 PERCENT REDUCTION", Design Flow Rate Calculation Method specifies Flow Rate per Exterior Surface Area, but Exterior Surface Area = 0. 0 Infiltration will result.
** Severe ** UpdateZoneSizing: Cooling supply air temperature (calculated) within 2C of zone temperature
** ~~~ ** ...check zone thermostat set point and design supply air temperatures
** ~~~ ** ...zone name = TZ_AMPHITHEATER
** ~~~ ** ...design sensible cooling load = 485247.45 W
** ~~~ ** ...thermostat set point temp = 0.000 C
** ~~~ ** ...zone temperature = 14.025 C
** ~~~ ** ...supply air temperature = 14.000 C
** ~~~ ** ...temperature difference = -2.49369E-002 C
** ~~~ ** ...calculated volume flow rate = 4140059.27874 m3/s
** ~~~ ** ...calculated mass flow rate = 4982363.74891 kg/s
************* Beginning System Sizing Calculations
************* Beginning Plant Sizing Calculations
** Warning ** GetOAControllerInputs: Controller:MechanicalVentilation="CONTROLLER MECHANICAL VENTILATION 1"
** ~~~ ** Cannot locate a matching DesignSpecification:ZoneAirDistribution object for Zone="TZ_AMPHITHEATER".
** ~~~ ** Using default zone air distribution effectiveness of 1.0 for heating and cooling.
************* Testing Individual Branch Integrity
************* All Branches passed integrity testing
************* Testing Individual Supply Air Path Integrity
************* All Supply Air Paths passed integrity testing
************* Testing Individual Return Air Path Integrity
************* All Return Air Paths passed integrity testing
************* No node connection errors were found.
************* Beginning Simulation
** Warning ** CalcOAController: Minimum OA fraction > Mechanical Ventilation Controller request for Controller:OutdoorAir=CONTROLLER OUTDOOR AIR 1, Min OA fraction is used.
** ~~~ ** This may be overriding desired ventilation controls. Check inputs for Minimum Outdoor Air Flow Rate, Minimum Outdoor Air Schedule Name and Controller:MechanicalVentilation
** ~~~ ** Minimum OA fraction = 1.0000, Mech Vent OA fraction = 0.4883
** ~~~ ** Environment=2020, at Simulation time=01/01 09:00 - 09:15
** Warning ** Missing 'Temperature Setpoint' for node named named 'NODE 3'.
** Fatal ** checkSetpointNodesAtEnd: At least one node does not have a setpoint attached, neither via a SetpointManager, EMS:Actuator, or API
...Summary of Errors that led to program termination:
..... Reference severe error count=1
..... Last severe error=UpdateZoneSizing: Cooling supply air temperature (calculated) within 2C of zone temperature

************* ===== Recurring Error Summary =====
************* The following recurring error messages occurred.

************* ** Warning ** Controller:OutdoorAir="CONTROLLER OUTDOOR AIR 1": Min OA fraction > Mechanical ventilation OA fraction, continues...
************* ** ~~~ ** This error occurred 35285 total times;
************* ** ~~~ ** during Warmup 0 times;
************* ** ~~~ ** during Sizing 0 times.
************* ** ~~~ ** Max=1.000000 Min=1.000000

************* ===== Final Error Summary =====
************* The following error categories occurred. Consider correcting or noting.
************* Nominally Unused Constructions
************* ..The nominally unused constructions warning is provided to alert you to potential conditions that can cause
************* ..extra time during simulation. Each construction is calculated by the algorithm indicated in the HeatBalanceAlgorithm
************* ..object. You may remove the constructions indicated (when you use the DisplayExtraWarnings option).

************* EnergyPlus Warmup Error Summary. During Warmup: 0 Warning; 0 Severe Errors.
************* EnergyPlus Sizing Error Summary. During Sizing: 11 Warning; 1 Severe Errors.
************* EnergyPlus Terminated--Fatal Error Detected. 35299 Warning; 1 Severe Errors; Elapsed Time=00hr 00min 2.75sec

plus
when i run run.py, finally, the same error occur.
part python log
Updating Shadowing Calculations, Start Date=10/08/2020
Continuing Simulation at 10/08/2020 for 2020
Updating Shadowing Calculations, Start Date=10/28/2020
Continuing Simulation at 10/28/2020 for 2020
Updating Shadowing Calculations, Start Date=11/17/2020
Continuing Simulation at 11/17/2020 for 2020
Updating Shadowing Calculations, Start Date=12/07/2020
Continuing Simulation at 12/07/2020 for 2020
Updating Shadowing Calculations, Start Date=12/27/2020
Continuing Simulation at 12/27/2020 for 2020
**FATAL:checkSetpointNodesAtEnd: At least one node does not have a setpoint attached, neither via a SetpointManager, EMS:Actuator, or API
EnergyPlus Run Time=00hr 00min 2.93sec
Program terminated: EnergyPlus Terminated--Error(s) Detected.

Training freeze when raise a RuntimeError in EnergyPlus

Hi, I used this repo as base for my own development.

I found that the comand raise RuntimeError(f"EnergyPlus failed with {self.energyplus_runner.sim_results['exit_code']}") in line 359 of run.py file (in step() method) freeze the simulation when an error apears.

I solved this problem changing this line for raise Exception(Faulty episode) and adding the following to the Tune configuration for running the experiment:

tune.Tuner(
    algorithm_name,
    run_config = air.RunConfig(
        stop = {'episode_total': 250},
        failure_config = air.FailureConfig(
        # Tries to recover a run up to this many times.
        max_failures=10
        )
    ),
    param_space=algo_config.to_dict(),
).to_fit()

This was helpful for me.

Make project structure more flexible

Main tasks:

split gym env / trainer / examples
sensors, actuators, and spaces should be configurable
configure action scaling
replace pip by poetry to pin all dependencies
auto detect energyplus install

Goal: make the project structure more flexible for creating new environments and test new algorithms/frameworks.

Question about timesteps

I wanted to know about what the timestep member variable of the EnergyPlusEnv does?

I am having a little problem understanding the step function of the class, and was hoping if I can have more information about it.

Premature E+ simulation termination if rllib run as non-annual

I have noticed that if rllib is not executed as an annual simulation for energyplus, the simulation will often terminate without properly executing the simulations.

This seems to happen at random but never happens if the '-a' (annual) flag is passed into energyplus. I also haven't noticed this behaviour when running just energyplus outside of rllib.

Also on the screenshot, it seems to suddenly start simulation for 7/21.

Thanks

Skipping action sending may lead to inconsistencies

Following code may lead to race conditions in synchronization between observations collection and action sending

rllib-energyplus/rllibenergyplus/run.py

Lines 225 to 242 in ce3be28

 def _send_actions(self, state_argument): 

 """ 

  EnergyPlus callback that sets actuator value from last decided action 

  """ 

 if self.simulation_complete or not self._init_callback(state_argument): 

 return 

 if self.act_queue.empty(): 

 return 

 next_action = self.act_queue.get() 

 assert isinstance(next_action, float) 

 self.x.set_actuator_value( 

 state=state_argument, 

 actuator_handle=self.actuator_handles["sat_spt"], 

 actuator_value=next_action 

 )

Returning when action queue is empty is wrong if env or policy action sampling is slower than an E+ timestep execution: the env may not have the time to push an action in the queue before E+ executes the callback.
This may be fine if the current E+ timestep is a system (internal) timestep, but not if it's a global timestep: in this case we should wait for the action to be available and apply it to the actuator.

error happened when running run.py

Dear @antoine-galataud

Error happened when I run the run.py in a win10 enviroment.
The following is logs. Could you tell me how to fix it?
Best regards.

(raylet) [libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/wire_format_lite.cc:581] String field 'ray.rpc.WorkerTableData.exit_detail' contains invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes.
(raylet) [libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/wire_format_lite.cc:581] String field 'ray.rpc.ErrorTableData.error_message' contains invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes.
(pid=gcs_server) [libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/wire_format_lite.cc:581] String field 'ray.rpc.WorkerTableData.exit_detail' contains invalid UTF-8 data when parsing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes.
(pid=gcs_server) [libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/wire_format_lite.cc:581] String field 'ray.rpc.ErrorTableData.error_message' contains invalid UTF-8 data when parsing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes.

OutputControl setting overriden?

In my IDF file, I have the OutputControl:Table:Style as CommaAndHtml, but for the folders generated by rllib for each of the episodes, doesn't contain the html files, and the csv file doesn't seem to be the default setting that I get when I run just plain E+ simulation.

I was reading through the code but couldn't see if there was a setting that overrides this, or if I am simply confused about some E+ settings.

Thanks.

Problem with ray 2.0.0

Hi,

I am very interested with this work. I am looking for solution to couple Eplus with RL for hvac control optimization. I tried to run this repo but i did not managed to install ray[RLlib]==2.0.0. It sounds to be to old.

If I install a newer version of ray, it doesn't work and there is a recommendation to work with gymnasium instead of gym.

Is there any update of this work ?

Thanks a lot.

Add Pearl example

Add a training example that uses Pearl

FAIL serialization

Hello!
I have tried to modify the run.py file to be able to build the algorithm and be able to train it directly with RLlib (ie without using Tune). When I run the command algo=config.build() it tells me that it is not serializable. With the help of ray.util.inspect_serializability I have obtained the following message:

==================================================== ==============
Checking Serializability of <class '__main__.EnergyPlusEnv'>
==================================================== ==============
!!! FAIL serialization: no default __reduce__ due to non-trivial __cinit__
     Serializing '__enter__' <function Env.__enter__ at 0x7f0420dbef80>...
     Serializing '__exit__' <function Env.__exit__ at 0x7f0420dbf010>...
     Serializing '__init__' <function EnergyPlusEnv.__init__ at 0x7f0306bd70a0>...
     Serializing '__str__' <function Env.__str__ at 0x7f0420dbeef0>...
     Serializing '_compute_reward' <function EnergyPlusEnv._compute_reward at 0x7f0306bd72e0>...
     Serializing 'close' <function Env.close at 0x7f0420dbecb0>...
     Serializing 'get_wrapper_attr' <function Env.get_wrapper_attr at 0x7f0420dbf0a0>...
     Serializing 'render' <function EnergyPlusEnv.render at 0x7f0306bd7250>...
     Serializing 'reset' <function EnergyPlusEnv.reset at 0x7f0306bd7130>...
     !!! FAIL serialization: no default __reduce__ due to non-trivial __cinit__
     Detected 4 global variables. Checking serializability...
         Serializing 'Queue' <class 'queue.Queue'>...
         Serializing 'EnergyPlusRunner' <class '__main__.EnergyPlusRunner'>...
         !!! FAIL serialization: no default __reduce__ due to non-trivial __cinit__
             Serializing '__init__' <function EnergyPlusRunner.__init__ at 0x7f0306bd69e0>...
             Serializing '_collect_obs' <function EnergyPlusRunner._collect_obs at 0x7f0306bd6cb0>...
             Serializing '_flush_queues' <function EnergyPlusRunner._flush_queues at 0x7f0306bd6ef0>...
             Serializing '_init_callback' <function EnergyPlusRunner._init_callback at 0x7f0306bd6dd0>...
             Serializing '_init_handles' <function EnergyPlusRunner._init_handles at 0x7f0306bd6e60>...
             !!! FAIL serialization: no default __reduce__ due to non-trivial __cinit__
             Serializing '_collect_obs' <function EnergyPlusRunner._collect_obs at 0x7f0306bd6cb0>...
     Serializing '_compute_reward' <function EnergyPlusEnv._compute_reward at 0x7f0306bd72e0>...
==================================================== ==============
Variable:

FailTuple(_init_handles [obj=<function EnergyPlusRunner._init_handles at 0x7f0306bd6e60>, parent=<class '__main__.EnergyPlusRunner'>])

was found to be non-serializable. There may be multiple other undetected variables that were non-serializable.
Consider either removing the instantiation/imports of these variables or moving the instantiation into the scope of the function/class.
==================================================== ==============
Check https://docs.ray.io/en/master/ray-core/objects/serialization.html#troubleshooting for more information.
If you have any suggestions on how to improve this error message, please reach out to the Ray developers on github.com/ray-project/ray/issues/
==================================================== ==============
(False,
  {FailTuple(_init_handles [obj=<function EnergyPlusRunner._init_handles at 0x7f0306bd6e60>, parent=<class '__main__.EnergyPlusRunner'>])})

This is a mistake? How can I fix it?
Thank you

	def _send_actions(self, state_argument):
	"""
	EnergyPlus callback that sets actuator value from last decided action
	"""
	if self.simulation_complete or not self._init_callback(state_argument):
	return

	if self.act_queue.empty():
	return

	next_action = self.act_queue.get()
	assert isinstance(next_action, float)

	self.x.set_actuator_value(
	state=state_argument,
	actuator_handle=self.actuator_handles["sat_spt"],
	actuator_value=next_action
	)