Giter Club home page Giter Club logo

drl-energy-management's Introduction

Deep reinforcement learning based energy management strategy for hybrid electric vehicle

This research is cited from: Lian R, Peng J, Wu Y, et al. Rule-interposing deep reinforcement learning based energy management strategy for power-split hybrid electric vehicle. Energy, 2020: 117297.

Happy to answer any questions you have. Please email us at [email protected] or [email protected].

A rule-interposing deep reinforcement learning (RIDRL) based energy management strategy (EMS) of hybrid electric vehicle (HEV) is investigated. Incorporated with the battery characteristics and the optimal brake specific fuel consumption (BSFC) curve of hybrid electric vehicles (HEVs), we are committed to acclerating the learning process of DRL agents on EMS.

Prius modelling

As shown in Fig. 1, the core power-split component of Prius is a planetary gear (PG) which splits power among the engine, motor and generator. In this structure, its engine and generator are connected with the planet carrier and sun gear respectively, and its motor is connected with the ring gear that is linked with the output shaft simultaneously. In addition, Prius is equipped with a small capacity Nickel metal hydride (Ni-MH) battery which is used to drive the traction motor and generator. Prius combines the advantages of series and parallel HEVs, and consists of three driving modes: pure electric mode, hybrid mode and charging mode.

In this research, a backward HEV model is built for the training and evaluation of EMS. The vehicle power demand under the given driving cycle is calculated by the longitudinal force balance equation. The engine (Fig. 2), generator and motor are modeled by their corresponding efficiency maps from bench experiments. The Ni-MH battery is modeled by an equivalent circuit model, wherein the impact of the temperature change and battery aging are not considered. The experiment data of battery including internal resistance and open-circuit voltage is shown in Fig. 2.

    Fig. 1. Architecture of Prius powertrain          Fig. 2. Engine map and battery characteristics

DRL-based energy management strategy

DRL agent is encountered with an environment with Markov property. The agent and the environment interact continually, the agent selecting actions and the environment responding rewards to these actions and presenting new states to the agent. In this research, deep deterministic policy gradient (DDPG) algorithm is incorporated with the expert knowledge of HEV to learn the optimal EMS action. Fig. 3 shows the agent-environment interaction of HEV energy management, i.e. the interaction between the EMS and the vehicle and traffic information. The state and action variables are set as follows, where the continuous action variables are explored from the optimal BSFC curve of engine. The reward function of DDPG-based EMS consists of two parts: the instantaneous fuel consumption of engine and the cost of battery charge sustaining. Thus, the multi-objective reward function is defined as:

State = {SoC, velocity, acceleration}

Action = {continuous action: engine power}

$Reward = -{\alpha[fuel(t)]+ \beta[SoC_{ref} - SoC(t)]^{2}}$

where α is the weight of fuel consumption, β the weight of battery charge sustaining, and $SoC_{ref}$ the SoC reference value while maintaining battery charge-sustaining. The SoC_{ref} is determined by the prior knowledge of the battery.

           Fig. 3. Agent-environment interaction for HEV energy management

Simulation results

We make extensive comparison experiments between RI DDPG and RI deep Q learning (DQL), wherein they share the same embedded expert knowledge. Fig. 4 shows the differences of SoC trajectories among dynamic programming (DP), RI DDPG and RI DQL under new European driving cycle (NEDC), where their values of terminal SoC are much the same, approximately at 0.6. From Fig. 5 and Fig. 6, it can be found that most of the engine working points of DDPG are distributed in areas with lower equivalent fuel consumption rate, while those of RI DQL are relatively poor. For this reason, the fuel economy of RI DDPG reaches 95.3% of DP’s, and get a decrease of 6.5% compared to that of RI DQL, as shown in table 5. In Fig. 7, it can be seen that RI DQL is difficult to guarantee its convergence and fluctuates more frequently as compared with RI DDPG that converges to a stable state after 50th episode. In order to train an EMS for a HEV online, the training process of a controller must be stable enough to guarantee the safety of powertrain. The stability of RI DDPG shows that it is more applicable to real-world applications of DRL-based EMSs. For further verification, different driving cycles are introduced into the two EMSs. The simulation results in table 1 demonstrate the superiority of RI DDPG algorithm in performance robustness, where the mean and standard deviation of fuel economy are improved by 8.94% and 2.74% respectively.

                Fig. 4. SoC trajectories of the three EMS models
                      Fig. 5. Working points of engine
    Fig. 6. Distributions of fuel consumption rate              Fig. 7. Convergence curves

      Table 1. Comparison between RI DDPG and RI DQL under different driving cycles

        

Dependencies

  • tensorflow 1.15.0

  • numpy

  • matplotlib

  • scipy

The code structure

  • The Data_Standard Driving Cycles folder contains the driving cycle for training DRL agents.
  • The Image folder contains the figures showed in this research.
  • Prius_model_new.py is the backward simulation model of the Prius powertrain.
  • Mot_eta_quarter.mat and Eng_bsfc_map.mat are the efficiency maps of the motor and engine.
  • Priority_Replay.py is the priority replay module for training DRL agents.
  • DeepQNetwork_Prius.py performs training for DQN agent.
  • DDPG_Prius.py performs training for DDPG agent.

The codes of DDPG and DQN models are developed according to MorvanZhou's DRL course.

Collaborators


Renzong Lian
Renzong Lian

💻
Yuankai Wu
Yuankai Wu

💻

drl-energy-management's People

Contributors

lryz0612 avatar kaimaoge avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.