Giter Club home page Giter Club logo

codegreen-prediction's Introduction

Carbon efficient model deployment for project Codegreen

This repository provides a tool for deploying prediction models in a more environment friendly manner. This tool is designed to complement the Codegreen project.

Project Codegreen allows users to time shift their computations to periods when a higher proportion of energy is produced from renewable energy source, thereby reducing the carbon footprint of their computation. This is achieved by leveraging forecasts of energy generation data obtained from open data sources.

For example, in the European Union, data is colleted from the ENTSOE platform. However, a significant challenge arise from the limited duration of the available energy production forecasts, typically spanning 24 hours, and the sporadic upload schedule. This unpredictability makes predicting the optimal time for long duration computational tasks difficult.

One approach to address this challenge is to train prediction models using historical energy generation data that forecast the time series of renewable energy percentages on an hourly basis. Since each country's energy generation patterns are unique, separate models are needed for each country. As our understanding of energy patterns for individual countries improves, we should incorporate this into our models. Thus there can be multiple models for a single country.

Now the question arises : how do we deploy these models effectively so that prediction values can be seamlessly integrated into the main Codegreen API while minimizing carbon emissions? This project outlines one approach to do just that.

Architecture

The figure below describes the overall architecture of deployment

Architecture

The Codegreen backend utilizes a Redis server to cache forecast values for improved performance. We used this redis as a shared memory between the backend and our prediction tool.

We create a docker container (named codegreen-prediction-tool) and add it to the docker network in where the Codegreen backend and other services operate. However this container does not run continuously. Instead, a CRON job starts the container after a specified time interval, automatically triggering the execution of script for running models for all available countries and storing their results. Once this task is completed, the container stops automatically.

The results of the models (the predictions) are send to the redis cache and stored in a local data folder. This folder, which also includes logs, is accessible for sharing with the host machine.

Installation and setup

  • Pre-requisites:

    • Docker must be installed
    • The Codegreen server must be up and running
    • Obtain the name of the Docker network in which the Codegreen containers exist. Use the command docker network ls. Usually, the default network name is projectfoldername_default.
  • Clone the repository : git clone https://github.com/shubhvjain/codegreen-prediction-tool.git. All further steps must be performed from the root of the project folder.

  • Create a config file :

    • Create a new file named .config in the root of the project repository
    • Initialize the file will the following envirenment variables:
    ENTSOE_TOKEN=token
    PREDICTIONS_REDIS_URL="redis://cache:6379"
    PREDICTIONS_CRON_JOB_FREQ_HOUR=1
    PREDICTIONS_DOCKER_VOLUME_PATH="/full/local/path"
    GREENERAI_DOCKER_NETWORK=greenerai_default
  • Initial setup : Execute the initial setup by running ./setup.sh.

    • Note : This command must be run again if config files are changed
    • Test run the program : Before configuring the cron job, ensure everything is properly set up by running ./run.sh. If the setup is correct, you will find log files of models run successfully in the path specified in the config file.
  • Setting up the cron job : Execute ./schedule.sh to set up the cron job. The frequency of the job is determined by the PREDICTIONS_CRON_JOB_FREQ_HOUR variable in the .config file.

Development

Local setup for development

  • Clone the repository
  • Create the .config file in the root of the project. Initialize it as mentioned in the installation steps above. Explanation for each variable in the config file is described below
  • Install the required packages
    • Optional step : Create a new conda environment (conda env create -n greenerai) and activate it (conda activate greenerai)
    • install packages using : pip install -r requirements.txt

Code Structure

  • All the models (and related metadata) are stored in the models folder. See instructions below on adding a new model to the repo.

  • Main Python files:

    • predictionModel.py: To find models and run them.
    • savePredictions.py: To store predictions generated by models.
    • entsoeAPI.py: Gathers data from ENTSOE portal.
  • Main Bash scripts:

    • setup.sh:
    • run.sh:
    • schedule.sh:

Working of the Tool

  • Running setup.sh generates a Docker image (using the Dockerfile), creates a new Docker container using this image, and adds it to the Docker network in which Codegreen server is running.

  • When the container starts, it runs the command python savePredictions.py. During development, one can run this command as well.

  • Working of savePredictions.py:

    • Performs checks: if all required ENV variables exist, required folders exist (if not, they are created which are already gitignored).
    • Gets the latest model available for each country, runs them and stores the results.
    • The results are stored in two ways:
      • In a CSV file under the data/predictions folder. There is a file for each country.
      • If the Codegreen Redis cache is available, data is stored in it with the key: countryName_predictions.
    • Model running is logged. Log are stored in data/logs folder. There is a log file for each country and each month

How to add a new model ?

  • Models are stored in the model folder.
  • File naming convention : twoLetterCountryCode_versionNumber
    • version number is incremental . The model with the latest number is considered the latest model for that country.
  • When the main script is run, model with the highest version number is selected for each country to make the predictions.

To add a new model :

  • Copy the model file (.h5) in the models folder
  • Rename the model file based on the file name conventions described above
  • Add a new JSON entry in the metadata.json file
    {
        "name":"DE_v1.h5",
        "country":"DE",
        "input_sequence":24,
        "description":""
    }
    

The .config file

All configuration setting required by the tool are stored in the config file in the root of the project folder Essentially, this file contains environment variables that are then loaded before running the main program

Description of each variable required in the .config file:

  • ENTSOE_TOKEN: Token required to access the ENTSO-E API.
  • PREDICTIONS_REDIS_URL: The URL of the common Redis server. Use "redis://cache:6379".
  • PREDICTIONS_CRON_JOB_FREQ_HOUR: The frequency (in hours) of the CRON job configured in the last step of installation.
  • PREDICTIONS_DOCKER_VOLUME_PATH: The full path on the host machine where the recent prediction files and log files will be stored.
  • GREENERAI_DOCKER_NETWORK: The name of the Docker network in which CodeGreen containers are running.

codegreen-prediction's People

Contributors

shubhvjain avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.