Carbon efficient model deployment for project Codegreen

This repository provides a tool for deploying prediction models in a more environment friendly manner. This tool is designed to complement the Codegreen project.

Project Codegreen allows users to time shift their computations to periods when a higher proportion of energy is produced from renewable energy source, thereby reducing the carbon footprint of their computation. This is achieved by leveraging forecasts of energy generation data obtained from open data sources.

For example, in the European Union, data is colleted from the ENTSOE platform. However, a significant challenge arise from the limited duration of the available energy production forecasts, typically spanning 24 hours, and the sporadic upload schedule. This unpredictability makes predicting the optimal time for long duration computational tasks difficult.

One approach to address this challenge is to train prediction models using historical energy generation data that forecast the time series of renewable energy percentages on an hourly basis. Since each country's energy generation patterns are unique, separate models are needed for each country. As our understanding of energy patterns for individual countries improves, we should incorporate this into our models. Thus there can be multiple models for a single country.

Now the question arises : how do we deploy these models effectively so that prediction values can be seamlessly integrated into the main Codegreen API while minimizing carbon emissions? This project outlines one approach to do just that.

Architecture

The figure below describes the overall architecture of deployment

The Codegreen backend utilizes a Redis server to cache forecast values for improved performance. We used this redis as a shared memory between the backend and our prediction tool.

We create a docker container (named codegreen-prediction-tool) and add it to the docker network in where the Codegreen backend and other services operate. However this container does not run continuously. Instead, a CRON job starts the container after a specified time interval, automatically triggering the execution of script for running models for all available countries and storing their results. Once this task is completed, the container stops automatically.

The results of the models (the predictions) are send to the redis cache and stored in a local data folder. This folder, which also includes logs, is accessible for sharing with the host machine.

Installation and setup

Pre-requisites:
- Docker must be installed
- The Codegreen server must be up and running
- Obtain the name of the Docker network in which the Codegreen containers exist. Use the command docker network ls. Usually, the default network name is projectfoldername_default.
Clone the repository : git clone https://github.com/shubhvjain/codegreen-prediction-tool.git. All further steps must be performed from the root of the project folder.

Create a config file :

Create a new file named .config in the root of the project repository
Initialize the file will the following envirenment variables:

ENTSOE_TOKEN=token
PREDICTIONS_REDIS_URL="redis://cache:6379"
PREDICTIONS_CRON_JOB_FREQ_HOUR=1
PREDICTIONS_DOCKER_VOLUME_PATH="/full/local/path"
GREENERAI_DOCKER_NETWORK=greenerai_default

Initial setup : Execute the initial setup by running ./setup.sh.
- Note : This command must be run again if config files are changed
- Test run the program : Before configuring the cron job, ensure everything is properly set up by running ./run.sh. If the setup is correct, you will find log files of models run successfully in the path specified in the config file.
Setting up the cron job : Execute ./schedule.sh to set up the cron job. The frequency of the job is determined by the PREDICTIONS_CRON_JOB_FREQ_HOUR variable in the .config file.

Development

Local setup for development

Clone the repository
Create the .config file in the root of the project. Initialize it as mentioned in the installation steps above. Explanation for each variable in the config file is described below
Install the required packages
- Optional step : Create a new conda environment (conda env create -n greenerai) and activate it (conda activate greenerai)
- install packages using : pip install -r requirements.txt

Code Structure

All the models (and related metadata) are stored in the models folder. See instructions below on adding a new model to the repo.
Main Python files:
- predictionModel.py: To find models and run them.
- savePredictions.py: To store predictions generated by models.
- entsoeAPI.py: Gathers data from ENTSOE portal.
Main Bash scripts:
- setup.sh:
- run.sh:
- schedule.sh:

Working of the Tool

Running setup.sh generates a Docker image (using the Dockerfile), creates a new Docker container using this image, and adds it to the Docker network in which Codegreen server is running.
When the container starts, it runs the command python savePredictions.py. During development, one can run this command as well.
Working of savePredictions.py:
- Performs checks: if all required ENV variables exist, required folders exist (if not, they are created which are already gitignored).
- Gets the latest model available for each country, runs them and stores the results.
- The results are stored in two ways:
  - In a CSV file under the data/predictions folder. There is a file for each country.
  - If the Codegreen Redis cache is available, data is stored in it with the key: countryName_predictions.
- Model running is logged. Log are stored in data/logs folder. There is a log file for each country and each month

How to add a new model ?

Models are stored in the model folder.
File naming convention : twoLetterCountryCode_versionNumber
- version number is incremental . The model with the latest number is considered the latest model for that country.
When the main script is run, model with the highest version number is selected for each country to make the predictions.

To add a new model :

Copy the model file (.h5) in the models folder
Rename the model file based on the file name conventions described above

Add a new JSON entry in the metadata.json file

{
    "name":"DE_v1.h5",
    "country":"DE",
    "input_sequence":24,
    "description":""
}

The `.config` file

All configuration setting required by the tool are stored in the config file in the root of the project folder Essentially, this file contains environment variables that are then loaded before running the main program

Description of each variable required in the .config file:

ENTSOE_TOKEN: Token required to access the ENTSO-E API.
PREDICTIONS_REDIS_URL: The URL of the common Redis server. Use "redis://cache:6379".
PREDICTIONS_CRON_JOB_FREQ_HOUR: The frequency (in hours) of the CRON job configured in the last step of installation.
PREDICTIONS_DOCKER_VOLUME_PATH: The full path on the host machine where the recent prediction files and log files will be stored.
GREENERAI_DOCKER_NETWORK: The name of the Docker network in which CodeGreen containers are running.

annehartebrodt / codegreen-prediction Goto Github PK

codegreen-prediction's Introduction

Carbon efficient model deployment for project Codegreen

Architecture

Installation and setup

Development

Local setup for development

Code Structure

Working of the Tool

How to add a new model ?

The `.config` file

codegreen-prediction's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

annehartebrodt / codegreen-prediction Goto Github PK

codegreen-prediction's Introduction

Carbon efficient model deployment for project Codegreen

Architecture

Installation and setup

Development

Local setup for development

Code Structure

Working of the Tool

How to add a new model ?

The .config file

codegreen-prediction's People

Contributors

Recommend Projects

Recommend Topics

Recommend Org

The `.config` file