Giter Club home page Giter Club logo

antijam's Introduction

AntiJam Project

Winning project of BITEhack 2023 hackathon in AI category.

This is a Reinforcement Learning system that trains a traffic light agent to optimally switch based on real time traffic observations.

The optimization goal is the minimization of wasted fuel - fuel used while stuck in traffic.

Each traffic light sees cars in the entire city, other traffic light states and its own position.

The PPO (Proximal Policy Optimization) agent has been trained in such an environment.

Results have been compared with a baseline model (switching lights every X ticks). The learned PPO agent outperforms the baseline.

Demo

Baseline agent on the left, trained PPO on the right.

The reward is fuel use efficiency (100% means no fuel was wasted while not moving).

image

video

Training report

report

Total computational time was 6 h.

Installation

  • install python 3.10 and pytorch
  • pip install -r requirements.txt

Training

With enough RAM, VRAM, CPU cores and a good GPU run:

  • python train.py

A pretrained checkpoint is provided in checkpoints/.

Simulation

Specify the trained checkpoint path in simulation.py and run:

  • python simulation.py

Environment specification

Step function

The step function takes an action mapping of traffic light ids to their requested states.

Each traffic light has a switching frequency limit. This is to stop rapid state changes.

It returns the environment observation and reward. The environment never terminates by itself.

Observation

The observation is designed for a convolutional neural network. It contains 6 channels, each NxM (size of grid):

  • the map (1 where road)
  • car positions (1 where car)
  • this agent position (1 where this agent junction)
  • 1 where lights are in state 0
  • 1 where lights are in state 1
  • all available junctions

To speed up training, the map was removed from the observation and the CNN was replaced with a simple FCN.

Reward

The reward in each step is calculated as (number of cars that moved / total cars).

The reward is summed over all agents and environment steps.

For simulation evaluation, an average of 100 step rewards is taken.

Authors

An amazing team from AGH UST Computer Science students:

antijam's People

Contributors

arch4ngel21 avatar bblachut avatar carbon225 avatar kosmydel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.