Giter Club home page Giter Club logo

septa-delay-analysis's Introduction

SEPTA-Delay-Analysis

Analysed SEPTA on-time-performance for Kaggle Data

Introduction

Approximately 119,000 daily riders take advantage SEPTA’s Regional Rail system, which provides service to the Philadelphia metropolitan area. The SEPTA (Southeastern Pennsylvania Transportation Authority) Regional Rail system consists of commuter rail service on 13 branches to more than 150 active stations in Philadelphia, Pennsylvania, and its suburbs and satellite cities.

SEPTA reports On-Time Performance (OTP) to measure service reliability. OTP identifies the number of trains for all rail lines that arrive at their scheduled destination at the scheduled time. However, by industry standard, a train may arrive up to 5 minutes and 59 seconds after its scheduled time and still be considered on-time. SEPTA has set an On-Time Performance target such that 91% of its trains arrive on time. Thus, even with 100% “on time” performance, trains may still arrive late, forcing commuters to deal with uncertainty and lost time – especially if they rely on back-to-back connections.

The blue lines in the map show all the regional rail line of SEPTA. (image from google maps)

Image

TWINT API

Twint is an advanced Twitter scraping tool written in Python that allows for scraping Tweets from Twitter profiles without using Twitter's API. I used this tool to get all tweets from @SEPTA, i.e. SEPTAs official twitter account.

Philly Weather data

Found data, to provide me with weather data for Philly of 2016.

Kaggle data

Kaggle had two csv files for ths dataset otp.csv and trainView.csv

This “otp.csv” from the Kaggle dataset contains On-Time Performance information as from 23 March, 2016 to 6 November, 2016 :

  • train_id
  • direction ('N' or 'S' direction is demarcated as either Northbound or Southbound)
  • origin ('Warminster', 'Glenside',...'Airport Terminal..')
  • next_station (Think of this as the station stop, at timeStamp)
  • date
  • status ('On Time', '5 min', ... This is a status on train lateness. 999 is a suspended train)
  • timeStamp

Note as per the dataset, only 32% of trains run perfectly “on-time”, with another 30% running 1-3 minutes late and the remaining ~1/3 of the trains running >3 minutes late

trainView.csv - GPS Train data (early release)

GPS coordinates are based on track telemetry.

  • train_id
  • status
  • next_station
  • service
  • dest
  • lon
  • lat
  • source
  • track_change
  • track
  • date
  • timeStamp0 First timeStamp at coordinates.
  • timeStamp1 Last timeStamp at coordinates.

Machine Learning

I determine the delays in the OPT dataset. We could combine weather data with the on-time performance data and see its effect too. The aim of this section would be to be able to predict the delays in the SEPTA regional rail system. The machine learing results provide a strong case for predicting future delays for regional rail lines

septa-delay-analysis's People

Contributors

arthtalati avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.